Friday, June 25, 2021

07.05

Week 7, Day 5

Improved code organization greatly, adding complete documentation to the closest worlds implementation. Wrote up Roger's test case for evaluating this implementation. Also, found a Euclidean distance transformation which yields values in the range [0, 1] using exponentiation: 1/e^distance. Since I noticed that the similarity scores get very small (>0.0001), I added in a power-scaling mechanism which brings the lowest nonzero similarity score to 0.01.

I began looking for real-world datasets, and found the sources listed below. Testing the counterfactual module with real-world data will be tricky, due to the fundamental problem of causal inference (it's impossible to observe the causal effect on a single unit -- a given person either took the treatment or didn't).