Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm annoyed that the dataset has the B variable and the LSAT variable encoding some manually-chosen hyperparameters in the formula creating them.

If the dataset gave the raw features it would be better at least



This - it's really encoding the authors' model with some magic values in there. Not even ethics, just bad stats. The correct step would be back out the raw proportion of black population, rather than the output of that quadratic model.

Edit: D'oh 62% and 64% black neighborhoods get the same B. Couple glasses of wine in already, math is hard...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: