I remember being incredulous in graduate school, learning that very large N's and data mining were things to be suspicious of. After all, my thinking went, how could more information and an automatic way to recognize connections that one might miss be bad? It turns out that big data sets make it more likely that one will find statistical significance in the absence of a real relationship and that data mining tends to turn up a lot of spurious, atheoretical correlations. Enter the Big Data and AI Neural Networks movements. They present approximately the same issues. Big Data, it turns out, leads to lots of connections with little explanation, and Neural Networks are, by definition, not understandable.
It occurs to me, though, that traditional statistical regression analysis could be combined with Big Data and AI Neural Networks as a corrective. Why not start with a traditional statistical model rooted in theory and previous empirical findings and then have a neutral network mine the error term? We could develop a set of meta-analyses that clearly state our priors. (This part eventually could even be done by bots.) Then, the AI's could do a kind of Bayesian exploration. I'm out of my depth now, though.
It occurs to me, though, that traditional statistical regression analysis could be combined with Big Data and AI Neural Networks as a corrective. Why not start with a traditional statistical model rooted in theory and previous empirical findings and then have a neutral network mine the error term? We could develop a set of meta-analyses that clearly state our priors. (This part eventually could even be done by bots.) Then, the AI's could do a kind of Bayesian exploration. I'm out of my depth now, though.
No comments:
Post a Comment