For me, the exciting part is to understand how the "blackbox" can learn and to f...

pierrebai · on Aug 29, 2019

"replace the mean with median"

This illustrates the black-box aspect of it. You changed something and the results are affected but you don't know why. Median has a built-in implicit filtering (it's not affected by extreme outliers like the mean), so it could simply be that you needed to filter your inputs. But won't know, because... black box.

kuu · on Aug 30, 2019

and the results are affected but you don't know why

Well, that's just your assumption without knowing the exact problem and I think you are missing my point.

You can approach to data preparation by randomly changing things, and maybe you can get some interesting results but I promise you you will fail many many times. Other way is to know what does it means to change the mean for the median for instance (as I mentioned in this just random example), and I promise you will find better solutions.

The idea is not just "change and test" and see what happens. The interesting part is to understand how the model uses your representation and why one is "better" than another.