What makes me think is that, in principle, there should be similar adversarial examples for the human optical system.
Practically speaking it wouldn't fool anyone for more than a split second, not least since our input is video instead of snapshots, but it's an interesting thing to wonder about. Maybe we could build an AI which would be in most senses as smart as us, but which would be more vulnerable to such things?
Prominent examples would be optical illusions and the kind of thing you find on https://reddit.com/r/confusing_perspective/. Both of which tend to fool for longer than a split-second.
It's not entirely the same method as these "adversarial noise" inputs, but some optical illusions are pretty close in how they mess with the localized parts of our optical processing (e.g. https://upload.wikimedia.org/wikipedia/commons/d/d2/Caf%C3%A...).
We can't backprop the human vision system to find "nearby" misclassifications as easily, and presumably our own "classifiers" are more robust to such pixel-scale perturbations, but especially lower-resolution images can trip us up quite easily too (see e.g. https://reddit.com/r/misleadingthumbnails/).
Practically speaking it wouldn't fool anyone for more than a split second, not least since our input is video instead of snapshots, but it's an interesting thing to wonder about. Maybe we could build an AI which would be in most senses as smart as us, but which would be more vulnerable to such things?