This AI Senses Humans Through Walls 👀

Dear Fellow Scholars, this is Two Minute Papers
with Károly Zsolnai-Fehér. Pose estimation is an interesting area of
research where we typically have a few images or video footage of humans, and we try to
automatically extract the pose a person was taking. In short, the input is one or more photo,
and the output is typically a skeleton of the person. So what is this good for? A lot of things. For instance, we can use these skeletons to
cheaply transfer the gestures of a human onto a virtual character, fall detection for the
elderly, analyzing the motion of athletes, and many many others. This work showcases a neural network that
measures how the wifi radio signals bounce around in the room and reflect off of the
human body, and from these murky waves, it estimates where we are. Not only that, but it also accurate enough
to tell us our pose. As you see here, as the wifi signal also traverses
in the dark, this pose estimation works really well in poor lighting conditions. That is a remarkable feat. But now, hold on to your papers, because that’s
nothing compared to what you are about to see now. Have a look here. We know that wifi signals go through walls. So perhaps, this means that…that can’t be
true, right? It tracks the pose of this human as he enters
the room, and now, as he disappears, look, the algorithm still knows where he is. That’s right! This means that it can also detect our pose
through walls! What kind of wizardry is that? Now, note that this technique doesn’t look
at the video feed we are now looking at. It is there for us for visual reference. It is also quite remarkable that the signal
being sent out is a thousand times weaker than an actual wifi signal, and it also can
detect multiple humans. This is not much of a problem with color images,
because we can clearly see everyone in an image, but the radio signals are more difficult
to read when they reflect off of multiple bodies in the scene. The whole technique work through using a teacher-student
network structure. The teacher is a standard pose estimation
neural network that looks at a color image and predicts the pose of the humans therein. So far, so good, nothing new here. However, there is a student network that looks
at the correct decisions of the teacher, but has the radio signal as an input instead. As a result, it will learn what the different
radio signal distributions mean and how they relate to human positions and poses. As the name says, the teacher shows the student
neural network the correct results, and the student learns how to produce them from radio
signals instead of images. If anyone said that they were working on this
problem ten years ago, they would have likely ended up in an asylum. Today, it’s reality. What a time to be alive! Also, if you enjoyed this episode, please
