As with everything, there's good and bad
Computer Vision now enables all kind of cool and convenient automations, such as snapping pictures when people smile, scanning cheques, performing quick identity check at the airport's border control, searching your ever growing photo collection, using face ID to unlock the phone, all the way to making cars drive themselves and letting robots see. I'm proud to have played a small role in many of these.
However, like any other tool, Computer Vision can also be used for bad. Orwellian surveillance and offensive military use are the two glaringly obvious ones. Notably, rising Computer Vision star PJ Reddie (Joseph Redmon), main author of the famed YOLO detection models, has quit the field of Computer Vision entirely after seeing what his detector was used for:
I stopped doing CV research because I saw the impact my work was having. I loved the work but the military applications and privacy concerns eventually became impossible to ignore. https://t.co/DMa6evaQZr
— Joseph Redmon (@pjreddie) February 20, 2020
Personal anecdote: tracking
I did my PhD in a computer vision lab that frequently carries the perception part of robotics projects, part of the reason being that person detection and tracking are key capabilities for social robots, and the lab's leader Prof. Bastian Leibe was well know for his work in those areas (none of which I knew).
I eventually had the idea for a completely new formulation of the multi-person tracking problem. It was technically very cool, I derived it from first principles Bayesian rule and all that, and it seemed like just the start of a whole new family of methods. Basically a PhD student's dream.
Many months later, some people from Samsung came to visit our lab. Like many others, were interested in our detection and tracking research. I was not actually in the lab that day, so I looked them up afterwards. Did you know that Samsung (specifically Samsung Techwin) not only builds phones and TVs, but also tanks and the first fully autonomous sentry gun (turret)? Yeah, I didn't know either. I stopped all my work on tracking and decided never to work on tracking again.
My mental framework
Over the course of my career I have regularly thought about the implications of my research. For a while, I clearly knew I won't ever work on tracking again, but I was perfectly fine working on detection. Why is that? Isn't a perfect detector essentially a tracker? Why am I fine with one but not the other?
It took a few more years of growing up before I could answer these questions to my satisfaction. The reality is that everything is intertwined:
- A better detector clearly leads to better trackers.
- A better classifier clearly leads to better detectors.
- A better pre-training clearly leads to better classifiers.
- A better optimizer clearly leads to better pre-training.
- A better software tooling clearly leads to accelerated research on all of the above.
So, following this logical chain of thoughts, I really shouldn't be working on anything anymore, or I would have participated in creating much better trackers!
The way out of this conundrum, to me at least, is that each of these of course has a long list of other things it enables! A better pre-training not only leads to better trackers, but improves most of the positive uses I mentioned in the introduction as well. So the net harm or benefit of doing research on a specific topic is kind of a weighted mix between the number and severity of positive and negative impacts one foresees the research to have.
Hence, where in this chain one draws the line is a very personal question which does not require any justification. Different people will draw different lines, and I respect anyone's personal line.
For myself, that's how I drew my line somewhere between detection and tracking. Detection has a vast amount of positive applications, some of which I mentioned in the intro, but also some negative ones. For tracking, on the other hand, I mostly see surveillance and military applications. The positives I could see either seem minor in comparison, or would be served just fine with detection.
A recent topic: quadruped locomotion
The video below recently made the rounds, exemplifying the insane progress made in quadruped locomotion over the last few years:
Extreme Off-Road | All-Terrain quadruped robotdog
pic.twitter.com/hxoTrvu86c — DEEP Robotics (@DeepRobotics_CN) November 13, 2024
It made me think. Quadruped locomotion seems awfully similar to tracking: It must be a lot of fun to work on, but I see many negative and just very few or niche positives uses of it. I would much rather work on, for example, dexterous manipulation:
Just a nice folding sequence 💅 pic.twitter.com/tIe120IJ4U
— Karol Hausman (@hausman_k) November 25, 2024
Stay safe and hydrated!