
Full-Time
/
Bengaluru
/
3-6 Years
The real jobYou'll be teaching computers to understand human behavior through video. Not just "is there a face?" but "is this person confident or just good at faking it?" This is computer vision meets psychology meets real-time processing at scale.
Your actual work will involve:
Building systems that analyze body language, facial expressions, eye contact during interviews
Creating models that work across different lighting, cameras, and network conditions (spoiler: everyone's webcam sucks)
Detecting if someone's reading answers off screen (harder than you think)
Building privacy-preserving vision systems (we analyze but don't store faces)
Optimizing models to run real-time on potato-quality video streams
Fighting with WebRTC, cursing at codec issues, and somehow making it all work
You probably have:
Built CV systems that work outside of perfect lab conditions
Experience with real-time video processing (not just image classification)
Deep knowledge of modern CV architectures (YOLO, Vision Transformers, etc.)
Understanding of edge deployment and model optimization
Battle scars from deploying CV in production
Projects involving human behavior analysis or video understanding
Epic work we're looking for:
Your AR filter went viral
You've built CV systems processing millions of images/videos daily
You've published CV research that people actually implemented
Your open-source CV project has thousands of stars
You've built something like Snapchat filters but better
You made a CV model run on a Raspberry Pi doing something actually useful
Why people fail:They've only worked with static images. Or they think accuracy is the only metric that matters. Or they can't handle the mess of real-world video data.
