Avatar

Ramprasaath R. Selvaraju

Senior Researcher

Apple
Sunnyvale, CA

About Me:

Hello! I am a Senior Research Scientist at Apple, specializing in Computer Vision. I did my PhD at Georgia Tech advised by Prof. Devi Parikh and Prof. Dhruv Batra. My research focuses on advancing AI systems through —

  • developing cutting-edge multimodal models with improved controllability,
  • creating tailored solutions for enterprise applications while addressing model biases,
  • understanding and interpreting model decisions while diagnosing failures,
  • building trust and enabling knowledge transfer between humans and AI,
  • encouraging human-like reasoning and grounded representations.

Check out my papers below and reach out to me if you would like to chat!

Education

  • Ph.D in Computer Science, 2020

    Georgia Institute of Technology, Atlanta

    Thesis: Explaining model decisions and fixing them via human feedback

  • Master of Science in Physics, 2015

    Birla Institute of Technology and Science (BITS-Pilani), Hyderabad, India

  • Bachelor of Engineering in Electrical & Electronics Engineering, 2015

    Birla Institute of Technology and Science (BITS-Pilani), Hyderabad, India

Awards & Recognition

  • 2022

    Recognized among the Top-100 scholars in the AMinor 2022 AI 2000 Most influential scholars in Computer Vision between 2012-2021.

Publications

(view list)

Development and validation of an AI-derived digital pathology-based biomarker

Development and validation of an AI-derived digital pathology-based biomarker to predict benefit of long-term androgen deprivation therapy with radiotherapy in men with prostate cancer


Information efficient visual representation learning with language supervision

Clip-lite: Information efficient visual representation learning with language supervision


Contrastive pretraining with video tracking supervision

Previts: contrastive pretraining with video tracking supervision


Boosting text-vqa via text-aware visual question-answer generation

Tag: Boosting text-vqa via text-aware visual question-answer generation


Can domain adaptation make object recognition work for everyone?

Can domain adaptation make object recognition work for everyone?



Vision-Language Representation Learning with Momentum Distillation

Align before Fuse: Vision and Language Representation Learning with Momentum Distillation


Improving VQA Model Consistency via Gradient Alignment

SORT-ing VQA Models: Contrastive Gradient Learning for Improved Consistency


Learning to Localize Improves Self-Supervised Representations

Casting Your Model: Learning to Localize Improves Self-Supervised Representations


Interrogating VQA Models with Sub-Questions

Squinting at VQA Models: Introspecting VQA Models with Sub-Questions


Visual Explanations from Deep Networks

Visual Explanations from Deep Networks


Making Vision and Language Models More Grounded

Taking a HINT: Leveraging Explanations to Make Vision and Language Models More Grounded


Thematic Reinforcement for Artistic Typography

Trick or Treat: Thematic Reinforcement for Artistic Typography

Research Experience

 
 
 
 
 

Senior Researcher

Apple

May 2023 – Present Cupertino
 
 
 
 
 

Senior Research Scientist

Salesforce Research

June 2020 – Feb 2022 San Francisco
 
 
 
 
 

Founding Member

Artera AI

Feb 2022 – Mar 2023 Los Altos
Led research and development of AI solutions for precision medicine.
 
 
 
 
 

Research Intern

Adaptive Systems and Interaction Group, Microsoft Research

Summer 2019 Redmond
Towards evaluating and encouraging human-like reasoning abilities in deep models.
 
 
 
 
 

Research Intern

Tesla Autopilot

Spring 2019 Palo Alto
Preventing failures of autonomous systems in case of rarely occurring scenarios.
 
 
 
 
 

Research Intern

Samsung Research America

Summer 2018 Mountain View
Leveraging explanations to make AI models more grounded.
 
 
 
 
 

Research Intern

Applied Machine Learning, Facebook

Spring 2017 Menlo Park
Developing framework for interpreting and visualizing Facebook's deep models.
 
 
 
 
 

PhD Research Assistant

Visual Intelligence Lab, Georgia Tech

2015 – 2017 Atlanta
Towards building AI systems that are Interpretable, Transparent, and Unbiased.
 
 
 
 
 

Research Intern

Visual Intelligence Lab, Georgia Tech

Spring 2015 Atlanta
Building curious systems that ask Natural Language open-ended questions about an image.
 
 
 
 
 

Research Intern

Oxford University

Fall 2014 Oxford
Developing interactive augmented reality systems for visually impaired.
 
 
 
 
 

Research Intern

Brown University

Summer 2013 Providence
Designing a vision-based navigation system to help the visually impaired navigate indoor environments.