Meta AR/VR Job | Research Scientist Intern, Audio, Speech & Natural Language in On-Device AI(PhD)

Job(岗位): Research Scientist Intern, Audio, Speech & Natural Language in On-Device AI(PhD)

Type(岗位类型): Artificial Intelligence | Computer Vision, Machine Learning

Date(发布日期): 2023-2-2


Reality Labs (formerly Facebook Reality Labs) focuses on delivering Meta’s vision through Virtual Reality (VR) and Augmented Reality (AR). Enabling compelling user experiences on Virtual and Augmented Reality devices requires innovation and co-design across all layers of stack from novel algorithms to custom silicon. The Meta AR/VR team is driving the state of the art forward with breakthrough work in computer vision, speech, virtual assistant, machine learning, mixed reality, graphics, displays, sensors, and new ways to map the human body among many others.

We are seeking exceptional research scientists with a background in innovating and developing efficient models for AR/VR applications, including audio, speech, sensor fusion, NLP, multi-modal AI etc. Our team focuses on innovating and exploring new applications leveraging audio, speech and language signals to deliver state of the art models with tight constraints on memory, latency and power. The ideal candidate will have practical experience in developing real time models that can achieve high accuracy under deployment constraints such as limited compute and memory resources. In this position, you will get exposure to the full stack from user experiences, algorithms down to hardware execution blocks. You will work with various teams to understand the challenges and build state-of-the-art models and applications and then work with the software/hardware team to deploy these solutions on-device.

This a 2023 internship opportunity with start dates from May – September. To learn more about our research, visit


Is in the process of obtaining a PhD in the field of computer vision, speech recognition, machine learning or a related field

Proven track record in using machine learning for solving audio, speech and natural language processing problems

Experience in Object Oriented programming

Experience with prototyping algorithms in Python or other scripting languages

Interpersonal experience: cross-group and cross-culture collaboration

Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment


Define, plan and execute cutting-edge application research to advance AR/VR experiences

Develop novel deep learning techniques to achieve state-of-the-art accuracy within the constraints of on-device and real-time execution

Collaborate with other team to develop innovative deep learning techniques for vision, speech, user interface and other use-cases

Collaborate with software and hardware engineers to develop tradeoff curves for accuracy vs the runtime resources/constraints such as latency, energy

Communicate the research results and the recommendations clearly, both within the group as well as to the cross-functional groups

Publish research results in top-tier journals and at leading international conferences

Additional Requirements(额外要求)

Proven track record of achieving significant results as demonstrated by grants, fellowships, patents, as well as first-authored publications at leading conferences or journals

Experience designing and developing audio, speech and natural language processing or machine learning algorithms.

Experience with multimodal fusion or deep learning with different modality sensor data

Experience with development on embedded devices and low power consuming chips

Demonstrated software engineer experience via an internship, work experience, coding competitions, or widely used contributions in open source repositories (e.g. GitHub)