Meta AR/VR Job | Computer Vision Research Scientist – Audio

Job(岗位): Computer Vision Research Scientist – Audio

Type(岗位类型): 3D Software Engineering

Citys(岗位城市): Redmond, WA

Date(发布日期): 2023-10-23


Facebook Reality Labs brings together a world-class team of researchers, developers, and engineers to create the future of virtual and augmented reality, which together will become as universal and essential as smartphones and personal computers are today. And just as personal computers have done over the past 45 years, AR and VR will ultimately change everything about how we work, play, and connect.

We are developing all the technologies needed to enable breakthrough AR glasses and VR headsets, including optics and displays, computer vision, audio, graphics, brain-computer interface, haptic interaction, eye/hand/face/body tracking, perception science, and true telepresence. Some of those will advance much faster than others, but they all need to happen to enable AR and VR that are so compelling that they become an integral part of our lives.

The audio team at Facebook Reality Labs is looking for experts in machine learning and computer vision. This role is focused in design, development and engineering of advanced computer vision systems that drive audio visual experiences design and synthesis in AR and VR. An ideal candidate will be passionate about development of advanced proof-of-concept demonstration platforms on one hand and about pushing the state of the art by conducting fundamental research on the other hand. The position is full-time employee (FTE) and requires a PhD in computer science, deep learning, machine learning, computer vision, computer engineering or statistics.


PhD in the field of Deep learning, Machine Learning, Computer Vision, Computer Science, Computer Engineering or Statistics or a related field.

Bachelor’s degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience.

Must obtain work authorization in country of employment at the time of hire, and maintain ongoing work authorization during employment.

4+ years experience with development and implementation of computer vision or deep learning algorithms.

3+ years experience with scientific programming languages such as Python, C++, or similar.

Demonstrated experience in implementing and evaluating work and end-to-end prototypical learning systems.

Interpersonal skills: cross-group and cross-culture collaboration.


Independently implement state of the art models and techniques on PyTorch, Tensorflow or other platforms.

Independently identify, motivate, and execute on reasonable medium to large hypotheses (each with many tasks) for model improvements through data analysis, and domain knowledge, and are able to communicate your learnings effectively.

Design, perform, and analyze online and offline experiments independently with specific and well thought-out hypotheses in mind.

Generate reliable, correct training data with great attention to detail.

Identify and debug common issues in training machine learning models such as overfitting/underfitting, leakage, offline/online inconsistency independently and consistently.

Understand the model architecture used, and the consequences of this for different hypotheses tested. In general, you have a good understanding of computer vision from an applied perspective, even though you may not be up to date with the state of the art.

Able to independently resolve most online and offline issues which affect the hypothesis testing.

Aware of common systems considerations and modeling issues, and factor this into modeling choices.

Additional Requirements(额外要求)

Experience with audio signal processing, audio-visual learning or similar.

Experience with building models on speech or acoustic datasets.

Proven track record of achieving significant results and innovation as demonstrated by first-authored publications and patents.