Meta AR/VR Job | Research Scientist - Efficient Audio Visual Machine Learning | Quest
Job(岗位): Research Scientist - Efficient Audio Visual Machine Learning | Quest
Type(岗位类型): Research
Citys(岗位城市): Redmond, WA
Date(发布日期): 2024-9-25
Summary(岗位介绍)
At Meta’s Reality Labs Research, our goal is to make world-class consumer virtual, augmented, and mixed reality experiences. Come work alongside industry-leading scientists and engineers to create the technology that makes VR and AR pervasive and universal. Join the adventure of a lifetime as we make science fiction real and change the world. We are a world-class team of researchers and engineers creating the future of augmented and virtual reality, which together will become as universal and essential as smartphones and personal computers are today. And just as personal computers have done over the past 45 years, AR and VR will ultimately change everything about how we work, play, and connect.
We are developing all the technologies needed to enable breakthrough Smartglasses, AR glasses and VR headsets, including optics and displays, computer vision, audio, graphics, brain-computer interfaces, haptic interaction, eye/hand/face/body tracking, perception science, and true telepresence. Some of those will advance much faster than others, but they all need to happen to enable AR and VR that are so compelling that they become an integral part of our lives.
The Audio team within RL Research is looking for an experienced and innovative Research Scientist with a specialty in real-time and efficient audio-visual learning and machine learning to join our growing team. You will be doing core and applied research in technologies that improve listener’s hearing abilities under challenging listening conditions using wearable computing, and alongside a team of dedicated researchers, developers, and engineers. You will operate at the intersection of egocentric perception, acoustics, computer vision, and signal processing algorithms with hardware and software co-design.
Qualifications(岗位要求)
PhD degree or equivalent experience in Deep Learning, Artificial Intelligence, Machine Learning, Computer Science, Robotics, Computer Vision, Computational Neuroscience, Signal Processing, Speech and Language technologies, or a related field..
4+ years of experience working on applied computer vision methods for wearable computing.
2+ years of experience working on efficient multimodal machine learning algorithms for low-compute and low-power devices.
Research-oriented software engineering skills, including fluency with machine learning (e.g., PyTorch, TensorFlow, Scikit-learn, Pandas) and libraries for scientific computing (e.g. SciPy ecosystem).
Experience with cross-group and cross-cultural collaboration.
Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience.
Description(岗位职责)
Develop novel AI algorithms and associated real-time systems for source tracking, source localization, source diarization, and relevant semantic scene understanding with application into egocentric wearable computing in AR and VR.
Design and develop efficient AI frameworks and real-time technical systems with constraints on low-compute, low-power and overall system latency.
Lead the development of systems and methods to enable quick prototyping, proof of concept, or proof-of-experience and demonstrations.
Contribute to datasets designs and large-scale data processing for real-time evaluations of efficient audio-visual machine learning methods.
Contribute to the technical strategy and establish new execution methods where relevant for efficient compute driven AI systems in Audio AR and VR applications.
Summarize technical findings to cross-org collaborators, and influence system design and integration decisions of multi-modal AI systems supporting hearing technologies in AR and VR.
Additional Requirements(额外要求)
8+ years of experience working on core and applied computer vision methods.
Experience with real-time AI modeling and systems design for wearable computing.
3+ years of experience working on audio-visual and multi-modal learning methods for egocentric perception.
Experience with real-time statistical modeling including heuristics driven computer vision methods for egocentric data processing.
Experience developing end-to-end ML pipelines, including dataset design, dataset preprocessing, model development and evaluation, and software integration into platforms.
Experience bridging and adopting machine learning systems from research into potential tech-transferable packages for production.
Experience with large-scale or distributed cluster computing for training, development and offline inference of machine learning models.
Experience with interdisciplinary and/or cross-cultural collaboration with domain researchers in speech processing, auditory perception, psychoacoustics or related.