雨果巴拉:行业北极星Vision Pro过度设计不适合市场

Meta AR/VR Job | Codec Avatars Systems Engineer | Quest

Job(岗位): Codec Avatars Systems Engineer | Quest

Type(岗位类型): Artificial Intelligence

Citys(岗位城市): Pittsburgh, PA

Date(发布日期): 2023-5-27

Summary(岗位介绍)

Reality Labs Research (RL-R) brings together a diverse and highly interdisciplinary team of researchers and engineers to create the future of augmented and virtual reality. On the Codec Avatars Infrastructure team, you’ll work on building tools, libraries, and frameworks that will help researchers collaborate with each other and empower their research towards the generation of Codec Avatars.

Our team cultivates an honest and considerate environment where self-motivated individuals thrive. We encourage a strong sense of ownership and embrace the ambiguity that comes with working on the frontiers of research.

In this hybrid systems engineer and software engineer role on the Codec Avatar Research Infrastructure team, you will foster our scientific explorations and generate viable paths to the consumer products that will connect people in meaningful ways for decades to come.

Qualifications(岗位要求)

Bachelor’s degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience.

Experience working independently, handling large projects simultaneously, and prioritizing team roadmap and deliverables by balancing required effort with resulting impact

5+ years experience in systems engineering

5+ years experience automating the management of infrastructure and services

5+ years experience coding in at least one of the following languages: Python, Ruby, PHP, Rust, or Go

Thorough understanding of Linux operating system internals

Experience with managing HPC scheduler libraries like Slurm, Kubernetes, or LSF

Experience with Python library management systems such as Conda or venv

Description(岗位职责)

Build, scale, and secure the Linux environment within Meta research lab HPC infrastructure, a heterogeneous environment containing diverse operating systems and applications

Work side by side with research scientists to enable the infrastructure for large scale training jobs that explore AR/VR

Provide on-call support and lead incident root cause analysis through multiple infrastructure layers (compute, storage, network) for our research lab’s HPC clusters and act as a final escalation point

Apply modern engineering methodologies such as Infrastructure-as-Code, container orchestration, and software-defined storage for large scale compute clusters

Collaborate in a diverse team environment across multiple scientific and engineering disciplines, making the architectural tradeoffs required to rapidly deliver software and infrastructure solutions

Find ways to leverage the scale and complexity of the larger Meta production infrastructure to solve problems for Reality Lab researchers

Provide guidance to other engineers on best practices to build mature services which are highly available, reliable, secure, and scalable

Help others around you move faster by identifying issues and driving them to resolution

Influence outcomes within your immediate team, peer engineering teams, and with cross-functional stakeholders

Additional Requirements(额外要求)

Prior experience in cluster oncall operations, including troubleshooting server/scheduler/storage errors, maintaining compute/storage environments/libraries/tools, helping onboard users to the cluster, and answering general questions from users.

Prior experience supporting configuration management in a multi-region environment

Prior experience building services

Prior experience building PaaS or internal clouds

Prior experience in cluster coordination and strategy planning, including collecting/understanding needs of users, developing tools to improve user experience, providing guidance on best practices, coordinating distribution of compute/storage resources, forecasting compute/storage needs, and developing long-term user experience/compute/storage strategies.

Prior experience with containerization technologies like Docker or Virtual Machines

Prior experience in developing/managing distributed network file systems

Prior experience optimizing multi-tenant HPC clusters for performance and maintenance

Prior academic or development experience with machine learning and/or deep learning

Prior experience in ML libraries such as PyTorch, TensorFlow or cuDNN

Prior experience in Computer vision libraries such as OpenCV

Prior experience in GPGPU development with CUDA, OpenCL or DirectCompute

您可能还喜欢...