Meta AR/VR Job | Production Systems Engineer
Job(岗位): Production Systems Engineer
Type(岗位类型): Hardware
Citys(岗位城市): Dublin, Ireland
Date(发布日期): 2024-1-19
Summary(岗位介绍)
Meta is seeking a Production Systems Engineer to join our Release to Production (RTP) team in Dublin. Our servers and data centers are the foundation upon which our rapidly scaling infrastructure operates efficiently to deliver our innovative services. The RTP team is responsible for the end-to-end Hardware Lifecycle of all Meta servers, from exploration and development to production health. RTP Engineers work closely with Production Engineering teams, Enterprise Networking, Hardware Designers, Networking Teams, Manufacturers, Vendors, Datacenter Operation teams and New Product Introduction teams to ensure the smooth operation of systems across the planet.
We encounter problems from the very smallest of scales (errors occurring at the microscopic scale, within single registers of a CPU) up to the very largest - deploying solutions to our entire millions-strong fleet. We look for people with curiosity and drive, who want to tackle the hardest problems in the domain.
Typically we will hire engineers from backgrounds such as Site Reliability Engineer (SRE), Software Engineer, Systems Engineer, Systems Development Engineer, DevOps Engineer, Systems Administrator, or similar.
You will have excellent technical ability, be adept at managing complex cross-functional demands, and have a demonstrated ability to drive a program of work to successful business outcomes.
Qualifications(岗位要求)
Bachelors degree in computer science, a related technical discipline, or equivalent work experience
6+ years experience coding in a higher-level language (Python, PHP, Java, Go, Rust, C++)
Experience building, maintaining and debugging production services or platforms - usually (but not necessarily) in a linux/unix environment
Knowledge of server architecture and components across Compute/Storage/AI Systems/Networking
Scientific approach to troubleshooting, root-cause analysis and investigation
Good communication skills, able to collaborate easily with others
Description(岗位职责)
Build and develop tooling solutions to automate business critical processes in service of managing the health of the Meta production fleet
Troubleshoot, diagnose and root cause system failures, working with key partners to identify and deliver solutions
Proactively identify opportunities to fix or enhance tooling, hardware and processes
Build subject matter expertise in one or more of the specialist areas covered by the RTP team in Dublin - Firmware Deployment
Edge/CDN hardware
or Silicon Sustaining