Site Reliability Engineer at SpaceX
Hawthorne, CA, United States
SpaceX was founded under the belief that a future where humanity is out exploring the stars is fundamentally more exciting than one where we are not. Today SpaceX is actively developing the technologies to make this possible, with the ultimate goal of enabling human life on Mars.
SITE RELIABILITY ENGINEER
The application software team is the central nervous system of SpaceX - we create mission critical applications that are used throughout SpaceX to accelerate launch vehicle production and flight as well as systems that allow Starlink to grow into a worldwide fast, reliable Internet service.
We are looking for an experienced Site Reliability Engineer to operate and scale custom-built mission-critical software products for engineering, test, and launch. These products are used to deliver the software flying rockets, spacecraft, satellites, and more - every time a Falcon 9 launches, a Dragon capsule docks with the ISS, or a Starlink satellite connects a new community, the software responsible for it was created with the tools you'll build and maintain.
SpaceX relies on our vehicle software being built quickly and correctly, tested rigorously, and rapidly iterated on. This allows us to pioneer technologies that were science fiction a decade ago; you'll work to ensure that software delivery at SpaceX keeps pace with other engineering efforts, to enable our goal of making humanity multi-planetary.
Aerospace experience is not required to be successful here - rather we look for smart, motivated, collaborative engineers who love solving problems and want to make an impact on a super inspiring mission. We are looking for engineers who treat fellow teammates with fairness, respect, and support. You will have full ownership of challenging problems, working with a team of enthusiastic engineers to design and produce solutions that enable SpaceX to move towards our goals at a rapid pace. The success of the missions at SpaceX depends on the software that you and your team produce.
RESPONSIBILITIES:
Deploy, upgrade, operate/maintain, and scale our suite of mission critical products and services
Manage our underlying infrastructure as code and use modern observability tools to tell a complete story of application health
Closely collaborate with software engineers to create highly operable and maintainable products
Engage in and improve the whole software development lifecycle of services -- from inception and design, through deployment, operation, and refinement
Practice sustainable incident response and blameless postmortems
Provide end-user support to vehicle software engineers for products
Participate in the team's on-call rotation periodically
Focus on performance bottlenecks and performance improvement techniques
BASIC QUALIFICATIONS:
Bachelor's degree in computer science, information systems, or engineering; OR 3+ years of professional experience with site reliability or DevOps without a degree
Experience with Linux operating systems
PREFERRED SKILLS AND EXPERIENCE:
5+ years of DevOps, Site Reliability Engineering, or System Administration experience
3+ years of experience with Python and Python-based development frameworks
Experience with source code and version control tools such as Git or Subversion
Experience with infrastructure as code (IaC) products for automatically managing fleets of servers
Experience with build systems (Make, Bazel / Pants / Buck, Gradle, etc.) and package management tools (pip, npm, etc.)
Experience with both container and virtualization technologies (VirtualBox, KVM, Docker, Kubernetes, vSphere, EC2, GCE)
Experience with Terraform, Ansible, Puppet, or other automation frameworks
Knowledge of TCP/IP networking
Experience with databases and data modeling
Experience with workflow and issue management tools such as JIRA
Ability to work with mission critical and sensitive systems, with a sense of urgency appropriate to the responsibilities
Ability to communicate with customers, peers, management etc. in both formal and informal situations
ITAR REQUIREMENTS:
SpaceX is an Equal Opportunity Employer; employment with SpaceX is governed on the basis of merit, competence and qualifications and will not be influenced in any manner by race, color, religion, gender, national origin/ethnicity, veteran status, disability status, age, sexual orientation, gender identity, marital status, mental or physical disability or any other legally protected status.
Applicants wishing to view a copy of SpaceX's Affirmative Action Plan for veterans and individuals with disabilities, or applicants requiring reasonable accommodation to the application/interview process should notify the Human Resources Department at (310) 363-6000.