[Remote] Site Reliability Engineer (Senior or Staff), Atlas
Note: The job is a remote job and is open to candidates in USA. MongoDB is a leading database platform empowering customers to innovate at market speed. They are seeking an experienced Site Reliability Engineer to support, maintain, and grow the Atlas platform, focusing on building complex systems and ensuring high availability for customer applications.
Responsibilities
- Participate in the development of a reliable and resilient multi-cloud platform that hosts business critical applications for a wide & varied range of customer applications
- Collaborate with service-owning teams to provide internal support, solve technical challenges and adapt or build tooling to solve novel use cases in a generic fashion
- Participate in a 24/7 on-call rotation to swiftly resolve issues related to any disruption of our customer facing Atlas fleet, ensuring minimal disruption and high availability
Skills
- Have 5+ years of experience running critical systems at scale
- Value efficiency in processes and operations, and display a preference for automation over manual processes ('allergic to ops work')
- Be familiar with a major cloud provider (AWS, Azure, or GCP) and possess the ability to build and operate systems in a multi-cloud environment
- A strong understanding of how to run a large scale Linux environment, including low level fundamentals
- Firm grasp of at least one modern programming language, beyond basic scripting (Go, Ruby, Python)
- Solid understanding of web and network protocols and standards (HTTP, TLS, DNS, etc)
- Be a US Citizen
Benefits
- Equity
- Participation in the employee stock purchase program
- Flexible paid time off
- 20 weeks fully-paid gender-neutral parental leave
- Fertility and adoption assistance
- 401(k) plan
- Mental health counseling
- Access to transgender-inclusive health insurance coverage
- Health benefits offerings
Company Overview