[Remote] Senior Site Reliability Engineer II
Note: The job is a remote job and is open to candidates in USA. Braze is a leading customer engagement platform that empowers brands to deliver exceptional customer experiences. The Senior Site Reliability Engineer II will be responsible for maintaining the reliability of internal-facing services and platforms, collaborating with engineering teams to improve infrastructure, automation, and tooling, while also managing incidents and ensuring adherence to enterprise-grade SLAs.
Responsibilities
- Partner with Braze’s engineering teams on:
- + Architecting products to effectively utilize infrastructure platforms in a scalable, reliable manner
- + Debugging reliability and scalability issues across all stack layers, including the products built using our infrastructure platforms
- + Make monitoring and alerting alerts on symptoms and not on outages
- + Ensure that Braze meets our strict enterprise-grade SLAs with customers
- Develop Braze’s internal platform infrastructure:
- + Create Infrastructure as code using Chef, Terraform, and Kubernetes
- + Develop deployment pipelines for applications in multiple languages using Docker, Kubernetes, etc
- + Provide centralized/common tooling, services, and automation frameworks that are critical for scaling operations, capacity management, reducing operational pain, and improving the day-to-day workflow of Braze’s engineering teams
- Manage incidents:
- + Be on a PagerDuty rotation to respond to availability incidents and provide support for other engineers
- + Use your on-call shift to prevent incidents from ever happening
- + Retrospect everything that happens to turn lessons into system improvements/changes, automation, etc
Skills
- 5+ years of experience as a Software, DevOps, or Site Reliability Engineer
- You think about systems - interfaces, boundaries, edge cases, failure modes, behaviors, specific implementations
- Have an urge to collaborate, document, and deliver quickly
- Collaborating across the global remote teams, often working asynchronously
- Document everything so you don't need to learn the same thing (or plan the same work) twice
- Delivering fast to delight our customers - even internal ones
- Have an enthusiastic, go-for-it attitude. When you see something broken, you can't help but fix it
- Have a desire to solve everyday challenges facing software engineers and automate their toil away
- Have an excellent ability to manage multiple tasks and expectations at once
- Know your way around Linux and Unix Shell
- Have strong programming skills - Ruby and/or Go preferred
- Have experience with Docker, Kubernetes, Terraform, or similar IaC technologies
- Have experience with MongoDB, Redis, Kafka, Postgres, or similar data technologies
Benefits
- On Target Earnings (OTE) between $172,000 and $306,000/year (including bonus or commission)
- Equity grants of restricted stock (RSUs)
- Competitive compensation that may include equity
- Retirement and Employee Stock Purchase Plans
- Flexible paid time off
- Comprehensive benefit plans covering medical, dental, vision, life, and disability
- Family services that include fertility benefits and equal paid parental leave
- Professional development supported by formal career pathing, learning platforms, and a yearly learning stipend
- A curated in-office employee experience, designed to foster community, team connections, and innovation
- Opportunities to give back to your community, including an annual company-wide Volunteer Week and donation matching
- Employee Resource Groups that provide supportive communities within Braze
- Collaborative, transparent, and fun culture recognized as a Great Place to Work®
Company Overview
Company H1B Sponsorship