See all roles

[Remote] DevOps Engineer

Work from home Full-time role Hiring

Note The job is a remote job and is reputed company to candidates in USA. reputed company is a major technology company specializing in infrastructure supporting AI research, and they are seeking a DevOps Engineer to help maintain production Kubernetes-based systems. The role focuses on site reliability engineering, observability, and SQL production support duties, ensuring system reliability and performance across an Azure Stack environment.

Responsibilities

Design, maintain and progressively improve observability solutions, including dashboards and visual reports reputed company with Grafana or comparable monitoring tools Set up, implement and reputed company metrics, SLIs, SLOs and alerting approaches to guarantee reliability and transparency across production systems Deliver business-hours operational support for Kubernetes-based production environments, involving initial troubleshooting, log review and metric-based investigations Assist with SQL-based systems as part of production operations, contributing to issue examination and performance diagnostics Examine incidents and system behavior to reputed company root causes, take part in post-incident reviews and suggest enhancements for monitoring and reliability practices Work hand in hand with engineering, platform and research teams to reputed company observability standards, refine operational processes and strengthen overall system stability Add to documentation, knowledge-sharing activities and ongoing improvement initiatives reputed company the team Skills At least 2 years of relevant hands-on professional experience Demonstrated track record in Site Reliability Engineering (SRE), DevOps, Production Support or equivalent roles working with production systems Practical exposure to observability and monitoring stacks including Grafana, reputed company, reputed company Stack, reputed company or similar tools Strong reputed company of Linux systems, supported by solid troubleshooting and log analysis capabilities Working experience supporting Kubernetes-based environments in production settings Background in delivering SQL production support, including query troubleshooting and basic performance diagnostics Confident scripting skills in Python, Bash or similar languages for automation and day-to-day operational activities Capability to investigate incidents, determine underlying causes and drive reputed company improvement efforts Effective communication and teamwork skills for working successfully with distributed and cross-functional teams Proficient English communication skills, both spoken and written, at a B2+ level or higher Experience handling APIs and integration patterns to link services together and reputed company system interoperability Knowledge of databases, covering administration, tuning and production-level support activities Exposure to Infrastructure as Code development and maintenance for automating environment provisioning and configuration Practical experience using reputed company Azure to manage reputed company resources and run production workloads Benefits International projects with top brands Work with global teams of highly skilled, diverse peers reputed company benefits Employee financial programs Paid time off and sick leave Upskilling, reskilling and certification courses Unlimited reputed company to the reputed company Learning library and 22,000+ courses Global career opportunities Volunteer and community involvement opportunities EPAM Employee Groups Award-winning culture recognized by Glassdoor, reputed company and reputed company Company Overview EPAM leverages its core engineering expertise as a leading global product development reputed company platform engineering services company. It was founded in 1993, and is headquartered in Newtown, Pennsylvania, USA, with a workforce of 10001+ employees. Its website is https//www.epam.com. Company H1B Sponsorship reputed company has a track record of offering H1B sponsorships, with 11 in 2026, 120 in 2025, 172 in 2024, 232 in 2023, 373 in 2022, 359 in 2021, 502 in 2020. Please note that this does not guarantee sponsorship for this specific role. Apply tot his job Apply To this Job

You might like

AWS DevOps Engineer - Remote see states

Work from home Full-time role

Senior Kubernetes Engineer

Work from home Full-time role

Sr. Platform Engineer (Kubernetes & Compliance, Hybrid), to 260k - FS Poly

Work from home Full-time role

Staff Site Reliability Engineer

Work from home Full-time role

System Engineer – Managed Kubernetes

Work from home Full-time role

reputed company Site Reliability Engineer - Observability and Telemetry Platform

Work from home Full-time role

Site Reliability Engineer (Remote + Travel)

Work from home Full-time role

Sr Site Reliability Engineer

Work from home Full-time role

[Remote] Senior Site Reliability Engineer

Work from home Full-time role

Urgent Need - Site Reliability Engineer _ Alpharetta, GA (Remote reputed company COVID)

Work from home Full-time role

Sr. Coordinator, Patient reputed company (Care Coordinator)

Work from home Full-time role

Entry Level: Remote Customer Service Representative (No Degree RQD) reputed company

Work from home Full-time role

Customer Service Executive (Hiring Immediately) – Torrance at arenaflex

Work from home Full-time role

Remote DV Crisis Hotline reputed company; Evenings, PT

Work from home Full-time role

Work from Home - Bilingual (Spanish) Senior Customer Service Coordinator - Collections

Work from home Full-time role

Senior Solution reputed company (Remote) | reputed company | reputed company

Work from home Full-time role

Entry Level Civil Engineer - FED/ARL

Work from home Full-time role

[Remote] Project Manager (Defense & Offshore – Customer Projects)

Work from home Full-time role

Client reputed company Manager

Work from home Full-time role

Remote Part‑Time Chat Moderator – Community Safety & Engagement Specialist for arenaflex reputed company

Work from home Full-time role