See all roles

Senior DevOps Engineer

Work from home Full-time role Hiring

Our team brings huge of cutting-edge, specialized expertise in Machine Learning and Speech Technologies, which are used daily by hundreds of millions of people worldwide. We already have several major projects underway and are looking to strengthen our team for a DevOps/SRE Engineer!

Requirements

Minimum 5 years of experience in a DevOps and/or Site Reliability Engineering role Strong hands-on experience with Linux system administration Extensive experience deploying, operating, and scaling Kubernetes in both cloud and bare-metal environments Deep expertise and practical experience with at least one major cloud provider (preferably Google Cloud Platform) Experience with ML inference on GPU/CPU is a strong plus Proven experience implementing SRE practices and building observability stacks using Grafana, Prometheus, and Loki Strong adherence to GitOps, Infrastructure as Code (IaC), and CI/CD principles Advanced expertise in Terraform, Ansible, and Python Comfortable working in high-uncertainty environments: we are building a new product, requirements evolve quickly, and the ability to rapidly learn new technologies and patterns is essential Proactive mindset: ability to look beyond DevOps tasks and actively debug and understand the product Strategic thinking: ability to choose technologies and architectural approaches based on long-term goals rather than short-term compromises

Responsibilities

Deploy, operate, and evolve a microservices-based platform running in Kubernetes clusters across AWS, GCP, and on-prem (Rancher) Operate and support GPU-based ML inference services (Triton Inference Server, vLLM) deployed on RunPod, Scaleway, and Nebius Build and maintain Docker images for all microservices and ensure a stable service lifecycle Maintain and scale development and production Kubernetes clusters, actively participate in deployment debugging, incident investigation, and performance troubleshooting Develop, maintain, and evolve custom Helm charts for each service Design and operate CI/CD pipelines using GitHub (code and pipelines) and GitLab for on-prem customer deployments Ensure platform compliance with SOC 2 requirements and actively contribute to improving security and compliance processes Manage cluster access via NetBird VPN, implementing role-based access control using group policies Deploy and manage infrastructure using IaC practices with Terraform and Ansible Develop and continuously improve observability systems: Grafana & Prometheus for metrics ELK stack for centralized log storage and analysis Continuously optimize infrastructure in the areas of IaC, IAM, Observability, and CI/CD Work with a technology stack, including: Python, Kubernetes, Linux, Docker, GitHub CI/CD, PostgreSQL, ClickHouse, Kafka, Superset, Terraform, Ansible What we offer Experienced team, Aiphoria is formed by a team of enthusiastic professionals who created award-winning devices, voice assistants and other AI-driven products for BigTech corporations. Cutting-edge technologies, we build a technology using our areas of expertise including Computer Vision, Speech Technologies, Natural Language Understanding, Generative AI incl. LLM and Diffusion models. Rapid career progression, facilitated by our team of seasoned senior professionals who hail from prestigious, industry-leading companies. Remote work opportunities. Company has prominent clients with an opportunity for you to work on different projects and/or to be involved in developing our proprietary own products. Competitive compensation surpassing market standards. A company with entrepreneurial spirit. We offer a unique mix of a secure workspace thanks to the big clients raised along with a true start-up culture! Apply To This Job

You might like

Platform Operations Lead

Work from home Full-time role

Senior Java Engineer

Work from home Full-time role

Key Account Manager (m/w/d) für Paid Social Agency

Work from home Full-time role

Business Technology Support Associate

Work from home Full-time role

Team Lead People & Culture (w/m/d)

Work from home Full-time role

Expert Full Stack Engineer

Work from home Full-time role

Support Engineer

Work from home Full-time role

Sr Account Manager- Electrical Product

Work from home Full-time role

Senior Software Engineer I/II (Frontend)

Work from home Full-time role

Workday Generalist HCM Dev – Mid

Work from home Full-time role

Property Claims Adjuster II

Work from home Full-time role

Remote Data Entry Clerk – Work From Home Position | Typing & Data Management Specialist at arenaflex

Work from home Full-time role

EAP Therapist (LPC, LMHC, LCSW, PsyD) CONTRACTOR

Work from home Full-time role

Experienced Live Chat Representative – Customer Service and Support for arenaflex's Waterpark and Aquatic Attractions

Work from home Full-time role

Information Security Analyst - Remote

Work from home Full-time role

Experienced Customer Service Representative – Remote Customer Support Team

Work from home Full-time role

Experienced Customer Service Representative – Remote Opportunity in Texas

Work from home Full-time role

Learning and Development Partner (Senior Manager)

Work from home Full-time role

Experienced Data Entry Specialist – Remote Work Opportunity at arenaflex

Work from home Full-time role

Experienced Entry-Level Data Entry Specialist – Remote Opportunity with arenaflex

Work from home Full-time role