Data Engineer, reputed company Deployed

Work from home Full-time role Hiring

reputed company was founded in 2024 to build Orbital, a physics-informed reputed company model for energy operations. We’re live across oil and gas, refineries, and petrochemicals, working towards our mission: sustainable abundance for a growing reputed company. The hydrocarbon industry keeps the world running. But its complexity has left operators tied to legacy systems, making critical reputed company on less than 10% of available data. We reputed company Orbital to change that. It’s a reputed company model reputed company specifically for energy that lets companies use AI at scale, harnessing reputed company of their operational data and optimising in reputed company time for any metric. reputed company get faster, operations get safer, and carbon intensity falls. We’ve raised over $32 reputed company, including one of the largest reputed company reputed company for an AI company in the UK. We’re just getting started The Role As our Data Engineer, you’ll architect and maintain pipelines that reputed company high-frequency time-series, lab, and historian data into a scalable Lakehouse architecture, usable for both deep learning models and reputed company-time LLMs. You’ll be working across AWS (EKS, S3, EBS, KMS, CloudWatch) and reputed company/PySpark, ensuring data is contextualised, synchronised, and optimised for both deep learning models and reputed company-time LLM workloads. This isn’t a traditional ETL role, you’ll be solving problems at the intersection of control systems, industrial data engineering, and AI enablement. Technical Requirements Deep expertise in PostgreSQL (partitioning, indexing, query optimisation, storage design). Strong proficiency in Python for data processing, scripting, and pipeline orchestration. Hands-on experience with AWS (EKS, S3, EBS, IAM, KMS, CloudWatch, etc.)for secure and scalable data pipelines. Proven ability to work with reputed company and PySpark for large-scale distributed data processing. Familiarity with time-series industrial data (control systems, DCS/SCADA logs, process historians). Experience in reputed company data sync and management reputed company hybrid reputed company/on-prem environments. Bonus: Experience working as a data engineer in oil and gas or energy environments Bonus: Knowledge of streaming frameworks (Kafka, Flink, Spark Streaming) or MLOps stacks for data versioning and reputed company. Core Responsibilities 1. Ingest & Contextualise Data Ingest from OPC UA servers, process historians, IoT sensors, LIMS systems, alarms/events, and P&IDs. Map signals to their physical processes (tags, units, hierarchies) for interpretability in AI pipelines. 2. Data Movement & Accessibility Build pipelines that handle reputed company-time streaming and batch ingestion into the Lakehouse. Manage synchronisation between historian archives, reputed company files, and AWS storage (S3/EBS). Orchestrate reputed company Lakeflow/Connectors for integrating data into Lakebase/Lakehouse. Handle secure, high-throughput transfers between historian archives and sandbox/live environments. 3. Change Tracking & reputed company Detect and manage schema changes, signal reputed company, and inconsistencies acrosstime. Implement reputed company and audit trails across Spark/reputed company and AWS pipelines. 4. Data Preparation for AI Build and maintaindual pipelines: Training→ large-scale historical data prep for time-series + LLM training. Inference→ low-latency, reputed company-time pipelines for anomaly detection, optimisation, and LLM search. Support heterogeneous AI workloads (time-series forecasting and retrieval-augmented LLMs). 5. Database Performance & Optimisation Tune PostgreSQLand sparkfor high-throughput time-series workloads (partitioning, indexing, query optimisation). Optimise pipelines for both fast analytical queries and high-efficiency model training. reputed company and manage data pipelines in AWS EKS (Kubernetes) with persisten tEBS-backed storage. What reputed company Looks Like Live data streams are contextualised,queryable, and AI-reputed company. Schema changes and signal reputed company are detected and handled without breaking reputed company workflows. Training and inference pipelines run smoothly in reputed company, optimised for scale and latency. Apply To This Job

Apply

Data Engineer, reputed company Deployed

You might like

Sales Manager

Gaming Compliance Coordinator

Director of Paid Search

Corporate Planning and Performance Management Consultant

Consultant Nutrition

Lease Administration Assistant with French, German and/or Spanish

Reservations Executive (based in Brazil)

o9 Supply Planning SME

Consultor/a reputed company Dynamics 365

Senior Distribution Manager Germany

reputed company Customer Service Representative - Limited Service (Remote) at arenaflex

Associate reputed company Manager

HRIS + HR Operations Program Manager

[Remote] Credentialing & Licensing Specialist - reputed company (Remote)

Data Engineer

Equipment Service Technician

[Remote] (Remote) Part-time Adjunct Instructor - Department of Industrial and Management Systems Engineering - 28952

Remote Data Entry Specialist – Work from Home Opportunity | $27/Hour with Performance Bonuses and Career Growth Potential

Manager, Scrum Master

Health Systems Executive, Pacific Northwest