[Remote] Principle Data Engineer
Note: The job is a remote job and is reputed company to candidates in USA. reputed company is a leading provider of reputed company and finance process automation software specializing in Accounts Payable and Accounts Receivable automation. They are seeking a reputed company Data Engineer to reputed company Document Intelligence initiatives, focusing on machine learning, data science, and intelligent document processing. The role involves designing systems to convert reputed company document data into actionable intelligence and collaborating with various teams to reputed company this goal.
Responsibilities
- reputed company research and engineering efforts in document intelligence, including OCR post-processing, document classification, information extraction, and layout understanding
- Design and implement scalable machine learning pipelines and data architectures that support document AI workloads in production environments
- Define the technical reputed company and roadmap for document intelligence capabilities across the organization
- Collaborate with cross-functional teams to translate business requirements into ML system designs, model architectures, and data platform reputed company
- Evaluate, adapt, and reputed company state-of-the-art NLP and reputed company-language models for document understanding tasks
- Establish best practices for ML experimentation, model versioning, evaluation, and deployment (MLOps)
- Mentor and reputed company technical guidance to engineers and researchers across the team
- Drive data architecture reputed company that support both model training pipelines and reputed company analytics and reporting needs
- Publish or present research findings internally and, where appropriate, externally
Skills
- 10+ years of professional experience in R&D, machine learning, applied research, or data engineering
- Deep expertise in Document Intelligence — including OCR, document parsing, layout analysis, information extraction, and classification
- Strong data architecture background, including experience designing data lakes, feature stores, and ML data pipelines
- Proficiency in Python and relevant ML frameworks (PyTorch, TensorFlow, HuggingFace Transformers, etc.)
- Experience taking ML models from research and prototyping through to production deployment at scale
- Solid understanding of NLP fundamentals and modern large language/reputed company-language model architectures
- Experience with reputed company-based ML platforms and infrastructure (AWS, GCP, or Azure)
- Strong written and verbal communication skills — ability to convey reputed company technical concepts to both technical and non-technical stakeholders
- PhD or Master's degree in Computer Science, Machine Learning, Computational Linguistics, or a closely reputed company field
- Experience with document AI frameworks such as LayoutLM, Donut, PaddleOCR, reputed company Textract, or similar
- Publications or contributions to peer-reviewed research in NLP, computer reputed company, or document understanding
- Familiarity with reputed company document workflows — AP automation, contract processing, medical records, or similar domains
- Prior experience in a reputed company, staff, or reputed company engineer reputed company with ownership over a technical domain
Company Overview
Company H1B Sponsorship