About the position
ENVIRONMENT:
Our client is transforming the way global shipping contracts are created, executed, and fulfilled. As the leading digital contracting platform for the ocean freight industry, they enable shippers and carriers to improve performance, reduce friction, and increase trust, all powered by data. They are looking for a Senior Data Engineer to help them scale their data platform, support product innovation, and enable advanced AI, analytics, and compliance initiatives. As a Senior Data Engineer, you’ll design and build highly scalable data pipelines, architect foundational data systems, and support machine learning and GenAI capabilities. You’ll also contribute to the backend service layer, working with Java, Python and microservices to ensure seamless data integration between internal systems and their broader platform.
DUTIES:
Platform & Infrastructure Engineering
- Build and maintain robust data pipelines (batch and streaming) using Airflow, AWS Glue, Step Functions, Lambda, and more
- Develop microservices and data-centric APIs in Java, with clean modular architecture and secure data access patterns
- Deploy and monitor services in AWS with infrastructure-as-code tools like Terraform and Docker
Data Modelling, Observability & Lineage
- Design and implement reliable data models to support analytics, data products, and AI workloads
- Establish data lineage, quality monitoring, and testing frameworks using tools like Great Expectations, Marquez, or Monte Carlo
- Maintain metadata management and documentation for compliance and discoverability
Data Science & GenAI Enablement
- Collaborate with data scientists to provision training datasets, feature stores, and model pipelines
- Build orchestration and evaluation workflows to support LLM and GenAI development (e.g., RAG pipelines, embedding search, document intelligence)
- Integrate unstructured data (PDFs, documents, messages) into structured datasets for analytics and AI
Security & Compliance
- Implement best practices aligned with SOC 2, GDPR, and internal infosec standards
- Ensure secure access controls, audit logging, and encrypted storage for sensitive data
- Work with cybersecurity and infrastructure teams to ensure end-to-end data governance
Cross-functional Collaboration
- Partner with engineering, product, analytics, and operations teams to support cross-cutting data initiatives
- Collaborate closely with backend and DevOps engineers to align services, APIs, and deployment patterns.
REQUIREMENTS:
- 7+ years of experience in data engineering or backend software development
- Proficiency in Java and Python, with experience developing microservices and scalable APIs
- Strong expertise in SQL, data modelling, and building reliable ETL/ELT pipelines
- Deep familiarity with AWS services (Step Functions, Lambda, Glue, S3, Redshift)
- Hands-on experience with Airflow, dbt, or similar orchestration and transformation tools
- Knowledge of data lineage, quality frameworks, and monitoring systems
- Prior experience working alongside data scientists or ML engineers
It’s a plus if you have:
- Experience with AIOps or GenAI systems.
- Familiarity with real-time streaming (e.g., Kafka, Kinesis) and event-driven architectures.
- Exposure to data privacy regulations and SOC 2 compliance.
- Background in logistics, supply chain, or a data-rich SaaS environment is a plus.
Desired Skills:
- AWS Glue
- Docker
- Java
- Python
About The Employer:
Our client is a leading platform in the shipping industry that addresses the critical issue of contract fulfilment for carriers, shippers, and NVOCCs. Its mission is to unite these stakeholders through shared digital infrastructure, enhancing performance while reducing manual workloads.