Environment Our client is transforming the way global shipping contracts are created, executed, and fulfilled. As the leading digital contracting platform for the ocean freight industry, they enable shippers and carriers to improve performance, reduce friction, and increase trust, all powered by data. They are looking for a Senior Data Engineer to help them scale their data platform, support product innovation, and enable advanced AI, analytics, and compliance initiatives. As a Senior Data Engineer, you’ll design and build highly scalable data pipelines, architect foundational data systems, and support machine learning and GenAI capabilities. You’ll also contribute to the backend service layer, working with Java, Python and microservices to ensure seamless data integration between internal systems and their broader platform. Duties Platform & Infrastructure Engineering Build and maintain robust data pipelines (batch and streaming) using Airflow, AWS Glue, Step Functions, Lambda , and more Develop microservices and data-centric APIs in Java , with clean modular architecture and secure data access patterns Deploy and monitor services in AWS with infrastructure-as-code tools like Terraform and Docker Data Modelling, Observability & Lineage Design and implement reliable data models to support analytics, data products, and AI workloads Establish data lineage, quality monitoring, and testing frameworks using tools like Great Expectations , Marquez , or Monte Carlo Maintain metadata management and documentation for compliance and discoverability Data Science & GenAI Enablement Collaborate with data scientists to provision training datasets , feature stores , and model pipelines Build orchestration and evaluation workflows to support LLM and GenAI development (e.g., RAG pipelines, embedding search, document intelligence) Integrate unstructured data (PDFs, documents, messages) into structured datasets for analytics and AI Security & Compliance Implement best practices aligned with SOC 2 , GDPR , and internal infosec standards Ensure secure access controls, audit logging, and encrypted storage for sensitive data Work with cybersecurity and infrastructure teams to ensure end-to-end data governance Cross-functional Collaboration Partner with engineering, product, analytics, and operations teams to support cross-cutting data initiatives Collaborate closely with backend and DevOps engineers to align services, APIs, and deployment patterns. Requirements 7+ years of experience in data engineering or backend software development Proficiency in Java and Python, with experience developing microservices and scalable APIs Strong expertise in SQL, data modelling, and building reliable ETL/ELT pipelines Deep familiarity with AWS services (Step Functions, Lambda, Glue, S3, Redshift) Hands-on experience with Airflow, dbt, or similar orchestration and transformation tools Knowledge of data lineage, quality frameworks, and monitoring systems Prior experience working alongside data scientists or ML engineers It’s a plus if you have: Experience with AIOps or GenAI systems. Familiarity with real-time streaming (e.g., Kafka, Kinesis) and event-driven architectures. Exposure to data privacy regulations and SOC 2 compliance. Background in logistics, supply chain, or a data-rich SaaS environment is a plus. #J-18808-Ljbffr
Senior Data Engineer (Ai-Driven Engineer) (Cpt - Hybrid)
DATAFIN RECRUITMENT
cape town, cape town
Published 14 days ago
Report job