Position: Lead Data Engineer Contract Type: Fixed term / Contract Contract Duration: Start Date: 25 May 2026 – End Date: December 2026 Work Model: Hybrid (2-3 days a week) Work Location: Sandton, Johannesburg, South Africa (Hybrid / Office-based as required) Role Overview We are seeking a Lead / Senior Data Engineer to design, build, and operate modern Databricks and Lakehouse data platforms that support advanced analytics, AI, and Generative AI use cases. This role is a senior individual contributor position, operating within product-aligned, cross‑functional squads. The successful candidate will deliver high-quality, governed, scalable data assets consumed by analytics platforms, machine learning models, and Generative AI solutions, including LLM- and agent-based systems. Key Responsibilities 1. Databricks & Data Platform Engineering Design, build, and operate data solutions using Databricks, including: Delta Lake Databricks Jobs and Workflows Unity Catalog Notebooks and shared libraries Develop scalable, reliable Lakehouse architectures supporting analytics and AI workloads. 2. Data Enablement & Consumption Enable data consumption for: Generative AI use cases (e.g. Retrieval-Augmented Generation, AI services, agent workflows) Analytics and reporting platforms Downstream operational and business systems Support feature-style and curated data access patterns required by AI and GenAI workloads. 3. Generative AI Data Enablement Build and maintain data pipelines that feed Generative AI applications, including: Curated knowledge and reference datasets Structured and semi-structured data sources Metadata, lineage, and traceability for AI consumption Enable common GenAI data patterns such as: Retrieval Augmented Generation (RAG) Contextual and prompt data preparation Model input, output, and feedback data flows 4. Engineering Standards & Best Practices Develop production-grade data pipelines using: Python SQL Apache Spark Implement automated testing, CI/CD, and deployment practices for data workloads. Ensure data solutions are: Observable Resilient Performant Cost-efficient Continuously improve data quality, reliability, and operational stability. 5. Collaboration & Ways of Working Act as a senior engineer within a cross-functional product squad. Collaborate closely with: Product Owners AI / Machine Learning Engineers Analytics teams Platform and security teams Provide engineering input into design discussions and delivery decisions. Support peer reviews and contribute to shared engineering standards. Provide mentorship and technical guidance, including involvement in AI Engineer development. 6. Risk, Governance & Run Ensure all data solutions comply with enterprise security, risk, and governance standards. Support the operational stability of data pipelines used by analytics and AI workloads. Participate in incident resolution and root cause analysis. Maintain appropriate technical documentation and runbooks. Required Background & Experience 10–15 years of industry experience in data engineering or related fields. 5+ years' operating as a Senior or Lead Data Engineer. Mandatory Technical Skills (with minimum experience) Databricks (hands‑on): 2+ years Enterprise data lake / lakehouse architecture: 5+ years Python: 5+ years SQL: 5+ years Apache Spark: 5+ years Production‑grade data platforms: 3+ years Enterprise or regulated environments: 5+ years Mandatory Skills Summary Databricks Data lake and lakehouse architecture Python SQL Apache Spark Production‑grade data platforms Enterprise or regulated environments Desirable / Beneficial Skills Experience enabling AI, ML, or Generative AI use cases from a data engineering perspective Familiarity with RAG data patterns Feature-style or AI-serving datasets Vector-based or embedding-ready data workflows Experience working in Agile, product-aligned squads Exposure to cloud-native data platforms such as AWS or Azure Desired Skills Summary AI, ML, or Generative AI RAG data patterns Feature-style or AI-serving datasets Vector or embedding-ready data workflows Cloud-native data platforms (AWS or Azure) #J-18808-Ljbffr
Lead Data Engineer
BELAY TALENT SOLUTIONS
sandton, sandton
Published 7 days ago
Report job