Position Overview The Platform Engineer is responsible for designing, building, and maintaining the internal developer platform that enables software engineering teams across Sanlam Fintech to deliver value efficiently and reliably. This role focuses on creating self‑service infrastructure abstractions, standardised tooling, and automated workflows that reduce cognitive load on development teams while enforcing organisational best practices for security, compliance, and operational excellence. You will work at the intersection of software development and infrastructure operations, applying software engineering principles to infrastructure challenges. Your primary customers are fellow engineers, and your success will be measured by developer productivity, platform adoption, and reduction in operational toil across multiple diverse teams. What will you do? Internal Developer Platform & Portal Design and implement self‑service developer portals and golden paths that enable development teams to provision infrastructure, deploy applications, and manage services independently Build and maintain internal developer portal capabilities that provide service catalogues, documentation, and operational dashboards Create intuitive abstractions that hide infrastructure complexity while maintaining flexibility for diverse team requirements Infrastructure as Code & Cloud Architecture Develop and maintain reusable Terraform modules and CloudFormation templates that encode organisational standards for AWS resources Design and implement serverless architectures using AWS Lambda, API Gateway, Step Functions, and related services optimised for cost and performance Apply clean architecture principles and domain‑driven design to platform component boundaries and infrastructure organisation CI/CD & Automation Build and maintain CI/CD pipelines using GitHub Actions that enforce security scanning, quality gates, and deployment best practices Implement GitOps workflows and automated deployment strategies including blue‑green and canary deployments Leverage AI tools (Claude, GPT) to accelerate development, improve automation quality, and enhance documentation Deploy, operate, and optimise Kubernetes clusters (EKS) and containerised workloads Develop Helm charts, Kubernetes operators, and standardised deployment manifests for platform consumers Observability & Reliability Implement comprehensive observability using Datadog including custom dashboards, alerts, and SLO tracking Define and monitor platform SLIs, SLOs, and error budgets to ensure reliability standards are met Work across multiple diverse teams to identify optimisation opportunities and drive adoption of platform capabilities Collaborate with development teams to understand their needs and build appropriate abstractions that improve their workflows Maintain comprehensive documentation in Confluence and manage work effectively using Jira and the Atlassian suite Qualification and Experience 5+ years of experience in software engineering, DevOps, platform engineering, or site reliability engineering Demonstrated experience building and operating production infrastructure at scale Track record of improving developer productivity and implementing self‑service capabilities Experience working across multiple teams to drive adoption of platform capabilities Relevant tertiary qualification in Computer Science, Engineering, or related field (or equivalent practical experience) What will make you successful in this role? Technical Proficiencies AWS: Strong proficiency with core services including Lambda, API Gateway, ECS/EKS, S3, RDS, DynamoDB, CloudWatch, IAM, and VPC. Experience designing serverless architectures Infrastructure as Code: Expert‑level Terraform skills with experience writing reusable modules. Proficiency with AWS CloudFormation Kubernetes: Experience deploying and operating Kubernetes clusters, writing Helm charts, and managing containerised workloads Development & Automation Programming: Strong proficiency in at least one of: Python, Java, Node.js, or Rust. Ability to write production‑quality code for platform tooling and automation CI/CD: Experience designing and implementing CI/CD pipelines using GitHub Actions. Understanding of GitOps principles Version Control: Advanced GitHub skills including branching strategies, pull request workflows, and repository management Cloud Architecture: Understanding of serverless patterns, event‑driven architectures, and cloud‑native design principles Software Architecture: Knowledge of clean architecture principles and domain‑driven design (DDD) patterns Observability & Tooling Monitoring: Experience with Datadog (or similar) for metrics, logging, APM, and alerting. Understanding of SLIs, SLOs, and error budgets Collaboration Tools: Proficiency with Atlassian suite (Jira, Confluence) for documentation and project management AI Tools: Experience using AI assistants (Claude, GPT) effectively for development acceleration and automation Nice‑to‑Have Skills Experience designing and implementing internal developer portals (Backstage, Port, or custom solutions) Service mesh experience (Istio, Linkerd) and advanced Kubernetes patterns FinOps experience and cloud cost optimisation strategies Security automation and DevSecOps practices including SAST, DAST, and container scanning Experience with multiple programming languages from the required list Financial services or fintech industry experience AWS certifications (Solutions Architect, DevOps Engineer) Experience with event‑driven architectures using Kafka, EventBridge, or SNS/SQS Our Commitment to Transformation The Sanlam Group is committed to achieving transformation and embraces diversity. This commitment is what drives us to achieve a diverse, inclusive and equitable workplace as we believe that these are key components to ensuring a thriving and sustainable business in South Africa. The Group's Employment Equity plan and targets will be considered as part of the selection process. #J-18808-Ljbffr
Senior Platform Engineer
SANLAM LIMITED
bellville, bellville
Published 6 days ago
Report job