About the position
Senior Data Engineer (Kafka Streaming, Spark, Iceberg on Kubernetes)
Remote | R100 000 - R110 000 per month
About Our Client
Our client is a fast-growing, technology-driven company building advanced, high-performance data platforms that power analytics, AI, and business intelligence. Operating at the forefront of real-time data streaming and distributed computing, they're known for their strong engineering culture, technical depth, and commitment to innovation. The environment encourages autonomy, collaboration, and continuous learning across global teams.
The Role: Senior Data Engineer
As a Senior Data Engineer, you'll architect and develop real-time data processing systems that push the boundaries of performance and scalability. You'll lead initiatives in designing and optimizing modern data pipelines and platforms using Kafka, Spark, and Apache Iceberg all running on Kubernetes. This role offers the opportunity to shape data infrastructure strategies and mentor engineers within a technically elite, innovation-driven team.
Key Responsibilities
- Design, build, and optimize highly scalable, low-latency data pipelines and architectures.
- Develop and manage Iceberg-based data lakes with schema evolution and time-travel capabilities.
- Implement robust streaming and ETL workflows using Apache Spark (Scala/Python) and Kafka Connect/Streams.
- Deploy, monitor, and scale distributed data services on Kubernetes using containerization best practices.
- Optimize performance and resource efficiency across Spark jobs, Kafka clusters, and Iceberg tables.
- Establish and enforce engineering best practices, including CI/CD, testing, and code quality standards.
- Collaborate across data, DevOps, and analytics teams to enable reliable data delivery and governance.
- Mentor engineers and foster a culture of technical excellence and innovation.
About You
- 5+ years professional experience in data or software engineering.
- Expert in Apache Spark (batch and streaming).
- Proven experience with Apache Kafka (Connect, Streams, or ksqlDB).
- Hands-on knowledge of Apache Iceberg, including table management and optimization.
- Strong programming skills in Python (PySpark) or Scala.
- Experience deploying distributed systems on Kubernetes (Spark Operator advantageous).
- Deep understanding of data modeling, warehousing, and performance optimization.
- Advantageous: Familiarity with AWS, Azure, or GCP; Flink; Trino.
- Bachelors or Masters degree in Computer Science, Engineering, or related field preferred.
Desired Skills:
Desired Work Experience:
Desired Qualification Level: