About the position
Data Engineer
JOB DESCRIPTION
Data Engineering & Pipeline Management
- Design, build, and optimize T-SQL stored procedures, functions, and scripts for high-volume data processing and ECM scoring.
- Develop, deploy, and monitor end-to-end ETL/ELT workflows (e.g., SQL Server Agent, SSIS, Azure Data Factory, or Airflow) with checkpoint/rollback, job
- tracking, and recovery capabilities.
- Perform data cleansing, preparation, and transformation to support business intelligence and machine learning workflows.
- Engineer and maintain reusable feature store tables (per entity/tax type) for ML models and operational scoring.
- Model and maintain data warehouse structures (3NF, dimensional/star/snowflake), ensuring proper documentation of data lineage.
- Prepare and deliver curated, scored datasets for downstream consumption in Power BI dashboards and analytics environments.
- Develop and maintain audit, telemetry, and job tracking tables to ensure data reliability, restartability, and monitoring visibility.
- Support and troubleshoot production pipelines, optimizing query performance via indexing, tuning, and profiling tools.
Data Quality, Governance, and Compliance
- Implement and monitor data validation, reconciliation, and QA frameworks across the data lifecycle.
- Enforce data security, privacy, and compliance controls in line with corporate and regulatory standards.
- Support the implementation of data governance and lineage documentation, ensuring traceability and adherence to EDM policies.
Collaboration and Cross-functional Support
- Collaborate with data analysts, data scientists, software engineers, and business stakeholders to translate business problems into scalable data solutions.
- Provide accessible, well-documented datasets to support analytics and reporting.
- Contribute to all phases of the SDLC, including requirements, design, development, testing, deployment, and maintenance.
JOB REQUIREMENTS
Qualifications and Experience:
- A tertiary qualification in Computer Science, Information Systems, Data Engineering, Analytics, Mathematics, or Statistics or Matric with 6-8 years of experience in data engineering, database development, or data management in production environments.
- Proven hands-on experience with SQL Server, including advanced T-SQL development, ETL/ELT workflow design, and performance tuning.
- Demonstrated delivery of production data solutions—both batch and near real-time—within enterprise environments.
- Experience in building and maintaining data warehouses, feature stores, and reusable data products.
- Track record of implementing data governance and quality frameworks, ensuring compliance and traceability.
- Experience in orchestrating complex data pipelines using SQL Server Agent, SSIS, Airflow, or Azure Data Factory.
- Familiarity with cloud-based data architectures (Azure preferred) and version control systems (Git).
- Exposure to Power BI or equivalent visualization tools for reporting and analytics enablement.
- Strong understanding of data security, privacy, and regulatory compliance requirements.
Key Competencies:
- Advanced SQL Server Development: Strong proficiency in T-SQL, stored procedure design, query optimization, indexing, and error handling.
- ETL and Data Warehousing: Expertise in ETL/ELT pipeline design and orchestration for batch and near real-time processing using SQL Server Agent, SSIS, or Azure Data
- Data Modeling: Solid understanding of normalized and dimensional modeling (3NF, star, snowflake) and scalable architecture design.
- Feature Store Development: Ability to design and maintain reusable feature tables supporting machine learning and operational scoring.
- Data Validation and Quality Assurance: Skilled in implementing validation rules, reconciliation checks, and QA frameworks to ensure data integrity.
- Data Governance and Security: Strong knowledge of data governance, privacy, and compliance standards; experience maintaining data lineage documentation.
- Workflow Orchestration: Experience building restartable, traceable workflows with checkpoint and rollback mechanisms.
- Programming and Scripting: Proficiency in SQL and beneficial experience in Python or R for automation and data manipulation.
- Cloud Platforms: Familiarity with Azure (preferred) or other cloud platforms such as AWS or GCP for data engineering workloads.
- Version Control and CI/CD: Exposure to Git and CI/CD pipelines for managing data workflow deployment.
- Visualization and Reporting (Beneficial): Ability to prepare scored or curated data for BI tools such as Power BI.
- Performance Optimization: Expertise in performance tuning, query profiling, and indexing strategies to optimize large-scale data operations.
- Collaboration and Communication: Ability to work effectively across technical and business teams, translating complex requirements into practical data solutions.
This position is open to persons with disabilities.
Desired Skills:
- Azure Data
- Data Governance and Security
- Exposure to Git and CI/CD pipelines