About the position
JOB DESCRIPTION
JOB REQUIREMENTS Qualification:
- BTech in Computer Science, Software Engineering, Information Systems, Electronic Engineering or equivalent qualifications coupled with 13 years experience
- BENG/MTech in Computer Science, Software Engineering, Information Systems, Electronic Engineering or equivalent qualifications coupled with 9 years experience
- MENG in Computer Science, Software Engineering, Information Systems, Electronic Engineering or equivalent qualifications coupled with 7 years experience
- PHD in Computer Science, Software Engineering, Information Systems, Electronic Engineering or equivalent qualifications coupled with 5 year
Experience:
- 3+ years in a technical leadership or software/system architectural role with direct responsibility for large-/platform-scale distributed systems.
- Demonstrated hands-on experience in infrastructure design and automation, distributed systems, observability, CI/CD, container orchestration (e.g. Kubernetes), DevOps/SRE practices and cloud-native technologies.
- Experience leading teams or initiatives that intersect with data platforms, storage, networking, and systems engineering domains
Knowledge:
- In-depth understanding of systems engineering principles, including performance optimisation, fault tolerance, and resource scheduling in Linux-based environments.
- Strong knowledge of containerised environments (Docker, Podman), orchestration platforms (Kubernetes, Helm), and runtime architectures (containerd, CRI).
- Expertise in infrastructure-as-code, continuous integration/deployment (CI/CD), and configuration management tools (e.g., GitLab CI, Ansible, Terraform, ArgoCD).
- Advanced understanding of distributed computing and storage architectures, including Ceph, S3, NFS, and local/clustered file systems.
- Operational and architectural fluency in relational and NoSQL database systems (e.g., PostgreSQL, MySQL, MongoDB), including replication, backups, and performance tuning.
- Working knowledge of networking fundamentals, security protocols, and systems-level observability (e.g., Prometheus, Grafana, ELK/EFK stack).
- Familiarity with the HPC ecosystem (e.g., SLURM, job schedulers) is beneficial for environments supporting scientific or research computing
Competency Essential:
- Demonstrated technical leadership (3+ years), leading cross-functional efforts across systems, storage, and database infrastructure, driving technical decisions from architecture through implementation.
- Systems engineering expertise, with a focus on Linux administration, infrastructure automation, service orchestration, and performance optimisation across diverse environments.
- Expertise in distributed systems architecture, including the design and deployment of scalable, resilient services using microservices, event-driven, and cloud-native design patterns.
- Containerisation and orchestration fluency, including production-grade usage of Kubernetes, Docker, and Helm for system and application-level deployments.
- Infrastructure automation and CI/CD, using tools such as GitLab CI, ArgoCD, FluxCD, Jenkins, or GitHub Actions to streamline and secure platform operations.
- Complementary DevOps and SRE practices, blending infrastructure-as-code, configuration management, and release automation (DevOps) with incident response, monitoring, SLIs/SLOs, and system reliability engineering (SRE).
- Linux expertise, including advanced troubleshooting, kernel tuning, system orchestration, and optimisation at scale.
- Technical delivery and planning capabilities, including backlog scoping, cross-team collaboration, and Agile sprint execution.
- Database administration skills, with operational experience in administering relational and NoSQL databases (e.g., PostgreSQL, MySQL, MongoDB), including high availability, backups, replication, and performance tuning.
- Diagnostic skills, with a root-cause-first approach, and a strong bias for ownership, accountability, and long-term operational stability.
Desired Skills:
- Problem solving and analysis
- Technical leadership
- Resource Management/Leadership