About the position
Our client, a leading retail company, is currently seeking to employ a highly skilled DevOps Engineer to join their high-performing technology team supporting a large-scale retail environment. In this role, you will be responsible for designing, implementing, automating, and supporting software delivery pipelines, infrastructure platforms, and operational processes across on-premises and multi-cloud environments.
This role requires a strong understanding of modern DevOps practices, infrastructure automation, CI/CD, observability, cloud platforms, disaster recovery, and operational excellence. The Engineer will play a key role in the business' transformation journey as they migrate from Azure DevOps Server to GitHub Enterprise and standardise Infrastructure as Code (IaC) using Terraform.
Given the nature of retail operations, the role demands a strong focus on platform reliability, resilience, disaster recovery, and 24/7 system availability.
Main duties will include, but are not limited to:
DevOps & CI/CD
- Design, build, and maintain CI/CD pipelines for .NET and related application workloads.
- Support and optimize Azure DevOps Server while contributing to the migration strategy towards GitHub Enterprise.
- Implement DevSecOps best practices across the software development lifecycle.
- Automate build, test, deployment, and release processes.
- Improve deployment frequency, quality, and reliability through automation.
Infrastructure & Platform Engineering
- Manage and support hybrid infrastructure environments spanning:
- On-premises data centres
- Microsoft Azure
- Amazon Web Services (AWS)
- Develop and maintain Infrastructure as Code solutions using Terraform.
- Standardize infrastructure provisioning and configuration management practices.
- Support IIS-hosted application environments and associated platform services.
Cloud Engineering
- Design and implement cloud-native and hybrid solutions.
- Collaborate with architecture and engineering teams to establish cloud best practices.
- Support workload migrations and modernization initiatives across Azure and AWS.
- Ensure cloud environments meet security, governance, and compliance requirements.
Observability & Operational Excellence
- Utilize Dynatrace for monitoring, performance analysis, alerting, and troubleshooting.
- Develop dashboards, alerts, and service-level indicators (SLIs/SLOs).
- Drive proactive identification and resolution of performance bottlenecks.
- Participate in incident management and root cause analysis activities.
Disaster Recovery & Business Continuity
- Maintain and improve disaster recovery processes and procedures.
- Conduct DR testing and validation exercises.
- Ensure critical systems meet recovery time objectives (RTO) and recovery point objectives (RPO).
- Work closely with infrastructure and application teams to ensure business continuity requirements are met.
Reliability Engineering
- Improve platform resilience, scalability, and availability.
- Support highly available environments including:
- IIS Web Farms
- F5 Load Balancers
- Multi-cloud services
- Participate in on-call support and major incident resolution where required.
Collaboration
- Work closely with Software Engineers, Systems Analysts, Infrastructure Engineers, Security teams, and Architects.
- Mentor development teams on DevOps best practices.
- Promote a culture of automation, continuous improvement, and operational excellence.
Minimum Requirements:
- Bachelor's degree in Computer Science, Information Technology, Engineering, or related field.
- Relevant industry certifications including (are advantageous):
- Microsoft Azure Certifications
- Amazon Web Services AWS Certifications
- HashiCorp Terraform Certifications
- GitHub Certifications
- ITIL Foundation
- 5+ years of DevOps, Platform Engineering, or Site Reliability Engineering experience.
- Experience managing enterprise CI/CD platforms.
- Experience supporting mission-critical production environments.
- Experience with hybrid cloud and on-premises infrastructure.
- Proven experience implementing Infrastructure as Code.
- Experience supporting high-availability systems operating 24x7.
- Experience in a retail or high-transaction environment would be advantageous.
Technical Skills:
DevOps & Automation
- Azure DevOps Server
- GitHub Enterprise
- Git
- CI/CD pipeline development
- Release management
- Infrastructure automation
Programming & Scripting
- C#, Python, TypeScript
- PowerShell
- Bash
- YAML
- JSON
Infrastructure as Code
- Terraform (preferred)
- ARM Templates
- AWS CloudFormation (advantageous)
Cloud Platforms
- Microsoft Azure
- Amazon Web Services (AWS)
Application Hosting
- Microsoft IIS
- Windows Server Administration
Networking & Load Balancing
- F5 Load Balancers
- DNS
- SSL/TLS
- Networking fundamentals
Monitoring & Observability
- Dynatrace
- Log analysis
- Performance monitoring
- Alerting and incident management
Operating Systems
- Windows Server
- Linux (advantageous)
Desired Skills:
- DevOps
- CI/CD
- Azure DevOps
- MS Azure
- AWS
- Platform Engineering
- Cloud
- Site Reliability Engineering
- C#
- Python
- TypeScript
- Git
Desired Work Experience:
- 5 to 10 years Software Development
Desired Qualification Level: