View Jobs at Tezza Business Solutions Limited |
Full Time |
Lagos |
Posted 8 months ago |
We are recruiting to fill the position below:
Job Title: Site Reliability Engineer (SRE)
Location: Lagos
Department: Information Technology
Job Summary
- We are seeking a skilled and experienced Site Reliability Engineer (SRE) to join our IT team.
- As an SRE, you will play a critical role in ensuring the reliability, performance, and scalability of our systems.
- The ideal candidate should have 3 to 5 years of relevant experience, a strong background in systems architecture, and a passion for implementing best practices in reliability engineering.
Responsibilities
- Collaborate with cross-functional IT teams to define and implement reliability goals for systems and applications.
- Design, implement, and maintain tools for monitoring, alerting, and incident response to ensure system reliability and availability.
- Conduct performance analysis and capacity planning to scale infrastructure and applications proactively.
- Automate deployment, scaling, and management of applications and infrastructure.
- Implement and maintain CI/CD pipelines to ensure efficient and reliable software delivery.
- Collaborate with development teams to optimize application performance, reliability, and scalability.
- Respond to and resolve incidents, identify root causes, and implement preventive measures.
- Participate in on-call rotations to provide 24/7 support for critical systems.
- Implement security best practices and contribute to the development of security-focused tools.
- Stay updated on emerging trends and technologies in site reliability engineering.
Requirements
- Bachelor’s Degree in Computer Science, Information Technology, or a related field.
- 3 to 5 years of proven experience as a Site Reliability Engineer.
- Proficiency in scripting and programming languages (e.g., Python, Bash, Go).
- Experience with containerization and orchestration tools (e.g., Docker, Kubernetes).
- Strong knowledge of cloud platforms (e.g., AWS, Azure, or Google Cloud).
- Expertise in monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
- Familiarity with infrastructure-as-code tools (e.g., Terraform, Ansible).
- Understanding of networking principles and protocols.
- Excellent problem-solving and debugging skills.
- Ability to collaborate effectively with cross-functional teams and communicate technical concepts to non-technical stakeholders.
- Proactive attitude towards learning and staying updated on industry trends.
Preferred Qualifications:
- Master’s Degree in Computer Science or a related field.
- Relevant certifications (e.g., AWS Certified DevOps Engineer, Google Professional DevOps Engineer).
- Experience with microservices architecture.
- Knowledge of incident response and post-mortem analysis.
- Contribution to open-source projects or a strong portfolio of previous work.
- Familiarity with observability tools (e.g., Jaeger, OpenTelemetry).
Application Closing Date
Not Specified.
Job Features
Job Category | Site Reliability Engineer ( |