View Jobs at Interswitch |
Permanent |
Lagos |
Posted 2 years ago |
JOB TITLE: Site Reliability Engineer
JOB LOCATION: Lagos
Employment Type: Permanent
Department: Technology
JOB DETAILS:
- Manage Availability and Capacity on the Core Applications. Provide support for the Applications and ensure their optimal performance. Implement setup of new Applications in the company’s environment.
Duties and Responsibilities
- Deployment of Applications
- Support the deployment of Applications on the production environment
- Implement projects involving Setup and deployment of new Applications and enhancement of existing applications
- Automation
- Implement Automations of Activities that are involved in the management of Applications.
- Application Environment Management
- Ensure 24×7 Availability of all Core Applications
- Carry out Capacity planning to ensure Applications are always available to meet demands.
- Create visibility into site health and key performance indicators of the Application Systems
- Ensure up-to date patching and full compliance to security standards of the Application Systems.
- Ensure up-to date documentation on all Core Applications as well as changes made
- Balance feature development speed and reliability with well-defined Service Level Objectives (SLO) and Service Level Indicators (SLI)
- Monitor Systems
- Monitor the performance, health, and capacity of:
- Servers
- Databases
- Services
- Storage
- Network Links
- Use a variety of monitoring tools like Nagios, Solarwinds, Kibana, PagerDuty, AppDynamics, etc.
- Troubleshooting.
- Troubleshoot reported issues, and proactively identify areas in need of optimization
- Working with technical support engineers to resolve critical incidents
- Create and update clear troubleshooting guides for Applications
- Requests Fulfilment.
- Implement Requests relevant to the operation and enhancement of the Core Processing Applications.
Qualifications
- Academic Qualification(s) – Good First Degree in Computer Science / Computer Engineering or other related fields
- Professional Qualification(s) – Service Management Certifications (eg ITIL) is an advantage.
- Experience (Number of relevant years) – Minimum of (1) year relevant experience.
Requirements:
- Expertise in Linux and Windows Operating systems and Shell scripting
- Technical experience working with cloud technologies
- Build and Deployment Management (Jenkins) in a CI/CD workflow
- Experience with Chef, Puppet or Ansible, automating all aspects of system and server management
- Good understanding of distributed systems and container technologies like Docker/Kubernetes container infrastructure and orchestration
- Good understanding of SLO and SLI for Applications
- Experience with DNS, Networking and High Availability solutions
- Proficient in at least one of the following languages: Python, Ruby, Go Ability to work across teams to continuously analyze system performance in production, troubleshoot reported issues, and proactively identify areas in need of optimization
- Previous experience with developing and driving real time monitoring solutions that provide visibility into site health and key performance indicators
- Working knowledge of databases
- Working understanding of Load balancing technologies.
- Working understanding of IT service management (Incident, Problem, Change and Knowledge management).
- Ability to work within a technical team of support engineers through day-to-day operations and critical incidents.
Apply Now
Deadline: September 8, 2022
Job Features
Job Category | Engineering / Technical |