JOB TITLE: Site Reliability Engineer
JOB LOCATION: Lekki Phase 1, Lagos
Employment Type: Full-time
JOB DETAILS:
- Kudi software engineers build solutions that will forever change the face of finance and banking in Africa by bringing affordable banking services to the doorstep of people across the continent.
- We’re looking for engineers that can bring fresh ideas and experience to the table from all areas of expertise including distributed system design, mobile development, systems architecture, networking, security and more.
- As a Site Reliability Engineer, you will view operations as a software problem and use programming and automation extensively to complete operations tasks including configuring, deploying and provisioning applications and dependencies across all environments.
- You will be responsible for ensuring our services and applications are always consistently and reliably serving customers.
- You will be part of an operations team which works closely with software engineers using DevOps processes and principals to quickly and reliably deliver value to customers.
- You will react in real time to production incidents and work to contain and resolve them as quickly as possible.
- You will build and maintain CI pipelines which entirely automate the build, test and deployment of all software changes throughout the organisation.
About the Position
- Ensure your team is immediately aware of production errors and prioritizes their repair.
- Provide architectural input to the teams’ development process from an operations and infrastructure POV, including but not limited to monitoring, alerting, persistence, tradeoffs given the state the available hardware, etc.
- Provision cluster resources, repositories, CI/CD pipelines, and credentials for your responsible team and systems to consume.
- Providing updates to the entire company during outages and downtime, scheduled maintenance and more in a professional, respectful, and timely manner.
- Strive to work at the highest standards possible along with the rest of your team.
About You
- Bachelor’s Degree or Higher in STEM courses.
- 3 years working as a software engineer/site reliability engineer professionally.
- 3 years developing Python + Linux/Mac/Unix environments + git professionally.
- 3 years working with Linux/Unix user environments, e.g. bash, grep, awk, sed, etc.
- 2 years of experience working with cloud infrastructure, e.g GCP, AWS.
- 2 years working with CI/CD tools, e.g. Jenkins, CircleCI, TravisCI, Semaphore.
- 2 years working with SQL and NoSQL databases, e.g. PostgreSQL, Cassandra, MongoDB.
- 2 years working with code as infrastructure tools such as Terraform, Ansible, Saltstack, Chef, Puppet.
- Solid knowledge and experience in networking, e.g. HTTP, TCP, UDP, DNS, VPN ( IPSec, Wireguard), routing, firewalls, etc.
- Solid knowledge and experience in encryption and security, e.g. AES, ECC, PKCS, PKI, OpenSSL, JWT.
- Experience with Linux system administration, e.g. systems, iptables, top, stat commands, kernel tuning, user management.
- Experience working with containers & container orchestration, e.g. Docker, Kubernetes.
- Experience with logging, monitoring, and incident management tools, e.g. Prometheus, Grafana, Cloud Logging, Opsgenie, Pagerduty.
- Experience working with Web Servers/Load Balancers, e.g. Nginx, Apache, HAProxy.
- Love for automation.
- Ability and willingness to pick up new technologies quickly and be productive.
Nice to have:
- Multilingual (programming) skills, in particular Python, Java, Javascript/Typescript, Golang.
- Experience with Bazel.
- Experience with identity and access management solutions eg. Keycloak.
- Experience implementing PCI DSS, ISO 27001, ISO 22301 policies/standards.
- Experience with Google BigTable.
- Experience managing Github organizations and repositories.
Application Closing Date
Not Specified.
Apply Now
Job Features
Job Category | Engineering / Technical |