Kubernetes Site Reliability Engineer


We are looking for a skilled and motivated Kubernetes DevOps / SRE to join our dynamic team.

As a Kubernetes DevOps / SRE, you will be responsible for ensuring the availability, scalability, and reliability of our Kubernetes infrastructure. You will work closely with our engineering, product, and operations teams to design, implement, and operate our Kubernetes clusters and related services. In addition, you will also be responsible for managing our CI/CD pipelines and other DevOps tools and processes.


  • Design, deploy, and maintain Kubernetes clusters across multiple environments
  • Develop and maintain automation tools for deploying and managing Kubernetes clusters
  • Monitor and troubleshoot Kubernetes clusters to ensure high availability and performance
  • Implement security best practices for Kubernetes infrastructure and services
  • Participate in incident response and work to reduce the MTTR over time.
  • Continuously improve the reliability, scalability, and performance of our Kubernetes infrastructure
  • Manage our CI/CD pipelines and other DevOps tools and processes
  • Work closely with developers to ensure that our applications are properly deployed and configured in Kubernetes


  • Deep understanding of Kubernetes and containers. (i.e. be a Certified Kubernetes Administrator (CKA) or working towards it)
  • Proficiency in infrastructure as code (IaC) tools such as Terraform and CloudFormation
  • Experience with cloud platforms such as AWS, Azure, or GCP
  • Experience with CI/CD pipelines and tools such as Github Actions and Azure Devops
  • Solid understanding of network and security principles, including VPNs, firewalls, and load balancers
  • Excellent problem-solving skills and ability to work independently or as part of a team
  • Proficiency in the English language, both written and verbal, sufficient for success in a remote and largely asynchronous work environment
  • Comfort working in a highly agile, iterative software development process
  • Self-motivated and self-managing, with strong organizational skills.


  • Experience with monitoring and observability tools such as Prometheus, Grafana, and Elasticsearch
  • Experience with logging and log analysis tools such as OpenSearch / ElasticSearch
  • Experience with GitOps principles and tooling such as Flux.
  • Experience writing Go, or a desire to learn.

Bonus Points for:

  • Contributions and/or a passion for open source.
  • Kubernetes operator and controller development.