Remote

Site Reliability Engineer

Responsibilities
Benefits
Requirements
Nice-to-haves


Are you passionate about driving innovation in software operations? Flanksource is on the hunt for a skilled Kubernetes Site Reliability Engineer who thrives on solving complex problems, optimizing systems for reliability and efficiency, and is eager to dive into the world of open-source, developer-first technologies.

We're looking for someone with a deep understanding of the Kubernetes ecosystem, a knack for automating and improving workflows, and a commitment to service excellence.

Responsibilities:

  • Design, deploy, and maintain Kubernetes clusters across multiple environments
  • Develop and maintain automation tools for deploying and managing Kubernetes clusters
  • Monitor and troubleshoot Kubernetes clusters to ensure high availability and performance
  • Implement security best practices for Kubernetes infrastructure and services
  • Participate in incident response and work to reduce the MTTR over time.
  • Continuously improve the reliability, scalability, and performance of our Kubernetes infrastructure
  • Manage  CI/CD pipelines and other DevOps tools and processes
  • Work closely with developers to ensure that our applications are deployed using best practises.

Requirements:

  • Deep understanding of Kubernetes and containers. (i.e. be a Certified Kubernetes Administrator (CKA)
  • Experience with 2 or more infrastructure as code (IaC) tools such as Terraform, Crossplane, Pulumi or CloudFormation
  • Experience with monitoring and observability tools such as Prometheus, Grafana, ELK, Datadog, Dynatrace, etc..
  • Experience with cloud platforms such as AWS, Azure, or GCP
  • Experience with CI/CD pipelines and tools such as Github Actions, Gitlab and Azure Devops.
  • Solid understanding of network and security principles, including VPNs, firewalls, and load balancers.
  • Excellent problem-solving skills and ability to work independently or as part of a team
  • Proficiency in the English language, both written and verbal, sufficient for success in a remote and largely asynchronous work environment
  • Comfort working in a highly agile, iterative software development process
  • Self-motivated and self-managing, with strong organizational skills.

Preferred:

  • Experience with GitOps principles and tooling such as Flux and ArgoCD
  • Experience writing Go, or a desire to learn.

Bonus Points for:

  • Contributions and/or a passion for open source.
  • Kubernetes operator and controller development.

Apply