Remote/Office
Full time
From 2000 EUR net
~ 4h
Fat cat section logo

Site Reliability Engineer

Remote/Office
Full time
From 2000 EUR net
~ 4h

We are looking for a Site Reliability Engineer who’ll be working on a project for one of the few unicorn startups in the USA - Calendly. They are one of the leading companies in the US, helping its users stay organized and schedule their meetings effortlessly! 

About the role:

You will help engineering teams improve the reliability, performance, resilience, and security of the services they own. Working with a well-defined continuous delivery process and a reasonably instrumented production environment, the successful candidate will be able to define SLOs and measure SLIs with an eye toward continuous improvement and an evolution at scale. An ideal candidate should demonstrate exceptional leadership in communicating patterns and improvements that automate tasks, improve stability, secure systems, and increase performance.

Key Responsibilities:

  • Institute resilient infrastructure through source code based configuration (Infrastructure as code)

  • Demonstrate skills in evaluating, measuring, and improving rapidly evolving systems

  • Collaborate with engineering teams to understand and improve their systems

  • Organize a holistic ecosystem of infrastructure, tools, and capabilities that effectively provides visibility into the health of each component

  • Operate CI/CD pipelines to provision, track, validate, sign, and securely deploy software

  • Grow expertise in cloud concepts, especially IaaS/PaaS with exposure to virtualization technology in support of building our enterprise container infrastructure

  • Implement high availability systems with automated failover across multiple availability zones

  • Lead postmortem of unexpected incidents to prevent future recurrence

  • Participate in an on-call rotation to support critical Calendly infrastructure

  • Foster an environment of learning and knowledge dissemination

  • Prototyping new solutions and going into green-field implementation

  • Define standard practices and tooling around new services, changes, incidents, postmortems, and work and capacity to work with engineering teams to adopt those practices

We expect you to have:

  • 3+ years of Engineering experience supporting high availability systems in production

  • Experience solving infrastructure problems with software

  • Excellent verbal and written English

  • Strong technical knowledge of cloud infrastructure, distributed systems, and reliability practices

  • Experience working in a Linux environment

  • Experience with GCP and/or AWS

  • RDBMS experience

  • Software development experience

  • Experience deploying containerized services (Docker experience preferred)

  • Experience running and securing Kubernetes in production environments

  • Understanding of CI/CD pipelines and application delivery via GitOps

  • Varied experience in software monitoring tools

We offer:

  • 6-hour net workday

  • Remote work

  • Flexible working hours

  • 20 vacation days

  • Private health insurance with full pregnancy/maternity coverage and family members included

  • 10 days of paid paternity leave

  • New Macbook Pro 16”

  • Pet-friendly office

  • Net salary starting at 2000 EUR 

  • Monthly home expenses budget of 50 EUR

  • A dynamic and friendly atmosphere where you can further develop your skills while having fun along the way!

Selection process 

  • Short general questionnaire ~ 5 min

  • Intro call with a team ~ 30 min

  • Client-side screening ~ 30 min

  • Client-side Tech Interview ~ 1h

  • HR & Interview with our Chief Cat ~ 1.5h

Want to know how you can apply for this position? 

If this position sounds appealing, send us your resume at careers@fatcatcareers.com with the subject line “Site Reliability Engineer - Debele Macke”.