Open role

DevOps & Cloud Infrastructure Engineer

San Francisco, CA (On-site) · Full-time

As our DevOps & Cloud Infrastructure Engineer, you will be the backbone that keeps Methodic’s platform running smoothly and efficiently. You'll design and manage the cloud environment hosting our multi-tenant SaaS platform, ensuring we meet our scalability and uptime goals. This role covers everything from CI/CD pipeline automation to monitoring production systems and optimizing performance. You will enable the development team to deploy updates rapidly (we aim for new customers to be able to launch in a day) while maintaining a secure and stable infrastructure.

Responsibilities

Architect, implement, and maintain our cloud infrastructure (AWS/Azure/GCP) to ensure high availability, scalability, and security of the Methodic platform.
Build and manage CI/CD pipelines that automatically test and deploy code, allowing for frequent and reliable releases.
Monitor system performance and reliability using logging and alerting tools; proactively address issues to uphold 99.999% uptime for our customers.
Optimize infrastructure for cost and performance, implementing auto-scaling or load balancing strategies to handle spikes (e.g., burst in transaction volumes) efficiently.
Use Infrastructure-as-Code to manage configurations, ensuring reproducibility of environments and easy recovery.
Collaborate with developers to make sure new services are designed with deployment and operability in mind (proper health checks, etc.).
Lead incident response for infrastructure or operations issues: troubleshoot live problems, coordinate fixes, and perform root cause analysis with an eye toward prevention.

Requirements

5+ years of experience in DevOps, SRE, or Cloud Infrastructure roles.
Expertise with cloud platforms (at least one of AWS, Azure, GCP)
Strong automation skills using scripting or programming (Bash, Python, PowerShell or similar) and familiarity with any CI/CD tools (Jenkins, GitHub Actions, GitLab CI, etc.).
Experience with monitoring and logging frameworks (e.g., Prometheus/Grafana, ELK stack, CloudWatch) and setting up alerting for critical metrics.
Good understanding of networking and security in cloud environments (VPCs, VPN, security groups, IAM roles, encryption of data in transit/at rest).
Proven ability to manage and optimize high-traffic, mission-critical systems.
Calm and effective under pressure — able to rapidly diagnose issues and work with the team to restore service during incidents.

Submit your application

Provide a few details and our hiring team will reach out with next steps.

Candidate details

Optional, max 5MB.

We email a confirmation to the hiring team.