Lead Site Reliability Engineer
Kubric, is reimagining how business leverage content. We are building an end-end automation stack around content design, development, changes and deployment.
We are seeking a DevOps Lead that cares deeply about uptime, reliability and automation. You strive help your colleagues deploy services and features quickly. We’re small but growing fast so being able to operate independently is a must. Our offices are split between Bengaluru, India & San Francisco, USA so communicating well via Slack/Github/etc is also important.
You will be focused on providing a rock solid foundation for us to grow our platform on. To do so you’ll own our infrastructure from end to end. Covering CI, deployment, monitoring, infrastructure scaling, and infrastructure optimization. In addition you’ll be responsible to make sure engineering ships in a way that keeps the platform performant and stable. The Zero downtime promise to customers, you hold the baton on that.
You'll be working hands on in managing the infrastructure needed for the applications to run in the smoothest manner possible for our customers. You will need to work with the application development teams to deploy, monitor and scale the distributed platform to handle real time AI analysis and loads and loads of visual data (images and videos in various formats). We're looking for people with extensive dev-ops experience and strong systems programming and networking skills.
- 3-6 years of experience in an Dev-Ops/Software Engineering role.
- Define and implement service level objectives for our applications.
- Define and own roadmap for our infrastructure that aligns with growth goals
- Support Engineering with resources, guidance, and the means to rapidly roll out new services.
- Work with Engineering to improve or overhaul our existing infrastructure management framework and general CI/CD process.
- Familiar with managing and deploying micro-services in a containerized environment
- You should be at home with Kubernetes, Service-Mesh, API Proxies
- Proficiency with OS and network fundamentals and strong Linux administrator skills.