Sr SRE
Sardine
Who we are:
We are a leader in fraud prevention and AML compliance. Our platform uses device intelligence, behavior biometrics, machine learning, and AI to stop fraud before it happens. Today, over 300 banks, retailers, and fintechs worldwide use Sardine to stop identity fraud, payment fraud, account takeovers, and social engineering scams. We have raised $75M from world-class investors including Andreessen Horowitz, Visa, Experian, FIS, and Google Ventures.
Our culture:
We have hubs in the Bay Area, NYC, Austin, and Toronto. However, we have a remote-first work culture. #WorkFromAnywhere
We hire talented, self-motivated people and get out of their way
We value performance and not hours worked. We believe you shouldn't have to miss your family dinner, your kid's school play, or doctor's appointments for the sake of adhering to an arbitrary work schedule.
About the Role:
Site Reliability Engineers (SREs) are responsible for keeping all production services running smoothly. SREs are a blend of pragmatic operators and software craftspeople that apply sound engineering principles, operational discipline, and mature automation to our operating environments. As an SRE at Sardine, you will build and run the core components that processes billions of events to protect financial institutions from fraud and compliance risks. You will also partner with our other engineering teams to help make their services more performant, scalable, observable, and reliable. We believe every engineering team at Sardine should be responsible for the software they build, and SREs play a critical part in providing the tools, practices, and expertise to make that happen.
You will:
Run our infrastructure with Terraform, CI/CD (Github and ArgoCD), and Kubernetes together with the devops team
Having a proactive approach to monitoring rather than a reactive approach. - Build monitoring that alerts on symptoms rather than on outages.
Participate in on-call rotations, along with every member of the engineering team
Improve and automate operational processes
Constantly improve the security of the product and security operation
Debug production issues across services and levels of the stack
Partner with engineering teams to ensure their products meet production standards
Be willing to go out of your comfort zone to unfamiliar territory to solve unique issues.
Help shape our company's engineering culture and keep high engineering standards
An ideal candidate has:
5+ years experience designing, building, and operating large-scale production systems
Experience with Google Cloud Platform
Experience with monitoring tools like datadog and preferably open source toolings like prometheus/grafana/jaeger(tracing)
Good to have elastic search experience.
Experience with container orchestration tools like Kubernetes and tools that support Kubernetes deployment, like ArgoCD and helm.
Strong programming skills in primarily GoLang and/or any other languages
Strong knowledge about database optimization
Good knowledge of ensuring good security practices within cloud infrastructure.
Benefits we offer:
Generous compensation in cash and equity
Early exercise for all options, including pre-vested
Work from anywhere: Remote-first Culture
Flexible paid time off, Year-end break, Self care days off
Health insurance, dental, and vision coverage for employees and dependents - US and Canada specific
4% matching in 401k / RRSP - US and Canada specific
MacBook Pro delivered to your door
One-time stipend to set up a home office — desk, chair, screen, etc.
Monthly meal stipend
Monthly social meet-up stipend
Annual health and wellness stipend
Annual Learning stipend
Unlimited access to an expert financial advisory
Join a fast-growing company with world-class professionals from around the world. If you are seeking a meaningful career, you found the right place, and we would love to hear from you.