Director of Site Reliability Engineering (SRE)
eToro
Director of Site Reliability Engineering (SRE)
- IT
- Bnei Brak ,Israel
- Management
- Full-time
Description
We are seeking a highly skilled and experienced Director of Site Reliability Engineering (SRE) to lead our SRE and QA teams. The ideal candidate will ensure that all software deliverables are developed, tested, and shipped under defined quality and reliability standards. This role will establish a culture of reliability and quality, merging SRE principles with development practices throughout the software development lifecycle (SDLC), from development to production.
Key Responsibilities:
Quality and Reliability Standards:
- Ensure all software deliverables meet the highest quality and reliability standards before being shipped to customers.
- Implement robust testing protocols and reliability benchmarks to maintain product integrity.
Culture of Reliability and Quality:
- Establish and nurture a culture that prioritizes reliability and quality.
- Integrate SRE principles with development practices, ensuring these principles are adhered to throughout the SDLC.
Team Management:
- Oversee the hiring, development, and management of SRE and QA teams.
- Ensure team alignment with organizational goals and maintain effective team dynamics.
- Lead three teams of SRE, including reliability infrastructure development.
Encouraging SLO Improvement:
- Encourage and lead Site Reliability Engineers to improve Service Level Objectives (SLOs).
Transparency and Reporting:
- Maintain transparency with executive management and key stakeholders regarding the health, performance, and risks of software products.
- Provide regular updates and insights into operational performance and challenges.
Project Coordination and Reporting:
- Ensure comprehensive end-to-end (E2E) integration and coordination for major cross-functional projects.
- Report progress, milestones, and challenges to the executive team, ensuring timely and effective communication.
Operational Excellence and Cost Efficiency:
- Drive operational excellence across the SRE and QA teams.
- Implement strategies to enhance cost efficiency while maintaining high standards of reliability and performance.
Requirements
Soft Skills:
- Hands-On Mentality: Willingness to be directly involved in technical issues and problem-solving.
- Collaboration: Ability to work collaboratively with cross-functional teams, fostering a culture of shared goals and mutual support.
- Innovation: Proactive approach to identifying and implementing innovative solutions to enhance reliability and efficiency.
- Customer Focus: Commitment to delivering high-quality products that meet or exceed customer expectations.
- Willingness to be part of a mission critical environment own SLA of a 24/7 financial platform
Experience:
- At least 3 years in a senior leadership role as an SRE or Monitoring Manager (Senior Lead/Director level).
- Proven track record and knowledge of SDLC, DevOps practices, and CI/CD.
- Strong experience leading infrastructure operations and development teams.
Leadership Skills:
- Ability to engage business and development teams through strong leadership to drive cultural change, preferably in a matrixed environment.
- Strong leadership skills with a demonstrated ability to manage and inspire high-performing teams.
Technical Expertise:
- Experience managing technology stacks across the entire lifecycle, including provisioning, monitoring, capacity planning, performance management, and automation.
- Familiarity with coding best practices and the ability to guide teams in adhering to them.
Educational Background:
- BA/BS in Computer Science or a related field, or equivalent academic knowledge.