This job is no longer taking applications and is displayed only for reference.
To view current postings please conduct a new search.
Thank you.
Accountabilities:
Leading and managing multiple teams to steer the overall technical strategy, architecture and development of observability platforms in use for various iQmetrix products
Growing the practices of observability within the Site Reliability Engineering group, as well as the Product & Technology team overall
Directing the creation and optimization of processes and tooling used to provide and maintain observability platforms and services
Collaborating continuously with the product development teams to implement observability platforms that meet the needs of product development
Determining and communicating the architectural and technical strategy to teammates and stakeholders
Performing analysis of the current practices and deficiencies to recommend and implement best practices and emerging concepts, with a strong focus on automation and tooling over processes
Integrating feedback from clients into our observability platforms
Integrating feedback from post-incident analysis into our observability platforms
Qualifications:
3-5 years of experience in DevOps, SRE or SysOps lead role
Excellent communication and interpersonal skills to build trust and relationships with a variety of technical and business stakeholders
Strong analytic and problem-solving capabilities with drive for continuous improvement
Experience leading and managing a distributed technical team
Project management and ability to delegate tasks effectively
Expert knowledge of system monitoring and log management tools (e.g. New Relic, Prometheus, Elasticsearch)
Broad automation experience in build, test, configuration and deployment in complex environments (e.g. Ansible, Terraform, Chef, Puppet)
Source Code Management (Github) and CI/CD experience
Cloud experience is an asset (Azure, AWS)
Solid scripting experience (PowerShell, Bash, Ruby)