This job is no longer taking applications and is displayed only for reference.
To view current postings please conduct a new search.
Thank you.
Yardi Canada is looking for a System Reliability Engineer to join our team helping to build and maintain the infrastructure for a large-scale, cloud based IoT and security system. This product will be offered to our existing multifamily property management client base. This position will be part of a globally distributed team of skilled individuals supporting a distributed, high-availability, scalable, cloud-based system. If this sounds like an exciting challenge and you relish the opportunity to use cutting-edge tools, we encourage you to apply!
We are seeking applicants experienced in:
Supporting public and hybrid cloud solutions
Cluster management with Kubernetes or Docker Swarm
Supporting monitoring and observability with the Elastic stack (Elasticsearch, Logstash, Beats, Kibana), Prometheus, collectd, Icinga, Graphite and Grafana
Preference for experience in the following areas:
Messaging systems such as Kafka or Nats.io
Storage systems such as Cassandra, Redis, Memcached
Monitoring and management of public cloud costs
Operating deployments of multiple hundreds of CPU cores
Supporting application deployment using Helm, TeamCity, Git
Supporting Apache Spark, Hadoop, Ignite (Machine Learning)
Managing fleets of GPU-based computing resources
Supporting systems requiring millions of simultaneous TCP connections
You will be responsible for:
Designing, implementing, and operating scalable and high availability resources
Working directly with the development team on projects and production troubleshooting
Learning about the inner workings of a large scale IoT platform and business
Collaborating directly with product owners, stakeholders, and developers to understand and create infrastructure plans
Participating in an enterprise 24x7 support environment
Computer and Technology Knowledge
- Windows
- Linux
- Internet
- Database software
- Hardware
- Device drivers
- Servers
- Security software
- Programming software
- Web service design
- Data analysis
- Programming languages
- Software development
Essential Skills
- Reading text
- Numeracy
- Oral communication
- Working with others
- Problem solving
- Decision making
- Critical thinking
- Job task planning and organizing
- Finding information
- Computer use
- Continuous learning
Specific Skills
- Conduct business and technical studies
- Provide advice on information systems strategy, policy, management and service delivery
- Assess physical and technical security risks to data, software and hardware
- Develop and implement policies and procedures throughout the software development life cycle
Work Setting
- Internet Service Provider (ISP)
Work Conditions and Physical Capabilities
- Fast-paced environment
- Work under pressure
- Tight deadlines
- Attention to detail
- Sitting
Security and Safety
- Basic security clearance