Site Reliability Engineer

Company	Rubix SolutionsSee more
Address	Sydney, NSW
Category	Engineering

Job description

Rubix Solutions is seeking a Site Reliability Engineer (SRE) to join our Sydney-based team. As an SRE, you will play a crucial role in ensuring the reliability, availability, and performance of our critical digital services and systems. You will collaborate closely with software engineering teams, employing a blend of coding skills and operational expertise to design and maintain scalable and resilient infrastructure.

Responsibilities:
Chapter Responsibilities:

Learn and improve your skills through dedicated chapter time.
Share experiences within your chapter and across the Engineering practice.
Generate excitement for technologies that aim to change the future.
Evangelize implementations across the organization where beneficial.

Squad Responsibilities:

Design, implement, and maintain highly available and scalable microservices architecture on Azure and GCP cloud platforms.
Collaborate with development teams to ensure applications are designed with reliability, scalability, and observability in mind.
Develop and maintain automation tools and processes to streamline deployment, monitoring, and incident response.
Participate in on-call rotations to provide 24/7 support for production systems, responding to and resolving incidents promptly.
Implement and maintain observability tools such as monitoring, logging, and tracing to ensure visibility into system performance and health.
Conduct post-incident reviews and implement improvements to prevent recurrence of issues.
Stay updated on industry best practices and emerging technologies related to cloud architecture, microservices, and observability.
Assist teams in identifying and removing manual repetitive tasks from their work.
Embed observability into all aspects of the application ecosystem.
Evaluate platform and service consumption to optimize costs and capacity.

Role Requirements:

Demonstrable experience working in technical delivery teams, preferably up to 10+ people.
Knowledge of cloud infrastructure and integration across enterprise platforms.
Experience in both agile and project-based execution.
Ability to switch context between multiple technologies.
Strong communication, collaboration, and influencing skills.
Experience implementing SRE practices and influencing engineers and product owners to prioritize reliability.
Strong experience with SRE practices, DevSecOps, and observability platforms.
Some experience with cloud-native application development, relational & document databases, and service integration patterns.
Good experience with a subset of technologies in our current stack, including Cloud (Microsoft Azure, Google Cloud Platform), Code (Powershell, Bash, C#, .NET Core, NodeJS, GoLang), Databases (Microsoft SQL Server, MongoDB, CosmosDB, PostgreSQL), CI (Azure DevOps, GitHub, Jenkins), Infrastructure (Kubernetes, Terraform), and Observability (App Insights, Dynatrace, Azure Log Analytics, Google Cloud Monitoring).

Benefits:

Long term engagement
Large enterprise end customer
Collaborative work environment

If this sounds like a match, we'd love to talk to you! Feel free to apply directly or reach out to Stephanie - *********@rubixsolutions.net.au

Refer code: 2151462. Rubix Solutions - The previous day - 2024-05-07 10:45

Site Reliability Engineer

Rubix SolutionsSee more

Job description

Wholesale Customer Experience

Customer Experience Lead (APS 6) - Client Management Section

Logistics Coordinator

Customer Experience/Digital Strategist - Travel

Patient Experience Team Lead - Afternoons

Administration Officer

Senior UI Engineer - xDefiant

Customer Service Officer | Transport & Logistics

Area Planner

Director of Finance and Corporate Services

Related jobs

Site Reliability Engineer

Senior Staff Site Reliability Engineer, Google Photos

Staff Site Reliability Engineer