A permanent opportunity has arisen for an Engineer to be based in the Gold Coast, to manage the incident coordination process within the Networking Operations Centre. This includes monitoring incoming alerts/tickets, performing triage, and managing the appropriate response. This could include the first-line resolution or triggering and coordinating the major incident response process.
There is also a focus on proactively identifying and actioning potential issues before they become incidents. As the role involves responding to live customer-facing issues, the Engineer is expected to work well under pressure and have excellent organisational and communication skills. The NOC is a 24/7 operation and so the Engineer will be expected to work on a shift basis.
What do you need to apply: ? ??? Windows Server OS, monitoring and diagnostic tools, SQL query skills and PowerShell ?
??? Experience in using Grafana, PagerDuty, Kibana, Remedy ? ???
Experience in directing/coordinating multiple Engineers in voice bridge calls ? ??? Black Rock certification advantageous ?
??? Sense of humour ? ???? Impeccable communication skills
- both written and verbal
High Severity Incident Support Coordinator Coordinate technical staff from multiple teams in real time live chat environments for hours at a time to drive investigation of an issue through to resolution. Coordinate technical staff from multiple teams in real time live voice bridges for hours at a time to drive investigation of an issue through to resolution. Liaise in real time via live chat, voice bridges, telephone and email with internal & external senior leadership and executive level persons Drafting and submitting written ongoing (every 30-60mins) formal structured communications updates for internal and external stakeholders Coordinate investigations through to resolution of multiple High Severity instances concurrently Ensure the closure of all resolved and end-user confirmed Incident records Completes Routine Operations Centre Tasks
First-line response and resolution for the following
- monitoring system alerts (e.g. statsd, graphite, updog, elastic)
- Windows OS Issues
- Cluster disk issues
- Linux OS issues
- ESX Host issues
- on Internal systems (e.g. Citrix, O365, S4B)
- application or product-specific issues (e.g. IIS app pool re-cycle) Handles all 1st line escalations for different support teams, updates and resolves tickets as necessary Manual re-assignment of Pager Duty calls or following up on unactioned Pager Duty calls Monitor activity of tickets and follow-up on progress Monitor and respond to critical incidents.
Pro-actively monitor alert trends in real-time, e.g. multiple concurrent issues at a site Analyse data trends on internal tickets, customer contacts, social media, and network monitors to identify potential issues. Fulfill Incident Support Coordinator role during High Severity Incident response
- initiate collaboration channels (conference call, chat)
- coordinate incident response
- Send out comms to stakeholders.
Provide audit data to post-mortem investigations or other investigations Consulting for
- Grafana issues/questions -…
Click here to view more detail / apply for High Severity Incident Support Coordinator | NOC Engineer