Cloud Site Reliability Engineer

Zürich  ‐ Vor Ort
Dieses Projekt ist archiviert und leider nicht (mehr) aktiv.
Sie finden vakante Projekte hier in unserer Projektbörse.

Beschreibung

Your role:

Do you want to protect our business by ensuring the integrity and security of our production platforms at all time? Do you want to develop new tools to solve existing problems in the production environment? Do you strongly believe in automation?

We're looking for a Cloud Site Reliability Engineer to deliver against the bank's Cloud Strategy with a strong MS bias

  • Be responsible for all aspects of application production support, deployment and monitoring, as well as the development of tools to support these activities
  • Support mission critical applications and associated platforms, ensuring the highest levels of availability, security, performance and stability are maintained at all times
  • Design and build tools and solutions with a strong bias towards automating as many aspects of support as possible such that manual trivial support activity is reduced or eliminated
  • Ensure new systems/services deployed can be integrated to existing monitoring and management tools so that the performance of the service and deviations from normal are easily anticipated and instrumented
  • Manage Cloud services that span storage, security, networking, and compute cloud capabilities

Must have:

  • Strong Azure skills
  • Powershell, Python, JSON
  • ARM Template

Key deliverables:

  • To ensure production systems run reliably at all times, that availability, performance and business process SLAs are met or exceeded
  • Spend 50% of time on tickets and 50 % on improvements that deliver engineering solutions that improve instrumentation, ease of deployment, service orchestration and other aspects of production support - reduce the burden of manual work involved as systems and user volumes scale.
  • Partner with service transition managers/development leads and architects to ensure designs of new applications meet expected standards in relation to Site reliability. Ensure non-functional production support requirements are considered early in the life cycle of all new applications
  • Manage Azure Portal Dashboard, Policies, Log Analytics, Azure DevOps, Resource Manager, Subscriptions, Graph API, Powershell/cli, Visual Studio Code, Python, JSON template Creation, DevOps and Infra as Code

You:

  • Exceptional development and engineering experience and the ability to apply that knowledge to solve the complex problem of running applications reliability at scale
  • Have a blend of skills including sysadmin, security, automation and the ability to code with a deep knowledge of Operating Systems and Application Source Code, Container Fabrics, Networking, Alerting and Monitoring
  • Deep knowledge of Azure Resource Manager, Monitor, Alerts, Security Centre, DevOps, RBAC
  • Deep knowledge of application Source code such as Java/C++/C#
  • Deep understanding of each service across the full IT life cycle, and ready to take requests for infrastructure services, applications, and environments
  • Design of solutions (Monitoring/process orchestration/capacity management/deployment) that not only scale but can potentially be leveraged by other parts of the organisation
  • Hands on experience working in both Agile and DevOps development methodologies
  • Confident in interacting with developers and deep diving into both Application and Infrastructure code
  • Willingness to challenge the status quo and introduce new ideas that will remove or reduce manual effort in relation to operating large production systems at scale.

If you think this could be the right role for you, please feel free to apply online or please contact Agnieszka Wojcik directly (see below)

Start
ideally ASAP
Dauer
12 months
Von
Harvey Nash IT Recruitment Switzerland
Eingestellt
16.11.2019
Projekt-ID:
1849171
Vertragsart
Freiberuflich
Um sich auf dieses Projekt zu bewerben müssen Sie sich einloggen.
Registrieren