Our partner is a global computer security company developing digital security tools for personal computers, server devices and mobile devices. Cloud Operations Engineer will be primarily responsible for deploying, and operating proof-of-concept and development installations that go far beyond traditional or current industry approaches.
Responsibilities
- You will be part of a global team that is responsible for Cloud Services that enable protection at the endpoint products on a continuous basis.
- Responsible for supporting Cloud service measurement, monitoring, and reporting
- Improving overall operational quality through common practices and by working with engineering, QA, IaaS, and product DevOps teams
- Responsible for the supporting efforts that improve operational performance and availability of Client's Production environments
- Responsible for continuous measurement and high availability of the Production environments
- Provide technical support for day to day operations of critical Cloud Services as part of an operational support rotation. This will require participation on our On-Call rotation
- Part of a 24/7 team providing first line of Operational Support including event response and recovery efforts
- Work closely with Cloud Solution Engineers to ensure system health
- You will have ownership and responsibilities for the high availability of Production environments
- Input into the monitoring of systems applications and supporting data
- Report on system uptime and availability
- Collaborate with other team member on best practices
- Assist with service deployments to staging & production environments
- Assist with creating and updating runbooks & SOP’s
Requirements
- Bachelor’s degree in computer engineering, computer science or information systems & technology or equivalent.
- At least 5+ years of hands-on working experience in building & supporting large scale environments
- 2 or more years of professional work experience supporting complex technical solutions hosted in AWS or GCP.
- Excellent written and verbal communication skills.
- Proven ability to work independently, deploying, testing and troubleshooting systems
- Experience working with and supporting production-level services within public cloud environments
- Strong production support background and experience of in-depth troubleshooting
- Experience working with solutions in both Linux and Windows environments
- Experience using modern Monitoring and Alerting tools (Prometheus, Grafana, Alerta, Opsgenie etc.)
- Knowledge of ITIL (IT Service Management) – incident management, problem management, release management & Agile practices.
- Familiarity with the tools (Jenkins, TeamCity, etc.) and processes used to support a Continuous Integration and Continuous Deployment environment.
- Familiarity with Containerization and associated management tools (Docker, Kubernetes)
- Cloud Computing experience / AWS / GCP
- SQLServer, PostgreSQL or MySQL experience
- Experience with PowerShell or other scripting languages
- At least one or more AWS Certifications
- Experienced with AWS Security (IAM, Security Groups, KMS, etc.)