We have partnered with our client in their search for a DevOps Platform Engineer role.
Responsibilities
Write and execute Infrastructure as Code (IaC) pipelines capable of deploying standard cloud services including Virtual Networks, Firewalls, Load Balancers, Storage Accounts, Application Program Interface (API) Management Gateways, Kubernetes clusters, Messaging bus services, Managed databases, and Virtual Machines
Write and execute code capable of managing Cloud governance policies, security, and cost management constructs
Implement and leverage Configuration Management and GitOps to maintain infrastructure leveraging Ansible, Salt, and Terraform
Plan and build Cost effective Platform as a Service (PaaS) solutions including provisions for high availability and disaster recovery
Code and deploy infrastructure leveraging Availability Zones/Availability Sets
Create and maintain infrastructure documentation and operational procedures using tools such as Confluence and Lucidchart
Collaborate and provide knowledge transfer to operational support teams and collogues
Create monitor alerts and remediation workflows
Build auto-remediation capabilities using automation frameworks and serverless functions
Implement autoscaling features, container management policies, and specify virtual hardware to optimize for low cost and high application performance
Monitor, operate, maintain, and improve cloud environment based on operational metrics, Service Level Agreements (SLAs), and best practices
Skills Required
Bachelor’s Degree in Computer Science or related IT discipline preferred
Strong experience with Microsoft Azure Cloud production environments
Strong experience coding and scripting in one of the following: Python, Bash, PowerShell
Strong experience deploying and operating containerized applications
Experience deploying scaling, and administering production Kubernetes clusters - Azure Kubernetes Service (AKS) preferred
Excellent knowledge of Continuous Integration (CI)/Continuous Delivery (CD) and Infrastructure as Code (IaC) automation tools
Proficient in IT Infrastructure, networking concepts, IT security, server engineering, virtualization, data center tools, processes, and modern event-driven application architecture
Strong understanding of DR (Disaster Recovery) and HA (highly availability) solutions and their use
Experience working as a Linux Systems Administrator or Windows System Administrator
Experienced with Configuration Management platform: Ansible, Chef, Puppet, Salt
Experience designing and building critical production solutions which are: implemented as code, fully automated, auto-scaled, fully instrumented, and monitored
Experience with VMware vRealize management suite, including vRealize Automation and Orchestration
Experience designing monitoring and alerting strategies, backup, and disaster recovery
Experience operating and leveraging hype converged computing platforms such as Nutanix and VxRail
Experience with industry frameworks and methodologies: ITIL/Agile/Scrum/DevOps/SRE