Job Summary: As an Operations Engineer, you will play a crucial role in ensuring the stability, reliability, and efficiency of our systems and infrastructure. You will work closely with cross-functional teams to deploy, manage, and optimize our technology stack, contributing to the overall success of our operations.
Responsibilities
- Infrastructure Management:
- Design, implement, and maintain the company's infrastructure, including servers, networks, and storage systems.
- Monitor system performance and proactively identify and resolve issues to ensure optimal uptime.
- Deployment and Automation:
- Develop and maintain automated deployment processes and tools for application and system provisioning.
- Implement infrastructure as code (IaC) principles to enable efficient infrastructure management.
- Security and Compliance:
- Collaborate with the security team to implement and maintain security best practices, including vulnerability assessments and patch management.
- Ensure systems and processes adhere to industry-specific compliance standards.
- Troubleshooting and Incident Response:
- Investigate and resolve infrastructure-related incidents and outages, working to minimize downtime and service disruptions.
- Develop and maintain incident response plans for critical systems.
- Performance Optimization:
- Continuously optimize infrastructure for performance, scalability, and cost-effectiveness.
- Identify and implement improvements to enhance system reliability.
- Documentation and Knowledge Sharing:
- Maintain comprehensive documentation of infrastructure configurations, procedures, and troubleshooting guides.
- Collaborate with team members to share knowledge and best practices.
- Collaboration:
- Work closely with cross-functional teams, including developers, DevOps engineers, and IT support, to ensure seamless operations.
- Participate in on-call rotations and provide 24/7 support as required.
- Bachelor's degree in Computer Science, Information Technology, or related field (or equivalent work experience).
- Application monitoring with Redgate, New relic and application monitoring, WAS experience.
- Proven experience as an Operations Engineer or similar role in a technology-driven environment.
- Proficiency in scripting and programming languages (e.g., Python, Shell, Ruby).
- Strong knowledge of cloud computing platforms (e.g., AWS, Azure, GCP).
- Experience with containerization and orchestration tools (e.g., Docker, Kubernetes).
- Familiarity with infrastructure as code (IaC) tools like Terraform or Ansible.
- Solid understanding of networking, security, and system administration concepts.
- Excellent problem-solving and communication skills.
- Certifications such as AWS Certified DevOps Engineer or Certified Kubernetes Administrator (CKA) are a plus.