IO is a flexible remote organisation with its origins in Cape Town. We design, build and ship scalable digital products. We collectively work on client and startup projects (which makes work at IO exciting, dynamic and agile) and we launch about 4 - 5 digital products per year. We are committed to modern methodologies and team structures and we really do put our people first.
About the role
We design, build and ship scalable products and startups. We are cultivating a company with a collaborative culture and strong values and work ethics, supported by a team-based organisational structure. The products we build are exceptional, and so is our team. We are always on the lookout for people to support our clients, stakeholders, entrepreneurs, product designers, developers and engineers to deliver the quality products and startups we strive for in IO.
We require a Site Reliability Engineer with a solid track record working with a collaborative team, and well-versed in Agile practices. Our ideal candidate can research, apply and document pragmatic SRE solutions, tools and software in line with our companys maturity and product requirements.
Our ideal candidate has
- 3 or more years of experience in the field of DevOps or Site Reliability Engineering
- Experience with Cloud providers (AWS, Google Cloud Platform, Microsoft Azure and Digital Ocean or similar)
- Experience with installing, setting up and running Linux distributions (Ubuntu, SUSE, RedHat or similar)
- A technical background, with coding experience in a language like Python, PHP, Javascript or similar (A bonus would be previous development experience)
- The ability to use Git as a version control system and has experience with either Gitlab, Github, Bitbucket or similar
- Strong skills in using configuration, serialisation or markup languages like YAML, JSON, HCL and XML
- An understanding of Agile development processes
- Experience in managing environments using Terraform and Kubernetes
- Excellent organisational and time management skills
- Accuracy and attention to detail
- Self-development skills to keep up to date with fast-changing trends
- The ability to continually improve on processes, tools, documentation and their own skill-level
- A true problem-solver and self-starter
- Knowledgeable within the software development lifecycle (from UX / UI design to deployment)
- Knowledgeable with regards to project and defect management tooling and SaaS platforms, specifically ClickUp (or something similar like Jira)
- Patient, collaborative and transparent
- Respectful towards their peers
- Well-spoken and articulate
- Pragmatic and logical
- A team player, and understands team dynamics
- Fantastic at prioritisation
- Trustworthiness (Trust is a major part of this role; and it comes with the territory)
- GitLab (or GitHub / BitBucket)
- ClickUp (or Jira)
- Google Workspace
- Docker
- GCP, AWS, Azure
- Terraform & Kubernetes
- Ansible, Chef or Puppet
- Availability of applications (Kubernetes) and Cloud infrastructure
- Latency and networking of Cloud infrastructure and applications (Kubernetes)
- Performance of Cloud infrastructure and applications, performance testing, auto-scaling setups
- Efficiency in automating processes and creating reusable code libraries for IAAC (Infrastructure as Code)
- Change management using Infrastructure as Code, Git, Gitlab, HCL, YAML
- Monitoring of Cloud infrastructure and applications
- Identifying, handling and responding to emergency issues, and drafting security response plans
- Capacity planning of applications and Cloud infrastructure, clusters, VMs, storage
- High-security practices and enforcement (Hashicorp Vault/Boundary), vulnerability scanning and reports, orchestrating penetration testing of applications and infrastructure
- Building, scaling and maintaining the infrastructure for various projects
- Develop, maintain, and configure software to automate processes and improve efficiency
- Testing and optimising systems to create stable, operational environments
- Perform code reviews, maintain and improve code quality
- Compose and maintain documentation of our infrastructure and tooling
- Collaborate with cross-functional teams to define, ship and scale new features
- Above all else, we value an attitude of lifelong self-learning. We are a team of people that keep up to date and continue to educate ourselves through research, mentoring and discussions
- An attitude of openness to keep learning is more important to us than fancy qualifications
- We are looking for highly motivated individuals who are willing to be part of a growing company. You must display a continuous willingness to learn and grow as a team player, and adaptability and flexibility in terms of tech stacks used
- We expect you to take full ownership of your work, and to be a reliable team member especially when production issues arise and need to be tackled quickly
- We take the time to put good structures, apps and tools in place to make work-life as easy as possible at IO, but your teams will still rely on you to display coping skills when it comes to complexity and real-world deadlines
- We are a very close-knit, supportive and kind team
- We are proud of what we build, and we believe in our products
- We build great products and startups with a fantastic and highly skilled team, focused on standards, quality and efficiency
- We stay ahead of the curve
- We are remote and flexible
- We believe in continuous improvement
- We dont micro-manage
- We have a flat but mutually respectful structure
- We want to assist you to grow at all times
At IO we pride ourselves in our ability to stay current while adopting modern methodologies to achieve the best possible results for our clients and teams.
Principle #1
Customer Value = Business Value
We are as invested in the success of our client's products as they are themselves. Working together towards a shared goal delivers value and meaning far beyond what is simply represented on an invoice.
Principle #2
Work in short cycles
Short work cycles allow us to quickly learn from our actions and make evidence-based decisions.
Principle #3
Hold regular open retrospectives
We use regular retrospectives to look back at our work as a team and improve our performance and work relationships swiftly.
Principle #4
Go and See
We observe, learn and share. We amplify good patterns and successes and make it part of our daily discussions.
Alternative Principle #5
Fast and Flexible
We quickly identify what we need to know, progressing to research and validating critical assumptions when and where it adds the most value.
Principle #6
Work as a modern, balanced team
We have a modern staffing model. Working with our dedicated cross-functional teams, we work on the same things at the same time. We empower and trust our teams to work with autonomy.
Principle #7
Radical Transparency
To produce the best possible work, everyone needs to be on the same page, at all times.
Principle #8
Celebrate Achievements
We provide regular incentives and celebrate exceptional work.
Principle #9
Make learning a first-class citizen of your backlog
Learning is part of our product development process, we document learnings and incorporate them into future developments.