Data Pipelines, Azure Databricks, ETL Tools, Extract, Transform, Load (ETL), Programming Languages, NoSQL, Data Engineering, Distributed Systems, Artificial Intelligence (AI)
Job description
About The Company
BluNova is a subsidiary of Blue Label Telecoms responsible for drawing value from its large heterogeneous corpus of data originating from multiple companies within the Blue Label group. With this data, we identify opportunities in the market and develop sustainable solutions to exploit these opportunities. We leverage scalable cloud technologies to deploy our solutions, with a strong emphasis on automation. Most of our data analysis is performed in a distributed spark environment, where we have leg room to spin up large clusters exceeding thousands of cores and terabytes of RAM when needed.
Job Purpose
Responsible for building the organisations data collection systems and processing pipelines. Oversee infrastructure, tools and frameworks used to support the delivery of end-to-end solutions to business problems through high-performing data infrastructure. Responsible for expanding and optimising the organisations data and data pipeline architecture, whilst optimising data flow and collection to ultimately support data initiatives.
RESPONSIBILITIES
Data
- Identify shortcomings and suggest improvements to existing processes, systems and procedures, then delivers a plan for a small element of a change management program with guidance from a project/program manager
- Build analytics tools that utilise the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics
- Create data tools for analytics and data scientist team members that assist them in building and optimising xxxxxxxxx into an innovative industry leader
- Monitor the existing metrics, analyse data, and lead partnership with other Data and Analytics teams in an effort to identify and implement system and process improvements
- Utilise data to discover tasks that can be automated and identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc
- Developing ETL processes that convert data into formats for consumption
- Responsible for executing testing and validation in line with data governance and quality business requirements
- Liaise with and collaborate with data analysts, data warehousing engineers, and data scientists in finding and applying best practices within the Data and
- Analytics department as well as defining the business data requirements, which will ensure that the collected data is of high quality and optimal for use across the department and the business at large
- Acts as a subject matter expert from a data perspective and provides input into all decisions relating to data engineering and the use thereof. Provide guidance in terms of setting, governance standards
Behavioural Competencies
- Strategy
- Manages Complexity
- Ensures Accountability
- Collaborates
- Plans and Aligns
- Tech Savvy
- Post Graduate Degree: Information Technology
- Post Graduate Degree: Information Studies
- Masters Degree: Information Technology
- Masters Degree: Information Studies
- Minimum 3 years experience
- IT Architecture
- Data Integrity
- IT Applications
- Data Analysis
- Knowledge Classification
- C++.
- Amazon Web Services.
- Amazon S3.
- Databricks
SKILLS
Important Hard Skills
- Database systems (SQL and NoSQL)
- Data engineer must know how to manipulate database management systems (DBMS), which is a software application that provides an interface to databases for information storage and retrieval
- Data warehousing solutions
- ETL tools
- ETL (Extract, Transfer, Load)
- Data APIs
- Allows two applications or machines to communicate with each other for a specified task
- Python, Java, R and Scala programming languages
- Understanding the basics of distributed systems
- Communication skills
- Interface with machine learning engineers, data analysts, CTOs, and developers
- Collaboration
- Presentation skills