Data Engineer

Requisition Details & Talent Acquisition Consultant

REQ 134343 Thembi Mtshali

Job Family

Risk, Audit and Compliance

Career Stream

Auditing

Leadership Pipeline

Manage Self: Professional

Job Purpose

The purpose of the Data Engineer is to leverage their data expertise and data related technologies, in line with the Nedbank Data Architecture Roadmap, to advance technical thought leadership for the Enterprise, deliver fit for purpose data products, and support data initiatives. In addition, Data Engineers enhance the data infrastructure of the bank to enable advanced analytics, machine learning and artificial intelligence by providing clean, usable data to stakeholders. They also create data pipelines, Ingestion, provisioning, streaming, self service, API and solutions around big data that support the Bank's strategy to become a data driven organisation.

Job Responsibilities

Responsible for the maintenance, improvement, cleaning, and manipulation of data in the bank's operational and analytics databases.
Data Infrastructure: Build and manage scalable, optimised, supported, tested, secure, and reliable data infrastucture eg using Infrastructure and Databases (DB2, PostgreSQL, MSSQL, HBase, NoSQL, etc), Data Lakes Storage (Azure Data Lake Gen 2), Cloud-based solutions (SAS , Azure Databricks, Azure Data Factory, HDInsight), Data Platforms (SAS, Ab Initio, Denodo, Netezza, Azure Cloud). Ensure data security and privacy in collaboration with Information Security, CISO and Data Governance
Data Pipeline Build (Ingestion, Provisioning, Streaming and API): Build and maintain data pipelines to:
create data pipelines for data integration (Data Ingestion, Data Provisioning and Data Streaming) utilising both On Premise tool sets and Cloud Data Engineering tool sets
efficiently extract data (Data Acquisition) from Golden Sources, Trusted sources and Writebacks with data integration from multiple sources, formats and structures
load the Nedbank Data Warehouse (Data Reservoir, Atomic Data Warehouse, Enterprise Data Mart)
provide data to the respective Lines of Business Marts, Regulatory Marts and Compliance Marts through self service data virtualisation
provide data to applications or Nedbank Data consumers
transform data to a common data model for reporting and data analysis, and to provide data in a consistent, useable format to Nedbank data stakeholders
handle big data technologies (Hadoop), streaming (KAFKA) and data Replication (IBM Inphosphere Data Replication)
drive utilisation of data integration tools ( Ab Initio) and Cloud data integration tools (Azure Data Factory and Azure Data Bricks)
Data Modelling and Schema Build: In collaboration with Data Modellers, create data models and database schemas on the Data Reservoir, Data Lake, Atomic Data Warehouse and Enterprise Data Marts.
Nedbank Data Warehouse Automation: Automate, monitor and improve the performance of data pipelines.
Collaboration: Collaborate with Data Analysts, Software Engineers, Data Modelers, Data Scientistsm Scrum Masers and Data Warehouse teams as part of a squad to contribute to the data architecture detail designs and take ownership of Epics end-to-end and ensure that data solutions deliver business value.
Data Quality and Data Governance: Ensure that reasonable data quality checks are implemented in the data pipelines to maintain a high level of data accuracy, consistency and security.
Performance and Optimisation: Ensure the performance of the Nedbank data warehouse, integration patterns, batch and real time jobs, streaming and API's.
API Development: Build API's that enable the Data Driven Organisation, ensuring that the data warehouse is optimised for API's by collaborating with Software Engineers.

Essential Qualifications - NQF Level

Advanced Diplomas/National 1st Degrees

Preferred Qualification

Field of Study:Bcom, BSc, BEng

Preferred Certifications

Cloud (Azure, AWS), DEVOPS or Data engineering certification. Any Data Science certification will be an added advantage, Coursera, Udemy, SAS Data Scientist certification, Microsoft Data Scientist.

Minimum Experience Level

Total number of years of experience:3 - 6 years
Experienced at working independently within a squad and has the demonstrated knowledge and skills to deliver data outcomes without supervision.
Experience designing, building, and maintaining data warehouses and data lakes.
Experience with big data technologies such as Hadoop, Spark, and Hive. Experience with programming languages such as Python, Java, and SQL.
Experience with relational databases and NoSQL databases.
Experience with cloud computing platforms such as AWS, Azure, and GCP. Experience with data visualization tools. Result-driven, analytical creative thinker, with demonstrated ability for innovative problem solving.

Technical / Professional Knowledge

Cloud Data Engineering (Azure , AWS, Google)Cloud Data Engineering (Azure , AWS, Google)
Data WarehousingData Warehousing
Databases (PostgreSQL, MS SQL, IBM DB2, HBase, MongoDB)Databases (PostgreSQL, MS SQL, IBM DB2, HBase, MongoDB)
Programming (Python, Java, SQL)Programming (Python, Java, SQL)
Data Analysis and Data ModellingData Analysis and Data Modelling
Data Pipelines and ETL tools (Ab Initio, ADB, ADF, SAS ETL)Data Pipelines and ETL tools (Ab Initio, ADB, ADF, SAS ETL)
Agile DeliveryAgile Delivery
Problem solving skillsProblem solving skills

Behavioural Competencies