A skilled Data Engineer is required to provide central support to the Core Data Engineering Team around the development and processing of data deliveries from another department as well as proving programming and data engineering support around the delivery of a number of elements of business survey redevelopment.
Data engineering and Datamodelling methods to the data is required for the main stage of the product build. This involves using key programming languages such as Python (through Spark), Scala (through Spark) and SQL (through Hive and Impala) in a big data context.
- Collaborating with key members of the Data Engineering team to develop automated coding solutions for a range of ETL, data cleaning, structuring and validation processes.
- Working with large semi-structured datasets to construct linked datasets derived from multiple underlying sources as well as supporting the wider team in delivering a range of data profiles across key strategic administrative data flows.
- Working with area leads across the broader Data Architecture Division providing ad-hoc coding support on a range of projects underway in Data Architecture utilising cross-government data;
- Assisting in a range of ETL and warehousing design projects in migrating data from a number of Legacy environments
Key skills required:
- Extensive proven experience of data engineering and architectural techniques, including data wrangling, data profiling, data preparation, metadata development, and data upload/download;
- Proven experience of 'big data' environments, including the Hadoop Stack (Cloudera), including data ingestion, processing and storage using HDFS, Spark, Hive and Impala;
- Extensive hands-on experience of developing ETL functionality in a cloud or on-premise environment;
- Experience of using tools such as python and SQL (in Spark) to profile, query and structure large-volume data;
- Proven experience of using Cloud Services particularly in the context of Hadoop;
- Experience of developing/utilising programming and query languages eg SQL (Hive Impala specifically), Python (through Spark), Scala.
This will be an initial 3 month contract, you must be SC Cleared.
For more information, please apply!