Required Skills for CCA159

Data Ingest

The skills to transfer data between external systems and your cluster. This includes the following:

Transform, Stage, Store

Convert a set of data values in a given format stored in HDFS into new data values and/or a new data format and write them into HDFS. This includes writing Spark applications in both Scala and Python:

Data Analysis

Use Data Definition Language (DDL) to create tables in the Hive metastore for use by Hive and Impala.