Training is not a commodity – all training centres are not the same. Iverson Associates Sdn Bhd is the most established, the most reputable, and the top professional IT training provider in Malaysia. With a large pool of experienced and certified trainers, state-of-the-art facilities, and well-designed courseware, Iverson offers superior training, a more impactful learning experience and highly effective results.
At Iverson, our focus is on providing high-quality IT training to corporate customers, meeting their learning needs and helping them to achieve their training objectives. Iverson has the flexibility to provide training solutions whether for a single individual or the largest corporation in a well-paced or accelerated training programme.
Our courses continue to evolve along with the fast-changing technological advances. Our instructor-led training services are available on a public and a private (in-company) basis. Some of our courses are also available as online, on demand, and hybrid training.
A CCA Data Analyst has proven their core analyst skills to load, transform, and model Hadoop data in order to define relationships and extract meaningful results from the raw input.
You are given eight to twelve customer problems with a unique large data set, a CDH cluster, and 120 minutes. For each problem, you must implement a technical solution with a high degree of precision that meets all the requirements. You may use any tool or combination of tools on the cluster (see list below) -- you get to pick the tool(s) that are right for the job. You must possess enough knowledge to analyze the problem and arrive at an optimal approach given the time allowed. You need to know what you should do and then do it on a live cluster, including a time limit and while being watched by a proctor.
Number of Questions: 8–12 performance-based (hands-on) tasks on CDH 5 cluster. See below for full cluster configuration
Time Limit: 120 minutes
Passing Score: 70%
Candidates for CCA Data Analyst can be SQL devlopers, data analysts, business intelligence specialists, developers, system architects, and database administrators. There are no prerequisites.
Use Extract, Transfer, Load (ETL) processes to prepare data for queries.
Import data from a MySQL database into HDFS using Sqoop
Export data to a MySQL database from HDFS using Sqoop
Move data between tables in the metastore
Transform values, columns, or file formats of incoming data before analysis
Use Data Definition Language (DDL) statements to create or alter structures in the metastore for use by Hive and Impala.
Create tables using a variety of data types, delimiters, and file formats
Create new tables using existing tables to define the schema
Improve query performance by creating partitioned tables in the metastore
Alter tables to modify existing schema
Create views in order to simplify queries
Use Query Language (QL) statements in Hive and Impala to analyze data on the cluster.
Prepare reports using SELECT commands including unions and subqueries
Calculate aggregate statistics, such as sums and averages, during a query
Create queries against multiple data sources by using join commands
Transform the output format of queries by using built-in functions
PMP, Project Management Professional (PMP), CAPM, Certified Associate in Project Management (CAPM) are registered marks of the Project Management Institute, Inc.