Training is not a commodity – all training centres are not the same. Iverson Associates Sdn Bhd is the most established, the most reputable, and the top professional IT training provider in Malaysia. With a large pool of experienced and certified trainers, state-of-the-art facilities, and well-designed courseware, Iverson offers superior training, a more impactful learning experience and highly effective results.
At Iverson, our focus is on providing high-quality IT training to corporate customers, meeting their learning needs and helping them to achieve their training objectives. Iverson has the flexibility to provide training solutions whether for a single individual or the largest corporation in a well-paced or accelerated training programme.
Our courses continue to evolve along with the fast-changing technological advances. Our instructor-led training services are available on a public and a private (in-company) basis. Some of our courses are also available as online, on demand, and hybrid training.
This four-day instructor-led course begins by introducing Apache Kafka, explaining its key concepts and architecture, and discussing several common use cases. Building on this foundation, you will learn how to plan a Kafka deployment, and then gain hands-on experience by installing and configuring your own cloud-based, multi-node cluster running Kafka on the Cloudera Data Platform (CDP).
You will then use this cluster during more than 20 hands-on exercises that follow, covering a range of essential skills, starting with how to create Kafka topics, producers, and consumers, then continuing through progressively more challenging aspects of Kafka operations and development, such as those related to scalability, reliability, and performance problems. Throughout the course, you will learn and use Cloudera’s recommended tools for working with Kafka, including Cloudera Manager, Schema Registry, Streams Messaging Manager, and Cruise Control.
14-17 Mar 2022
13-16 Jun 2022
3-6 Sep 2022
This course is designed for system administrators, data engineers, and developers.
All students are expected to have basic Linux experience, and basic proficiency with the Java programming language is recommended. No prior experience with Apache Kafka is necessary.
During this course, you learn how to:
This three-day hands-on training course delivers the key concepts and expertise developers need to improve the performance of their Apache Spark applications. During the course, participants will learn how to identify common sources of poor performance in Spark applications, techniques for avoiding or solving them, and best practices for Spark application monitoring.Apache Spark Application Performance Tuning presents the architecture and concepts behind Apache Spark and underlying data platform, then builds on this foundational understanding by teaching students how to tune Spark application code. The course format emphasizes instructor-led demonstrations illustrate both performance issues and the techniques that address them, followed by hands-on exercises that give students an opportunity to practice what they've learned through an interactive notebook environment. The course applies to Spark 2.4, but also introduces the Spark 3.0 Adaptive Query Execution framework.
4-6 Mar 2022
1-3 Aug 2022
5-7 Dec 2022
This course is designed for software developers, engineers, and data scientists who have experience developing Spark applications and want to learn how to improve the performance of their code. This is not an introduction to Spark.
Spark examples and hands-on exercises are presented in Python and the ability to program in this language is required. Basic familiarity with the Linux command line is assumed. Basic knowledge of SQL is helpful.
Students who successfully complete this course will be able to:
One of the most critical functions of a data-driven enterprise is the ability to manage ingest and data flow across complex ecosystems. Does your team have the tools and skill sets to succeed at this? Apache NiFi provides this capability and our three-day Cloudera Dataflow: Flow Management with Apache Nifi course delivers the foundational training you'll need to succeed with NiFi. In addition to learning NiFi's key features and concepts, participants will gain hands-on experience creating, executing, managing, and optimizing NiFi dataflows throughout a variety of scenarios.
21-23 Feb 2022
27-29 Jun 2022
26-28 Oct 2022
This course is designed for developers, data engineers, administrators, and others with an interest in learning NiFi's innovative no-code, graphical approach to data ingest.
Although programming experience is not required, basic experience with Linux is presumed, and previous exposure to big data concepts and applications is helpful.
During this course, you learn how to:
Cloudera's four-day administrator training course for CDP Private Cloud Base provides participants with a comprehensive understanding of all the steps necessary to operate and maintain on-premises clusters using Cloudera Manager. From installation and configuration through load balancing and tuning, this Cloudera training course is the best preparation for the real-world challenges faced by administrators who run CDP Private Cloud Base.
This course is best suited to systems administrators who have at least basic Linux experience. Prior knowledge of CDP, nor earlier platforms such as Cloudera’s CDH or Hortonworks HDP, is not required.
Upon completion of the course, attendees are encouraged to continue their studies and register for the CCA Administrator exam. Certification is a great differentiator. It helps establish you as a leader in the field, providing employers and customers with tangible evidence of your skills and expertise.
14-17 Feb 2022
18-21 Apr 2022
20-23 Jun 2022
15-18 Aug 2022
17-20 Oct 2022
19-22 Dec 2022
Through instructor-led discussion and interactive, hands-on exercises, you will learn to
This four-day workshop covers data science and machine learning workflows at scale using Apache Spark 2 and other key components of the Hadoop ecosystem. The workshop emphasizes the use of data science and machine learning methods to address real-world business challenges.
Using scenarios and datasets from a fictional technology company, students discover insights to support critical business decisions and develop data products to transform the business. The material is presented through a sequence of brief lectures, interactive demonstrations, extensive hands-on exercises, and discussions. The Apache Spark demonstrations and exercises are conducted in Python (with PySpark) and R (with sparklyr) using the Cloudera Data Science Workbench (CDSW) environment.
3-6 Jan 2022
29 Feb -3 Mar 2022
30 May -2 Jun 2022
18-21 Jul 2022
19-22 Sep 2022
14-17 Nov 2022
The workshop is designed for data scientists who currently use Python or R to work with smaller datasets on a single machine and who need to scale up their analyses and machine learning models to large datasets on distributed clusters. Data engineers and developers with some knowledge of data science and machine learning may also find this workshop useful.
Workshop participants should have a basic understanding of Python or R and some experience exploring and analyzing data and developing statistical or machine learning models. Knowledge of Hadoop or Spark is not required.
Individuals who earn the CCA Administrator certification have demonstrated the core systems and cluster administrator skills sought by companies and organizations deploying Cloudera in the enterprise.
There are no prerequisites for taking any Cloudera certification exam; however, a background in system administration, or equivalent training is strongly recommended. The CCA Administrator exam (CCA131) follows the same objectives as Cloudera Administrator Training and the training course is an excellent part of preparation for the exam.
Demonstrate an understanding of the installation process for Cloudera Manager, CDH, and the ecosystem projects.
Perform basic and advanced configuration needed to effectively administer a Hadoop cluster
Maintain and modify the cluster to support day-to-day operations in the enterprise
Enable relevant services and configure the cluster to meet goals defined by security policy; demonstrate knowledge of basic security practices
Benchmark the cluster operational metrics, test system configuration for operation and efficiency
Demonstrate ability to find the root cause of a problem, optimize inefficient execution, and resolve resource contention scenarios
A CCA Spark and Hadoop Developer has proven their core skills to ingest, transform, and process data using Apache Spark and core Cloudera Enterprise tools.
There are no prerequisites required to take any Cloudera certification exam. The CCA Spark and Hadoop Developer exam (CCA175) follows the same objectives as Cloudera Developer Training for Spark and Hadoop and the training course is an excellent preparation for the exam.
The skills to transfer data between external systems and your cluster. This includes the following:
Import data from a MySQL database into HDFS using Sqoop
Export data to a MySQL database from HDFS using Sqoop
Change the delimiter and file format of data during import using Sqoop
Ingest real-time and near-real-time streaming data into HDFS
Process streaming data as it is loaded onto the cluster
Convert a set of data values in a given format stored in HDFS into new data values or a new data format and write them into HDFS.
Load RDD data from HDFS for use in Spark applications
Write the results from an RDD back into HDFS using Spark
Read and write files in a variety of file formats
Use Spark SQL to interact with the metastore programmatically in your applications. Generate reports by using queries against loaded data.
Use metastore tables as an input source or an output sink for Spark applications
Understand the fundamentals of querying datasets in Spark
Filter data using Spark
Write queries that calculate aggregate statistics
Join disparate datasets using Spark
This is a practical exam and the candidate should be familiar with all aspects of generating a result, not just writing code.
A CCA Data Analyst has proven their core analyst skills to load, transform, and model Hadoop data in order to define relationships and extract meaningful results from the raw input.
You are given eight to twelve customer problems with a unique large data set, a CDH cluster, and 120 minutes. For each problem, you must implement a technical solution with a high degree of precision that meets all the requirements. You may use any tool or combination of tools on the cluster (see list below) -- you get to pick the tool(s) that are right for the job. You must possess enough knowledge to analyze the problem and arrive at an optimal approach given the time allowed. You need to know what you should do and then do it on a live cluster, including a time limit and while being watched by a proctor.
Number of Questions: 8–12 performance-based (hands-on) tasks on CDH 5 cluster. See below for full cluster configuration
Time Limit: 120 minutes
Passing Score: 70%
Candidates for CCA Data Analyst can be SQL devlopers, data analysts, business intelligence specialists, developers, system architects, and database administrators. There are no prerequisites.
Use Extract, Transfer, Load (ETL) processes to prepare data for queries.
Import data from a MySQL database into HDFS using Sqoop
Export data to a MySQL database from HDFS using Sqoop
Move data between tables in the metastore
Transform values, columns, or file formats of incoming data before analysis
Use Data Definition Language (DDL) statements to create or alter structures in the metastore for use by Hive and Impala.
Create tables using a variety of data types, delimiters, and file formats
Create new tables using existing tables to define the schema
Improve query performance by creating partitioned tables in the metastore
Alter tables to modify existing schema
Create views in order to simplify queries
Use Query Language (QL) statements in Hive and Impala to analyze data on the cluster.
Prepare reports using SELECT commands including unions and subqueries
Calculate aggregate statistics, such as sums and averages, during a query
Create queries against multiple data sources by using join commands
Transform the output format of queries by using built-in functions
Cloudera University’s four-day Data Analyst Training course will teach you to apply traditional data analytics and business intelligence skills to big data. This course presents the tools data professionals need to access, manipulate, transform, and analyze complex data sets using SQL and familiar scripting languages.
Advance Your Ecosystem Expertise
Apache Hive makes transformation and analysis of complex, multi-structured data scalable in Cloudera environments. Apache Impala enables real-time interactive analysis of the data stored in Hadoop using a native SQL environment. Together, they make multi-structured data accessible to analysts, database administrators, and others without Java programming expertise.
24-27 Jan 2022
21-24 Mar 2022
23-26 May 2022
25-28 Jul 2022
26-29 Sep 2022
21-24 Nov 2022
This course is designed for data analysts, business intelligence specialists, developers, system architects, and database administrators. Some knowledge of SQL is assumed, as is basic Linux command-line familiarity. Prior knowledge of Apache Hadoop is not required.
Upon completion of the course, attendees are encouraged to continue their study and register for the CCA Data Analyst exam. Certification is a great differentiator. It helps establish you as a leader in the field, providing employers and customers with tangible evidence of your skills and expertise.
This four-day hands-on training course teaches the key concepts and knowledge developers need to use Apache Spark in developing high-performance, parallel applications on the Cloudera Data Platform (CDP).
Hands-on exercises allow students to practice writing Spark applications that integrate with CDP core components, such as Hive and Kafka. Participants will learn how to use Spark SQL to query structured data, use Spark Streaming to perform real-time processing on streaming data, and work with “big data” stored in a distributed file system.
After taking this course, participants will be prepared to face real-world challenges and build applications that make fast and relevant decisions, implementing interactive analysis applied to a wide variety of use cases, architectures, and industries.
10-13 Jan 2022
7-10 Mar 2022
9-12 May 2022
4-7 Jul 2022
5-8 Sep 2022
7-10 Nov 2022
This course is designed for developers and data engineers.
All students are expected to have basic Linux experience and basic proficiency with either Python or Scala programming languages. Basic knowledge of SQL is helpful. Prior knowledge of Spark and Hadoop is not required
Through instructor-led discussion and interactive, hands-on exercises, you will learn how to:
PMP, Project Management Professional (PMP), CAPM, Certified Associate in Project Management (CAPM) are registered marks of the Project Management Institute, Inc.