The training has been developed in partnership with Talend and is designed to help you master Big Data Integration Platform using Talend Open Studio to easily connect and analyze data.You'll use Talend ETL tool with HDFS, PIG and Hive on real-life case studies.
Role of Open Source ETL Technologies in Big Data :
Learning Objectives - In this module, you will get an overview on various products offered by Talend corporation till date and get familiar with the relevance to Data Integration and Big Data. Also, basic ETL and DWH concepts, how Talend fits in and how open source technologies are taking Big Data into next level. Zero to Pro in minutes is what Talend has to offer in Big Data arena.
Topics - About Talend corporation and their journey, Overviews on TOS (Talend Open Studio) for Data Integration, TOS for Data Quality, TOS for Master Data Management, TOS for Big Data, ETL concepts, Data warehousing concepts, Quiz session.
Talend: A Revolution in Big Data :
Learning Objectives - In this module, you will get familiar with the TOS for DI tool, GUI, what is where, what is what. You will also learn to setup talend (installation) and most frequent error encountered and how to fix them, Talend architecture, Hadoop is not a threat to ETL but they go hand in hand.
Topics - Why Talend, Features, Advantages, Talend Installation/System Requirements, GUI layout (designer), Understanding it's Basic Features, Comparison with other market leader tools in ETL domain, Important areas in Talend Architecture: Project, Workspace, Job, Metadata, Propagation, Linking components, Hands On: Creating a simple job and discussion about it, Quiz session.
Talend: Read & Write Various Types of Source/Target Systems :
Learning Objectives - In this module, you will get acquainted with various types of source, target systems supported by Talend, Demo of popular CSV/Delimited file and fixed width file, How to read and write in this area, How to connect to Database and read/write/update data, How to read complex source system like Excel and XML.
Topics - Data Source Connection, File as Source, Create meta data, Database as source, Create metadata, Using MySQL database (create tables, insert, update data from talend), Read and write into excel files, into multiple tabs, View data, How to capture log and navigate around basic errors, Role of tLogrow and how it makes developers life easy, Quiz session, Hands on assignments.
Talend: How to Transform your Business: Basic :
Learning Objectives - In this module, you will understand basic to advanced transformation components offered under TOS for DI. You will also learn: 1- How homogeneous/heterogeneous data sources talk with each other and 2- How to transform data patterns depending on business requirements
Topics - Using Advanced components like: tMap, tJoin, tFilter, tSortRow, tAggregateRow, tReplicate, tSplit, Lookup, tRowGenerator, Quiz session, Scenarios and assignments: How to join 2 sources and get matching from second source, rows to columns and columns to rows transformation, Remove Duplicates, Filter based on Business requirement.
Talend: How to Transform your Business: Advanced 1 :
Learning Objectives - In this module, you will learn to set dependencies between Jobs, Setting up parameters in Job, Use of Functions, Deploy jobs from development to production environment in realtime, Cross platform sharing with Talend (how to import and export information).
Topics - Trigger (types) and Row Types, Context Variables (paramaterization), Functions (basic to advanced functions to transform business rules such as string, date, mathematical etc.), Accessing job level / component level information within the job, Quiz session, Scenarios and assignments: How to search and replace errors in source data (Data Quality and cleansing), Job Trigger or Action (Possible scenario is as soon as file arrives kick off a job).
Talend: How to Transform your Business: Advanced 2 :
Learning Objectives - In this module, you will understand transformation and various steps: How to program looping in talend, How to search files in a directory and process one by one, Centralized error handling and debugging mechanism in talend.
Topics - Type Casting (convert datatypes among source-target platforms), Looping components (like tLoop, tFor), tFileList, tRunJob, How to schedule and run talend DI jobs externally (not in GUI), Quiz session, Scenarios and assignments: How to redirect errors in a job to central error loging which can be analysed later, How to create output files dynamically based on a field value in the source, How to read files in a directory (in loop) and process them one by one.
Big Data Concepts: Required for Talend for Big Data :
Learning Objectives - In this module, you will understand the prior knowledge required in Hadoop in order to be comfortable while learning Talend for Big Data: Basics in Hadoop, HDFS (Hadoop Distributed File System) architecture Overview, MapReduce Concept Overview, Industry standards.
Topics - How module 1 to 6 will help in understanding and performing hands on Talend for Big Data and How Big Data will never be this easy to learn and use, Quiz session.
Introduction to Talend for Big Data :
Learning Objectives - In this module, you will learn: TOS for BD means (Talend Open Studio for Big Data), How to setup Big Data environment on your machine, Big Data connectors in TOS for BD (Talend offers some 800+ connectors for Big Data environment), How to access HDFS from Talend.
Topics - Big Data setup using Hortonworks Sandbox in your personal computer, Explaining the TOS for Big Data Environment, Quiz session, Scenarios and assignments: Basic HDFS commands and Exploring in Sandbox, How to check connectivity to HDFS from Talend, How to read from HDFS in Talend Job, How to write into HDFS from Talend job.
Hive in Talend for Big Data :
Learning Objectives - In this module, you will learn: What is Hive and concepts, How to setup Hive environment in Talend, Hive Big Data connectors in TOS for BD and Use Cases using Hive in Talend.
Topics - How to create and access Hive tables in Talend, Process and Transform data from hive, Access data from Hive, transform and interact with MySQL tables, Quiz session, Scenarios and assignments: Hive connectors, Use cases using Hive in Talend.
Pig in Talend for Big Data and Project :
Learning Objectives - In this module, you will learn: What is Pig and concepts, How to setup Pig environment in Talend, Pig Big Data connectors in TOS for BD, Use cases using Pig in Talend, Project Implementation, Conclusion.
Topics - Quiz session, Scenarios and assignments: Using Pig connectors, Setup, Use case using Pig scripting via Talend. Business requirements: Source/Target/Mapping will be provided and explained, Quiz session and Discussion.
About the Author
Edureka courses are specially curated by experts who monitor the IT industry with a hawk’s eye, and respond to expectations, changes and requirements from the industry, and incorporate them into our courses.
Posted: 06 September, 2017
This is a certification course.
By completing this course, you are eligible for certification opportunities. This course provides the instruction and educational material needed to prepare for a third-party certification exam.
This is a course package.
Course packages provide a comprehensive learning plan at a discounted price, and may lead to certification opportunities.