Training Course

Hadoop: Fundamentals

Browse All Courses > Programming

Have a question while you're learning?

Get in-depth explanations, tips, further insights, and more from Certified Experts. Our experts are vetted industry professionals eager to help you learn from their experiences.

Course Syllabus(41 Lessons)

Hadoop: Fundamentals - Chapter 01 - Hadoop Architecture
Topic A: Prerequisites - Part 1
3 lessons20m 30s
Topic B: Introduction - Part 1
3 lessons19m 16s
Topic C: History - Part 1
3 lessons8m 27s
Topic D: Architecture - Part 1
3 lessons7m 03s
Topic E: Ecosystems - Part 1
3 lessons16m 14s
Hadoop: Fundamentals - Chapter 02 - ETL and MapReduce

Course Description

Before you begin, it is recommended you download and explore some software and utilities. The prerequisites chapter discusses all of these tools. For example, the Hadoop sandbox provides us with a working cluster. Utilities like PuTTY allow us to interact with the cluster in order to run jobs, perform file system operations, and demonstrate the capabilities of Hadoop. Linux is the operating system that supports Hadoop.

Once you have all the tools you need to get started, you will learn about the history of Hadoop; how it began as an attempt to create a better open source search engine and how it grew into the powerful data and processing engine it is today.

We’ll explore how Hadoop might fit within a large-scale enterprise, evaluating strengths and weakness of its implementation. We’ll also take a tour of the Hadoop Sandbox using the Ambari graphical user interface.

A core component of Hadoop is the Hadoop File System (HDFS). We’ll talk about how it differs from an ordinary file system and how it supports the Hadoop distributed architecture. We’ll take a look at the various nodes of HDFS and their respective roles. We’ll end with a tour of the HDFS within Linux.

We’ll then learn about ETL and MapReduce. ETL is what connects Hadoop to the outside world. Scoop is an ETL tool provided by Hadoop for exchanging data between Hadoop and an external database server. We’ll go over how to use Scoop to pull data from a Postgres database. We’ll demonstrate how to build and run a basic application in the Java language and follow it up with information on a very important component of Hadoop: MapReduce.

Course Details

3h 20m 22s


Kevin McCarty
I’m a computer professional with over 30 years of experience in the industry as a programmer, project manager, database administrator, architect, and data scientist. I’m a Microsoft Certified Trainer with more than 25 individual certifications in programming and database technologies. I serve as the chapter leader of the Boise SQL Server Users Group. As a former army officer and Eagle Scout, I hold a doctorate in computer science and have a lifelong love of learning.
Kevin McCartyInstructor and Curriculum Developer

Share Course

Ready to get started?