Email:info@araniconsulting.com

Basic

Unit 1: Introduction and Overview of Hadoop

  1. What is Hadoop?
  2. History of Hadoop.
  3. Building Blocks – Hadoop Eco-System.
  4. Who is behind Hadoop?
  5. What Hadoop is good for and what it is not?

Unit 2: Hadoop Distributed FileSystem (HDFS)

  1. HDFS Overview and ArchitecturePREVIEW
  2. HDFS Installation
  3. HDFS Use Cases
  4. Hadoop File System Shell
  5. File System Java API
  6. Hadoop Configuration

Unit 3: HBase – The Hadoop Database

  1. HBase Overview and Architecture
  2. HBase Installation
  3. HBase Shell
  4. Java Client API
  5. Java Administrative API
  6. Filters
  7. Scan Caching and Batching
  8. Key Design
  9. Table Design

Unit 4: Map/Reduce 2.0/YARN

  1. Decomposing Problems into MapReduce Workflow
  2. Using JobControl
  3. Oozie Introduction and Architecture
  4. Oozie Installation
  5. Developing, deploying, and Executing Oozie Workflows

Unit 5: Pig

  1. Pig Overview
  2. Installation
  3. Pig Latin
  4. Developing Pig Scripts
  5. Processing Big Data with Pig
  6. Joining data-sets with Pig

Unit 6: Hive

  1. Hive Overview
  2. Installation
  3. Hive QL

Unit 7: Sqoop

  1. Introduction
  2. Sqoop Tools
  3. Sqoop Import
  4. Sqoop Import all tables
  5. Sqoop Export
  6. Sqoop Job
  7. Sqoop metastore
  8. Sqoop Eval
  9. Sqoop Codegen
  10. Sqoop List Databases and List Tables
  11. Sqoop Create Hive Table

Advance

Unit 1: Integrating Hadoop Into The Workflow

  1. Relational Database Management Systems
  2. Storage Systems
  3. Importing Data from RDBMSs With Sqoop
  4. Hands-on exercise
  5. Importing Real-Time Data with Flume
  6. Accessing HDFS Using FuseDFS and Hoop

Unit 2: Delving Deeper Into The Hadoop API

  1. More about ToolRunner
  2. Testing with MRUnit
  3. Reducing Intermediate Data With Combiners
  4. The configure and close methods for Map/Reduce Setup and Teardown
  5. Writing Partitioners for Better Load Balancing
  6. Hands-On Exercise
  7. Directly Accessing HDFS
  8. Using the Distributed Cache

Unit 3: Common Map Reduce Algorithms

  1. Sorting and Searching
  2. Indexing
  3. Machine Learning With Mahout
  4. Term Frequency – Inverse Document Frequency
  5. Word Co-Occurrence

Unit 4: Using Hive and Pig

  1. Hive Basics
  2. Pig Basics

Unit 5: Practical Development Tips and Techniques

  1. Debugging MapReduce Code
  2. Using LocalJobRunner Mode For Easier Debugging
  3. Retrieving Job Information with Counters
  4. Logging
  5. Splittable File Formats
  6. Determining the Optimal Number of Reducers
  7. Map-Only MapReduce Jobs

Unit 6: More Advanced Map Reduce Programming

  1. Custom Writables and WritableComparables
  2. Saving Binary Data using SequenceFiles and Avro Files
  3. Creating InputFormats and OutputFormats

Unit 7: Joining Data Sets in Map Reduce

  1. Map-Side Joins
  2. The Secondary Sort
  3. Reduce-Side Joins

Unit 8: Graph Manipulation in Hadoop

  1. Introduction to graph techniques
  2. Representing graphs in Hadoop
  3. Implementing a sample algorithm: Single Source Shortest Path

Unit 9: Creating Workflows With Oozie

  1. The Motivation for Oozie
  2. Oozie’s Workflow Definition Format
  • RSS
  • Delicious
  • Digg
  • Facebook
  • Twitter
  • Linkedin
  • Youtube

Popular Posts

Hello world!

Welcome to WordPress. This is your first post. Edit or ...

Sponsors