Industrial Training Big Data

prestige
Industrial Training
  • 0 lessons
  • 0 quizzes
  • 0 week duration

HADOOP COURSE CONTENT – (HADOOP-1.X, 2.X & 3.X)

(Development, Administration & REAL TIME Projects Implementation)

  • Introduction to BIGDATA and HADOOP
    • What is Big Data?
    • What is Hadoop?
    • Relation between Big Data and Hadoop.
    • Challenges with Big Data
      • Storage
      • Processing
    • Type of BigData Projects
      • On Premises project
      • Cloud Integrated Project
      • Differences between On Premises & Cloud Integrated Projects
    • Hadoop Installation (Two methods)

 

  • HDFS (Hadoop Distributed File System)
    • Significance of HDFS in Hadoop
    • Features of HDFS
  • HDFS Architecture –
    • NameNode and its functionality
    • DataNode and its functionality
  • Replication in Hadoop – Fail Over Mechanism

 

  • Accessing HDFS
    • CLI (Command Line Interface) and HDFS Commands
  • Features of HDFS

 

  • File read operation

 

  • File write operation

 

  • Rack Awareness

 

  • MapReduce
    • Why Map Reduce is essential in Hadoop?
    • Processing Daemons of Hadoop
    • Node Manager
    • Resource Manager
    • Keys and Values
    • Mapreduce Flow
    • Wordcount example
    • Map abstraction
    • Mapper
    • Reduce abstraction
    • Reducer
    • Map only job
    • Combiner
    • Data locality
    • Anatomy of MapReduce
    • Hadoop Data types
    • Input files
    • Hdfs Blocks
    • Input format
    • Input split
    • InputSplits and
    • Records
    • InputSplits and Blocks
    • Record reader
    • Partitioner
    • Shuffling
    • Sorting
    • OutputFormat

 

  • HIVE
    • Hive Introduction
    • Need of Apache HIVE in Hadoop
    • Hive Query Language(Hive QL)
    • Configuring Hive with MySQL MetaStore
    • SQL VS Hive QL
    • Data Slicing Mechanisms
      • Partitioning Vs Bucketing
  • SQOOP
      • Introduction to Sqoop.
      • MySQL client and Server Installation
      • How to connect to Relational Database using Sqoop
      • Performance Implications in SQOOP Import and how to improve the performance
      • Performance Implications in SQOOP Export and how to improve the performance
  • Projects:
    • File Join operations
    • HRanalysis
    • Web Log Analysis
  • SPARK
    • Introduction of Big Data
    • Introduction to Spark
    • Spark Vs Map Reduce Processing
    • Real time Examples of Spark
  • Resilient Distributed Dataset
    • What is RDD and why it is important in Spark
    • Transformation in RDD
    • Actions in RDD
    • Loading Data through RDD
    • Saving Data
    • Key-Value pair RDD
    • Pair RDD operations
Curriculum is empty

0.00 average based on 0 ratings

5 Star
0%
4 Star
0%
3 Star
0%
2 Star
0%
1 Star
0%