Corporate Training Bigdata

prestige
Corporate Training
Free
  • 0 student
  • 0 lessons
  • 0 quizzes
  • 10 week duration
0 student

CORPORATE TRAINING ON BIG DATA

Indore’s best  institute for Corporate training on Big Data

Learn Big Data at Prestige Point, Indore’s best institute for Corporate training on Big Data. Our Industry ready curriculum makes you ready for Corporate IT industry in Big Data.

 

Duration- 4 Months

HADOOP COURSE CONTENT – (HADOOP-1.X, 2.X & 3.X)

                (Development, Administration & REAL TIME Projects Implementation)

  • Introduction to BIGDATA and HADOOP

    • What is Big Data?
    • What is Hadoop?
    • Relation between Big Data and Hadoop.
    • What is the need of going ahead with Hadoop?
    • Scenarios to apt Hadoop Technology in REAL TIME Projects
    • Challenges with Big Data
      • Storage
      • Processing
    • How Hadoop is addressing Big Data Changes

Comparison with Other Technologies

  • RDBMS

  • Different Components of Hadoop Echo System

    • Storage Components
    • Processing Components
  • Type of BigData Projects

    • On Premises project
    • Cloud Integrated Project
    • Differences between On Premises & Cloud Integrated Projects
  • Hadoop Installation (Two methods)
  • HDFS (Hadoop Distributed File System)

    • Significance of HDFS in Hadoop
    • Features of HDFS
    • Storage aspects of HDFS
      • Block – the basic storage unit in hadoop
      • How to Configure block size
      • Default Vs Configurable Block size
      • Why HDFS Block size so large?
  • HDFS Architecture –

    • NameNode and its functionality
    • DataNode and its functionality
  • Replication in Hadoop – Fail Over Mechanism

  • Accessing HDFS

    • CLI (Command Line Interface) and HDFS Commands
  • Features of HDFS

  • File read operation

  • File write operation

  • Rack Awareness

  • Hadoop Archives

    • Configuration files in Hadoop Installation and the Purpose
    • Difference between Hadoop 1.X.X , Hadoop 2.X.X & 3.X.X version
  • MapReduce

    • Why Map Reduce is essential in Hadoop?
    • Processing Daemons of Hadoop
    • Node Manager
    • Resource Manager
    • Job
    • Task
    • Keys and Values
    • Mapreduce Flow
    • Wordcount example
    • Map abstraction
    • Mapper
    • Reduce abstraction
    • Reducer
    • Map only job
    • Combiner
    • Data locality
    • Anatomy of MapReduce
    • Hadoop Data types
    • Input files
    • Hdfs Blocks
    • Input format
    • Input split
    • InputSplits and
    • Records
    • InputSplits and Blocks
    • Record reader
    • Partitioner
    • Shuffling
    • Sorting
    • OutputFormat
  • Apache PIG

    • Introduction to Apache Pig
    • Map Reduce Vs Apache Pig
    • SQL Vs Apache Pig
    • Different datat ypes in Pig
    • Where to Use Map Reduce and PIG in REAL Time Hadoop Projects
    • Modes Of Execution in Pig
      • Local Mode
      • Map Reduce OR Distributed Mode
    • Execution Mechanism
      • Grunt Shell
      • Script
      • Embedded
    • How to write a simple pig script
    • Bags , Tuples and fields in PIG
    • UDFs in Pig
  • HIVE

    • Hive Introduction
    • Need of Apache HIVE in Hadoop
    • When to choose MAP REDUCE , PIG & HIVE in REAL Time Project
    • Hive Architecture
      • Driver
      • Compiler
      • Executor(Semantic Analyzer)
    • Meta Store in Hive
      • Importance Of Hive Meta Store
      • Embedded Metastore VS External Metastore
      • Embedded metastore configuration
      • External metastore configuration
      • Communication mechanism with Metastore and configuration details
      • Drawbacks with Internal/Embedded metastore over External metastore
    • Hive Integration with Hadoop
    • Hive Query Language(Hive QL)
    • Configuring Hive with MySQL MetaStore
    • SQL VS Hive QL
    • Data Slicing Mechanisms
      • Partitions In Hive
      • Static Partitioning in Hive and its performance trade offs
      • Dynamic Partitioning in Hive and its performance trade offs
      • Buckets In Hive
      • Partitioning with Bucketing usage in Real Time Project Use Cases
      • Partitioning Vs Bucketing
      • Real Time Use Cases
    • User Defined Functions(UDFs) in HIVE
      • Need of UDFs in HIVE
    • Hive Serializer/Deserializer – SerDe
  • SQOOP

    • Introduction to Sqoop.
    • MySQL client and Server Installation
    • How to connect to Relational Database using Sqoop
    • Performance Implications in SQOOP Import and how to improve the performance
    • Performance Implications in SQOOP Export and how to improve the performance
    • Different Sqoop Commands
      • Different flavors of Imports
      • Export
      • Hive-Imports
    • SQOOP Incremental Load VS History Load & Limitations in Incremental Load
  • Flume

    • Flume Introduction
    • Flume Architecture
    • Flume Master , Flume Collector and Flume Agent
    • Flume Configurations
    • Real Time Use Case using Apache Flume
  • YARN (Yet another Resource Negotiator) – Next Gen. Map Reduce

    • What is YARN?
    • Difference between Map Reduce & YARN
    • YARN Architecture
      • Resource Manager
      • Application Master
      • Node Manager
    • When should we go ahead with YARN
    • YARN Process flow
  • Projects:

    • Practical Knowledge on Retail data Analysis
    • Employee CTC Analysis

  • SPARK

    • Introduction of Big Data
    • Introduction to Spark
    • Motivation for Spark
    • Spark Vs Map Reduce Processing
    • Architecture Of Spark
    • Spark Shell Introduction
    • Real time Examples of Spark
  • MapReduce

    • Why Map Reduce is essential in Hadoop?
    • Processing Daemons of Hadoop
    • Node Manager
    • Resource Manager
    • Job
    • Task
    • Keys and Values
    • Mapreduce Flow
    • Wordcount example
    • Map abstraction
    • Mapper
    • Reduce abstraction
    • Reducer
    • Map only job
    • Combiner
    • Data locality
    • Anatomy of MapReduce
    • Hadoop Data types
    • Input files
    • Hdfs Blocks
    • Input format
    • Input split
    • InputSplits and
    • Records
    • InputSplits and Blocks
    • Record reader
    • Partitioner
    • Shuffling
    • Sorting
    • OutputFormat
  • Resilient Distributed Dataset

    • What is RDD and why it is important in Spark
    • Transformation in RDD
    • Actions in RDD
    • Loading Data through RDD
    • Saving Data
    • Key-Value pair RDD
    • Pair RDD operations
  • Spark SQL

    • Introduction to Spark SQL
    • The SQL Context
    • Hive Vs Spark SQL
    • Spark SQL support for Text Files, Parquet and JSON files
    • Data Frames
    • Real Time examples of Spark SQL
    • Spark Streaming
    • Introduction to Spark Streaming
    • Architecture of Spark Streaming
    • Spark Streaming Vs Flume
    • Introduction to Kafka
    • Spark Streaming Integration with Kafka Overview
    • Real Time examples of Spark Streaming & Kafka
  • Spark MLlib

  • GraphX

  • Project:

    • WordCount FIle analysis using Spark
    • SetTopBox Project outline
  • Benefits of doing Corporate Training:

    • Support for real-life project
    • Complementary Job Assistance
    • Resume & Interview Preparation
    • Personalized career guidance
    • Extra Practical projects in both Hadoop Spark
    • AWS (Amazon Web Services)

                AWS:  In AWS we will learn, how to setup an Apache         Hadoop cluster on Amazon AWS in master and slave fashion.

Search us for:

  • big data training indore,
  • big data certification course in indore,
  • big data professional indore, madhya pradesh,
  • big data coaching classes in indore,
  • best big data coaching in indore,
  • big data coaching in indore,
  • list of big data certification course indore,
  • big data certification course fee indore,
  • bigdata training institute in Indore,
  • big data training institute indore
Curriculum is empty

0.00 average based on 0 ratings

5 Star
0%
4 Star
0%
3 Star
0%
2 Star
0%
1 Star
0%
Free