• HADOOP 

  • Course duration: 47 Hours
  • Core Topics:

  • Apache Hadoop
  • Data Science
  • Data Analytics
  • R
  • Apache Spark & Scala [Introduction]
  • NoSQL [Cassandra Vs MongoDb Vs Riak Vs Neo4j]
  • Machine Learning
  • Apache Hadoop :

  • HDFS
  • MapReduce
  • Hive
  • Pig
  • Sqoop
  • WebCrawling
  • Apache Tika
  • —SPARK

  • Introduction
  • Why Spark?
  • Problems with Traditional Large-Scale Systems
  • introducing Spark
  • Spark Basics
  • What is Apache Spark?
  • Using the Spark Shell
  • Resilient Distributed Datasets (RDDs)
  • Functional Programming with Spark
  • Working with RDDs
  • RDD Operations
  • Key-Value Pair RDDs
  • MapReduce and Pair RDD Operations
  • The Hadoop Distributed File System
  • Why HDFS?
  • HDFS Architecture
  • Running Spark on a Cluster
  • Overview
  • A Spark Standalone Cluster
  • The Spark Standalone Web UI
  • Using HDFS
  • RDD Partitions and HDFS Data Locality
  • Working With Partitions
  • Executing Parallel Operations
  • Caching and Persistence
  • RDD Lineage
  • Caching Overview
  • Distributed Persistence
  • Writing Spark Applications
  • Spark Applications vs. Spark Shell
  • Creating the SparkContext
  • Configuring Spark Properties
  • Building and Running a Spark Application
  • Logging
  • Spark, Hadoop, and the Enterprise Data Center
  • Overview
  • Spark and the Hadoop Ecosystem
  • Spark and MapReduce
  • Spark Streaming
  • Spark Streaming Overview
  • Example: Streaming Word Count
  • Other Streaming Operations
  • Sliding Window Operations
  • Developing Spark Streaming Applications
  • Common Spark Algorithms
  • Iterative Algorithms
  • Graph Analysis
  • Machine Learning
  • Improving Spark Performance
  • Shared Variables: Broadcast Variables
  • Shared Variables: Accumulators
  • Common Performance issues

     DataScience:

  • Introduction to datascience
  • Basics of statistics
  • Bayesian Algorithm
  •  R
  • Introduction to NLP
  •  Machine Learning

  • Clustering
  • Dimensionality Reduction
  • Support Vector Machines
  • Anomaly Detection
  • K-means
  • R
  •  Basics of R
  •  Key principles
  •  Applying and integrating R with Hadoop
  •  Applying statistics & algorithms in R
  • NoSQL :
  •  Basics of::
  •  Cassandra
  •  MongoDb
  •  Hbase
  • Riak
  • Neo4j
  • Apache Spark & Scala

  • Apache Spark & Scala
  • Spark MlLib
  • Advanced Spark Concepts
  • Apache Flink – 4th Generation

  • System Overview
  • DataSet API
  • DataStream API
  • End to End Architecture
  • MongoDB

  • Introduction to MongoDB
  • Installing MongoDB
  • Data Model
  • Working with Data
  • Indexing & Aggregation

Please contact below undersigned for Course fee,Timelines Etc..

Kalyan D.

kalyan@techEtraining.com,

bolokalyan@gmail.com,

USA Desk  : +1-201 – 478 – 8484  (24 x 7)

India: +91 -880 1000 880(whats app)

India: +91 -800-800-4053(whats app)

Skype: techetraining
www.techEtraining.com

SHARE