High Availbility

OS & Virtualization

Monday, August 26, 2019

Technique of Processing Big Data




Typical Data centre with Hadoop






Sqoop
  • tool designed to transfer data between Hadoop and relational databases or mainframes


Eg
  • $ sqoop list –databases –connect jdbc:mysql://database.test.com/

Pig
  • high level scripting language
  • run on client 
  • simple SQL-like scripting language is called Pig Latin
  • Uses
    • ETL data pipeline
    • Research on raw data
    • Iterative processing.





Hive
  • high level abstraction of map reduce
  • turn hiveql queries into mapreduce jobs
  • SQL like language
  • Hive makes analysis of   data stored in Hadoop easier and more productive than by writing MapReduce code
  • HiveQL statements are interpreted by Hive. Hive then produces one or more MapReduce jobs, and then submits them for execu>on on the Hadoop cluster.
    • Analyzing the relatively static data
    • Less Responsive time
    • No rapid changes in data.
Impala
  • similar to HiveQL
  • runs on hadoop cluster
  •  Impala is meant for interactive computing. Hive is more batch processing

Real time data to HDFS



Apache Flume is a system used for moving massive quantities of streaming data into HDFS.


Apache Spark is a fast, in-memory data processing engine with elegant and expressive development APIs to allow data workers to efficiently execute streaming, machine learning or SQL workloads that require fast iterative access to datasets

2 comments:

surenkumar said...

Thanks for Sharing This Article. It was a valuable content. Python is an interpreted, high-level and general-purpose programming language. Python's design philosophy emphasizes code readability with its notable use of significant whitespace. Its language constructs and object-oriented approach aim to help programmers write clear, logical code for small and large-scale projects.
python Training in Hyderabad

python Course in Hyderabad

MVLTR Apps for Android said...

Happy to visit your blog, I am by all accounts forward to more solid articles and I figure we as a whole wish to thank such countless great articles, blog to impart to us…

AWS Training in Hyderabad