Apache Hadoop Ecosystem

Apache Hadoop Ecosystem

  • Hadoop HDFS - 2007 - A distributed file system for reliably storing huge amounts of unstructured, semi-structured and structured data in the form of files. 
  • Hadoop MapReduce - 2007 - A distributed algorithm framework for the parallel processing of large datasets on HDFS filesystem. It runs on Hadoop cluster but also supports other database formats like Cassandra and HBase. 
  • Cassandra - 2008 - A key-value pair NoSQL database, with column family data representation and asynchronous masterless replication. 
  • HBase - 2008 - A key-value pair NoSQL database, with column family data representation, with master-slave replication. It uses HDFS as underlying storage. 
  • Zookeeper - 2008 - A distributed coordination service for distributed applications. It is based on Paxos algorithm variant called Zab. 
  • Pig - 2009 - Pig is a scripting interface over MapReduce for developers who prefer scripting interface over native Java MapReduce programming. 
  • Hive - 2009 - Hive is a SQL interface over MapReduce for developers and analysts who prefer SQL interface over native Java MapReduce programming. 
  • Mahout - 2009 - A library of machine learning algorithms, implemented on top of MapReduce, for finding meaningful patterns in HDFS datasets. 
  • Sqoop - 2010 - A tool to import data from RDBMS/DataWarehouse into HDFS/HBase and export back. 
  • YARN - 2011 - A system to schedule applications and services on an HDFS cluster and manage the cluster resources like memory and CPU. 
  • Flume - 2011 - A tool to collect, aggregate, reliably move and ingest large amounts of data into HDFS. 
  • Storm - 2011 - A system to process high-velocity streaming data with 'at least once' message semantics. 
  • Spark - 2012 - An in-memory data processing engine that can run a DAG of operations. It provides libraries for Machine Learning, SQL interface and near real-time Stream Processing. 
  • Kafka - 2012 - A distributed messaging system with partitioned topics for very high scalability. 
  • SolrCloud - 2012 - A distributed search engine with a REST-like interface for full-text search. It uses Lucene library for data indexing.

Comments

  1. I simply wanted to write down a quick word to say thanks to you for those wonderful tips and hints you are showing on this site.

    big data training in chennai
    hadoop training in bangalore

    ReplyDelete
  2. This is a very informative website, I'll recommend it to my friends to check out, keep the good contents coming! If you ever need help to register a business, let us help, we are the best singapore company incorporation assistant out there, accounting company now!

    ReplyDelete
  3. Really useful information about hadoop, i have to know information about hadoop online training institutes.

    ReplyDelete
  4. Amazing Article, thank you!. I am very glad to read your informative & practical blog. Kindly keep updating your blog.
    Java Developer is a wonderful career for IT students.To start Dream Career to become a Java developer learn from
    Java Training in Chennai
    . or learn thru Java Online Training from India .

    ReplyDelete
  5. My rather long internet look up has at the end of the day been compensated with pleasant insight to talk about with my family and friends.
    hadoop-training-institute-in-chennai
    big-data-hadoop-training-institute-in-bangalore

    ReplyDelete
  6. Privileged to read this informative blog on Hadoop.Commendable efforts to put on research the hadoop. Please enlighten us with regular updates on hadoop. Friends if you're keen to learn more about AI you can watch this amazing tutorial on the same.
    https://www.youtube.com/watch?v=1jMR4cHBwZE

    ReplyDelete
  7. Thanks a lot very much for the high quality and results-oriented help. I won’t think twice to endorse your blog post to anybody who wants and needs support about this area.
    big-data-hadoop-training-institute-in-bangalore

    ReplyDelete

Post a Comment

Popular posts from this blog

Big Data Before The Internet

Big Data After The Internet