A Review: Big Data Technologies with Hadoop Distributed Filesystem and Implementing M/R
Keywords:Big Data, Hadoop Ecosystem, Hadoop Distributed File System, NameNode, DataNode
Today Big Data, is any set of data that is larger than the capacity to be processed using traditional database tools to capture, share, transfer, store, manage and analyze within an acceptable time frame; from the point of view of service providers, Organizations need to deal with a large amount of data for the purpose of analysis. And IT department are facing tremendous challenge in protecting and analyzing these increased volumes of information. The reason organizations are collecting and storing more data than ever before is because their business depends on it. The type of information being created is no more traditional database-driven data referred to as structured data rather it is data that include documents, images, audio, video, and social media contents known as unstructured data or Big Data. Big Data Analytics is a way of extracting value from these huge volumes of information, and it drives new market opportunities and maximizes customer retention. Moreover, this paper focuses on discussing and understanding Big Data technologies and Analytics system with Hadoop distributed filesystem (HDFS). This can help predict future, obtain information, take proactive actions and make way for better strategic decision making.
A Day in Big Data. BIG DATA for smarter customer experiences. 2014. [ONLINE] Available at: http://adayinbigdata.com. [Accessed 03 November 15].
EMC Solutions Group. Big Data-as-a-Service. 2012, July. Retrieved from https://www.emc.com/collateral/software/white-papers/h10839-big-data-as-a-service-perspt.pdf
Dhawan, S & Rathee, S. Big Data Analytics using Hadoop Components like Pig and Hive. American International Journal of Research in Science, Technology, Engineering & Mathematics, 88. 2013 Retrieved from http://iasir.net/AIJRSTEMpapers/AIJRSTEM13-131.pdf
Enterprise Hadoop: The Ecosystem of Projects. Retrieved from http://hortonworks.com/hadoop/
Penchikala, S. Big Data Processing with Apache Spark - Part 1: Introduction. 2015, January Retrieved from http://www.infoq.com/articles/apache-spark-introduction
Grunsky, E. C. "R: a data analysis and statistical programming environment–an emerging tool for the geosciences." Computers & Geosciences. 28.10.2002.
Fang, Huang. "Managing data lakes in big data era: What's a data lake and why has it became popular in data management ecosystem." Cyber Technology in Automation, Control, and Intelligent Systems (CYBER), 2015 IEEE International Conference on. IEEE, 2015.
Tiwari, S. Using Oracle Berkeley DB as a NoSQL Data Store. 2011. Retrieved April 5 2015 from
Waller, Matthew A., and Stanley E. Fawcett. "Data science, predictive analytics, and big data: a revolution that will transform supply chain design and management." Journal of Business Logistics 34.2, pp.77-84. (2013).
O'Leary, Daniel E. "Artificial intelligence and big data." IEEE Intelligent Systems pp.96-99. 28.2 (2013).
MICHAEL, JW, ALAN COHN, and JARED R. BUTCHER. "BlockChain technology." The Journal (2018).
Deka, Ganesh Chandra. "Big data predictive and prescriptive analytics." Handbook of Research on Cloud Infrastructures for Big Data Analytics. IGI Global, Pp.370-391. 2014.
Shvachko, Konstantin, et al. "The hadoop distributed file system." Mass storage systems and technologies (MSST), 2010 IEEE 26th symposium on. Ieee, 2010.
Vavilapalli, Vinod Kumar, et al. "Apache hadoop yarn: Yet another resource negotiator." Proceedings of the 4th annual Symposium on Cloud Computing. ACM, 2013.
IBM. 2015. IBM - What is MapReduce. from: https://www.01.ibm.com/software/data/infosphere/hadoop/mapreduce/.
How to Cite
Copyright (c) 2020 Renas Rajab Asaad, Hawar B. Ahmad, Rasan Ismael Ali
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Authors retain copyright
The use of a Creative Commons License enables authors/editors to retain copyright to their work. Publications can be reused and redistributed as long as the original author is correctly attributed.
- The researcher(s), whether a single or joint research paper, must sell and transfer to the publisher (the Academic Journal of Nawroz University) through all the duration of the publication which starts from the date of entering this Agreement into force, the exclusive rights of the research paper/article. These rights include the translation, reuse of papers/articles, transmit or distribute, or use the material or parts(s) contained therein to be published in scientific, academic, technical, professional journals or any other periodicals including any other works derived from them, all over the world, in English and Arabic, whether in print or in electronic edition of such journals and periodicals in all types of media or formats now or that may exist in the future. Rights also include giving license (or granting permission) to a third party to use the materials and any other works derived from them and publish them in such journals and periodicals all over the world. Transfer right under this Agreement includes the right to modify such materials to be used with computer systems and software, or to reproduce or publish it in e-formats and also to incorporate them into retrieval systems.
- Reproduction, reference, transmission, distribution or any other use of the content, or any parts of the subjects included in that content in any manner permitted by this Agreement, must be accompanied by mentioning the source which is (the Academic Journal of Nawroz University) and the publisher in addition to the title of the article, the name of the author (or co-authors), journal’s name, volume or issue, publisher's copyright, and publication year.
- The Academic Journal of Nawroz University reserves all rights to publish research papers/articles issued under a “Creative Commons License (CC BY-NC-ND 4.0) which permits unrestricted use, distribution, and reproduction of the paper/article by any means, provided that the original work is correctly cited.
- Reservation of Rights
The researcher(s) preserves all intellectual property rights (except for the one transferred to the publisher under this Agreement).
- Researcher’s guarantee
The researcher(s) hereby guarantees that the content of the paper/article is original. It has been submitted only to the Academic Journal of Nawroz University and has not been previously published by any other party.
In the event that the paper/article is written jointly with other researchers, the researcher guarantees that he/she has informed the other co-authors about the terms of this agreement, as well as obtaining their signature or written permission to sign on their behalf.
The author further guarantees:
- The research paper/article does not contain any defamatory statements or illegal comments.
- The research paper/article does not violate other's rights (including but not limited to copyright, patent, and trademark rights).
This research paper/article does not contain any facts or instructions that could cause damages or harm to others, and publishing it does not lead to disclosure of any confidential information.