A Review: Big Data Technologies with Hadoop Distributed Filesystem and Implementing M/R


  • Renas Rajab Asaad Department of Computer Science, Nawroz University, Duhok, Kurdistan Region - Iraq
  • Hawar B. Ahmad Department of Computer Science, Nawroz University, Duhok, Kurdistan Region - Iraq
  • Rasan Ismael Ali Department of Computer Science, Nawroz University, Duhok, Kurdistan Region - Iraq




Big Data, Hadoop Ecosystem, Hadoop Distributed File System, NameNode, DataNode


Today Big Data, is any set of data that is larger than the capacity to be processed using traditional database tools to capture, share, transfer, store, manage and analyze within an acceptable time frame; from the point of view of service providers, Organizations need to deal with a large amount of data for the purpose of analysis. And IT department are facing tremendous challenge in protecting and analyzing these increased volumes of information. The reason organizations are collecting and storing more data than ever before is because their business depends on it. The type of information being created is no more traditional database-driven data referred to as structured data rather it is data that include documents, images, audio, video, and social media contents known as unstructured data or Big Data. Big Data Analytics is a way of extracting value from these huge volumes of information, and it drives new market opportunities and maximizes customer retention. Moreover, this paper focuses on discussing and  understanding Big Data technologies and Analytics system with Hadoop distributed filesystem (HDFS). This can help predict future, obtain information, take proactive actions and make way for better strategic decision making.


Download data is not yet available.


A Day in Big Data. BIG DATA for smarter customer experiences. 2014. [ONLINE] Available at: http://adayinbigdata.com. [Accessed 03 November 15].

EMC Solutions Group. Big Data-as-a-Service. 2012, July. Retrieved from https://www.emc.com/collateral/software/white-papers/h10839-big-data-as-a-service-perspt.pdf

Dhawan, S & Rathee, S. Big Data Analytics using Hadoop Components like Pig and Hive. American International Journal of Research in Science, Technology, Engineering & Mathematics, 88. 2013 Retrieved from http://iasir.net/AIJRSTEMpapers/AIJRSTEM13-131.pdf

Enterprise Hadoop: The Ecosystem of Projects. Retrieved from http://hortonworks.com/hadoop/

Penchikala, S. Big Data Processing with Apache Spark - Part 1: Introduction. 2015, January Retrieved from http://www.infoq.com/articles/apache-spark-introduction

Grunsky, E. C. "R: a data analysis and statistical programming environment–an emerging tool for the geosciences." Computers & Geosciences. 28.10.2002.

Fang, Huang. "Managing data lakes in big data era: What's a data lake and why has it became popular in data management ecosystem." Cyber Technology in Automation, Control, and Intelligent Systems (CYBER), 2015 IEEE International Conference on. IEEE, 2015.

Tiwari, S. Using Oracle Berkeley DB as a NoSQL Data Store. 2011. Retrieved April 5 2015 from

Waller, Matthew A., and Stanley E. Fawcett. "Data science, predictive analytics, and big data: a revolution that will transform supply chain design and management." Journal of Business Logistics 34.2, pp.77-84. (2013).

O'Leary, Daniel E. "Artificial intelligence and big data." IEEE Intelligent Systems pp.96-99. 28.2 (2013).

MICHAEL, JW, ALAN COHN, and JARED R. BUTCHER. "BlockChain technology." The Journal (2018).

Deka, Ganesh Chandra. "Big data predictive and prescriptive analytics." Handbook of Research on Cloud Infrastructures for Big Data Analytics. IGI Global, Pp.370-391. 2014.

Shvachko, Konstantin, et al. "The hadoop distributed file system." Mass storage systems and technologies (MSST), 2010 IEEE 26th symposium on. Ieee, 2010.

Vavilapalli, Vinod Kumar, et al. "Apache hadoop yarn: Yet another resource negotiator." Proceedings of the 4th annual Symposium on Cloud Computing. ACM, 2013.

IBM. 2015. IBM - What is MapReduce. from: https://www.01.ibm.com/software/data/infosphere/hadoop/mapreduce/.



How to Cite

Asaad, R. R., Ahmad, H. B., & Ali, R. I. (2020). A Review: Big Data Technologies with Hadoop Distributed Filesystem and Implementing M/R. Academic Journal of Nawroz University, 9(1), 25–33. https://doi.org/10.25007/ajnu.v9n1a530