Simulation and analysis applied on virtualization to build Hadoop clusters

Jun 1, 2015·
Gabriel Spadon
Ronaldo Celso Messias Correia
Rogério Eduardo Garcia
Celso Olivete
The data growth enhances the need of a method and paradigms responsible to deal with high scalability, reliability and fault tolerance in large amounts of data. Big Data is a framework capable of dealing with this need. This research makes usage of Apache Hadoop, and a Virtual Private Server (VPS) to analyze the performance through benchmark tests executed on locally, geographically distributed, and centralized Hadoop computational layout. The result from the simulations metrics, and performance analyses are compared to real servers and introduce an alternative model implemented with a tunnel protocol that enhance the processing power of the cluster.
2015 10th Iberian Conference on Information Systems and Technologies (CISTI)