Lab: Hadoop and HDFS in a simulated production environment

Author

Objectives

This lab follows the course on the open-source framework called Hadoop, developed and maintained by the Apache Foundation. The objective is to use Hadoop and HDFS in a simulated production environment, using a Docker container.




Step #1 : map-reduce, stand-alone mode.

Step #2 : Hadoop installation (using a pre-configured Docker container).

Step #3 : map-reduce, Hadoop cluster mode.

Exercise : For those who have the time, here is an exercise on big matrix-vector multiplication.