hadoop2 build hdfs without yarn and mapreduce

I want to make some changes to hadoop hdfs according to a published paper. After that I just need to build HDFS and get it running. How can I do that?


Refer the following Hadoop documentation


This assumes you build on Linux. If you use a different OS, you may need to do some extra steps; for details see this - I've never done this on non-Linux myself.

  • Install Git, Java (JDK), Maven and ProtocolBuffer (2.5+ version required)

  • Clone https://github.com/apache/hadoop-common.git by typing something like this in your command line:

    git clone https://github.com/apache/hadoop-common.git

    Note: you may want to use a particular branch corresponding to the version of HDFS you're looking to build. To list all branches, type git branch -a. Then to switch to branch 2.3, for example, type:

    git checkout --track origin/branch-2.3

    If you did everything correctly, you should see a message about tracking the remote branch you've selected.

  • Make whatever changes you need to make in HDFS; the code lives under hadoop-hdfs-project.

  • Compile the project by running the following from the root of your tree:

    mvn install -DskipTests

    This will take some time the first time you do it, but will be a lot quicker during re-runs.

Your final jars will be placed into directories like hadoop-hdfs-project/hadoop-hdfs/target (this is accurate for at least 2.3, but it might have been different in older version, or it may change in the future).

