Before compiling your first hadoop program, please see the instructions on how to run the WordCount Example.
You can get the wordcount example code from Github
(Make sure you get the compatible version):
wget https://github.com/apache/hadoop-common/raw/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/WordCount.java
Optionally you can change package org.apache.hadoop.examples;
to package org.janzhou;
.
Set the HADOOP_CLASSPATH
:
export HADOOP_CLASSPATH=$(bin/hadoop classpath)
Compile:
javac -classpath ${HADOOP_CLASSPATH} -d WordCount/ WordCount.java
Create JAR:
jar -cvf WordCount.jar -C WordCount/ .
Run:
bin/hadoop jar WordCount.jar org.janzhou.wordcount /wordcount/input /wordcount/output
Using sun.tools.javac.Main
You normally invoke javac.exe from the command line, but you can also invoke it from within a Java program. Use the sun.tools.javac.Main
class located in ${JAVA_HOME}/lib/tools.jar
to pass it an array of Strings equivalent to the command line parameters.
Look the MapReduce Tutorial.
Set environment variables:
export HADOOP_CLASSPATH=$JAVA_HOME/lib/tools.jar
Compile WordCount.java and create a jar:
bin/hadoop com.sun.tools.javac.Main -d WordCount/ WordCount.java
jar -cvf WordCount.jar -C WordCount/ .
Makefile
It is also nice to have a Makefile that do this automatically for you.
Here is a simple example:
HADOOP = ${HOME}/hadoop-2.5.1/bin/hadoop
APP = WordCount
SRC = src/*.java
OUT = out
$(APP): $(SRC)
mkdir -p $(OUT)
javac -classpath `$(HADOOP) classpath` -d $(OUT) $(SRC)
jar -cvf $(APP).jar -C $(OUT) .
clean:
rm -rf $(OUT) *.jar .
You can find more comprehensive examples from: Hadoop Example