1. Please install alex, happy and groom cabal install alex happy groom 2. Then, make it! make The compiler command fregel must have been produced. 3. Compile your Fregel code ./fregel ../sample-programs/re.fgl Two Java files must have been created: reAllI.java for 'reAll' defined in re.fgl. re3.java for 're3' defined in re.fgl. If you want to compile it into Pregel+ code, run the following command: $ ./fregel -p ../sample-programs/re.fgl Two c++ files must have been created: reAllI.cpp for 'reAll' defined in re.fgl. re3.cpp for 're3' defined in re.fgl. 4. Compile the generated code, and Run it 4.1 [Giraph] Compile the Java source and make a jar file, and Run it Assumption: Giraph 1.2.0 is installed in ~/giraph-1.2.0 Hadoop 1.2.1 is installed in ~/hadoop-1.2.1 First, make a directory mysrc and extract class files from the jar file of giraph-examples: bash cd ~ mkdir mysrc cd mysrc jar xf ~/giraph-1.2.0/giraph-examples/target/giraph-examples-1.2.0-for-hadoop-1.2.1-jar-with-dependencies.jar rm -rf org/apache/giraph/examples/ exit You can reuse this directory 'mysrc' for multiple fregel programs. Next, copy the java files generated by Fregel compiler to the directory mysrc: cp *.java ~/mysrc Compile the source to make a jar file: bash cd ~/mysrc javac *.java -cp ~/hadoop-1.2.1/hadoop-core-1.2.1.jar:. cd .. jar cf myprog.jar -C mysrc/ . exit Run reAll in the jar file with Hadoop (be careful that the initial letter of the class name is capital): ~/hadoop-1.2.1/bin/start-all.sh ~/hadoop-1.2.1/bin/hadoop dfs -copyFromLocal ../sample-programs/graph4.txt /user/hduser/input/graph4.txt ~/hadoop-1.2.1/bin/hadoop dfs -ls /user/hduser/input ~/hadoop-1.2.1/bin/hadoop jar ~/myprog.jar org.apache.giraph.GiraphRunner 'ReAll$VertexComputation' -mc 'ReAll$MasterComputation' -vif 'ReAll$InputFormat' -vof 'ReAll$OutputFormat'-vip /user/hduser/input/graph4.txt -op /user/hduser/output/reAll -w 1 Please replace 'reAll' (and 'ReAll') with others such as 're3' (and 'Re3') to run other programs. For example, if you compiled the sssp code (../sample-programs/sssp.fgl), the command is ~/hadoop-1.2.1/bin/hadoop jar ~/myprog.jar org.apache.giraph.GiraphRunner 'Sssp$VertexComputation' -mc 'Sssp$MasterComputation' -vif 'Sssp$InputFormat' -vof 'Sssp$OutputFormat' -vip /user/hduser/input/graph4.txt -op /user/hduser/output/sssp -w 1 4.2 [Pregel+] Compile the Java source and make a jar file Assumption: Hadoop 1.2.1 is installed. Pregel+ system is installed. MPICH is installed. Your system is amd64 (64bit Linux). The following environment variables are set properly: JAVA_HOME for the java directiry, e.g., /usr/lib/jvm/java-8-oracle/ HADOOP_HOME for the Hadoop directory, e.g., ~/hadoop-1.2.1/ REGEL_PLUS_HOME for the Pregel+ system directory, e.g., ~/system/ CLASSPATH for listing up all jar files about Hadoop (see Pregel+'s document) LD_LIBRARY_PATH to include directories of libhdfs.so and libjvm.so, e.g., $HADOOP_HOME/c++/Linux-amd64-64/lib and $JAVA_HOME/jre/lib/amd64/server/ For example, the last two can be set by the following script (you may append it to your .bashrc): JPLATFORM=amd64 PLATFORM=Linux-amd64-64 export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${HADOOP_HOME}/c++/${PLATFORM}/lib:${JAVA_HOME}/jre/lib/${JPLATFORM}/server/ CLASSPATH=$( ls $HADOOP_HOME/*.jar $HADOOP_HOME/lib/*.jar | sed -e 'H;$!d;g;s/\n//;s/\n/:/g' ) export CLASSPATH To compile a Pregel+ code, it is comvenient to use a Makefile with the following content: ### Makefile starts ### CCOMPILE=mpic++ JPLATFORM=amd64 PLATFORM=Linux-amd64-64 CPPFLAGS= -I$(HADOOP_HOME)/src/c++/libhdfs -I$(JAVA_HOME)/include -I$(JAVA_HOME)/include/linux -I$(PREGEL_PLUS_HOME) LIB = -L$(HADOOP_HOME)/c++/$(PLATFORM)/lib -L$(JAVA_HOME)/jre/lib/$(JPLATFORM)/server/ LDFLAGS = -ljvm -lhdfs -Wno-deprecated -O3 %: %.cpp $(CCOMPILE) $< $(CPPFLAGS) $(LIB) $(LDFLAGS) -o $@ ### Makefile ends ### Copy the Makefile to the same directory as the generated code, e.g., reAll.cpp and re3.cpp. Then, the following command builds the executables: make reAll re3 Next, you need to upload your graph data to the specific directory in the HDFS. The directory is /PROGRAM_NAME/input/ for PROGRAM_NAME.cpp . For example, you can do it by the following command for reAll.cpp: ${HADOOP_HOME}/bin/start-dfs.sh ${HADOOP_HOME}/bin/hadoop dfs -copyFromLocal ../sample-programs/graph4-pp.txt /reAll/input/graph4-pp.txt ${HADOOP_HOME}/bin/hadoop dfs -ls /reAll/input/ A graph data is a text file in which each line represents a vertex in the following format: vid \t val degree nb1 len1 nb2 len2 ... Here, nb1, nb2, ... are vids of the neighbors, and len1, len2, ... are values associated to edges to the neighbors. Then, you need a machine file that lists the machines to be used in the computation. For example, a simple machine file to use two local cpu cores consists of the following single line: localhost:2 For the details, see Pregel+'s document. Now, you are ready to run the program. Execute the following command, in which machines.txt is the machine file and '-n 2' means to use two threads: mpirun -f machines.txt -n 2 $(pwd)/reAll The output is given in the directory /reAll/output/ in the HDFS. You can check the result by the following command: ${HADOOP_HOME}/bin/hadoop dfs -cat /reAll/output/* In general, the result is given in /PROGRAM_NAME/output for PROGRAM_NAME.cpp .