Instruction
Compile PVFS2 server and JNI library
- Get the latest version of PVFS2 Layout from CVS
- cvs -d :pserver:anonymous@cvs.parl.clemson.edu:/anoncvs co -r pvfs2-mr-shim pvfs2
- Get the latest version of JNI from SVN
- cd pvfs2/
- rm -rf src/jni/
- svn co svn://gs7619.sp.cs.cmu.edu/pvfs/hadoop-0.20.1-jni/jni src/jni
- Compile and install PVFS2
- ./configure
- make
- make install
- JNI library is located at pvfs2/lib/libpvfs2-jni.so
Compile Hadoop 0.20.1 with PVFS2-FileSystem library
- Download hadoop-0.20.1 source code from http://archive.apache.org/dist/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz.
- wget http://archive.apache.org/dist/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz
- tar xzf hadoop-0.20.1.tar.gz
- Get the lastest PVFS2-FileSystem library
- cd hadoop-0.20.1
- svn co svn://gs7619.sp.cs.cmu.edu/pvfs/hadoop-0.20.1-jni/src/contrib/pvfs src/contrib/pvfs
- Compile PVFS2-FileSystem library
- PVFS2-FileSystem library is located at hadoop-0.20.1/build/contrib/pvfs/hadoop-0.20.2-dev-pvfs.jar.
Configure Hadoop and PVFS
- Create PVFS configuration file
- /usr/local/pvfs2/bin/pvfs2-genconfig pvfs2-fs.conf
- Increase a PVFS default block size by adding the following parameters to pvfs2-fs.conf (example)
<Filesystem>
<Distribution>
Name simple_stripe
Param strip_size
Value 67108864
</Distribution>
</Filesystem>
- Copy JNI and PVFS2-FileSystem libraries to Hadoop classpath
- cp pvfs2/lib/libpvfs2-jni.so hadoop-0.20.1/lib/native/Linux-amd64-64/libpvfs2-jni.so
- cp pvfs2/lib/libpvfs2-jni.so hadoop-0.20.1/lib/native/Linux-i386-32/libpvfs2-jni.so
- cp hadoop-0.20.1/build/contrib/pvfs/hadoop-0.20.2-dev-pvfs.jar hadoop-0.20.1/lib/hadoop-0.20.2-dev-pvfs.jar
- Configure Hadoop for PVFS2 by adding the following properties to core-site.xml (example)
<configuration>
<property>
<name>fs.pvfs2.impl</name>
<value>org.apache.hadoop.fs.pvfs2.PVFS2FileSystem</value>
<description>The FileSystem for pvfs2: uris.</description>
</property>
<property>
<name>fs.default.name</name>
<value>pvfs2://cloud2:3334/</value>
<description>The name of the default file system.</description>
</property>
<property>
<name>fs.local.block.size</name>
<value>67108864</value>
<description>Blocksize must be the same as the pvfs2 block size</description>
</property>
<property>
<name>fs.pvfs2.buffer.size</name>
<value>4194304</value>
<description>The size of each PVFS I/O request </description>
</property>
<property>
<name>fs.pvfs2.layout</name>
<value>default</value>
<description>Layout used for pvfs2 shim. Options are default, roundrobin, roundrobin-singlefile, local, local-singlefile</description>
</property>
</configuration>
Verify configuration
- Format and start PVFS2 for each server
- /usr/sbin/pvfs2-server /etc/pvfs2-fs.conf -f
- /usr/sbin/pvfs2-server /etc/pvfs2-fs.conf
- Run a list command and a result should be the following
- hadoop-0.20.1/bin/hadoop fs -ls /
Found 4 items
drwxrwxrwx - wtantisi lcs 0 2010-11-03 17:57 /.0
drwxrwxrwx - wtantisi lcs 0 2010-11-03 17:57 /.1
drwxrwxrwx - wtantisi lcs 0 2010-11-03 17:57 /.2
drwxrwxrwx - wtantisi lcs 0 2010-11-03 17:56 /lost+found
- hadoop-0.20.1/bin/hadoop fs -copyFromLocal hadoop-0.20.1.tar.gz /hadoop-0.20.1.tar.gz
- hadoop-0.20.1/bin/hadoop fs -lsr /
drwxrwxrwx - wtantisi lcs 0 2010-11-03 17:57 /.0
drwxrwxrwx - wtantisi lcs 0 2010-11-03 18:05 /.1
-rw-r--r-- 1 wtantisi lcs 40196756 2010-11-03 18:05 /.1/hadoop-0.20.1.tar.gz
drwxrwxrwx - wtantisi lcs 0 2010-11-03 18:05 /.2
-rw-r--r-- 1 wtantisi lcs 40196756 2010-11-03 18:05 /.2/hadoop-0.20.1.tar.gz
drwxrwxrwx - wtantisi lcs 0 2010-11-03 17:56 /lost+found
-rwxrwxrwx 3 wtantisi lcs 40196756 2010-11-03 18:05 /hadoop-0.20.1.tar.gz
|