Thursday, January 22, 2015

Hadoop installation in Ubuntu (Single node)

Following steps can be used to install Hadoop file system in Ubuntu 14.04.1 64bit version.

I have used following software versions.

Ubuntu 14.04.1 64 bit
JDK1.7 64 bit
Hadoop 2.5.2

For the demo purpose, it is used Oracle Virtual Machine for installing OS.

1. Install Ubuntu in Oracle VM

2. Install JAVA in Ubuntu local path /usr/local/java

3. Install Hadoop in Ubuntu local path /usr/local/hadoop

4. Create a user group for haoop execution
     sudo addgroup hadoop
     sudo adduser --ingroup hadoop hduser
     sudo chown -R hduser:hadoop /usr/local/hadoop

5. logging as "hduser" created above
          su - hduser

6. Now, install ssh for remote access to hadoop file system and authorized public key for ignore password prompt.
          sudo apt-get update
          sudo apt-get install ssh
          sudo apt-get install rsync
          sudo ssh-keygen -t dsa -P ~/.ssh/id_dsa
          cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

7. Check ssh using following command
          ssh localhost

8. Now, Configure JAVA and Hadoop PATH variables.
          vi .bashrc
          add following variables to end of .bashrc file
                export JAVA_HOME=/usr/local/java
                export HADOOP_PREFIX=/usr/local/hadoop
                PATH=$JAVA_HOME/bin:$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin:$PATH
                export PATH

9. Config Hadoop xml files located in /usr/local/hadoop/etc/hadoop/

vi hadoop-env.sh

export JAVA_HOME=/usr/local/java

vi hdfs-site.xml 

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>

vi core-site.xml

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>

vi mapred-site.xml

<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
<property>
<name>mapred.framework.name</name>
<value>yarn</value>
</property>
</configuration>

vi yarn-site.xml 

<property>
<name>yarn.nodemanager.aux.services</name>
<value>mapreduce_shuffle</value>
</property>

10. Format the namenode using following command

          hdfs namenode -format

11. Set the file system partition
         hdfs dfs -mkdir -p /home/hduser/

11. Start services for hadoop file system

         start-dfs.sh
         start-yarn.sh

12. Testing file system

Create test directory
hadoop fs -mkdir /home/hduser/test

List files in test directory
hadoop fs -ls -R /home/hduser/test

Touch file in test directory
hadoop fs -touchz /home/hduser/test/test1.txt

Copy file from local system
hadoop fs -cp testlocal.txt /home/hduser/test/testlocal.txt

Cat file in hadoop
hadoop fs -cat /home/hduser/test/testlocal.txt

NOTE:- In some cases, you have to change IPv6 using following commands.

sudo vi /etc/sysctl.conf

net.ipv6.conf.all.disable_ipv6=1
net.ipv6.conf.default_ipv6=1
net.ipv6.conf.io.disable_ipv6=1

Monday, January 12, 2015

Check sever status using Socket in JAVA

In some cases, you have to check server status whether server is running or not using telnet command before executing commands on the server application. This code can be used to check whether there is a running something on the given IP and Port.

import java.io.DataInputStream;
import java.io.InputStream;
import java.net.*;

public class Telnet {

    public static void main(String args[]) {
        try {
            String ip =  "192.168.*.*";           
            int port = 23;
            Socket s1 = new Socket(ip, port);
            InputStream is = s1.getInputStream();
            DataInputStream dis = new DataInputStream(is);
            if (dis != null) {
                System.out.println("Connected IP : " + ip + ", Port : " + port);
            } else {
                System.out.println("Connection Invalid.");
            }

            dis.close();
            s1.close();

        } catch (Exception e) {
            System.out.println("Not Connected, check Ip and Port.");

        }

    }
}