A Way Disable SSH for Single Node Hadoop?

Where I work is using Centrify for authentication, and for some reason when i SSH into a system its not running the .bashrc / .bash_profile . This means JAVA_HOME and a bunch of other variables do not get set when it SSHes locally to start the node. The SAs have been useless in figuring out why and refuse to remove it from my development machine. As of right now I'm trying to run a Hadoop instance that needs native compression libraries. So the JAVA_LIBRARY_PATH isn't being set when it SSHes in. And for some reason setting it in the hadoop-env.sh is not working.

Is there a way at all to run hadoop in single-node mode without using SSH?

Answers


Removing SSH from the scripts is more painful than adding JAVA_HOME and such into the bin/hadoop script. All deamons are started via this, so this is the main point to change this if env is not working.


If .bash_profile and .bashrc are not getting called, its probably because the shell is being called non-interactively by your ssh calls which has nothing to do with Centrify.

Here's an article that describes this further: http://hacktux.com/bash/bashrc/bash_profile

If you want to your .bash_profile and .bashrc to be called, you need to change your ssh calls to use an interactive shell as described below.

Why does an SSH remote command get fewer environment variables then when run manually?

Let me take you through an example.

Here's my .bashrc on a system running Centrify and OpenSSH

$ id
uid=1296041358(simon_schuster) gid=1296041358(simon_schuster) groups=1296040449(domain_u),1296041358(simon_schuster)
[simon_schuster@engcen5 ~]$ cat .bashrc
# .bashrc

# Source global definitions
if [ -f /etc/bashrc ]; then
    . /etc/bashrc
fi

# User specific aliases and functions
JAVA_HOME=/usr/local/java/
export JAVA_HOME

If I try to connect to the system using a remote ssh connection and execute the env command, this ssh connection will call a non-interactive shell and as you can see below, the JAVA_HOME variable is not shown. That's because the .bashrc is not sourced if the shell is called non-interactively.

ssh simon_schuster@engcen5.ocean.net env
CentOS release 5.5 (Final)
Kernel 2.6.18-194.el5 on an x86_64

Password:
SHELL=/bin/bash
SSH_CLIENT=192.168.81.26 44524 22
CDC_PREW2KHOST=engcen5
USER=simon_schuster
MAIL=/var/mail/simon_schuster
PATH=/usr/bin:/bin:/usr/sbin:/sbin:/usr/share/centrifydc/bin
PWD=/home/simon_schuster
USER_PRINCIPAL_NAME=Simon.Schuster@OCEAN.NET
KRB5CCNAME=FILE:/tmp/krb5cc_cdc1296041358_TEcxjC
CDC_JOINED_DC=dc.ocean.net
SHLVL=1
HOME=/home/simon_schuster
CDC_JOINED_SITE=Ocean-Demo
DA_DASH_DEPTH=1
LOGNAME=simon_schuster
SSH_CONNECTION=192.168.81.26 44524 192.168.81.25 22
CDC_JOINED_ZONE=CN=Global,CN=Zones,OU=Unix,DC=ocean,DC=net
CDC_LOCALHOST=engcen5.ocean.net
CDC_JOINED_DOMAIN=ocean.net
_=/usr/bin/env

Here's the another session as the same user but this time the user logs in interactively and types the env command. Notice the JAVA_HOME variable is displayed. This is is because the .bashrc is sourced when an interactive (login) shell is invoked.

[simon_schuster@engcen5 ~]$ env
HOSTNAME=engcen5.ocean.net
TERM=xterm
SHELL=/bin/bash
HISTSIZE=1000
SSH_CLIENT=192.168.81.11 49519 22
CDC_PREW2KHOST=engcen5
SSH_TTY=/dev/pts/1
USER=simon_schuster
LS_COLORS=no=00:fi=00:di=00;34:ln=00;36:pi=40;33:so=00;35:bd=40;33;01:cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:ex=00;32:*.cmd=00;32:*.exe=00;32:*.com=00;32:*.btm=00;32:*.bat=00;32:*.sh=00;32:*.csh=00;32:*.tar=00;31:*.tgz=00;31:*.arj=00;31:*.taz=00;31:*.lzh=00;31:*.zip=00;31:*.z=00;31:*.Z=00;31:*.gz=00;31:*.bz2=00;31:*.bz=00;31:*.tz=00;31:*.rpm=00;31:*.cpio=00;31:*.jpg=00;35:*.gif=00;35:*.bmp=00;35:*.xbm=00;35:*.xpm=00;35:*.png=00;35:*.tif=00;35:
MAIL=/var/spool/mail/simon_schuster
PATH=/usr/kerberos/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/share/centrifydc/bin:/home/simon_schuster/bin
INPUTRC=/etc/inputrc
PWD=/home/simon_schuster
JAVA_HOME=/usr/local/java/
LANG=en_US.UTF-8
USER_PRINCIPAL_NAME=Simon.Schuster@OCEAN.NET
SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass
KRB5CCNAME=FILE:/tmp/krb5cc_1296041358
CDC_JOINED_DC=dc.ocean.net
SHLVL=1
HOME=/home/simon_schuster
CDC_JOINED_SITE=Ocean-Demo
DA_DASH_DEPTH=1
LOGNAME=simon_schuster
SSH_CONNECTION=192.168.81.11 49519 192.168.81.25 22
LESSOPEN=|/usr/bin/lesspipe.sh %s
CDC_JOINED_ZONE=CN=Global,CN=Zones,OU=Unix,DC=ocean,DC=net
CDC_LOCALHOST=engcen5.ocean.net
G_BROKEN_FILENAMES=1
CDC_JOINED_DOMAIN=ocean.net
_=/usr/bin/env

In order to have ssh source your bash_profile and bashrc, you have to call bash with the --login option as shown below. As you can see, the JAVA_HOME variable is now visible when executing the env command remotely via ssh.

ssh simon_schuster@engcen5.ocean.net "bash --login -c env"
CentOS release 5.5 (Final)
Kernel 2.6.18-194.el5 on an x86_64

Password:
HOSTNAME=engcen5.ocean.net
SHELL=/bin/bash
HISTSIZE=1000
SSH_CLIENT=192.168.81.26 58672 22
CDC_PREW2KHOST=engcen5
USER=simon_schuster
LS_COLORS=
PATH=/usr/kerberos/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/share/centrifydc/bin:/home/simon_schuster/bin
MAIL=/var/spool/mail/simon_schuster
INPUTRC=/etc/inputrc
PWD=/home/simon_schuster
JAVA_HOME=/usr/local/java/
LANG=en_US.UTF-8
USER_PRINCIPAL_NAME=Simon.Schuster@OCEAN.NET
SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass
KRB5CCNAME=FILE:/tmp/krb5cc_cdc1296041358_qRHqro
CDC_JOINED_DC=dc.ocean.net
HOME=/home/simon_schuster
SHLVL=2
CDC_JOINED_SITE=Ocean-Demo
DA_DASH_DEPTH=2
LOGNAME=simon_schuster
SSH_CONNECTION=192.168.81.26 58672 192.168.81.25 22
LESSOPEN=|/usr/bin/lesspipe.sh %s
CDC_JOINED_ZONE=CN=Global,CN=Zones,OU=Unix,DC=ocean,DC=net
CDC_LOCALHOST=engcen5.ocean.net
G_BROKEN_FILENAMES=1
CDC_JOINED_DOMAIN=ocean.net
_=/usr/bin/env

In summary, the fact that you're not getting your environment variable has nothing to do with Centrify and instead has to do with the fact that your shell is being called non-interactively.

Centrify is only responsible for authentication, access control and policy enforcement, and does not impact the default behavior of OpenSSH or your shells.

Hope this helps.


Need Your Help

Multiple plots on pdf with matplotlib

python pdf matplotlib subplot

I've been fighting with pyplot for few days now. I want to return a pdf report with 4 samples on each page. 4 inline subplots for each: text with the name and some statistics, and 3 graphs of value...