Move a compressed file across servers to Hadoop HDFS
I have hundreds of large, lzo compressed files sitting on a server that I want to copy to Hadoop DFS. The usual command I use, for uncompressed files, is
cat someFile | ssh uname@hadoop "hadoop dfs -put - /data/someFile"
I'm assuming this won't work for the compressed file (as cat maybe doesn't make sense). Do I first need to copy over the file to the Name Node and the put:
scp someFile.lzo uname@hadoop:~/ #on remote server hadoop dfs -put someFile.lzo /data/ #on Hadoop server rm ~/someFile.lzo
Seems like there should be a better way of doing this.
If your client machine (the server which is having the big files) can be installed with hadoop client library, you don't need to cat your file in any case.
Command that can be used :
hadoop dfs -cp localfilename hdfsipaddress:hdfsport(usually 9000) :hdfs path