Moving huge local files to HDFS using java
I have about 20 millions files stored on my local file system, each file 5k represents a tweet.
This stored as the following:
Example1 : /home/username/tweets/SCP/2014/04/11/9989443342233.txt
Example1 : /home/username/tweets/WDR/2014/02/08/5890321764568.txt
So is it possible to write a map reduce java program to move all tweets under a certain tag to a singe directory in HDFS based on the tag.
Any similar examples?
Make the sequencefile first, then upload it to HDFS.