Error while copying from S3 to HDFS

I am trying to copy some files from S3 bucket to HDFS of my EMR cluster. But I am getting the following error:

Exception in thread "main" java.lang.RuntimeException: Error running job
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(
    at java.lang.reflect.Method.invoke(
    at org.apache.hadoop.util.RunJar.main(
Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://
    at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(
    at org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat.listStatus(
    at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(
    at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(
    at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(
    at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(
    at org.apache.hadoop.mapreduce.Job$
    at org.apache.hadoop.mapreduce.Job$
    at Method)
    at org.apache.hadoop.mapreduce.Job.submit(
    ... 9 more

The command I am using is :

./elastic-mapreduce --jobflow  j-12345678 --jar /home/hadoop/lib/emr-s3distcp-1.0.jar --args '--src,s3n://my-bucket/data/,--dest,hdfs:///data/in,--srcPattern,xyz01-1-1*ped*' --step-name "Copy input files to HDFS" --wait-for-steps

I tried to run the sample word-count job, to check if there is any issue with HDFS, but it ran fine.

Can anyone please help me with this? If any more info is needed, please let me know and I will update the description.


Usually its the --srcPattern '<regex>' argument. You can also use hadoop fs -cp s3://src/file1.something /my/output/path/ to test for 1 file and modify your regex. Also starting with .* any char-0 or more times, should relax the matching.

It would be great to know if regex non-matches get logged and where.

