ORCfile storage implementation in Pig
does anybody know how to use ORCfiles input/output in Pig? I found some kind of support for RCFiles in elephant-birds, but it seems ORC format is not supported... Could you please provide a sample of using Pig to access/store ORC files in Pig?
Support for ORC Storage through Pig is not yet committed and under active development. Refer to Apache JIRA PIG-3558. Following this, you would be able to access ORC files via your Pig Script like this
load 'foo.orc' using OrcStorage(); ... store .. using OrcStorage('-c SNAPPY');
Define a HCatalog table using HCat CLI stored as ORC.Then LOAD the relation in pig using org.apache.hcatalog.pig.HCatLoader() or STORE using org.apache.hcatalog.pig.HCatStorer()