Creation of a partitioned external table with hive: no data available

I have the following file on HDFS:

I create the structure of the external table in Hive:

CREATE EXTERNAL TABLE google_analytics(
  `session` INT)
PARTITIONED BY (date_string string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LOCATION '/flumania/google_analytics';

ALTER TABLE google_analytics ADD PARTITION (date_string = '2016-09-06') LOCATION '/flumania/google_analytics';

After that, the table structure is created in Hive but I cannot see any data:

Since it's an external table, data insertion should be done automatically, right?

Answers


your file should be in this sequence.

int,string

here you file contents are in below sequence

string, int

change your file to below.

86,"2016-08-20" 78,"2016-08-21"

It should work. Also it is not recommended to use keywords as column names (date);


I think the problem was with the alter table command. The code below solved my problem:

CREATE EXTERNAL TABLE google_analytics(
  `session` INT)
PARTITIONED BY (date_string string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LOCATION '/flumania/google_analytics/';

ALTER TABLE google_analytics ADD PARTITION (date_string = '2016-09-06');

After these two steps, if you have a date_string=2016-09-06 subfolder with a csv file corresponding to the structure of the table, data will be automatically loaded and you can already use select queries to see the data.

Solved!


Need Your Help

Overriding django's CheckboxSelectMultiple widget for Awesome Bootstrap Checkboxes

django

If I print an instance of the following form in a template:

CDN Support for bundling

javascript asp.net-mvc cdn

Is there CDN Support for bundling in asp.net MVC4 ?