Get the unique word count of each word in Hive

I am having a table such as follows,

select * from tablename;

ID                   sentence
1              This is a sentence
2              This might be a test
3                     America
4                    This this 

I want to write a query to split the sentence into words and get the count of the words in the descending order. I want to have an output something like,

word     count    Unique(ids)

This       4          3
a          2          2
might      1          1
.
.
.

where count is the number of times the word has occurred in the column and Unique(ids) is the number of users with that word.

I am thinking in what way we can write a query to do this?

Can anybody help me doing this in hive?

Thanks

Answers


laterral View

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView

select id, word
from tablename tn lateral view explode( split( tn.sentense, ' ' ) ) tb as word

the result will be:

1 This
1 is 
1 a
1 sentense
2 This
2 might
2 be
2 a
2 test
3 america

aggregate the result


Need Your Help

Understanding Matlab Pattern Recognition Neural Network Plots

matlab neural-network epoch confusion-matrix

I was currently doing a project on Vehicle classification and it has almost finished now but I have several confusion about the plots I get from my Neural Network

Inline form twitter bootstrap

html css twitter-bootstrap

Sorry for asking such an easy question but how can i make a form like this inline?