Hadoop Data Persistance in which format?

  1. I have some experience with Lucene, I'm trying to understand how the data is actually stored in slave server in Hadoop framework?

  2. Do we create an index in Slave Server with set of attributes to describe Document we are storing? how does it works in reality ?

Answers


Data is split into blocks of a certain size, and then replicated to other nodes in the cluster for reliability. This process is handled by a single "Name Node" which keeps track of which blocks of data have gone where.

Hadoop provides you with a virtual filesystem, similar to Unix, which you can query using various Hadoop filesystem tools (ls, get, put etc)

This link should give you a comprehensive overview.


Need Your Help

Handle null values in spring expression language

java spring spring-el

I have the following code using spring expression language:

Compile Boost on WIndows XP

gcc boost mingw

I am trying to compile the Boost library for Windows (as a prerequisite for building the Bitcoin client), using the MinGW compiler toolchain to do so (rather than Visual Studio) and running into er...