Add new disks to datanode with bigger hard drivers

I am running a hdfs with some datanode, each datanode has 8 x 1TB hard drivers.

I want to add 2 x 2TB hard drivers for each datanode. I know how to add new hard drivers for datanode but I confuse that new hard drivers is bigger than old one so It maybe have problem in data distribution among hard drivers on datanode.

I think it is better to create 2 logical drivers (1TB) on 2TB hard driver then mount its to OS so that the volume of each datanode path is the same.

I need some advices. Thank for reading!

Answers


If you have mixed sized disks in a datanode, it is a common problem that the smaller disks will fill faster than the biggest ones. This is because the default volume choosing policy in the datanode is round robin. Basically the datanode will write new data to each disk in turn, taking no consideration about the size of the disks or their free space.

There is an alternative volume choosing policy which is ideal to use on datanodes with mixed sized disks called AvailableSpaceVolumeChoosingPolicy - I am not sure what distribution of Hadoop you are using, but the CDH documentation is:

https://www.cloudera.com/documentation/enterprise/5-5-x/topics/admin_dn_storage_balancing.html#concept_tws_bbg_2r

If you change to that policy, then by default 75% of new writes will go to the under used disks until they catch up with the other disks and then it will fall back to round robin writes.


Need Your Help

You have attempted 5 times with wrong credential. Unable to login in in drupal

drupal

i am not able to login with the existing credential in drupal it is showing that you have attempted 5 times with wrong credential. Please Help. How to login with the existing credential

Add text to a column in a listview in C#

c# winforms listview

I have a table with two columns, and a button that when I click add text to two columns, the code I use for that is this: