java - Hadoop Map task/Map object -
According to the principle, the following nodes have to define the number of map / red task slots on the data node. mapred.tasktracker.map.tasks.maximum | mapred.map.tasks .
In addition, the number of mapar objects has been determined by the number of input splits in the mpadudes job. We apply the map / red function and creates the framework object and closes it to the nearest of the data blocks.
So what is the difference created by the framework in the map work slot and mapper objects.
Let's say that I am storing 2TB files in 5 data nodes, each node is 400 MB. If I define
dfs.block.size = 100Mb then each node will place the 400/100 = 4 data block. Here, from the 4 data blocks we can ideally have 4 input splits and Can change 4 mapar objects in node. And at the same time, if I
map slots determine how many map works Work trackers can run. The map works is determined by the input partition and you can not change it. If more than map work will be running map slots some map jobs blocks and other tasks.
Comments
Post a Comment