java - Hadoop Map task/Map object -


According to the principle, the following nodes have to define the number of map / red task slots on the data node. mapred.tasktracker.map.tasks.maximum | mapred.map.tasks .

In addition, the number of mapar objects has been determined by the number of input splits in the mpadudes job. We apply the map / red function and creates the framework object and closes it to the nearest of the data blocks.

So what is the difference created by the framework in the map work slot and mapper objects.

Let's say that I am storing 2TB files in 5 data nodes, each node is 400 MB. If I define dfs.block.size = 100Mb then each node will place the 400/100 = 4 data block. Here, from the 4 data blocks we can ideally have 4 input splits and Can change 4 mapar objects in node. And at the same time, if I mapred.tasktracker.map.tasks.maximum = 2 & amp; mapred.map.tasks = 2 , so what can I conclude from it? Can I say that 4 mapper objects are being shared in 2 map work slots, I am going in the wrong direction , Any clarification would be useful.

map slots determine how many map works Work trackers can run. The map works is determined by the input partition and you can not change it. If more than map work will be running map slots some map jobs blocks and other tasks.

Comments

Popular posts from this blog

c# - ASP.NET MVC - Attaching an entity of type 'MODELNAME' failed because another entity of the same type already has the same primary key value -

jasper reports - How to center align barcode using jasperreports and barcode4j -

django - CommandError: You must set settings.ALLOWED_HOSTS if DEBUG is False -