0

I need to get the hosts (so the actual machines) where the differente tasks (mapper and reducer) of a Hadoop Job run. So I got a long running job and I need to retrieve the hosts where the tasks are currently running. I need this information in an external programm, so not inside the actual jobs.

I know that I can use hadoop job -list-attempt-ids job_201307251119_0004 map running to get the task attempts, but this does not show me the hosts.

I also know that I can use the JobClient to retrieve the host of a finished task. But in my case, the task is still running.

The only solution which came to my mind was to parse the Job-Tracker-HTTP-Interface HTML page which contains the host in the URLs which point to the log-files. But this does not seem like the right way to go, what are the alternatives?

4

1 に答える 1

0

マッパー/リデューサーが現在実行されているホスト名を取得したいので、それを見つけるためにマッパー/リデューサー自体にいくつかの追加の Java 行を書き込むことができます。多分 :

String hostname = java.net.InetAddress.getLocalHost().getHostName();

これがまさに必要なものである場合はIDK。

于 2013-07-25T21:48:53.237 に答える