4

Hadoop のベスト プラクティスのサイトとは何ですか。新しいプロジェクトや小さな例を作成するための段階的なプロセスを入手できる書籍ではありません。このようなサイトが 1 つも見つかりません。共有してください。

4

2 に答える 2

0

Hadoop is not something one single application instead it is a distributed processing framework which is used by several applications which sits top of this framework. Pig, Hive, HBase, Cassandra, etc are few of many such application designed for specific requirement. Underneath all of these application consume Hadoop framework which mainly consist of distributed file system (HDFS) and distributed processing (MapReduce).

Technically when you have a bare minimum Hadoop cluster (HDFS + MapReduce only) you can start writing MapReduce based applications (in Java or other languages are supported through Hadoop Streaming) to process some data.

What you could do is first download a pre-build/configured Hadoop virtual Image from Cloudera or Hortonworks distribution and get it running in your machine. After that start learning writing MapReduce jobs in Java and run in your virtual machine.

Here is the URL to download Cloudera Hadoop Distribution VM

Here is the link to learn writing simplest wordcount job.

于 2013-04-02T22:26:07.057 に答える