Betriebssysteme » Linux » Installing and Using Hadoop (ID 25540)
| Sprache: Englisch | |
| Skill: | |
| Rating: | |
| Beschreibung: Hadoop is an open-source distributed computing framework built by the Apache project. It is useful for processing large datasets across one or more computers and includes custom filesystem to store data including replication across multiple nodes. There is no direct access to HDFS filesystem (you can't mount it) included in Hadoop. The flow of a MapReduce job. First the data is split into chunks to be processed in parrallel in the Map job. Then the hadoop framework then takes that data and feeds that into the reduce job. Next the reduce job aggregates all the map jobs back into a single data set. This is discussed in more detail in the streaming jobs section. |
|
Zurück - Tutorial bewerten - Skill bewerten - Tutorial übersetzen - Link defekt!?

