This is the second lab, writing and running a basic hadoop program
This is the basic wordcount program as discussed in the lecture.
Install MRJob library Run the mrjob.py program
https://pythonhosted.org/mrjob/guides/quickstart.html
Alter the example program to produce a wordcount
hadoop classpath
will give you the requisite libraries if you are compiling
from the command line.
Follow the instructions here to make and run the jar
- Run this on Elastic Mapreduce (You'll need API keys from your AWS Console)