This project shows how to collect data from https://www.data.go.kr/(PublicDataPortal Korea), http://kostat.go.kr/portal/korea/index.action(Statistics Korea and other websites, and save into Hadoop cluster in order to analyze them to make a model that predicts monthly manpower-demand on Korean-pear farms located in Naju, Cheonan.
It also provides a webservice where a user can see each year's manpower predictions, and also upload recruit or apply notices.
Tools & Frameworks used:
-
DataCollecting : Selenium, Pandas, BeautifulSoup, urllib
-
DataProcessing : Pandas
-
DataManaging & Storing : Hadoop Cluster, Spark, Sqoop, Flume
-
RDBMS :Mysql, OracleDBMS
-
DataAnalysis : R(Multiple Regression), Keras(RNN LSTM)
-
WebService : SpringMVC , ApacheTomcat
-
OS : CentOS, Ubuntu
-
DevelopmentTool : Jupyter, Zeppelin, RStudio
-
CommunicationTool : Trello, Slack, Spring SVM