Coder Social home page Coder Social logo

food-chi's Introduction

Food

This is a group project for CS229, categorized in Natural Language Process.

More details would be in this poster.

Motivation

While there are a lot of recommendation systems for movies, audios, books, or restaurants, surprisingly there is hardly any for dishes. As food lovers, we intend to address this issue by exploring models for recommending Chinese dishes to users.
We scraped data from a popular Chinese recipe website/app called Xia Chu Fang Cooking Recipe, and
implemented word embedding algorithms and recommendation systems algorithms to build our model.
Specifically, the input to our algorithm are users from Xia Chu Fang and dish names they stored in their favourite list.
We then use Word2Vec and collaborative filtering to build our recommendation system.Specifically, we explore the Skip-Gram model in Word2Vec to calculate dish similarity, and apply the Non-NegativeMatrix Factorization and Singular Value Decomposition methods in collaborative filtering to output predicted ratings on dishes, which allows generation of top recommendations for users based on predicted ratings.Limiting our dishes to Chinese cuisine allows us to take on a more tailored recommendation system, while still maintain a high practical value given the rising popularity and diversity of Chinese cuisines. Through this project, we hope to tackle a less touched topic of using Machine Learning for dish recommendation and promote Chinese food.

Method

1. Data crawling

chef_spider.py

  • Utilize parallel crawling and proxies to fetch data more efficiently
  • We run this spider on Google Cloud, which created two virtual machines and each contains 8vCPUs.

2. Map to fixed dictionary

preprocess.py

  • Map users' favorite recipes into a fixed-length dictionary, which has 1871 unique keys representing recipe names.
  • This dictionary also has corresponding english translation of each dish's chinese recipe name.

3. Train models

Word2Vec

This content-based method could tackle the issue of a cold start, and intuitively gives us recommendations based on

  • your input keywords
  • other users appetites in our database

word2vec.py

Usage

  1. Suppose you have downloaded our repo including the models in the models directory.
  2. Then you just need to execute this script by typing python word2vec.py

This is our sample output: word2vec

Tips

  • You may need to install some dependencies by run pip install -r requirements.txt.

  • If you really don't want to see any warnings, you can simply use this command python -W ignore word2vec.py.

  • The visualization by T-SNE is in the res folder, you can see the two figures (here and here) which generated by different seeds.

    Also, we have a sample only trained by part data.

Collaborative filter

  • NMF (Non-negative Matrix Factorization)
  • SVD (Singular Value Decomposition)

We implement this idea mainly by matrix factorization, which gives us more quantified way to measure our model by recall, see more details here.

TL;DR;

Dev Set RMSE Test Set RMSE Dev Set Recall Test Set Recall
NMF 0.4574 0.5851 0.5081 0.5393
SVD 0.3317 0.3634 0.9173 0.9301

food-chi's People

Contributors

yj022011 avatar zengyu714 avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

yj022011

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.