oxford-cs-deepnlp-2017 / practical-2 Goto Github PK

View Code? Open in Web Editor NEW

112.0 18.0 93.0 209 KB

Oxford Deep NLP 2017 course - Practical 2: Text Classification

Home Page: https://www.cs.ox.ac.uk/teaching/courses/2016-2017/dl/

nlp natural-language-processing deep-learning machine-learning oxford

practical-2's People

Contributors

Stargazers

Watchers

Forkers

mmakowski richiecakes vyraun vkanade satishjasthi alokkshukla fitrialif red-sackz dtsamis yoryos ubaidsayyed54 pandeyadarsh scottjwalter jinyu0310 allensmile ankitvad benjamesbabala kormilitzin tomdcsmith valeman akmalsabri ychernushenko kikou2016 merico34 geniusgeek 1401062808 jeanlucla tony32769 qitsweauca humsha clementlefevre sbm376 c123w muharremokutan bwegge avisheks girishgupta anich duthedd imaculate seanreed1111 tfolkman azeloc lia7lx gudongxing miguelperalvo shaonbhattashuvo wellbeing18 rahduro blackaceatzworg srinivasalu83 emcity jnarhan ai-learningandoptimize mkhoin mememero21japan hawkmath xiluo789 jiguang123 songt96 flian2 joelxiangnanchen ksferguson yongtaek miguelvr yhamidullah abiraja2004 griefgeek afcarl lorenzoinvernizzi yiyiicon xfzhu2003 shahidmawan liliantsang devcode1981 riviera2015 ikhushbu john2912 jbinkleyj lidagh emmygold1 santoshik29 yamshee osmanatam rchouhan170590 sara-nlp avudzor khiem2105 kenkoko psraju123 fraware yayuelaurazhou

practical-2's Issues

Where can I get access to the TED talks dataset used in this practical ?

I would like to use the TED talk dataset for research purposes. What will be a good place to access the data. Also are there any related publications which use this data?

Link to dataset

Can you guys put this link inside the practical?

http://www.clg.ox.ac.uk/tedcorpus

My name is Quei-An, an Msc student. For the first question of practical 2, I don't understand much what "starting from random embeddings" means. You mention GloVe afterwards, so I suppose that stands for word embeddings, but how can we train all word embeddings with such limited amount of data (only ~2000 talks)?

I tried somehow to implement the model with respect to the classification problem, but firstly I can only feed in one input at a time because each document has different size (number of words), secondly there's so little data that the accuracy turns out to be so poor. Then you mention "Training in batches", so I start getting confused.

Could you please clarify?

If I use fixed word representations (e.g. word2vec) then everything seems OK, I get about 56% accuracy by trying different hyperparameters.

Thank you and have a nice weekend.

my neural net only predicts 'ooo'.

I implemented the most basic neural net (following the instructions) and it is not performing very well.
I'm using Bag of Means to do document embedding which uses a Word2Vec model trained on the ted text.

I suspect that I have some sort of bug, as I'm a beginner with PyTorch. If the instructors don't mind, I'd like to share my code: [removed]
It is mostly modeled off this tutorial.

oxford-cs-deepnlp-2017 / practical-2 Goto Github PK

practical-2's People

Contributors

Stargazers

Watchers

Forkers

practical-2's Issues

Where can I get access to the TED talks dataset used in this practical ?

Link to dataset

the first question

my neural net only predicts 'ooo'.

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent