Coder Social home page Coder Social logo

yoga-user-network-yun's Introduction

Yoga-User-Network-YUN

This repository contains code for [Do You Do Yoga? Understanding Twitter Users' Types and Motivations using Social and Textual Information], IEEE ICSC 2021.

Data:

  1. Please download the 'data' folder from following link (On request):

[YUN Data]

  1. 'data' folder should be kept inside the 'Yoga-User-Network-YUN/code/' folder.

data/yoga_user_name_loc_des_mergetweets_yoga_1300_lb.csv file contains 6 columns: name, location, description, text, utype, umotivation

  1. Please download pre-trained Word2Vec from this link and save it inside the 'code/pre-trained/' folder

[Word2Vec]

code/pre-trained/GoogleNews-vectors-negative300.bin

  1. Pre-trained emoji2vec.bin is inside 'code/pre-trained/' folder

code/pre-trained/emoji2vec.bin

  1. Train and Test data are inside data folder.

data/train.txt

data/test.txt

Computing Machine:

OS: MacBookPro, Processor: 2.5 GHz Dual-Core Intel Core i7, Memory: 16 GB 2133 MHz LPDDR3

Software Packages and libraries:

python 3.6.6
PyTorch 1.1.0
jupiter notebook
pandas
gensim
nltk
spacy
emoji
sklearn
matplotlib
numpy
preprocessor
transformers

Run all codes from 'code' directory.

cd code

Construct User Networks:

  1. Create user graphs:
python create_yoga_graph.py  data/yoga_user_mentioned_yoga_1300_lb.txt 

Now user graph will save in data/usergraph folder.

  1. Merge all usergraphs to make one:
cat data/usergraph/*.txt > data/yoga_usergraph.txt

Create Embeddings:

  1. Create description, location, tweets embeddings using pre-trained Word2Vec and Emoji2Vec. This will take ~2 hours to run in CPU.
create_embeddings.ipynb

This will create following three embeddings:

data/locationEmbeddings.pt

data/descriptionEmbeddings.pt

data/tweetsEmbeddings.pt

  1. Create user network embeddings using Node2Vec and input graph for this is data/yoga_usergraph.txt

Please download Node2Vec from this link:

https://github.com/aditya-grover/node2vec

This will create following embedding:

data/userNetworkEmd.emd

Run Models:

  1. Run Joint embedding attention-based neural network model Yoga User Network (YUN) model (Dec + Loc + Twt + Net). Each of them will take ~24 hours to run in CPU.
DLTN_utype_2layer.ipynb. 

DLTN_umotivation_2layer.ipynb
  1. Run description only baseline model. Each of them will take ~10 minutes to run in CPU.
Description_BiLSTMAttn_utype_2layer_classifier.ipynb

Description_BiLSTMAttn_umotivation_2layer_classifier.ipynb

  1. Run location only baseline model. Each of them will take ~7 minutes to run in CPU.
Location_only_utype_2layer_classifier.ipynb

Location_only_umotivation_2layer_classifier.ipynb

  1. Run tweets only baseline model. Each of them will take ~15 hours to run in CPU.
Tweets_BiLSTMAttn_utype_2layer_classifier.ipynb

Tweets_BiLSTMAttn_umotivation_2layer_classifier.ipynb

  1. Run user network only baseline model. Each of them will take ~5 minutes to run in CPU.
Network_only_utype_2layer_classifier.ipynb

Network_only_umotivation_2layer_classifier.ipynb

  1. Run joint Description and Location (Des + Loc) model. Each of them will take ~10 minutes to run in CPU.
DL_utype_2layer_classifier.ipynb

DL_umotivation_2layer_classifier.ipynb

  1. Run joint Description, Location, Tweet (Des + Loc + Twt) model. Each of them will take ~18 hours to run in CPU.
DLT_utype_2layer_classifier.ipynb

DLT_umotivation_2layer_classifier.ipynb

  1. Run joint Description, Location, Network (Des + Loc + Net) model. Each of them will take ~10 minutes to run in CPU.
DLN_utype_2layer_classifier.ipynb

DLN_umotivation_2layer_classifier.ipynb

  1. Run fine-tuned BERT model on Description (Description_BERT). Each of them will take ~10 minutes to run in GPU.
baseline_BERT_finetuned_description_utype_preprocessed.ipynb

baseline_BERT_finetuned_description_umotivation_preprocessed.ipynb

  1. Run fine-tuned BERT model on Location (Location_BERT). Each of them will take ~10 minutes to run in GPU.
baseline_BERT_finetuned_location_utype_preprocessed.ipynb

baseline_BERT_finetuned_location_umotivation_preprocessed.ipynb

  1. Run fine-tuned BERT model on Tweets (Tweets_BERT). Each of them will take ~15 minutes to run in GPU.
baseline_BERT_finetuned_tweet_utype_preprocessed_split.ipynb

baseline_BERT_finetuned_tweet_umotivation_preprocessed_split.ipynb

Citation:

If you find the paper useful in your work, please cite:

@inproceedings{islam2021you,
  title={Do You Do Yoga? Understanding Twitter Users' Types and Motivations using Social and Textual Information},
  author={Islam, Tunazzina and Goldwasser, Dan},
  booktitle={2021 IEEE 15th International Conference on Semantic Computing (ICSC)},
  pages={362--365},
  year={2021},
  organization={IEEE}
}

yoga-user-network-yun's People

Contributors

tunazislam avatar

Watchers

James Cloos avatar  avatar

Forkers

anshiquanshu66

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.