Coder Social home page Coder Social logo

ailabteam / smile Goto Github PK

View Code? Open in Web Editor NEW

This project forked from smile-data/smile

0.0 0.0 0.0 108.73 MB

SMILE: A Multimodal Dataset for Understanding Laughter

Shell 0.78% JavaScript 1.04% Python 32.49% CSS 0.19% HTML 0.80% Jupyter Notebook 64.69%

smile's Introduction

SMILE: A Multimodal Dataset for Understanding Laughter

This is the repository of SMILE: A Multimodal Dataset for Understanding Laughter. It comprises SMILE dataset, and codes involving the description of the dataset and evaluation for laughter reasoning.

teaser_v3(1)-1

Installation

$ conda create -n SMILE python==3.10.11
$ conda activate SMILE

# move to FastChat/ directory
$ cd FastChat

$ pip3 install --upgrade pip  
$ pip3 install -e .
$ pip3 install openai
$ pip3 install scikit-image
$ pip3 install evaluate
$ pip3 install bert-score

Download the SMILE Dataset

  1. Now, we are updating SMILE dataset v.2. After the update, we will update the laugh reasoning benchmark.

  2. Download SMILE dataset v.2 in here

  3. unzip the dataset.

    
    ├── annotations
    |    ├── data_split.json
    |    ├── GT_laughter_reason.json
    |    └── multimodal_textual_representation.json
    |
    └── videos
         └── SMILE_videos.zip
                ├── video_clips
                └── video_segments
    
    
  4. Details about each file

    • data_split.json: key index for train, validation, test split
    • GT_laughter_reason.json: Ground-Truth laughter reason for video clip
    • multimodal_textual_representation.json: multimodal textual representation encoded from video clip
    • video_clips: 887 video clips from sitcom and TED, Note: sitcom has an underbar in the key index, while TED does not. You can use this information for splitting our dataset by video types.
    • video_segments: 4482 video segments trimmed from video clip by utterances.
  5. SMILE dataset v.1 for evaluation

    • We provide v.1 dataset for evaluation download in hear
    • Note that sitcom_reasoning_{train/val}.json and ted_reasoning_{train/val}.json are subset of smile_reasoning_{train/val}.json.
    
    ├── SMILE_v1_evaluation
         ├── smile_reasoning_train.json
         ├── smile_reasoning_val.json
         ├── sitcom_reasoning_train.json
         ├── sitcom_reasoning_val.json
         ├── ted_reasoning_train.json
         └── ted_reasoning_val.json
    
    

Evaluation

Laugh reasoning

We provide the inference code for in-context and zero-shot experiment using GPT3.

As the fine-tuneded GPT3 requires a certain openai api-key which the model was fine-tuned on, we instead provide the inferecne code for fine-tuned model using LLaMA.

Please evaluate the models with the provided v.1. dataset.

In-context and Zero-shot experiment (GPT3)

Note that running GPT3 requires your own openai api-key and it also charges for running the model.

Replace the { } with your own information.

$ python gpt3_inferece.py -openai_key {your openai api key} -engine {name of gpt3 model} -shot {fewshot or zeroshot} -val_data {path/for/validation_data} -train_data {path/for/train_data} -random_seed {any integer number} 

Fine-tuned experiment (LLaMA)

We provide the pre-trained weights of the LLaMA for the research purpose only.

Training data Link
SMILE SMILE_checkpoint
SMILE_Sitcom Sitcom_checkpoint
SMILE_Ted Ted_checkpoint
 ├── SMILE
      ├── checkpoint
 ├── SMILE_SITCOM
      ├── checkpoint
 ├── SMILE_TED
      ├── checkpoint

Replace the { } with your own information.

You should direct the checkpoint directory for the model_path, e.g., "SMILE/checkpoint".

$ python FastChat/fastchat/serve/inference.py -model_path {path/for/fine-tuned model} -val_data {path/for/validation_data} -train_data {path/for/train_data} -random_seed {any integer number}

Acknowledgement

We are grateful for the following awesome projects, our SMILE arising from:

  • GPT3: Language Models are Few-Shot Learners
  • LLaMA: LLaMA: Open and Efficient Foundation Language Models
  • Vicuna: Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90% ChatGPT Quality
  • MUStARD: Towards Multimodal Sarcasm Detection (An Obviously Perfect Paper)
  • UR-FUNNY: UR-FUNNY: A Multimodal Language Dataset for Understanding Humor

smile's People

Contributors

smile-data avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.