Coder Social home page Coder Social logo

chartllama-code's Introduction

ChartLlama: A Multimodal LLM for Chart Understanding and Generation

                 

Model       Dataset

Yucheng Han*, Chi Zhang*(Corresponding Author), Xin Chen, Xu Yang, Zhibin Wang
Gang Yu, Bin Fu, Hanwang Zhang

(* equal contributions)

From Tencent and Nanyang Technological University.

🔆 Introduction

🤗🤗🤗 We first create an instruction-tuning dataset based on our proposed data generation pipeline. Then, we train ChartLlama on this dataset and achieve the abilities shown in the figure.

Examples about the abilities of ChartLlama.

Redraw the chart according to the given chart, and edit the chart following instructions.

Draw a new chart based on given raw data and instructions

📝 Changelog

  • [2023.11.27]: 🔥🔥 Update the inference code and model weights.
  • [2023.11.27]: Create the git repository.

⚙️ Setup

Refer to the LLaVA-1.5. Since I have uploaded the code, you can just install by

pip install -e .

💫 Inference

You need to first install LLaVA-1.5, then use model_vqa_lora to do inference. The model_path is the path to our Lora checkpoints, the question-file is the json file containing all questions, the image-folder is the folder containing all your images and the answers-file is the output file name.

Here is an example:

CUDA_VISIBLE_DEVICES=1 python -m llava.eval.model_vqa_lora --model-path /your_path_to/LLaVA/checkpoints/${output_name} \
    --question-file /your_path_to/question.json \
    --image-folder ./playground/data/ \
    --answers-file ./playground/data/ans.jsonl \
    --num-chunks $CHUNKS \
    --chunk-idx $IDX \
    --temperature 0 \
    --conv-mode vicuna_v1 &

📖 TO-DO LIST

  • Create and open source a new chart dataset in Chinese.
  • Open source the training scripts and the dataset.
  • Open source the evaluation scripts.
  • Open source the evaluation dataset.
  • Open source the inference script.
  • Open source the model.
  • Create the git repository.

😉 Citation

@misc{han2023chartllama,
      title={ChartLlama: A Multimodal LLM for Chart Understanding and Generation}, 
      author={Yucheng Han and Chi Zhang and Xin Chen and Xu Yang and Zhibin Wang and Gang Yu and Bin Fu and Hanwang Zhang},
      year={2023},
      eprint={2311.16483},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

📢 Disclaimer

We develop this repository for RESEARCH purposes, so it can only be used for personal/research/non-commercial purposes.


chartllama-code's People

Contributors

tingxueronghua avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.