Coder Social home page Coder Social logo

zjukg / meaformer Goto Github PK

View Code? Open in Web Editor NEW
51.0 6.0 3.0 18.58 MB

[Paper][ACM MM 2023] MEAformer: Multi-modal Entity Alignment Transformer for Meta Modality Hybrid

Home Page: https://arxiv.org/abs/2212.14454

License: MIT License

Python 96.88% Shell 3.12%
entity-alignment meta-learning multi-modal transformer meaformer knowledge-graph alignment modality-alignment

meaformer's Introduction

Logo

license arxiv badge Pytorch ACMMM

This paper introduces MEAformer, a multi-modal entity alignment transformer approach for meta modality hybrid, which dynamically predicts the mutual correlation coefficients among modalities for more fine-grained entity-level modality fusion and alignment.

MEAformer

๐Ÿ‘† Click to see the Video

๐Ÿ”” News

๐Ÿ”ฌ Dependencies

pip install -r requirement.txt

Details

  • Python (>= 3.7)
  • PyTorch (>= 1.6.0)
  • numpy (>= 1.19.2)
  • Transformers (== 4.21.3)
  • easydict (>= 1.10)
  • unidecode (>= 1.3.6)
  • tensorboard (>= 2.11.0)

๐Ÿš€ Train

  • Quick start: Using script file (run.sh)
>> cd MEAformer
>> bash run.sh
  • Optional: Using the bash command
>> cd MEAformer
# -----------------------
# ---- non-iterative ----
# -----------------------
# ----  w/o surface  ---- 
# FBDB15K
>> bash run_meaformer.sh 1 FBDB15K norm 0.8 0 
>> bash run_meaformer.sh 1 FBDB15K norm 0.5 0 
>> bash run_meaformer.sh 1 FBDB15K norm 0.2 0 
# FBYG15K
>> bash run_meaformer.sh 1 FBYG15K norm 0.8 0 
>> bash run_meaformer.sh 1 FBYG15K norm 0.5 0 
>> bash run_meaformer.sh 1 FBYG15K norm 0.2 0 
# DBP15K
>> bash run_meaformer.sh 1 DBP15K zh_en 0.3 0 
>> bash run_meaformer.sh 1 DBP15K ja_en 0.3 0 
>> bash run_meaformer.sh 1 DBP15K fr_en 0.3 0
# ----  w/ surface  ---- 
# DBP15K
>> bash run_meaformer.sh 1 DBP15K zh_en 0.3 1 
>> bash run_meaformer.sh 1 DBP15K ja_en 0.3 1 
>> bash run_meaformer.sh 1 DBP15K fr_en 0.3 1
# -----------------------
# ------ iterative ------
# -----------------------
# ----  w/o surface  ---- 
# FBDB15K
>> bash run_meaformer_il.sh 1 FBDB15K norm 0.8 0 
>> bash run_meaformer_il.sh 1 FBDB15K norm 0.5 0 
>> bash run_meaformer_il.sh 1 FBDB15K norm 0.2 0 
# FBYG15K
>> bash run_meaformer_il.sh 1 FBYG15K norm 0.8 0 
>> bash run_meaformer_il.sh 1 FBYG15K norm 0.5 0 
>> bash run_meaformer_il.sh 1 FBYG15K norm 0.2 0 
# DBP15K
>> bash run_meaformer_il.sh 1 DBP15K zh_en 0.3 0 
>> bash run_meaformer_il.sh 1 DBP15K ja_en 0.3 0 
>> bash run_meaformer_il.sh 1 DBP15K fr_en 0.3 0
# ----  w/ surface  ---- 
# DBP15K
>> bash run_meaformer_il.sh 1 DBP15K zh_en 0.3 1 
>> bash run_meaformer_il.sh 1 DBP15K ja_en 0.3 1 
>> bash run_meaformer_il.sh 1 DBP15K fr_en 0.3 1

โ—Tips: you can open the run_meaformer.sh or run_meaformer_il.sh file for parameter or training target modification.

๐ŸŽฏ Results

$\bf{H@1}$ Performance with the Settings: w/o surface & Non-iterative in UMAEA. We modified part of the MSNEA to involve not using the content of attribute values but only the attribute types themselves (See issues for details):

Method $\bf{DBP15K_{ZH-EN}}$ $\bf{DBP15K_{JA-EN}}$ $\bf{DBP15K_{FR-EN}}$
MSNEA .609 .541 .557
EVA .683 .669 .686
MCLEA .726 .719 .719
MEAformer .772 .764 .771
UMAEA .800 .801 .818

๐Ÿ“š Dataset

  • โ—NOTE: Download from GoogleDrive (1.26G) and unzip it to make those files satisfy the following file hierarchy:
ROOT
โ”œโ”€โ”€ data
โ”‚ย ย  โ””โ”€โ”€ mmkg
โ””โ”€โ”€ code
 ย ย  โ””โ”€โ”€ MEAformer
  • Case analysis Jupyter script: GoogleDrive (180M) base on the raw images of entities (need to be unzip). I hope this gives you a good understanding of this dataset.
  • [ Option ] The raw Relations & Attributes appeared in DBP15k and case from MEAformer can be downloaded from Huggingface (150M).
  • [ Option ] The raw images of entities appeared in DBP15k can be downloaded from Baidu Cloud Drive (50GB) with the pass code mmea. All images are saved as title-image pairs in dictionaries and can be accessed with the following code :
import pickle
zh_images = pickle.load(open("eva_image_resources/dbp15k/zh_dbp15k_link_img_dict_full.pkl",'rb'))
print(en_images["http://zh.dbpedia.org/resource/้ฆ™ๆธฏๆœ‰็ทš้›ป่ฆ–"].size)

Code Path

๐Ÿ‘ˆ ๐Ÿ”Ž Click
MEAformer
โ”œโ”€โ”€ config.py
โ”œโ”€โ”€ main.py
โ”œโ”€โ”€ requirement.txt
โ”œโ”€โ”€ run_meaformer.sh
โ”œโ”€โ”€ run_meaformer_il.sh
โ”œโ”€โ”€ run.sh
โ”œโ”€โ”€ model
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”œโ”€โ”€ layers.py
โ”‚   โ”œโ”€โ”€ MEAformer_loss.py
โ”‚   โ”œโ”€โ”€ MEAformer.py
โ”‚   โ”œโ”€โ”€ MEAformer_tools.py
โ”‚   โ””โ”€โ”€ Tool_model.py
โ”œโ”€โ”€ src
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”œโ”€โ”€ distributed_utils.py
โ”‚   โ”œโ”€โ”€ data.py
โ”‚   โ””โ”€โ”€ utils.py
โ””โ”€โ”€ torchlight
    โ”œโ”€โ”€ __init__.py
    โ”œโ”€โ”€ logger.py
    โ”œโ”€โ”€ metric.py
    โ””โ”€โ”€ utils.py

Data Path

๐Ÿ‘ˆ ๐Ÿ”Ž Click
mmkg
โ”œโ”€โ”€ DBP15K
โ”‚ย ย  โ”œโ”€โ”€ fr_en
โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ ent_ids_1
โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ ent_ids_2
โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ ill_ent_ids
โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ training_attrs_1
โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ training_attrs_2
โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ triples_1
โ”‚ย ย  โ”‚ย ย  โ””โ”€โ”€ triples_2
โ”‚ย ย  โ”œโ”€โ”€ ja_en
โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ ent_ids_1
โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ ent_ids_2
โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ ill_ent_ids
โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ training_attrs_1
โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ training_attrs_2
โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ triples_1
โ”‚ย ย  โ”‚ย ย  โ””โ”€โ”€ triples_2
โ”‚ย ย  โ”œโ”€โ”€ translated_ent_name
โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ dbp_fr_en.json
โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ dbp_ja_en.json
โ”‚ย ย  โ”‚ย ย  โ””โ”€โ”€ dbp_zh_en.json
โ”‚ย ย  โ””โ”€โ”€ zh_en
โ”‚ย ย      โ”œโ”€โ”€ ent_ids_1
โ”‚ย ย      โ”œโ”€โ”€ ent_ids_2
โ”‚ย ย      โ”œโ”€โ”€ ill_ent_ids
โ”‚ย ย      โ”œโ”€โ”€ training_attrs_1
โ”‚ย ย      โ”œโ”€โ”€ training_attrs_2
โ”‚ย ย      โ”œโ”€โ”€ triples_1
โ”‚ย ย      โ””โ”€โ”€ triples_2
โ”œโ”€โ”€ FBDB15K
โ”‚ย ย  โ””โ”€โ”€ norm
โ”‚ย ย      โ”œโ”€โ”€ ent_ids_1
โ”‚ย ย      โ”œโ”€โ”€ ent_ids_2
โ”‚ย ย      โ”œโ”€โ”€ ill_ent_ids
โ”‚ย ย      โ”œโ”€โ”€ training_attrs_1
โ”‚ย ย      โ”œโ”€โ”€ training_attrs_2
โ”‚ย ย      โ”œโ”€โ”€ triples_1
โ”‚ย ย      โ””โ”€โ”€ triples_2
โ”œโ”€โ”€ FBYG15K
โ”‚ย ย  โ””โ”€โ”€ norm
โ”‚ย ย      โ”œโ”€โ”€ ent_ids_1
โ”‚ย ย      โ”œโ”€โ”€ ent_ids_2
โ”‚ย ย      โ”œโ”€โ”€ ill_ent_ids
โ”‚ย ย      โ”œโ”€โ”€ training_attrs_1
โ”‚ย ย      โ”œโ”€โ”€ training_attrs_2
โ”‚ย ย      โ”œโ”€โ”€ triples_1
โ”‚ย ย      โ””โ”€โ”€ triples_2
โ”œโ”€โ”€ embedding
โ”‚ย ย  โ””โ”€โ”€ glove.6B.300d.txt
โ”œโ”€โ”€ pkls
โ”‚ย ย  โ”œโ”€โ”€ dbpedia_wikidata_15k_dense_GA_id_img_feature_dict.pkl
โ”‚ย ย  โ”œโ”€โ”€ dbpedia_wikidata_15k_norm_GA_id_img_feature_dict.pkl
โ”‚ย ย  โ”œโ”€โ”€ FBDB15K_id_img_feature_dict.pkl
โ”‚ย ย  โ”œโ”€โ”€ FBYG15K_id_img_feature_dict.pkl
โ”‚ย ย  โ”œโ”€โ”€ fr_en_GA_id_img_feature_dict.pkl
โ”‚ย ย  โ”œโ”€โ”€ ja_en_GA_id_img_feature_dict.pkl
โ”‚ย ย  โ””โ”€โ”€ zh_en_GA_id_img_feature_dict.pkl
โ”œโ”€โ”€ MEAformer
โ””โ”€โ”€ dump

๐Ÿค Cite:

Please condiser citing this paper if you use the code or data from our work. Thanks a lot :)

@inproceedings{DBLP:conf/mm/ChenCZGFHZGPSC23,
  author       = {Zhuo Chen and
                  Jiaoyan Chen and
                  Wen Zhang and
                  Lingbing Guo and
                  Yin Fang and
                  Yufeng Huang and
                  Yichi Zhang and
                  Yuxia Geng and
                  Jeff Z. Pan and
                  Wenting Song and
                  Huajun Chen},
  title        = {MEAformer: Multi-modal Entity Alignment Transformer for Meta Modality
                  Hybrid},
  booktitle    = {{ACM} Multimedia},
  pages        = {3317--3327},
  publisher    = {{ACM}},
  year         = {2023}
}

๐Ÿ’ก Acknowledgement

We appreciate MCLEA, MSNEA, EVA, MMEA and many other related works for their open-source contributions.

Flag Counter

meaformer's People

Contributors

hackerchenzhuo avatar wencolani avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

meaformer's Issues

thank you for your work

Hello author, first of all thank you very much for your work, I would like to ask you about how long it takes to train the DBP15K dataset, you said in the literature that there is no parallelism, what should the batch_size be set to if only a single card 3090 is used, you provided 3000 in your code

ๆ•ฐๆฎ้›†้—ฎ้ข˜

ไฝœ่€…๏ผŒๆ‚จๅฅฝ๏ผŒ DBP15K ่ฟ™ไธชๆ•ฐๆฎ้›†ๆ˜ฏๅฆๆœ‰ๅ…ณ็ณปๅฏนๅบ”id็š„ๆ–‡ไปถ๏ผŒๆˆ‘ๆƒณๅœจๅ…ณ็ณปไธŠ็ปง็ปญๅš็ ”็ฉถ๏ผŒๅฅˆไฝ•ๆ‰พไบ†ๅพˆๅคš่ฎบๆ–‡๏ผŒๅนถๆฒกๆœ‰ไธŽ่ฟ™ๅฅ—ๆ˜ ๅฐ„้…ๅฅ—็š„ๅ…ณ็ณปๅ

Missing dataset DWY

It seems the dataset allowed as per the configuration file is not available in the google drive as of 14/09/2023. Anyone has that data in a compatible format? Or is there any instructions on how to download and prepare that dataset. I am interested in a mono-lingual dataset.

ๆจกๅž‹็š„test้—ฎ้ข˜

ๆ‚จๅฅฝ๏ผŒไฝœ่€…๏ผŒ้ฆ–ๅ…ˆ้žๅธธๆ„Ÿ่ฐขๆ‚จ็š„ๅทฅไฝœ๏ผ๏ผ
ๆˆ‘ๅœจไปฃ็ ่ฎญ็ปƒๆœ€ๅŽไธ€่ฝฎ่ฟ›่กŒ็š„test็ป“ๆžœๅฆ‚ไธ‹ๅ›พๆ‰€็คบ๏ผŒไธŽ่ฎบๆ–‡ไธญๅฑ•็คบ็š„ๆ•ฐๅ€ผๅทฎไธๅคš
1XZN10OM$CWK1 Q9({A@L9N
ไฝ†ๆ˜ฏๆˆ‘ๅฐ†่ฎญ็ปƒๅฅฝ็š„ๆจกๅž‹ๅ•็‹ฌ่ฟ›่กŒtest็š„็ป“ๆžœไผšๆœ‰่พƒๅคงๅทฎๅผ‚ๅ‘ข๏ผŒไธ‹ๅ›พๆ˜ฏไฟฎๆ”นrun_meaformer.shๆ–‡ไปถไธญonly_test=1๏ผŒๅนถๅœจmain.pyไธญ_load_modelๆ–นๆณ•ไธญๅŠ ่ฝฝ่ฎญ็ปƒๅฅฝ็š„ๆจกๅž‹๏ผŒtest็š„็ป“ๆžœ
76$(ZJ)%@FNAM60LK)Q30K6

่ฏท้—ฎไธ€ไธ‹ไฝœ่€…๏ผŒๆ˜ฏๆˆ‘่ฟ›่กŒtest็š„ๆ–นๆณ•ไธๅฏนๅ—๏ผŸ

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.