Light

jingtaozhan / disentangled-retriever Goto Github PK

View Code? Open in Web Editor NEW

54.0 2.0 5.0 928 KB

An easy-to-use python toolkit for flexibly adapting various neural ranking models to any target domain.

License: MIT License

Python 100.00%

dense-retrieval domain-adaption neural-ranking colbert reranker splade unicoil

disentangled-retriever's Introduction

Hi there 👋 This is Jingtao Zhan.

🌱 I’m a third-year PhD student at Tsinghua IR Group supervised by Prof. Shaoping Ma and Prof. Yiqun Liu.
🔭 My research lies in Information Retrieval and Web Search. I currently focus on Dense Retrieval with a wide interest in improving its effectiveness, efficiency, and interpretability. The publications are available at my homepage.
📫 Contact me via [email protected] or twitter.

disentangled-retriever's People

Contributors

Stargazers

Watchers

Forkers

din0s tmukande-debug allenhung1025 brunotech techthiyanes

disentangled-retriever's Issues

Design choice w.r.t. the DAM MLM training

Hi Jingtao:

I wonder what is the max_seq_len hyparameter you use when doing MLM on target domain dataset?

In the huggingface example they use a block_size=128, just curious why you did not use their method directly.

Also I am interested in the effect of leaving out [CLS] and [SEP] in preparing the MLM dataloader, does it make a huge difference if you do not remove these two tokens?

官方示例运行报错

在运行官方示例时，model.merge_lora这一步报了没有这个属性错误。是更新了相关的版本吗？
AttributeError: 'BertDense' object has no attribute 'merge_lora'

run_contrast.py: AttributeError: 'BertModel' object has no attribute 'add_adapter'

Hi,

I was trying to train my own REM by following the instruction.

output_dir="./data/dense-mlm/english-marco/train_rem/rem-with-hf-dam/contrast"

python -m torch.distributed.launch --nproc_per_node 4 \ 
    -m disentangled_retriever.dense.finetune.run_contrast \
    --lora_rank 192 --parallel_reduction_factor 4 --new_adapter_name msmarco \
    --pooling average \
    --similarity_metric ip \
    --qrel_path ./data/datasets/msmarco-passage/qrels.train \
    --query_path ./data/datasets/msmarco-passage/query.train \
    --corpus_path ./data/datasets/msmarco-passage/corpus.tsv \
    --negative ./data/datasets/msmarco-passage/msmarco-hard-negatives.tsv \
    --output_dir $output_dir \
    --model_name_or_path jingtao/DAM-bert_base-mlm-msmarco \
    --logging_steps 100 \
    --max_query_len 24 \
    --max_doc_len 128 \
    --per_device_train_batch_size 32 \
    --inv_temperature 1 \
    --gradient_accumulation_steps 1 \
    --fp16 \
    --neg_per_query 3 \
    --learning_rate 2e-5 \
    --num_train_epochs 5 \
    --dataloader_drop_last \
    --overwrite_output_dir \
    --dataloader_num_workers 0 \
    --weight_decay 0 \
    --lr_scheduler_type "constant" \
    --save_strategy "epoch" \
    --optim adamw_torch

However, I then get AttributeError: 'BertModel' object has no attribute 'add_adapter'.

    def add_adapter(self, adapter_name: str, config=None, overwrite_ok: bool = False, set_active: bool = False):
        """
        Adds a new adapter module of the specified type to the model.

        Args:
            adapter_name (str): The name of the adapter module to be added.
            config (str or dict, optional): The adapter configuration, can be either:

                - the string identifier of a pre-defined configuration dictionary
                - a configuration dictionary specifying the full config
                - if not given, the default configuration for this adapter type will be used
            overwrite_ok (bool, optional):
                Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.
            set_active (bool, optional):
                Set the adapter to be the active one. By default (False), the adapter is added but not activated.

        If self.base_model is self, must inherit from a class that implements this method, to preclude infinite
        recursion
        """
        if self.base_model is self:
            super().add_adapter(adapter_name, config, overwrite_ok=overwrite_ok, set_active=set_active)
        else:
            # error thrown here on the following line
            self.base_model.add_adapter(adapter_name, config, overwrite_ok=overwrite_ok, set_active=set_active)

Error Stack

[WARNING|modeling_utils.py:3180] 2023-11-04 16:40:20,978 >> Some weights of the model checkpoint at jingtao/DAM-bert_base-mlm-msmarco were not used when initializing BertDense: ['cls.predictions.transform.LayerNorm.weight', 'cls.predictions.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.bias', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertDense from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertDense from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[WARNING|modeling_utils.py:3192] 2023-11-04 16:40:20,978 >> Some weights of BertDense were not initialized from the model checkpoint at jingtao/DAM-bert_base-mlm-msmarco and are newly initialized: ['bert.pooler.dense.weight', 'bert.pooler.dense.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[INFO|modeling_utils.py:2839] 2023-11-04 16:40:21,154 >> Generation config file not found, using a generation config created from the model config.
11/04/2023 16:40:21-INFO-adapter_arg- Add a lora adapter and only train the adapter
11/04/2023 16:40:21-INFO-adapter_arg- Add a parallel adapter and only train the adapter
Traceback (most recent call last):
  File "C:\Users\ymurong\Documents\Github\Domain-Adapation-French-Legal-Retrieval\scripts\disentangled-retriever\run_contrast.py", line 203, in <module>
    main()
  File "C:\Users\ymurong\Documents\Github\Domain-Adapation-French-Legal-Retrieval\scripts\disentangled-retriever\run_contrast.py", line 145, in main
    model.add_adapter(model_args.new_adapter_name, config=adapter_config)
  File "C:\Users\ymurong\Documents\Github\Domain-Adapation-French-Legal-Retrieval\venv\lib\site-packages\transformers\adapters\model_mixin.py", line 1077, in add_adapter
    self.base_model.add_adapter(adapter_name, config, overwrite_ok=overwrite_ok, set_active=set_active)
  File "C:\Users\ymurong\Documents\Github\Domain-Adapation-French-Legal-Retrieval\venv\lib\site-packages\torch\nn\modules\module.py", line 1269, in __getattr__
    raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'BertModel' object has no attribute 'add_adapter'

Is there anything that I could do wrong?

One small modification that I had done is to change the import augument as there is no BertAdapterModel in transformers in my case. Maybe it could be the reason? I am currenty using transformers-4.33.3 with adapter-transformers==3.2.1. I am running python3.10.

from transformers import BertAdapterModel

from transformers.adapters import BertAdapterModel

Thank you for your help!

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.