zhengxxn / adaptive-knn-mt Goto Github PK

View Code? Open in Web Editor NEW

83.0 83.0 9.0 4.01 MB

License: MIT License

Python 97.38% C++ 0.61% Cuda 1.41% Shell 0.04% Lua 0.16% Cython 0.41%

adaptive-knn-mt's People

Contributors

Stargazers

Watchers

Forkers

old-young233 yqgao716 jiezhanggt hikenshi zhaoqianfeng zth9730 houzer chenweihua91 chenn-clear

adaptive-knn-mt's Issues

Notice for any issues

As I have now become a full-time engineer, so I can't reply to the issues in time. I highly recommend to use NJUNLP/knn-box instead of this repo in the future, it can be more clear and easy to use, and can do some visualization, thanks for their works!

Data preprocessing steps used in the paper

Thanks a lot for this awesome work, and for releasing the code for the same!

I used your repository and was able to reproduce results for vanilla kNN-MT (K=8) and Adaptive kNN-MT (K=4) on the provided preprocessed data for the IT domain.

I have two queries:

I would like to run your model on other datasets (e.g. WMT'19 as mentioned in section 4.1 of kNN-MT paper). Could you please mention the preprocessing scripts that I could use for the same?
Could you also confirm that the K mentioned in table 2 in your paper is actually the max-k that the Meta-k network was trained on?

Thanks a lot!

在自己训练的翻译模型上训练adaptive-knn时无法收敛

作者你好！非常感谢您出色的工作。
我在尝试使用您的代码，在我自己训练的翻译模型基础上训练adaptive-knnmt，但是发现loss和ppl非常大，无法收敛。奇怪的是，如果不经过adaptive的训练，直接执行vanila knnmt的解码，就不会出现问题。
我是使用fairseq0.10.1训练的模型，请问您觉得有什么可能的原因吗？

Is the datastore built using the validation set?

While looking through the save_datastore.py, I found this line which reads rather odd together with dataset = task.dataset(subset). It seems like the datastore is built using the validation set or am I misinterpreting something here? It should be built using the training data, right?

Since the implementation is based on validate.py maybe this was falsely not adapted when creating the script. I believe it should be using args.train_subset instead.

Error when building faiss index.

I try to use a data whose dstore_size is 750101549 to build datastore and faiss index. It's ok build to datastore but it raise error when build the faiss index.
Is there any way to solve this problem?
The scripts is

#/bin/bash
work_dir=/data/root/knn
PROJECT_PATH=$work_dir/adaptive-knn-mt-main
SIGNATURE=data
DSTORE_PATH=$xianf_dir/datastore/$SIGNATURE
DSTORE_SIZE=750101549
export PYTHONPATH=$PROJECT_PATH:$PYTHONPATH                                              
   
CUDA_VISIBLE_DEVICES=0 python $PROJECT_PATH/train_datastore_gpu.py \
  --dstore_mmap $DSTORE_PATH \
  --dstore_size $DSTORE_SIZE \
  --dstore-fp16 \
  --faiss_index ${DSTORE_PATH}/knn_index \
  --ncentroids 3072 \
  --probe 32 \
  --dimension 1024

Traceback (most recent call last):
  File "/data/root/knn/adaptive-knn-mt-main/train_datastore_gpu.py", line 113, in <module>
    gpu_index.add_with_ids(to_add.astype(np.float32), np.arange(start, end))
  File "/usr/local/python3/lib/python3.6/site-packages/faiss/__init__.py", line 214, in replacement_add_with_ids
    self.add_with_ids_c(n, swig_ptr(x), swig_ptr(ids))
  File "/usr/local/python3/lib/python3.6/site-packages/faiss/swigfaiss.py", line 5756, in add_with_ids
    return _swigfaiss.GpuIndex_add_with_ids(self, n, x, ids)
RuntimeError: Error in virtual void* faiss::gpu::StandardGpuResourcesImpl::allocMemory(const faiss::gpu::AllocRequest&) at /__w/faiss-wheels/faiss-wheels/faiss/faiss/gpu/StandardGpuResources.cpp:452: Error: 'err == cudaSuccess' failed: StandardGpuResources: alloc fail type IVFLists dev 0 space Device stream 0x2d6cbc0 size 8388608 bytes (cudaMalloc error out of memory [2])

I have a couple of (maybe stupid) questions concerning the methodology:

In your paper you mention you use the valid set to train the meta-k network for about 5k steps. How was the number of steps chosen (since you are using the valid set as a training set)?
Would it be possible to train a base model using a combination of general open-source data + IT + Med + Koran + Law and then apply the adaptive-knn-mt technique on IT, Med, Koran, Law using this base model? I can use the in-domain training set to create the datastore and then the valid set to train the meta-k network. This should not cause any data leakage or am I missing something?

Thanks in advance for the attention,

zhengxxn / adaptive-knn-mt Goto Github PK

adaptive-knn-mt's People

Contributors

Stargazers

Watchers

Forkers

adaptive-knn-mt's Issues

Notice for any issues

Data preprocessing steps used in the paper

在自己训练的翻译模型上训练adaptive-knn时无法收敛

Is the datastore built using the validation set?

Error when building faiss index.

The training set of the model

Nice Implementation!

meta-k network hyperparameters and training the base model on out-ouf-domain+in-domain data

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent