zhengxxn / adaptive-knn-mt Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
As I have now become a full-time engineer, so I can't reply to the issues in time. I highly recommend to use NJUNLP/knn-box instead of this repo in the future, it can be more clear and easy to use, and can do some visualization, thanks for their works!
Thanks a lot for this awesome work, and for releasing the code for the same!
I used your repository and was able to reproduce results for vanilla kNN-MT (K=8) and Adaptive kNN-MT (K=4) on the provided preprocessed data for the IT domain.
I have two queries:
I would like to run your model on other datasets (e.g. WMT'19 as mentioned in section 4.1 of kNN-MT paper). Could you please mention the preprocessing scripts that I could use for the same?
Could you also confirm that the K mentioned in table 2 in your paper is actually the max-k that the Meta-k network was trained on?
Thanks a lot!
作者你好!非常感谢您出色的工作。
我在尝试使用您的代码,在我自己训练的翻译模型基础上训练adaptive-knnmt,但是发现loss和ppl非常大,无法收敛。奇怪的是,如果不经过adaptive的训练,直接执行vanila knnmt的解码,就不会出现问题。
我是使用fairseq0.10.1训练的模型,请问您觉得有什么可能的原因吗?
While looking through the save_datastore.py
, I found this line which reads rather odd together with dataset = task.dataset(subset)
. It seems like the datastore is built using the validation set or am I misinterpreting something here? It should be built using the training data, right?
Since the implementation is based on validate.py
maybe this was falsely not adapted when creating the script. I believe it should be using args.train_subset
instead.
I try to use a data whose dstore_size is 750101549 to build datastore and faiss index. It's ok build to datastore but it raise error when build the faiss index.
Is there any way to solve this problem?
The scripts is
#/bin/bash
work_dir=/data/root/knn
PROJECT_PATH=$work_dir/adaptive-knn-mt-main
SIGNATURE=data
DSTORE_PATH=$xianf_dir/datastore/$SIGNATURE
DSTORE_SIZE=750101549
export PYTHONPATH=$PROJECT_PATH:$PYTHONPATH
CUDA_VISIBLE_DEVICES=0 python $PROJECT_PATH/train_datastore_gpu.py \
--dstore_mmap $DSTORE_PATH \
--dstore_size $DSTORE_SIZE \
--dstore-fp16 \
--faiss_index ${DSTORE_PATH}/knn_index \
--ncentroids 3072 \
--probe 32 \
--dimension 1024
Traceback (most recent call last):
File "/data/root/knn/adaptive-knn-mt-main/train_datastore_gpu.py", line 113, in <module>
gpu_index.add_with_ids(to_add.astype(np.float32), np.arange(start, end))
File "/usr/local/python3/lib/python3.6/site-packages/faiss/__init__.py", line 214, in replacement_add_with_ids
self.add_with_ids_c(n, swig_ptr(x), swig_ptr(ids))
File "/usr/local/python3/lib/python3.6/site-packages/faiss/swigfaiss.py", line 5756, in add_with_ids
return _swigfaiss.GpuIndex_add_with_ids(self, n, x, ids)
RuntimeError: Error in virtual void* faiss::gpu::StandardGpuResourcesImpl::allocMemory(const faiss::gpu::AllocRequest&) at /__w/faiss-wheels/faiss-wheels/faiss/faiss/gpu/StandardGpuResources.cpp:452: Error: 'err == cudaSuccess' failed: StandardGpuResources: alloc fail type IVFLists dev 0 space Device stream 0x2d6cbc0 size 8388608 bytes (cudaMalloc error out of memory [2])
Hi, xin.
I have a question about the training set of the model.
Do you directly use the dev set of IT domain to train model and use test set of IT domain to test.
This means we only need few in-domain samples to train the meta-network.
Really cool implementation, a lot cleaner and better integrated within fairseq than the implementation provided by the original knn-MT work!
If one would want to adapt this to the multilingual task as well, do you have an idea how the forward_and_get_hidden_state_step
method would look like and what would need to change in the save_datastore.py
script?
Hello,
thanks for the nice work and the clean implementation, it is very easy to use and understand!
I have a couple of (maybe stupid) questions concerning the methodology:
Thanks in advance for the attention,
Z
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.