Comments (1)
同学你好,
感谢你对我们工作和数据集的关注。
这个问题是由于DBpedia和YAGO实体具有相同的localname造成的,我一开始构造数据集和设计MultiKE的时候没有注意到这个问题。
后续很多工作已经注意到这个bug了,相关论文如下:
-
Exploring and Evaluating Attributes, Values, and Structures for Entity Alignment. Zhiyuan Liu, Yixin Cao, Liangming Pan, Juanzi Li, Zhiyuan Liu, Tat-Seng Chua. (EMNLP 2020)
-
Cross-lingual Entity Alignment with Incidental Supervision. Muhao Chen, Weijia Shi, Ben Zhou, Dan Roth. (EACL 2020)
所以,如果想使用name作为feature进行实体对齐,DBP-YG这些数据集是不适用的,会造成test data leakage问题,并且,这个数据集也不足够来测试使用name的方法的真实有效性和鲁棒性,所以建议换别的数据集,比如DBP15K,或者,换成我们OpenEA里面给出的2.0版本数据集,我对实体的localname进行了编码,会更难一些。
我和Muhao Chen有一些issue讨论,供你参考:
最后,关于基于表示学习的实体对齐,我想给一些个人建议供你参考:
- 建议优先考虑和研究基于relation structure的方法,关于这些方法的实验,数据集较多,对比也相对公平;
- 如果考虑attribute包括name等信息,建议考虑方法的鲁棒性泛化性等,实验方面,选择合适的数据集比如DBP15K、OpenEA2.0或者使用WD等localname是没有特定含义的知识库,方法对比上避免unfair settings,保证实验可靠有效。
祝科研顺利!有问题可以给我发邮件讨论!
孙泽群
from bootea.
Related Issues (9)
- Question about the baseline IPTransE results in paper HOT 2
- 自举数据问题 HOT 1
- About AlignE with different proportion of seeds HOT 2
- 数据中的疑虑 HOT 1
- Can you tell me about the description of DWY100k? HOT 1
- 运行问题 HOT 2
- result文件
- MemoryError HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bootea.