Coder Social home page Coder Social logo

information-retrieval's Introduction

信息检索大作业

大作业要求

  • 实现对病人病历的检索模型(20分)
  • 界面程序(无具体要求,实现基本功能,建议bash纯命令行界面)
  • 实验报告(10分)

大作业内容

  • 病人病历数据库 xml格式与txt格式前者是官方给定标准数据集格式,后者是为方便处理。官方文档是两者都可使用的,但是要以xml为准!
  • 查询见topic.xmlextra_topics2017.pdf 通常做法是将disease字段作为查询,其他字段作为辅助。
  • 提交结果形式:<查询ID> Q0 <> Q0 <文档ID> Q0 <> <文档排序> <文档评分> <系统ID> Q0 <>
  • 评价指标——P@10 计算方法 可自己编写,也可以使用trec_eval脚本计算
  • 5折交叉验证——3部分训练,1部分验证,1部分测试
  • 测试结果取平均

完成方法

  • 建立倒排索引(必做,已从康哲舟出拷贝,但是只是部分倒排索引,张路,从康哲舟处拷贝倒排索引和程序)
  • BM25模型(戚亚涛,已完成)
  • 界面(张家瑞,已完成)
  • 词干还原(戚亚涛,必做,正在编写)
  • 寻找医学语料库(张路,已完成)
  • 查询扩展(张路,优化,已完成)
  • 查询扩展进一步优化(张路,获取更大的语料库,正在编写)
  • 程序完善(结果文件标准格式,计算准确率等,戚亚涛,张家瑞,正在编写)
  • 相关反馈,模型训练(戚亚涛,张家瑞,查找文献,正在编写)
  • 实验报告编写(石瑞聪,卢丽婧)

文件说明

information-retrieval's People

Contributors

lost-person avatar fatefawkes avatar zjr35897 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.