Coder Social home page Coder Social logo

Comments (6)

rmitsch avatar rmitsch commented on May 22, 2024

Please copy-paste your code and config (formatted with ``` ```) into this thread.

from spacy-llm.

tianchiguaixia avatar tianchiguaixia commented on May 22, 2024

examples.yml:

  • text: 前白蛋白(PA) 302.65 mg/L 180-400
    entities:
    实验室检查的指标:
    - 前白蛋白(PA)
    实验室检查的单位:
    - mg/L
    实验室检查的结果数值:
    - 302.65
    实验室检查的范围值:
    - 180-400

  • text: 谷氨酰转肽酶(GGT) 17 IU/L 10-60
    entities:
    实验室检查的指标:
    - 谷氨酰转肽酶(GGT)
    实验室检查的单位:
    - IU/L
    实验室检查的结果数值:
    - 17
    实验室检查的范围值:
    - 10-60

    fewshot.cfg:
    [paths]
    examples = null

[nlp]
lang = "zh"
pipeline = ["llm_ner"]

[components]

[components.llm_ner]
factory = "llm"

[components.llm_ner.task]
@llm_tasks = "spacy.NER.v2"
labels = 实验室检查的指标,实验室检查的单位,实验室检查的结果数值,实验室检查的范围值

[components.llm_ner.task.examples]
@misc = "spacy.FewShotReader.v1"
path = ${paths.examples}

[components.llm_ner.model]
@llm_models = "spacy.GPT-3-5.v2"
name = "gpt-3.5-turbo"
config = {"temperature": 0.0}

zeroshot.cfg:

[nlp]
lang = "zh"
pipeline = ["llm_ner"]

[components]

[components.llm_ner]
factory = "llm"

[components.llm_ner.task]
@llm_tasks = "spacy.NER.v2"
labels = 实验室检查的指标,实验室检查的单位,实验室检查的结果数值,实验室检查的范围值

[components.llm_ner.model]
@llm_models = "spacy.GPT-3-5.v2"
name = "gpt-3.5-turbo"
config = {"temperature": 0.0}

run_pipeline.py

import os
from pathlib import Path
from typing import Optional

import typer
from wasabi import msg

from spacy_llm.util import assemble

Arg = typer.Argument
Opt = typer.Option

def run_pipeline(
# fmt: off
text: str = Arg("", help="Text to perform text categorization on."),
config_path: Path = Arg(..., help="Path to the configuration file to use."),
examples_path: Optional[Path] = Arg(None, help="Path to the examples file to use (few-shot only)."),
verbose: bool = Opt(False, "--verbose", "-v", help="Show extra information."),
# fmt: on
):
if not os.getenv("OPENAI_API_KEY", None):
msg.fail(
"OPENAI_API_KEY env variable was not found. "
"Set it by running 'export OPENAI_API_KEY=...' and try again.",
exits=1,
)

msg.text(f"Loading config from {config_path}", show=verbose)
nlp = assemble(
    config_path,
    overrides={}
    if examples_path is None
    else {"paths.examples": str(examples_path)},
)

doc = nlp(text)
msg.text(f"Entities: {[(ent.text, ent.label_,ent.start,ent.end) for ent in doc.ents]}")

if name == "main":
typer.run(run_pipeline)

!python run_pipeline.py
"**医学科学院 阜外医院 检验报告单 姓名: 贾全喜 性别:男 年龄: 55岁 门诊:0066000117992 样品号: 科别: 门诊 床号: 诊断: 标本种类:血清 送检项目: 0265 生化全套 项 目 结果 单位 参考值 项 目 结果 单位 1 前白蛋白(PA) 302.65 mg/L 180-400 参考值 2 *总蛋白(TP) 69.9 19*尿酸(URIC) 542.06 umol/L 1 148.8-416.5 g/L 65-85 20 *肌酸激酶(CK) IU/L 0-200 16 3 *白蛋白(溴甲酚绿法)(ALB) 41.6 g/L 40-55 21 肌酸激酶同工酶(CKMB-Mass) 2.06 ng/nL 0-5 4 *丙氨酸氨基转移酶(ALT) 22 IU/L 9-50 22*乳酸脱氢酶(LDH) 149 IU/L 0-250 5 *天门冬氨酸氨基转移酶(AST) 24 IU/L 15-40 23 淀粉酶(AMY) 100 U/L 0-220 6 *碱性磷酸酶(ALP) 85 45-125 24 脂蛋白(a)(Lp(a)) 827.42 ng/L ↑ 10-300 1/0I 7 *谷氨酰转肽酶(GGT) 17 IU/L 10-60 25 超敏C反应蛋白(HSCRP) 1.28 mg/L 0.00-3.00 8 *总胆红素(TBi1) 16.94 umo1/L 5.1-19 26 同型半胱氨酸(HCY) 8.31 umol/L 6-15 9 直接胆红素(DBil) 4.34 μmol/L 0-6.8 27 游离脂肪酸(FFA) 0.65 mmol/L t 0.1-0.6 10*钾(K) 4.41 mmol/L 3.5-5.3 28*甘油三酯(TG) 0.94 mmol/L 0.38-1.76 11*钠(NA) 141.69 mmol/1 137-147 29*总胆固醇(CHOL) 3.38 mmol/L 13.64-5.98 12*氯(CL) 101.89mmol/L 99-110 30*高密度脂蛋白胆固醇(HDL-C) 1.10 mmol/L 0.7-1.59 13二氧化碳(C02) 32.65 mmol/L ↑21.0-31.0 31*低密度脂蛋白胆固醇(LDL-C) 1.86 mmol/L 一般人群<3.37 14*葡萄糖(GLU) 5.01 mmol/L 3.58-6.05 高危人群<2.59 15*磷(P) 0.95 mmol/L ↓0.97-1.50 极高危人群<2.00 16*钙(CA) 2.40 mmol/L 2.2-2.75 32 小密低密度脂蛋白(sdLDL) 0.55 mmol/L 0.23-1.39 17*肌酐(苦味酸法)(CREA) 89.54 umol/L 44-133 33 载脂蛋白A1(apoA1) 1.05 g/L 11.1-1.8 18*尿素氮(BUN) 5.45 mmol/L 2.86-7.90 34 载脂蛋白B(apoB) 0.67 g/L 0.5-1.2 极高危人群:急性冠脉综合征(ACS)或冠心病/缺血性脑卒中/周围动脉硬化合并糖尿病。 申请日期:2021.08.18 采样时间:2021.08.19 09:08 接收时间:2021.08.19 10:08 报告时间:2021.08.19 11:43 申请医师:李子煦 检验者: 邢跃雷 审核者: 苏保满 备 注: 此报告仅对送检样本负责。 *标记项目为北京市三级医院互认项目 实验诊断中心生化 电话: 88398271"
./zeroshot.cfg

from spacy-llm.

tianchiguaixia avatar tianchiguaixia commented on May 22, 2024

The complete code is included in the attachment

from spacy-llm.

tianchiguaixia avatar tianchiguaixia commented on May 22, 2024

Hello, I have discovered a problem. Just treat me as a zero shot. Once fewshot reports an error. Why did providing fewshot knowledge report an error.

from spacy-llm.

rmitsch avatar rmitsch commented on May 22, 2024

It's difficult to diagnose why fewshotting yields worse results here. I recommend debugging with one example at a time and looking into the raw output received from the model (see here on how to do that).

The fact that you get no entities at all if you include fewshot examples indicate that the LLM might have issues understanding those examples, or that the output produced by the LLM if those examples are included is incoherent and cannot be parsed. Either way the best way forward is to have a closer look at how both the prompt and the response look like if you add one example at a time.

from spacy-llm.

tianchiguaixia avatar tianchiguaixia commented on May 22, 2024

Thank you very much for your answers. It would be even better if we could add a Chinese model later on

from spacy-llm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.