Coder Social home page Coder Social logo

medlsam's Introduction


Key Features

  • Foundation Model for 3D Medical Image Localization: MedLAM: MedLSAM introduces MedLAM as a foundational model for the localization of 3D medical images.
  • First Fully-Automatic Medical Adaptation of SAM: MedLSAM is the first complete medical adaptation of the Segment Anything Model (SAM). The primary goal of this work is to significantly reduce the annotation workload in medical image segmentation.
  • Segment Any Anatomy Target Without Additional Annotation: MedLSAM is designed to segment any anatomical target in 3D medical images without the need for further annotations, contributing to the automation and efficiency of the segmentation process.

Updates

  • 2024.1.9: Release the training code
  • 2023.10.15: Accelerate the inference speed. Add Sub-Patch Localization (SPL).
  • 2023.07.01: Code released.

Details

The Segment Anything Model (SAM) has recently emerged as a groundbreaking model in the field of image segmentation. Nevertheless, both the original SAM and its medical adaptations necessitate slice-by-slice annotations, which directly increase the annotation workload with the size of the dataset. We propose MedLSAM to address this issue, ensuring a constant annotation workload irrespective of dataset size and thereby simplifying the annotation process. Our model introduces a few-shot localization framework capable of localizing any target anatomical part within the body. To achieve this, we develop a Localize Anything Model for 3D Medical Images (MedLAM), utilizing two self-supervision tasks: relative distance regression (RDR) and multi-scale similarity (MSS) across a comprehensive dataset of 14,012 CT scans. We then establish a methodology for accurate segmentation by integrating MedLAM with SAM. By annotating only six extreme points across three directions on a few templates, our model can autonomously identify the target anatomical region on all data scheduled for annotation. This allows our framework to generate a 2D bounding box for every slice of the image, which are then leveraged by SAM to carry out segmentations. We conducted experiments on two 3D datasets covering 38 organs and found that MedLSAM matches the performance of SAM and its medical adaptations while requiring only minimal extreme point annotations for the entire dataset. Furthermore, MedLAM has the potential to be seamlessly integrated with future 3D SAM models, paving the way for enhanced performance.

MedLSAM Image Fig.1 The overall segmentation pipeline of MedLSAM.

Feedback and Contact

Get Started

Main Requirements

torch>=1.11.0
tqdm
nibabel
scipy
SimpleITK
monai

Installation

  1. Create a virtual environment conda create -n medlsam python=3.10 -y and activate it conda activate medlsam
  2. Install Pytorch
  3. git clone https://github.com/openmedlab/MedLSAM
  4. Enter the MedSAM folder cd MedLSAM and run pip install -e .

Download Model

Download MedLAM checkpoint, SAM checkpoint, MedSAM checkpoint and place them at checkpoint/medlam.pth, checkpoint/sam_vit_b_01ec64.pth and checkpoint/medsam_vit_b.pth

Inference

GPU requirement

We recommend using a GPU with 12GB or more memory for inference.

Data preparation

Note: You can also download other CT datasets and place them any place you want. MedLSAM will automaticly apply the preprocessing procedure during the inference time, so please do not normalize the original CT images!!!

After downloading the datasets, you should sort the data into "support" and "query" groups. This does not require moving the actual image files. Rather, you need to create separate lists of file paths for each group.

For each group ("support" and "query"), perform the following steps:

  1. Create a .txt file listing the paths to the image files.
  2. Create another .txt file listing the paths to the corresponding label files.

Ensure that the ordering of images and labels aligns in both lists. These lists will direct MedLSAM to the appropriate files during the inference process.
The file names are not important, as long as the ordering of images and labels aligns in both lists.

Example format for the .txt files:

  • image.txt
/path/to/your/dataset/image_1.nii.gz
...
/path/to/your/dataset/image_n.nii.gz
  • label.txt
/path/to/your/dataset/label_1.nii.gz
...
/path/to/your/dataset/label_n.nii.gz

Config preparation

MedLAM and MedLSAM load their configurations from a .txt file. The structure of the file is as follows:

[data]
support_image_ls      =  config/data/StructSeg_HaN/support_image.txt
support_label_ls      =  config/data/StructSeg_HaN/support_label.txt
query_image_ls        =  config/data/StructSeg_HaN/query_image.txt
query_label_ls        =  config/data/StructSeg_HaN/query_label.txt
gt_slice_threshold    = 10
bbox_mode             = SPL
slice_interval        = 2
fg_class              = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22]
seg_save_path         = result/npz/StructSeg
seg_png_save_path     = result/png/StructSeg

[vit]
net_type                = vit_b

[weight]
medlam_load_path   = checkpoint/medlam.pth
vit_load_path  = checkpoint/medsam_20230423_vit_b_0.0.1.pth

Each of the parameters is explained as follows:

  • support_image_ls: The path to the list of support image files. It is recommended to use between 3 and 10 support images.
  • support_label_ls: The path to the list of support label files.
  • query_image_ls: The path to the list of query image files.
  • query_label_ls: The path to the list of query label files.
  • gt_slice_threshold: The threshold value for ground truth slice selection.
  • bbox_mode: The bounding box mode. It could be SPL (Sub-Patch Localization) or WPL (Whole-Patch Localization), as shown in Fig.2.
  • slice_interval: Specifies the number of slices in a sub-patch. A smaller value results in more patches. This parameter should be of type int, and its value should be greater than 0. Applicable only for Sub-Patch Localization (SPL), set to False for Whole-Patch Localization (WPL).
  • fg_class: The list of foreground class to be used for localization and segmentation. This could be a list of integers indicating the class labels. You can only select a part of them as target classes.
  • seg_save_path: The path to save the segmentation results in .npz format, only required for MedLSAM.
  • seg_png_save_path: The path to save the segmentation results in .png format, only required for MedLSAM.
  • net_type: The type of vision transformer model to be used, only required for MedLSAM. By default, this is set to vit_b.
  • medlam_load_path: The path to load the pretrained MedLAM model weights.
  • vit_load_path: The path to load the pretrained vision transformer model weights, only required for MedLSAM. You can change it to checkpoint/sam_vit_b_01ec64.pth to use the SAM model as segmentation basis.

Fig.2 Comparison between Whole-Patch Localization (WPL) and Sub-Patch Localization (SPL) strategies.

Inference

  • MedLAM (Localize any anatomy target)
CUDA_VISIBLE_DEVICES=0 python MedLAM_Inference.py --config_file path/to/your/test_medlam_config.txt

Example:

CUDA_VISIBLE_DEVICES=0 python MedLAM_Inference.py --config_file config/test_config/test_structseg_medlam.txt
CUDA_VISIBLE_DEVICES=0 python MedLAM_Inference.py --config_file config/test_config/test_word_medlam.txt
  • MedLSAM (Localize and segment any anatomy target with WPL/SPL)
CUDA_VISIBLE_DEVICES=0 python MedLSAM_WPL_Inference.py --config_file path/to/your/test_medlsam_config.txt
CUDA_VISIBLE_DEVICES=0 python MedLSAM_SPL_Inference.py --config_file path/to/your/test_medlsam_config.txt

Example:

CUDA_VISIBLE_DEVICES=0 python MedLSAM_WPL_Inference.py --config_file config/test_config/test_structseg_medlam_wpl_medsam.txt
CUDA_VISIBLE_DEVICES=0 python MedLSAM_WPL_Inference.py --config_file config/test_config/test_structseg_medlam_wpl_medsam.txt
CUDA_VISIBLE_DEVICES=0 python MedLSAM_SPL_Inference.py --config_file config/test_config/test_structseg_medlam_spl_sam.txt
CUDA_VISIBLE_DEVICES=0 python MedLSAM_SPL_Inference.py --config_file config/test_config/test_structseg_medlam_spl_sam.txt

Results

  • MedLAM (Localize any anatomy target): MedLAM automatically calculates and saves the mean Intersection over Union (IoU) along with the standard deviation for each category in a .txt file. These files are stored under the result/iou directory.
  • MedLSAM (Localize and segment any anatomy target): MedLSAM automatically calculates and saves the mean Dice Similarity Coefficient (DSC) along with the standard deviation for each category in a .txt file. These files are stored under the result/dsc directory.

Training

Training Data preparation

  • Create a train/config/ori_nii.txt file listing the paths to the original CT nii files. (MedLAM is based on the self-supervised learning tasks and no label file is required during the training time!!!)
  • run python train/dataset_preprocess.py. It will automatically preprocess the CT file. By default, the preprocessed CT files will be saved with a new name that appends _pre to the original filename. For example, if your original file is named scan.nii, the preprocessed file will be named scan_pre.nii.
  • After preprocessing, the paths to the preprocessed CT files will be automatically saved in a file named pre_nii.txt located in the train/config/ directory.

Training script

python train/train_position_full_size_with_fc.py -c train/config/train_position_full_size_with_fc.txt
  • the tar checkpoint will be saved in train/checkpoint. It contains both the network weights and the optimizer states.
  • for inference, you need to extract the network weights from the checkpoint and save it as pth file. you can run python train/extract_weights.py -p train/checkpoint/your.tar to do this (change the tar path to your file). It will automatically extract the network weights from the checkpoint and save them as checkpoint/medlam.pth.

To do list

  • Support scribble prompts
  • Support MobliSAM

🛡️ License

This project is under the CC-BY-NC 4.0 license. See LICENSE for details.

🙏 Acknowledgement

  • A lot of code is modified from MedSAM.
  • We highly appreciate all the challenge organizers and dataset owners for providing the public dataset to the community.
  • We thank Meta AI for making the source code of segment anything publicly available.

📝 Citation

If you find this repository useful, please consider citing this paper:

@article{Lei2023medlam,
  title={MedLSAM: Localize and Segment Anything Model for 3D Medical Images},
  author={Wenhui Lei, Xu Wei, Xiaofan Zhang, Kang Li, Shaoting Zhang},
  journal={arXiv preprint arXiv:},
  year={2023}
}

medlsam's People

Contributors

lwhyc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

medlsam's Issues

I don't need to do any prompt words, you can get the desired segmentation mask, your code can achieve this function.I don't need to do any prompt words, you can get the desired segmentation mask, your code can achieve this function.

Thank you for your contribution. I currently want to perform node segmentation, which is a binary segmentation, and the dataset file is in NIFIT format. I don't need to do any prompt words, you can get the desired segmentation mask, your code can achieve this function. Looking forward to your reply.

Pretraining checkpoint

After i trained,i put the .tar at vit_load_path.But i get the missing key error when i want to segment other data(like this:Missing key(s) in state_dict: "image_encoder.pos_embed", "image_encoder.patch_embed.proj.weight").Did i forget anything to do?thanks.

Hello

Hello,why my validition dice is very high???

about medsam_inference.py

Sorry to bother you,i was looking at this code,i'm wondering is the medsam segment at the wpl_box(use ground truth to get the x,y,z dimension),or it segment at the hole picture.thanks to use your own free time to answer me.

Value Error at inference

Screenshot from 2023-11-20 11-24-14

How to resolve this error please?

Also I want to know if this model can be used to segment images outside the 22 classes mentioned in the paper.

I'm planning to use it for lung nodules segmentation.

Suggestion - Integrate MobileSAM into the pipeline for lightweight and faster inference

Reference: https://github.com/ChaoningZhang/MobileSAM

Our project performs on par with the original SAM and keeps exactly the same pipeline as the original SAM except for a change on the image encode, therefore, it is easy to Integrate into any project.

MobileSAM is around 60 times smaller and around 50 times faster than original SAM, and it is around 7 times smaller and around 5 times faster than the concurrent FastSAM. The comparison of the whole pipeline is summarzed as follows:

image

image

Best Wishes,

Qiao

Crashing

We want to test this model on Google Colab, but at the beginning of the execution, when it comes to reorienting the support image, we crashed! Can you help us run this model on Colab?

code block

CUDA_VISIBLE_DEVICES=0 python MedLAM_Inference.py --config_file config/test_config/test_structseg_medlam.txt

error

data support_image_ls config/data/StructSeg_HaN/support_image.txt config/data/StructSeg_HaN/support_image.txt
data support_label_ls config/data/StructSeg_HaN/support_label.txt config/data/StructSeg_HaN/support_label.txt
data query_image_ls config/data/StructSeg_HaN/query_image.txt config/data/StructSeg_HaN/query_image.txt
data query_label_ls config/data/StructSeg_HaN/query_label.txt config/data/StructSeg_HaN/query_label.txt
data gt_slice_threshold 10 10
data fg_class [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22] [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22]
weight medlam_load_path checkpoint/medlam.pth checkpoint/medlam.pth
data support_image_ls config/data/StructSeg_HaN/support_image.txt config/data/StructSeg_HaN/support_image.txt
data support_label_ls config/data/StructSeg_HaN/support_label.txt config/data/StructSeg_HaN/support_label.txt
data query_image_ls config/data/StructSeg_HaN/query_image.txt config/data/StructSeg_HaN/query_image.txt
data query_label_ls config/data/StructSeg_HaN/query_label.txt config/data/StructSeg_HaN/query_label.txt
data gt_slice_threshold 10 10
data fg_class [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22] [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22]
weight medlam_load_path checkpoint/medlam.pth checkpoint/medlam.pth
=> loading medlam checkpoint 'checkpoint/medlam.pth'
=> loaded medlam checkpoint 'checkpoint/medlam.pth'
Loading support class: 1 0% 0/5 [00:00<?, ?it/s]
The image is not in the target orientation. Reorienting...
The image is not in the target orientation. Reorienting...
^C

Problem of missing classes

Hi, thanks for your nice work! However, I have several questions about adapting this framework to new datasets concerning the problem of missing classes. For example, if I want to segment 4 organs. However, for some images, there only exists 3 organs, (the other organ may be excised and not exist in this image).

  1. When we use images with missing classes as support image, it will raise an error due to the missing class.
  2. When we inference images with missing classes, the output of MedLAM for this class seems to have no meaning since this target does not exist.

Do you have any suggestions?

Best regards.

Segment without ground truth labels

Is there a way to segment specific anatomical structures without ground truth labels? (Is it a must to have two .txt files?) I just wanted to use this to segment with pre-trained weights.
Thanks

CT abdomen image segmentation is not good

Thank you for sharing. I tried to perform the task of abdomen segmentation directly on CT images, but the results were not good. The following is my result map, please ask what the possible reasons are?
Abdomen_medlam_img0001_0_cl7
Abdomen_medlam_img0002_0_cl1
捕获

Suggestion: get_data() function is deprecated

The get_data() function has been deprecated since nibabel version 3.0 and is no longer recommended. In its place is the get_fdata() function, which provides a more predictable return type.
So we need to change get_data() to get_fdata() so that the code runs successfully
image

Using own dataset

I was trying to use my dataset,but the code will get wrong at data_process_func line95(getting the zero point),i always get the value-error(zero-size array to reduction operation maximum which has no identity).I'am wondering does your dataset do preprocessing or some label i didn't notice.Is there any alternative solution?thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.