Coder Social home page Coder Social logo

yzhuoning / libauc Goto Github PK

View Code? Open in Web Editor NEW
74.0 5.0 10.0 350 KB

(Archived). Please visit our new repo: https://github.com/Optimization-AI/LibAUC.

Home Page: https://github.com/Optimization-AI/LibAUC

License: GNU General Public License v3.0

auc pytorch tensorflow auroc auprc auc-optimization imbalanced-datasets

libauc's Introduction

Logo by Zhuoning Yuan

PyPI version PyPI LICENSE PyPI language

Update: Please visit our new repo here!

LibAUC

An end-to-end machine learning library for AUC optimization (AUROC, AUPRC).

Why LibAUC?

Deep AUC Maximization (DAM) is a paradigm for learning a deep neural network by maximizing the AUC score of the model on a dataset. There are several benefits of maximizing AUC score over minimizing the standard losses, e.g., cross-entropy.

  • In many domains, AUC score is the default metric for evaluating and comparing different methods. Directly maximizing AUC score can potentially lead to the largest improvement in the model’s performance.
  • Many real-world datasets are usually imbalanced. AUC is more suitable for handling imbalanced data distribution since maximizing AUC aims to rank the predication score of any positive data higher than any negative data

Links

Installation

$ pip install libauc

Usage

Official Tutorials:

  • Creating Imbalanced Benchmark Datasets [Notebook][Script]
  • Optimizing AUROC loss with ResNet20 on Imbalanced CIFAR10 [Notebook][Script]
  • Optimizing AUPRC loss with ResNet18 on Imbalanced CIFAR10 [Notebook][Script]
  • Training with Pytorch Learning Rate Scheduling [Notebook][Script]
  • Optimizing AUROC loss with DenseNet121 on CheXpert [Notebook][Script]
  • Optimizing AUROC loss with DenseNet121 on CIFAR100 in Federated Setting (CODASCA) [Preliminary Release]

Quickstart for Beginners:

Optimizing AUROC (Area Under the Receiver Operating Characteristic)

>>> #import library
>>> from libauc.losses import AUCMLoss
>>> from libauc.optimizers import PESG
...
>>> #define loss
>>> Loss = AUCMLoss()
>>> optimizer = PESG()
...
>>> #training
>>> model.train()    
>>> for data, targets in trainloader:
>>>	data, targets  = data.cuda(), targets.cuda()
        preds = model(data)
        loss = Loss(preds, targets) 
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
...	
>>> #restart stage
>>> optimizer.update_regularizer()		

Optimizing AUPRC (Area Under the Precision-Recall Curve)

>>> #import library
>>> from libauc.losses import APLoss_SH
>>> from libauc.optimizers import SOAP_SGD, SOAP_ADAM
...
>>> #define loss
>>> Loss = APLoss_SH()
>>> optimizer = SOAP_ADAM()
...
>>> #training
>>> model.train()    
>>> for index, data, targets in trainloader:
>>>	data, targets  = data.cuda(), targets.cuda()
        preds = model(data)
        loss = Loss(preds, targets, index) 
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()	

Please visit our website or github for more examples.

Citation

If you find LibAUC useful in your work, please cite the following paper for our library:

@inproceedings{yuan2021robust,
	title={Large-scale Robust Deep AUC Maximization: A New Surrogate Loss and Empirical Studies on Medical Image Classification},
	author={Yuan, Zhuoning and Yan, Yan and Sonka, Milan and Yang, Tianbao},
	booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
	year={2021}
	}

Contact

If you have any questions, please contact us @ Zhuoning Yuan [[email protected]] and Tianbao Yang [[email protected]] or please open a new issue in the Github.

libauc's People

Contributors

yzhuoning avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

libauc's Issues

When to use retain_graph=True?

Hi,

When to use retain_graph=True in the loss backward function?

In 2 examples (2 and 4), it is True. But not in the other examples.

I appreciate your time.

How to train multi-label classification tasks? (like chexpert)

I have started using this library and I've read your paper Robust Deep AUC Maximization: A New Surrogate Loss and Empirical Studies on Medical Image Classification, and I'm still not sure how to train a multi-label classification (MLC) model.

Specifically, how did you fine-tune for the Chexpert multi-label classification task? (i.e. classify 5 diseases, where each image may have presence of 0, 1 or more diseases)

  • The first step pre-training with Cross-entropy loss seems clear to me
  • You mention: "In the second step of AUC maximization, we replace the last classifier layer trained in the first step by random weights and use our DAM method to optimize the last classifier layer and all previous layers.". The new classifier layer is a single or multi-label classifier?
  • In the Appendix I, figure 7 shows only one score as output for Deep AUC maximization (i.e. only one disease)
  • In the code, both AUCMLoss() and APLoss_SH() receive single-label outputs, not multi-label outputs, apparently

How do you train for the 5 diseases? Train sequentially Cardiomegaly, then Edema, and so on? or with 5 losses added up? or something else?

Extend to Multi-class Classification Task and Be compatible with PyTorch scheduler

Hi Zhuoning,

This is an interesting work!
I am wondering if the DAM method can be extended to a multi-class classification task with long-tailed imbalanced data. Intuitively, this should be possible as the famous sklearn tool provides auc score for multi-class setting by using one-versus-rest or one-versus-one technique.

Besides, it seems that optimizer.update_regularizer() is called only when the learning rate is reduced, thus it would be more elegant to incorporate this functional call into a pytorch lr scheduler. E.g.,

scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer)
scheduler.step()    # override the step to fulfill: optimizer.update_regularizer()

For current libauc version, the PESG optimizer is not compatible with schedulers in torch.optim.lr_scheduler . It would be great if this feature can be supported in the future.

Thanks for your work!

AUCMLoss does not use margin argument

I noticed in the AUCMLoss class that the margin argument is not used.
Following the formulation in the paper, the forward function should be changed in line 20 from
2*self.alpha*(self.p*(1-self.p) + \
to
2*self.alpha*(self.p*(1-self.p)*self.margin + \

Using AUCMLoss with imratio>1

I'm not very familiar with the maths in the paper so please forgive me if i'm asking something obvious.

The AUCMLoss uses the "imbalance ratio" between positive and negative samples.
The ratio is defined as

the ratio of # of positive examples to the # of negative examples

Or imratio=#pos/#neg

When #pos<#neg, imratio is some value between 0 and 1.
But when #pos>#neg, imratio>1

Will this break the loss calculations? I have a feeling it would invalidate the many 1-self.p calculations in the LibAUC implementation, but as i'm not familiar with the maths I can't say for sure.

Also, is there a problem (mathematically speaking) with calculating imratio=#pos/#total_samples, to avoid the problem above?
When #pos<<#neg, #neg approximates #total_samples.

Example for tensorflow

Thank you for the great library.
Does it currently support tensorflow? If so, could you provide an example of how it can be used with tensorflow? Thank you very much

Where is the source code?

Great library!

Where is the source code located? I only see the examples/ and imgs/ folders in the repo. Am I missing something?

Multiclass

Can the losses be extended to multi class classification problems ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.