Coder Social home page Coder Social logo

anonymwriter / cyclesl Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 28 KB

CycleSL: Server-Client Cyclical Update Driven Scalable Split Learning

License: MIT License

Python 99.38% Shell 0.62%
collaborative-learning coordinate-descent distributed-learning federated-learning split-learning

cyclesl's Introduction

CycleSL

CycleSL: Server-Client Cyclical Update Driven Scalable Split Learning. CycleSL is a novel scalable split learning paradigm that can be integrated into existing scalable split learning methods such as PSL and SFL to improve convergence rate and quality while burdening the server even less and providing the same privacy guarantee.

This repository is anonymized for review.

๐Ÿ–ผ๏ธ Teaser

๐Ÿ—ผ Pipeline

After receiving smashed data from clients, CycleSL first forms a global dataset on the server side using the client features following a feature-as-sample strategy. Then CycleSL re-samples mini-batches from the dataset and feeds them into the server part model to train the model. Only after the server part model is updated, the original feature batches are re-used to compute gradients using the latest server side model. In the next the gradients are sent back to clients for client part model update.

๐Ÿ’ Usage

  1. Download data and conduct data preprocessing following the instructions given in the data directory.

  2. Create conda environment with conda env create -f environment.yml and then activate the environment with conda activate cyclesl.

  3. Run reproduce_experiments.sh to reproduce our experiments.

For detailed argument and hyperparameter settings please check utils.py.

๐Ÿ”ง Environment

Important libraries and their versions by May 4th, 2024:

Library Version
Python 3.11.9 by Anaconda
PyTorch 2.2.2 for CUDA 12.1
Scikit-Learn 1.4.2
WandB 0.16.6

Others:

  • The program should be run a computer with at least 16GB RAM. If run on NVIDIA GPU, a minimum VRAM requirement is 8GB. We obtained our results on a cluster with AMD EPYC 7763 64-Core and NVIDIA A100 80GB PCIe x 4.

  • There is no requirement on OS for the experiment itself. However, to do data preprocessing, Python environment on Linux is needed. If data preprocessing is done on Windows Subsystem Linux (WSL or WSL2), please make sure unzip is installed beforehand, i.e. sudo apt install unzip for WSL2 Ubuntu.

  • We used Weights & Bias (https://wandb.ai/site) for figures instead of tensorboard. Please install it (already included in environment.yml) and set up it (run wandb login) properly beforehand.

  • We used the Python function match in our implementation. This function only exists for Python version >= 3.10 (Python 3.11.9 already included in environment.yml). Please replace it with if-elif-else statement if needed.

๐Ÿ—บ Instructions on data preprocessing

We conducted experiments using three datasets: FEMNIST, CelebA, and Shakespeare. All three datasets can be obtained from https://leaf.cmu.edu/ together with bash code for reproducible data split.

Please dive into the data directory for further instructions.

cyclesl's People

Contributors

anonymwriter avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.