Coder Social home page Coder Social logo

code-data-share-for-python's Introduction

code-data-share-for-python

image

image

image

Introduction

The main goal of code-data-share-for-python is to improve the efficiency of importing python module, including time and memory footprint, by persistence of code object of imported modules with a memory-mapped file.

For more information not included in this README, e.g. detailed user and developer guide, performance results, please refer to our wiki page.

Basic Usage

code-data-share-for-python provides mechanism to determine imported packages, dump a memory mapped file from the list of packages and import packages directly from the memory mapped file.

Install

pip install code-data-share

Determine the imported packages

# tracer: list imported packages to mod.lst
PYCDSMODE=TRACE PYCDSLIST=mod.lst python -c 'import json'

Create the memory-mapped file

# dumper: create archive named mod.img from mod.lst
python -c 'import cds.dump; cds.dump.run_dump("mod.lst", "mod.img")'

Import packages from archive

# replayer
PYCDSMODE=SHARE PYCDSARCHIVE=mod.img python -c 'import json'

AppCDS in Java

The design is inspired by the Application Class-Data Sharing (AppCDS) feature, introduced in OpenJDK. AppCDS allows a set of application classes to be pre-processed into a shared archive file, which can then be memory-mapped at runtime to reduce startup time and memory footprint.

code-data-share-for-python's People

Contributors

alibaba-oss avatar dependabot[bot] avatar imzhuhl avatar oraluben avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

code-data-share-for-python's Issues

Elaborating documents

Add detailed documents, including:

  • README that describes how to install/use this package;
  • Contributing guide, includes instructions for building and testing;
  • Statistics describe the effectiveness, option: use GH Page and push performance result to it;
  • Advanced usage, e.g. manually move in/out;
  • PR template and commit message template (in Wiki);
  • Other stuffs.

Fine-grained CDS.

The current implementation will dump ALL packages imported after the cds module gets loaded, which might cause unnecessary footprint during load. We need a mechanism to verify if this is the case, and if so, find a way to get rid of unused pacakges.

[Bug] Crash during finalization in venv environment

Describe the bug
Crash when python is finalizing.
Test case has been added at d274f88.

Steps to Reproduce
Steps to reproduce the behavior:

  1. Checkout above version
  2. nox -s tests_current_venv
  3. See error

Expected behavior
All tests passed without error

Python version (sys.version)
cpython 3.9, 3.10 release and latest 3.10 version python/cpython@0897a0bf9c,

Execution environment

  • OS and version:
  • CPU model:
  • Number of CPU cores:
  • Size of physical memory:
  • Inside Linux container?
    • Linux container name (docker, pouch, etc):
    • Linux container version:

Optimize archive structure

We want to improve both security and performance by optimizing data structure of archive file,
this includes:

  • Sanity check based on checksum and/or some magic number
  • Use array instead of linked list for better cache behavior

Windows support

The current implementation uses POSIX's mmap and works on Linux and MacOS. It should be trivial to add windows support by using Windows' equivalent of mmap, e.g. MapViewOfFile.

Support CPython version < 3.9

At least 3.7 and 3.8 is still in their lifetime, we should also support these active versions.
There are several (known) issues resulting in the current lack of 3.8 support:

  1. There seems to be an issue in the parser that some cds tests failed against very long python expressions, but this might not be fundamental;
  2. Now we use a hashtable implementation from cpython internal, which is only available in >= 3.9 (python/cpython@b617993).
  • 3.8
  • 3.7

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.