Coder Social home page Coder Social logo

friendly-enigma's Introduction

friendly-enigma

Read a very large file, containing a single continuous string of UTF-8 characters, and sort the string using little memory.

Algo

1.) Read the string in chunks, sort each chunk and write it to a tmp file.

2.) Read the tmp files one char at a time, pushing the read chars onto a heap. Since the tmp files contain sorted strings, we are always reading the 'lowest' char from each tmp file.

3.) Pop the min value from the heap and write it to the output file.

4.) Read a new char from the same file where the popped min value originated.

5.) Loop until all chars are read from all tmp files.

Run tests

python3 test.py

Run program

This program creates a very large number of temporary files. You may need to increase the limit for open file handles your OS allows before you can run it. Eg, on MacOS: ulimit -Sn 10000.

python3 sort.py 'path/to/input_file'

Find the sorted string in './out.txt'.

Memory usage reports

During the read and sort chunks phase, the program prints the memory usage of the current process after writing every sorted chunk to a tmp file.

During the heap sort phase, the program prints the memory usage of the current process after every 10m elements handled.

friendly-enigma's People

Contributors

dilyand avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.