Coder Social home page Coder Social logo

pydis's Introduction

Hi there ๐Ÿ‘‹

pydis's People

Contributors

boramalper avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pydis's Issues

Is MSET atomic?

According to the Redis docs:

MSET is atomic, so all given keys are set at once. It is not possible for clients to see that some of the keys were updated while others are unchanged.

Is this true in Pydis?

Importance of async?

Excellent work! Just wanted to mention, if straight-line performance is the goal, it may be worth testing a non-async version of the code. Most operations are going to be gated on the dictionary backing the application, so unless idle connections are part of the benchmark, the async component might be introducing more overhead than strictly necessary. What do you think?

Needless creation of a new deque() for each command processed

Perhaps the original author tested this, but the pattern of:

deque = self.dictionary.get(key, collections.deque())
# process command relative to the deque
self.dictionary[key] = deque

Results in the creation of a deque with every command regardless whether or not the key is present in the dictionary. It may be faster to initialize self.dictionary with a defaultdict like so:

self.dictonary = collections.defaultdict(collections.deque)

which then simplifies the previous code to:

deque = self.dictionary[key]

which also eliminates the need to re-store the deque into self.dictionary. While the logic in defaultdict to decide whether or not call the factory might add some time, I suspect that it will be overshadowed by the time saved from the object creation/deletion and additional dict lookup.

Correctness issues

Even though the supported subset of commands is very limited, there are many cases where pydis can't get the results right, which makes it more like a toy that can't do much other than benchmarks.

For example,

127.0.0.1:7878> SET a 1
OK
127.0.0.1:7878> INCR a
Error: Server closed the connection

because it only converts str while it is stored as bytes

  File "pydis.py", line 133, in incr
    value += 1
TypeError: can't concat int to bytes

After fixing that one, the expiration doesn't really work...

127.0.0.1:7878> SET a 1 EX 1
OK
[wait a few seconds here]
127.0.0.1:7878> INCR a
(integer) 2

I don't know if you'd like to fix those since the goal is to "disprove some falsehoods about performance." If not, at least put out a warning like "don't take it serious" also I'm not interested in the performance toll of making extra checks for expiration, etc. but that should be taken into account when doing benchmarks too.

My comment on this work

This is a good data point, a lot of our beliefs about software are based on "educated guesses" (translation we just have no clue and make everything up). Thank you for making this, it's really very interesting! I hope to learn more from it too.

redis is 100,000 lines of .c code plus 50,000 lines of deps (jemalloc mostly, and lua, and then linenoise which is tiny). It runs (roughly) 2x the speed of pydis.

pydis is 250 lines of .py which is very impressive.. but it runs on top of python which is 400,000 lines of .c code and 777,460 lines of .py

I would like to see the kind of performance a golang implementation in roughly 250 lines would get (because it's high level like python, but it's also compiled so it might be very fast). How close to 1.0x performance might it achieve?

Benchmark Memory Usage Too

We should benchmark the memory usage as well, since it's an undoubtedly very important metric for an in-memory database.

Windows support

Please add this change to support Windows:

#comment out the initial import:
#import uvloop

def main() -> int:
    print("Hello, World!")

    # detect what platform
    if sys.platform == 'win32':
        loop = asyncio.ProactorEventLoop()
        # loop = asyncio.DefaultEventLoopPolicy() does not work
        asyncio.set_event_loop(loop)
    else:
        import uvloop
        asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())

    loop = asyncio.get_event_loop()

Aim

Hello. I found this project very interesting, Reading the README it says

The aim of this exercise is to prove that interpreted languages can be just as fast as C.

I'm surprised to read that though, because I had learned that interpreters would always be slower than compiler because of the interpreter overhead. I would be very excited to see this project reach close to 1.0x performance but I'm curious why you believe the interpreter overhead would not hold it back?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.