Light

Should we allow "bytearray" about asyncio HOT 9 CLOSED

python commented on May 13, 2024

Should we allow "bytearray"

from asyncio.

Comments (9)

GoogleCodeExporter commented on May 13, 2024

I think we should support bytearray in all places where we support bytes, but 
in all cases where we hold on to it after returning from the function we should 
make a copy -- either extend another bytearray or append a bytes() copy to a 
list.

Original comment by [email protected] on 27 Mar 2013 at 11:43

from asyncio.

GoogleCodeExporter commented on May 13, 2024

Original comment by [email protected] on 16 Oct 2013 at 4:42

Added labels: Priority-Low
Removed labels: Priority-Medium

from asyncio.

GoogleCodeExporter commented on May 13, 2024

Issue 87 has been merged into this issue.

Original comment by [email protected] on 21 Nov 2013 at 12:54

from asyncio.

GoogleCodeExporter commented on May 13, 2024

To explain the issue to the creator of issue 87: if we allowed bytearray or 
e.g. memoryview or arrays, we'd have to copy them before storing them in the 
buffer, because otherwise the bytearray's contents might change while we're 
waiting for the I/O to happen. Given that we have to copy anyway, the caller 
might as well do that -- and this restriction on the API then becomes a gentle 
hint that using bytearray may actually be *less* efficient than using bytes.

So I am actually thinking of closing this as won't fix, but I'll entertain some 
discussion first.

(Note that you may be able to avoid the copy if the send() happens inline, 
because in that case the data is never added to the buffer; but (a) we only do 
that for TCP and UDP using selectors; we always append to the buffer when using 
SSL/TLS, and the Windows Proactor also always appends to the buffer. So, again 
the seeming optimization might backfire depending on your transport type.)

Original comment by [email protected] on 21 Nov 2013 at 1:01

from asyncio.

GoogleCodeExporter commented on May 13, 2024

In my case, I use bytearray for performance reasons, so I won't like to have an 
extra copy every time I sent data. Would it make sense to accept memoryview 
objects as well in all write methods? This would also avoid the slice copy when 
the socket only writes part of the bytearray. I would leave to the user the 
responsibility to not alter the data until the bytearray has been completely 
sent. A new method in Transport to wait for all writers in Transport._sock to 
be cleared would help. If the idea makes sense I can submit a patch for it.

Original comment by [email protected] on 21 Nov 2013 at 1:35

from asyncio.

GoogleCodeExporter commented on May 13, 2024

Can you tell me how you create the bytearray? Do you know how much the copy 
would cost you? (It's easy enough to benchmark asyncio with the assert taken 
out if you want to. :-)

Leaving the user the responsibility to not modify the data is unsatisfactory -- 
plus, the user could modify the *size* of the data which could break 
invariants/assumptions of the transport.

Waiting until the buffer is completely flushed can be done using a combination 
of calling transport.set_write_buffer_limits(0) and defining a 
protocol.resume_writing() callback. (Check the definition of drain() in 
streams.py for a usage example.)

Original comment by [email protected] on 21 Nov 2013 at 4:48

from asyncio.

GoogleCodeExporter commented on May 13, 2024

I'll do the benchmark. 

In the meanwhile, I also detected that you cannot write memoryview objects to a 
transport due the "b''.join()" line concatenating the bytes in the buffer. I 
solved that patching the _write_ready method as follows:

    def _write_ready(self):
        while self._buffer:
            data = self._buffer.popleft()
            try:
                n = self._sock.send(data)
            except (BlockingIOError, InterruptedError):
                self._buffer.appendleft(data)  # Still need to write this.
                break
            except Exception as exc:
                self._loop.remove_writer(self._sock_fd)
                self._fatal_error(exc)
            else:
                data = data[n:]
                if data:
                    self._buffer.appendleft(data)  # Still need to write this.
                    break

        self._maybe_resume_protocol()
        if not self._buffer:
            self._loop.remove_writer(self._sock_fd)
            if self._closing:
                self._call_connection_lost(None)
            elif self._eof:
                self._sock.shutdown(socket.SHUT_WR)

It avoids some extra copies when the buffer grows.
Does it break any internal compatibility?

Original comment by [email protected] on 21 Nov 2013 at 6:57

from asyncio.

GoogleCodeExporter commented on May 13, 2024

There's movement here. We may end up changing the buffer to be a bytearray 
rather than a deque. See recent discussion on python-tulip list and this code 
review: https://codereview.appspot.com/33490043/

Original comment by [email protected] on 27 Nov 2013 at 5:36

from asyncio.

GoogleCodeExporter commented on May 13, 2024

This issue was closed by revision c50e09feb51f.

Original comment by [email protected] on 27 Nov 2013 at 10:13

Changed state: Fixed

from asyncio.

Related Issues (20)

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.