Comments (15)
The actual issue is the early return in case the stream isn't readable anymore. Currently we assume a stream is closed once it's closed the first time.
It just worked before, because the first read was deferred any there was always data to read after that.
byte-stream/lib/ResourceInputStream.php
Line 121 in 3a4e493
from byte-stream.
Well there are two issues. One is the one you are talking about (which should just mean read will be always empty), second one is that even if stream is closed, event loop should return context to pipe coroutine. I don't know why it doesn't so execution never goes out of this one while loop.
from byte-stream.
It doesn't, because of your while (true)
loop in combination with Success
, which is an immediately resolved promise.
from byte-stream.
Indeed. Still two problems though. But yes, both of them would be solved by simply removing the assumption that there are not going to be more writes to resource after it's eof.
I noticed you have added a test where you solved this issue by adding Delay. That's not ideal. Now try to modify test case also like following, which should reflect my real use case more. I cannot assume same amount of read/writes will be called during piping and while loop. I might not even know the size of source since it might not be a file.
--- test/ResourceStreamTest.php (revision aa2ebfe17be166e197375020483fb78f9d497fa6)
+++ test/ResourceStreamTest.php (date 1545781598000)
@@ -260,7 +260,7 @@
\Amp\ByteStream\pipe(
new ResourceInputStream(\fopen(__FILE__, 'rb')),
- new ResourceOutputStream(\fopen($middle, 'wb'))
+ new ResourceOutputStream(\fopen($middle, 'wb'), 1)
);
$middleReadStream = new ResourceInputStream(\fopen($middle, 'rb'));
from byte-stream.
ping @kelunik
from byte-stream.
@ostrolucky We can't simply remove that assumption, it's how we defined InputStream
to work. I'd advise you to work with another API or write your own stream, that never returns null
in the first place.
Not sure whether you know it, but file resources (e.g. opened with fopen
) will always do blocking reads, never non-blocking reads, because async I/O isn't supported for file handles.
I'm fine with a scheduling change that sometimes does a Loop::delay(0)
/ Loop::defer
to make the scheduling fairer.
from byte-stream.
file resources (e.g. opened with fopen) will always do blocking reads
Really? http://php.net/manual/en/function.stream-set-blocking.php says
This function works for any stream that supports non-blocking mode (currently, regular files and socket streams).
Anyway what I intend to do when solving this issue is to get rid of amphp/file and just use byte-stream, since according my testing it results in considerate boost of read/write speed
from byte-stream.
@ostrolucky Yes, really. While the read might return EAGAIN
then instead of blocking, the actual problem is that polling doesn't work, it always returns immediately, see https://www.remlab.net/op/nonblock.shtml
Have you tried installing ext-eio
or ext-uv
?
from byte-stream.
@ostrolucky If you want raw speed, then yes, don't use amphp/file. If you want low overall latency, then by all means go with amphp/file.
amphp/file is slower by design in that it defers work to another thread/process. But while the other thread/process does work on executing your disk I/O, your current process can still do other work.
amphp/file is built to scale, not to be extremely fast, basically.
from byte-stream.
@trowski What's your thought on the scheduling?
from byte-stream.
I think the scheduling should remain as-is. There's a few things wrong here that will cause issues no matter what kind of scheduling we do.
The code originally posted is performing reads with no condition. In synchronous PHP code, it's effectively doing this:
$file = fopen('middle', 'r');
while (true) {
echo fread($file, 8192);
}
while (true)
with no escape condition should result in an infinite loop. Adding delays/defers just hides the problem IMO.
ResourceInputStream
and ResourceOutputStream
were never meant for files (we should probably put this as a warning box on the front page of the docs). The correct way to do this is with amphp/file
. Install ext-eio
or ext-uv
if you're concerned about performance. I would also advise against using pipe
if you want to do something else with the data read from the input stream.
from byte-stream.
ext-eio or ext-uv
- I'm building simple tool, not a server running 24/7. I don't want to make consumers of my tool install non standard php extension. Making it work with good performance should be as simple as doing
composer require
and executing it. And don't get me even started on cross-platform compatibility issues - I did try them, they have various bugs making them unusable for me and since I don't really want to use them anyway because of first point, I don't want to make investment in creating detailed bug reports
@bwoebi That's exactly it. I don't need to scale much and hence am interested more in raw speed. I'm fine to block the script for few ms while fread
is being called, since there is usually only 2 tasks running concurrently anyway.
We can't simply remove that assumption, it's how we defined InputStream to work
I don't understand this justification. Wouldn't InputStream continue to work for all current uses cases when this assumption is removed? It's obviously wrong assumption as I demonstrated. Resource being eof does not mean it will be eof indefinitely.
All I need is to have a way to skip execution of this line
byte-stream/lib/ResourceInputStream.php
Line 121 in 6bbfcb6
from byte-stream.
InputStream
returns null from read()
to indicate EOF. This is no longer something can or should be changed. I and many others have written libraries and apps using this API without issue. ResourceInputStream
was designed to work with pipes and sockets, not files. If your script was allowed to continue to the fread()
and subsequent watcher activation, it would result in 100% CPU as the readable watcher would immediately report the stream as readable (again, because loop watchers do not work properly for file streams). If you were using anything other than NativeDriver
for a loop, I believe even creating the watcher would fail.
As I said before, you shouldn't use pipe()
if you want to do something with the data read from the source. Against my better judgement, here's some code showing what I meant using file streams with Resource*Stream
objects:
Loop::run(function () {
$source = new ResourceInputStream(\fopen('source/file/path', 'rb'));
$destination = new ResourceOutputStream(\fopen('destination/file/path', 'wb'));
while (($chunk = yield $source->read()) !== null) {
echo $chunk; // Or whatever else you wish to do with $chunk.
yield $destination->write($chunk);
}
});
I'm certain you could have written that on your own, so I'm confused at your refusal to do so.
I would recommend a different approach to concurrency. Use amphp/parallel
and Amp\Parallel\Worker\Task
to download each file within a worker. You can even use amphp/file
, which uses blocking operations when within a worker. If worker is too simple or opinionated of an API, you can use Amp\Parallel\Context\Process
to write your own process while being provided a communication channel back to the parent. Initial file downloads/uploads will add some overhead of starting another process, but file operations within the process will not be encumbered by IPC.
from byte-stream.
My example code is simplest reproducer. There is a reason there are 3 resources there - I don't just echo in real use case, but write to resource 4 which doesn't exist at the time piping between resource 1 and 2 is initiated
Use amphp/parallel and Amp\Parallel\Worker\Task to download each file within a worker
Resource 1 in my case is stdin and resource 4 socket. I believe this makes this not offloadable to different thread/process
You can even use amphp/file
I do, it's way slower. That's why I want to use ResourceInputStream instead and submitted this issue. Would be much more performant in my use case to get rid of amp/parallel overhead and just not use 100% non-blocking code.
Feel free to experiment yourself with https://github.com/ostrolucky/stdinho and commands head -c 1G </dev/urandom | bin/stdinho 127.0.0.1:1337 -v
and curl 127.0.0.1:1337 > /dev/null
. Then try to replace amphp/file
with ResourceInputStream
inside \Ostrolucky\Stdinho\Responder
class.
it would result in 100% CPU as the readable watcher would immediately report the stream as readable
It's nearly 100% even with amphp/file
I don't understand why is this logic necessary there, each consumer has its own logic to determine when reading is finished (most often with while chunk is not null) and after that it doesn't attempt to call read()
at all. ResourceInputStream is purposely ignoring my requests to read the resource for God knows what reason. It is consumers responsibility to stop calling read() after file is EOF. If nothing else, this makes it confusing API. Method shouldn't be named read()
if it doesn't even attempt to read the resource.
from byte-stream.
As you changed the title and as I already explained that topic: This is by design and won't change. If you don't care about blocking, I'd advise you to use fread
directly over using ResourceInputStream
.
We can, however, talk about the scheduling of immediate reads if a lot of them happen in a tight loop, as that might result in a blocking situation that isn't expected that way. I've opened #49 for that.
from byte-stream.
Related Issues (20)
- Low bandwidth v2.0.0-beta.13, v2.0.0-beta.14 HOT 10
- Consider avoiding stream_socket_shutdown HOT 3
- Error in Amp\ByteStream\splitLines()
- Broken symlink in docs HOT 2
- Immediate reads and writes might result in blocking HOT 2
- Modify chunk size on the fly HOT 5
- ResourceOutputStream interrupts writing without consumer giving information what was written/unwritten HOT 11
- Problem writing to stream HOT 6
- InputStreamChain HOT 3
- ResourceOutputStream treats open stream as closed HOT 5
- Can cause infinite loop in caller code HOT 2
- Invalid watcher issue in ResourceInputStream
- LineReader custom delimiter HOT 2
- Ability to read particular length from stream HOT 4
- Assertion fail: Trying to read from a previously fclose()'d resource (Windows) HOT 3
- Add InputStream::close
- Add CancellationToken to InputStream::read
- The url inside the composer.json should be in their secure version? HOT 2
- ReadableResourceStream::read() length should reattempt a read if less data is returned? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from byte-stream.