Comments (8)
I was expecting the workload to be distributed among these 10 threads. But after some console logging, I see they are getting the exact same array and they are doing the exact same work, just now 10 times. How does that help? Maybe the example code just doesn't explain this but I was expecting either:
Testing in my console shows that using v4.1.0
that the result of your supplied function
const foo = () => {
var params = {
'array':[0,1,2,3,4,5,6,7,8,9]
};
hamsters.run(params, function() {
var arr = params.array;
arr.forEach(function(item) {
rtn.data.push((item * 120)/10);
});
}, function(output) {
console.log(output);
}, 10);
};
foo();
Is exactly what it should be which is an array containing 10 subarrays as your final output, since you have not asked the library to aggregate your results back together. If you want to have a single output you need to change your logic to
const foo = () => {
var params = {
'array':[0,1,2,3,4,5,6,7,8,9]
};
hamsters.run(params, function() {
var arr = params.array;
arr.forEach(function(item) {
rtn.data.push((item * 120)/10);
});
}, function(output) {
console.log(output);
}, 10, true);
};
foo();
As far as
But after some console logging, I see they are getting the exact same array and they are doing the exact same work, just now 10 times.
This isn't the case, your array assuming you are defining multiple threads WILL be split across as many threads as you have specified, the same operation you defined before will be executed across all items in the array regardless of what thread they make use of. Inspecting http://hamsters.io/performance while pasting your function into the console shows me that each thread is in fact getting a subarray equal to Array.size / threads
. Perhaps the problem you're having is that you aren't making use of a real worker implementation and you're seeing the behavior of the legacy mode which honestly should be following the same exact process so I'm at a loss as to what problems you're seeing.
Could you help me understand how this library should be used? If you need an example, say we have a big array of size n (n > 10k) of positive integers and we are trying to find (x) => x * (x - 1) * (x - 2) * ... * 2 * 1 for every item in the array. Since the data items have no dependency on each other, ideally we could spawn n threads where each one would grab one number and start calculating. Could you illustrate how this could be done with this library?
That's exactly what is illustrated above, you've just forgotten to tell the library you want a single output.
from hamsters.js.
Sorry maybe I didn't make it clear in my first statement. By "some logging" I meant more console.log
s than I showed in the code, I removed all of them to make the code simpler and closer to the original example code in readme. It would look more like this in my testing:
const foo = () => {
var params = {
'array':[0,1,2,3,4,5,6,7,8,9]
};
hamsters.run(params, function() {
console.log('params', params);
var arr = params.array;
arr.forEach(function(item) {
console.log('run');
rtn.data.push((item * 120)/10);
});
}, function(output) {
console.log(output);
}, 10);
};
foo();
I saw "params"
in the logs 10 times, each with identical content, and "run"
100 times, which is exactly the size of the array times the thread count I provided, while I'm expecting "run"
only 10 times.
from hamsters.js.
As of the log from the output function, for me that was:
[ [ 0, 12, 24, 36, 48, 60, 72, 84, 96, 108 ],
[ 0, 12, 24, 36, 48, 60, 72, 84, 96, 108 ],
[ 0, 12, 24, 36, 48, 60, 72, 84, 96, 108 ],
[ 0, 12, 24, 36, 48, 60, 72, 84, 96, 108 ],
[ 0, 12, 24, 36, 48, 60, 72, 84, 96, 108 ],
[ 0, 12, 24, 36, 48, 60, 72, 84, 96, 108 ],
[ 0, 12, 24, 36, 48, 60, 72, 84, 96, 108 ],
[ 0, 12, 24, 36, 48, 60, 72, 84, 96, 108 ],
[ 0, 12, 24, 36, 48, 60, 72, 84, 96, 108 ],
[ 0, 12, 24, 36, 48, 60, 72, 84, 96, 108 ] ]
while I was expecting something like:
[ [ 0 ],
[ 12 ],
[ 24 ],
[ 36 ],
[ 48 ],
[ 60 ],
[ 72 ],
[ 84 ],
[ 96 ],
[ 108 ] ]
from hamsters.js.
You need to provide your entire example source for me to understand why you are seeing different behavior from both the provided benchmark example logic and jasmine tests.
const foo = () => {
var params = {
'array':[0,1,2,3,4,5,6,7,8,9]
};
hamsters.run(params, function() {
var arr = params.array;
arr.forEach(function(item) {
rtn.data.push((item * 120)/10);
});
}, function(output) {
console.log(output);
}, 10, true);
};
Using 4.0.0
functions exactly as it should and my final output is in fact how it should be which is a single array that looks like [0, 12, 24, 36, 48, 60, 72, 84, 108]
As mentioned before, you are not telling your output to be aggregated into a single output, pay special attention to #5 on the how it works section of the read me.
This optional argument will tell the library whether or not we want to aggregate our individual thread outputs together after execution, this is only relevant if you are executing across multiple threads and defaults to false.
Additionally the documentation does not say to declare hamsters as a constant, the hamsters object should never be treated as immutable because the library modifies it self during runtime.
var hamsters = require('hamsters.js');
from hamsters.js.
As mentioned before, you are not telling your output to be aggregated into a single output, pay special attention to #5 on the how it works section of the read me.
I'm aware of that and I didn't expect the result to be aggregated (as I plan to do the aggregation myself). What I was saying was the result contains redundant data that apparently is produced by redundant work.
I'll use the aggregation flag, hopefully it's easier for you to see what I mean.
I'm printing the amount of data each thread received, as well as the amount of work cycles they did. With 2 threads, the total amount of work cycle should always be the input data size, which is 10, but we are seeing 20 cycles (runs).
Also, with the aggregation flag set, the output contains redundant data.
from hamsters.js.
So I've managed to reproduce this issue only under the following conditions
- The library is making use of legacy mode which means making use of the main thread.
- The function invoked is using more than 1 thread
- The function invoked is using a non typed array, using a typed array does not suffer the same issue.
My debugging so far leads me to believe this is an inheritance issue and a race condition based on the time slicing behavior of setTimeout
causing multiple "threads" to modify the same array, I'll have a fix ready in the next release version which shouldn't be too long.
In the mean time my recommendations are to use a 3rd party worker implementation with Node.js as this is only going to affect the legacy mode of the library and only within Node or browsers that do not support web workers eg. IE9.
Thanks for spotting this, it's very hard to debug multithreaded logic so understand my requests for more info are because I wasn't seeing the problem until I recreated every condition of your setup.
from hamsters.js.
I'm absolutely glad to help. I was looking for easy-to-use node parallelism solutions and came across this library and I think it is very promising.
By the way, could you comment on my side note there? I'd assume there's some limitation that I'm not aware of forcing you to design the API this way but wouldn't it easier for people to understand to have the thread function signature like this:
function threadFunc (params, report) {
// Do work here with data provided in `params`.
var data = params.array.map((v) => v + 1);
// Send results back with the `report` function.
report(data);
}
Or simply:
function threadFunc (params) {
// Do work here with data provided in `params`.
var data = params.array.map((v) => v + 1);
// `return`ed data is automatically collected by the main process.
return data;
}
I know in the second case it's probably harder to allow async work, in which case Promise
probably could be used. My general suggestion is to make the thread function look more like a normal function, instead of using some seemingly undefined "internal variables".
from hamsters.js.
I've gone ahead and pushed out v4.1.1, please update to v4.1.1 and your issues should be resolved.
https://github.com/austinksmith/Hamsters.js/releases/tag/v4.1.1
It's unfortunately not possible to do it that way unless they added that functionality as a first class part of the language.
from hamsters.js.
Related Issues (20)
- Uncaught ReferenceError: adTestentListener is not defined HOT 1
- Tests not pasing HOT 1
- React-native android side, typeerror using hamsters and react-native-threads HOT 18
- how to migrate from v4 to v5 HOT 18
- Promises error HOT 15
- rtn data is pushed on different index than 0 HOT 19
- Windows 10 | Node.js - no output when using hamsters.js + webworker-threads package HOT 13
- The examples in the website don't have the source code of the example HOT 2
- Cannot read property 'buffer' of undefined." HOT 4
- im not getting correct results from the rtn.data object in hamsterjs v5.1.3 HOT 21
- Pointers for use with ReactJS HOT 2
- react-native-threads not working. HOT 2
- react native workers HOT 4
- ERROR: while initiating hamsters.js [React Native] HOT 5
- Your country is blocked from accessing this content. HOT 1
- http://www.hamsters.io/ offline HOT 2
- Anyone use hamesters.js successfully in Chrome Browser ? HOT 3
- Uncaught TypeError: hamsters.run is not a function HOT 3
- rtn undefined HOT 4
- react-native-hamsters/example - undefined is not an object (evaluating 'message.data.data') HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hamsters.js.