pair-code / federated-learning Goto Github PK

View Code? Open in Web Editor NEW

159.0 159.0 33.0 33.5 MB

Federated learning experiment using TensorFlow.js

License: Apache License 2.0

TypeScript 94.90% Shell 5.10%

federated-learning's People

Contributors

Stargazers

Watchers

federated-learning's Issues

Error in Emoji hunt demo

I get the following error when I run yarn dev to setup the federated server

RangeError: Invalid typed array length: 424424 at new Float32Array (<anonymous>) at /home/psi/flp/federated-learning/demo/emoji_hunt/server/.yalc/federated-learning-server/dist/models.js:391:19 at Array.map (<anonymous>) at flatDeserialize (/home/psi/flp/federated-learning/demo/emoji_hunt/server/.yalc/federated-learning-server/dist/models.js:388:24) at FederatedServerDynamicModel.<anonymous> (/home/psi/flp/federated-learning/demo/emoji_hunt/server/.yalc/federated-learning-server/dist/models.js:334:36) at step (/home/psi/flp/federated-learning/demo/emoji_hunt/server/.yalc/federated-learning-server/dist/models.js:42:23) at Object.next (/home/psi/flp/federated-learning/demo/emoji_hunt/server/.yalc/federated-learning-server/dist/models.js:23:53) at fulfilled (/home/psi/flp/federated-learning/demo/emoji_hunt/server/.yalc/federated-learning-server/dist/models.js:14:58)

Here is the snippet that the error points to:
381: exports.flatSerialize = flatSerialize;
function flatDeserialize(_a) {
var data = _a.data, _b = _a.json, meta = _b.meta, byteOffsets = _b.byteOffsets;
var numels = meta.map(function (_a) {
var shape = _a.shape;
return shape.reduce(function (x, y) { return x * y; }, 1);
});
var tensors = meta.map(function (_a, i) {
var shape = _a.shape, dtype = _a.dtype;
var ctor = common_1.dtypeToTypedArrayCtor[dtype];
var arr = new ctor(data.buffer, byteOffsets[i], numels[i]);
return tf.tensor(arr, shape, dtype);
});
return tensors;
395: }

Error when installing emoji server

common.ts:103:16 - error TS7017: Element implicitly has an 'any' type because type '{ 'float32': Float32ArrayConstructor; 'int32': Int32ArrayConstructor; 'bool': Uint8ArrayConstruct...' has no index signature.

103 const ctor = dtypeToTypedArrayCtor[dtype];
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

models.ts:227:3 - error TS2322: Type '{ data: Uint8Array; json: { meta: { shape: number[] | [number, number] | [number, number, number,...' is not assignable to type 'FlatVars'.
Types of property 'json' are incompatible.
Type '{ meta: { shape: number[] | [number, number] | [number, number, number, number] | [number, number...' is not assignable to type '{ meta: { shape: number[]; dtype: "float32" | "int32" | "bool"; }[]; byteOffsets: number[]; }'.
Types of property 'meta' are incompatible.
Type '{ shape: number[] | [number, number] | [number, number, number, number] | [number, number, number...' is not assignable to type '{ shape: number[]; dtype: "float32" | "int32" | "bool"; }[]'.
Type '{ shape: number[] | [number, number] | [number, number, number, number] | [number, number, number...' is not assignable to type '{ shape: number[]; dtype: "float32" | "int32" | "bool"; }'.
Types of property 'dtype' are incompatible.
Type '"float32" | "int32" | "bool" | "complex64"' is not assignable to type '"float32" | "int32" | "bool"'.
Type '"complex64"' is not assignable to type '"float32" | "int32" | "bool"'.

227 return {data: dataArr, json: {meta, byteOffsets}};
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

node_modules/@tensorflow/tfjs-core/dist/kernels/webgl/gpgpu_context.d.ts:15:27 - error TS2304: Cannot find name 'WebGLLoseContext'.

15 loseContextExtension: WebGLLoseContext;
~~~~~~~~~~~~~~~~

error Command failed with exit code 2.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.
error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.
$ yarn postinstall && parcel build index.html
$ ./postinstall.sh
$ rm -rf dist/ && yarn && yarn build && yalc publish
[1/4] 🔍 Resolving packages...
success Already up-to-date.
$ tsc
[email protected] published in store.
[email protected] linked ==> /Users/vatsa/Desktop/Proj/federated-learning-master/demo/emoji_hunt/client/node_modules/federated-learning-client
/bin/sh: parcel: command not found
error Command failed with exit code 127.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.
[email protected] linked ==> /Users/vatsa/Desktop/Proj/federated-learning-master/demo/emoji_hunt/server/node_modules/federated-learning-server
error Command "parcel" not found.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.
TypeError: Object prototype may only be an Object or null: undefined
at setPrototypeOf ()
at __extends (/Users/vatsa/Desktop/Proj/federated-learning-master/demo/emoji_hunt/server/node_modules/@tensorflow/tfjs-node/dist/nodejs_kernel_backend.js:7:9)
at /Users/vatsa/Desktop/Proj/federated-learning-master/demo/emoji_hunt/server/node_modules/@tensorflow/tfjs-node/dist/nodejs_kernel_backend.js:53:5
at Object. (/Users/vatsa/Desktop/Proj/federated-learning-master/demo/emoji_hunt/server/node_modules/@tensorflow/tfjs-node/dist/nodejs_kernel_backend.js:1100:2)
at Module._compile (internal/modules/cjs/loader.js:701:30)
at Module._extensions..js (internal/modules/cjs/loader.js:712:10)
at Object.nodeDevHook [as .js] (/Users/vatsa/Desktop/Proj/federated-learning-master/demo/emoji_hunt/server/node_modules/ts-node-dev/lib/hook.js:61:7)
at Module.load (internal/modules/cjs/loader.js:600:32)
at tryModuleLoad (internal/modules/cjs/loader.js:539:12)
at Function.Module._load (internal/modules/cjs/loader.js:531:3)
[ERROR] 03:37:32 TypeError: Object prototype may only be an Object or null: undefined

Softmax tilts to one side (trained label) after aggregator mean computation

Hello Authors! Thanks for sharing nice work. Your code are awesome.
I try your emoji_hunt demo with my trained model.
I think this is important issue to implement Ferderated Learning.
After mean computation, Softmax tilts to one side (trained label ) was occured

I guess this is ocurring when mean aggreation computation.

Different versions of TF.js between library user & consumer cause instanceof check fail error

If construct a tf.Model with a different instantiation of Tensorflow.js I get the error User-defined optimizer must be an instance of tf.Optimizer, as the library constructs its own SGDOptimizer, which fails the instanceof check within the tf.Model constructed by the library consumer's version of Tensorflow.js

I think this is due to the instanceof check in https://github.com/tensorflow/tfjs-layers/blob/de37e29133e2bc11defd3f5931e9e6afcacb7004/src/engine/training.ts#L745 failing.

Unable to Connect to server…

I am trying to run the Hogwarts demo and is stuck at the following screen. Any ideas what could be the issue.

Terminal output:

when I yarn Audio demo, i met this problem. Could not load source file "../common.ts"

Could not load source file "../common.ts" in source map of "node_modules/federated-learning-client/dist/common.js".

No save handlers found for the URL

Hi , I have been trying to run the tff server. I facing the below issue can anyone please help me out.

(node:19400) UnhandledPromiseRejectionWarning: Error: Cannot find any save handlers for URL 'file://D:Projects/1554387814204'
warning.js:18
    at new ValueError (D:\Projects\tff_server\node_modules\@tensorflow\tfjs-layers\dist\errors.js:36:28)
        at Model.<anonymous> (D:\Projects\tff_server\node_modules\@tensorflow\tfjs-layers\dist\engine\training.js:1161:39)
    at step (D:\Projects\tff_server\node_modules\@tensorflow\tfjs-layers\dist\engine\training.js:42:23)
    at Object.next (D:\Projects\tff_server\node_modules\@tensorflow\tfjs-layers\dist\engine\training.js:23:53)
    at D:\Projects\tff_server\node_modules\@tensorflow\tfjs-layers\dist\engine\training.js:17:71
    at new Promise (<anonymous>)
    at __awaiter (D:\Projects\tff_server\node_modules\@tensorflow\tfjs-layers\dist\engine\training.js:13:12)
    at Model.save (D:\Projects\tff_server\node_modules\@tensorflow\tfjs-layers\dist\engine\training.js:1153:16)
    at FederatedServerTfModel.<anonymous> (d:\Projects\tff_server\federated-learning-server\models.ts:111:22)
    at step (D:\Projects\tff_server\federated-learning-server\dist\models.js:42:23)
(node:19400) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 2)
warning.js:18
(node:19400) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
warning.js:18

Error Hogwarts Demo

Hi there, i cant start the hogwarts Demo. I am using mac os.

[email protected] linked ==> /Users/21222433/Documents/TensorFlow_Fed/demo/audio/client/node_modules/federated-learning-client
events.js:167
throw er; // Unhandled 'error' event
^

Error: spawn parcel ENOENT
at Process.ChildProcess._handle.onexit (internal/child_process.js:232:19)
at onErrorNT (internal/child_process.js:407:16)
at process._tickCallback (internal/process/next_tick.js:63:19)
at Function.Module.runMain (internal/modules/cjs/loader.js:745:11)
at startup (internal/bootstrap/node.js:279:19)
at bootstrapNodeJSCore (internal/bootstrap/node.js:696:3)
Emitted 'error' event at:
at Process.ChildProcess._handle.onexit (internal/child_process.js:238:12)
at onErrorNT (internal/child_process.js:407:16)
[... lines matching original stack trace ...]
at bootstrapNodeJSCore (internal/bootstrap/node.js:696:3)
error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.
cp: dist/*: No such file or directory
2018-09-12 14:12:26.826263: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.2 AVX AVX2 FMA
(node:10800) UnhandledPromiseRejectionWarning: FetchError: invalid json response body at https://storage.googleapis.com/tfjs-speech-command-model-14w/model.json reason: Unexpected token < in JSON at position 0
at /Users/u719611/Documents/TensorFlow_Fed/demo/audio/server/node_modules/node-fetch/lib/index.js:239:32
at process._tickCallback (internal/process/next_tick.js:68:7)
(node:10800) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 2)
(node:10800) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

Handing multiple weight updates from same client

This might not be something we decide right now, but I'm creating an issue to keep track of the discussion.

In the API right now, we treat weight updates from the same client (whom we identify solely via their autogenerated websocket id) as completely independent. If a client sends 10 weight updates, each with numExamples=1, we assume they labeled 10 separate examples, used the same ModelFitConfig to modify the original weights based on those examples, and sent us the results. This should work and allow us to learn, but another option would be to have the client retrain the model each time (starting from the original weights) using all of the data they've labeled so far. In that case, they would send us successive updates with numExamples=1, then numExamples=2, and so on until numExamples=10, and when it comes time to average, we would only consider the latest update the client sent us.

Here are some thoughts about the strengths of each strategy:

Good things about keeping same-client updates totally independent:

We don't have to store old labeled examples on the client side; the client can throw away data as soon as it's been used to compute a weight update.
Computing weight updates might be a little faster, since the batch size will always be 1.
If we save old examples and a client disconnects and reconnects, it's possible they will either (a) keep the same client ID but lose the examples, or (b) keep the old examples but get assigned a new client ID. If either of these things happens, and we have superseding logic, the server might throw away updates for certain examples (under a) or double count them (under b). Having more persistent IDs might help address this, but it's nice to not have to worry about it.

Good things about letting same-client updates supersede each other:

It might lead to faster learning. The results from a very basic experiment suggest that, when clients take multiple local SGD steps, we learn slightly faster from 10 clients with 3 samples each than we do with 33 clients with 1 sample each. It's not clear how those results will generalize / how significant they are, but it makes intuitive sense that having only a single example would preclude us from taking more than a couple SGD steps. Taking multiple steps on the client is really important for learning faster overall. (Note that we could also experiment with federated averaging using different numbers of local SGD steps based on the number of examples!)
Learning might be comparatively even faster because of privacy concerns. If it turns out that sending an exact weight update for just one example lets us reconstruct the original input on the server, then the client might need to add a lot of extra noise to the weight update. However, if the client ran SGD with multiple examples, maybe they could get away with adding less noise to the new weight update, making it even more useful. Of course, now multiple files on the server will contain information about the same examples, so maybe we introduce a new privacy issue!
In cases where users are using the client app for a long time and actually care about improving the local versions of their model right now, it might be nice to use an updated version of the local model trained on all the data points. Computing weight updates for one example at a time wouldn't help with that, whereas recomputing the update for all the examples would give us the new model for free.

Either way:

We should compute and upload weight updates as soon as we can on the client side, whether they're for a single example or for many, since the user might leave the page at any time.
To prevent race conditions, we should never overwrite records or files on the server. If we have multiple updates from the same client that supersede each other, we just read all of them and take the latest, which is inherently safe / fails gracefully (worst that can happen is we miss an update).

Sorry if this got super long, but hopefully this will be useful for reference later.

Error while running tff server

Hi , this is a great work. I have tried running the server using the below code

import * as http from 'http';
import * as federated from 'federated-learning-server';

const INIT_MODEL = 'file:///initial/model.json';
const webServer = http.createServer(); // can also use https
const fedServer = new federated.Server(webServer, INIT_MODEL);

fedServer.setup().then(() => {
  webServer.listen(80);
});

I have passed my pretrained keras model which was converted to tfjs. I am facing the below error
(node:8028) UnhandledPromiseRejectionWarning: Error: Unknown loss categoricalCrossEntropy at new ValueError (D:\Projects\federated-learning\src\server\node_modules\@tensorflow\tfjs-layers\dist\errors.js:36:28)
Can you please help out!. TIA

Deployment instructions for hogwarts

When trying to deploy the hogwarts demo using Docker, it fails to build. Is there any specific command you require?

RangeError: Invalid typed array length: 216

I get this error after running yarn dev in the demo/emoji_hunt/server demo.

$ yarn postinstall && ./deploy.sh
$ ./postinstall.sh
$ rm -rf dist/ && yarn && yarn build && yalc publish
[1/4] 🔍  Resolving packages...
success Already up-to-date.
$ tsc
[email protected] published in store.
$ yarn postinstall && parcel build index.html
$ ./postinstall.sh
$ rm -rf dist/ && yarn && yarn build && yalc publish
[1/4] 🔍  Resolving packages...
success Already up-to-date.
$ tsc
[email protected] published in store.
[email protected] linked ==> /Users/loretoparisi/Documents/Projects/AI/federated-learning/demo/emoji_hunt/client/node_modules/federated-learning-client
[1/4] Resolving packages...
[2/4] Fetching packages...
warning Pattern ["babel-core@^6.26.3"] is trying to unpack in the same destination "/Users/loretoparisi/Library/Caches/Yarn/v1/npm-babel-core-6.26.3-b2e2f09e342d0f0c88e2f02e067794125e75c207" as pattern ["babel-core@^6.25.0","babel-core@^6.26.0"]. This could result in non-deterministic behavior, skipping.
[3/4] Linking dependencies...
warning " > @tensorflow/[email protected]" has unmet peer dependency "@tensorflow/tfjs-core@~0.12.4".
error An unexpected error occurred: "Reduce of empty array with no initial value".
info If you think this is a bug, please open a bug report with the information provided in "/Users/loretoparisi/Documents/Projects/AI/federated-learning/demo/emoji_hunt/client/yarn-error.log".
info Visit https://yarnpkg.com/en/docs/cli/add for documentation about this command.
🚨  /Users/loretoparisi/Documents/Projects/AI/federated-learning/demo/emoji_hunt/client/index.html: Failed to install babel-core.
    at PromiseQueue.install [as process] (/usr/local/lib/node_modules/parcel/src/utils/installPackage.js:46:11)
    at <anonymous>
error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.
[email protected] linked ==> /Users/loretoparisi/Documents/Projects/AI/federated-learning/demo/emoji_hunt/server/node_modules/federated-learning-server
error Command "parcel" not found.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.
2018-10-22 18:56:52.407559: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.2 AVX AVX2 FMA
1
(node:74723) Warning: N-API is an experimental feature and could change at any time.
RangeError: Invalid typed array length: 216
    at typedArrayConstructByArrayBuffer (<anonymous>)
    at new Float32Array (native)
    at /Users/loretoparisi/Documents/Projects/AI/federated-learning/demo/emoji_hunt/server/.yalc/federated-learning-server/dist/models.js:391:19
    at Array.map (<anonymous>)
    at flatDeserialize (/Users/loretoparisi/Documents/Projects/AI/federated-learning/demo/emoji_hunt/server/.yalc/federated-learning-server/dist/models.js:388:24)
    at FederatedServerDynamicModel.<anonymous> (/Users/loretoparisi/Documents/Projects/AI/federated-learning/demo/emoji_hunt/server/.yalc/federated-learning-server/dist/models.js:334:36)
    at step (/Users/loretoparisi/Documents/Projects/AI/federated-learning/demo/emoji_hunt/server/.yalc/federated-learning-server/dist/models.js:42:23)
    at Object.next (/Users/loretoparisi/Documents/Projects/AI/federated-learning/demo/emoji_hunt/server/.yalc/federated-learning-server/dist/models.js:23:53)
    at fulfilled (/Users/loretoparisi/Documents/Projects/AI/federated-learning/demo/emoji_hunt/server/.yalc/federated-learning-server/dist/models.js:14:58)
    at <anonymous>
✨  Done in 21.70s.

Tutorial?

Hello,

Is there a tutorial to get started on this repo? I would like to run it on 1 server and n clients to see how it works.

I started with the demos but could not make them to run, got the same issue which other people have reported.

Is there a tutorial or any other help?

pair-code / federated-learning Goto Github PK

federated-learning's People

Contributors

Stargazers

Watchers

Forkers

federated-learning's Issues

Recommend Projects

Recommend Topics

Recommend Org