Coder Social home page Coder Social logo

skycoin / skywire Goto Github PK

View Code? Open in Web Editor NEW
73.0 18.0 45.0 66.73 MB

Skywire Node implementation

Dockerfile 0.04% Go 46.83% Shell 0.78% HTML 7.86% Makefile 0.56% JavaScript 1.10% TypeScript 22.77% CSS 0.07% SCSS 2.99% PowerShell 0.17% Batchfile 0.08% Java 16.75%
vpn meshnet software-defined-network

skywire's People

Contributors

0pcom avatar 4rchim3d3s avatar alexadhy avatar arc1999 avatar asgaror avatar atang152 avatar ayuryshev avatar bigookie avatar darkren avatar dharmendrakariya avatar ersonp avatar evanlinjin avatar fray avatar gz-c avatar i-hate-nicknames avatar ivcosla avatar jdknives avatar kifen avatar mrpalide avatar mungujn avatar nkryuchkov avatar pikomonde avatar ppcamp avatar senyoret1 avatar specter25 avatar taras-skycoin avatar xpecex avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

skywire's Issues

Make /exec endpoint consistent with others

While all endpoints for nodes management look like /nodes/{pk}/{action}, the endpoint for command execution looks like /exec/{pk}, which is not consistent with others and violates REST.

I think /nodes/{pk}/exec would look better.

Document all the possible Hypervisor auth states

Feature description

Describe the feature
Document all the possible Hypervisor auth states, to make it easier to consume the API correctly.

Is your feature request related to a problem? Please describe.

Describe the solution you'd like
When creating documentation about the Hypervisor API, please add a section about all the possible results related to auth, like what happens when the session is not valid, what is the normal validity duration of a session cookie, if the user must change the password after a certain amount of time and all other relevant information.

Describe alternatives you've considered

Additional context
The information is needed for being able to use the API in a effective way.

Possible implementation

Implement backoff for transport reconnection

Feature description

Currently, when one edge of a transport goes down, the other edge will try to re-establish the transport every three seconds. We should implement a backoff for re-establishing the transport. @evanlinjin suggested using the retry logic in /internal/netutil . Also, we need to improve logging, so that these messages do not clutter stdout:

[2019-11-13T10:59:10+08:00] WARN [tp:03c775]: failed to redial underlying connection: dial tcp 192.168.0.107:7777: connect: connection refused

  • improve logging
  • implement retrial backoff

Invalid rule panic reappears.

panic: invalid rule

goroutine 66 [running]:
github.com/skycoin/skywire/pkg/routing.Rule.TransportID(0x7ff4f309d139, 0x36, 0x36, 0x0, 0x0)
        /home/evanlinjin/dev/skycoin/skywire/pkg/routing/rule.go:70 +0x13f
github.com/skycoin/skywire/pkg/router.(*Router).forwardPacket(0xc000176af0, 0xe1c480, 0xc0000c8010, 0xc000358266, 0xa, 0xa, 0x7ff4f309d139, 0x36, 0x36, 0x93c947, ...)
        /home/evanlinjin/dev/skycoin/skywire/pkg/router/router.go:219 +0x28f
github.com/skycoin/skywire/pkg/router.(*Router).handlePacket(0xc000176af0, 0xe1c480, 0xc0000c8010, 0xc000358260, 0x10, 0x10, 0xe0ef20, 0xc000162290)
        /home/evanlinjin/dev/skycoin/skywire/pkg/router/router.go:143 +0x4d4
github.com/skycoin/skywire/pkg/router.(*Router).Serve.func1(0xc000176af0, 0xe1c480, 0xc0000c8010)
        /home/evanlinjin/dev/skycoin/skywire/pkg/router/router.go:116 +0x107
created by github.com/skycoin/skywire/pkg/router.(*Router).Serve
        /home/evanlinjin/dev/skycoin/skywire/pkg/router/router.go:110 +0x109

[M2] Keep-alive packet not being propagated through all the route

Describe the bug
When node receives keep-alive packet, it update activity of the rule associated with the route. It works fine. But the packet is not being propagated forward the route. If we have, say, 3 nodes, where one of them is intermediary, there won't be any trouble. But if we add more intermediary nodes, they won't get this keep-alive packet. This way the route will get broken as soon as the data packets stop going through the network.

Actual behavior
Without transmitting data packets, having 2+ intermediary nodes the route gets broken after rule timeout

Expected behavior
Keep-alive packets are being handled by all the nodes along the route, therefore updating rule activity and preventing rules from being removed

Possible implementation
There's func handling keep-alive packet. We need to just forward the packet down the line, should be easy to implement

Improvements for the Manager UI

This is a list of some of the improvements that should be added to the manager:

  • Limit the number of elements (transports, routes and apps) that can be shown in the sumary page and create dedicated pages for showing the full lists, with pagination.

  • The UI should show loading animations when getting the data from the hypervisor for populating the screen for the first time, instead of showing incomplete or outdated UI hoping to get the data soon.

  • Fix the data that is shown in the bar at the right of the sumary page, as it is currently showing test data and its structure may not be good for the data available in the mainnet.

  • The buttons on the bar that is shown at the right of the sumary page should be more heterogeneous, as it is currently showing tabs, options and navigation controls, almost without any visual distinction. The buttons in the bar should work in a similar way, to avoid confusing the user, so the tabs should be moved elsewere.

  • The modal window that shows the logs of an app needs pagination options.

  • The design should be modified to make it work well in smaller window sizes.

  • All hardcoded texts must be replaced with variables from the language file.

  • The option for changing the language should be made always visible, or at least easier to find when using the app for the first time. This is not relevant until adding more languages.

  • The language file management system of the Skycoin wallet, for detecting which languages must be updated, should be added to the manager.

  • When making remote operations, the UI should be blocked, to avoid having the user quickly sending multiple requests. Also, the feedback must be better.

  • The structure of the CSS styles should be simplified.

  • Changes should be made in the UI files to ensure more consistency in the design and make it easier to make future modifications to the app. Modularization and following the DRY philosophy would be good for this.

  • The old unused code should be deleted.

  • It should be possible to control how often the data is updated, as it can be done in the testnet.

  • The app should give the user better feedback about the data updates and any problems in the process.

  • The forms should work better with the keyboard, specially for sending the data by pressing the Enter key.

  • The way in which errors are processed and displayed to the user must be standardized.

  • The code that makes periodic operations and retries in case of error should be more robust.

  • All subscriptions must be checked to be sure that all is being handled correctly, including taking measures to avoid problems when multiple subscriptions are added to a single Subscription object and one fails ( skycoin/skycoin-web#572 ).

  • The snackbar should be improved. It should be easier for the user to identify if a message is an error, a warning or a confirmation. It should also avoid hidding the messages too quickly.

  • There should be an option for changing the autostart configuration of the apps. Depends on #27 , due to the API endpoint that must be used.

  • An option for creating routes should be added. (Delayed for a future version)

  • In most cases modal windows should be closed if the app navigates away of the current page, which could hapen after some errors.

  • The app should ask for confirmation before making some dangerous operations (mainly deleting things).

In addition to this, dedicated controls should be added for the default apps, in the apps page, just as the testnet manager has dedicated controls for “Connect to Node”, “SSH Server“ and “SSH Client”. However, for doing this more info about the default apps is needed, and some modifications could be needed in the visor, hypervisor and the skywire-services repository.

There are some minor improvements that were not listed, and some additional major changes may become evident in the future.

Routes are removed after several hours

Describe the bug
When using the instructions written in https://github.com/SkycoinProject/skywire-mainnet/tree/mainnet-milestone2/cmd/hypervisor to use the hypervisor API with the skywire-services repo, if some routes are created with curl --data {'"recipient":"'$PK_A'", "message":"Hello Joe!"}' -X POST $CHAT_C, the routes start to be returned by the hypervisor API, but are erased after several hours (arround 6 hours), or at least the hypervisor API stops returning them.

Environment information:

  • OS: Linux (Ubuntu 18.04.1)
  • Platform: Linux 4.15.0-65-generic x86_64

Steps to Reproduce
It is complicate to see the problem and to have specific instructions about how to reproduce it, because the big amount of time it takes to happen. However, I think it takes about 6 hours without using the hypervisor API to make the problem to appear.

Actual behavior
The routes are deleted after several hours. In addition to that, running curl --data {'"recipient":"'$PK_A'", "message":"Hello Joe!"}' -X POST $CHAT_C does not recreate them.

Expected behavior
Routes should stay.

Additional context

Possible implementation

[M2] Socket files are not removed on visor shutdown

Describe the bug
Socket files (app serve and dmsgpty) are not removed on visor shutdown. This way if we restart visor, it won't run because of address already in use

Environment information:
Independent

Steps to Reproduce
Steps to reproduce the behavior:

  1. Run skywire visor
  2. Shutdown visor
  3. Run it again

Actual behavior
Visor fails to run because of address already in use

Expected behavior
Visor runs

Possible implementation
Remove socket files on visor shutdown

Endpoint for restarting visor from hypervisor

Feature description
The hypervisor needs to be able to restart the visor because it's complicated to restart a big amount of visors manually.

Describe the solution you'd like
We can create a new visor process with the same args and kill the current one.

At the visor start:

  1. Save the current working directory.
  2. Save the command line args: the relative binary path, binary args

At the visor restart:

  1. Since args may contain relative paths, either set the working directory to the saved one, or change relative paths in args to absolute ones using the saved working directory.
  2. Run a new visor process with the required args.
  3. Kill the current process.

Describe alternatives you've considered
When going back to the beginning of main(), it might be very complicated to close all open resources.

Additional context
Running a new visor and killing the current one may need further study whether there are any pitfalls. E.g. we need to ensure that all resources needed by the new visor are freed by the old one before the start of the new one.

It is not possible to delete transports and routes using the hypervisor API

Describe the bug
In the master branch, when calling the DELETE /api/nodes/{pk}/transports/{tid} and DELETE /api/nodes/{pk}/routes/{routeId} API endpoints for deleting transports and routes, the operation fails.

Environment information:

  • OS: Linux (Ubuntu 18.04.1)
  • Platform: Linux 4.15.0-65-generic x86_64

Steps to Reproduce
Steps to reproduce the behavior:

  1. Start some nodes using the make integration-run-generic command of the skywire-services repository.
  2. Create transports and routes.
  3. Try to delete the transports and routes using the DELETE /api/nodes/{pk}/transports/{tid} and DELETE /api/nodes/{pk}/routes/{routeId} API endpoints.

Actual behavior
When trying to delete the transports and routes, the operation always fails and the response is something similar to {"error":"can not find app of name from node 024ec47420176680816e0406250e7156465e4531f5b26057c9f6297bb0303558c7"}

Expected behavior
The transports and routes should be deleted.

Additional context

Possible implementation

Visors cannot connect to hypervisor

Describe the bug
When running the integration environment, connections between visors and hypervisor stuck in dmsg code.

Environment information:
Independent.

Steps to Reproduce
Run the generic integration environment.

Actual behavior
Visors cannot connect to hypervisor.

Expected behavior
Visors connect to hypervisor.

nettest.TestConn fails with stcp.Conn

The test implemented in #42 for stcp.Conn panics with nil pointer dereference.

Sometimes it's TestConn/PingPong:

=== RUN   TestConn/PingPong
SIGN! len(b.Bytes) 396 dc56b62cf8625836528ccd75e696d5c65346df9433f49c8c0bc84d670e31975e
VERIFY! len(b.Bytes) 396 dc56b62cf8625836528ccd75e696d5c65346df9433f49c8c0bc84d670e31975e recovered: [2 5 203 190 126 217 33 250 75 59 226 126 36 57 212 254 13 226 200 213 128 212 136 120 236 13 119 155 170 154 29 162 152] <nil> expected: 0205cbbe7ed921fa4b3be27e2439d4fe0de2c8d580d48878ec0d779baa9a1da298
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x13e13db]

goroutine 40 [running]:
github.com/SkycoinProject/skywire-mainnet/pkg/snet/stcp.(*Conn).Read(0x0, 0xc000264e10, 0x8, 0x8, 0xc0000396c0, 0x100be95, 0x1454020)
	<autogenerated>:1 +0x2b
io.ReadAtLeast(0x5809090, 0x0, 0xc000264e10, 0x8, 0x8, 0x8, 0x0, 0x0, 0x0)
	/usr/local/Cellar/go/1.13.1/libexec/src/io/io.go:310 +0x87
io.ReadFull(...)
	/usr/local/Cellar/go/1.13.1/libexec/src/io/io.go:329
golang.org/x/net/nettest.testPingPong.func1(0x15709a0, 0x0)
	/Users/nkryuchkov/skywire-mainnet/vendor/golang.org/x/net/nettest/conntest.go:108 +0x11b
created by golang.org/x/net/nettest.testPingPong
	/Users/nkryuchkov/skywire-mainnet/vendor/golang.org/x/net/nettest/conntest.go:137 +0x130

Sometimes it's TestConn/RacyRead:

=== RUN   TestConn/RacyRead
SIGN! len(b.Bytes) 400 921359adcec09b1e27215ca1d8a3143ebaefdd19188961e8e4839aa93a8cd2ef
VERIFY! len(b.Bytes) 400 921359adcec09b1e27215ca1d8a3143ebaefdd19188961e8e4839aa93a8cd2ef recovered: [3 180 67 123 20 223 155 216 138 227 38 155 141 154 215 180 147 72 69 43 108 148 180 210 149 137 11 66 24 104 184 76 202] <nil> expected: 03b4437b14df9bd88ae3269b8d9ad7b49348452b6c94b4d295890b421868b84cca
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x13e162b]

goroutine 22 [running]:
github.com/SkycoinProject/skywire-mainnet/pkg/snet/stcp.(*Conn).Write(0x0, 0xc0001e6c00, 0x400, 0x400, 0x400, 0x0, 0x0)
	<autogenerated>:1 +0x2b
io.copyBuffer(0x1565240, 0xc000184170, 0x1565220, 0xc000184180, 0xc0001e6c00, 0x400, 0x400, 0xc000184180, 0xc000184180, 0x1466ea0)
	/usr/local/Cellar/go/1.13.1/libexec/src/io/io.go:404 +0x1fb
io.CopyBuffer(0x1565240, 0xc000184170, 0x1565220, 0xc000184180, 0xc0001e6c00, 0x400, 0x400, 0x13e0e03, 0x1570b80, 0xc0001ca280)
	/usr/local/Cellar/go/1.13.1/libexec/src/io/io.go:375 +0x82
golang.org/x/net/nettest.chunkedCopy(0x4960288, 0x0, 0x1564320, 0xc0000967b0, 0x10000c0000f9df0, 0x191c9a0)
	/Users/nkryuchkov/skywire-mainnet/vendor/golang.org/x/net/nettest/conntest.go:462 +0x11b
created by golang.org/x/net/nettest.testRacyRead
	/Users/nkryuchkov/skywire-mainnet/vendor/golang.org/x/net/nettest/conntest.go:148 +0xd0

The autostart value of the apps is not being restored after a restart

Describe the bug
If the autostart value of an application is changed by calling the PUT /api/visors/{pk}/apps/{app} API endpoint, the new value is not restored after stopping the visor and starting it again.

Environment information:

  • OS: Linux (Ubuntu 18.04.1)
  • Platform: Linux 4.15.0-65-generic x86_64

Steps to Reproduce
Steps to reproduce the behavior:

  1. Start some nodes using the make integration-run-generic command of the skywire-services repository
  2. Call GET /api/nodes/024ec47420176680816e0406250e7156465e4531f5b26057c9f6297bb0303558c7. The autostart property of the skychat will be true
  3. Call PUT /api/nodes/024ec47420176680816e0406250e7156465e4531f5b26057c9f6297bb0303558c7/apps/skychat with {autostart: false} as content, to stop the skychat app.
  4. Call GET /api/nodes/024ec47420176680816e0406250e7156465e4531f5b26057c9f6297bb0303558c7 again. The autostart property of the skychat will be false
  5. Call make integration-teardown; tmux kill-server in the command window that is running the skywire-services test enviroment, to stop it. Then call make integration-run-generic to start it again.
  6. Call GET /api/nodes/024ec47420176680816e0406250e7156465e4531f5b26057c9f6297bb0303558c7 again. The autostart property of the skychat will be true.

Actual behavior
After restarting the visor, the value set by calling PUT /api/nodes/024ec47420176680816e0406250e7156465e4531f5b26057c9f6297bb0303558c7/apps/skychat is ignored.

Expected behavior
The value should survive the restart of the visor.

Additional context

Possible implementation

Add ability to delete entries in transport discovery.

Feature description

Currently, one can only update statuses of entries in transport discovery and cannot actually delete entries.

This causes issues when restarting transports. When we remove transports from visor, we only remove them locally. On visor startup, it polls transport discovery to determine saved transports. Hence, locally deleted transports are revived.

Implementation Details:

  • A transport between nodes A and B can only be deleted by A or B.

Tasks:

  • Add endpoint to transport discovery to delete transport.
  • Update transport discovery client.
  • Update (transport.Manager).DeleteTransport to call transport discovery's delete transport endpoint.

Add deployment flag to gen-config command

Feature description

There are currently two deployments in use:

  1. skycoin.com deployment (production)

  2. skywire.cc deployment (testing)

The default skywire-cli node gen-config command should create a config file pointing to skycoin.com. There should be a flag to allow to generate a config pointing to skywire.cc.

Describe the solution you'd like

Skywire.cc Deployment:

  1. http://routefinder.skywire.cc

  2. http://transport.discovery.skywire.cc

  3. http://dmsg.discovery.skywire.cc

  4. Setup Node PK: 026c5a07de617c5c488195b76e8671bf9e7ee654d0633933e202af9e111ffa358d

skycoin.com Deployment:

  1. http://transport.discovery.skywire.skycoin.com

  2. http://routefinder.skywire.skycoin.com

  3. http://messaging.discovery.skywire.skycoin.com

  4. Setup Node PK: 026c5a07de617c5c488195b76e8671bf9e7ee654d0633933e202af9e111ffa358d

Go build Skychat dysfunctional on RPi

Describe the bug
After cloning the repo and running

make build; make install

it returns

GO111MODULE=on go build -race -o ./apps/skychat.v1.0 ./cmd/apps/skychat	
go build: -race is only supported on linux/amd64, linux/ppc64le, linux/arm64, freebsd/amd64, netbsd/amd64, darwin/amd64 and windows/amd64
Makefile:90: recipe for target 'host-apps' failed
make: *** [host-apps] Error 2

[M2] Check behavior of route groups

We need to check the behavior of route groups. Specifically, let's consider 2 route groups communicating. If visor with one of them comes down, and the second one still tries to communicate what errors will it be experiencing? Will there be errors at all? What should be considered as a correct behavior in this case?

Improve app configurability from Hypervisor

Feature description

Currently, we can only set the autostart parameter for applications on a visor from the hypervisor and start/stop applications. skyproxy as well as dmsgpty require more control however.

Users should be able to

  • enable/disable authentication and change auth passcode for skyproxy
  • set and manage the whitelist for dmsgpty

Add missing options to the hypervisor

Feature description
The manager UI has some options in the testnet that can not be implemented in the mainnet with the current hypervisor API. The options are:

o1

Those options are for setting the address of the discovery service, checking updates for the node/visor and restarting the node/visor. Also, the mainnet UI currently has an option for selecting if the visor should act as an exit node, but there is no API endpoint for making it work.

Is your feature request related to a problem? Please describe.
There are some options in the manager that are not working and can not be implemented with the curren API.

Describe the solution you'd like
API endpoints must be added to make those options work, or the options should be removed from the UI.

Describe alternatives you've considered

Additional context

Possible implementation

[M2] Transport manager tests fail

Describe the bug
Two transport manager tests are failing randomly. Sometimes they pass, sometimes they don't. The failing tests are:

  • TestNewManager/check_read_write with error:
    image
    Which happens here:
    image

  • TestNewManager/check_tp_logs with error:
    image
    Which happens here:
    image
    ALSO, this test sometimes fails with the different error:
    image
    Which happens on these lines:
    image

These 2 are probably connected and happen because of a single bug

  • Also test sometimes hangs

Make visor listen for stcp transports automatically

Feature description

Get local IP address and make visor listen for stcp transports on that address and port 7777 by default.

Is your feature request related to a problem? Please describe.
Currently, we have to manipulate the configuration file manually to setup stcp transports and set the local IP and port to listen on. All of that should be automated.

Update Skywire Cli

We should rename skywire-cli node to visor for consistency.

  • rename skywire-cli node to skywire-cli visor

Milestone 1 Production Testing

We need to test milestone 1 in production. These are the functionalities we need to test for:

  • run proxy over TCP transport

  • run SSH over TCP transport

  • run chat over TCP transport

  • run hypervisor over production

Error trying to start an app that is already running

Describe the bug
When calling PUT /api/visors/{pk}/apps/{app} with "status": 1 to start the skychat app when it is already running, the console shows Failed to start app skychat: app skychat is already started, but the API does not return anything, so the client stays waiting indefinitely for the API to return something. If the same enpoint is called with "status": 0 while waiting for the response, the visor sometimes stops being accesible using the hypervisor API.

Environment information:

  • OS: Linux (Ubuntu 18.04.1)
  • Platform: Linux 4.15.0-65-generic x86_64

Steps to Reproduce
Steps to reproduce the behavior:

  1. Start some nodes using the make integration-run-generic command of the skywire-services repository

  2. Call http://{localIp}:8080/api/visors/024ec47420176680816e0406250e7156465e4531f5b26057c9f6297bb0303558c7/apps/skychat using PUT with { "autostart": true, "status": 1 } as content. You should not get a response from this call.

  3. Still waiting for the response for the request made in step 2, make a similar request again, but changing the content to { "autostart": true, "status": 0 }.

  4. Try repeating steps 3 and 4, so that you will continue trying to start an already started app and then stopping it, until you start getting "error": "connection is shut down" as response. You may not need to repeat the previous steps for this to happen, you may start getting that response in the first try.

Actual behavior
The call made in step 2 never returs a valid response, even being that the console shows Failed to start app skychat: app skychat is already started, which would make a client app unable to give feedback to the user.

Also, the whole procedure eventually makes the console show Injected [CLOSE]: Closing stream... received=<type:CLOSE><id:… or Rejected [ACK]: Failed to grow remote window. error="local record of remote window has become invalid:…, after which it is not possible to access the visor using the hypervisor API, as it starts returning "error": "connection is shut down" all the time.

In fact, there are other ways to make the visor inaccessible via the hypervisor API after step 2, but the process is a bit erratic, so I do not have a specific serie of steps to make it happen.

Expected behavior
The call made in stept 2 should return an error and the procedure should not make the visor inaccessible

Additional context

Possible implementation

It is not possible to make operations related to the routes with the API

Describe the bug
The hypervisor API has various endpoints that allow to work with the routes, like GET /nodes/{pk}/routes/{rid}. The problem is that it is not possible to get the value that must be sent as the {rid} param, as the hypervisor API does not have an endpoint for getting the ID of the routes.

Calling the GET /nodes/{pk}/routes endpoint only returns a key property for each route, which worked before when sent as the {rid} param, but not anymore, as the API responds error: "invalid UUID length: 0".

Possible implementation
2 possible solutions are:

  • Add the ID to the GET /nodes/{pk}/routes endpoint.
  • Make the API endpoints related to the routes work if the key property returned by the GET /nodes/{pk}/routes endpoint is sent as the {rid} param.

Multiple errors running the hypervisor and visor on mainnet-milestone2

Describe the bug
Using the mainnet-milestone2 branch, the hypervisor does not run. Also, the visor has problems connecting to the dmsg server.

Environment information:

  • OS: Linux (Ubuntu 18.04.1)
  • Platform: Linux 4.15.0-65-generic x86_64

Steps to Reproduce
Steps to reproduce the behavior:

  1. Run make install using the mainnet-milestone2 branch.

  2. Run hypervisor gen-config -o hypervisor-config.json, to create a default configuration file for the hypervisor.

  3. Run hypervisor. It will fail with Failed to parse rpc port from rpc address: parse :7080: missing protocol scheme.

  4. Open the configuration file created in the second step and change the value of rpc_addr to localhost:7080.

  5. Run hypervisor again. It will fail with Failed to parse rpc port from rpc address: strconv.ParseUint: parsing "": invalid syntax.

  6. Open the configuration file created in the second step again and change the value of rpc_addr to http://skycoin.net:7080.

  7. Run hypervisor again. The console will start displaying no dms_servers found: trying again in 1s... error="Get /dmsg-discovery/available_servers: unsupported protocol scheme """. The problem with the rpc address appears to be solved.

  8. Open the configuration file created in the second step again and change the value of dmsg_discovery to https://messaging.discovery.skywire.skycoin.net.

  9. Run hypervisor again. The console will start displaying no dms_servers found: trying again in 1s... error="Get https://messaging.discovery.skywire.skycoin.net/dmsg-discovery/available_servers: dial tcp: lookup messaging.discovery.skywire.skycoin.net: no such host".

In a similar way, trying to run a visor with the default config results in no dms_servers found: trying again in 1s... error="json: cannot unmarshal number into Go value of type disc.HTTPMessage" being displayed every second in the console.

Actual behavior
The errors previously described appear. Also, the hypervisor is not connecting with the visors, as it does in the master branch.

Expected behavior
The hypervisor and the visor should work like in the master branch, with the default config. If it is currently necessary to run a local dmsg server for making the visor and hypervisor work, it would be good to add instructions in the readme, and it could be also useful in other cases.

Additional context

Possible implementation

Service Discovery Client

Feature description

https://github.com/SkycoinPro/skywire-service-proxy-discovery

We need a client for this discovery service. This discovery enables Visors to advertise themselves with their geolocation to other visors wanting to connect.

Is your feature request related to a problem? Please describe.
Users should be able to connect to visors with certain properties (geolocation, connection characteristics etc). Most important for now is the location.

Visor apps fail to reconnect Streams

Under milestone2 branches, after a visor is restarted, communication between apps on different visors fails.

Steps to Reproduce

  1. Run generic integration environment
  2. Run make integration-startup
  3. Run curl --data {'"recipient":"'$PK_A'", "message":"Hello Joe!"}' -X POST $CHAT_C
  4. Check that on visor A the message appears and skychat has trashed the "Hello Joe!" message
  5. Stop and restart visor-c
  6. Run step 3) again. Now you should see on node A No stream of given ID messages, and thus it being unable to deliver it to skychat app
    Additionally, if in step 5) we restart visor-b instead of visor-c we will see that visors a and c continuously try to reconnect with it without success.

Attached are logs collected from the described scenario.
visor-a.txt
visor-b.txt
visor-c.txt

Multiple errors in production environment

Describe the bug
Multiple errors can be observed in production logs.

Environment information:
Production environment

Steps to Reproduce
Check production logs.

Actual behavior
Production services run with errors.

Expected behavior
Production services run with no errors.

Additional context

Error initiating server connections by initiator: findOrConnectToServers: all servers failed
read: connection reset by peer)
[2019-12-26T19:20:23Z] WARN [dms_server]: failed to write frame: write error: write tcp 10.244.2.149:8080->192.168.143.73:53912: use of closed network connection error="write error: write tcp 10.244.2.149:8080->192.168.143.73:53912: use of closed network connection" srcClient=0242880d65dda463b8a0ca630ab7ebbc98e2bd9fb5172536218ddde2f705827901
[2019-12-26T19:20:23Z] WARN [dmsg]: Failed to close connection error="close tcp 10.244.2.149:8080->192.168.143.73:53939: use of closed network connection"
[2019-12-26T19:20:23Z] WARN [dms_server]: failed to write frame: write error: write tcp 10.244.2.149:8080->192.168.143.73:53912: use of closed network connection error="write error: write tcp 10.244.2.149:8080->192.168.143.73:53912: use of closed network connection" srcClient=0242880d65dda463b8a0ca630ab7ebbc98e2bd9fb5172536218ddde2f705827901
[2019-12-26T19:20:23Z] INFO [dms_server]: ClosingConn connCount=3 error="read failed: EOF" srcClient=0242880d65dda463b8a0ca630ab7ebbc98e2bd9fb5172536218ddde2f705827901
[2019-12-26T19:20:23Z] WARN [dmsg]: Failed to close connection error="close tcp 10.244.2.149:8080->192.168.143.73:53940: use of closed network connection"
[2019-12-26T19:20:23Z] WARN [dmsg]: Failed to close connection error="close tcp 10.244.2.149:8080->192.168.143.73:53912: use of closed network connection"
[2019-12-26T19:20:23Z] WARN [dmsg]: Failed to close connection error="close tcp 10.244.2.149:8080->192.168.143.73:53912: use of closed network connection"
[2019-12-26T19:20:23Z] WARN [dms_server]: Failed to close connection error="close tcp 10.244.2.149:8080->192.168.143.73:53939: use of closed network connection" srcClient=0242880d65dda463b8a0ca630ab7ebbc98e2bd9fb5172536218ddde2f705827901
[2019-12-26T19:20:23Z] INFO [dms_server]: connection with client 0242880d65dda463b8a0ca630ab7ebbc98e2bd9fb5172536218ddde2f705827901 closed: error(read failed: EOF)
[2019-12-26T19:20:23Z] WARN [dmsg]: Failed to close connection error="close tcp 10.244.2.149:8080->192.168.143.73:53939: use of closed network connection"
[2019-12-26T19:20:23Z] WARN [dms_server]: failed to write frame: write error: write tcp 10.244.2.149:8080->192.168.143.73:53912: use of closed network connection error="write error: write tcp 10.244.2.149:8080->192.168.143.73:53912: use of closed network connection" srcClient=030903ae94ad689968367b2a6618587688d584886bf660151e7a6d3eb477796604
[2019-12-26T19:20:23Z] INFO [dms_server]: ClosingConn connCount=2 error="read failed: EOF" srcClient=030903ae94ad689968367b2a6618587688d584886bf660151e7a6d3eb477796604
[2019-12-26T19:20:23Z] WARN [dms_server]: Failed to close connection error="close tcp 10.244.2.149:8080->192.168.143.73:53940: use of closed network connection" srcClient=030903ae94ad689968367b2a6618587688d584886bf660151e7a6d3eb477796604
[2019-12-26T19:20:23Z] INFO [dms_server]: connection with client 030903ae94ad689968367b2a6618587688d584886bf660151e7a6d3eb477796604 closed: error(read failed: EOF)
[2019-12-26T19:20:23Z] WARN [dmsg]: Failed to close connection error="close tcp 10.244.2.149:8080->192.168.143.73:53940: use of closed network connection"
[2019-12-26T19:20:23Z] WARN [dmsg]: Failed to close connection error="close tcp 10.244.2.149:8080->192.168.143.73:53912: use of closed network connection"
[2019-12-26T19:20:23Z] WARN [dmsg]: Failed to close connection error="close tcp 10.244.2.149:8080->192.168.143.73:53939: use of closed network connection"
[2019-12-26T19:20:23Z] WARN [dmsg]: Failed to close connection error="close tcp 10.244.2.149:8080->192.168.143.73:53912: use of closed network connection"
[2019-12-26T19:20:23Z] WARN [dmsg]: Failed to close connection error="close tcp 10.244.2.149:8080->192.168.143.73:53912: use of closed network connection"
[2019-12-26T19:20:23Z] WARN [dmsg]: Failed to close connection error="close tcp 10.244.2.149:8080->192.168.143.73:53912: use of closed network connection"
[2019-12-26T19:20:23Z] WARN [dmsg]: Failed to close connection error="close tcp 10.244.2.149:8080->192.168.143.73:53912: use of closed network connection"
[2019-12-26T19:20:23Z] WARN [dmsg]: Failed to close connection error="close tcp 10.244.2.149:8080->192.168.143.73:53912: use of closed network connection"
[2019-12-26T19:20:23Z] WARN [dmsg]: Failed to close connection error="close tcp 10.244.2.149:8080->192.168.143.73:53912: use of closed network connection"
[2019-12-26T20:05:28Z] INFO [dms_server]: ClosingConn connCount=1 error="read failed: read tcp 10.244.2.149:8080->192.168.143.73:53891: read: connection reset by peer" srcClient=029ecd5bd8ff5c591931bb2d0ffc14cb09eb5c4707712df7e4776fe79f565d7a75
[2019-12-26T20:05:28Z] INFO [dms_server]: connection with client 029ecd5bd8ff5c591931bb2d0ffc14cb09eb5c4707712df7e4776fe79f565d7a75 closed: error(read failed: read tcp 10.244.2.149:8080->192.168.143.73:53891: read: connection reset by peer)
[dmsgC_httpS]: no dmsg_servers found: trying again in 1s... error="something unexpected happened"

Fix hypervisor endpoints

The following endpoints seem to be broken:

  • PUT /api/nodes/{pk}/apps/{apps}

  • POST /api/nodes/{pk}/transports

  • DELETE /api/nodes/{pk}/transports/{tid}

Visors are intermittently disconnecting to the hypervisor

Describe the bug
When using the hypervisor to get info about the visors, it is very common to find errors due to the visors being intermittently disconnected from the hypervisor.

Environment information:

  • OS: Linux (Ubuntu 18.04.1)
  • Platform: Linux 4.15.0-65-generic x86_64

Steps to Reproduce
If you call GET http://{localIp}:8080/api/visors frequently, some times the response will have one or more visors with most fields empty.

Also, calling any of the API endpoints for getting info about a specific visor sometimes results in geting unexpected EOF or connection is shut down.

Actual behavior
The API is returning invalid responses when the connection to the visors is lost for a brief period of time.

The problem is frequent enough to be quite annoying in the manager UI.

Expected behavior
The API should return the expected responses.

Additional context

Possible implementation
If there is any serious complication for implementing a solution in the hypervisor, the client could implement something like a “noice cancellation” procedure to detect in which cases the hypervisor is just having a temporary disconnection. This would sometimes make the UI slower but should work.

If this is going to be done, it would be good to document the need to do so in some location related to the API, including ways to detect the disconnection and the amount of time in which a reconnection could be expected, so anyone using the API is aware of the need to do something similar.

Backport master fixes to Milestone2

Describe the bug
Some fixes have been applied to master but not to milestone2. Notably, these include

  • fixes to the hypervisor made by Evan
  • changing app config values from slice to map
  • listening on stcp by default
  • transport deregistration logic changes
  • remove therealssh

Invalidate the Hypervisor session after changing the password

Feature description

Describe the feature
If the Hypervisor is ever going to be used remotely by various users, changing the password should act as a way for instantly blocking access to unauthorized users at any time. Invalidate all sessions of the user should be part of that.

Is your feature request related to a problem? Please describe.
After changing the password o fan account, unauthorized users would still have access as long as they keep an earlier session open.

Describe the solution you'd like
All sessions of a particular account should be invalidated just after changing its password.

Describe alternatives you've considered

Additional context
This is not critical, as system administrators can delete all sessions by restating the Hypervisor, but would be more convenient.

Possible implementation

It is very difficult to delete the transports

Describe the bug
Using the hypervisor API it is very difficult to delete a transport from a visor, because after removing it from one visor, the other visor will create it again shortly after.

Environment information:

  • OS: Linux (Ubuntu 18.04.1)
  • Platform: Linux 4.15.0-65-generic x86_64

Steps to Reproduce
Steps to reproduce the behavior:

  1. Create a transport with the POST /api/visors/{pk}/transports API endpoint.

  2. Delete the transport with the DELETE /api/visors/{pk}/transports/{tid} API endpoint.

  3. Use the GET /api/visors/{pk}/transports API endpoint to check the current transports list.

Actual behavior
If you call the GET /api/visors/{pk}/transports API endpoint just after deleting the transport, most likely the transport will not be listed, but after some time it will be there again, as the other visor will create it again.

Expected behavior
Transports should stay deleted.

Additional context

Possible implementation
The operation for deleting the transport should be informed to both visors, to avoid having the transport being created again after a short amount of time. Another option is to make the first visor wait for the second one to try to create the transport again and then inform it that the transport should be deleted there too.

As a temporary workarround, when deleting a transport the Manager could try to delete it in both visors if both are connected to the hypervisor, and show a warning to the user if one of the visors is not connected.

Remove therealssh from codebase

Describe the bug
Dmsgpty suppprt was recently added and there does not seem to be a point in keeping therealssh around. It seems difficult enough to maintain to warrant being removed.

Make install does not work on Windows

Describe the bug
Running make install on windows fails.

Environment information:

  • OS: Windows 10 (Uasing MinGW32)
  • Platform: X64

Steps to Reproduce
Steps to reproduce the behavior:

  1. Run make install on the root of the repo.

Actual behavior
The process exits with the following error:

build github.com/SkycoinProject/skywire-mainnet/cmd/skywire-visor: cannot load github.com/sirupsen/logrus/hooks/syslog: no Go source files

Expected behavior
The proces should finish correctly.

Additional context
The only go file inside vendor/github.com/sirupsen/logrus/hooks/syslog starts with // +build !windows,!nacl,!plan9, so it is ignored on Windows.

Possible implementation
Apparently, to be fully multiplatform the code on github.com/sirupsen/logrus/hooks/syslog must not be used. The main Skycoin repo also uses logrus, but vendor/github.com/sirupsen/logrus/hooks/syslog was no vendored by dep, so I think the code in the main repo avoided using that part and that is why it works well on Windows.

Setup CLA

We should require signing of a CLA (Contributor License Agreement) for contribution.

Improve naming consistency and documentation

From Discord:

We are currently

  • using messaging in many instances instead of dmsg.
  • using node instead of visor (notably in the skywire-cli command and many more instances in the codebase)
  • using therealproxy where we should probably switch to a more straightforward name of skyproxy
  • talking in documentation about docker setups that do not exist anymore

Document stcp usage

Currently the stcp transport is working, but it is not documented how to get it running, since it requires manual creation of a pktable.

  • document with an example how to setup the stcp transport

The hypervisor returns inaccessible visors

Describe the bug
When calling GET http://{localIp}:8080/api/visors the hypervisor appears to return the list of all the nodes that have been connected to it, even if there is no connection to them anymore. Also, the response does not include any info indicating if the visor is connected to the hypervisor.

Environment information:

  • OS: Linux (Ubuntu 18.04.1)
  • Platform: Linux 4.15.0-65-generic x86_64

Steps to Reproduce
Steps to reproduce the behavior:

  1. Start some nodes using the make integration-run-generic command of the skywire-services repository

  2. Call GET http://{localIp}:8080/api/visors. You should get a list with 3 visors.

  3. Call GET http://{localIp}:8080/api/visors/024ec47420176680816e0406250e7156465e4531f5b26057c9f6297bb0303558c7. You should get the basic info of the visor.

  4. Go to the console window used in step 1, press Ctrl+b and then 6 to open the tab of the first visor. Then press Ctrl+c to stop the visor.

  5. Call GET http://{localIp}:8080/api/visors again. You should still get a list with 3 visors, but most of the fields of the 024ec47420176680816e0406250e7156465e4531f5b26057c9f6297bb0303558c7 visor will now be empty.

  6. Call GET http://{localIp}:8080/api/visors/024ec47420176680816e0406250e7156465e4531f5b26057c9f6297bb0303558c7 again. You should get "error": "connection is shut down".

Actual behavior
GET http://{localIp}:8080/api/visors returns nodes that are not connected. One way to tell which nodes are not connected is by checking if the node_version and app_protocol_version fields are empty, but the problem is that the API endpoint sometimes return those fields empty for valid visors. Maybe that is related to #28 .

Expected behavior
Only connected visors should be returned. If returning visors that are not connected is a measure to minimize the effects of #28 . Then a procedure to deal with the problem should be created, because this issue affects the ability to show a list of connected nodes to the user.

Additional context

Possible implementation
Similar to #28 :

If there is any serious complication for implementing a solution in the hypervisor, the client could implement something like a “noice cancellation” procedure to detect in which cases the hypervisor is just having a temporary disconnection. This would sometimes make the UI slower but should work.

If this is going to be done, it would be good to document the need to do so in some location related to the API, including ways to detect the disconnection and the amount of time in which a reconnection could be expected, so anyone implmenting the API is aware of the need to do something similar.

[M2] Default route group I/O timeout

Describe the bug
The default I/O timeout for route group is hard coded now into constant. If we try to fully satisfy the net.Conn, we should remove the default timeout, since most of client applications expect such behavior

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.