Coder Social home page Coder Social logo

nais / device Goto Github PK

View Code? Open in Web Editor NEW
17.0 8.0 6.0 11.37 MB

naisdevice is a application suite that enables NAV developers to connect to internal resources in a secure and friendly manner.

Home Page: https://doc.nais.io/device/install

License: MIT License

Shell 2.22% Go 89.19% Makefile 2.14% Smarty 0.17% Jinja 1.21% Dockerfile 0.14% NSIS 4.51% PLpgSQL 0.42%
wireguard go tray-application

device's People

Contributors

ahusby avatar androa avatar audunstrand avatar chinatsu avatar christeredvartsen avatar dependabot[bot] avatar erlingjd avatar frodesundby avatar henrikhorluck avatar jhrv avatar jksolbakken avatar jrtm avatar kimtore avatar mortenlj avatar muni10 avatar pcmoen avatar pjwalstrom avatar rbjornstad avatar sechmann avatar starefossen avatar thokra-nav avatar toby1knby avatar tommytroen avatar toresbe avatar tronghn avatar x10an14 avatar x10an14-nav avatar ybelmekk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

device's Issues

Device health checker does not always check for failures

The device health checker only fetches failures for the Kolide devices if the failure_count attribute of the device is larger than the resolved_failure_count attribute. The issue is that resolved_failure_count is an always increasing number, while failure_count reflects the current amount of failing checks for the device, and will reset to 0 when a failure has been resolved.

AAD token is not refreshed properly

ERRO[2020-06-04T08:17:23+02:00] Unable to get gateway config: getting device config: Get http://10.255.240.1/devices/C02X1CGMJG5J/gateways: oauth2: cannot fetch token: 400 Bad Request
Response: {"error":"invalid_grant","error_description":"AADSTS70043: The refresh token has expired or is invalid due to sign-in frequency checks by conditional access. The token was issued on 2020-06-03T07:09:26.9675714Z and the maximum allowed lifetime for this request is 82800.\r\nTrace ID: 39de2a28-2749-4a6e-8aeb-f05d97ad0701\r\nCorrelation ID: 672adb9a-e472-40ac-a220-c48f11499d90\r\nTimestamp: 2020-06-04 06:17:23Z","error_codes":[70043],"timestamp":"2020-06-04 06:17:23Z","trace_id":"39de2a28-2749-4a6e-8aeb-f05d97ad0701","correlation_id":"672adb9a-e472-40ac-a220-c48f11499d90","suberror":"token_expired"}

Gateway agent make restart less disruptive

Atm the gateway-agent will always setup the wg interface (teardown+setup), which disconnects everyone. Instead of teardown we can check if the wg interface is already set up, and just skip that step if it is (or allow the ip link add command to fail).

Improve column names in the API server database

last_check and last_seen should be renamed.

last_check is only updated by the API server, and this occurs when a device is updated. Rename to last_updated?

last_seen is when Kolide last saw the device. Rename to kolide_last_seen?

Automate gateway setup

Today we use terraform to set up the VM, networking, secrets and serviceusers.
The rest of the setup: packages, systemd and kernel settings are set manually, and should be automated.

Agent seem unable to refresh tokens

After running agent for a while, this occurs:

INFO[2020-05-27T17:53:56+02:00] Starting device-agent with config:
{APIServer:http://10.255.240.1 Interface:utun69 ConfigDir:/Users/hrv/Library/Application Support/naisdevice BinaryDir:/usr/local/bin BootstrapToken: WireGuardBinary: WireGuardGoBinary: PrivateKeyPath: WireGuardConfigPath: BootstrapConfigPath: LogLevel:info OAuth2Config:{ClientID:8086d321-c6d3-4398-87da-0d54e3d93967 ClientSecret: Endpoint:{AuthURL:https://login.microsoftonline.com/62366534-1ec3-4962-8869-9b5535279d0b/oauth2/v2.0/authorize TokenURL:https://login.microsoftonline.com/62366534-1ec3-4962-8869-9b5535279d0b/oauth2/v2.0/token AuthStyle:0} RedirectURL:http://localhost:51800 Scopes:[openid 6e45010d-2637-4a40-b91d-d4cbb451fb57/.default offline_access]} Platform: BootstrapAPI:https://bootstrap.device.nais.io}
INFO[2020-05-27T17:53:56+02:00] If the browser didn't open, visit this url to sign in: https://login.microsoftonline.com/62366534-1ec3-4962-8869-9b5535279d0b/oauth2/v2.0/authorize?access_type=offline&client_id=8086d321-c6d3-4398-87da-0d54e3d93967&code_challenge=xxx-sg6nXyE4SnuZ0&code_challenge_method=S256&redirect_uri=http%3A%2F%2Flocalhost%3A51800&response_type=code&scope=openid+6e45010d-2637-4a40-b91d-d4cbb451fb57%2F.default+offline_access&state=HrNmLh1i5iNSf9YB
Starting device-agent-helper, you might be prompted for password
Password:
INFO[2020-05-27T17:54:02+02:00] Starting device-agent-helper with config:
{Interface:utun69 BinaryDir: WireGuardBinary:/usr/local/bin/naisdevice-wg WireGuardGoBinary:/usr/local/bin/naisdevice-wireguard-go WireGuardConfigPath:/Users/hrv/Library/Application Support/naisdevice/wg0.conf LogLevel:info DeviceIP:10.255.240.9}
ERRO[2020-05-27T19:04:23+02:00] Unable to get gateway config: getting device config: Get http://10.255.240.1/devices/SERIAL/gateways: oauth2: cannot fetch token: 400 Bad Request
Response: {"error":"invalid_grant","error_description":"AADSTS70043: The refresh token has expired or is invalid due to sign-in frequency checks by conditional access. The token was issued on 2020-05-22T18:38:20.0528180Z and the maximum allowed lifetime for this request is 82800.\r\nTrace ID: 5d0a143c-3d3e-4872-8ac6-be1628421e00\r\nCorrelation ID: 39a0d8d8-368d-4237-ae96-9a8ddf1574f8\r\nTimestamp: 2020-05-27 17:04:23Z","error_codes":[70043],"timestamp":"2020-05-27 17:04:23Z","trace_id":"5d0a143c-3d3e-4872-8ac6-be1628421e00","correlation_id":"39a0d8d8-368d-4237-ae96-9a8ddf1574f8","suberror":"token_expired"}

Eksempel på fungerende on-prem gateway

Må modellere inn informasjon om hvilke CIDRs gatewayen skal være proxy for.
Vi har i dag to typer gateways, en som kun er en nat-gw (f.eks azure-gw) og en som proxyer trafikk (apiservere i GCP). Dette må skilles på i modellen slik at gateway-agent kan konfigurere iptables riktig avhengig av hvilken type gateway den er.

Kan muligens bare utledes ved at gatewayen ikke har definert noen routes.

if len(gateway.routes) == 0 {
  // proxy type
} else {
  // nat type
}

Improve start-up time

Currently it takes ~20 seconds before all timers have run in a order to allow reaching the gateways

  1. device-agent-helper takes 10 seconds to run sync on the bootstrap-config
  2. this allows device-agent, about 5 seconds later, to get gateways from apiserver
  3. 10 seconds later, these are synced and made available for user

Clean up routes when tunnel is not working

Currently, if the tunnel stops working it will end up in a never ending loop because:

  • We add routes that tunnel Microsoft-traffic through a gateway
  • We use azure ad auth on our api server

In the scenario where the microsoft gateway stops working, we're unable to fetch new access tokens, and therefore unable to communicate with the api.

as a naisdevice admin i want to have a dashboard (w/alerts) so i know if the system is working properly

  • instrument gateway-agent and apiserver with relevant metrics
  • setup metrics-rig, prometheus on own server? must communicate over tunnel
  • prometheus available as DS for common grafana instance

metrics:
gateways:
- throughput (gateways)
- device count
apiserver:
- device count
- healthy/unhealthy count
- apicalls by code
healthchecker:
- platformtype count

alerts:

  • if gateways or apiserver goes down
  • x time since healthchecker has run
  • ..

Brutal cleanup of existing `wireguard-go` processes

device-agent-helper sometimes fail due to existing wireguard-go process (that for unknown reasons were not killed when a previous instance of device-agent-helper exited)

We could check for and kill existing wireguard-go processes when we start device-agent-helper

Lage phar-pakker til device health update

For å forenkle installasjon / bruk av device health update skal det genereres phar pakker.

Det bør laget to pakker, en for "update" og en for "get-checks". Dette kan legges til i hovedworkflowen til master-branchen. Det kan også lages snapshots for andre branches.

Screenlock compliance check

Is "missing" from the Linux tables in Osquery. We need a discussion on how to mitigate/solve this requirement.

improve login-page

Some of our users seem to not appreciate the kekw meme as much as we do.

Use tags in Kolide instead of a static configuration file

Now that Kolide have a concept around tags we should use these for check severity.

Currently all checks in Kolide have been tagged with the correct levels, according to the existing severity levels found in the configuration file.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.