Coder Social home page Coder Social logo

paz-sh / paz Goto Github PK

View Code? Open in Web Editor NEW
1.1K 59.0 61.0 126 KB

An open-source, in-house service platform with a PaaS-like workflow, built on Docker, CoreOS, Etcd and Fleet. This repository houses the documentation and installation scripts.

Home Page: http://paz.sh

License: Other

Shell 100.00%

paz's Introduction

Gitter chat

Paz

Continuous deployment production environments, built on Docker, CoreOS, etcd and fleet.

THIS PROJECT IS INACTIVE

Paz is an in-house service platform with a PaaS-like workflow.

Paz's documentation can be found here.

Screenshot

What is Paz?

Paz is...

  • Like your own private PaaS that you can host anywhere
  • Free
  • Open-source
  • Simple
  • A web front-end to CoreOS' Fleet with a PaaS-like workflow
  • Like a clustered/multi-host Dokku
  • Alpha software
  • Written in Node.js

Paz is not...

  • A hosted service
  • A complete, enterprise-ready orchestration solution

Features

  • Beautiful web UI
  • Run anywhere (Vagrant, public cloud or bare metal)
  • No special code required in your services
    • i.e. it will run any containerised application unmodified
  • Built for Continuous Deployment
  • Zero-downtime deployments
  • Service discovery
  • Same workflow from dev to production
  • Easy environments

Components

  • Web front-end - A beautiful UI for configuring and monitoring your services.
  • Service directory - A catalog of your services and their configuration.
  • Scheduler - Deploys services onto the platform.
  • Orchestrator - REST API used by the web front-end; presents a unified subset of functionality from Scheduler, Service Directory, Fleet and Etcd.
  • Centralised monitoring and logging.

Service Directory

This is a database of all your services and their configuration (e.g. environment variables, data volumes, port mappings and the number of instances to launch). Ultimately this information will be reduced to a set of systemd unit files (by the scheduler) to be submitted to Fleet for running on the cluster. The service directory is a Node.js API backed by a LevelDB database.

Scheduler

This service receives HTTP POST commands to deploy services that are defined in the service directory. Using the service data from the directory it will render unit files and run them on the CoreOS cluster using Fleet. A history of deployments and associated config is also available from the scheduler.

For each service the scheduler will deploy a container for the service and an announce sidekick container.

The scheduler is a Node.js API backed by a LevelDB database and uses Fleet to launch services.

Orchestrator

This is a service that ties all of the other services together, providing a single access-point for the front-end to interface with. Also offers a web socket endpoint for realtime updates to the web front-end.

The orchestrator is a Node.js API server that communicates with Etcd, Fleet, the scheduler and service directory.

Web Front-End

A beautiful and easy-to-use web UI for managing your services and observing the health of your cluster. Built in Ember.js.

HAProxy

Paz uses Confd to dynamically configure HAProxy based on service availability information declared in Etcd. HAProxy is configured to route external and internal requests to the correct host for the desired service.

Monitoring and Logging

Currently cAdvisor is used for monitoring, and there is not yet any centralised logging. Monitoring and logging are high-priority features on the roadmap.

Installation

Paz's Docker repositories are hosted at Quay.io, but they are public so you don't need any credentials.

You will need to install fleetctl and etcdctl. On OS/X you can install both with brew:

$ brew install etcdctl fleetctl

Vagrant

Clone this repository and run the following from the root directory of this repository:

$ ./scripts/install-vagrant.sh

This will bring up a three-node CoreOS Vagrant cluster and install Paz on it. Note that it may take 10 minutes or more to complete.

For extra debug output, run with DEBUG=1 environment variable set.

If you already have a Vagrant cluster running and want to reinstall the units, use:

$./script/reinstall-units-vagrant.sh

To interact with the units in the cluster via Fleet, just specify the URL to Etcd on one of your hosts as a parameter to Fleet. e.g.:

$ fleetctl -strict-host-key-checking=false -endpoint=http://172.17.9.101:4001 list-units

You can also SSH into one of the VMs and run fleetctl from there:

$ cd coreos-vagrant
$ vagrant ssh core-01

...however bear in mind that Fleet needs to SSH into the other VMs in order to perform operations that involve calling down to systemd (e.g. journal), and for this you need to have SSHd into the VM running the unit in question. For this reason you may find it simpler (albeit more verbose) to run fleetctl from outside the CoreOS VMs.

DigitalOcean

Paz has been tested on Digital Ocean but there isn't currently an install script for it.

In short, you need to create your own cluster and then install the Paz units on there.

The first step is to spin up a CoreOS cluster on DigitalOcean with Paz's cloud-config userdata, and then we'll install Paz on it.

  1. Click the "Create Droplet" button in the DigitalOcean console.
  2. Give your droplet a name and choose your droplet size and region.
  3. Tick "Private Networking" and "Enable User Data"
  4. Paste the contents of the digitalocean/userdata file in the yldio/paz repository into the userdata text area.
  5. Go to http://discovery.etcd.io/new and copy the URL that it prints in the browser, pasting it into the userdata text area instead of the one that is already there.
  6. In the write_files section, in the section for writing the /etc/environment file, edit PAZ_DOMAIN, PAZ_DNSIMPLE_APIKEY and PAZ_DNSIMPLE_EMAIL fields, putting in your dnsimple-managed domain name, dnsimple API key and dnsimple account's email address, respectively.
  7. Before submitting, copy this userdata to a text file or editor because we'll need to use it again unchanged
  8. Select the CoreOS version you want to install (e.g. latest stable or beta should be fine).
  9. Add the SSH keys that will be added to the box (under core user).
  10. Click "Create Droplet".
  11. Repeat for the number of nodes you want in the cluster (e.g. 3), using the exact same userdata file (i.e. don't generate a new discovery token etc.).
  12. Once all droplets have booted (test by trying to SSH into each one, run docker ps and observe that paz-dnsmasq, cadvisor and paz-haproxy are all running on each box), you may proceed.
  13. Install Paz:
$ ssh-add ~/.ssh/id_rsa
$ FLEETCTL_TUNNEL=<MACHINE_IP> fleetctl -strict-host-key-checking=false start unitfiles/1/*

...where <MACHINE_IP> is an IP address of any node in your cluster. You can wait for all units to be active/running like so:

$ FLEETCTL_TUNNEL=<MACHINE_IP> watch -n 5 fleetctl -strict-host-key-checking=false list-units

Once they're up you can install the final services:

$ FLEETCTL_TUNNEL=<MACHINE_IP> fleetctl -strict-host-key-checking=false start unitfiles/2/*

Bare Metal

Paz works fine on a bare metal install, but there is no install script available for it yet.

You need to create your cluster, then add the contents of bare_metal/user-data to your cloud config, and finally submit the unit files.

  1. Create your cluster.
  2. Paste the contents of bare_metal/user-data into your cloud config file. Be sure to alter the networking information to match your setup.
  3. Go to http://discovery.etcd.io/new and copy the URL that it prints in the browser, pasting it into the userdata text area instead of the one that is already there.
  4. Install Paz:
$ ssh-add ~/.ssh/id_rsa
$ FLEETCTL_TUNNEL=<MACHINE_IP> fleetctl -strict-host-key-checking=false start unitfiles/1/*

...where <MACHINE_IP> is an IP address of any node in your cluster. You can wait for all units to be active/running like so:

$ FLEETCTL_TUNNEL=<MACHINE_IP> watch -n 5 fleetctl -strict-host-key-checking=false list-units

Once they're up you can install the final services:

$ FLEETCTL_TUNNEL=<MACHINE_IP> fleetctl -strict-host-key-checking=false start unitfiles/2/*

Tests

There is an integration test that brings up a CoreOS Vagrant cluster, installs Paz and then runs a contrived service on it and verifies that it works:

$ cd test
$ ./integration.sh

Each paz repository (service directory, orchestrator, scheduler) has tests that run on http://paz-ci.yld.io:8080 (in StriderCD), triggered by a Github webhook.

Paz Repositories

The various components of Paz are spread across several repositories:

paz's People

Contributors

bfirsh avatar dscape avatar enzor avatar foliveira avatar hyperbolic2346 avatar jacyzon avatar jemgold avatar lukebond avatar sublimino avatar tomgco avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

paz's Issues

cAdvisor unit reporting as "failed"

...yet it is actually running fine.

Looking at the logs the error appears to be due to a port conflict. Given that 8080 is only used by cAdvisor I suspect that the systemd unit file is misconfigured to demonise it yet continually try to start it again.

":( something went wrong"

Hi guys, I attempted to bring up a vagrant cluster last night and the paz-web.paz dashboard failed with the above message. I know very little about javascript/ember so I'm not sure how much more debugging I can do, I'm not even sure where to start at this point.

fleetctl --version
fleetctl version 0.9.1
etcdctl --version
etcdctl version 2.0.4
vagrant --version
Vagrant 1.7.2

I started it up with:

./scripts/install-vagrant.sh 
Installing Paz on Vagrant

Checking for existing Vagrant cluster

Creating a new Vagrant cluster
Cloning into 'coreos-vagrant'...
remote: Counting objects: 351, done.
remote: Total 351 (delta 0), reused 0 (delta 0), pack-reused 351
Receiving objects: 100% (351/351), 79.37 KiB | 0 bytes/s, done.
Resolving deltas: 100% (152/152), done.
Checking connectivity... done.
==> core-01: Box 'coreos-beta' not installed, can't check for updates.
==> core-02: Box 'coreos-beta' not installed, can't check for updates.
==> core-03: Box 'coreos-beta' not installed, can't check for updates.
Bringing machine 'core-01' up with 'virtualbox' provider...
Bringing machine 'core-02' up with 'virtualbox' provider...
Bringing machine 'core-03' up with 'virtualbox' provider...
==> core-01: Box 'coreos-beta' could not be found. Attempting to find and install...
    core-01: Box Provider: virtualbox
    core-01: Box Version: >= 308.0.1
==> core-01: Loading metadata for box 'http://beta.release.core-os.net/amd64-usr/current/coreos_production_vagrant.json'
    core-01: URL: http://beta.release.core-os.net/amd64-usr/current/coreos_production_vagrant.json
==> core-01: Adding box 'coreos-beta' (v607.0.0) for provider: virtualbox
    core-01: Downloading: http://beta.release.core-os.net/amd64-usr/607.0.0/coreos_production_vagrant.box
    core-01: Calculating and comparing box checksum...
==> core-01: Successfully added box 'coreos-beta' (v607.0.0) for 'virtualbox'!
==> core-01: Importing base box 'coreos-beta'...
==> core-01: Matching MAC address for NAT networking...
==> core-01: Checking if box 'coreos-beta' is up to date...
==> core-01: Setting the name of the VM: coreos-vagrant_core-01_1425861758586_22045
==> core-01: Clearing any previously set network interfaces...
==> core-01: Preparing network interfaces based on configuration...
    core-01: Adapter 1: nat
    core-01: Adapter 2: hostonly
==> core-01: Forwarding ports...
    core-01: 22 => 2222 (adapter 1)
==> core-01: Running 'pre-boot' VM customizations...
==> core-01: Booting VM...
==> core-01: Waiting for machine to boot. This may take a few minutes...
    core-01: SSH address: 127.0.0.1:2222
    core-01: SSH username: core
    core-01: SSH auth method: private key
    core-01: Warning: Connection timeout. Retrying...
==> core-01: Machine booted and ready!
==> core-01: Setting hostname...
==> core-01: Configuring and enabling network interfaces...
==> core-01: Running provisioner: file...
==> core-01: Running provisioner: shell...
    core-01: Running: inline script
==> core-02: Box 'coreos-beta' could not be found. Attempting to find and install...
    core-02: Box Provider: virtualbox
    core-02: Box Version: >= 308.0.1
==> core-02: Loading metadata for box 'http://beta.release.core-os.net/amd64-usr/current/coreos_production_vagrant.json'
    core-02: URL: http://beta.release.core-os.net/amd64-usr/current/coreos_production_vagrant.json
==> core-02: Adding box 'coreos-beta' (v607.0.0) for provider: virtualbox
==> core-02: Importing base box 'coreos-beta'...
==> core-02: Matching MAC address for NAT networking...
==> core-02: Checking if box 'coreos-beta' is up to date...
==> core-02: Setting the name of the VM: coreos-vagrant_core-02_1425861790309_80904
==> core-02: Fixed port collision for 22 => 2222. Now on port 2200.
==> core-02: Clearing any previously set network interfaces...
==> core-02: Preparing network interfaces based on configuration...
    core-02: Adapter 1: nat
    core-02: Adapter 2: hostonly
==> core-02: Forwarding ports...
    core-02: 22 => 2200 (adapter 1)
==> core-02: Running 'pre-boot' VM customizations...
==> core-02: Booting VM...
==> core-02: Waiting for machine to boot. This may take a few minutes...
    core-02: SSH address: 127.0.0.1:2200
    core-02: SSH username: core
    core-02: SSH auth method: private key
    core-02: Warning: Connection timeout. Retrying...
==> core-02: Machine booted and ready!
==> core-02: Setting hostname...
==> core-02: Configuring and enabling network interfaces...
==> core-02: Running provisioner: file...
==> core-02: Running provisioner: shell...
    core-02: Running: inline script
==> core-03: Box 'coreos-beta' could not be found. Attempting to find and install...
    core-03: Box Provider: virtualbox
    core-03: Box Version: >= 308.0.1
==> core-03: Loading metadata for box 'http://beta.release.core-os.net/amd64-usr/current/coreos_production_vagrant.json'
    core-03: URL: http://beta.release.core-os.net/amd64-usr/current/coreos_production_vagrant.json
==> core-03: Adding box 'coreos-beta' (v607.0.0) for provider: virtualbox
==> core-03: Importing base box 'coreos-beta'...
==> core-03: Matching MAC address for NAT networking...
==> core-03: Checking if box 'coreos-beta' is up to date...
==> core-03: Setting the name of the VM: coreos-vagrant_core-03_1425861823848_20722
==> core-03: Fixed port collision for 22 => 2222. Now on port 2201.
==> core-03: Clearing any previously set network interfaces...
==> core-03: Preparing network interfaces based on configuration...
    core-03: Adapter 1: nat
    core-03: Adapter 2: hostonly
==> core-03: Forwarding ports...
    core-03: 22 => 2201 (adapter 1)
==> core-03: Running 'pre-boot' VM customizations...
==> core-03: Booting VM...
==> core-03: Waiting for machine to boot. This may take a few minutes...
    core-03: SSH address: 127.0.0.1:2201
    core-03: SSH username: core
    core-03: SSH auth method: private key
    core-03: Warning: Connection timeout. Retrying...
==> core-03: Machine booted and ready!
==> core-03: Setting hostname...
==> core-03: Configuring and enabling network interfaces...
==> core-03: Running provisioner: file...
==> core-03: Running provisioner: shell...
    core-03: Running: inline script
Waiting for Vagrant cluster to be ready...
CoreOS Vagrant cluster is up

Configuring SSH
Identity added: /home/thecatwasnot/.vagrant.d/insecure_private_key (/home/thecatwasnot/.vagrant.d/insecure_private_key)

Starting paz runlevel 1 units
Unit paz-scheduler.service launched on 7641f8b0.../172.17.8.101
Unit paz-orchestrator.service launched on 53f5997f.../172.17.8.102
Unit paz-service-directory-announce.service launched on b9bc6257.../172.17.8.103
Unit paz-service-directory.service launched on b9bc6257.../172.17.8.103
Unit paz-scheduler-announce.service launched on 7641f8b0.../172.17.8.101
Unit paz-orchestrator-announce.service launched on 53f5997f.../172.17.8.102
Successfully started all runlevel 1 paz units on the cluster with Fleet
Waiting for runlevel 1 services to be activated...
Activating: 0 | Active: 6 | Failed: 0.  
All runlevel 1 units successfully activated!

Waiting for orchestrator, scheduler and service directory to be announced

Starting paz runlevel 2 units
Unit paz-web.service launched on 53f5997f.../172.17.8.102
Unit paz-web-announce.service launched on 53f5997f.../172.17.8.102
Successfully started all runlevel 2 paz units on the cluster with Fleet
Waiting for runlevel 2 services to be activated...
Activating: 0 | Active: 8 | Failed: 0...
All runlevel 2 units successfully activated!

You will need to add the following entries to your /etc/hosts:
172.17.8.101 paz-web.paz
172.17.8.101 paz-scheduler.paz
172.17.8.101 paz-orchestrator.paz
172.17.8.101 paz-orchestrator-socket.paz

Paz installation successful

I did edit /etc/hosts
fleet reports everything OK:

vagrant ssh core-01
CoreOS beta (607.0.0)
Update Strategy: No Reboots
core@core-01 ~ $ fleetctl list-units
UNIT                    MACHINE             ACTIVE  SUB
paz-orchestrator-announce.service   53f5997f.../172.17.8.102    active  running
paz-orchestrator.service        53f5997f.../172.17.8.102    active  running
paz-scheduler-announce.service      7641f8b0.../172.17.8.101    active  running
paz-scheduler.service           7641f8b0.../172.17.8.101    active  running
paz-service-directory-announce.service  b9bc6257.../172.17.8.103    active  running
paz-service-directory.service       b9bc6257.../172.17.8.103    active  running
paz-web-announce.service        53f5997f.../172.17.8.102    active  running
paz-web.service             53f5997f.../172.17.8.102    active  running

This morning I tried running the integration test:

./integration.sh 
Starting Paz integration test script
./integration.sh: line 18: checkRequiredEnvVars: command not found

Checking for existing Vagrant cluster

Creating a new Vagrant cluster
Cloning into 'coreos-vagrant'...
remote: Counting objects: 351, done.
remote: Total 351 (delta 0), reused 0 (delta 0), pack-reused 351
Receiving objects: 100% (351/351), 79.37 KiB | 0 bytes/s, done.
Resolving deltas: 100% (152/152), done.
Checking connectivity... done.
==> core-01: Checking for updates to 'coreos-beta'
    core-01: Latest installed version: 607.0.0
    core-01: Version constraints: >= 308.0.1
    core-01: Provider: virtualbox
==> core-01: Box 'coreos-beta' (v607.0.0) is running the latest version.
==> core-02: Checking for updates to 'coreos-beta'
    core-02: Latest installed version: 607.0.0
    core-02: Version constraints: >= 308.0.1
    core-02: Provider: virtualbox
==> core-02: Box 'coreos-beta' (v607.0.0) is running the latest version.
==> core-03: Checking for updates to 'coreos-beta'
    core-03: Latest installed version: 607.0.0
    core-03: Version constraints: >= 308.0.1
    core-03: Provider: virtualbox
==> core-03: Box 'coreos-beta' (v607.0.0) is running the latest version.
Bringing machine 'core-01' up with 'virtualbox' provider...
Bringing machine 'core-02' up with 'virtualbox' provider...
Bringing machine 'core-03' up with 'virtualbox' provider...
==> core-01: Importing base box 'coreos-beta'...
==> core-01: Matching MAC address for NAT networking...
==> core-01: Checking if box 'coreos-beta' is up to date...
==> core-01: Setting the name of the VM: coreos-vagrant_core-01_1425905935661_73514
==> core-01: Clearing any previously set network interfaces...
==> core-01: Preparing network interfaces based on configuration...
    core-01: Adapter 1: nat
    core-01: Adapter 2: hostonly
==> core-01: Forwarding ports...
    core-01: 22 => 2222 (adapter 1)
==> core-01: Running 'pre-boot' VM customizations...
==> core-01: Booting VM...
==> core-01: Waiting for machine to boot. This may take a few minutes...
    core-01: SSH address: 127.0.0.1:2222
    core-01: SSH username: core
    core-01: SSH auth method: private key
    core-01: Warning: Connection timeout. Retrying...
==> core-01: Machine booted and ready!
==> core-01: Setting hostname...
==> core-01: Configuring and enabling network interfaces...
==> core-01: Running provisioner: file...
==> core-01: Running provisioner: shell...
    core-01: Running: inline script
==> core-02: Importing base box 'coreos-beta'...
==> core-02: Matching MAC address for NAT networking...
==> core-02: Checking if box 'coreos-beta' is up to date...
==> core-02: Setting the name of the VM: coreos-vagrant_core-02_1425905966683_94058
==> core-02: Fixed port collision for 22 => 2222. Now on port 2200.
==> core-02: Clearing any previously set network interfaces...
==> core-02: Preparing network interfaces based on configuration...
    core-02: Adapter 1: nat
    core-02: Adapter 2: hostonly
==> core-02: Forwarding ports...
    core-02: 22 => 2200 (adapter 1)
==> core-02: Running 'pre-boot' VM customizations...
==> core-02: Booting VM...
==> core-02: Waiting for machine to boot. This may take a few minutes...
    core-02: SSH address: 127.0.0.1:2200
    core-02: SSH username: core
    core-02: SSH auth method: private key
    core-02: Warning: Connection timeout. Retrying...
==> core-02: Machine booted and ready!
==> core-02: Setting hostname...
==> core-02: Configuring and enabling network interfaces...
==> core-02: Running provisioner: file...
==> core-02: Running provisioner: shell...
    core-02: Running: inline script
==> core-03: Importing base box 'coreos-beta'...
==> core-03: Matching MAC address for NAT networking...
==> core-03: Checking if box 'coreos-beta' is up to date...
==> core-03: Setting the name of the VM: coreos-vagrant_core-03_1425905998600_89301
==> core-03: Fixed port collision for 22 => 2222. Now on port 2201.
==> core-03: Clearing any previously set network interfaces...
==> core-03: Preparing network interfaces based on configuration...
    core-03: Adapter 1: nat
    core-03: Adapter 2: hostonly
==> core-03: Forwarding ports...
    core-03: 22 => 2201 (adapter 1)
==> core-03: Running 'pre-boot' VM customizations...
==> core-03: Booting VM...
==> core-03: Waiting for machine to boot. This may take a few minutes...
    core-03: SSH address: 127.0.0.1:2201
    core-03: SSH username: core
    core-03: SSH auth method: private key
    core-03: Warning: Connection timeout. Retrying...
==> core-03: Machine booted and ready!
==> core-03: Setting hostname...
==> core-03: Configuring and enabling network interfaces...
==> core-03: Running provisioner: file...
==> core-03: Running provisioner: shell...
    core-03: Running: inline script
Waiting for Vagrant cluster to be ready...
CoreOS Vagrant cluster is up

Configuring SSH
Identity added: /home/thecatwasnot/.vagrant.d/insecure_private_key (/home/thecatwasnot/.vagrant.d/insecure_private_key)

Starting paz runlevel 1 units
Unit paz-scheduler.service launched on 14dbc022.../172.17.8.101
Unit paz-scheduler-announce.service launched on 14dbc022.../172.17.8.101
Unit paz-orchestrator.service launched on 4f6c57a6.../172.17.8.103
Unit paz-orchestrator-announce.service launched on 4f6c57a6.../172.17.8.103
Unit paz-service-directory.service launched on 2c75bccd.../172.17.8.102
Unit paz-service-directory-announce.service launched on 2c75bccd.../172.17.8.102
Successfully started all runlevel 1 paz units on the cluster with Fleet
Waiting for runlevel 1 services to be activated...
Activating: 0 | Active: 6 | Failed: 0.. 
All runlevel 1 units successfully activated!

Waiting for orchestrator, scheduler and service directory to be announced

Starting paz runlevel 2 units
Unit paz-web.service launched
Unit paz-web-announce.service launched on 14dbc022.../172.17.8.101
Successfully started all runlevel 2 paz units on the cluster with Fleet
Waiting for runlevel 2 services to be activated...
Activating: 1 | Active: 8 | Failed: 0...
All runlevel 2 units successfully activated!

You will need to add the following entries to your /etc/hosts:
172.17.8.101 paz-web.paz
172.17.8.101 paz-scheduler.paz
172.17.8.101 paz-orchestrator.paz
172.17.8.101 paz-orchestrator-socket.paz

Adding service to directory
{"doc":{"name":"demo-api","description":"Very simple HTTP Hello World server","dockerRepository":"lukebond/demo-api","config":{"publicFacing":false,"numInstances":3,"ports":[],"env":{}}}}
Deploying new service with the /hooks/deploy endpoint
{"statusCode":200}
Waiting for service to announce itself

Which hung for hours (was still waiting when I returned 8 hours later)
I've now also tried changing my version of etcdctl to match the one on coreos and no joy.

paz-dnsmasq container not removed on failure

If paz-dnsmasq fails to start for whatever reason, or has previously stopped (e.g. reboot), it can't be restarted later because there is no docker rm in the unit file.

Fix incoming...

Split unit files up into chained steps

I picked up this tip from the Giantswarm guys.

Currently our unit files do docker pull, docker kill, docker run, docker stop etc. all in messy multi-line bash statements in one unit file. If we separate these into separate unit files, eg. one for pulling, one for starting, one for stopping, etc., and chain them together with systemd requires/after directives then it makes neater unit files as well as making it easier to insert steps in between, such as mounting an EBS volume, starting/connecting to a weave network, etc. It will also unroll on stop/kill and allow you to disconnect these things in reverse order.

Implement service watcher that can start/remove units

We need a little service that watches what's running in the cluster and:

...will start new service if:

  • there are fewer instances of a service running compared to what is declared in the service directory (e.g. one died and didn't restart)

...will stop services if:

  • instances of a newer version of the service have been deployed and are all healthy, it can kill off the old ones

This is something like the Kubernetes replication controller.

Use Fleet machine metadata for "environments"

Let's say you want dev, QA, staging and production clusters. Rather than have multiple clusters of Paz, they could be the same cluster but use Fleet machine metadata to schedule units only on hosts containing units from their environment.

e.g. 4 environments, each a 3-node cluster, you may have the following metadata for them:

Host Name Metadata
host1 dev1 environment=dev
host2 dev1 environment=dev
host3 dev1 environment=dev
host4 qa1 environment=qa
host5 qa2 environment=qa
host6 qa3 environment=qa
host7 staging1 environment=staging
host8 staging2 environment=staging
host9 staging3 environment=staging
host10 prod1 environment=prod
host11 prod2 environment=prod
host12 prod3 environment=prod

More can be read about Fleet scheduling with metadata here: https://coreos.com/docs/launching-containers/launching/launching-containers-fleet/#schedule-based-on-machine-metadata

Credit to @rimusz for the idea.

Getting "Deploy failed" when creating any service...

Hi,

Trying to test out paz, but when I try adding any service, prior to adding an app, I just get "Deploy Failed".

I am trying the vagrant cluster. Looks like things installed well, added entries to hosts, and accessed the web panel without issue, but when I try to add the demo-api service from the docs, or the registry container (to make a private registry in the cluter), I get a "deploy failed" error immediately as I click the deploy button...

fleetctl list units looks like everything is up everywhere?

UNIT                    MACHINE             ACTIVE  SUB
$ fleetctl -strict-host-key-checking=false -endpoint=http://172.17.8.101:4001 list-units
paz-orchestrator-announce.service   6bf8fd0d.../172.17.8.101    active  running
paz-orchestrator.service        6bf8fd0d.../172.17.8.101    active  running
paz-scheduler-announce.service      f069df3f.../172.17.8.102    active  running
paz-scheduler.service           f069df3f.../172.17.8.102    active  running
paz-service-directory-announce.service  6bf8fd0d.../172.17.8.101    active  running
paz-service-directory.service       6bf8fd0d.../172.17.8.101    active  running
paz-web-announce.service        51e62345.../172.17.8.103    active  running
paz-web.service             51e62345.../172.17.8.103    active  running

PS: what is the "public facing" setting that is when we choose a service? (I tried both true and false with same results) It would probably make sense to add a line to http://paz.readme.io/v1.0/docs/deploying-your-first-application-using-paz to explain what "public facing" does.

Improve documentation

The following needs improvement:

  • What is Paz?
  • Getting started / installation
  • How to deploy stuff via CI, Docker hub etc.
  • Technical detail on how it works under the hood

How to install PAZ in Azure Cloud

I created a coreOS cluster with 3 nodes. I am able to manually run services and use docker builds. Can anyone help me to install paz?
One more question, Is PAZ production ready now?

@lukebond : came to know about PAZ from your london presentation. can you please suggest something?

Etcd Unavailable when bootstrapping with vagrant

I ran into an issue when bootstrapping a vagrant coreos cluster with paz. Specifically, when running the script install-vagrant.sh, etcd was not available in time.

As a hack, I added a 5 second sleep right before launchAndWaitForUnits which "fixed" the issue on my machine.

Investigate running Paz's internal services under rkt

Suggestion from @rimusz

It would make images smaller and startup time faster, makes Paz more reliable (knocking out the Docker daemon kills everything, even the management plane) and is a good way to get started with rkt with a view to supporting it in the future for user services.

Installation doesn't fail if there is no Internet connection

$ ./integration.sh
Starting Paz integration test script

Checking for existing Vagrant cluster

Creating a new Vagrant cluster
Cloning into 'coreos-vagrant'...
fatal: unable to access 'https://github.com/coreos/coreos-vagrant/': Could not resolve host: github.com
../scripts/helpers.sh: line 35: cd: coreos-vagrant: Not a directory
Can't open config.rb.sample: No such file or directory.
A Vagrant environment or target machine is required to run this
command. Run `vagrant init` to create a new Vagrant environment. Or,
get an ID of a target machine from `vagrant global-status` to run
this command on. A final option is to change to a directory with a
Vagrantfile and to try again.
A Vagrant environment or target machine is required to run this
command. Run `vagrant init` to create a new Vagrant environment. Or,
get an ID of a target machine from `vagrant global-status` to run
this command on. A final option is to change to a directory with a
Vagrantfile and to try again.
Waiting for Vagrant cluster to be ready...
CoreOS Vagrant cluster is up
mkdir: unitfiles: File exists
cp: ../unitfiles/*: No such file or directory
mkdir: scripts: File exists
cp: ../scripts/start-runlevel.sh: No such file or directory

...and so on. Should just be a matter of adding -e to the shebang line in integration.sh.

Roadmap

This is a place to discuss the short-to-medium term roadmap. Let's aim to distil it to a list of a handful of items that are doable in a month or two.

To get us started:

  1. CLI
  2. Monitoring with Heapster, InfluxDB and Grafana
  3. Centralised logging
  4. Something to observe what's running like Kubernetes' Replication Controller
  5. Deployment history in the UI somewhere
  6. Use Weave & WeaveDNS to simplify the complex HAProxy plumbing and service discovery
  7. Proper test runner/framework (maybe Gulp)

Implement CLI

Let's discuss the scope and features of the Paz command-line interface functionality.

Naturally, it needs to be called paz.sh :)

Proposed functionality:

  • Installation:
    • Provision a Vagrant/VirtualBox cluster running Paz
    • Provision a DigitalOcean cluster running Paz
  • Register SSH keys
  • List status of Paz internal units
  • List declared services
  • Show status of running services
  • Add/edit/delete/scale services
  • Show status of hosts
  • Administer Paz configuration (e.g. domain/DNS)
  • Scale the cluster

Is this project active?

Hi, I would like to ask if this project is still active since I am evaluating Docker related Paas. Thanks.

Implement an out-of-the-box monitoring solution

Let's discuss what will become the out-of-the-box monitoring solution for Paz.

Currently we're using cAdvisor from the Kubernetes project. This is a good solution but when used in isolation it is limited because it doesn't do storage and search of historical data.

Heapster is the evolution of this project and builds upon cAdvisor to provide a cluster-aware, searchable cAdvisor (effectively). At first glance it appears to be a good solution.

I'm open to anything else, this is not my area of expertise.

Discuss?

Exporting/Importing cluster configuration (templating)

I started playing around with paz, and i am finding very useful for some side projects i am working on.
One thing I am missing is the possibility to export the current cluster configuration in an external file , just to be able to reimport it later, instead of recreating everything anew via the UI.
The service should be available both from the UI and via API.
I will give it a try to implement it this weekend , so wish me luck!
Any tip in particular to where should i start from?

Investigate simplifying, changing or replacing the Etcd/HAProxy service discovery layer

It's currently a bit brittle and difficult to understand / remember how it works.

Some options:

  • Weave
  • Kubernetes
  • Keep what we have

For me Weave would be really helpful, but there are multiple ways we could use it and we should have a discussion with the Weave developers about this. Some options:

  • Use Weave to give each container a unique IP address and dispense with random Docker ports, simplifying the existing HAProxy "magic" we have
  • Use WeaveDNS as the service discovery mechanism and ditch HAProxy/Confd altogether. This would greatly simplify our stack, leaving the hard networking issues to the experts, at the cost of losing the ability to leverage HAProxy's zero-downtime-deployment features (which we're not really utilising yet, until #32 is done)

Access web app

I have installed paz on digital ocean, but how to access the web app? Entering the ip direct in the URL (port 80) doesn't work. Tested on stable, beta and alpha versions of CoreOS.

paz.sh

Currently the website paz.sh is still an email capture with a few descriptions about paz, will this be open-sourced so we can update the site with more up to date details and documentation? =D

Adding custom entries to haproxy is problematic

Currently the haproxy confd template is generated with run.sh in the haproxy container. This means that the only way to change the template is to eclipse the script with your own to generate a different config file. This works fine, but has the unfortunate side-effect of the template only updating when the container is restarted.

This is just a quality of life type of thing.

Split up install and integration test scripts into usable pieces

Currently, once you've installed Paz you've no way of reinstalling it or fixing it if it fails without tearing it down and starting again (which takes ages); apart from SSHing in and fixing it manually of course.

Split up the installation script into set up cluster w/ varant & cloud-config, tear down cluster, install/reinstall units, wait for units to start.

Re-running Paz cluster

Right now there's no way of running paz other than executing the install-vagrant.sh script (that I'm aware of), which destroys the current cluster and creates a new one (which takes it's time).

Running vagrant up on the coreos-vagrant folder is giving me random Connection timeout messages and even if I can start the cluster without any apparent problems, ssh'ing into each machines shows:

Failed Units: 2
  cadvisor.service
  paz-dnsmasq.service

Implement centralised logging

We want all logs for all services to be tail-able together and individually, from the command line and also displayable within the UI. Searching would also be good.

From @sublimino in #19:

one-command cluster monitoring (i.e. journal -f on all units, got some fleet jiggerypokery to do this as there are silly TTY complications)

Is https://github.com/gliderlabs/logspout an option?

create-swap.service failed and how I solved

I have used the minimum Digital Ocean Droplet (512MB RAM), and the default userdata always gives me this error.
image

Solved changing both occurrences of Environment="SWAPFILE=/2GiB.swap" to 1GiB and ExecStart=/usr/bin/fallocate -l 2048m ${SWAPFILE} to 1024m in userdata.

installation

I'm going to do an installation and try to go by the book instead of trying to guess so that the onboarding of new developers can get easier.

Services don't (always?) automatically restart when they stop

If a host is taken out, it's Fleet's responsibility to reschedule. If a container dies, it's systemd's responsibility. Therefore, investigate systemd unit files for internal Paz services and ensure they are configured to automatically restart when they exit.

nginx proxy on front of paz?

I have a historical nginx setup, which proxies all my servers. What I do is publish into etcd and I have confd watching that and writing out my nginx config file. I do this to keep requests to certain services locked to internal access only and other services are public. I'm thinking that this matches in a way haproxy, but just as a stop-gap until I convert over I was planning on using nginx in front of paz(haproxy).

This seems to work, but I do see some issues. The first is that occasionally the page refresh fails and the second issue is that the services tab just errors. Looking through the network requests I was able to find that I needed to expose paz-web, paz-orchestrator, and paz-orchestrator-socket. I also found that I needed to pass websocket connections with

proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_http_version 1.1;

But I'm not sure where to being to find out why things are still failing.

Also, please advise if it would just be easier to convert my services to haproxy. I'm not against that at all. I am concerned about the availability of haproxy, but I assume I can add in some ip restrictions for the proxy sites?

Bare metal scripts and documentation

I'm interested in trying paz, but I have an existing coreos cluster on bare metal. I assume I just need to wget some unit files to pull down and run paz, but I don't see anything in the documentation about this.

It seems as simple as clone the repo and run

scripts/start-runlevel.sh 1 && scripts/start-runlevel.sh 2

but I would expect some documentation if it were that simple. Is the documentation just missing for bare metal?

Slow announcer for orchestrator

$ fleetctl -strict-host-key-checking=false -endpoint=http://172.17.8.101:4001 journal paz-orchestrator-announce.service
Feb 26 17:38:59 core-01 sh[1195]: grep: HostIp:0.0.0.0: No such file or directory
Feb 26 17:39:00 core-01 sh[1195]: grep: HostIp:0.0.0.0: No such file or directory
Feb 26 17:39:01 core-01 sh[1195]: grep: HostIp:0.0.0.0: No such file or directory
Feb 26 17:39:02 core-01 sh[1195]: grep: HostIp:0.0.0.0: No such file or directory
Feb 26 17:39:03 core-01 sh[1195]: grep: HostIp:0.0.0.0: No such file or directory
Feb 26 17:39:04 core-01 sh[1195]: grep: HostIp:0.0.0.0: No such file or directory
Feb 26 17:39:05 core-01 sh[1195]: grep: HostIp:0.0.0.0: No such file or directory
Feb 26 17:39:06 core-01 sh[1195]: grep: HostIp:0.0.0.0: No such file or directory
Feb 26 17:39:07 core-01 sh[1195]: grep: HostIp:0.0.0.0: No such file or directory
Feb 26 17:39:08 core-01 sh[1195]: grep: HostIp:0.0.0.0: No such file or directory

However this timed out and then:

Feb 26 17:41:31 core-01 systemd[1]: paz-orchestrator-announce.service start-pre operation timed out. Terminating.
Feb 26 17:41:31 core-01 systemd[1]: Failed to start paz-orchestrator announce.
Feb 26 17:41:31 core-01 systemd[1]: Unit paz-orchestrator-announce.service entered failed state.
Feb 26 17:41:31 core-01 systemd[1]: paz-orchestrator-announce.service failed.
Feb 26 17:41:31 core-01 systemd[1]: paz-orchestrator-announce.service holdoff time over, scheduling restart.
Feb 26 17:41:31 core-01 systemd[1]: Stopping paz-orchestrator announce...
Feb 26 17:41:31 core-01 systemd[1]: Starting paz-orchestrator announce...
Feb 26 17:41:31 core-01 sh[26833]: Waiting for 49153/tcp...
Feb 26 17:41:31 core-01 systemd[1]: Started paz-orchestrator announce.
Feb 26 17:41:31 core-01 sh[26857]: Connected to 172.17.8.101:49153/tcp and 172.17.8.101:49154, publishing to etcd..

However the service seemed to be up and running:

$ fleetctl -strict-host-key-checking=false -endpoint=http://172.17.8.101:4001 journal paz-orchestrator.service
-- Logs begin at Thu 2015-02-26 16:40:27 UTC, end at Thu 2015-02-26 17:44:04 UTC. --
Feb 26 17:05:35 core-01 systemd[1]: Started paz-orchestrator: Main API for all paz services and monitor of services in etcd..
Feb 26 17:05:37 core-01 bash[16821]: {}
Feb 26 17:05:37 core-01 bash[16821]: { disabled: 'true',
Feb 26 17:05:37 core-01 bash[16821]: provider: 'dnsimple',
Feb 26 17:05:37 core-01 bash[16821]: email: '[email protected]',
Feb 26 17:05:37 core-01 bash[16821]: apiKey: '312487532487',
Feb 26 17:05:37 core-01 bash[16821]: domain: 'paz' }
Feb 26 17:05:37 core-01 bash[16821]: {"name":"paz-orchestrator_log","hostname":"1add37e0c392","pid":9,"level":30,"msg":"Starting server","time":"2015-02-26T17:05:37.887Z","src":{"file":"/usr/src/app/server.js","line":205},"v":0}
Feb 26 17:05:37 core-01 bash[16821]: {"name":"paz-orchestrator_log","hostname":"1add37e0c392","pid":9,"level":30,"msg":"paz-orchestrator is now running on port 9000","time":"2015-02-26T17:05:37.921Z","src":{"file":"/usr/src/app/server.js","line":194},"v":0}
Feb 26 17:05:37 core-01 bash[16821]: {"name":"paz-orchestrator_log","hostname":"1add37e0c392","pid":9,"level":30,"svcdir-url":"http://paz-service-directory.paz","msg":"","time":"2015-02-26T17:05:37.921Z","src":{"file":"/usr/src/app/server.js","line":195},"v":0}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.