Coder Social home page Coder Social logo

orhankislal / gpupgrade Goto Github PK

View Code? Open in Web Editor NEW

This project forked from greenplum-db/gpupgrade-archive

0.0 1.0 0.0 3.64 MB

GPDB major version upgrade utility

License: Other

Makefile 0.70% Go 80.76% Shell 17.18% Ruby 0.13% Python 0.36% PLpgSQL 0.88%

gpupgrade's Introduction

gpupgrade Concourse Build Status

gpupgrade runs pg_upgrade across all segments to upgrade a Greenplum cluster across major versions. For further details read the Greenplum Database Upgrade documentation and blog post. We warmly welcome any feedback and contributions.

Purpose:

Greenplum has several ways of upgrading including backup & restore and gpcopy. These methods usually require additional diskspace for the required copy and significant downtime. gpupgrade can do in-place upgrades without the need for additional hardware, disk space, and with less downtime.

Creating an easy upgrade path enables users to quickly and confidently upgrade. This enables Greenplum to have faster release cycles with faster user feedback. Most importantly it allows Greenplum to reduce its reliance on supporting legacy versions.

Supported Versions:

Source Cluster Target Cluster
5 6
6 7 (future work)

Architecture:

gpupgrade consists of three processes that communicate using gRPC and protocol buffers:

  • CLI
    • Runs on the master host
    • Consists of a gRPC client
  • Hub
    • Runs on the master host
    • Upgrades the master
    • Coordinates the agent processes
    • Consists of a gRPC client and server
  • Agents
    • Run on all segment hosts
    • Upgrade the standby, primary, and mirror segments
    • Execute commands received from the hub
    • Consist of a gRPC server
       CLI                     Hub                     Agent
      ------                  ------                  -------
  gRPC client    <-->      gRPC server
                                ^
                                |
                                V
                           gRPC client     <-->      gRPC server

Steps:

Running gpupgrade consists of three main steps:

  • gpupgrade initialize
    • The source cluster can still be running. No downtime.
    • Substeps include creating the gpupgrade state directory, starting the hub and agents, creating the target cluster, and running pre-upgrade checks.
  • gpupgrade execute
    • This step will stop the source cluster. Downtime is needed.
    • Substeps include upgrading the master, copying the master catalog to the segments, and upgrading the primaries.
  • gpupgrade finalize
    • After finalizing the upgrade cannot be reverted.
    • Substeps include updating the data directories and master catalog, and upgrading the standby and mirrors.
  • gpupgrade revert
    • Restores the cluster to the state before upgrade.
    • Can be run after initialize or execute, but not finalize.
    • Substeps include deleting the target cluster, archiving the gpupgrade log directory, and restoring the source cluster.
  start <---- run migration
    |            scripts  |
run migration             |
  scripts                 ^
    |                     |
    V                     |
initialize ---> revert ----
    |                     ^
    V                     |
 execute  ----> revert ----
    |
    V
 finalize
    |
run migration 
  scripts
    |
    V
   done

Each substep within a step implements crash-only idempotence. This means that if an error occurs and is fixed then on rerun the step will succeed. This requires each substep to clean up any side effects it creates, or possibly check if the work has been done.

Link vs. Copy Mode:

pg_upgrade supports two upgrade modes: link and copy.

Attribute Copy Mode Link Mode
Description Copy's source files to the target cluster. Uses hard links to modify the source cluster data in place.
Upgrade Time Slow, since it copy's the data before upgrading. Fast, since the data is modified in place.
Disk Space ~60% free disk space needed. ~20% free disk space needed.
Revert Speed Fast, since the source cluster remains untouched. Slow, since the source files have been modified the primaries and mirrors need to be rebuilt.
Risk Less risky since the source cluster is untouched. More risky since the source cluster is modified.

Getting Started

Prerequisites

  • Golang. We currently develop against latest stable Golang, which was v1.16 as of October 2020.
  • protoc. This is the compiler for the gRPC protobuf system which can be installed on macOS with brew install protobuf.
  • Run make && make depend-dev to install other developer dependencies. Note make needs to be run first.

Setting up your IDE

For IntelliJ users set up the following:

Imports:

  • Preferences > Editor > Code Style > Go > select "Imports" tab
    • uncheck "Use back quotes for imports"
    • uncheck "Add parentheses for a single import"
    • uncheck "Remove redundant import aliases"
    • Sorting type: gofmt
    • check "Move all imports in a single declaration"
    • check "Group stdlib imports"
      • check "Move all stdlib imports in a single group"
    • check "Group"
      • check "Current project packages"

Copyright:

  • Preferences > Editor > Copyright > Copyright Profiles
    • Add new profile called "vmware" with the following text:

      // Copyright (c) 2017-2021 VMware, Inc. or its affiliates
      // SPDX-License-Identifier: Apache-2.0
      
  • Preferences > Editor > Copyright > Formatting > Go
    • select "Use custom formatting options"
    • For Comment Type: check "Prefix each line" and select "use line comment"
    • For Relative Location: select "Before other comments" and check "Add blank line after"

Formatting:

  • Install the "File Watchers" plugin
  • Preferences > Tools > File Watchers
    • Add a new file watcher choosing "go fmt"
    • Depending on your personal preference check "Auto-save edited files to trigger the watcher"
  • Now saving the file (ctl+s) will format the file.
  • References: https://blog.jetbrains.com/go/2021/01/12/running-go-fmt-on-save/

Build and Test

make         # builds gpupgrade binary locally
make check   # runs tests
make install # installs gpupgrade into $GOBIN

Running

gpupgrade initialize --file ./gpupgrade_config
OR
gpupgrade initialize --source-gphome "$GPHOME" --target-gphome "$GPHOME" --source-master-port 6000 --disk-free-ratio 0
gpupgrade execute
gpupgrade finalize

Running Tests

Unit tests

make unit

Integration tests

Tests that run against the gpupgrade binary to verify the interaction between components. Before writing a new integration test please review the README.

make integration

Acceptance tests

Tests more end-to-end acceptance-level behavior between components. Tests are located in the test directory and use the BATS (Bash Automated Testing System) framework which can be installed on macOS with brew install bats-core. Please review the integrations/README.

# Some tests require GPDB installed and running
make gpupgrade-tests
make pg-upgrade-tests

All local tests

# Runs all local tests
make check

End-to-End tests

Creates a Concourse pipeline that includes various multi-host X-to-Y upgrade and functional tests. These cannot be run locally.

make set-pipeline

Concourse Pipeline

To update the pipeline edit the yaml files in the ci directory and run make set-pipeline.

The yaml files in the ci directory are concatenated to create ci/generated/template.yml. Next, go generate ./ci is executed which runs go run ./parser/parse_template.go generated/template.yml generated/pipeline.yml to create ci/generated/pipeline.yml. None of the generated files template.yml or pipeline.yml are checked in.

To update the production pipeline: PIPELINE_NAME=gpupgrade FLY_TARGET=prod make set-pipeline

To make the pipeline publicly visible run make expose-pipeline. This will allow anyone to see the pipeline and its status. However, the task details will not be visible unless one logs into Concourse.

Note: If your dev pipeline is failing on the build job while verifying the rpm then the most likely cause is needing to sync the latest tags on origin with your remote. This allows the GPDB test rpm to have the correct version number. On your GPDB branch run the following:

$ git fetch --tags origin
$ git push --tags <yourRemoteName>

If you already flew a pipeline before pushing tags you will likely need to delete it, push tags, and re-fly as Concourse has some weird caching issues.

Generating gRPC code

To recompile proto files to generate gRPC client and server code run go generate ./idl

Bash Completion

To enable tab completion of gpupgrade commands source the cli/bash/gpupgrade.bash script from your ~/.bash_completion config, or copy it into your system's completions directory such as /etc/bash_completion.d.

Building

Cross-compile with:

  • make build_linux
  • make build_mac

gpupgrade's People

Contributors

berlin-ab avatar bhuvnesh2703 avatar bmdoil avatar chibin avatar dgkimura avatar dkrieger-pvtl avatar dotyjim-work avatar jchampio avatar jmcatamney avatar kalensk avatar kyeap-vmware avatar nadeemg avatar orhankislal avatar sambitesh avatar skahler-vmware avatar slari-pivotal avatar soumyadeep2007 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.