Coder Social home page Coder Social logo

go-mad's Introduction

Go MAD

Latest Release CircleCI codecov Github Actions APACHE-2.0 License

MySQL anonymized dump

This project aims to have a tool that allows mysql repositories to be quickly dump in an anonymized form.

what not to do

CommitStrip Illustration - we all do it, right?

Since we shouldn't, never ever, directly duplicate a production db to QA, testing or dev; this way we can safely perform that regression or performance test.

I have found numerous projects that strive to do somehow the same, but none gave me the tooling that would fit my requirements, which are, have a sort of faker and fill the mysql dump data with them

Notorious Projects that could do similar:

Installation

You can install it using go install or by using one of the pre compiled binaries available in the (releases)[https://github.com/doutorfinancas/go-mad/releases]

go install github.com/doutorfinancas/[email protected]

Usage

from shell, call:

go-mad my_database --config=config_example.yml

if you are using innodb, then the configurations recommended are:

go-mad my_database --config=config_example.yml --single-transaction --quick

It will create a larger dump, but wraps everything around a transaction, disables locks and dumps writes faster To reduce insert impact, you can replace the --quick with --insert-into-limit=10 or whichever limit size would be best for you.

The database argument is required. Currently, only exporting one database is supported

you can use either SQL direct commands or faker on rewrites. Else it's compatible with mtk-dump config

please refer to faker documentation here

Available Flags (all are optional)

Flag (short) Description Type
--host (-h) your MySQL host, default 127.0.0.1 string
--user (-u) your user to authenticate in mysql, no default string
--password (-p) password to authenticate in mysql, no default string
--port (-P) port to your mysql installation, default 3306 string
--config (-c) path to your go-mad config file, example below string
--output (-o) path to the intended output file, default STDOUT string
--char-set uses SET NAMES command with provided charset, default utf8 string
--trigger-definer changes trigger delimiter to the string you pass, default is ';' string
--insert-into-limit defines limit to be used with each insert statement, cannot use with --quick, default 100 int
--debug (-v) turns on verbose mode if passed bool
--quiet (-q) disables log output if passed bool
--skip-lock-tables skips locking mysql tables when dumping bool
--single-transaction does the dump within a single transaction by issuing a BEGIN Command bool
--quick dump writes row by row as opposed to using extended inserts bool
--add-locks add write lock statements to the dump bool
--hex-encode performs hex encoding and respective decode statement for binary values bool
--ignore-generated strips generated columns from create statements bool
--dump-trigger dumps triggers from database bool
--skip-definer skips definer of triggers dumps (used in conjuntion with --dump-trigger) bool

Configuration Example

rewrite:
  users:
    email: "'faker.Internet().Email()'"
    password: "'FAKE_PASSWORD'"
    username: "'faker.Internet().Email()'"
    # name: faker.Person().Name()
    name: "SELECT names FROM random WHERE id = users.id"

nodata:
  - actions
  - exports
  - tokens

ignore:
  - advertisers
  - transactions
  - cache

where:
  users: |-
    id < 5000

Contributing

Feel free to contribute to the project, as in form of opening issues as by submitting a pull request To do so:

  • Clone the project
  • Make sure you have golint-ci installed
  • If you have pre-commit, you can make hook-setup
  • Write your code, run make test and commit (it should be signed)
  • Open pull request and wait for our review :)

Next Steps (ToDos)

  • Adds support for triggers (thank you @shyim)
  • Adds support to exporting multiple databases at a time
  • Exports run in goroutines to accelerate when --parallel is passed
  • Add support for env vars
  • Feel free to expand this list

go-mad's People

Contributors

bolovsky avatar dependabot[bot] avatar espina2 avatar nussjustin avatar shyim avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

go-mad's Issues

cant find column named in Camelcase

When you try to animalize a column with a camelcase header it will not be found, it has to be lowercase. It would be nice if someone can change it so you can animalize columns with camelcase and lowercase

Rewrite only if data exists in the row

Currently, the rewrite feature replaces all rows with the value. I guess that hurts the reproducibility of bugs as the data is not the same.

My idea would be to change the SELECT to IF(colum IS NULL, NULL, REPLACEMNET) as X. With a new option.

What do you think about that? I could provide a PR

Progress bar to show progress

Currently, the only way to see what it is doing is to run SHOW processlist in another terminal.

Make we could add a progress bar which table is currently worked on ๐Ÿ˜…

Ignore generated columns

image

CREATE TABLE `product_keyword_dictionary` (
  `id` binary(16) NOT NULL,
  `language_id` binary(16) NOT NULL,
  `keyword` varchar(500) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
  `reversed` varchar(500) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci GENERATED ALWAYS AS (reverse(`keyword`)) STORED,
  PRIMARY KEY (`id`,`language_id`),
  UNIQUE KEY `uniq.language_id_keyword` (`language_id`,`keyword`),
  KEY `idx.product_keyword_dictionary.language_id` (`language_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
ERROR 3105 (HY000) at line 11000: The value specified for generated column 'reversed' in table 'product_keyword_dictionary' is not allowed.

Is --quick really faster on importing?

First thanks for building this โค๏ธ

I understand from the description that is creating for any record an own INSERT INTO. Does it maybe make more sense to have an insert into row limiter rather than. like --insert-into-limit=50 adds only 50 records to insert into

And I would like to embed this into a cli command

Hex blob columns

Hey,

we use for all tables UUIDs saved in binary(16). Those fields should be hexed in the output SQL

triggers can not be imported using mysql cli

Hi, when triggers are dumped (#40) .. they can not be imported using the mysql cli as they contain the current mysql delimiter ;.

For example a dumped trigger looks like this:

--
-- Trigger `upd_check`
--

CREATE TRIGGER upd_check BEFORE UPDATE ON account
       FOR EACH ROW
       BEGIN
           IF NEW.amount < 0 THEN
               SET NEW.amount = 0;
           ELSEIF NEW.amount > 100 THEN
               SET NEW.amount = 100;
           END IF;
       END;

Which fails on the ; within the trigger -- within the IF statement.

According to the mysql documentation the delimiter needs to be changed to allow for this.

if you use the mysql program to define a trigger that executes multiple statements, it is necessary to redefine the mysql statement delimiter so that you can use the ; statement delimiter within the trigger definition. -- https://dev.mysql.com/doc/refman/5.7/en/trigger-syntax.html

So a trigger dump should maybe look like this:

--
-- Trigger `upd_check`
--
delimiter //
CREATE TRIGGER upd_check BEFORE UPDATE ON account
       FOR EACH ROW
       BEGIN
           IF NEW.amount < 0 THEN
               SET NEW.amount = 0;
           ELSEIF NEW.amount > 100 THEN
               SET NEW.amount = 100;
           END IF;
       END;//
delimiter ;

Then the dump can get imported.

With more than one trigger can wrap all triggers with:

delimiter //
-- triggers; each ending in //
delimiter ;

We use the shopware-cli tool (which uses this library) for creating a dev seed shared between devs and are having to rewrite the seed to accommodate for this.

--
MySQL Version: Server version: 5.7.40 MySQL Community Server (GPL)

FriendsOfShopware/shopware-cli#141

Gzip result file

Maybe we should generate a griped sql file at the end, as the file size can be huge

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.