Coder Social home page Coder Social logo

simonw / sqlite-generate Goto Github PK

View Code? Open in Web Editor NEW
22.0 4.0 0.0 56 KB

Tool for generating demo SQLite databases

Home Page: https://sqlite-generate-demo.datasette.io/

License: Apache License 2.0

Python 100.00%
sqlite datasette-io datasette-tool

sqlite-generate's Introduction

sqlite-generate

PyPI Changelog License

Tool for generating demo SQLite databases

Installation

Install this plugin using pip:

$ pip install sqlite-generate

Demo

You can see a demo of the database generated using this command running in Datasette at https://sqlite-generate-demo.datasette.io/

The demo is generated using the following command:

sqlite-generate demo.db --seed seed --fts --columns=10 --fks=0,3 --pks=0,2

Usage

To generate a SQLite database file called data.db with 10 randomly named tables in it, run the following:

sqlite-generate data.db

You can use the --tables option to generate a different number of tables:

sqlite-generate data.db --tables 20

You can run the command against the same database file multiple times to keep adding new tables, using different settings for each batch of generated tables.

By default each table will contain a random number of rows between 0 and 200. You can customize this with the --rows option:

sqlite-generate data.db --rows 20

This will insert 20 rows into each table.

sqlite-generate data.db --rows 500,2000

This inserts a random number of rows between 500 and 2000 into each table.

Each table will have 5 columns. You can change this using --columns:

sqlite-generate data.db --columns 10

--columns can also accept a range:

sqlite-generate data.db --columns 5,15

You can control the random number seed used with the --seed option. This will result in the exact same database file being created by multiple runs of the tool:

sqlite-generate data.db --seed=myseed

By default each table will contain between 0 and 2 foreign key columns to other tables. You can control this using the --fks option, with either a single number or a range:

sqlite-generate data.db --columns=20 --fks=5,15

Each table will have a single primary key column called id. You can use the --pks= option to change the number of primary key columns on each table. Drop it to 0 to generate rowid tables. Increase it above 1 to generate tables with compound primary keys. Or use a range to get a random selection of different primary key layouts:

sqlite-generate data.db --pks=0,2

To configure SQLite full-text search for all columns of type text, use --fts:

sqlite-generate data.db --fts

This will use FTS5 by default. To use FTS4 instead, use --fts4.

Development

To contribute to this tool, first checkout the code. Then create a new virtual environment:

cd sqlite-generate
python -mvenv venv
source venv/bin/activate

Or if you are using pipenv:

pipenv shell

Now install the dependencies and tests:

pip install -e '.[test]'

To run the tests:

pytest

sqlite-generate's People

Contributors

simonw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

sqlite-generate's Issues

Add a progress bar

I'm running this and it's taking ages:

sqlite-generate big.db --tables=100 --rows=100000 --columns=4 --seed=fixed

Error: "Cannot choose from an empty sequence"

The foreign key populating code throws an error if a table exists with 0 rows in it:

Generating rows  [####################################]  975/975  100%
Traceback (most recent call last):
  File "/Users/simon/.local/share/virtualenvs/sqlite-generate-_TC9iTyA/bin/sqlite-generate", line 11, in <module>
    load_entry_point('sqlite-generate', 'console_scripts', 'sqlite-generate')()
  File "/Users/simon/.local/share/virtualenvs/sqlite-generate-_TC9iTyA/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/Users/simon/.local/share/virtualenvs/sqlite-generate-_TC9iTyA/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/Users/simon/.local/share/virtualenvs/sqlite-generate-_TC9iTyA/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/simon/.local/share/virtualenvs/sqlite-generate-_TC9iTyA/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/Users/simon/Dropbox/Development/sqlite-generate/sqlite_generate/cli.py", line 127, in cli
    {column: fake.random.choice(table_pks_cache[other_table])},
  File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/random.py", line 290, in choice
    raise IndexError('Cannot choose from an empty sequence') from None
IndexError: Cannot choose from an empty sequence

--fks errors on second run against same database

$ sqlite-generate % sqlite-generate data.db
Generating rows  [####################################]  1035/1035  100%
Populating foreign keys  [###########-------------------------]  7/22   31%
Traceback (most recent call last):
  File "/Users/simon/.local/share/virtualenvs/sqlite-generate-_TC9iTyA/bin/sqlite-generate", line 11, in <module>
    load_entry_point('sqlite-generate', 'console_scripts', 'sqlite-generate')()
  File "/Users/simon/.local/share/virtualenvs/sqlite-generate-_TC9iTyA/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/Users/simon/.local/share/virtualenvs/sqlite-generate-_TC9iTyA/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/Users/simon/.local/share/virtualenvs/sqlite-generate-_TC9iTyA/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/simon/.local/share/virtualenvs/sqlite-generate-_TC9iTyA/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/Users/simon/Dropbox/Development/sqlite-generate/sqlite_generate/cli.py", line 113, in cli
    db[table_name].add_foreign_key(column, other_table, "id")
  File "/Users/simon/.local/share/virtualenvs/sqlite-generate-_TC9iTyA/lib/python3.8/site-packages/sqlite_utils/db.py", line 756, in add_foreign_key
    raise AlterError(
sqlite_utils.db.AlterError: Foreign key already exists for return_id => result_beautiful_responsibility.id

Ability to create foreign keys

Maybe an --fks option which causes some of the columns to randomly link to other existing tables with a foreign key relationship.

--columns= option

Right now every table gets just two columns: id and name. We can do something more interesting than that.

Maybe --columns=10 for number of columns per table - and always have a name column, but the other X-2 columns are randomly picked from a family of name/type/faker-generator tuples.

Ability to create composite primary keys, or tables with no explicit primary key

Perhaps via a --pks option, to mirror the syntax in #7:

  • --pks=1 is the default, as I believe is currently the case.
  • --pks=0 results in tables that do not have a named column that could become an alias for the ROWID, meaning that SQLite adds its own implicit primary key column in addition to columns defined by the user. See SQLite docs for more detail here: https://www.sqlite.org/lang_createtable.html#rowid
  • --pks=2 results in tables that have a composite, two-column primary key. For simplicity of implementation, I'd say it's fine to mandate that components columns of the primary key are always of integer type.
  • --pks=0,2 means that tables may have between 0 and 2 named columns in the primary key, analogous to the syntax in #7.

Foreign keys to tables with composite primary keys would obviously have to be composite themselves, which would make the generation logic a bit trickier.

Tables with no explicit primary key cannot have other tables pointing to them via foreign keys, but may point to other tables via their own foreign keys.

--rows=100,1000 option

The tool currently puts between 0 and 200 rows in each generated table:

db[table_name].insert_all(
[{"name": fake.name()} for j in range(fake.random.randint(0, 200))]
)

The --rows argument will customize this. --rows=100,1000 means "between 100 and 1000 rows". --rows=200 means "exactly 200 rows".

Live demo

A demo shipped by every commit would be useful. I can use Vercel for this.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.