rodionovsasha / jfixtures Goto Github PK

A tool for populating relational databases with yml-based test data

License: MIT License

Shell 0.08% Java 25.92% Groovy 74.00%

java testing test-data sql

jfixtures's Introduction

Preface

JFixtures creation is inspired by Ruby On Rails fixtures - it helps to define test data in a human readable YML format and then to transform the data to the SQL language which your database understands. So it is a sort of YML to SQL converter. As for java world, JFixtures could be compared with DBUnit library.

Please read our WIKI for more info

JFixtures VS plain SQL

With plain SQL it is hard to match values to column names even when you format SQL well:

INSERT 
  INTO users(id, first_name, last_name, middle_name, sex, age, is_admin, is_guest) 
  VALUES (5, 'Vladimir', 'Korobkov', 'Vadimovich', 'm', 29, true, false);

With JFixtures it is easier:

user_1:
  id: 5
  first_name: Vladimir
  last_name: Korobkov
  middle_name: Vadimovich
  sex: m
  age: 29
  is_admin: true
  is_guest: false

or the compact form:

vlad: { first_name: 'Vladimir', last_name: 'Korobkov', sex: 'm', age: 29 }
homer: { first_name: 'Homer', last_name: 'Griffin', sex: 'm', age: 45 }

References to other tables are numbers in plain SQL:

INSERT INTO comment(id, ticket_id, user_id, text) VALUES (1, 4, 8, 'Hello, world');

Numbers are hard to remember and to manage, they do not bring any associated and readable information. Like foreign key 4 says nothing about what is in the referred row of user table.

JFixtures allows to define your own text alias for every row and then to refer tables by aliases:

good_comment: # This is the alias for the row below
  text: This service is really great
  rate: 10
  ticket_id: write_wiki_ticket # refers to tickets table by write_wiki_ticket alias
  user_id: vlad_admin # refers to users table by vlad_admin alias

Since every row of every table has an alias, JFixtures takes care of automatic primary keys generation so no need to deal with these numbers at in 99% of cases. For the remaining 1% there is an ability to define primary keys values manually.
JFixtures resolves references to foreign tables by aliases instead of foreign key numeric values
JFixtures takes care of the tables order, finding the dependencies between tables and aligning the dependent tables first. For example, if table comments has foreign keys to tables users and tickets, then users/tickets go first in the result SQL to satisfy comments.
JFixtures provides a nice file system structure to break data into files:

  permissions.yml   # converts to "permissions" table
  addresses.yml     # converts to "addresses" table
- geo/ 
    coordinates.yml # converts to "geo.addresses" table
    locations.yml   # converts to "geo.locations" table
- chat/
    comments.yml    # converts to "chat.comments" table
    rooms.yml       # converts to "chat.rooms" table
  - customer/
      rooms.yml     # converts to "chat.customer.rooms" table
      costs.yml     # converts to "chat.customer.costs" table

JFixtures has circular dependencies detection between tables

Tech details

The project is written in pure Java and needs Java Runtime 8+. It is available on maven central as a library or there is a command line interface for it

JFixtures support a few database dialects:

SQL 99 which covers many databases(Postgres, Oracle, H2, SqlLite, Yandex ClickHouse, Sybase and many others)
MySql
Miscrosoft Sql
Adding a custom database support is easy or submit us an issue

JFixtures can export the result into XML, so it is possible then to transform the XML into any other custom format, either SQL or not

Dependencies

By the moment JFixtures has only one compile time dependency - org.yaml:snakeyaml:jar for parsing source YML files.

Feedback and contribution

We are happy if you contribute or submit an issue

Also, feel free to email us, the emails are in pom.xml

And join us on slack

Please read our WIKI for more info

jfixtures's People

Contributors

Stargazers

Watchers

Forkers

vladkorobkov hpmmrajbs

jfixtures's Issues

Uppercase `null`to `NULL`

Yaml has two valid null types: ~ and null as per snake yaml documentation

Both currently translate to null in output SQL code. We need to uppercase null so it becomes NULL and to cover this scenario with tests:

~ -> NULL
null -> NULL
Null -> NULL
nULL -> NULL
nulla -> 'nulla'

Before inserts custom SQL

Problem:
To be able to write a custom SQL before insert instructions(and after cleanup instruction).

Possible configuration:

tables:
  transactional_users:
    applies_to: users, users_profiles
    before_inserts:
      - // Doing table $TABLE_NAME
      - BEGIN TRANSACTION;

The before-insert section could be: a single value or a list(or list of lists with any depth). The $TABLE_NAME placeholder should be replaced with the real table name.

Before-insert instructions order must not be changed, duplicates(if there are) must not be removed.

Fail when foreign key name is the same as PK name

How it is possible to setup the same name for PK and for foreign key in scope of one table. This weird scenario should be checked and a exception should be thrown

Skip existing primary keys when a new PK is auto generating

WHEN JFixures is auto generating a new PK for a row in the table,
AND a row in the table with such a PK already exists,
THEN auto generator should generate a new value(next in sequence) and so on

Unite YmlRowsLoader with FixturesLoader

Pros:

Easier code base

Cons:

No deferred fixtures loading/yaml parsing - is some cases it might give a performance improvement

H2 support

We need to add H2 database support for a few reasons:

An ability to write fast end-to-end integration tests without need to install a heavy databases/dockers/etc
To support one more DB

Acceptance criteria:

Basing on adding h2 support, README.md file should be extended with an instruction of adding a custom database support
H2 support itself (see com.github.vkorobkov.jfixtures.JFixtures, com.github.vkorobkov.jfixtures.sql, com.github.vkorobkov.jfixtures.sql.dialects). I think the main goal is just to escape column/table names properly as well as string literals
Let's add one end-to-end test which will just create a single table with a few records - just to reuse it as a template in the future
Ensure code coverage remains 100%(see target/site/jacoco/index.html)
Add yourself into <developers> section of pom.xml
<version>1.0.5</version> should be increased to 1.0.6 in pom.xml

Computed by row aliases PKs and IDs

Tests need an ability to calculate ID(pk) by row aliases.
Ideally, calculating the id from row alias should be a pure function, like id = by_alias('homer'). This trick allows to use the same ID auto generation mechanism in both JFixtures and the tests within JFixtures.

The algorithm could base on the standard hashCode of String:

final LOWER_BOUND = 100_000;
final HASH_RANGE = Integer.MAX_VALUE - LOWER_BOUND;

public int calculate(String alias) { // do not copy this method name
  int hashWithOffset = alias.hashCode() % HASH_RANGE;
  return LOWER_BOUND + hashWithOffset;
}

Pros:

Pure function(no side effects, does not depend on any external state, the result depends only on the argument)
Easy
Fast
User can define their own ID's in the range 0...LOWER_BOUND which will never intersect with auto generated ones

Cons:

Collisions of auto generated IDs are possible:

=> 4.6587822898051644e-08

This is a probability of collisions when a table has 100 rows(don't think users would ever add more rows manually). The probability is low, so I think we can this approach. For that rare case of collision, we need to offer users to define their own IDs on 0...LOWER_BOUND range.

I think we need to do in a few small steps. Every steps would need a separate PR:

commentService.findByTopic(TOPIC_ID).id == [4, 5, 6, 8]

will look like

commentService.findByTopic(TOPIC_ID).id == HashId.list("homer", "bart", "lisa", "maggy")

Merge develop to master and release
Update core version in command line runner
Update tests in demo project

truncate clean up method

By now JFixtures generates clean up SQL for every table(before inserts):

DELETE FROM $table_name;

Add ability to change the instruction from DELETE to TRUNCATE in order to produce SQL like:

TRUNCATE $table_name;

A new table property is required:

tables:
   no_clean_up:
     tables: /.+
     clean_method: truncate # might be delete or none

Add SQLite syntax support

Configure output SQL formatting parameters

These could be:

The amount of empty lines between: tables, clean instructions, inserts, e.t.c.
Formatting insert instructions anyhow
Adding comments when table starts/ends
Adding comments with row aliases

Disallow dots in folder names

So sub folders with fixtures can not have dots to avoid collisions like below:

+ old/
  + searches/
    log.yaml        # old.searches.log
+ old.searches/
  log.yaml          # also old.searches.log?

Instead, an exception should occur when dot in a folder name is detected.

Clickhouse dialect

I would use your library for writing clickhouse integration tests. Could you add clickhouse dialect?

Escape single quote characters for String values

Singe quote characters ' should be replaced with '' for string literals in SQL statements

Table property for disabling cleanup

By now JFixtures generates clean up SQL for every table(before inserts):

DELETE FROM $table_name;

This behavior is hardcoded but should be switchable off. A new table property is required:

tables:
   no_clean_up:
     applies_to: /.+
     clean_method: none # might be delete or truncate(truncate is out of scope of this ticket)

See com.github.vkorobkov.jfixtures.processor.Processor#cleanupTable

Support Binary type

Yaml has a binary type - http://yaml.org/type/binary.html

Currently if we have a binary type in YAML:

  children: !!binary |
             R0lGODlhDAAMAIQAAP//9/X17unp5WZmZgAAAOfn515eXvPz7Y6OjuDg4J+fn5
             OTk6enp56enmlpaWNjY6Ojo4SEhP/++f/++f/++f/++f/++f/++f/++f/++f/+
             +f/++f/++f/++f/++f/++SH+Dk1hZGUgd2l0aCBHSU1QACwAAAAADAAMAAAFLC
             AgjoEwnuNAFOhpEMTRiggcz4BNJHrv/zCFcLiwMWYNG84BwwEeECcgggoBADs=

It simply concerts into VALUES (99560979, [B@303cf2ba);

We need to find a way to concert it to correct SQL binary field

http://www.ocelot.ca/datatype.htm
http://www.technowlogeek.com/programming/java/jdbc/the-sql-99-types-blob-binary-large-object/

After inserts custom SQL

Problem:
To be able to write a custom SQL after the insert instructions
Possible configuration:

tables:
  transactional_users:
    applies_to: users, users_profiles
    after_inserts:
      - // Completed table $TABLE_NAME
      - COMMIT TRANSACTION;

The after-insert section could be: a single value or a list(or list of lists with any depth). The $TABLE_NAME placeholder should be replaced with the real table name.

The instructions order must not be changed, duplicates(if there are) must not be removed.

customize lower bound of auto generated PKs

Currently when JFixtures auto generates primary keys it starts with value 10_000(See com.github.vkorobkov.jfixtures.processor.sequence.IncrementalSequence#LOWER_BOUND) and then increments for every row.

This value should be configurable per table. The minimal acceptable value is 1, otherwise an exception should be thrown.

Move the most part of README to wiki

README should have the overall description of the product;
Documentation should go to WIKI

Remove `type:sql` for column values

Currently when we don't want to escape a string we use type: sql in column values in fixture files:

vlad:
  age: 29
  name: Vlad # Gets escaped
dan:
  age: 45
  name:
    type:  value
    value: Dan # Gets escaped because of type: value
robot:
  age: 0
  name:
    type: sql
    value: DEFAULT # Does not get escaped because of  type: sql

This could be done easier:

vlad:
  age: 29
  name: Vlad # Gets escaped
dan:
  age: 45
  name: text:Dan # Gets escaped because of text:
robot:
  age: 0
  name: sql:DEFAULT # Does not get escaped because of `sql:` prefix

text: or sql: does not go to the resulting SQL
values after the prefix is not trimmed, so the spaces go to SQL
if no prefix text: or sql: is specified the value will be interpreted as text and will be escaped
com.github.vkorobkov.jfixtures.util.RowMergeConflictResolver could be removed

Update README to have a link to jfixtures-cmd project

Before cleanup custom SQL

Problem:
To be able to write a custom SQL before the cleanup instruction(if disabled, then before-cleanup means the same as before-inserts):
Possible configuration:

tables:
  transactional_users:
    applies_to: users, users_profiles
    before-cleanup:
      - // Beginning of the $TABLE_NAME
      - BEGIN TRANSACTION;

The before-cleanup section could be: a single value or a list(or list of lists with any depth). The $TABLE_NAME placeholder should be replaced with the real table name.

The instructions order must not be changed, duplicates(if there are) must not be removed.

Aliases for column name

If column names are too long or too difficult there may be useful to use shorter/better aliases to them in fixtures files.

Probably the scope of column aliases must be only for yaml fixtures, all the configuration should use original column names just like these are in the DB

Restrict binary and timestamp types

Restrict binary and timestamp types since we don't normally support them at the moment.

It is better to fail gracefully rather than to write to SQL meaningless values.

write page to Wiki based on the demo project

Like "Getting started with JFixtures, Spring Boot and JUnit" with examples of code and with a reference to demo proj

Add TRUNCATE_CASCADE clean option

See https://www.postgresql.org/docs/current/static/sql-truncate.html

Might be really useful when between test launches a table(users) refers to another table(logs), and this another referred table(logs) does not have fixture file and it does get cleaned up, correspondingly.

Is this case, the standard DELETE or TRUNCATE fails because of foreign key restriction.

Foreign key to any column of referred table

Now foreign key refers only to a PK of referred table, though, FK in SQL can point to any column of referred table

Export to custom formats

Export to custom formats which keeps tables ordering: think of YAML, XML, CSV, JSON

Have a mechanism to define tables order AND is different that relationships

Currently the order tables go to the destination SQL is not defined except one rule: referred tables go strictly before the referring ones.

However, except the relationship, there might be other cases when the tables order is important: table can have dependencies on triggers/stored procedures/default values level.

So there should be a way to say that table A goes before table B and these rules should have the highest priority, higher, than table ordering defined by relationships. The circular references should still be detected in the same manner.

A possible format of describing such dependencies in .conf:

# maybe a new <requires> table property
user_needs_roles:
  apply_to: users
  requires: [roles, permissions] # the required tables go before <users>

Add Sybase syntax support

Java9 adoption

Replace null checks with Optional

Replace null checks like == null or != null with Optional

Add loger and minimal verbosity

No new compile-time dependencies should be added - provided scope only.

Probably, at minimum, there might be following logs:

WARN: file ".cong.yml" not found, using defaults
Processing table "users"
Processing table "comments"
Processing table "permissions"
Processing table "photos"

custom increment step for auto generated PK

Now the step is always 1 but could be configured per table(s)

Clean tables which don't have fixtures

If there are tables in the DB which don't have fixtures it's good to have ability to clean them anyway.

Currently, the workaround is to create an empty fixture so the clean up instruction will be executed.
Though, the more elegant way is to specify such tables right in the config:

clean_tables: ['logs', 'statistics', 'garbage', 'slow_sql']

The instructions for cleaning these tables should go first. The order should be preserved, the duplicates should be skipped.

Add MS SQL syntax support

Support .yaml file extension

YML files have two valid possible extensions: .yml and .yaml

Currently we do not support .yaml extension and we need to fix it before the public release.

com.github.vkorobkov.jfixtures.config.ConfigLoader
Should lookup for .conf.yaml first, in case of failure should lookup fpr .conf.yml
com.github.vkorobkov.jfixtures.loader.FixturesLoader#isYml should be fixed:

.endsWith(".yml") || .endsWith(".yaml")

There is an edge case here: if the same fixture exists with both extensions(like users.yaml and users.yml we need to throw an exception with a readable message)

Self defined foreign keys

Probably for small project or database it would be useful to define foreign keys on the place:

Short(or relative) table name:

vlads_comment:
  text: Hello
  # users table should be looked up first in the same directory as current fixture is
  # otherwise the general lookup strategy from fixtures root folder should be user
  user_id: users:vlad # trying to find users table and vlad in it

Table name with schema:

vlads_comment:
  text: Hello
  # the same lookup strategy as above: relative first, canonical after 
  user_id: public.users:vlad # trying to find public.users table and vlad in it

Foreign key to any field other than to PK:

vlads_comment:
  text: Hello
  user_id: public.users:vlad:public_id # refers to public_id column of public.users table

Table relations defined in .conf.yml would be the default behaviour, but manually defined FKs would override it.

Handle Date type correctly

SnakeYaml has implicit conversion of strings like 2001-11-23 and 2001-11-23 15:02:31 to Date type. Currently, the Date type generates invalid SQL value:

INSERT INTO "users" ("id","date") VALUES (99560979, Fri Nov 23 21:03:17 MSK 2001);

Perhaps, we need to investigate this question and to generate more convenient dates for SQL in format YYYY-MM-DD / format: YYYY-MM-DD HH:MI:SS / unix timestamp.

More info:
https://bitbucket.org/asomov/snakeyaml/wiki/Documentation
https://bitbucket.org/asomov/snakeyaml/src/tip/src/test/java/examples/resolver/CustomResolverTest.java?fileviewer=file-view-default
https://www.w3schools.com/sql/sql_dates.asp

Fail when user defined PK already exists in table

WHEN row has a PK which is defined by user
AND table already has a row with such a PK
THEN jfixtures should throw an exception with a human readable message

Computed PK name

Some databases have strange convention for PK name: it is not fixed(like id or ID) but it is dynamic instead and depends on the table name: USER_ID, payment_id.

To support that, we need to inline some scripting when we set a table PK name. It could be a JS since Java has it's interpreter.

To overall design TBD

Custom SQL fragments before/after all

Custom SQL fragments before/after all the fixtures generation. In case user needs to do something common special in the database.

There are few ways to define these fragments:

In config file(inline SQL as well as reference to external SQL file)
File with conventional name which will be auto registered and auto executed:

- .before/
    01-clean-logs.sql
    02-recreate-sequences.sql
    03-apply-migrations.sql
- .after/
    01-do-something.sql

(file are sorted by name ASC and then executed)

Ability to set custom ID generator

Ability to set a custom ID generator per table(s) and globally. Should be a property like:

id_generator: com.package.IdGenerator.generate

next to other id related properties.

The property describes a path to a static method which will be used for ID generation. The method must be public, static, it must consume String and return Object. Otherwise, an exception should be thrown.

A few standard generator could be added to the project: LongId(the same as IntId but with wider range) and StringId(presumably can just return row aliases)

Throw NotSupportedException on Iterable yaml types

When fixture value type is iterable(array, list, map, e.t.c.) we need to throw NotSupportedException(or NotImplementedYesException until we understand how to support these types for different SQL dialects.

Full list of unsopported types/syntax:

list: [LITE, RES_ACID, SUS_DEXT]
map: {hp: 13, sp: 5}
!!omap, !!pairs, !!set, !!seq, !!map

Every type of syntax should be covered with a test.

Also, as a result of this bug, we need to raise epics to investigate/implement collections for particular SQL dialects(like map could be represented as a JSONB field of PG, !!set, !!seq, !!map as arrays of PG)

Write into DB directly in Fluent API?

Add fluent methods to write the result directly in a DB using JDBC driver class, connection url, username, password, or maybe using a provided JDBC connection/pool.

Might be useful in maven plugin so maven can upload the data to the DB by connection properties.

The command line tool should be modified to enable this feature - it should load an external JAR with DB driver.

Mono YAML file so small dataset

Depends and basing on #167
If we read YAML with com.github.vkorobkov.jfixtures.util.YmlUtil#load then we can just pass the output map into #167

For medium sized datasets(a few small tables) it may be the most valuable option to put the data into a single YAML file, so no need to think about directory structure:

# users table
users:
  homer:
    name: Homer
    age: 40
  bart:
    name: Bart
    age: 12

# admin.roles table
# beware the dots in table name should be escaped
"admin.roles": 
  admin:
    reads: true
    writes: true
  user:
    reads: true
    writes: false

$TABLE_NAME placeholder still must be replaced.

Uppercase boolean values in the output SQL

Should be TRUE / FALSE instead of true and false