Coder Social home page Coder Social logo

Comments (9)

goldsam avatar goldsam commented on June 8, 2024 12

I ended up solving this by creating a helper script which explicitly deletes every item from one or all tables. I invoke these methods in a beforeEach block to reset database state before each test. Although this works, it has some limitations.

What I quickly discovered was that Jest runs test files in parallel which becomes problematic for code using a shared resource such as DynamoDB due to race conditions. I ended up solving this (at least temporarily) by restructuring my tests so that all code using a given table is invoked from one root test file and thus executed sequentially.

A better solution would be to create distinct "environments" for each test. I can think of a few approaches:

  1. Within the same DynamoDB instance, create table names uniquely for each test using some kind of random post/prefix. That post/prefix could then be injected into the test so it knows what table names to use. This seems ugly to me.
  2. Instead, allow the test code to simply setup and teardown the database state manualy. It would be nice to simply invoke a method and pass the tables configuration array similar to what is supported in jest-dynamodb-config.js. Admittedly, This solution requires no changes to your library.
  3. Spin up a new DynamoDB instance for each test and inject the port number of that instance into the test environment. This has a lot of cost overhead but provides a strong level of data isolation.

Although this is arguably an unrelated or secondary problem at best, its seemed worthwhile to at least start a discussion.

Below is the dynamodb-utils.ts helper script I mentioned above:

import * as AWS from 'aws-sdk'; 
import { AttributeMap, KeySchemaElement, Key } from 'aws-sdk/clients/dynamodb';
import { DynamoDB } from 'aws-sdk';

function itemToKey(item: AttributeMap, keySchema: KeySchemaElement[]): Key {
    let itemKey: Key = {};
    keySchema.map(key => { 
        itemKey = { ...itemKey, [key.AttributeName]: item[key.AttributeName] };
    });
    return itemKey;
};

export async function clearTable(dynamoDB: AWS.DynamoDB, tableName: string): Promise<void> {
    // get the table keys
    const { Table = {} } = await dynamoDB
      .describeTable({ TableName: tableName })
      .promise();
  
    const keySchema = Table.KeySchema || [];
  
    // get the items to delete
    const scanResult = await dynamoDB.scan({
        AttributesToGet: keySchema.map(key => key.AttributeName),
        TableName: tableName,
        ConsistentRead: true
    }).promise();
    const items = scanResult.Items || [];
  
    if (items.length > 0) {
        const deleteRequests = items.map(item => ({
            DeleteRequest: { Key: itemToKey(item, keySchema) },
        }));

        await dynamoDB
            .batchWriteItem({ RequestItems: { [tableName]: deleteRequests } })
            .promise();
    }
};

export async function clearAllTables(dynamoDb: DynamoDB): Promise<void> {
    const { TableNames } = await dynamoDb.listTables().promise();
    for (const tableName of TableNames) {
        await clearTable(dynamoDb, tableName);
    }

    await new Promise(resolve => setTimeout(resolve, 500));
}

from jest-dynamodb.

freshollie avatar freshollie commented on June 8, 2024 10

@goldsam FYI - I rewrote this library with this use case in mind:

https://github.com/freshollie/jest-dynalite

I used dynalite as a mock for dynamo instead of dynamodb-local for several reasons.

Firstly, dynalite is much lighter and so allows us to spin up a single instance for each runner.
Secondly, dynalite allows tables to be created and destroyed quickly.
Thirdly, dynalite does not need java to run.

jest-dynalite provides isolation between tests and between test suites. Give it a go.

from jest-dynamodb.

ohmtech-rdi avatar ohmtech-rdi commented on June 8, 2024 4

Just to add on that thread, as we had a lot of discussions here about it (and thanks everyone for their hard work on that!):

dynalite doesn't support transactions and DynamoDB streams, so if you need them, that's an immediate show-stopper.

You can't really unfortunately rely on the previous workaround mentioned on this issue: DeleteRequest are write operations, and because your read operations have eventual consistency (whatever you request, see below), you might get "zombie" items after deleting each table items, which is probably not what you expect from a unit-test isolation perspective.

This issue is quite rare, but testing other 10,000 tests show this issue from time to time. It makes a flaky test. There is no way around it, as one limitation of AWS DynamoDB local is to not acknowledge strong-consistency reads:

Read operations are eventually consistent. However, due to the speed of DynamoDB running on your computer, most reads appear to be strongly consistent.
Source (Emphasis on word "most" added).

Said differently, starting a test by assuming that the database is empty, requires strong-consistency reads if you "clean" a table before running a test. And you can't assume that because of DynamoDB local limitations.

We implemented the solution 1. in @goldsam post upper (creating a new table for each test), as we believe it is, at least conceptually, and whether they said it was ugly, a classic test isolation strategy (avoiding collisions by partitioning space), and the best approach you could have after spinning a new database instance each time (like the excellent jest-dynalite does).

This offers some important features:

  • Each single test is truly isolated from others,
  • Because of that, they can run in parallel, which becomes quickly important if you try to run thousands of tests locally in less than a minute, or in less than 10 minutes on CI, over multiple jest workers.

The following assume you follow the single-table design with DynamoDB.

Here is our setup:

// jest-dynamodb-config.js

module.exports = {
   tables: [], // A new table is created before each test, so don't declare anything here
   port: 8000,
   options: [
      '-sharedDb',
      
      // This uses `:memory:` sqlite in-memory table which is
      // ways of magnitude faster than their file-relative usage.
      // This makes creating a new table instant, and accelerates all database operations.
      // This should be probably a `jest-dynamodb` default.
      '-inMemory', 
   ]
};

In your jest tests:

beforeEach(async () => {
   // reset all modules to isolate every single tests...
   jest.resetModules();
   jest.mock('./../store/client');

   const { prepareNewTable } = require('./helper');
   const tableName = await prepareNewTable();
   
   // ... so that your database use the new table name for every test
   process.env.TableName = tableName;
});

helper creates a new table based on the CloudFormation template, but gives the table name a random name each time to provide isolation:

// helper.js

'use strict';

const crypto = require('crypto');
const fs = require('fs');

const { dynamoDBClient } = require('./../store/client');
const { CreateTableCommand } = require("@aws-sdk/client-dynamodb");
const yaml = require('js-yaml');
const { CLOUDFORMATION_SCHEMA } = require('cloudformation-js-yaml-schema');


// ---------------------------------------------------------------------------

const getCloudFormationDynamoDbTableSchema = () => {
   const templateYaml = '../template.yaml';
   const templateYamlContent = fs.readFileSync(templateYaml, 'utf8');
   const cf = yaml.load(templateYamlContent, { schema: CLOUDFORMATION_SCHEMA });

   let resources = [];
   Object.keys(cf.Resources).forEach(item => {
      resources.push(cf.Resources[item]);
   });

   const tables = resources
      .filter(r => r.Type === 'AWS::DynamoDB::Table')
      .map(r => {
         let table = r.Properties;
         delete table.TableName;                // will be renamed
         delete table.TimeToLiveSpecification;  // errors on DynamoDB local
         return table;
      });

   return tables[0];  // we have only one table per service
};

const TABLE_SCHEMA = getCloudFormationDynamoDbTableSchema ();


// ---------------------------------------------------------------------------

const prepareNewTable = async () => {
   const tableName = crypto.randomBytes(16).toString('hex');

   await dynamoDBClient.send(
      new CreateTableCommand({
         ...TABLE_SCHEMA,
         TableName: tableName,
      })
   );

   return tableName;
};


// ---------------------------------------------------------------------------

module.exports = {
   prepareNewTable,
};

The store implementation:

'use strict';

const {
   GetCommand,
   QueryCommand,
   TransactWriteCommand,
   UpdateCommand,
} = require('@aws-sdk/lib-dynamodb');

const { dynamoDBClient } = require('./client');

// this gets evaluated on each single test,
// because modules are reset for each single test
const { TableName } = process.env;

Running around 1000 tests for a service (with all the rest of the code) takes around 10 seconds on a 10-cpu computer, and around 2 minutes on GitHub actions, with DynamoDB taking the most time for each test. That's a bit slow, but that means you can probably run 5000 tests in around the ideal 10 minutes for CI, which should provide in most cases an excellent level of unit testing.

Finally, you can achieve this way to not have any sort of test infrastructure leakage in your production code.

from jest-dynamodb.

vladholubiev avatar vladholubiev commented on June 8, 2024 3

Great job, @freshollie. I've mentioned jest-dynalite in README: https://github.com/shelfio/jest-dynamodb#alternatives

from jest-dynamodb.

adrians5j avatar adrians5j commented on June 8, 2024 1

Having the same problem myself. Good thing I found this issue and now I know it's not just me. 🙂

from jest-dynamodb.

vladholubiev avatar vladholubiev commented on June 8, 2024

Hey Sam!

I totally agree, this will simplify my tests as well. I'll put this on my agenda.

from jest-dynamodb.

blakedietz avatar blakedietz commented on June 8, 2024

Is there a way to utilize support for transactions to make this even faster?

from jest-dynamodb.

goldsam avatar goldsam commented on June 8, 2024

@blakedietz Do you mean batch operations? Yes, batch operations would have definitely been faster. Another possible approach might be to:

  1. Read the table schema definition.
  2. Delete the table in one operation
  3. Recreate the table using the schema definition from (1)

from jest-dynamodb.

msoffredi avatar msoffredi commented on June 8, 2024

I'm using @goldsam utils (thank you very much for sharing!), adapted to TypeScript and dynamoose (it was already 95% compatible, though), and I found an interesting issue I want to share here. Keep reading if you are trying the same and running multiple tests fails, but running them individually works just fine.

By clearing your tables all at once on every run, you may end up in a race condition because jest would try to optimize and run multiple tests simultaneously. You may end up having tests affect other tests by clearing the entire tables in the middle of running tests.

If this is your case, an easy fix is just to run tests sequentially. I prefer this to other options.

Other options:

  • Ensure tests don't get affected by other tests or existing data (sometimes hard)
  • Make each test clean its own data. This approach prevents you from re-using mock data on multiple tests.

from jest-dynamodb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.