markitx / dynamo-backup-to-s3 Goto Github PK

View Code? Open in Web Editor NEW

141.0 141.0 106.0 138 KB

Stream DynamoDB backups to S3

License: MIT License

JavaScript 100.00%

dynamo-backup-to-s3's People

Contributors

Stargazers

Watchers

Forkers

rambler jalieven smalltownheroes hiteshjoshi chadkouse nathanwelch bradoyler karolisl pureclarity dharrisio playkot-dev dwkerwin rijalati jianyuwang edgarchinchilla sahilsk smashgg apoorvmalhotra sdesalas endymion00 hekard2l jdhom blackwoodseven nloiselle miles990 mpaalanen awolden koaninc sujithvellat kileynaromw pushpay cyft black4clover qzaidi mousavian pareddy113 redbirdhq celtra peterjuras rupshabagchi minhthuong bhague1281 mllona skycatch gabrielkaputa christophgysin mkrue2015 pesticles johnnynotsolucky jsyang-dev jaysb2710 singhalavi rakeysh17 koenvg timestopper willpickering7 prabz evangelion1204 tayloryork rokka-io f1r3er majindageta artemkv rajatonit tzach agentgoldpaw shark0der therealchristhomas divyavanmahajan tpoingt vladimirsto snolan-uturn hlta fabrik85 boehlkers omerkarabacak bhavana-hk sitem8 lbernardo harshathadkapally ebekker buddhabuddy joetraviskiva jaaromy ziggy6792 sathish97625 markyim-klaytn dayagalanikhil kartik-ramc glee2429 jacksonkasi1

dynamo-backup-to-s3's Issues

undefined error

Output:
Starting to copy table db_test
Error backing up undefined
undefined
Error backing up db_test
[Error: Cannot initiate transfer]
Done copying table db_test
Finished backup
Error backing up undefined
undefined
Error backing up db_test
[Error: Cannot initiate transfer]
Done copying table db_test
nodejs/dynamodb/node_modules/dynamo-backup-to-s3/node_modules/s3-streaming-upload/node_modules/aws-sdk/lib/request.js:30
throw err;
^
Error: Callback was already called.
at nodejs/dynamodb/node_modules/dynamo-backup-to-s3/node_modules/async/lib/async.js:30:31
at nodejs/dynamodb/node_modules/dynamo-backup-to-s3/lib/dynamo-backup.js:119:21
at Uploader. (nodejs/dynamodb/node_modules/dynamo-backup-to-s3/lib/dynamo-backup.js:63:16)
at Uploader.emit (events.js:107:17)
at Response. (nodejs/dynamodb/node_modules/dynamo-backup-to-s3/node_modules/s3-streaming-upload/lib/uploader.js:71:17)
at Request. (nodejs/dynamodb/node_modules/dynamo-backup-to-s3/node_modules/s3-streaming-upload/node_modules/aws-sdk/lib/request.js:353:18)
at Request.callListeners (nodejs/dynamodb/node_modules/dynamo-backup-to-s3/node_modules/s3-streaming-upload/node_modules/aws-sdk/lib/sequential_executor.js:105:20)
at Request.emit (nodejs/dynamodb/node_modules/dynamo-backup-to-s3/node_modules/s3-streaming-upload/node_modules/aws-sdk/lib/sequential_executor.js:77:10)
at Request.emit (nodejs/dynamodb/node_modules/dynamo-backup-to-s3/node_modules/s3-streaming-upload/node_modules/aws-sdk/lib/request.js:595:14)
at Request.transition (nodejs/dynamodb/node_modules/dynamo-backup-to-s3/node_modules/s3-streaming-upload/node_modules/aws-sdk/lib/request.js:21:10)

Script :

var DynamoBackup = require('dynamo-backup-to-s3');

var backup = new DynamoBackup({
includedTables: ['db_test'],
bucket: "dynamodb-backup",
awsAccessKey: "xxxxx",
awsSecretKey: "xxxxx",
awsRegion: "us-west-1"
});

backup.on('error', function(data) {
console.log('Error backing up ' + data.tableName);
console.log(data.error);
});

backup.on('start-backup', function(tableName) {
console.log('Starting to copy table ' + tableName);
});

backup.on('end-backup', function(tableName) {
console.log('Done copying table ' + tableName);
});

backup.backupAllTables(function() {
console.log("Finished backup");
});

Modules:-

Missing items in backup

Hi I have a table with ~1200 items and when I back it up I used to only get about half of them on version 0.1.9. I updated to the latest version and now I get more but still not all. Interestingly, the percentage of records backed up goes down as I increase the read throughput. For example, I get ~850 when I have read percentage at 75% but ~1100 at 25%.

I did some initial debugging by modifying the code to emit an event for every item that's processed and I can confirm that all items are processed. Additionally if I modify dynamo-backup.js to save all records to a file and then upload that file, I can confirm they are all processed and uploaded. To me this points to some bug either in the readable stream or in what is handling the S3 upload stream. Given that the percentage of items backed up goes up as the read rate goes down, I'm guessing it's reading from dynamo faster than it can write to S3 and that the readable stream cannot handle this.

Is anyone aware of any issues that would cause this? My temporary fix in my fork (master...smashgg:backup-to-file) is to add an option to backup to file and then upload that file after the dynamo backup has completed.

Node version: Tried in 0.10.x, 0.12.x, and 6.2.2 with fresh installs between each version change
dynamo-backup-to-s3 version: 0.4.x

Support for KMS server-side encryption?

I've made some changes to support KMS encryption. You can just provide a kmsKeyId and it'll set everything up.

Is this interesting enough to warrant a PR?

Is CLI still working?

I try to run the backup CLI but no matter what parameters I supply, I just get an "all done" message right away:

macair$ bin/dynamo-backup-to-s3 -b BBB
Finished backing up DynamoDB

Has anybody tried this CLI recently? Shouldn't it at least give verbose data instead of providing a false message (obviously it hasn't finished backup. Moreover. It hasn't even started doing so)

I'm glad to do a PR but I wanted to be sure I'm not missing something silly first

restore db tables : batch multi thread issue

Hi while I am using this tool to do the restore multiple tables, only the first table restore succeed and then fall into the infinite loop with the message:

Error processing batch, putting back in the queue.
Batch sent. 1 in flight. 0Kb remaining to download...

I am using the command tool, and this is the snippit:

restore_tables() {
    cd bin
    aws s3 ls $S3_RESTORE_LOC | awk '{print $4}' | while read tablenamejson; do
        tablename=$(echo $tablenamejson | cut -d'.' -f1)
        tablesize=$(aws s3 ls $S3_RESTORE_LOC$tablenamejson | awk '{print $3}')
        if [[ $tablesize -eq 0 ]]; then
            echo "table size is 0. skip restore $tablename"
        else
            ./dynamo-restore-from-s3 -s "$S3_RESTORE_LOC$tablenamejson" -t "$tablename" --overwrite \
                --aws-key "$AWS_KEY" --aws-secret "$AWS_SECRET" --aws-region "$REGION_NAME"

            sleep 1 ### even with sleep, it didn't go to this step
            read -r -p "Press any key to continue" ### not go to this step yet while inifitite loop happens
        fi
    done
}

do you have the issue? and how to resolve ... ?

appreciated!

Duration calculation

Hi,
I am interested to know if there has been any reason behind choosing an external lib (moment-range) over using moment's diff?

If it make sense to use moment's difference, I will be happy to make a PR replacing that.

Option save-datapipeline-format not working

The save-datapipeline-format property does not work just the short -d

17:53:26 [bin] Running shell script
17:53:26 + ./dynamo-backup-to-s3 -d true --included-tables TABLE --bucket BUCKET --aws-key KEY --aws-secret SECRET --aws-region eu-central-1
17:53:26 Finished backing up DynamoDB

17:57:54 [bin] Running shell script
17:57:54 + ./dynamo-backup-to-s3 --save-datapipeline-format true --included-tables TABLE --bucket BUCKET --aws-key KEY --aws-secret SECRET --aws-region eu-central-1
17:57:55 
17:57:55   error: unknown option `--save-datapipeline-format'

LSI and GSI

For tables that has GSI , the script does not copy GSI from source to destination, is there a way to copy GSI from source to destination. I am using cli to copy the data

Unexpected end of input - restorartion

Under the test I've tried to restore a table(200Mb - 525k items) from S3 using your tool, however, I came across info pasted below. Making changes in concurrency hasn't resolved the case. I would be grateful if you were able to point the cause and how it should be mitigated.

./bin/dynamo-restore-from-s3 -t ProductRR -c 200 -s s3://xxx/xxx/Products.json --readcapacity 200 --writecapacity 200 --aws-region eu-central-1 --partitionkey Id
...
Batch sent. 51 in flight. 213Mb remaining to download...
Batch sent. 52 in flight. 213Mb remaining to download...
Batch sent. 53 in flight. 213Mb remaining to download...
Batch sent. 54 in flight. 213Mb remaining to download...
undefined:1
{"YPosition":{"S":"55e4fb166e18335069f9b21783a222"},"Title":{"S":"a6643fda61abf5c7d34f"},"PageCount":{"S":"6374237438b50df9ee129b5c22d3c1746"},"Dimensions":{"S":"96a1d5a24bb982406397

SyntaxError: Unexpected end of input
at Object.parse (native)
at DynamoRestore._processLine (/home/nubes/dynamo-backup-to-s3/lib/dynamo-restore.js:173:55)
at emitOne (events.js:77:13)
at Interface.emit (events.js:169:7)
at PassThrough.onend (readline.js:92:12)
at emitNone (events.js:72:20)
at PassThrough.emit (events.js:166:7)
at endReadableNT (_stream_readable.js:905:12)
at nextTickCallbackWith2Args (node.js:441:9)
at process._tickDomainCallback (node.js:396:17)

restore by specific PK

I want to restore data across the multiple table with specific key, is it possible? if so let me know how can I achieved this?

for e.g I have following tables [Users, Transactions, Reports], relation is 1 user has many transactions and multiple transactions has single report so I want to restore data only user which has PK[userId] is 1

Thanks

Handle GlobalSecondaryIndexes throughput

With the -c flag we can provision an higher throughput for writing, but if we have global secondary indexes with a low write capacity this will be useless.

I believe we should also increase the capacity for GSI's and when finished coming back to the same state

[Error: Cannot initiate transfer]

Starting to copy table db_test
Error backing up undefined
{ table: 'db_test',
err: [Error: Cannot initiate transfer] }
Error backing up db_test
{ tableName: 'db_test',
error: [Error: Cannot initiate transfer] }
/nodejs/dynamodb/node_modules/dynamo-backup-to-s3/node_modules/s3-streaming-upload/node_modules/aws-sdk/lib/request.js:30
throw err;
^
ReferenceError: stopOnFailure is not defined
at /nodejs/dynamodb/node_modules/dynamo-backup-to-s3/lib/dynamo-backup.js:113:29
at Uploader. (/nodejs/dynamodb/node_modules/dynamo-backup-to-s3/lib/dynamo-backup.js:63:16)
at Uploader.emit (events.js:107:17)
at Response. (/nodejs/dynamodb/node_modules/dynamo-backup-to-s3/node_modules/s3-streaming-upload/lib/uploader.js:71:17)
at Request. (/nodejs/dynamodb/node_modules/dynamo-backup-to-s3/node_modules/s3-streaming-upload/node_modules/aws-sdk/lib/request.js:353:18)
at Request.callListeners (/nodejs/dynamodb/node_modules/dynamo-backup-to-s3/node_modules/s3-streaming-upload/node_modules/aws-sdk/lib/sequential_executor.js:105:20)
at Request.emit (/nodejs/dynamodb/node_modules/dynamo-backup-to-s3/node_modules/s3-streaming-upload/node_modules/aws-sdk/lib/sequential_executor.js:77:10)
at Request.emit (/nodejs/dynamodb/node_modules/dynamo-backup-to-s3/node_modules/s3-streaming-upload/node_modules/aws-sdk/lib/request.js:595:14)
at Request.transition (/nodejs/dynamodb/node_modules/dynamo-backup-to-s3/node_modules/s3-streaming-upload/node_modules/aws-sdk/lib/request.js:21:10)
at AcceptorStateMachine.runTo (/nodejs/dynamodb/node_modules/dynamo-backup-to-s3/node_modules/s3-streaming-upload/node_modules/aws-sdk/lib/state_machine.js:14:12)

script:
var DynamoBackup = require('dynamo-backup-to-s3');

var backup = new DynamoBackup({
includedTables: ['db_test'],
bucket: "dynamodb-backup",
readPercentage: .9,
awsAccessKey: "xxxx",
awsSecretKey: "xxxx",
awsSecretKey: "us-west-1"
});

backup.on('error', function(data) {
console.log('Error backing up ' + data.tableName);
console.log(data);
});

backup.on('start-backup', function(tableName) {
console.log('Starting to copy table ' + tableName);
});

backup.on('end-backup', function(tableName) {
console.log('Done copying table ' + tableName);
});

backup.backupAllTables(function() {
console.log('Finished backing up DynamoDB');
});

Module versions:
├─┬ [email protected]
│ ├── [email protected]
│ ├─┬ [email protected]
│ │ ├─┬ [email protected]
│ │ │ └── [email protected]
│ │ └── [email protected]
│ ├── [email protected]
│ ├── [email protected]
│ ├─┬ [email protected]
│ │ ├─┬ [email protected]
│ │ │ ├── [email protected]
│ │ │ ├── [email protected]
│ │ │ └── [email protected]
│ │ └─┬ [email protected]
│ │ ├── [email protected]
│ │ ├── [email protected]
│ │ ├── [email protected]
│ │ └── [email protected]
│ └── [email protected]

TypeError: Cannot read property push of undefined on dynamo-restore-from-s3

Hi Mark,

thanks for taking the time and writing that pice of software.

Backup works find and i'm now able to backup to another colo.

Regarding the restore i run into some trouble. If i try the following:

bin/dynamo-restore-from-s3 --source 's3://XXXXXXXXXXXXXXXXXXXX/20180919_1611/XXXXXXXXXXXXXXXXXXXX-XXXXXXXXXXXXX.json' --table 'XXXXXXXXXXXXXXXXXXXX-XXXXXXXXXXXXX' --overwrite --aws-region eu-central-1 --aws-key 'XXXXXXXXXXXXXXXXXXXX' --aws-secret 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'

i get this error ...

WARN: table [XXXXXXXXXXXXXXXXXXXX-XXXXXXXXXXXXX] will be overwritten.
Starting download. 0Kb remaining...
node_modules/dynamo-backup-to-s3/lib/dynamo-restore.js:148
                this.batches.push({
                             ^

TypeError: Cannot read property 'push' of undefined
    at DynamoRestore.<anonymous> (/home/hans/Development/mtribes/backup-service/node_modules/dynamo-backup-to-s3/lib/dynamo-restore.js:148:30)
    at Interface.emit (events.js:187:15)
    at Interface.EventEmitter.emit (domain.js:442:20)
    at Interface.close (readline.js:379:8)
    at PassThrough.onend (readline.js:157:10)
    at PassThrough.emit (events.js:182:13)
    at PassThrough.EventEmitter.emit (domain.js:442:20)
    at endReadableNT (_stream_readable.js:1092:12)
    at process._tickCallback (internal/process/next_tick.js:63:19)

Any idea whats going wrong?

Thanks and Regards,
Sebastian

WriteCapacity limits on restore

Hi guys,

would like to ask what's the expected behavior on restores. Shouldn't it change write capacity limits when using -c flag ?

Thanks

Error restoring binary attribute

I have a table with a binary attribute with base64 encoded strings. When I backup with base64binary : true and then restore, I get a different value in the restored table.

If I backup with base64binary : false, the restore fails with an error:

{ InvalidParameterType: Expected params.RequestItems['dev-rental-events'][24].PutRequest.Item['body'].B to be a string, Buffer, Stream, Blob, or typed array object

The json in the backup (without base64binary) looks like:
{"createdAt":{"S":"2017-05-02T09:50:52.322Z"},"stream":{"S":"rental"},"id":{"S":"2017-05-02T09:50:52.317Z0"},"name":{"S":"register"},"body":{"B":{"type":"Buffer","data":[101,121,74,112,90,67,73,54,73,106,73,53,73,105,119,105,90,109,108,121,99,51,82,102,98,109,70]}}}

While the json with base64binary looks like:

{"createdAt":{"S":"2017-05-02T09:51:19.258Z"},"stream":{"S":"rental"},"id":{"S":"2017-05-02T09:51:19.257Z2"},"name":{"S":"lock-applicant"},"body":{"B":"ZXlKaGNIQnNhV05oYm5RaU9pSXlPU0lzSW1KeWIydGxjaUk2SWtGa1lXMGlMQ0owYVcxbGMzUmhiWEFpT2lJeU1ERTNMVEExTFRBeVZEQTVPalV4T2pFNUxqSTFOMW9pZlE9PQ=="}}
Note that the string value of the body above is the exact same string I see in the dynamodb ui.

large data restore miss

good tool.
when I backup a 100000 records table, then restore it. the new table has only 99563.
I tried twice, always like that.

Any idea?

Don't get the full list of tables.

I'm trying to back up several tables but only a subset are getting included.

I think this is a simple fix: pull request to follow.

Write concurrency is much higher than default during restoration

I use dynamo-restore-from-s3 without setting -c. So by default the write currency should be 200.
However the actual write currency is as high as 1000!

It drops down gradually only because requests got throttled.

The version of dynamo-backup-to-s3 is 0.6.1

Overwrite flag not being honored

Wherever I put the -o or --overwrite flag in the command line for dyanmo-restore-from-s3 I keep seeing Fatal Error. Failed to create new table. ResourceInUseException: Table already exists: <my-table>

It looks like the flag is honored if I remove the --aws-region parameter because it defaults to the ap-southeast-1 region and ends up creating a new DynamoDB table.

I also noticed that it was doing this because it couldn't find the table in my region. I saw that it only found tables that started with an UpperCase Letter as the first character. Don't know if it's a bug in getting the tables or that I have 200+ tables in my region?

Missing credentials in config for lots of parallel requests

I happen to face an issue when using AWS temporary security credentials solution in parallel for many tables which cause Missing credentials in config error randomly.
So I need to set some more general configuration options such as httpOptions and maxRetries.
( some references: 1, 2, 3)

But since param object, which holds aws config options, is not customisable outside of this library, it is not possible.

--
@dylanlingelbach, do you think we can fork, merge these PRs and work around existing issues if you don't get write access in any near future?

Find new owner

I have been unable to get write access to this repo and my time to work on this project is very small. I'd like to see if we can find a new owner to help maintain this.

I think the process should look like:

Fork this project to new org/owner
Comment on existing PRs to get them to move their PR to forked project
I move NPM package to new org/owner

Assigning a few people who have contributed in the past to see if anyone is able/willing to own.

Setting LIMIT on Dynamo scan doesn't give expected read throughput.

When backing up a table, setting the read-percentage doesn't result in expected throughput:

The calc to ascertain what the throughput should be is correct but using that value as the limit on a scan does not result in that throughput.
Setting the limit on scan merely prevents a scan operation from reading more than that many items in a single scan operation.
Items are not equivalent to units of throughput (which is more a measure of data size than count).
Scans are not limited to 1 per second.

So, with a table capacity of 10 and a read-percentage of 0.8, the expected throughput would be 8 units/second.
BUT setting the scan limit to 8 only tells it to consume a max of 8 items (or the max of 1MB). Those items could be of any size. From the dynamo docs:
"One read capacity unit represents one strongly consistent read per second, or two eventually consistent reads per second, for an item up to 4 KB in size".
So, if each item in our example was a little bigger than 4K, consuming 8 items will actually use 16 units of capacity.

So, while memory is being protected by limiting the scan, it doesn't result in the desired read-percentage throughput.

I think what this needs is a delay of some sort. There's a nice approach described here:
https://aws.amazon.com/blogs/developer/rate-limited-scans-in-amazon-dynamodb/
using Google Guava's RateLimiter class...
Essentially, it uses the metadata passed back by dyanmo to calculate consumed throughput and the rate limiter to achieve the throughput required.

I've attached a graph showing consumed throughput on a table during 2 backup operations:

Back up a table with read-percentage set to 0.8 and a table capacity of 10
Back up a table with read-percentage set to 0.8 and a table capacity of 5
As you can see, this isn't quite what was expected :)

Any chance this functionality could be added?

undefined this.batches in restore

when restoring a table the program crashes with this message

WARN: table [<table name>] will be overwritten.
Starting download. 6Kb remaining...
node_modules/aws-sdk/lib/request.js:31
            throw err;
            ^

TypeError: Cannot read property 'length' of undefined
    at DynamoRestore._checkTableReady (node_modules/dynamo-backup-to-s3/lib/dynamo-restore.js:284:28)
    at Request.<anonymous> (node_modules/aws-sdk/lib/request.js:364:18)
    at Request.callListeners (node_modules/aws-sdk/lib/sequential_executor.js:105:20)
    at Request.emit (node_modules/aws-sdk/lib/sequential_executor.js:77:10)
    at Request.emit (node_modules/aws-sdk/lib/request.js:683:14)
    at Request.transition (node_modules/aws-sdk/lib/request.js:22:10)
    at AcceptorStateMachine.runTo (node_modules/aws-sdk/lib/state_machine.js:14:12)
    at node_modules/aws-sdk/lib/state_machine.js:26:10
    at Request.<anonymous> (node_modules/aws-sdk/lib/request.js:38:9)
    at Request.<anonymous> (node_modules/aws-sdk/lib/request.js:685:12)

looking at the source code it appears to be a synchronization issue between the s3 file download and the restore batch start, which is happening before while the batches array is still undefined

as a workaround, I raised the timeout on dynamo-restore.js line 109 to 5000
setTimeout(dynamodb.describeTable.bind(dynamodb, { TableName: this.options.table }, this._checkTableReady.bind(this)), 1000);

cannot read property of length undefined

TypeError: Cannot read property 'length' of undefined at DynamoRestore._checkTableReady (/home/viv1sgp/Documents/dynamo-backup-restore/dynamo-restore/dynamo-backup-to-s3/lib/dynamo-restore.js:415:28) at Request.<anonymous> (/home/viv1sgp/Documents/dynamo-backup-restore/dynamo-restore/node_modules/aws-sdk/lib/request.js:364:18) at Request.callListeners (/home/viv1sgp/Documents/dynamo-backup-restore/dynamo-restore/node_modules/aws-sdk/lib/sequential_executor.js:105:20) at Request.emit (/home/viv1sgp/Documents/dynamo-backup-restore/dynamo-restore/node_modules/aws-sdk/lib/sequential_executor.js:77:10) at Request.emit (/home/viv1sgp/Documents/dynamo-backup-restore/dynamo-restore/node_modules/aws-sdk/lib/request.js:683:14) at Request.transition (/home/viv1sgp/Documents/dynamo-backup-restore/dynamo-restore/node_modules/aws-sdk/lib/request.js:22:10) at AcceptorStateMachine.runTo (/home/viv1sgp/Documents/dynamo-backup-restore/dynamo-restore/node_modules/aws-sdk/lib/state_machine.js:14:12) at /home/viv1sgp/Documents/dynamo-backup-restore/dynamo-restore/node_modules/aws-sdk/lib/state_machine.js:26:10 at Request.<anonymous> (/home/viv1sgp/Documents/dynamo-backup-restore/dynamo-restore/node_modules/aws-sdk/lib/request.js:38:9) at Request.<anonymous> (/home/viv1sgp/Documents/dynamo-backup-restore/dynamo-restore/node_modules/aws-sdk/lib/request.js:685:12) /home/viv1sgp/Documents/dynamo-backup-restore/dynamo-restore/node_modules/aws-sdk/lib/request.js:31 throw err;

When I run multiple nodejs processes to restore multiple tables at the same time, I get this error. I am not sure what is causing the problem. Any help or insight is appreciated.

Working with on-demand provisioning tables

Hi there!

I was reading the docs and considering using your lib, but I haven't found any documentation about not using read-percentage, readcapacity units and write capacity units.

DynamoDB has these on-demand tables that does not contain these RCU and WCU configured.

Is it possible to work with this kind of table using your lib?

Thanks.

[Cannot Initiate Transfer]

Hey, I'm getting this error, I'm wondering if it's policy issue, I have full access for S3 and DynamoDB, do I need other permissions?

here's the error below ( i added some console.logs to see where the error was) :

START RequestId: 3e9af9a8-9ada-11e5-84fb-d732598ecf7e Version: $LATEST
2015-12-04T22:56:25.049Z    3e9af9a8-9ada-11e5-84fb-d732598ecf7e    backupTable failed celebInfo error is Error: Cannot initiate transfer
2015-12-04T22:56:25.049Z    3e9af9a8-9ada-11e5-84fb-d732598ecf7e    Error backing up undefined
2015-12-04T22:56:25.066Z    3e9af9a8-9ada-11e5-84fb-d732598ecf7e    backupAllTables failed error is Error: Cannot initiate transfer
2015-12-04T22:56:25.066Z    3e9af9a8-9ada-11e5-84fb-d732598ecf7e    Error backing up celebInfo
2015-12-04T22:56:25.066Z    3e9af9a8-9ada-11e5-84fb-d732598ecf7e    [Error: Cannot initiate transfer]
2015-12-04T22:56:25.068Z    3e9af9a8-9ada-11e5-84fb-d732598ecf7e    Error: Callback was already called.
    at /var/task/node_modules/dynamo-backup-to-s3/node_modules/async/lib/async.js:30:31
    at /var/task/node_modules/dynamo-backup-to-s3/lib/dynamo-backup.js:126:36
    at Uploader.<anonymous> (/var/task/node_modules/dynamo-backup-to-s3/lib/dynamo-backup.js:65:16)
    at Uploader.emit (events.js:95:17)
    at Response.<anonymous> (/var/task/node_modules/dynamo-backup-to-s3/node_modules/s3-streaming-upload/lib/uploader.js:71:17)
    at Request.<anonymous> (/var/task/node_modules/dynamo-backup-to-s3/node_modules/s3-streaming-upload/node_modules/aws-sdk/lib/request.js:353:18)
    at Request.callListeners (/var/task/node_modules/dynamo-backup-to-s3/node_modules/s3-streaming-upload/node_modules/aws-sdk/lib/sequential_executor.js:105:20)
    at Request.emit (/var/task/node_modules/dynamo-backup-to-s3/node_modules/s3-streaming-upload/node_modules/aws-sdk/lib/sequential_executor.js:77:10)
    at Request.emit (/var/task/node_modules/dynamo-backup-to-s3/node_modules/s3-streaming-upload/node_modules/aws-sdk/lib/request.js:595:14)
    at Request.transition (/var/task/node_modules/dynamo-backup-to-s3/node_modules/s3-streaming-upload/node_modules/aws-sdk/lib/request.js:21:10)
END RequestId: 3e9af9a8-9ada-11e5-84fb-d732598ecf7e
REPORT RequestId: 3e9af9a8-9ada-11e5-84fb-d732598ecf7e  Duration: 208.05 ms Billed Duration: 300 ms     Memory Size: 256 MB Max Memory Used: 19 MB  
Process exited before completing request

Restore did not work because of outdated/incorrect documentation

I've just spent quite a lot of time figuring out why my restore is not working until i found out that in the restore script, the access key and secret fields should be called awsKey and awsSecret and not awsAccessKey and awsSecretKey as the docs suggest. Might be helpful if somebody would check what else is incorrect. (I will do it in a couple of days if nobody else does it before)

maximum backup copies in s3

For our use case,
We take a daily backup of the db,
But we would like to limit it to maximum of 5 days. Its some non critical data, you can say 😉
Do you guys thing it would be helpful if dynamo-backup-to-s3 provides the functionality.
Would like to hear your thoughts 😄

Not all tables get written.

When using the command line tool, not all of the tables get written. I think the problem is that the uploader isn't correctly waiting for the end event from the uploader. I'll submit another pull request.

Folders in S3 Bucket not created on Windows due to wrong Slash

This library used path.join to create the S3 bucket object name. This results in a directory\data.json object name and thus S3 does not create a directory.

Solution would be to either use a regex after the path.join in dynamo-backup.js to replace \ by / or just concatenate them with a /.

NPM package

At present users have to npm install using:

npm install git+https://[email protected]/markitx/dynamo-backup-to-s3.git --save

Would be better to have an npm package published so they can use simpler notation:

npm install dynamo-backup-to-s3 --save

Script runs forever...

Hi,

I use the tool to backup some dynamodb. I notice that some table would cause the script to run forever. Is there any idea why this happens?

Thanks a lot

--partitionkey necessary, even when using --overwrite on restore

On restoring a backup.
When you already have a table and skip auto-creation using the --overwrite option, it still insists you tell it the partitionkey. As this uses bacthwrite to put the records in, I don't think this is necessary?

Resorting a Global Table does not replicate

I am tying to backup one global table to s3 and than restore it to another global table.

restored table does not replicate items to second region

what is recommended restore procedure?

Thanks for this, it works very nicely for backup.

Is there any recommended restore method for the backups produced?

Remove throughput limit for on-demand tables

DynamoDB has a new on demand pricing model, where you pay per request and do not provision any capacity. Since the cost is only per-request and not based on requests per second, it does not make sense to throttle reads from those tables.

An on-demand table has provisioned capacity values of 0:

"ProvisionedThroughput": {
  "NumberOfDecreasesToday": 0,
  "WriteCapacityUnits": 0,
  "ReadCapacityUnits": 0,
  "LastDecreaseDateTime": 1546977951.249
}

Read credentials and config from the shared .ini file

As mentioned here... aws/aws-sdk-js#1391
Latest AWS SDK supports reading config and credentials from the shared .ini file.
I will be doing a pr for this one 🙂

Unable to backup the data from DynamoDB to S3 bucket when the file is triggered from the scheduled event in aws lambda

Steps:

Created the aws lambda handler function.
Scheduled the event to backup the data

Actual Observations :

When running on the single table the functions backup.on('error', function(data){});, backup.on('start-backup', function(tableName, startTime) {});, backup.on('end-backup', function(tableName, backupDuration) {}); are never executed. But when i run the same without aws lambda and handler locally im able to back up the data.
Function backup.backupAllTables(function() {}); is executed but no data is backed up.

Could you please provide inputs on the same.

Thanks,
Zaid

Documentation out of date

The callback for the "error" event is out of date.

The documentation says that "data" will have "tableName" and "error" properties. In fact the properties are called "table" and "err".

Also, "start-backup" has a startTime (not documented) and "end-backup" has a duration, not documented.

If you want me to create a PR for the above then I can..
Thanks!

John