Coder Social home page Coder Social logo

azure / blobxfer Goto Github PK

View Code? Open in Web Editor NEW
150.0 38.0 39.0 1.2 MB

Azure Storage transfer tool and data movement library

License: MIT License

Python 99.41% Shell 0.30% Dockerfile 0.27% PowerShell 0.03%
azure azure-storage azure-blob azure-file python-library python file-transfer azure-blob-storage docker-image data-movement

blobxfer's Introduction

Build Status Build status codecov PyPI

PROJECT STATUS

This project is no longer actively maintained. For tools officially supported by Microsoft please refer to this documentation.

blobxfer

blobxfer is an advanced data movement tool and library for Azure Storage Blob and Files. With blobxfer you can copy your files into or out of Azure Storage with the CLI or integrate the blobxfer data movement library into your own Python scripts.

Major Features

  • Command-line interface (CLI) providing data movement capability to and from Azure Blob and File Storage
  • Standalone library for integration with scripts or other Python packages
  • High-performance design with asynchronous transfers and disk I/O
  • Supports ingress, egress and synchronization of entire directories, containers and file shares
  • YAML configuration driven execution support
  • Fine-grained resume support including resuming a broken operation within a file or object
  • Vectored IO support
    • stripe mode allows striping a single file across multiple blobs (even to multiple storage accounts) to break through single blob or fileshare throughput limits
    • replica mode allows replication of a file across multiple destinations including to multiple storage accounts
  • Synchronous copy with cross-mode (object transform) replication support
    • Leverages server-side copies by default
    • Arbitrary URL copy support
  • Client-side encryption support
  • Support all Azure Blob types and Azure Files for both upload and download
  • Advanced skip options for rsync-like operations
  • Store/restore POSIX filemode and uid/gid
  • Support reading/pipe from stdin including to page blob destinations
  • Support reading from blob and file share snapshots for downloading and synchronous copy
  • Support for setting access tier on objects for uploading and synchronous copy
  • Configurable one-shot block upload support
  • Configurable chunk size for both upload and download
  • Automatic block size selection for block blob uploading
  • Automatic uploading of VHD/VHDX files as page blobs
  • Include and exclude filtering support
  • Rsync-like delete support
  • No clobber support in either direction
  • Automatic content type tagging
  • Support for setting the Cache Control property of blobs and files
  • File logging support
  • Support for HTTP proxies

Installation

There are three ways to install blobxfer:

Please refer to the installation guide for more information on how to install blobxfer.

Documentation

Please refer to the blobxfer documentation for more details and usage information.

Change Log

Please see the Change Log for project history.

Support

This project is community supported and not officially supported by Microsoft. There is no defined SLA for addressing features, issues, and bugs which are exclusively serviced via GitHub issues. For tools officially supported by Microsoft please refer to this documentation.


Please see this project's Code of Conduct and Contributing guidelines.

blobxfer's People

Contributors

alexandair avatar alfpark avatar amishra-dev avatar microsoft-github-policy-service[bot] avatar rems75 avatar zhodowanec avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

blobxfer's Issues

Migrate to azure-storage split library

azure-storage is now broken into multiple packages.

  • Update requirements: azure-storage-blob, azure-storage-file
  • Breakage fixups
    • models
    • operations
    • tests
  • Install guide with azure-storage issues?

Add possibility to copy files between containers?

Hi there! First of all thanks much for the great work! :) 👍

Is there actually a possibility to copy files between two containers? For example with a source of "file share" container and destination "blob" container?

This would be really perfect!

If this currently isn't possible, is it planned?

Thanks much and best regards,
Christian

Recursive upload with RSA with UTF8 files completes but download fails

Upload completes flawlessly
blobxfer stopablob container1 /etc --upload --rsapublickey ../public.pem --storageaccountkey blabla

Download fails:
blobxfer stopablob container1 etc1 --remoteresource . --download --rsaprivatekey ../private.pem --storageaccountkey blabla --rsakeypassphrase blabla

remote blob: apparmor.d/tunables/home.d/ubuntu length: 352 bytes, md5: KoiBH3t2PaqWwgsgJpKUpA==
Traceback (most recent call last):
  File "/usr/local/bin/blobxfer", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python2.7/dist-packages/blobxfer.py", line 2566, in main
    localfile, blob, False, blobdict[blob])
  File "/usr/local/lib/python2.7/dist-packages/blobxfer.py", line 1827, in generate_xferspec_download
    remoteresource, contentlength, contentmd5))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xdc' in position 11: ordinal not in range(128)
=====================================
 azure blobxfer parameters [v0.12.1]
=====================================
             platform: Linux-4.4.0-71-generic-x86_64-with-Ubuntu-16.04-xenial
   python interpreter: CPython 2.7.12
     package versions: az.common=1.1.4 az.sml=0.20.5 az.stor=0.33.0 crypt=1.8.1 req=2.12.3
      subscription id: None
      management cert: None
   transfer direction: Azure->local
       local resource: etc1
      include pattern: None
      remote resource: .
   max num of workers: 3
              timeout: None
      storage account: stopablob
              use SAS: False
  upload as page blob: False
  auto vhd->page blob: False
 upload to file share: False
 container/share name: container1
  container/share URI: https://stopablob.blob.core.windows.net/container1
    compute block MD5: False
     compute file MD5: True
    skip on MD5 match: True
   chunk size (bytes): 4194304
     create container: True
  keep mismatched MD5: False
     recursive if dir: True
component strip on up: 1
        remote delete: False
           collate to: disabled
      local overwrite: True
      encryption mode: file dependent
         RSA key file: ../private.pem
         RSA key type: private
=======================================

Can blobxfer work thru a proxy

I would like to find out if blobxfer can work via a corporate proxy.

Based our our company security setup we must do all our traffic via proxy servers with azure for all machines that are not in azure.

Thanks,
crick

blobxfer getting stuck when copying a container with millions of files

I am trying to download the contents of a virtual directory on Azure storage to my local storage. The virtual directory has ~ 10M small sized blobs( < 15 KB each ).

I am using the following commands to download the blobs

Command 1:
blobxfer --storageaccountkey "STORAGE_KEY" MY_STORAGE_ACCOUNT MY_CONTAINER . --download --remoteresource .

Command 2:
blobxfer --storageaccountkey "STORAGE_KEY" MY_STORAGE_ACCOUNT MY_CONTAINER . --download --remoteresource . --include 'test/*'

After I type in the command I get stuck at

"attempting to copy entire container pictures to ."

I've been stuck here for 30 mins. The same operation starts the download instantaneously while using AZcopy on ubuntu. Maybe the command is trying to compute the total number of files. If that's the case then Is there a way to skip that?

413 Client Error: The request body is too large and exceeds the maximum permissible limit

Following up from #5
I am now running the docker image to upload a 191GB file, and hitting this after about 30 minutes (XXX = my redactions):

-rw-r--r-- 1 root root 191G Oct  8 01:39 NA12892_recompressed.bam
Image Repo  alfpark/blobxfer
Image Tag   latest
Image Size  89M
Image ID    sha256:4e0dbe3348a186e8e716388bc2f8d4de80e7893d578a5750e7022d67b37477f1
Last Updated    2016-10-03T15:54:00.876894577Z (4d 9h ago)
Registry    registry-1.docker.io
=====================================
 azure blobxfer parameters [v0.11.5]
=====================================
             platform: Linux-3.13.0-96-generic-x86_64-with
   python interpreter: CPython 3.5.2
     package versions: az.common=1.1.4 az.sml=0.20.5 az.stor=0.33.0 crypt=1.5.2 req=2.11.1
      subscription id: None
      management cert: None
   transfer direction: local->Azure
       local resource: /data/NA12892_recompressed.bam
      include pattern: None
      remote resource: None
   max num of workers: 12
              timeout: None
      storage account: XXX
              use SAS: True
  upload as page blob: False
  auto vhd->page blob: False
 upload to file share: False
 container/share name: XXX
  container/share URI: XXX
    compute block MD5: False
     compute file MD5: False
    skip on MD5 match: True
   chunk size (bytes): 4194304
     create container: False
  keep mismatched MD5: False
     recursive if dir: True
component strip on up: 1
        remote delete: False
           collate to: disabled
      local overwrite: True
      encryption mode: disabled
         RSA key file: disabled
         RSA key type: disabled
=======================================
script start time: 2016-10-08 01:39:37
detected 0 empty files to upload
performing 48889 put blocks/blobs and 1 put block lists
spawning 12 worker threads
  File "/usr/lib/python3.5/site-packages/blobxfer.py", line 941, in run
    offset, bytestoxfer, encparam, flock, filedesc)
  File "/usr/lib/python3.5/site-packages/blobxfer.py", line 1095, in put_storage_data
    timeout=self.args.timeout)
  File "/usr/lib/python3.5/site-packages/blobxfer.py", line 1418, in azure_request
    return req(*args, **kwargs)
  File "/usr/lib/python3.5/site-packages/blobxfer.py", line 776, in put_block
    response.raise_for_status()
  File "/usr/lib/python3.5/site-packages/requests/models.py", line 862, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 413 Client Error: The request body is too large and exceeds the maximum permissible limit. for url: XXX
Traceback (most recent call last):
  File "/usr/lib/python3.5/site-packages/blobxfer.py", line 941, in run
    offset, bytestoxfer, encparam, flock, filedesc)
  File "/usr/lib/python3.5/site-packages/blobxfer.py", line 1095, in put_storage_data
    timeout=self.args.timeout)
  File "/usr/lib/python3.5/site-packages/blobxfer.py", line 1418, in azure_request
    return req(*args, **kwargs)
  File "/usr/lib/python3.5/site-packages/blobxfer.py", line 776, in put_block
    response.raise_for_status()
  File "/usr/lib/python3.5/site-packages/requests/models.py", line 862, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 413 Client Error: The request body is too large and exceeds the maximum permissible limit. for url: XXX

SLES 12 SP1 - error : no module named azure.common

Hi,

I have no Python experience. Sorry. Just wanted to use blobxfer as a tool to backup some SAP data to a storage account. But whatever I tried now for the last few hours - whenever I try to call blobxfer it comes back with the error that it cannot find a module named azure.common

I am on a SLES 12 SP1 VM on Azure. I manually installed e.g. azure-servicemanagement-legacy==0.20.5 after I got also an error. I used pip to install exactly this version and then the error was gone.

But after resolving all those issues I always get stuck with this azure.common error.

Could someone give me a hint how to solve this ?

Thanks

Regards

Hermann

python no module error

could you please help

`(azure)[ec2-user@stagingadmin blobxfer] pip freeze | grep azure
azure-common==1.1.4
azure-nspkg==1.0.0
azure-servicemanagement-legacy==0.20.5
azure-storage==0.33.0

(azure)[ec2-user@stagingadmin blobxfer] ./blobxfer.py -h
Traceback (most recent call last):
File "./blobxfer.py", line 72, in
import azure.common
ImportError: No module named common

(azure)[ec2-user@admin blobxfer] blobxfer
Traceback (most recent call last):
File "/media/ebs/karthikk_src/azure/bin/blobxfer", line 9, in
load_entry_point('blobxfer==0.12.0', 'console_scripts', 'blobxfer')()
File "build/bdist.linux-x86_64/egg/pkg_resources/init.py", line 542, in load_entry_point
File "build/bdist.linux-x86_64/egg/pkg_resources/init.py", line 2569, in load_entry_point
File "build/bdist.linux-x86_64/egg/pkg_resources/init.py", line 2229, in load
File "build/bdist.linux-x86_64/egg/pkg_resources/init.py", line 2235, in resolve
File "build/bdist.linux-x86_64/egg/blobxfer.py", line 72, in
ImportError: No module named common
(azure)[ec2-user@admin blobxfer] which blobxfer
/media/ebs/karthikk_src/azure/bin/blobxfer`

Better progress indication for computing skip dictionary

I'm running into an issue where blobxfer takes a very long time to start up (i.e. it sits for a long time after showing the script start time message for upwards of 15 minutes), and it seems like it's trying to list all of the files in the remote container according to the traceback:

Traceback (most recent call last):
  File "/home/derrick/.local/bin/blobxfer", line 11, in <module>
    sys.exit(main())
  File "/home/derrick/.local/lib/python2.7/site-packages/blobxfer.py", line 2361, in main
    blobskipdict = get_blob_listing(blob_service[0], args)
  File "/home/derrick/.local/lib/python2.7/site-packages/blobxfer.py", line 1637, in get_blob_listing
    for blob in result:
  File "/home/derrick/.local/lib/python2.7/site-packages/azure/storage/models.py", line 112, in __iter__
    resources = self._list_method(*self._list_args, **self._list_kwargs)
  File "/home/derrick/.local/lib/python2.7/site-packages/azure/storage/blob/baseblobservice.py", line 1275, in _list_blobs
    return self._perform_request(request, _convert_xml_to_blob_list, operation_context=_context)
  File "/home/derrick/.local/lib/python2.7/site-packages/azure/storage/storageclient.py", line 234, in _perform_request
    return parser(response)
  File "/home/derrick/.local/lib/python2.7/site-packages/azure/storage/blob/_deserialization.py", line 299, in _convert_xml_to_blob_list
    setattr(blob.properties, info[1], info[2](property_element.text))
  File "/home/derrick/.local/lib/python2.7/site-packages/dateutil/parser.py", line 1168, in parse
    return DEFAULTPARSER.parse(timestr, **kwargs)
  File "/home/derrick/.local/lib/python2.7/site-packages/dateutil/parser.py", line 556, in parse
    res, skipped_tokens = self._parse(timestr, **kwargs)
  File "/home/derrick/.local/lib/python2.7/site-packages/dateutil/parser.py", line 675, in _parse
    l = _timelex.split(timestr)         # Splits the timestr into tokens
  File "/home/derrick/.local/lib/python2.7/site-packages/dateutil/parser.py", line 192, in split
    return list(cls(s))
  File "/home/derrick/.local/lib/python2.7/site-packages/dateutil/parser.py", line 188, in next
    return self.__next__()  # Python 2.x support
  File "/home/derrick/.local/lib/python2.7/site-packages/dateutil/parser.py", line 181, in __next__
    token = self.get_token()
  File "/home/derrick/.local/lib/python2.7/site-packages/dateutil/parser.py", line 82, in get_token
    if self.tokenstack:
KeyboardInterrupt

This very long startup time is mitigated if I provide the --no-skiponmatch option, so it seems like it's trying to get all of the MD5s to skip on. I have many files (100k+) in the container, so this process does take quite a while.

Perhaps there should be a progress indicator for this, or some kind of heuristic that only fetches the skiplist MD5s for files that are found in the include / source directory? Some documentation about this edge case might also be useful.

Test improvements

  • tox driven
  • Single directory
  • Per file tests
  • Test 2.7, 3.4 through 3.6

Pre-beta items:

  • Improve common/download test coverage that has diverged since upload/synccopy logic was added
  • Add upload tests
  • Add synccopy tests

Unable to upload large files

I tried to use blobxfer to upload a large file (around 264 GB) to an azure block blob storage. The tool could not upload the file. The error message was:

RuntimeError: 63084 chunks for file my_big_file exceeds Azure Storage limits for a single block blob

I believe that this is due to the limitation on the MAX_BLOB_CHUNK_SIZE_BYTES set to 4194304 (63084 * 4194304 B = 264 GB). According to azure documentation, the maximum number of blocks in a block blob is 50000 blocks of 100 MB. 63084 blocks is of course over the limit for the number of blocks and that is why it fails. But I don't understand this artificial limitation to the maximum block size of 4194304 bytes (which is 4 MB) imposed by blobxfer, which is 25 times less than the maximum in the azure documentation.

Anyway, the quick fix for me was to increase the value of MAX_BLOB_CHUNK_SIZE_BYTES to a higher value and then I could upload my file without any issue.

`--sync-copy-dest-mode` does not seem to have any effect

I am trying to transfer files from one blob storage to another using the following command:

blobxfer synccopy --storage-account sourcestorage --sync-copy-dest-storage-account destinationstorage --sync-copy-dest-storage-account-key "destinationstoragekey" --remote-path sourcestoragepath --sync-copy-dest-remote-path destinationfileshare/destinationstoragepath --sas "sastoken" --mode auto --sync-copy-dest-mode file

However, the data gets copied from source blob storage to a destination blob storage container instead of a destination file share.

This is the info log when initiating the transfer:

============================================
         Azure blobxfer parameters
============================================
         blobxfer version: 1.1.0
                 platform: Darwin-16.7.0-x86_64-i386-64bit
               components: CPython=3.5.4 azstor.blob=0.37.1 azstor.file=0.36.0 crypt=2.1.4 req=2.18.4
       transfer direction: Azure -> Azure
                  workers: disk=0 xfer=16 md5=0 crypto=0
                 log file: None
              resume file: None
                  timeout: connect=3.1 read=12.1
                     mode: StorageModes.Auto
                  skip on: fs_match=False lmt_ge=False md5=False
        delete extraneous: False
                overwrite: True
                recursive: True
            rename single: False
              access tier: None
============================================

and only specifies a single mode even though I used both source and destination mode arguments. Is this a known issue or am I incorrectly specifying the parameters?

Problem with installing on Azure: Ubuntu16.04

I am trying to install this on Azure's virtual machine, but I am getting an error regarding associated library click.

Traceback (most recent call last):
  File "/usr/local/bin/blobxfer", line 11, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 676, in main
    _verify_python3_env()
  File "/usr/local/lib/python3.5/dist-packages/click/_unicodefun.py", line 118, in _verify_python3_env
    'for mitigation steps.' + extra)
RuntimeError: Click will abort further execution because Python 3 was configured to use ASCII as encoding for the environment.  Consult http://click.pocoo.org/python3/for mitigation steps.

This system supports the C.UTF-8 locale which is recommended.
You might be able to resolve your issue by exporting the
following environment variables:

    export LC_ALL=C.UTF-8
    export LANG=C.UTF-8

Trying to export locale does not solve this issue.

Can't reproduce locally.

Also, I tried to use pip instead of pip3, but it seems like it didn't had any effect?

#!/bin/bash
set -e

export LANG=en_US.utf8
export LC_ALL=en_US.utf8

sudo apt-get update -y
sudo apt-get install -y build-essential libssl-dev libffi-dev libpython3-dev python3-dev python3-pip
sudo pip3 install --upgrade pip
sudo pip3 install --upgrade azure-batch azure-storage msrestazure # blobxfer
# coz the click library used by blobxfer is bad
sudo pip install --upgrade pip
sudo pip install --upgrade blobxfer

need retry of SysCallError ECONNRESET in azure_request

I've been trying to upload some large files (>100GB) as block blobs and hitting some intermittent errors that I think should be retried. e.g.

=======================================
script start time: 2016-10-07 04:21:52
detected 0 empty files to upload
performing 33567 put blocks/blobs and 1 put block lists
spawning 12 worker threads
...
    timeout=self.args.timeout)
  File "/usr/local/lib/python2.7/dist-packages/blobxfer.py", line 1418, in azure_request
    return req(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/blobxfer.py", line 775, in put_block
    data=block, timeout=self.timeout)
  File "/usr/local/lib/python2.7/dist-packages/blobxfer.py", line 1418, in azure_request
    return req(*args, **kwargs)
  File "/usr/share/dnanexus/lib/python2.7/site-packages/requests/api.py", line 122, in put
    return request('put', url, data=data, **kwargs)
  File "/usr/share/dnanexus/lib/python2.7/site-packages/requests/api.py", line 50, in request
    response = session.request(method=method, url=url, **kwargs)
  File "/usr/share/dnanexus/lib/python2.7/site-packages/requests/sessions.py", line 465, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/share/dnanexus/lib/python2.7/site-packages/requests/sessions.py", line 573, in send
    r = adapter.send(request, **kwargs)
  File "/usr/share/dnanexus/lib/python2.7/site-packages/requests/adapters.py", line 370, in send
    timeout=timeout
  File "/usr/share/dnanexus/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py", line 544, in urlopen
    body=body, headers=headers)
  File "/usr/share/dnanexus/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py", line 372, in _make_request
    httplib_response = conn.getresponse(buffering=True)
  File "/usr/lib/python2.7/httplib.py", line 1051, in getresponse
    response.begin()
  File "/usr/lib/python2.7/httplib.py", line 415, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python2.7/httplib.py", line 371, in _read_status
    line = self.fp.readline(_MAXLINE + 1)
  File "/usr/lib/python2.7/socket.py", line 476, in readline
    data = self._sock.recv(self._rbufsize)
  File "/usr/share/dnanexus/lib/python2.7/site-packages/requests/packages/urllib3/contrib/pyopenssl.py", line 171, in recv
    data = self.connection.recv(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/OpenSSL/SSL.py", line 1302, in recv
    self._raise_ssl_error(self._ssl, result)
  File "/usr/local/lib/python2.7/dist-packages/OpenSSL/SSL.py", line 1163, in _raise_ssl_error
    raise SysCallError(errno, errorcode.get(errno))
SysCallError: (104, 'ECONNRESET')

For whatever reason this ECONNRESET is surfaced as SysCallError, not requests.ConnectionError as the existing retry logic seems to anticipate.

cc @giventocode @rnpandya

--delete flag not taking in the rc1 binary distribution.

I doesn't appear that the --delete flag is taking with the 1.0.0rc1 binary distribution (renamed to blobxfer below). Output shows "False" and confirmed file is not being deleted on the destination after being deleted from source.

> [root@vdev-azw-ecom-us-cifs001 ~]# ./blobxfer upload --delete --storage-account gapecombrowsestorage --storage-account-key IHeMfulLFR/M5COcyVvNRjw9+Z3CIFpgPVQ7EdODEEnTU2ZHDfnCckHEhQF7+Jblahblahblah --remote-path test --local-path /etc/yum
2017-10-11 21:45:56.486 INFO -
============================================
         Azure blobxfer parameters
============================================
         blobxfer version: 1.0.0rc1
                 platform: Linux-3.10.0-514.28.1.el7.x86_64-x86_64-with-redhat-7.3-Maipo
               components: CPython=3.5.3 azstor.blob=0.36.0 azstor.file=0.36.0 crypt=2.0.3 req=2.18.4
       transfer direction: local -> Azure
                  workers: disk=4 xfer=8 md5=0 crypto=0
                 log file: None
              resume file: None
                  timeout: connect=3.1 read=12.1
                     mode: StorageModes.Auto
                  skip on: fs_match=False lmt_ge=False md5=False
        delete extraneous: False
                overwrite: True
                recursive: True
            rename single: False
         chunk size bytes: 0
           one shot bytes: 0
         strip components: 0
         store properties: attr=False md5=False
           rsa public key: None
       local source paths: /etc/yum
============================================
2017-10-11 21:45:56.486 INFO - blobxfer start time: 2017-10-11 21:45:56.486177+00:00
2017-10-11 21:45:56.486 DEBUG - spawning 4 disk threads
2017-10-11 21:45:56.638 DEBUG - spawning 8 transfer threads
2017-10-11 21:45:58.085 DEBUG - 7 local/remote files processed, waiting for upload completion of approx. 0.0018 MiB
2017-10-11 21:45:58.132 INFO - elapsed upload + verify time and throughput of 0.0000 GiB: 1.138 sec, 0.0126 Mbps (0.002 MiB/s)
2017-10-11 21:45:58.132 INFO - blobxfer end time: 2017-10-11 21:45:58.132789+00:00 (elapsed: 1.647 sec)
[root@vdev-azw-ecom-us-cifs001 ~]#

Notice: Breaking changes with CLI commands

Due to the migration to Click for the CLI in 1.0.0 of blobxfer, the existing argument and parameter structure will change from the 0.x.y releases.

Note that the docker image tagged as latest will not update to 1.0.0 until it reaches release candidate status. PyPI packages in the 1.x release line prior to final will only be able to be installed with the --pre flag or targeting a specific version. You can always version your pip install, e.g., pip install blobxfer==x.y.z, which will tie your use case with the version x.y.z of blobxfer.

Thanks for your understanding as we improve the overall blobxfer experience and feature set.

Feature Request: specify subdirectory for fileshare

When uploading to fileshare, it would be great if I could specify a specific sub directory.
Right now, I have to create the structure locally, which is problematic.

For example, I have 5 machines in a cluster called gamma and I want to backup each one to the same share.

say to

gamma machine0/...
gamma machine1/....
gamma machine2/...
gamma machine3/...
gamma machine4/...
blobxfer cockroachbackup gamma data/ --fileshare --upload

If I run this on each machine, it will overwrite the data in the share. I could use the --collate command, but that assumes my data is flat.

So locally, on the machine, I have to first copy the files to a machineX directory, then call blobxfer.

I'd like to maintain the directory structure and just specify a destination directory on the share. So adding a --dest-dir option or something similar would be perfect.

blobxfer cockroachbackup gamma data/ --fileshare --upload --dest-dir machine0

No module named azure.common

After installation, running blobxfer always results in "ImportError: No module named common" after the script attempts to import azure.common.

This looks suspiciously similar to #11, but following the suggested solution there does not resolve this.

This is on a fresh vanilla SLES 12 SP2 VM, running Python 2.7.9 and after installing blobxfer and its dependencies via pip as suggested. Exact error and version information attached.
error and versions.txt

Any help is much appreciated!

"pip install blobxfer --upgrade" failing on CentOS 7.2

While executing "pip install blobxfer -upgrade" on CentOS 7.2, the command is failing with the following error
build/temp.linux-x86_64-2.7/_openssl.c:12:24: fatal error: pyconfig.h: No such file or directory
# include <pyconfig.h>

prior to run this command I ran
yum -y install python3-pip libssl-dev libffi-dev npm
pip install --upgrade pip

can you please share the way to setup blobxfer on CentOS ?

Add Read the Docs documentation

  • Add webhook
  • Activate builds
  • mkdocs.yml
  • symlink index and changelog
  • Reword docs readme
  • Update markdown for compliance
  • Hardlink readme to doc links

Certificate error with version of requests

By default, the version of blobxfer available via pip presented the following error, which an upgrade of requests fixed.
AttributeError: 'X509' object has no attribute '_x509'

Installing requests==2.18.2 from 2.12.3 remedied this issue for me.

blobxfer - account sas token - permission rlac

Team,

I am just exploring blobxfer. I am able to upload and download files to/from azure blob storage using both AccessKey or SAS token.

I have a specific requirement wherein I have a SAS token which has Read-List-Create-Add permissions.
When i tried to upload using that SAS token it fails in blobxfer. But the same works fine with Azure Storage explorer.

azexplorer

azportal

Command Used:
blobxfer STORAGE CONTAINER FILE01 --upload --saskey "?sv=2016-05-31&ss=b&srt=sco&sp=rlac&se=2017-04-18T14:44:55Z&st=2017-04-18T06:31:55Z&spr=https&sig=signature"

HTTPError: 403 Client Error: This request is not authorized to perform this operation using this permission.

Please help me out!

-Robin

Crypto Refactor

  • Port crypto to multiprocess
  • Add support for AES256-GCM?
  • Add support for keys stored in KeyVault?

Directory mirroring

Hey,

cmd: "blobxfer myStorageAcc myContainer /hamachi --upload --fileshare"

problem: on Azure there are multiple folders created instead of one:

  • ham
  • hama
  • hamac
  • hamachi

using docker, tried versions 0.11.0 - 0.11.4 on ubuntu and phusion/baseimage, playing with all cmd options available - same result

any chance there's a bug somewhere around https://github.com/Azure/blobxfer/blob/master/blobxfer.py#L2330 ?

Question on performance

input: 2Gb folder with 400 directories and 10k files (MD5 calculation is disabled)

problem: time to start file upload is about half an hour

is this expected? any way to speed up start of upload?

"Vectored IO" support

  • Striping across multiple blobs within a single storage account
  • Striping across multiple blobs across multiple storage accounts
  • Replica mode across multiple storage accounts?

The MAC signature found in the HTTP request '' is not the same as any computed signature

Hey,
I was trying to use Blobxfer to upload files from Azure Batch into Azure storage. I installed blobxfer with the following commands:
apt-get install -y build-essential libssl-dev libffi-dev python-dev python3-dev
pip3 install azure-storage blobxfer

I am trying to upload to storage with the command:
blobxfer logs file.txt --storageaccountkey

I keep getting the following error:
blocks/min Traceback (most recent call last):
File "/usr/local/lib/python3.4/dist-packages/blobxfer.py", line 967, in run
offset, bytestoxfer, encparam, flock, filedesc)
File "/usr/local/lib/python3.4/dist-packages/blobxfer.py", line 1121, in put_storage_data
timeout=self.args.timeout)
File "/usr/local/lib/python3.4/dist-packages/blobxfer.py", line 1444, in azure_request
return req(*args, **kwargs)
File "/usr/local/lib/python3.4/dist-packages/azure/storage/blob/blockblobservice.py", line 175, in put_block
timeout=timeout
File "/usr/local/lib/python3.4/dist-packages/azure/storage/blob/blockblobservice.py", line 848, in _put_block
self._perform_request(request)
File "/usr/local/lib/python3.4/dist-packages/azure/storage/storageclient.py", line 266, in _perform_request
raise ex
File "/usr/local/lib/python3.4/dist-packages/azure/storage/storageclient.py", line 238, in _perform_request
raise ex
File "/usr/local/lib/python3.4/dist-packages/azure/storage/storageclient.py", line 225, in _perform_request
_http_error_handler(HTTPError(response.status, response.message, response.headers, response.body))
File "/usr/local/lib/python3.4/dist-packages/azure/storage/_error.py", line 98, in _http_error_handler
raise AzureHttpError(message, http_error.status)
azure.common.AzureHttpError: Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.

AuthenticationFailedServer failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.

RequestId:83bb9646-0001-0038-0c1c-418916000000
Time:2016-11-17T21:51:00.6168829ZThe MAC signature found in the HTTP request 'Mb41HpCZ438MVNPXMDDMqaUyEy20MSYb2XAGHnwoxNw=' is not the same as any computed signature. Server used following string to sign: 'PUT

It did work initially but on subsequent tries it keeps giving me this error. Any help with this would be appreciated.

Build improvements

  • Travis improvements
    • Add OS X to matrix (deferred due to Travis build times)
    • Auto pypi package generator and upload
    • Generate PyInstaller packaged "binary"
  • Create AppVeyor Windows build
    • Create Windows Docker image
    • Generate PyInstaller packaged Windows executable

blobxfer hangs with high CPU usage

Context

  • blobxfer upload
  • AzureGermanCloud

Environments/Versions tried

  • Official Docker image alfpark/blobxfer:1.0.0b2 on MacOS
  • Official Docker image alfpark/blobxfer:1.0.0rc1 on MacOS
  • Official Docker image alfpark/blobxfer:1.0.0rc1 on Linux
  • Release binary 1.0.0rc1 on Linu

Problem

When attempting to use upload mode blobxfer starts exhausting the CPU and doesn't log any progress.

I suspect this may have to do with the fact that I'm working with AzureGermanCloud but I'm not certain.

CLI usage

With endpoint explicitly defined:

$ docker run -it -v $(pwd):/upload alfpark/blobxfer:1.0.0rc1 upload -v --storage-account {{REMOVED}} --sas "REMOVED" --endpoint "https://{{REMOVED}}.blob.core.cloudapi.de/"  --local-path /upload --remote-path {{REMOVED}} --skip-on-filesize-match --show-config

{
    "azure_storage": {
        "endpoint": "https://{{REMOVED}}.blob.core.cloudapi.de/",
        "accounts": {
            "{{REMOVED}}": "{{REMOVED}}"
        }
    },
    "upload": [
        {
            "source": [
                "/upload"
            ],
            "destination": [
                {
                    "{{REMOVED}}": "{{REMOVED}}"
                }
            ]
        }
    ],
    "options": {
        "log_file": null,
        "progress_bar": true,
        "resume_file": null,
        "timeout": {
            "connect": null,
            "read": null
        },
        "verbose": true,
        "concurrency": {
            "crypto_processes": 0,
            "disk_threads": 0,
            "md5_processes": 0,
            "transfer_threads": 0
        }
    }
}

============================================
         Azure blobxfer parameters
============================================
         blobxfer version: 1.0.0rc1
                 platform: Linux-4.9.41-moby-x86_64-with
               components: CPython=3.6.1 azstor.blob=0.36.0 azstor.file=0.36.0 crypt=2.0.3 req=2.18.4
       transfer direction: local -> Azure
                  workers: disk=4 xfer=8 md5=0 crypto=0
                 log file: None
              resume file: None
                  timeout: connect=3.1 read=12.1
                     mode: StorageModes.Auto
                  skip on: fs_match=True lmt_ge=False md5=False
        delete extraneous: False
                overwrite: True
                recursive: True
            rename single: False
         chunk size bytes: 0
           one shot bytes: 0
         strip components: 0
         store properties: attr=False md5=False
           rsa public key: None
       local source paths: /upload
============================================
2017-10-06 10:16:05.293 INFO blobxfer.operations.upload:_run:1016 blobxfer start time: 2017-10-06 10:16:05.293394+00:00
2017-10-06 10:16:05.294 DEBUG blobxfer.operations.upload:_initialize_disk_threads:301 spawning 4 disk threads
2017-10-06 10:16:05.377 DEBUG blobxfer.operations.upload:_initialize_transfer_threads:313 spawning 8 transfer threads
--- STUCK HERE ---

Without endpoint explicitly defined:

$ docker run -it -v $(pwd):/upload alfpark/blobxfer:1.0.0rc1 upload -v --storage-account {{REMOVED}} --sas "REMOVED"  --local-path /upload --remote-path {{REMOVED}} --skip-on-filesize-match --show-config
2017-10-06 10:18:14.322 DEBUG blobxfer:_init_config:121 config:
{
    "azure_storage": {
        "endpoint": null,
        "accounts": {
            "{{REMOVED}}": "{{REMOVED}}"
        }
    },
    "upload": [
        {
            "source": [
                "/upload"
            ],
            "destination": [
                {
                    "{{REMOVED}}": "{{REMOVED}}"
                }
            ]
        }
    ],
    "options": {
        "log_file": null,
        "progress_bar": true,
        "resume_file": null,
        "timeout": {
            "connect": null,
            "read": null
        },
        "verbose": true,
        "concurrency": {
            "crypto_processes": 0,
            "disk_threads": 0,
            "md5_processes": 0,
            "transfer_threads": 0
        }
    }
}
2017-10-06 10:18:14.325 INFO blobxfer.operations.progress:output_parameters:211
============================================
         Azure blobxfer parameters
============================================
         blobxfer version: 1.0.0rc1
                 platform: Linux-4.9.41-moby-x86_64-with
               components: CPython=3.6.1 azstor.blob=0.36.0 azstor.file=0.36.0 crypt=2.0.3 req=2.18.4
       transfer direction: local -> Azure
                  workers: disk=4 xfer=8 md5=0 crypto=0
                 log file: None
              resume file: None
                  timeout: connect=3.1 read=12.1
                     mode: StorageModes.Auto
                  skip on: fs_match=True lmt_ge=False md5=False
        delete extraneous: False
                overwrite: True
                recursive: True
            rename single: False
         chunk size bytes: 0
           one shot bytes: 0
         strip components: 0
         store properties: attr=False md5=False
           rsa public key: None
       local source paths: /upload
============================================
2017-10-06 10:18:14.325 INFO blobxfer.operations.upload:_run:1016 blobxfer start time: 2017-10-06 10:18:14.325774+00:00
2017-10-06 10:18:14.326 DEBUG blobxfer.operations.upload:_initialize_disk_threads:301 spawning 4 disk threads
2017-10-06 10:18:14.361 DEBUG blobxfer.operations.upload:_initialize_transfer_threads:313 spawning 8 transfer threads
--- STUCK HERE ---

Trying to download with a SAS

Hi,

I'm running into problems trying to do a basic download. I generated a SAS for a specific blob in Storage Explorer and am running blobxfer 0.12.0 in a Ubuntu 14.04 VM:

blobxfer --download --remoteresource [blob_id] --saskey "https://myaccount.blob.core.windows.net/mycontainer/[blob_id]?st=2016-11-23T14%3A50%3A00Z&se=2016-11-24T14%3A50%3A00Z&sp=rl&sv=2015-12-11&sr=b&sig=[signature]" myaccount mycontainer .

but this results in:

script start time: 2016-11-23 17:36:26
generating local directory structure and pre-allocating space
Traceback (most recent call last):
File "/usr/local/bin/blobxfer", line 11, in
sys.exit(main())
File "/usr/local/lib/python2.7/dist-packages/blobxfer.py", line 2561, in main
localfile, blob, False, blobdict[blob])
File "/usr/local/lib/python2.7/dist-packages/blobxfer.py", line 1793, in generate_xferspec_download
container_name=args.container, blob_name=remoteresource)
File "/usr/local/lib/python2.7/dist-packages/blobxfer.py", line 1444, in azure_request
return req(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/blobxfer.py", line 592, in get_blob_properties
response.raise_for_status()
File "/usr/local/lib/python2.7/dist-packages/requests/models.py", line 862, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature. for url: https://myaccount.blob.core.windows.net/mycontainer/[blob_id]?https://myaccount.blob.core.windows.net/mycontainer/[blob_id]?st=2016-11-23T14%3A50%3A00Z&se=2016-11-24T14%3A50%3A00Z&sp=rl&sv=2015-12-11&sr=b&sig=[signature]

The URL being reported in the error certainly doesn't look like what I'm passing in (not sure if it's an artifact in the terminal window). I am able to use the supplied SAS with curl no problems.

More sample usages in the blobxfer README would be greatly appreciated!

Thanks,
Andy

UnicodeEncodeError: 'utf-8' codec can't encode character

blobxfer is barfing after running on a fairly large directory tree. I'm not sure if it dislikes certain filename on the source side, or if this is a bug. Any thoughts?

         blobxfer version: 1.0.0rc3
                 platform: Linux-3.10.0-327.4.5.el7.x86_64-x86_64-with-centos-7.2.1511-Core
               components: CPython=3.6.3 azstor.blob=0.36.0 azstor.file=0.36.0 crypt=2.1.2 req=2.18.4
       transfer direction: local -> Azure
                  workers: disk=32 xfer=64 md5=0 crypto=0
                 log file: /var/log/blobxfer.log
              resume file: None
                  timeout: connect=3.1 read=12.1
                     mode: StorageModes.Auto
                  skip on: fs_match=True lmt_ge=False md5=False
        delete extraneous: True
                overwrite: True
                recursive: True
            rename single: False
         chunk size bytes: 0
           one shot bytes: 0
         strip components: 0
         store properties: attr=False md5=False
           rsa public key: None
       local source paths: /mnt/Asset_Archive
============================================
2017-11-06 13:54:35.921 INFO - blobxfer start time: 2017-11-06 13:54:35.921523-05:00
2017-11-06 13:54:35.921 DEBUG - spawning 32 disk threads
2017-11-06 13:54:35.950 DEBUG - spawning 64 transfer threads
2017-11-06 20:44:37.756 ERROR - 'utf-8' codec can't encode character '\udca0' in position 74: surrogates not allowed
Traceback (most recent call last):
  File "blobxfer/operations/upload.py", line 1156, in start
  File "blobxfer/operations/upload.py", line 1064, in _run
  File "blobxfer/operations/upload.py", line 1063, in <listcomp>
  File "blobxfer/operations/upload.py", line 898, in _generate_destination_for_source
  File "blobxfer/operations/upload.py", line 842, in _check_for_existing_remote
  File "blobxfer/operations/azure/blob/__init__.py", line 83, in get_blob_properties
  File "site-packages/azure/storage/blob/baseblobservice.py", line 1453, in get_blob_properties
  File "site-packages/azure/storage/common/storageclient.py", line 231, in _perform_request
  File "site-packages/azure/storage/common/_serialization.py", line 83, in _update_request
  File "urllib/parse.py", line 781, in quote
UnicodeEncodeError: 'utf-8' codec can't encode character '\udca0' in position 74: surrogates not allowed

New Options

  • Mode: Auto, Append, Block, Page, File
    • Read from stdin
  • No clobber in both directions
  • "delete extraneous from target" in both directions
  • Destination path
  • Multiple filters
    • Include
    • Exclude
  • Store file mode, uid, gid

YAML configuration file support

  • Support YAML configuration input
  • "Process" action based on input from yaml
  • Azure storage options
    • Endpoint
    • Multiple accounts
  • General options
    • Store source file mode option
  • Upload KV list
  • Download KV list
  • Source "input" list
  • Skip options
  • Vectored IO options

Upload/Download files from Azure Strorage Container using blobxfer

Hello author,
I tried to upload some file in to azure container. I ran this command line argument in linux machine. But there was no response for a long time and it showing the same status for long time. What the issue for this? Anyone would suggest me where this is going wrong?

Uploading Command : blobxfer accountname containername files_name --saskey "KEY"

This command showing as follows,
Details of the options like timeout,storage account,use SAS,container/share name, and transfer direction, etc.,
script start time: 2016-09-14 06:57:46

But, After a long period i.e more than 2 hours also showing the same status. The file size is also less than 1Mb. I am facing the same issue for downloading the files as well. Give me some suggestion to proceed for uploading/downloading.
Thanks in advance!!

Failed to synchronise

Failed to synchronise an empty local folder with an empty blob container when using --delete.
This command will throw an exception about that could not authenticate to the server. The probable reason is the parameter "include" in the request to the server.

upd.
directory /blobtest is empty and container is not empty.

blobxfer ACCOUNT test /blobtest --storageaccountkey KEY --delete

script start time: 2016-09-02 11:37:08
creating container, if needed: test
deleting 37 remote blobs
deletion complete.
detected no transfer actions needed to be taken, exiting...

directory /blobtest is empty and container is empty.

blobxfer ACCOUNT test /blobtest --storageaccountkey KEY --delete

script start time: 2016-09-02 11:37:12
Traceback (most recent call last):
  File "/usr/bin/blobxfer", line 11, in <module>
    sys.exit(main())
  File "/usr/lib/python2.7/site-packages/blobxfer.py", line 2389, in main
    blob_service[0], args, metadata=False)
  File "/usr/lib/python2.7/site-packages/blobxfer.py", line 1605, in get_blob_listing
    container_name=args.container, marker=marker, include=incl)
  File "/usr/lib/python2.7/site-packages/blobxfer.py", line 1418, in azure_request
    return req(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/azure/storage/blob/baseblobservice.py", line 1177, in list_blobs
    resp = self._list_blobs(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/azure/storage/blob/baseblobservice.py", line 1247, in _list_blobs
    response = self._perform_request(request)
  File "/usr/lib/python2.7/site-packages/azure/storage/storageclient.py", line 195, in _perform_request
    _storage_error_handler(HTTPError(response.status, response.message, response.headers, response.body))
  File "/usr/lib/python2.7/site-packages/azure/storage/_serialization.py", line 125, in _storage_error_handler
    return _general_error_handler(http_error)
  File "/usr/lib/python2.7/site-packages/azure/storage/_error.py", line 74, in _general_error_handler
    raise AzureHttpError(message, http_error.status)
azure.common.AzureHttpError: Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.
<?xml version="1.0" encoding="utf-8"?><Error><Code>AuthenticationFailed</Code><Message>Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.
RequestId:7779c4fa-0001-0115-02f5-04dd08000000
Time:2016-09-02T08:37:11.7990970Z</Message><AuthenticationErrorDetail>The MAC signature found in the HTTP request '+JjRFkacbfHXhzsvLYhR2Jci5nty2b8Xtd4hwT1TVVc=' is not the same as any computed signature. Server used following string to sign: 'GET

x-ms-client-request-id:71b1790e-70e8-11e6-9ee8-001c42abe891
x-ms-date:Fri, 02 Sep 2016 08:37:12 GMT
x-ms-version:2015-07-08
/ACCOUNT/test
comp:list
include:
restype:container'.</AuthenticationErrorDetail></Error>

--skip-on-filesize-match option causes Traceback.

Using latest blobxfer-1.0.0rc2-linux-x86_64 linux binary. I get a traceback with the --skip-on-filesize-match, but seems to work when that option is not supplied.

> [root@vdev-azw-ecom-us-nfs001 ~]# ./blobxfer-1.0.0rc2-linux-x86_64 upload --skip-on-filesize-match --delete --log-file /tmp/blobxfer.log --progress-bar --storage-account blahblablah --storage-account-key IHeMfulLFR/M5COcyVvNRjw9+Z3CIFpgPVQ7EdODEEnTU2ZHDfnCckblahblahblah --remote-path test --local-path /etc/yum/
============================================
         Azure blobxfer parameters
============================================
         blobxfer version: 1.0.0rc2
                 platform: Linux-3.10.0-514.28.1.el7.x86_64-x86_64-with-redhat-7.3-Maipo
               components: CPython=3.5.3 azstor.blob=0.36.0 azstor.file=0.36.0 crypt=2.1.1 req=2.18.4
       transfer direction: local -> Azure
                  workers: disk=4 xfer=8 md5=0 crypto=0
                 log file: /tmp/blobxfer.log
              resume file: None
                  timeout: connect=3.1 read=12.1
                     mode: StorageModes.Auto
                  skip on: fs_match=True lmt_ge=False md5=False
        delete extraneous: True
                overwrite: True
                recursive: True
            rename single: False
         chunk size bytes: 0
           one shot bytes: 0
         strip components: 0
         store properties: attr=False md5=False
           rsa public key: None
       local source paths: /etc/yum
============================================
Traceback (most recent call last):
  File "cli/cli.py", line 906, in <module>
  File "site-packages/click/core.py", line 722, in __call__
  File "site-packages/click/core.py", line 697, in main
  File "site-packages/click/core.py", line 1066, in invoke
  File "site-packages/click/core.py", line 895, in invoke
  File "site-packages/click/core.py", line 535, in invoke
  File "site-packages/click/decorators.py", line 64, in new_func
  File "site-packages/click/core.py", line 535, in invoke
  File "cli/cli.py", line 901, in upload
  File "blobxfer/operations/upload.py", line 1145, in start
  File "blobxfer/operations/upload.py", line 1117, in _run
RuntimeError: upload mismatch: [count=0/-7 bytes=0/-1884]
[40227] Failed to execute script cli
[root@vdev-azw-ecom-us-nfs001 ~]#
[root@vdev-azw-ecom-us-nfs001 ~]# ./blobxfer-1.0.0rc2-linux-x86_64 upload --delete --log-file /tmp/blobxfer.log --progress-bar --storage-account blahblah --storage-account-key IHeMfulLFR/M5COcyVvNRjw9+Z3CIFpgPVQ7EdODEEnTU2Zblahblahblah --remote-path test --local-path /etc/yum/
============================================
         Azure blobxfer parameters
============================================
         blobxfer version: 1.0.0rc2
                 platform: Linux-3.10.0-514.28.1.el7.x86_64-x86_64-with-redhat-7.3-Maipo
               components: CPython=3.5.3 azstor.blob=0.36.0 azstor.file=0.36.0 crypt=2.1.1 req=2.18.4
       transfer direction: local -> Azure
                  workers: disk=4 xfer=8 md5=0 crypto=0
                 log file: /tmp/blobxfer.log
              resume file: None
                  timeout: connect=3.1 read=12.1
                     mode: StorageModes.Auto
                  skip on: fs_match=False lmt_ge=False md5=False
        delete extraneous: True
                overwrite: True
                recursive: True
            rename single: False
         chunk size bytes: 0
           one shot bytes: 0
         strip components: 0
         store properties: attr=False md5=False
           rsa public key: None
       local source paths: /etc/yum
============================================
upload progress: [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 100.00%        0.009 MiB/sec, 7/7 uploaded
[root@vdev-azw-ecom-us-nfs001 ~]#

Script Fail on download

Hi There.
When downloading from azure using blobxfer
I have these errors.

Traceback (most recent call last):
File "/usr/local/bin/blobxfer", line 11, in
sys.exit(main())
File "/usr/local/lib/python2.7/dist-packages/blobxfer.py", line 2566, in main
localfile, blob, False, blobdict[blob])
File "/usr/local/lib/python2.7/dist-packages/blobxfer.py", line 1786, in generate_xferspec_download
remoteresource = encode_blobname(args, remoteresource)
File "/usr/local/lib/python2.7/dist-packages/blobxfer.py", line 1524, in encode_blobname
return urlquote(blobname)
File "/usr/lib/python2.7/urllib.py", line 1288, in quote
return ''.join(map(quoter, s))
KeyError: u'\u2560'
hwwadm@hgslhpsv01:/m

The blobs I'm restoring were happily uploaded to Azure using blobxfer, so were happy with all the filenames.

DD_15Fischsta╠łbchen_450g.ai

This is the filename that caused the blobxfer to stop, however Azure Storage Explorer was happy to download, so dealt with this odd character. I've also had issues with ( ) in filenames - renaming them by removing the brackets fixes the issues.

Looking froward to hearing from you!

many thanks.

Logging support

  • Python logging facilities
  • Per module logger
  • Allow redirect of logger output via configuration?

Resumable Transfer Support

Resuming encrypted files are not supported as the python stdlib SHA256 is not picklable. This may be supported at a later time with a suitable pure python replacement.

Operation support:

  • Download
  • Upload
  • Synccopy

Integrity:

  • Without MD5
  • With MD5

Feedback Request

Hello blobxfer users,

I would like to solicit feedback on what you feel is missing or what you would like to see improved in the next version of blobxfer. The next version of blobxfer will be a nearly from scratch rewrite incorporating collected feedback over the past year. Some ideas include:

  • Restrict to Python 3.4/3.5 for native asyncio (docker image would still be available for those stuck on Python 2.7 or earlier versions of Python 3.x)
  • Rewrite in Go
  • Please see other ideas which have been promoted from thoughts to "maybes" in the associated kanban board: https://github.com/Azure/blobxfer/projects/1

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.