Coder Social home page Coder Social logo

pyunpack's People

Contributors

eisene avatar fativi avatar ponty avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

pyunpack's Issues

Unable to extract *.tar.xz in Windows

OS version: Windows 10 20H2
Python version: Python 3.9.5 and Anaconda3 2021.5 Python 3.8.8 64-bit
Requirement patool-1,12 already satisfied
MWE:

import os
import sys
from pyunpack import Archive
path_to_file = 'xxx'
target_dir = 'xxx'
Archive(path_to_file).extractall(target_dir)

Problem: the program just hangs at the last line, not extracting the compressed files. The format of the compressed file is *.tar.xz, which is just 69350 KiB.

Archive.extractall hangs on password protected .rar file without throwing exception

I'm writing a python script to process some old archives that include many .rar files. I can successfully extract data from them using Archive.extractall from pyunpack. However, occasionally I will hit a password protected file, and the program just hangs. I have the code in a try block, but it's not throwing any exception. Would it be possible to throw an exception in this circumstance so my script can skip the file and continue?

Here's the relevant portion of my code:

import os
import rarfile
from pyunpack import Archive

for subdir,dirs,files in os.walk(directoryToExtract):
        for file in files:
            if rarfile.is_rarfile(os.path.join(subdir, file)):
                print("Found rarfile " + os.path.join(subdir, file))
                try:                    
                    Archive(os.path.join(subdir, file)).extractall(os.path.join(subdir, file[0:len(file)-4]),auto_create_dir=True)
                except Exception as e:
                    print("ERROR: Couldn't unrar " + os.path.join(subdir, file))
                    print(str(e))

Add password support

Most of the archives format support protecting the archive with a password.
It will be great if it will be able to extract files with a given password

Problem using PyUnpack in an executable made with PyInstaller in combination with try /except

NB: I have already opened the same issue on StackOverFlow here:
https://stackoverflow.com/questions/60655910/use-pyunpack-inside-an-executable-file-made-with-pyinstaller-in-combination-with

I have a strange behaviour for pyunpack, a package for unpacking, inside an executable.

I want to do the following thing:

I have a .7z type of file whose ending is not in .7z but in .sent.

First I try to unzip it the direct way, which leads to an expected error that is caught.

Inside this error catching, I am first adding the .7z extension, then I am unzipping the file properly into a folder called "grog", then I give the zipped file its original name back.

Here is the code below:

# test.py
from os.path import abspath, join, exists, dirname
from os import rename, mkdir
from shutil import copy
import multiprocessing
import pyunpack

multiprocessing.freeze_support()
print(0)
name = "file_to_be_unzipped.sent"
print("a")
path = "C:\\Users\\myname\\eclipse-workspace-tms\\test_unzip_exe"
print(abspath("."))
print("b")
unzip_dest = join(path, "grog")
if not exists(unzip_dest):
    mkdir(unzip_dest)
print("c")
name = join(path, name)
print("d")
print("e")
try:
    print(1)
    pyunpack.Archive(name).extractall(unzip_dest)
    print(2)
except pyunpack.PatoolError as pterr:
    print(3)
    temp_f_name = name + ".7z"
    print(4)
    rename(name, temp_f_name)
    try:
        print(5)
        pyunpack.Archive(temp_f_name).extractall(unzip_dest)
        print(6)
        rename(temp_f_name, name)
        print(7)
    except pyunpack.PatoolError as pterr2:
        # removing useless 7z extension
        print(8)
        rename(temp_f_name, name)
        print(9)
        # Case when the file is already unzipped
        if str(pterr2).find("Is not archive"):
            print(10)
            copy(name, unzip_dest)
            print(11)
        print(12)
except ValueError as v:
    print(13)
    print(v)
    print(14)

When I launch the script test.py, I get the expected behaviour:

0
a
C:\Users\myname\eclipse-workspace-tms\test_unzip_exe
b
c
d
e
1
3
4
5
6
7

then I build the executable with the following command line:

pyinstaller --log-level=DEBUG test.spec

and the following spec file:

# -*- mode: python ; coding: utf-8 -*-

block_cipher = None

import pyunpack
import patoolib
from pyunpack import Archive, PatoolError
from patoolib.programs import ar
from patoolib.programs import  arc
from patoolib.programs import  archmage
from patoolib.programs import  arj
from patoolib.programs import  bsdcpio
from patoolib.programs import  bsdtar
from patoolib.programs import  bzip2
from patoolib.programs import  cabextract
from patoolib.programs import  chmlib
from patoolib.programs import  clzip
from patoolib.programs import  compress
from patoolib.programs import  cpio
from patoolib.programs import  dpkg
from patoolib.programs import  flac
from patoolib.programs import  genisoimage
from patoolib.programs import  gzip
from patoolib.programs import  isoinfo
from patoolib.programs import  lbzip2
from patoolib.programs import  lcab
from patoolib.programs import  lha
from patoolib.programs import  lhasa
from patoolib.programs import  lrzip
from patoolib.programs import  lzip
from patoolib.programs import  lzma
from patoolib.programs import  lzop
from patoolib.programs import  mac
from patoolib.programs import  nomarch
from patoolib.programs import  p7azip
from patoolib.programs import  p7rzip
from patoolib.programs import  p7zip
from patoolib.programs import  pbzip2
from patoolib.programs import  pdlzip
from patoolib.programs import  pigz
from patoolib.programs import  plzip
from patoolib.programs import  py_bz2
from patoolib.programs import  py_echo
from patoolib.programs import  py_gzip
from patoolib.programs import  py_lzma
from patoolib.programs import  py_tarfile
from patoolib.programs import  py_zipfile
from patoolib.programs import  rar
from patoolib.programs import  rpm
from patoolib.programs import  rpm2cpio
from patoolib.programs import  rzip
from patoolib.programs import  shar
from patoolib.programs import  shorten
from patoolib.programs import  star
from patoolib.programs import  tar
from patoolib.programs import  unace
from patoolib.programs import  unadf
from patoolib.programs import  unalz
from patoolib.programs import  uncompress
from patoolib.programs import  unrar
from patoolib.programs import  unshar
from patoolib.programs import  unzip
from patoolib.programs import  xdms
from patoolib.programs import  xz
from patoolib.programs import  zip
from patoolib.programs import  zoo
from patoolib.programs import  zopfli
from patoolib.programs import  zpaq


# from pyunpack import Archive, PatoolError

a = Analysis(['test.py'],
             pathex=['C:\\Users\\myname\\eclipse-workspace-tms\\test_unzip_exe'],
             binaries=[],
             datas=[],
             hiddenimports=['pyunpack', 'patoolib',
                             'patoolib.programs.ar',
                             'patoolib.programs.arc',
                             'patoolib.programs.archmage',
                             'patoolib.programs.arj',
                             'patoolib.programs.bsdcpio',
                             'patoolib.programs.bsdtar',
                             'patoolib.programs.bzip2',
                             'patoolib.programs.cabextract',
                             'patoolib.programs.chmlib',
                             'patoolib.programs.clzip',
                             'patoolib.programs.compress',
                             'patoolib.programs.cpio',
                             'patoolib.programs.dpkg',
                             'patoolib.programs.flac',
                             'patoolib.programs.genisoimage',
                             'patoolib.programs.gzip',
                             'patoolib.programs.isoinfo',
                             'patoolib.programs.lbzip2',
                             'patoolib.programs.lcab',
                             'patoolib.programs.lha',
                             'patoolib.programs.lhasa',
                             'patoolib.programs.lrzip',
                             'patoolib.programs.lzip',
                             'patoolib.programs.lzma',
                             'patoolib.programs.lzop',
                             'patoolib.programs.mac',
                             'patoolib.programs.nomarch',
                             'patoolib.programs.p7azip',
                             'patoolib.programs.p7rzip',
                             'patoolib.programs.p7zip',
                             'patoolib.programs.pbzip2',
                             'patoolib.programs.pdlzip',
                             'patoolib.programs.pigz',
                             'patoolib.programs.plzip',
                             'patoolib.programs.py_bz2',
                             'patoolib.programs.py_echo',
                             'patoolib.programs.py_gzip',
                             'patoolib.programs.py_lzma',
                             'patoolib.programs.py_tarfile',
                             'patoolib.programs.py_zipfile',
                             'patoolib.programs.rar',
                             'patoolib.programs.rpm',
                             'patoolib.programs.rpm2cpio',
                             'patoolib.programs.rzip',
                             'patoolib.programs.shar',
                             'patoolib.programs.shorten',
                             'patoolib.programs.star',
                             'patoolib.programs.tar',
                             'patoolib.programs.unace',
                             'patoolib.programs.unadf',
                             'patoolib.programs.unalz',
                             'patoolib.programs.uncompress',
                             'patoolib.programs.unrar',
                             'patoolib.programs.unshar',
                             'patoolib.programs.unzip',
                             'patoolib.programs.xdms',
                             'patoolib.programs.xz',
                             'patoolib.programs.zip',
                             'patoolib.programs.zoo',
                             'patoolib.programs.zopfli',
                             'patoolib.programs.zpaq'],
             # hiddenimports=['Archive', 'PatoolError'],
             hookspath=[],
             runtime_hooks=[],
             excludes=[],
             win_no_prefer_redirects=False,
             win_private_assemblies=False,
             cipher=block_cipher,
             noarchive=False)
pyz = PYZ(a.pure, a.zipped_data,
             cipher=block_cipher)
exe = EXE(pyz,
          a.scripts,
          [],
          exclude_binaries=True,
          name='test',
          debug=False,
          bootloader_ignore_signals=False,
          strip=False,
          upx=True,
          console=True )
coll = COLLECT(exe,
               a.binaries,
               a.zipfiles,
               a.datas,
               strip=False,
               upx=True,
               upx_exclude=[],
               name='test')

and then after an unexpected long time, I get the following:

0
a
C:\Users\myname\eclipse-workspace-tms\test_unzip_exe\dist\test
b
c
d
e
1
2

where the file in the destination ("grog") is not an unzipped file as wanted but simply a copy.

Does anybody have an idea of what is going wrong?

Thanks a lot

ImportError: No module named patoolib

Hello,
I'm trying to use the pyunpack to open a RAR file.

I've install the patool in order to add RAR suport (patool 1.12)
My pyunpack package is v.0.0.3

But using:

>>> from pyunpack import Archive
>>> Archive('my-file.rar').extractall('./')

I get the error: ImportError: No module named patoolib

I'm using Python 2.7 and a Mac OS El Capitan

What i'm missing?
Thanks for any help

zip "gbk" decode request?

Achive file with chinese character name file in it unpack will give wrong results, zip "gbk" decode request?

Achive:
test.zip

├─test
│ └─测试 (subfolder with chinese name)
│ └─test
│ test.txt
│ 测试.txt (file with chinese name)

after
arch = pyunpack.Archive()
arch.extractall()

will get

└─test
│ └─▓Γ╩╘
│ └─test
│ test.txt
│ ▓Γ╩╘.txt

All with chinese name will goes wrong!

test.zip

===========================

And pyunpack seems cannot handle 7z?
log:
patool can not unpack
patool error: error extracting could not find an executable program to extract format 7z; candidates are (7z,7za,7zr)

Bug with Chinese character on windows 10

PS E:\Project*\test> & E:/Installed/Python/Python37-32/python.exe e:/Project//test/unzipfile1.py
Traceback (most recent call last):
File "e:/Project/***/test/unzipfile1.py", line 7, in
Archive('软件 安排 编辑.rar').extractall('./module/temp')
File "E:\Installed\Python\Python37-32\lib\site-packages\pyunpack_init_.py", line 113, in extractall
self.extractall_patool(directory, patool_path)
File "E:\Installed\Python\Python37-32\lib\site-packages\pyunpack_init_.py", line 74, in extractall_patool
raise PatoolError("patool can not unpack\n" + str(p.stderr))
pyunpack.PatoolError: patool can not unpack

********** Oops, I did it again. *************

You have found an internal error in patool. Please write a bug report
at https://github.com/wummel/patool/issues/ and include at least the information below:

Not disclosing some of the information below due to privacy reasons is ok.
I will try to help you nonetheless, but you have to give me something
I can work with ;) .

<class 'UnicodeEncodeError'> 'charmap' codec can't encode characters in position 63-64: character maps to
Traceback (most recent call last):
File "E:\Installed\Python\Python37-32\Scripts\patool", line 213, in main
res = globals()"run_%s" % args.command
File "E:\Installed\Python\Python37-32\Scripts\patool", line 33, in run_extract
patoolib.extract_archive(archive, verbosity=args.verbosity, interactive=args.interactive, outdir=args.outdir)
File "E:\Installed\Python\Python37-32\lib\site-packages\patoolib_init_.py", line 683, in extract_archive
util.log_info("Extracting %s ..." % archive)
File "E:\Installed\Python\Python37-32\lib\site-packages\patoolib\util.py", line 516, in log_info
print("patool:", msg, file=out)
File "E:\Installed\Python\Python37-32\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 63-64: character maps to
System info:
patool 1.12
Python 3.7.0 (v3.7.0:1bf9cc5093, Jun 27 2018, 04:06:47) [MSC v.1914 32 bit (Intel)] on win32

sys.argv ['E:\Installed\Python\Python37-32\Scripts\patool', '--non-interactive', 'extract', 'E:\Project\**\test\\u8f6f\u4ef6 \u5b89\u6392 \u7f16\u8f91.rar', '--outdir=E:\Project\****\test\module\temp']
LANG = 'en_US.UTF-8'

******** patool internal error, over and out ********

The easyprocess hanged sometimes when unpack, I fixed this problem in my code. I think need patch

I added --non-interactive option and timeout
My Archive:

class ArchiveWithTimeout(Archive):

    def __init__(self, filename, timeout=None):
        super(ArchiveWithTimeout, self).__init__(filename)
        self.timeout = timeout

    def extractall_patool(self, directory, patool_path):
        log.debug("starting backend patool")
        if not patool_path:
            patool_path=fullpath('patool')
        p = EasyProcess([
            sys.executable,
            patool_path,
            '--non-interactive',
            'extract',
            self.filename,
            '--outdir=' + directory,
            #                     '--verbose',
        ]).call(timeout=self.timeout)
        if p.return_code:
            raise PatoolError("patool can not unpack\n" + str(p.stderr))

Calling patool directly

Hi,
I saw that patool isn't called from python. Is there a reason to call it in a sub process?
I can provide a pull request.

Password protected ZIPs cannot be extracted

pyunpack currently doesn't extract password protected ZIP files with backend = "auto". But it is able to do it with manually chosen "patool" as backend.

Default config fails:

$ python -m pyunpack.cli  -p "geheim" ./lz_input/pwd_protected.zip . 
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/user/.cache/pypoetry/virtualenvs/mnxmpl-ToTEp6cO-py3.10/lib/python3.10/site-packages/pyunpack/cli.py", line 12, in <module>
    def extractall(
  File "/home/user/.cache/pypoetry/virtualenvs/mnxmpl-ToTEp6cO-py3.10/lib/python3.10/site-packages/entrypoint2/__init__.py", line 382, in entrypoint
    return func(**kwargs)
  File "/home/user/.cache/pypoetry/virtualenvs/mnxmpl-ToTEp6cO-py3.10/lib/python3.10/site-packages/pyunpack/cli.py", line 21, in extractall
    Archive(filename, backend, password=password).extractall(
  File "/home/user/.cache/pypoetry/virtualenvs/mnxmpl-ToTEp6cO-py3.10/lib/python3.10/site-packages/pyunpack/__init__.py", line 111, in extractall
    self.extractall_zipfile(directory)
  File "/home/user/.cache/pypoetry/virtualenvs/mnxmpl-ToTEp6cO-py3.10/lib/python3.10/site-packages/pyunpack/__init__.py", line 79, in extractall_zipfile
    zipfile.ZipFile(self.filename).extractall(
  File "/usr/lib/python3.10/zipfile.py", line 1645, in extractall
    self._extract_member(zipinfo, path, pwd)
  File "/usr/lib/python3.10/zipfile.py", line 1698, in _extract_member
    with self.open(member, pwd=pwd) as source, \
  File "/usr/lib/python3.10/zipfile.py", line 1571, in open
    return ZipExtFile(zef_file, mode, zinfo, pwd, True)
  File "/usr/lib/python3.10/zipfile.py", line 800, in __init__
    self._decompressor = _get_decompressor(self._compress_type)
  File "/usr/lib/python3.10/zipfile.py", line 699, in _get_decompressor
    _check_compression(compress_type)
  File "/usr/lib/python3.10/zipfile.py", line 679, in _check_compression
    raise NotImplementedError("That compression method is not supported")
NotImplementedError: That compression method is not supported
$ ls .

This works:

$ python3 -m pyunpack.cli  -p "geheim" -b "patool" ./lz_input/pwd_protected.zip .
$ ls .
file.txt

Info:

Python: 3.10.8
Patool: 1.12 4928f3f
Pyunpack: 0.3

PatoolError: patool can not unpack & not found patool

Hi,

I want to use pyunpack to try to unzip cabinet file.
Code is as following:

from pyunpack import Archive
sys.executable = r"C:\Users\<myname>\AppData\Local\Continuum\anaconda3\python.exe"
Archive("folder1//A.cab").extractall("folder1")

Here is the traceback:

Traceback (most recent call last):

File "c:\users\<myname>\documents\repo\<programA>\test.py", line 70, in <module>
Archive("folder1//Alienware_Desktop_064C.cab").extractall("folder1")

File "C:\Users\<myname>\AppData\Local\Continuum\anaconda3\lib\site-packages\pyunpack\__init__.py", line 94, in extractall
self.extractall_patool(directory, patool_path)

File "C:\Users\<myname>\AppData\Local\Continuum\anaconda3\lib\site-packages\pyunpack\__init__.py", line 65, in extractall_patool
raise PatoolError("patool can not unpack\n" + str(p.stderr))

PatoolError: patool can not unpack
File "C:\Users\<myname>\AppData\Local\Continuum\anaconda3\Scripts\patool.exe", line 1
SyntaxError: Non-UTF-8 code starting with '\x90' in file C:\Users\<myname>\AppData\Local\Continuum\anaconda3\Scripts\patool.exe on line 1, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details

I am try to extract folder1//A.cab but the pyunpack tries to extract patool.exe

And the other issue is that I found in __init__.py for pyunpack , in line 48, it should be
patool_path = _exepath("patool.exe")
or there will be an error:
ValueError: patool not found! Please install patool!
even if I have already installed patool.

Environment:

Win10
spyder4.1.4
patool-1.12
pyunpack 0.2.1

Russian language

Unpack decode file names in archive using cp437 and we receive broken name, if decode filenames with cp866 all fine.
default
default
Upd:
It bug in zipfile.py :(

Fails to Identify `patool` Path on Some Windows Versions

It seems that there's a failure in _exepath that doesn't allow it to correctly identify patool.exe on a Windows system. Seems like that could be corrected with the following (or similar) addition:

def _exepath(cmd: str) -> Optional[str]:
    for p in os.environ["PATH"].split(os.pathsep):
        fullp = os.path.join(p, cmd)
        if os.access(fullp, os.X_OK):
            return fullp
+        if os.access(fullp + ".exe", os.X_OK):
+            return fullp + ".exe"
    return None

Missleading Error Message

I made a python script wich searches for .zip and .rar files and unpacks it with pyunpack. First I was puzzled as pyunpack throw me this exception since the path it showed was in fact valid:

filename.rar will be unpacked
Traceback (most recent call last):
File "ungesichtetaut.py", line 66, in
main(working_dir, sortierung)
File "ungesichtetaut.py", line 61, in main
unpacking = unpack(working_dir)
File "ungesichtetaut.py", line 47, in unpack
auto_create_dir = True)
File "/usr/local/lib/python2.7/dist-packages/pyunpack/init.py", line 66, in extractall
"archive file does not exist:" + str(self.filename))
ValueError: archive file does not exist:/media/usb1/ungesichtet/filename.rar

The code that gave me the error was:

        for file in test_to_unpack:
            if (".rar" in file or ".zip" in file or ".001" in file):
                print file, "will be unpacked"
                Archive(file).extractall(working_folder +    "_unpack",
                auto_create_dir = True)

I corrected it in:

        for file in test_to_unpack:
            if (".rar" in file or ".zip" in file or ".001" in file):
                print file, "will be unpacked"
                Archive(working_folder + '/' + file).extractall(working_folder + "_unpack",
                auto_create_dir = True)

And now it worked given the complete path.
First I assumed unpackpy knew the path since the ValueError showed the whole path.

Nevertheless thanks for your great work.

password support

Does password decompression currently only support ZIP format?

7z unpack

from pyunpack import Archive
Archive(thing).extractall(str(thing[0:thing.rfind('/')]))

except

Traceback (most recent call last):
  File "importItAll.py", line 33, in <module>
    Archive(thing).extractall(str(thing[0:thing.rfind('/')]))
  File "/usr/local/lib/python2.7/dist-packages/pyunpack/__init__.py", line 74, in extractall
    self.extractall_patool(directory, patool_path)
  File "/usr/local/lib/python2.7/dist-packages/pyunpack/__init__.py", line 41, in extractall_patool
    '--outdir=' + directory,
  File "/usr/local/lib/python2.7/dist-packages/easyprocess/__init__.py", line 108, in __init__
self.cmd_as_string = ' '.join(self.cmd)  # TODO: not perfect
TypeError: sequence item 1: expected string, NoneType found

stackoverflow

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.