Coder Social home page Coder Social logo

Comments (16)

bwoodsend avatar bwoodsend commented on August 18, 2024 2

Hmm, I'm curious to know why in 2.2 _pywrap_tensorflow_internal loses its prefix. I'm guessing that there is yet more sys.path monkey-patching and import redirecting going on in there which they ditched in 2.3.

('_pywrap_tensorflow_internal', '.../site-packages/tensorflow/python/_pywrap_tensorflow_internal.so', 'EXTENSION')

But the fact that it still works if you keep only the copy in the root leads me to think there is still some magic going on. A straight forward import tensorflow.python._pywrap_tensorflow_internal should be a ModuleNotFoundError. Unless it no longer does get imported as a python extension and the PyInit is just left over code which they never got round to removing...

And I agree that the tensorflow/python subdirectory is where it belongs. But that path rewriting mess it makes on macOS is a game-changer. That seems to hard-code the assumption that all binary modules get put in the root (which tbh seems a bit short-sighted). We have in the past dodged the DLL scanner by pretending some binaries are datas. I guess we could do something similar here. i.e. Convert all extension modules in hiddenimports to datas. But that's hardly appealing.

So overall, I'd probably just go with suppressing inclusion as an extension module using excludedimports = ['tensorflow.python._pywrap_tensorflow_internal'] (and probably need to remove it from hiddenimports too). Maybe explicitly add it to binaries if we don't trust our binary scanner to pick it up automatically.

from pyinstaller-hooks-contrib.

harshithdwivedi avatar harshithdwivedi commented on August 18, 2024 1

@bwoodsend Sorry about the additional mention! I figured that I should mention Rok since he created #46.
To answer your question, yes; I did that.

from pyinstaller-hooks-contrib.

rokm avatar rokm commented on August 18, 2024 1

As a quick (and ugly) work-around, you could remove the offending entry from the TOC in the .spec file, e.g.:

a = Analysis(...)

from PyInstaller.building.datastruct import TOC
a.binaries = TOC(filter(lambda x: 'tensorflow.python._pywrap_tensorflow_internal' not in x[0], a.binaries))

pyz = ...

from pyinstaller-hooks-contrib.

rokm avatar rokm commented on August 18, 2024 1

Ufff, so looking at this a bit further, it actually looks like a rather complicated problem...

With tensorflow 2.3, we end up with two copies of _pywrap_tensorflow_internal on all OSes. This file (_pywrap_tensorflow_internal.so on linux and macOS, _pywrap_tensorflow_internal.pyd on Windows) is a Python C extension module (it has the PyInit__pywrap_tensorflow_internal entry point), but it also seems to serve as a shared library that other tensorflow's extension modules are linked against (other .so/.pyd files in tensorflow/python). I suspect this is why we gather it twice, once as extension, and once as a library/binary.

The OS-independent problem here is that _pywrap_tensorflow_internal is by far the largest file in the recent tensorflow versions, so the compiled program ends up with tensorflow that's effectively twice the original size.

To fix this, I guess we would need to adjust the shared library search in pyinstaller to ignore libraries that are also extension modules.

But the fun doesn't end just there. With linux and Windows builds, one can remove either _pywrap_tensorflow_internal.{so,pyd} file, and the program still works (and based on the fact that the file is an extension module, I now think that its proper place is in tensorflow/python subdirectory, not in the program's root directory).

However, on macOS, pyinstaller rewrites the paths to the linked libraries for the libraries that it bundles with the program with relative path to the program's root directory. This also happens with tensorflow's extension modules in tensorflow/python, which therefore end up looking for _pywrap_tensorflow_internal in ../../ (i.e, the program's root). That's why the compiled program on macOS works only if the copy in tensorflow/python is removed.

from pyinstaller-hooks-contrib.

rokm avatar rokm commented on August 18, 2024 1

And for versions like 2.2, where extension is collected as _pywrap_tensorflow_internal instead of tensorflow.python._pywrap_tensorflow_internal, the file would still end up in program's root directory by default...

The annoying part is that all other extension modules with _pywrap in their name seem to be collected with their proper name:

('tensorflow.python._pywrap_mlir', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_mlir.so', 'EXTENSION')
('tensorflow.python._pywrap_kernel_registry', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_kernel_registry.so', 'EXTENSION')
('tensorflow.python.profiler.internal._pywrap_traceme', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/profiler/internal/_pywrap_traceme.so', 'EXTENSION')
('tensorflow.python._pywrap_record_io', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_record_io.so', 'EXTENSION')
('tensorflow.python._pywrap_toco_api', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_toco_api.so', 'EXTENSION')
('tensorflow.python._pywrap_stat_summarizer', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_stat_summarizer.so', 'EXTENSION')
('tensorflow.python._pywrap_quantize_training', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_quantize_training.so', 'EXTENSION')
('tensorflow.python._pywrap_tfe', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_tfe.so', 'EXTENSION')
('tensorflow.python.profiler.internal._pywrap_profiler', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/profiler/internal/_pywrap_profiler.so', 'EXTENSION')
('_pywrap_tensorflow_internal', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so', 'EXTENSION')
('tensorflow.python._pywrap_stacktrace_handler', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_stacktrace_handler.so', 'EXTENSION')
('tensorflow.python._pywrap_tfprof', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_tfprof.so', 'EXTENSION')
('tensorflow.python._pywrap_utils', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_utils.so', 'EXTENSION')
('tensorflow.python._pywrap_tf_session', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_tf_session.so', 'EXTENSION')
('tensorflow.python._pywrap_py_exception_registry', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_py_exception_registry.so', 'EXTENSION')
('tensorflow.python._pywrap_tfcompile', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_tfcompile.so', 'EXTENSION')
('tensorflow.python._pywrap_bfloat16', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_bfloat16.so', 'EXTENSION')
('tensorflow.python._pywrap_tf_optimizer', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_tf_optimizer.so', 'EXTENSION')
('tensorflow.python._pywrap_py_func', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_py_func.so', 'EXTENSION')
('tensorflow.python._pywrap_debug_events_writer', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_debug_events_writer.so', 'EXTENSION')
('tensorflow.python._pywrap_checkpoint_reader', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_checkpoint_reader.so', 'EXTENSION')
('tensorflow.python._pywrap_tf_item', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_tf_item.so', 'EXTENSION')
('tensorflow.python._pywrap_device_lib', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_device_lib.so', 'EXTENSION')
('tensorflow.python._pywrap_transform_graph', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_transform_graph.so', 'EXTENSION')
('tensorflow.python._pywrap_events_writer', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_events_writer.so', 'EXTENSION')
('tensorflow.python._pywrap_util_port', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_util_port.so', 'EXTENSION')
('tensorflow.python._pywrap_python_op_gen', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_python_op_gen.so', 'EXTENSION')
('tensorflow.python._pywrap_tf_cluster', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_tf_cluster.so', 'EXTENSION')
('tensorflow.python._pywrap_file_io', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_file_io.so', 'EXTENSION')

from pyinstaller-hooks-contrib.

rokm avatar rokm commented on August 18, 2024 1

Aha, removing from hiddenimports and adding to excludeimports did the trick (I was stuck trying each individually).

@harshithdwivedi can you try the following branch: https://github.com/rokm/pyinstaller-hooks-contrib/tree/tensorflow-duplicate-library ? (It should fix the duplication, as well as avoid including some other unnecessary files).

I.e., run pip install -U https://github.com/rokm/pyinstaller-hooks-contrib/archive/tensorflow-duplicate-library.zip to update the hooks and rebuild the program.

from pyinstaller-hooks-contrib.

harshithdwivedi avatar harshithdwivedi commented on August 18, 2024

@rokm

from pyinstaller-hooks-contrib.

bwoodsend avatar bwoodsend commented on August 18, 2024

@harshithdwivedi Slightly desperate here - did you remember to use the --clean option after applying #46 ?

from pyinstaller-hooks-contrib.

rokm avatar rokm commented on August 18, 2024

Those errors

[libprotobuf ERROR external/com_google_protobuf/src/google/protobuf/descriptor_database.cc:118] File already exists in database: tensorflow/core/protobuf/master.proto
[libprotobuf FATAL external/com_google_protobuf/src/google/protobuf/descriptor.cc:1379] CHECK failed: GeneratedDatabase()->Add(encoded_file_descriptor, size): 
libc++abi.dylib: terminating with uncaught exception of type google::protobuf::FatalException: CHECK failed: GeneratedDatabase()->Add(encoded_file_descriptor, size): 

indicate that tensorflow's protobuf definitions are being registered twice (i.e., by two libraries).

Under tensorflow 2.3, we actually end up with two (identical) copies of _pywrap_tensorflow_internal.so:

(venv) aether:pyi-tensorflow23 rok$ find dist/mnist -name _pywrap_tensorflow_internal.so
dist/mnist/_pywrap_tensorflow_internal.so
dist/mnist/tensorflow/python/_pywrap_tensorflow_internal.so

Under tensorflow 2.2 (which works), there was only one:

(venv) aether:pyi-tensorflow22 rok$ find dist/mnist -name _pywrap_tensorflow_internal.so
dist/mnist/_pywrap_tensorflow_internal.so

After deleting the copy in tensorflow/python/_pywrap_tensorflow_internal.so, I can get the built mnist program to work.

I guess we'll need to figure out why we end up with two copies in 2.3 but a single one in 2.2, whereas the system in both cases contains a single one in the same location:

(venv) aether:pyi-tensorflow22 rok$ find venv/ -name _pywrap_tensorflow_internal.so
venv//lib/python3.7/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so
(venv) aether:pyi-tensorflow23 rok$ find venv/ -name _pywrap_tensorflow_internal.so
venv//lib/python3.7/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so

from pyinstaller-hooks-contrib.

harshithdwivedi avatar harshithdwivedi commented on August 18, 2024

@C-Aniruddh can we try this out?

from pyinstaller-hooks-contrib.

C-Aniruddh avatar C-Aniruddh commented on August 18, 2024

@rokm Somehow, pyinstaller with TensorFlow 2.3 only creates one copy of the _pywrap_tensorflow_internal.pyd when run on windows.

from pyinstaller-hooks-contrib.

rokm avatar rokm commented on August 18, 2024

Yes, it works correctly on linux and Windows. This is macOS-specific problem.

from pyinstaller-hooks-contrib.

rokm avatar rokm commented on August 18, 2024

Hmm, actually, I see two copies of _pywrap_tensorflow_internal.pyd with tensorflow 2.3 on Windows as well (but it does not have same effect as in macOS, because those are not dynamic libraries). There's also a warning in the build log:

93314 WARNING: ('_pywrap_tensorflow_internal.pyd',
 'c:\\users\\rok\\development\\pyinst-5184\\venv-fixed\\lib\\site-packages\\tensorflow\\python\\_pywrap_tensorflow_internal.pyd',
 'BINARY')
93315 WARNING: was placed previously at
93316 WARNING: ('tensorflow\\python\\_pywrap_tensorflow_internal.pyd',
 'c:\\users\\rok\\development\\pyinst-5184\\venv-fixed\\lib\\site-packages\\tensorflow\\python\\_pywrap_tensorflow_internal.pyd',
 'EXTENSION')

from pyinstaller-hooks-contrib.

harshithdwivedi avatar harshithdwivedi commented on August 18, 2024

Maybe we can add a specific check to the hook to delete this duplicate file after tensorflow is packaged?

from pyinstaller-hooks-contrib.

rokm avatar rokm commented on August 18, 2024

The difference between TOC entries for the duplicated library in a tensorflow 2.2 and tensorflow 2.3 build is:

v.2.2:

('_pywrap_tensorflow_internal', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so', 'EXTENSION')
('_pywrap_tensorflow_internal.so', '/Users/rok/Development/pyi-tf22/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so', 'BINARY')

v.2.3:

('tensorflow.python._pywrap_tensorflow_internal', '/Users/rok/Development/pyi-tf23/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so', 'EXTENSION')
('_pywrap_tensorflow_internal.so', '/Users/rok/Development/pyi-tf23/venv/lib/python3.7/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so', 'BINARY')

So it looks like we collect it as a BINARY and as an EXTENSION in both cases, but in the case of 2.2., the basename clash probably prevents it from being packaged twice. With 2.3, the extension is collected with full namespace prefix, which results in duplication.

@bwoodsend, any ideas at how to approach this? I'm leaning towards what @harshithdwivedi suggested - we try to catch that specific extension in the hook and exclude it there, if possible...

from pyinstaller-hooks-contrib.

harshithdwivedi avatar harshithdwivedi commented on August 18, 2024

Sounds great, thanks!
@C-Aniruddh can you try this out?

from pyinstaller-hooks-contrib.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.