Coder Social home page Coder Social logo

object's People

Contributors

alexcrichton avatar amanieu avatar amshafer avatar bjorn3 avatar chrisdenton avatar daladim avatar dependabot-support avatar ecnelises avatar esmeyi avatar fitzgen avatar ignatenkobrain avatar jonas-schievink avatar jrmuizel avatar jsgf avatar luser avatar m4b avatar mkroening avatar mstange avatar nagisa avatar osiewicz avatar philipc avatar roblabla avatar rocallahan avatar rreverser avatar striezel avatar sunfishcode avatar tamird avatar tesuji avatar vthib avatar xry111 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

object's Issues

Change `open` into `parse` on the `Object` trait

this will let us avoid reparsing in the mach_o impl and will make zero-copy impls work much better

we will also be able to use the memmap crate to mmap the actual file into memory and then the Object impls can just parse that as a slice

Add PE support

It might be good to at least prototype this out before doing a new release to make sure all of the api changes make sense on top of PE.

patch possibly broke cargo-bloat on macs

cargo-bloat wasn't working for me on my mac (RazrFalcon/cargo-bloat#35), and I bisected the bug down to this patch in the object library: c600365. I patched goblin to point at a fork of object, and saw it outputs data before that patch, but not afterwards. With the patch:

% cargo run bloat --crates -n 2
    Finished dev [unoptimized + debuginfo] target(s) in 0.04s
     Running `target/debug/cargo-bloat bloat --crates -n 2`
Compiling ...
Analyzing target/debug/cargo-bloat

File  .text Size Name
0.0% 100.0%   0B .text section size, the file size is 13.2MiB

Note: numbers above are a result of guesswork. They are not 100% correct and never will be.

Without:

% cargo run bloat --crates -n 2
    Finished dev [unoptimized + debuginfo] target(s) in 0.04s
     Running `target/debug/cargo-bloat bloat --crates -n 2`
Compiling ...
Analyzing target/debug/cargo-bloat

 File  .text   Size Name
12.8%  31.1% 1.7MiB std
 7.6%  18.4% 1.0MiB clap
41.1% 100.0% 5.4MiB .text section size, the file size is 13.2MiB

Note: numbers above are a result of guesswork. They are not 100% correct and never will be.

I'm not familiar with how these tools work, so I'm not sure if this is a bug in cargo-bloat or object. Do you know what's going on?

Implement dynsym

Both ELF and MachO (can) have dynamic symbols (I don't know if PE has something similar). Would it be possible to implement a function dynamic_symbols in addition to symbols?

For now I have only looked at the ELF code, so my suggestions are based on that.
If there are dynamic symbols in PEs, we could simply add such a function to the Object trait.
If that's not the case, I'd suggest the following approaches:

  1. Append all dynamic symbols to the result returned by symbols.
  2. Make dynamic_symbols a function of trait Object returning Vec<Symbol<'a>>. Implement it for ELF and MachO and make it return an empty vec for PE. This behaviour suggests that it's possible to have dynamic symbols in PEs.
  3. Like the former, but panic if it's called on PE. While this makes the contract stricter, it should be the better behaviour, implying that this is not possible for PEs.
  4. Like the former, but make dynamic_symbols return Option<Vec<Symbol<'a>> and return Some(vec) for ELF and MachO and None for PE. I think that this approach just complicates things.
  5. Add the dynamic_symbols function to both ELF and MachO and let the user create the respective struct himself instead of creating a File.
  6. Like the former, but make inner of File public (or expose it somehow) so that it's possible for the user to instantiate a File but still call dynamic_symbols.

For ELF files I was able to just exchange symtab and strtab with dynsym and dynstrtab respectively and it worked as expected. Therefore I'd suggest making the current symbols function private, but add 2 parameters for the symbol table and the string table, which the new symbols and dynamic_symbols call.
I did this with testwise (and because I need it) for ElfFile in 3ae266f .

Add `Object` trait back in

Since this is growing beyond a simple get_section/is_little_endian, I think we should add the Object trait back in, and split the code up into separate elf/mach files again. The difference from before is that I want to keep support for compiling both ELF and Mach-O at the same time.

macho_fixup_relocation incorrect

let constant = match relocation.kind {
RelocationKind::Relative
| RelocationKind::GotRelative
| RelocationKind::PltRelative => relocation.addend + 4,
_ => relocation.addend,
};
relocation.addend -= constant;

When relocation.addend will always be either 0 or -4 afterwards, as the constant subtracted form relocation.addend contains relocation.addend itself. The match should have been written like:

let constant = match relocation.kind {
    RelocationKind::Relative
    | RelocationKind::GotRelative
    | RelocationKind::PltRelative => -4,
    _ => 0,
};

Can we add MIPS variant to Machine enum?

I think this is an information that dependent crates might need to know.

object/src/lib.rs

Lines 87 to 101 in 655dc57

/// The machine type of an object file.
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum Machine {
/// An unrecognized machine type.
Other,
/// ARM
Arm,
/// ARM64
Arm64,
/// x86
X86,
/// x86-64
#[allow(non_camel_case_types)]
X86_64,
}

Clippy run (never loop)

Clippy output
warning: redundant field names in struct initialization
   --> src/pe.rs:314:17
    |
314 |                 name: name,
    |                 ^^^^^^^^^^ help: replace it with: `name`
    |
    = note: #[warn(clippy::redundant_field_names)] on by default
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#redundant_field_names

warning: casting u32 to u64 may become silently lossy if types change
   --> src/elf.rs:343:37
    |
343 |         if (self.section.sh_flags & elf::section_header::SHF_COMPRESSED as u64) == 0 {
    |                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: try: `u64::from(elf::section_header::SHF_COMPRESSED)`
    |
    = note: #[warn(clippy::cast_lossless)] on by default
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#cast_lossless

warning: redundant pattern matching, consider using `is_err()`
   --> src/elf.rs:368:16
    |
368 |           if let Err(_) =
    |  _________-      ^^^^^^
369 | |             decompress.decompress_vec(compressed_data, &mut decompressed, FlushDecompress::Finish)
370 | |         {
371 | |             return None;
372 | |         }
    | |_________- help: try this: `if decompress.decompress_vec(compressed_data, &mut decompressed, FlushDecompress::Finish).is_err()`
    |
    = note: #[warn(clippy::redundant_pattern_matching)] on by default
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#redundant_pattern_matching

warning: redundant pattern matching, consider using `is_err()`
   --> src/elf.rs:396:16
    |
396 |           if let Err(_) =
    |  _________-      ^^^^^^
397 | |             decompress.decompress_vec(&data[12..], &mut decompressed, FlushDecompress::Finish)
398 | |         {
399 | |             return None;
400 | |         }
    | |_________- help: try this: `if decompress.decompress_vec(&data[12..], &mut decompressed, FlushDecompress::Finish).is_err()`
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#redundant_pattern_matching

error: this loop never actually loops
   --> src/elf.rs:535:17
    |
535 | /                 while let Some(reloc) = relocations.next() {
536 | |                     let kind = match self.file.elf.header.e_machine {
537 | |                         elf::header::EM_ARM => match reloc.r_type {
538 | |                             elf::reloc::R_ARM_ABS32 => RelocationKind::Direct32,
...   |
566 | |                     ));
567 | |                 }
    | |_________________^
    |
    = note: #[deny(clippy::never_loop)] on by default
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#never_loop

error: this loop never actually loops
   --> src/macho.rs:328:17
    |
328 | /                 while let Some(Ok((section, data))) = sections.next() {
329 | |                     return Some(MachOSection {
330 | |                         file: self.file,
331 | |                         section,
332 | |                         data,
333 | |                     });
334 | |                 }
    | |_________________^
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#never_loop

warning: casting u32 to u64 may become silently lossy if types change
  --> src/wasm.rs:83:58
   |
83 |         self.module.start_section().map_or(u64::MAX, |s| s as u64)
   |                                                          ^^^^^^^^ help: try: `u64::from(s)`
   |
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#cast_lossless

warning: use of `unwrap_or` followed by a function call
   --> src/wasm.rs:191:10
    |
191 |         .unwrap_or(Cow::from(&[][..]))
    |          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: try this: `unwrap_or_else(|| Cow::from(&[][..]))`
    |
    = note: #[warn(clippy::or_fun_call)] on by default
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#or_fun_call

warning: large size difference between variants
  --> src/lib.rs:89:5
   |
89 |     Elf(ElfFile<'data>),
   |     ^^^^^^^^^^^^^^^^^^^
   |
   = note: #[warn(clippy::large_enum_variant)] on by default
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#large_enum_variant
help: consider boxing the large fields to reduce the total size of the enum
   |
89 |     Elf(Box<ElfFile<'data>>),
   |         ^^^^^^^^^^^^^^^^^^^

error: aborting due to 2 previous errors

error: Could not compile `object`.

How to get a section size with .rodata?

Currently, cargo-bloat doesn't support the .rodata and I'm testing it on the encoding crate that has a lot of big tables. The problem is that I can't figure out how to get an .rodata. SectionKind::ReadOnlyData simply doesn't have a required data.

Here is my results on the recode example in the debug mode:

Here is a bloaty output:

     VM SIZE                                                                         FILE SIZE
 --------------                                                                   --------------
   4.5%  64.1Ki encoding_index_tradchinese::big5::backward::hec9fca170a69a4b8      64.2Ki   1.3%

The same method using object/examples/nm:

> cargo run --example nm -- ../recode | grep hec9fca170a69a4b8
000000000008f3d0 00000000000000f9 t _ZN26encoding_index_tradchinese4big58backward17hec9fca170a69a4b8E

Only 249B. But big5 is actually big.

Sorting all symbols via object by size:

    let file = object::File::parse(...).unwrap();
    let mut list = Vec::new();
    for symbol in file.symbols() {
        let fn_name = symbol.name().unwrap_or("<unknown>");
        let fn_name = rustc_demangle::demangle(fn_name).to_string();
        list.push((fn_name, symbol.size()));
    }

    list.sort_by(|a, b| b.1.cmp(&a.1));

    for v in list.iter().take(10) {
        println!("{:?}", v);
    }

Outputs:

("encoding::codec::japanese::iso2022jp::raw_feed::h5aad8d4365b7d428", 52642)
("encoding::codec::japanese::eucjp::raw_feed::h49eec0588e0efa26", 16934)
("encoding::codec::simpchinese::gb18030::raw_feed::h645cf08094182a99", 16774)
("encoding::label::encoding_from_whatwg_label::ha039dfc40b9b0e42", 12038)
("getopts::Options::parse::h6cd96ef1d45134b1", 11536)
("read_line_info", 8786)
("recode::main::h4f6550cba8af3809", 8692)
("encoding::codec::tradchinese::bigfive2003::raw_feed::h591f7a2d33a75d32", 6999)
("alloc::str::join_generic_copy::hed5e07989f354dd0", 6849)
("elf_add", 6605)

The biggest symbol is just 50KiB.

Any ideas?

Baremetal binary backend

I'm currently working on symbolication on an odd platform (homebrew on the Nintendo Switch), which doesn't use ELF or Mach (or anything standard for that matter). I was wondering if it'd be a good idea to implement a "dumb" backend that hardcodes common section names, and finds them with linker symbols. For instance, __debug_abbrev_start, __dynamic_start, etc... With weak linkage, we could get something that works pretty well, I think.

So we would create a BaremetalObject (unit?) struct that implements Object.

Would such a PR get accepted ?

#![no_std] support

Looks like the only hard std dependency is std::io::Cursor here, which really should use goblin::peek_bytes but that is by some mistake covered under if_std!. oh wow everything transitively depends on std::io my eyes my eyes are burning

Missing constructor symbols in elf

When I build some Python packages, and then run object's nm on the generated .o files, some symbols are missing

e.g. https://pypi.org/project/pyzmq/ , https://pypi.org/project/gevent/

> nm build/temp.linux-x86_64-3.7/zmq/backend/cython/message.o | grep __pyx_wrapperbase_3zmq_7backend_6cython_7message_5Frame_2__init__
0000000000000038 C __pyx_wrapperbase_3zmq_7backend_6cython_7message_5Frame_2__init__
> ~/projects/rust/object/target/debug/examples/nm build/temp.linux-x86_64-3.7/zmq/backend/cython/message.o | grep __pyx_wrapperbase_3zmq_7backend_6cython_7message_5Frame_2__init__
0000000000000020 0000000000000038 U __pyx_wrapperbase_3zmq_7backend_6cython_7message_5Frame_2__init__

This is blocking indygreg/PyOxidizer#183

https://pypi.org/project/psutil/ appears to be a different problem.

Add support for conditionally building support for each object format

Goblin has a set of features that let you choose which formats are supported. It would be nice if we could have the same for 'object'. This can cut down on code size in situations where you only need to support the native format of the platform you're running on. An example use case for this is backtrace-rs, additionally we'd like this for use in the profiler in Gecko.

Cargo.toml says indexmap 1.0.0 is requires, but indexmap 1.2.0 is actually required

error[E0599]: no method named `insert_full` found for type `indexmap::set::IndexSet<&'a [u8]>` in the current scope
  --> /Users/bjorn/.cargo/registry/src/github.com-1ecc6299db9ec823/object-0.14.0/src/write/string.rs:22:31
   |
22 |         let id = self.strings.insert_full(string).0;
   |

Updating Cargo.lock from indexmap 1.0.1 to indexmap 1.2.0 fixed the problem.

Improve parsing efficiency, reliability, and flexibility

The design of goblin and scroll has a few deficiencies:

  • it requires copying of headers before users can access them,
  • it eagerly parses many headers,
  • it requires large allocations.

Changing this in goblin is not feasible because it would be a fundamental and intrusive change. Additionally, the principal author of goblin has previously expressed disinterest in such changes.

I propose replacing the usage of goblin with our own implementation.

The main features should be:

  • use a crate such as zerocopy to access headers in place. A similar alternative is plain, but it has less safety checks and write support is weak.
  • lazily parse headers,
  • require the caller to cache parsing results if needed (but many callers shouldn't need to)
  • allow disabling unneeded file formats (include supporting only one of 32-bit or 64-bit).

This design would bring it closer to how gimli operates.

There is a basic proof of concept at philipc@b658b5a. So far this has not required changes to the API of the object crate. However, I do expect that ElfFile will need to be split into 32-bit and 64-bit variants.

Make a release

There's a bunch of new functionality so it would be nice to have a new release.

Please release a new version on crates.io

I'd like to make use of the debug_file_info() method that was added in #39, but that method was not included in the 0.7.0 release and there hasn't been a new release since then.

`ObjectSection::relocations` is inefficient for ELF

There is no link from an ELF section to its relocations, so in order to find its relocations we currently iterate over all relocation sections, which is O(n). And since the objdump example does this for every section, we get O(n^2) runtime.

Write support

The ability to write object files is useful for testing the write support in gimli. I do this by reading an existing file, converting the DWARF to gimli's write data structures, writing it out again, and making sure it still works as expected.

So far I have been using faerie, but its API is too high level and not really suitable for converting data from existing files.

My plan is to add a lower level API in an object-write crate. I'm currently developing that separately, but will move into this git repo at some stage. This may also prove useful for providing some level of test coverage for the object crate.

Add TLS standard section

.tdata for ELF and I think __DATA,__thread_data for mach-O, but I don't know what the role of __DATA,__thread_vars is yet. Also needs a associated SectionKind.

mach-o unimplemented symbol

In the same setup as #149 , Cython modules in gevent fail to be parsed on mach-o with the error

unimplemented symbol Symbol { name: [95, 95, 95, 112, 121, 120, 95, 109, 111, 100, 117, 108, 101, 95, 105, 115, 95, 109, 97, 105, 110, 95, 103, 101, 118, 101, 110, 116, 95, 95, 95, 113, 117, 101, 117, 101], value: 0, size: 0, kind: Unknown, scope: Dynamic, weak: false, section: Some(SectionId(2)) }"'

That name is ___pyx_module_is_main_gevent___queue.

This symbol comes from Cython generated wrapper code

cython$ git grep is_main
Cython/Compiler/ModuleNode.py:        module_is_main = "%s%s" % (Naming.module_is_main, self.full_module_name.replace('.', '__'))
Cython/Compiler/ModuleNode.py:        code.putln("extern int %s;" % module_is_main)
Cython/Compiler/ModuleNode.py:        code.putln("int %s = 0;" % module_is_main)
Cython/Compiler/ModuleNode.py:        code.putln("if (%s%s) {" % (Naming.module_is_main, self.full_module_name.replace('.', '__')))
Cython/Compiler/ModuleNode.py:        module_is_main = "%s%s" % (Naming.module_is_main, self.full_module_name.replace('.', '__'))
Cython/Compiler/ModuleNode.py:                module_is_main=module_is_main,
Cython/Compiler/Naming.py:module_is_main   = pyrex_prefix + "module_is_main_"
Cython/Utility/Embed.c:      %(module_is_main)s = 1;
bin/cython_freeze:    print("\nextern int __pyx_module_is_main_%s;" % modules[0])
bin/cython_freeze:    __pyx_module_is_main_%(main)s = 1;

The command to compile the offending .o is

$ clang -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/var/folders/dd/xb3jz0tj133_hgnvdttctwxc0000gn/T/tmp9f1_qys4/tools/deps/include -I/var/folders/dd/xb3jz0tj133_hgnvdttctwxc0000gn/T/tmp9f1_qys4/tools/deps/include/ncurses -I/var/folders/dd/xb3jz0tj133_hgnvdttctwxc0000gn/T/tmp9f1_qys4/tools/deps/lib/libffi-3.2.1/include -I/var/folders/dd/xb3jz0tj133_hgnvdttctwxc0000gn/T/tmp9f1_qys4/tools/deps/include/uuid -I/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers -F/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/System/Library/Frameworks -Werror=unguarded-availability-new -I/usr/local/opt/[email protected]/include -U__llvm__ -I/Users/runner/pyapp/build/target/x86_64-apple-darwin/debug/pyoxidizer/hacked_base/include/python3.7m -I/private/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/pyoxidizer-temp-venv.2zNoVZYSf6Py/venv/include/site/python3.7 -Ideps -I/private/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/pyoxidizer-temp-venv.2zNoVZYSf6Py/venv/include -I/Users/runner/pyapp/build/target/x86_64-apple-darwin/debug/pyoxidizer/hacked_base/include/python3.7m -I/install/include/python3.7m -c src/gevent/queue.c -o build/temp.macosx-10.9-x86_64-3.7/src/gevent/queue.o

Removing -g, -O3 and -fPIC didnt seem to make any difference.
Using gcc instead of clang also didnt fix it.

nm reports it as

2019-11-27T11:55:33.9006300Z 000000000005fd30 S ___pyx_module_is_main_gevent___queue

nm manpage says this is

The symbol is in an uninitialized or zero-initialized data section for small objects.

An initial attempt at using options -mno-extern-sdata, -mno-local-sdata & -mno-embedded-sdata didnt work - they appeared in unused arguments warnings. It sounds like that might work, so I'll re-attempt that as a temporary workaround.

MSVC debug unimplemented relocation kind: SectionIndex

I've got a bit further with indygreg/PyOxidizer#183 by switching to the debug compiler flags '/nologo', '/Od', '/MDd', '/Zi', '/W3', '/D_DEBUG', but then I encountered the following in out_object.write() on the MSVC objects. This isnt a critical problem because these debug flags are rarely used.

thread 'main' panicked at 'called Result::unwrap() on an Err value: "unimplemented relocation Relocation { offset: 252, size: 16, kind: SectionIndex, encoding: Generic, symbol: SymbolId(164), addend: 0 }

It is a bit surprising that the backtrace doesnt show any object lines, but I guess that is because it is returning catching goblin errors and returning strings.

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: "unimplemented relocation Relocation { offset: 252, size: 16, kind: SectionIndex, encoding: Generic, symbol: SymbolId(164), addend: 0 }"', src\libcore\result.rs:1084:5
stack backtrace:
   0: backtrace::backtrace::trace_unsynchronized
             at C:\Users\VssAdministrator\.cargo\registry\src\github.com-1ecc6299db9ec823\backtrace-0.3.34\src\backtrace\mod.rs:66
  21: std::rt::lang_start_internal::{{closure}}
             at /rustc/625451e376bb2e5283fc4741caa0a3e8a2ca4d54\/src\libstd\rt.rs:49
  22: std::panicking::try::do_call<closure-0,i32>
             at /rustc/625451e376bb2e5283fc4741caa0a3e8a2ca4d54\/src\libstd\panicking.rs:296
  23: panic_unwind::__rust_maybe_catch_panic
             at /rustc/625451e376bb2e5283fc4741caa0a3e8a2ca4d54\/src\libpanic_unwind\lib.rs:80
  24: std::panicking::try
             at /rustc/625451e376bb2e5283fc4741caa0a3e8a2ca4d54\/src\libstd\panicking.rs:275
  25: std::panic::catch_unwind
             at /rustc/625451e376bb2e5283fc4741caa0a3e8a2ca4d54\/src\libstd\panic.rs:394
  26: std::rt::lang_start_internal
             at /rustc/625451e376bb2e5283fc4741caa0a3e8a2ca4d54\/src\libstd\rt.rs:48
  27: std::rt::lang_start<()>
             at /rustc/625451e376bb2e5283fc4741caa0a3e8a2ca4d54\src\libstd\rt.rs:64
  28: main
  29: invoke_main
             at d:\agent\_work\2\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:78
  30: __scrt_common_main_seh
             at d:\agent\_work\2\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288
  31: BaseThreadInitThunk
  32: RtlUserThreadStart

objcopy --redefine-sym

indygreg/PyOxidizer#183 contains a lot of boilerplate code to write a object file, substantially copied from https://github.com/gimli-rs/object/blob/master/examples/objcopy.rs .

It would be nice if there was a simple API to achieve the same, with high-level transform operations selected and then the reconstruction iteration process left for object & friends to perform.

In my case the transform I need is renaming of one sym from PyInit_* to PyInit_foo_baz. This is the same as objcopy --redefine-sym old=new, and it is a bit similar to the Mangling use-case.

Mach-O: symbol size is always 0

Hi!

And thanks a ton for this amazing crate.

Not sure if I did something wrong, if this is intentional, if this is a bug or simply something that hasn't been implemented, but when loading a Mach-O library, symbols have a correct address but their size is always 0.

This happens no matter what the symbol kind is. With an ELF library, the correct size is returned and symbol_data() works as expected.

Add archive support

Add an iterator over objects in an archive. It would be nice if this same iterator also worked for Mach-O fat binaries, and also returned a single object for files that aren't archives.

Look into how mingw produces DWARF-in-PE

The mingw toolchain produces PE binaries with DWARF in them. I'm not 100% sure how it works, but it should be possible to support with gimli+object. This is a follow-on from #39 .

objcopy should preserve more format-specific attributes

For example, for Mach-O it should preserve file header flags (MH_SUBSECTIONS_VIA_SYMBOLS ) and section attributes (LiveSupport (0x80000), NoTOC (0x400000), StripStaticSyms (0x200000) for __eh_frame).

To do this, we first need somewhere in the read API that exposes these, and somewhere in the write API that they can be set. These will necessarily use format-specific values.

Implement symbols

It should be possible to expose a decent cross object symbol api.

Implement relocations

If there is a .rela.debug_info section, then those relocations need to be applied before gimli can parse .debug_info. See atefail/ig_server from the libdwarf regression tests for an example.

coff write causes multiply defined __real@3ff0000000000000

In the same setup as #149 , Cythonised module gevent._queue on Windows parses and writes without error, but then I encounter linker warning when linking it in with libpythonXY.a

"C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Enterprise\\VC\\Tools\\MSVC\\14.16.27023\\bin\\HostX64\\x64\\lib.exe" "-out:C:/Users/VssAdministrator/pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\pyoxidizer\\libpythonXY.a" "-nologo" "@C:/Users/VssAdministrator/pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\pyoxidizer\\libpythonXY.a.args"
...
gevent._queue.0.o : warning LNK4006: __real@3ff0000000000000 already defined in gevent.libuv._corecffi.29.o; second definition ignored

And then an error when generating the final binary

C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Enterprise\\VC\\Tools\\MSVC\\14.16.27023\\bin\\HostX64\\x64\\link.exe" "/NOLOGO" "/NXCOMPAT" "/LIBPATH:C:\\Rust\\.rustup\\toolchains\\nightly-x86_64-pc-windows-msvc\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib" "C:/Users/VssAdministrator/pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\deps\\pyapp.2viwg4n27f4k3ugl.rcgu.o" "C:/Users/VssAdministrator/pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\deps\\pyapp.39xgc2919xpkdbm2.rcgu.o" "C:/Users/VssAdministrator/pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\deps\\pyapp.47qx6sds81fagm4s.rcgu.o" "C:/Users/VssAdministrator/pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\deps\\pyapp.4tpwh6odkuci599r.rcgu.o" "C:/Users/VssAdministrator/pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\deps\\pyapp.4w5ugkvi6xt6vtkj.rcgu.o" "C:/Users/VssAdministrator/pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\deps\\pyapp.9zz25lg0q0qieuh.rcgu.o" "/OUT:C:/Users/VssAdministrator/pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\deps\\pyapp.exe" "C:/Users/VssAdministrator/pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\deps\\pyapp.4tawtyk1l821zf45.rcgu.o" "/OPT:REF,NOICF" "/DEBUG" "/NATVIS:C:\\Rust\\.rustup\\toolchains\\nightly-x86_64-pc-windows-msvc\\lib\\rustlib\\etc\\intrinsic.natvis" "/NATVIS:C:\\Rust\\.rustup\\toolchains\\nightly-x86_64-pc-windows-msvc\\lib\\rustlib\\etc\\liballoc.natvis" "/NATVIS:C:\\Rust\\.rustup\\toolchains\\nightly-x86_64-pc-windows-msvc\\lib\\rustlib\\etc\\libcore.natvis" "/NATVIS:C:\\Rust\\.rustup\\toolchains\\nightly-x86_64-pc-windows-msvc\\lib\\rustlib\\etc\\libstd.natvis" "/LIBPATH:C:/Users/VssAdministrator/pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\deps" "/LIBPATH:C:/Users/VssAdministrator/pyapp\\build\\target\\debug\\deps" "/LIBPATH:C:/Users/VssAdministrator/pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\pyoxidizer" "/LIBPATH:C:\\Rust\\.rustup\\toolchains\\nightly-x86_64-pc-windows-msvc\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib" "C:\\Users\\VssAdministrator\\pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\deps\\libpyembed-4b48f565d7a5da92.rlib" "C:\\Users\\VssAdministrator\\pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\deps\\libuuid-daedd81f20ecf0e1.rlib" "C:\\Users\\VssAdministrator\\pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\deps\\librand-d1e04b03716ec118.rlib" "C:\\Users\\VssAdministrator\\pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\deps\\librand_xorshift-ea2a037dc87488be.rlib" "C:\\Users\\VssAdministrator\\pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\deps\\librand_pcg-b02ce1ff882f9fe6.rlib" "C:\\Users\\VssAdministrator\\pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\deps\\librand_hc-acc7902f7cbc2f51.rlib" "C:\\Users\\VssAdministrator\\pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\deps\\librand_chacha-2281fa3a7851e606.rlib" "C:\\Users\\VssAdministrator\\pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\deps\\librand_isaac-ecc623c48d28a16d.rlib" "C:\\Users\\VssAdministrator\\pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\deps\\librand_core-c14dd5f160e54f1c.rlib" "C:\\Users\\VssAdministrator\\pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\deps\\librand_os-edfe7944a02c998d.rlib" "C:\\Users\\VssAdministrator\\pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\deps\\librand_jitter-1664950d1b77a195.rlib" "C:\\Users\\VssAdministrator\\pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\deps\\libwinapi-54541c97a078f74e.rlib" "C:\\Users\\VssAdministrator\\pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\deps\\librand_core-c4447b557c2a0676.rlib" "C:\\Users\\VssAdministrator\\pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\deps\\liblazy_static-9ea7c902478200db.rlib" "C:\\Users\\VssAdministrator\\pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\deps\\libcpython-e576a5f71afb40bd.rlib" "C:\\Users\\VssAdministrator\\pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\deps\\libnum_traits-c0b182d52757fad1.rlib" "C:\\Users\\VssAdministrator\\pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\deps\\libbyteorder-0cbdb9b2f37557d2.rlib" "C:\\Users\\VssAdministrator\\pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\deps\\libpython3_sys-91dda59b980cbd2f.rlib" "C:\\Users\\VssAdministrator\\pyapp\\build\\target\\x86_64-pc-windows-msvc\\debug\\deps\\liblibc-516b44f1ea4d5be3.rlib" "C:\\Rust\\.rustup\\toolchains\\nightly-x86_64-pc-windows-msvc\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\libstd-d2a75bc74b11e2cc.rlib" "C:\\Rust\\.rustup\\toolchains\\nightly-x86_64-pc-windows-msvc\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\libpanic_unwind-915bfcc230c4eda6.rlib" "C:\\Rust\\.rustup\\toolchains\\nightly-x86_64-pc-windows-msvc\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\libhashbrown-2d959e942a9e3e5c.rlib" "C:\\Rust\\.rustup\\toolchains\\nightly-x86_64-pc-windows-msvc\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\librustc_std_workspace_alloc-d889c9534777e0f9.rlib" "C:\\Rust\\.rustup\\toolchains\\nightly-x86_64-pc-windows-msvc\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\libbacktrace-c0632338fc51212a.rlib" "C:\\Rust\\.rustup\\toolchains\\nightly-x86_64-pc-windows-msvc\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\librustc_demangle-c23d6f351d09fb94.rlib" "C:\\Rust\\.rustup\\toolchains\\nightly-x86_64-pc-windows-msvc\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\libunwind-2098d630f29dcc07.rlib" "C:\\Rust\\.rustup\\toolchains\\nightly-x86_64-pc-windows-msvc\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\libcfg_if-68cf5df90ef019f9.rlib" "C:\\Rust\\.rustup\\toolchains\\nightly-x86_64-pc-windows-msvc\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\liblibc-744c9ad3b176c572.rlib" "C:\\Rust\\.rustup\\toolchains\\nightly-x86_64-pc-windows-msvc\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\liballoc-f60261e32bf289cf.rlib" "C:\\Rust\\.rustup\\toolchains\\nightly-x86_64-pc-windows-msvc\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\librustc_std_workspace_core-2ef81bd49a7650fd.rlib" "C:\\Rust\\.rustup\\toolchains\\nightly-x86_64-pc-windows-msvc\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\libcore-5779894fffb6f902.rlib" "C:\\Rust\\.rustup\\toolchains\\nightly-x86_64-pc-windows-msvc\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\libcompiler_builtins-7b6566d5f1691e05.rlib" "Ole32.lib" "OleAut32.lib" "User32.lib" "cabinet.lib" "msi.lib" "msvcrt.lib" "rpcrt4.lib" "shlwapi.lib" "version.lib" "winmm.lib" "advapi32.lib" "crypt32.lib" "gdi32.lib" "iphlpapi.lib" "libcrypto.lib" "libssl.lib" "psapi.lib" "shell32.lib" "user32.lib" "userenv.lib" "ws2_32.lib" "advapi32.lib" "credui.lib" "kernel32.lib" "secur32.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "pythonXY.lib" "advapi32.lib" "ws2_32.lib" "userenv.lib" "msvcrt.lib"
...
          libpyembed-4b48f565d7a5da92.rlib(gevent._queue.0.o) : error LNK2005: __real@3ff0000000000000 already defined in libpyembed-4b48f565d7a5da92.rlib(longobject.obj)

where longobject.obj is from Cython

Inside object the symbol prints as

Symbol { name: Some("__real@3ff0000000000000"), address: 0, size: 0, kind: Data, section_index: Some(SectionIndex(8)), undefined: false, weak: false, scope: Linkage }

sections according to objdump

build\temp.win-amd64-3.7\Release\src/gevent/queue.obj:     file format pe-x86-64

    Sections:
    Idx Name          Size      VMA               LMA               File off  Algn
      0 .drectve      00000060  0000000000000000  0000000000000000  000001a4  2**0
                      CONTENTS, READONLY, DEBUGGING, EXCLUDE, NOREAD
      1 .debug$S      000000c8  0000000000000000  0000000000000000  00000204  2**0
                      CONTENTS, READONLY, DEBUGGING
      2 .text$mn      00025783  0000000000000000  0000000000000000  000002cc  2**4
                      CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
      3 .data         00007c71  0000000000000000  0000000000000000  00038ed3  2**4
                      CONTENTS, ALLOC, LOAD, RELOC, DATA
      4 .bss          00001110  0000000000000000  0000000000000000  00000000  2**4
                      ALLOC
      5 .rdata        00000dea  0000000000000000  0000000000000000  000427dc  2**4
                      CONTENTS, ALLOC, LOAD, READONLY, DATA
      6 .xdata        00001710  0000000000000000  0000000000000000  000435c6  2**2
                      CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
      7 .pdata        00001098  0000000000000000  0000000000000000  00045b5e  2**2
                      CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
      8 .rdata        00000008  0000000000000000  0000000000000000  00049572  2**3
                      CONTENTS, ALLOC, LOAD, READONLY, DATA, LINK_ONCE_DISCARD (COMDAT __real@3ff0000000000000 2385)
      9 .chks64       00000050  0000000000000000  0000000000000000  0004957a  2**2
                      CONTENTS, READONLY, DEBUGGING, EXCLUDE, NOREAD

From what I read, these __real@.. symbols are supposed to be weak, and that LINK_ONCE_DISCARD in the section header suggests the same thing.

objdump -t ..

...
    [2382](sec  0)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x0000000000000000 __ImageBase
    [2383](sec  9)(fl 0x00)(ty   0)(scl   3) (nx 1) 0x0000000000000000 .rdata
    AUX scnlen 0x8 nreloc 0 nlnno 0 checksum 0xa2dacc80 assoc 0 comdat 2
    [2385](sec  9)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x0000000000000000 __real@3ff0000000000000
    [2386](sec  0)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x0000000000000000 __security_cookie
    [2387](sec  0)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x0000000000000000 _fltused
    [2388](sec 10)(fl 0x00)(ty   0)(scl   3) (nx 1) 0x0000000000000000 .chks64
    AUX scnlen 0x50 nreloc 0 nlnno 0

fwiw, these three problems with "gevent" are likely to be the main batch of problems related to PyOxidizer for quite a while. It only needs to rewrite symbols where two Python packages have used the same name for their module and my analysis of ~2000 packages indicates this is fairly rare, especially when considering which packages PyOxidizer is likely to support in the near future, and the problem only arises if a PyOxidizer user wants to use both in the same project. I've tested object rewriting many of those conflicts without problem, except for Cython-generated modules like gevent. gevent is especially problematic because it uses the same name as a Python core standard library, and it is a dependency of so many other packages. For many cases, the Python package authors would be happy to rename their DSO to avoid the conflict, as its name is quite irrelevant to their Python userbase. This is why I believe pressing ahead with PyOxidizer using object is worthwhile - after this initial batch, there will be a managable trickle of 'bugs' in object, and in most cases the problem can be avoided by a Python package owner renaming their DSO.
(And PyOxidizer users also have the option of manually renaming the symbols, and I can add support for using objcopy, but those should be fallbacks.)

How to replace Symbol::section_kind

Symbol::section_kind was removed, but without the transition notes. How to convert this code to a newer version?

if symbol.section_kind() != Some(object::SectionKind::Text) {
    continue;
}

Allow iterating over sections

Currently you need to know the name of the section you want. It's useful to be able to get them all as well as names and flags.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.