Coder Social home page Coder Social logo

fus's Introduction

Purpose

This tool attempts to produce a viable "depsolved" collection of packages.

Unlike earlier tools of this type, it understands the concept of modules and module streams and can incorporate them into the dependency solving algorithm.

It takes as input a list of repositories and a list of artifacts of various types that must be in the output package set. The full list of input types is in "Usage" below, but it includes individual RPMs, comps style groups and categories, and, as mentioned above, modules, either whole or as part of a stream.

It then attempts to output a package set that includes all required artifacts as well as all additional packages needed to allow the required artifacts to be installed.

NOTE: This does not mean that the entire output package set can be installed at once. Many module use cases prevent this, as do some existing basic RPM use cases.

What it does mean is that for each output artifact, there is the potential to resolve the RPM and module-level dependencies to allow it to install.

Known limitations

  • Shared RPM artifacts are not supported

Building

$ meson builddir
$ ninja -C builddir

Usage

$ ./builddir/main --repo NAME,TYPE,PATH --debug ITEM

There can be multiple repos. The name is just the name of the repository; type can be lookaside or anything else.

Item to include in the input can have one of the following forms.

  • module(NAME) or module(NAME:STREAM) for modules
  • group:foo or category:bar for comps input
  • just package name, or a glob matched against package names

Testing

To run all the available tests:

$ ninja -C builddir test

Or to have more control on which tests are run, use gtester:

$ G_TEST_SRCDIR=$PWD/tests gtester builddir/tests

fus's People

Contributors

dcantrell avatar ignatenkobrain avatar imcleod avatar lubomir avatar mmathesius avatar r4f4 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

fus's Issues

Segmentation fault in parsing repo XML due to missing error handling around solv_xfopen

In my case, the reproducer is insufficient permissions while reading repo data:
(pulp) [vagrant@pulp2 builddir]$ gdb --args ./fus --repo modulerr,regular,/var/lib/pulp/published/yum/master/yum_distributor/modulerr/1538566893.09/ walrus

# ---->%-----------
Reading symbols from ./fus...done.
(gdb)  run
Starting program: /home/vagrant/devel/fus/builddir/fus --repo modulerr,regular,/var/lib/pulp/published/yum/master/yum_distributor/modulerr/1538566893.09/ walrus
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff6df1868 in fread () at iofread.c:35
35        if (bytes_requested == 0)
Missing separate debuginfos, use: dnf debuginfo-install elfutils-libelf-0.173-1.fc28.x86_64 glib2-2.56.1-4.fc28.x86_64
(gdb) where
#0  0x00007ffff6df1868 in fread () at iofread.c:35
#1  0x00007ffff73d6605 in fread (__stream=0x0, __n=8192, __size=1, __ptr=0x7fffffffb970) at /usr/include/bits/stdio2.h:294
#2  solv_xmlparser_parse (xmlp=xmlp@entry=0x7fffffffd9e0, fp=fp@entry=0x0) at ../ext/solv_xmlparser.c:303
#3  0x00007ffff73c454e in repo_add_repomdxml (repo=<optimized out>, fp=0x0, flags=0) at ../ext/repo_repomdxml.c:355
#4  0x00000000004048a6 in create_repo (pool=0x61b9b0, name=0x63ca70 "modulerr", path=0x63cab0 "/var/lib/pulp/published/yum/master/yum_distributor/modulerr/1538566893.09/") at ../fus.c:416
#5  0x00000000004063da in fus_depsolve (arch=0x7fffffffdd34 "x86_64", platform=0x0, exclude_packages=0x0, repos=0x61b8b0, solvables=0x61b920, error=0x7fffffffddc8) at ../fus.c:985
#6  0x000000000040691d in main (argc=1, argv=0x7fffffffdec8) at ../main.c:69
(gdb) 

All works as expected once sufficient permissions are given to the program:

(pulp) [vagrant@pulp2 builddir]$ sudo -u apache ./fus --repo modulerr,regular,/var/lib/pulp/published/yum/master/yum_distributor/modulerr/1538566893.09/ walrus
*walrus-5.21-1.noarch@modulerr
module:walrus:5.21:20180704144203:deadbeef.x86_64@modulerr
(null)
(pulp) [vagrant@pulp2 builddir]$ 

Having examined the stack, the offending line seems to be the call to solv_xfopen() that isn't sanitized for return value, which in my case turns out to be 0x0, causing the iofread:fread() to receive a SIGSEGV, as can be seen on the stack.

Lookaside is not handled correctly

If the same RPM (with the same NEVRA) is present in both regular repo and in lookaside repo, it should never appear in the output.

However that is not the case. If the package is explicitly listed as input, it will be taken from whichever repo was given first. In such case the the lookasides has to be listed first.

If the package is pulled in as a dependency of another package, I'm not sure how it is selected.

A test case for the first situation follows. The order of repos has to be reversed in tests.c for the test to fail (add normal repo first, and then add the lookaside).

$ cat input 
pkg
$ cat expected 
$ cat packages.repo 
=Ver: 2.0

=Pkg: pkg 1 1 noarch
$ cat lookaside.repo 
=Ver: 2.0

=Pkg: pkg 1 1 noarch

For the second:

$ cat input 
foo
$ cat expected 
foo-1-1.x86_64@repo
$ cat packages.repo 
=Ver: 2.0

=Pkg: foo 1 1 noarch
=Req: pkg

=Pkg: pkg 1 1 noarch
$ cat lookaside.repo 
=Ver: 2.0

=Pkg: pkg 1 1 noarch

Multilib for modules

Quoting the document:

Modules do not provide multilib by default. Components that are expected to be available as multilib as marked as such in the modulemd components section, which is SRPM based. All produced binary multilib RPMs of the marked components are expected to be included in the module unless filtered out.

This might work now. I need to test and will close this if there is no change required.

Option to exclude particular packages

We need to be able to tell the code to exclude particular packages. It should support globs against package names.

The behaviour is that if a package name matches the glob, it should be completely invisible, as if it was not in the repo at all.

Enable non-default stream

We have a product that has non-modular package depending on a non-default stream. The instructions for the product tell users they need to manually enable the module stream before installation of the package.

When running fus, this information is not available, so there are dependency problems reported.

Example situation:
repo-1: foo package Requires: bar
lookaside-1: bar:non-default module with bar package

It would be nice to have some way to tell fus that bar:non-default will always be enabled.

CC @contyk for more ideas

How are packages returned in output?

For packages what is the output?
My current assumption is that it's N-E:V-R.A with epoch included only if it's not zero.
Can you confirm please?

Remove the PID from logs

The logs contain process id for lines with debug, warning or error severity. For info it's missing.

It would be nice to remove that. It's not really useful and makes diffing the logs a lot harder and less convenient.

Enhance debug output

Can the debug output be enhance to show why packages are pulled in? Right now I can't find it there.

Having one reason for pulling a each package in would be very helpful. It's not complete information, but that's not a problem.

Allow continuing when dependencies are broken

Current behaviour when a package has unsatisfiable dependencies is to print an error and exit. This is fine as a default, but there should be an option to continue even in such situation and try to resolve as many dependencies as possible.

Without this it would be impossible to use this code in e.g. Fedora Rawhide. There are always some broken dependencies and aborting on them would mean there would never be a successful compose.

Repo argument requires trailing slash

There's a segfault if the repo path in argument does not end with a /.

==19415== Invalid read of size 4
==19415==    at 0x5B1572A: fread (in /usr/lib64/libc-2.26.so)
==19415==    by 0x5403A8C: ??? (in /usr/lib64/libsolvext.so.0)
==19415==    by 0x53F2109: repo_add_repomdxml (in /usr/lib64/libsolvext.so.0)
==19415==    by 0x403E9C: create_repo (main.c:342)
==19415==    by 0x4049A8: main (main.c:583)
==19415==  Address 0x0 is not stack'd, malloc'd or (recently) free'd

Cleanup ~/.cache/fus metadata before fus exits.

Hi,

I have found out fus is storing cached metadata in ~/.cache/fus. This is quite OK, but it should try to remove them once it finishes using it. I understand that in case of crash or some unpredictable issue, the metadata can stay there, but I think generally it should remove it.

So far I have to add following cronjob to clean that directory to composer machine we use:

find ~/.cache/fus -type f -mtime +1 -exec rm {} ;

Add some tests

It would be good to have some tests (for functionality and regression) to avoid bugs like #31. Currently I have a test developed by @lubomir for #34. Other tests are needed though, e.g, to test module alternatives.
I'm thinking about creating pkgs specs and a modules.yaml and use them to generate a repo. Then we'd run fus on this repo testing the different resolution paths. Another idea is to have a repo per test case.
@ignatenkobrain do you have any input/preferences about this?

Pulling in bare RPM when module contains an older one

Available in the repo:

  • module minput, contains input-package which requires foo; module has requires on module mfoo
  • module mfoo contains foo-1.0-1
  • there's a bare rpm foo-2.0-1

When I ask to install module minput, I get both versions of foo in the output. Is this correct?

CC @contyk

Attached is a data set that can reproduce that:
repr.tar.gz

Indicate source of package in output

The list of packages in the output should either include name of the repo where the package is from, or a path (or both).

Ultimately Pungi needs to know the path to the package, but can find it in the repo itself.

Segfault on current master

$ git show
commit 9cb602f2bdb505644f414f92d940c60626a00e84 (HEAD -> master, origin/master, origin/HEAD)
$ meson . builddir
$ ninja -C builddir
$ valgrind ./builddir/fus --platform el8 foo
==12222== Memcheck, a memory error detector
==12222== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==12222== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==12222== Command: ./builddir/fus --platform el8 foo
==12222== 
==12222== Invalid read of size 4
==12222==    at 0x404ED9: pool_whatprovides (pool.h:334)
==12222==    by 0x404F72: pool_whatprovides_ptr (pool.h:348)
==12222==    by 0x405C3F: _repo_add_modulemd_defaults (repo.c:305)
==12222==    by 0x40717E: add_platform_module (repo.c:824)
==12222==    by 0x4071E0: create_system_repo (repo.c:832)
==12222==    by 0x4096A9: fus_depsolve (fus.c:699)
==12222==    by 0x409DD0: main (main.c:69)
==12222==  Address 0x334 is not stack'd, malloc'd or (recently) free'd
==12222== 
==12222== 
==12222== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==12222==  Access not within mapped region at address 0x334
==12222==    at 0x404ED9: pool_whatprovides (pool.h:334)
==12222==    by 0x404F72: pool_whatprovides_ptr (pool.h:348)
==12222==    by 0x405C3F: _repo_add_modulemd_defaults (repo.c:305)
==12222==    by 0x40717E: add_platform_module (repo.c:824)
==12222==    by 0x4071E0: create_system_repo (repo.c:832)
==12222==    by 0x4096A9: fus_depsolve (fus.c:699)
==12222==    by 0x409DD0: main (main.c:69)
==12222==  If you believe this happened as a result of a stack
==12222==  overflow in your program's main thread (unlikely but
==12222==  possible), you can try to increase the size of the
==12222==  main thread stack using the --main-stacksize= flag.
==12222==  The main thread stack size used in this run was 8388608.
==12222== 
==12222== HEAP SUMMARY:
==12222==     in use at exit: 251,587 bytes in 843 blocks
==12222==   total heap usage: 1,163 allocs, 320 frees, 349,526 bytes allocated
==12222== 
==12222== LEAK SUMMARY:
==12222==    definitely lost: 0 bytes in 0 blocks
==12222==    indirectly lost: 0 bytes in 0 blocks
==12222==      possibly lost: 1,544 bytes in 20 blocks
==12222==    still reachable: 242,123 bytes in 764 blocks
==12222==                       of which reachable via heuristic:
==12222==                         length64           : 384 bytes in 9 blocks
==12222==                         newarray           : 1,600 bytes in 20 blocks
==12222==         suppressed: 0 bytes in 0 blocks
==12222== Rerun with --leak-check=full to see details of leaked memory
==12222== 
==12222== For lists of detected and suppressed errors, rerun with: -s
==12222== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped [obraz paměti uložen])

Read solvables from file

It should be possible to read solvables from a file instead of just command line arguments. The list of arguments can get rather long, and running a command line that uses 8000 arguments and has 160kB in total is just not good.

Include modular packages with broken deps in the output

If the package is listed explicitly as input, it will be included even if it has unresolvable dependencies.

If a modular package has broken deps, it will not be listed. However since it's basically also an input solvable, it would be nice to list it in the output. The NVRA is known, so it should be possible.

Optional packages in comps group are not included

If a comps group includes a package marked as optional, it will not be pulled into the result.

This is possibly wrong, since if the output is used to generate a repo, users of the repo will not have a chance to install the optional packages at all.

Fails to use repomd.xml from local file

With the patch to download repodata from HTTP repos, it seems like it's no longer possible to use local paths. I'm getting errors about failure to load repomd.xml saying "No such file or directory".

The repository exists and contains the file. I think download_to_path is asked to download local file, so it logs a message that the file is already a local file and does nothing. Then the loading code attempts to load the file from cache. It's not there.

Help output

It would be nice to print usage information in some way.

This has very low priority.

Selfhosting trees

We need an option to follow build dependencies and include them in the result as well. This should be a simple command line option.

Add ability to consume repositores over HTTP

Currently the input repository has to be located on a local filesystem. It would be useful to allow consuming packages from repos served over HTTP. Fus should transparently download the files as needed.

Including modules with different contexts

We want to include two different contexts for a module with the same name, stream and version. That is currently not possible, since only name and stream are accepted as input.

Multilib for non-modular packages

No changes to the traditional multilib handling are expected. However, traditional RPM multilib handling needs to be limited to the non-modular part of the compose and shouldn’t affect modular RPMs.

We need to support multilib for non-modular packages. This should be hidden behind an option, so that it can be turned on only for x86_64 and not elsewhere. Also it will need to have a way to configure a whitelist and blacklist.

Current composes support a couple methods for determining what should be multilib. In practice only devel and runtime methods are used. The current implementation is in Python, so not directly usable.

Expected behaviour:

  • for each package added into the result set, if it matches the rules for multilib or is on the whitelist, and is not on the blacklist, the multilib version should be added
  • dependencies of the multilib package should be resolved too

The rules are horribly complex, and I don't think reimplementing it makes much sense. It would make sense to have a generic rule, and handle exceptions by using the whitelist (which needs to support globs).

The general rules would be

  • for devel method: any package whose name ends with -devel or -static should be multilib
  • for runtime method: if a package installs a file matching *.so.* into libdir, it should be multilib

CC @contyk

Coding style - comments

Please, have mercy with anybody who will help you with the maintaining the code. Without proper comments, it would be painful to understand it. It is even more important until there is no documentation as far as I know. But probably you planned to do it later already.
With main() function splitted into parts, it would be much easier to read it. But I won't be the one who will develop the code further, so I do not mind this much.
Anyway, thanks. This is probably going to be much faster than current depsolving process.

Multiple instances of fus running at the same time use wrong cache

When there are two instances of fus spawned at about the same time and use same repo names for different repos, it can cause one of the processes to not be able to download repo metadata.

As a workaround in Pungi, I'm setting XDG_CACHE_HOME to use a fresh cache for sequence of fus runs that all use the same repos. This cache is deleted afterwards.

Occassional segfault

When running, the code randomly segfaults. It does not happen very often, but it's frequent enough to hinder real usage.

A traceback:

(gdb) bt
#0  _int_malloc (av=av@entry=0x7f16f5732760 <main_arena>, bytes=bytes@entry=513) at malloc.c:3415
#1  0x00007f16f53f066c in _int_realloc (av=av@entry=0x7f16f5732760 <main_arena>, oldp=oldp@entry=0x1980210, oldsize=oldsize@entry=400, nb=nb@entry=528) at malloc.c:4292
#2  0x00007f16f53f1e42 in __GI___libc_realloc (oldmem=0x1980220, bytes=bytes@entry=512) at malloc.c:3030
#3  0x00007f16f5c3919e in solv_realloc (old=<optimized out>, len=512) at /usr/src/debug/libsolv-0.6.30/src/util.c:52
#4  0x00007f16f5c39238 in solv_realloc2 (old=<optimized out>, num=<optimized out>, len=len@entry=4) at /usr/src/debug/libsolv-0.6.30/src/util.c:63
#5  0x00007f16f5c278b1 in queue_alloc_one (q=q@entry=0x7ffe160c8260) at /usr/src/debug/libsolv-0.6.30/src/queue.c:97
#6  0x00007f16f5c2856c in queue_push (id=4011, q=0x7ffe160c8260) at /usr/src/debug/libsolv-0.6.30/src/queue.h:78
#7  lookup_idarray_solvable (off=<optimized out>, q=q@entry=0x7ffe160c8260, repo=<optimized out>) at /usr/src/debug/libsolv-0.6.30/src/repo.c:1037
#8  0x00007f16f5c2a35a in repo_lookup_idarray (repo=<optimized out>, entry=<optimized out>, keyname=-171804495, keyname@entry=9, q=q@entry=0x7ffe160c8260) at /usr/src/debug/libsolv-0.6.30/src/repo.c:1066
#9  0x00007f16f5c2a4f8 in repo_lookup_deparray (repo=<optimized out>, entry=<optimized out>, keyname=keyname@entry=9, q=q@entry=0x7ffe160c8260, marker=marker@entry=0)
    at /usr/src/debug/libsolv-0.6.30/src/repo.c:1107
#10 0x00007f16f5c3eaa8 in solvable_lookup_deparray (s=<optimized out>, keyname=keyname@entry=9, q=q@entry=0x7ffe160c8260, marker=marker@entry=0) at /usr/src/debug/libsolv-0.6.30/src/solvable.c:99
#11 0x00007f16f5c2373d in pool_whatcontainsdep (pool=pool@entry=0x130fce0, keyname=keyname@entry=9, dep=-2147483647, q=q@entry=0x7ffe160c8410, marker=marker@entry=0)
    at /usr/src/debug/libsolv-0.6.30/src/pool.c:1476
#12 0x00000000004040a7 in main (argc=1, argv=0x7ffe160c8888) at main.c:895
(gdb) l
890                           Solvable *s = pool_id2solvable (pool, p);
891
892                           g_auto(Queue) q;
893                           queue_init (&q);
894                           Id dep = pool_rel2id (pool, s->name, s->arch, REL_ARCH, 1);
895                           pool_whatcontainsdep (pool, SOLVABLE_REQUIRES, dep, &q, 0);
896
897                           g_auto(Queue) j;
898                           queue_init (&j);
899                           for (int k = 0; k < q.count; k++)

Modules without context are ignored

We still have a few modules that were last built before contexts were introduced. When a repo contains such module, the solver will ignore it and not include it at all (here).

This could probably be solved by rebuilding the modules so that they actually get some context. Filing this to have it tracked.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.