Coder Social home page Coder Social logo

repomd-parser's Introduction

RepomdParser

RPM repository metadata parser.

For tools that use RepomdParser, see repo-tools repository.

This gem can parse repomd.xml, primary.xml and deltainfo.xml metadata files of the RPM repository, providing a way to get access to the list of packages in the repo and the details of each individual package (name, size, checksum, etc.)

Installation

  1. Add gem 'repomd_parser' line to your application's Gemfile;
  2. Execute bundle.

Alternatively, install as gem install repomd_parser.

Usage

RepomdParser::RepomdXmlParser

Parses repomd.xml -- the main repository metadata file, which references other metadata files.

parse and parse_file methods return an array of RepomdParser::Reference.

Using the parse method
File.open('repomd.xml') do |fh|
  metadata_files = RepomdParser::RepomdXmlParser.new.parse(fh)
  metadata_files.each do |metadata_file|
    printf "type: %10s, location: %s\n", metadata_file.type, metadata_file.location
  end
end
Using the parse_file method
metadata_files = RepomdParser::RepomdXmlParser.new.parse_file('repomd.xml')
metadata_files.each do |metadata_file|
  printf "type: %10s, location: %s\n", metadata_file.type, metadata_file.location 
end

RepomdParser::PrimaryXmlParser

Parses primary.xml, which contains information about RPM packages in the repository.

parse and parse_file methods return an array of RepomdParser::Reference.

Using the parse method
File.open('primary.xml') do |fh|
  rpm_packages = RepomdParser::PrimaryXmlParser.new.parse(fh)
  rpm_packages.each do |rpm|
    printf "arch: %8s, location: %s\n", rpm.arch, rpm.location
  end
end
Using the parse_file method
rpm_packages = RepomdParser::PrimaryXmlParser.new.parse_file('primary.xml')
rpm_packages.each do |rpm|
  printf "arch: %8s, location: %s\n", rpm.arch, rpm.location
end

RepomdParser::DeltainfoXmlParser

Parses deltainfo.xml, which contains information about delta-RPM packages in the repository.

parse and parse_file methods return an array of RepomdParser::Reference.

Using the parse method
File.open('deltainfo.xml') do |fh|
  rpm_packages = RepomdParser::DeltainfoXmlParser.new.parse(fh)
  rpm_packages.each do |rpm|
    printf "arch: %8s, location: %s\n", rpm.arch, rpm.location
  end
end
Using the parse_file method
rpm_packages = RepomdParser::DeltainfoXmlParser.new.parse_file('deltainfo.xml')
rpm_packages.each do |rpm|
  printf "arch: %8s, location: %s\n", rpm.arch, rpm.location
end

Compressed file support

The gzip and Zstandard compression formats are supported. The parse_file method automatically decompresses files based on the filename, e.g.:

rpm_packages = RepomdParser::PrimaryXmlParser.new.parse_file('primary.xml.gz')
rpm_packages.each do |rpm|
  printf "arch: %8s, location: %s\n", rpm.arch, rpm.location
end

The RepomdParser.decompress_io helper is provided to handle decompression of IO objects for use with the parse method:

filename = 'primary.xml.gz'
io = RepomdParser.decompress_io(File.open(filename), filename)

rpm_packages = RepomdParser::PrimaryXmlParser.new.parse(io)
rpm_packages.each do |rpm|
  printf "arch: %8s, location: %s\n", rpm.arch, rpm.location
end

RepomdParser::Reference

Represents a file referenced in the metadata file. Has the following accessors:

  • location, relative to the root of the repository.
  • checksum_type, e.g. SHA1, SHA256, MD5.
  • checksum.
  • type, type of the file, e.g. :primary, :deltainfo, :rpm, :drpm.
  • size in bytes.
  • arch.

RPM and DRPM files additionally have the following attributes:

  • name.
  • version.
  • release.
  • build_time.

Caveats

  • File extension is used to determine file compression type (expected extensions are .gz and .zst for gzip and Zstandard respectively)

Development

After checking out the repo, run bundle install to install dependencies. Then, run rake spec to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/ikapelyukhin/repomd-parser.

repomd-parser's People

Contributors

dependabot[bot] avatar felixsch avatar ikapelyukhin avatar thutterer avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

repomd-parser's Issues

bz2 support in repomd_parser

Sorry Ivan, I was not fast enough for 1.0.0.

RMT needs support for bz2 compression which is currently missing. I will provide a pull request shortly!

Question: Support for StringIO

Are there plans for supporting StringIO? The use case is: not persisting to disk the downloaded XML.

response = http_client.get('https://a-webserver.here/repodata/primary.xml.gz')
response.body.class # => string

# Then I'd have to persit it on disk here
File.write('primary.xml.gz')
entries = RepomdParser::PrimaryXmlParser.new('primary.xml.gz').parse

But it looks like it doesn't play well in overlayfs as we're hitting DiskPressure sometimes, it'd be ideal not to have to persist it in disk ๐Ÿค”

missing support for decoding `.zst` compressed repository metadata

There's handling for gzip here:

https://github.com/ikapelyukhin/repomd-parser/blob/ec7ffdcc50458f9825f456c5f026079e0da27dfb/lib/repomd_parser/base_parser.rb#L29C1-L32

however no corresponding handling for .zst (ZSTD) compression. ZSTD is the default compression type created by createrepo_c >= 1.0.0 and many distributions (Fedora, openSUSE) are switching to it because of the higher compression ratios and much higher decompression speed.

Additional fields when parsing primary.xml

Hallo @ikapelyukhin ๐Ÿ‘‹

I started looking into this gem and maybe using it for SCC. Looks super promising, but I just noticed that parsing a primary doesn't return all the fields we'd need, like summary, description, license.

Can we have these? Anything planned here already? I can do a draft PR, just wanted to ask your opinion. Should I just add parsing these fields always, or attempt something like optional "extra_fields"?

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.