Coder Social home page Coder Social logo

bbc / rd-apmm-python-lib-mediagrains Goto Github PK

View Code? Open in Web Editor NEW
7.0 14.0 0.0 1.42 MB

A python library for handling grain-based media

License: Apache License 2.0

Makefile 4.92% Python 93.63% Jinja 1.32% Shell 0.13%
rd-project rd-section-apmm rd-cloudfit rd-stability-green

rd-apmm-python-lib-mediagrains's Introduction

mediagrains

A python library for handling grain-based media in a python-native style.

Introduction

Provides constructor functions for various types of grains and classes that nicely wrap those grains, as well as a full serialisation and deserialisation library for Grain Sequence Format (GSF) format, a lightweight binary wrapper format for various media types including uncompressed video.

Please read the pydoc documentation for more details. The GSF format is documented in gsf_docs/gsf.md.

Some useful tools for handling the Grain Sequence Format (GSF) file format are also included - see Tools.

Installation

Requirements

  • A working Python 3.10+ installation
  • The tool Docker is needed to run the tests, but not required to use the library.

Steps

# Install from pip
$ pip install mediagrains

# Install directly from source repo
$ git clone [email protected]:bbc/rd-apmm-python-lib-mediagrains.git
$ cd rd-apmm-python-lib-mediagrains
$ make install

Usage

As an example of using this in your own code the following is a simple program to load the contents of a GSF file and print the timestamp of each grain.

>>> from mediagrains.gsf import load
>>> f = open('examples/video.gsf', "rb")
>>> (head, segments) = load(f)
>>> print('\n'.join(str(grain.origin_timestamp) for grain in segments[1]))
1420102800:0
1420102800:20000000
1420102800:40000000
1420102800:60000000
1420102800:80000000
1420102800:100000000
1420102800:120000000
1420102800:140000000
1420102800:160000000
1420102800:180000000

Alternatively to create a new video grain in 10-bit planar YUV 4:2:0 and fill it with colour-bars:

>>> from mediagrains import VideoGrain
>>> from uuid import uuid1
>>> from mediagrains.cogenum import CogFrameFormat, CogFrameLayout
>>> src_id = uuid1()
>>> flow_id = uuid1()
>>> grain = VideoGrain(src_id=src_id, flow_id=flow_id, cog_frame_format=CogFrameFormat.S16_422_10BIT, width=1920, height=1080)
>>> colours = [
...      (0x3FF, 0x000, 0x3FF),
...      (0x3FF, 0x3FF, 0x000),
...      (0x3FF, 0x000, 0x000),
...      (0x3FF, 0x3FF, 0x3FF),
...      (0x3FF, 0x200, 0x3FF),
...      (0x3FF, 0x3FF, 0x200) ]
>>> x_offset = [0, 0, 0]
>>> for colour in colours:
...     i = 0
...     for c in grain.components:
...             for x in range(0,c.width//len(colours)):
...                     for y in range(0,c.height):
...                             grain.data[c.offset + y*c.stride + (x_offset[i] + x)*2 + 0] = colour[i] & 0xFF
...                             grain.data[c.offset + y*c.stride + (x_offset[i] + x)*2 + 1] = colour[i] >> 8
...             x_offset[i] += c.width//len(colours)
...             i += 1

(a more natural interface for accessing data exists in the form of numpy arrays. See later.)

The object grain can then be freely used for whatever video processing is desired, or it can be serialised into a GSF file as follows:

>>> from mediagrains.gsf import dump
>>> f = open('dummyfile.gsf', 'wb')
>>> dump(f, [grain])
>>> f.close()

The encoding module also supports a "progressive" mode where an encoder object is created and a dump started, then grains can be added and will be written to the output file as they are added.

>>> from uuid import uuid1
>>> from mediagrains import Grain
>>> from mediagrains.gsf import GSFEncoder
>>> src_id = uuid1()
>>> flow_id = uuid1()
>>> f = open('dummyfile.gsf', 'wb')
>>> enc = GSFEncoder(f)
>>> seg = enc.add_segment()  # This must be done before the call to start_dump
>>> enc.start_dump()  # This writes the file header and starts the export
>>> seg.add_grain(Grain(src_id=src_id, flow_id=flow_id))  # Adds a grain and writes it to the file
>>> seg.add_grain(Grain(src_id=src_id, flow_id=flow_id))  # Adds a grain and writes it to the file
>>> seg.add_grain(Grain(src_id=src_id, flow_id=flow_id))  # Adds a grain and writes it to the file
>>> enc.end_dump()  # This ends the export and finishes off the file
>>> f.close()

If the underlying file is seekable then the end_dump call will upade all segment metadata to list the correct grain count, otherwise the counts will be left at -1.

Comparing Grains

In addition the library contains a relatively rich grain comparison mechanism in the submodule comparison. This submodule exposes methods for comparing two grains, and comparing the outputs of two grain iterators. Information on advanced comparison features such as how to include attributes, exclude attributes, expect differences on attributes and compare PSNR values can be found in the comparison submodules documentation.

An example of usage is as follows:

>>> from mediagrains.comparison import compare_grain
>>> print(compare_grain(a, b))
❌   Grains do not match<a/b>.grain_type == 'video'<a/b>.source_id == UUID('9d0a2518-8f39-11ec-bcdd-737806a40a30')
  ✅   <a/b>.flow_id == UUID('a1269208-8f39-11ec-bcdd-737806a40a30')
  ✅   <a/b>.rate == Fraction(25, 1)
  ✅   <a/b>.duration == Fraction(1, 25)
  ✅   <a/b>.length == 6220800a.origin_timestamp - b.origin_timestamp == mediatimestamp.immutable.TimeOffset.from_sec_nsec('-0:160000000'), not the expected mediatimestamp.immutable.TimeOffset.from_sec_nsec('0:0')
  ❌   a.sync_timestamp - b.sync_timestamp == mediatimestamp.immutable.TimeOffset.from_sec_nsec('-0:160000000'), not the expected mediatimestamp.immutable.TimeOffset.from_sec_nsec('0:0')
  ◯   a.creation_timestamp - b.creation_timestamp == mediatimestamp.immutable.TimeOffset.from_sec_nsec('0:0') as expectedLists matchlen(<a/b>.timelabels) == 0<a/b>.cog_frame_format == CogFrameFormat.U8_444<a/b>.width == 1920<a/b>.height == 1080<a/b>.cog_frame_layout == CogFrameLayout.FULL_FRAMEBinary data <a/b>.data are equal

This output gives a relatively detailed breakdown of the differences between two grains, both as a printed string (as seen above) and also in a data-centric fashion as a tree structure which can be interrogated in code.

Numpy arrays

An additional feature is provided in the form of numpy array access to the data in a grain. As such the above example of creating colourbars can be done more easily:

>>> from mediagrains.numpy import VideoGrain
>>> from uuid import uuid1
>>> from mediagrains.cogenums import CogFrameFormat, CogFrameLayout
>>> src_id = uuid1()
>>> flow_id = uuid1()
>>> grain = VideoGrain(src_id=src_id, flow_id=flow_id, cog_frame_format=CogFrameFormat.S16_422_10BIT, width=1920, height=1080)
>>> colours = [
...      (0x3FF, 0x000, 0x3FF),
...      (0x3FF, 0x3FF, 0x000),
...      (0x3FF, 0x000, 0x000),
...      (0x3FF, 0x3FF, 0x3FF),
...      (0x3FF, 0x200, 0x3FF),
...      (0x3FF, 0x3FF, 0x200) ]
>>> for c in range(0, 3):
...     for x in range(0, grain.components[c].width):
...         for y in range(0, grain.components[c].height):
...             grain.component_data[c][x, y] = colours[x*len(colours)//grain.components[c].width][c]

Documentation

The API is well documented in the docstrings of the module mediagrains. A rendered version of this documentation is available here.

Instructions for using the 'new style' grains can be find in new_style_grains.md

Tools

Some tools are installed with the library to make working with the Grain Sequence Format (GSF) file format easier.

  • wrap_video_in_gsf - Provides a means to read raw video essence and generate a GSF file.
  • wrap_audio_in_gsf - As above, but for audio.
  • extract_from_gsf - Read a GSF file and dump out the raw essence within.
  • gsf_probe - Read metadata about the segments in a GSF file.

For example, to generate a GSF file containing a test pattern from ffmpeg, dump the metadata and then play it out again:

ffmpeg -f lavfi -i testsrc=duration=20:size=1920x1080:rate=25 -pix_fmt yuv422p10le -c:v rawvideo -f rawvideo - | \
wrap_video_in_gsf - output.gsf --size 1920x1080 --format S16_422_10BIT --rate 25
gsf_probe output.gsf
extract_gsf_essence output.gsf - | ffplay -f rawvideo -pixel_format yuv422p10 -video_size 1920x1080 -framerate 25 pipe:0

To do the same with a sine wave:

ffmpeg -f lavfi -i "sine=frequency=1000:duration=5" -f s16le -ac 2 - | wrap_audio_in_gsf - output_audio.gsf --sample-rate 44100
gsf_probe output_audio.gsf
extract_gsf_essence output_audio.gsf - | ffplay -f s16le -ac 2 -ar 44100 pipe:0

The tools have been packaged into the mediagrains docker image for convenience. The mediagrains image containing all the tools can be built using:

make tools

You can make use of the tools image with the following command:

$(make -s run-cmd) <TOOL> <ARGS>

Running the command without <TOOL> or <ARGS> will list the available tools. Running without <ARGS> will provide help.

Development

Commontooling

This repository uses a library of makefiles, templates, and other tools for development tooling and CI workflows. To discover operations that may be run against this repo, run the following in the top level of the repo:

$ make

Testing

To run the unittests for this package in a docker container follow these steps:

$ git clone [email protected]:bbc/rd-apmm-python-lib-mediagrains.git
$ cd rd-apmm-python-lib-mediagrains
$ make test

Continuous Integration

This repository includes GitHub Actions workflows for CI. The shared workflows are centrally managed and should not be modified.

Versioning

We use Semantic Versioning for this repository

Contributing

All contributions are welcome, before submitting you must read and sign a copy of the Individual Contributor License Agreement

Please ensure you have run the test suite before submitting a Pull Request, and include a version bump in line with our Versioning policy.

Authors

  • James Weaver
  • Philip deNier
  • Sam Mesterton-Gibbons
  • Alex Rawcliffe
  • James Sandford

For further information, contact [email protected]

License

See LICENSE.md

rd-apmm-python-lib-mediagrains's People

Contributors

alexrawcliffe avatar andrewgibb avatar c-lunn avatar georginashippey avatar j616 avatar jamesba avatar mattycarroll avatar philipnbbc avatar robwadge avatar samdbmg avatar samihrd avatar simonrankine avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rd-apmm-python-lib-mediagrains's Issues

Extensive code duplication in GSF decoding

The GSF synchronous and asynchronous decoding code paths are nearly identical, and yet because IO and processing are heavily intertwined they have to be repeated rather than written once and reused. This is suboptimal, to say the least.

Suggested Fix: Refactor the code-path so that IO to load an entire block worth of data into a memory buffer is performed upon block entry, and then the various read_ methods source data from this memory buffer. That way the IO routine can be swapped out for sync and async code paths, but the rest of the code can be shared between the two.

Resolve use of submodules

Leaving an issue here so we don't forget about it - hopefully we will be able to mop up after the packaging discussion at the next ways of working meeting

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.