Coder Social home page Coder Social logo

gsfpy's Introduction

gsfpy - Generic Sensor Format for Python

Python Package PyPI - Downloads

Python wrapper for the C implementation of the Generic Sensor Format library.

  • Free software: MIT license
  • Notes on licensing: The bundled gsfpy3_0x/libgsf/libgsf03_0x.so binaries are covered by the LGPL v2.1 license. Copies of this license are included in the project at gsfpy3_0x/libgsf/libgsf_LICENSE.md. The top-level MIT licensing of the overall gsfpy project is not affected by this. However, as required by the libgsf license, the libgsf shared object libraries used by the gsfpy3_0x packages at runtime may be replaced with a different version by setting the GSFPY3_08_LIBGSF_PATH and/or GSFPY3_09_LIBGSF_PATH environment variables to the absolute file path of the new library.

Namespaces and supported GSF versions

The gsfpy package provides three namespaces: gsfpy, gsfpy3_08 and gsfpy3_09.

The default version of GSF supported is 3.08. Top level package functionality for 3.08 can be used either via import gsfpy (without setting the DEFAULT_GSF_VERSION environment variable - see below) or import gsfpy3_08. Note that import gsfpy will also work for versions 3.06 and 3.07 of GSF as well (older versions have not been tested).

If you are using GSF v3.09, there are two options:

  • Set the DEFAULT_GSF_VERSION environment variable to "3.09", then import gsfpy
  • Import the 3.09 package directly with import gsfpy3_09

Features

  • The gsfpy(3_0x).bindings modules provide wrappers for all GSFlib functions, including I/O, utility and info functions. Minor exceptions are noted in the sections below.

  • For added convenience the gsfpy top level package provides the following higher level abstractions:

    • open_gsf()
    • GsfFile (class)
    • GsfFile.read()
    • GsfFile.get_number_records()
    • GsfFile.seek()
    • GsfFile.write()
    • GsfFile.close()

Install using pip

From PyPI

pip install gsfpy

From GitHub (SSH)

pip install git+ssh://[email protected]/UKHO/gsfpy.git@master

From GitHub (HTTPS)

pip install git+https://github.com/UKHO/gsfpy.git@master

Examples of usage

Open/close/read from a GSF file (GSF v3.08)

from ctypes import string_at

from gsfpy3_08 import open_gsf
from gsfpy3_08.enums import RecordType

with open_gsf("path/to/file.gsf") as gsf_file:
    # Note - file is closed automatically upon exiting 'with' block
    _, record = gsf_file.read(RecordType.GSF_RECORD_COMMENT)

    # Note use of ctypes.string_at() to access POINTER(c_char) contents of
    # c_gsfComment.comment field.
    print(string_at(record.comment.comment))

Write to a GSF file (GSF v3.09)

from ctypes import c_int, create_string_buffer

from gsfpy3_09 import open_gsf
from gsfpy3_09.enums import FileMode, RecordType
from gsfpy3_09.gsfRecords import c_gsfRecords

comment = b"My comment"

# Initialize the contents of the record that will be written.
# Note use of ctypes.create_string_buffer() to set POINTER(c_char) contents.
record = c_gsfRecords()
record.comment.comment_time.tvsec = c_int(1000)
record.comment.comment_length = c_int(len(comment))
record.comment.comment = create_string_buffer(comment)

with open_gsf("path/to/file.gsf", mode=FileMode.GSF_CREATE) as gsf_file:
    gsf_file.write(record, RecordType.GSF_RECORD_COMMENT)

Copy GSF records (GSF v3.08 as default)

from ctypes import byref, c_int, pointer

from gsfpy import *


# This example uses the bindings module to illustrate use of the lower level functions
file_handle = c_int(0)
data_id = gsfDataID.c_gsfDataID()
source_records = gsfRecords.c_gsfRecords()
target_records = gsfRecords.c_gsfRecords()

ret_val_open = bindings.gsfOpen(
    b"path/to/file.gsf", enums.FileMode.GSF_READONLY, byref(file_handle)
)

# Note use of ctypes.byref() as a shorthand way of passing POINTER parameters to
# the underlying foreign function call. ctypes.pointer() may also be used.
bytes_read = gsfpy.bindings.gsfRead(
    file_handle,
    enums.RecordType.GSF_RECORD_COMMENT,
    byref(data_id),
    byref(source_records),
)

# Note use of pointer() rather than byref() when passing parameters to
# gsfCopyRecords(). Implementation of this function is in Python as calling
# the native underlying function causes memory ownership clashes. byref()
# is only suitable for passing parameters to foreign function calls (see
# ctypes docs).
ret_val_cpy = bindings.gsfCopyRecords(
    pointer(target_records), pointer(source_records)
)
ret_val_close = bindings.gsfClose(file_handle)

Troubleshoot

from gsfpy3_09.bindings import gsfIntError, gsfStringError

# The gsfIntError() and gsfStringError() functions are useful for
# diagnostics. They return an error code and corresponding error
# message, respectively.
retValIntError = gsfIntError()
retValStringError = gsfStringError()
print(retValIntError, retValStringError)

Notes on implementation

gsfPrintError()

The gsfPrintError() method of GSFlib is not implemented as there is no FILE* equivalent in Python. Use gsfStringError() instead - this will give the same error message, which can then be written to file as required.

gsfCopyRecords() and gsfFree()

gsfFree() the sibling method to gsfCopyRecord() in GSFlib, used to deallocate memory assigned by the library but managed by the calling application, is not required by gsfpy as memory allocation and deallocation is handled by ctypes. gsfFree() is therefore omitted from the package.

gsf_register_progress_callback()

Implementation of the GSFlib function gsf_register_progress_callback() is not applicable for gsfpy as the DISPLAY_SPINNER macro was not defined during compilation. It is therefore omitted from the package.

Generic Sensor Format Documentation

Generic Sensor Format specification: see e.g. https://github.com/schwehr/generic-sensor-format/blob/master/doc/GSF_lib_03-06.pdf

Generic Sensor Format C library v3.06 specification: see e.g. https://github.com/schwehr/generic-sensor-format/blob/master/doc/GSF_spec_03-06.pdf

More recent versions of these documents can be downloaded from the Leidos website.

Dev Setup

Ensure Poetry is installed before proceeding

Poetry (Recommended)

By default Poetry will create it's own virtual environment using your system's Python. This feature can be disabled.

git clone [email protected]:UKHO/gsfpy.git
cd gsfpy
poetry install

Pyenv

A good choice if you want to run a version of Python different than available through your system's package manager

git clone [email protected]:UKHO/gsfpy.git
cd gsfpy
pyenv install 3.8.3
pyenv virtualenv 3.8.3 gsfpy
pyenv local gsfpy
poetry install

Run Tests

make test

Run Checks

make lint

Notes on Security

Some known concerns relating to the underlying GSFlib C library are documented at dwcaress/MB-System#368 and https://github.com/schwehr/generic-sensor-format/issues. Note that gsfpy simply wraps GSFlib and does not purport to stop or mitigate these potential vulnerabilities. It is left to the authors of applications calling gsfpy to assess these risks and mitigate where deemed necessary.

GSF data processed using gsfpy should be sourced from reliable providers and checked for integrity where possible.

Please also refer to the LICENSE file for the terms of use of gsfpy.

Credits

libgsf03-08.so was built from the Leidos C code using the Makefile in UKHO/libgsf

This package was created with Cookiecutter and the UKHO/cookiecutter-pypackage project template.

Related Projects

Also see schwehr/generic-sensor-format

gsfpy's People

Contributors

dependabot-preview[bot] avatar dependabot[bot] avatar lmarsden avatar paulcw-ukho avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

gsfpy's Issues

Use dedicated environment variable for optional loading of shared object library

gsfpy should use its own dedicated environment variable (suggested $GSFPY_GSFLIB_PATH) for loading of the libgsf shared object library.

Suggested approach

  • Search in the path specified by the $GSFPY_GSFLIB_PATH variable for the libgsf library. If present, load it from there, else default to the lib/libgsfxxx.so location at which the bundled version of the library is stored.

ACs

  • gsfpy loads a shared object library located at $GSFPY_GSFLIB_PATH at runtime in preference to the bundled version.

Rename master to main

In accordance with UKHO policy could we change the name of the master branch to main?

Calling gsfFree() causes segfault

Calling gsfFree() results in a segmentation fault at present.

Inpection of the gsfFree() C code in gsf.c of the GSF library 3.08 reveals that it passes through each of the dynamically allocated member fields of the gsfRecords structure, calling free() on each individually. It is likely at one of these steps that the segfault is occurring.

Recommended approach - create a small C program to read a GSF record, then free the memory. If this fails then the issue is with the underlying library and out of scope for gsfpy. If not, attempt to identify the offending python code.

The error can be reproduced by running test_gsfFree_success() in test_bindings.py. This test is currently commented.

Publish conda package

gsfpy is published as a pip package to PyPI as part of the automated CI process. To support as wide a developer audience as possible, we should also publish to Conda.

ACs

  • GitHub actions updated to publish master version to Conda
  • Following publication, gsfpy can be installed using the command conda install gsfpy
  • Documentation: README updated with instructions on how to install using Conda

Update README with Notes on Security

Action from Threat Modelling session of 03/06/20.

As some security concerns are known to exist around libgsf (see e.g. https://github.com/schwehr/generic-sensor-format/issues), make clear in the README that it is the responsibility of the calling application to mitigate these where necessary.

Likewise, it should be reiterated to users of gsfpy that GSF data processed by the package should come from reputable sources and should be integrity checked where possible, as it is a possible attack vector.

ACs

  • README updated with notes on security including the points mentioned above.

Add support for GSF 3.09

Currently gsfpy is bundled with version 3.08 of libgsf. The sister project UKHO/libgsf provides built artefacts for both 3.08 and 3.09 of libgsf.

Update the gsfpy project documentation with instructions on how to run gsfpy with versions of libgsf other than 3.08.

ACs

  • Documentation: Instructions on how to run gsfpy with libgsf>3.08 are present in project documentation.

Evaluate SAST tooling options for CI pipeline

We currently do not have SAST scanning integrated with our CI pipeline. It is a requirement for any project which is to be made open source (see https://github.com/UKHO/docs/blob/master/docs/open-source-governance-checklist.md).

This ticket covers evaluation of SAST options for Python that can be used with our CI pipeline set-up in GitHub Actions (see #17 CI Pipeline).

Tools to evaluate are:

  • Coverity
  • Pyre
  • Flake8 (including any security plugins)

ACs

  • Decision made as to which SAST tooling option to use in the CI pipeline
  • Documentation: SAST approach added to project Test Approach document.

Create gsfpy README

Self explanatory.

ACs
Must include:

  • Pip install instructions
  • Link to GSF documentation
  • Notes on use, with examples

Document shared object library build process

The shared object library libgsf3_06.so does not currently have any associated documentation regarding how it was built. Create this and add it to the MS Teams site for the project.

Out of scope at this stage is an automated build process for creation of this artefact.

ACs

Documentation: Documented process for creation of the shared object library present in MS Teams.

GsfException: [-23] GSF End of File Encountered

  • Generic Sensor Format for Python version:2.0.0
  • Python version:3.7.12
  • Operating System: Google Colab

My Code:

`from ctypes import string_at

from gsfpy3_08 import open_gsf
from gsfpy3_08.enums import RecordType

with open_gsf("Path_to_gsf_file.gsf") as gsf_file:
# Note - file is closed automatically upon exiting 'with' block
_, record = gsf_file.read(RecordType.GSF_RECORD_COMMENT)`


Error

GsfException Traceback (most recent call last)
in ()
6 with open_gsf("20211104_162722-S-3-7500-0001_M3.gsf") as gsf_file:
7 # Note - file is closed automatically upon exiting 'with' block
----> 8 _, record = gsf_file.read(RecordType.GSF_RECORD_COMMENT)

/usr/local/lib/python3.7/dist-packages/gsfpy3_08/init.py in read(self, desired_record, record_number)
99
100 _handle_failure(
--> 101 gsfRead(self._handle, desired_record, byref(data_id), byref(records))
102 )
103

/usr/local/lib/python3.7/dist-packages/gsfpy3_08/init.py in _handle_failure(return_code)
171 """
172 if return_code == _ERROR_CODE:
--> 173 raise GsfException()

GsfException: [-23] GSF End of File Encountered

Complete implementation of sensor-specific types

The file gsfSensorSpecific.py contains a number of FIXMEs relating to missing attributes of sensor-specific structures. The incomplete implementation is possible at present because the c_gsfSensorSpecific type is a Union type, and the largest possible struct instance of this type - gsfEM3Specific - is fully implemented meaning that memory allocation for the Union is always correct.

Dependency - Note that this ticket should be implemented after #12 (Upgrade to GSF 3.09), as GSF v3.09 introduces support for new sensors.

ACs

  • All sensor-specific types are fully implemented in the gsfpy wrapper.
  • Testing - All regression tests pass; Coverage meets required threshold.

Add SAST to CI pipeline

Add SAST to the CI pipeline which is present in GitHub Actions. It already includes:

  • Unit test run
  • 3rd-party package vulnerability scanning

The addition step will be

  • Static analysis including SAST tooling (dependency on #23 Evaluate SAST tooling)

See the project Test Approach document on Teams for more details.

Note - GitHub Actions allows external contributors to see the results of automated builds. This is intentional, and will encourage contributions more than using an Azure pipelines alternative.

ACs

  • UKHO.gsfpy CI pipeline is present in GitHub Actions.
  • CI pipeline is triggered automatically upon repo commit, and required to pass before merge is permitted.
  • Pipeline runs all unit tests
  • Pipeline runs static analysis, including SAST using the selected tool as defined in the Test Approach document
  • Pipeline runs 3rd-party package vulnerability scanning, as defined in the Test Approach document

Upgrade to GSF 3.08

gsfpy is currently based on GSF 3.06. The latest version of the GSF library is 3.09 (https://s3.amazonaws.com/leidos-com-static/software/other/GSF_03-09.zip). Bring gsfpy up to date.

Recommended approach:
Rebuild the shared object library (currently libgsf3_06.so) using the instructions provided for building the library in the MS Teams wiki (see #11). Run regression tests and fix any failures. Version 3.09 of GSF is not thought to introduce any breaking changes so no changes to Python code are expected to be necessary.

ACs

  • libgsf3_06.so is replaced with a new shared object library named libgsf3_09.so built from version 3.09 of the Generic Sensor Format library.
  • All existing unit tests pass with the new shared object library.

Upgrade to GitHub-native Dependabot

Dependabot Preview will be shut down on August 3rd, 2021. In order to keep getting Dependabot updates, please merge this PR and migrate to GitHub-native Dependabot before then.

Dependabot has been fully integrated into GitHub, so you no longer have to install and manage a separate app. This pull request updates your config file to the new syntax. When merged, we'll swap out dependabot-preview (me) for a new dependabot app, and you'll be all set!

With this change, you'll now use the Dependabot page in GitHub, rather than the Dependabot dashboard, to monitor your version updates, and you'll configure Dependabot through the new config file rather than a UI.

Your previous schedule was set to live. This option is no longer supported in the new config file so it has been changed to daily.

You have configured automerging on this repository. There is no automerging support in GitHub-native Dependabot, so these settings will not be added to the new config file. Several 3rd-party GitHub Actions and bots can replicate the automerge feature.

If you've got any questions or feedback for us, please let us know by creating an issue in the dependabot/dependabot-core repository.

Learn more about migrating to GitHub-native Dependabot

Please note that regular @dependabot commands do not work on this pull request.

Update README with additional licensing info

The LGPL v2.1 (https://www.gnu.org/licenses/old-licenses/lgpl-2.1.en.html), under which the libgsf library is licensed, requires that software bundling works that use it do two things if they wish to publish under different terms (in the case of gsfpy, the MIT license).

From the licence:
* “(section 6)…you may also combine or link a "work that uses the Library" with the Library to produce a work containing portions of the Library, and distribute that work under terms of your choice, provided that the terms permit modification of the work for the customer's own use and reverse engineering for debugging such modifications)”
*“1. You must give prominent notice with each copy of the work that the Library is used in it and that the Library and its use are covered by this License. You must supply a copy of this License. If the work during execution displays copyright notices, you must include the copyright notice for the Library among them, as well as a reference directing the user to the copy of this License. Also, you must do one of these things:"
*"2. b) Use a suitable shared library mechanism for linking with the Library. A suitable mechanism is one that (1) uses at run time a copy of the library already present on the user's computer system, rather than copying library functions into the executable, and (2) will operate properly with a modified version of the library, if the user installs one, as long as the modified version is interface-compatible with the version that the work was made with."

This PBI covers the updates to the README for compliance with the above:

  1. Add a notice to the README to make clear that libgsf is licensed under LGPL 2.1.
  2. Add a notice to the README to explain that users may replace the bundled version with their own by using the standard convention of setting the LD_LIBRARY_PATH environment variable. The C dynamic (i.e. shared) linking mechanism, as opposed to static linking, is already used.

ACs

  • Updates to README applied as above

Create Threat Model

Create a threat model for gsfpy.

Note that some known vulnerabilities for the underlying gsflib are documented at dwcaress/MB-System#368 and https://github.com/schwehr/generic-sensor-format/issues

Out of scope:

  • Addressing known security vulnerabilities in the gsflib implementation is out of scope. However, these known vulnerabilities should be documented or linked to in the README.md document.

ACs

Add Licence Checking to CI Pipeline

Add Licence Checking to the CI pipeline which is present in GitHub Actions. It already includes:

Approach:

  • Discuss with DDC around licence checking for for Python
  • Build any additional checks required to the CI pipeline

ACs

  • UKHO.gsfpy CI pipeline runs licence checking automatically upon triggered builds
  • Documentation: Add licence checking section to the Test Approach document

Publish to PyPI

As part of process of open sourcing, GitHub actions should be updated to publish merges to master to PyPI.

Recommended technology/approach
As decided at the Threat Modelling session on 03/06/20, poetry (https://python-poetry.org/) should be used for publishing to PyPI.

ACs

  • GitHub actions updated to publish master version to PyPI
  • Following publication, gsfpy can be installed using the command pip install gsfpy

Wrap remaining GSFlib functions

The current gsfpy implementation exposes a subset of the functions present in the underlying gsflib. This subset was that necessary for cleansing, but a public project should ensure that all gsflib functionality is exposed.

The GSF Library specification bundled in the library download, as well as the library code, should be used as the definitive references for the GSF library.

In scope:

  • All methods not currently present in gsfpy, but which are present in gsflib, should be implemented in bindings.py at a minimum.
  • Enumerations not currently present in gsfpy, but which are present in gsflib, should be implemented.

Dependency: Ensure that #12 (Upgrade to GSF 3.09) is complete before implementing this ticket.

ACs

  • All gsflib methods are implemented by gsfpy
  • All gsflib enumerations are implemented by gsfpy
  • Documentation: README.md is updated with a note indicating that the GSF wrapper exposes the complete set of functionality available in gsflib.

Publish gsfpy as standalone pip package

A Python wrapper for the Generic Sensor Format library v3.06 has been created, but currently exists only as code. This PBI aims to make the package available via pip for convenient usage by other teams.

Scope Out of scope for this PBI:

  • Publishing the pip package binary to a pip repo. Install from source is sufficient at present.
  • Upgrade the package to support GSF 3.09

ACs

  • Publish gsfpy code to the UKHO/gsfpy repo in a form that allows direct install from source using pip using a command of the form: pip install git+https://github.com/UKHO/gsfpy.git@master
  • Documentation: Create README.md file with install instructions
  • Demo: Install gsfpy using pip as described above into a test project, then import and make a function call.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.