Coder Social home page Coder Social logo

ocfl-java's People

Contributors

awoods avatar bbpennel avatar dependabot[bot] avatar ives1227 avatar mormonjesus69420 avatar pwinckles avatar sprater avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

ocfl-java's Issues

configurable behavior on unknown extensions

Allow users to configure what they want to happen when unsupported extensions are encountered:

  1. Fail on unknown repo extensions
  2. Warn on unknown repo extensions
  3. Fail on unknown object extensions
  4. Warn on unknown object extensions

Cleanup old revision markers

When using the mutable head extension, revision number markers remain in the revisions directory indefinitely. This becomes problematic if someone creates a huge number of revisions (see fcrepo/fcrepo#2024). One solution to this problem is to attempt to delete non-current revision markers after creating a new revision so that the directory only ever contains at most 2 revision markers. I think this should be safe, but requires further investigation.

Investigate Cloud caching

The cloud implementation currently does not cache any files other than object inventories. If possible, it might be desirable to cache object files on the local filesystem to decrease cloud accesses.

Optimize Cloud implementation

The cloud implementation works but is not well optimized. Need to run performance tests, and investigate async clients vs thread parallelization.

Support OCFL 1.1

I created the ocfl-1.1 branch to start working on adding support for OCFL 1.1.

Outstanding work:

calculate inventory digest while serializing

Currently, new inventories are first serialized to a file and then their digest is calculated. This can be optimized slightly by calculating the digest while the inventories are being written to disk.

change s3 lock implementation

Change the S3 db lock to not lock a row or hold a connection open. Instead, write a record that indicates the lock is held. Fail if the record already exists and indicates a hold. Partial solution for the problem of not releasing a lock, holds are only valid for an hour (or a configurable duration).

add a validation api

There should be some sort of validation api that validates an ocfl object's structure and perhaps optionally does a fixity check.

Shorten staging directory names

Shorten staging directory names so that there isn't the possibility of creating a directory name that's longer than the filesystem supports.

ocfl-java release 1.5.0

Tasks before releasing 1.5.0

  • #78
  • Update all dependencies
  • Review String.split() usage
  • Update contact info in pom

multi-object transactions

Consider adding support for multi-object transactions.

  1. Modified objects would need to be write locked
  2. Need a way to stage version changes
  3. Need a way to access staged versions
  4. Cache update logic needs to change
  5. Should transactions expire, and what happens if they do?

clear cache

Add a method for manually clearing the inventory cache.

Support for logs dir

How should the logs directory be used? Currently, nothing is written to the logs directory. How do people want to use it?

unicode normalization

Issue raised by @benc. May want to support unicode normalization for logical paths and object ids. For example, ᚊ could be represented as \u1E69, \u0073\u0323\u0307, or \u0073\u0307\u0323. Without normalization, it would difficult to find an object or file that uses such a character without knowing the representation that was used.

For logical paths, strings could be normalized for comparison purposes only and written to the inventory in the same form as they were received.

Object ids would need to be normalized as part of the storage layout.

It's unclear how much of an issue this is. If a user consistently encodes their strings, they will not experience any issues.

https://www.unicode.org/reports/tr15/
https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/text/Normalizer.html

Enforce non-conflicting logical paths

Within a version, logical paths MUST be unique and non-conflicting, so the logical path for a file cannot appear as the initial part of another logical path.

Filename length limitation

The testing and surfacing of this limitation came through the script at the bottom run against a Fedora 6 repository.
The issue, which may potentially require no additional action beyond documenting in the README what the limitation is for file lengths on Ubuntu, is that files with names with 323 characters fails when persisting to OCFL. I have not narrowed down the exact limitation length.

  • System
$ uname -a
Linux silver 5.3.0-42-generic #34-Ubuntu SMP Fri Feb 28 05:49:40 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 19.10
Release:	19.10
Codename:	eoan

Testing script

clear;x=0; r="http://localhost:8080/rest/"; while true; do x=$(($x + 1));echo $x; r=`curl -XPOST $r`; echo "r = $r"; done 

Script output

Add rollback support

Add support for rolling an object back to a previous version. The end result should be that the head version is set to the previous version and all later versions must be deleted.

Cannot compare head version of ObjectVersionId

Comparing the head version of ObjectVersionId leads to a NPE. ObjectVersionId#equals() should check versionNum for null values.

@Test
public void equalsVersionId() {
    assertEquals(ObjectVersionId.head("a"), ObjectVersionId.head("a"));
    assertNotEquals(ObjectVersionId.head("a"), ObjectVersionId.head("b"));
}

Review ObjectDetails and VersionDetails APIs

Review the ObjectDetails and VersionDetails APIs to make sure that they are exposing information sensibly. For example, VersionDetails should probably have a method for getting its version number, and there might be a better alternative to ObjectDetails.getVersionMap().

MutableOcflRepository fails on writing files with the same hash

Writing a file with the same hash leads to a IllegalArgumentException due to the fact that the source directory is empty. Before copying, the MutableOcflRepository should check if the hash is already present in the inventory.json.

java.lang.IllegalArgumentException: Source must exist and be a directory: /tmp/ocfl3291760920107648701/ocfl-work/e03731555a6eece977a708aaf6a3c780-3248953293/content/r1

	at io.ocfl.core.util.FileUtil.moveDirectory(FileUtil.java:90)
	at io.ocfl.core.storage.filesystem.FileSystemStorage.moveDirectoryInto(FileSystemStorage.java:266)
	at io.ocfl.core.storage.DefaultOcflStorage.moveToRevisionDirectory(DefaultOcflStorage.java:756)
	at io.ocfl.core.storage.DefaultOcflStorage.storeNewMutableHeadVersion(DefaultOcflStorage.java:694)
	at io.ocfl.core.storage.DefaultOcflStorage.storeNewVersion(DefaultOcflStorage.java:253)
        ...
public class OCFLTestCase {

    @Test
    public void testMutable() throws IOException {
        Path tempDirectory = Files.createTempDirectory("ocfl");
        Path repoDirectoryPath = tempDirectory.resolve("ocfl-repo");
        Path workDirectoryPath = tempDirectory.resolve("ocfl-work");

        Files.createDirectory(repoDirectoryPath);
        Files.createDirectory(workDirectoryPath);

        MutableOcflRepository repository = new OcflRepositoryBuilder()
            .defaultLayoutConfig(new HashedNTupleLayoutConfig())
            .storage(storage -> storage.fileSystem(repoDirectoryPath))
            .workDir(workDirectoryPath)
            .buildMutable();

        String objectId = "object_1";
        ObjectVersionId head = ObjectVersionId.head(objectId);
        repository.stageChanges(head, new VersionInfo(), (updater) -> {
            updater.writeFile(new ByteArrayInputStream(new byte[] { 1 }), "info_1.txt");
        });
        repository.commitStagedChanges(objectId, new VersionInfo());

        repository.stageChanges(head, new VersionInfo(), (updater) -> {
            updater.writeFile(new ByteArrayInputStream(new byte[] { 1 }), "info_2.txt");
        });
        repository.commitStagedChanges(objectId, new VersionInfo());
    }

}

Changing one of the byte[] { 1 } to byte[] { 2 } will work as expected.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.