Coder Social home page Coder Social logo

ark-builders / arklib-android Goto Github PK

View Code? Open in Web Editor NEW
4.0 4.0 6.0 12.49 MB

Gradle wrapper for ARKLib, for usage in Android projects

License: MIT License

Rust 11.96% Kotlin 87.80% Shell 0.24%
android gradle kotlin kotlin-library library rust jni jni-android jni-android-library

arklib-android's Introduction

Buy Me A Coffee

Coverage Bugs Vulnerabilities

ArkLib for Android

This is a wrapper of ArkLib which enables you to build Android apps, powered by resource indexing, previews generation and user metadata support such as tags or scores.

⚠️ WARNING
The following information is only for developers.

Importing the library

Github packages with credentials is a workaround since JCenter is shutdown

Add the following script to project's build.gradle:

allprojects {
    repositories{
        maven {
            name = "GitHubPackages"
            url = "https://maven.pkg.github.com/ARK-Builders/arklib-android"
            credentials {
                username = "token"
                password = "\u0037\u0066\u0066\u0036\u0030\u0039\u0033\u0066\u0032\u0037\u0033\u0036\u0033\u0037\u0064\u0036\u0037\u0066\u0038\u0030\u0034\u0039\u0062\u0030\u0039\u0038\u0039\u0038\u0066\u0034\u0066\u0034\u0031\u0064\u0062\u0033\u0064\u0033\u0038\u0065"
            }
        }
    }
}

And add arklib-android dependency to app module's build.gradle:

implementation 'dev.arkbuilders:arklib:0.3.1'

Development of the library

Prerequisites

  • Rust toolchain
  • Kotlin toolchain
  • Android SDK + NDK r24 (latest)

Build Rust library

You need to have Rust targets installed:

rustup target add armv7-linux-androideabi
rustup target add aarch64-linux-android
rustup target add i686-linux-android
rustup target add x86_64-linux-android

Compile Rust (option 1)

For checking if Rust code compiles without problems, you can use this command:

./gradlew cargoBuild

The above command should generates libarklib.so file inside ./arklib/target/<arch>/<buildVariant> folder. If the build is failed, which leads to no generated .so files, there's a build alternative which doesn't require you to install extra dependencies:

https://github.com/bbqsrc/cargo-ndk

Compile Rust (option 2)

Using cargo-ndk, you can generate the libarklib.so files in two steps:

- cd arklib
- cargo ndk -o ./jniLibs build

Running the above two commands outputs same .so files as ./gradlew cargoBuild does.

Build AAR

Before make a release build, ensure you have set profile = "release" in cargo config.

./gradlew lib:assemble

The generated release build is lib/build/outputs/aar/lib-release.aar

Publish New Version

Ensure you have committed your changes.

./gradlew release

Then simply push to the repo.

Debug

Make sure you have switch to debug profile in cargo config, which could be found at lib/build.gradle

Run the command to build

./gradlew lib:assemble

Connect to a device or setup an AVD and check the functionality.

./gradlew appmock:connectedCheck

Unit tests

Unit tests require native ARK library file for host machine in project root directory.

  • libarklib.so for Linux
  • libarklib.dylib for Mac
  • libarklib.dll for Windows

Unit tests depend on buildRustLibForHost gradle task (Linux, Mac)

But you can do it manually:

  • Find out host architecture rustc -vV | sed -n 's|host: ||p'
  • Change to arklib directory and build the library cargo build --target $host_arch
  • Copy library from arklib/target/$host_arch/debug/libarklib.(so|dylib|dll) to project root directory

Shortcut for Linux:

ARCH=$(rustc -vV | sed -n 's|host: ||p') cargo build --target $ARCH && cp arklib/target/$ARCH/debug/libarklib.so .

arklib-android's People

Contributors

dependabot[bot] avatar gwendalf avatar hhio618 avatar hieuwu avatar j4w3ny avatar kirillt avatar mdrlzy avatar oluiscabral avatar shubertm avatar tuancoltech avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

arklib-android's Issues

Chunked resource index

It could be good idea to store index as collection of files, or chunks. Update of a single resource would affect smaller file, less data would be needed to synced using Syncthing then. But this feature is debatable. It might be useless when we get rid of Syncthing and implement our own sync mechanism — in that case, we would just broadcast atomic changes to other devices.

Chunked storage type

We have two storage implementations:

  • File-based: all key-value pairs are stored in a single file.
  • Folder-based: each key gets its individual file in a designated folder.

File-based storages tend to fail when dealing with map sizes nearing 10,000 entries, often resulting in slow, sometimes even flawed writing. On the other hand, folder-based storages must be inefficient when handling smaller values.

Chunked storage blends both methods: it utilizes a folder containing files, with each file holding multiple entries. We need a strategy to identify the chunks requiring updates after a storage model modification. The use of Merkle trees might be necessary for efficient synchronization of external updates.

Text layer extraction and storage

For resources of kind "Document", it would be useful to extract and store text from them. E.g. for PDF resources, text layer should be similar to what is emitted by Linux utility pdftotext. The text layer can be used later for filtering resources by specified text in content, or for various text analytics (e.g. counting words).

Implement cache for generated storages

Generated storages, i.e. those which contain only generated data, like MetadataStorage and PreviewStorage, access filesystem on each locate call. This should be cached in order to provide better performance.

Provide Android API for previews generation

When arklib is used to generate preview of a resource, resulting bitmap must be passed into Android side, where it will be stored as necessarily. Later this will change, but for now it would be a good start.

The Pdfium usage here must be replaced with arklib usage, so bitmaps would be generated by function implemented in ARK-Builders/arklib#5. Initialization of arklib must be done only once per Android app invocation, i.e. we don't want to re-initialize the lib in 2 subsequent calls to it.

Index projection

It should be easier and more flexible way to work with "favorites" which use index of their parent folder but exploit optimized work with resource collection. Right now, there is awkward prefix: Path parameter in methods of ResourceIndex interface.

Storage pruning

If any of resources were removed, values in storages associated with them should be cleaned up.

This should be optional and decidable by an app, probably the app would have a preference for this behavior.

Indexing service

We should externalize the processing stages: indexing, metadata extraction, and previews generation. By externalization, we mean an external system entity should undertake these activities for each root folder, with the results subsequently pulled by the applications (Navigator, Shelf, Memo, etc.).

Bump version in the apps

This issue is supposed to track versions of arklib-android used in the apps.

This issue is not supposed to be closed.

Persisted/Replicated index

Index is already implemented in https://github.com/ARK-Builders/arklib as plain file stored in .ark folders. We need to throw Room away and switch to arklib's index implementation. This will solve ARK-Builders/ARK-Navigator#142 in Navigator.

The index will be persisted and replicated, meaning that it is a file synced by Syncthing or another filesystem sync mechanism. Devices sharing the same root folder will share the index as well and should benefit from that indexing will happen less frequently.

FolderStorage$readFromDisk ClassCastException

java.lang.ClassCastException: java.util.LinkedHashMap$LinkedHashMapEntry cannot be cast to java.util.HashMap$TreeNode
	at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1831)
	at java.util.HashMap$TreeNode.treeify(HashMap.java:1948)
	at java.util.HashMap.treeifyBin(HashMap.java:771)
	at java.util.HashMap.putVal(HashMap.java:643)
	at java.util.HashMap.put(HashMap.java:611)
	at space.taran.arklib.domain.storage.FolderStorage$readFromDisk$jobs$2$1.invokeSuspend(FolderStorage.kt:100)
	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
	at kotlinx.coroutines.internal.LimitedDispatcher.run(LimitedDispatcher.kt:42)
	at kotlinx.coroutines.scheduling.TaskImpl.run(Tasks.kt:95)
	at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:570)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:750)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:677)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:664)
	Suppressed: kotlinx.coroutines.DiagnosticCoroutineContextException: [StandaloneCoroutine{Cancelling}@6564017, Dispatchers.IO]

TagStorage can get corrupted during writing

Sometimes, the file serving as tag storage becomes corrupted:

...
117421-975209069:visa
92126-3414598226:hmm
60272-1305102029:music
34362-6340143
(END OF FILE)

In the example above, resource id isn't complete and no tags following.

In fact, there is even bigger problem: in such cases, half of the storage is lost. Thanks to backup mechanism the loss can be mitigated. However, this must not happen at all.

Atomic writing should be implemented.

Storage files monitoring

Files backing storages for user data should be monitored similar to how the resources are monitored.
This would allow us to catch updates from other devices in good time.

Files backing generated storages (metadata, previews) and index file could be skipped since we always generate them from the updated resources. The idea to optimize generation out is wrong here, because even if we just copy the updates we should verify them.

Right now, the updates are handled only when we initialize presenters, i.e. we need to close folder and open again to have new values from storages.

Use IndexProjection for performant filters

Using IndexProjection we can not only open "favorite" folders, but open other folders with filter applied to resources.

These filters can be based on properties of Resource:

    val name: String,
    val extension: String,
    val modified: FileTime

or on parts of ResourceId, especially:

    val dataSize: Long

Stale metadata in the cache

Right now we are not removing lost resource meta, which causes npe:

06-09 00:56:01.152 E/ACRA    (16329): ACRA caught a NullPointerException for space.taran.arknavigator
06-09 00:56:01.152 E/ACRA    (16329): java.lang.NullPointerException
06-09 00:56:01.152 E/ACRA    (16329): 	at space.taran.arklib.domain.preview.RootPreviewProcessor.initKnownResources(RootPreviewProcessor.kt:128)
06-09 00:56:01.152 E/ACRA    (16329): 	at space.taran.arklib.domain.preview.RootPreviewProcessor.access$initKnownResources(RootPreviewProcessor.kt:15)
06-09 00:56:01.152 E/ACRA    (16329): 	at space.taran.arklib.domain.preview.RootPreviewProcessor$init$2.invokeSuspend(RootPreviewProcessor.kt:44)
06-09 00:56:01.152 E/ACRA    (16329): 	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
06-09 00:56:01.152 E/ACRA    (16329): 	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
06-09 00:56:01.152 E/ACRA    (16329): 	at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:570)
06-09 00:56:01.152 E/ACRA    (16329): 	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:750)
06-09 00:56:01.152 E/ACRA    (16329): 	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:677)
06-09 00:56:01.152 E/ACRA    (16329): 	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:664)
06-09 00:56:01.152 E/ACRA    (16329): 	Suppressed: kotlinx.coroutines.DiagnosticCoroutineContextException: [StandaloneCoroutine{Cancelling}@a414215, Dispatchers.Default]
    private suspend fun initKnownResources() {
        _busy.emit(true)
        metadata.state().forEach { (id, meta) ->
            val path = index.getPath(id)!!
            generate(id, path, meta)
        }
    ...
    }

RootPreviewProcessor takes meta, but there is no such id in index, because resource was deleted, but meta was not deleted
See ARK-Builders/arklib#40

Redesign index-storage interactions

  • Index-related classes should not be coupled with storages too tightly.
  • Index-related classes should not call any factories or storages:
    1. Storages should consume updates directly or via presenters.
    2. Factories should be invoked by storages when they don't have data or it is outdated.

Examples of improper classes interaction:

ResourceIndexRepo depends on MetadataStorage

class ResourceIndexRepo(
    private val foldersRepo: FoldersRepo,
    //todo any storage should be attached next to index, not into it
    private val metadataStorageRepo: MetadataStorageRepo,
    private val messageFlow: MutableSharedFlow<Message>,
...

PlainIndex during index updates processing invokes GeneralMetadataFactory:

    private suspend fun handleUpdate(update: UpdatedResourcesId) {
        update.deleted.forEach { ... }
        val added = ...
        ...

        //todo MetadataStorage must manage Metadata for resources
        // the same as TagStorage manages their tags
                GeneralMetadataFactory.compute(path, resource)
                    .onFailure {
                        Log.e(
                                RESOURCES_INDEX,
                                "Could not detect kind for " +
                                        path.absolutePathString()
                            )
                        messageFlow.emit(Message.KindDetectFailed(path))
                    }
                    .map { metadata ->
                        resource.metadata = metadata
                        resource
                    }
...

ResourceIndexRepo during initial index loading invokes GeneralMetadataFactory:

    internal suspend fun providePlainIndex(
        root: Path
    ): PlainIndex = provideMutex.withLock {
        ...
        //todo MetadataStorage must manage Metadata for resources
        // the same as TagStorage manages all tags
                GeneralMetadataFactory.compute(path, resource)
                    .map { metadata ->
                        resource.metadata = metadata
                        resource
                    }

Bounty #1 for Homayoun

We need to solve several issues at once:

  1. #20
    Ensure that everywhere ResourceID is taken from arklib, right now it is CRC32 checksum + filesize.
    When we need to print a string with both values, we can just delimit them using dash (-).

    • Update ARK-Shelf
    • Update ARK-Shelf-Desktop
    • Update ARK-Navigator
      We should also implement migration in ARK Navigator, i.e. check some value (or index version) in app data and if there is none (or version is too low) then perform storages upgrade. Index and previews could be just dropped. Storages must be backed up before performing upgrade and old resource ids must be replaced by new resource ids. E.g. 5839051: english, greek must be replaced with 88363-5839051: english, greek in .ark/tags, etc.
  2. #21
    Basically, we just need to take code from Navigator and move it into arklib-android, ensuring any app could just import couple of modules and use metadata as well as index. We don't really need indexing in the apps right now (except Navigator). Could be useful later.

  3. ARK-Builders/ARK-Memo#8
    We need to use new metadata storage and separate title into it. Also, we can introduce created-date metadata field.
    This issue is kinda similar to ARK-Builders/ARK-Shelf#21

Restrict the frequency of storage write operations

  • Implement a version counter for coroutines writing to the same storage: a simple integer that increments with each write to the in-memory storage. The counters can initialize at 0 during a single app launch. If a coroutine approaches the mutex and it's occupied, we compare their versions: the more recent one gains access to the storage, while the older one is cancelled.
  • Restrict the creation of coroutines to a rate of 1 per second.

Tracked storage and CRDT

This new kind of storage would consist of separate tracks for each device. Each track stores history of update from its device. Main storage is result of consensus between tracks. This should improve conflicts resolution. Storages also could generalize from Monoid structure to any CRDT.

Enabling app developers to extend the Properties class

Currently, an app developer can extend the Properties class, however, they need to create their own version of PropertiesStorage to handle the customized structure. We should consider enhancing PropertiesStorage to support third-party extensions without the need to duplicate the entire code within the app.

A potential solution could be to make PropertiesStorage more flexible and adaptable by generalizing it to operate over a generic P type, rather than having it tightly coupled with the Properties type. This would facilitate easier adaptability and integration with various data structures.

Once we've generalized the PropertiesStorage, we can proceed to shift the existing Properties structure into the ARK Shelf repository.

Group generated data under `cache` folder

All output from new Processor classes should be grouped under .ark/cache folder.

E.g. .ark/previews should be moved into .ark/cache/previews.
Same with metadata and thumbnails.

Other processor classes could be added in future.

Continuous storage synchronization

Right now, internal and external changes are synchronized only when the storage is written. In this case, the model is merged with the file content. This is done because the storage file can have updates from external devices.

  • Sync deletes from remote devices
    There are no conflicts resolution though, only plain merge. This results in that user can't delete any value while any of devices keeps the app in its memory. All deletes will be ignored because we perform deletes only from local model and storage files, but the storage file will have deleted value restored from the model on other device.

  • Sync storage continuously
    When client requests storage values for a key (e.g. getTags(id: ResourceId)), the value is taken solely from the model. The model needs to be updated with external changes in advance. External changes are not merged "on the fly" yet.

Migrate ResourceId from crc32 to (crc32, filesize)

ArkLib defines ResourceId as 2 values of Long type:

  1. CRC-32 checksum
  2. Size of the file

Navigator and arklib-android defines ResourceId as just CRC-32 checksum:

typealias ResourceId = Long

We need to make Navigator use exactly the same what is defined in ArkLib.

Stats storage: labeling statistics

See ARK-Builders/ark-android#16 for the context.
This issue is about metrics 2 and 4 from the list:

  1. How many times a resource was labeled with the tag
    .ark/stats/tag-labeled-n
  2. How recently a resource was labeled with the tag
    .ark/stats/tag-labeled-ts

Storing the stats

It is needed to create new section of persisted/replicated storage (.ark folder) for tag statistics and update corresponding files every time we label a resource with a tag:

  • .ark/stats/tag-labeled-n for total amount of times a tag was used, it is mapping Tag -> Int, where values cannot be negative
  • .ark/stats/tag-labeled-ts for the most recent timestamps tag was used, although in fact we can just maintain it without timestamps and storing the order in which tags were used: if we labeled a resource with tag T right now, that means we put T on top of the stack, so just List[Tag] is good enough

Using the stats

These stats could be used in metrics for sorting tags in tag selector. Stats tag-labeled-ts are more important here, since it would be very handy to see latest used tags on top. At the same time, tag-labeled-n is less important but also could be handy because it would push those tags on top, which are used for temporary labelings, e.g. todo tag.

Handle index construction errors

In the file RootIndex.kt:

        if (!BindingIndex.load(root)) {
            Log.e(
                RESOURCES_INDEX,
                "Couldn't provide index from $path"
            )
            throw UnknownError()
        }

We should report the error in a way that dependent app (Navigator, Shelf) could report it to the user.

Separate generated metadata from user metadata

In our model, we deal with two distinct types of metadata:
1. Generated metadata.
This refers to metadata extracted directly from the data. For example, video dimensions and duration. It's okay if we partially or completely lose this metadata as it can always be re-generated, much like previews. It's crucial that this metadata is generated deterministically, ensuring it's created the same way on any device.

2. User-defined metadata.
This refers to metadata created by the user. This could include tags, scores, or more specific attributes such as a link or document's title. Losing this data is not an option. We must ensure the storage is secure and synchronized. Unlike generated metadata, we cannot regenerate or restore this type of data.

We need to implement a storage for the second kind of metadata.

Pluggable resource kinds

Plain enum ResourceKind type should be replaced by a registry — map from some id type I, with bundles of code as values.

The bundle of code should include implementations of MetadataExtractor and PreviewGenerator interfaces.

This would allow us to move concrete resource kinds into external dependencies, defined by apps itself. Kind identifier I type should be something unique, like bytes vector or a string. Different apps should be able to use the same reources kind just by using the same library defining it.

Encrypted storage

Binary interface of FolderStorage looks like a good fit for this.

Don't pass all resources from index to storages multiple times

Somehow, storages depend on knowing all existing resources. It should be reworked in more performant way.

interface ResourceIndex {
    ...
    // we pass all known resource ids to a storage because
    // 1) any storage exists globally
    // 2) we maintain only 1 storage per root
    // 3) every storage is initialized with resource ids
    suspend fun allIds(): Set<ResourceId> = allIds(null)
    ...
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.