Coder Social home page Coder Social logo

datafabricrus / textfile-utils Goto Github PK

View Code? Open in Web Editor NEW
3.0 3.0 1.0 509 KB

A simple JVM library with utilitarian methods for working with text files of any size, including merge sorting and binary search. The library is based on the Java NIO and Kotlin coroutines.

License: Apache License 2.0

Kotlin 100.00%
binary-search file-utility kotlin kotlin-library mergesort text-files

textfile-utils's People

Contributors

sszuev avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Forkers

so-dewy

textfile-utils's Issues

replace `ResourceIterator` with `AutoCloseableSequence` ?

Example:

class AutoCloseableSequenceImpl<X>(private val source: Iterator<X>): Sequence<X>, AutoCloseable {
    override fun iterator(): Iterator<X> = source

    override fun close() {
        TODO("Not yet implemented")
    }
}

fun <T, R> AutoCloseableSequenceImpl<T>.map(transform: (T) -> R): AutoCloseableSequenceImpl<R> {
    val theSequence = this as Sequence<T>
    return AutoCloseableSequenceImpl(theSequence.map(transform).iterator())
}

OOM when using sort on 21 MB file

Reproducer test can be found in a commit here https://github.com/so-dewy/textfile-utils

How to reproduce

  1. In src/test/kotlin/MergeSortTest.kt add a new test:
    @Test
    fun `test sort oom`(@TempDir dir: Path) {
        testDefaultSortResourceFile(dir, "/oom.nt")
    }
  1. In /src/test/resources add file oom.nt.tar.gz
  2. Run test from 1

Stacktrace:

org.gradle.api.internal.tasks.testing.TestSuiteExecutionException: Could not complete execution for Gradle Test Executor 5.
	at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.stop(SuiteTestClassProcessor.java:64)
	at [email protected]/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at [email protected]/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at [email protected]/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at [email protected]/java.lang.reflect.Method.invoke(Method.java:566)
	at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
	at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
	at org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:33)
	at org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:94)
	at com.sun.proxy.$Proxy2.stop(Unknown Source)
	at org.gradle.api.internal.tasks.testing.worker.TestWorker$3.run(TestWorker.java:193)
	at org.gradle.api.internal.tasks.testing.worker.TestWorker.executeAndMaintainThreadName(TestWorker.java:129)
	at org.gradle.api.internal.tasks.testing.worker.TestWorker.execute(TestWorker.java:100)
	at org.gradle.api.internal.tasks.testing.worker.TestWorker.execute(TestWorker.java:60)
	at org.gradle.process.internal.worker.child.ActionExecutionWorker.execute(ActionExecutionWorker.java:56)
	at org.gradle.process.internal.worker.child.SystemApplicationClassLoaderWorker.call(SystemApplicationClassLoaderWorker.java:113)
	at org.gradle.process.internal.worker.child.SystemApplicationClassLoaderWorker.call(SystemApplicationClassLoaderWorker.java:65)
	at app//worker.org.gradle.process.internal.worker.GradleWorkerMain.run(GradleWorkerMain.java:69)
	at app//worker.org.gradle.process.internal.worker.GradleWorkerMain.main(GradleWorkerMain.java:74)
Caused by: java.lang.OutOfMemoryError: Java heap space
	at java.base/java.util.concurrent.ArrayBlockingQueue.<init>(ArrayBlockingQueue.java:270)
	at java.base/java.util.concurrent.ArrayBlockingQueue.<init>(ArrayBlockingQueue.java:254)
	at cc.datafabric.textfileutils.files.AsyncLineReader.<init>(LineReader.kt:325)
	at cc.datafabric.textfileutils.files.LineReaderKt.asyncReadByteLines(LineReader.kt:204)
	at cc.datafabric.textfileutils.files.LineReaderKt.readLines(LineReader.kt:148)
	at cc.datafabric.textfileutils.files.LineReaderKt.readLines$default(LineReader.kt:97)
	at cc.datafabric.textfileutils.files.FileMergeKt$mergeFilesInverse$10$1.invoke(FileMerge.kt:234)
	at cc.datafabric.textfileutils.files.FileMergeKt$mergeFilesInverse$10$1.invoke(FileMerge.kt:228)
	at cc.datafabric.textfileutils.files.FilesKt.use(Files.kt:245)
	at cc.datafabric.textfileutils.files.FilesKt.use$default(Files.kt:236)
	at cc.datafabric.textfileutils.files.FileMergeKt$mergeFilesInverse$10.invoke(FileMerge.kt:228)
	at cc.datafabric.textfileutils.files.FileMergeKt$mergeFilesInverse$10.invoke(FileMerge.kt:227)
	at cc.datafabric.textfileutils.files.FilesKt.use(Files.kt:229)
	at cc.datafabric.textfileutils.files.FilesKt.use$default(Files.kt:223)
	at cc.datafabric.textfileutils.files.FileMergeKt.mergeFilesInverse(FileMerge.kt:227)
	at cc.datafabric.textfileutils.files.FileMergeKt.mergeFilesInverse(FileMerge.kt:126)
	at cc.datafabric.textfileutils.files.FileMergeKt.mergeFilesInverse$default(FileMerge.kt:100)
	at cc.datafabric.textfileutils.files.MergeSortKt.suspendSort(MergeSort.kt:304)
	at cc.datafabric.textfileutils.files.MergeSortKt.suspendSort(MergeSort.kt:208)
	at cc.datafabric.textfileutils.files.MergeSortKt$blockingSort$1.invokeSuspend(MergeSort.kt:118)
	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:108)
	at kotlinx.coroutines.internal.LimitedDispatcher$Worker.run(LimitedDispatcher.kt:115)
	at kotlinx.coroutines.scheduling.TaskImpl.run(Tasks.kt:103)
	at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:584)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:793)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:697)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:684)

BinareSearch: Possibly incorrect calculation result if no rows found

Looks suspicious, need review

https://github.com/DataFabricRus/textfile-utils/blob/main/src/main/kotlin/files/ByteArrays.kt#L174

            if (!includeLeftBound && low == sourceStartInclusive) {
                val endExclusive = (high - 1)
                return Lines(startInclusive = -1, endExclusive = endExclusive, lines = emptyList())
            }
            if (!includeRightBound && high == sourceEndExclusive) {
                // end
                val startInclusive = low
                return Lines(startInclusive = startInclusive, endExclusive = -1, lines = emptyList())
            }
            val insertPosition = high
            return Lines(startInclusive = insertPosition, endExclusive = insertPosition, lines = emptyList())

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.