Coder Social home page Coder Social logo

js-devtools / readdir-enhanced Goto Github PK

View Code? Open in Web Editor NEW
85.0 4.0 8.0 592 KB

fs.readdir() with filter, recursion, absolute paths, promises, streams, and more!

Home Page: https://jstools.dev/readdir-enhanced/

License: MIT License

JavaScript 68.26% TypeScript 31.74%
javascript readdir async stream filter recursion eventemitter directory filesystem ls

readdir-enhanced's Introduction

Enhanced fs.readdir()

Cross-Platform Compatibility Build Status

Coverage Status Dependencies

npm License Buy us a tree

Features

Example

import readdir from "@jsdevtools/readdir-enhanced";
import through2 from "through2";

// Synchronous API
let files = readdir.sync("my/directory");

// Callback API
readdir.async("my/directory", (err, files) => { ... });

// Promises API
readdir.async("my/directory")
  .then((files) => { ... })
  .catch((err) => { ... });

// Async/Await API
let files = await readdir.async("my/directory");

// Async Iterator API
for await (let item of readdir.iterator("my/directory")) {
  ...
}

// EventEmitter API
readdir.stream("my/directory")
  .on("data", (path) => { ... })
  .on("file", (path) => { ... })
  .on("directory", (path) => { ... })
  .on("symlink", (path) => { ... })
  .on("error", (err) => { ... });

// Streaming API
let stream = readdir.stream("my/directory")
  .pipe(through2.obj(function(data, enc, next) {
    console.log(data);
    this.push(data);
    next();
  });

Installation

Install using npm:

npm install @jsdevtools/readdir-enhanced

Pick Your API

Readdir Enhanced has multiple APIs, so you can pick whichever one you prefer. Here are some things to consider about each API:

Function Returns Syntax Blocks the thread? Buffers results?
readdirSync()
readdir.sync()
Array Synchronous yes yes
readdir()
readdir.async()
readdirAsync()
Promise async/await
Promise.then()
callback
no yes
readdir.iterator()
readdirIterator()
Iterator for await...of no no
readdir.stream()
readdirStream()
Readable Stream stream.on("data")
stream.read()
stream.pipe()
no no

Blocking the Thread

The synchronous API blocks the thread until all results have been read. Only use this if you know the directory does not contain many items, or if your program needs the results before it can do anything else.

Buffered Results

Some APIs buffer the results, which means you get all the results at once (as an array). This can be more convenient to work with, but it can also consume a significant amount of memory, depending on how many results there are. The non-buffered APIs return each result to you one-by-one, which means you can start processing the results even while the directory is still being read.

Alias Exports

The example above imported the readdir default export and used its properties, such as readdir.sync or readdir.async to call specific APIs. For convenience, each of the different APIs is exported as a named function that you can import directly.

  • readdir.sync() is also exported as readdirSync()
  • readdir.async() is also exported as readdirAsync()
  • readdir.iterator() is also exported as readdirIterator()
  • readdir.stream() is also exported as readdirStream()

Here's how to import named exports rather than the default export:

import { readdirSync, readdirAsync, readdirIterator, readdirStream } from "@jsdevtools/readdir-enhanced";

Enhanced Features

Readdir Enhanced adds several features to the built-in fs.readdir() function. All of the enhanced features are opt-in, which makes Readdir Enhanced fully backward compatible by default. You can enable any of the features by passing-in an options argument as the second parameter.

Crawl Subdirectories

By default, Readdir Enhanced will only return the top-level contents of the starting directory. But you can set the deep option to recursively traverse the subdirectories and return their contents as well.

Crawl ALL subdirectories

The deep option can be set to true to traverse the entire directory structure.

import readdir from "@jsdevtools/readdir-enhanced";

readdir("my/directory", {deep: true}, (err, files) => {
  console.log(files);
  // => subdir1
  // => subdir1/file.txt
  // => subdir1/subdir2
  // => subdir1/subdir2/file.txt
  // => subdir1/subdir2/subdir3
  // => subdir1/subdir2/subdir3/file.txt
});

Crawl to a specific depth

The deep option can be set to a number to only traverse that many levels deep. For example, calling readdir("my/directory", {deep: 2}) will return subdir1/file.txt and subdir1/subdir2/file.txt, but it won't return subdir1/subdir2/subdir3/file.txt.

import readdir from "@jsdevtools/readdir-enhanced";

readdir("my/directory", {deep: 2}, (err, files) => {
  console.log(files);
  // => subdir1
  // => subdir1/file.txt
  // => subdir1/subdir2
  // => subdir1/subdir2/file.txt
  // => subdir1/subdir2/subdir3
});

Crawl subdirectories by name

For simple use-cases, you can use a regular expression or a glob pattern to crawl only the directories whose path matches the pattern. The path is relative to the starting directory by default, but you can customize this via options.basePath.

NOTE: Glob patterns always use forward-slashes, even on Windows. This does not apply to regular expressions though. Regular expressions should use the appropraite path separator for the environment. Or, you can match both types of separators using [\\/].

import readdir from "@jsdevtools/readdir-enhanced";

// Only crawl the "lib" and "bin" subdirectories
// (notice that the "node_modules" subdirectory does NOT get crawled)
readdir("my/directory", {deep: /lib|bin/}, (err, files) => {
  console.log(files);
  // => bin
  // => bin/cli.js
  // => lib
  // => lib/index.js
  // => node_modules
  // => package.json
});

Custom recursion logic

For more advanced recursion, you can set the deep option to a function that accepts an fs.Stats object and returns a truthy value if the starting directory should be crawled.

NOTE: The fs.Stats object that's passed to the function has additional path and depth properties. The path is relative to the starting directory by default, but you can customize this via options.basePath. The depth is the number of subdirectories beneath the base path (see options.deep).

import readdir from "@jsdevtools/readdir-enhanced";

// Crawl all subdirectories, except "node_modules"
function ignoreNodeModules (stats) {
  return stats.path.indexOf("node_modules") === -1;
}

readdir("my/directory", {deep: ignoreNodeModules}, (err, files) => {
  console.log(files);
  // => bin
  // => bin/cli.js
  // => lib
  // => lib/index.js
  // => node_modules
  // => package.json
});

Filtering

The filter option lets you limit the results based on any criteria you want.

Filter by name

For simple use-cases, you can use a regular expression or a glob pattern to filter items by their path. The path is relative to the starting directory by default, but you can customize this via options.basePath.

NOTE: Glob patterns always use forward-slashes, even on Windows. This does not apply to regular expressions though. Regular expressions should use the appropraite path separator for the environment. Or, you can match both types of separators using [\\/].

import readdir from "@jsdevtools/readdir-enhanced";

// Find all .txt files
readdir("my/directory", {filter: "*.txt"});

// Find all package.json files
readdir("my/directory", {filter: "**/package.json", deep: true});

// Find everything with at least one number in the name
readdir("my/directory", {filter: /\d+/});

Custom filtering logic

For more advanced filtering, you can specify a filter function that accepts an fs.Stats object and returns a truthy value if the item should be included in the results.

NOTE: The fs.Stats object that's passed to the filter function has additional path and depth properties. The path is relative to the starting directory by default, but you can customize this via options.basePath. The depth is the number of subdirectories beneath the base path (see options.deep).

import readdir from "@jsdevtools/readdir-enhanced";

// Only return file names containing an underscore
function myFilter(stats) {
  return stats.isFile() && stats.path.indexOf("_") >= 0;
}

readdir("my/directory", {filter: myFilter}, (err, files) => {
  console.log(files);
  // => __myFile.txt
  // => my_other_file.txt
  // => img_1.jpg
  // => node_modules
});

Get fs.Stats objects instead of strings

All of the Readdir Enhanced functions listed above return an array of strings (paths). But in some situations, the path isn't enough information. Setting the stats option returns an array of fs.Stats objects instead of path strings. The fs.Stats object contains all sorts of useful information, such as the size, the creation date/time, and helper methods such as isFile(), isDirectory(), isSymbolicLink(), etc.

NOTE: The fs.Stats objects that are returned also have additional path and depth properties. The path is relative to the starting directory by default, but you can customize this via options.basePath. The depth is the number of subdirectories beneath the base path (see options.deep).

import readdir from "@jsdevtools/readdir-enhanced";

readdir("my/directory", { stats: true }, (err, stats) => {
  for (let stat of stats) {
    console.log(`${stat.path} was created at ${stat.birthtime}`);
  }
});

Base Path

By default all Readdir Enhanced functions return paths that are relative to the starting directory. But you can use the basePath option to customize this. The basePath will be prepended to all of the returned paths. One common use-case for this is to set basePath to the absolute path of the starting directory, so that all of the returned paths will be absolute.

import readdir from "@jsdevtools/readdir-enhanced";
import { resolve } from "path";

// Get absolute paths
let absPath = resolve("my/dir");
readdir("my/directory", {basePath: absPath}, (err, files) => {
  console.log(files);
  // => /absolute/path/to/my/directory/file1.txt
  // => /absolute/path/to/my/directory/file2.txt
  // => /absolute/path/to/my/directory/subdir
});

// Get paths relative to the working directory
readdir("my/directory", {basePath: "my/directory"}, (err, files) => {
  console.log(files);
  // => my/directory/file1.txt
  // => my/directory/file2.txt
  // => my/directory/subdir
});

Path Separator

By default, Readdir Enhanced uses the correct path separator for your OS (\ on Windows, / on Linux & MacOS). But you can set the sep option to any separator character(s) that you want to use instead. This is usually used to ensure consistent path separators across different OSes.

import readdir from "@jsdevtools/readdir-enhanced";

// Always use Windows path separators
readdir("my/directory", {sep: "\\", deep: true}, (err, files) => {
  console.log(files);
  // => subdir1
  // => subdir1\file.txt
  // => subdir1\subdir2
  // => subdir1\subdir2\file.txt
  // => subdir1\subdir2\subdir3
  // => subdir1\subdir2\subdir3\file.txt
});

Custom FS methods

By default, Readdir Enhanced uses the default Node.js FileSystem module for methods like fs.stat, fs.readdir and fs.lstat. But in some situations, you can want to use your own FS methods (FTP, SSH, remote drive and etc). So you can provide your own implementation of FS methods by setting options.fs or specific methods, such as options.fs.stat.

import readdir from "@jsdevtools/readdir-enhanced";

function myCustomReaddirMethod(dir, callback) {
  callback(null, ["__myFile.txt"]);
}

let options = {
  fs: {
    readdir: myCustomReaddirMethod
  }
};

readdir("my/directory", options, (err, files) => {
  console.log(files);
  // => __myFile.txt
});

Backward Compatible

Readdir Enhanced is fully backward-compatible with Node.js' built-in fs.readdir() and fs.readdirSync() functions, so you can use it as a drop-in replacement in existing projects without affecting existing functionality, while still being able to use the enhanced features as needed.

import { readdir, readdirSync } from "@jsdevtools/readdir-enhanced";

// Use it just like Node's built-in fs.readdir function
readdir("my/directory", (er,  files) => { ... });

// Use it just like Node's built-in fs.readdirSync function
let files = readdirSync("my/directory");

A Note on Streams

The Readdir Enhanced streaming API follows the Node.js streaming API. A lot of questions around the streaming API can be answered by reading the Node.js documentation.. However, we've tried to answer the most common questions here.

Stream Events

All events in the Node.js streaming API are supported by Readdir Enhanced. These events include "end", "close", "drain", "error", plus more. An exhaustive list of events is available in the Node.js documentation.

Detect when the Stream has finished

Using these events, we can detect when the stream has finished reading files.

import readdir from "@jsdevtools/readdir-enhanced";

// Build the stream using the Streaming API
let stream = readdir.stream("my/directory")
  .on("data", (path) => { ... });

// Listen to the end event to detect the end of the stream
stream.on("end", () => {
  console.log("Stream finished!");
});

Paused Streams vs. Flowing Streams

As with all Node.js streams, a Readdir Enhanced stream starts in "paused mode". For the stream to start emitting files, you'll need to switch it to "flowing mode".

There are many ways to trigger flowing mode, such as adding a stream.data() handler, using stream.pipe() or calling stream.resume().

Unless you trigger flowing mode, your stream will stay paused and you won't receive any file events.

More information on paused vs. flowing mode can be found in the Node.js documentation.

Contributing

Contributions, enhancements, and bug-fixes are welcome! Open an issue on GitHub and submit a pull request.

Building

To build the project locally on your computer:

  1. Clone this repo
    git clone https://github.com/JS-DevTools/readdir-enhanced.git

  2. Install dependencies
    npm install

  3. Run the tests
    npm test

License

Readdir Enhanced is 100% free and open-source, under the MIT license. Use it however you want.

This package is Treeware. If you use it in production, then we ask that you buy the world a tree to thank us for our work. By contributing to the Treeware forest you’ll be creating employment for local families and restoring wildlife habitats.

Big Thanks To

Thanks to these awesome companies for their support of Open Source developers ❤

Travis CI SauceLabs Coveralls

readdir-enhanced's People

Contributors

greenkeeperio-bot avatar jamesmessinger avatar mrmlnc avatar papb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

readdir-enhanced's Issues

Ability to use depth in the custom deep function

If we use deep option as custom function we cannot control depth. May be we must provide number of depth to function arguments?

For example:

const options = {
	deep: (entry: IReaddirEntry, depth: number) => ...
};

Maximum call stack size exceeded with filters

Hello,

And it's me again :)

We have a small problem. When working with large nested directory, we often apply the filter. For example, built-in filter by patterns:

const re = require('readdir-enhanced');

let files = null;

try {
    files = re.sync('./node_modules', {
        deep: true,
        filter: '**/test-*'
    });
} catch (error) {
    console.log('MEMORY: ' + (process.memoryUsage().heapUsed / 1e6) + ' MB');

    console.dir(error, { colors: true });
}

console.dir(files, { colors: true });

In the example above, I call the reading of the directory that contains 20320 entries:

$ npm i jest ava babel-core standard eslint typescript tslint monaco xterm readdir-enhanced

After running the script we get the following:

MEMORY: 9.37484 MB

node_modules/readdir-enhanced/lib/call.js:51
      throw err;
      ^

RangeError: Maximum call stack size exceeded
    at Object.fs.lstatSync (fs.js:839:18)
    at exports.lstat (node_modules/readdir-enhanced/lib/sync/fs.js:58:20)
    at Object.safeCall [as safe] (node_modules/readdir-enhanced/lib/call.js:24:8)
    at stat (node_modules/readdir-enhanced/lib/stat.js:19:8)
    at DirectoryReader.processItem (node_modules/readdir-enhanced/lib/directory-reader.js:171:5)
    at node_modules/readdir-enhanced/lib/sync/for-each.js:14:5
    at Array.forEach (native)
    at Object.syncForEach [as forEach] (node_modules/readdir-enhanced/lib/sync/for-each.js:13:9)
    at node_modules/readdir-enhanced/lib/directory-reader.js:80:16
    at onceWrapper (node_modules/readdir-enhanced/lib/call.js:45:17)

If I remove any filter (pattern, function), everything works fine.

Question about reading directories

Hello, @BigstickCarpet,

Firstly, very thanks you for this package. This package works faster analogues and covers all my needs. 🌮

Do you not think about how to prevent the reading of the directory, if it has been filtered? I'm not mean filter of results, I talk about skip the contents of the directory, if it has been filtered.

For example,

filter

function filter(stat) {
  console.log('stat: ', stat.path);
  if (stat.isDirectory() && stat.path === 'a') {
    console.log('Hello from the "a/" directory!');
    return false;
  }
  return true;
}

current behaviour

stat:  a
Hello from the "a/" directory!
stat:  a.txt
stat:  a\b                  // <--- but "a" is filtered
stat:  a\b\b.no-txt    // <--- but "a" is filtered
stat:  a\b\c               // <--- but "a" is filtered
stat:  a\b\c\d           // <--- but "a" is filtered
stat:  a\b\c\d\e.txt   // <--- but "a" is filtered

expected behaviour

stat:  a
Hello from the "a/" directory!
stat:  a.txt

I ask because this may severely degrade performance on very large projects.

Add npm install instructions

Should add something that says

Installation

npm install --save readdir-enhanced

This would help identifying the npm repository as genuine.

No files are streamed when you omit the 'data' handler.

I want to build a stream and only listen to files. I'm building it like so:

const files = readdirEnhanced.readdirStreamStat('/tmp/test')

files.on('file', (file) => console.log(file))
files.on('end', () => console.log('end'))

If I run the above code, no files pass through the stream. If I add a data listener (even an empty function), things start moving:

const files = readdirEnhanced.readdirStreamStat('/tmp/test')

files.on('data', () => {
  // no-op
})

files.on('file', (file) => console.log(file))
files.on('end', () => console.log('end'))

Is this intended behaviour? If so, I'm happy to update the documentation a bit to make it clearer why the data handler is needed.

Cheers

Knowing when readdir.stream is finished?

Is there a way with readdir.stream() to know when it is finished? I couldn't see any appropriate event for this or anything in the docs.

Note, I did try .on('finish', function () { }), but this did not get called.

readdir.sync.stat should have option to silently fail

I use the following under Windows

const readdir = require('readdir-enhanced')
const result = readdir.sync.stat('C:/')

Under my system this fails because some files are protected (for instance C:\pagefile.sys). Should there be an option to either skip those errors and return the rest of the file stats, or to return the other files as part of the exception payload?

{filter: '*.txt'} returns empty files array

when i use filter: '*txt in my code it always return empty array
readdir(path1, { basePath: path1, filter: "*.txt", deep: true}, function(err, files) {

if i remove the filter i will show all the files in the directory including the subdirectories

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.