somewhatabstract / checksync Goto Github PK

A tool for detecting when related text blocks change

License: MIT License

JavaScript 3.26% Python 0.31% TypeScript 96.43%

linting tools tools-engineering synchronization developer-tools development-workflow development-utils tool code-quality

checksync's Introduction

checksync

Usage

You can install checksync if you want, but the easiest way to use it is via npx.

npx checksync --help

For detailed usage information, run npx checksync --help.

Example workflow

Add synchronization tags to files indicating what sections to synchronize and with which files:

// my-javascriptfile.js
// sync-start:mysyncid ./my-pythonfile.py
/**
 * Some code that needs to be synchronised.
 */
// sync-end:mysyncid

# my-pythonfile.py
# sync-start:mysyncid ./my-javascriptfile.js
'''
Some code that needs to be synchronised.
'''
# sync-end:mysyncid

Use consecutive sync-start tags with the same identifier to target multiple files.

// my-csharpfile.cs
// sync-start:mysyncid ./my-pythonfile.py
// sync-start:mysyncid ./my-javascriptfile.js
/**
 * Some code that needs to be synchronised.
 */
// sync-end:mysyncid

Run checksync to verify the tags are correct:
```
yarn checksync <globs|files|dirs>
```
Run with --update-tags or -u to automatically insert the missing checksums:
```
yarn checksync -u <globs|files|dirs>
```
Add a pre-commit step to run checksync on commiting changes so that you catch when synchronized blocks change. You can do this using a package like husky, or pre-commit.
Commit your tagged files!

To get more information about the various arguments that checksync supports as well as information about sync-tags, run yarn checksync --help.

Target file paths

All target paths are relative to your project root directory. By default, this is determined, using ancesdir to be the ancestor directory of the files being processed that contains package.json. If you want to specify a different root (for example, if you're syncing across multiple packages in a monorepo) you can specify a custom marker name using the --root-marker argument.

Contributing

For details on contributing to checksync, checkout our contribution guidelines.

checksync's People

Contributors

Stargazers

Watchers

Forkers

jeremywiebe kevinbarabash nexzhu

checksync's Issues

Add node_module caching to build jobs

We can improve our repeat build speeds if we pre-cache node_modules.

Add an option to output detected issues as JSON

Is your feature request related to a problem? Please describe.
I'd like to use checksync to build an eslint plugin for linting sync tags.

Describe the solution you'd like
A --json option that would output any issues detected in a file as a JSON blob.

Describe alternatives you've considered
I considered a direct API, but implementing a --json option should be easier since it shouldn't require any changes to how checksync is packaged.

Allow more than one ignore file

Currently --ignore-file takes a single file. Would be nice to take a list of files (quoted, comma-delimited?).

Update --json to report more errors

Is your feature request related to a problem? Please describe.
There are a bunch of errors that we log in marker-parser.js that would be nice to include the JSON output.

Describe the solution you'd like
I'm not sure what the implementation should look like.

Describe alternatives you've considered
None.

Improve readme and help around tagging

We should have some more examples of how tags work, especially regarding multiple files targeting one another on the same content.

Regarding the readme, some images may help illustrate much more clearly.

Console output should be grouped by message type

Many tools, like linters and such, will group the output by the type of error. Currently, checksync outputs in file order. Instead, it should output in message type order such that warnings, errors, mismatches, etc. are grouped together. It may also be useful to group by the message too so that like messages are together.

This may possibly be a mode of output since not all users may want it. This new output would perhaps drop some of the duplicated messaging to give a clearer overview.

In the new formatting coming in v5 (see #1260), one gets:

Mismatch __examples__/content_after_start/b.py:4
Looks like you changed the target content for sync-tag 'content_after_start' in '__examples__/content_after_start/c.js:3'
Make sure you've made the parallel changes in the source file, if necessary (No checksum != 770446101)

Error    __examples__/content_after_start/c.js:5
Sync-start for 'content_after_start' found after content started

Mismatch __examples__/content_after_start/c.js:3
Looks like you changed the target content for sync-tag 'content_after_start' in '__examples__/content_after_start/b.py:4'
Make sure you've made the parallel changes in the source file, if necessary (No checksum != 249234014)

Error    __examples__/content_after_start/c.js:5
No return tag named 'content_after_start' in '__examples__/content_after_start/a.js'

Error    __examples__/directory_target/example.js:3
Sync-start for 'directory_target' points to '__examples__/directory_target/a_directory', which does not exist or is a directory

Error    __examples__/directory_target/example.js:3
No return tag named 'directory_target' in '__examples__/directory_target/a_directory'

Warning  __examples__/duplicate-target/a.js:5
Duplicate target for sync-tag 'update_me'

Mismatch __examples__/duplicate-target/a.js:4
Looks like you changed the target content for sync-tag 'update_me' in '__examples__/duplicate-target/b.py:3'
Make sure you've made the parallel changes in the source file, if necessary (12352 != 249234014)

Instead, a format like the following might may be preferred:

Mismatch
  __examples__/content_after_start/b.py:4
  'content_after_start' -> '__examples__/content_after_start/c.js:3' (No checksum != 770446101)

  __examples__/content_after_start/c.js:3
 'content_after_start' -> '__examples__/content_after_start/b.py:4' (No checksum != 249234014)

  __examples__/duplicate-target/a.js:4
  'update_me' -> '__examples__/duplicate-target/b.py:3' (12352 != 249234014)

Error
  __examples__/content_after_start/c.js:5
  Sync-start for 'content_after_start' found after content started

  __examples__/content_after_start/c.js:5
  No return tag named 'content_after_start' in '__examples__/content_after_start/a.js'

  __examples__/directory_target/example.js:3
  Sync-start for 'directory_target' target '__examples__/directory_target/a_directory' not found

  __examples__/directory_target/example.js:3
  No return tag named 'directory_target' in '__examples__/directory_target/a_directory'

Warning
  __examples__/duplicate-target/a.js:5
  Duplicate target for sync-tag 'update_me'

Add support for Node 14

.checksyncrc file

It would be really helpful if the default args could be overridden with a file.

Speed up by not globbing file paths

Currently, when run with a file path targeting specific files, we still treat those paths as globs. Instead, we should check the paths we are asked to run against and to see if they might not be globs first. Something like "if does not contain a * and is an existing file, don't glob".

Duplicate targets should be differentiated by text and line number

Duplicate targets are going to have the same text and therefore share the same auto-fixes which breaks the way our autofixer works. The autofixer should track line numbers and use that to differentiate the targets.

In other words, if a duplicate target has exactly the same text as its predecessor then they will share their fixes, which isn't really what we want since if the first fix is to delete as a duplicate, it might delete both! A TODO is in place to look at addressing this by using both text and line number to apply the correct fix.

See

checksync/src/fix-file.js

Lines 75 to 79 in d96418a

    
           // If there are multiple lines with the exact same text, 
        
           // they will both receive the same fix till we track the line 
        
           // number. 
        
           // TODO: Track the line number so we can work with duplicate 
        
           // text.

Verbose mode for logging

There should be a logging level that outputs every file being parsed, etc. so that we can troubleshoot problems.

Might want to consider moving to a logging framework like winston so we can leverage more log capabilities. However, for now, just having a '--verbose` arg that gates our own logging is sufficient.

Support recursive discovery of ignore files

Is your feature request related to a problem? Please describe.
Currently, it automatically uses the .gitignore of the current directory or a list of explicit ignore files via --ignore-files as well as specific ignore paths with -i.

However, it would be helpful if it used the .gitignore, and possible .eslintignore files of all the folders that are being processed. In addition, it would be helpful if one could specify the name of ignore files to match recursively.

Since the existing --ignore-files arg is for specific paths to ignore files, this likely requires a new arg or special syntax for --ignore-files paths that indicates they should be matched recursively.

Describe the solution you'd like
What ever is done, it should avoid a breaking change to existing executions, which would take --ignore-files .gitignore to mean only the current directory's .gitignore and not "every .gitignore".

Installing from commit SHA should ensure the dist is built on install

When this is a package on NPM, we will have the distribution built as part of the package and it will all "just work" after installation. However, when testing fixes or using the tool while it's not on NPM, we install via a commit SHA, which gets the source but does not build it. We should add an install step that checks for the distribution files and, if they don't exist, installs deps and builds the dist.

Something like:
yarn install && rollup -c

Add support for Node 16

Add --no-ignore option

We default --ignore-file to .gitignore but some folks may not want that, so let's add a --no-ignore option.

Output indicating running the command should use the way the command was invoked

Currently, we output checksync <args> when stating how to rerun the tool after a given run. However, just copying and pasting this to the terminal will not work. We should make this more likely to succeed in this case.

Code coverage isn't always working (noticeable in dependabot jobs)

Though the code coverage step succeeds, it doesn't always seem to trigger the codecov check to pass. Possibly due to outdated codecov integration?

Remove or fix the dependabot badge in the README

Output indicating how to autofix does not include all the arguments

Describe the bug
If you run checksync with a marker file and ignore file, for example, not all of the arguments are correctly listed in the output when it explains how to run checksync to auto-fix.

To Reproduce
Try running checksync with an ignore file argument or other non-standard arg that affects what gets processed.

Expected behavior
Should output a command that can be copied and run to auto-fix.

Update mode appears to add newlines to files it shouldn't be writing

Describe the bug
Saw this in the KA webapp repo. It was chaging the endings of json, csv, and sometimes even png files. But it should only be writing the files that have a mismatch. Need to debug.

To Reproduce
checksync --ignore-files lint_blacklist.txt,.gitignore -i dev/linters -m .ka_root -u on webapp

Expected behavior
Nothing should change in files that don't have a MISMATCH.

Desktop (please complete the following information):

OS: macOS 10.15.4
Node Version: 10.19.0
checksync Version: 2.0.0

Running after installing with `yarn global add checksync` fails

Describe the bug
I cannot run checksync if I install it with yarn global add checksync. When I run it in a project directory, I get the following error:

(node:50085) UnhandledPromiseRejectionWarning: Error: ENOTDIR: not a directory, scandir '/usr/local/bin/checksync'
(node:50085) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag `--unhandled-rejections=strict` (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 1)
(node:50085) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

To Reproduce
Steps to reproduce the behavior:

yarn global add checksync
Switch to project dir
Run checksync

I get the above error.

Expected behavior
Checksync runs and shows me sync region issues.

Screenshots
~~If applicable, add screenshots to help explain your problem.~~

Desktop (please complete the following information):

OS: macOS
Node Version: v12.18.3
checksync Version 2.2.3

Additional context
When launching checksync on my machine, the launchFilePath is /Users/jeremy/dev/checksync/bin/checksync.dev.js (which is the target of the symlink that yarn global installed).

But the process.argv array is:

[
    "/usr/local/Cellar/node@12/12.18.3/bin/node", 
    "/usr/local/bin/checksync" // the actual symlink
]

I have some customizations of my yarn environment as follows:

$ yarn global bin
/usr/local/bin
$ yarn config current
yarn config v1.22.10
{
  "registryFolders": [
    "node_modules"
  ],
  "linkedModules": [],
  "cache": {},
  "cwd": "/Users/jeremy/khan/webapp",
  "looseSemver": true,
  "commandName": "config",
  "preferOffline": false,
  "globalFolder": "/Users/jeremy/.config/yarn/global",
  "linkFolder": "/Users/jeremy/.config/yarn/link",
  "offline": false,
  "binLinks": true,
  "ignorePlatform": false,
  "ignoreScripts": false,
  "disablePrepublish": false,
  "nonInteractive": false,
  "workspaceRootFolder": null,
  "lockfileFolder": "/Users/jeremy/khan/webapp",
  "networkConcurrency": 8,
  "childConcurrency": 5,
  "networkTimeout": 30000,
  "workspacesEnabled": true,
  "workspacesNohoistEnabled": true,
  "pruneOfflineMirror": false,
  "enableMetaFolder": false,
  "enableLockfileVersions": false,
  "linkFileDependencies": false,
  "cacheFolder": "/Users/jeremy/Library/Caches/Yarn/v6",
  "tempFolder": "/Users/jeremy/Library/Caches/Yarn/v6/.tmp",
  "production": false
}
✨  Done in 0.05s.

Checksync uses current working directory for marker location rather than the passed location or the config file

When running checksync from a location, it uses the current working directory for its discovery; but if a config file is given, or a location is passed, it should use that rather than the current working directory.

Otherwise, it's impossible to use outside of the intended directory, which is counter-intuitive.

Support inter-dependency sync tags

Problem

Sometimes, we may want to sync code between different repos in a manner that is non-programmatic. For example, I may have a golang project and a javascript project in separate repos that need to have some common code, but that we don't really want to create some sort of code dependency between them.

In addition, just coming up with a way to reference these things and then doing the sync check "live" would be dependent on network requests or some need for the code to be available locally all at the same time, which would make automated checks such as in GitHub Actions hard, slow, and possibly impossibly.

Proposed Solution

The CheckSync tool is already run across all files in a project to "lint" for sync issues and verify all the checksums are correct.
This run outputs a summary of broken sync tags.

We can update CheckSync to:

Output a verbose summary of ALL sync tags, whether broken or not
Allow configuration to import such summaries as a proxy for scanning the real files

With this mechanism, repo A can make available in a commit, the complete output defining all sync-tags and checksums for their content. Then repo B can reference that file in its config along with the commit SHA of the last repo A commit it was updated from (based on the repoA commit where the summary file was last changed).

When CheckSync runs in repoB, it can:

Verify that it has the latest version of the sync summary file from repoA
Load the version that it has locally and use that to verify any sync tags that target repoA files.

We can also add the ability to update the repoB source with a new sync file from repoA and update the commit SHA in the config accordingly. And likewise, repoA can use this mechanism to target repoB.

For things that are direct dependencies, like other JavaScript packages, one would expect them to share code somehow rather than need sync-tags. However, if they do require sync-tags (perhaps non-code files that we want to be reminded to update such as docs related stuff), then rather than needing to go look at actual git repos, the code could first check the local package files for the summary file and load it from there.

Illustration

Run checksync on repoB
checksync gets latest commit that the repo A summary was changed (likely via GitHub APIs?) and verifies the SHA matches the summary repoB already has, if it dosn't it raises a high-level sync issue in the output. At this point, we could fail early, continue as if everything was fine but note the issue, or if auto-fix was on, downloaded the updated file and update the config
checksync loads the repo A summary to match repo B synctags against and continues on its way
checksync outputs repo B's summary ready for inclusion in the commit - if it didn't change, it won't get committed

Upgrade @hyperjump/json-schema when we drop Node 16

@hyperjump/json-schema requires a minimum of Node 18.

Blocked: We need to support Node 16 for now, so we cannot update as long as that is a requirement.

Should include version in CLI output

Integration tests should run on windows

Describe the bug
Currently, the snapshots are dependent on the platform path separator. This means that the integration tests fail on windows because windows uses a backslash as the path separator instead of the forward slash in the snapshots.

To Reproduce
Run the integration tests on windows.

Expected behavior
The tests should pass.

The `parse-gitignore` dependency isn't handling some ignore patterns

Patterns like !dev don't appear to be working. Probably need to change to a different package like https://www.npmjs.com/package/@gerhobbelt/gitignore-parser.

Add --version arg

Is your feature request related to a problem? Please describe.
It's pretty common to expect tools to support --version and sometimes -v for getting the version information.

Describe the solution you'd like
Add --version to output just the version number.

Describe alternatives you've considered
--help outputs the version as part of the help, but it's not machine readable and it is at the top of a large block of text, making it impractical.

When auto-fixing, added line-ending should match line-ending of file being edited

When auto-fixing, we use the \n character as a line-ending, but on Windows, the line-ending is more likely to be \r\n. We could use os.EOL but this may not match the file that we are fixing, so we should probably detect the line ending during parsing and then use that.

Handle symlinks

Currently, the globs include symlinks. Need to handle these. Either remove them, or more likely, resolve them to real paths when opening so that we still process them since folks may be make sync-tags to them in their symlinked location.

Indentation is getting lost when auto-updating checksum

The indentation of an existing tag is being lost when the checksum is auto-updated.