Coder Social home page Coder Social logo

korandoru / hawkeye Goto Github PK

View Code? Open in Web Editor NEW
78.0 2.0 11.0 832 KB

Simple license header checker and formatter, in multiple distribution forms.

License: Apache License 2.0

Dockerfile 1.37% Rust 96.67% Python 1.96%
cli license license-checking native native-image

hawkeye's Introduction

HawkEye

Simple license header checker and formatter, in multiple distribution forms.

Usage

You can use HawkEye in GitHub Actions or in your local machine. HawkEye provides three basic commands:

# check license headers
hawkeye check

# format license headers (auto-fix all files that failed the check)
hawkeye format

# remove license headers
hawkeye remove

You can use -h or --help to list out all config options.

GitHub Actions

The HawkEye GitHub Action enables users to run license header check by HawkEye with a config file.

First of all, add a licenserc.toml file in the root of your project. The simplest config for projects licensed under Apache License 2.0 is as below:

Note

The full configurations can be found in the configuration section.

headerPath = "Apache-2.0.txt"

[properties]
inceptionYear = 2023
copyrightOwner = "tison <[email protected]>"

You should change the copyright line according to your information.

To check license headers in GitHub Actions, add a step in your GitHub workflow:

- name: Check License Header
  uses: korandoru/hawkeye@v5

Docker

Alpine image (~18MB):

docker run -it --rm -v $(pwd):/github/workspace ghcr.io/korandoru/hawkeye check

Arch Linux

Note

Reach out to the maintainer (@orhun) of the package or report issues on Arch Linux GitLab in the case of packaging-related problems.

hawkeye can be installed with pacman:

pacman -S hawkeye

Cargo Install

The hawkeye executable can be installed by:

cargo install hawkeye

Prebuilt Binary

Instead of cargo install, you can install hawkeye as a prebuilt binary by:

export VERSION=v5.3.1
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/korandoru/hawkeye/releases/download/$VERSION/hawkeye-installer.sh | sh

It would retain more build info (output by hawkeye -V) than cargo install.

Build

This steps requires Rust toolchain.

cargo build --workspace --all-features --bin --tests --examples --benches

Build Docker image:

docker build . -t hawkeye

Configurations

Config file

# Base directory for the whole execution.
# All relative paths is based on this path.
# default: current working directory
baseDir = "."

# Inline header template.
# Either inlineHeader or headerPath should be configured, and inlineHeader is prioritized.
inlineHeader = "..."

# Path to the header template file.
# Either inlineHeader or headerPath should be configured, and inlineHeader is prioritized.
# This path is resolved by the ResourceFinder. Check ResourceFinder for the concrete strategy.
# The header template file is skipped on any execution.
headerPath = "path/to/header.txt"

# On enabled, check the license header matches exactly with whitespace.
# Otherwise, strip the header in one line and check.
# default: true
strictCheck = true

# Whether you use the default excludes. Check Default.EXCLUDES for the completed list.
# To suppress part of excludes in the list, declare exact the same pattern in `includes` list.
# default: true
useDefaultExcludes = true

# The supported patterns of includes and excludes follow gitignore pattern format, plus that:
# 1. `includes` does not support `!`
# 2. backslash does not escape letter
# 3. whitespaces and `#` are normal since we configure line by line
# See also https://git-scm.com/docs/gitignore#_pattern_format

# Patterns of path to be included on execution.
# default: all the files under `baseDir`.
includes = ["..."]

# Patterns of path to be excluded on execution. A leading bang (!) indicates an invert exclude rule.
# default: empty; if useDefaultExcludes is true, append default excludes.
excludes = ["..."]

# Keywords that should occur in the header, case-insensitive.
# default: ["copyright"]
keywords = ["copyright", "..."]

# Whether you use the default mapping. Check DocumentType.defaultMapping() for the completed list.
# default: true
useDefaultMapping = true

# Paths to additional header style files. The model of user-defined header style can be found below.
# default: empty
additionalHeaders = ["..."]

# Mapping rules (repeated).
#
# The key of a mapping rule is a header style type (case-insensitive).
#
# Available header style types consist of those defined at `HeaderType` and user-defined ones in `additionalHeaders`.
# The name of header style type is case-insensitive.
#
# If useDefaultMapping is true, the mapping rules defined here can override the default one.
[mapping.STYLE_NAME]
filenames = ["..."]  # e.g. "Dockerfile.native"
extensions = ["..."] # e.g. "cc"

# Properties to fulfill the template.
# For a defined key-value pair, you can use ${key} in the header template, which will be substituted
# with the corresponding value.
#
# Preset properties:
# * 'hawkeye.core.filename' is the current file name, like: pom.xml.
[properties]
inceptionYear = 2023

# Options to configure Git features.
[git]
# If enabled, do not process files that are ignored by Git; possible value: ['auto', 'enable', 'disable']
# 'auto' means this feature tries to be enabled with:
#   * gix - if `basedir` is in a Git repository.
#   * ignore crate's gitignore rules - if `basedir` is not in a Git repository.
# 'enable' means always enabled with gix; failed if it is impossible.
# default: 'auto'
ignore = 'auto'
# If enabled, populate file attrs determinated by Git; possible value: ['auto', 'enable', 'disable']
# Attributes contains:
#   * 'hawkeye.git.fileCreatedYear'
#   * 'hawkeye.git.fileModifiedYear'
# 'auto' means this feature tries to be enabled with:
#   * gix - if `basedir` is in a Git repository.
# 'enable' means always enabled with gix; failed if it is impossible.
# default: 'disable'
attrs = 'disable'

Header style file

# [REQUIRED] The name of this header.
[my_header_style]

# The first fixed line of this header. Default to none.
firstLine = "..."

# The last fixed line of this header. Default to none.
endLine = "..."

# The characters to prepend before each license header lines. Default to empty.
beforeEachLine = "..."

# The characters to append after each license header lines. Default to empty.
afterEachLine = "..."

# Only for multi-line comments: specify if blank lines are allowed.
# Default to false because most of the time, a header has some characters on each line.
allowBlankLines = false

# Specify whether this is a multi-line comment style or not.
#
# A multi-line comment style is equivalent to what we have in Java, where a first line and line will delimit
# a whole multi-line comment section.
#
# A style that is not multi-line is usually repeating in each line the characters before and after each line
# to delimit a one-line comment.
#
# Defaulut to true.
multipleLines = true

# Only for non multi-line comments: specify if some spaces should be added after the header line and before
# the `afterEachLine` characters so that all the lines are aligned.
#
# Default to false.
padLines = false

# A regex to define a first line in a file that should be skipped and kept untouched, like the XML declaration
# at the top of XML documents.
#
# Default to none.
skipLinePattern = "..."

# [REQUIRED] The regex used to detect the start of a header section or line.
firstLineDetectionPattern = "..."

# [REQUIRED] The regex used to detect the end of a header section or line.
lastLineDetectionPattern = "..."

License

Apache License 2.0

History

This software is originally from license-maven-plugin,with an initial motivation to bring it beyond a Maven plugin. The core abstractions like Document, Header, and HeaderParser are originally copied from the license-maven-plugin sources under the terms of Apache License 2.0.

Later, when I started focusing on the Docker image's size and integrate with Git, I found that Rust is better than Java (GraalVM Native Image) for this purpose. So I rewrote the core logic in Rust and keep ship a slim image.

hawkeye's People

Contributors

byron avatar diamondmofeng avatar orhun avatar tisonkun avatar yuanyuyuan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

hawkeye's Issues

Support for different years

Is there a way to allow different years in the license header? E.g. in a project that is developed over multiple years and some file are not touched, the year could be different to new files.

Distribute as a Maven plugin

So, this project is strongly inspired by license-maven-plugin, with an initial motivation to bring it beyond a Maven plugin.

But, with a core library to support basic check, format, and remove logic, we can evaluate building a maven plugin upon these core functions :)

The significant part is that we should populate properties correctly and do the right overriding. Also, the license-maven-plugin supports a few "multi" configs, while I'd prefer to use multiple executions simply. But this should also be verified.

Move HawkEye project version Java constant to hawkeye-core

It's now under hawkeye-command but it should be in the core module for any other modules usage.

public final class CommandConstants {
public static final String UNKNOWN = "<unknown>";
public static final String VERSION;
static {
ClassLoader classLoader = CommandConstants.class.getClassLoader();
try (InputStream is = classLoader.getResourceAsStream("hawkeye.properties")) {
final Properties properties = new Properties();
properties.load(is);
VERSION = properties.getProperty("project.version", UNKNOWN);
} catch (IOException e) {
throw new UncheckedIOException("cannot load hawkeye properties file: hawkeye.properties", e);
} catch (Exception e) {
throw new UncheckedIOException("cannot load hawkeye properties file: hawkeye.properties", new IOException(e));
}
}
}

Originally made by #79.

format will remove the comment by mistake

Comment //! Load balancer will be removed.
Images ID :7a11b1a12b1d

docker run -it --rm -v $(pwd):/github/workspace ghcr.io/korandoru/hawkeye-native format

before format

// Copyright 2022-2023 CeresDB Project Authors. Licensed under Apache-2.0.

//! Load balancer

use macros::define_result;

after format

// Copyright 2023 The CeresDB Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//     http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

use macros::define_result;

config

headerPath = "Apache-2.0.txt"

includes = [
    "*.rs",
    "*.toml",
    "*.yml"
]

[properties]
inceptionYear = 2023
copyrightOwner = "The CeresDB Authors"

Support builtin license header template

I guess we can do this with headerPath since it also finds on the classpath.

We can define some templates in the resources folder. The one that needs caution is when building native image, we may specify the resources explicitly.

The docker image is not working

docker run -it --rm -v $(pwd):/github/workspace ghcr.io/korandoru/hawkeye-native format
java.lang.Exception: git check-ignore failed:

	at io.korandoru.hawkeye.core.GitHelper.filterIgnoredFiles(GitHelper.java:101)
	at io.korandoru.hawkeye.core.LicenseProcessor.call(LicenseProcessor.java:81)
	at io.korandoru.hawkeye.command.HawkEyeCommandFormat.call(HawkEyeCommandFormat.java:48)
	at io.korandoru.hawkeye.command.HawkEyeCommandFormat.call(HawkEyeCommandFormat.java:29)
	at picocli.CommandLine.executeUserObject(CommandLine.java:1953)
	at picocli.CommandLine.access$1300(CommandLine.java:145)
	at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2358)
	at picocli.CommandLine$RunLast.handle(CommandLine.java:2352)
	at picocli.CommandLine$RunLast.handle(CommandLine.java:2314)
	at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179)
	at picocli.CommandLine$RunLast.execute(CommandLine.java:2316)
	at picocli.CommandLine.execute(CommandLine.java:2078)
	at io.korandoru.hawkeye.command.HawkEyeCommandMain.main(HawkEyeCommandMain.java:30)

Cache Maven dependencies in CI workflow

Currently, unit tests on macOS consume over 20 minutes, which is relatively long for me. It seems the dependencies download step consumes the major time. Thus, we can improve the case by caching maven depedencies.

Related to -

- uses: actions/checkout@v3
- name: Setup Java
uses: actions/setup-java@v3
with:
distribution: 'temurin'
java-version: '17'
- name: Maven verify
run: ./mvnw clean verify

Ref impl -

      - name: Cache local Maven repository
        uses: actions/cache@v3
        timeout-minutes: 5
        with:
          path: |
            ~/.m2/repository/*/*/*
            !~/.m2/repository/org/apache/pulsar
          key: ${{ runner.os }}-m2-dependencies-core-modules-${{ hashFiles('**/pom.xml') }}
          restore-keys: |
            ${{ runner.os }}-m2-dependencies-core-modules-

Fails to build with Rust stable

Hello! 👋🏼

I packaged this tool for Arch Linux, however I'm hitting:

error[E0554]: `#![feature]` may not be used on the stable release channel
  --> fmt/src/lib.rs:15:1
   |
15 | #![feature(extract_if)]
   | ^^^^^^^^^^^^^^^^^^^^^^^

error[E0554]: `#![feature]` may not be used on the stable release channel
  --> fmt/src/lib.rs:16:1
   |
16 | #![feature(let_chains)]
   | ^^^^^^^^^^^^^^^^^^^^^^^

error[E0554]: `#![feature]` may not be used on the stable release channel
  --> fmt/src/lib.rs:15:12
   |
15 | #![feature(extract_if)]
   |            ^^^^^^^^^^

I solved this by using cargo-nightly package but this unfortunately effects reproducibility :/

I was wondering if it is possible to support Rust stable 👀

Evaluate Jackson's NioPathDeserializer on native-image

So, I ever tried to use Path as the type of HawkEyeConfig.baseDir and so on.

This works well when packaging with vanilla Java but fails on native-image with some deserialization not found.

But...just now I found that changing the type to Path simply works. So the remaining work can be improving type signatures.

`hawkeye.core.filename` Property Not Working

According to the documentation, hawkeye.core.filename should have the name of the file. However, it is not there if you try to use it in your header template file.

My header template file is as follows:

${hawkeye.core.filename}
------------------------------------------
Copyright (c) ${year}
${owner}
------------------------------------------

Expected Resulting File (with year = 2024 and owner = "Some_Name":

/*
 * my_file.c
 * ------------------------------------------
 * Copyright (c) 2024
 * Some_Name
 * ------------------------------------------
 */

Actual Resulting File (with year = 2024 and owner = "Some_Name":

/*
 * ${hawkeye.core.filename}
 * ------------------------------------------
 * Copyright (c) 2024
 * Some_Name
 * ------------------------------------------
 */

Write a CONTRIBUTING file

  • How the source files are organized
  • How to add a new license header type (HeaderType)
  • How to add a new file type (DocumentType)
  • How to run CI checks locally (code style, verify)

Factor out the reporting logics

Now we always print the result to the console in a fixed format.

We should build an internal report data structure and adapt it to multiple reporting format (HTML, XML, console plain text, etc.).

Improve the Gix check ignore code

It really depends how many files this exclusion check is currently run on, as the version I see here probably won't be much faster than git2, maybe even slower, I never tried.

If it's not slower, it might already be OK to keep it, as making this fast is going to need some refactoring. select_files_with_git() should keep the state it needs on the stack, instead of putting it into a separate GitHelper structure (it's not helping in this case). Taking the gix::Repository directly should do the trick.

Originally posted by @Byron in #121 (review)

Refactor the Maven plugin as an aggregator plugin

Ref - https://cwiki.apache.org/confluence/display/MAVENOLD/Aggregator+Plugins

Otherwise, for multiple module project, the configuration file path will be resolved as per project basedir and cause un-intuitive result:

[INFO] --------------------< io.korandoru.hawkeye:hawkeye >--------------------
[INFO] Building HawkEye 3.3.1-SNAPSHOT                                    [1/6]
[INFO]   from pom.xml
[INFO] --------------------------------[ pom ]---------------------------------
[INFO] 
[INFO] --- hawkeye:3.3.0:check (default-cli) @ hawkeye ---
[INFO] Checking license headers... with cfg: /Users/tison/Brittani/hawkeye/licenserc.toml
[WARNING] Processing unknown file: /Users/tison/Brittani/hawkeye/action.yml.bak
[INFO] No missing header file has been found.
[INFO] 
[INFO] -----------------< io.korandoru.hawkeye:hawkeye-core >------------------
[INFO] Building HawkEye :: Core 3.3.1-SNAPSHOT                            [2/6]
[INFO]   from hawkeye-core/pom.xml
[INFO] --------------------------------[ jar ]---------------------------------
[INFO] 
[INFO] --- hawkeye:3.3.0:check (default-cli) @ hawkeye-core ---
[INFO] Checking license headers... with cfg: /Users/tison/Brittani/hawkeye/hawkeye-core/licenserc.toml
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for HawkEye 3.3.1-SNAPSHOT:
[INFO] 
[INFO] HawkEye ............................................ SUCCESS [  0.240 s]
[INFO] HawkEye :: Core .................................... FAILURE [  0.001 s]
[INFO] HawkEye :: Command ................................. SKIPPED
[INFO] HawkEye :: Distribution :: Command Line Interface .. SKIPPED
[INFO] HawkEye :: Distribution :: Native Image ............ SKIPPED
[INFO] HawkEye :: Maven Plugin ............................ SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  0.318 s
[INFO] Finished at: 2023-08-30T20:59:25+08:00
[INFO] ------------------------------------------------------------------------
[ERROR] /Users/tison/Brittani/hawkeye/hawkeye-core/licenserc.toml (No such file or directory)

Support user-defined header definitions

mapping enables users to define their own mapping logic. And it's common for users to want to extend the HeaderDefinition that can be used.

The config structure can be:

[headerDefinitions.my_favorite_style]
firstLine = "..."
endLine = "..."
# ...

... and then referred in the mapping section:

[mapping]
rs = 'my_favorite_style'

References:

Support pluggable Git integration

license-maven-plugin has a pluggable Git integration license-maven-plugin-git.

We can have a similar plugin to populate Git-related information like first commit time, last commit time and excludes from .gitignore rules, etc.

It will require a plugin strategy and perhaps dynamically loaded. But in the native image and command line fat jar, we may bundle them. I'm unsure now.

For Git interoperation library, jgit is a good choice: https://www.eclipse.org/jgit/

Refactor configs from different sources

HawkEyeConfig can factor out ConfigFileModel and add new config keys like dryRun that are passed from the command line or other sources, which is unreasonable to be configured in the config file.

Revisit includes and excludes logic

Currently, we build includes and excludes in Selection using buildOverrideInclusions, buildExclusions and buildInclusions.

I don't think it's a clean way and I don't agree that we should explicitly specify exact includes to remove default exclusions.

Basically, we should evaluate includes and excludes with priority and acting like a painter:

  1. Filter if in default includes if no user-defned includes.
  2. If in default excludes, pending to be ignored.
  3. If in user-defined includes, bring it back.
  4. If in user-defined excludes, kick it off again.

In this way, we can append and plugin includes and excludes like the idea in #20 to respect .gitignore configs.

Distinguishing between the copyright years for existing and new files

As a new year begins, updating copyrights is a task that developers find a bit exciting. In simple terms, my understanding is that the year in existing code files can stay as is, while files created in the new year should reflect that year's date. Yet, when following this practice, I didn’t pass the GitHub action check with korandoru/hawkeye.

Could you please clarify if there’s a mistake in my grasp of licensing, or if I've overlooked how to properly set up the checker?

Project: https://github.com/GreptimeTeam/greptimedb

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.