Coder Social home page Coder Social logo

johann-petrak / licenseheaders Goto Github PK

View Code? Open in Web Editor NEW
124.0 124.0 80.0 160 KB

Simple python script to add/replace license headers in a directory tree of source files

License: BSD 3-Clause "New" or "Revised" License

Python 88.98% CMake 2.92% C++ 1.92% C 1.72% Shell 1.73% Coq 2.73%

licenseheaders's Introduction

licenseheaders's People

Contributors

alswl avatar aminya avatar arthurzenika avatar charlyhue avatar danyspin97 avatar emgre avatar igor1306 avatar joaohf avatar johann-petrak avatar keckj avatar liyishuai avatar m000 avatar paradoxkg avatar pavlobielousov avatar rcaroncd avatar tset-noitamotua avatar wsciaroni avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

licenseheaders's Issues

Dockerfile extension rules

Currently the tool will recognize docker files as Dockerfile or .dockerfile.

Looking at a StackOverflow the community uses both:

  1. <purpose>.Dockerfile
  2. Dockerfile.<purpose>

Is there an existing way to support the 2nd way above?

It seems that currently the additional extension option also only support wildcard in the prefix but not the suffix: https://github.com/johann-petrak/licenseheaders/blob/master/licenseheaders.py#L817

Proposal

If no existing way to do this, I'm wondering if we can allow specifying below:

--additional-extensions docker=Dockerfile.

That is, when adding the dot at the end, we do the following instead:

# https://github.com/johann-petrak/licenseheaders/blob/master/licenseheaders.py#L817
patterns.append(ext + "*")

Do you think that's feasible? If so, I can create a pull request.

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9

Traceback (most recent call last):
  File "/home/arthur/local/licenseheaders/licenseheaders.py", line 470, in <module>
    sys.exit(main())
  File "/home/arthur/local/licenseheaders/licenseheaders.py", line 424, in main
    dict = read_file(file)
  File "/home/arthur/local/licenseheaders/licenseheaders.py", line 288, in read_file
    lines = f.readlines()
  File "/usr/lib/python3.6/codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 262: invalid continuation byte

pre-commit cannot find built-in templates

Here's my .pre-commit-config.yaml:

---
repos:
  - repo: https://github.com/johann-petrak/licenseheaders.git
    rev: 'aef25543567a9db59448349127260e386619cbf2'
    hooks:
      - id: licenseheaders
        args: ["-t", "mpl-2", "-cy", "-f"]

Error message:

licenseheaders...........................................................Failed
- hook id: licenseheaders
- exit code: 1

licenseheaders_0.8.8 ERROR: Not a built-in template and not a file, cannot proceed: mpl-2
licenseheaders_0.8.8 ERROR: Built in templates:

Executing licenseheaders in command line worked fine.

adding license headers to scripts

Licenseheaders identifies files to modify based on their extension, but some files that need license headers don't have extensions.

I'd suggest adding an option that allows license headers to be added to such files:

    licenseheaders ...other options... --type py script1 script2 script3

Here, --type would use the extension to set the type for the explicit arguments.

Code with inline comment disappear after apply licenseheaders

I installed licenseheaders via pip and get licenseheaders_0.8.5.

I applied licenseheaders on my code and the line with a comment symbol (#), which is continue from header block, is disappeared.

Here is my command and log.

$ licenseheaders -v -v -d target -t header.tmpl -b
licenseheaders_0.8.5 INFO: Using file //header.tmpl
licenseheaders_0.8.5 INFO: Processing file target/foo.py as python
licenseheaders_0.8.5 INFO: Backing up file target/foo.py to target/foo.py.bak

foo.py.bak (original)

#
# Copyright (c) 2020 Foo Inc. All rights reserved.
#
from XXX import YYY # Import package

foo.py (after)

#
# Copyright (c) 2020 Foo Inc. All rights reserved.
#

[Info] Using the pre-commit hook

The only way we could get it to work as a pre-commit hook with a directory specified was with pass_filenames: false. The settings:

  - repo: https://github.com/johann-petrak/licenseheaders.git
    rev: '8e2d6f944aea639d62c8d26cd99dab4003c3904d'
    hooks: 
        - id: licenseheaders
          args: [ "-t", "copyright.tmpl", "-d", "./src/"]
          language_version: python3.9
          pass_filenames: false

Hope this helps someone!

KeyError: 'opt_dummy_attribute'

Hi,
I tried the package using the full path to the lgpl-v3.tmpl template file (as using the name does not work as mentioned in another issue). I ended up with the following error:

Traceback (most recent call last):
File "C:\Program Files\Python39\lib\runpy.py", line 197, in _run_module_as_main
  return _run_code(code, main_globals, None,
File "C:\Program Files\Python39\lib\runpy.py", line 87, in _run_code
  exec(code, run_globals)
File "C:\Program Files\Python39\lib\site-packages\licenseheaders.py", line 899, in <module>
  sys.exit(main())
File "C:\Program Files\Python39\lib\site-packages\licenseheaders.py", line 799, in main
  template_lines = read_template(os.path.abspath(opt_tmpl), settings, arguments)
File "C:\Program Files\Python39\lib\site-packages\licenseheaders.py", line 512, in read_template
  lines = [Template(line).substitute(vardict) for line in lines]
File "C:\Program Files\Python39\lib\site-packages\licenseheaders.py", line 512, in <listcomp>
  lines = [Template(line).substitute(vardict) for line in lines]
File "C:\Program Files\Python39\lib\string.py", line 121, in substitute
  return self.pattern.sub(convert, self.template)
File "C:\Program Files\Python39\lib\string.py", line 114, in convert
  return str(mapping[named])
KeyError: 'opt_dummy_attribute'

Single Line Style Comment Headers Break Search for End of Comment Block

While using this script, some files I noticed had headers in the single-line style comments such as:

// Some
// License
// Header

When I would run this script, the script would parse the file until it found the symbol for the end of a Block style comment.

I believe I have a fix for this and will create a branch shortly to attempt to fix this problem.

Below is the terminal output for my demonstration of the bug

┌─[wsciaron@gollum]─[/testDir]
└──╼ $ll
total 16
drwxrwxr-x 2 wsciaron wsciaron 4096 Dec 1 16:29 ./
drwxr-xr-x 35 wsciaron wsciaron 4096 Dec 1 16:22 ../
-rw-rw-r-- 1 wsciaron wsciaron 633 Dec 1 16:26 apache-2.tmpl
-rw-rw-r-- 1 wsciaron wsciaron 285 Dec 1 16:29 file.cpp
┌─[wsciaron@gollum]─[
/testDir]
└──╼ $cat file.cpp

// Top Text
// More Top Text
// Even More Text
// Copyright 2020
// There you go
#include "otherFile.h" // End of line comment after include statement
// Comment After Include
int main() {}
// Comment after some code
/*
 * Some BLOCK comment
 */

int otherCode() {}

// Other Comment

┌─[wsciaron@gollum]─[/testDir]
└──╼ $
/.local/bin/licenseheaders -b -t apache-2.tmpl -y 2020 -o Will -n BugReport -u https://BugReportURL
┌─[wsciaron@gollum]─[~/testDir]
└──╼ $cat file.cpp

/*
 * Copyright (c) 2020 Will.
 *
 * This file is part of BugReport.
 * See https://BugReportURL for further info.
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 *
 */

int otherCode() {}

// Other Comment

Add support for read-only files

Some files in projects are auto-generated by some externals tools (cmake for example). Those files are often marked as read only so that they are not accidentally modified by other developers. It would be nice to add the possibility to enforce license header injection in such files without manually having to modify file permissions. I don't think this should not be the default behavior (files are marked read-only for a reason), but adding an option such as --force-overwrite to include read-only files would be nice.

New Release on PyPI?

When I run the CLI using 0.8.8 on PyPI I get the following:

#
# © 2022 XXX, Inc. All rights reserved.#

but when I run the source I get:

# © 2022 XXX, Inc. All rights reserved.

Can a new release be generated so I can easily get the new clean comment? @johann-petrak

More flexibility in choosing the files to process

Currently all the extensions defined for any of the file types are processed (and in addition the extensions added for a type).

As we add more types, it becomes more likely that this is not wanted or that extensions can be used for several different types.

So it would be good to limit what is processed by type, by extension, or even exclude whole file path patterns. On the other hand it may be useful to define file path patterns which should be processed in a particular way (so maybe because of the base name, not just the extension).
One example is something like "robots.txt" which is a specific type of file, but not just a general text (.txt) file.

For this it may also be useful to change the way how file types are defined to not just use the last extension but arbitrary path patterns, of which extensions are a special case.

Option to ensure that all files have a license header

I needed a way of ensuring that all files have a license header in my continuous integration tests and failing if they don't, but was having trouble doing this with just the output produced by --dry -v -v -v.

For now, I've made a new argument --check which behaves like --dry, but throws and outputs an error if it encounters a file that doesn't have a license. I already have a fork with the change described. Can I open a PR?

Matlab Support for .m Files?

Howdy, I noticed that .m files are assumed to be Objective-C. I'd like to instead treat these as Matlab source files. Maybe this would be possible by adding support for the Matlab syntax and somehow overriding the default associations?

Remove import in .Java files.

Hello.
Just use the last release v0.5. Works perfectly to add a license header but remove imports, and static imports.

Shebang at top of javascript files is not preserved

I have a project with some javascript entrypoints containing a top line of

#!/usr/bin/env node

which seems to be a common (but maybe not well specified) extension to allow shell shebang lines with javascript. I can point to examples of use in many NPM packages, but it seems like this is a node ability to handle and not an ECMAScript standard ability.

It would be good if the licenseheaders tool preserved these top line shebangs the same way they are preserved for shell, python, etc. types of files when inserting the copyright header.

Line Endings updates

Hi,

I have a question concerning line endings.

My source files are formatted with ‘LF’ line ends. When I run LicenseHeaders tool on a windows host, all the content of my files is updated with ‘CRLF’ line endings. I would to keep my line endings unchanged.

Do you think it could be possible to add in a future version an option to have control on the line end type (newline option in the file opening mode) ?

Best regards.

Laurent

Reimplement the whole thing!

OK, the current code is rather ugly and it shows that this was my first Python project when I learned the language. It also was just meant to be a quick and dirty solution for my own needs, shared with others in case it may be useful.
When I look at it now, I feel kind of embarrassed so I am thinking of doing this a bit more properly.

This has always been a free-time project of mine and sadly my free time is often very limited.

Still, I have created a branch to explore how this could be done better.

--dry doesn't produce any output

it seems to me that --dry doesn't produce any output because it's using the logger using info level.
What I would like to achieve is to ensure that all files have a license header in my continuous integration. I thought I could check if dry gave any output, but it never does.

Avoid whitespaces on empty lines

For an empty line in the template, and for python files "# " is inserted. Avoiding a whitespace would be nice. (contribution incoming)

[BUG] Encoding command-line parameter

Description

Specifying an --enc argument results in an error (see below). Therefore, the encoding used to read / write files cannot be changed.

To repoduce :

Call python -m licenseheaders --enc utf8 -f

An error is raised : TypeError: open() argument 'encoding' must be str or None, not list

How to fix the bug

See https://github.com/johann-petrak/licenseheaders/blob/master/licenseheaders.py#L412 for reference. The add_argument method for the "--enc" parameter should be called without specifying nargs, as specifying it results in the encoding being wrappend in a list (even for nargs=1).

"Not a built-in template" error, and no built-in templates found

Hi! I have installed licenseheaders-0.8.4 on macOS via pip3, like this:

pip3 install licenseheaders

Installation logs seem fine:

Click to expand
Collecting licenseheaders
  Downloading https://files.pythonhosted.org/packages/80/01/3bcd4063e4ed517ff7e72aad98ab4135b8307585b0022be18e093178cf38/licenseheaders-0.8.4-py3-none-any.whl
Collecting regex (from licenseheaders)
  Downloading https://files.pythonhosted.org/packages/4c/e7/eee73c42c1193fecc0e91361a163cbb8dfbea62c3db7618ad986e5b43a14/regex-2020.4.4.tar.gz (695kB)
     |████████████████████████████████| 696kB 1.0MB/s
Building wheels for collected packages: regex
  Building wheel for regex (setup.py) ... done
  Stored in directory: /Users/folex/Library/Caches/pip/wheels/e6/9b/ae/2972da29cc7759b71dee015813b7c6931917d6a51e64ed5e79
Successfully built regex
Installing collected packages: regex, licenseheaders
Successfully installed licenseheaders-0.8.4 regex-2020.4.4

$ licenseheaders -h
usage: licenseheaders [-h] [-V] [-v] [-d DIR] [-b] [-t TMPL] [-y YEARS]
[-o OWNER] [-n PROJECTNAME] [-u PROJECTURL]
[--enc ENCODING] [--dry] [--safesubst] [-D]
[-E [E [E ...]]]
[--additional-extensions ADDITIONAL_EXTENSIONS [ADDITIONAL_EXTENSIONS ...]]
[-x [EXCLUDE [EXCLUDE ...]]]
...

Then, I tried to run it (copied example from README):

$ licenseheaders -t lgpl3 -o "Eager Hacker"
licenseheaders_0.8.4 ERROR: Not a built-in template and not a file, cannot proceed: lgpl3
licenseheaders_0.8.4 ERROR: Built in templates:

Then via python3 -m, same:

$ python3 -m licenseheaders -t lgpl3 -o "Eager Hacker"
licenseheaders_0.8.4 ERROR: Not a built-in template and not a file, cannot proceed: lgpl3
licenseheaders_0.8.4 ERROR: Built in templates:

Reinstall (pip3 uninstall and then pip3 install) didn't help.

Here's dist-info/ content:

Click to expand
$ ls -lh /usr/local/lib/python3.7/site-packages/licenseheaders-0.8.4.dist-info/
total 64
-rw-r--r--  1 folex  admin     4B May  5 17:56 INSTALLER
-rw-r--r--  1 folex  admin   1.1K May  5 17:56 LICENSE.txt
-rw-r--r--  1 folex  admin   6.4K May  5 17:56 METADATA
-rw-r--r--  1 folex  admin   833B May  5 17:56 RECORD
-rw-r--r--  1 folex  admin    92B May  5 17:56 WHEEL
-rw-r--r--  1 folex  admin    56B May  5 17:56 entry_points.txt
-rw-r--r--  1 folex  admin    15B May  5 17:56 top_level.txt

Hope my issue will help. Thanks for you project! If you need any info, I'll try to help.


One-liner to reproduce it with docker

docker run --rm -it python:slim bash -c 'pip install licenseheaders && mkdir -p test && cd test && touch a.java && licenseheaders -t lgpl-v3 -y 2012-2014 -o ThisNiceCompany -n ProjectName -u http://the.projectname.com'

Getting KeyError after running any command

I tried running the commands using python 3.8 but getting
$ licenseheaders --help Traceback (most recent call last): File "/usr/local/bin/licenseheaders", line 8, in <module> sys.exit(main()) File "/usr/local/lib/python3.8/site-packages/licenseheaders.py", line 612, in main arguments = parse_command_line(sys.argv) File "/usr/local/lib/python3.8/site-packages/licenseheaders.py", line 304, in parse_command_line known_extensions = [ftype+":"+",".join(conf["extensions"]) for ftype, conf in typeSettings.items()] File "/usr/local/lib/python3.8/site-packages/licenseheaders.py", line 304, in <listcomp> known_extensions = [ftype+":"+",".join(conf["extensions"]) for ftype, conf in typeSettings.items()] KeyError: 'extensions'

Fix logging

Some logging messages appear twice right now. Also maybe make pylint happy about how to format the logging messages.

--enc do not encode the read_tempalte

The option --enc do not apply the selected encoding when the template is read. So it may cause problems on windows (that do not use utf-8 as pattern when calling open() method.

command line option to exclude files

Add a command line option that will read in one or more UNIX-style patterns. Any file name matching this pattern will be ignored. No license will be incorporated.

licenseheaders ... -x "_version.py,*not_this_one*"

Running this script as a pre build event causes recompilation of whole codebase

Brief
Even if a file already contains right copyright header, the 'licenseheaders.py' script overrides file with the same content making build system treat this file as "changed" and recompile it again.

Details
Suppose we have a C++ codebase where all files already have right copyright headers. After we apply a 'licenseheaders.py' script, the C++ build system will treat all C++ files as "changed" even if there were no changes introduced, which will lead to recompilation of whole code base.
Because of that it is not practical to use the 'licenseheaders.py' script as a prebuild event.

Possible solution
The algorithm to ensure that the script touches files only when it is needed:

  1. Write output to the temporary file
  2. If temporary file is different from the original (it means that the copyright header has been updated) replace the original file with the temporary one

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.