Coder Social home page Coder Social logo

tinder / bazel-diff Goto Github PK

View Code? Open in Web Editor NEW
389.0 389.0 57.0 717 KB

Performs Bazel Target Diffing between two revisions in Git, allowing for Test Target Selection and Selective Building

License: Other

Starlark 8.77% Shell 1.21% Kotlin 90.01% TypeScript 0.02%
bazel target-selection test-selection

bazel-diff's People

Contributors

andre-alves avatar balestrapatrick avatar bz-canva avatar chenrui333 avatar fa93hws avatar fahhem avatar jaimelennox avatar jmthvt avatar kevinjiao avatar lalten avatar malinskiy avatar mehran-prs avatar molar avatar morozov avatar naveenonarayanan avatar nikhilbirmiwal avatar onioni avatar panfilwk avatar purkhusid avatar sanju-naik avatar tgeng avatar thirtyseven avatar tinder-maxwellelliott avatar tinder-yukisawa avatar vcase avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

bazel-diff's Issues

Add support for bazel query output formats for the get-impacted-targets command output

In our setup we enable specific CI pipelines depending on the types of targets that were changed. To get this kind of metadata from the output of bazel-diff get-impacted-targets, we need to run another set of bazel queries because the list of targets may be too large for a single bazel query. Splitting up one large bazel query into multiple smaller ones is one of the USPs for bazel-diff, and so I was hoping to add the functionality here.

The functionality could come as part of an output format for get-impacted-targets, another option would be to add a new subcommand that transforms the current output into a standard bazel query output (streamed_proto/xml/..).

What do you think?

Infinite loop during bazel-diff

Sorry for this rather useless issue, but maybe it is just enough info to fix the bug. Otherwise feel free to close it.

I hit an infinite loop in bazel-diff, but only on CI, I could not reproduce this locally, hence the minimal information.
CI hanged during this step: bazel run @bazel_diff//:bazel-diff -- -sh "$starting_hashes_json" -fh "$final_hashes_json" -w "$workspace_path" -b "$bazel_path" -o "$impacted_targets_path".

I was able to work around it by running bazel build ... on CI first. The only difference I see between this build and other CI builds that did succeed is that in this case a relatively big external dependency was added. Maybe the issue arrises when it takes a few minutes before a bazel query returns its first result?

Invalid target throws silent error

Changing a target name but not updating dependencies to use the new name throws a silent error.

Example

// Package A depends on "//packages/FOO"
ts_library(
    ...
    deps = [
        "//packages/FOO" # the full name of this target is //packages/FOO:FOO
    ]
)

// Package FOO
// Change FOO:FOO to FOO:BAR in BUILD file
ts_library(
    name = "BAR" # used to be FOO
    ...
)

In the above example Package A is now depending on an invalid target after updating the name. This currently throws a silent error and the impacted files output will be blank.

Expected behavior is to throw an error so the user can be aware they have made an invalid change to the dependency graph.

Unknown option: '-t'

Hello!

I'm trying out version 2.0.2 (update: happening also on 2.1.0) but running the provided bazel-diff-example.sh causes an error when running the command $bazel_path run //tools:bazel_diff -- -sh $starting_hashes_json -fh $final_hashes_json -w $workspace_path -b $bazel_path -o $impacted_test_targets_path -t in our repo.

The errors is a follows:

Unknown option: '-t'
Usage: bazel-diff [-hV] [-aq=<avoidQuery>] -b=<bazelPath>
                  [-co=<bazelCommandOptions>] [-fh=<finalHashesJSONPath>]
                  [-o=<outputPath>] [-sh=<startingHashesJSONPath>]
                  [-so=<bazelStartupOptions>] -w=<workspacePath> [COMMAND]
Writes to a file the impacted targets between two Bazel graph JSON files
      -aq, --avoid-query=<avoidQuery>
                  A Bazel query string, any targets that pass this query will
                    be removed from the returned set of targets
  -b, --bazelPath=<bazelPath>
                  Path to Bazel binary
      -co, --bazelCommandOptions=<bazelCommandOptions>
                  Additional space separated Bazel command options used when
                    invoking Bazel
      -fh, --finalHashes=<finalHashesJSONPath>
                  The path to the JSON file of target hashes for the final
                    revision. Run 'generate-hashes' to get this value.
  -h, --help      Show this help message and exit.
  -o, --output=<outputPath>
                  Filepath to write the impacted Bazel targets to, newline
                    separated
      -sh, --startingHashes=<startingHashesJSONPath>
                  The path to the JSON file of target hashes for the initial
                    revision. Run 'generate-hashes' to get this value.
      -so, --bazelStartupOptions=<bazelStartupOptions>
                  Additional space separated Bazel client startup options used
                    when invoking Bazel
  -V, --version   Print version information and exit.
  -w, --workspacePath=<workspacePath>
                  Path to Bazel workspace directory.
Commands:
  generate-hashes     Writes to a file the SHA256 hashes for each Bazel Target
                        in the provided workspace.
  modified-filepaths  Writes to the file the modified filepaths between two
                        revisions.

All other commands work just fine, but the -t is unrecognized for some reason. Have you ever seen this before?

Filtering

Hey, this project looks great! Fills a much needed gap in the Bazel ecosystem. I was looking over the code and I was wondering if there is some nice way to filter targets using bazel-diff?

My usecase is that I need to find all our deployment targets that changed by rule kind and also our artifact targets by specific tags.
Is this somehow possible with the current release?

NPE when -o flag is not provided

I am getting a NullPointerException when starting hashes and final hashes are same:
UPD: it turned out that NPE is caused by lack of -o flag.

C:\bin>java -jar bazel-diff_deploy.jar  -w C:\W\workspace -b .\bazel.exe -sh a.txt -fh b.txt
java.lang.NullPointerException
        at java.base/java.io.FileOutputStream.<init>(FileOutputStream.java:228)
        at java.base/java.io.FileOutputStream.<init>(FileOutputStream.java:187)
        at java.base/java.io.FileWriter.<init>(FileWriter.java:96)
        at com.bazel_diff.BazelDiff.call(main.java:153)
        at com.bazel_diff.BazelDiff.call(main.java:71)
        at picocli.CommandLine.executeUserObject(CommandLine.java:1853)
        at picocli.CommandLine.access$1100(CommandLine.java:145)
        at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2255)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2249)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2213)
        at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2080)
        at picocli.CommandLine.execute(CommandLine.java:1978)
        at com.bazel_diff.BazelDiff.main(main.java:169)

v3.2.3 - https://github.com/Tinder/bazel-diff/tree/3.2.3

Throws an exception when executing Generating Hashes

bazel_diff version: 4.0.8, 4.0.5, 4.0.2
Reference documentation: https://github.com/Tinder/bazel-diff/blob/4.0.8/README.md

Integrate bazel_diff into my own project using the recommended way in README.md, Reference address: https://github.com/Tinder/bazel-diff/blob/4.0.8/README.md#integrate-into-your-project-recommended

image

After bazel_diff is generated, an error will be reported when executing the script $bazel_diff generate-hashes -w $workspace_path -b $bazel_path $starting_hashes_json, the error is as follows:

Generating Hashes for Revision '****************'`
`Loading: 0 packages loaded`
Loading: 1384 packages loaded
    currently loading: ******************* ... (42 packages)
WARNING: --keep_going specified, ignoring errors.  Results may be inaccurate
Loading: 1426 packages loaded
Loading: 1426 packages loaded
[Error] Unexpected error during generation of hashes
java.lang.RuntimeException: Bazel query failed, exit code 3
	at com.bazel_diff.bazel.BazelQueryService.query(BazelQueryService.kt:60)
	at com.bazel_diff.bazel.BazelClient.queryAllSourcefileTargets(BazelClient.kt:41)
	at com.bazel_diff.hash.BuildGraphHasher$hashAllBazelTargetsAndSourcefiles$1$sourceTargetsFuture$1.invokeSuspend(BuildGraphHasher.kt:40)
	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
	at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:571)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:750)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:678)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:665)

I also tested bazel_diff of 4.0.8, 4.0.5 and 4.0.2 versions, and will have the same error, while testing earlier versions such as 3.5.0 and 3.4.2 is normal.

Can someone please help to see what is the reason for this thanks.

Adding dependency to WORKSPACE interpreted as changing almost all targets

Hi!

After upgrading from 2.3.0 to 2.4.0, we encountered an issue where a change to WORKSPACE resulted in almost all targets being marked as changed. The change only added a new dependency.

I'm not sure how this could have been introduced by the change set.

Let me know if there are more details I can provide.

Dependencies between generated targets sometimes missing

Hi again,

There's a subtle bug in the way dependencies on generated files are handled in the implementation of #82 resulting in missing impacted targets when the generating target is lexicographically before the generated target.

I have an example here (https://github.com/KevinJiao/gen-file-repro) with logs

~/projects/bazel-diff-source$ ./bazel-diff-example.sh ~projects/genrule-repro /usr/bin/bazel master^ master
Generating Hashes for Revision 'master^'
Starting local Bazel server and connecting to it...
...
Generating Hashes for Revision 'master'
...
Determining Impacted Targets
Impacted Targets between master^ and master:
//:gen_test //:gen_test.sh //:aaa_test.txt

Here I would also expect //:compare_files to be impacted, as it depends on //:aaa_test.txt

Initial bazel-diff run is resource hungry and time intensive

I build my projects using Jenkins pipeline jobs executed in fresh containers. Running the bazel-diff tool on a fresh clone of the repo can take an incredibly long time, even when there are 4 cores and 8gb of memory available. Subsequent runs are much faster (almost instantaneous).

Some questions:

  • Is this normal behavior?
  • what are the benchmarks for bazel-diff?
  • Is there anything I can do to speed up execution?
  • If work is required on the tool itself, what features need to be worked on? I'd love to contribute if possible.

Thanks

Not all changes detected

I've got the following BUILD.bazel file

load("@io_bazel_rules_docker//container:container.bzl", "container_image")
load("@io_bazel_rules_docker//docker/util:run.bzl", "container_run_and_commit_layer")
load("@io_bazel_rules_docker//docker/package_managers:download_pkgs.bzl", "download_pkgs")
load("@io_bazel_rules_docker//docker/package_managers:install_pkgs.bzl", "install_pkgs")

# Base docker image for our Scala services
download_pkgs(
    name = "base_packages",
    image_tar = "@openjdk_11_slim//image",
    packages = [
        "wget",
    ],
)

install_pkgs(
    name = "install_base_packages",
    image_tar = "@openjdk_11_slim//image",
    installables_tar = ":base_packages.tar",
    installation_cleanup_commands = "rm -rf /var/lib/apt/lists/*",
    output_image_name = "install_base_packages",
)

container_run_and_commit_layer(
    name = "base_run_commands",
    commands = [
        "wget -O /bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/v0.3.6/grpc_health_probe-linux-amd64",
        "chmod +x /bin/grpc_health_probe",
    ],
    image = "install_base_packages.tar",
)

container_image(
    name = "scala_base_image",
    base = "@openjdk_11_slim//image",
    layers = ["base_run_commands"],
    # This is needed because the scala_image expectes the java executable
    # to be available at /usr/bin/java. We then use this image as a base
    # for the app image
    symlinks = {
        "/usr/bin/java": "/usr/local/openjdk-11/bin/java",
    },
    visibility = ["//visibility:public"],
)

If e.g. change the download_pkgs target to download curl as well then bazel-diff will not detect the changed targets.

Query errors are silently ignored

Bazel queries are executed by starting a process and reading the output but existing code doesn't check for bazel errors. When I ran this in a workspace with some repository rule errors, bazel-diff exited with status 0 (and no error messages) after printing partial output.

I discovered my error by adding pb.redirectError(ProcessBuilder.Redirect.INHERIT). However, maybe you'd want to do something a little more sophisticated (like check bazel exit status and only print stderr if there's a problem), and also set bazel-diff's error code.

(Also, thanks for sharing this tool publicly!)

Breaks when non-bazel files are included in modified paths

It seems like the tool silently fails when comparing commits that include files that are not a part of the Bazel workspace.

In my case I had a script in the root of the repo that I changed along with a source file that is tracked by Bazel. When I ran bazel-diff with the example script it completed without errors but reported no impacted targets.

Hash all targets has poor performance with rules_pip

Hi!

When upgrading to 3.0.0, I'm seeing large performance regressions in "hash all targets" query with my targets that depend on a rules_pip generated external dependency. The key appears to be in the kind('source file', deps(//...)) query, which is not restricted to internal targets, and therefore will result in downloading every single external dependency and attempting to hash all the source files for external dependencies.

Pre upgrade, changes in external dependencies would be picked up by the hash implementation in BazelRule.java, without needing to actually download anything. For reference, this is the change in question: #72

I have a example repository in https://github.com/KevinJiao/pip-diff-repro, and some timings:

$ bazel version
Build label: 3.7.1
$ cd bazel-diff
$ git checkout 2.4.0
$ bazel build :bazel-diff
$ time bazel run :bazel-diff --config=verbose -- generate-hashes -b /usr/bin/bazel -w /home/kjiao/projects/pip-diff-repro /tmp/hash
hes_fast
INFO: Build completed successfully, 2 total actions
Executing Query: '//external:all-targets' + '//...:all-targets'

real    0m5.610s
user    0m0.748s
sys     0m0.160s

$ git checkout master
$ bazel build :bazel-diff
$ sudo rm -rf /data/home/kjiao/.cache/bazel/_bazel_kjiao/35b52e3c66a2af30496de6e275263eab
$ time bazel run :bazel-diff --config=verbose -- generate-hashes -b /usr/bin/bazel -w /home/kjiao/projects/pip-diff-repro /tmp/hash
INFO: Build completed successfully, 2 total actions
Executing Query: kind('source file', deps(//...))
Executing Query: '//external:all-targets' + '//...:all-targets'

real    0m24.143s
user    0m1.125s
sys     0m0.175s

24 seconds may not seem like a lot but the actual repo I'm working with has a few hundred pip dependencies and it isn't performant to have to download all pip packages every time to determine if anything changed.

Would it be possible to either
a. Bring back the hashing using the --modifiedFilepaths flag or
b. structure the source file query to not include source files of external dependencies?

I tried to find a query that worked for option b. but I wasn't able to. I imagine in theory something like "rdeps(//..., //...:all-targets) should work but that query still results in downloading all packages.

Transitive dependency of repository rule do not change bazel-diff output hashes

In this minimal repo https://github.com/tingilee/repro-pip-bazel-diff/, we have found that transitive dependencies in the repository rules do not change the output hashes.

https://github.com/tingilee/repro-pip-bazel-diff/blob/master/BUILD#L15 :lib depends on testcontainers and testcontainers dependent on wrapt.

I ran the following at commit tingilee/repro-pip-bazel-diff@697f909 and one commit prior.

bazel run @bazel_diff//:bazel-diff -- generate-hashes -w /Users/jacqueline.lee/repro-pip-bazel-diff -b /usr/local/bin/bazel /tmp/hashes.json

And then ran

bazel run @bazel_diff//:bazel-diff  -- -sh /tmp/starting_hashes_before_change.json -fh /tmp/starting_hashes_after_change.json -w /Users/jacqueline.lee/repro-pip-bazel-diff -b /usr/local/bin/bazel -o /tmp/impacted

The expectation is :lib should be in the impacted targets since its transitive dependency has changed, but this is not true.

OutOfMemory Error when constructing hashes

I'm using bazel-diff 3.3.0 and Bazel 4.2.1. Whenever I run the provided example script over our moderately sized Go repo, the script is able to generate the starting hashes but the second set of hashes generates the following error:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.base/java.util.Arrays.copyOf(Arrays.java:3536) at java.base/java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:100) at java.base/java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:130) at java.base/java.io.OutputStream.write(OutputStream.java:127) at com.bazel_diff.BazelSourceFileTargetImpl.<init>(BazelSourceFileTarget.java:30) at com.bazel_diff.BazelClientImpl.processBazelSourcefileTargets(BazelClient.java:74) at com.bazel_diff.BazelClientImpl.queryAllSourcefileTargets(BazelClient.java:58) at com.bazel_diff.TargetHashingClientImpl.hashAllBazelTargetsAndSourcefiles(TargetHashingClient.java:28) at com.bazel_diff.GenerateHashes.call(main.java:58) at com.bazel_diff.GenerateHashes.call(main.java:22) at picocli.CommandLine.executeUserObject(CommandLine.java:1853) at picocli.CommandLine.access$1100(CommandLine.java:145) at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2255) at picocli.CommandLine$RunLast.handle(CommandLine.java:2249) at picocli.CommandLine$RunLast.handle(CommandLine.java:2213) at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2080) at picocli.CommandLine.execute(CommandLine.java:1978) at com.bazel_diff.BazelDiff.main(main.java:169)

I have tried increasing the heap size available to Bazel in the startup options but I think this is an OOM from the JAR itself?

Add ability to disable warnings

Some of the warnings being produced right now are not very useful for our project, including warnings about not being able to hash various built-in types that won't affect build output anyway:

[Warning] Unsupported target type in the build graph: PACKAGE_GROUP
[Warning] Unable to calculate digest for input //visibility:private for rule //external:maven

It would be nice if we could turn these off to reduce noise in our CI.

Potential bug in hash function

In the file TargetHashingClient.java, the function createHashForRule creates a hash for a rule by considering the rule's inputs and if necessary recursively hashing inputs that are rules. A hashmap ruleHashes is used to cache previously computed rule hashes. I think there is a bug here, as at least on my system, the MessageDigest object is reset when .digest() is called. So if the rule is in the cache, the first time it is found it will be reset when digest is called to update the parent's digest. Any other rules that use it will subsequently update with the wrong hash.

Am I missing something here?

Usage: The use of `//external:` in the list impacted targets

This is an amazing tool with great potential.

My approach to use this in CI is something along the line of

bazel test $(cat /tmp/impacted_targets.txt)

However, I get

Found reference to a workspace rule in a context where a build rule was expected; probably a reference to a target in that external repository, properly specified as @reponame//path/to/package:target, should have been specified by the requesting rule

for all //external/ targets.

Just wondering why we have //external in list of impacted targets.

Because my natural next step is to remove all the //external targets before invoking bazel.

RFC: Bazel Query Service

In the BazelCon talk that inspired this repo, at this timestamp:
https://youtu.be/9Dk7mtIm7_A?t=1875
Benjamin talks about how Dropbox operationalized the bazel-diff tool by hosting it as a service. This issue proposes that we implement such a thing in this repo.

Language: Java, since that's what's already used in this repo

Storage: for the cache behavior, we need to store the hashes.json files at a given Git SHA. It should persist over server shutdowns since cache misses introduce a lot of latency. Can make this configurable but AWS S3 seems like the obvious choice most users would want.

Hosting: Ben points out in the talk that a custom load balancer can be needed for this service. So it's not enough to just ship a docker (OCI) image that has a runnable service with networking, we probably need a k8s manifest that also describes how to run a few instances of the query service, health/load checks, and a load balancer that finds available instance to send requests. Maybe even a dynamic scaling to adjust the number of instances.

Getting the code: we'd have to use a git client (probably assume one is on the $PATH and call it as a subprocess). Then we have to checkout the workspace at various SHAs. When a server comes up it should do an initial fetch of the repo before reporting healthy to accept requests. Need to give user configurability to reach their git server (auth keys, etc). Also have to deal with bad git state (maybe just detect and lame duck the server rather than try to repair)

Prior art:

  • Google has a service "skyframe" that basically gives you this "Bazel query at scale", partly based on bazelbuild/bazel#11194 and then a bunch of google-internal mechanics around it. I think it's safe to say that no one at Google has time or motivation to refactor that into an open-source shape. Also our scope here would be smaller, not supporting arbitrary bazel queries but only the affectedness calculation.
  • Dropbox has the implementation Benjamin describes in the talk. Maybe worth discussing with them if they can justify spending time to make that available.

feature: experimental_hash_all_targets

Hey @tinder-maxwellelliott, I encountered an interesting issue which appears to be coming from the hash function. I've put together a small repro below showing that the hashes from generate-hashes are not changing even when the source files are changing. Have you seen this before? Based on my understanding of the rule_implementation_hash bug, I don't think this is related but my exposure to this is pretty limited right now.

https://github.com/jonahgeorge/bazel-diff-repro

 ± git rev-parse HEAD
0d42f9d1b0b514382825ddf1272a226ec5a6bff0

 ± bazel run //:bazel-diff -- \
  --workspacePath $(pwd) --bazelPath $(which bazel) generate-hashes /dev/stdout | grep repro

INFO: Invocation ID: d4a882ba-027c-4361-a6c8-fb66c2e269ea
Loading: 
Loading: 0 packages loaded
Analyzing: target //:bazel-diff (0 packages loaded, 0 targets configured)
INFO: Analyzed target //:bazel-diff (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
[1 / 6] [Prepa] BazelWorkspaceStatusAction stable-status.txt
Target //:bazel-diff up-to-date:
  bazel-bin/bazel-diff.jar
  bazel-bin/bazel-diff
INFO: Elapsed time: 0.169s, Critical Path: 0.00s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
INFO: Running command line: bazel-bin/bazel-diff --workspacePath /Users/jonahgeorge/Workspace/button/bazel-diff-repro --bazelPath /usr/local/bin/bazel generate-hashes /dev/stdout
INFO: Build completed successfully, 1 total action
  "//:repro": "5bc3def9bc1f63785fe2b84dee005d62e8d20de42be50e796b6c57e45ed58632",
  "//:repro.go": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
  "//:repro_lib": "7e462ed495dd9d56d742eab9d831ff19ef18d5bdb9fba0e105f5ce24914c70b1",

 ± echo '\nfunc init() { fmt.Println("should change the hash") }' >> repro.go

 ± git add repro.go

 ± git commit -m 'modify repro.go'
[master bf6d672] modify repro.go
 1 file changed, 2 insertions(+)

 ± bazel run //:bazel-diff -- \                                              
  --workspacePath $(pwd) --bazelPath $(which bazel) generate-hashes /dev/stdout | grep repro

INFO: Invocation ID: fb25ffac-0adc-4806-bf1e-1ee7fabd30d2
Loading: 
Loading: 0 packages loaded
Analyzing: target //:bazel-diff (0 packages loaded, 0 targets configured)
INFO: Analyzed target //:bazel-diff (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
[1 / 7] [Prepa] BazelWorkspaceStatusAction stable-status.txt
Target //:bazel-diff up-to-date:
  bazel-bin/bazel-diff.jar
  bazel-bin/bazel-diff
INFO: Elapsed time: 0.168s, Critical Path: 0.00s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
INFO: Running command line: bazel-bin/bazel-diff --workspacePath /Users/jonahgeorge/Workspace/button/bazel-diff-repro --bazelPath /usr/local/bin/bazel generate-hashes /dev/stdout
INFO: Build completed successfully, 1 total action
  "//:repro": "5bc3def9bc1f63785fe2b84dee005d62e8d20de42be50e796b6c57e45ed58632",
  "//:repro.go": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
  "//:repro_lib": "7e462ed495dd9d56d742eab9d831ff19ef18d5bdb9fba0e105f5ce24914c70b1",

Memory "leak": Whole source directory is loaded into memory to lazy-generate digests

BazelSourceFileTarget's constructor loads the entire contents of the given source file, plus some digest, into memory. These contents are only used to later get the SHA256 digest of it, and these objects are kept around presumably for the whole runtime of bazel-diff.

This means that Java needs a heap size >= the size of your source tree. Since only the digest of these contents are needed, we should instead calculate that immediately and store that instead. That would allow us to only need a heap size of O(digest_size * num_files).

In our repository, bazel-diff takes between 8 and 12GB of RAM (we pass -Xms12g -Xmx12g, and it doesn't work with 8g). With my PR (I'll send it after submitting this issue), bazel-diff takes less than 512MB (but more than 256MB).

Infinite "loading packages" step

Hello,

I can't manage to reproduce this locally because it is currently working on my machine, but in our CI/CD pipeline when I'm generating hashes, the command seems to get stuck while loading packages during one of the Bazel queries.

Loading: 0 packages loaded Loading: 0 packages loaded Loading: 0 packages loaded Loading: 504 packages loaded currently loading: Loading: 504 packages loaded currently loading: Loading: 504 packages loaded currently loading: Loading: 504 packages loaded currently loading: Loading: 504 packages loaded currently loading: Loading: 504 packages loaded currently loading: Loading: 504 packages loaded currently loading: Loading: 504 packages loaded currently loading: Loading: 504 packages loaded currently loading: Loading: 504 packages loaded currently loading: Loading: 504 packages loaded currently loading: Loading: 504 packages loaded currently loading:

Eventually it will reach 505 but at that point it just sits at Loading: 505 packages loaded for an indeterminate amount of time (has gone 20+ minutes before)

Any idea why this might happen?

Performance issues for large diffs

In our repo, we have 1000s of generated json files that are not tracked by bazel.

On diffs with ~1000 untracked files we're finding some performance problems stemming from the non-bazel query fix: #9.

Moving the partition to 100 improves the performance, but of course the original #8 bug occurs in this situation.

Any thoughts on how to keep the large partition size but still fix the query bug?

Logging switch for bazel calls and main

Our Bazel workspace does a lot of container pre-fetching and other setup tasks. As bazel-diff runs silently it is difficult to determine what Bazel is currently running, particularly on large diffs or critical changes.

This is a feature request for a --verbose switch that can re-enable Bazel logging for the queries, and just general logging around the stages of bazel-diff. Happy to assist!

`--version` still says `3.4.0`

Deploy jar for 4.0.8 still says version is 3.4.0

❱❱❱ wget -O /tmp/tools/bazel-diff.jar https://github.com/Tinder/bazel-diff/releases/download/4.0.8/bazel-diff_deploy.jar

...

Saving to: ‘/tmp/tools/bazel-diff.jar’

/tmp/tools/bazel-diff.jar                                     100%[==============================================================================================================================================>]   9.26M  13.9MB/s    in 0.7s

2022-09-16 09:21:51 (13.9 MB/s) - ‘/tmp/tools/bazel-diff.jar’ saved [9706614/9706614]

❱❱❱ java -jar /tmp/tools/bazel-diff.jar --version
3.4.0

Change in rule attribute not detected

Hello! I've noticed that one type of change isn't detected by bazel-diff. The situation is as follows:
We have a custom rule that has a tool has a default executable attribute that looks something like this.

def _my_rule_impl(ctx):
    # TODO

_my_rule = rule(
    implementation = _my_rule_impl,
    attrs = {
        "_tool": attr.label(
            default = "//tools/release:ios_release",
            executable = True,
            cfg = "host",
        ),
    },
    executable = True,
)

def my_rule(name, **args):
    _my_rule(name, **args)

I can see in a diff that changes to //tools/release:ios_release are correctly identified, but I would expect also all my_rule definitions to show up in the diff since they all depend on //tools/release:ios_release, but they're not.

Stack overflow error when generating hashes

Hello!

After updating to 2.4.0, we've started seeing StackOverflowErrors when running generate-hashes (using modified-filepaths):

Exception in thread "main" java.lang.StackOverflowError
	at java.base/java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3963)
	at java.base/java.util.regex.Pattern$GroupHead.match(Pattern.java:4804)
	at java.base/java.util.regex.Pattern$Branch.match(Pattern.java:4749)
	at java.base/java.util.regex.Pattern$Branch.match(Pattern.java:4747)
	at java.base/java.util.regex.Pattern$Branch.match(Pattern.java:4747)
	at java.base/java.util.regex.Pattern$BranchConn.match(Pattern.java:4713)
	at java.base/java.util.regex.Pattern$GroupTail.match(Pattern.java:4863)
	at java.base/java.util.regex.Pattern$BmpCharPropertyGreedy.match(Pattern.java:4344)
	at java.base/java.util.regex.Pattern$GroupHead.match(Pattern.java:4804)
	at java.base/java.util.regex.Pattern$Branch.match(Pattern.java:4749)
	at java.base/java.util.regex.Pattern$Branch.match(Pattern.java:4747)
	at java.base/java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3964)
	at java.base/java.util.regex.Pattern$Start.match(Pattern.java:3619)
	at java.base/java.util.regex.Matcher.search(Matcher.java:1729)
	at java.base/java.util.regex.Matcher.find(Matcher.java:773)
	at java.base/java.util.Formatter.parse(Formatter.java:2702)
	at java.base/java.util.Formatter.format(Formatter.java:2655)
	at java.base/java.util.Formatter.format(Formatter.java:2609)
	at java.base/java.lang.String.format(String.java:2897)
	at com.bazel_diff.BazelRuleImpl.transformRuleInput(BazelRule.java:53)
	at com.bazel_diff.BazelRuleImpl.lambda$getRuleInputList$0(BazelRule.java:38)
	at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
	at java.base/java.util.AbstractList$RandomAccessSpliterator.forEachRemaining(AbstractList.java:720)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
	at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578)
	at com.bazel_diff.BazelRuleImpl.getRuleInputList(BazelRule.java:39)
	at com.bazel_diff.TargetHashingClientImpl.createDigestForRule(TargetHashingClient.java:101)
	at com.bazel_diff.TargetHashingClientImpl.createDigestForRule(TargetHashingClient.java:106)
	at com.bazel_diff.TargetHashingClientImpl.createDigestForRule(TargetHashingClient.java:106)
	at com.bazel_diff.TargetHashingClientImpl.createDigestForRule(TargetHashingClient.java:106)
	at com.bazel_diff.TargetHashingClientImpl.createDigestForRule(TargetHashingClient.java:106)
	at com.bazel_diff.TargetHashingClientImpl.createDigestForRule(TargetHashingClient.java:106)
	at com.bazel_diff.TargetHashingClientImpl.createDigestForRule(TargetHashingClient.java:106)
	at com.bazel_diff.TargetHashingClientImpl.createDigestForRule(TargetHashingClient.java:106)
	at com.bazel_diff.TargetHashingClientImpl.createDigestForRule(TargetHashingClient.java:106)
	at com.bazel_diff.TargetHashingClientImpl.createDigestForRule(TargetHashingClient.java:106)
	at com.bazel_diff.TargetHashingClientImpl.createDigestForRule(TargetHashingClient.java:106)
	at com.bazel_diff.TargetHashingClientImpl.createDigestForRule(TargetHashingClient.java:106)
	at com.bazel_diff.TargetHashingClientImpl.createDigestForRule(TargetHashingClient.java:106)
	at com.bazel_diff.TargetHashingClientImpl.createDigestForRule(TargetHashingClient.java:106)
	at com.bazel_diff.TargetHashingClientImpl.createDigestForRule(TargetHashingClient.java:106)
	at com.bazel_diff.TargetHashingClientImpl.createDigestForRule(TargetHashingClient.java:106)
	at com.bazel_diff.TargetHashingClientImpl.createDigestForRule(TargetHashingClient.java:106)
       (etc)

I had a look at the code and saw the recursive call in createDigestForRule which seems to be the culprit, however I'm also curious if we might be doing something wrong to get into this state, since I've not seen any complaints of similar issues yet.

Any thoughts?

Over-triggering with rules_go and Gazelle under certain circumstances

This issue is basically a better documented version of #70

Example repository: https://github.com/mikberg/bazel-diff-problem

This repository sets up rules_go with Gazelle, a go binary with an external dependency and some rule which depends on the WORKSPACE file. I think all three are important.

The second commit adds a comment to the WORKSPACE file, demonstrating that any change to WORKSPACE will now result in bazel-diff regarding all external Go dependencies as impacted. This presumably also is the reason why the go_binary is also regarded as impacted.

./bazel-diff-example.sh <path-to>/bazel-diff-test /usr/local/bin/bazelisk d075588217ec39bac4064dc7feb7026cd899a1b0 d6e1e901dcd5dd9e9b32c2ce575938946839055f
Generating Hashes for Revision 'd075588217ec39bac4064dc7feb7026cd899a1b0'
Generating Hashes for Revision 'd6e1e901dcd5dd9e9b32c2ce575938946839055f'
Determining Impacted Targets
Impacted Targets between d075588217ec39bac4064dc7feb7026cd899a1b0 and d6e1e901dcd5dd9e9b32c2ce575938946839055f:
//:project_lib //external:com_github_kr_pretty //external:com_github_bazelbuild_buildtools //external:in_gopkg_check_v1 //external:com_github_pelletier_go_toml //external:org_golang_x_mod //external:bazel_gazelle_go_repository_config //external:com_github_fsnotify_fsnotify //external:com_github_rs_zerolog //:some-script //external:in_gopkg_yaml_v2 //external:com_github_bazelbuild_rules_go //external:com_github_kr_text //:project //external:com_github_davecgh_go_spew //external:com_github_bmatcuk_doublestar //external:org_golang_x_net //external:org_golang_x_text //:WORKSPACE //external:com_github_kr_pty //external:com_github_pmezard_go_difflib //external:org_golang_x_sync //external:com_github_burntsushi_toml //external:com_github_google_go_cmp //external:org_golang_x_crypto

I haven't been able to nail this down further.

As far as I can tell, this problem was introduced between 2.3.0 and 2.4.0.

Thank you for a fantastic tool!

Filter Targets tagged as "manual"

when running the example bazel-diff-example.sh on my repository, I also get the targets tagged as manual.

Bazel filters them when using wildcard target patters already (link) and it would be great to have a configuration so we can filter them automatically with bazel-diff too.

BTW. This tool is great, thank you for open sourcing it 😄

Provide the final shell script as part of distribution

Right now users copy-paste the bazel-diff-example.sh to stitch together the bazel-diff commands. This means users end up modifying and having diverging copies, not able to upstream their improvements to share with other users.

Ideally some "shrink-wrapped" distribution would include the top-level entry point (maybe still using Bash).

Generates same hashes

I have C/C++ project, I do this to run bazel-diff:

  1. bazel build //... -> builds all files
  2. run bazel-diff: java -jar bazel-diff_deploy.jar generate-hashes -b .\bazel.exe -w C:\project A.txt
  3. change single .c file in a project
  4. run bazel-diff: java -jar bazel-diff_deploy.jar generate-hashes -b .\bazel.exe -w C:\project B.txt
  5. bazel build //... -> this rebuilds this single .c file - bazel detected change correctly
  6. run bazel-diff: java -jar bazel-diff_deploy.jar generate-hashes -b .\bazel.exe -w C:\project C.txt

Then all 3 files are same: {A,B,C}.txt. What do I do wrong? Why bazel-diff does not detect that configuration changed?

My goal is to find a list of labels/files that changed between 2 git commits.

Error uncommitted changes but git showing no changes

After following the setup guide and adding bazel diff to BUILD file and running bazel run //:bazel-diff -- modified-filepaths -w . -b $(which bazel) "HEAD^" HEAD a exits with There are active changes in '.', please commit these changes before running modified-filepaths, however when running git status --porcelain immediately after the error in the same dir the output doesn't show any non-whitespace characters.

I'm running on MacOS

WORKSPACE

http_jar(
    name = "bazel_diff",
    urls = [
        "https://github.com/Tinder/bazel-diff/releases/download/2.1.2/bazel-diff_deploy.jar",
    ],
    sha256 = "a01d8e26dc0abfacd282f44a72434613d9799270bff406848e0ba8df7bae2081"
)

BUILD

java_binary(
    name = "bazel-diff",
    main_class = "com.bazel_diff.BazelDiff",
    runtime_deps = ["@bazel_diff//jar"],
)

Blocked by transitive dep causing `CircularDependencyException`, while Bazel builds fine

Transitive dependency causing circular dependency error in 4.0.8

//external:lib is not a target that we define in our repo. It's a transitive dependency from GRPC: https://github.com/grpc/grpc/blob/master/bazel/grpc_deps.bzl#L183-L186

[Error] Unexpected error during generation of hashes
com.bazel_diff.hash.RuleHasher$CircularDependencyException: Circular dependency detected:
//external:libuv -> //external:libuv
	at com.bazel_diff.hash.RuleHasher.raiseCircularDependency(RuleHasher.kt:23)
	at com.bazel_diff.hash.RuleHasher.digest(RuleHasher.kt:36)
	at com.bazel_diff.hash.RuleHasher$digest$finalHashValue$1.invoke(RuleHasher.kt:54)
	at com.bazel_diff.hash.RuleHasher$digest$finalHashValue$1.invoke(RuleHasher.kt:41)
	at com.bazel_diff.hash.HashingExtensionsKt.sha256(HashingExtensions.kt:14)
	at com.bazel_diff.hash.RuleHasher.digest(RuleHasher.kt:41)
	at com.bazel_diff.hash.TargetHasher.digest(TargetHasher.kt:38)
	at com.bazel_diff.hash.BuildGraphHasher.hashAllTargets$lambda-6(BuildGraphHasher.kt:107)
	at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
	at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
	at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:952)
	at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:926)
	at java.base/java.util.stream.AbstractTask.compute(AbstractTask.java:327)
	at java.base/java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:746)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
	at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)

It says in this PR (#138) that dependencies depending on themselves would keep Bazel from building, but that's not necessarily the case as this does not interfere with Bazel building.

I don't have the full context on this but if the only reason this exception is being raised is because we're assuming that Bazel wouldn't build, then I think we should change it to a warning message.

bazelrc change is not reflected in bazel-diff

If I change build options (such as how dexing works in android) in my bazelrc file, all affected targets should be rebuild. However, currently, bazel-diff cannot detect this change.

I'm thinking we should add an skip-list which matches the name pattern of changed files and if matches then we'll return all targets.

Exitcode is not propagated

It looks like the exit code is not being propagated correctly, i.e. the tool fails silently with exit 0.
I think the problem is in main:

    public static void main(String[] args) {
        new CommandLine(new BazelDiff()).execute(args);
    }

I assume the fix would be something like:

    public static void main(String[] args) {
        Integer exitCode = new CommandLine(new BazelDiff()).execute(args);
        if (exitCode == null) {
            return;
        }
        System.exit(exitCode)
    }

Let me know if it is actually an issue as my local setup might be off. If it is an issue I will follow with a PR

Usage instructions aren't accurate since 4.0.6

The recommended way of using bazel diff doesn't work after 4.0.5 because zip files are being published, but http_jar requires jar files.

http_jar(
    name = "bazel_diff",
    urls = [
        "https://github.com/Tinder/bazel-diff/releases/download/4.0.5/bazel-diff_deploy.jar",
    ],
    sha256 = "59f2a614f90b4c2a6c83f1e6146d8722dfaac3a1d8f42734dcbb6ccf373a1cbd",
)

Change in external repo not detected

I've found a case that bazel-diff does not handle properly and I've got a repro here: https://github.com/purkhusid/bazel-diff-repro

You can check out the repository and run:

./bazel-diff.sh $(pwd) bazel $(git rev-parse HEAD~1) $(git rev-parse HEAD)

bazel-diff should have marked //:yo as changed since it does depend on @scuttle//:bin which was changed between commits.

Unable to run due to failed to download zlib 1.2.11.tar.gz

openjdk version "11.0.5" 2019-10-15 LTS
OpenJDK Runtime Environment Zulu11.35+15-CA (build 11.0.5+10-LTS)
OpenJDK 64-Bit Server VM Zulu11.35+15-CA (build 11.0.5+10-LTS, mixed mode)

When running bazel build //:bazel-diff I got

WARNING: Download from https://zlib.net/zlib-1.2.11.tar.gz failed: class java.io.FileNotFoundException GET returned 404 Not Found
ERROR: An error occurred during the fetch of repository 'zlib':

I've a check it seems there is only https://zlib.net/zlib-1.2.12.tar.gz now and there is no 1.2.11

stackOverflow issue. A fix for the issue 71 fix.

I wonder what is the best way to debug this problem. I think it is because it cannot parse some of the rules and end up in a recursive loop. I want to find out which rule is causing the problem so maybe I can fix the rule.

==========

[2021-07-21T08:14:49Z] Exception in thread "main" java.lang.StackOverflowError

  | [2021-07-21T08:14:49Z] at java.base/java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3963)
  | [2021-07-21T08:14:49Z] at java.base/java.util.regex.Pattern$GroupHead.match(Pattern.java:4804)
  | [2021-07-21T08:14:49Z] at java.base/java.util.regex.Pattern$Branch.match(Pattern.java:4749)
  | [2021-07-21T08:14:49Z] at java.base/java.util.regex.Pattern$Branch.match(Pattern.java:4747)
  | [2021-07-21T08:14:49Z] at java.base/java.util.regex.Pattern$Branch.match(Pattern.java:4747)
  | [2021-07-21T08:14:49Z] at java.base/java.util.regex.Pattern$BranchConn.match(Pattern.java:4713)
  | [2021-07-21T08:14:49Z] at java.base/java.util.regex.Pattern$GroupTail.match(Pattern.java:4863)
  | [2021-07-21T08:14:49Z] at java.base/java.util.regex.Pattern$BmpCharPropertyGreedy.match(Pattern.java:4344)
  | [2021-07-21T08:14:49Z] at java.base/java.util.regex.Pattern$GroupHead.match(Pattern.java:4804)
  | [2021-07-21T08:14:49Z] at java.base/java.util.regex.Pattern$Branch.match(Pattern.java:4749)
  | [2021-07-21T08:14:49Z] at java.base/java.util.regex.Pattern$Branch.match(Pattern.java:4747)
  | [2021-07-21T08:14:49Z] at java.base/java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3964)
  | [2021-07-21T08:14:49Z] at java.base/java.util.regex.Pattern$Start.match(Pattern.java:3619)
  | [2021-07-21T08:14:49Z] at java.base/java.util.regex.Matcher.search(Matcher.java:1729)
  | [2021-07-21T08:14:49Z] at java.base/java.util.regex.Matcher.find(Matcher.java:773)
  | [2021-07-21T08:14:49Z] at java.base/java.util.Formatter.parse(Formatter.java:2702)
  | [2021-07-21T08:14:49Z] at java.base/java.util.Formatter.format(Formatter.java:2655)
  | [2021-07-21T08:14:49Z] at java.base/java.util.Formatter.format(Formatter.java:2609)
  | [2021-07-21T08:14:49Z] at java.base/java.lang.String.format(String.java:2897)
  | [2021-07-21T08:14:49Z] at com.bazel_diff.BazelRuleImpl.transformRuleInput(BazelRule.java:53)
  | [2021-07-21T08:14:49Z] at com.bazel_diff.BazelRuleImpl.lambda$getRuleInputList$0(BazelRule.java:38)
  | [2021-07-21T08:14:49Z] at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
  | [2021-07-21T08:14:49Z] at java.base/java.util.AbstractList$RandomAccessSpliterator.forEachRemaining(AbstractList.java:720)
  | [2021-07-21T08:14:49Z] at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
  | [2021-07-21T08:14:49Z] at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
  | [2021-07-21T08:14:49Z] at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
  | [2021-07-21T08:14:49Z] at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
  | [2021-07-21T08:14:49Z] at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578)
  | [2021-07-21T08:14:49Z] at com.bazel_diff.BazelRuleImpl.getRuleInputList(BazelRule.java:39)
  | [2021-07-21T08:14:49Z] at com.bazel_diff.TargetHashingClientImpl.createDigestForRule(TargetHashingClient.java:88)
  | [2021-07-21T08:14:49Z] at com.bazel_diff.TargetHashingClientImpl.createDigestForRule(TargetHashingClient.java:93)
  | [2021-07-21T08:14:49Z] at com.bazel_diff.TargetHashingClientImpl.createDigestForRule(TargetHashingClient.java:93)
  | [2021-07-21T08:14:49Z] at com.bazel_diff.TargetHashingClientImpl.createDigestForRule(TargetHashingClient.java:93)
  | [2021-07-21T08:14:49Z] at com.bazel_diff.TargetHashingClientImpl.createDigestForRule(TargetHashingClient.java:93)
  | [2021-07-21T08:14:49Z] at com.bazel_diff.TargetHashingClientImpl.createDigestForRule(TargetHashingClient.java:93)
  | [2021-07-21T08:14:49Z] at com.bazel_diff.TargetHashingClientImpl.createDigestForRule(TargetHashingClient.java:93)
  | [2021-07-21T08:14:49Z] at com.bazel_diff.TargetHashingClientImpl.createDigestForRule(TargetHashingClient.java:93)
  | [2021-07-21T08:14:49Z] at com.bazel_diff.TargetHashingClientImpl.createDigestForRule(TargetHashingClient.java:93)
  | [2021-07-21T08:14:49Z] at com.bazel_diff.TargetHashingClientImpl.createDigestForRule(TargetHashingClient.java:93)
.... same call to createDigestForRule 100+ times ....

Filter targets

I see the avoid_query functionality (#26, #28) that allows targets to not be included in the set of targets generated by bazel-diff doesn't exist in the latest release.

I have a use case that I believe to be very usual: I'm specifying the targets that bazel-diff generates to bazel test and I don't want Docker-related targets to be built since it's gonna take way too much time.

Impacted targets always empty

Hello!

I'm trying out the example script (using the latest version with the fix released in https://github.com/Tinder/bazel-diff/releases/tag/2.1.1), but the /tmp/impacted_targets.txt` file is always empty in my case. My test case contains only two commits. So commit A has some changes and commit B has some other changes in another package.

I then run ./bazel-diff-example.sh ~/src/myrepo bazelisk A-SHA B-SHA .

Here's the result:

  • /tmp/modified_filepaths.txt correctly contains the paths of the files changed in commit B (8 files were modified in my example).
  • /tmp/starting_hashes.json correctly contains some hashes.
  • /tmp/final_hashes.json contains exactly the same hashes as the starting json, this seems wrong to me.
  • /tmp/impacted_targets.txt is empty.

I'm not really sure what could be the problem. I tried to run bazel-diff from source and adding some log statements and I see that the impacted targets array is empty, so something must be wrong in the diffing or computation of the hashes. Did you ever see something like this? When I run bazelisk query '//external:all-targets' + '//...:all-targets', I can see the target that I would expected to be run with the changes in commit B.

Change in toolchain definition is not detected?

Hi!

I am working with a C++ project and wanted to apply bazel-diff there. When I do changes in a toolchain definition (e.g. in a typical cc_toolchain_config.bzl file) it seems not to be detected.

Bazel handles toolchain resolution implicit to the target definitions. Could it be that bazel-diff misses those dependencies when it runs a normal query? In our project we add toolchains using the command line parameter --extra_toolchains which is not supported when running a bazel query.

[Feature Request] Allow to provide a content hash json file

Background

bazel-diff is an awesome tool that can help us a lot compared to bazel query rdeps. However, we have a very large repo and generating content hash takes ~2 minutes in bazel-diff. More than half of the files in our repo is translation file so we can do some trick to it to reduce the hashing time from ~2 minutes to 50 seconds.

Feature description

We would like to be able to provide a json file, contains part of content hashing such as

{
  "web/src/pages/login/button/button.tsx": "891ad1e682a7e6538603291963d62740f8fce4244771f66b1f977ad546e88868",
  "foo-services/src/java/com/xx/foo.java": "c09a57550d151625ea9d8d9ffb0344f16f243c74fb4770dcee6a530a14cb5cb5"
}

When running bazel-diff calculate-hashes, we would like bazel-diff to use the content hash we provided if it is in our JSON file, otherwise it will fallback to existing behavior.

Others

I can help to implement this feature if the intent look good to you

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.