Coder Social home page Coder Social logo

clangbuildanalyzer's People

Contributors

aras-p avatar artemist avatar dutor avatar i-ky avatar jdumas avatar jspam avatar lectem avatar llunak avatar mj-xmr avatar modkin avatar mrexodia avatar neroburner avatar nexuapex avatar ot avatar rootkiller avatar shua27 avatar sigiesec avatar stolyaroleh avatar trass3r avatar vitaut avatar vittorioromeo avatar warchant avatar waywardmonkeys avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

clangbuildanalyzer's Issues

Exception on less-than operator in template argument

With code like

template<typename T, unsigned N>
void foo(typename std::enable_if<(N < sizeof(T)), T>::type argument) {}

void bar() { foo<int, 2>(3); }

the parsing in collapseName breaks down, because < and > no longer match up. Then we go into retval.append(pos, new_pos); with pos > new_pos, which causes a std::length_error to be thrown in libstdc++.

Note that while C++11 syntax requires parantheses around comparisons in template arguments (like in the code above), they don't seem to appear in the string that we have here, which looks like std::enable_if<(2u) < (sizeof(int)), int>::type.

Add a `--version` command line argument

There doesn't appear to be a way to get the ClangBuildAnalyzer version installed. Could a --version command line argument be added for this purpose? Thanks in advance 🙂

Consider upgrading to simdjson 0.4.0

Version 0.4 of simdjson is now available

Highlights

  • Test coverage has been greatly improved and we have resolved many static-analysis warnings on different systems.

New features:

  • We added a fast (8GB/s) minifier that works directly on JSON strings.
  • We added fast (10GB/s) UTF-8 validator that works directly on strings (any strings, including non-JSON).
  • The array and object elements have a constant-time size() method.

Performance:

  • Performance improvements to the API (type(), get<>()).
  • The parse_many function (ndjson) has been entirely reworked. It now uses a single secondary thread instead of several new threads.
  • We have introduced a faster UTF-8 validation algorithm (lookup3) for all kernels (ARM, x64 SSE, x64 AVX).

System support:

  • C++11 support for older compilers and systems.
  • FreeBSD support (and tests).
  • We support the clang front-end compiler (clangcl) under Visual Studio.
  • It is now possible to target ARM platforms under Visual Studio.
  • The simdjson library will never abort or print to standard output/error.

Version 0.3 of simdjson is now available

Highlights

  • Multi-Document Parsing: Read a bundle of JSON documents (ndjson) 2-4x faster than doing it individually. API docs / Design Details
  • Simplified API: The API has been completely revamped for ease of use, including a new JSON navigation API and fluent support for error code and exception styles of error handling with a single API. Docs
  • Exact Float Parsing: Now simdjson parses floats flawlessly without any performance loss (simdjson/simdjson#558).
    Blog Post
  • Even Faster: The fastest parser got faster! With a shiny new UTF-8 validator
    and meticulously refactored SIMD core, simdjson 0.3 is 15% faster than before, running at 2.5 GB/s (where 0.2 ran at 2.2 GB/s).

Minor Highlights

  • Fallback implementation: simdjson now has a non-SIMD fallback implementation, and can run even on very old 64-bit machines.
  • Automatic allocation: as part of API simplification, the parser no longer has to be preallocated-it will adjust automatically when it encounters larger files.
  • Runtime selection API: We've exposed simdjson's runtime CPU detection and implementation selection as an API, so you can tell what implementation we detected and test with other implementations.
  • Error handling your way: Whether you use exceptions or check error codes, simdjson lets you handle errors in your style. APIs that can fail return simdjson_result, letting you check the error code before using the result. But if you are more comfortable with exceptions, skip the error code and cast straight to T, and exceptions will be thrown automatically if an error happens. Use the same API either way!
  • Error chaining: We also worked to keep non-exception error-handling short and sweet. Instead of having to check the error code after every single operation, now you can chain JSON navigation calls like looking up an object field or array element, or casting to a string, so that you only have to check the error code once at the very end.

distinguish external headers

I guess it could come in handy to have a separate section "Expensive external headers" for headers not coming from the current source tree. To find out which headers could easily go into a PCH.

Not sure about that though. If internal headers dominate like in llvm it's harder to use PCH.

Same file is reported multiple times when relative paths are used

Here is the output I've got when analyzing one project:

*** Expensive headers:
4826 ms: ../../../include/common.h (included 82 times, avg 58 ms), included via:
  ...

1657 ms: ../../../../include/common.h (included 28 times, avg 59 ms), included via:
  ...

938 ms: ../../include/common.h (included 16 times, avg 58 ms), included via:
  ...

This is actually the same header and in the project it is always included via

#include "common.h"

Project folder structure looks like this:

├ include/
└ src/
  └ foo/
    └ bar/
      └ baz/

... with each directory having its own Makefile listing subdirectories and make is called recursively. Top-level make gets -Iinclude and down the line we get -I../include, -I../../include, etc.

I think ClangBuildAnalyzer should report either absolute paths to files or paths relative to its working directory combining paths from *.json files and relative locations of *.json files themselves. Path fragments like <some directory>/../ should be squashed.

Wall-clock vs CPU clock

While looking at the clang patch that added -ftime-trace, I noticed that it uses std::chrono::steady_clock. That is most certainly a wall clock, not a CPU clock. That means the numbers obtained when running a parallel build (which is almost unavoidable on a large project) may be unreliable, depending on scheduling. Did you consider this issue?

MinGW support

I changed this to build with MinGW:

diff --git a/CMakeLists.txt b/CMakeLists.txt
index 71f708f..56223a3 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -22,4 +22,4 @@ set(SRC
 )
 add_executable(ClangBuildAnalyzer "${SRC}")
 target_compile_features(ClangBuildAnalyzer PRIVATE cxx_std_17)
-target_link_libraries(ClangBuildAnalyzer -lrt -lpthread)
+target_link_libraries(ClangBuildAnalyzer -lpthread)
diff --git a/src/main.cpp b/src/main.cpp
index 3ef1fc1..68e46d6 100644
--- a/src/main.cpp
+++ b/src/main.cpp
@@ -96,7 +96,7 @@ static int RunStart(int argc, const char* argv[])
     return 0;
 }
 
-#ifdef _MSC_VER
+#ifdef _WIN32
 static time_t FiletimeToTime(const FILETIME& ft)
 {
     ULARGE_INTEGER ull;
@@ -128,7 +128,7 @@ struct JsonFileFinder
         if (!cf_get_file_time(f->path, &mtime))
             return;
         time_t fileModTime;
-#ifdef _MSC_VER
+#ifdef _WIN32
         fileModTime = FiletimeToTime(mtime.time);
 #else
         fileModTime = mtime.time;

../src/external/cute_files.h:427: int cf_dir_open(cf_dir_t*, const char*): Assertion `0' failed on ClangBuildAnalyzer --stop

Recently, when trying to use ClangBuildAnalyzer to analyze a firefox build, I got the following error and assertion failure on ClangBuildAnalyzer --stop:

ERROR: Failed to open directory (obj-x86_64-pc-linux-gnu-trace/dist/bin/chrome/en-US/locale/en-US/mozapps/downloads/settingsChange.dtd): No such file or directory.
ClangBuildAnalyzer: ../src/external/cute_files.h:427: int cf_dir_open(cf_dir_t*, const char*): Assertion `0' failed.

obj-x86_64-pc-linux-gnu-trace/dist/bin/chrome/en-US/locale/en-US/mozapps/downloads/settingsChange.dtd exists, but is a symbolic link to a file, not a directory:

[simon@sigibln fuzzy]$ stat obj-x86_64-pc-linux-gnu-trace/dist/bin/chrome/en-US/locale/en-US/mozapps/downloads/settingsChange.dtd
  File: obj-x86_64-pc-linux-gnu-trace/dist/bin/chrome/en-US/locale/en-US/mozapps/downloads/settingsChange.dtd -> /home/simon/work/fuzzy/toolkit/locales/en-US/chrome/mozapps/downloads/settingsChange.dtd
  Size: 88              Blocks: 8          IO Block: 4096   symbolic link
Device: fd02h/64770d    Inode: 34560445    Links: 1
Access: (0777/lrwxrwxrwx)  Uid: ( 1000/   simon)   Gid: ( 1000/   simon)
Context: unconfined_u:object_r:user_home_t:s0
Access: 2020-10-21 18:42:33.948517287 +0200
Modify: 2020-10-14 18:48:03.012265746 +0200
Change: 2020-10-14 18:48:03.012265746 +0200
 Birth: 2020-10-14 18:48:03.012265746 +0200
[simon@sigibln fuzzy]$ stat /home/simon/work/fuzzy/toolkit/locales/en-US/chrome/mozapps/downloads/settingsChange.dtd
  File: /home/simon/work/fuzzy/toolkit/locales/en-US/chrome/mozapps/downloads/settingsChange.dtd
  Size: 0               Blocks: 0          IO Block: 4096   regular empty file
Device: fd02h/64770d    Inode: 49026953    Links: 1
Access: (0664/-rw-rw-r--)  Uid: ( 1000/   simon)   Gid: ( 1000/   simon)
Context: unconfined_u:object_r:user_home_t:s0
Access: 2020-10-22 09:43:44.048022446 +0200
Modify: 2020-10-22 09:43:34.870015064 +0200
Change: 2020-10-22 09:43:34.870015064 +0200
 Birth: 2020-10-22 09:43:29.366010639 +0200

Fails on very large capture files

We ran this on a large codebase and the final capture file was 3.8GB. After several seconds, the analyze phase fails with the output:

Analyzing build trace from './results'...
ERROR: JSON parse error expected end of input.
  no trace events found.

The ./results file appears to be intact. Running the tool on a smaller subset of our codebase worked just fine.

Thanks for the great project! It's already helping us immensely.

Unknown trace events with clang 11

When using ClangBuildAnalyzer, there are lots of "unknown trace event" warnings, including:

PassManager<llvm::Function>
PassManager<llvm::Module>
LoopLoadEliminationPass
Float2IntPass
ModuleToFunctionPassAdaptor<llvm::PassManager<llvm::Function>>
GlobalDCEPass
CGProfilePass
ConstantMergePass

Not sure if these are meaningful to handle in ClangBuildAnalyzer, but at least they should not produce warnings.

Create a folder automatically when using `--start`

Right now, you get an error when trying to use --start with a folder name that does not exist:

❯ clangbuildanalyzer --start artifacts
ERROR: failed to create session file at 'artifacts/ClangBuildAnalyzerSession.txt'.

Creating a folder manually solves this, but I think it should be done automatically (ideally recursively, so that it works when multiple folders need to be created):

~/Documents/Git/godotengine/godot master
❯ mkdir artifacts                                            

~/Documents/Git/godotengine/godot master
❯ clangbuildanalyzer --start artifacts
Build tracing started. Do some Clang builds with '-ftime-trace', then run 'ClangBuildAnalyzer --stop artifacts <filename>' to stop tracing and save session to a file.

crash when running --all on a very large project

ClangBuildAnalysis is a great tool and it has really helped me improve the build time of a large project I am working on.

My project is so large that the code crashes when loading the json files. I believe it is running out of memory while parsing the json dom. my project has 83GB of json files. I have run ClangBuildAnalyzer on all the files and I can analyze all of them if I do it in two groups, with the 51 largest json files in one group and the other 241 files in the other group.

While debugging, I changed the max tasks to 1 in main.cpp, hoping that parsing 1 json at a time would reduce the memory footprint. It may have gotten farther, but ClangBuildAnalyzer still crashed.

I tried using gdb to see the crash, but the app is killed due to out of memory, instead of core dumping, so gdb didn't give me any useful information.

I wondered if there was a memory leak or other bug in the json parsing. I manually copied the latest release of simdjson (0.9.4) into ClangBuildAnalyzer and rebuilt. I still got the same crashes.

I am currently using clang-11. I had more luck with ClangBuildAnalyzer on this project in the past and may have been using clang-10. I wonder if clang-10 writes fewer events? I do not see any warnings about unknown events while running ClangBuildAnalyzer with clang-11.

Could ClangBuildAnalyzer add a batch mode to support large projects? For example, I could run ClangBuildAnalyzer -all <subdirN> on the N subdirs of my build. Then, I would run ClangBuildAnalyzer --analyze with N file names instead of 1. ClangBuildAnalyzer would then read all N binary event files and concatenate all the data together before producing the statistics over the combined set of all events.

I have studied the LoadBuildEvents() method and believe it could be possible to have it append new files to the current event and names lists, but haven't had time to do it yet, especially making the event to name index is updated correctly.

ERROR: no clang -ftime-trace .json files found under <dir>

I am not able to get simple trace running successfully through this tool. From a freshly cloned and built tree, I run:

# ./ClangBuildAnalyzer  --start /tmp
Build tracing started. Do some Clang builds with '-ftime-trace', then run 'ClangBuildAnalyzer --stop /tmp <filename>' to stop tracing and save session to a file.
# clang <various options...> -o /tmp/main.cc.o -c /tmp/main.cc -ftime-trace -ftime-report
ls /tmp
# ls /tmp/main*
/tmp/main.cc.json  /tmp/main.cc.o
# ./ClangBuildAnalyzer --stop /tmp /tmp/report
Stopping build tracing and saving to '/tmp/report'...
ERROR: no clang -ftime-trace .json files found under '/tmp'.

Seems to find the json file and fail while parsing it (within ParseBuildEvents in the code). The clang version is 11/head, downloaded and built today. Chrome can read and display the file successfully within chrome://tracing.

Any ideas? My apologies if I'm missing something obvious, and thank you for producing this tool and -ftime-trace. Incredibly useful.

New trace events

It looks like clang 13 added new events to trace its backend more closely, and ClangBuildAnalyzer does not recognise them.
For example:

WARN: unknown trace event 'AlwaysInlinerPass' in 'redacted.cpp.json', skipping.
WARN: unknown trace event 'ModuleToFunctionPassAdaptor' in 'redacted.cpp.json', skipping.
WARN: unknown trace event 'ModuleToFunctionPassAdaptor' in 'redacted.cpp.json', skipping.
WARN: unknown trace event 'AddressSanitizerPass' in 'redacted.cpp.json', skipping.
WARN: unknown trace event 'AddressSanitizerPass' in 'redacted.cpp.json', skipping.
...

Should they just be added as fallthrough after

else if (StrEqual(name, "RunLoopPass"))
?

Mismatch between printf format string and arguments

While compiling on Linux with GCC 9, I get the following warnings:

src/BuildEvents.cpp: In function ‘void DebugPrintEvents(const BuildEvents&, const BuildNames&)’:
src/BuildEvents.cpp:15:26: warning: format ‘%i’ expects argument of type ‘int’, but argument 3 has type ‘BuildEventType’ [-Wformat=]
   15 |         printf("%4zi: t=%i t1=%7llu t2=%7llu par=%4i ch=%4zi det=%s\n", i, event.type, event.ts, event.ts+event.dur, event.parent.idx, event.children.size(), names[event.detailIndex].substr(0,130).c_str());
      |                         ~^                                                 ~~~~~~~~~~
      |                          |                                                       |
      |                          int                                                     BuildEventType
src/BuildEvents.cpp:15:35: warning: format ‘%llu’ expects argument of type ‘long long unsigned int’, but argument 4 has type ‘int64_t’ {aka ‘long int’} [-Wformat=]
   15 |         printf("%4zi: t=%i t1=%7llu t2=%7llu par=%4i ch=%4zi det=%s\n", i, event.type, event.ts, event.ts+event.dur, event.parent.idx, event.children.size(), names[event.detailIndex].substr(0,130).c_str());
      |                               ~~~~^                                                    ~~~~~~~~
      |                                   |                                                          |
      |                                   long long unsigned int                                     int64_t {aka long int}
      |                               %7lu
src/BuildEvents.cpp:15:44: warning: format ‘%llu’ expects argument of type ‘long long unsigned int’, but argument 5 has type ‘long int’ [-Wformat=]
   15 |         printf("%4zi: t=%i t1=%7llu t2=%7llu par=%4i ch=%4zi det=%s\n", i, event.type, event.ts, event.ts+event.dur, event.parent.idx, event.children.size(), names[event.detailIndex].substr(0,130).c_str());
      |                                        ~~~~^                                                     ~~~~~~~~~~~~~~~~~~
      |                                            |                                                             |
      |                                            long long unsigned int                                        long int
      |                                        %7lu
$ g++ --version
g++ (Ubuntu 9.2.1-9ubuntu2) 9.2.1 20191008

Self timings vs. hierarchical timings

Noticing that the timings don't add up, or are spread across multiple cores. One issue appears to be that top offenders are just the hierchical timing of the calls, without subtracting the children.

Would be nice to have self timings sorted out, so can see which headers are specifically at fault for all the build slowdowns. Maybe there's a mode for this, and I'm just too much of a newbie using the tool. But I do love it. So thanks for making this open-source.

Here's an example from our traces. So hash_table is a part of unordered_map. I doubt those both take up 137s and 123s. But maybe hash_table is 123/137 of unordered_map. Self timings would more accurately portray the offender. But it is a huge amount of instantiations of both. It's just when I look at -ftime-trace flame graphs other functions we have are not self times, and have a lot of overlap.

**** Template sets that took longest to instantiate:
137531 ms: std::unordered_map<$> (16285 times, avg 8 ms)
123470 ms: std::__hash_table<$> (18299 times, avg 6 ms)
 65691 ms: std::function<$>::function<$> (1805 times, avg 36 ms)
 65352 ms: std::__function::__value_func<$>::__value_func<$> (1805 times, avg 36 ms)
 54030 ms: std::__function::__func<$>::__func (1805 times, avg 29 ms)
 50907 ms: std::forward_as_tuple<$> (8135 times, avg 6 ms)
 48883 ms: std::allocator_traits<$> (43333 times, avg 1 ms)
 47541 ms: std::__function::__alloc_func<$>::__alloc_func (5415 times, avg 8 ms)

Compilation on CentOS6 fails at link time with undefined reference errors

I know CentOS6 is not very current, but I needed to add the -lrt flag to linker in projects/make/Makefile otherwise I would get undefined reference to `clock_gettime' errors.

As in:

diff --git a/projects/make/Makefile b/projects/make/Makefile
index f7b14dd..2e79ff2 100644
--- a/projects/make/Makefile
+++ b/projects/make/Makefile
@@ -30,7 +30,7 @@ clean:
        rm -f build/ClangBuildAnalyzer $(C_OBJS) $(CPP_OBJS)

 build/ClangBuildAnalyzer: $(C_OBJS) $(CPP_OBJS)
-       $(CXX) -o $@ $(C_OBJS) $(CPP_OBJS) $(LDFLAGS) -lpthread
+       $(CXX) -o $@ $(C_OBJS) $(CPP_OBJS) $(LDFLAGS) -lrt -lpthread

 build/%.o: %.c
        mkdir -p $(dir $@)

Support scanning for existing json files

It would be nice to have an option to just start the tool on an existing collection of profiling files in the build directory instead of capturing them only between the start/stop cycle. Current workflow does not play well with incremental builds where most of the sources may be already processed with their build time stats already captured. In order to use the clang's profiling information one has to always start the build from scratch which doesn't seem to be strictly necessary.

-Wformat warnings on gcc 9.3.0

When I compiled the latest release (v1.2.0) I got some interesting warnings:

/opt/Workspace/ClangBuildAnalyzer/src/BuildEvents.cpp:55:26: warning: format ‘%i’ expects argument of type ‘int’, but argument 3 has type ‘BuildEventType’ [-Wformat=]
   55 |         printf("%4zi: t=%i t1=%7llu t2=%7llu par=%4i ch=%4zi det=%s\n", i, event.type, event.ts, event.ts+event.dur, event.parent.idx, event.children.size(), std::string(names[event.detailIndex].substr(0,130)).c_str());
      |                         ~^                                                 ~~~~~~~~~~
      |                          |                                                       |
      |                          int                                                     BuildEventType
/opt/Workspace/ClangBuildAnalyzer/src/BuildEvents.cpp:55:35: warning: format ‘%llu’ expects argument of type ‘long long unsigned int’, but argument 4 has type ‘int64_t’ {aka ‘long int’} [-Wformat=]
   55 |         printf("%4zi: t=%i t1=%7llu t2=%7llu par=%4i ch=%4zi det=%s\n", i, event.type, event.ts, event.ts+event.dur, event.parent.idx, event.children.size(), std::string(names[event.detailIndex].substr(0,130)).c_str());
      |                               ~~~~^                                                    ~~~~~~~~
      |                                   |                                                          |
      |                                   long long unsigned int                                     int64_t {aka long int}
      |                               %7lu
/opt/Workspace/ClangBuildAnalyzer/src/BuildEvents.cpp:55:44: warning: format ‘%llu’ expects argument of type ‘long long unsigned int’, but argument 5 has type ‘long int’ [-Wformat=]
   55 |         printf("%4zi: t=%i t1=%7llu t2=%7llu par=%4i ch=%4zi det=%s\n", i, event.type, event.ts, event.ts+event.dur, event.parent.idx, event.children.size(), std::string(names[event.detailIndex].substr(0,130)).c_str());
      |                                        ~~~~^                                                     ~~~~~~~~~~~~~~~~~~
      |                                            |                                                             |
      |                                            long long unsigned int                                        long int
      |                                        %7lu

How about using "{fmt}" for such formatted output? It might be overhead though to add another external library just for it.
At least warnings can be fixed if instead of inlining all args, we declare constants for all of them with types that was declared in prrintf:

        const auto eventType = static_cast<int>(event.type); // %i
        const auto t1 = static_cast<long long unsigned int>(event.ts); // %7llu
        const auto t2 = static_cast<long long unsigned int>(event.ts+event.dur); // %7llu
        const auto par = static_cast<int>(event.parent.idx); // %4i
        const auto ch = static_cast<size_t>(event.children.size());
        const auto det = std::string(names[event.detailIndex].substr(0,130)); // %s
        printf("%4zi: t=%i t1=%7llu t2=%7llu par=%4i ch=%4zi det=%s\n", i, eventType, t1, t2, par, ch, det.c_str());

Display template instantiation origin

In the report, we can see:

**** Templates that took longest to instantiate:
  1079 ms: std::variant<sf::Event::Empty, sf::Event::Closed, sf::Event::Resized... (18 times, avg 59 ms)
   961 ms: std::__detail::__variant::_Variant_base<sf::Event::Empty, sf::Event:... (18 times, avg 53 ms)
   ...

**** Template sets that took longest to instantiate:
  1582 ms: std::vector<$> (740 times, avg 2 ms)
  1273 ms: std::unique_ptr<$> (115 times, avg 11 ms)
  ...

It would be nice if we could somehow figure out where the bulk of these instantiations come from. For example, it would be useful if the report showed something like the following:

  1079 ms: std::variant<sf::Event::Empty, sf::Event::Closed, sf::Event::Resized... (18 times, avg 59 ms)
       - 12 times in Foo.cpp
       - 6 times in Bar.cpp
   ...

**** Template sets that took longest to instantiate:
  1582 ms: std::vector<$> (740 times, avg 2 ms)
      - 420 times in Baz.cpp
      - 320 times in Abc.cpp
  ...

Not sure how feasible/difficult this would be to implement.

Is it possible to get tool's output in a machine-readable format (json, csv)?

This tool was very helpful in optimizing build times. However I would to be able to analyze the output a little more thoroughly. For example, how much time out of total is taken by each header or translation unit. It would greatly help with cost-benefit analysis when attempting any kind of refactoring. To do that I need raw aggregated data of the build analysis. Is it possible to get it from the tool at the moment. If not how hard would it be to add and where to look to do that?

Checksum mismatch

I'm attempting to profile fuchsia compilations and am running into:

Analyzing build trace from '/tmp/capture-file'...
ERROR: corrupt input file '/tmp/capture-file' (checksum mismatch)

Do you have any ways of going about debugging this? This is from a fresh ClangBuildAnalyzer checkout and there didn't seem to be any issues with building or running it. I'm able to assert that -ftime-trace is working as intended since I can see the .json files in my build. The capture file I also made is about 470 MB.

Build failure with CMake because of #define min/max in Windows.h

ClangBuildAnalyzer pulls in Windows.h at several points, leading to these compilation errors:

src\main.cpp(175): error C2589: '(': illegal token on right side of '::'
src\external\flat_hash_map\flat_hash_map.hpp(1282): error C2589: '(': illegal token on right side of '::'

It looks like you've avoided this in the Visual Studio project by adding a global NOMINMAX define.

This should be done for the CMake build as well. I fixed it locally by adding this to CMakeLists:

target_compile_definitions(ClangBuildAnalyzer PRIVATE "NOMINMAX")

Error no json files found on macOS 12/ LLVM clang 13

Looking at an older issue #45 I'm guessing LLVM clang 13 is not identified by this tool.

The error is

Stopping build tracing and saving to '1'...
ERROR: no clang -ftime-trace .json files found under './'.

Clang version:

clang version 13.0.0
Target: x86_64-apple-darwin21.2.0
Thread model: posix

Example json file is :

Expand
{
    "traceEvents": [
        {
            "pid": 16764,
            "tid": 259,
            "ph": "X",
            "ts": 6343,
            "dur": 831,
            "name": "Source",
            "args": {
                "detail": "/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX12.1.sdk/usr/include/sys/cdefs.h"
            }
        },
        {
            "pid": 16764,
            "tid": 259,
            "ph": "X",
            "ts": 3786,
            "dur": 242254,
            "name": "Frontend"
        },
        {
            "pid": 16764,
            "tid": 259,
            "ph": "X",
            "ts": 125,
            "dur": 258862,
            "name": "ExecuteCompiler"
        },
        {
            "pid": 16764,
            "tid": 260,
            "ph": "X",
            "ts": 0,
            "dur": 258861,
            "name": "Total ExecuteCompiler",
            "args": {
                "count": 1,
                "avg ms": 258
            }
        },
        {
            "pid": 16764,
            "tid": 261,
            "ph": "X",
            "ts": 0,
            "dur": 242253,
            "name": "Total Frontend",
            "args": {
                "count": 1,
                "avg ms": 242
            }
        },
        {
            "pid": 16764,
            "tid": 262,
            "ph": "X",
            "ts": 0,
            "dur": 217862,
            "name": "Total Source",
            "args": {
                "count": 28,
                "avg ms": 7
            }
        },
        {
            "pid": 16764,
            "tid": 263,
            "ph": "X",
            "ts": 0,
            "dur": 2,
            "name": "Total PerformPendingInstantiations",
            "args": {
                "count": 1,
                "avg ms": 0
            }
        },
        {
            "cat": "",
            "pid": 16764,
            "tid": 259,
            "ts": 0,
            "ph": "M",
            "name": "process_name",
            "args": {
                "name": "clang"
            }
        },
        {
            "cat": "",
            "pid": 16764,
            "tid": 259,
            "ts": 0,
            "ph": "M",
            "name": "thread_name",
            "args": {
                "name": ""
            }
        }
    ],
    "beginningOfTime": 1641965546191608
}

Need improve error handling

I mistakenly passed my artefacts folder path to the "--analize" arg and ClangBuildAnalyzer crashed with std::bad_alloc from BufferedReader(FILE* f) when "fsize" was calculated as 9223372036854775807.
I think it's better to add some additional checks for stupid errors like this. At least need to check that path is valid and it points to a file.
Also, errors handling when working with fseek, ftello64 etc is completely skipped and ftello64 even returns signed int64_t and it also wasn't checked before being implicitly cast to size_t.

Several "WARN: unknown trace event" with trunk clang.

I'm using clang built at http://reviews.llvm.org/rG64a362e7216a43e (from Oct 2). With that, ClangBuildAnalyzer prints these for every TU:

WARN: unknown trace event 'PerFunctionPasses' in 'out/src/gn/ninja_generated_file_target_writer_unittest.json', skipping.
WARN: unknown trace event 'PerModulePasses' in 'out/src/gn/ninja_generated_file_target_writer_unittest.json', skipping.
WARN: unknown trace event 'CodeGenPasses' in 'out/src/gn/ninja_generated_file_target_writer_unittest.json', skipping.

These were added in https://reviews.llvm.org/D68161 .

Looks like https://reviews.llvm.org/D69750 also adds "DebugType", but my compiler doesn't have that yet.

Latest tagged release does not have support for --all flag

The README mentions the flag (merged to master in Oct 2020), but the latest release v1.2.0 (March 2020) doesn't have support for it. Discovered this after much confusion, and then looking at the pull requests & source code. Can we get a new release/tag? :)

--all was introduced in #47

Allow collapsing non-ambiguous operator symbols

I have a case where I have a bunch of templated classes that have potentially expensive operator() implementations. However, due to how we collapse names, we never collapse any operators at all and we don't end up grouping these when highlight expensive template instantiation patterns. Relevant code is here:

...
static std::string_view CollapseName(const std::string_view& elt)
{
    // Parsing op<, op<<, op>, and op>> seems hard.  Just skip'm all
    if (elt.find("operator") != std::string::npos)
      return elt;
...

I'm wondering if we can change this to only prevent collapsing if we find either 'operator<' or 'operator>' in the name so that we can surface something like this.

More fine-grained compilation stats

Not sure if this is the correct place for this issue, but here it goes.
It would be good to see more detailed overview of the back-end phase.
In particular, it would be great to have at least global **** Passes that took longest:
Possibly something more fine-grained too (top-10 pass-function, pass-files pairs e.g.)

Assertion failure on --stop command

Command:
~/ClangBuildAnalyzer/build/ClangBuildAnalyzer --stop ~/build-Debug profile_result.json

Result:

ERROR: String ".PropertyMover1DTo3DDataFlowTest" too long to copy on line 231 in file src/external/cute_files.h (max length of 32).
ClangBuildAnalyzer: src/external/cute_files.h:215: int cf_safe_strcpy_internal(char *, const char *, int, int, const char *, int): Assertion `0' failed.
./profile_compilation.sh: line 4: 165782 Aborted                 (core dumped) ~/ClangBuildAnalyzer/build/ClangBuildAnalyzer --stop ~/build-Debug profile_result.json

ubuntu 18.04 with clang9

Wrong printf format specifiers for int64_t

In DebugPrintEvents %7ld is used to print a pair of int64_t values. When compiling with Visual Studio that leads to printf mismatch warnings because on Windows (32-bit or 64-bit) long is a 32-bit value.

I believe that %7lld is a portable 64-bit format value. It definitely works for 32-bit and 64-bit Windows processes. This is the updated line of code that I used to fix the warnings:

    printf("%4zi: t=%i t1=%7lld t2=%7lld par=%4i ch=%4zi det=%s\n", i, (int) event.type, event.ts, event.ts+event.dur, event.parent.idx, event.children.size(), std::string(names[event.detailIndex].substr(0,130)).c_str());

Missing diagnostic for JSON errors

I built my project with clang-cl version 16.0.6 and -ftime-trace. When I run ClangBuildAnalyzer on it, I get:

$ ClangBuildAnalyzer.exe --all ninja_project_Release buildAnalyze.bin
Processing all files and saving to 'buildAnalyze.bin'...
WARN: JSON parse error TAPE_ERROR: The JSON document has an improper structure: missing or superfluous commas, braces, missing keys, etc..
WARN: JSON parse error TAPE_ERROR: The JSON document has an improper structure: missing or superfluous commas, braces, missing keys, etc..
WARN: JSON parse error F_ATOM_ERROR: Problem while parsing an atom starting with the letter 'f'.
WARN: JSON parse error TAPE_ERROR: The JSON document has an improper structure: missing or superfluous commas, braces, missing keys, etc..
  done in 86.6s. Run 'ClangBuildAnalyzer --analyze buildAnalyze.bin' to analyze it.

Not sure what the problem is but it would be helpful if it at least output which JSON file had the issue along with line number if possible.

Segmentation fault during --stop

Unfortunately, when using with our project, I get Segmentation fault (core dumped) during --stop.
I could provide you with the data you need, but I don't exactly know what to attach.

Our project is quite small - less than 100K LOC, number of files is not big either.

I noticed in htop, that the utility starts to use a lot of RAM, maybe this could help.

CMake does not work on macOS due to rt requirement

The rt library is unconditionally added as requirement in file CMakeLists.txt:

target_link_libraries(ClangBuildAnalyzer -lrt -lpthread)

This library is not present on macOS, and hence the build fails.

When I remove the -lrt entry, building succeeds.

The line above should be changed like:

if (APPLE)
  target_link_libraries(ClangBuildAnalyzer -lpthread)
else()
  target_link_libraries(ClangBuildAnalyzer -lrt -lpthread)
endif()

I can make a PR if requested.

Not working with Xcode

> ~/Downloads/ClangBuildAnalyzer-mac --start clang_analyzer
Build tracing started. Do some Clang builds with '-ftime-trace', then run 'ClangBuildAnalyzer --stop clang_analyzer <filename>' to stop tracing and save session to a file.

> ~/Downloads/ClangBuildAnalyzer-mac --stop clang_analyzer clang_analyzer_report
Stopping build tracing and saving to 'clang_analyzer_report'...
ERROR: no .json files found under 'clang_analyzer'.

macOS Catalina 10.15.5 (19F101)

image

ClangBuildAnalyzerTest.zip

Output file silently truncated when not enough space is available

Thank you very much for this tool (and your work on -ftime-trace)!

I tested it on our medium-sized project, where --stop creates a 3.8GB output file. I did not have enough free space in the output folder (as I later realized). However, the --stop command succeeded (exit code 0) and did not give an error message. Only when --analyze failed to parse the JSON file I realized the output had been truncated.

Other file writing errors are detected, such as missing write permissions.

Improve performance / resource usage for big codebases

The analyzer can't read large json files due to this code:

diff --git a/src/main.cpp b/src/main.cpp
index 380b26f..fdfafbc 100644
--- a/src/main.cpp
+++ b/src/main.cpp
@@ -26,7 +26,7 @@ static std::string ReadFileToString(const std::string& path)
     if (!f)
         return "";
     fseek(f, 0, SEEK_END);
-    size_t fsize = ftell(f);
+    size_t fsize = _ftelli64(f);
     fseek(f, 0, SEEK_SET);
     std::string str;
     str.resize(fsize);

Maybe it should just use memory-mapped files.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.