Coder Social home page Coder Social logo

logan's Introduction

What is this?

LOG ANalizer ('logan') is specifically designed for log files produced by applications based on the Mozilla Gecko Platform through MOZ_LOG. This mainly means the Firefox browser.

The main focus is to search for objects of selected classes and their properties (e.g. url). Secondary focus is to walk the object linking chain to reach the "the line" while everything uninterested is filtered off.

The analyzer is based on logan-rules.js file containing set of matching rules to track objects lifetime, relations and properties. It's hugely generic, thus very powerful and pretty much open to enhancements by anyone. See below the Rules definition reference section for details.

The code has originally been published on GitHub and a live instance is running on my site as well as a beta staging here.

Current state

logan works for most use cases and logs produced with current Firefox Nightly for mainly diagnosing networking issues. It can process parent and child logs together as well as rotated logs automatically.

Missing functionality

  • way to easily customize the rules when using a life-staged instance [#7]
  • no worker to do parsing and searching off the main thread (expect your browser to be freezing with large logs)

How to use it

1. Select log files

The first screen has a single purpose - to drop or browse for log files. After the files are selected, you are brought to a secondary screen and can watch the loading progress bar in the heading. When load is done, you can start searching. Note that a 64-bit browser is recommended for larger logs.

2. Search for an object by its class name and properties

First, select a class name. Second, select one of the properties that have been captured on the objects. You can also search by a line captured on an object. Third, select the compare method and value you are looking for (e.g. a URI.) Pressing the [ Search ] button brings you to the next screen with results.

(Note about the state property: objects that have been released from memory (a destructor has been found in the log for them) has state released. Rules for an object (see the Rules definition reference section and logan-rules.js) may change the state property during the object lifetime. An example is nsHttpTransaction that has been put on nsHttpConnection - its state is then active. If rules don't change the state, object's state is then created between its creation and destruction.)

3. Results exploration

If some objects have matched the search criteria you will see a list of them on the screen. Each line represents a summary of the object found, with few selected properties displayed right away. The objects are ordered according the position in the log file (in general - by it's creation order.)

Each search result line has a check box at its left side:

  • 2017-06-01 18:51:01.616 │ nsHttpTransaction │ 14D4C400 │ released │ - │ - │ https://example.com/

When you check it, all lines captured from the log belonging to the object are revealed and given a distinct color.

Revealing more than one object interleaves the log lines with previously revealed object lines, always ordered according the position in the whole log file(s).

4. Links to referred or referring objects

Among lines of a revealed object you may find 'references' to and from other objects. Those lines again have their own checkboxes:

  • nsHttpChannel @13ED1800 --> nsHttpTransaction @14D4C400

Checking it reveals the related object. The object gets a new distinctive color. These references allows you to explore the object chaining from the top to the bottom. (A good example is the chain of nsHttpChannel linking nsHttpTransaction linking nsHttpConnection etc. Revealing the nsHttpConnection object may then show what all other transactions have been dispatched on that connection.)

5. Breadcrumbs

At the top of the results view there is a sticky area listing all objects that have been expanded.

Clicking a breadcrumb shows all properties captured on the object. By default this shows the final state of the object (as it was at the end of the log.) This can be changed using the seek function, see following section.

6. Changing the position in the log via 'seek'

It may be useful to search for an object being in a certain state at an exact line in the log. The seek controls can be found under the [ Search ] button, appearing as:

seek: tail

A line to seek the log to can be easily selected by clicking the text tail and then picking a line from the results.

After that details of previously searched objects will show the properties state at the new seek position.

Any new search will look for objects and their property values at the current seek position.

To seek back to the tail of the log, click the red ⭳ icon.

Note: lines in the results view that are past the seek point are marked with a red bar in the front.

7. Network diagnostics

This is experimental, but may be really helpful when finding or verifying scheduling and prioritization problems and enhancements. This works only for nsHttpChannel objects. The parent log must be loaded along with all the child logs. The log files must have been captured on Firefox 57 with following modules:

MOZ_LOG=timestamp,sync,nsHttp:5,cache2:5,DocumentLeak:5,PresShell:5,DocLoader:5,nsDocShellLeak:5,RequestContext:5,LoadGroup:5,nsSocketTransport:5

The details pane of an nsHttpChannel - opened with a breadcrumb click - has a [diagnose] button at the top. It opens a diagnostic page that puts this channel to its request context, listing various timing and other properties of the channel. The page also lists all other channels running in parallel to or before the diagnosed channel (within the page laod), broke down by various conditions. It's better to try this live than to describe here, also because this is still developing. All should be hopefully self-explanatory.

The code lives in logan-netdiag.js with API referred from the rules (in a bit quick-hacky-duplication way, subject to change.)

Rules definition reference

The rules are defined in logan-rules.js file in a hierarchy of a schema (a top level name-space), and modules within the schema. A module is an equivalent of a mozilla log module (e.g. nsHttp, cache2). See logan-rules.js heading for a life example.

There are multiple schemes. You can load MOZ_LOG generated files, raw console output, a try run console output, logs from TaskCluster (try server). Nice thing is that these "wrapping" schemes are capable of "forwarding" the unwrapped input to lower level schemes such as MOZ_LOG or 'text console', which means everything is nicely merged together - for instance DOM window counters and MOZ_LOG tracked objects.

A schema pre-processes every line with a pre-processing function. The MOZ_LOG schema parses and separates the time stamp, module name, thread name and the actual log text. Lines w/o the timestamp/thread/module prefix will use the last known timestamp and thread values.

The rules themselves need a bit more thorough explanation.

There are 3 types of rules that you can define:

  1. a simple rule by a printf formatting - via module.rule()
  2. a more general printf rule conditioned by a state evaluation - via schema.ruleIf() or module.ruleIf()
  3. a general rule (no string matching) only conditioned by a state - via schema.plainIf()

A simple rule

An example of its definition:

module.rule("nsHttpChannel %p created nsHttpTransaction %p", function(channel, transaction) {
  this.obj(channel).capture().link(transaction);
});

The consuming function is called only when the line in the log file matches the formatting string. Note that in 99% cases the rule string is simply a copy of the C++ LOG() formatting string in question.

For convenience the called consumer function is given the found values as arguments - containing strings, not objects or numbers directly.

this inside the function is assigned the processing state object. Some of its properties and methods are:

  • this.thread: an object (a property Bag, see below) representing the thread as found on the current line, this simple object lives through out the file processing and you can store properties on it at will to build ruleIf() and plainIf() conditions based on it (more on it below)
  • this.thread.name: obviously the name of the thread
  • this.line: the currently processed line, stripped the timestamp, thread name and module name
  • this.timestamp: a Date object holding the time as read from the log line
  • this.obj(identifier): this method returns a JS object representing the given identifier that can be then conveniently worked with, more below ; note: the same object is always returned since its first call for the same identifier until destroy() is called on that object
  • this.objIf(identifier): same as above, but only a temporary object is returned when the object didn't exist before; this prevents null-checks in the rule code
  • this.service(className): this conveniently creates a service object - a singleton, bound to the log file (think of it as 'bound to the process') which then behaves as an object created with this.obj
  • this.duration(timestamp): calculates number of milliseconds since timestamp till now, timestamp is expected to be a Date object, the result is a number of milliseconds
  • this.global: a property Bag object that is carried through out the whole parsing process being common (global) to all files and threads; you can store any global data you need on it

Property Bag helper object

Some of the data fields on the processing state (thread, global) and objects you create with this.obj() are so called "property Bags". It's nothing more then a simple JS object (of a class Bag) for keeping arbitrary properties and providing following convenient methods:

  • .on("property", handler): a convenience method to perform a conditioned operation + change value or nullify the property in one easy step ; the handler is called when the property is non-null on the object you call .on() for, the handler is passed value of that property as its single argument, and the return value of the handler replaces the value of that property on the thread (note that when you don't return a value it effectively deletes the property from the object)
  • .data("key", value): a convenience function that ensures a property "key" on the object exists, [object].key = {}, and then ensures [object].key.value = new Bag(). You can then store and access arbitrary properties on it.

Working with objects

To access an object a rule consumer function calls this.obj(identifier) as described above. That returns an instance of Obj prototype.

Obj (an object) methods:

  • .create("classname"): called from constructors, this puts the object to a 'created' state and assigns its class name; such a created object lives until .destroy() is called on it; if called on an already created object, that existing object is first destroyed and then a new plain object is created and returned; a warning is shown in the web console that an object's been recreated

  • .destroy(["classname"]): called from destructors, this sets the state of the object to 'released' and removes the object from the processing state; it means that a following call to this.obj() in rules with the same identifier value will return a new blank object; if "classname" is provided, the object is destroyed only when the object's class name is identical to it

  • .capture("string" or no argument): this adds a line to the object so that it then appears in the results when the object is expanded in the results view; when there is no argument passed, the currently processed line is automatically added

  • .alias("alias"): an object can be identified by multiple values sometimes thanks static_cast pointer shifts, wrapping helper classes ("handlers"), or simply by a unique key side by a pointer; this method allows you to define such an alias so that calls to this.obj("alias") will resolve to this object

  • .unalias(): when there is an alias to an object that is a 'handle' or 'wrapper' object owning it and such a wrapper object can live shorter than the real object, you must un-alias it when its destroyed using this method so that there is no conflict when the pointer gets recycled

  • .inherits(base object, "this derived object class"): for objects inheriting from a base class that both log their constructors - use in the derived class constructor, you need to have the base object already in hands (passed e.g. via this.thread) ; the method changes the object's class to the "derived class" and logan then sees the base and derived objects as one

  • .grep(): this conveniently instructs the object to capture all lines that contain the object's pointer or any of its aliases that are in form of a pointer

  • .link("identifier" or object): this adds a 'this object links to other object' line, as described in the Links to referred or referring objects section above, the argument can be an identifier or an alias (will be resolved) or directly an object as returned by this.obj(); note that the link is automatically added to both objects with the correct vector

  • .mention("identifier" or object): this simply adds a line that mentions the given object so that it can be expanded in the results view - via a line with a checkbox; this doesn't establish any relation between the two objects

  • .prop("name", value, merge = false): sets or deletes a property on an object

    • when value has a value, it will be set under the name as a property on the object that you can then search by and examine
    • when value has a value and merge = true, it will be joined with the pre-existing value with ','
    • when value is undefined, the property will be removed from the object
    • when value is a function it will be called with two argument, the first being the existing property value or 0 (a number, convenient for counters) when the property has not yet been set, the second is the object, the result of the function is stored as a new property value (ignoring the merge argument!);
    • the merge argument can be a function too, called with one argument being the object
    • note that reading a property back is only possible via direct access on object's props simple object (a hashtable); it's strongly discouraged to modify this array directly as it would break properties history capture (seek)
  • .propIfNull("name", value): sets the property but only when it's not already present; use to set a property only once

  • .propIf("name", value, cond, merge): sets the property only when cond evaluates to true; the cond function is called with the object as the only argument

  • .props: property Bag - a simple object - holding all the currently captured properties for reading, provides .on("property", handler) method for convenience, see this.thread.on above for details

  • .state(value or ommited): this is a shorthand to the "state" property of the object, if value has a value it's set on the object's "state", if called without arguments the method returns the current object's "state" value

  • .expect("format", [consumer[, unmatch]]): use this to process lines following the current line on the same thread; optional consumer and unmatch handlers will be called as long as their results evaluate to true

    consumer is called only when a line matches the format string, the arguments are:

    • this: the processing state
    • obj: the object this follow has been initiated for
    • found values: passed the same way as for a rule consumer (see A simple rule section)
    • proc: the processing state, once more
    • result: true to continue the follow, false to stop it

    if consumer is not not specified, the default is capture of the line on the object and then stop the follow by returning false

    the optional unmatch handler is called when a line doesn't match the format with following arguments:

    • this: the processing state
    • obj: the object this follow has been initiated for
    • line: the line being currently processed
    • result: true to continue the follow, false to stop it

    if unmatch is not specified, the default is a function returning always true to not break the follow

  • .follow("format", [consumer[, unmatch]]): similar to .expect() but stops when any rule from the same module (or non-prefixed lines) matches a line on the same thread where this follow has been started, this is convenient for cases one doesn't know if the line matching "format" will or will not follow the currently processed line

    if consumer is not not specified, the default is capture of the matching line on the object and continue of the follow (this is different from expect)

    unmatch behaves the same as for expect

  • .follow(consumer): similar to the above form of .follow() but without a rule-like formating; the consumer function is called for lines with the same module (or non-prefixed lines) on the same thread as long as no other rule from the same module matches on the thread and as long as the consumer's result is evaluating to true (note that the consumer function can do anything it wants, not just capturing) ; the arguments are:

    • obj: the object this follow has been initiated for
    • line: the line being currently processed
    • proc: the processing state as described above
    • result: true to continue the follow, false to stop it
  • .follow(n): this will simply capture n following lines on this thread and module or non-prefixed lines, the follow will stop sooner if a rule matches on the thread

  • .ipcid(id): this sets a globally unique interprocess identifier on the object so that .send() and .recv() synchronization will then work between parent and child process log files on different objects with the same ipcid

  • .ipcid(): returns the assigned id, if any

  • .send("message"): sends a message (unblocks corresponding recv(), see below) from one log file to another, has an effect only when all of:

    • ipcid has been assigned on this object
    • there are parent and child log files loaded in logan
  • .recv("message", handler): used for synchronization between child and parent logs, the handler is called only when all of:

    • ipcid has been assigned on this object
    • there are parent and child log files loaded in logan
    • the corresponding .send("message") has been called; correspondence here means the sender has the same ipcid as the receiver and the message string is identical

    In case we hit a line with a .recv() call sooner than the corresponding .send() line in another log file (because of tight or non-synchronous timestamps) recv() stops parsing of this log file until the corresponding .send() is hit in one of the other log files. Only then the handler is called with two arguments: receiver and sender where receiver is the object recv() has been called on and sender is the object that called the corresponding send().

    In case the corresponding send() has already been hit, the handler is called immediately.

  • .class("name"): gives a class name to objects that are not tracked (no rules defined for them) or are partial in the log (long-living) which has been started later during the session; calling this on an object that has not been create()ed will give it a class name "name" by which you can then search the object for, state is set to "partial" and "missing-constructor" property is set to true; calling this on an object that has been create()ed doesn't do anything

  • .on("object_property", handler): see this.thread.on above for details, note this is working with JS properties you may have set directly on the Obj instance and not what has been set with the .prop() method!

  • .call(function, ...): a convenience method to call a function with the following arguments:

    • this being the processing state, same as in rule handling functions
    • the first argument is always the object call is being called on
    • additional arguments to call are passed as additional arguments to the function
  • .createOnce(className[, onCreate]): behave the same way as create() when the object doesn't exist already, but will just do nothing (except capturing the log line by default) when the object already exists

    • onCreate is an optional function called when the object is actually created or ignored when already existing and being reused, it has only one argument: the object

For convenience each of these methods (modulo documented exceptions) return back the object, so that they can be chained in the jQuery promise style.

To make the capturing process reliable there is a suggested ordering of calls when chained on one line:

  1. create()
  2. alias()
  3. state() or prop()
  4. recv()
  5. capture()
  6. send(), link() or mention()
  7. follow()
  8. destroy() or unalias()

A conditional rule

This can be used for adding rules for lines that are too general or indistinguishable or when the object to relate it to cannot be determined from the line itself.

An example of a conditional rule definition:

module.ruleIf("uri=%s", proc => proc.thread.httpchannel, function(url, channel) {
  delete this.thread.httpchannel; // we only want to hit this once
  channel.prop("url", url);
});

The rule assumes that a rule executed just before has set thread.httpchannel on the processing state to the object we want to assign the URL to.

The first argument to module.ruleIf() is equal to what is being passed to module.rule(). The second argument is a condition function that is evaluated prior to evaluating the formatting string. It has an only argument - the processing state as described above. If the condition function returns anything evaluating to true and the string matches, the function is called the same way as in the simple rule case with one added argument at the end - the result of the condition for convenience.

Note: you can define more than one conditional rule with the same formatting string.

Note: you can define an "If" rule on either the schema (for raw lines w/o a module) or on a module (to not conflict and be more efficient).

A conditional plain rule

Execution of the consumer of such a rule is conditioned only by evaluation of the condition function and nothing else.

schema.plainIf(proc => proc.thread.someCondition, function(line, condition) {
  ...
});

The line argument is holding the currently processed line. The condition argument keeps the result of the condition evaluation (proc.thread.someCondition in this case.)

To process a plain line you can use logan.parse("input", "format", consumer[, failure]) method:

  • the input argument is the unprocessed input string (e.g. line from the example above)
  • format is a printf formatting, same as in case of a rule definition to process the input
  • consumer is called, when input is matching format, with arguments filled with resolved format parameters - the same way as in case of a rule consumer
  • optional failure, if provided, is called when the input doesn't match the format, with one argument being the input

License

This is for general public use and spread at the moment.

logan's People

Contributors

mayhemer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

logan's Issues

Think .follow() a bit more through

follow is a bit misleading such as that it looks like it bounds somehow to the object. but it's actually not. by default follow() is canceled automatically when any other rule matches. if a code logs something matching between the rule that started the follow and the follow actually hits, that follow will be canceled silently.

hence, the auto-cancel condition should be more sophisticated.

Add ability to link to the current view of the log

I would love to have the ability to share my findings with others via a link, similar to the Gecko Profiler.
Logan doesn't currently host profiles, so it would need to have the ability to load profiles from a file on the internet, and I guess all of the filters would be provided by query params.

Have a query language instead of search fields

I think it's time to add a bit more then a flat search or searches. I think we want to look for objects in a more structured way like "find if there is a transaction that has ever been throttled on a particular h2 session".

Search by log line is broken (regressed by #16 and #17)

Since we no longer keep log lines on objects search by log line matching is broken.

One possible fix could be to simply walk the global capture array (not captures of individual objects) and search the string. All the files have to be reloaded but I think it's a reasonable price to pay.

I'll remove the search by log line feature for now until this is fixed.

Remove netdiag

Since Backtrack becomes serious, there is no longer a need to try to get the network scheduling information from logs.

Make higlight toggling less confusing

  • make it more clear that clicking a line from an individual object that has been revealed by hierarchy unroll makes that object distinct
  • selection toggles highlighting

Rewrite expanders, add checkbox to breadcrumbs

Now when there are two expanders in the results that reveal the same object, it's pretty confusing on how it expands/collapses that object - there are conflicts.

Instead of a checkbox there should be a button to simply expand. Breadcrumbs should have a way to hide (collapse) only but leave the object in the breadcrumbs list as well as to collapse+remove it from the breadcrumbs list.

Reveal history

Remember opened reveals to collapse them conveniently back

Allow saving the session (searches + expanded objects)

I am not happy with sharing finding (aka objects) through the URI. I also don't intend to remove it as it is somewhat useful - e.g. when you are diagnosing a single simple test failure, you just reload and see the same set of results.

What I want is to be able to add more data (like a line number to scroll to, or even highlight data + any manually added lines) which would simply produce too long URLs.

As we have full support for zip files now (even loaded from a URL!) we can add a 'logan.json' file with whatever data we wish.

There would be a button "save this session" to get that json file and add to the zip file t share.

Add support for Http2PushedStreams

This one may be a bit tricky (or may not), because due to inheritance, some (many?) log lines from Http2PushedStreams show up as Http2Stream. It would be really useful to have, though. Probably also want to add Http2PushTransactionBuffer at the same time.

Improve "search up to line" functionality

Right now:

  • it limits the search but summary shows final properties values
  • when it's changed the results get unexpectedly mixed

No idea how to make this more useful or clear.

List searches at the top

Make each performed search be visible as a 'tab' with 'delete' and 'enable' and maybe 'highlight' switches.

Blocks/interacts with #9 - when the 'state at' line is reselected, the search results should be updated

Obj.grep()

Allow autocapture of all lines that contains pointer (or any alias) of this object. Can have performance implications for both parsing and view, but may greatly reduce rule writing and simply provide way more info.

Error when trying to view a log

Entered a log so I could try to track down the source of a bug. Parsing got something like 99.9% of the way there, and then the progress bar just stopped, and I was never able to do anything with the log. The log was created from startup using the appropriate environment variables (not about:networking). The URL that was loaded was from bug 1397621, and loading was cancelled by hitting escape. The browser was then cleanly shut down. The problematic log (from a fresh profile, so no PII) is
moz.log.busted.gz

The following error was in the devtools console:

while processing '2017-11-03 21:49:53.100025 UTC - [Socket Thread]: I/nsHttp Http2Stream::WriteSegments 0x108cc8c80 Buffered 80470007 0'
at moz.log.busted:382128 (line#s are inaccurate) logan.js:737:14

Safari words it as (not sure if this will be helpful):

while processing '2017-11-03 21:49:53.100025 UTC - [Socket Thread]: I/nsHttp Http2Stream::WriteSegments 0x108cc8c80 Buffered 80470007 0'
at nspr.log:382128 (line#s are inaccurate)
	processRule (logan.js:1048)
	processLineByRules (logan.js:1028)
	processLine (logan.js:980)
	consumeLineByRules (logan.js:967)
	consumeLine (logan.js:945)
	(anonymous function) (logan.js:919)
	asyncFunctionResume
	(anonymous function)
	promiseReactionJob

Process rotated logs

Such has to be processed as a single file, ordered by the first line timestamp (it might be a bad idea to use lastModified of the file, since files could be tempered with or simply improperly copied w/o the time information)

Optimize processing

split and map rules by modules and by [word, separator, word, the rest] so that we don't have to iterate the whole rule array over every line in the log - O(n x m).

Benchmarking support

Hi, I'm using this tool as a benchmark for JavaScript performance. To make things easier could you add calls to performance.mark() and performance.measure() https://developer.mozilla.org/en-US/docs/Web/API/User_Timing_API at the approprite places so that we can measure how long it takes to load and process files?

I attempted this myself but could not obviously find the right places. I placed the calls in consumeFiles() but did not get the right places, I think because this is asynchronous and I don't see where loading finishes.

Could you do that for https://janbambas.cz/moz/bug1391710/ also? Maybe maintain it as a seperate branch.

Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.