rueckstiess / mtools Goto Github PK
View Code? Open in Web Editor NEWA collection of scripts to set up MongoDB test environments and parse and visualize MongoDB log files.
License: Apache License 2.0
A collection of scripts to set up MongoDB test environments and parse and visualize MongoDB log files.
License: Apache License 2.0
This would replace the --no-legend startup parameter.
currently it is only possible to color by namespace. Re-factor to make it possible to color by different aspects, i.e by operations (update, remove, query, ...).
Possible usage:
mplotqueries logfile --color namespace (default)
mplotqueries logfile --color operation
filtering by namespace, operation, thread, ...
mlogfilter logfile --namespace "test.collection"
mlogfilter logfile --operation update query
mlogfilter logfile --thread conn123
To filter only databases, --ns and --exclude-ns should accept wildcards.
Then it would be possible to write
--ns "admin." to include all admin collections
--exclude-ns "bla to exclude all databases starting with bla
And in addition to --kill
also offer --remove
which kills the instances and removes the data folder.
If the dbpath exists already (from previous mlaunch), then compare the startup options. If they are different, exit with warning that existing data will be overwritten and that mlaunch --restore
should be used to restart.
Should we allow the script to run if the same number of nodes are started? What about with a different number of mongos? more shards?
Perhaps offer --force
to force overwrite.
millisecond parsing (input) is already implemented through LogLine. output should detect if any of the files use millisecond format and output accordingly.
Big goal, but if this project will continue to grow, which I hope it will, we'll need to have tests in place so we don't brake anything. And also be more confident to try things and don't be scared to break stuff
mlogfilter should read and understand the new format for time hh:mm:ss.uuu where uuu are the milliseconds.
If no restart (or start) message found, return a warning or error
for stdin, provide buffer that makes the input seekable (line numbers, start and end of file).
also support DB storage (mongod).
Should go in LICENSE.md and linked from README.md
If two data points are further apart than the gap-threshold, the range bar is stopped at the last point and started again with the new point.
For this we also need to find a way to pass customized arguments to a plot type.
Especially with some matplotlib complications.
$ mlaunch --single --verbose --helpz
creating directory: ./data/db
launching: mongod --dbpath ./data/db --logpath ./data/mongod.log --port 27017 --logappend --helpz --fork
waiting for mongod to start up...
# takes a long time
mongod at Gianfranco-10gen.local:27017 running.
but doesn't start it
same for
mlaunch --replica --verbose --helpz
mlaunch --replicaset --verbose --helpz
Instead of setting $PATH before running mlaunch, it would be nice to allow a user to specify the path to the correct binary to use.
Similar to other open source tools, suggest adding "created by ..." tagline at the bottom of generated charts. This should include version & github url, eg:
"Created by mplotqueries v0.31 (https://github.com/rueckstiess/mtools)"
Would also add a command line option to disable the this (--nobanner), but would have the default for this as enabled.
This will help others discover mtools to create their own charts ;)
This problem started after implementing the number toggling [0-9].
Something
why does this not work:
mlogmerge logfile --timezone "-5" --pos eol
any option that mlaunch doesn't understand directly should be passed on to the mongod mongos launch.
This could include -vvv, --nojournal, etc.
Make sure to pass options only to the process that understands it (filter before).
If a group is invisible from the plot, clicking in the graph should not output any log lines from this group. This is confusing and prevents useful filtering with the numbers.
Sometimes I kill instances and it takes a minute or so before mlaunch let's me start a new set on the old ports. I'm not sure if the socket test is reliable or what the problem is.
We should also check for all ports before starting any processes. Sometimes, it starts a port on 27017 but then complains that 27018 is still in use. But the first process is already running and needs to be killed off (which again can take some time).
currently, if there are no datapoints anymore, the plot will only reach to the last data point. to compare different plots, the x-axis range should always cover the complete file. This can be fixed with a call to plt.xlim() setting the whole range.
for example:
mlogfilter logfile.log --from "-2h" should grab the last two hours of the logfile.
(I searched the repository for "nscanned" and didn't see anything relevant for mplotqueries, so I hope I didn't miss an existing feature request! Feel free to close if I did miss an existing issue.)
Currently, the y-axis of charts produced by mplotqueries is some function of the runtime of slow queries being logged. It'd be great if there was an option to show the ratio of nscanned to n (maybe either logarithmic, if that makes sense?), nscanned, or just n.
Obviously, there's a correlation of nscanned/n to query runtime, but there are cases when nscanned/n is very high but overall runtime is not a problem from the perspective of the developer. However, when many of the same/similar query are run simultaneously, the inefficiency of nscanned/n becomes much more apparent.
mtools/util/logfile.py
currently, a start of
mlaunch --single
stores the data in ./data/
, whereas starting
mlaunch --single foo
stores the data in ./foo/data/
.
The behavior should be changed to store the second case in ./foo/
directly, without the nested data
folder.
--mongos X (where X is the number of mongos to start. X=1 default)
For example mention branching model (develop / master).
Including mlogfilter, which did it's own thing before.
Remove usage from each script section. Create new documentation (either in the wiki or as separate pages) to explain each of the tools in detail.
Should make it easier to maintain as well, as each change doesn't require README.md to be updated.
Add version numbers for mtools releases. Should be included in the --help output, as well as a simple display with --version.
bucketing namespace/operations, so you can see whether a large number of operations occurred within a given unit of time.
Add the number in the legend that needs to be pressed to toggle that plot.
Show which ones are visible/invisible in the legend.
Extend to 18 plots with Shift-1 - Shift-9.
use numbers 0-9 to hide and unhide plots.
start counting at 1 for real plots.
use 0 to hide/unhide all plots.
If you follow the usage help by putting --exclude-ns
before the filename it errors out.
./mplotqueries.py --exclude-ns "(command)" mongo-live-a-4_2-13_mongodb.log
usage: mplotqueries.py [-h] [--ns [NS [NS ...]]] [--log]
[--exclude-ns [NS [NS ...]]]
filename
mplotqueries.py: error: too few arguments
If you put it after it works fine
/mplotqueries.py mongo-live-a-4_2-13_mongodb.log --exclude-ns "(command)"
{'exclude_ns': ['(command)'], 'ns': None, 'log': False, 'filename': 'mongo-live-a-4_2-13_mongodb.log'}
0 live.player_action 1
...
I think is just moving line 88 and 89 after or changing the usage help
mlogvisjs is a stand-alone library based on the mlogvis javascript code. We don't want to keep both versions around. The js library is further advanced now, so I'd like to try to use only that.
Necessary steps:
index.html
file each time, similar to how the data is writtenAdd option -h / --human to enable human-readable format.
This would convert each line's ms numbers to min, hours, days, ...
It would also insert commas for very large values of nscanned, nreturned, ...
Make sure that --slow, --scan, and other mtools, like mplotqueries, still work.
Sometimes it would be useful to plot log lines that aren't necessarily timed operations. Let's say you filtered/grepped for a small set of "events" in the logfile (e.g. replica set status changes, etc) and want to visualize those as well.
I could see this as a useful scenario:
grep "is now in state" mongod.log | mplotqueries --plot-untimed
To visualize, they could draw vertical thin lines instead of dots (because they don't have a y-axis value).
Ideally, I'd like to overlay such a plot with the original timed plot. Need to work out how that would be possible.
This should work:
mplotqueries mongod.log --ns admin.$cmd
Currently, you must use instead:
mplotqueries mongod.log --ns admin.\$cmd
Group slow queries and show counts, to easily identify the most inefficient queries. Needs to be smart enough to group similar queries together but allow for value changes. Maybe use similar logic as the query optimizers uses.
Possible output
<query type> #occurences average time total time avg nscanned avg nreturned
(retry)
This can be useful for simply time-shifting a single file.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.