Coder Social home page Coder Social logo

memory_profiler's Introduction

https://travis-ci.org/pythonprofilers/memory_profiler.svg?branch=master

Memory Profiler

Note: This package is no longer actively maintained. I won't be actively responding to issues.

This is a python module for monitoring memory consumption of a process as well as line-by-line analysis of memory consumption for python programs. It is a pure python module which depends on the psutil module.

Installation

Install via pip:

$ pip install -U memory_profiler

The package is also available on conda-forge.

To install from source, download the package, extract and type:

$ pip install .

Quick Start

Use mprof to generate a full memory usage report of your executable and to plot it.

mprof run executable
mprof plot

The plot would be something like this:

https://i.stack.imgur.com/ixCH4.png

Usage

line-by-line memory usage

The line-by-line memory usage mode is used much in the same way of the line_profiler: first decorate the function you would like to profile with @profile and then run the script with a special script (in this case with specific arguments to the Python interpreter).

In the following example, we create a simple function my_func that allocates lists a, b and then deletes b:

@profile
def my_func():
    a = [1] * (10 ** 6)
    b = [2] * (2 * 10 ** 7)
    del b
    return a

if __name__ == '__main__':
    my_func()

Execute the code passing the option -m memory_profiler to the python interpreter to load the memory_profiler module and print to stdout the line-by-line analysis. If the file name was example.py, this would result in:

$ python -m memory_profiler example.py

Output will follow:

Line #    Mem usage    Increment  Occurrences   Line Contents
============================================================
     3   38.816 MiB   38.816 MiB           1   @profile
     4                                         def my_func():
     5   46.492 MiB    7.676 MiB           1       a = [1] * (10 ** 6)
     6  199.117 MiB  152.625 MiB           1       b = [2] * (2 * 10 ** 7)
     7   46.629 MiB -152.488 MiB           1       del b
     8   46.629 MiB    0.000 MiB           1       return a

The first column represents the line number of the code that has been profiled, the second column (Mem usage) the memory usage of the Python interpreter after that line has been executed. The third column (Increment) represents the difference in memory of the current line with respect to the last one. The fourth column (Occurrences) shows the number of times that profiler has executed each line. The last column (Line Contents) prints the code that has been profiled.

Decorator

A function decorator is also available. Use as follows:

from memory_profiler import profile

@profile
def my_func():
    a = [1] * (10 ** 6)
    b = [2] * (2 * 10 ** 7)
    del b
    return a

In this case the script can be run without specifying -m memory_profiler in the command line.

In function decorator, you can specify the precision as an argument to the decorator function. Use as follows:

from memory_profiler import profile

@profile(precision=4)
def my_func():
    a = [1] * (10 ** 6)
    b = [2] * (2 * 10 ** 7)
    del b
    return a

If a python script with decorator @profile is called using -m memory_profiler in the command line, the precision parameter is ignored.

Time-based memory usage

Sometimes it is useful to have full memory usage reports as a function of time (not line-by-line) of external processes (be it Python scripts or not). In this case the executable mprof might be useful. Use it like:

mprof run <executable>
mprof plot

The first line run the executable and record memory usage along time, in a file written in the current directory. Once it's done, a graph plot can be obtained using the second line. The recorded file contains a timestamps, that allows for several profiles to be kept at the same time.

Help on each mprof subcommand can be obtained with the -h flag, e.g. mprof run -h.

In the case of a Python script, using the previous command does not give you any information on which function is executed at a given time. Depending on the case, it can be difficult to identify the part of the code that is causing the highest memory usage.

Adding the profile decorator to a function(ensure no from memory_profiler import profile statement) and running the Python script with

mprof run --python python <script>

will record timestamps when entering/leaving the profiled function. Running

mprof plot

afterward will plot the result, making plots (using matplotlib) similar to these:

https://camo.githubusercontent.com/3a584c7cfbae38c9220a755aa21b5ef926c1031d/68747470733a2f2f662e636c6f75642e6769746875622e636f6d2f6173736574732f313930383631382f3836313332302f63623865376337382d663563632d313165322d386531652d3539373237623636663462322e706e67

or, with mprof plot --flame (the function and timestamp names will appear on hover):

./images/flamegraph.png

A discussion of these capabilities can be found here.

Warning

If your Python file imports the memory profiler from memory_profiler import profile these timestamps will not be recorded. Comment out the import, leave your functions decorated, and re-run.

The available commands for mprof are:

  • mprof run: running an executable, recording memory usage
  • mprof plot: plotting one the recorded memory usage (by default, the last one)
  • mprof list: listing all recorded memory usage files in a user-friendly way.
  • mprof clean: removing all recorded memory usage files.
  • mprof rm: removing specific recorded memory usage files

Tracking forked child processes

In a multiprocessing context the main process will spawn child processes whose system resources are allocated separately from the parent process. This can lead to an inaccurate report of memory usage since by default only the parent process is being tracked. The mprof utility provides two mechanisms to track the usage of child processes: sum the memory of all children to the parent's usage and track each child individual.

To create a report that combines memory usage of all the children and the parent, use the include-children flag in either the profile decorator or as a command line argument to mprof:

mprof run --include-children <script>

The second method tracks each child independently of the main process, serializing child rows by index to the output stream. Use the multiprocess flag and plot as follows:

mprof run --multiprocess <script>
mprof plot

This will create a plot using matplotlib similar to this:

https://cloud.githubusercontent.com/assets/745966/24075879/2e85b43a-0bfa-11e7-8dfe-654320dbd2ce.png

You can combine both the include-children and multiprocess flags to show the total memory of the program as well as each child individually. If using the API directly, note that the return from memory_usage will include the child memory in a nested list along with the main process memory.

Plot settings

By default, the command line call is set as the graph title. If you wish to customize it, you can use the -t option to manually set the figure title.

mprof plot -t 'Recorded memory usage'

You can also hide the function timestamps using the n flag, such as

mprof plot -n

Trend lines and its numeric slope can be plotted using the s flag, such as

mprof plot -s

./images/trend_slope.png

The intended usage of the -s switch is to check the labels' numerical slope over a significant time period for :

  • >0 it might mean a memory leak.
  • ~0 if 0 or near 0, the memory usage may be considered stable.
  • <0 to be interpreted depending on the expected process memory usage patterns, also might mean that the sampling period is too small.

The trend lines are for ilustrative purposes and are plotted as (very) small dashed lines.

Setting debugger breakpoints

It is possible to set breakpoints depending on the amount of memory used. That is, you can specify a threshold and as soon as the program uses more memory than what is specified in the threshold it will stop execution and run into the pdb debugger. To use it, you will have to decorate the function as done in the previous section with @profile and then run your script with the option -m memory_profiler --pdb-mmem=X, where X is a number representing the memory threshold in MB. For example:

$ python -m memory_profiler --pdb-mmem=100 my_script.py

will run my_script.py and step into the pdb debugger as soon as the code uses more than 100 MB in the decorated function.

API

memory_profiler exposes a number of functions to be used in third-party code.

memory_usage(proc=-1, interval=.1, timeout=None) returns the memory usage over a time interval. The first argument, proc represents what should be monitored. This can either be the PID of a process (not necessarily a Python program), a string containing some python code to be evaluated or a tuple (f, args, kw) containing a function and its arguments to be evaluated as f(*args, **kw). For example,

>>> from memory_profiler import memory_usage
>>> mem_usage = memory_usage(-1, interval=.2, timeout=1)
>>> print(mem_usage)
    [7.296875, 7.296875, 7.296875, 7.296875, 7.296875]

Here I've told memory_profiler to get the memory consumption of the current process over a period of 1 second with a time interval of 0.2 seconds. As PID I've given it -1, which is a special number (PIDs are usually positive) that means current process, that is, I'm getting the memory usage of the current Python interpreter. Thus I'm getting around 7MB of memory usage from a plain python interpreter. If I try the same thing on IPython (console) I get 29MB, and if I try the same thing on the IPython notebook it scales up to 44MB.

If you'd like to get the memory consumption of a Python function, then you should specify the function and its arguments in the tuple (f, args, kw). For example:

>>> # define a simple function
>>> def f(a, n=100):
    ...     import time
    ...     time.sleep(2)
    ...     b = [a] * n
    ...     time.sleep(1)
    ...     return b
    ...
>>> from memory_profiler import memory_usage
>>> memory_usage((f, (1,), {'n' : int(1e6)}))

This will execute the code f(1, n=int(1e6)) and return the memory consumption during this execution.

REPORTING

The output can be redirected to a log file by passing IO stream as parameter to the decorator like @profile(stream=fp)

>>> fp=open('memory_profiler.log','w+')
>>> @profile(stream=fp)
>>> def my_func():
    ...     a = [1] * (10 ** 6)
    ...     b = [2] * (2 * 10 ** 7)
    ...     del b
    ...     return a

For details refer: examples/reporting_file.py

Reporting via logger Module:

Sometime it would be very convenient to use logger module specially when we need to use RotatingFileHandler.

The output can be redirected to logger module by simply making use of LogFile of memory profiler module.

>>> from memory_profiler import LogFile
>>> import sys
>>> sys.stdout = LogFile('memory_profile_log')

Customized reporting:

Sending everything to the log file while running the memory_profiler could be cumbersome and one can choose only entries with increments by passing True to reportIncrementFlag, where reportIncrementFlag is a parameter to LogFile class of memory profiler module.

>>> from memory_profiler import LogFile
>>> import sys
>>> sys.stdout = LogFile('memory_profile_log', reportIncrementFlag=False)

For details refer: examples/reporting_logger.py

IPython integration

After installing the module, if you use IPython, you can use the %mprun, %%mprun, %memit and %%memit magics.

For IPython 0.11+, you can use the module directly as an extension, with %load_ext memory_profiler

To activate it whenever you start IPython, edit the configuration file for your IPython profile, ~/.ipython/profile_default/ipython_config.py, to register the extension like this (If you already have other extensions, just add this one to the list):

c.InteractiveShellApp.extensions = [
    'memory_profiler',
]

(If the config file doesn't already exist, run ipython profile create in a terminal.)

It then can be used directly from IPython to obtain a line-by-line report using the %mprun or %%mprun magic command. In this case, you can skip the @profile decorator and instead use the -f parameter, like this. Note however that function my_func must be defined in a file (cannot have been defined interactively in the Python interpreter):

In [1]: from example import my_func, my_func_2

In [2]: %mprun -f my_func my_func()

or in cell mode:

In [3]: %%mprun -f my_func -f my_func_2
   ...: my_func()
   ...: my_func_2()

Another useful magic that we define is %memit, which is analogous to %timeit. It can be used as follows:

In [1]: %memit range(10000)
peak memory: 21.42 MiB, increment: 0.41 MiB

In [2]: %memit range(1000000)
peak memory: 52.10 MiB, increment: 31.08 MiB

or in cell mode (with setup code):

In [3]: %%memit l=range(1000000)
   ...: len(l)
   ...:
peak memory: 52.14 MiB, increment: 0.08 MiB

For more details, see the docstrings of the magics.

For IPython 0.10, you can install it by editing the IPython configuration file ~/.ipython/ipy_user_conf.py to add the following lines:

# These two lines are standard and probably already there.
import IPython.ipapi
ip = IPython.ipapi.get()

# These two are the important ones.
import memory_profiler
memory_profiler.load_ipython_extension(ip)

Memory tracking backends

memory_profiler supports different memory tracking backends including: 'psutil', 'psutil_pss', 'psutil_uss', 'posix', 'tracemalloc'. If no specific backend is specified the default is to use "psutil" which measures RSS aka "Resident Set Size". In some cases (particularly when tracking child processes) RSS may overestimate memory usage (see example/example_psutil_memory_full_info.py for an example). For more information on "psutil_pss" (measuring PSS) and "psutil_uss" please refer to: https://psutil.readthedocs.io/en/latest/index.html?highlight=memory_info#psutil.Process.memory_full_info

Currently, the backend can be set via the CLI

$ python -m memory_profiler --backend psutil my_script.py

and is exposed by the API

>>> from memory_profiler import memory_usage
>>> mem_usage = memory_usage(-1, interval=.2, timeout=1, backend="psutil")

Frequently Asked Questions

  • Q: How accurate are the results ?
  • A: This module gets the memory consumption by querying the operating system kernel about the amount of memory the current process has allocated, which might be slightly different from the amount of memory that is actually used by the Python interpreter. Also, because of how the garbage collector works in Python the result might be different between platforms and even between runs.
  • Q: Does it work under windows ?
  • A: Yes, thanks to the psutil module.

Support, bugs & wish list

For support, please ask your question on stack overflow and add the *memory-profiling* tag. Send issues, proposals, etc. to github's issue tracker .

If you've got questions regarding development, you can email me directly at [email protected]

http://fa.bianp.net/static/tux_memory_small.png

Development

Latest sources are available from github:

https://github.com/pythonprofilers/memory_profiler

Projects using memory_profiler

Benchy

IPython memory usage

PySpeedIT (uses a reduced version of memory_profiler)

pydio-sync (uses custom wrapper on top of memory_profiler)

Authors

This module was written by Fabian Pedregosa and Philippe Gervais inspired by Robert Kern's line profiler.

Tom added windows support and speed improvements via the psutil module.

Victor added python3 support, bugfixes and general cleanup.

Vlad Niculae added the %mprun and %memit IPython magics.

Thomas Kluyver added the IPython extension.

Sagar UDAY KUMAR added Report generation feature and examples.

Dmitriy Novozhilov and Sergei Lebedev added support for tracemalloc.

Benjamin Bengfort added support for tracking the usage of individual child processes and plotting them.

Muhammad Haseeb Tariq fixed issue #152, which made the whole interpreter hang on functions that launched an exception.

Juan Luis Cano modernized the infrastructure and helped with various things.

Martin Becker added PSS and USS tracking via the psutil backend.

License

BSD License, see file COPYING for full text.

memory_profiler's People

Contributors

afoucaul avatar altendky avatar amyangfei avatar astrojuanlu avatar bbengfort avatar carlodri avatar cclauss avatar d-ryzhykau avatar demiurg906 avatar fabianp avatar florentx avatar guyer avatar hnykda avatar ianozsvald avatar imikhelson avatar kmike avatar maxhgerlach avatar mgbckr avatar mhaseebtariq avatar mjpieters avatar ogrisel avatar pbowyer avatar pgervais avatar sagaru avatar shyba avatar takluyver avatar tonivega avatar vene avatar wasade avatar william-silversmith avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

memory_profiler's Issues

Reported memory usage is sometimes wrong.

In some cases, memory is not reclaimed because of memory profiler. Here's a small example exhibiting the problem:

Line #    Mem usage    Increment   Line Contents
================================================
     6                             @profile
     7    20.254 MB     0.000 MB   def random_array(shape):
     8    43.148 MB    22.895 MB       arr1 = np.random.randn(*shape)
     9   138.320 MB    95.172 MB       arr = scipy.signal.detrend(arr1, axis=1)
    10   138.320 MB     0.000 MB       del arr1
    11   138.320 MB     0.000 MB       gc.collect()
    12                             
    13   138.320 MB     0.000 MB       col_mean = np.mean(arr, axis=1)
    14   138.320 MB     0.000 MB       np.testing.assert_array_less(abs(col_mean), 1e-15)
    15   138.320 MB     0.000 MB       return arr

arr1 is a numpy array weighting 22.9MB. Detrending creates an array that has exactly the same size (arr), but memory usage increases by 95MB, and does not go down, even after line 10.

I have checked that this is a side-effect of the memory_profiler module, by monitoring global memory usage, without using memory_profiler at all. Memory usage does raise to 96MB during execution of scipy.signal.detrend, but also does decrease just after it execution.

memory_profiler might be keeping a reference to the arr1 array somehow, but I wasn't able to find how even with guppy/heapy (which complains somehow when memory_profiler is loaded).

This seems to me like a tricky issue, but it is really important, since reported memory usage can be completely different than the value obtained without profiling.

EDIT: here is a snapshot of the memory usage graph.

example

Memory usage does not work with class methods?

I want to profile time and memory usage of class method.
When I try to use partial from functools I got this error:

File "/usr/lib/python2.7/site-packages/memory_profiler.py", line 126, in memory_usage
  aspec = inspect.getargspec(f)
File "/usr/lib64/python2.7/inspect.py", line 815, in getargspec
  raise TypeError('{!r} is not a Python function'.format(func))
TypeError: <functools.partial object at 0x252da48> is not a Python function

By the way exactly the same approach works fine with timeit function.

When I try to use lambda as was I got this error:

File "/usr/lib/python2.7/site-packages/memory_profiler.py", line 141, in memory_usage
  ret = parent_conn.recv()
IOError: [Errno 4] Interrupted system call

How can I handle class methods with memory_profiler? Are there any (even dirty) ways?

I asked this question on SO: http://stackoverflow.com/questions/16593246/how-to-use-memory-profiler-python-module-with-class-methods

UPD: fixed broken link to SO

%memit example sometimes doesn't work

In [2]: import numpy as np

In [3]: %memit np.zeros(1e2)
maximum of 1: 28.300781 MB per loop

In [4]: %memit np.zeros(1e2)
maximum of 1: 28.320312 MB per loop

In [5]: %memit np.zeros(1e2)
maximum of 1: 28.320312 MB per loop

In [6]: %memit np.zeros(1e4)
maximum of 1: 28.328125 MB per loop

In [7]: %memit np.zeros(1e7)
maximum of 1: 28.406250 MB per loop

In [8]: %memit np.zeros(1e7)
maximum of 1: 104.710938 MB per loop

Memory reported as MB should be MiB

I know, this is pretty much aesthetic.

Anyways, very useful little profiler. Allowed me to plot some great graphs about memory usage of some functions. Thank you!

Documentation is outdated (profile.timestamp)

Hi!

The deocumentation is outdated in "Executing external scripts" section.
When it comes to timestamps there is a piece of code under "It is also possible to timestamp a portion of code using a context manager like this:" line. It is not working in the last version of memory_profiler.

Is there a possibility to still add custom timestamps and see plots like here?

Unicode strings upset the timestamp context manager

Using a unicode string inside a timestamp context manager:
with profile.timestamp(u"Adding_jobs"):
causes:
TypeError: __name__ must be set to a string object

This can be caused with a normal looking Python string if you use from __future__ import unicode_literals (as I originally did) and the error message isn't as informative as it could be. The solution, if using the __future__ import is to force a binary sequence:
with profile.timestamp(b"Adding_jobs"):

I mention this more just to help others spot the problem if they hit this error message.

Error when profiling pandas.io.parsers.read_csv

Here's a simple script using read_csv from pandas.io.parsers:

from pandas.io.parsers import read_csv

@profile
def test_read_csv():
  a = read_csv('dummy.txt')
  return a


if __name__ == '__main__':
  test_read_csv()

But when I call python -m memory_profiler profile_read_csv, the following error appears:

Traceback (most recent call last):
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/Library/Python/2.7/site-packages/memory_profiler.py", line 272, in <module>
    execfile(__file__, locals(), locals())
  File "profile_read_csv.py", line 10, in <module>
    test_read_csv()
  File "/Library/Python/2.7/site-packages/memory_profiler.py", line 158, in f
    result = func(*args, **kwds)
  File "profile_read_csv.py", line 5, in test_read_csv
    a = read_csv('dummy.txt')
  File "/Library/Python/2.7/site-packages/pandas-0.7.3-py2.7-macosx-10.7-intel.egg/pandas/io/parsers.py", line 187, in read_csv
    return _read(TextParser, filepath_or_buffer, kwds)
  File "/Library/Python/2.7/site-packages/pandas-0.7.3-py2.7-macosx-10.7-intel.egg/pandas/io/parsers.py", line 153, in _read
    parser = cls(f, **kwds)
TypeError: __init__() got an unexpected keyword argument 'kwds'

--pdb-mmem failures

sample script:

@profile
def f():
    import numpy as np
    print "about to allocate"
    a = np.ones(1e8)
    print "done"

f()

Please add full license text

I see the license of this module as Simplified BSD in README. But I couldn't find full license text in the source tree. For making it clearer, please add full license text in a COPYING or LICENSE file.

Does not work with generator functions?

It appears that memory_profiler does not produce any output at all if asked to profile a generator function (I couldn't find this documented anywhere, so I assume it's a bug)..

I took the simple example code snippet from https://pypi.python.org/pypi/memory_profiler and saved it as example.py. I then took a copy and modified it as follows and saved it to example2.py:

@profile
def my_func():
    a = [1] * (10 ** 6)
    b = [2] * (2 * 10 ** 7)
    del b
    yield a

if __name__ == '__main__':
    next(my_func())

(i.e. replaced the "return" with a "yield" instead). I got the following results:

$ python3 -m memory_profiler example.py 
Filename: example.py

Line #    Mem usage    Increment   Line Contents
================================================
     1                             @profile
     2     8.969 MB     0.000 MB   def my_func():
     3    16.699 MB     7.730 MB       a = [1] * (10 ** 6)
     4   169.324 MB   152.625 MB       b = [2] * (2 * 10 ** 7)
     5    16.738 MB  -152.586 MB       del b
     6    16.738 MB     0.000 MB       return a


$ python3 -m memory_profiler example2.py 
$ 

wrong results when calling function twice

running test/test_func.py


Line #    Mem usage  Increment   Line Contents
==============================================
     2                           @profile
     3                           def test_1():
     4      7.73 MB    0.00 MB       # .. will be called twice ..
     5      7.73 MB    0.00 MB       a = 2.
     6      7.73 MB    0.00 MB       b = 3
     7      7.61 MB   -0.12 MB       c = {}
     8      7.80 MB    0.19 MB       for i in range(1000):
     9      7.80 MB    0.00 MB           c[i] = 2
    10      7.80 MB    0.00 MB       c[0] = 2.

but correct results should be the same as calling the function once:


Line #    Mem usage  Increment   Line Contents
==============================================
     2                           @profile
     3                           def test_1():
     4      7.57 MB    0.00 MB       # .. will be called twice ..
     5      7.59 MB    0.02 MB       a = 2.
     6      7.59 MB    0.00 MB       b = 3
     7      7.59 MB    0.00 MB       c = {}
     8      7.73 MB    0.14 MB       for i in range(1000):
     9      7.73 MB    0.00 MB           c[i] = 2
    10      7.73 MB    0.00 MB       c[0] = 2.

Can the plot be scaled to show fine detail?

I have an issue with making the plot from a mprof run legible. This is how mine looks:
plot

I'd like to be able to stretch/zoom the plot - make it much larger, so the function markers don't overlap.

I have tried changing matplotlib's savefig.dpi and figure.figsize values, and both result in the graph being scaled, rather than the canvas being larger and the line/function markers becoming thinner and separated, and the text smaller.

I tried a really wide figure using these settings in my matplotlibrc:

figure.figsize   : 200, 10    # figure size in inches
savefig.dpi      : 100

but it still plotted at 1400x600.

Do you know a way to make this possible?

memory profiling and compute profiling the same code

I really like the output of this memory profiler.

However, I think that people who are interested in memory efficiency may also be interested in execution time. And in requiring the decorator (or some other memory profiling mechanism), we end up with code that requires modification each time we want to profile for both compute and memory. Finally, the memory profiler slows down execution time, so for a final product ship, we must remove the memory profiling mechanism. Making updates to code for just capturing profiling data is cumbersome and definitely not efficient.

Ideally, the memory profiler would require no updates to the code to perform and would function on the code in a similar manner as the cProfile module.

Python3 support

memory_profiler doesn't print memory usage with python3 (no error messages either), just empty output.

Possible race condition with psutil

Hi again,

I think that I may have found a possible race condition when counting the memory with psutil of a process using the include_children option. The problem (I think) is in this piece of code in _get_memory:

if include_children:
    for p in process.get_children(recursive=True):
        mem += p.get_memory_info()[0] / _TWO_20

The method get_childrenreturns a list that is used to iterate over and calculate the total memory. It may happen though that one of the child processes dies or finishes before the sum has finished, resulting on an error like this:

Reading configuration from '/pica/h1/guilc/repos/facs/tests/data/bin/fastq_screen.conf'
Using 1 threads for searches
Adding database phiX
Processing /pica/h1/guilc/repos/facs/tests/data/synthetic_fastq/simngs_phiX_100.fastq
Output file /pica/h1/guilc/repos/facs/tests/data/tmp/simngs_phiX_100_screen.txt already exists - skipping
Processing complete
Process MemTimer-2:
Traceback (most recent call last):
  File "/sw/comp/python/2.7_kalkyl/lib/python2.7/multiprocessing/process.py", line 232, in _bootstrap
    self.run()
  File "/pica/h1/guilc/.virtualenvs/facs/lib/python2.7/site-packages/memory_profiler.py", line 124, in run
    include_children=self.include_children)
  File "/pica/h1/guilc/.virtualenvs/facs/lib/python2.7/site-packages/memory_profiler.py", line 52, in _get_memory
    mem += p.get_memory_info()[0] / _TWO_20
  File "/pica/h1/guilc/.virtualenvs/facs/lib/python2.7/site-packages/psutil/__init__.py", line 758, in get_memory_info
    return self._platform_impl.get_memory_info()
  File "/pica/h1/guilc/.virtualenvs/facs/lib/python2.7/site-packages/psutil/_pslinux.py", line 470, in wrapper
    raise NoSuchProcess(self.pid, self._process_name)
NoSuchProcess: process no longer exists (pid=17442)

It happens randomly, and can be solved encapsulating the sum on a try except statement:

if include_children:
    for p in process.get_children(recursive=True):
        try:
            mem += p.get_memory_info()[0] / _TWO_20
        except NoSuchProcess:
            pass

I'm not sure that this is the best solution though... any comments/ideas? @fabianp @brainstorm

Thanks!

strange behavior for growing dictionaries/tuples

it seems strange the allocation of 2.05MB on line 6, it should be a small allocation and the 2.05 allocation should be on the next line (on a for loop we take the max). Probably lines are messed up

└─[$] python memory_profiler.py examples/example_loop.py
Line #    Mem usage  Increment   Line Contents
==============================================
     4                           @profile
     5      7.56 MB    0.00 MB   def my_func_dict():
     6      9.61 MB    2.05 MB       a = {}
     7      9.61 MB    0.00 MB       for i in range(10000):
     8      9.61 MB    0.00 MB           a[i] =  i + 1
     9      9.61 MB    0.00 MB       return

memory_profiler.py: error: no such option

When running python -m memory_profiler script_file.py --some-args, memory_profile assumes that --some-args is intended for it rather than for script_file.py. This is easy to fix by adding the following single line immediately after creating the OptionParser:

parser.disable_interspersed_args()

fix sys.argv

does not correctly pass arguments into executed script

Calling `itervalues` and `iteritems` on dict breaks mprof in Python3

These iter* method have been removed and generator become default behaviour in Python 3.

% mprof plot
Traceback (most recent call last):
  File "/usr/bin/mprof", line 467, in <module>
    actions[get_action()]()
  File "/usr/bin/mprof", line 436, in plot_action
    mprofile = plot_file(filename, index=n, timestamps=timestamps)
  File "/usr/bin/mprof", line 338, in plot_file
    for values in ts.itervalues():
AttributeError: 'dict' object has no attribute 'itervalues'

The Python version is 3.3.3.

UnboundLocalError

this should work

(Pdb) import memory_profiler
(Pdb) memory_profiler.memory_usage()
*** UnboundLocalError: local variable 'num' referenced before assignment

Incorrect output for first line of a function call

memory_profiler does not seem to correctly catch the memory allocation for the first line of a function.

Here's two versions of a script (I was trying to illustrate the failings of sys.getsizeof), differing only in an initial print 'hello world' statement:

$ cat ~/tmp/lists.py
import random
import sys
import numpy as np

@profile
def test_random_mem_usage():
    c = [np.zeros(50000) for x in range(1000)]
    print sys.getsizeof(c)
    print sum(map(len, c))
    print sum(map(sys.getsizeof, c))

if __name__ == '__main__':
    test_random_mem_usage()

and

$ cat ~/tmp/lists2.py
import random
import sys
import numpy as np

@profile
def test_random_mem_usage():
    print 'hello world'
    c = [np.zeros(50000) for x in range(1000)]
    print sys.getsizeof(c)
    print sum(map(len, c))
    print sum(map(sys.getsizeof, c))

if __name__ == '__main__':
    test_random_mem_usage()

Compare the outputs:

$ python -m memory_profiler ~/tmp/lists.py
9032
50000000
80000
Line #    Mem usage  Increment   Line Contents
==============================================
     5                           @profile
     6    399.39 MB    0.00 MB   def test_random_mem_usage():
     7    399.39 MB    0.00 MB       c = [np.zeros(50000) for x in range(1000)]
     8    399.39 MB    0.00 MB       print sys.getsizeof(c)
     9    399.40 MB    0.01 MB       print sum(map(len, c))
    10    399.40 MB    0.00 MB       print sum(map(sys.getsizeof, c))
$ python -m memory_profiler ~/tmp/lists2.py
hello world
9032
50000000
80000
Line #    Mem usage  Increment   Line Contents
==============================================
     5                           @profile
     6     16.43 MB    0.00 MB   def test_random_mem_usage():
     7     16.43 MB    0.00 MB       print 'hello world'
     8    399.39 MB  382.96 MB       c = [np.zeros(50000) for x in range(1000)]
     9    399.39 MB    0.00 MB       print sys.getsizeof(c)
    10    399.40 MB    0.01 MB       print sum(map(len, c))
    11    399.40 MB    0.00 MB       print sum(map(sys.getsizeof, c))

Installing from pip or setup.py gives error

Tested on both OS X 10.11 and Debian GNU/Linux 7.

Collecting memory-profiler
Downloading memory_profiler-0.38.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "", line 20, in
File "/private/tmp/pip-build-BV2ywJ/memory-profiler/setup.py", line 1, in
import memory_profiler
File "memory_profiler.py", line 863, in
magic_mprun = MemoryProfilerMagics().mprun.func
TypeError: init() takes exactly 2 arguments (1 given)

----------------------------------------

Command "python setup.py egg_info" failed with error code 1 in /private/tmp/pip-build-BV2ywJ/memory-profiler

Profiling multithreaded functions doesn't work

Hi,

I was trying to profile a function that executes a subprocess.check_call() which in turns calls a multithreaded program, which is the actual program I want to profile.

I am profiling with memory_usage, and it is always returning the same value, which is around 9MB, and I guess that is the memory used by the function that create the threads, but not by the threads all together.

The real function that I'm trying to benchmark would be this one. It is quite heavy to test, so I've written an ugly small script to reproduce the scenario. Here it is:

import sys
from multiprocessing import Pool
from memory_profiler import memory_usage

def test(n):
    l = [i for i in range(n)]

def test_multip(n, np):
    p = Pool(processes=np)
    results = p.map(test, [n]*np)

if __name__=="__main__":
    t = str(sys.argv[1])
    n = int(sys.argv[2])
    if t == "test":
        print memory_usage((test, (n, )), max_usage=True, include_children=True)
    else:
        np = int(sys.argv[3])
        print memory_usage((test_multip, (n, np, )), max_usage=True, include_children=True)

If you benchmark directly the test function, memory_usage behaves as expected:

(master)guilc@milou-b:~/tests$ python multithread.py test 10000
[9.421875]
(master)guilc@milou-b:~/tests$ python multithread.py test 100000
[12.765625]
(master)guilc@milou-b:~/tests$ python multithread.py test 1000000
[47.67578125]
(master)guilc@milou-b:~/tests$ python multithread.py test 10000000
[396.0625]
(master)guilc@milou-b:~/tests$ python multithread.py test 100000000
[3879.73828125]

However, if you test the multithreaded version, this is what happens:

(master)guilc@milou-b:~/tests$ python multithread.py test_multip 10000 1
[9.5234375]
(master)guilc@milou-b:~/tests$ python multithread.py test_multip 100000 1
[9.51953125]
(master)guilc@milou-b:~/tests$ python multithread.py test_multip 1000000 1
[9.4296875]
(master)guilc@milou-b:~/tests$ python multithread.py test_multip 10000000 1
[9.31640625]

And the same if you try with more than one thread:

(master)guilc@milou-b:~/tests$ python multithread.py test_multip 10000000 1
[9.31640625]
(master)guilc@milou-b:~/tests$ python multithread.py test_multip 10000000 2
[9.3203125]
(master)guilc@milou-b:~/tests$ python multithread.py test_multip 10000000 3
[9.3203125]
(master)guilc@milou-b:~/tests$ python multithread.py test_multip 10000000 4
[9.3203125]
(master)guilc@milou-b:~/tests$ python multithread.py test_multip 10000000 5
[9.328125]

I hope that this is enough to understand and reproduce the problem. If you have any suggestion/idea of how to fix this, I can help on that.

Thanks a lot!

Previous trace function is not restored when disabling LineProfiler

    def disable(self):
        self.last_time = {}
        sys.settrace(None)

Instead this should execute sys.settrace(previous_fn) to previous callback set before LineProfiler was enabled. See how it's done in Nostrils class from http://reminiscential.wordpress.com/2012/04/17/use-pythons-sys-settrace-for-fun-and-for-profit/ as an example.

Even more - we may want LineProfiler::trace_memory_usage to run original trace callback (if it's code coverage for example and we don't want it skipped).

ignores last line

sometimes there's no measurement for the last line (maybe when there's no return statement?)


     4                           @profile
     5      7.58 MB    0.00 MB   def my_func_dict():
     6      9.62 MB    2.05 MB       a = {}
     7      9.62 MB    0.00 MB       for i in range(10000):
     8                                   a[i] =  i + 1

Does not remove itself from sys.argv

When invoking via command line with -m option, memory_profiler does not remove itself from sys.argv, so it messes up the profiled program's argument parsing:

$ cat mp.py
import sys
print sys.argv

$ python mp.py --foo
['mp.py', '--foo']

$ python -m memory_profiler mp.py --foo
['/home/jneely/dev/env/betl/lib/python2.6/site-packages/memory_profiler.py', 'mp.py', '--foo']

By contrast:

$ python -m pdb mp.py --foo
> /home/jneely/tmp/mp.py(1)<module>()
-> import sys
(Pdb) c
['mp.py', '--foo']

Truncation issue in _get_memory

Hi,

When i was reading your code, I found a small truncation issue
in _get_memory() (psutil version, i didn't check the other).

Here a gist to reproduce it:
https://gist.github.com/3665731

It is due to the integer division in python 2.x.
It seems that it was not intended when I saw the float(), so I report it.

max_iter = float('inf') looks wrong

it is used in memory_usage function like this:

    if timeout is not None:
        max_iter = int(timeout / interval)
    elif isinstance(proc, int):
        # external process and no timeout
        max_iter = 1
    else:
        # for a Python function wait until it finishes
        max_iter = float('inf') # <--------------------------

    if isinstance(proc, (list, tuple)):
        # ... (snip)
    else:
        # external process
        if proc == -1:
            proc = os.getpid()
        if max_iter == -1:
            max_iter = 1
        for _ in range(max_iter):   # <----------------
            ret.append(_get_memory(proc))
            time.sleep(interval)
    return ret

range(float('inf')) is an error, and max_iter is not used for anything else here.

import module as name doesn' t work with Python 3

Here is an example script:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import math as m

@profile
def f():
    o = m.sqrt(2013)
    return o

print(f())

And here output with Python 2:

~$ python2 -m memory_profiler ./tmpr.py 
44.8664685483
Filename: ./tmpr.py

Line #    Mem usage    Increment   Line Contents
================================================
     7                             @profile
     8     9.668 MB     0.000 MB   def f():
     9     9.676 MB     0.008 MB       o = m.sqrt(2013)
    10     9.676 MB     0.000 MB       return o

And here with Python 3:

~$ python3 -m memory_profiler ./tmpr.py 
Traceback (most recent call last):
  File "/usr/lib64/python3.2/runpy.py", line 161, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib64/python3.2/runpy.py", line 74, in _run_code
    exec(code, run_globals)
  File "/usr/lib64/python3.2/site-packages/memory_profiler.py", line 615, in <module>
    ns, copy(globals()))
  File "./tmpr.py", line 13, in <module>
    print(f())
  File "/usr/lib64/python3.2/site-packages/memory_profiler.py", line 576, in wrapper
    val = prof(func)(*args, **kwargs)
  File "/usr/lib64/python3.2/site-packages/memory_profiler.py", line 229, in f
    result = func(*args, **kwds)
  File "./tmpr.py", line 9, in f
    o = m.sqrt(2013)
NameError: global name 'm' is not defined

plot_action not taking the output .dat file with smaller intervals

I ran the following test code with the command:
mprof run -T 0.001 mprof_example.py

import time


def test1():
    n = 10000
    a = [1] * n
    time.sleep(1)
    return a

def test2():
    n = 100000
    b = [1] * n
    time.sleep(1)
    return b

if __name__ == "__main__":
    test1()
    test2()

I got the following output file:
CMDLINE /usr/local/local/python-2.7.5/bin/python2.7 mprof_example.py
MEM 5.476562 1438698997.6944
MEM 7.613281 1438698997.7504
MEM 7.613281 1438698997.8072
MEM 7.613281 1438698997.8678
MEM 7.613281 1438698997.9238
MEM 7.613281 1438698997.9789
MEM 7.613281 1438698998.0327
MEM 7.613281 1438698998.0876
MEM 7.613281 1438698998.1430
MEM 7.613281 1438698998.1976
MEM 7.613281 1438698998.2512
MEM 7.613281 1438698998.3066
MEM 7.613281 1438698998.3623
MEM 7.613281 1438698998.4171
MEM 7.613281 1438698998.4711
MEM 7.613281 1438698998.5262
MEM 7.613281 1438698998.5816
MEM 7.613281 1438698998.6397
MEM 7.613281 1438698998.6970
MEM 8.378906 1438698998.7522
MEM 8.378906 1438698998.8076
MEM 8.378906 1438698998.8626
MEM 8.378906 1438698998.9165
MEM 8.378906 1438698998.9765
MEM 8.378906 1438698999.0322
MEM 8.378906 1438698999.0871
MEM 8.378906 1438698999.1414
MEM 8.378906 1438698999.1967
MEM 8.378906 1438698999.2529
MEM 8.378906 1438698999.3094
MEM 8.378906 1438698999.3658
MEM 8.378906 1438698999.4282
MEM 8.378906 1438698999.4831
MEM 8.378906 1438698999.5372
MEM 8.378906 1438698999.5924
MEM 8.378906 1438698999.6475
MEM 8.378906 1438698999.7022
MEM 0.000000 1438698999.7563

I tried plotting the result and saw the following error message:
mprof plot mprofile_mprofex.dat
/usr/lib/pymodules/python2.7/matplotlib/axes.py:4601: UserWarning: No labeled objects found. Use label='...' kwarg on individual plots.
warnings.warn("No labeled objects found. "
Traceback (most recent call last):
File "/home/jey/memory_profiler-0.33/mprof", line 490, in
actionsget_action()
File "/home/jey/memory_profiler-0.33/mprof", line 470, in plot_action
leg.get_frame().set_alpha(0.5)
AttributeError: 'NoneType' object has no attribute 'get_frame'

Not able to plot the graph from "./mplot run"

I've done "mprof run --python " and post that I am trying to plot the graph("mprof plot"). But I don't see any graph being plotted.

vikas@host:/home/vikas/memory_profiler-0.32$ ./mprof run --python ../asl
mprof: Sampling memory every 0.1s

running as a Python program...

vikas@host:/home/vikas/memory_profiler-0.32$ cat mprofile_20150224005550.dat

CMDLINE python ../asl

MEM 1.316406 1424768150.5671

MEM 6.539062 1424768150.6675

MEM 8.812500 1424768150.7678

MEM 8.812500 1424768150.8681

MEM 8.812500 1424768150.9684

"mprof plot" with labels does not like spaces in the label

mprof can use a context manager to place a label. If the label contains a space e.g. "my label" then a ValueError is raised as shown below. If the space is removed (e.g. "my_label") then mprof plot displays the label.

It might be easier to make a note in the README stating that spaces aren't allowed if this disturbs your parsing code! Or catching the ValueError and hinting that spaces aren't allowed (to give the user a hint).

$ mprof plot
Traceback (most recent call last):
  File "/home/ian/workspace/virtualenvs/high_performance_python_orielly/shared_github/raw_code/ian/env/bin/mprof", line 467, in <module>
    actions[get_action()]()
  File "/home/ian/workspace/virtualenvs/high_performance_python_orielly/shared_github/raw_code/ian/env/bin/mprof", line 436, in plot_action
    mprofile = plot_file(filename, index=n, timestamps=timestamps)
  File "/home/ian/workspace/virtualenvs/high_performance_python_orielly/shared_github/raw_code/ian/env/bin/mprof", line 322, in plot_file
    mprofile = read_mprofile_file(filename)
  File "/home/ian/workspace/virtualenvs/high_performance_python_orielly/shared_github/raw_code/ian/env/bin/mprof", line 299, in read_mprofile_file
    ts.append([float(start), float(end),
ValueError: could not convert string to float: list

Incompatible with psutil 3.0.0

psutil 3.0.0 was recently released. Attempting to profile code using this version of psutil results in an exception:

Traceback (most recent call last):
  File "/usr/local/bin/mprof", line 472, in <module>
    actions[get_action()]()
  File "/usr/local/bin/mprof", line 220, in run_action
    include_children=options.include_children, stream=f)
  File "/usr/local/lib/python2.7/dist-packages/memory_profiler.py", line 243, in memory_usage
    include_children=include_children)
  File "/usr/local/lib/python2.7/dist-packages/memory_profiler.py", line 48, in _get_memory
    mem_info = getattr(process, 'memory_info', process.get_memory_info)
AttributeError: 'Process' object has no attribute 'get_memory_info'

(memory_profiler does appear to work correctly with the previous version of psutil, 2.2.1)

[not a bug] memory_profiler could also record Disk I/O and Network I/O

I've been playing with psutil, I'm attaching two images showing disk I/O and network I/O measurement (both hacky proofs).

The disk usage graph writes 10 files of 10MB each (with flushes), we can see some odd caching behaviour which maybe needs some more work?

The network graph reads a 1.6MB file from wikipedia 5 times.

Both charts exhibit a spike at the end of their 'with' block which I don't understand.

Is there interest in merging this code into the main project? Obviously this goes beyond the remit of a memory profiler! All I've done is changed a few lines with psutil in memory_profiler.py and fixed 1 line in mprof for the plotting.

disk_used
io_used

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.