summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorDaniƫl van Noord <13665637+DanielNoord@users.noreply.github.com>2021-12-27 10:28:46 +0100
committerGitHub <noreply@github.com>2021-12-27 10:28:46 +0100
commit99a8f70b50182a1d67705d0f148db9c6dbf7418c (patch)
tree747485be647b0338e545217c661a99ce7fd56a2e
parentb4bf5168621f0a4bf7cca795862e2b5c139fc8cb (diff)
downloadpylint-git-99a8f70b50182a1d67705d0f148db9c6dbf7418c.tar.gz
Add documentation about profiling and performance analysis (#5597)
Co-authored-by: Andreas Finkler <3929834+DudeNr33@users.noreply.github.com>
-rw-r--r--doc/development_guide/index.rst1
-rw-r--r--doc/development_guide/profiling.rst118
2 files changed, 119 insertions, 0 deletions
diff --git a/doc/development_guide/index.rst b/doc/development_guide/index.rst
index 50343e4c9..190efa369 100644
--- a/doc/development_guide/index.rst
+++ b/doc/development_guide/index.rst
@@ -8,3 +8,4 @@ Development
contribute
testing
+ profiling
diff --git a/doc/development_guide/profiling.rst b/doc/development_guide/profiling.rst
new file mode 100644
index 000000000..13a8bbaf4
--- /dev/null
+++ b/doc/development_guide/profiling.rst
@@ -0,0 +1,118 @@
+.. -*- coding: utf-8 -*-
+.. _profiling:
+
+===================================
+ Profiling and performance analysis
+===================================
+
+Performance analysis for Pylint
+-------------------------------
+
+To analyse the performance of Pylint we recommend to use the ``cProfile`` module
+from ``stdlib``. Together with the ``pstats`` module this should give you all the tools
+you need to profile a Pylint run and see which functions take how long to run.
+
+The documentation for both modules can be found at cProfile_.
+
+To profile a run of Pylint over itself you can use the following code and run it from the base directory.
+Note that ``cProfile`` will create a document called ``stats`` that is then read by ``pstats``. The
+human-readable output will be stored by ``pstats`` in ``./profiler_stats``. It will be sorted by
+``cumulative time``:
+
+.. sourcecode:: python
+
+ import cProfile
+ import pstats
+ import sys
+
+ sys.argv = ["pylint", "pylint"]
+ cProfile.run("from pylint import __main__", "stats")
+
+ with open("profiler_stats", "w", encoding="utf-8") as file:
+ stats = pstats.Stats("stats", stream=file)
+ stats.sort_stats("cumtime")
+ stats.print_stats()
+
+You can also interact with the stats object by sorting or restricting the output.
+For example, to only print functions from the ``pylint`` module and sort by cumulative time you could
+use:
+
+.. sourcecode:: python
+
+ import cProfile
+ import pstats
+ import sys
+
+ sys.argv = ["pylint", "pylint"]
+ cProfile.run("from pylint import __main__", "stats")
+
+ with open("profiler_stats", "w", encoding="utf-8") as file:
+ stats = pstats.Stats("stats", stream=file)
+ stats.sort_stats("cumtime")
+ stats.print_stats("pylint/pylint")
+
+Lastly, to profile a run over your own module or code you can use:
+
+.. sourcecode:: python
+
+ import cProfile
+ import pstats
+ import sys
+
+ sys.argv = ["pylint", "your_dir/your_file"]
+ cProfile.run("from pylint import __main__", "stats")
+
+ with open("profiler_stats", "w", encoding="utf-8") as file:
+ stats = pstats.Stats("stats", stream=file)
+ stats.sort_stats("cumtime")
+ stats.print_stats()
+
+The documentation of the ``pstats`` module discusses other possibilites to interact with
+the profiling output.
+
+
+Performance analysis of a specific checker
+------------------------------------------
+
+To analyse the performance of specific checker within Pylint we can use the human-readable output
+created by ``pstats``.
+
+If you search in the ``profiler_stats`` file for the file name of the checker you will find all functional
+calls from functions within the checker. Let's say we want to check the ``visit_importfrom`` method of the
+``variables`` checker::
+
+ ncalls tottime percall cumtime percall filename:lineno(function)
+ 622 0.006 0.000 8.039 0.013 /MY_PROGRAMMING_DIR/pylint/pylint/checkers/variables.py:1445(visit_importfrom)
+
+The previous line tells us that this method was called 622 times during the profile and we were inside the
+function itself for 6 ms in total. The time per call is less than a millisecond (0.006 / 622)
+and thus is displayed as being 0.
+
+Often you are more interested in the cumulative time (per call). This refers to the time spend within the function
+and any of the functions it called or they functions they called (etc.). In our example, the ``visit_importfrom``
+method and all of its child-functions took a little over 8 seconds to exectute, with an execution time of
+0.013 ms per call.
+
+You can also search the ``profiler_stats`` for an individual function you want to check. For example
+``_analyse_fallback_blocks``, a function called by ``visit_importfrom`` in the ``variables`` checker. This
+allows more detailed analysis of specific functions::
+
+ ncalls tottime percall cumtime percall filename:lineno(function)
+ 1 0.000 0.000 0.000 0.000 /MY_PROGRAMMING_DIR/pylint/pylint/checkers/variables.py:1511(_analyse_fallback_blocks)
+
+
+Parsing the profiler stats with other tools
+-------------------------------------------
+
+Often you might want to create a visual representation of your profiling stats. A good tool
+to do this is gprof2dot_. This tool can create a ``.dot`` file from the profiling stats
+created by ``cProfile`` and ``pstats``. You can then convert the ``.dot`` file to a ``.png``
+file with one of the many converters found online.
+
+You can read the gprof2dot_ documentation for installation instructions for your specific environment.
+
+Another option would be snakeviz_.
+
+.. _cProfile: https://docs.python.org/3/library/profile.html
+.. _gprof2dot: https://github.com/jrfonseca/gprof2dot
+.. _snakeviz: https://jiffyclub.github.io/snakeviz/