Python comes with a built-in way of collecting profile information, documented at https://docs.python.org/3/library/profile.html.
In a nutshell, it boils down to:
python3 -m cProfile -o profile.out command.py arguments…
This post is an attempt to document what's in profile.out
, since the python
documentation does not cover it. I have tried to figure this out by looking at
the sources of the pstats
module,
the sources of gprof2dot
,
and trying to make sense of the decoded structure, and navigate it.
Decoding Python profile data
The data collected by python's profile
/cProfile
is a data structure
encoding using the marshal
module1.
Loading the data is simple:
>>> import marshal
>>> with open("profile.out", "rb") as fd:
>>> data = marshal.load(fd)
Structure of Python profile data
Decoded profile data is a dictionary indexed by tuples representing functions or scopes:
# File name, Line number, Function or scope name
Func = Tuple[str, int, str]
For example:
("./staticsite/page.py", 1, "<module>")
("./staticsite/features/markdown.py", 138, "load_dir")
('~', 0, "<method 'items' of 'collections.OrderedDict' objects>")
('/usr/lib/python3.7/pathlib.py', 1156, 'stat')
('~', 0, …)
represents a built-in function.
The signature of the decoded dict seems to be:
Dict[
# Function of scope
Func,
Tuple[
# Aggregated statistics
int, int, float, float,
# Callers
Dict[
# Caller function or scope
Func,
# Call statistics collected for this caller → callee
Tuple[int, int, float, float]
]
]
]
The four int, int, float, float
statistics are normally unpacked as
cc, nc, tt, ct
, and stand for:
cc
: number of primitive (non recursive) calls. Sometimes listed aspcalls
.nc
: total number of calls (inculding recursive), sometimes listed asncalls
.tt
: total time, sometimes listed astottime
: total time spent in the given function, and excluding time made in calls to sub-functions.ct
: cumulative time, Sometimes listed ascumtime
: cumulative time spent in this and all subfunctions (from invocation till exit). This figure is accurate even for recursive functions.
Phrases in italic are quotations from profile's documentation.
Building a call graph
This is some example code to turn the collected statistics into a call graph structure:
class Node:
"""
Node in the caller → callee graph
"""
def __init__(self, func: Func):
self.fname, self.lineno, self.scope = func
self.callees: List["Call"] = []
self.callers: List["Call"] = []
self.cc = 0
self.nc = 0
self.tt = 0.0
self.ct = 0.0
def __str__(self):
# Builtin function
if self.fname == "~" and self.lineno == 0:
return f"[builtin]:{self.scope}"
# Shorten file names from system libraries
self.shortname = self.fname
for path in sorted(sys.path, key=lambda x: -len(x)):
if not path:
continue
if self.fname.startswith(path):
return f"[sys]{self.fname[len(path):]}:{self.lineno}:{self.scope}"
# File in the local project
return f"{self.fname}:{self.lineno}:{self.scope}"
class Call:
"""
Arc in the caller → callee graph
"""
def __init__(self, caller: Node, callee: Node, cc: int, nc: int, tt: float, ct: float):
self.caller = caller
self.callee = callee
self.cc = cc
self.nc = nc
self.tt = tt
self.ct = ct
class Graph:
"""
Graph of callers and callees.
Each node in the graph represents a function, with its aggregated
statistics.
Each arc in the graph represents each collected caller → callee statistics
"""
def __init__(self, stats: Stats):
# Index of all nodes in the graph
self.nodes: Dict[Func, Node] = {}
# Total execution time
self.total_time = 0.0
# Build the graph
for callee, (cc, nc, tt, ct, callers) in stats.items():
self.total_time += tt
# Get the callee and fill its aggregated stats
ncallee = self.node(callee)
ncallee.cc = cc
ncallee.nc = nc
ncallee.tt = tt
ncallee.ct = ct
# Create caller → callee arcs
for caller, (cc, nc, tt, ct) in callers.items():
ncaller = self.node(caller)
call = Call(ncaller, ncallee, cc, nc, tt, ct)
ncallee.callers.append(call)
ncaller.callees.append(call)
def node(self, fun: Func) -> Node:
"""
Lookup or create a node
"""
res = self.nodes.get(fun)
if res is None:
res = Node(fun)
self.nodes[fun] = res
return res
@classmethod
def load(cls, pathname: str) -> "Graph":
"""
Builds a Graph from profile statistics saved on a file
"""
with open(pathname, "rb") as fd:
return cls(marshal.load(fd))
@classmethod
def from_pstats(cls, stats: pstats.Stats) -> "Graph":
"""
Builds a Graph from an existing pstats.Stats structure
"""
return cls(stats.stats)
Other python profiling links
- pyinstrument: call stack profiler for Python
- austin: Python frame stack sampler for CPython
- py-spy: Sampling profiler for Python programs
- FlameGraph: stack trace visualizer