Heapy: A Memory Profiler and Debugger for Python

This report presents background, design, implementation, rationale and some use cases for Heapy version 0.1, a toolset and library for the Python programming language providing object and heap memory sizing, profiling and other analysis.

Excessive memory use may cause severe performance problems and system crashes. Without appropriate tools, it may be difficult or impossible to determine why a program is using too much memory. This applies even though Python provides automatic memory management — garbage collection can help avoid many memory allocation bugs, but only to a certain extent due to the lack of information during program execution. There is still a need for tools helping the programmer to understand the memory behaviour of programs, especially in complicated situations. The primary motivation for Heapy is that there has been a lack of such tools for Python.The main questions addressed by Heapy are how much memory is used by objects, what are the objects of most interest for optimization purposes, and why are objects kept in memory. Memory leaks are often of special interest and may be found by comparing snapshots of the heap population taken at different times. Memory profiles, using different kinds of classifiers that may include retainer information, can provide quick overviews revealing optimization possibilities not thought of beforehand. Reference patterns and shortest reference paths provide different perspectives of object access patterns to help explain why objects are kept in memory.

Contents

1 Introduction
1.1 Background
1.2 Problem definition
1.3 Design approach
1.4 Related work
1.5 Scope of this report
2 Background
2.1 Memory leaks
2.2 Memory profiling
2.3 The WHY question
2.4 Managing complexity
3 Diving into Python Internals
3.1 Python
3.2 Finding objects
3.3 Interpreter structures
3.4 Simplified assumption
3.5 The memory allocation in Python
3.6 No general procedure
3.7 The compromise
4 Design, implementation and rationale
4.1 General Concepts
4.1.1 Session context
4.1.2 Universal set
4.1.3 Set of objects
4.1.4 Kind object
4.1.5 Equivalence relation
4.1.6 Path
4.1.7 Reference pattern
4.1.8 Remote monitor
4.1.9 Memory profiling
4.1.10 Profile browser
4.2 Summary of API & UI
4.2.1 Creating the Heapy Session Context
4.2.2 Commonly used operations on a session context
4.2.3 Common operations on IdentitySet objects
4.3 Implementation overview
4.4 Extension modules
4.4.1 heapyc.so
4.4.2 setsc.so
4.5 Python modules in guppy.heapy
4.5.1 Use.py
4.5.2 UniSet.py
4.5.3 Classifiers.py
4.5.4 Part.py
4.5.5 Paths.py
4.5.6 RefPat.py
4.5.7 View.py
4.5.8 Prof.py
4.5.9 Monitor.py
4.5.10 Remote.py
4.6 Python modules in guppy.etc
4.6.1 Glue.py
4.7 Rationale
4.7.1 Why sets?
4.7.2 Why equivalence relations?
4.7.3 What is this term Partition used when printing tables?
4.7.4 Why session context – why not global variables?
4.7.5 Why API == UI?
4.7.6 Why a variety of C functions?
4.7.7 Why nodesets?
4.7.8 Why not importing Use directly?
4.7.9 Why family objects, why not many subclasses of UniSet?
4.7.10 Why is monitor a server?
4.7.11 Why Glue.py?
5 Sets, kinds and equivalence relations
5.1 Sets
5.2 Kinds
5.3 Equivalence relations and classes
5.4 Partitions
5.5 The heap() method
5.6 The Via classifier
5.7 An optimization possibility
6 Using Heapy to find and seal a memory leak
6.1 Background
6.2 Debugging approach
7 Conclusions and future work
7.1 Future work
A API specification (extract)

Author: Nilsson, Sverker

Source: Linköping University

Download URL 2: Visit Now

Leave a Comment