guppy/heapy in PythonThe guppy3 is a Python package that offers the user status of the current heap and objects available in a heap. This detail can support the developer with memory profiling and analysis. The guppy package consists of a sub-package named heapy. The heapy sub-package offers a list of significant methods and current heap status. When called for heap status, the guppy3 package provides a special C-nodesets object. This C-nodesets object consists of the heap status for each object available in the memory. The guppy3 package also consists of details associated with all reachable or unreachable objects available in a heap as a special type of list. It also provides details related to the number of objects, percentage of the memory occupied by the object, size in bytes, and type information. Apart from this, the guppy3 package offers a list of useful methods that allow us to access an individual element of the heap status, find the difference between heap statuses, find the size of objects in Bytes, and many other functionalities. We will understand the usage of the guppy3 package in order to gather information associated with memory usage in Python through different examples as a part of the following tutorial. So, let's get started. Understanding the guppy3 packageThe guppy3 package is a Python Programming Environment and Heap analysis toolset. This package consists of the following sub-packages:
The descriptions for the above sub-packages are given below: 1. etc - This sub-package supports modules consisting of the Glue protocol module. 2. gsl - This sub-package contains the implementation of the Guppy Specification Language. This sub-package can be used to create documents and tests from a common source. 3. heapy - This sub-package supports the heap analysis toolset. It can be utilized to find details associated with the objects in a heap and show the detail in different ways. 4. sets - This sub-package supports the Bitsets and 'nodesets' implemented in the C language. The guppy3 package is a fork of Guppy-PE, created by Sverker Nilsson for Python 2. Requirements for the guppy3 packageWe should have a Python version of 3.6, 3.7, 3.8, or 3.9. The guppy3 package is CPython only; thus, PyPy and other Python implementations are not supported. We can obtain the support for Python 2 from guppy-pe by Sverker Nilsson, from which this package is forked. We can use the graphical browser of this package with the help of the Tkinter library. Moreover, threading is also required in order to utilize its remote monitoring feature. How to install the guppy3 package?We can install the guppy3 package either with the help of the pip installer or using conda. Both methods are shown below: Installation using pip Installation using conda Verifying the InstallationOnce the module is installed, we can verify it by creating an empty Python program file and writing an import statement as follows: File: verify.py Now, save the above file and execute it using the following command in a terminal: Syntax: If the above Python program file does not return any error, the module is installed properly. However, in the case where an exception is raised, try reinstalling the module, and it is also recommended to refer to the official documentation of the module. Understanding some methods and attributes of the guppy3 packageThe following table consists of the methods and attributes available for us through the guppy3 package
Let us now consider some examples to understand the usage of the above methods and attributes of the guppy3 package for the purpose of profiling the memory usage in Python. Some examples based on the Python guppy3 packageExample 1In the following example, we will understand the method of accessing the heap status using the guppy.hpy(), heap() and setref() methods of the guppy module. Code: Output: Heap Status At Starting : Heap Size : 12023800 bytes Partition of a set of 85790 objects. Total size = 12023800 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 25003 29 4135160 34 4135160 34 str 1 18832 22 1346584 11 5481744 46 tuple 2 6269 7 1107919 9 6589663 55 types.CodeType 3 12133 14 916798 8 7506461 62 bytes 4 5748 7 781728 7 8288189 69 function 5 863 1 734088 6 9022277 75 type 6 244 0 494712 4 9516989 79 dict of module 7 863 1 461784 4 9978773 83 dict of type 8 1319 2 415880 3 10394653 86 dict (no owner) 9 150 0 219792 2 10614445 88 set <282 more rows. Type e.g. '_.more' to view.> Heap Status After Setting Reference Point : Heap Size : 616 bytes Partition of a set of 3 objects. Total size = 616 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 1 33 408 66 408 66 types.FrameType 1 1 33 136 22 544 88 function 2 1 33 72 12 616 100 builtins.weakref Heap Status After Creating Few Objects : Heap Size : 56632 bytes Partition of a set of 1333 objects. Total size = 56632 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 1328 100 37184 66 37184 66 int 1 1 0 12728 22 49912 88 list 2 1 0 6104 11 56016 99 numpy.ndarray 3 1 0 408 1 56424 100 types.FrameType 4 1 0 136 0 56560 100 function 5 1 0 72 0 56632 100 builtins.weakref Memory Usage After Creation Of Objects : 56016 bytes Explanation: In the above snippet of code, we have explained the usage of the guppy.hpy(), heap() and setref() methods. We first collected the heap status at the start of the script. We have then set the reference point and retrieved the heap status again. We created some objects such as a list, string, and NumPy array of the random numbers. Once the creation of these objects is done, we have again called the heap() method to get the heap status which has the detail regarding these objects. The output shown above shows that it consists of the detail regarding the number of objects and the total size of the whole heap and object count, size, percentage of memory utilized by that object type, and type information. The second heap status does not show much information since nothing much has happened after setting up a reference point. The third heap status object has information regarding the objects created after setting a reference point. Example 2:In the following example, we will understand the method for accessing objects which are unreachable from the root of the heap using the heapu() method. Code: Output: GC Collectable Objects Which Are not Reachable from Root of Heap Total Objects : 587 Total Size : 82948 Bytes Number of Entries : 17 Entries : Index Count Size Cumulative Size Object Name 0 56 29088 29088 dict 1 46 18768 47856 type 2 130 9360 57216 types.WrapperDescriptorType 3 92 4888 62104 tuple 4 65 4160 66264 types.MemberDescriptorType 5 3 4072 70336 list 6 93 3906 74242 bytes 7 52 3744 77986 types.MethodDescriptorType 8 16 2610 80596 str 9 21 1512 82108 types.BuiltinMethodType 10 6 384 82492 types.GetSetDescriptorType 11 1 160 82652 sys.flags 12 2 144 82796types.ClassMethodDescriptorType 13 1 64 82860 types.MethodType 14 1 48 82908encodings.utf_8.IncrementalDecoder 15 1 24 82932 builtins.stderrprinter 16 1 16 82948 Token.MISSING First 5 Entries : Index Count Size Cumulative Size Object Name 0 56 29088 29088 dict 1 46 18768 47856 type 2 130 9360 57216 types.WrapperDescriptorType 3 92 4888 62104 tuple 4 65 4160 66264 types.MemberDescriptorType Directly Printing Results Without Iteration Partition of a set of 725 objects. Total size = 99179 bytes. Index Count % Size % Cumulative % Type 0 66 9 34496 35 34496 35 dict 1 54 7 22032 22 56528 57 type 2 147 20 10584 11 67112 68 types.WrapperDescriptorType 3 113 16 6064 6 73176 74 tuple 4 74 10 5328 5 78504 79 types.MethodDescriptorType 5 66 9 4224 4 82728 83 types.MemberDescriptorType 6 3 0 4072 4 86800 88 list 7 93 13 3906 4 90706 91 bytes 8 22 3 3089 3 93795 95 str 9 40 6 2880 3 96675 97 types.BuiltinMethodType <39 more rows. Type e.g. '_.more' to view.> Measuring Unreachable Objects From This Reference Point Onwards Partition of a set of 725 objects. Total size = 99179 bytes. Index Count % Size % Cumulative % Type 0 66 9 34496 35 34496 35 dict 1 54 7 22032 22 56528 57 type 2 147 20 10584 11 67112 68 types.WrapperDescriptorType 3 113 16 6064 6 73176 74 tuple 4 74 10 5328 5 78504 79 types.MethodDescriptorType 5 66 9 4224 4 82728 83 types.MemberDescriptorType 6 3 0 4072 4 86800 88 list 7 93 13 3906 4 90706 91 bytes 8 22 3 3089 3 93795 95 str 9 40 6 2880 3 96675 97 types.BuiltinMethodType <39 more rows. Type e.g. '_.more' to view.> Explanation: In the above snippet of code, we have first retrieved a list of unreachable objects from the heap and then utilized different methods in order to access an individual row of status to retrieve information associated with the individual data type. Example 3:In the following example, we will understand the method of retrieving an individual entry from the whole heap status object. Code: Output: Heap Status At Starting : Partition of a set of 87291 objects. Total size = 21079156 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 69 0 9007150 43 9007150 43 numpy.ndarray 1 25008 29 4135434 20 13142584 62 str 2 18834 22 1346760 6 14489344 69 tuple 3 6270 7 1108095 5 15597439 74 types.CodeType 4 12135 14 916976 4 16514415 78 bytes 5 5748 7 781728 4 17296143 82 function 6 863 1 734088 3 18030231 86 type 7 244 0 494712 2 18524943 88 dict of module 8 863 1 461784 2 18986727 90 dict of type 9 1319 2 415880 2 19402607 92 dict (no owner) <282 more rows. Type e.g. '_.more' to view.> Accessing Individual Element of Heap First Element : Partition of a set of 69 objects. Total size = 9007150 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 69 100 9007150 100 9007150 100 numpy.ndarray Second Element : Partition of a set of 25008 objects. Total size = 4135434 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 1 25008 100 4135434 100 13142584 318 str Third Element : Partition of a set of 18834 objects. Total size = 1346760 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 2 18834 100 1346760 100 14489344 1076 tuple Total Heap Size : 20.10 MB Size Of Object : 0 - 8.59 MB Size Of Object : 1 - 3.94 MB Size Of Object : 2 - 1.28 MB Size Of Object : 3 - 1.06 MB Size Of Object : 4 - 895.48 KB Size Of Object : 5 - 763.41 KB Size Of Object : 6 - 717.28 KB Size Of Object : 7 - 483.12 KB Size Of Object : 8 - 450.96 KB Size Of Object : 9 - 406.13 KB Size Of Object : 10 - 214.64 KB Size Of Object : 11 - 151.73 KB Size Of Object : 12 - 132.05 KB Size Of Object : 13 - 98.14 KB Size Of Object : 14 - 91.27 KB Explanation: We have retrieved the heap status after creating some lists in the above snippet of code. The individual object can also be accessed from the heap status object with the help of the list indexing. We have then printed different individual entries of the heap status. We have also created a simple method that accepts the size in bytes as input and returns the size in KB/MB/GB. Example 4:In the following example, we will understand how to find the difference between the two heap status to check the total number of objects created between two calls of the heap status. For this example, we will be using the diff() method and the disjoint() method. Code: Output: Heap Status At Starting : Heap Size : 12024920 bytes Partition of a set of 85803 objects. Total size = 12024920 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 25016 29 4136028 34 4136028 34 str 1 18832 22 1346712 11 5482740 46 tuple 2 6269 7 1107919 9 6590659 55 types.CodeType 3 12133 14 916898 8 7507557 62 bytes 4 5748 7 781728 7 8289285 69 function 5 863 1 734088 6 9023373 75 type 6 244 0 494712 4 9518085 79 dict of module 7 863 1 461784 4 9979869 83 dict of type 8 1319 2 415880 3 10395749 86 dict (no owner) 9 150 0 219792 2 10615541 88 set <282 more rows. Type e.g. '_.more' to view.> Heap Status After Creating Few Objects : Heap Size : 12081288 bytes Partition of a set of 87135 objects. Total size = 12081288 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 25016 29 4136028 34 4136028 34 str 1 18832 22 1346712 11 5482740 45 tuple 2 6269 7 1108167 9 6590907 55 types.CodeType 3 12133 14 916898 8 7507805 62 bytes 4 5748 7 781728 6 8289533 69 function 5 863 1 734088 6 9023621 75 type 6 244 0 494712 4 9518333 79 dict of module 7 863 1 461784 4 9980117 83 dict of type 8 1319 2 415880 3 10395997 86 dict (no owner) 9 150 0 219792 2 10615789 88 set <282 more rows. Type e.g. '_.more' to view.> Memory Usage After Creation Of Objects : 56120 bytes Finding Out Difference Between Two Heap Status : Whether Two Heap Status Are Disjoint : False Total Objects : 1332 Total Size : 56368 Bytes Number of Entries : 8 Entries : Index Count Size Cumulative Size Object Name 0 1329 37216 37216 int 1 1 12728 49944 list 2 1 6104 56048 numpy.ndarray 3 0 248 56296 types.CodeType 4 1 72 56368 builtins.weakref 5 0 0 56368 ctypes.CFunctionType 6 0 0 56368 ctypes._FuncPtr 7 0 0 56368 dict of ctypes._FuncPtr Explanation: In the above snippet of code, we first took heap status at the beginning. We have then created some lists of objects and string objects. Once these objects are created, we have again taken another heap status. We have then called the diff() method on the second heap status object, passing it the first heap status object in order to get the difference between two screenshots of the heap. We have then looped through the stats object and printed the difference of objects. Example 5:In the following example, we will understand the usage of some attributes like count, size, referents, referrers, and stat along with some methods like dump() and load(). Code: Output: Few Important Properties/Methods of Heap Status Object( Explanation: In the above snippet of code, we have explained the usage of the attributes like count, size, referents, referrers, and stat along with some methods like dump() and load(). Example 6:In the following example, we will understand different attributes available through the heap status object, which allows us to group heap status entries on the basis of different attributes such as type, size, referrers, memory address, and more. Code: Output: =========== Heap Status At Starting : ============ Partition of a set of 42072 objects. Total size = 5086505 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 12893 31 1168594 23 1168594 23 str 1 8414 20 582912 11 1751506 34 tuple 2 610 1 515280 10 2266786 45 type 3 2910 7 514056 10 2780842 55 types.CodeType 4 5565 13 396264 8 3177106 62 bytes 5 2669 6 362984 7 3540090 70 function 6 610 1 297456 6 3837546 75 dict of type 7 106 0 174272 3 4011818 79 dict of module 8 98 0 169136 3 4180954 82 set 9 377 1 134664 3 4315618 85 dict (no owner) <147 more rows. Type e.g. '_.more' to view.> ============ Heap Status Grouped By Type : ========== Partition of a set of 42072 objects. Total size = 5086505 bytes. Index Count % Size % Cumulative % Type 0 12893 31 1168594 23 1168594 23 str 1 1766 4 736256 14 1904850 37 dict 2 8414 20 582912 11 2487762 49 tuple 3 610 1 515280 10 3003042 59 type 4 2910 7 514056 10 3517098 69 types.CodeType 5 5565 13 396264 8 3913362 77 bytes 6 2669 6 362984 7 4276346 84 function 7 98 0 169136 3 4445482 87 set 8 1170 3 84240 2 4529722 89 types.WrapperDescriptorType 9 941 2 67752 1 4597474 90 builtins.weakref <96 more rows. Type e.g. '_.more' to view.> ==== Heap Status Grouped By Referrers of kind(class/dict of class) : === Partition of a set of 42072 objects. Total size = 5087009 bytes. Index Count % Size % Cumulative % Referrers by Kind (class / dict of class) 0 11255 27 824572 16 824572 16 types.CodeType 1 5314 13 667128 13 1491700 29 function 2 5040 12 533219 10 2024919 40 dict of type 3 1972 5 410976 8 2435895 48 type 4 5191 12 365199 7 2801094 55 tuple 5 1480 4 247408 5 3048502 60 dict of module 6 207 0 186655 4 3235157 64 dict of module, tuple 7 718 2 181110 4 3416267 67 function, tuple 8 75 0 153856 3 3570123 70 dict of _frozen_importlib_external.FileFinder 9 2251 5 138361 3 3708484 73 set <546 more rows. Type e.g. '_.more' to view.> ========== Heap Status Grouped By Module : ============ Partition of a set of 42072 objects. Total size = 5087526 bytes. Index Count % Size % Cumulative % Module 0 41966 100 5079894 100 5079894 100 ~module 1 1 0 72 0 5079966 100 __main__ 2 1 0 72 0 5080038 100 _abc 3 1 0 72 0 5080110 100 _ast 4 1 0 72 0 5080182 100 _bootlocale 5 1 0 72 0 5080254 100 _codecs 6 1 0 72 0 5080326 100 _collections 7 1 0 72 0 5080398 100 _ctypes 8 1 0 72 0 5080470 100 _functools 9 1 0 72 0 5080542 100 _heapq <97 more rows. Type e.g. '_.more' to view.> ========== Heap Status Grouped By Individual Size : ============== Partition of a set of 42072 objects. Total size = 5087894 bytes. Index Count % Size % Cumulative % Individual Size 0 2809 7 494752 10 494752 10 176 1 401 1 426664 8 921416 18 1064 2 2734 6 371824 7 1293240 25 136 3 4836 11 348192 7 1641432 32 72 4 2709 6 173376 3 1814808 36 64 5 2919 7 163464 3 1978272 39 56 6 614 1 142448 3 2120720 42 232 7 2575 6 123600 2 2244320 44 48 8 184 0 117760 2 2362080 46 640 9 94 0 110544 2 2472624 49 1176 <625 more rows. Type e.g. '_.more' to view.> ========= Heap Status Grouped By Total Size : ============ Partition of a set of 42072 objects. Total size = 5087931 bytes. Index Count % Size % Cumulative % Explanation: In the above snippet of code, we have explained the usage of the attributes like bytype, byrcs, bymodule, bysize, byunity, byvia, byidset, and byid. Example 7:In the following example, we will understand the usage of two supporting methods available through the heap objects. These methods are iso() and idset(). Code: Output: ============== ISO Method Examples ==================== Partition of a set of 1 object. Total size = 12029656 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 1 100 12029656 100 12029656 100 list Partition of a set of 1 object. Total size = 50 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 1 100 50 100 50 100 str Partition of a set of 1 object. Total size = 59 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 1 100 59 100 59 100 str Partition of a set of 1 object. Total size = 56 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 1 100 56 100 56 100 list Partition of a set of 1 object. Total size = 49 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 1 100 49 100 49 100 str ============== IDSET Method Examples =================== Partition of a set of 1500000 objects. Total size = 41999996 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 1500000 100 41999996 100 41999996 100 int Index Count % Size % Cumulative % Kind (class / dict of class) 0 1 100 50 100 50 100 str Partition of a set of 8 objects. Total size = 400 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 8 100 400 100 400 100 str Explanation: In the above snippet of code, we have explained the usage of the iso() and idset() methods which accept single or multiple objects as input and return status specifying the object size. We can also observe the difference between the results of the two methods. Example 8:In the following example, we will understand the way of checking doc of specific methods/attributes of the objects available through guppy. Code: Output: ============== Heap Documents ==================== Top level interface to Heapy. Available attributes: Anything Prod Via iso Clodo Rcs doc load Id Root findex monitor Idset Size heap pb Module Type heapu setref Nothing Unity idset test Use eg: the_heap.doc. Explanation: In the above snippet of code, we have accessed the document of the heap by calling the doc attribute of a heap object. It will display documentation for the whole heap object listing all the methods/attributes available through that object. We have also accessed the individual method/attribute document by calling that method attribute's name on the doc attribute. We have also accessed documentation of the heap status object. |