Instance PGA memory leak detection (Linux/Solaris)

Instance PGA memory leak detection (Linux/Solaris)

Instance PGA memory leak detction.

The purpuse of this KB is to illustrate on Linux and Solaris how to track and diagnose PGA memory leaks. This KB will only focus on dedicated server connections (most common case).
When an Oracle client process is created via a connection to the database (either via sqlplus or listener), the spawned process contains several in memory structures. The connection spawned is an instantiation of the $ORACLE_HOME/bin/oracle binary, that access several dynamically linked libraries (shared objects).

Background

For each client process, the following memory structures are created:

Name
Description
stack
working stack
lib
Shared libraries
anon
PGA (non file descriptiors)
process
the oracle binary
shared
SGA size (shared memory segment)

Let's first create an oracle database connection via sqlplus:
  1. [oracle@ostest ~]$ sqlplus test/First1_Database2_@192.168.2.32:1521/91590229bb55fccee0532002a8c04128.sasidevsubnet.sasidevdb.oraclevcn.com
    SQL*Plus: Release 12.2.0.1.0 Production on Sat Aug 31 13:29:58 2019
    Copyright (c) 1982, 2016, Oracle.  All rights reserved.
    Last Successful login time: Sat Aug 31 2019 13:03:59 +00:00
    Connected to:
    Oracle Database 12c EE High Perf Release 12.2.0.1.0 - 64bit Production
    SQL>

Using a script called procmm.sh, the totals of each segment type can calculated and displayed. This scripts uses pmap to get details about the process.
For oracle database connections,  the memory structure breakdown looks like this:
  1. [root@ostest ~]# ./procmm.sh 58238
    -- procmm.sh: Process Memory Matrix v1.02 by Tanel Poder (
    http://tech.e2sn.com/ )
    -- All numbers are shown in megabytes
    -- ps info: PID=58238 VSZ=2281564 RSS=75636 ARGS=oracleOSTEST

       PID          SEGMENT_TYPE VIRTUAL RSS
    ------             --------------------- ------------ ------------
     58238        sga                     1796      0
     58238        stack                  0            0
     58238        oracle                338        63
     58238        anon                  5            3
     58238        lib                       88         8
    ------             --------------------- ------------ ------------
     58238        Total  (MB)         2228    75

Important things to note from this output:
  1. the sga segment type is the shared memory segment that the connection process attaches too. It can be seen that this matches exactly the SGA size from below - 1.7G
  2. the anon segment is actually the PGA for the process. An example of anon is memory not connected to a file. For Oracle DB connections, when the process needs an in memory sort or creates PL/SQL memory arrays, the PGA size will grow as needed.
  3. The lib segment is the shared object or dynamic libraries the process needs.
  4. The VSZ (VIRTUAL) value is the total size of the all memory structures  (including shared memory segments).
  5. The RSS (Resident Set Size) is the process loaded into actual memory (or swap).
  6. The oracle segment type is the  "oracle" binary executable. This can be verified by
  1. [oracle@ostest ~]$ ls -lh $ORACLE_HOME/bin/oracle
    -rwsr-s--x 1 oracle asmadmin 391M Aug 30 16:07 /u01/app/oracle/product/12.2.0.1/dbhome_1/bin/oracle
    [oracle@ostest ~]$

Is is noted that the reported total size of the process is 2228M or 2.2G. This is not the true process memory size, as it counts the shared memory it is attached too, as it's own memory. This leads to "double counting" for all the oracle processes - a common error when assessing oracle memory on linux and solaris.

Therefore, when calculating the actual process memory size, the shared memory segment size must not be included in the calculation.

Another handy tool is ps_mem.py, which directly queries the /proc/$pic/smaps memory structures. Using the example above, the actual memory consumed is:

  1. [root@ostest ~]# python ps_mem.py -p 58238
     Private  +   Shared  =  RAM used    Program

      5.2 MiB +   7.3 MiB =  12.5 MiB       oracle_58238_os
    ------------------------------------------------------
                                         12.5 MiB
    =================================
    [root@ostest ~]#

At connection creation, the true memory consumed by the oracle database connection process is 12.5M for v12.2.0.1 connection (Linux). The shared mentioned above is not SGA, rather internal inter process communication (ipc).

The typical way to size a databases' total memory requirement is to calculate:
 ->     size the SGA  + (total number of connections expected x each connections pga requirements)

An examples would be:
   50G + 1000 connections x (15M per connection PGA)
= 50G + 15G
= 65G memory required


  1. The actual (true) memory consumed by a Linux or Solaris database connection cannot be calculated using top or ps.

Setup

We need to know the size of the SGA before we start. This can be checked via the init.ora parameters or via sqlplus:
  1. [oracle@ostest ~]$ sqlplus  / as sysdba

    SQL*Plus: Release 12.2.0.1.0 Production on Sat Aug 31 13:18:33 2019
    Copyright (c) 1982, 2016, Oracle.  All rights reserved.
    Connected to:
    Oracle Database 12c EE High Perf Release 12.2.0.1.0 - 64bit Production

    SQL> alter session set container = OSTEST_PDB1;
    Session altered.
    SQL> show sga
    Total System Global Area 1879048192 bytes
    Fixed Size            8794072 bytes
    Variable Size          704643112 bytes
    Database Buffers     1140850688 bytes
    Redo Buffers           24760320 bytes
    SQL>

We can also verify the shared memory segment for the instance, that all client connections / sessions attached to:
  1. [root@ostest ~]# ipcs -a

    ------ Shared Memory Segments --------
    key                    shmid            owner      perms    bytes      nattch     status
    0x89cce19c        65536          grid        600        24576             45
    0x8ce1fab0        98305           grid        600        8192                21
    0x00000000     1048578       oracle     600        10485760       77
    0x00000000     1081347       oracle     600        1845493760   77
    0x00000000     1114116          oracle     600        25165824       77
    0xb3a2fb0c        1146885       oracle     600        2097152         77

From this, we can see the shared memory segment (SYSV) with key 0x00000000 is size 1845493760/1024/1024/1024 = 1.72G.
This figure matches the SGA figures seen via sqlplus.

In database version 12c, Oracle introduced a new parameter called "pga_aggregate_limit".
This parameter now creates a hard limit for the total of all client PGAs. The parameter "PGA_AGGREGATE_TARGET" only provides a soft limit.

The rules for pga_aggregate_limit values:
  1. If MEMORY_TARGET is set, then PGA_AGGREGATE_LIMIT defaults to the MEMORY_MAX_TARGET value.

    If MEMORY_TARGET is not set, then PGA_AGGREGATE_LIMIT defaults to 200% of PGA_AGGREGATE_TARGET.

    If MEMORY_TARGET is not set, and PGA_AGGREGATE_TARGET is explicitly set to 0, then the value of PGA_AGGREGATE_LIMIT is set to 90% of the physical memory size minus the total SGA size.

  1. If a value of 0 is specified, it means there is no limit to the aggregate PGA memory consumed by the instance
The Warning above is crucial - if set to 0, the PGA memory will grow till server memory is exhausted.

Testing PGA growth

To verify there are no pga leaks, we need to reconcile the databases' view of memory consumed by a connection process with the OS version of the processes memory.

For a new database connection, this is the memory consumed:
  1. SQL> @pga_memory
                  Process    PGA         PGA
    Session                       ID         Curr(M)         Max(M)
    ------------------------- ---------- --------------- ---------------
      196 - TESTostest 17722        1.98        3.80
    SQL>

As we can see, the current PGA usage is 1.98M and the high water mark usage (max) is 3.8M.

This can be reconciled with the OS view:
  1. [root@ostest ~]# ./procmm.sh 58238
    -- procmm.sh: Process Memory Matrix v1.02 by Tanel Poder (
    http://tech.e2sn.com/ )
    -- All numbers are shown in megabytes
    -- ps info: PID=58238 VSZ=2281564 RSS=75636 ARGS=oracleOSTEST

       PID          SEGMENT_TYPE VIRTUAL RSS
    ------             --------------------- ------------ ------------
     58238        sga                     1796      0
     58238        stack                  0            0
     58238        oracle                338        63
     58238        anon                  5            3
     58238        lib                       88         8
    ------             --------------------- ------------ ------------
     58238        Total  (MB)         2228    75

As can be seen above, the anon (which corresponds to PGA) is 3M big.

A procedure containing a PL/SQL block is used to create a large in memory array. The array in this case has 6 million elements, corresponding to PGA requirements of 1.5G.
  1. SQL> @pga_memory
    SQL> execute p1;

While this memory array inside the process is being created and extended (PGA), we monitor the growth via another session:
  1. SQL> @pga_memory
                 Process    PGA        PGA
    SSESSION         ID     Curr(M)     Max(M)
    -------------------- ---------- --------------- ---------------
       43 - TESTostest      67220       37.61       37.61

    SQL> @pga_memory
                 Process    PGA        PGA
    SSESSION         ID     Curr(M)     Max(M)
    -------------------- ---------- --------------- ---------------
       43 - TESTostest      67220       61.55       61.55

    SQL> @pga_memory
                                         Process    PGA        PGA
    SSESSION                   ID             Curr(M)     Max(M)
    --------------------              ----------     --------------- ---------------
       43 - TESTostest      67220         359.61      359.61

    SQL> @pga_memory

As can be seen above, the current PGA and Max PGA are growing.
At the same time, the anon memory (PGA) is growing.

  1. [root@ostest ~]# ./procmm.sh 16883
    -- procmm.sh: Process Memory Matrix v1.02 by Tanel Poder ( http://tech.e2sn.com/ )
    -- All numbers are shown in kilobytes
    -- ps info: PID=16883 VSZ=2460496 RSS=261068 ARGS=oracleOSTEST

       PID          SEGMENT_TYPE      VIRTUAL          RSS
    ------ --------------------- ------------ ------------
     16883                   sga      1839104            0
     16883                 stack          488          296
     16883                oracle       346512        78548
     16883                   aio           12           12
     16883                  anon       192224       173948
     16883                   lib        90288         8648
    ------ --------------------- ------------ ------------
     16883        Total Real(kB)       629524       261452

    [root@ostest ~]# ./procmm.sh 16883
    -- procmm.sh: Process Memory Matrix v1.02 by Tanel Poder ( http://tech.e2sn.com/ )
    -- All numbers are shown in kilobytes
    -- ps info: PID=16883 VSZ=2476880 RSS=273104 ARGS=oracleOSTEST

       PID          SEGMENT_TYPE      VIRTUAL          RSS
    ------ --------------------- ------------ ------------
     16883                   sga      1839104            0
     16883                 stack          488          296
     16883                oracle       346512        78548
     16883                   aio           12           12
     16883                  anon       200480       185856
     16883                   lib        90288         8648
    ------ --------------------- ------------ ------------
     16883        Total Real(kB)       637780       273360

    [root@ostest ~]# ./procmm.sh 16883
    -- procmm.sh: Process Memory Matrix v1.02 by Tanel Poder ( http://tech.e2sn.com/ )
    -- All numbers are shown in kilobytes
    -- ps info: PID=16883 VSZ=2476880 RSS=285520 ARGS=oracleOSTEST

       PID          SEGMENT_TYPE      VIRTUAL          RSS
    ------ --------------------- ------------ ------------
     16883                   sga      1839104            0
     16883                 stack          488          296
     16883                oracle       346512        78548
     16883                   aio           12           12
     16883                  anon       216864       198592
     16883                   lib        90288         8648
    ------ --------------------- ------------ ------------
     16883        Total Real(kB)       654164       286096

    [root@ostest ~]# ./procmm.sh 16883

Eventually, the server runs out of memory:
  1. SQL>@pga_leak
    SQL> execute p1;
    BEGIN p1; END;

    *
    ERROR at line 1:
    ORA-27103: internal error
    Additional information: 11818
    Additional information: 9
    ORA-27102: out of memory
    Linux-x86_64 Error: 12: Cannot allocate memory
    Additional information: 10529
    Additional information: 65536
    ORA-06512: at "TEST.P1", line 21
    ORA-06512: at line 1

At this point in time, the DB view of the process is:
  1. SQL>@pga_memory
  2.               Process    PGA         PGA
    Session           ID         Curr(M)         Max(M)
    ------------------------- ---------- --------------- ---------------
      196 - TESTostest           17722        1,548.61        1,552.73

    SQL>

This show that the max requested memory for PGA was 1.5G.
The OS view at this stage was:
  1. [root@ostest ~]# ./procmm.sh 17722
    -- procmm.sh: Process Memory Matrix v1.02 by Tanel Poder (
    http://tech.e2sn.com/ )
    -- All numbers are shown in megabytes
    -- ps info: PID=17722 VSZ=3884204 RSS=1611416 ARGS=oracleOSTEST

       PID          SEGMENT_TYPE      VIRTUAL          RSS
    ------ --------------------- ------------ ------------
     17722                   sga         1796            0
     17722                 stack            0            0
     17722                oracle          351           20
     17722                   aio            0            0
     17722                  anon         1556        
    1550
     17722                   lib           88            3
    ------ --------------------- ------------ ------------
     17722        Total Real(MB)         1997         1574
  2. [root@ostest ~]# python ps_mem.py -p 17722
    Private  +   Shared  =  RAM used    Program

      1.5 GiB +  13.8 MiB =   1.5 GiB    oracle_17722_os
    ---------------------------------
                             
    1.5 GiB
    =================================

As can be seen, the the figures above in green show the PGA memory allocated to the process as 1.5G.
After the server runs out of memory, the PL/SQL block is destroyed and the PGA is automatically reduced in size (pmon does the clean up):
  1. SQL> @pga_memory
                  Process    PGA         PGA
    Session           ID         Curr(M)         Max(M)
    ------------------------- ---------- --------------- ---------------
      196 - TESTostest           17722          
    39.30        1,552.73

    SQL> /
    [root@ostest ~]# python ps_mem.py -p 17722
     Private  +   Shared  =  RAM used    Program

     26.7 MiB +  12.7 MiB =  39.3 MiB    oracle_17722_os
    ---------------------------------
                             39.3 MiB


So it can be seen from the analysis above, that the process memory consumed reconciles from the database and OS points of view.

If we set a limit so the total PGA is restricted, let's see what happens:
  1. SQL> alter system set pga_aggregate_limit=1194M;
    System altered.
    SQL>

And let's run the procedure again:
  1. SQL>@pga_leak
    SQL> execute p1;
    BEGIN p1; END;*
    ERROR at line 1:
    ORA-04036: PGA memory used by the instance exceeds PGA_AGGREGATE_LIMIT
    SQL>
The memory limit set has been reached, and the procedure aborts.
  1. PGA_AGGREGATE_LIMIT should always be set so server memory is not exhausted.
 

Summary

Memory allocation on Linux and Solaris for database connections is traditionally difficult to investigate and diagnose faults or issues. Several utilities are available to allow details of process memory to be detailed and analyzed. The tool "pmap" on Linux and Solaris provides a breakdown of the memory structures, allowing reconcilliation with the databases' view of the process memory.