Thursday, 25 June 2020

system_stats extension for PostgreSQL

One of my colleagues at EDB spent some time working on a new extension for PostgreSQL and EDB Postgres Advanced Server as part of a proof of concept that I asked him to look into. I'm pleased to say that we've decided to make that work Open Source, releasing it under the PostgreSQL licence.

The system_stats extension offers a number of stored procedures that are useful when monitoring Postgres. They expose various system metrics to the database server, allowing you to extend your database monitoring to include information about CPU, memory, disk and network usage without having to add a separate mechanism to gain access to that information.

A number of functions are included:

pg_sys_os_info()
This interface allows the user to get operating system statistics.

postgres=# SELECT * FROM pg_sys_os_info();
-[ RECORD 1 ]-------+--------------------------------------------------------------------------------------------------
name                | Darwin
version             | Darwin Kernel Version 19.4.0: Wed Mar  4 22:28:40 PST 2020; root:xnu-6153.101.6~15/RELEASE_X86_64
host_name           | hal.local
domain_name         | 
handle_count        | 
process_count       | 531
thread_count        | 
architecture        | x86_64
last_bootup_time    | 
os_up_since_seconds | 281466

pg_sys_cpu_info()
This interface allows the user to get CPU information.

postgres=# SELECT * FROM pg_sys_cpu_info();
-[ RECORD 1 ]------+---------------
vendor             | 
description        | 
model_name         | MacBookPro15,1
processor_type     | 
logical_processor  | 12
physical_processor | 6
no_of_cores        | 12
architecture       | x86_64
clock_speed_hz     | 2900000000
cpu_type           | 7
cpu_family         | 260141638
byte_order         | 1234
l1dcache_size      | 32
l1icache_size      | 32
l2cache_size       | 256
l3cache_size       | 12288

pg_sys_cpu_usage_info()
This interface allows the user to get CPU usage information. Values are a percentage of time spent by CPUs for all operations.

postgres=# SELECT * FROM pg_sys_cpu_usage_info();
-[ RECORD 1 ]-------------------+----------
usermode_normal_process_percent | 1.6806724
usermode_niced_process_percent  | 0
kernelmode_process_percent      | 0
idle_mode_percent               | 98.31933
io_completion_percent           | 
servicing_irq_percent           | 
servicing_softirq_percent       | 
user_time_percent               | 
processor_time_percent          | 
privileged_time_percent         | 
interrupt_time_percent          | 

pg_sys_memory_info()
This interface allows the user to get memory usage information. All the values are in bytes.

postgres=# SELECT * FROM pg_sys_memory_info();
-[ RECORD 1 ]----+------------
total_memory     | 34359738368
used_memory      | 31668727808
free_memory      | 2691010560
swap_total       | 3221225472
swap_used        | 2672033792
swap_free        | 549191680
cache_total      | 
kernel_total     | 
kernel_paged     | 
kernel_non_paged | 
total_page_file  | 
avail_page_file  |

pg_sys_io_analysis_info()
This interface allows the user to get an I/O analysis of block devices.

postgres=# SELECT * FROM pg_sys_io_analysis_info();
-[ RECORD 1 ]-+-------------
device_name   | disk0
total_reads   | 9044867
total_writes  | 6688535
read_bytes    | 127989366784
write_bytes   | 127197491200
read_time_ms  | 3270007
write_time_ms | 3959392
-[ RECORD 2 ]-+-------------
device_name   | disk2
total_reads   | 2222011
total_writes  | 3213728
read_bytes    | 25915008512
write_bytes   | 99796177408
read_time_ms  | 22302918
write_time_ms | 65947889

pg_sys_disk_info()
This interface allows the user to get the disk information.

postgres=# SELECT * FROM pg_sys_disk_info();
-[ RECORD 1 ]----+---------------------
mount_point      | /
file_system      | apfs
drive_letter     | 
drive_type       | 
file_system_type | /dev/disk1s5
total_space      | 2000796545024
used_space       | 11124047872
free_space       | 1328550260736
total_inodes     | 19539028760
used_inodes      | 487630
free_inodes      | 19538541130
-[ RECORD 2 ]----+---------------------
mount_point      | /System/Volumes/Data
file_system      | apfs
drive_letter     | 
drive_type       | 
file_system_type | /dev/disk1s1
total_space      | 2000796545024
used_space       | 11124047872
free_space       | 1328550260736
total_inodes     | 19539028760
used_inodes      | 487630
free_inodes      | 19538541130
-[ RECORD 3 ]----+---------------------
mount_point      | /private/var/vm
file_system      | apfs
drive_letter     | 
drive_type       | 
file_system_type | /dev/disk1s4
total_space      | 2000796545024
used_space       | 11124047872
free_space       | 1328550260736
total_inodes     | 19539028760
used_inodes      | 487630
free_inodes      | 19538541130
-[ RECORD 4 ]----+---------------------
mount_point      | /Volumes/Backup
file_system      | hfs
drive_letter     | 
drive_type       | 
file_system_type | /dev/disk3
total_space      | 2000796545024
used_space       | 11124047872
free_space       | 1328550260736
total_inodes     | 19539028760
used_inodes      | 487630
free_inodes      | 19538541130

pg_sys_load_avg_info()
This interface allows the user to get the average load of the system over 1, 5, 10 and 15 minute intervals.

postgres=# SELECT * FROM pg_sys_load_avg_info();
-[ RECORD 1 ]------------+----------
load_avg_one_minute      | 1.4116211
load_avg_five_minutes    | 1.5273438
load_avg_ten_minutes     | 
load_avg_fifteen_minutes | 1.5668945

pg_sys_process_info()
This interface allows the user to get process information.

postgres=# SELECT * FROM pg_sys_process_info();
-[ RECORD 1 ]------+----
total_processes    | 531
running_processes  | 528
sleeping_processes | 0
stopped_processes  | 0
zombie_processes   | 2

pg_sys_network_info()
This interface allows the user to get network interface information.

postgres=# SELECT * FROM pg_sys_network_info();
-[ RECORD 1 ]---+-------------
interface_name  | lo0
ip_address      | 127.0.0.1
tx_bytes        | 281804800
tx_packets      | 882164
tx_errors       | 0
tx_dropped      | 0
rx_bytes        | 281804800
rx_packets      | 882164
rx_errors       | 0
rx_dropped      | 0
link_speed_mbps | 0
-[ RECORD 2 ]---+-------------
interface_name  | en0
ip_address      | 192.168.0.2
tx_bytes        | 3770546176
tx_packets      | 13311181
tx_errors       | 1164
tx_dropped      | 0
rx_bytes        | 7829563392
rx_packets      | 9070534
rx_errors       | 0
rx_dropped      | 0
link_speed_mbps | 0
-[ RECORD 3 ]---+-------------
interface_name  | utun4
ip_address      | 172.24.64.89
tx_bytes        | 5830656
tx_packets      | 27350
tx_errors       | 0
tx_dropped      | 0
rx_bytes        | 75328512
rx_packets      | 124340
rx_errors       | 0
rx_dropped      | 0
link_speed_mbps | 0

pg_sys_cpu_memory_by_process()
This interface allows the user to get the CPU and memory information for each process ID.

postgres=# SELECT * FROM pg_sys_cpu_memory_by_process();
-[ RECORD 1 ]---------+-----------------
pid                   | 62633
name                  | postgres
running_since_seconds | 
cpu_usage             | 4.48
memory_usage          | 0.02
memory_bytes          | 6848512
-[ RECORD 2 ]---------+-----------------
pid                   | 62632
name                  | psql
running_since_seconds | 
cpu_usage             | 0
memory_usage          | 0.01
memory_bytes          | 2453504
-[ RECORD 3 ]---------+-----------------
pid                   | 62594
name                  | Google Chrome He
running_since_seconds | 
cpu_usage             | 0
memory_usage          | 0.09
memory_bytes          | 31694848
...
...

All functions are restricted for use by superusers and members of the monitor_system_stats role due to the potentially sensitive nature of the information they make available. To grant a user permission to use the extension, grant them that role. For example:

GRANT monitor_system_stats TO nagios;

The extension is currently supported on Linux, macOS and Windows, though not all statistics are available or relevant on all platforms, so you may see some NULL values being returned.

Pull Requests for additional platform support, additional functions, or bug fixes are welcome, as are bug reports.

You can find the code and a source release on Github; please see the README for more information. Devrim Gündüz has also made RPMs available on yum.postgresql.org.