Dave's Postgres Blog
Postgres related musings and ramblings
Monday, 16 November 2020
macOS Big Sur upgrade breaking PostgreSQL installations
Tuesday, 25 August 2020
Testing pgAdmin
Like any software, pgAdmin has bugs. At the time of writing there are 119 new or in-progress issues in the bug tracker, which is pretty good for a piece of software with so many moving parts, and regular-as-clockwork monthly releases. Over 10% of these issues are currently assigned to the reporter to gather further information, plus there will undoubtedly be some duplicates and support issues/feature requests in that number that haven’t yet been picked up in our weekly backlog refinement meeting and either closed or moved to the feature or support trackers.
Manual Testing
We have one person, Fahar, who is permanently assigned as QA to the pgAdmin team at EDB, though he is also responsible for a number of other product releases such as the PostgreSQL installers for Windows and macOS (pgAdmin is the vast majority of his work however).
Fahar primarily does four things:
Works to verify that issues reported to users are reproducible, when they require complex setup or the developers have failed to reproduce them.
Verifies that bugs that have been fixed by the developers actually do fix the reported problem.
Smoke tests release builds; that is, ensures the packages can be installed and basic functionality works prior to public release.
Tests new features once development is believed to be complete.
Automation
All pgAdmin tests and builds are automated using Jenkins. The system will notify the development team of any failures via email and via a dedicated Slack channel (internal to EDB). There are also large screens mounted on the wall in development offices to show the status of the test and build runs in real time.
Automated Testing
We have three primary automated test suites for pgAdmin, which are essential in allowing us to release quickly and regularly, with minimal issues.
Python Tests
The Python tests (sometimes referred to as the API tests) are intended to test the Python application that implements the pgAdmin server. The test suite will instantiate the server and then make hundreds of calls to the server with different inputs to ensure itr responds as expected. A small number of these tests operate directly on the Python code, and do not run through the API (e.g. for testing server import/export).
One of the Python tests is the RE-SQL (Reverse Engineered SQL) test. This is a special case as it’s a single test which is actually comprised of around 460 sub-tests, designed to test the SQL generation in pgAdmin; each sub-test will create or modify an object, and then validate that the SQL generated by pgAdmin is correct.
The Python tests are run on every check in to the pgAdmin GIT repository. There are nearly 1200 tests (plus the 460 RE-SQL tests), which are run on Windows, macOS and Linux against all supported database servers (PostgreSQL 9.5 through 12, and EDB Postgres Advanced Server 9.5 through 12). This equates to nearly 100,000 Python tests being run on every check in. The exact number isn’t quite that because some tests only run on certain database server versions.
Jasmine Tests
The Jasmine tests are unit tests for the Javascript code used in the frontend user interface of pgAdmin, named after the framework that is used to run them. At the time of writing, there are 866 Javascript unit tests, each of which is run on Windows, macOS and Linux. That means that 2,598 Javascript unit tests are run on every check in.
Feature Tests
Feature tests are “full system” tests in which an instance of pgAdmin is started and loaded into a browser, and then inputs are automatically given to the browser to simulate end-to-end operation of the application, with the displayed output then being verified. The feature tests are run on a nightly basis as they take a long time to execute. They are run using Selenoid using Google Chrome as the browser (we plan to add Firefox in the future), and run against all supported database server versions.
There are only around 15 feature tests in the suite (which actually runs from the Python test framework), however each test may actually verify anything from 1 to 20 or more individual pieces of functionality.
Linting
Various tasks run as part of the automated tests to check the code formatting, including Javascript linting and Python PEP-8 validation.
Automated Builds
Test Builds
Test builds of all packages (Docker, tarballs, docs, RPMs, DEBs, installers etc) are made with every check in to the GIT repository. These are discarded after the build completes as their primary purpose is to verify that all the code builds without error on every platform, and that packages can be successfully generated.
Snapshot Builds
Snapshot builds are created on a nightly basis and made available for testing and use by anyone that is interested in contributing or needs a fix in a hurry.
QA Builds
QA builds are produced on the Monday before one of the monthly releases. They are uploaded to a server from where Fahar will download and smoke test them to ensure there are no obvious show-stopping issues. These builds will also be tested to ensure they are carrying the latest version of certain dependencies, such as the OpenSSL libraries.
Static Code Analysis
EDB maintains a SonarQube server for performing static code analysis of all the product code within the company, including the Open Source pgAdmin code. This allows us to monitor the state of a number of key metrics that can be detected through analysis:
Bugs
Security vulnerabilities
Security hotspots (code that should be reviewed as it’s a possible source of a vulnerability)
Code smells (code that looks badly structured or implemented)
Code duplications
The current state of the pgAdmin code can be seen below:
Note that the sharp rise on July 19th was the result of a SonarQube upgrade that added/changed a number of the analysis rules.
We’re in the process of working through the different issues, with focus currently on code smells (mostly overly complex functions that need to be refactored) and security hotspots (mostly regexps in the code). Each of the issues reported may or may not involve a code change - at the very least they will be reviewed and marked as “Won’t fix”.
Developer Testing
All the developers are required to test patches before they are submitted for inclusion in pgAdmin. This includes running all relevant existing tests (there’s no point in running Jasmine tests for a change that only touches Python code for example), and linting tests (e.g. the PEP-8 checks). They are also expected to include suitable tests (or updates to tests) to validate changes they’ve made.
Many of the developers also use static code analysis in their development environments, either the tools built directly into products like PyCharm, or personal SonarQube scans.
Statistics
There are various statistics we can look at as some form of code quality metric.
Team
The team size is interesting because it gives us an idea of how much each member of the team is responsible for on average.
There are currently 6 people working on pgAdmin full time. In addition, we have a number of people at EDB who are primarily focussed on Postgres Enterprise Manager, which as it is based on pgAdmin, occasionally means that that team will contribute to pgAdmin as well. We also have a number of community contributors who submit the occasional patch.
Counting only the full time developers, that’s 19.8 not-yet-fixed bugs per developer.
Users
The number of users is interesting because the bigger the user base, the more likely we are to get issues reported.
Unfortunately, we don’t know how many users we have because pgAdmin is freely available. We do know that Docker is currently reporting over 50 million pulls of the official pgAdmin container, and that’s almost certainly not where most people get their copy of pgAdmin from.
Counting only the Docker users, and assuming each one of them has pulled every one of the 34 releases since the first one uploaded there, that's 1.47 million users (yeah, OK, we definitely don’t have that many users, but still…). That works out at 0.000081 not-yet-fixed bugs per user, though arguably this number is also pretty meaningless. A better one might be the number of bugs reported over the lifetime of the codebase (which is 1856, including issues in new features that ended up being fixed prior to release), which gives us 0.001263 reported and verified bugs per user.
Code
Looking at the code, we see the following stats:
dpage@hal:~/git/pgadmin4$ perl ~/bin/cloc.pl .
3650 text files.
3148 unique files.
1115 files ignored.
https://github.com/AlDanial/cloc v 1.66 T=12.18 s (217.1 files/s, 27242.3 lines/s)
--------------------------------------------------------------------
Language files blank comment code
--------------------------------------------------------------------
Python 714 21204 30011 92424
JavaScript 285 10216 9984 75844
SQL 1384 5383 2889 38516
JSON 108 31 0 24752
SASS 34 995 16 5210
make 2 111 37 3717
CSS 39 376 66 2012
HTML 28 42 20 1548
XML 13 0 0 1208
C++ 9 342 247 1177
Bourne Shell 12 298 204 1096
Qt 3 0 0 945
DOS Batch 1 85 8 284
C/C++ Header 10 94 107 244
Qt Project 1 25 7 99
Objective C++ 1 2 11 14
Windows Resource File 1 0 0 1
--------------------------------------------------------------------
SUM: 2645 39204 43607 249091
--------------------------------------------------------------------
So, there are a shade under a quarter of a million lines of code in pgAdmin, which equates to one not-yet-fixed bug per 2,093 lines of code. By way of comparison, this Stackoverflow answer cites a book that in turn cites a number of studies into industry average bug rates, which are believed to be in the order of 1-25 bugs per 1000 lines of code.
Conclusion
pgAdmin has a relatively small number of developers, but a large amount of code and number of users. From the data we have, we can assume that with such a large user base, bugs are more likely to be reported than if there was a user base of tens or hundreds of users. Despite this, the number of issues per 1000 lines of code seems to be very low in comparison to industry studies.
Despite the huge amount of effort that goes into automated testing and builds, there will always be bugs. We rely on users to give us feedback when they find issues, and aim to fix everything as soon as possible, prioritising based on impact and likelihood of any given user running into each issue. We also try to balance time spent on bug fixes with new features and housekeeping tasks (aimed at keeping the code maintainable).
None of this would be possible without the automation in use for building and testing the code, or without the hard working developers (including Fahar) and users that report and help them fix issues, to whom I owe a huge debt of gratitude.
Thursday, 25 June 2020
system_stats extension for PostgreSQL
The system_stats extension offers a number of stored procedures that are useful when monitoring Postgres. They expose various system metrics to the database server, allowing you to extend your database monitoring to include information about CPU, memory, disk and network usage without having to add a separate mechanism to gain access to that information.
A number of functions are included:
pg_sys_os_info()
This interface allows the user to get operating system statistics.
postgres=# SELECT * FROM pg_sys_os_info();
-[ RECORD 1 ]-------+--------------------------------------------------------------------------------------------------
name | Darwin
version | Darwin Kernel Version 19.4.0: Wed Mar 4 22:28:40 PST 2020; root:xnu-6153.101.6~15/RELEASE_X86_64
host_name | hal.local
domain_name |
handle_count |
process_count | 531
thread_count |
architecture | x86_64
last_bootup_time |
os_up_since_seconds | 281466
pg_sys_cpu_info()
This interface allows the user to get CPU information.
postgres=# SELECT * FROM pg_sys_cpu_info();
-[ RECORD 1 ]------+---------------
vendor |
description |
model_name | MacBookPro15,1
processor_type |
logical_processor | 12
physical_processor | 6
no_of_cores | 12
architecture | x86_64
clock_speed_hz | 2900000000
cpu_type | 7
cpu_family | 260141638
byte_order | 1234
l1dcache_size | 32
l1icache_size | 32
l2cache_size | 256
l3cache_size | 12288
pg_sys_cpu_usage_info()
This interface allows the user to get CPU usage information. Values are a percentage of time spent by CPUs for all operations.
postgres=# SELECT * FROM pg_sys_cpu_usage_info();
-[ RECORD 1 ]-------------------+----------
usermode_normal_process_percent | 1.6806724
usermode_niced_process_percent | 0
kernelmode_process_percent | 0
idle_mode_percent | 98.31933
io_completion_percent |
servicing_irq_percent |
servicing_softirq_percent |
user_time_percent |
processor_time_percent |
privileged_time_percent |
interrupt_time_percent |
pg_sys_memory_info()
This interface allows the user to get memory usage information. All the values are in bytes.
postgres=# SELECT * FROM pg_sys_memory_info();
-[ RECORD 1 ]----+------------
total_memory | 34359738368
used_memory | 31668727808
free_memory | 2691010560
swap_total | 3221225472
swap_used | 2672033792
swap_free | 549191680
cache_total |
kernel_total |
kernel_paged |
kernel_non_paged |
total_page_file |
avail_page_file |
pg_sys_io_analysis_info()
This interface allows the user to get an I/O analysis of block devices.
postgres=# SELECT * FROM pg_sys_io_analysis_info();
-[ RECORD 1 ]-+-------------
device_name | disk0
total_reads | 9044867
total_writes | 6688535
read_bytes | 127989366784
write_bytes | 127197491200
read_time_ms | 3270007
write_time_ms | 3959392
-[ RECORD 2 ]-+-------------
device_name | disk2
total_reads | 2222011
total_writes | 3213728
read_bytes | 25915008512
write_bytes | 99796177408
read_time_ms | 22302918
write_time_ms | 65947889
pg_sys_disk_info()
This interface allows the user to get the disk information.
postgres=# SELECT * FROM pg_sys_disk_info();
-[ RECORD 1 ]----+---------------------
mount_point | /
file_system | apfs
drive_letter |
drive_type |
file_system_type | /dev/disk1s5
total_space | 2000796545024
used_space | 11124047872
free_space | 1328550260736
total_inodes | 19539028760
used_inodes | 487630
free_inodes | 19538541130
-[ RECORD 2 ]----+---------------------
mount_point | /System/Volumes/Data
file_system | apfs
drive_letter |
drive_type |
file_system_type | /dev/disk1s1
total_space | 2000796545024
used_space | 11124047872
free_space | 1328550260736
total_inodes | 19539028760
used_inodes | 487630
free_inodes | 19538541130
-[ RECORD 3 ]----+---------------------
mount_point | /private/var/vm
file_system | apfs
drive_letter |
drive_type |
file_system_type | /dev/disk1s4
total_space | 2000796545024
used_space | 11124047872
free_space | 1328550260736
total_inodes | 19539028760
used_inodes | 487630
free_inodes | 19538541130
-[ RECORD 4 ]----+---------------------
mount_point | /Volumes/Backup
file_system | hfs
drive_letter |
drive_type |
file_system_type | /dev/disk3
total_space | 2000796545024
used_space | 11124047872
free_space | 1328550260736
total_inodes | 19539028760
used_inodes | 487630
free_inodes | 19538541130
pg_sys_load_avg_info()
This interface allows the user to get the average load of the system over 1, 5, 10 and 15 minute intervals.
postgres=# SELECT * FROM pg_sys_load_avg_info();
-[ RECORD 1 ]------------+----------
load_avg_one_minute | 1.4116211
load_avg_five_minutes | 1.5273438
load_avg_ten_minutes |
load_avg_fifteen_minutes | 1.5668945
pg_sys_process_info()
This interface allows the user to get process information.
postgres=# SELECT * FROM pg_sys_process_info();
-[ RECORD 1 ]------+----
total_processes | 531
running_processes | 528
sleeping_processes | 0
stopped_processes | 0
zombie_processes | 2
pg_sys_network_info()
This interface allows the user to get network interface information.
postgres=# SELECT * FROM pg_sys_network_info();
-[ RECORD 1 ]---+-------------
interface_name | lo0
ip_address | 127.0.0.1
tx_bytes | 281804800
tx_packets | 882164
tx_errors | 0
tx_dropped | 0
rx_bytes | 281804800
rx_packets | 882164
rx_errors | 0
rx_dropped | 0
link_speed_mbps | 0
-[ RECORD 2 ]---+-------------
interface_name | en0
ip_address | 192.168.0.2
tx_bytes | 3770546176
tx_packets | 13311181
tx_errors | 1164
tx_dropped | 0
rx_bytes | 7829563392
rx_packets | 9070534
rx_errors | 0
rx_dropped | 0
link_speed_mbps | 0
-[ RECORD 3 ]---+-------------
interface_name | utun4
ip_address | 172.24.64.89
tx_bytes | 5830656
tx_packets | 27350
tx_errors | 0
tx_dropped | 0
rx_bytes | 75328512
rx_packets | 124340
rx_errors | 0
rx_dropped | 0
link_speed_mbps | 0
pg_sys_cpu_memory_by_process()
This interface allows the user to get the CPU and memory information for each process ID.
postgres=# SELECT * FROM pg_sys_cpu_memory_by_process();
-[ RECORD 1 ]---------+-----------------
pid | 62633
name | postgres
running_since_seconds |
cpu_usage | 4.48
memory_usage | 0.02
memory_bytes | 6848512
-[ RECORD 2 ]---------+-----------------
pid | 62632
name | psql
running_since_seconds |
cpu_usage | 0
memory_usage | 0.01
memory_bytes | 2453504
-[ RECORD 3 ]---------+-----------------
pid | 62594
name | Google Chrome He
running_since_seconds |
cpu_usage | 0
memory_usage | 0.09
memory_bytes | 31694848
...
...
GRANT monitor_system_stats TO nagios;
The extension is currently supported on Linux, macOS and Windows, though not all statistics are available or relevant on all platforms, so you may see some NULL values being returned.
Pull Requests for additional platform support, additional functions, or bug fixes are welcome, as are bug reports.
You can find the code and a source release on Github; please see the README for more information. Devrim Gündüz has also made RPMs available on yum.postgresql.org.
Wednesday, 10 July 2019
Reverse Proxying to pgAdmin
Because of the number of questions asked, I spent a little time over the last couple of days doing some testing and updating the documentation with some examples. Here's a blog-ified version of that work.
Nginx
docker pull dpage/pgadmin4 docker run -p 5050:80 \ -e "PGADMIN_DEFAULT_EMAIL=user@domain.com" \ -e "PGADMIN_DEFAULT_PASSWORD=SuperSecret" \ -d dpage/pgadmin4
server { listen 80; server_name _; location / { proxy_set_header Host $host; proxy_pass http://localhost:5050/; proxy_redirect off; } }
server { listen 80; server_name _; location /pgadmin4/ { proxy_set_header X-Script-Name /pgadmin4; proxy_set_header Host $host; proxy_pass http://localhost:5050/; proxy_redirect off; } }
server { listen 80; return 301 https://$host$request_uri; } server { listen 443; server_name _; ssl_certificate /etc/nginx/server.crt; ssl_certificate_key /etc/nginx/server.key; ssl on; ssl_session_cache builtin:1000 shared:SSL:10m; ssl_protocols TLSv1 TLSv1.1 TLSv1.2; ssl_ciphers HIGH:!aNULL:!eNULL:!EXPORT:!CAMELLIA:!DES:!MD5:!PSK:!RC4; ssl_prefer_server_ciphers on; location /pgadmin4/ { proxy_set_header X-Script-Name /pgadmin4; proxy_set_header X-Scheme $scheme; proxy_set_header Host $host; proxy_pass http://localhost:5050/; proxy_redirect off; } }
Traefik
defaultEntryPoints = ["http", "https"] [entryPoints] [entryPoints.http] address = ":80" [entryPoints.http.redirect] entryPoint = "https" [entryPoints.https] address = ":443" [entryPoints.https.tls] [docker] domain = "domain_name" watch = true
docker pull dpage/pgadmin4 docker run --name "pgadmin4" \ -e "PGADMIN_DEFAULT_EMAIL=user@domain.com" \ -e "PGADMIN_DEFAULT_PASSWORD=SuperSecret" \ -d dpage/pgadmin4
docker pull dpage/pgadmin4 docker run --name "pgadmin4" \ -e "PGADMIN_DEFAULT_EMAIL=user@domain.com" \ -e "PGADMIN_DEFAULT_PASSWORD=SuperSecret" \ -e "SCRIPT_NAME=/pgadmin4" \ -l "traefik.frontend.rule=PathPrefix:/pgadmin4" \ -d dpage/pgadmin4