PostgreSQL custom SQL collations

I had used this feature before and forgotten about it, but it’s really cool.

In Postgres you can use expressions in order-by clauses.  This is incredibly powerful because it allows you to have your own custom collation or sort sequence for “order by” clauses.

For example, I have a simple table of people and tshirt sizes:


tshirts=# \d tshirts
Table "public.tshirts"
   Column   |     Type     |                      Modifiers
------------+--------------+------------------------------------------------------
 id         | integer      | not null default (nextval('tshirts_id_seq'::regclass)
 name       | text         | not null
 shirt_size | character(2) |

Indexes:
 "tshirts_pkey" PRIMARY KEY, btree (id)

Check constraints:
 "tshirts_shirt_size_check" CHECK (shirt_size = ANY (ARRAY['S'::bpchar, 'M'::bpchar, 'L'::bpchar, 'XL'::bpchar, '2X'::bpchar, '3X'::bpchar, '4X'::bpchar, '5X'::bpchar]))

Here is the most common method most people would use to get tshirt counts to populate a tshirt order form:


select shirt_size
     , count(*)
  from tshirts
 group by shirt_size
 order by shirt_size
;

It returns this:


 shirt_size | count
------------+-------
 2X         |    13
 3X         |     7
 4X         |     1
 5X         |     1
 L          |    38
 M          |    45
 S          |     5
 XL         |    28

But I’d rather have it in a more natural order like below:


 shirt_size | count
------------+-------
 S          |     5
 M          |    45
 L          |    38
 XL         |    28
 2X         |    13
 3X         |     7
 4X         |     1
 5X         |     1

Here’s one way to accomplish that easily:


select shirt_size
     , count(*)
  from tshirts
 group by shirt_size
 order by case shirt_size
            when 'S' then 1
            when 'M' then 2
            when 'L' then 3
            when 'XL' then 4
            when '2X' then 5
            when '3X' then 6
            when '4X' then 7
            when '5X' then 8
            else 9
          end   asc
;

I threw in the “else” clause to capture anything I may have missed, or if any bad data made into the table somehow.

And it works exactly like I want.

This is a very simple example, but can be very useful in some cases.

Here’s a more real-world example.  I use this SQL to show me postgres threads, putting the active ones at the top, idle-in-trans next, then idle threads at the end, and all sorted oldest first for each group:


select a.pid as "PID"
     , a.datname as "DB"
     , a.usename as "User"
     , case when a.client_addr is null
         then 'local'
         else host(a.client_addr)
       end as "IP"
     , date_trunc('seconds',a.query_start) as "Start"
     , case waiting when false then ' ' when true then 'Y' end as "W"
     , case state when 'idle in transaction' then 'idle-trn' else state end as "State"
     , case state
         when 'active' then ltrim(regexp_replace(a.query,E'[\n\f\t\r ]+',' ','g'))
         when 'idle in transaction' then ltrim(regexp_replace(a.query,E'[\n\f\t\r ]+',' ','g'))
         else ''
       end as "Active Query"
  from pg_stat_activity a
 where a.pid != pg_backend_pid()
 order by state != 'idle' desc
        , state != 'idle in transaction' desc
        , query_start
 limit 20
;

And here’s what it gives me:


 PID  |   DB    |  User   |  IP   |         Start          | W |  State   |                               Active Query
------+---------+---------+-------+------------------------+---+----------+--------------------------------------------------------------------------
 4745 | pgbench | pgbench | local | 2017-05-23 16:03:32-05 |   | active   | delete from pgbench_history where mtime < now() - '3 minutes'::interval;
 4772 | pgbench | pgbench | local | 2017-05-23 16:03:32-05 |   | active   | vacuum;
 4778 | pgbench | pgbench | local | 2017-05-23 16:03:32-05 |   | active   | select pg_sleep(9);
 4844 | pgbench | pgbench | local | 2017-05-23 16:03:33-05 | Y | active   | UPDATE deadlock_test_table SET data='goodbye' WHERE id=1;
 4856 | pgbench | pgbench | local | 2017-05-23 16:03:33-05 | Y | active   | select count(*) from pglocktest_user;
 4863 | pgbench | pgbench | local | 2017-05-23 16:03:33-05 |   | active   | select pg_sleep(4);
 4858 | pgbench | pgbench | local | 2017-05-23 16:03:33-05 | Y | active   | truncate pgbench_history
 4903 | pgbench | pgbench | local | 2017-05-23 16:03:34-05 | Y | active   | vacuum;
 4914 | pgbench | pgbench | local | 2017-05-23 16:03:34-05 |   | active   | select pg_sleep(5);
 4846 | pgbench | pgbench | local | 2017-05-23 16:03:35-05 | Y | active   | UPDATE deadlock_test_table SET data='goodbye' WHERE id=2;
 4784 | pgbench | pgbench | local | 2017-05-23 16:03:32-05 |   | idle-trn | BEGIN;
 4845 | pgbench | pgbench | local | 2017-05-23 16:03:32-05 |   | idle-trn | LOCK TABLE pglocktest_user;
 4888 | pgbench | pgbench | local | 2017-05-23 16:03:33-05 |   | idle-trn | BEGIN;
 4924 | pgbench | pgbench | local | 2017-05-23 16:03:35-05 |   | idle-trn | BEGIN;
 4785 | pgbench | pgbench | local | 2017-05-23 16:03:32-05 |   | idle     |
 4889 | pgbench | pgbench | local | 2017-05-23 16:03:33-05 |   | idle     |
(16 rows)

I hope you have fun with it and find it useful!

Advertisements

Selecting Non-Existent Columns

Since we run a bunch of postgres instances on several different versions it has become somewhat of a chore to update my monitoring to work with each new version of PostgreSQL that gets released.

As postgres evolves the system views change to accommodate new features or provide more insight into its operations, which usually means column changes to existing system views.  I had to add condition logic in my monitoring to run the correct version of a query for the correct version of postgres.

Well… I used to have to do that.

A few weeks ago I stumbled across a post (I wish I had bookmarked it, cuz I can’t find it now) about how to write a SQL query statement that will work even if it is used against tables where columns may or may not exist or be different.

It’s something developers use in code every day, but I had never thought about using it with SQL… SCOPE RULES!

light bulb

Yup, scope rules exist in SQL too!  If a column isn’t found in a sub-query, it will look in the outer query for a matching column.

For example, as part of my monitoring I want to monitor the number of threads waiting on locks. Starting with PostgreSQL v8.2 the boolean “waiting” column in the “pg_stat_activity” view could be used to easily query how many threads were waiting on a lock. But in v9.6 that column was replaced with two columns, “wait_event” and “wait_event_type”.

Using scope rules I can easily accommodate the change in SQL without the need to add condition coding or interrogate the postgres version or any of that nonsense.

Here’s an example:

-- lock wait count on PG v8.2 thru v9.5
select (select count(*)
          from pg_stat_activity
         where waiting = true)

-- add lock wait count on PG v9.6 and up
    +  (select count(*)
          from pg_stat_activity
         where wait_event_type is not null)

    as wait_count

-- inherit one of the below columns
from (select false as waiting
           , null  as wait_event_type
     ) as dummy
;

Now it doesn’t matter what version of postgres I query, I get the correct result.

In the example above when the column doesn’t exist in the “pg_stat_activity” view, then the column and its value are inherited from the outer “dummy” table.

You need to be careful about choosing what values you inherit so they make sense for your query, but in my case it works perfectly.

Nested Exceptions in PL/pgSQL

Our Zabbix monitoring system uses PostgreSQL as the underlying database and we decided to implement partitioning since we’re expecting it to grow fairly large, plus we’re already seeing the Zabbix housekeeper CPU usage going fairly high.

So my colleague found this article about auto partitioning for Zabbix. I tested it out on our test system and everything appeared to work perfectly.

Then we installed it in production. Everything looked fine until I checked the DB logs the next day and discovered these errors from midnight:

ERROR:  relation "history_p2017_05_17" already exists
CONTEXT:  SQL statement "CREATE TABLE IF NOT EXISTS partitions.history_p2017_05_17 (CHECK ((clock >= '1494979200' AND clock < '1495065600'))) INHERITS (history)"
PL/pgSQL function trg_partition() line 39 at EXECUTE
insert into history (itemid,clock,ns,value) values (36958,1494979198,640977203,0.000000);

I immediately realized what was happening.  Our production Zabbix was much busier than our test Zabbix and what we were seeing were race conditions occurring when it was time to create new partitions. Two threads were both executing the trigger function to create the new partition table at the same time and one succeeded and the other failed.  The failure resulted in the insert being discarded… not a disaster for monitoring but not good either.

I decided to fix the procedure, and here’s what I did:

Original Function Pseudo-Code:

BEGIN
  INSERT INTO table_partN values(x,x,x);
  RETURN null;
EXCEPTION
  WHEN undefined_table THEN
    CREATE TABLE table_partN ... INHERITS(table);
    CREATE INDEX table_partN_1 on table_partN ...;
    INSERT INTO table_partN values(x,x,x); 
    RETURN null;
END;

What I needed to do was trap the error when two or more threads all tried to create the table at the same time and only one succeeded.  I needed a nested exception trap.

New Function Pseudo-Code:

BEGIN
  BEGIN
    INSERT INTO table_partN values(x,x,x);
    RETURN null;
  EXCEPTION
    WHEN undefined_table THEN
      CREATE TABLE table_partN ... INHERITS(table);
      CREATE INDEX table_partN_1 on table_partN ...;
      INSERT INTO table_partN values(x,x,x); 
      RETURN null;
  END;
EXCEPTION
  WHEN duplicate_table THEN
  INSERT INTO table_partN values(x,x,x);
  RETURN null;
END;

Now if the race condition occurs then the winner will create the table and index and perform the insert, but the losers will get a “duplicate_table” exception and re-try the insert.

Any other SQL errors encountered will still propagate back out to the caller.

I’ve put together my own version of these functions and wrapped them in Bash scripts to make things easier. You can find them at my github site here.

Postgres crontab on active/passive clusters

My first blog post…

If you’ve had to deal with PostgreSQL on an active/passive cluster using Pacemaker or some other clustering software, you’ve probably discovered that setting up local cron jobs can be a little tricky.

The problem with active/passive DB nodes there isn’t a simple way for your cron job to determine if the current node is the active node where Postgres is currently running.

You could just run it on every node but then you’d get error emails from the passive nodes that didn’t have a database running.

You could write a shell wrapper that checks cluster configs and statuses, or run psql and check connectivity, or modify the cluster config to switch crontabs along with the resources.  If you want Pacemaker to control the crontab, see this article Managing Cron Jobs With Pacemaker.

But I use a simple check that works well for me:  I just look for the postgres socket file as part of the cron command.  It’s very fast and doesn’t require any special permissions.

Below are example crontabs running pg_dumpall at midnight daily.


On Red Hat distros like Centos:

0 0 * * * test -S /tmp/.s.PGSQL.5432 && pg_dumpall > dump.sql

On Debian distros like Ubuntu:

0 0 * * * test -S /var/run/postgresql/.s.PGSQL.5432 && pg_dumpall > dump.sql


The socket file should only exists on the active PostgreSQL node (unless Postgres crashed hard).  The “test -S” checks if the socket file exists and returns a success if found, or failure if not found. Then the double-ampersand runs the next command only if the test command found the socket.

I like the simple solutions.