Postgres 9.3 min/max for the cidr data type

I've become a big fan of PostgreSQL (aka PostgresDB or just Postgres depending on where you look). Over the years they have added some of the best database technology to the point where if I do need a relational database, I will use PostgreSQL without a second thought.

I also believe that you should use a relational database for data that has relationships - you know, when you might want to compare two pieces of data against each other in some way or other.

I've found that all data is relational... ;)

I digress. This post is about the CIDR data type that PostgreSQL makes available, and how (in 9.3 and below) there is no min or max function inbuilt into the data type. That may be coming with 9.5 I think, but its not what we are using today.

CIDR or Classless Inter-Domain Routing if you want to be all formal about it, is a common way of detailing out IP subnet ranges. Sadly PostgreSQL in version 9.3 and below does not have a default method for determining the minimum IP address and maximum IP address within a CIDR range. Using a 'trick' its actually quite easy to determine this for yourself, but it's a) non-obvious, and b) not documented in a straight-forward way that I've seen. Hence this post...

So on to my use-case. MaxMind helpfully provide a set of GeoIP databases related to what countries and/or cities any specific IP address is associated with. GeoIP identification is a common thing to use in an application and specifically the free country geoIP db from MaxMind is pretty accurate (higher than 90% I've seen said on the Internets, so it must be true!).

Their geoIP listing is pretty good - for example if you take the first few rows of the 'blocks' data, you get the following:

network,geoname_id,registered_country_geoname_id,represented_country_geoname_id,is_anonymous_proxy,is_satellite_provider  
1.0.0.0/24,2077456,2077456,,0,0  
1.0.1.0/24,1814991,1814991,,0,0  
1.0.2.0/23,1814991,1814991,,0,0  
1.0.4.0/22,2077456,2077456,,0,0  
1.0.8.0/21,1814991,1814991,,0,0  

geoname_id and the like relate to the locations data (country, region, that sort of thing). The key focus here in the blocks table is the network column.

1.0.0.0/24 of course actually means the 255 IP addresses between 1.0.0.0 and 1.0.0.255. In the networking world the 1.0.0.255 address would actually be used for broadcast purposes, but its still an IP of note - it's just unlikely to ever be seen by a service you are running and wanting to know the geoIP location for.

If you can use PostgreSQL natively, this entire article is pretty irrelevant, as you can use some of the more interesting functions to figure out if an IP address of 1.0.0.57 is within a CIDR range. For example:

SELECT geoname_id FROM geoip_blocks WHERE 1.0.0.57 <<= network;  

This will return the geoname_id of 2077456.

Not an issue right? Except if you are using the Java Persistence API (JPA) and therefore using the Java Persistence Query Language (JPQL) you will quickly find that <<= (which means 'is contained within or equals') and the like are not supported. PostgreSQL is just too advanced...

To get around the lack of advanced query operators, you have to use the older BETWEEN query functionality. Except that by default you can't. Due to no MIN or MAX, its really hard to use a between on a single object that is actually a range of IP addresses.

Ideally you would want to use the MIN or MAX functions to be able to return the minimum or maximum IP address. In PostgreSQL 9.3 and below they do not do this. So you are left with wanting to figure out the MIN or MAX for yourself based on the network CIDR data.

Roll in a little known (or at least not very well documented) trick, and you can do this very easily. Knowing the trick is the majority of the battle. It turns out that if you subtract 0.0.0.0 from a CIDR range, it will return the number of IP addresses possible up to that point of the range. In this case:

SELECT CIDR '1.0.0.0/24' - INET '0.0.0.0';  
returns:  
16777216  

Well, that was the easy part right? Getting the minimum is really straight forward. So is the maximum when you use the same technique, but the thinking process you need to go through isn't quite as obvious. If you have worked in the networking field for a while, you will know that the /24 of a CIDR range is also the same as the subnet mask 255.255.255.0. You have probably seen this in the ipconfig of your own network device. What the subnet mask really means is that the first three octets (the 255.255.255. part) are specific down to a single IP address. The last octet is 0 which actually means any number can go there - so 255 total possibilities.

This almost helps us right? You could create a CIDR lookup table and do some math alongside the min function to figure all this out. Thankfully PostgreSQL also provides a function called hostmask, which returns the inverse of the subnet mask - 0.0.0.255. This shows us which octets have what number of possible values.

This is helpful because using the trick above, you can do the following to get the maximum number of IP addresses within a single CIDR range:

SELECT (CIDR '1.0.0.0/24' - INET '0.0.0.0') + (hostmask(CIDR '1.0.0.0/24') - INET '0.0.0.0');  
returns:  
16777471  

So now we have a way of getting the minimum or maximum of a specific range, and by translating a single IP address into a number, we can use that information and the BETWEEN clause in a SELECT statement to figure out which network CIDR range an IP address is located within. As an example, our test value of 1.0.0.57 just happens to be the 16,777,273th IP address in the IPv4 world.

SELECT INET '1.0.0.57' - INET '0.0.0.0';  
returns:  
16777273  

16777273 of course is in-between 16777216 and 1677471. This is good enough for JPQL as long as the minimum and maximum data is in a table already. Given we have to create and then update the geoIP data tables from time to time (monthly for the free tables, more often for the paid-for subscription from MaxMind), then we can easily add a couple of extra columns and update them with the minimum and maximum values.

It's all well and good using commands that don't really use data in tables - so an example of running an update on an entire table to populate the MIN/MAX fields would actually look like this:

UPDATE geoip.geoip_blocks_temp SET min_ip = network - inet '0.0.0.0', max_ip = (network - inet '0.0.0.0') + (hostmask(network) - inet '0.0.0.0');  
returns:  
UPDATE 1776115  

Yes, there are 177,615 different network blocks as I write this in the MaxMind Country geoIP database.

So, how best to do this from start to finish? Why with Ansible of course! Following is the general way I write an Ansible script, and thankfully MaxMind provide a MD5 checksum of their data files so I use that to see if I need to download data or not.

First in my process, I start with what I need to accomplish in pseudo code:

keep copy last md5 checksum for comparison  
get latest md5 checksum  
compare the md5 files  
if different delete old data files  
if different download new data files  
if different extract new data files  
if different import into database  

This really isn't something I feel a role is good for (its an infrequent set of commands that we do not need to run very often - maybe once a day at most), so this becomes its very own playbook. I'm using Ansible 1.7.x in this example as there are still a few backwards-incompatibility bugs in 1.8.x as I write this, so just be aware this probably does not work in 1.8.x or higher (specifically the inclusion of an include using a variable for that directory). I also don't use some of the core modules (wget directly is probably a sin to use in the eyes of the creator of Ansible ;) ) for good reasons - I don't always need to be idempotent, and some of the core modules do bizarre things in my OS of choice.

Then I go through and expand each line (either multiple commands or just one depending on how good my pseudo code skills are on the day). As an example, what originally was

get latest md5 checksum  

as a specific item became:

- name: get latest md5 checksum
  raw: "wget -O {{ geoipfiles }}GeoLite2-Country-CSV.zip.md5 http://geolite.maxmind.com/download/geoip/database/GeoLite2-Country-CSV.zip.md5"
  register: resultingdata
- debug: var=resultingdata

Yes, I used raw - it gives me a consistent result every time I run it across Ansible version updates (something I've had issues with more than once), and in this case I know I have to run the command - it really can't be idempotent at this point of the script.

Don't worry if the below is confusing - I will eventually explain it all in future posts...

What I ended up with of course was something slightly longer-winded (some lines sanitised of course!):

---
- hosts: localhost
  connection: local
  gather_facts: False
  vars:
    myplaybookname: apb-api-epicdata-prod-geoip-refresh
  roles:
    - globalvars
  tasks:
    - include: "{{ ansiblescripts }}inc-slack.yaml"
      vars:
        slackmessage: "Playbook {{ myplaybookname }} Start"
        slackcolor: "warning"
        aignore_errors_default: no

    - name: ensure oldest md5 checksum is deleted
      file:
        path: "{{ geoipfiles }}GeoLite2-Country-CSV.zip.md5.old"
        state: absent
      register: resultingdata
    - debug: var=resultingdata

    - name: keep copy last md5 checksum for comparison
      copy:
        src: "{{ geoipfiles }}GeoLite2-Country-CSV.zip.md5"
        dest: "{{ geoipfiles }}GeoLite2-Country-CSV.zip.md5.old"
      ignore_errors: true
      register: resultingdata
    - debug: var=resultingdata

    - name: ensure old md5 checksum is deleted
      file:
        path: "{{ geoipfiles }}GeoLite2-Country-CSV.zip.md5"
        state: absent
      register: resultingdata
    - debug: var=resultingdata

    - name: get latest md5 checksum
      raw: "wget -O {{ geoipfiles }}GeoLite2-Country-CSV.zip.md5 http://geolite.maxmind.com/download/geoip/database/GeoLite2-Country-CSV.zip.md5"
      register: resultingdata
    - debug: var=resultingdata

    - name: get old md5 checksum
      raw: "cat {{ geoipfiles }}GeoLite2-Country-CSV.zip.md5.old"
      ignore_errors: true
      register: md5old
    - debug: var=md5old

    - name: get new md5 checksum
      raw: "cat {{ geoipfiles }}GeoLite2-Country-CSV.zip.md5"
      register: md5new
    - debug: var=md5new

    - name: if different delete old data files
      file:
        path: "{{ geoipfiles }}IPByCountry"
        state: absent
      when: md5new.stdout != md5old.stdout
      register: resultingdata
    - debug: var=resultingdata

    - name: if different download new data files
      raw: "wget -O {{ geoipfiles }}GeoLite2-Country-CSV.zip http://geolite.maxmind.com/download/geoip/database/GeoLite2-Country-CSV.zip"
      when: md5new.stdout != md5old.stdout
      register: resultingdata
    - debug: var=resultingdata

    - name: if different extract new data files
      raw: "unzip -j {{ geoipfiles }}GeoLite2-Country-CSV.zip -d {{ geoipfiles }}IPByCountry"
      when: md5new.stdout != md5old.stdout
      register: resultingdata
    - debug: var=resultingdata

    - name: if different import into database
      raw: "time psql -d dbname -h dbhostname.rds.amazonaws.com -p 5432 -U dbusername -t -f {{ ansiblefiles }}geoip-tables.psql"
      when: md5new.stdout != md5old.stdout
      register: psqloutput
    - debug: var=psqloutput

    - include: "{{ ansiblescripts }}inc-slack.yaml"
      vars:
        slackmessage: "Playbook {{ myplaybookname }} SQL stderr\n{{ psqloutput.stderr }}\nSQL stdout\n{{ psqloutput.stdout }}"
      when: md5new.stdout != md5old.stdout

    - include: "{{ ansiblescripts }}inc-slack.yaml"
      vars:
        slackmessage: "Playbook {{ myplaybookname }} End with changed data result {{ md5new.stdout != md5old.stdout }}"
        slackcolor: "good"
        aignore_errors_default: no

If you are used to reading Ansible scripts, you will have already seen that my conditional on if I should download the bigger data zip is based on a when clause:

when: md5new.stdout != md5old.stdout  

It would be great to do that in a simple 'set variable so that I can compare the md5 files', but thats not the direction Ansible went, so I improvised and use CAT to accomplish the same concept.

CAT as a command is pretty useful for doing that comparison - there are many other ways of course I'm sure! set_fact is not one of those ways, or at least isn't in all the attempts I've made to use that particular command ( /rant!).

You will also have noticed that there is a SQL file referenced. And for the very astute, no db password :) I use another trick for storing db passwords (a .pgpass file in the home directory of the user running the sql commands) as I hate them being listed in Ansible directly for what are hopefully obvious reasons.

The SQL file is a bit more involved, as we have to do a few things (back to pseudo code!):

make sure schema exists  
drop old temporary tables if they still exist from an interrupted run  
create new tables  
go overboard on indexes  
import data  
alter blocks table for min/max values  
update blocks table with min/max values  
delete old data tables  
rename temp tables  
rename temp index and constraint names  

To make sure everything works as intended, and to minimise production interruptions, all this is run in such a way that if it fails, it fails before the table switching occurs. The SQL could also be easily broken up into two consecutive commands, for creating the temporary data and then only running the table switch and renames at the end if all is good with the create. This would be sensible for data sources that you are importing where you can't trust that the data will be in the same format (or consistent in quality!) each time. MaxMind in my experience give consistent high quality data so I didn't feel it was necessary in this example to go to that extreme (my testing did, but thats because I didn't trust my own SQL writing skills!).

The entire script takes ~5 seconds to run, but the really important piece is when the tables are switched, and by changing the method to drop/rename rathe than drop/create, our total outage time is less than 0.1 seconds. If desired this can also all be encapsulated within a non-locking commit, but if your application engineers are good, they will be able to handle a 0.1 second outage within their code. I would never ask them to handle a 5 second outage though - thats just asking for trouble :P

So on to the SQL code:

CREATE SCHEMA IF NOT EXISTS geoip;  
DROP TABLE IF EXISTS geoip.geoip_blocks_temp;  
DROP TABLE IF EXISTS geoip.geoip_locations_temp;

-- GeoLite2-Country-Locations-en.csv
-- geoname_id,locale_code,continent_code,continent_name,country_iso_code,country_name
-- bigint     varchar(2)  varchar(2)     varchar(50)    varchar(2)       varchar(50)

CREATE TABLE geoip.geoip_locations_temp  
(
  geoname_id bigint NOT NULL,
  locale_code                          character varying(2) NOT NULL,
  continent_code                       character varying(2),
  continent_name                       text,
  country_iso_code                     character varying(2),
  country_name                         text,
  CONSTRAINT geoip_locations_pkey_temp PRIMARY KEY (geoname_id)
);

-- GeoLite2-Country-Blocks-IPv4.csv
-- network,geoname_id,registered_country_geoname_id,represented_country_geoname_id,is_anonymous_proxy,is_satellite_provider
-- CIDR    bigint     bigint                        bigin                          int                int

CREATE TABLE geoip.geoip_blocks_temp  
(
  network                           cidr NOT NULL,
  geoname_id                        bigint,
  registered_country_geoname_id     bigint,
  represented_country_geoname_id    bigint,
  is_anonymous_proxy                int NOT NULL,
  is_satellite_provider             int NOT NULL,
  CONSTRAINT geoip_blocks_pkey_temp PRIMARY KEY (network),
  FOREIGN KEY (geoname_id)          REFERENCES geoip.geoip_locations_temp (geoname_id)
);

CREATE UNIQUE INDEX index_geoip_locations_geoname_id_temp       ON geoip.geoip_locations_temp (geoname_id);  
CREATE        INDEX index_geoip_locations_locale_code_temp      ON geoip.geoip_locations_temp (locale_code);  
CREATE        INDEX index_geoip_locations_country_iso_code_temp ON geoip.geoip_locations_temp (country_iso_code);  
CREATE UNIQUE INDEX index_geoip_blocks_network_temp             ON geoip.geoip_blocks_temp (network);  
CREATE        INDEX index_geoip_blocks_geoname_id_temp          ON geoip.geoip_blocks_temp (geoname_id);  
CREATE        INDEX index_geoip_blocks_is_anonymous_proxy_temp  ON geoip.geoip_blocks_temp (is_anonymous_proxy);

-- First the locations table due to the foreign key constraint in blocks
\COPY geoip.geoip_locations_temp FROM '/usrdeploy/IPByCountry/GeoLite2-Country-Locations-en.csv' WITH CSV HEADER;

-- Then the blocks table
\COPY geoip.geoip_blocks_temp FROM '/usrdeploy/IPByCountry/GeoLite2-Country-Blocks-IPv4.csv' WITH CSV HEADER;

-- add min_ip and max_ip columns to blocks
ALTER TABLE geoip.geoip_blocks_temp ADD COLUMN min_ip bigint;  
ALTER TABLE geoip.geoip_blocks_temp ADD COLUMN max_ip bigint;

-- populate min/max IPs
UPDATE geoip.geoip_blocks_temp SET min_ip = network - inet '0.0.0.0', max_ip = (network - inet '0.0.0.0') + (hostmask(network) - inet '0.0.0.0');

-- create indexes for min/max IPs
CREATE UNIQUE INDEX index_geoip_blocks_min_ip_temp ON geoip.geoip_blocks_temp (min_ip);  
CREATE UNIQUE INDEX index_geoip_blocks_max_ip_temp ON geoip.geoip_blocks_temp (max_ip);

-- drop old tables
DROP TABLE IF EXISTS geoip.geoip_blocks;  
DROP TABLE IF EXISTS geoip.geoip_locations;  
-- rename temp tables
ALTER TABLE geoip.geoip_locations_temp RENAME TO geoip_locations;  
ALTER TABLE geoip.geoip_blocks_temp RENAME TO geoip_blocks;

ALTER INDEX geoip.index_geoip_locations_geoname_id_temp       RENAME TO index_geoip_locations_geoname_id;  
ALTER INDEX geoip.index_geoip_locations_locale_code_temp      RENAME TO index_geoip_locations_locale_code;  
ALTER INDEX geoip.index_geoip_locations_country_iso_code_temp RENAME TO index_geoip_locations_country_iso_code;  
ALTER INDEX geoip.index_geoip_blocks_network_temp             RENAME TO index_geoip_blocks_network;  
ALTER INDEX geoip.index_geoip_blocks_min_ip_temp              RENAME TO index_geoip_blocks_min_ip;  
ALTER INDEX geoip.index_geoip_blocks_max_ip_temp              RENAME TO index_geoip_blocks_max_ip;  
ALTER INDEX geoip.index_geoip_blocks_geoname_id_temp          RENAME TO index_geoip_blocks_geoname_id;  
ALTER INDEX geoip.index_geoip_blocks_is_anonymous_proxy_temp  RENAME TO index_geoip_blocks_is_anonymous_proxy;  
ALTER TABLE geoip.geoip_blocks                                RENAME CONSTRAINT geoip_blocks_pkey_temp    TO geoip_blocks_pkey;  
ALTER TABLE geoip.geoip_locations                             RENAME CONSTRAINT geoip_locations_pkey_temp TO geoip_locations_pkey;  

It's also possible to run the above as a full schema swap (i.e. tables/indexes/constraints keep their names, all you do is drop old schema and rename temp schema). From a timing perspective this takes around about the same time, and in some of my testing was actually taking 10 to 20 milliseconds longer. Of course some of the time it was slightly faster, so I'll write that off as networking variances within AWS.

Well there you have it - a real world example of creating MIN/MAX for JPQL usage, while keeping all the awesome functionality of the CIDR type within PostgreSQL for when it can actually be used natively. One day we'll get MIN/MAX as functions within PostgreSQL. One day soon I think, but we needed geoIP queries via JPQL and given the above, we already have that.