
Written By Paul Johnson, VLDB Founder & Technical Lead
There I was perusing LinkedIn, about a week ago, when I came across a post by Lasha Dolenjashvili covering the ‘Gaps and Islands problem in SQL’.
This got my attention for several reasons.
First of all, I’m an Ess-Queue-Ell (SQL) nerd, so how could I possibly resist?
The conundrum in question also reminded me of a similar challenge from a few years ago.
The challenge then was to align share price sequences to compute daily price changes. Pesky weekends, public holidays and missing data made this a ‘gaps and islands’ type problem.
Lasha’s solution also uses nested CTEs, DENSE_RANK and a single SQL statement.
“No way would I do it like that, surely?” I thought.
Only one way to find out.
Disclaimer: I am *not* criticising the suggested solution. I set out to explore how I would solve the same problem. Nothing more, nothing less. There is *always* more than one way to skin the SQL cat.
TL;DR – there is no mention of AI in this article, nor was AI used to create any of the content. Sorry, not sorry.
Create/Populate Base Table
This step creates the 10 row base table containing nothing more than customer identifiers and login dates.

So far, so good.
Compute Previous/Next Dates & Flag Logins/Logouts
This step uses LAG to compute previous and next LOGIN_DATE values for each LOGIN_DATE.
Where applicable a LOGIN_DATE is flagged as a LOGIN (first in series) or LOGOUT (last in series) event.

A physical table is used to store the results, rather than a transient CTE.
Not only does this make testing easier, one day the presence of intermediate results from the last run of a process will save the day.
Compute Interval Start/End & Duration
This step uses the output from the previous step (gap_island_1) to compute start/end and duration of each interval.

The output is stored in table gap_island_2.
Display Answer Set
The final step filters out the rows not needed from gap_island_2 to deliver the answer set.

This consists of a row for each ‘island’ (start/end of a logged in period) and the duration in number of consecutive logged in days.
Running The SQL
The psql client on MacOS was used to run the SQL against a local Postgres v18 instance:

The response is almost instantaneous, as might be expected.
It *might* even be correct.
I’m sure eagle-eyed readers will soon let me know if it isn’t…play nicely folks!!!

