I have a huge sql table (more than 1 billion) of user transactions.
I'd like to add a binary column which represents where or not the current user_id row is 40 minutes or less than the previous one.
For instance:
user_id | date
--------+--------------------
1 | 2011-01-01 12:15:00
1 | 2011-01-01 12:00:00
8 | 2011-01-01 15:00:00
8 | 2011-01-01 14:00:00
the result of the query would be:
user_id | date | new
--------+---------------------+----
1 | 2011-01-01 12:15:00 | 0
1 | 2011-01-01 12:00:00 | 1
8 | 2011-01-01 15:00:00 | 1
8 | 2011-01-01 14:00:00 | 1
I'd like to avoid joining the entire table to itself and maybe use a side table or an analytic function (over-partition).