I have a table with the following columns:
ref_year, ref_no, inv_id, start_date, end_date
The first two columns (ref_year and ref_no) combine to form a primary key in other tables and I will refer to them as 'reference' from now on, but in this table they can appear multiple times. The third (inv_id) is a foreign key. The final two columns represent the date an inv_id became attached to the reference and, where appropriate, the date it ceased being attached to that reference.
I want to return exactly one row for each reference that will relfect the earliest inv_id attached to that reference where end_date is null. It's the end_date part that's causing me problems. Here's what I've got so far:
SELECT
t1.*
FROM
involvements t1
LEFT OUTER JOIN
involvements t2
ON
(t1.ref_year = t2.ref_year
AND
t1.ref_no = t2.ref_no
AND
t1.start_date < t2.start_date)
WHERE
t2.ref_year IS NULL
AND
t2.ref_no IS NULL
This selects the inv_id with the earliest start_date perfectly, but I can't figure out how to account for those cases where the inv_id with the earliest start_date has an end_date that is not null, in which case I'd want the script to check the next-oldest inv_id for that reference instead, and so on until it returns one with a null end_date. I tried creating a temporary table with only null end_dates then inner joining to this as a sub-query but of course couldn't because the WHERE
clause came before the sub-query. Is there an efficient way to get my desired behaviour?