Typically, there is an orders
table to go with that, with exactly one row per order.
If we can further assume that there is always at least one transaction for every order, this would do the job:
SELECT o.id
FROM orders o
WHERE NOT EXISTS (
SELECT FROM transactions -- SELECT list can be empty for EXISTS test
WHERE order_id = o.id
AND product_id <> ALL ('{a,b,c}')
);
That's good for very common product_id's or long lists.
For short lists or rare products, it will be faster to start with a positive selection first. Like:
SELECT order_id
FROM (
SELECT DISTINCT order_id
FROM transactions
WHERE product_id = ANY ('{a,b,c}')
) t
WHERE NOT EXISTS (
SELECT FROM transactions
WHERE order_id = t.order_id
AND product_id <> ALL ('{a,b,c}')
);
An index on (product_id)
is essential for performance. Better yet, a multicolumn index on (product_id, order_id)
, plus another one on (order_id, product_id)
. See:
The manual about array literals:
About the ANY
and ALL
constructs:
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…