I have a PostgreSQL function that returns a query result to pgadmin results grid REALLY FAST.
Internally, this is a simple function that uses a
dblink to connect to another database and does a query return so that I can simply run
SELECT * FROM get_customer_trans();
And it runs just like a basic table query.
The issue is when I use the
NOT IN clause. So I want to run the following query, but it takes forever:
SELECT * FROM get_customer_trans() WHERE user_email NOT IN (SELECT do_not_email_address FROM do_not_email_tbl);
How can I speed this up? Anything faster than a
NOT IN clause for this scenario?
get_customer_trans() is not a table - probably some stored procedure, so query is not really trivial. You'd need to look at what this stored procedure really does to understand why it might work slow.
However, regardless of stored procedure behavior, adding following index should help a lot:
CREATE INDEX do_not_email_tbl_idx1 ON do_not_email_tbl(do_not_email_address);
This index lets
NOT IN query to quickly return answer. However,
NOT IN is known to have issues in older PostgreSQL versions - so make sure that you are running at least PostgreSQL 9.1 or later.
<strong>UPDATE</strong>. Try to change your query to:
SELECT t.* FROM get_customer_trans() AS t WHERE NOT EXISTS ( SELECT 1 FROM do_not_email_tbl WHERE do_not_email_address = t.user_email LIMIT 1 )
This query does not use
NOT IN, and should work fast.
I think that in PostgreSQL 9.2 this query should work as fast as one with
NOT IN though.
Just do it this way:
SELECT * FROM get_customer_trans() as t1 left join do_not_email_tbl as t2 on user_email = do_not_email_address where t2.do_not_email_address is null