I have two tables like this
A B ---- ------- col1 col2 col1 col2 ---------- -----------
A table contains 300k rows B table contains 400k rows
I need to count the col1 for table A if it is matching col1 for table B I have written a query like this:
select count(distinct ab.col1) from A ab join B bc on(ab.col1=bc.col1)
but this takes too much time
could try a group by... Also ensure that the col1 is indexed in both tables
SELECT COUNT (col1 ) FROM ( SELECT aa.col1 FROM A aa JOIN B bb on aa.col1 = bb.col1 GROUP BY (aa.col1) )
It's difficult to answer without you positing more details: did you analyze the tables? Do you have an index on col1 on each table? How many rows are you counting?
That being said, there aren'y so many potential query plans for your query. You likely have two seq scans that are hash joined together, which is about the best you can do... If you've a material numbers of rows, you'll be counting a gazillion rows, and this takes time.
Perhaps you could rewrite the query differently? If every B.col1 is in A.col1, you could get the same result without the join:
select count(distinct col1) from B
If A has low cardinality, it might be faster to rely on exists():
with vals as ( select distinct A.col1 as val from A ) select count(*) from vals where exists(select 1 from B where B.col1 = vals.val)
Or, if you know every possible value from A.col1 and it's reasonably small, you could unnest an array without querying A at all:
select count(*) from unnest(Array[val1, val2, ...]) as vals (val) where exists(select 1 from B where B.col1 = vals.val)
Or vice-versa, in each of the above, if every B holds the reference values.