I'm looking into the possibility to use BigQuery and its API to do on-site queries depending on content that is viewed by our visitors. Therefore the response time is crucial. I have loaded a very simple structured dataset of 10k rows (4columns) and run a very simple query and that takes between 1 and 2 seconds. My questions is hopefully pretty simple to answer, will I ever be able to get a <1sec response time with the BQ API by optimising the data in someway or not?
Thanks a lot in advance!
BigQuery is not an online transaction processing (OLTP) database but an interactive analysis database making possible to scan terabytes of data within seconds. While the query time is pretty consistent, since it is a shared service, the query time is not guaranteed, i.e. query running for 2 seconds might run 1.5 seconds or 3 seconds at different periods of time. Due to the nature and internals of BigQuery, query time of < 1s is not realistic as of today.
One of the popular design patterns is to let BigQuery do the heavy lifting of complex analysis of your data and then storing results in OLTP (like mySQL) or even in-memory (like Redis) database and serve the results to clients from there. You can periodically update the data by running the queries in the background.
BigQuery is designed for interactive analysis of MASSIVELY large datasets.
Processing time is expected within the seconds.
At the same time, expecting that running query against smaller dataset will bring the runtime under the second – is not realistic just because it does not design this way.
No matter what data size you have - you still “pay” all expenses as if you would process big dataset.
So, short answer unfortunatelly:
Today – Not! Tomorrow – Not! Day after – you never know, but I don’t think so, unless BigQuery will evolve into something else