$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$GCP FinOpsBoring Predictabilitywith BigQuery Slots#003

tl;dr;

BigQuery Slots offers predictable BigQuery analysis costs for those exceeding $2000 a month in on-demand queries.

BigQuery

BigQuery is a multi-cloud data warehouse. It is cost-effective, serverless, highly scalable. It differs from more traditional analytical databases by its architectural choices. BigQuery has a clear division of responsibilities: data ingest, data storage, and data processing. This decoupled architecture allows for massive horizontal scalability as storage and compute are joint by petabit network. This is why querying terabytes of data in just a few seconds is the norm.

BigQuery Architecture

BigQuery Pricing

Decoupled architecture also means decoupled pricing. Below calculations are based on US multi-region, other regions are likely to be more expensive - always check the latest pricing here.

Ingest

  • Free for batch insert
  • $50 per TB of streaming insert

Storage

  • Active storage $20 per TB/month
  • Long-term storage $10 per TB/month (applies to data that was stored without modifications for 90 consecutive days)

Analysis

  • $5 per TB of bytes processed by on-demand queries
  • $4 per hour/100 slots
  • $2000 per month/100 slots (monthly commit)
  • $1700 per month/100 slots (yearly commit)

Extract

  • Batch exports are free
  • Streaming reads are $1.10 per TB exported

On-demand vs Flat-rate

Analysis pricing is split into two categories: on-demand and flat-rate. Slots and CostFlat-rate slotsOn-demand

On-demand

  • Default option
  • Bills you for the volume of bytes processed
  • Has 2000 slots available for processing

Flat-rate

  • Requires additional configuration
  • Supports 1-minute, 1-month, and 1-year minimum commitment
  • Slots are purchased in increments of 100
  • The volume of bytes processed no longer impact your billing

What slots mean?

The best comparison would be with Serverless (Cloud Functions or Cloud Run) vs GCE Instance. On-demand is serverless, it scales nearly-infinitely, in very little time, but you are paying a premium for each compute second consumed. Slots are more like a GCE Instance. Whether you utilize it or not, it will cost you money at the end of the month. Also, it can not suddenly handle huge increases in traffic and will slow you down, but it will handle a steady-rate load well.

So, 2000 on-demand slots are like having 2000 vCPUs available to you which you only pay for when querying. With 2000 flat-rate slots - you’ll always pay for 2000 vCPUs as they are dedicated for your usage only.

You can use Cloud Monitoring to view Slot usage.

How do I save money with this?

Monthly BigQuery analysis spend exceeding $2000 is a good starting point to consider slots. It is possible to save costs even when your monthly bill is under $2000 by using Flex Slots, but you will have to trade off savings with engineering effort.

By purchasing a minimum slot commitment (100 slots for 1 month) your BigQuery analysis cost becomes a flat-rate at $2000 per month. This means that analysis bills previously exceeding $2000 would now be capped, becoming savings.

Pros and Cons

Pros

  • Predictable cost - any service that is billed on-demand and allows for vast unrestricted consumption of service in seconds is only a while-loop away from bankrupting your company.
  • Psychological safety - all consumers of BigQuery will surely be relieved to know that the biggest damage they can do with an inefficient query is to slow down their peers.

Cons

  • Extra configuration - it is not hard and fairly straightforward, but has to be done.
  • Noisy neighbor problem - as Slots are usually shared across the organization and projects, it is possible that several heavy consumers might compete for slots at the same time, slowing everything down.
  • Green flag to inefficient queries - with the removal of the cost as a penalty, BigQuery Slots might relax users into writing less efficient queries, which will further contribute to the noisy neighbor problem.
  • Slower processing - 100 slots are not 2000, some workloads will take longer to complete. From the experience, x20 reduction in maximum slots did not slow everything down 20 fold.

Further optimizations

Hybrid

It is possible to use a combination of a flat-rate and on-demand BigQuery - configurable per project. This way, projects that retrieve vast amounts of data (volume of bytes queried), but not perform a lot of transformations (not many slots are needed) can be made flat-rate. At the same time, projects that need maximum query performance, but do not process a lot of bytes, can stay on-demand.

Trigger Flex Slots dynamically

Flex Slots are the latest, entry-point addition to the BigQuery flat-rate pricing. Flex Slots have a minimum 1-minute commitment and are billed by the second.

If you are running a nightly ETL/Batch processing that has a well-known query cost - you can enable Flex Slots before processing and disable it once processing is complete, reducing idle time.

Top Up

Similar to the above scenario, you might have monthly/yearly slots, say 200 of them. However, each month you need to run a large analysis for forecasting that normally takes hours. Using Flex Slots you can temporarily boost available slots to speed up processing, but still keeping costs predictable.

Summary

BigQuery Slots are great for companies that start to get serious about BigQuery and when their bills start to reflect the same. While there is a trade off between predictability and savings vs performance, from the experience, the difference is usually is not great.