How to Forecast Snowflake Credit Burn with SQL & Python: A Practical FinOps Guide

10 minutes to read
Get free consultation

 

If unpredictable Snowflake bills are holding back your business, you are not alone. Data analysts, engineers, and FinOps professionals often struggle with budget overruns, finance pressure, and the rigid limitations of spreadsheet-based forecasts. This hands-on guide will help you forecast Snowflake credit burn with a scalable approach by combining SQL (for data extraction) and Python (for advanced time-series modeling). You will also see how to use Cortex, Snowflake’s native ML feature, for in-platform forecasting.

Why a Programmatic Forecast Beats Manual Spreadsheets

Manual spreadsheets are no match for the dynamic, usage-based Snowflake billing model. Spreadsheets are static, slow to update, and especially prone to human error – making them unreliable for companies where daily credit burn fluctuates due to warehouse resizing, increasing workloads, or evolving business operations.

A programmatic approach provides:

Organizations surveyed by the FinOps Foundation consistently report lower cost overruns when they automate forecasting and integrate alerts for unexpected usage. In short, shifting to SQL and Python not only improves operational efficiency, it brings clarity and credibility to budget management.

Step 1: Extract Historical Credit Usage with SQL

Accurate Snowflake cost forecasts begin with precise, granular historical data. The ACCOUNT_USAGE schema exposes detailed metering via the WAREHOUSE_METERING_HISTORY view. This view logs every credit burned by each virtual warehouse, including start/end timestamps, making it easy to aggregate, trend, and diagnose contributors to cloud costs.

Querying WAREHOUSE_METERING_HISTORY for Raw Data

Focus your extraction on SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY. For greater precision, incorporate both START_TIME and END_TIME and aggregate by warehouse where needed.

Best Practice Tips:

For reference, Snowflake warehouse credits per hour depend on size: a Medium warehouse uses 4 credits/hour, Large uses 8, etc. (Source). Changes in warehouse size directly impact daily credit burn trends.

SQL Query to Create a Daily Credit Burn Trend

Here is a step-by-step, ready-to-use SQL query to extract a daily credit usage trend from your environment:

SELECT
  DATE_TRUNC('day', START_TIME) AS usage_date,
  WAREHOUSE_NAME,
  SUM(CREDITS_USED) AS daily_credits_used
FROM SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY
WHERE START_TIME >= DATEADD('day', -365, CURRENT_DATE())
  AND CREDITS_USED > 0
GROUP BY usage_date, WAREHOUSE_NAME
ORDER BY usage_date, WAREHOUSE_NAME;

If you’re preparing for a platform migration, be mindful that workload mapping may not directly align between legacy and Snowflake virtual warehouses, complicating historical comparisons.

Step 2: Build Your Predictive Model in Python

Now with a daily time series of credits burned, you are ready to forecast. Python excels at flexible, robust modeling, allowing you to account for seasonality, workload shifts, and outlier events that break spreadsheet logic. Prophet is a popular, open-source library for time-series forecasting, widely used in FinOps teams (see latest documentation).

Environment Setup (Pandas, Prophet, Snowflake Connector)

Install all needed tools:

pip install pandas prophet snowflake-connector-python matplotlib

A Python Script for Time-Series Forecasting

Below is a complete script to fetch, model, and visualize your forecast:

import pandas as pd
from prophet import Prophet
import matplotlib.pyplot as plt
import snowflake.connector

# 1. Connect to Snowflake & Query Data
db_params = {
    'user': '<USERNAME>',
    'password': '<PASSWORD>',
    'account': '<ACCOUNT_ID>',
    'warehouse': '<WAREHOUSE>',
    'database': '<DATABASE>',
    'schema': 'ACCOUNT_USAGE'
}

conn = snowflake.connector.connect(**db_params)
query = '''
    SELECT DATE_TRUNC('day', START_TIME) AS ds, SUM(CREDITS_USED) AS y
    FROM WAREHOUSE_METERING_HISTORY
    WHERE START_TIME >= DATEADD('day', -365, CURRENT_DATE())
    GROUP BY ds
    ORDER BY ds;
'''
df = pd.read_sql(query, conn)
conn.close()

df = df.dropna()  # Remove days with missing data

# 2. Fit Prophet Model
model = Prophet(daily_seasonality=True)
model.fit(df)

# 3. Forecast 30 Days Ahead
future = model.make_future_dataframe(periods=30)
forecast = model.predict(future)

# 4. Visualize
fig = model.plot(forecast)
plt.title('Snowflake Daily Credit Burn Forecast')
plt.xlabel('Date')
plt.ylabel('Credits Used')
plt.show()

Step 3: Visualize and Validate Your Forecast

Visualization reveals forecast trends, seasonality, and anomalies. Overlay predicted vs. actual credit usage to demonstrate transparency and build trust with your stakeholders. The Matplotlib chart above plots historical data with the Prophet prediction and its confidence intervals.

Charting Forecast vs. Actuals and Calculating MAPE

For error measurement, Mean Absolute Percentage Error (MAPE) is common in FinOps settings:

# Only compare forecasted values on historical dates
actuals = df['y']
predicted = forecast['yhat'][:len(df)]
mape = (abs(actuals - predicted) / actuals).mean() * 100
print(f"MAPE: {mape:.2f}%")

Pro Tip: You can visualize trends in the Snowflake UI charts if you create worksheets visualizing your daily aggregates, useful for teams with no Python experience.

Evaluating the Accuracy of Forecasting Models

Validate your forecast against real usage week by week. Regularly recalibrate the model after sizable business events (like product launches or new team onboarding) or warehouse resizes. Inaccuracy typically arises from:

Industry benchmarks put forecasting error (MAPE) for mature FinOps programs at 7-15% (FinOps Foundation), but regular monitoring is crucial for trust and budget accountability.

Step 4: Operationalize Your Forecast for FinOps

A reliable forecast only creates real value when it informs budgeting, governance, and anomaly detection.

Presenting Forecasts to Leadership

Decision makers want actionable numbers and insights, not raw CSVs. Present:

Set a cadence (e.g., monthly) for reporting, and use narrative alongside charts to tie forecast trends to business events and savings opportunities. Stellans partners with clients to implement these controls as part of transparent, compliant cost governance.

Setting Budgets and Anomaly Detection Thresholds

With an accurate forecast, use the predicted median daily or weekly spend to set forward-looking budgets aligned to your operational profile. Use upper/lower confidence intervals to guide dynamic alert thresholds – for example, trigger Slack or email alerts when actual daily spend exceeds the 90th percentile of the forecast.

Alerts can be automated inside Snowflake using Tasks and Alerts, or integrated into orchestration tools like Airflow. Regular true-up with actuals supports defensible, proactive cost governance and eliminates reactive fire drills.

Alternative Method: Using Snowflake Cortex for SQL-Based Forecasting

If you want to predict Snowflake credit usage entirely in SQL and the Snowflake UI, Snowflake Cortex provides time-series forecasting directly within Snowflake’s SQL layer, generally available since early 2024. You can use Cortex ML-powered functions for rapid, no-code/low-code prediction—ideal for non-Python users or quick prototypes.

Sample Cortex SQL:

SELECT * FROM SNOWFLAKE.ML.FORECAST(
  TABLE(
    SELECT DATE_TRUNC('DAY', START_TIME) AS usage_date, SUM(CREDITS_USED) AS credits
    FROM SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY
    WHERE START_TIME >= DATEADD('day', -365, CURRENT_DATE())
    GROUP BY 1
  ),
  'USAGE_DATE',  -- time_col
  'CREDITS',     -- target_col
  FORECAST_LENGTH => 30  -- days to forecast
);

Cortex vs. Custom Python Model: A Trade-off Analysis

Hybrid Approach: Start with Cortex for a quick win, then upgrade to Python-based modeling for deeper analytics, better handling of migrations, and richer business context alignment.

Frequently Asked Questions

What role does virtual warehouse sizing play in predicting Snowflake costs?

Virtual warehouse sizing dictates the hourly credit cost: For example, a Medium warehouse uses 4 credits/hour, and a 2X-Large uses 64 credits/hour. Right-sizing (aligning warehouse size to workload need) is the fastest way to control cost and produce predictable trends for forecasting. Sudden resizing, however, can create forecast discrepancies. Always document warehouse changes for the cleanest predictions. More info

How can FinOps teams use forecasts to control Snowflake spending?

Forecasts move FinOps teams from reactively chasing overruns to proactively setting budgets and surfacing anomalies before they impact the bottom line. With programmatic forecasting, you can alert on variance, allocate budgets aligned to business cycles, and provide quantifiable, transparent justification for platform investments.

How accurate are forecasting models for Snowflake credit consumption?

State-of-the-art models (Prophet, Cortex) routinely achieve MAPE below 10-15% in stable environments, matching industry benchmarks (FinOps Foundation). Key risks to accuracy include abrupt warehouse resizing, sudden shifts in workload mix, and untracked business events. Always monitor and recalibrate after major changes.

 

Ready to bring predictability to your Snowflake spend?

Article By:

https://stellans.io/wp-content/uploads/2024/08/image-63346022-1.png
Ilya Novoselskiy

Lead Data Scientist at Stellans

Related Posts

    Get a Free Data Audit

    * You can attach up to 3 files, each up to 3MB, in doc, docx, pdf, ppt, or pptx format.