Control Snowflake Time-Travel Cost: A Practical Guide to Retention Policies

15 minutes to read
Get free consultation

 

Has your monthly Snowflake bill ever made you do a double-take? You see the compute credits, which are understandable, but then you notice a surprisingly high figure for storage. For many, this creeping storage cost is a source of frustration, often stemming from one of Snowflake’s most powerful features: Time Travel. While it’s an incredible safety net for data recovery, its default settings can quietly inflate your storage consumption and, consequently, your expenses.

This article provides a strategic framework for mastering Snowflake’s data retention features. We’ll move beyond the technical “what” and dive into the business-critical “why” and “how.” Our goal is to empower you to manage Time Travel and Fail-safe proactively, ensuring you can control costs while maintaining robust data recoverability and compliance. Effective retention policies are the key to unlocking the full value of Time Travel without overspending.

What Are Snowflake Time Travel and Fail-safe?

Before we can control the costs, we need to understand the features driving them. Snowflake provides two distinct mechanisms for historical data recovery: Time Travel and Fail-safe. Though related, they serve different purposes and have vastly different implications for your budget.

How Time Travel Works: Your 90-Day Safety Net

Snowflake Time Travel is a feature that allows you to access and restore historical versions of your data that have been modified or deleted. Think of it as a version control system for your tables. Within a configurable window, you can run queries on data as it existed in the past, restore dropped tables, or clone entire tables from a specific point in time.

The retention period is the key lever here. For Snowflake Standard edition, the default is just one day. For Enterprise and higher editions, however, the retention period can be set anywhere from 1 to 90 days. This extended window provides a powerful safety net for recovering from accidental data deletions or correcting operational errors.

Understanding Fail-safe: The Last Resort for Data Recovery

Fail-safe is a separate, non-configurable recovery period that begins immediately after the Time Travel window ends. It is a 7-day, last-resort safety net designed for disaster recovery scenarios, managed exclusively by Snowflake. You cannot directly query or access data within the Fail-safe window. Data recovery from this state must be initiated through a support request to Snowflake. Its primary purpose is to protect against catastrophic data loss, not for routine operational recovery.

Time Travel vs. Fail-safe: Key Differences at a Glance

It’s crucial to distinguish between these two features, as they have different levels of control and cost impact. A common pitfall we help organizations avoid is confusing the two, leading to misguided assumptions about data retention and costs.

Feature

Time Travel

Fail-safe

Duration Configurable, 1 to 90 days (depending on edition) Fixed, 7 days
Configurability Fully user-configurable at multiple levels Not configurable; managed by Snowflake
Cost Control Directly controllable by setting retention policies Indirectly impacted by Time Travel but not directly controllable
Intended Use Case Operational recovery, query history, cloning Disaster recovery for critical data loss

The Hidden Costs: How Retention Policies Impact Your Snowflake Bill

Understanding the “what” is the first step. Now, let’s connect it to the “why” and the reason your storage bill is higher than you expected. The answer lies in how Snowflake calculates storage costs.

Deconstructing Snowflake Storage Costs

Snowflake’s pricing model for storage is based on the average amount of data stored per day. This isn’t just your active, live data. The total billed storage is a sum of three components:

  1. Active Data: The current data in your tables.
  2. Time Travel Data: Historical versions of data retained within the DATA_RETENTION_TIME_IN_DAYS window.
  3. Fail-safe Data: Data held in the 7-day Fail-safe period after the Time Travel window closes.

At an on-demand rate of around $23 per terabyte per month (price varies by region and plan), these additional layers of historical data can add up quickly, especially when retention policies are not actively managed.

Cost Scenario: The Financial Impact of a 90-Day Retention Policy

Let’s illustrate this with a simple, practical example. Imagine you have a 1 TB table that is fully refreshed with new data every day.

As you can see, a long retention period on a frequently changing table can cause costs to multiply dramatically. The same 1 TB table can cost over 45 times more just by changing a single retention parameter.

Why “Set and Forget” Is a Costly Mistake

The most common issue we see with clients is that they either use Snowflake’s default settings or apply a blanket 90-day retention policy to all data “just in case.” This “set and forget” approach is a costly mistake. Non-critical data, such as staging tables, development environments, or transient logs, rarely requires a long retention history. Applying a long retention period to this data provides little business value but creates significant and unnecessary cost bloat over time.

A Strategic Framework for Setting Retention Policies

Controlling Time Travel costs isn’t about turning the feature off; it’s about applying it intelligently. Our recommended approach is to move from a reactive, one-size-fits-all model to a proactive, strategic governance framework. This involves three key steps.

Step 1: Classify Your Data

First, you cannot create an effective policy without understanding your data. Not all data is created equal. We work with clients to segment their tables based on business criticality and usage patterns. A simple classification model might look like this:

Step 2: Align Retention with Business & Compliance Needs

Once your data is classified, you can define appropriate retention periods for each category. This decision should balance cost, operational needs, and regulatory requirements.

Here are some best-practice starting points:

Step 3: Implement Policies with a Scalpel, Not a Sledgehammer

Snowflake provides granular control over retention policies. You can set the DATA_RETENTION_TIME_IN_DAYS parameter at four different levels: Account, Database, Schema, and Table. This hierarchy allows you to set a conservative default at a higher level (e.g., 7 days at the database level) and then override it with a more aggressive, cost-saving policy at the table level (e.g., 1 day for a specific staging table).

Our approach is to always define retention at the most specific level possible. A blanket policy is easy to implement, but it’s inefficient and expensive. Here is the SQL command to set a policy for a specific table:

-- Set the retention period for a specific table to 7 days
ALTER TABLE your_database.your_schema.your_table 
SET DATA_RETENTION_TIME_IN_DAYS = 7;

By applying this three-step framework, you transform retention management from a technical task into a strategic business process that directly impacts your bottom line.

Advanced Strategies for Cost Optimization

Beyond setting retention policies, you can leverage other Snowflake features to further minimize storage costs while maintaining flexibility.

Choosing the Right Table Type: Permanent vs. Transient vs. Temporary

Snowflake offers three main table types, each with different implications for Time Travel and Fail-safe. Choosing the right one for your use case is a powerful cost-control lever.

Table Type Time Travel Fail-safe        Cost & Use Case
Permanent Yes (1-90 days) Yes (7 days) Default type. Highest cost, full data protection. Use for all critical production data.
Transient Yes (0-1 day) No Lower cost. No Fail-safe protection. Ideal for staging data or any data that doesn’t need long-term recovery.
Temporary Yes (0-1 day) No Lowest cost. Table exists only for the duration of the session. Use for ETL/ELT job scratch space.

 

For example, by simply changing a large staging table from PERMANENT to TRANSIENT, you immediately eliminate Fail-safe storage costs for that table, as Transient tables do not have a Fail-safe period.

Leveraging Zero-Copy Cloning for Development and Testing

A common source of cost is duplicating large production tables to create development or testing environments. Snowflake’s Zero-Copy Cloning feature allows you to create a clone of a table, schema, or database almost instantly without duplicating the underlying storage. The clone is a metadata-only operation that points to the original data. You only incur storage costs for the new or modified data in the clone. This is an incredibly efficient way to provide developers with fresh data without doubling your storage and the associated Time Travel costs.

Monitoring Your Storage Consumption

You can’t optimize what you can’t measure. Snowflake provides views to help you track storage usage. The TABLE_STORAGE_METRICS view is particularly useful, as it breaks down storage by active bytes, Time Travel bytes, and Fail-safe bytes. Regularly monitoring this view helps you identify tables that are the biggest contributors to your storage costs and validate that your retention policies are working as expected. This forms the foundation for continuous Snowflake cost monitoring and optimization.

Conclusion: From Reactive Cost Cutting to Proactive Governance

Snowflake Time Travel is an indispensable feature for data resilience, but its financial implications cannot be ignored. The key to controlling costs is to shift from a reactive “set and forget” mindset to one of proactive governance. By classifying your data, aligning retention policies with business needs, and leveraging the right Snowflake features like Transient tables and Zero-Copy Cloning, you can strike the perfect balance between data protection and cost efficiency.

Ultimately, managing Time Travel is a governance challenge, not just a technical one. It requires a deep understanding of your data lifecycle and a commitment to continuous optimization.

Struggling to balance costs and recovery? Our experts can help you build a robust Snowflake Cost Governance framework. Contact us for a complimentary retention policy health check.

FAQs

  1. What is the difference between Snowflake Time Travel and Fail-safe? Snowflake Time Travel is a configurable feature that allows you to access historical data for 1 to 90 days, designed for query recovery and data restoration. Fail-safe is a non-configurable, 7-day disaster recovery period that begins after the Time Travel window ends and is managed directly by Snowflake.
  2. How does data retention impact Snowflake storage costs? Snowflake bills for the average daily storage used, which includes active data plus all historical data retained by Time Travel and Fail-safe. Longer retention periods mean more historical data is stored, which can significantly increase your monthly storage costs.
  3. How can I change the data retention period in Snowflake? You can change the data retention period by setting the DATA_RETENTION_TIME_IN_DAYS parameter at the account, database, schema, or table level using an ALTER statement. For example, to set it for a specific table, you would use: ALTER TABLE your_table SET DATA_RETENTION_TIME_IN_DAYS = 7;

 

References

Article By:

https://stellans.io/wp-content/uploads/2024/09/DavidStellans2-1-2.png
David Ashirov

Co-founder & CTO

Related Posts

    Get a Free Data Audit

    * You can attach up to 3 files, each up to 3MB, in doc, docx, pdf, ppt, or pptx format.