dbt MERGE vs DELETE+INSERT

15 minutes to read

October 23, 2025

The Advanced Guide to Incremental Model Performance

Analytics engineers rely on dbt (data build tool) for scalable, reliable transformations. As data volume grows, the choice between MERGE and DELETE+INSERT incremental model strategies isn’t just academic—it is a major lever for performance optimization and cost control. This detailed guide taps into firsthand experience from client projects at Stellans, arming you with expert benchmarks, clear decision frameworks, and actionable warehouse-specific advice.

If your goal is to optimize dbt incremental model performance for business impact—whether on Snowflake, BigQuery, or Redshift—read on for a comprehensive comparison that goes beyond the basics. For additional strategies on data modernization, see how clients benefit in our Customer Success Stories.

Why Your Incremental Strategy is a Critical Cost & Performance Lever

Selecting the right dbt incremental strategy directly shapes your analytics workflows’ speed, reliability, and total warehouse spend. Many teams experience models that function smoothly at first, but slow or bottleneck as datasets scale.

Direct Impact on Compute Costs and Analytics Velocity:
Each incremental strategy—MERGE, DELETE+INSERT, or insert_overwrite—alters compute intensity and model run times. Benchmark tests on Snowflake and BigQuery show that MERGE on a table with 500M+ rows can be over three times slower than a partitioned DELETE+INSERT (e.g., 44 minutes vs. 13 minutes for daily partitioned updates on Snowflake). These delays can block analytics teams from timely insights and raise platform expenses.
Data Freshness and Model Reliability:
If incremental loads drag or fail, downstream dashboards become unreliable. Model reliability is often the first casualty when inefficient strategies push warehouses beyond their thresholds.

Beyond the Basics: When Standard Incremental Models Start to Fail

Relying on default incremental options works at first, but as data volume rises:

Full table scans and absent partitioned keys degrade performance
Codebase maintenance grows as transformations scale
Data freshness SLAs become difficult to meet

At Stellans, we’ve frequently helped clients overcome incremental model slowdowns that surfaced only as data size crossed the threshold into millions or billions of records.

The Hidden Costs of an Inefficient Strategy

Performance bottlenecks can multiply when incremental loads are slow:

Delayed downstream workflows: Analysis and ML pipelines wait on sluggish models
Spiking warehouse costs: Larger scans, excessive I/O, or poorly configured strategies increase cloud data warehouse bills
Operational complexity: Non-tuned strategies demand more troubleshooting and maintenance, diverting resources from higher-value work

The right incremental model approach is a business decision with real economic impact.

https://stellans.io/wp-content/uploads/2025/10/pexels-asphotography-95916-1.jpg

A Deep Dive into the MERGE Strategy

How MERGE Works Under the Hood

The MERGE statement is an atomic SQL operation that can insert, update, and sometimes delete rows in a single statement based on a unique key. In dbt, using the merge strategy configures your incremental model like so:

MERGE INTO target_table AS dest
USING staging_table AS src
ON dest.pk = src.pk
WHEN MATCHED THEN UPDATE SET ...
WHEN NOT MATCHED THEN INSERT (...)

MERGE is supported on Snowflake, BigQuery, Redshift, and Databricks. dbt manages the temp tables and key logic behind the scenes. For up-to-date best practices, refer to dbt’s incremental strategy documentation.

Pros: Simplicity and Handling Updated Rows

Upserts made simple: Inserts and updates handled together—ideal for SCDs or frequently updated event tables.
Atomic and consistent: Ensures updates and inserts occur together, reducing the risk of partial loads.
Minimal configuration: Specify the unique_key and let dbt manage the rest, with additional options for partition or clustering keys in warehouses like Snowflake or BigQuery.

Cons: Performance Degradation on Large, Unpartitioned Data

Scalability trade-offs: MERGE performance degrades noticeably as tables surpass 100M rows, especially if there are no clustering or partition keys. Our Stellans benchmarks show MERGE exceeding a 3x slowdown compared to optimized alternatives on BigQuery above 500M rows.
Higher costs on compute-billed warehouses: Each run can spike compute and scan costs, especially if the entire table is scanned for each update.
Not ideal for massive, mostly append-only tables: When most rows are new and updates are rare, the cost of identifying the changed keys outweighs MERGE‘s simplicity.

In client engagements, we consistently observe MERGE as the pragmatic starting point, but growing tables often require performance tuning or a switch to DELETE+INSERT.

A Deep Dive into the DELETE+INSERT Strategy

The Two-Step Logic of DELETE+INSERT

DELETE+INSERT splits the upsert into two steps:

DELETE phase: Remove rows in the target where the unique_key matches the staging data
INSERT phase: Insert all staged rows (new and updates) into the target table

Enable this in dbt with:

{{
  config(
    materialized='incremental',
    incremental_strategy='delete+insert',
    unique_key='your_primary_key',
    partition_by={'field': 'event_date', 'data_type': 'date'}  # for BigQuery/Snowflake
  )
}}

For warehouse-specific config, refer to dbt’s incremental models docs.

Pros: Speed and Efficiency on Partitioned Data

Faster with partitions: When tables are partitioned (e.g., by date), only affected partitions are scanned. In Stellans client benchmarks, partitioned DELETE+INSERT on BigQuery processed 100M row updates in less than 10 minutes, compared to MERGE taking over 25 minutes.
Lower compute and I/O: Cloud warehouses like Snowflake and BigQuery apply partition or micro-partition pruning, scanning and updating fewer blocks.
Ideal for append-only/time-based data: New data loads quickly—even for large tables—if updates are rare and most changes concern recent dates.

Cons: Potential for Write Amplification and Transactional Complexity

Not fully atomic: Intermediate states are possible if failures occur between DELETE and INSERT, though dbt mitigates some risks.
Write amplification: If staged updates overlap many partitions or keys, excessive deletes and inserts can increase warehouse load.
Maintenance required: Frequent deletes require regularly running ANALYZE or VACUUM (on Redshift) to maintain index and partition health, especially as table size scales.

Our empirical benchmarks across Snowflake and BigQuery confirm that with careful partitioning, DELETE+INSERT typically delivers the lowest cost-per-row processed for bulk update jobs or nightly loads in 2024 deployments.

The Decision Framework: When to Use MERGE vs. DELETE+INSERT

Selecting your incremental strategy is not one-size-fits-all. It should match your update patterns, schema design, and underlying warehouse capabilities. Our Stellans framework guides analytics engineers to the optimal choice:

Scenario 1: Few Updates and Append-Only Data (Use insert_overwrite or DELETE+INSERT)

When incoming data is mostly append-only and existing records rarely change:

On supported warehouses: Use insert_overwrite targeting only recent or relevant partitions.
Otherwise: Configure DELETE+INSERT to only operate on altered partitions. This ensures maximum partition pruning and minimal cost.

Example config (for partitioned loads):

models:
  my_incremental_model:
    +materialized: incremental
    +incremental_strategy: delete+insert
    +unique_key: id
    +partition_by: {"field": "event_date", "data_type": "date"}

Scenario 2: Frequent Updates on a Unique Key (MERGE is Your Starting Point)

If you routinely update existing rows—such as handling slowly changing dimensions or correcting event data—begin with MERGE:

Delivers atomic, reliable upserts that simplify your dbt logic
Best for low- to mid-volume tables (our tests: under 50M rows, typical MERGE on Snowflake executes in 2–6 minutes on balanced compute)
Minimal operational overhead, especially when clustering or partitioning supports your update routes

Monitor run times and warehouse consumption—if execution times surge, move to the next scenario.

Scenario 3: MERGE is Slow on Large Tables (Switch to DELETE+INSERT on Partition Key)

If MERGE performance degrades, especially for tables exceeding 100M rows:

Use DELETE+INSERT on just the affected partitions (often, the last few days/weeks).
This limits data scanned and updated, often slashing runtime and compute cost by 60–80% in Stellans client benchmarks.

Quick-Reference Decision Tree

graph TD;
    A[What is your update pattern?] -->|Append-only or rare updates| B[Use DELETE+INSERT or insert_overwrite on partitions];
    A -->|Frequent updates on unique keys| C[Start with MERGE];
    C -->|MERGE slow on large tables?| D[Switch to DELETE+INSERT targeting partitions];

Tip: Regularly benchmark model run times and warehouse resource usage. This catches regressions early—before they impact analytics stakeholders.

https://stellans.io/wp-content/uploads/2025/10/pexels-tima-miroshnichenko-7567591-2-1.jpg

Warehouse-Specific Tuning for Optimal Performance

Your cloud warehouse architecture dictates your dbt incremental model’s ceiling. Insights below are drawn directly from Stellans’ empirical data:

Snowflake: The Power of Clustering with MERGE

Micro-partition clustering: Define clustering keys matching your unique_key or partition column. With proper clustering, MERGE statements can achieve 2–3x performance improvement on 1B+ row tables (empirical Stellans client data, Jan 2024).
Ongoing maintenance: Re-cluster as tables grow, as unbalanced micro-partitions will erode speed and increase costs.
Get advanced tips from Snowflake’s MERGE documentation.

Pro tip: For massive tables, combine clustering with DELETE+INSERT to limit updates to recent partitions. Because billing is tied to data scanned, this approach frequently halves monthly spend.

BigQuery: Partitioning and the Nuances of MERGE API

Table partitioning: Always partition your incremental tables by the most granular common filter (daily/hourly).

Best practice config:

models:
  my_model:
    +materialized: incremental
    +partition_by: {"field": "event_date", "data_type": "date"}
    +unique_key: id

Limit streaming inserts: Use batch loads into partitioned tables; BigQuery pruning is less effective on streaming data.
Watch row and partition limits (see reference).

Pro tip: On tables over 1B rows, optimized DELETE+INSERT processing just the last 7 days often runs 4–5x faster than a full-table MERGE (Stellans 2023–24 client benchmarks).

Redshift: Leveraging Distribution and Sort Keys

Sort/distribution keys: Choose keys matching your incremental model’s filters—typical options include event_date or updated_at.
Parallelization: Properly set distribution keys for more efficient batch loading.
Maintain table health with regular ANALYZE and VACUUM commands.

Poor key selection is a root cause of model slowdowns on Redshift, especially for MERGE operations on growing datasets.

Conclusion: Strategy, Not Just Syntax

Optimizing dbt incremental models isn’t just about picking a familiar SQL command. It’s about:

Balancing data freshness, cost, and schema maintainability
Relying on solid benchmark data, not assumptions
Actively tuning your models as your business and data scales

Stellans’ Analytics Engineering Performance Tuning has helped analytics teams scale from gigabytes to petabytes—cutting model runtimes and cloud warehouse expenses by more than 50 percent in documented cases.

If your dbt models are slow or legacy strategies are failing you, use our framework and tailored guidance to unlock new performance and ROI.

Frequently Asked Questions

What is the difference between MERGE and DELETE+INSERT in dbt?

MERGE is a single atomic operation that matches records via a unique key, applying inserts and updates together. DELETE+INSERT separates the steps: first deleting matching rows using the unique key, then inserting new or updated rows. See more detail in dbt’s incremental models documentation.

Which incremental strategy is more efficient for large datasets?

For large, partitioned datasets, DELETE+INSERT generally outperforms MERGE as it enables partition pruning and reduced data scans. MERGE is most efficient on smaller, well-indexed (or clustered) tables with frequent updates.

How do these strategies affect warehouse cost?

Efficient use of DELETE+INSERT targeted to affected partitions keeps costs in check by limiting compute and data scanned. Poorly configured MERGE can drive up charges as table size increases and more data is scanned for each upsert cycle.

Article By:

https://stellans.io/wp-content/uploads/2024/06/IMG_5527-2-1.png

Vitaly Lilich

Co-founder and CEO of Stellans

Get free consultation

dbt MERGE vs DELETE+INSERT

The Advanced Guide to Incremental Model Performance

Why Your Incremental Strategy is a Critical Cost & Performance Lever

Beyond the Basics: When Standard Incremental Models Start to Fail

The Hidden Costs of an Inefficient Strategy

A Deep Dive into the MERGE Strategy

How MERGE Works Under the Hood

Pros: Simplicity and Handling Updated Rows

Cons: Performance Degradation on Large, Unpartitioned Data

A Deep Dive into the DELETE+INSERT Strategy

The Two-Step Logic of DELETE+INSERT

Pros: Speed and Efficiency on Partitioned Data

Cons: Potential for Write Amplification and Transactional Complexity

The Decision Framework: When to Use MERGE vs. DELETE+INSERT

Scenario 1: Few Updates and Append-Only Data (Use insert_overwrite or DELETE+INSERT)

Scenario 2: Frequent Updates on a Unique Key (MERGE is Your Starting Point)

Scenario 3: MERGE is Slow on Large Tables (Switch to DELETE+INSERT on Partition Key)

Quick-Reference Decision Tree

Warehouse-Specific Tuning for Optimal Performance

Snowflake: The Power of Clustering with MERGE

BigQuery: Partitioning and the Nuances of MERGE API

Redshift: Leveraging Distribution and Sort Keys

Conclusion: Strategy, Not Just Syntax

Frequently Asked Questions

What is the difference between MERGE and DELETE+INSERT in dbt?

Which incremental strategy is more efficient for large datasets?

How do these strategies affect warehouse cost?

Article By:

Vitaly Lilich

Related Posts

Let’s
Talk

Get a Free Data Audit

dbt MERGE vs DELETE+INSERT

The Advanced Guide to Incremental Model Performance

Why Your Incremental Strategy is a Critical Cost & Performance Lever

Beyond the Basics: When Standard Incremental Models Start to Fail

The Hidden Costs of an Inefficient Strategy

A Deep Dive into the MERGE Strategy

How MERGE Works Under the Hood

Pros: Simplicity and Handling Updated Rows

Cons: Performance Degradation on Large, Unpartitioned Data

A Deep Dive into the DELETE+INSERT Strategy

The Two-Step Logic of DELETE+INSERT

Pros: Speed and Efficiency on Partitioned Data

Cons: Potential for Write Amplification and Transactional Complexity

The Decision Framework: When to Use MERGE vs. DELETE+INSERT

Scenario 1: Few Updates and Append-Only Data (Use insert_overwrite or DELETE+INSERT)

Scenario 2: Frequent Updates on a Unique Key (MERGE is Your Starting Point)

Scenario 3: MERGE is Slow on Large Tables (Switch to DELETE+INSERT on Partition Key)

Quick-Reference Decision Tree

Warehouse-Specific Tuning for Optimal Performance

Snowflake: The Power of Clustering with MERGE

BigQuery: Partitioning and the Nuances of MERGE API

Redshift: Leveraging Distribution and Sort Keys

Conclusion: Strategy, Not Just Syntax

Frequently Asked Questions

What is the difference between MERGE and DELETE+INSERT in dbt?

Which incremental strategy is more efficient for large datasets?

How do these strategies affect warehouse cost?

Article By:

Vitaly Lilich

Related Posts

Let’s Talk

Get a Free Data Audit

Get a Free Consultation

Let's talk about your project

Select an available slot to get in touch with Stellans so that one of our representatives can contact you and start a discussion.

David Ashirov

Co-founder, CTO

30 minutes

Contact us

Select an available slot to get in touch with Stellans so that one of our representatives can contact you and start a discussion.

Anton Malyshev

Co-founder, COO

30 minutes

Contact us

Select an available slot to get in touch with Stellans so that one of our representatives can contact you and start a discussion.

Vitaly Lilich

Co-founder, CEO

30 minutes

Contact us

Thank You

Thank You

Thank You

Let’s
Talk

Let's talk about
your project

Select an available slot to
get in touch with Stellans
so that one of our representatives can contact you and start a discussion.

Select an available slot to
get in touch with Stellans
so that one of our representatives can contact you and start a discussion.

Select an available slot to
get in touch with Stellans
so that one of our representatives can contact you and start a discussion.

Thank
You

Thank
You

Thank
You