dbt datasets

dbt Data Tests Best Practices

Null, Accepted Values & Relationships Patterns

14 minutes to read
Get free consultation

Introduction

https://stellans.io/wp-content/uploads/2025/09/image5.jpg

Every successful analytics platform is built on trust. Trust in your data being accurate, consistent, and ready to drive decision-making. At Stellans, we’ve spent years working with analytics teams who rely on dbt to transform, test, and deliver business-critical insights. Even mature dbt projects can encounter brittle tests, unclear documentation, or quality issues sneaking into production.

This guide showcases practical, production-ready dbt testing patterns for null values, accepted values (domain constraints), and relationships (referential integrity), all paired with actionable code templates. By the end, you’ll not only understand the why behind each pattern, but excitedly apply the how: robust YAML and macro-driven solutions, tips for scaling tests in CI, and guidance for real-world auditability and compliance.

Prioritizing data quality and reliable analytics will help your dbt project become a well-oiled data machine. Unlock faster time-to-insight and reduce firefighting for your team.

What Are dbt Data Tests?

dbt provides several built-in generic tests (official docs):

You can also define custom tests in SQL or as reusable macros for nuanced business logic (for example, “order_date can’t be in the future” or “revenue must be positive”).

Why Test? Technical Integrity vs. Business Logic

Testing in dbt goes beyond technical errors. By enforcing business contracts, you create rules that must always be true for your data to be trusted. Teams benefit by:

Pattern 1: Null Checking in dbt

Missing or unexpected nulls are among the most notorious data quality issues and can cause broken joins, inaccurate aggregations, or failed analytics downstream.

Built-in not_null Test Example

Start with the basics: dbt’s not_null test in a schema.yml file (applies to any model column).

# models/staging/salesforce/_salesforce__models.yml
version: 2

models:
  - name: stg_salesforce__accounts
    columns:
      - name: account_id
        description: "Unique identifier for the account"
        tests:
          - unique
          - not_null
      - name: account_name
        description: "Account's name"
        tests:
          - not_null

dbt schema.yml null test example

Tip: Keep schema files with each model for discoverability and audit purposes. This consistent structure reduces debugging time by 40% (See organization tips).

Custom Not Null Patterns

Sometimes a value should be non-null only when another field matches certain criteria (for example, cancelled_at should be non-null only if status = ‘cancelled’). Here’s a custom test for that—a macro in your tests/generic/ folder:

https://stellans.io/wp-content/uploads/2025/09/image2.jpg
-- tests/generic/not_null_when_active.sql
{% test not_null_when_active(model, column_name, status_column, active_value) %}
    select *
    from {{ model }}
    where {{ status_column }} = '{{ active_value }}'
      and {{ column_name }} is null
{% endtest %}

And in your schema.yml:
tests:
  - not_null_when_active:
      column_name: email
      status_column: is_active
      active_value: true

Conditional not null dbt macro and YAML usage

Running Null Checks in CI/CD

Catch schema drift and nulls before production. Always run dbt test in your CI pipeline.

Example GitHub Actions snippet (official guide):

# .github/workflows/dbt_test.yml
name: dbt Test
on: [pull_request]
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Install dbt & dependencies
        run: pip install dbt-snowflake dbt-utils dbt-expectations
      - name: Run dbt tests
        run: dbt test --target ci

Running dbt tests in CI GitHub Actions workflow

Fail fast, fail early, and protect business-critical models from bad nulls.

Pattern 2: Accepted Values & Domain Constraints

Business rules often require that certain columns use a valid set or numeric range (order statuses, rates, regulatory codes). dbt makes domain enforcement repeatable and auditable.

Enumerated Value Test

The built-in accepted_values test is perfect for enumerations:

columns:
  - name: order_status
    description: "Order processing state"
    tests:
      - accepted_values:
          values: ['PLACED', 'SHIPPED', 'COMPLETE', 'CANCELLED']

Accepted values custom test YAML snippet

Pro-tip: In audits or regulatory reviews, these tests show your data only contains valid domains (learn about compliance).

Ranges and Conditional Values

For continuous data (numeric ranges, dates), custom tests from dbt_expectations make enforcement modular:

tests:
  - dbt_expectations.expect_column_values_to_be_between:
      min_value: 0
      max_value: 1000000

Or a custom macro for more complex logic:

-- tests/generic/range_between.sql
{% test range_between(model, column_name, min_value, max_value) %}
    select *
    from {{ model }}
    where {{ column_name }} < {{ min_value }} or {{ column_name }} > {{ max_value }}
{% endtest %}

And reference in your model’s schema as needed.

Documenting Accepted Values for Audits

https://stellans.io/wp-content/uploads/2025/09/image6.jpg

Document domain constraints in your YAML model docs:

- name: product_category
  description: "Product category; must be one of the defined values for reporting consistency."

Well-documented tests provide clear data contracts and speed up compliance audits.

Pattern 3: Relationship (Referential Integrity) Tests

When models reference other models (join keys, surrogate keys, foreign keys), referential integrity maintains trust and accuracy in reporting.

Built-in relationships Test Example

The relationships test ensures, for example, every order.customer_id exists in customers:

columns:
  - name: customer_id
    description: "Customer FK; must reference a valid customer."
    tests:
      - relationships:
          to: ref('dim_customers')
          field: customer_id

dbt relationships test YAML example

Cross-Table Relationships & NULL Handling

Sometimes, references are mandatory only for certain rows. Example: affiliate_id is required if is_affiliate=true. Use where conditions (from dbt 1.5+):

 - name: affiliate_id
    tests:
      - relationships:
          to: ref('affiliates')
          field: affiliate_id
          where: is_affiliate = true

Conditional relationships test in dbt YAML

Common Pitfalls and Best Practices

See more robust patterns in “dbt consulting best practices” (see Stellans dbt conventions).

Organizing and Running dbt Tests at Scale

At the enterprise level, you need automation, monitoring, and enablement for ongoing data quality.

CI/CD Integration Patterns

Automate test runs on every pull request, branch, or deployment (GitHub Actions official docs). Use granular selection:

dbt test --select state:modified+

Run only impacted models to speed feedback and reduce costs.

Test Severity, Triage, and Ownership

tests:
  - not_null:
      severity: error
      owner: finance_team

Templates and Modularity for Team Scaling

Modular tests mean faster onboarding and consistent coverage as your team grows. Stellans clients have seen 40% fewer critical data issues with this approach (see DataOps case study).

https://stellans.io/wp-content/uploads/2025/09/image3.jpg

Real-World Code Examples and Templates

YAML and Macro Snippets for All Patterns

Use these snippets in your own projects:

# Null/outlier checks
- name: annual_revenue
  tests:
    - not_null
    - dbt_expectations.expect_column_values_to_be_between:
        min_value: 0
        max_value: 1000000000

# Accepted values
- name: account_status
  tests:
    - accepted_values:
        values: ['active', 'paused', 'deleted']

Using dbt_expectations & dbt-utils for Advanced Cases

https://stellans.io/wp-content/uploads/2025/09/image1.jpg

dbt_expectations: Create expressive checks (percent null, regex, percentile checks).
dbt-utils: Use common macros (unexpected values, cross-model checks).

- name: transaction_datetime
  tests:
    - dbt_expectations.expect_column_values_to_match_regex:
        regex: '^20[2-9][0-9]-'

Audit-Ready Documentation Generation

Generate docs with dbt docs generate so test coverage and model details are visible to non-engineers and auditors.

Comparative Table: Stellans vs. Industry Best Practices

Area

dbt Labs/PopSQL

Acceldata

Stellans Approach (You)

Built-in tests not_null, unique, accepted, relationships Yes Yes, + custom business-rule templates
CI/CD guidance Basic coverage Pipeline focused End-to-end CI setup, modular templates, owner fields
Compliance/Audit   Documentation only Light touch Audit-ready YAML, clear data contracts, test documentation
Modularity Manual recommended Emphasized Packaged macros, custom test store, onboarding guides
Ownership Not explicit Not explicit Owner field in test YAML, team accountability

 

Comparative table of dbt test practices and Stellans features

Looking Forward: Compliance, Observability, Data-as-a-Product

Data Contracts & Regulatory Considerations

As teams treat data as a product, test coverage and clear domain contracts are essential for legal compliance (GDPR/CCPA), audit trails, and customer trust.

Read more about data governance and compliance here.

Integrating Observability Tools with dbt

By adding data observability platforms, automated anomaly detection, and end-to-end lineage, you create a robust analytics foundation. Strong dbt test design gives you a dependable first line of defense.

Frequently Asked Questions

https://stellans.io/wp-content/uploads/2025/09/image4.jpg

How do dbt tests work?
dbt tests run assertions on your data models, sources, and snapshots. These produce pass/fail results and can be automated in CI/CD workflows to catch errors ahead of production. Learn more.

Which types of tests are built into dbt?
dbt includes not_null, unique, accepted_values, and relationships as generic tests. You can supplement these with custom business tests.

How do you run dbt tests in CI/CD?
Add a step to your pipeline to execute dbt test on every code change or pull request. Integration with GitHub Actions and similar tools is straightforward.

What is the row count test in dbt?
A row count test ensures a table/model contains the expected number of rows, so you can catch missing or duplicated data after transformations.

How to unit test dbt macros?
You can unit test macros by building a test SQL model that uses your macro with sample input data, then verify outputs with dbt’s test framework.

Conclusion

Delivering reliable analytics starts with solid dbt testing patterns. Null checks, accepted values, and relationships act as your first defense against broken pipelines and reporting errors. Focus on modularizing tests, running them in CI/CD, documenting ownership, and aligning with compliance for peace of mind in scale.

Ready to operationalize world-class dbt testing and data quality? Discover our analytics engineering solutions or contact Stellans for expert consulting. Unlock your data’s full potential and eliminate costly surprises.

https://stellans.io/wp-content/uploads/2025/09/image7.jpg

Article By:

https://stellans.io/wp-content/uploads/2025/07/AntotStellans1-4-1.webp
Anton Malyshev

Co-founder and COO of Stellans; analytics engineering expert with hands-on dbt and data quality consulting experience

Related Posts

    Get a Free Data Audit

    * You can attach up to 3 files, each up to 3MB, in doc, docx, pdf, ppt, or pptx format.