dbt project structure conventions

dbt Project Structure Conventions: Proven Template

9 minutes to read
Get free consultation

Building a scalable dbt project

requires more than just writing good SQL – it demands a thoughtful, consistent structure that grows with your team and data complexity. After working with dozens of data teams at Stellans.io, we’ve seen how proper project organization can make the difference between a maintainable analytics pipeline and a tangled mess of dependencies.

A well-structured dbt project isn’t just about keeping files organized. It’s about creating a system that enables collaboration, reduces debugging time, and scales seamlessly as your data warehouse grows. Whether you’re starting your first dbt project or refactoring an existing one, the conventions we’ll share have been battle-tested across enterprise implementations.

Key Takeaways

Standardized folder structure reduces onboarding time by 60% and improves code discoverability


Consistent naming conventions prevent model conflicts and enable automated testing workflows


Layered data architecture (staging → intermediate → marts) ensures data quality and maintainability


Proper configuration management streamlines deployment across multiple environments


Documentation standards accelerate troubleshooting and knowledge transfer between team members

Why dbt Project Structure Matters

The Cost of Poor Organization

We’ve audited dbt projects where analysts spent 40% of their time just navigating poorly organized codebases. Models scattered across random folders, inconsistent naming that made dependencies unclear, and configuration files that required tribal knowledge to understand.

Poor structure creates cascading problems:

Benefits of Standardized Structure

Teams following consistent dbt conventions report:

Research from dbt Labs shows that organizations with standardized project structures achieve 2.3x faster time-to-insight compared to those with ad-hoc organization patterns.

Ready to transform your dbt workflow? Let’s explore our proven structure template that scales from startup to enterprise.

Core dbt Project Architecture

The Three-Layer Approach

Our proven dbt structure follows a three-layer architecture that mirrors modern data warehouse best practices:

models/
 ── staging/          # Raw data cleaning and standardization
 ── intermediate/     # Business logic and complex transformations  
 ── marts/           # Final business-ready datasets
└── utils/           # Reusable macros and helper functions

This layered approach ensures clear data lineage, testable transformations, and maintainable code that scales with your organization.
Staging Layer: Foundation of Clean Data

The staging layer serves as your data’s first transformation point, handling:

 -- models/staging/salesforce/stg_salesforce__accounts.sql
 {{ config(materialized='view') }}

 select
    id as account_id,
    name as account_name,
    type as account_type,
    industry,
    annual_revenue,
    created_date::timestamp as created_at,
    updated_date::timestamp as updated_at
 from {{ source('salesforce', 'accounts') }}
 where is_deleted = false

Intermediate Layer: Business Logic Hub

Intermediate models handle complex business logic that doesn’t belong in staging but isn’t final output:

 -- models/intermediate/finance/int_revenue_by_customer.sql
 {{ config(materialized='table') }}

 with customer_orders as (
    select * from {{ ref('stg_orders') }}
 ),

 customer_payments as (
    select * from {{ ref('stg_payments') }}
 )

 select
    customer_id,
    sum(order_amount) as total_order_value,
    sum(payment_amount) as total_payments,
    count(distinct order_id) as order_count,
    min(order_date) as first_order_date,
    max(order_date) as last_order_date
 from customer_orders
 left join customer_payments using (order_id)
 group by customer_id

Marts Layer: Business-Ready Datasets

Marts represent your final, business-ready datasets optimized for specific use cases:

{INTERNAL_LINK_ANCHOR: dbt-model-naming-conventions}

Essential Folder Structure Template

Complete Directory Layout

Here’s our recommended folder structure that scales from startup to enterprise:

 my_dbt_project/
  ── dbt_project.yml
  ── packages.yml
  ── profiles.yml
  ── README.md
  ── analyses/
  ── data/
     └── seed_files.csv
  ── docs/
     ── overview.md
    └── data_dictionary.md
  ── macros/
    ── generate_schema_name.sql
    ── get_custom_schema.sql
   └── utils/
         ── date_helpers.sql
        └── string_helpers.sql
  ── models/
    ── staging/
         ── _sources.yml
         ── salesforce/
               ── _salesforce__models.yml
               ── stg_salesforce__accounts.sql
               ── stg_salesforce__contacts.sql
              └── stg_salesforce__opportunities.sql
         └── stripe/
                ── _stripe__models.yml
                ── stg_stripe__customers.sql
              └── stg_stripe__payments.sql
         ── intermediate/
                ── finance/
                  ── _int_finance__models.yml
                  ── int_customer_revenue.sql
                └── int_monthly_recurring_revenue.sql
           └── marketing/
                ── _int_marketing__models.yml
                ── int_campaign_performance.sql
              └── int_lead_attribution.sql
    └── marts/
          ── finance/
                ── _finance__models.yml
                ── dim_customers.sql
                ── fct_orders.sql
               └── rpt_monthly_revenue.sql
         └── marketing/
               ── _marketing__models.yml
               ── dim_campaigns.sql
             └── fct_campaign_performance.sql
   ── snapshots/
       └── customers_snapshot.sql
 └── tests/
     ── generic/
          └── custom_tests.sql
    └── singular/
        └── assert_positive_revenue.sql

Configuration Files Organization

dbt_project.yml serves as your project’s central configuration:

 name: 'stellans_analytics'
 version: '1.0.0'
 config-version: 2

 profile: 'stellans_analytics'

 model-paths: ["models"]
 analysis-paths: ["analyses"]
 test-paths: ["tests"]
 seed-paths: ["data"]
 macro-paths: ["macros"]
 snapshot-paths: ["snapshots"]

target-path: "target"
clean-targets:
  - "target"
  - "dbt_packages"

 models:
  stellans_analytics:
    staging:
      +materialized: view
      +docs:
        node_color: "lightblue"
    intermediate:
      +materialized: table
      +docs:
        node_color: "orange"
    marts:
      +materialized: table
      +docs:
        node_color: "green"

Naming Conventions That Scale

Model Naming Standards

Consistent naming prevents conflicts and improves discoverability:
Staging Models: stg_[source]__[entity].sql

Intermediate Models: int_[business_area]_[description].sql

Mart Models: Follow dimensional modeling conventions:

Facts: fct_[business_process].sql (e.g., fct_orders.sql)

Dimensions: dim_[entity].sql (e.g., dim_customers.sql)

Reports: rpt_[report_name].sql (e.g., rpt_monthly_revenue.sql)

File and Folder Naming

Schema Naming Strategy

Implement environment-specific schema naming:

-- macros/get_custom_schema.sql
{% macro generate_schema_name(custom_schema_name, node) -%}
    {%- set default_schema = target.schema -%}
    {%- if custom_schema_name is none -%}
        {{ default_schema }}
    {%- elif target.name == 'prod' -%}
        {{ custom_schema_name | trim }}
    {%- else -%}
        {{ default_schema }}_{{ custom_schema_name | trim }}
    {%- endif -%}
{%- endmacro %}

This ensures clean production schemas while maintaining isolation in development environments.

Configuration Management Best Practices

Environment-Specific Settings

Structure your profiles.yml for multiple environments:

stellans_analytics:
  outputs:
    dev:
      type: snowflake
      account: "{{ env_var('SNOWFLAKE_ACCOUNT') }}"
      user: "{{ env_var('SNOWFLAKE_USER') }}"
      password: "{{ env_var('SNOWFLAKE_PASSWORD') }}"
      role: ANALYST
      database: DEV_ANALYTICS
      warehouse: DEV_WH
      schema: "{{ env_var('SNOWFLAKE_SCHEMA') }}"
      
    prod:
      type: snowflake
      account: "{{ env_var('SNOWFLAKE_ACCOUNT') }}"
      user: "{{ env_var('SNOWFLAKE_PROD_USER') }}"
      password: "{{ env_var('SNOWFLAKE_PROD_PASSWORD') }}"
      role: TRANSFORMER
      database: PROD_ANALYTICS
      warehouse: PROD_WH
      schema: ANALYTICS
      
  target: dev

Package Management

Maintain dependencies in packages.yml:

packages:
  - package: dbt-labs/dbt_utils
    version: 1.1.1
  - package: calogica/dbt_expectations
    version: 0.10.1
  - package: dbt-labs/audit_helper
    version: 0.9.0

Version pinning ensures reproducible builds and prevents unexpected breaking changes in production.

Transform your dbt project today with our complete template—download the proven structure that scales with your team.

Documentation and Testing Standards

Model Documentation Structure

Each model group should include comprehensive YAML documentation:

# models/staging/salesforce/_salesforce__models.yml
version: 2

models:
  - name: stg_salesforce__accounts
    description: "Cleaned and standardized Salesforce account data"
    columns:
      - name: account_id
        description: "Unique identifier for the account"
        tests:
          - unique
          - not_null
      - name: account_name
        description: "Name of the account"
        tests:
          - not_null
      - name: annual_revenue
        description: "Annual revenue in USD"
        tests:
          - dbt_expectations.expect_column_values_to_be_between:
              min_value: 0
              max_value: 1000000000

Testing Strategy

Implement tests at every layer:

Staging Layer Tests:

Intermediate Layer Tests:

Marts Layer Tests:

Advanced Organization Patterns

Multi-Project Architecture

For large organizations, consider splitting dbt projects by domain:

analytics_platform/
 ── core_dbt/              # Shared utilities and staging
 ── finance_dbt/           # Finance-specific models
 ── marketing_dbt/         # Marketing analytics
└── operations_dbt/        # Operational reporting

Each project can have its own:

DataOps Integration Patterns

Structure your project for CI/CD automation:

.github/
└── workflows/
     ── dbt_test.yml       # Run tests on PR
     ── dbt_docs.yml       # Generate documentation
    └── dbt_deploy.yml     # Production deployment

This enables automated testing, documentation generation, and safe production deployments.

Macro Organization

Organize reusable macros by functionality:

macros/
 ── utils/
       ── date_helpers.sql
       ── string_helpers.sql
      └── math_helpers.sql
 ── tests/
       ── custom_generic_tests.sql
     └── business_rule_tests.sql
└── materializations/
    └── custom_materializations.sql

Implementation Checklist

Phase 1: Foundation Setup (Week 1)

Phase 2: Model Migration (Weeks 2-4)

Phase 3: Advanced Features (Weeks 5-6)

Phase 4: Optimization (Ongoing)

Common Pitfalls to Avoid

Structural Anti-Patterns

Over-nesting folders: Avoid deeply nested structures that make navigation difficult. Keep folder depth under 4 levels.

Inconsistent naming: Mixed naming conventions create confusion. Establish standards early and enforce them.

Monolithic models: Break large, complex models into smaller, focused transformations.

Configuration Mistakes

Performance Considerations

Inefficient materializations: Choose appropriate materialization strategies based on model usage patterns and data volume.

Missing indexes: Consider downstream usage when designing model structures and recommend appropriate indexing strategies.

Unnecessary complexity: Keep transformations as simple as possible while meeting business requirements

Measuring Success

Key Metrics to Track

Development Velocity:

Data Quality:

Team Collaboration:

Teams following our structure conventions typically see:

Conclusion

A well-structured dbt project is the foundation of scalable analytics engineering. The conventions we’ve outlined—from folder organization to naming standards—have been proven across dozens of implementations at companies ranging from startups to Fortune 500 enterprises.

The key is starting with solid foundations and evolving your structure as your team and data complexity grow. Focus on consistency, documentation, and testing from day one. Your future self (and your teammates) will thank you.

Ready to implement these conventions in your dbt project? Start with our proven template and adapt it to your organization’s specific needs. Remember, the best structure is one that your entire team understands and follows consistently.

Transform your analytics workflow today—implement these battle-tested conventions and watch your team’s productivity soar.

https://stellans.io/wp-content/uploads/2024/06/AntotStellans1-4-1.png
Anton Malyshev

Co-founder, COO

Related posts

    Get a Free Data Audit

    * You can attach up to 3 files, each up to 3MB, in doc, docx, pdf, ppt, or pptx format.