Deploy dbt with GitHub Actions: A Complete CI/CD Tutorial for Agile Data Teams

12 minutes to read
Get free consultation

Table of Contents

Introduction: The Search for Faster, Safer Data

If you’re running a data team, you’ve likely felt the friction. A business stakeholder needs a new metric, an analyst writes the dbt model, and then… what? A manual dbt run on a laptop? A prayer that it doesn’t break a production dashboard? This process is slow, risky, and simply doesn’t scale. Manual DBT runs are a bottleneck that keeps valuable insights locked away and exposes your production environment to human error.

The solution is to build an automated, well-oiled data machine. We can achieve this by combining the technical power of dbt Core with the automation of GitHub Actions for CI/CD (Continuous Integration/Continuous Deployment) and the process framework of agile rituals. This isn’t just a technical fix; it’s the foundation for a high-performing analytics team. In our experience, teams that adopt this integrated approach see their development and deployment workflows become twice as fast and significantly more reliable.

At Stellans, our philosophy is built on process optimization. We believe the right tools are only half the battle. The other half is creating efficient team operations that empower people to use those tools effectively. This guide will not only show you how to build the pipeline but also why each piece matters for making your team more collaborative, agile, and impactful.

Why Your Data Team Needs More Than Just a CI/CD Pipeline

Automating your dbt builds is a huge step forward, but it’s still just code automation. True transformation happens when this automation fuels better team collaboration and a more predictable workflow. A CI/CD pipeline is a tool, but agile practices provide the framework for using that tool to its full potential.

Introducing Agile Rituals for Analytics Teams

Software development teams have used agile rituals for years to ship better code faster. We can adapt these same principles for the world of analytics to create a transparent, collaborative, and continuously improving data culture.

Prerequisites: Setting the Stage for Success

Before we dive into the YAML, let’s ensure your project is ready for automation. A little preparation here will save you a lot of headaches later.

A DBT Core Project Ready for Automation

Your DBT project should be structured correctly, with your dbt_project.yml file defining your project configurations and your profiles.yml setup to handle different environments (like development, CI, and production). The CI/CD pipeline will rely on this configuration to know how and where to run your dbt commands.

Essential GitHub Repository Settings

Protect your main branch. In your GitHub repository settings, implement branch protection rules that require pull requests (PRs) and successful status checks before merging. This enforces the workflow where all changes are peer-reviewed and automatically tested before they can impact your production models.

Securely Managing Your Credentials with GitHub Secrets

Never, ever hardcode credentials in your code. Your database user, password, and other sensitive information should be stored as GitHub encrypted secrets. Our workflow will securely access these secrets as environment variables at runtime, ensuring your production credentials are never exposed.

Step-by-Step: Building Your dbt CI/CD Workflow with GitHub Actions

Now for the main event. We’ll create a GitHub Actions workflow that automatically lints and tests your dbt models whenever a Pull Request is opened. This CI (Continuous Integration) pipeline ensures that every proposed change meets your quality standards before it can even be considered for merging.

Create a new file in your dbt repository at .github/workflows/dbt_ci.yml.

The Full YAML Workflow File

Here is the complete, copy-paste-ready YAML file we use. We’ll break down what each part does below.

name: dbt CI on Pull Request

on:
  pull_request:
    branches:
      - main

jobs:
  run_dbt_and_lint:
    name: Run dbt build and SQL linting
    runs-on: ubuntu-latest

    env:
      DBT_USER: ${{ secrets.DBT_USER }}
      DBT_PASSWORD: ${{ secrets.DBT_PASSWORD }}
      DBT_ACCOUNT: ${{ secrets.DBT_ACCOUNT }}
      DBT_WAREHOUSE: ${{ secrets.DBT_WAREHOUSE }}
      DBT_DATABASE: ${{ secrets.DBT_DATABASE }}
      DBT_SCHEMA: "CI_BRANCH_${{ github.head_ref }}"

    steps:
      - name: Checkout code
        uses: actions/checkout@v3

      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.10'
          cache: 'pip'
      
      - name: Install Dependencies
        run: |
          pip install dbt-core dbt-snowflake sqlfluff
          dbt deps

      - name: Lint SQL files
        run: sqlfluff lint models

      - name: Run dbt build (Slim CI)
        run: dbt build --select state:modified+ --defer --state ./target

      - name: Post PR Comment with dbt build results
        uses: mshick/add-pr-comment@v2
        with:
          message: |
            **dbt build results:**
            ```
            ${{ steps.dbt_build.outputs.stdout }}
            ```
          repo-token: ${{ secrets.GITHUB_TOKEN }}

Breaking It Down: Triggering the Workflow on Pull Requests

The on: pull_request: The section tells GitHub to run this job anytime a PR is opened or updated that targets the main branch. This is the entry point to our automated quality control.

Step 1: Installing Dependencies (dbt deps)

After checking out the code and setting up Python, the first command we run is dbt deps. This installs all the packages your dbt project relies on, ensuring the environment is ready for the subsequent steps.

Step 2: Linting Your SQL with SQLFluff for Quality Control

Code quality and consistency are critical for a scalable data project. We use sqlfluff to automatically lint all our SQL files. This catches style inconsistencies, bad practices, and potential errors before they become a real problem. If the linter fails, the entire CI check fails, forcing the developer to fix their code.

Step 3: Running dbt build --select state:modified+ (Slim CI)

This is the heart of our CI pipeline and a dbt best practice known as “Slim CI”. Building and testing your entire dbt project on every single PR can be slow and expensive. Instead, we do something smarter.

The command dbt build --select state:modified+ tells dbt to only run and test the models that you’ve actually changed in your branch (state:modified), plus any models that depend on them downstream (+). This dramatically speeds up your pipeline. For more information, you can always refer to the official dbt documentation.

Step 4: Posting Automated Comments back to the PR

To close the feedback loop, the final step uses an action to post the results of the dbt build command directly as a comment on the pull request. This means your data analysts don’t have to dig through logs to see if their changes worked. The results are right there in the PR, making the review process transparent and efficient.

Integrating the Pipeline into Your Agile Workflow

With our CI pipeline in place, let’s connect it back to our agile team rituals.

The Agile Data Team Kanban Board in Action

Your Kanban board is no longer just a to-do list; it’s a living map of your data development process, powered by your CI/CD pipeline. Here is a structure we’ve found to be incredibly effective:


This visualization instantly tells every team member the exact status of any given task and highlights bottlenecks. If cards are piling up in the “In Review” column, it’s a signal that CI jobs are failing or peer reviews are slow, and the team needs to swarm to fix it. This is a core tenet of the Kanban methodology.

Structuring Your Analytics Standups Around CI/CD Results

Your daily standup is now a focused, 15-minute meeting centered on the Kanban board. The discussion for each person becomes:

How to Handle Unpredictable Research Tasks in an Agile Framework (Spikes)

Not all data work is predictable. Sometimes, you need to do exploratory analysis or research that doesn’t have a clear outcome. In agile, these are called “spikes.” The best way to handle them is to timebox them. Create a task on your Kanban board for the spike (e.g., “Research user activity patterns”) and allocate a fixed amount of time (e.g., 2 days). The goal of the spike isn’t a finished dbt model, but an answer: a recommendation on whether a full model-building effort is justified.

Beyond CI: Setting Up Continuous Deployment (CD)

Once your PR is approved and merged, you want to automatically deploy it to production. This is Continuous Deployment (CD). We’ll do this with a second, separate workflow file at .github/workflows/dbt_cd.yml.

Creating a Separate Workflow for Merges to main.

This workflow is simpler. It triggers only when a PR is merged into the main branch.

name: dbt CD on Merge to Main

on:
  push:
    branches:
      - main

jobs:
  run_dbt_production:
    name: Run dbt build in Production
    runs-on: ubuntu-latest

    env:
      DBT_USER: ${{ secrets.DBT_PROD_USER }}
      DBT_PASSWORD: ${{ secrets.DBT_PROD_PASSWORD }}
      DBT_ACCOUNT: ${{ secrets.DBT_ACCOUNT }}
      DBT_WAREHOUSE: ${{ secrets.DBT_PROD_WAREHOUSE }}
      DBT_DATABASE: ${{ secrets.DBT_PROD_DATABASE }}
      DBT_SCHEMA: "PROD_SCHEMA" # Your production schema

    steps:
      - name: Checkout code
        uses: actions/checkout@v3

      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.10'
          cache: 'pip'
      
      - name: Install Dependencies
        run: |
          pip install dbt-core dbt-snowflake
          dbt deps

      - name: Run dbt build for Production
        run: dbt build

Best Practices for Production Deployments

Notice that this workflow uses a different set of secrets (e.g., DBT_PROD_USER) and runs a full dbt build against your production environment. This ensures that the newly merged code integrates correctly with your entire production project. The key best practice here is absolute separation of concerns: the CI workflow tests changes in isolation, while the CD workflow executes the final, trusted build in production.

Powering High-Performance Teams

These automated, agile processes are the bedrock of elite data teams. They are particularly powerful for modern team structures. For the Fractional & Embedded Data Teams that we build and manage for our clients, these practices are not an afterthought; they are the out-of-the-box standard. It allows our embedded experts to integrate seamlessly with client teams and deliver value from day one.

This system of automation and process removes ambiguity and empowers analysts to focus on what they do best: delivering insights. If setting up and managing these workflows feels daunting, let Stellans optimize your data operations for you.

Conclusion: From Automated Code to an Elite Team

We’ve covered a lot of ground, from the technical details of a YAML file to the collaborative principles of an agile standup. The key takeaway is this: a CI/CD pipeline automates your code, but an agile process automates your team’s success. By combining dbt, GitHub Actions, and agile rituals, you create a powerful system for delivering faster, higher-quality data insights.

This integration transforms your data team from a reactive support function into a proactive, strategic powerhouse.

Ready to transform your data team’s workflow? Contact Stellans today to learn more about our Data Team Process Optimization services.

Frequently Asked Questions

How do you securely manage secrets for dbt in GitHub Actions? Secrets for dbt, such as database credentials, should be stored as GitHub Encrypted Secrets at the repository or organization level. These secrets are then securely passed to the workflow as environment variables.

What is Slim CI for dbt projects? Slim CI is a best practice for dbt CI/CD pipelines where you only build and test the models that have been modified in a pull request, along with their downstream dependencies. This is achieved using the command ‘dbt build –select state:modified+’, which makes pipelines faster and more cost-effective.

How can Kanban boards be used for dbt model prioritization? A Kanban board can visually represent the lifecycle of a dbt model. Columns can be set up for ‘Backlog’, ‘In Development’, ‘In Review (CI Running)’, and ‘Deployed’. This allows team leads to prioritize tasks, identify bottlenecks in the CI/CD process, and track progress during daily standups.

References

Article By:

https://stellans.io/wp-content/uploads/2026/01/leadership-2.jpg
Anton Malyshev

Co-founder

Related Posts

    Get a Free Data Audit

    * You can attach up to 3 files, each up to 3MB, in doc, docx, pdf, ppt, or pptx format.