Automate dbt Docs to GitHub Pages in 15 Minutes

11 minutes to read
Get free consultation

 

dbt documentation is a powerful asset for any data team. It provides a clear, interactive map of your data models, sources, and tests. But let’s be honest, for that documentation to be useful, it must be up to date and easily accessible. Keeping your dbt docs fresh after every model change can be a tedious chore that often falls by the wayside.

What if you could set up a system that automatically publishes your dbt docs to a free, shareable website every time you update your project?

In this guide, we’ll show you exactly how to do that. We’ll walk through a simple, automated workflow to publish your dbt docs to GitHub Pages. The best part? You can have the entire system up and running in about 15 minutes. Let’s build a solution that makes your documentation a living, breathing resource, not a forgotten artifact.

Why Automate Your dbt Documentation Publishing?

Before we dive into the “how,” let’s quickly cover the “why.” Automating your documentation isn’t just a technical convenience; it’s a strategic move that fundamentally improves how your team and your stakeholders interact with data.

The Friction of Manual Documentation

A manual documentation process can lead to several downstream problems:

Key Benefits of an Automated Workflow

Implementing a “set it and forget it” pipeline with GitHub Actions and GitHub Pages delivers immediate and lasting value.

Prerequisites: What You’ll Need

This tutorial is designed to be fast and straightforward. To follow along, you’ll need just a few things in place:

That’s it. If you have a dbt project in a GitHub repo, you’re ready to go.

The 4-Step Guide to Publishing dbt Docs to GitHub Pages

Now, let’s get to the main event. We’ll break this down into four simple steps that will take you from manual chaos to automated bliss in minutes.

Step 1: Generate Your dbt Documentation Locally (2 minutes)

First, let’s ensure you can generate the documentation on your own machine. This confirms your dbt project is set up correctly and helps you understand what files we’ll be automating.

Navigate to your dbt project’s root directory in your terminal and run the core documentation command:

dbt docs generate

 

This command compiles your project and generates a set of static HTML and JSON files that make up your documentation site. Once it completes, you’ll find these files inside the target/ directory. The key files are:

You can open the target/index.html file in a web browser to see the local version of your documentation. Our goal is to get this exact site onto a public URL.

Step 2: Configure Your GitHub Repository for GitHub Pages (3 minutes)

Next, we need to tell GitHub where to find the files for your documentation website. We will configure it to serve files from a specific branch called gh-pages. While this branch doesn’t exist yet, our automation will create and populate it for us later.

  1. In your GitHub repository, go to the Settings tab.
  2. In the left sidebar, click on Pages.
  3. Under the “Build and deployment” section, for the Source, select Deploy from a branch.
  4. Under “Branch,” select gh-pages and keep the folder as / (root). If gh-pages isn’t an option yet, don’t worry. You can type it in, or simply wait for our workflow to create it in the next step.
  5. Click Save.

Your repository is now ready to host a site. All we need to do is push our documentation files to that gh-pages branch.

Step 3: Create the GitHub Actions Workflow for Automation (8 minutes)

This is the heart of our automation. We will create a GitHub Actions workflow. This is a small YAML file that tells GitHub what commands to run whenever we push code to our main branch.

  1. In your code editor, create a new directory structure in your dbt project’s root: .github/workflows/.
  2. Inside that workflows directory, create a new file named publish-docs.yml.
  3. Copy and paste the following code into publish-docs.yml. Here’s the exact YAML configuration we recommend for a robust setup:
name: Publish dbt Docs

on:
  push:
    branches:
      - main # Or your default branch, e.g., master

jobs:
  build-and-deploy-docs:
    runs-on: ubuntu-latest

    steps:
      - name: Checkout code
        uses: actions/checkout@v3

      # This step is crucial for dbt-core projects that need a profiles.yml file
      # It creates a temporary profiles.yml for dbt docs generate to use
      - name: Set up dbt profile
        run: |
          echo "default:" > profiles.yml
          echo "  target: dev" >> profiles.yml
          echo "  outputs:" >> profiles.yml
          echo "    dev:" >> profiles.yml
          echo "      type: duckdb" >> profiles.yml
          echo "      path: ':memory:'" >> profiles.yml
        # Note: We use DuckDB here because it requires no credentials.
        # The 'dbt docs generate' command only needs a valid profile to parse the project;
        # it does not need to connect to your actual data warehouse.

      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.10' # Match your dbt project's Python version
          cache: 'pip'

      - name: Install dbt
        run: |
          pip install dbt-core dbt-duckdb # Add any other adapters your project needs for parsing

      - name: Generate dbt docs
        run: dbt docs generate

      - name: Deploy to GitHub Pages
        uses: peaceiris/actions-gh-pages@v3
        with:
          github_token: ${{ secrets.GITHUB_TOKEN }}
          publish_dir: ./target
          # This will commit the generated docs to the 'gh-pages' branch

Breaking Down the Workflow File

Let’s quickly review what each part of this file does:

Step 4: Commit, Push, and Verify Your Live dbt Docs (2 minutes)

Now for the final step. All that’s left is to commit our new workflow file and push it to GitHub.

  1. Save the publish-docs.yml file.
  2. In your terminal, add the new file to git, commit it, and push it to your main branch.
git add .github/workflows/publish-docs.yml
git commit -m "feat: Add GitHub Action to publish dbt docs"
git push origin main

As soon as you push this commit, GitHub will automatically detect the new workflow file and trigger a run.

Click the link, and you should see your fully interactive dbt documentation site, live and accessible to anyone you share the URL with.

Best Practices and Next Steps

Congratulations! You’ve built a robust, automated pipeline for your dbt documentation. Here are a few tips to enhance it further.

Securing Your dbt Docs

By default, GitHub Pages sites on public repositories are public. If your documentation contains sensitive information, you should host it in a private repository. You can then manage access control through GitHub’s settings, ensuring only authorized team members can view the documentation.

Custom Domains

Want a more professional URL? You can easily configure a custom domain (e.g., dbt-docs.yourcompany.com) for your GitHub Pages site. This involves adding a CNAME file to your repository and updating DNS records with your domain provider.

Handling Multiple Environments (Dev/Prod)

For more advanced setups, you might want to publish documentation for different dbt targets (e.g., dev vs. prod). You can adapt the workflow to handle this by using different branches as triggers or by passing variables into your dbt commands to change the target.

Beyond Documentation: Full-Stack Data Engineering Automation

Automating your documentation is a fantastic first step toward building a well-oiled data machine. It solves a specific, high-visibility problem and introduces your team to the power of CI/CD. But this is just the beginning. The same principles can be applied to automate testing, deployments, and data quality monitoring across your entire data stack.

At Stellans, we help teams build these kinds of robust, end-to-end automated systems. We move beyond simple tasks and help you implement a full-stack data platform where reliability and efficiency are built-in, not bolted on.

Ready to automate more than just your documentation? Learn more about our Data Engineering services.

Conclusion

In just a few short steps, you’ve transformed your dbt documentation from a static, quickly outdated artifact into a dynamic, self-updating resource. By leveraging the power of GitHub Actions, you’ve created a seamless pipeline that ensures your entire organization has access to a single source of truth for your data. This small investment in automation not only improves efficiency but also fosters a culture of data literacy and trust, empowering everyone to make better, data-informed decisions.

Frequently Asked Questions

How do I host dbt docs on GitHub Pages?

You can host dbt docs on GitHub Pages by first generating the static site using the dbt docs generate command. Then, use a GitHub Actions workflow to automatically deploy the output files from your target/ folder to a dedicated gh-pages branch in your repository, which GitHub Pages will then serve as a website.

Can I automate dbt documentation deployment?

Yes, you can fully automate the deployment of dbt documentation using a CI/CD tool like GitHub Actions. By creating a simple workflow YAML file in your repository, you can configure an action that triggers on every push to your main branch, automatically generates the latest docs, and publishes them to a hosting service like GitHub Pages.

Can I make my dbt docs private?

Yes. If your GitHub repository is private, your associated GitHub Pages site can also be restricted. You can manage access through your repository’s settings, allowing only authenticated members of your organization to view the deployed documentation site.

References

Article By:

https://stellans.io/wp-content/uploads/2024/06/IMG_5527-2-1.png
Vitaly Lilich

Co-founder and CEO of Stellans

Related Posts

    Get a Free Data Audit

    * You can attach up to 3 files, each up to 3MB, in doc, docx, pdf, ppt, or pptx format.