How to Set Up Bitbucket Pipelines: Examples and Code Snippets

Photo of Jim Kutz
Jim Kutz
March 20, 2026

Summarize this article with:

✨ AI Generated Summary

What is Bitbucket Pipelines and how does it fit into a Git-based software repository?

Bitbucket Pipelines is a CI/CD service baked into Bitbucket Cloud that runs your pipeline (computing) steps in Linux containers defined by YAML stored alongside your code. It ties directly to your Git workflow: each commit, branch, or tag can trigger builds, unit testing, packaging, and deployments. Because configuration lives in your software repository, the process is versioned, reviewable, and reproducible. Pipelines manage images, caches, artifacts, and variables to keep automation consistent across languages and libraries.

Core concepts you should align on first

Before authoring YAML, define the lifecycle you intend to automate: build, unit testing, package, and deploy are the usual stages. Each step executes inside a container image you choose, with optional services for databases and queues. Caches (computing) speed up dependency installs, while artifacts move outputs between steps. Conditions determine when pipelines run by branch, tag, or pull request, and variables let you parameterize environment-specific behavior cleanly.

How execution works from commit to container

A push to Git selects a pipeline using your YAML rules. Bitbucket schedules a container for the step’s image, checks out the repository, restores caches, and runs the script. If services are declared, they start as sidecar containers. Artifacts persist between steps; caches persist between runs. Logs stream to the UI and statuses annotate commits and pull requests, so you can gate merges and track the health of each change.

Components and where they appear in YAML

This table summarizes key Bitbucket Pipelines components and the YAML fields where you configure them.

Component Purpose YAML location Notes
Step Unit of execution pipelines > branches/tags/custom > step Defines image, script, caches, services, artifacts
Image Linux container to run step.image Choose per language/toolchain
Cache (computing) Persist directories between runs definitions.caches and step.caches Keyed by name; maps to paths
Artifact Files passed between steps step.artifacts Useful for build outputs and test reports
Variable Parameterize config Repository/Workspace/Deployment variables Mark secrets as secured

How do you enable and configure Bitbucket Pipelines with YAML for a new repo?

Getting started is quick: enable Pipelines in repository settings, add a bitbucket-pipelines.yml at the root of your Git repo, and push a commit. Start with a minimal step to validate the environment. Then progressively introduce caches (for dependency directories), artifacts (for build outputs), and more steps. Mature the configuration by adding conditions for branches/tags and separate validation from deployment as your process stabilizes.

Prerequisites, permissions, and repository settings

Ensure you have repository admin permissions to enable Pipelines and manage variables. Confirm that your build doesn’t require privileged operations, or plan for a runner if it does. Align on branch and tag naming to drive pipeline conditions, and document who can trigger deploys. If you rely on private registries or services, prepare credentials as secured variables rather than embedding them in YAML.

Creating your first bitbucket-pipelines.yml

Begin with a minimal file that proves your image and scripts run as expected. Commit on a feature branch, check logs, and iterate. Add caches and artifacts once the basics work to reduce runtime and increase reuse across steps.

```yamlbitbucket-pipelines.yml (minimal)pipelines:  branches:    main:      - step:          name: Build and test          image: node:20-alpine          caches:            - node          script:            - set -euo pipefail            - node --version            - npm ci            - npm test -- --cidefinitions:  caches:    node: ~/.npm```

Choosing container images and shell environment

Select images that match your language runtime and build tools: Node.js for front-end assets (CSS, fonts), Python for analytics, Java for services. Prefer slim images to reduce pull time and attack surface. Set strict shell flags (for example, set -euo pipefail) so scripts fail predictably. Install only what you need inside each step to keep containers lean and reproducible.

Which Bitbucket Pipelines building blocks should data engineers use first?

Data engineers benefit from a small set of primitives that compose into reliable pipelines. Decide early on step granularity, service usage for ephemeral databases, artifact boundaries between build and test, and a cache strategy tied to lockfiles. Writing this down once reduces drift across repositories, libraries (computing), and teams as your platform scales.

Steps, services, artifacts, and dependencies

Divide the workflow so each step has a clear purpose, with outputs captured as artifacts for downstream steps. Use services for databases or Kafka during integration tests to keep state isolated. Keep step scripts short and explicit, delegating complex logic to Makefiles or task runners versioned with the code. This keeps the YAML focused on orchestration while preserving reuse across repositories.

Variables, secured secrets, and deployment scoping

Variables make pipelines portable across environments. Use repository/workspace variables for shared settings and deployment variables for environment-specific values. Mark credentials as secured and avoid printing them. Scope cloud roles and keys to the minimum necessary, and prefer short-lived tokens when available to reduce long-term risk and audit overhead.

Conditions for branches, tags, and pull requests

Conditions let you control costs and risk. Run quick validation on feature branches, full suites on main, and deploy only from protected branches or signed tags. For pull requests, keep checks fast and focused to deliver actionable feedback, and reserve slower integration paths for merge gates or scheduled builds.

Quick-reference: component-to-purpose mapping

This table recaps which building block to use for common goals.

Goal Use Why
Speed up installs Cache directories Reuses dependency downloads
Pass build outputs Artifacts Avoids rebuilding across steps
Test with DB Services Spins up sidecar database
Parameterize envs Variables Keeps YAML reusable

How do you structure Bitbucket Pipelines YAML for multi-language monorepos and libraries?

Monorepos and polyglot stacks complicate builds: multiple languages, shared libraries, and varying release cadences. An effective structure isolates concerns, runs only what changed, and keeps per-project YAML readable. Choose per-language images, define clear artifact boundaries, and centralize shared logic so services and libraries evolve without duplication or hidden coupling.

Organizing steps by language and responsibility

Group steps first by language (for example, Node.js for bundling CSS/fonts, Python for ETL libraries, Java/Go for services), then by responsibility (build, unit testing, package, deploy). Use separate images to keep toolchains small and predictable. Artifacts pass compiled assets, wheels, or JARs between steps without rework, aiding reproducibility and shorter feedback loops.

Reusing logic with templates and scripts

Promote repeatability by moving common commands into shell scripts or Makefiles checked into the repo. Reference them from YAML with a consistent interface via environment variables. This keeps pipelines concise and ensures libraries and services align on the same linting, testing, and packaging standards while remaining easy to evolve.

Table: common language images and cache directories

This table lists typical base images and cache paths you can map; verify paths in your environment before adopting.

Language Example image Common cache paths
Node.js node:lts ~/.npm, ~/.cache/yarn, node_modules
Python python:3.x ~/.cache/pip, venv directories, .mypy_cache
Java eclipse-temurin:17-jdk ~/.m2, ~/.gradle
Go golang:1.x GOPATH/pkg/mod, $GOCACHE

What is the right way to manage variables, secrets, and caches in Bitbucket Pipelines?

Strong hygiene on variables and caches improves reliability and minimizes leakage. Use the right scope for each variable, limit who can edit them, and avoid printing sensitive values. Design caches as performance hints, not as sources of truth, and ensure your builds remain deterministic when caches are cold or invalidated

Variable scoping and secured handling

Repository variables capture settings unique to the project, while workspace variables apply across multiple repositories. Deployment variables express environment-specific configuration for dev, staging, and prod. Mark secrets as secured to prevent exposure, and prefer short-lived tokens or federated identities where possible. Review who can manage variables as part of repository governance.

Designing a cache (computing) strategy

Cache dependency directories that are safe to reuse, and key them to lockfiles or version pins to minimize stale results. Ensure scripts can rebuild from scratch to preserve determinism and debuggability. When in doubt, scope separate caches to separate package managers to avoid contamination across languages.

Artifacts versus caches: when to use each

Artifacts pass outputs within a single run—compiled bundles, test reports, or packages. Caches persist directories across runs to accelerate repeated tasks like dependency downloads or compiler intermediates. Use artifacts to preserve provenance for downstream steps, and caches to optimize repeated work without changing the build result.

How do you run unit testing, code quality, and security scans in Bitbucket Pipelines?

Testing and scanning benefit from containerized repeatability. Keep tooling versions pinned, run fast checks early, and publish structured outputs (like JUnit XML) as artifacts. Integrate code quality and dependency scanning into CI to catch regressions before deployment and anchor pull request gating on clear, auditable signals.

Unit testing across languages and libraries

Choose images with your test runtimes and install only required frameworks. Persist reports as artifacts for review and downstream processing. For integration tests, declare services for databases and seed data deterministically.

```yamlpipelines:  pull-requests:    '**':      - step:          name: Node.js unit tests          image: node:20-alpine          caches: [node]          script:            - set -euo pipefail            - npm ci            - npm run test:ci -- --reporters=jest-junit          artifacts:            - reports/junit.xml```

Code quality and dependency scanning

Run linters and formatters with fixed versions to ensure consistent results. Add dependency scanners suitable for your language ecosystem and treat high-severity findings as failures. Keep outputs as artifacts and link to them from pull requests for actionable triage without rerunning jobs.

Reporting and gating pull requests

Surface test and scan statuses directly on pull requests and require them for merge via branch protections. Store logs and structured reports as artifacts, and provide concise summaries with links so reviewers can dive deeper only when needed.

How can Bitbucket Pipelines build and deploy data and analytics services on Linux?

Most data services package as containers deployed to orchestrators. Bitbucket Pipelines can build Docker images, push them to a registry, and trigger your deployment system. Post-deploy hooks can run migrations, warm caches, rebuild indexes, and notify stakeholders by email or chat. Parameterize all environment details with variables to keep YAML generic and secure.

Building and pushing Docker images

Use steps that enable Docker to build and push images tagged by commit or build number. Keep Dockerfiles minimal and prefer multi-stage builds for deterministic outputs.

```yamlpipelines:  branches:    main:      - step:          name: Build and push          services: [docker]          script:            - docker build -t "$DOCKER_REPO/my-data-api:${BITBUCKET_COMMIT}" .            - docker push "$DOCKER_REPO/my-data-api:${BITBUCKET_COMMIT}"```

Deploying to environments and running migrations

Trigger deploys using your platform’s CLI or API, gating production on protected branches or tags. After rollout, run schema migrations and smoke tests before marking success. Keep credentials in secured deployment variables and avoid inline secrets.

```yamlstep:    name: Deploy to staging    script:      - ./infra/cli deploy --env=staging --image "$DOCKER_REPO/my-data-api:${BITBUCKET_COMMIT}"      - ./infra/cli migrate --env=staging      - ./infra/cli smoke-test --env=staging```

Post-deploy tasks and notifications

Warm application caches, rebuild analytical aggregates, or precompute features if your service benefits from it. Send notifications via email or chat APIs with clear links to logs, image tags, and version notes so on-call engineers and stakeholders have immediate context.

How do you optimize Bitbucket Pipelines for speed, cost, and reliability?

Optimization starts with identifying the longest paths and removing unnecessary work. Shrink images, tighten caches, and split independent tasks into parallel steps. Keep builds deterministic and transparent so incidents are debuggable. Measure step durations over time and adjust granularity as your repository and team evolve.

Speed tactics that respect determinism

Small, purpose-built images reduce cold-start time. Cache only safe, versioned dependencies and artifact heavy outputs to avoid recomputation. On branches, run incremental checks; on main, run the full suite. Avoid fetching large, rarely changed assets every run by pinning and caching them appropriately.

Managing concurrency and parallelism

Decompose long workflows into steps that can run in parallel, then merge via artifact consumption. Make scripts idempotent so retries don’t cause side effects. Use conditions and manual gates to avoid concurrent deployments to the same environment that could contend for state.

Observability and failure isolation

Annotate logs with clear phases and exit quickly on failures. Persist diagnostics—test reports, core dumps, or trace logs—as artifacts for postmortems. Quarantine flaky tests or unstable external dependencies into isolated steps to reduce noise and improve signal for reviewers.

Which Bitbucket Pipelines setup fits your repository and team?

Your choice depends on repository topology, release frequency, and ownership boundaries. A small service or library can rely on one lean pipeline, while a platform with multiple languages and deploy targets benefits from conditional pipelines and promotions. Evaluate governance, compliance, and on-call requirements before you standardize.

When a single pipeline per repo is sufficient

If you ship infrequently and support one primary language, a single linear pipeline with build, unit testing, and publish is often optimal. Use deployment variables for environment differences and gate publishing on main or signed tags. Keep scripts simple and visible to minimize maintenance overhead.

When to adopt multi-pipeline and environment promotions

For polyglot repos or multiple targets, define conditional pipelines per branch/tag and custom pipelines for manual promotions. Separate validate/build/test from deploy and enforce approvals for production. This structure reduces risk and aligns ownership with the services responsible for each deployable.

When to consider self-hosted runners

Evaluate runners when you need specialized hardware, private networking, or deeper control over execution. They let you run steps on your infrastructure while retaining Bitbucket Pipelines orchestration. Factor in operational effort, patching cadence, and security posture before adopting.

How Does Airbyte Help With Bitbucket Pipelines in Data Sync and Connector CI?

Once Bitbucket Pipelines is in place, you may want to orchestrate downstream data syncs or manage connector builds as part of CI/CD. Airbyte approaches this by exposing a REST API to trigger syncs and check job status, which you can call from a pipeline step after deployments or database migrations. This enables your application releases and operational data flows to stay coordinated.

One way to address connector lifecycle is through Airbyte’s containerized connectors. You can build and push connector Docker images in Bitbucket Pipelines as part of CI, test them, and publish to your registry. You can also optionally wait for a sync to complete and fail the pipeline when a job fails, giving a clear, automated gate on data readiness.

FAQs

What base image should I start with for Bitbucket Pipelines?

Choose the smallest image that includes your language runtime and required build tools. Slim images start faster and reduce surface area.

Can I run database-backed tests in Bitbucket Pipelines?

Yes. Add a service container for your database, seed test data in setup, and connect via the service hostname in your tests.

How should I store secrets for Bitbucket Pipelines?

Use repository or deployment variables marked as secured. Avoid printing them in logs and rotate credentials regularly.

What should I cache to speed up Bitbucket Pipelines?

Cache dependency directories keyed to lockfiles. Avoid caching build outputs that vary by environment or timestamp.

How do I surface test results in pull requests with Bitbucket Pipelines?

Store JUnit XML and coverage reports as artifacts and enable PR checks. Link to artifacts so reviewers can inspect details quickly.

Can Bitbucket Pipelines send notifications on pipeline events?

Bitbucket surfaces pipeline status natively. You can also call chat or email APIs from a post-step script to send targeted alerts.

Limitless data movement with free Alpha and Beta connectors
Introducing: our Free Connector Program
The data movement infrastructure for the modern data teams.
Try a 30-day free trial
Photo of Jim Kutz