Improve GitLab Pipeline Performance with DAGs

Directed Acyclic Graph (DAG) style dependencies between individual stages in a continuous deployment pipeline allow for a more flexible workflow and better utilize available computational resources.

Imagine a simple pipeline consisting of three jobs:

  1. A syntax check
  2. A code complexity check
  3. Running all unit tests

You may be tempted to group those in two stages: A) Build (consisting of jobs 1 and 2) and B) Test (consisting of the unit tests):

Traditional Sequences

In plain old GitLab pipelines, you would define that stage A needs to execute before stage B and everyone would be happy.

Except if the syntax check is quite fast (let’s assume 30 seconds) while the code complexity check may be very slow (say 4 minutes). Then, the unit tests need to wait a total of max(30 sec, 4 min) = 4 minutes before they can be executed, resulting in an overall slow pipeline:

One way to optimize this is to move the unit tests into stage A. However, it clearly does not make sense to run the unit tests when there are syntax errors – it would just waste computational resources. Therefore, this is no feasible solution.

DAGs Allow a More Fine Grained Control

A better solution is to let the unit tests only wait for the fast syntax check but not for the code complexity check. DAGs allow such a fine grained control:

In the pipeline above, the only dependency (shown with an arrow) exists between the unit tests and the syntax check. This means, that the unit tests only need to wait until the syntax check is completed but they don’t wait for the code complexity check.

The way to implemented such a DAG is very easy. In the .gitlab-ci.yml file, use the “needs” keyword to specify dependencies (i.e., each black arrow above represents a “needs” relationship):

  stage: build

  stage: build

  stage: test
  needs: ["syntax:build"]

This simple optimization can drastically increase your pipeline performance, as visualized here:


Bernhard Knasmüller on Software Development