Kestra: Ops automation beyond CI/CD
CI/CD pipelines are optimized for code deployments. Long-running operational processes and self-service workflows can be orchestrated more flexibly with Kestra.
(Image: Matthias Parbel / KI / iX)
- Philip Lorenz
Monday, 9 AM. A developer needs a test server. They open the ticketing system, enter the hostname, operating system, and cost center, and wait. One day, sometimes three. Eventually, the VM lands in their inbox, configured according to the state from two weeks ago, because nobody has touched the template since then.
The reflex to solve this problem with existing tools is understandable. A pipeline in Azure DevOps or GitHub Actions can be written quickly. However, CI/CD tools are built for something else: building, testing, and deploying code. They follow a commit, run, and are done. Long-running workflows, retry logic, and state management are not their domain.
Kestra fills exactly this gap. The tool has been developed by Kestra Technologies since 2021 and has been publicly available since February 2022, primarily as an open-source project under the Apache 2.0 license. Commercial enterprise features complement the OSS core. Kestra sees itself as a universal orchestrator: for Ops automation, data pipelines, event-driven workflows, and increasingly also for AI agents and workflows, all declaratively in YAML, all versionable in Git. This article shows what this looks like in practice with a concrete example: A developer requests a Hetzner VM via a form, and Kestra provisions it fully automatically using Terraform. All code for Terraform, the Kestra setup, and the Kestra Flows can be found in the author's GitHub repository.
The CLC Conference, specializing in Developer Experience (DX) and Platform Engineering, will take place from November 11 to 12, 2026, in Mannheim. A special focus will be on how Agentic AI is changing the work of developers, software architects, DevOps, and platform engineers, and how digital sovereignty can be sustainably achieved.
Tickets are now available at the early bird price.
Orchestration is not the same as automation
When you first hear about Kestra, you inevitably think it's another tool that wraps shell scripts in a nice UI. That's not accurate. The difference lies in the concept: a shell script or a CI/CD pipeline are sequential: step A is followed by B, then C, and in the end, the status is either finished or failed. What happens in between doesn't concern the tool. Kestra, on the other hand, orchestrates: it knows the state of each individual step, can wait for external events, handle errors specifically, parallelize steps, and resume a workflow after hours or days exactly where it was interrupted.
Videos by heise
This is not purely an academic distinction. In practice, it means that Kestra can map workflows that cannot be represented with classic pipeline tools: a provisioning job that waits until an external system is ready, a data pipeline that pauses for three minutes on an HTTP error and then tries again, a deployment that only proceeds after manual approval. What other tools call a “pipeline” is a “flow” in Kestra.
The development team behind the project deliberately positions Kestra for a broad range of applications. The main areas are:
- Ops Automation: Provisioning infrastructure, setting up clusters, automating recurring operational tasks.
- Data Pipelines: Kestra directly competes here with Apache Airflow and Prefect.
- Event-driven Workflows: Flows that react to webhooks, queue messages, or file changes.
- AI Workflows: Kestra orchestrates AI agents just like classic Ops tasks.
The Four Building Blocks
Kestra has four concepts that developers should internalize first to better understand the tool's overall approach.
A Flow is the highest unit, comparable to a pipeline definition, but with significantly more expressiveness. Each flow has an ID, a namespace for structuring, and a list of tasks. It exists as a YAML file in Git and can be versioned and deployed through it.
Tasks are the individual steps within a flow. Kestra comes with a comprehensive library of built-in task types: HTTP requests, file operations, shell commands, container execution, loops, conditions, and parallelization. Tasks can produce outputs that subsequent tasks use as inputs. This creates a traceable data flow between the individual steps of the flow.
Triggers define when a flow starts. Options include schedules (cron), webhooks, other flows as triggers, or manual execution via the UI. The manual trigger should not be underestimated: it automatically opens an input form in the Kestra UI for all declared inputs.
Plug-ins extend Kestra with external integrations. The official plug-in ecosystem includes Terraform, Kubernetes, AWS, Azure, GCP, Slack, dbt, and Airbyte, among others. Plug-ins are versioned JARs that Kestra loads at startup, without separate dependency management or pip install.
The following listing shows a minimal flow:
id: hello-kestra
namespace: demo
inputs:
- id: message
type: STRING
defaults: "Hallo von Kestra"
tasks:
- id: log
type: io.kestra.plugin.core.log.Log
message: "{{ inputs.message }}"
This flow demonstrates the essentials: The namespace demo structures flows in the UI similarly to folders. The input message appears as a text field during manual execution. The task references the input using Pebble template syntax {{ inputs.message }}. The same syntax works everywhere in the flow for secrets, for outputs of previous tasks, and for execution metadata.
Not a Replacement, but a Supplement
The question inevitably arises: Don't we already have Azure DevOps? Do we really need this? The answer is yes and no. Azure DevOps or GitHub Actions remain unbeaten for what they do: compiling code, running tests, building containers, deploying artifacts. Anyone trying to replace these tasks has the wrong problem, because Kestra addresses a different area.
CI/CD remains the right layer for everything that comes from the repository: build, test, deliver. However, many operational processes start with a request rather than a commit: an environment needs to be created, a change needs to be reviewed, a resource needs to be released, or an external service needs to be integrated. This is precisely where Kestra complements the existing CI/CD landscape.
A concrete example: A developer wants to provision a Kubernetes cluster. This can be technically implemented in an Azure DevOps pipeline with terraform apply as a script task. What's missing is everything else: there's no structured input form for developers, and no clean retry logic if the cloud provider doesn't respond. There's also no log showing who requested which resource and when. Most importantly, a pipeline treats Terraform in a “fire and forget” manner: start script, let it run, hope for a result – what happens in between is not traceable. Kestra, on the other hand, completely encapsulates each execution with its parameters: state, logs, and retry behavior are clearly assigned to a specific execution and are traceable in the UI. If a step fails, Kestra knows where, with what inputs, and in what state, and can react specifically instead of blindly starting over.
Comparison with Airflow and Prefect: The Perspective Matters
Which alternatives to Kestra are suitable depends on the perspective, which can differ significantly between data engineering and platform engineering. Those coming from the data engineering field usually think first of Apache Airflow, now a top-level project of the Apache Software Foundation, or Prefect, developed by the company of the same name. Airflow is the established standard for DAG-based data pipelines with dependencies, scheduling, and retry logic. Prefect is a more modern alternative that scores with lower operational overhead and a leaner API. Kestra occupies the same space but is less Python-centric: flows are defined in YAML, tasks are plug-in calls, and Python knowledge is not required. This makes Kestra particularly accessible to mixed teams.
In platform engineering, however, completely different tools are used. Common ones include Argo Workflows as a Kubernetes-native orchestrator, the Ansible Automation Platform for operational tasks and configuration management, or Terraform Cloud for Infrastructure-as-Code workflows. All these tools solve real problems, but each is firmly anchored in its paradigm. Terraform Cloud does Terraform, Argo does Kubernetes workflows, the Ansible Automation Platform does Ansible. Kestra deliberately aims to be more universal: a flow can start a Terraform container, then execute a PowerShell script, call an HTTP API, and report the result in Slack – all in a single YAML flow, without the tool needing to consider the underlying technologies.
From a developer's perspective, Azure DevOps and GitHub Actions remain the first choice for the CI/CD layer. Airflow or Prefect can continue to serve their purpose in data-driven environments. Kestra complements where none of these tools are at home: as an orchestration layer that holds heterogeneous tool landscapes together, but can also serve as a replacement for individual tools.
Test Locally, Operate in the Cluster
Kestra can be started locally using Docker Compose. The complete docker-compose.yml is available in the accompanying GitHub repository. The command [span class="tx_code"]docker compose up -d[/span] is sufficient; the UI is then immediately accessible at [span class="tx_code"]http://localhost:8080[/span]:
For production operation on Kubernetes, there is an official Helm chart. The components (server, scheduler, executor, and worker) run as separate deployments and can be scaled independently. Workers handle the actual task execution and are the primary scaling lever: more parallel executions mean more worker replicas, the rest of the infrastructure remains stable. For those who want to manage Kestra resources (flows, namespaces, secrets, users, and roles) as code as well, there is the official Terraform provider: it maps the entire Kestra configuration as Terraform resources and can be seamlessly integrated into existing Infrastructure-as-Code pipelines.