Decisions and RFCs
RFC process
Want to write a new, technical RFC? Please follow the RFC and decision making process.
Existing RFCs
| Created (newest first) | Title | State |
|---|---|---|
| 2026-01-30 | Wildcard TLS Certificates for Envoy Gateway on Management Clusters Use wildcard TLS certificates on management clusters with Envoy Gateway to eliminate per-service domain configuration in gateway-api-config. | approved |
| 2026-01-21 | Teleport Cluster Version Upgrade Testing Pipeline Establish an automated testing pipeline to validate Teleport cluster version upgrades before applying to production. Deploy an ephemeral Teleport cluster with the current production version, connect ephemeral MCs, validate connectivity, upgrade to the target version, and re-validate. | review |
| 2025-12-11 | Using semVer tags for automatic app upgrades in different release stages We want to use flux and flux-operator’s automatic upgrades capabilities to create automatic upgrades for different release stages, so we don’t have to manually or through extra automation care about those rollouts. | approved |
| 2025-11-27 | Semantic Versioning of Upstream Software Describes our approach for versioning our own packages of upstream software. | review |
| 2025-11-11 | Move collections to management-cluster-bases, introduce shared collection and support stages Move collections to management-cluster-bases, introduce shared-collection and support stages. | approved |
| 2025-07-15 | Strategic placement and standardization of operational runbooks This RFC addresses the strategic placement of operational runbooks to improve accessibility, maintainability, and security while ensuring consistent structure and content sanitization. Part of the work involves renaming the “Ops recipes” to “Runbooks” for clarity and alignment with industry standards. | approved |
| 2025-02-25 | Title This RFC proposes a new label to identify pods by their platform subsystem, aiming to simplify operations and maintenance. The label will use free-form text values, with a documented list of common values. Implementation includes adding the label to a repository and documentation. | review |
| 2025-02-04 | Deploy observability silences using GitOps Deploy observability silences using GitOps | review |
| 2024-12-06 | Configuration rendering controller This RFC describes the vision to evolve our current configuration management system to make it more flexible, reusable, locally reproducible and easier to extend in the future while making other parts of our platform slightly more simple. | approved |
| 2024-09-26 | App prefix Decide whether the apps in CAPI cluster should have or not a prefix, and if we need to enforce it. | approved |
| 2024-05-22 | Structured way to propose, discuss, and formalize technical decisions within an organization This contains the process for proposing, discussing, and formalizing technical decisions. It also introduces the RFC structure. | approved |
| 2024-02-19 | Revamp our docs Revamp our docs to describe the Dev Platform product, considering the new Cluster API (CAPI) architecture. Temporarily move vintage to a subpath and create the new content in the top level. The docs entry point can still point to the old till renovation is over. | approved |
| 2023-10-26 | Importing EKS/AKS/GKE clusters to CAPI using crossplane In order for Giant Swarm to import/adopt customer clusters on bring-your-own infrastructure, use Crossplane ObserveOnly functionality for resources to discover existing infrastructure of customers without managing it. Use clusters.x-k8s.io/managed-by: crossplane annotation to prevent CAPI from reconciling clusters. Do not rely on “paused” objects. | approved |
| 2023-10-12 | Leaving docker hub and simplifying registries architecture Switch to Azure Container Registry (ACR), even for China. Instead of replicating images across our multiple registries, trust this single provider to solve high availability. Run a local pull-through proxy to fall back during provider outage. | approved |
| 2023-10-10 | RFC and decision making process This contains the explicit procedure to follow for creating an RFC and having it reviewed. Introduce a structured YAML header for the Markdown file. List of RFCs gets rendered in the handbook. | approved |
| 2023-10-09 | PSS migration orchestration Describes the implementation of early Policy API features to assist with customer migrations to Kyverno-enforced Pod Security Standards. | approved |
| 2023-10-09 | Policy Orchestration System Introduces the Policy API as an abstraction for declaratively managing several external tools through a single customer interface. | approved |
| 2023-08-24 | Extension to Giant Swarm CRD management via Flux Each management cluster gets its own crds Flux kustomization for flexibility in CRD management, replacing the shared all composite with a common one. Vintage provider-specific CRDs are versioned based on the apiextensions library. | approved |
| 2023-08-07 | Manage essential CRDs via MCB Use kustomize in management-cluster-bases to manage essential CRDs (Giant Swarm, Flux, VPA, etc.) in a unified way. Replace opsctl ensure crds and provide a framework for adding any new CRDs without further migrations. | approved |
| 2023-05-31 | Simplify baseDomain usage in our applications | approved |
| 2023-05-08 | Default PSS and Policy Exceptions with Kyverno Outlines the suggested replacement of Pod Security Policies with a Kyverno-backed implementation of the official Pod Security Standards guidelines in Giant Swarm clusters. | approved |
| 2023-03-01 | Container registry configuration Since Docker Hub has an image download rate limit which can lead to unhealthy clusters, configure containerd to use our Azure Container Registry as mirror and keep Docker Hub as primary. Also make this the default on WCs. Use per-MC authenticated Giant Swarm account for our registries to avoid public pulls using up all our limits. See RFC for details on secret handling and cluster chart values to configure this. | obsolete |
| 2023-01-25 | Ensure no single point of failure in management cluster access Introduce Azure AD as a second SSO identity provider alongside GitHub to eliminate the single point of failure for management cluster access. Keep 1Password kubeconfig as an emergency fallback and integrate it into opsctl login. Automate SSO setup for both management and workload clusters. | approved |
| 2023-01-12 | Logging infrastructure Adopt a distributed Loki setup with one instance per installation to comply with customer data residency requirements. Evaluate object storage solutions per provider and build a POC starting with one provider. | approved |
| 2022-11-09 | Crossplane MVP on Management Clusters | approved |
| 2022-10-18 | Making parts of the intranet public | approved |
| 2022-09-07 | Assigning installation names | approved |
| 2022-07-08 | SIG Meeting Improvement Initiative | approved |
| 2022-06-20 | Multi layer app configs | approved |
| 2022-05-11 | Classifying clusters based on priority Use label giantswarm.io/service-priority={highest,medium,lowest} on Cluster objects to specify the importance of a customer cluster. | approved |
| 2022-04-15 | Automatic App upgrades Use Flux’s watch features such as ImagePolicy to automatically upgrade to newer app versions. This change was not performed, but we use *-collection repos (on MCs) and cluster default apps (on MCs/WCs) instead, so this RFC is obsolete. | obsolete |
| 2022-04-01 | Merging config in a gitops context | approved |
| 2022-03-24 | RFCs Related to Kyverno Policy Management and Deployment RFCs related to how Giant Swarm stores, versions, and shares Kyverno policies with customers. Partially superseded by Policy API concepts. | obsolete |
| 2022-02-15 | A better customer email management solution Add alias support+customer@giantswarm.io to forward e-mail to customer’s Slack channel. Also add alias urgent+area@giantswarm.io to forward to the area in Opsgenie. This change was introduced, but later reverted, so this RFC is obsolete. | obsolete |
| 2021-11-24 | Managed Apps Vision Spreads managed app ownership from one team to multiple teams for holistic ownership, enabling deeper operations, automated upgrades, and reduced customer operational load. | obsolete |
| 2021-11-10 | Configuration management with Cluster API Config maps to represent YAML templates in CAPI cluster releases. Share a Git repo for configuration with the customer. Later add cluster definitions to GitOps as well. | obsolete |
| 2021-09-17 | Automatic workload cluster upgrades As part of the Cluster API hackathon we brainstormed how to automate the cluster upgrades taking into account customer requirements and technical limitations brought under the new upstream implementation. | approved |
| 2021-07-26 | Monitoring System End To End Tests | approved |
| 2021-07-23 | Defaulting of CAPI clusters with webhooks Use Kubernetes mutating webhooks to centralize CAPI cluster defaulting. Webhooks source default values from config-controller, enabling consistent defaults across all cluster creation methods. | obsolete |
| 2021-07-19 | Road to Cluster API (over the potholes) | approved |
| 2021-07-12 | Enable customers to use gitops in management clusters | approved |