Extension to Giant Swarm CRD management via Flux

CreatedStateSummary
2023-08-24approved-

Related to: Manage essential CRDs via MCB

Context

The original solution does not leave room for provider specific resources in the sense that all crds Flux kustomization reconciles CMC bases/crds meaning all management clusters in the given CMC will have the same set of CRDs installed. (Note, that with the current solution we need to have a file in the CMC that is pointed to by the crds kustomization because the flux source is the CMC repository and from there we point to the remote resources in MCB.)

This is normal for CAPx based management clusters as their specific CRDs are managed in a different way in cluster-api-app and in mc-bootstrap we install the common set of Giant Swarm CRDs to all CAPx clusters.

For vintage clusters however there are provider specific CRDs located in apiextensions for aws, azure and kvm.

Solutions

Each MC has their own crds kustomization

Just like each management cluster in each CMC repository has a managmenet-clusters/<MC_NAME>/catalogs kustomization. We have the same convention for flux-extras and crossplane-providers as an additional example.

We follow this pattern and each can references their own set of CRDs they want to install before everything else.

Since all CAPIx are the same, and they don’t need special stuff they can reference the all one from MCB.

For vintage, we can prepare the provider specific CRDs, and they just need to be referenced where needed.

Pros

  • We already do something like this with catalogs and actually flux-extras abd crossplane too. We set the convention that something needs to exist at a specific location which is good UX guidance and support wise too.
  • Provides customers a nice extension point
  • Easy to use, very generic solution
  • Easy to implement (maybe some hassle with auto_branches as always)

Cons

Unique crds kustomization per MCB bases/provider via new MCB flux source

Instead of a concrete, single crds Flux kustomization resource there is rather a convention that we have one called crds. Then each provider in MCB under bases/provider is responsible to create it.

For this to work we need a new MCB Flux source that is optimally deployed to all clusters, tho for this issue itself we can just deploy it to vintage clusters. However, there are already some other use cases where having the MCB source in the cluster could be utilized like triggering reconciliation immediately on e.g. flux kustomization when MCB changes.

Related links to reconciliation trigger by different source workaround:

Pros

  • Sort of drop-in and implicit for customers (tho this can be interpreted as a con too in some use cases)
  • We probably need the MCB source soon anyway

Cons

  • We take away control and the extension point from customers
  • Complex and leads to some duplication in MCB repository

Sidestep - Handle Vintage provider specific CRDs from new crds-vintage kustomization

In MCB we create a new Flux kustomization for the vintage providers that point to a k8s kustomization in MCB that lists the resources stored in the respective helm/crds-<PROVIDER> folder in apiextensions.

Pros

  • Easier to implement, smaller impact
  • Isolated to work on an extension to vintage clusters only, which also makes it easier to clean up later
  • We probably need the MCB source soon anyway

Cons

  • Not a generic solution, if CAPx clusters need something like this in the future
  • A new Flux kustomization is added to vintage clusters and this means a new dependency to flux kustomization too

Decision

The chosen solution is: “Each MC has their own crds kustomization”

This gives us a lot of flexibility in the future, we can easily do exception or rolling migrations for example.

Additionally, for vintage CRDs we decided to version them based on the apiextensions library cos some of them still seem to run a slightly older version of CRDs. For common giantswarm ones and flux-app ones we keep them as latest, shared across all clusters in the same state for now, but nothing prevents us versioning them in the long run either.

Since with the exception mechanism we cannot really talk about the all composite kustomization, it is replaced by the common composite instead that will contain giantswarm and flux-app for now.

Next steps

Implement and migrate all CMC repositories, improve mc-bootstrap and finally, actually get rid of opsctl ensure crds.

Last modified November 21, 2023: Update rendered RFCs (#176) (7f6b6e4)