Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC : How Do We Best Leverage K8s as a Runtime? #22

Closed
wants to merge 4 commits into from

Conversation

topherbullock
Copy link
Member

Proposal

This is an initial proposal to consider whether Tekton vs native K8s resources would be the best path forward for Concourse's K8s runtime implementation.

To make the path forward clearer, we may want to include a POC which spells out how converting a build plan to a set of K8s objects would happen in practical terms.

Signed-off-by: Topher Bullock [email protected]

Signed-off-by: Topher Bullock <[email protected]>
@topherbullock topherbullock changed the title RFC : How Do We Best Leverage K8s as a Runtime RFC : How Do We Best Leverage K8s as a Runtime? Mar 28, 2019
@cirocosta
Copy link
Member

cirocosta commented Mar 28, 2019

Thanks for putting this up, @topherbullock!

Some quick (and, possibly stupid) thoughts:

Is it right to rephrase those three "problems" into

  • "volume" (produced by steps) -> image (consumed by k8s)
  • task configs that get dynamically discovered (need to create "units of execution" as their details are discovered), and
  • storing & consuming inputs & outputs

?

In terms of 1 (volume --> image), it feels like once we figure out 3 (storing & consuming inputs & outputs), 1 gets solved by providing an interface to those bits that k8s can consume (i.e., the OCI Distribution spec).

For 2 though (dynamically discoverable configs), 🤷‍♂️ seems quite hard 🤔

does that make any sense?

Thanks!

@topherbullock
Copy link
Member Author

Yeah, that rephrasing makes sense to me @cirocosta!

It seems like using outputs as images is in the Tekton backlog, so this would get us a little further if they implement it (or be a good place for us to jump in to contribute): tektoncd/pipeline#639 , tektoncd/pipeline#216

I think for now we can start looking at a simple pipeline to exercise those use cases, and move forward with a POC using Tekton (and maybe later do a bake-off experiment against native K8s) to run a single build of a job in that pipeline.

Here's the simplest concourse example that exercises the first case of using images:

---
resources:
  - name: ubuntu
    type: registry-image
    source:
      repository: ubuntu

jobs:
  - name: fetch-and-use-ubuntu
    plan:
      - get: ubuntu
      - task: use-it
        image: ubuntu
        config:
          platform: linux
          run:
            path: echo
            args: ["Hello, world!"]

@ddadlani
Copy link

ddadlani commented Apr 4, 2019

Here are some things @chenbh and I discovered during our investigation:

Things we've found:

  • in a tekton Task, the workspace volume is shared by all steps (different containers on the same pod)
  • in a tekton Pipeline, persistent volumes can be shared explicitly either as PipelineResources or as PVCs. This supports the use case of passing the output of a Concourse step as the input of another.
  • for the above two reasons we believe the tekton Pipeline is a better abstraction for a Concourse job
  • Volume caching can be imitated through PVCs, however, things like COW volumes are not directly supported. We would need to default to creating a copy of a volume every time it is used.
  • Able to attach (intercept), cp (uploading one off build artifacts) via kubectl directly
  • ATC basically executes a bunch of kubectl commands
  • Taints and tolerances could perhaps be used for team workers / worker tags (changes pod affinity for a given node)

tekton Qs:

  • Can we define custom PipelineResources
  • How to mount volumes from one Task to another (nop PipelineResource)
    • PersistentVolumeResource?
  • Easy/painless way to pipe to stdin on a pod/container
  • Where's the PipelineResource implementations
  • Any way to override the default PVC size for PipelineResource (I wanna pull an 100gb image)

k8s Qs:

  • COW PVCs? How do?

limitations:

  • No current way of pre-populating a readonly disk (COW) (may be possible with some resources)
  • does not clean up after itself (one-time use objects such as PipelineRuns stick around)
  • do we support the use case where an ATC is deployed outside of the cluster? How do we handle stdin/out redirection (e.g. for check containers)?
  • PipelineResources do not currently support both gets and puts (different for every PipelineResource implementation)
  • Persistent Volumes/Claims/Disk lifecycle needs to be explicitly managed
  • no parallel Tasks (currently. Tekton roadmap says it'll be implemented sometime in 2019)
  • resizing PVCs for larger gets? (alpha in k8s 1.13, not in tekton yet)

Concourse Qs:

  • How do we determine the amount of resources to allocate to a Task/Pipeline/Pod/PVC

@chenbh
Copy link
Contributor

chenbh commented Apr 9, 2019

@kcmannem and I did some more digging

  • creds:

    • kubernetes provides an interface for specifying what secrets to load into a volume, this cannot be reused to adapt concourse credential management.
    • can be passed in as env vars
  • baggageclaim:

    • k8s allows creation of PersistentVolumeClaims which can be mounted onto any pod/container
    • must be backed by a PersistentVolume (k8s talk for vendor specific hardware disk)
    • will need something similar to baggageclaim for managing k8s pvc/pv
  • garden:

    • containers:
      • replaced completely by k8s runtime
      • see lifecycle of pods
    • images:
      • must come from image repository (no rootfs_uri)
        • can specify any reachable repo (not limited to hardcoded sources)
        • concourse will have to extract the name from the resources section and just put it in the tekton task definition
      • will be cached on the nodes (vms) by kubernetes
      • does not support progress bars for pulling images
  • lifecycle management:

    • volumes
      • persistent volume claims:
        • concourse will need to handle the creation, mounting, and gc of pvc and potentially the underlying pv
        • can be dynamically generated with bare bones params (size)
      • persistent voulmes:
        • underlying hardware/API is IaaS controlled
        • have the ability to outlive the k8s cluster (not sure if we care about this)
        • explicitly defined (name, file system, permissions, name, size)
    • pods
      • goes away afer the taskrun they were created from are deleted
      • stays around after execution finishes in a 'Completed' state
      • cannot ssh into a 'Completed' pod
      • can grab the definition of a 'Completed' pod
      • kubernetes shouldn't allow anybody to restart a Completed pod (more investigations needed, important in case credentials were put on the pod)
    • tekton tasks
      • lives independent of taskruns and pipelines
      • must be manually deleted
    • tekton taskruns
      • dependent on pipelineruns
      • on delete cascade
      • must be deleted and re-applied if trying to trigger with no configuration changes
    • tekton pipelines
      • lives independent of pipelineruns
      • must be manually deleted
    • tekton pipelineruns
      • will not remove itself on completion
      • must be manually deleted
      • must be deleted and re-applied if trying to trigger with no configuration changes
      • can just update name to force re-trigger pipeline
  • concourse:

    • workers:
      • no longer needed
      • no external workers
      • same with tsa
    • tracking state of the kubernetes cluster
      • do we want to keep the last known state of the k8s stuff in the concourse db?
      • or just realy on multiple calls to kubectl to get the current state everytime?
    • timing:
      • everything is done asynchronously: applying returns immediately on successfully uploading the configuration, need to poll to figure out if the pod was created/finished successfully
  • fly:

    • hijacking
      • currently garden and docker provide a workdir to run the hijacked process(/bin/bash) from. We noticed kubectl has a similar command kubectl exec that maps to docker docker exec
        however, kubectl doesn't provide a -w flag so we can't preset the workdir when we hijack.
      • we found out that the spec allows for you to specify a workingDir which will be respected when you exec into a container of that image
    • commands that would no longer make sense
      • workers
      • land-worker
      • prune-worker
      • check-resource-type -> for custom resources and is used to check against an image repo
    • commands that might need to be repurposed/redesigned
      • containers
      • volumes
      • clear-task-cache

@topherbullock
Copy link
Member Author

topherbullock commented Apr 10, 2019

Thinking more about build logs / piping stdin, stdout, stderr :

Easy/painless way to pipe to stdin on a pod/container

This would be possible using the attach / exec API; which aren't properly supported by the current k8s go client. :c

do we support the use case where an ATC is deployed outside of the cluster? How do we handle stdin/out redirection (e.g. for check containers)?

We would need to support in-cluster configuration as well as out-of-cluster config our K8s Client.

In order to retrieve build events from stdout for a given step, the runtime would need to :

  • watch for the pod which the step is executing to exist
  • start following the logs from the step's container
  • populate build events for each log line

This relies on the K8s Logging Architecture being configured for the level of build output logging a Concourse cluster would produce. I'm not entirely sure of the details of the underlying architecture in terms of scaling to follow logs from many containers.

- Convert concourse pipeline yml to a tekton version
- Put is WIP and untested
- Get done but untested
- Currently building booklit.yml

Signed-off-by: Bohan Chen <[email protected]>
Co-authored-by: Sameer Vohra <[email protected]>
@voor
Copy link

voor commented Jul 4, 2019

Something of note, regarding no external workers, gitlab runner has this ability to run effectively workers in other clusters, which is a nifty feature for leveraging native services in different clusters (like a database inaccessible from outside a k8s cluster)

@aledegano
Copy link

Hi,
Thank you for this RFC: I'm really excited by the idea of having an external scheduler!
I would like to point out something, that's probably obvious, but I didn't see in the discussion: non-linux platform support. Which basically means we need a way to have tasks land on Windows and macOS.

That is, of course, hoping Concourse is going to keep supporting these platforms!

@vito
Copy link
Member

vito commented May 5, 2020

Going to close this for the same reason as #20 (comment)

@aledeganopix4d The likely approach there will be to support K8s workers alongside non-K8s workers, i.e. to support multiple runtimes. I'm sure there will be trickier parts to work out, like how to schedule Windows work on K8s, but we fully intend to support all platforms now and forever.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants