Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

control-service: Allow job builder run as non-root #625

Merged
merged 2 commits into from
Feb 10, 2022

Conversation

doks5
Copy link
Contributor

@doks5 doks5 commented Dec 14, 2021

Currently, the data job builder requires root priviledges to
run, which is not ideal in situations, where users may have access
to kubernetes clusters which allow pods to run only under SecurityContext.

This change allows the job builder pods to run under security context, by assigning
the ownership of the working directories to UID 1000, which is the first "normal" user
Linux machines create without root priviledges.

Testing Done: Manually tested the change by updating the KubernetesService class
to set security context to the data job builder pod. The change was made to the
createJob() method by adding the following lines and replacing the V1PodTemplateSpecBuilder

List<String> addedCapabilities = Arrays.asList(
               "AUDIT_WRITE",
               "CHOWN",
               "DAC_OVERRIDE",
               "FOWNER",
               "FSETID",
               "KILL",
               "MKNOD",
               "NET_BIND_SERVICE",
               "NET_RAW",
               "SETFCAP",
               "SETGID",
               "SETPCAP",
               "SETUID",
               "SYS_CHROOT"
       );
       List<String> droppedCapabilities = Arrays.asList(
               "AUDIT_CONTROL",
               "BLOCK_SUSPEND",
               "DAC_READ_SEARCH",
               "IPC_OWNER",
               "LEASE",
               "LINUX_IMMUTABLE",
               "MAC_ADMIN",
               "MAC_OVERRIDE",
               "NET_ADMIN",
               "NET_BROADCAST",
               "SYSLOG",
               "SYS_ADMIN",
               "SYS_BOOT",
               "SYS_MODULE",
               "SYS_NICE",
               "SYS_PACCT",
               "SYS_PTRACE",
               "SYS_RAWIO",
               "SYS_RESOURCE",
               "SYS_TIME",
               "SYS_TTY_CONFIG",
               "WAKE_ALARM"
       );
       long fsGroup = 2000;
       long runAsGroup = 2000;
       long runAsUser = 1000;
       long supGroup = 1;
       List<Long> supplementalGroups = Arrays.asList(supGroup);

        log.debug("Creating k8s job name:{}, image:{}", name, image);
        var template = new V1PodTemplateSpecBuilder()
                .withSpec(new V1PodSpecBuilder()
                        .withRestartPolicy("Never")
                        .withContainers(container(name, image, privileged,
                                envs, args, volumeMounts, imagePullPolicy,
                                request, limit, null)
                                .securityContext(new V1SecurityContext()
                                        .allowPrivilegeEscalation(false)
                                        .capabilities(new V1Capabilities()
                                                .add(addedCapabilities)
                                                .drop(droppedCapabilities)
                                        )
                                        .privileged(false)
                                ))
                        .withVolumes(volumes)
                        .withSecurityContext(
                                new V1PodSecurityContext()
                                        .fsGroup(fsGroup)
                                        .runAsGroup(runAsGroup)
                                        .runAsUser(runAsUser)
                                        .supplementalGroups(supplementalGroups)
                        )
                        .build())
                .build();

After the change was made, a local instance of the Control Service was run, and it was
verified that the builder job is able to successfully run under SecurityContext.

Signed-off-by: Andon Andonov [email protected]

@doks5 doks5 force-pushed the person/andonova/builder-security-context branch from bbb19be to 007dd7e Compare December 25, 2021 12:01
@doks5 doks5 force-pushed the person/andonova/builder-security-context branch 3 times, most recently from a3455b0 to 0d84723 Compare January 13, 2022 17:07
@doks5 doks5 marked this pull request as ready for review January 13, 2022 17:10
@doks5 doks5 force-pushed the person/andonova/builder-security-context branch from 0d84723 to 205ace8 Compare January 13, 2022 21:43
@mivanov1988
Copy link
Collaborator

LGTM

@doks5 doks5 force-pushed the person/andonova/builder-security-context branch 2 times, most recently from 527514a to 55a33f5 Compare January 14, 2022 09:24
@mivanov1988 mivanov1988 force-pushed the person/andonova/builder-security-context branch 2 times, most recently from 2bc6aac to 20bd79d Compare February 8, 2022 10:56
Currently, the data job builder requires root priviledges to
run, which is not ideal in situations, where users may have access
to kubernetes clusters which allow pods to run only under SecurityContext.

This change allows the job builder pods to run under security context, by assigning
the ownership of the working directories to UID 1000, which is the first "normal" user
Linux machines create without root priviledges.

Testing Done: Manually tested the change by updating the KubernetesService class
to set security context to the data job builder pod. The change was made to the
createJob() method by adding the following lines and replacing the V1PodTemplateSpecBuilder

```
List<String> addedCapabilities = Arrays.asList(
               "AUDIT_WRITE",
               "CHOWN",
               "DAC_OVERRIDE",
               "FOWNER",
               "FSETID",
               "KILL",
               "MKNOD",
               "NET_BIND_SERVICE",
               "NET_RAW",
               "SETFCAP",
               "SETGID",
               "SETPCAP",
               "SETUID",
               "SYS_CHROOT"
       );
       List<String> droppedCapabilities = Arrays.asList(
               "AUDIT_CONTROL",
               "BLOCK_SUSPEND",
               "DAC_READ_SEARCH",
               "IPC_OWNER",
               "LEASE",
               "LINUX_IMMUTABLE",
               "MAC_ADMIN",
               "MAC_OVERRIDE",
               "NET_ADMIN",
               "NET_BROADCAST",
               "SYSLOG",
               "SYS_ADMIN",
               "SYS_BOOT",
               "SYS_MODULE",
               "SYS_NICE",
               "SYS_PACCT",
               "SYS_PTRACE",
               "SYS_RAWIO",
               "SYS_RESOURCE",
               "SYS_TIME",
               "SYS_TTY_CONFIG",
               "WAKE_ALARM"
       );
       long fsGroup = 2000;
       long runAsGroup = 2000;
       long runAsUser = 1000;
       long supGroup = 1;
       List<Long> supplementalGroups = Arrays.asList(supGroup);

        log.debug("Creating k8s job name:{}, image:{}", name, image);
        var template = new V1PodTemplateSpecBuilder()
                .withSpec(new V1PodSpecBuilder()
                        .withRestartPolicy("Never")
                        .withContainers(container(name, image, privileged,
                                envs, args, volumeMounts, imagePullPolicy,
                                request, limit, null)
                                .securityContext(new V1SecurityContext()
                                        .allowPrivilegeEscalation(false)
                                        .capabilities(new V1Capabilities()
                                                .add(addedCapabilities)
                                                .drop(droppedCapabilities)
                                        )
                                        .privileged(false)
                                ))
                        .withVolumes(volumes)
                        .withSecurityContext(
                                new V1PodSecurityContext()
                                        .fsGroup(fsGroup)
                                        .runAsGroup(runAsGroup)
                                        .runAsUser(runAsUser)
                                        .supplementalGroups(supplementalGroups)
                        )
                        .build())
                .build();
```

After the change was made, a local instance of the Control Service was run, and it was
verified that the builder job is able to successfully run under SecurityContext.

Signed-off-by: Andon Andonov <[email protected]>
@mivanov1988 mivanov1988 force-pushed the person/andonova/builder-security-context branch from 20bd79d to 630a803 Compare February 8, 2022 18:13
@mivanov1988 mivanov1988 enabled auto-merge (squash) February 10, 2022 14:33
@mivanov1988 mivanov1988 merged commit 0c1f696 into main Feb 10, 2022
@mivanov1988 mivanov1988 deleted the person/andonova/builder-security-context branch February 10, 2022 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants