nomad copy / to alloc directory #1478

waterdudu · 2016-07-28T11:33:47Z

When I run nomad with this job, it copy all files to alloc folder.

job "exec-demo" {
  type = "service"
  datacenters = ["dc1"]

  group "exec-demo" {
    count = 1
    task "guesswhat" {
      driver = "exec"
      config {
    command = "nosuchcommand"
      }

      resources {
        cpu = 100
        memory = 100
        network {
      mbits = 1
        }
      }

   }
  }
}

My nomad version is 0.4.0

The text was updated successfully, but these errors were encountered:

dadgar · 2016-07-28T16:28:17Z

Can you paste the output of: nomad fs <alloc-id> <task-name> and also what OS are you on.

waterdudu · 2016-07-28T16:54:21Z

I'm using vagrant(ubuntu/trusty64), here is the nomad output.

vagrant@server1:~$ nomad run z.nomad
==> Monitoring evaluation "3171bf8e"
    Evaluation triggered by job "exec-demo"
    Allocation "78edb35c" created: node "6e51f8ac", group "exec-demo"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "3171bf8e" finished with status "complete"
vagrant@server1:~$ nomad status
ID         Type     Priority  Status
exec-demo  service  50        dead
vagrant@server1:~$ nomad status exec-demo
ID          = exec-demo
Name        = exec-demo
Type        = service
Priority    = 50
Datacenters = dc1
Status      = dead
Periodic    = false

Allocations
ID        Eval ID   Node ID   Task Group  Desired  Status
0d166f90  bfea1d6a  6e51f8ac  exec-demo   run      failed
78edb35c  3171bf8e  6e51f8ac  exec-demo   run      failed

vagrant@server1:~$ nomad fs ls 78edb35c
Error querying allocation: Unexpected response code: 500 (rpc error: alloc lookup failed: index error: Invalid UUID: encoding/hex: invalid byte: U+006C 'l')

The job is scheduled to machine docker1, here is the alloc info.

root@docker1:/opt/nomad/data/alloc/78edb35c-d137-6eb0-1265-465bd28784c8/guesswhat# ll
total 52
drwxrwxrwx  14 nobody nogroup 4096 Jul 28 16:43 ./
drwxr-xr-x   4 root   root    4096 Jul 28 16:43 ../
drwxrwxrwx   5 nobody nogroup 4096 Jul 28 16:43 alloc/
drwxr-xr-x   2 root   root    4096 Jul 28 16:43 bin/
drwxr-xr-x  14 root   root    4000 Jul 28 16:41 dev/
drwxr-xr-x 103 root   root    4096 Jul 28 16:43 etc/
-rw-r--r--   1 root   root     135 Jul 28 16:43 guesswhat-executor.out
drwxr-xr-x  22 root   root    4096 Jul 28 16:43 lib/
drwxr-xr-x   2 root   root    4096 Jul 28 16:43 lib64/
drwxrwxrwx   2 nobody nogroup 4096 Jul 28 16:43 local/
dr-xr-xr-x  94 root   root       0 Jul 28 16:41 proc/
drwxr-xr-x   3 root   root    4096 Jul 28 16:43 run/
drwxr-xr-x   2 root   root    4096 Jul 28 16:43 sbin/
drwxrwxrwx   2 nobody nogroup 4096 Jul 28 16:43 tmp/
drwxr-xr-x  10 root   root    4096 Jul 28 16:43 usr/

root@docker1:/opt/nomad/data/alloc/78edb35c-d137-6eb0-1265-465bd28784c8/guesswhat# cat guesswhat-executor.out
2016/07/28 16:43:15 [DEBUG] executor: launching command nosuchcommand
2016/07/28 16:43:15 [DEBUG] executor: running command as nobody

dadgar · 2016-07-28T17:21:15Z

If you look here you will see the set of directories we add to the chroot: https://www.nomadproject.io/docs/drivers/exec.html. We also map alloc/, local/ as part of Nomad's directory structure and dev/, proc/, tmp/,run/` as most applications need access to all of these to run properly.

It looks normal to me! Let me know if you'd like to reopen!

iconara · 2016-11-29T14:11:31Z

We're running into this issue too, I think. It looks like all of /usr, /etc and all the other directories in the chroot list you linked to are copied into each allocation.

I'm completely new to Nomad, so I'm probably misunderstanding something here, because I don't really understand how this can scale. On our machines /usr is 1 GB and the disk is 30 GB. After 30 allocations the disk will be full. If update a service 30 times the disk will be full, or if you run batch jobs a machine is basically unusable after 30 jobs. Is this really the way it's supposed to work?

dadgar · 2016-12-01T18:19:11Z

Hey Theo, I would suggest increasing the disk in the meantime. An upcoming release will add client side garbage collection so that the chroots are removed as there is disk pressure. Right now it happens on a 4hr static interval. We may play around with other approaches to building a chroot such as overlay filesystems to reduce the overhead but having a flat filesystem brings performance benefits. Another alternative we plan to look into as time allows is to support FS drivers and use ZFS. This brings the best of both worlds! Thanks, Alex

…

On Nov 29, 2016, 6:11 AM -0800, Theo Hultberg ***@***.***>, wrote: We're running into this issue too, I think. It looks like all of /usr, /etc and all the other directories in the chroot list you linked to are copied into each allocation. I'm completely new to Nomad, so I'm probably misunderstanding something here, because I don't really understand how this can scale. On our machines /usr is 1 GB and the disk is 30 GB. After 30 allocations the disk will be full. If update a service 30 times the disk will be full, or if you run batch jobs a machine is basically unusable after 30 jobs. Is this really the way it's supposed to work? — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub, or mute the thread.

iconara · 2016-12-02T10:05:18Z

@dadgar increasing the disk is not an option unfortunately, we're on EC2 but we don't use EBS (expensive, unnecessary, slow, risky, etc.) so we'll have to figure something out, or wait for the fix.

I'll see if I can get some time to write up a PR for the documentation, because I was very surprised by this behaviour. Now when I know about it I can kind of see how the docs don't say that it works otherwise, but they also require you to extrapolate from a sentence or two that you will need very big disks to use the exec driver.

dadgar · 2016-12-02T18:46:39Z

@iconara Appreciate it. The fix is being worked on as we speak so you won't have to wait to long!

github-actions · 2022-12-17T02:12:20Z

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

dadgar added the stage/waiting-reply label Jul 28, 2016

dadgar closed this as completed Jul 28, 2016

iconara mentioned this issue Jan 23, 2017

Clarify that exec copies all chroot_env directories #2228

Merged

github-actions bot locked as resolved and limited conversation to collaborators Dec 17, 2022

hc-github-team-nomad-core assigned tgross May 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nomad copy / to alloc directory #1478

nomad copy / to alloc directory #1478

waterdudu commented Jul 28, 2016

dadgar commented Jul 28, 2016

waterdudu commented Jul 28, 2016

dadgar commented Jul 28, 2016

iconara commented Nov 29, 2016

dadgar commented Dec 1, 2016 via email

iconara commented Dec 2, 2016

dadgar commented Dec 2, 2016

github-actions bot commented Dec 17, 2022

nomad copy / to alloc directory #1478

nomad copy / to alloc directory #1478

Comments

waterdudu commented Jul 28, 2016

dadgar commented Jul 28, 2016

waterdudu commented Jul 28, 2016

dadgar commented Jul 28, 2016

iconara commented Nov 29, 2016

dadgar commented Dec 1, 2016 via email

iconara commented Dec 2, 2016

dadgar commented Dec 2, 2016

github-actions bot commented Dec 17, 2022