Skip to content
This repository has been archived by the owner on Oct 23, 2024. It is now read-only.

Cannot reference a port of a resident application to define a readiness check #5137

Closed
hantuzun opened this issue Feb 8, 2017 · 13 comments
Closed
Assignees

Comments

@hantuzun
Copy link

hantuzun commented Feb 8, 2017

I can't set "readinessCheck" of an application of mine. I'm following the readinessCheck documentation while working with Marathon version 1.1.2-dcos-1 and DCOS v. 1.7.3.


Here's an example response I get while trying to set "readinessChecks":

Request
$ curl -XPUT http://marathon.mesos:8080/v2/apps/elasticsearch --header "Content-type: application/json" --data '
{
    "readinessChecks": [
      {
        "path": "/"
      }
    ]
}'
Response
{"message":"Object is not valid","details":[{"path":"/value(0)/portName","errors":["is not one of ()"]}]}

The same error is persisted however many combinations I've tried. The same error is returned on Marathon .UI as well.


How can I set readinessCheck?

@jdef
Copy link
Contributor

jdef commented Feb 8, 2017 via email

@hantuzun
Copy link
Author

hantuzun commented Feb 9, 2017

Hi @jdef,

We want to add readiness check for a resident task of Elasticsearch.

Our issue is turned out to be about setting portDefinitions. Readiness

Health checks API on ports is like this:

For MESOS_HTTP, MESOS_HTTPS, MESOS_TCP, TCP, HTTP and HTTPS health checks, either port or portIndex may be used. If none is provided, portIndex is assumed. If port is provided, it takes precedence overriding any portIndex option.

  • portIndex (Optional. Default: 0): Index in this app’s ports or portDefinitions array to be used for health requests. An index is used so the app can use random ports, like [0, 0, 0] for example, and tasks could be started with port environment variables like $PORT1.
  • port (Optional. Default: None): Port number to be used for health requests.
    https://mesosphere.github.io/marathon/docs/health-checks.html

Readiness check API on ports is just the below:

Our Elasticsearch task have the following portDefinitions now:

"portDefinitions":[
   {
      "port":10104,
      "protocol":"tcp",
      "labels":{

      }
   },
   {
      "port":10105,
      "protocol":"tcp",
      "labels":{

      }
   }
]

Marathon is generated them and we're interested in the first one of them (port 10104). The issue is we cannot update them since our task is resident. We cannot add these ports names neither. Note that they don't have default names as well.

Therefore, we're only left with the option of having Elasticsearch downtime for about 5 minutes where we will add http-api name to the first port definitions. Could we have another solution?

This could have been prevented by having the health check port API for readiness checks as well. Having a mechanism for updating port definitions of resident tasks could also have helped.

@aquamatthias
Copy link
Contributor

aquamatthias commented Feb 9, 2017

@hantuzun Marathon needs to know on which port to run the readiness check.
Therefore you need to define a name in portDefinition and reference this port in the readiness check definition - eg:

{
  "portDefinitions": [
    {
      "port": 0,
      "protocol": "tcp",
      "name": "myport"
    }
  ],
  "readinessChecks": [
    {
      "path": "/v1/plan",
      "portName": "myport"
    }
  ]
}

@jdef
Copy link
Contributor

jdef commented Feb 9, 2017 via email

@aquamatthias
Copy link
Contributor

@jdef added the definition - thanks for the hint.

@hantuzun
Copy link
Author

Hi @aquamatthias ,

Thanks for the response. Have you read the second post of mine on this thread? I didn't say there's no need to provide a port for readiness checks.

What I am saying is:

  • We can't just modify the port definitions of a resident task. We need to stop it.
  • These port definitions Marathon created does not come up with a default name. Therefore, I can't refer to them.
  • The readiness check API does not accept portDefinitions index like the healthcheck API. Therefore, I can't refer to the first one of these ports defined.
  • The readiness check API does not accept a port number just like the healthcheck API. Therefore, I can't just write the readiness check with the port.

Is the issue clear? I'm kindly asking the members to reopen the issue.

@jdef
Copy link
Contributor

jdef commented Feb 10, 2017 via email

@hantuzun
Copy link
Author

Yes @jdef, that's all right!

The solution you propose would have solve our problem :)

i just want to point out the case we're having right now. The readiness API seems to be not looked after for a while.

We'll have some down time anyway since we use DC/OS and it'd take too long to have an update to fix this. We could have updated our Marathon version with a hot-fix or we could have created a second Elasticsearch for this transition, but no need for our case.

However, not being able to add readiness checks to resident tasks could cause more trouble for others.

@aquamatthias
Copy link
Contributor

@hantuzun I see. The problem arises with resident tasks since it is not allowed to change resources after they are created. You actually do not want to change resources, but only assign a name.

There are 2 things to improve:

  • validation should only check the port number, not the whole port definition
  • the default port should have a name

I will change the title of this ticket to reflect this more clearly.

@aquamatthias aquamatthias reopened this Feb 13, 2017
@aquamatthias aquamatthias changed the title Cannot add readinessCheck to an application Cannot reference a port of a resident application to define a readiness check Feb 13, 2017
@aquamatthias aquamatthias added this to the Marathon 1.5 milestone Feb 13, 2017
@aquamatthias
Copy link
Contributor

Patch is here: https://phabricator.mesosphere.com/D511

@aquamatthias aquamatthias self-assigned this Feb 13, 2017
@aquamatthias aquamatthias modified the milestones: Marathon 1.4, Marathon 1.5 Feb 13, 2017
@jdef
Copy link
Contributor

jdef commented Feb 13, 2017 via email

@aquamatthias
Copy link
Contributor

@jdef D511 refines the validation for not changing resources.
Adding/Changing/Removing a name to the port definition results in a restart of the task.

aquamatthias added a commit that referenced this issue Feb 15, 2017
…dation logic for resource changes of resident apps.

Summary:
See #5137
Problem #1: the default port has no name: define `default` as name for the default port.
Problem #2: the validation logic compares the whole port definition to idenify resource changes, where only the number matters.

Test Plan:
- Start Marathon
- Create an app with:
```
{
  "id": "/foo",
  "instances": 0,
  "cmd": "sleep 1000",
  "cpus": 0.1,
  "disk": 0,
  "mem": 16,
  "container": {
    "type": "MESOS",
    "volumes": [
      {
        "containerPath": "docker_storage",
        "mode": "RW",
        "persistent": {
          "size": 10
        }
      }
    ]
  }
}
```
- Update the app with:
```
{
  "id": "/foo",
  "instances": 0,
  "cmd": "sleep 1000",
  "cpus": 0.1,
  "disk": 0,
  "mem": 16,
  "portDefinitions": [
    {
      "name": "myport",
      "port": 0,
      "protocol": "tcp"
    }
  ],
  "readinessChecks": [
    {
      "name": "myReadyCheck",
      "protocol": "HTTP",
      "path": "/v1/plan",
      "portName": "myport",
      "intervalSeconds": 10,
      "timeoutSeconds": 3,
      "httpStatusCodesForReady": [ 200 ],
      "preserveLastResponse": true
    }
  ]
}
```

Reviewers: unterstein, jdef, jenkins

Reviewed By: unterstein, jdef, jenkins

Subscribers: marathon-team

Differential Revision: https://phabricator.mesosphere.com/D511
@hantuzun
Copy link
Author

Thanks a lot @aquamatthias! That's a helpful fix.

@d2iq-archive d2iq-archive locked and limited conversation to collaborators Mar 27, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants