-
Notifications
You must be signed in to change notification settings - Fork 66
Fixed #703: restart using rest api, iterate hosts to verify #770
Fixed #703: restart using rest api, iterate hosts to verify #770
Conversation
Since the REST API restart is not available on ML7 should we keep the old restart method in there? Additionally, this fails on ML 8.0-5.1 on Windows for me so I am going to attempt to upgrade to test it. Something to keep in mind for users who may have an environment on an older version of ML8.. |
Hmm, good point, forgot about that! |
Maybe push away till next release, to give us more time to think about this? |
Given the time crunch, that sounds good to me. |
One sec, I might have an idea.. |
a6f0027
to
53fbcb8
Compare
OK, this works against an ML 7 cluster, but like before it only checks ml.server.. |
Upgraded to the latest ML8 on Windows and ran restart and got this output: $ ./ml local restart -v When I forced the ML7 method of restart to run I got this output : $ ./ml local restart -v So this leads me to believe there's something not working correctly with the old and new timestamp check. However, both restarts appear to be successful. I am digging into it... In order to run the ML7 restart I added a param called ml.force-old-restart and set that true. Simple change on 550 to make that an or to get it to work. Recommend implementing this for users in a situation where they are stuck on an older ML8. "Best" approach would be to automatically try the ML7 method when the ML8 one fails, but I think including this property would be the bare minimum for now. |
I dumped the value of new_timestamp and old_timestamp and I got (in that order) : robspc.attlocal.net2017-06-07T00:18:07.063157-05:00 Continuing to look into it.. |
Did you check if the hostname inside ml works as dns name outside ml? |
Potentially found the issue stepping through the function. I did not have "ml.verify_retry_max" defined. Setting that to a value of 10 made the function work correctly. So just add that to default.properties . |
Regarding old restart, maybe some flags instead of properties? I could look into that.. |
Hmm, never tried what happens if you set retry to zero.. |
Flags could work too. Setting server_version=7 throws the CSRF error so we need a better way since this doesnt work on every version of 8 and could potentially affect folks in production or staging with older versions of 8. |
I'd expect it to work just fine on older ml8, but would not hurt to check. I might have a vm running old ml8, or else i'll spin one up. Will have to wait though, afk now.. :) |
On ML8.0-5.1 it failed for me with this error: $ ./ml local restart -v Somewhat unrelated : I highly recommend switching from VMs to containers. I just discovered them last week and it's been a lifesaver for letting me quickly spin up and down multiple versions of ML in parallel to each other. |
I use (ml)vagrant with which I can spin up a complete cluster from scratch with any specific ML version for which I happen to have a installer for. I already have plenty around, and it takes just one command-line command to spin up or down. I'll switch to docker as soon as I have worked out how to use the Vagrant docker provider.. ;-) |
Works just fine on 8.0-6.3 for me, but let me try some more versions.. |
I checked:
The CSRF change was introduced in 8.0-4. I think you were set on the wrong foot initially because of the missing |
53fbcb8
to
3aa8a69
Compare
@RobertSzkutak We already have the --verify=false flag. I see no particular need for not using management rest api for ML8+. WDYT? |
Rereading your 8.0-5.1 issue. That must have been a regression in that specific patch release, as 8.0-3 as well as 8.0-6 worked fine. For the sake of production, some flag to not use manage rest api does make sense after all, just for the actual restart (not the verification, since that uses the old admin ui timestamp endpoint, which probably goes back to ml5 or older).. I'm thinking about |
3bb1561
to
888a8c3
Compare
OK, added the legacy flag, WDYT? |
Looks good to me! Doing some testing on it... |
Could you add a definition for max in default.properties so its defined by default? That may seem more intuitive than "SKIPPED" for our customers. |
I thought it already was defined! Let me check again.. |
Yes, max and sleep are defined here: https://github.com/marklogic/roxy/blob/dev/deploy/default.properties#L278 (part of latest dev) |
I was piling on top of the existing single-server restart verification.. |
Thanks! git merge fail. |
Fixes #703