Skip to content
This repository has been archived by the owner on Dec 29, 2022. It is now read-only.

Implement actual timeouts for requests #552

Merged
merged 1 commit into from
Nov 15, 2017

Conversation

alexheretic
Copy link
Member

Related to the discussion in #540 this PR fixes requests.rs to timeout after the set duration and respond with a default response. The code previously looked like it was doing this, but wasn't (see #540 for explanation).

Since now we can potentially expect the main thread to create many concurrent threads it makes sense to cap that amount of threads. I used a rayon thread pool to this effect.

Summary:

  • Use channels to provide timeouts
  • Use rayon thread pool to avoid unbound thread count
  • warn! on thread panics

@Xanewok
Copy link
Member

Xanewok commented Nov 6, 2017

I’m not familiar with how rayon works, but does it try to keep the pool full, even after a worker thread panic (does it spawn a new thread then?)

@alexheretic
Copy link
Member Author

Rayon generally propagates the panic though in this case I'm using a pool configured with a panic handler

rayon::Configuration::default()
    .thread_name(|num| format!("request-worker-{}", num))
    .panic_handler(|err| warn!("{:?}", err))

I would be very surprised if panics handled this way depleted the thread pool. As the docs don't mention this directly it's probably easiest to demonstrate it working on the playground.

const COMPILER_TIMEOUT: u64 = 1500;

// Timeout for potenially very slow CPU CI boxes
#[cfg(test)]
const COMPILER_TIMEOUT: u64 = 3_600_000;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1hr timeout per single request in testing environment seems... overly excessive. I think we could bump it up to 10s or similar here, but for test-level timeout I'd toy more with the TEST_TIMEOUT_IN_SEC value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just to deal with potentially slow CI environments as tests currently (and generally) won't deal with timeouts. And if you did test a timeout you wouldn't do it by waiting 2 seconds.

I haven't actually spotted this being an issue, this is just me being defensive since this PR introduces the timeouts having an effect.

I just used an arbitrarily large number because we expect this never to happen. Maybe I should have used u64::MAX.

let ctx = ctx.inited();
let file_path = parse_file_path!(&params.text_document.uri, "symbols")?;

let analysis = ctx.analysis.clone();

let rustw_handle = thread::spawn(move || {
let receiver = receive_from_thread(move|| {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style nit: I'd keep move || space-separated version to be more consistent with the style used in other places in the codebase

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

},
None => Ok(vec![]),

match receiver.recv_timeout(max_wait - elapsed) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm not mistaken, this changes the semantics to treat both analysis data and racer as equal and use the first one returned, whereas the previous one preferred the analysis data if possible. Probably not a bad thing, but something to keep in mind

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right. It seemed the only sensible thing to do. Waiting then entire timeout is not something we ever want to happen, its very much a measure to cap the worst case scenario.

I think it would be cleaner just to use one or the other if you want full clarity. Waiting the max request timeouts for the racer results hardly seems worthwhile. Note though, that if the compiler fast-fails then this logic will work as before.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just that with goto_def_racer_fallback cfg option enabled, Racer will stop acting as a fallback and both threads will race to return an answer to be used. Since Racer is more of a heuristic compared to analysis data, we might be sacrificing correctness for speed in some cases.

@nrc should chime in about that

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to preserve the existing behaviour (or at least the intention of the existing behaviour)

  • Start both threads at the same time
  • If the compiler thread completes, return a result immediately
  • if we hit the timeout, use the Racer result if it exists, otherwise return nothing.

t.unpark();

result
let (sender, receiver) = mpsc::channel();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we were to still prioritize analysis data, I think we could simplify this a great deal with receive_from_thread, like used elsewhere. We could start both threads, wait with timeout on the analysis one (this should succeed or fail fast in practice), and if needed wait for the remainder for the Racer one (iirc it can also take longer and is less reliable). It'd also remove the need to use WORK_POOL directly and do the explicit gymnastics with counters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tbf I would have generalised the racing logic it if it were used more than once.

As I wrote above I think you need to race the computations, or just use one or the other. If the compiler fails fast then it'll work fine in this case too. Except here we can handle a compiler-timeout and a fast racer with grace.

I see that this is a change in functionality though, racer could never beat the compiler before. Let me know what you guys want to do with this.

@Xanewok
Copy link
Member

Xanewok commented Nov 7, 2017

Thanks for the PR, this definitely looks like an improvement! Left some comments inline, @nrc should take a closer look as well as the changes made may be somewhat delicate

@alexheretic
Copy link
Member Author

Thanks for reviewing the PR @Xanewok. You guys just let me know what you'd like to do in the racer-compiler race, that's just my idea of how to do it - not a big deal.

As I said this PR just puts a cap on the worst case request scenario blocking rls forever, so its a small win. But since the code to try to do this was already there, it seems important we make sure it works.

@alexheretic
Copy link
Member Author

This PR can be closed if you guys want to merge #565, as that includes this. I actually reworked the Definition functionality back to resemble how it was to cut down on the amount of questions the change would raise.

@nrc
Copy link
Member

nrc commented Nov 9, 2017

Hey, I was looking at this PR and #565, they both look like great improvements! Given that they are both complex and fairly big, I think it would be good to keep the two separate so I'll comment separately.

Copy link
Member

@nrc nrc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This basically looks good! I think it will need to change a little bit to accommodate the intended compiler/Racer semantics, so I haven't looked too closely at that part.

Have you observed a benefit from using a Rayon thread pool? I'm a bit worried about adding another dep + complexity. But if it makes things observably better, then it is worth it.

@alexheretic
Copy link
Member Author

I think it will need to change a little bit to accommodate the intended compiler/Racer semantics

I've reverted the compiler/racer to behaviour to the previous style.

Have you observed a benefit from using a Rayon thread pool?

I think we benefit from having a thread pool, it prevents us from having an unbound amount of threads open in RLS (possible currently if tasks take a very long time to complete). Rayon is just a decent work-stealing pool, which in my opinion is a good fit for this situation.

@alexheretic alexheretic mentioned this pull request Nov 9, 2017
17 tasks
@alexheretic alexheretic force-pushed the request-timeouts branch 2 times, most recently from 44a66ea to 2f0b96c Compare November 13, 2017 11:09
Use rayon thread pool to avoid unbound thread count
Use channels to provide timeouts
warn! on thread panics
Add long request-timeout for CI testing
Use consistent `move ||` code style
Use compiler-then-racer impl for `Definition`
@nrc nrc merged commit 37f9c4d into rust-lang:master Nov 15, 2017
@nrc
Copy link
Member

nrc commented Nov 15, 2017

Cool, thanks for the PR and changes!

@alexheretic alexheretic deleted the request-timeouts branch November 16, 2017 16:42
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants