LTeX gets stuck in an almost infinite loop in very long documents #253

flindeberg · 2021-02-15T15:48:48Z

Describe the bug

It seems like it is possible by inducing a spellcheck while another spellcheck is running to cause an infinite loop when using large documents. I have a book (thesis) with 20 000 word chapters per file which cause issues.

As I have managed to work around it with massively increasing the sentence cache and the heap size the issue is not critical to me. But it indicated, AFAIK, that something is a bit fishy with the behavior of cancelling RPCs.

If needed I can reset to default values and recreate an infinite log, but I'd rather avoid it :-)

Steps to reproduce
Steps to reproduce the behavior:

Create a larger, around 20 000 word, .tex document with default settings for heap and cache size
Make small edits and induce spellchecking in quick succession (i.e. if ltex.checkFrequency is manual or save do whatever you have to do to make the spellchecking start)
Wait and monitor your CPU-load

Expected behavior

I expect the spellcheck to finish in reasonable time, i.e. a couple of minutes at bootup and a couple of seconds after that, and in particular to finish at all.

I seem to be able to avoid it by increasing ltex.sentenceCacheSize to something large (I set it to 50 000). Currently I am also running with "ltex.java.maximumHeapSize": 8192 and I cannot recreate the issue with those settings.

Sample document

A 20 000 word document with proper sentences (I do not seem to induce it with a 20 000 word Lorem ipsum, possibly because the sentences repeat?)

LTeX configuration

"ltex.additionalRules.enablePickyRules": true,
"ltex.additionalRules.motherTongue": "sv",
"ltex.latex.commands": {},
"ltex.statusBarItem": true,
"ltex.hiddenFalsePositives": {},

"LTeX Language Server" log file

Spread over 10 min or so, sometimes it gets stuck "forever", I have witnessed 24h+ of accumulated CPU time when I let it run overnight. I do not have a log from these 24h+ sessions of repeated spellchecking.

Usually the log "stops" (i.e. waits for the next iteration) at the first paragraph of the document, which confuses me:

FINE: Checking the following text in language 'en-US' via LanguageTool: "The ecology\n\n Blah blah "... (truncated to 100 characters)
Feb 15, 2021 4:09:04 PM org.bsplines.ltexls.server.DocumentChecker checkAnnotatedTextFragment
FINE: Obtained 682 rule matches

Does it check the document in reverse or is this a clue for something?

"The ecology\n\n Blah blah" represents the start of the chapter, i.e.

\chapter{The ecology}\label{cha:ecology}

\epigraph{Blah blah etc

Feb 15, 2021 4:07:16 PM org.eclipse.lsp4j.jsonrpc.RemoteEndpoint handleCancellation
WARNING: Unmatched cancel notification for request id 59
Feb 15, 2021 4:07:16 PM org.eclipse.lsp4j.jsonrpc.RemoteEndpoint handleCancellation
WARNING: Unmatched cancel notification for request id 60
Feb 15, 2021 4:07:16 PM org.eclipse.lsp4j.jsonrpc.RemoteEndpoint handleCancellation
WARNING: Unmatched cancel notification for request id 54
Feb 15, 2021 4:07:16 PM org.eclipse.lsp4j.jsonrpc.RemoteEndpoint handleCancellation
WARNING: Unmatched cancel notification for request id 58
Feb 15, 2021 4:07:16 PM org.eclipse.lsp4j.jsonrpc.RemoteEndpoint handleCancellation
WARNING: Unmatched cancel notification for request id 61
Feb 15, 2021 4:07:16 PM org.eclipse.lsp4j.jsonrpc.RemoteEndpoint handleCancellation
WARNING: Unmatched cancel notification for request id 62
Feb 15, 2021 4:07:16 PM org.eclipse.lsp4j.jsonrpc.RemoteEndpoint handleCancellation
WARNING: Unmatched cancel notification for request id 63
Feb 15, 2021 4:07:16 PM org.eclipse.lsp4j.jsonrpc.RemoteEndpoint handleCancellation
WARNING: Unmatched cancel notification for request id 64
Feb 15, 2021 4:07:16 PM org.eclipse.lsp4j.jsonrpc.RemoteEndpoint handleCancellation
WARNING: Unmatched cancel notification for request id 65

Feb 15, 2021 4:08:57 PM org.bsplines.ltexls.server.DocumentChecker checkAnnotatedTextFragment
FINE: Obtained 0 rule matches
Feb 15, 2021 4:08:57 PM org.bsplines.ltexls.server.DocumentChecker checkAnnotatedTextFragment
FINE: Checking the following text in language 'en-US' via LanguageTool: "lorem ipsum ... "... (truncated to 100 characters)
Feb 15, 2021 4:09:04 PM org.bsplines.ltexls.server.DocumentChecker checkAnnotatedTextFragment
FINE: Obtained 682 rule matches

Feb 15, 2021 4:09:34 PM org.eclipse.lsp4j.jsonrpc.RemoteEndpoint handleCancellation
WARNING: Unmatched cancel notification for request id 76
Feb 15, 2021 4:09:34 PM org.eclipse.lsp4j.jsonrpc.RemoteEndpoint handleCancellation
WARNING: Unmatched cancel notification for request id 77
Feb 15, 2021 4:09:34 PM org.eclipse.lsp4j.jsonrpc.RemoteEndpoint handleCancellation
WARNING: Unmatched cancel notification for request id 78

I also wonder if all the cancel notifications can provide a clue? (there can be hundreds of them which only shows up in the log if the server actually finishes)

Feb 15, 2021 4:07:16 PM org.eclipse.lsp4j.jsonrpc.RemoteEndpoint handleCancellation
WARNING: Unmatched cancel notification for request id 65

are a piece of the puzzle?

Version information
List here the version information of the relevant software.

Operating system: Linux Manjaro, 5.10.14 running a Xanmod- and Manjaro-patched kernel.
VS Code: 1.53.2
vscode-ltex: 9.0.0

Additional context/information

As mentioned, increasing heapsize and cache size seems to do the trick.

    "ltex.java.maximumHeapSize": 8192,
    "ltex.sentenceCacheSize": 50000,

I guess many requests for spellcheck gets queued up since the cache is not large enough so each individual request takes quite a bit of time. And then something goes completely haywire when I manage to queue up enough spellchecks to keep it running all night?

Maybe something is buggy with the RPC calls for cancelling already running requests?

The text was updated successfully, but these errors were encountered:

valentjn · 2021-02-15T17:29:15Z

The docs for ltex.sentenceCacheSize say “If you set this too small, checking time may increase significantly.” That's a part of what you're seeing here.

From the description of the problem, it sounds like it's like you say: Each keystroke causes LT_EX to run one check, and these checks stack up. I experienced the same problem when I experimented with having Pandoc as a Markdown backend (which might cost ~1s per conversion; LT_EX checks will take a lot longer if ltex.sentenceCacheSize is too small). The problem should occur as soon as you type faster than LT_EX can check. I can look into whether we can postpone checks until we don't get any user input anymore for some small period of time. Hopefully, we'll still be able to see diagnostics while we type (otherwise I might go for having a maximum document length; that won't help you of course).

In the meantime, as a workaround, either increase ltex.sentenceCacheSize or split up your document into multiple *.tex files.

The reason why ltex.sentenceCacheSize is not larger by default is that people complained about RAM usage before (see #15, #29), and the documents of most people are shorter than 20,000 sentences.

flindeberg · 2021-02-16T07:50:46Z

Just to clarify, I think the issue might be the requests which don't get cancelled. The large document etc, are contextual, and rather descriptive of the context in which I noticed that requests seem to get queued up.

I.e.:

Feb 15, 2021 4:07:16 PM org.eclipse.lsp4j.jsonrpc.RemoteEndpoint handleCancellation
WARNING: Unmatched cancel notification for request id 59
Feb 15, 2021 4:07:16 PM org.eclipse.lsp4j.jsonrpc.RemoteEndpoint handleCancellation
WARNING: Unmatched cancel notification for request id 60
Feb 15, 2021 4:07:16 PM org.eclipse.lsp4j.jsonrpc.RemoteEndpoint handleCancellation
WARNING: Unmatched cancel notification for request id 54
Feb 15, 2021 4:07:16 PM org.eclipse.lsp4j.jsonrpc.RemoteEndpoint handleCancellation
WARNING: Unmatched cancel notification for request id 58
Feb 15, 2021 4:07:16 PM org.eclipse.lsp4j.jsonrpc.RemoteEndpoint handleCancellation
WARNING: Unmatched cancel notification for request id 61
Feb 15, 2021 4:07:16 PM org.eclipse.lsp4j.jsonrpc.RemoteEndpoint handleCancellation
WARNING: Unmatched cancel notification for request id 62
Feb 15, 2021 4:07:16 PM org.eclipse.lsp4j.jsonrpc.RemoteEndpoint handleCancellation
WARNING: Unmatched cancel notification for request id 63
Feb 15, 2021 4:07:16 PM org.eclipse.lsp4j.jsonrpc.RemoteEndpoint handleCancellation
WARNING: Unmatched cancel notification for request id 64
Feb 15, 2021 4:07:16 PM org.eclipse.lsp4j.jsonrpc.RemoteEndpoint handleCancellation
WARNING: Unmatched cancel notification for request id 65

Which makes it possible to stack up a lot of spellcheck requests. I noticed that something was fishy not because of CPU-usage (my workstation is a bit over the top for latex) but because the spellcheck came back "wrong" which could be seen since the wrong words and passages were marked in the text (due to positions etc being of since the text had been modified since the request for spellcheck was sent).

I.e., I believe, without fully going into all the logs, that the spellchecker backend might have been working on, lets say, request 43 while the front-end vscode-ltex just sent request 163 or so. Which indicates that older requests don't get cancelled asynchronously, which the logs indicate that they should be.

If they were cancelled correctly, I believe that I would not have seen such horrendous results for multiple large files.

valentjn · 2021-02-16T16:19:12Z

The cancellation requests are not handled correctly because LT_EX checks synchronously. Making the check process asynchronous would of course be another option, but that's probably quite hard as most of the existing code is not thread-safe.

flindeberg · 2021-02-23T14:49:37Z

I think I'm going to have a look, since this annoys me :-)

Can you move this issue to LTeX, or should I close this one and open a new one at LTeX?

Fixes valentjn/vscode-ltex#253.

valentjn · 2021-03-21T18:24:09Z

You mean ltex-ls (LT_EX is usually vscode-ltex + ltex-ls). There's not really a strict separation in terms of issues (many issues affect both vscode-ltex and ltex-ls). But in this case, we can keep this issue here for visibility.

I've come to the conclusion that you have to have asynchronous checks, otherwise there's no way to cancel them. I implemented a basic version for this on a new branch of ltex-ls, but sometimes the program waits if a check is currently running in the checking thread, before handing the edit event over to ltex-ls and giving us the chance to cancel it. I'm a bit stuck right now.

aloispichler · 2021-04-22T12:25:07Z

Same problem here: impossible to use LTeX for longer documents.

valentjn · 2021-04-22T12:45:48Z

This issue is already confirmed. Please upvote the issue using the thumbs-up reaction to help reduce noise. Thank you!

valentjn · 2021-04-23T19:27:28Z

After greatly improving debugging facilities, I was able to see the problem of my branch code. It seems Java somehow reuses threads if they are currently waiting. If I understand it correctly, the RPC thread waited for the checking thread to return a result, but the CompletableFutures inside the checking thread were run in the RPC thread instead of a new one. That led to the RPC thread not processing incoming client notifications (e.g., cancel notifications) anymore. It seems to work if I tell Java to use different threads for the CompletableFutures.

Some testing still needs to be done as I'm not yet entirely sure everything is working as intended.

@flindeberg I'm not sure about your “expected behavior” section. The implemented solution tries to cancel the checking process if a new edit is sent to the server. Then, the updated text (after the edit) is checked. If there are multiple edits, then they don't stack anymore; only the last edit is checked. However, the root problem is that the sentence cache is too small (if it wasn't too small, checking would only take milliseconds). This means the checking of the updated text will take as long as the initial checking, minus the time for the cached sentences. For example, if you have a document of 20000 sentences, but your sentence cache only has size 2000, and every sentence takes 0.01s, then the initial checking will take 200s, but the subsequent checks will also take 180s, so almost the same (as 18000 sentences have to be checked). There is no way that these subsequent checks only take “a couple of seconds” as you write, especially if you exceed the size of the sentence cache by a lot as in my example. This issue is just about the almost infinite loop due to the stacking of the checks. If you want fast checking, you cannot get around increasing the sentence cache size.

Maybe it's possible to detect LT_EX takes a lot of time, and offer to increase (e.g., double) the sentence cache size.

valentjn · 2021-05-05T07:35:45Z

Fix released in 10.2.0.

aloispichler · 2021-05-06T07:27:05Z

This seems to work now.
Thank you!

orzechow · 2021-11-25T15:44:45Z

First of all, a huge thank you @valentjn for providing this very helpful plugin! It boosts writing my PhD thesis ❤️

I ran into this issue with the current version of vscode-ltex (13.0.0) and ltex-ls (15.0.0) & jdk (11.0.12+7) auto-installed by the extension.
Besides unresponsive ltex suggestions, the extension blocks any VSCode suggestions/auto-completion (like code snippets, etc).

My document currently contains ~15.000 words (don't know how many sentences), but increasing ltex.sentenceCacheSize, ltex.java.maximumHeapSize and ltex.java.initialHeapSize did not help. Any idea, why this problem occurs again?

valentjn · 2021-11-25T15:55:44Z

@orzechow Please open a new issue with an example document to reproduce the issue. It's not possible to help you with just this info.

Fixes valentjn/vscode-ltex#253.

flindeberg added 1-bug 🐛 Issue type: Bug report (something isn't working as expected) 2-unconfirmed Issue status: Bug that needs to be reproduced (all new bugs have this label) labels Feb 15, 2021

valentjn added a commit to valentjn/ltex-ls that referenced this issue Mar 21, 2021

Add support for cancelation of checking requests

991defe

Fixes valentjn/vscode-ltex#253.

valentjn added 2-confirmed Issue status: Confirmed, reproducible bug in LTeX and removed 2-unconfirmed Issue status: Bug that needs to be reproduced (all new bugs have this label) labels Mar 21, 2021

valentjn changed the title ~~LTeX gets stuck in an infinite loop on rapid changes.~~ LTeX gets stuck in an almost infinite loop in very long documents Mar 21, 2021

valentjn closed this as completed in valentjn/ltex-ls@1e06217 Apr 23, 2021

valentjn added the 3-fixed Issue resolution: Issue has been fixed on the develop branch label Apr 23, 2021

valentjn self-assigned this Apr 25, 2021

valentjn added this to the 10.2.0 milestone May 5, 2021

me-johnomar added a commit to me-johnomar/ltex-ls that referenced this issue Jan 31, 2024

Add support for cancellation of checking requests

9500b71

Fixes valentjn/vscode-ltex#253.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LTeX gets stuck in an almost infinite loop in very long documents #253

LTeX gets stuck in an almost infinite loop in very long documents #253

flindeberg commented Feb 15, 2021

valentjn commented Feb 15, 2021 •

edited

Loading

flindeberg commented Feb 16, 2021 •

edited

Loading

valentjn commented Feb 16, 2021 •

edited

Loading

flindeberg commented Feb 23, 2021

valentjn commented Mar 21, 2021

aloispichler commented Apr 22, 2021 •

edited

Loading

valentjn commented Apr 22, 2021

valentjn commented Apr 23, 2021 •

edited

Loading

valentjn commented May 5, 2021

aloispichler commented May 6, 2021

orzechow commented Nov 25, 2021

valentjn commented Nov 25, 2021

LTeX gets stuck in an almost infinite loop in very long documents #253

LTeX gets stuck in an almost infinite loop in very long documents #253

Comments

flindeberg commented Feb 15, 2021

valentjn commented Feb 15, 2021 • edited Loading

flindeberg commented Feb 16, 2021 • edited Loading

valentjn commented Feb 16, 2021 • edited Loading

flindeberg commented Feb 23, 2021

valentjn commented Mar 21, 2021

aloispichler commented Apr 22, 2021 • edited Loading

valentjn commented Apr 22, 2021

valentjn commented Apr 23, 2021 • edited Loading

valentjn commented May 5, 2021

aloispichler commented May 6, 2021

orzechow commented Nov 25, 2021

valentjn commented Nov 25, 2021

valentjn commented Feb 15, 2021 •

edited

Loading

flindeberg commented Feb 16, 2021 •

edited

Loading

valentjn commented Feb 16, 2021 •

edited

Loading

aloispichler commented Apr 22, 2021 •

edited

Loading

valentjn commented Apr 23, 2021 •

edited

Loading