-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redrawing room list buffer can be slow when in many rooms #298
Comments
Additionally since I named all the bridges I use, I've tried stopping all of those so the bot accounts don't get a chance to update anything temselves. This didn't solve anything |
You can fully-expand a node in the profile report with C-uRET You want to read C-hig |
Right, that makes sense, should've read the manual thoroughly...
|
I think you'll want to test 2212b38 |
Maybe I should've been more specific. Yes, I know about it, I've tested it. alphapapa made it when I told them about the lags. It makes it a bit better, since it doesn't lag when doing operations with emacs quickly, but after I am idle for only a second, it usually lags again, if not right after the second, then after few more seconds. |
You can, of course, If you have |
There are a few fundamental issues here:
So there are various possible ways to address these issues:
None of these are great options, but they're all I know of now. |
@Rutherther, you should look into all the above suggestions first; but if you've tried everything you can and still can't get a good result, another option could be #294 (comment). That (or rather a previous version of that code) was my workaround back when I was afflicted by a Matrix server bug which was having a somewhat similar effect to your issue. |
Thanks for the suggestions, I will try to look into some of them, and probably as the most used workaround for now I will just close this buffer, as I do not need it very often. |
When a sync response is received from the server, the room list buffer is updated, and that re-renders the whole buffer, because anything that was previously rendered could have changed.
Yes, see https://debbugs.gnu.org/cgi/bugreport.cgi?bug=71725
Ok, but that function is not what renders the room buffer, it just calls the functions that do. So you would need to verify that all of the relevant functions are native-compiled.
Ok. |
@alphapapa, I started using I started writing some more complicated idle-time code, and then thought "why not just check how long it's been since the last refresh vs the duration set by the user, and not refresh if it hasn't been long enough. So I did that, and then realised that didn't quite work either (as a long period between syncs could prevent the refresh from happening in any timely fashion). Then I wrote myself this comment, and turned off my laptop for the day :)
And re-reading that, I'd be inclined to:
Does that sound like a good approach? |
I don't know. We seem to be making it more complicated, which I'd like to avoid. But another factor to consider is whether the room list buffer is visible. If it's not in a visible window, then it doesn't seem as urgent to update it, so an idle timer would seem to make sense. If the buffer is visible, I feel like it ought to be updated in real time, otherwise we'd be showing incorrect information to the user. So we might also need to use a function on the window configuration hook to update the buffer immediately if it becomes visible. I still don't fully understand what the performance problem is. I am in 127 rooms--which isn't as many as 200, but if 200 were causing a significant delay, I'd expect 127 to cause a noticeable one too--and I notice no pauses from updating the buffer. So I also wonder whether users who are reporting problems are running the relevant code as interpreted, byte-compiled, or native-compiled, or some combination. I don't feel like we have enough information to understand what's really going on. There could also be something unusual going on, e.g. if a room's avatar image were very large, and it were having to be resized down to icon size on every redraw (which shouldn't happen, since that image does get cached, but who knows), then only users in such a room would have the problem. IOW this doesn't seem like a universal problem, which suggests there's more to it. The ultimate solution might be to refactor taxy-magit-section to more effectively cache values between redraws. But that would be a lot of work, so I don't intend to do that soon. |
I'd intermittently experienced some performance issues from it in the past week or so, which is why I initially added your patch the other day, but I don't really have any insights -- most of the time it was fine, so it just seemed like the server was deciding to temporarily be super chatty for some reason. I don't recall having similar issues anytime recently, but maybe the server was updated and introduced something relatively noisy.
Yep, good idea. The buffer-local hook value works well for that sort of thing. |
For future reference, I pushed v0.14.2 of taxy-magit-section today, which includes a minor performance optimization: https://github.com/alphapapa/taxy.el/tree/package/taxy-magit-section?tab=readme-ov-file#0142 I don't know if it will make a noticeable difference, but it may help a little. |
@alphapapa Okay, so how should I convince you it's native-compiled? is there an Emacs function that will say if the whole tree is native compiled, or should I send each function description output as I did for the one I already sent as an example? |
Switched to v0.14.2 and no observable difference for me |
@Rutherther would you like to give phil-s@6e696d5 a try?
The user options I'm keen to know whether / to what extent this alleviates things for you. |
Hello @phil-s, thanks for that. Yes, debouncing is solving part of the problem. But the underlying problem of rendering being slow is not solved. So Emacs still lags when it syncs, but not only that. Thanks to it being delayed I can actually start doing something in the ement room list, and notice that even toggling sections is causing this slowness for me. I suppose that also does the redraw of the whole buffer. |
@alphapapa here as per your suggestion of re-evaluating everything, and obtaining profiler report for that. I hope that with interpretation the bottleneck as the same place as with compilation. This also isn't obtained from the timer sync, but from tabbing the sections. There the lag also occurs. Based on this I've tried redefining |
The defun I presented is obviously not a good alternative to the function, but I am now trying
and although it still not 100% smooth with this and probably that debounce is good to have, it lowers the delay considerably to something much more pleasant for me. |
Well that's weird -- in the regular function there's a hash table caching the results of the expensive look-ups, so that should be fast. Ah -- the hash table is not updated if it uses the fallback case. So that must be what's happening -- every time it tries and fails at the expensive look-up. |
(by changed function I mean the one doing logging, then for the second test I used elpaca-build after reverting the commit to build the whole file, and restarted emacs) |
What I mean is, the changes in b20fda0 need to be native-compiled (or at least, you must be careful to load the same kind of code when conducting the before/after tests). The function that's changed by the diff I shared, which times the execution and prints the message, needn't be native-compiled. |
Experiencing the same issue but a lot less rooms, so I thought I should evaluate/compile the code in #298 (comment). What more logs do y'all need for me to gather? Emacs info:
|
I guess the first question is where is this coming from, and is it a factor? Edit: Answers are (a) image.c: static void
image_size_error (void)
{
image_error ("Invalid image size (see `max-image-size')");
} and (b): I'm guessing it's not a factor, but I might be wrong -- and it does kinda look like the same aborted processing is happening repeatedly, which could be having a similar effect to the uncached values we saw earlier in this discussion. You probably want to figure out which images are causing that, and see if you can test in their absence. |
Thanks. I'd suggest disabling |
I've been seeing this myself sometimes, and I haven't mentioned it because it's so mysterious. My impression is that it started happening with this commit: alphapapa/taxy.el@9e76b7f Which reminds me: @Icy-Thought What version of taxy-magit-section do you currently have loaded? Anyway, I added some debugging code to show what value To be more specific, the computed image width I'm referring to is from Also, if that commit did trigger the error, I don't understand why...although I suddenly have an idea: all of my visible frames are maximized, but maybe there's an invisible one that's tiny, and maybe it sometimes gets selected by the loop first...? ...Yeah, looks like it might be something like that: even though I only have one visible Emacs frame at the moment, look what this code returns: (cl-loop for frame in (frame-list)
when (member (framep frame) '(x w32 ns pgtk))
collect (list frame (frame-width frame) (frame-height frame)))
;; ((#<frame 0x110250a0> 28 10) (#<frame *scratch* - GNU Emacs 0x277b0b0> 211 52)) Now those numbers still look like they shouldn't cause an error, but if that first frame is some invisible one, who knows what its sizes might be at various times. So I can probably fix that in taxy-magit-section soon by making sure the frame actually exists as a tangible (but not necessarily visible) one or something like that. But IME that image-size error wasn't causing a performance problem, so I don't know if it's causing one for @Icy-Thought here. We'll have to wait for the profiler report. |
I don't know how to profile individual functions tbh, so I just attached the whole profiling data to this message.
installed version: 0.14.2 b7b60a4 |
Yep, there's always one frame at minimum, which means that if you run Emacs as as daemon there's always that invisible initial frame behind the scenes. (This is a notorious reason for people's frame-based tweaks in their init file suddenly 'breaking' when they start using a daemon, as things start getting applied to the invisible frame instead.) Edit: Oh, no your case is something different? Or, at least, I see |
@phil-s Yep, I think that explains it.
It is? Why do you say that? |
Too quick for me :) I just added this:
|
I still see automatic sync responses in that profiler report, so it would seem that Also, there are many byte-compiled functions involved. It would help to |
Yeah, now I'm a bit confused too: |
Anyway, it seems like the key is (cl-loop for frame in (frame-list)
when (member (framep frame) '(x w32 ns pgtk))
collect (list frame (frame-visible-p frame) (frame-width frame) (frame-height frame)))
;; ((#<frame 0x18cde470> nil 25 10)
;; (#<frame *scratch* 0x97bbe98> t 211 52)
;; (#<frame *Ement Notifications* 0x277b0b0> t 211 52)) So I think the code in taxy-magit-section needs to be: (cl-loop for frame in (frame-list)
when (and (frame-visible-p frame)
(memq (framep frame)
'(x w32 ns pgtk)))
return frame) I'm guessing that will fix the image-size error. I'm not sure if that will have any effect on performance, but we'll see. |
When server-mode is enabled, an invisible frame is present, which still returns X from FRAMEP, but which should not be used for calculating image sizes. This should fix an "Invalid image size" error that was happening (when such an invisible frame happened to be sorted first in FRAME-LIST). See <alphapapa/ement.el#298 (comment)>.
Just pushed v0.14.3 of taxy-magit-section, which I hope will fix the "invalid image size" error (which might also help with performance, but I don't know). |
This comment was marked as resolved.
This comment was marked as resolved.
That's good to hear. If you have time to use the code in #298 (comment) to time the redrawing now, it would be good to know what the hard numbers are.
Unfortunately, I don't see anything in that report about reverting the room list buffer. What's needed is:
That way the only thing that's being profiled is the reverting of the room list buffer (plus minor UI like completion, but that can't be avoided). |
Profiler report: (only relevant bits. If you still want the whole report, please tell me!)
|
@Icy-Thought Please post the whole report, otherwise we don't have a full picture of what Emacs is doing. |
Apologies. I had assumed that you were not interested in the remaining profiling data. I have attached the profiler.txt file as usual within this comment. If you want me to perform more profiles, please ping me! :) |
@Icy-Thought No problem, thanks. From that report it seems that it must be not taking long at all. How many rooms are you in? And how long is it taking to revert the buffer? |
Glad to provide the logs! :) |
Yeah, with that few rooms, updating the room buffer should be imperceptibly fast. Thanks. AFAIK this is solved now (other than the idea to debounce the automatic reversion, which is tracked elsewhere), so closing now. |
Thank you so much for taking the time to fix this issue! Have an awesome day! 🎉 |
@Icy-Thought Thanks for the kind words, and for your help in debugging it. |
Anytime! Just ping and I'll try to help out where I can! 😊 |
OS/platform
Guix System or NixOS
Emacs version and provenance
Tried from both Nix and Guix, both times with jit native compilation enabled.
GNU Emacs 29.4 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.41, cairo version 1.18.0)
Emacs command
./with-emacs.sh --dir /tmp/test
Emacs frame type
GUI
Ement package version and provenance
I've used method in README to obtain from latest commit. That is 3f87a95 when trying this.
Actions taken
Observed results
After few seconds Emacs lags, for a few seconds without being able to do anything. This repeats every few seconds between the lags.
Expected results
I should be able to use emacs normally when having ement room list open.
Backtrace
No response
Etc.
I have not used profiler at all till now and I am not sure how to get relevant information. I can run for specific functions, but I don't know what to target.
I observe this behavior both in my regular Emacs installation and on a fresh new emacs version obtained with with-emacs.sh script.
I am on Conduit server. I also have sliding sync running on the server. I am in more than 200 rooms. Few of the rooms are bigger groups. But most of them are bridged DM rooms. I use Mautrix Discord (no guild bridged, only DMs), Mautrix Telegram, Mautrix Whatsapp, Mautrix Meta, Heisenbridge (bridging a few irc groups).
I have tried registering a separate account and join a few rooms (I am in those on my main account) that have more members. I did not observe this behavior on that account.
I have already reached alphapapa through Ement.el room, where it was suggested to open an issue. This was like two weeks ago (sorry for taking so long)
The text was updated successfully, but these errors were encountered: