Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming mode #126

Merged
merged 10 commits into from
Apr 30, 2024
Merged

Streaming mode #126

merged 10 commits into from
Apr 30, 2024

Conversation

DrJosh9000
Copy link
Contributor

@DrJosh9000 DrJosh9000 commented Dec 21, 2023

The goal

Allow terminal output to be processed in a streaming manner, at the expense of not buffering the full history of the input. Enable much larger input to be processed without exhausting memory.

Why

The memory improvements of the last couple of PRs (#121, #124) whittle the memory allocated per run for the NPM benchmark from ~24MiB down to ~14MiB, but on a ~0.5MiB input that's still a hefty blowup.

The overwhelming amount of typical terminal output writes to new lines. Sometimes a command on a PTY will erase the previous line (e.g. git). Some tools will rewrite the last 10 or so lines (bazel, docker). In all cases, sensible CLI tools should assume the terminal window is no larger than a few hundred lines.

So why do we need to buffer more lines than that?

By streaming out all the lines we no longer need to touch, they don't need to be kept in memory, so they can be freed. The total memory needed to process input can be limited, rather than having to scale with the size of the input. This will enable displaying much larger logs.

How

The key functionality was added in #118; the main change is to tidy up the main edge case (preserving parser state between chunks of input) and default to using it. Pass --buffer-max-lines=0 to disable the line limit. It applies to all modes: stdin, file, and the web service.

But while I'm here...

  • Add some tests and benchmarks for the new streaming approach
  • Use strings.Builder instead of bytes.Buffer, and avoid a few casts to []byte - it's slightly faster
  • Change how the scroll-out works, to reuse previously allocated screen lines
  • Remove the strings.Replace that inserted   by generating   for every line that would have been empty.

Show me the benchmarks!

goos: darwin
goarch: arm64
pkg: github.com/buildkite/terminal-to-html/v3
                │ before.txt  │              after.txt              │
                │   sec/op    │   sec/op     vs base                │
RendererNpm-10    12.88m ± 1%   12.02m ± 0%   -6.64% (p=0.000 n=30)
StreamingNpm-10   13.64m ± 0%   11.31m ± 1%  -17.08% (p=0.000 n=30)
geomean           13.25m        11.66m       -12.02%

                │  before.txt   │              after.txt               │
                │     B/op      │     B/op      vs base                │
RendererNpm-10     13.92Mi ± 0%   11.19Mi ± 0%  -19.56% (p=0.000 n=30)
StreamingNpm-10   10.633Mi ± 0%   5.675Mi ± 0%  -46.62% (p=0.000 n=30)
geomean            12.16Mi        7.971Mi       -34.47%

                │ before.txt  │             after.txt              │
                │  allocs/op  │  allocs/op   vs base               │
RendererNpm-10    158.9k ± 0%   163.9k ± 0%  +3.14% (p=0.000 n=30)
StreamingNpm-10   158.9k ± 0%   158.2k ± 0%  -0.46% (p=0.000 n=30)
geomean           158.9k        161.0k       +1.32%

The streaming benchmark doesn't exist before this PR, so I copied it into main temporarily (and fixed it so it would build). The speedups are mostly to do with strings.Builder and avoiding casts to []byte. The 46% memory saving before/after in the streaming benchmark is primarily due to the screen line recycling.

@DrJosh9000 DrJosh9000 changed the title WIP: File/stdin streaming mode Streaming mode Jan 17, 2024
@DrJosh9000 DrJosh9000 marked this pull request as ready for review January 17, 2024 06:20
@DrJosh9000 DrJosh9000 force-pushed the full-stream-ahead branch 3 times, most recently from 37d5ebd to 6476034 Compare January 17, 2024 06:32
@dabarrell
Copy link

Is my understanding that this would effectively limit the maximum memory used by t2html - meaning the limit is determined by the maxLines, not by the size of the input? Have you tried putting a bigger input file through the benchmarks to see what that looks like? e.g. 5mb compared to 0.5mb

@DrJosh9000
Copy link
Contributor Author

That is the goal, to limit the memory needed to process larger and larger input. In practice it will still allocate quite a lot, but should garbage collect more of it.

Here's the same benchmark on a 17MiB file (several copies of fixtures/npm.sh.raw):

BenchmarkRendererBig-10                3         356336375 ns/op        392413698 B/op   5244420 allocs/op
BenchmarkStreamingBig-10               4         322766854 ns/op        166112616 B/op   5042078 allocs/op

166 MB of memory allocations per run still sounds alarming, but that number is "total" - it doesn't decrease as the garbage collector frees memory. If I stick a little diagnostic code at the end of parseToScreen to see what's up:

	var m runtime.MemStats
	runtime.ReadMemStats(&m)
	log.Printf("TotalAlloc: %d\tHeapAlloc: %d\tHeapInuse: %d", m.TotalAlloc, m.HeapAlloc, m.HeapInuse)

and do a single run with the input streamed into the parser (io.Copy):

2024/01/18 11:26:16 TotalAlloc: 839816  HeapAlloc: 839816       HeapInuse: 1441792
2024/01/18 11:26:16 TotalAlloc: 1212656 HeapAlloc: 1212656      HeapInuse: 1761280
2024/01/18 11:26:16 TotalAlloc: 1558336 HeapAlloc: 1558336      HeapInuse: 2097152
...snip...
2024/01/18 11:26:17 TotalAlloc: 173126152       HeapAlloc: 1782400      HeapInuse: 2441216
2024/01/18 11:26:17 TotalAlloc: 173426456       HeapAlloc: 2082704      HeapInuse: 2727936
2024/01/18 11:26:17 TotalAlloc: 173618552       HeapAlloc: 2274800      HeapInuse: 2924544

HeapInuse is the most useful, and hovered ± 1.7 MB around a mean of 3.1 MB.

The same thing without using MaxLines:

2024/01/18 11:34:43 TotalAlloc: 836776  HeapAlloc: 836776       HeapInuse: 1359872
2024/01/18 11:34:43 TotalAlloc: 1222880 HeapAlloc: 1222880      HeapInuse: 1744896
2024/01/18 11:34:43 TotalAlloc: 1527456 HeapAlloc: 1527456      HeapInuse: 2072576
...snip...
2024/01/18 11:34:43 TotalAlloc: 248025120       HeapAlloc: 154157880    HeapInuse: 166273024
2024/01/18 11:34:43 TotalAlloc: 248375520       HeapAlloc: 154508280    HeapInuse: 166453248
2024/01/18 11:34:43 TotalAlloc: 248631216       HeapAlloc: 154763976    HeapInuse: 166600704

i.e. heap usage grows, does not shrink, and in the end has used 166 MB without being able to free it.

@DrJosh9000 DrJosh9000 requested review from pda, triarius and moskyb January 18, 2024 03:19
@DrJosh9000 DrJosh9000 force-pushed the full-stream-ahead branch 3 times, most recently from cfbba30 to 5bbd91b Compare April 29, 2024 04:36
@DrJosh9000 DrJosh9000 requested review from a team and removed request for triarius April 29, 2024 04:39
Copy link
Contributor

@tessereth tessereth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I'm excited to try it out.

cmd/terminal-to-html/terminal-to-html.go Outdated Show resolved Hide resolved
DrJosh9000 and others added 10 commits April 30, 2024 14:18
* Reorganise terminal-to-html.go
* Reorganise Screen and parser a bit
* Change screen.Parse to screen.Write (works as destination for io.Copy)
* Screen owns the parser: handles incremental writes
* Support setting max lines in all of HTTP, stdin, and file modes
* Add comments to parser methods
* Change AsHTML return to string, avoiding cast to []byte
* Change bytes.Replace to strings.Replace
Also preallocate lines []string slice, saves time and memory in benchmark
Replace was used to put ` ` inside otherwise blank lines (`\n\n` -> `\n \n`). I went looking through history to understand. It used to be implemented as a regexp replace from `^$`. That originated in the Ruby version going back to ef20b4a, before that there was something funky going on with divs.

Changing it to generate ` ` on the fly broke two tests. If the rationale for ` ` is to ensure empty lines aren't lost, then the two tests were arguably wrong:
* "clears everything after the \x1b[0K" clears the "hello" line, but because that blank line is at the start of the output it wasn't between two newlines, missing the Replace
* a similar problem occurs with Pikachu, but at both the start and the end (if you are unconvinced about the end, note there are two blank lines after Pikachu if you cat fixtures/pikachu.sh.raw).
No need to garbage collect lots of slices if they can be reused right away
This checks that the streaming approach produces the same output as the classical buffer-everything approach, for all existing cases.
This avoids a copy when the existing buffer is empty.

Also split the "recycle" line in two to make it clear in memory profiles that append is responsible for allocation, not creating the new line itself.
And mention that it can be turned off in the flag description
Co-authored-by: Tessa Bradbury <[email protected]>
@DrJosh9000 DrJosh9000 merged commit 54d249f into main Apr 30, 2024
1 check passed
@DrJosh9000 DrJosh9000 deleted the full-stream-ahead branch April 30, 2024 04:27
DennisRasey pushed a commit to DennisRasey/forgejo that referenced this pull request Jul 5, 2024
- Update the `github.com/buildkite/terminal-to-html/v3` dependency from
version v3.10.1 to v3.13.0.
- Version v3.12.0 introduced an incompatible change, the return type of
`AsHTML` changed from `[]byte` to `string`. That same version also
introduced streaming mode
buildkite/terminal-to-html#126, which allows us
to avoid reading the whole input into memory.
- Closes #4313
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants