Streaming mode #126

DrJosh9000 · 2023-12-21T04:35:53Z

The goal

Allow terminal output to be processed in a streaming manner, at the expense of not buffering the full history of the input. Enable much larger input to be processed without exhausting memory.

Why

The memory improvements of the last couple of PRs (#121, #124) whittle the memory allocated per run for the NPM benchmark from ~24MiB down to ~14MiB, but on a ~0.5MiB input that's still a hefty blowup.

The overwhelming amount of typical terminal output writes to new lines. Sometimes a command on a PTY will erase the previous line (e.g. git). Some tools will rewrite the last 10 or so lines (bazel, docker). In all cases, sensible CLI tools should assume the terminal window is no larger than a few hundred lines.

So why do we need to buffer more lines than that?

By streaming out all the lines we no longer need to touch, they don't need to be kept in memory, so they can be freed. The total memory needed to process input can be limited, rather than having to scale with the size of the input. This will enable displaying much larger logs.

How

The key functionality was added in #118; the main change is to tidy up the main edge case (preserving parser state between chunks of input) and default to using it. Pass --buffer-max-lines=0 to disable the line limit. It applies to all modes: stdin, file, and the web service.

But while I'm here...

Add some tests and benchmarks for the new streaming approach
Use strings.Builder instead of bytes.Buffer, and avoid a few casts to []byte - it's slightly faster
Change how the scroll-out works, to reuse previously allocated screen lines
Remove the strings.Replace that inserted   by generating   for every line that would have been empty.

Show me the benchmarks!

goos: darwin
goarch: arm64
pkg: github.com/buildkite/terminal-to-html/v3
                │ before.txt  │              after.txt              │
                │   sec/op    │   sec/op     vs base                │
RendererNpm-10    12.88m ± 1%   12.02m ± 0%   -6.64% (p=0.000 n=30)
StreamingNpm-10   13.64m ± 0%   11.31m ± 1%  -17.08% (p=0.000 n=30)
geomean           13.25m        11.66m       -12.02%

                │  before.txt   │              after.txt               │
                │     B/op      │     B/op      vs base                │
RendererNpm-10     13.92Mi ± 0%   11.19Mi ± 0%  -19.56% (p=0.000 n=30)
StreamingNpm-10   10.633Mi ± 0%   5.675Mi ± 0%  -46.62% (p=0.000 n=30)
geomean            12.16Mi        7.971Mi       -34.47%

                │ before.txt  │             after.txt              │
                │  allocs/op  │  allocs/op   vs base               │
RendererNpm-10    158.9k ± 0%   163.9k ± 0%  +3.14% (p=0.000 n=30)
StreamingNpm-10   158.9k ± 0%   158.2k ± 0%  -0.46% (p=0.000 n=30)
geomean           158.9k        161.0k       +1.32%

The streaming benchmark doesn't exist before this PR, so I copied it into main temporarily (and fixed it so it would build). The speedups are mostly to do with strings.Builder and avoiding casts to []byte. The 46% memory saving before/after in the streaming benchmark is primarily due to the screen line recycling.

dabarrell · 2024-01-17T07:17:51Z

Is my understanding that this would effectively limit the maximum memory used by t2html - meaning the limit is determined by the maxLines, not by the size of the input? Have you tried putting a bigger input file through the benchmarks to see what that looks like? e.g. 5mb compared to 0.5mb

DrJosh9000 · 2024-01-18T00:45:17Z

That is the goal, to limit the memory needed to process larger and larger input. In practice it will still allocate quite a lot, but should garbage collect more of it.

Here's the same benchmark on a 17MiB file (several copies of fixtures/npm.sh.raw):

BenchmarkRendererBig-10                3         356336375 ns/op        392413698 B/op   5244420 allocs/op
BenchmarkStreamingBig-10               4         322766854 ns/op        166112616 B/op   5042078 allocs/op

166 MB of memory allocations per run still sounds alarming, but that number is "total" - it doesn't decrease as the garbage collector frees memory. If I stick a little diagnostic code at the end of parseToScreen to see what's up:

	var m runtime.MemStats
	runtime.ReadMemStats(&m)
	log.Printf("TotalAlloc: %d\tHeapAlloc: %d\tHeapInuse: %d", m.TotalAlloc, m.HeapAlloc, m.HeapInuse)

and do a single run with the input streamed into the parser (io.Copy):

2024/01/18 11:26:16 TotalAlloc: 839816  HeapAlloc: 839816       HeapInuse: 1441792
2024/01/18 11:26:16 TotalAlloc: 1212656 HeapAlloc: 1212656      HeapInuse: 1761280
2024/01/18 11:26:16 TotalAlloc: 1558336 HeapAlloc: 1558336      HeapInuse: 2097152
...snip...
2024/01/18 11:26:17 TotalAlloc: 173126152       HeapAlloc: 1782400      HeapInuse: 2441216
2024/01/18 11:26:17 TotalAlloc: 173426456       HeapAlloc: 2082704      HeapInuse: 2727936
2024/01/18 11:26:17 TotalAlloc: 173618552       HeapAlloc: 2274800      HeapInuse: 2924544

HeapInuse is the most useful, and hovered ± 1.7 MB around a mean of 3.1 MB.

The same thing without using MaxLines:

2024/01/18 11:34:43 TotalAlloc: 836776  HeapAlloc: 836776       HeapInuse: 1359872
2024/01/18 11:34:43 TotalAlloc: 1222880 HeapAlloc: 1222880      HeapInuse: 1744896
2024/01/18 11:34:43 TotalAlloc: 1527456 HeapAlloc: 1527456      HeapInuse: 2072576
...snip...
2024/01/18 11:34:43 TotalAlloc: 248025120       HeapAlloc: 154157880    HeapInuse: 166273024
2024/01/18 11:34:43 TotalAlloc: 248375520       HeapAlloc: 154508280    HeapInuse: 166453248
2024/01/18 11:34:43 TotalAlloc: 248631216       HeapAlloc: 154763976    HeapInuse: 166600704

i.e. heap usage grows, does not shrink, and in the end has used 166 MB without being able to free it.

tessereth

Looks great! I'm excited to try it out.

cmd/terminal-to-html/terminal-to-html.go

* Reorganise terminal-to-html.go * Reorganise Screen and parser a bit * Change screen.Parse to screen.Write (works as destination for io.Copy) * Screen owns the parser: handles incremental writes * Support setting max lines in all of HTTP, stdin, and file modes * Add comments to parser methods * Change AsHTML return to string, avoiding cast to []byte * Change bytes.Replace to strings.Replace

Also preallocate lines []string slice, saves time and memory in benchmark

Replace was used to put ` ` inside otherwise blank lines (`\n\n` -> `\n \n`). I went looking through history to understand. It used to be implemented as a regexp replace from `^$`. That originated in the Ruby version going back to ef20b4a, before that there was something funky going on with divs. Changing it to generate ` ` on the fly broke two tests. If the rationale for ` ` is to ensure empty lines aren't lost, then the two tests were arguably wrong: * "clears everything after the \x1b[0K" clears the "hello" line, but because that blank line is at the start of the output it wasn't between two newlines, missing the Replace * a similar problem occurs with Pikachu, but at both the start and the end (if you are unconvinced about the end, note there are two blank lines after Pikachu if you cat fixtures/pikachu.sh.raw).

No need to garbage collect lots of slices if they can be reused right away

This checks that the streaming approach produces the same output as the classical buffer-everything approach, for all existing cases.

This avoids a copy when the existing buffer is empty. Also split the "recycle" line in two to make it clear in memory profiles that append is responsible for allocation, not creating the new line itself.

And mention that it can be turned off in the flag description

Co-authored-by: Tessa Bradbury <[email protected]>

- Update the `github.com/buildkite/terminal-to-html/v3` dependency from version v3.10.1 to v3.13.0. - Version v3.12.0 introduced an incompatible change, the return type of `AsHTML` changed from `[]byte` to `string`. That same version also introduced streaming mode buildkite/terminal-to-html#126, which allows us to avoid reading the whole input into memory. - Closes #4313

DrJosh9000 changed the title ~~WIP: File/stdin streaming mode~~ Streaming mode Jan 17, 2024

DrJosh9000 force-pushed the full-stream-ahead branch from eb5f762 to 4a1680d Compare January 17, 2024 06:20

DrJosh9000 marked this pull request as ready for review January 17, 2024 06:20

DrJosh9000 force-pushed the full-stream-ahead branch 3 times, most recently from 37d5ebd to 6476034 Compare January 17, 2024 06:32

DrJosh9000 requested review from pda, triarius and moskyb January 18, 2024 03:19

DrJosh9000 mentioned this pull request Feb 21, 2024

Report resource and processing stats to stderr #129

Merged

DrJosh9000 force-pushed the full-stream-ahead branch 3 times, most recently from cfbba30 to 5bbd91b Compare April 29, 2024 04:36

DrJosh9000 requested review from a team and removed request for triarius April 29, 2024 04:39

tessereth approved these changes Apr 30, 2024

View reviewed changes

cmd/terminal-to-html/terminal-to-html.go Outdated Show resolved Hide resolved

DrJosh9000 and others added 10 commits April 30, 2024 14:18

Prefer strings.Builder and WriteString

35af5f7

outputLineAsHTML -> screenLine.asHTML

9e6c26f

Also preallocate lines []string slice, saves time and memory in benchmark

Add streaming benchmarks

71ce798

Recycle line nodes slices

28199e9

No need to garbage collect lots of slices if they can be reused right away

Add streaming renderer tests

8914d39

This checks that the streaming approach produces the same output as the classical buffer-everything approach, for all existing cases.

Minor parseToScreen optimisation

17c1893

This avoids a copy when the existing buffer is empty. Also split the "recycle" line in two to make it clear in memory profiles that append is responsible for allocation, not creating the new line itself.

Make 300 the default for max-lines

4e164c3

And mention that it can be turned off in the flag description

Use fpath var

382f44c

Co-authored-by: Tessa Bradbury <[email protected]>

DrJosh9000 force-pushed the full-stream-ahead branch from bd535aa to 382f44c Compare April 30, 2024 04:18

DrJosh9000 merged commit 54d249f into main Apr 30, 2024
1 check passed

DrJosh9000 deleted the full-stream-ahead branch April 30, 2024 04:27

DrJosh9000 mentioned this pull request May 7, 2024

Bump version to v3.12.0 #150

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streaming mode #126

Streaming mode #126

DrJosh9000 commented Dec 21, 2023 •

edited

Loading

dabarrell commented Jan 17, 2024

DrJosh9000 commented Jan 18, 2024

tessereth left a comment

Streaming mode #126

Streaming mode #126

Conversation

DrJosh9000 commented Dec 21, 2023 • edited Loading

The goal

Why

How

But while I'm here...

Show me the benchmarks!

dabarrell commented Jan 17, 2024

DrJosh9000 commented Jan 18, 2024

tessereth left a comment

Choose a reason for hiding this comment

DrJosh9000 commented Dec 21, 2023 •

edited

Loading