Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve --heap-size-hint arg handling #48050

Merged
merged 15 commits into from
Feb 23, 2024
Merged
7 changes: 4 additions & 3 deletions doc/man/julia.1
Original file line number Diff line number Diff line change
Expand Up @@ -228,9 +228,10 @@ fallbacks to the latest compatible BugReporting.jl if not. For more information,
--bug-report=help.

.TP
--heap-size-hint=<size>
Forces garbage collection if memory usage is higher than that value. The value can be
specified in units of K, M, G, T, or % of physical memory.
--heap-size-hint=<value>
Forces garbage collection if memory usage is higher than the given value.
The value may be specified as a number of bytes, optionally in units of
KB, MB, GB, or TB, or as a percentage of physical memory with %.

.TP
--compile={yes*|no|all|min}
Expand Down
2 changes: 1 addition & 1 deletion doc/src/manual/command-line-interface.md
Original file line number Diff line number Diff line change
Expand Up @@ -213,7 +213,7 @@ The following is a complete list of command-line switches available when launchi
|`--output-incremental={yes\|no*}` |Generate an incremental output file (rather than complete)|
|`--trace-compile={stderr,name}` |Print precompile statements for methods compiled during execution or save to a path|
|`--image-codegen` |Force generate code in imaging mode|
|`--heap-size-hint=<size>` |Forces garbage collection if memory usage is higher than that value. The memory hint might be specified in megabytes (e.g., 500M) or gigabytes (e.g., 1G)|
|`--heap-size-hint=<size>` |Forces garbage collection if memory usage is higher than the given value. The value may be specified as a number of bytes, optionally in units of KB, MB, GB, or TB, or as a percentage of physical memory with %.|


!!! compat "Julia 1.1"
Expand Down
90 changes: 52 additions & 38 deletions src/jloptions.c
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@

#include <unistd.h>
#include <getopt.h>

#include "julia_assert.h"

#ifdef _OS_WINDOWS_
Expand All @@ -18,6 +19,15 @@ char *shlib_ext = ".dylib";
char *shlib_ext = ".so";
#endif

/* This simple hand-crafted tolower exists to avoid locale-dependent effects in
* behaviors (and utf8proc_tolower wasn't linking properly on all platforms) */
static char ascii_tolower(char c)
{
if ('A' <= c && c <= 'Z')
return c - 'A' + 'a';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, let me recap: this PR changed things to use tolower, which has issues, so it was suggested to use utf8proc_tolower instead which then was replaced by this hand-written function...

How about my original suggestion to just stick with what the code did before this PR, and simply have two cases in the switch for each letter? That seems by far simplest and is guaranteed to have no UTF-8 related issues or anything else and requires no additional explanation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, yeah. But this works too, and there are a couple extra cases now, including one that is not in a switch statement, so it is not entirely obvious to me that it is significantly better one way or the other; just a style preference.

return c;
}

static const char system_image_path[256] = "\0" JL_SYSTEM_IMAGE_PATH;
JL_DLLEXPORT const char *jl_get_default_sysimg_path(void)
{
Expand Down Expand Up @@ -187,9 +197,9 @@ static const char opts[] =
" expressions. It first tries to use BugReporting.jl installed in current environment and\n"
" fallbacks to the latest compatible BugReporting.jl if not. For more information, see\n"
" --bug-report=help.\n\n"

" --heap-size-hint=<size> Forces garbage collection if memory usage is higher than that value.\n"
" The value can be specified in units of K, M, G, T, or % of physical memory.\n\n"
" --heap-size-hint=<size> Forces garbage collection if memory usage is higher than the given value.\n"
" The value may be specified as a number of bytes, optionally in units of\n"
" KB, MB, GB, or TB, or as a percentage of physical memory with %.\n\n"
;

static const char opts_hidden[] =
Expand Down Expand Up @@ -801,43 +811,47 @@ JL_DLLEXPORT void jl_parse_opts(int *argcp, char ***argvp)
break;
case opt_heap_size_hint:
if (optarg != NULL) {
size_t endof = strlen(optarg);
long double value = 0.0;
if (sscanf(optarg, "%Lf", &value) == 1 && value > 1e-7) {
char unit = optarg[endof - 1];
uint64_t multiplier = 1ull;
switch (unit) {
case 'k':
case 'K':
multiplier <<= 10;
break;
case 'm':
case 'M':
multiplier <<= 20;
break;
case 'g':
case 'G':
multiplier <<= 30;
break;
case 't':
case 'T':
multiplier <<= 40;
break;
case '%':
if (value > 100)
jl_errorf("julia: invalid percentage specified in --heap-size-hint");
uint64_t mem = uv_get_total_memory();
uint64_t cmem = uv_get_constrained_memory();
if (cmem > 0 && cmem < mem)
mem = cmem;
multiplier = mem/100;
break;
default:
jl_errorf("julia: invalid unit specified in --heap-size-hint");
break;
}
jl_options.heap_size_hint = (uint64_t)(value * multiplier);
char unit[4] = {0};
int nparsed = sscanf(optarg, "%Lf%3s", &value, unit);
if (nparsed == 0 || strlen(unit) > 2 || (strlen(unit) == 2 && ascii_tolower(unit[1]) != 'b')) {
jl_errorf("julia: invalid argument to --heap-size-hint (%s)", optarg);
}
uint64_t multiplier = 1ull;
switch (ascii_tolower(unit[0])) {
case '\0':
case 'b':
break;
case 'k':
multiplier <<= 10;
break;
case 'm':
multiplier <<= 20;
break;
case 'g':
multiplier <<= 30;
break;
case 't':
multiplier <<= 40;
break;
case '%':
if (value > 100)
jl_errorf("julia: invalid percentage specified in --heap-size-hint");
uint64_t mem = uv_get_total_memory();
uint64_t cmem = uv_get_constrained_memory();
if (cmem > 0 && cmem < mem)
mem = cmem;
multiplier = mem/100;
break;
default:
jl_errorf("julia: invalid argument to --heap-size-hint (%s)", optarg);
break;
}
long double sz = value * multiplier;
if (isnan(sz) || sz < 0) {
jl_errorf("julia: invalid argument to --heap-size-hint (%s)", optarg);
}
jl_options.heap_size_hint = sz < UINT64_MAX ? (uint64_t)sz : UINT64_MAX;
}
if (jl_options.heap_size_hint == 0)
jl_errorf("julia: invalid memory size specified in --heap-size-hint");
Expand Down
16 changes: 16 additions & 0 deletions test/cmdlineargs.jl
Original file line number Diff line number Diff line change
Expand Up @@ -1148,3 +1148,19 @@ if Sys.islinux() && Sys.ARCH in (:i686, :x86_64) # rr is only available on these
"_RR_TRACE_DIR" => temp_trace_dir); #=stderr, stdout=#))
end
end

@testset "--heap-size-hint" begin
exename = `$(Base.julia_cmd())`
@test errors_not_signals(`$exename --heap-size-hint -e "exit(0)"`)
@testset "--heap-size-hint=$str" for str in ["asdf","","0","1.2vb","b","GB","2.5GB̂","1.2gb2","42gigabytes","5gig","2GiB","NaNt"]
@test errors_not_signals(`$exename --heap-size-hint=$str -e "exit(0)"`)
end
k = 1024
m = 1024k
g = 1024m
t = 1024g
@testset "--heap-size-hint=$str" for (str, val) in [("1", 1), ("1e7", 1e7), ("2.5e7", 2.5e7), ("1MB", 1m), ("2.5g", 2.5g), ("1e4kB", 1e4k),
("1e100", typemax(UInt64)), ("1e500g", typemax(UInt64)), ("1e-12t", 1), ("500000000b", 500000000)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, clearly I did not look closely enough before: I thought this was about adding support for inputs like 2.5g which seems reasonable to me

But is a heap size hint of the form 1e-12t or 1e100 really something we want to / should support?

I guess it would be more work to not support them, though. Ah well.

Copy link
Member Author

@mbauman mbauman Feb 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR doesn't change the numeric syntaxes supported here — it just adds tests for them.

The big change here isn't about the supported numeric values, but rather the byte unit. Before Julia would silently discard/ignore unrecognized units. After this PR, Julia errors. That's the big improvement.

Currently:

▶ julia --heap-size-hint=2.5GB -E "Base.JLOptions().heap_size_hint"
0x0000000000000002

▶ julia --heap-size-hint=2.5Gibberish -E "Base.JLOptions().heap_size_hint"
0x0000000000000002

▶ julia --heap-size-hint=2.5Gobbledeegook -E "Base.JLOptions().heap_size_hint"
0x0000000000000a00

After:

▶ ./julia --heap-size-hint=2.5GB -E "Base.JLOptions().heap_size_hint"
0x00000000a0000000

▶ ./julia --heap-size-hint=2.5Gibberish -E "Base.JLOptions().heap_size_hint"
ERROR: julia: invalid argument to --heap-size-hint (2.5Gibberish)

▶ ./julia --heap-size-hint=2.5Gobbledeegook -E "Base.JLOptions().heap_size_hint"
ERROR: julia: invalid argument to --heap-size-hint (2.5Gobbledeegook)

@test parse(UInt64,read(`$exename --heap-size-hint=$str -E "Base.JLOptions().heap_size_hint"`, String)) == val
end
end