-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
overhaul std.fmt formatting api #1358
Comments
|
it should be restricted to packed types then. |
I'd like to further propose:
|
runtime zfilling is useful. i wanted that feature for this project: https://github.com/thejoshwolfe/hexdump-zip . when that tool was written in javascript, i would determine the digit count for the highest memory address value (which depends on the user-provided input file size), then zfill all memory address representations to that width. the zig implementation of that tool can't easily do that, so i just zfill everything to the maximum conceivable memory address, which is way bulky. |
Supports {x} for lowercase and {X} for uppercase;
allow bytes to be printed-out as hex (#1358)
Small note but the api should take an extra
|
I'm toying with Zig on a STM32F103C8 MCU (64kB/128kB flash, 20kB RAM) aka Bluepill. So far it's been a success story after overcoming some road bumps (see #1290 for reference)! Thank you guys for your work! Zig is the kind of fun I was missing in my life recently! 😄 For debugging purposes I've used std.fmt and encountered a strange problem. The following example freezes the MCU:
The problem is in https://github.com/ziglang/zig/blob/master/std/fmt.zig
For an yet unknown reason the I thought that the stack overwrites something but the stack pointer should be properly initialized to 0x20005000 at startup. The call is not any particularly deep so unless there's something I'm missing, it shouldn't be the cause. Later on I'm going to setup a proper debugging environment and inspect deeper. I'm writing this as a reminder and a hurray that Zig can be used just fine on resource constrained platforms for embedded projects. Could you please bear it in mind when designing the standard library? I think an unified |
the -0x8000000000000000 =>
"-1000000000000000000000000000000000000000000000000000000000000000" indeed that seems overkill if we know that we're not going to need that. Seems like a straightforward optimization to make with comptime code that uses the bitcount of the integer type. |
The number was wrong anyway because it was from before we had arbitrary sized ints, and now you can print a u128 as binary. I pushed a commit to make it use the integer bit count as @thejoshwolfe suggested. |
It works! Thank you gentlemen! |
This removes the odd width and precision specifiers found and replacing them with the more consistent api described in #1358. Take the following example: {1:5.9} This refers to the first argument (0-indexed) in the argument list. It will be printed with a minimum width of 5 and will have a precision of 9 (if applicable). Not all types correctly use these parameters just yet. There are still some missing gaps to fill in. Fill characters and alignment has yet to be implemented.
This removes the odd width and precision specifiers found and replacing them with the more consistent api described in #1358. Take the following example: {1:5.9} This refers to the first argument (0-indexed) in the argument list. It will be printed with a minimum width of 5 and will have a precision of 9 (if applicable). Not all types correctly use these parameters just yet. There are still some missing gaps to fill in. Fill characters and alignment have yet to be implemented.
@tiehuis If you're open to outside contributions, how do you feel about these suggestions:
|
@hryx Happy with that. I'll write up the remaining parts tonight. |
Remaining things required in this issue:
|
from IRC;
|
I just stumbled over this (for me) unexpected result: > cat src/main.zig
const std = @import("std");
pub fn main() anyerror!void {
const value = @as(u8, 0b00001101); // 13
std.log.info("value is '{b:08}'", .{value});
}
> zig build run
info: value is ' 1101' The format parser just ignored the leading zero in the format specifier One option is to provide some comptime assistance and produce a compile error for leading zeros in the width parameter. diff --git a/lib/std/fmt.zig b/lib/std/fmt.zig
index 97dfcc78b..34ac8c940 100644
--- a/lib/std/fmt.zig
+++ b/lib/std/fmt.zig
@@ -309,6 +309,11 @@ pub fn format(
break :init @field(args, fields_info[arg_index].name);
} else {
+ if (parser.peek(0)) |ch| {
+ if (ch == '0') {
+ @compileError("Leading 0 is not valid for the width parameter. If you intended to zero-fill, try specifying the alignment parameter like this: {" ++ parser.buf[0..parser.pos] ++ "0>" ++ parser.buf[parser.pos + 1 ..] ++ "}");
+ }
+ }
break :init parser.number();
}
}; Another option is to allow the fill character to be specified without an explicit alignment parameter, if that fill character is suitable (probably violates Only one obvious way to do things). diff --git a/lib/std/fmt.zig b/lib/std/fmt.zig
index 34ac8c940..79fdd3023 100644
--- a/lib/std/fmt.zig
+++ b/lib/std/fmt.zig
@@ -268,10 +268,12 @@ pub fn format(
// Parse the fill character
// The fill parameter requires the alignment parameter to be specified
- // too
- if (comptime parser.peek(1)) |ch| {
- if (comptime mem.indexOfScalar(u8, "<^>", ch) != null) {
- options.fill = comptime parser.char().?;
+ // too, unless the fill character is unambiguous, such as '0'.
+ if (comptime parser.peek(0)) |ch| {
+ const has_alignment_parameter = mem.indexOfScalar(u8, "<^>", parser.peek(1) orelse 0) != null;
+ if (has_alignment_parameter or mem.indexOfScalar(u8, "123456789<^>", ch) == null) {
+ options.fill = ch;
+ _ = comptime parser.char();
}
} I'd be fine with either, but I think ignoring this will produce many a frowny face. |
Trying to use fmt left me annoyed that 2s complement is not represented as how the memory actually looks. test {
const print = @import("std").debug.print;
const min_usable = -2147483647;
const max_usable = 2147483647;
print("\n", .{});
print("min_usable: {d}, {x}\n", .{ min_usable, min_usable });
print("max_usable: {d}, {x}\n", .{ max_usable, max_usable });
} ie when you want to have quickly the actual memory repesentation of an integer in hex or binary for using as test case. min_usable: -2147483647, -7fffffff
max_usable: 2147483647, 7fffffff But I need If I want to properly print the number for copy-paste a test case, the intuitive approach with absInt also does not work, because const print = @import("std").debug.print;
const math = @import("std").math;
const test1 = -0x7fffffff;
print("test: {d}, -0x{x}\n", .{ math.absInt(test1), math.absInt(test1) }); I would prefer, if Personally I dont understand the design decision to represent binary and hex not as how the actual memory looks like for 2s complement. For integer this is however understandable (they are intended for human inspection). PS: The workaround to cast to unsigned does also not work. |
How does Zig do it? |
Can we add onto this proposal a way to override the default max depth through either a format string parameter or a formatter? |
Closing as complete with the remaining TODO split into #12313. Any other missing alignment/padding should be reported as new bugs. |
Please put these format specifiers in the documentation. It is hard to find string formatting information for zig. A frequent use case of printing debug information should be easy to find in the documentation. It took me a while to figure out that printing a floating point with nothing after the dot is "{d:.0}", Maybe there should be examples in the doc because this (or similar) seems to be quite a common case. |
wrong place to post that but it is at https://ziglang.org/documentation/master/std/#A;std:fmt.format |
Remaining Work.
This is a proposal for the formatting interface exposed via
std.fmt
and whichshows up in most printing functions (e.g.
std.debug.warn
).This is largely based on Rust's
std::fmt
(which in turn is similar to Python3) so see that for a more in-depth reference for certain parts.Formatting Options
We do take the following formatting options from Rust:
"{0}"
)."{:<} {:<5} {:0^10}"
)"{:5}"
)"{:.5} {:.0}"
)We do not take the following:
#
alternate printing forms+
,-
,0
sign flags (NOTE: may actually want these)format!("{arg1}", arg1 = "example")
)format!("{:.*}", 3, 5.0923412
) (NOTE: could add this in if reasonable demand)format!("{0:1$}", 5.0923412, 3)
)Format Specifiers
These are largely unchanged but a few are:
{}
(primitives
) print the default primitive representation (if it exists){c}
(int
): print as an ascii character{b}
(int
): print as binary{x}
(int
): print as lowercase hex{X}
(int
): print as uppercase hex{o}
(int
): print as octal{e}
(float
): print in exponent form{d}
(int
/float
): print in base10/decimal form{s}
([]u8
/*u8
): print as null-terminated string{*}
(any
): print as a pointer (hex) (NOTE: does & make more sense here?){?}
(any
): print full debug representation (e.g. traverse structs etc to primitive fields){#}
(any
): print raw bytes of the value (hex) (NOTE: do we need this? how often is it used?)These format specifiers are removed from the current implementation:
{.}
(float
): was to specify decimal float, now{d}
replaces this{e10}
(float
): precision was attached to format specifier. The new formatspecifier type would replace this.
{B}
(any
): printed raw bytes of value, replaced by{#}
. This is toensure it cannot be shadowed by a user defined function.
User-defined functions
Alongside this I propose a change in the way format functions are defined.
The current function to implement is of the form:
I instead propose changing this to be of the form:
Format specifiers should be simple and ensuring they are only 1 character
at least enforces consistency and simpler format strings. This also makes
switching on the format cases much easier for an implementation and avoids
some easy edge cases.
format
is null for the{}
case.If the function does not handle the format specifier they can return null and
std.fmt
will handle an appropriate message.Old Example
New Example
One extra thing that comes to mind is whether we want to allow access to the
formatting specifiers for user-defined functions, passing the values to each.
An example use-case for the above would be allowing access to the precision
field and printing the vector components with that precision instead of
hardcoding. One concern is format functions don't necessarily have to use that
information for the correct purpose and could use it poorly. This is minor,
though.
Shortcomings/Extras
Leftside format-specifier type
With this proposal
{s}
becomes{:s}
. Is this fine? Since we only accept onecharacter and don't want named arguments we could put this on the leftside of
:
alongside the positional argument. This would mean the common case is thesame as now and fairly clean. With a positional parameter this would change from:
This is still unambiguous.
Grammar
End
Feel free to make any other suggestions and/or highlight any issues. I'd
prefer to keep this as simple as reasonable as long as it covers all the common
use-cases reasonably.
The text was updated successfully, but these errors were encountered: