Skip to content

implement DIP1027 - Easy String Interpolation #15722

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

WalterBright
Copy link
Member

I implemented https://github.com/dlang/DIPs/blob/master/DIPs/rejected/DIP1027.md#description so people can try it out.

The implementation:

  1. adds a new token, TOK.istring
  2. adds lexer code to tokenize an i-string
  3. adds a new expression node, IStringExp
  4. adds parser code to turn the i-string token into an IStringExp
  5. adds a semantic routine for IStringExp to turn it into a tuple

And that's really about all there is to it.

As shown in the test case, pragma(msg, istring) is a convenient way to see what tuple the compiler turned your i-string into.

The test cases use printf (boo, hiss!) because the test suite isn't supposed to rely on Phobos, and writefln is in Phobos.

Anyhow, I had fun writing it, hope it's fun to use, too!

@dlang-bot
Copy link
Contributor

Thanks for your pull request, @WalterBright!

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub run digger -- build "master + dmd#15722"

@WalterBright WalterBright force-pushed the dip1027 branch 3 times, most recently from 80cd64f to 842de6e Compare October 22, 2023 06:30
@WalterBright
Copy link
Member Author

Most unhelpful error message ever:

Failed to compile the `unit` test executable! (exit code 1)

@WalterBright WalterBright mentioned this pull request Oct 22, 2023
@thewilsonator
Copy link
Contributor

That generally you either missed something when updating the headers, or you changed the way that existing behaviour works, and you should either revert that, or update the tests to accomodate to the new behaviour.

@Imperatorn
Copy link
Contributor

Imperatorn commented Oct 22, 2023

@WalterBright is it just to rdmd build.d + put the new dmd where it should be and try it out basically?

I want to try it out :)

Is there any shorthand notation for directly getting the string from the interpolated string?

@WalterBright
Copy link
Member Author

The failures here are in the C++ interface, dmd will work fine if you use build.d.

@WalterBright
Copy link
Member Author

Seems to be working now!

@Imperatorn
Copy link
Contributor

Is there any shorthand notation for directly getting the string from the interpolated string?

@tgehr
Copy link
Contributor

tgehr commented Oct 22, 2023

Is there any shorthand notation for directly getting the string from the interpolated string?

You'll have to call format.

@Imperatorn
Copy link
Contributor

Imperatorn commented Oct 22, 2023

@WalterBright

Would be nice if something like this was built in. Code just for demonstration purposes.

string s(Args...)(Args args) {

	return args.format;
}

scope const(char*) p(Args...)(Args args) {

	return args.s.toStringz;
}

void main()
{
	int apples = 24;
	int bananas = 42;

	auto s1 = i"I ate ${%d}apples and ${%d}bananas totalling ${%d}(apples + bananas) fruit".s;
	auto s2 = i"I ate ${%d}apples and ${%d}bananas totalling ${%d}(apples + bananas) fruit".p;

	writeln(typeid(s1));
	writeln(typeid(s2));

	writeln(s1);
	printf(s2);
}

It would be nice to have some convenience and be able to write something like this instead (the letter is not important, just for the sake of demonstrating)

        // Gets the interpolated object
	auto s0 = i"I ate ${%d}apples and ${%d}bananas totalling ${%d}(apples + bananas) fruit";
	
        // Gets the interpolated string (normal D string)
	auto s1 = s"I ate ${%d}apples and ${%d}bananas totalling ${%d}(apples + bananas) fruit";
	
        // Gets the interpolated C-string
	auto s2 = p"I ate ${%d}apples and ${%d}bananas totalling ${%d}(apples + bananas) fruit";

Of course in a way that doesn't compromise the entire compiler etc. Just for convenience and such that you don't have to import std.conv and std.string everywhere.

Since i"" is the base, all those others are just .format or .format.toStringz, that's why I thought maybe some clever solution could be done to accomplish that.

@WalterBright
Copy link
Member Author

@Imperatorn your idea is good, and I thought about it for a while. The issue is that the compiler would have to have builtin the equivalent of format(), which is a fairly large and complex piece of code. However, you can get pretty close to your suggestion with:

import core.stdc.stdio;
import std.format;

alias f = std.format.format;

enum x = "betty";
enum s = i"hello $x\n".f;

void main()
{
    puts(s.ptr);
}

@dd86k
Copy link

dd86k commented Oct 22, 2023

I thought we wanted to move the complexity towards Phobos. What prompted this experimentation? (if other than curiosity)

@WalterBright
Copy link
Member Author

#dd86k not sure what you mean, this relies on Phobos to do the actual formatting.

@dd86k
Copy link

dd86k commented Oct 22, 2023

not sure what you mean, this relies on Phobos to do the actual formatting.

My bad. I was trying to point out the i string prefix (over a function or template). Then again, this seems like a pretty minimal change, so I guess why not.

@Imperatorn
Copy link
Contributor

@Imperatorn your idea is good, and I thought about it for a while. The issue is that the compiler would have to have builtin the equivalent of format(), which is a fairly large and complex piece of code. However, you can get pretty close to your suggestion with:

import core.stdc.stdio;
import std.format;

alias f = std.format.format;

enum x = "betty";
enum s = i"hello $x\n".f;

void main()
{
    puts(s.ptr);
}

Ok, I see. I was just curious.

Thanks for looking into it.

@p0nce
Copy link
Contributor

p0nce commented Oct 23, 2023

We do all logging in printf (or printf-like) so I'm glad to be able to use that.

@adamdruppe
Copy link
Contributor

Reminder to commenters: this change was overwhelmingly REJECTED on review every time Walter has tried to push it through. It has several fatal flaws, including that it does the wrong thing by default. It is memory unsafe if you use it with printf without manual attention. It is type unsafe using it with most other functions with no help from the compiler - just giving random runtime results. This implementation addresses zero of those concerns brought up by community review.

If you want interpolation in D, that actually addresses these concerns among other lessons learned from real world use (it incorporates ideas from Andrei Alexandrescu and John Colvin from experience with Symmetry) as well as studying community review over the years (it also incorporates lessons from me and Steven Schveighoffer from previous DIPs), see here: #15715 See the link in the first comment for several example use cases. Even printf works there - we also incorporated the dip1027 use cases!

@WalterBright
Copy link
Member Author

have the potential to just work

Not for any non-default formats such as %03d.

pragma(msg, "Your name is ", name, " and you are ", age, " years old");

does indeed work and isn't that much different from:

pragma(msg, "Your name is $name and you are $age years old");

Of course, we could add such special casing for default formatting, but that's more special cases and complexity to learn.

@WalterBright
Copy link
Member Author

In any case, discussion of what to do with pragma(msg) is not relevant to this PR.

@nordlow
Copy link
Contributor

nordlow commented Dec 19, 2023

adam

Can you describe or link to a description of what's unsafe about this implementation?

@adamdruppe
Copy link
Contributor

This has been discussed to death multiple times. Read the review thread of the rejected DIP.

@nordlow
Copy link
Contributor

nordlow commented Dec 19, 2023

This has been discussed to death multiple times. Read the review thread of the rejected DIP.

Got it. Thanks.

@anon17
Copy link

anon17 commented Dec 26, 2023

At least this one is simpler, not described as "6 years of accumulated bikeshedding" and is a better starting point for improvements. That said, did DLF express an opinion about overengineered features?

.f suffix isn't very long, but instead you can generate call to an ambient function: .formatInterpolatedString(tuple), if the function isn't there, then no luck, std.format would have alias format formatInterpolatedString. Interpolated strings are used mostly for string creation and logging optimization, so I think it should be a string by default, and functions should opt in to receive a tuple, say, with an attribute. Only a few functions need it, and they probably already have it for printf diagnostic.

@formatlike
void logInfo(string sformat, ...);

@adamdruppe
Copy link
Contributor

At least this one is simpler, not described as "6 years of accumulated bikeshedding" and is a better starting point for improvements.

This proposal is actually from the middle of the "6 years of accumulated bikeshedding", and is, in reality, more complex than other implementation, since it does various string transformations in the compiler (which then library code has to undo to actually use it) instead of just wrapping them in a library type from the get go.

@anon17
Copy link

anon17 commented Dec 26, 2023

Or alias format _d_format_interpolated_string how compiler hooks look like.

@anon17
Copy link

anon17 commented Dec 26, 2023

various string transformations

You mean this transformation is heavier than instantiation of InterpolatedExpression!"baz + 4"() template?

@schveiguy
Copy link
Member

You mean this transformation is heavier than instantiation of InterpolatedExpression!"baz + 4"() template?

You lose the information, and have to re-parse the thing at runtime, that was already parsed by the compiler. In essence, you give the library more work to do -- the same work that was already done (parsing out interpolation tokens).

Granted, there is a tad more work to do if you instantiate a template with the string literal (which could be minimized with effort). But the cost is humungous on the library/runtime side compared to that.

@RazvanN7
Copy link
Contributor

@WalterBright You have approved #15715 . Where does that leave us with regards to this PR?

@schveiguy
Copy link
Member

I think it should be closed, reopen if I'm wrong.

@schveiguy schveiguy closed this Jan 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.