Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to ryu for float to string #120

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Switch to ryu for float to string #120

wants to merge 1 commit into from

Conversation

kylebarron
Copy link
Member

@kylebarron kylebarron commented Oct 8, 2024

  • I agree to follow the project's code of conduct.
  • I added an entry to CHANGES.md if knowledge of this change could be valuable to users.
  • I ran cargo fmt

I'm not sure that this is using ryu for all geometry types, because I had to update all the point/linestring tests because ryu was printing out 1.0 instead of 1. But I didn't have to change the polygon tests:

"POLYGON((0 0,20 40,40 0,0 0),(5 5,20 30,30 5,5 5))",

Closes #118

@kylebarron kylebarron requested a review from michaelkirk October 8, 2024 02:51
@lnicola
Copy link
Member

lnicola commented Oct 8, 2024

I'm not convinced this is a good idea. From the rustc issue:

As I understand it, the fact that short numbers perform worse is an inherent aspect of the algorithm. Dtoa goes left to right and avoids generating too many noise digits, so the performance is better on short numbers as reflected in the table. Ryū generates an exact representation and then goes right to left removing noise digits, so the performance is better on the longest numbers.

The Ryū benchmarks all focus on a random distribution of numbers, and those have on average a lot of digits.

dtoa might be a better choice, then, given that real-world coordinates should have fewer fractional digits?

Also: https://xkcd.com/2170/.

@michaelkirk
Copy link
Member

Since this is seemingly a straight forward performance proposal, we should be able to measure it. Can you include any benchmarks using plausible real world data that shows improvement?

Related: #121

Note: There were no measurable perf differences for me with the existing write benchmark. It's likely that the existing data we've included in the benchmarks is not representative of real world data, so I'd be happy to include different data in the benchmarks if it's useful to show the improvements here.

That said, if we have to contrive really crazy data to get a measurable perf improvement — e.g. if it's only an improvement for people providing gigabytes of molecular level GPS precision (as an extreme example), and a regression for everybody else, then obviously we shouldn't do it.

@@ -42,11 +43,12 @@ where
///
/// let point: Point<f64> = point!(x: 1., y: 2.);
///
/// assert_eq!(point.wkt_string(), "POINT(1 2)");
/// assert_eq!(point.wkt_string(), "POINT(1.0 2.0)");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how to feel about this change in behavior. On the one hand it makes our serialization less efficient which is bad, OTOH it conveys to the reader that this was serialized from floating point, so I guess that's useful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the interest of not leaving ambiguous comments, I'll say I prefer the old version, which serialized efficiently, but I'd be willing to let that small demerit go if there are substantial other wins.

That said, if there's a way to use faster float serialization while still maintaining the integer display for .0, I think that'd be best.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ryu doesn't have any options as far as I can tell. So if we switched to ryu I don't think we'd be able to serialized floats without the .0

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can still check if the string ends with .0 and strip that off.

let mut buffer = ryu::Buffer::new();
let y = buffer.format(self.y);

write!(f, "{} {}", x, y)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I suspect a sequence of f.write_str calls would be a little faster here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly, it might muddy the waters on perf improvements though. I'd prefer to keep everything except ryu the same in this PR.

@lnicola
Copy link
Member

lnicola commented Oct 8, 2024

There were no measurable perf differences for me with the existing write benchmark.

We could set up Criterion, I guess. And it might be worth comparing to dtoa, which has the same API.

@michaelkirk
Copy link
Member

There were no measurable perf differences for me with the existing write benchmark.

We could set up Criterion, I guess. And it might be worth comparing to dtoa, which has the same API.

What do you mean? We're already using criterion AFAICT.

@lnicola
Copy link
Member

lnicola commented Oct 8, 2024

We're already using criterion AFAICT

Not my finest day 🤦.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use ryu for faster float-to-string conversion
3 participants