Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Palindrome Product Exercise #501

Merged
merged 24 commits into from
Apr 27, 2018
Merged

Palindrome Product Exercise #501

merged 24 commits into from
Apr 27, 2018

Conversation

Menkir
Copy link
Contributor

@Menkir Menkir commented Apr 15, 2018

Hey Guys,

i just added the exercise palindrome product https://github.com/exercism/problem-specifications/blob/master/exercises/palindrome-products/description.md

Please give me feedback :)

Menkir and others added 6 commits April 12, 2018 08:13
	- pass 3 range failure tests by if statement in lib.rs
	- add exmaple.rs ref. solution
	- move ref solution from lib to example.rs
@coriolinus
Copy link
Member

Thanks @Dev7353 for this work!

Just going to summarize the Travis results here, as you may not be familiar with interpreting its output:

  • All but the first test should have #[ignore] attributes
  • The filename of the README should be README.md, not README.MD (note capitalization)

I'll perform a proper review later once Travis passes, but here are some items I noted in a quick glance:

  • You haven't assigned a difficulty in config.json. It actually surprises me that configlet lint hasn't flagged that as an issue. Please assign an appropriate difficulty and insert the exercise in the appropriate location within config.json.
  • I see that you've placed this exercise in config.json among the difficulty 10 exercises. That ranking surprises me; the example is fewer than 100 lines of straightforward code. I'm willing to be convinced that that difficulty is appropriate, but I'd like to hear your justification for it.
  • Track policy prefers an actual stub in the stub file, not simply a blank file.

@Menkir
Copy link
Contributor Author

Menkir commented Apr 16, 2018

Thanks, @coriolinus for the detailed feedback. I'm wondered as well why lint doesn't flag anything.
The reason why I set the difficulty for 10 was retrospectively nonsense. I struggled a bit with the duration of the testing because I needed to check millions of numbers so the test was running very long. Because I don't find a way to decrease the testing time so I set the difficulty higher.

Another Question: When I write the stub in the lib.rs file. Do I need also to specify the return value? Because then I need to specify the struct as well which is part of the return value.

@coriolinus
Copy link
Member

We'll see what we can do about the testing run time. There are options to run the automated tests in release mode, which may be applicable here, and it may prove possible to improve the algorithm used in the example (though I haven't actually looked at that in any detail yet, so this is only speculation). In the meantime, you should pick a difficulty more in line with how challenging this exercise is to solve.

Yes, you'll need to specify all appropriate types and return values in the file. The Forth exercise, another difficulty-10 problem, does it well: all necessary types are defined, including the error enum, in a way which does not in any way reduce the difficulty of completing the exercise.

@Menkir
Copy link
Contributor Author

Menkir commented Apr 17, 2018

Hi @coriolinus ,
I just checked out the travis log and it says:

Exercises with out-of-date README.md:$
palindrome-products

What does that mean? What is out of date?

@coriolinus
Copy link
Member

coriolinus commented Apr 17, 2018

@Dev7353 That error means that README.md is out of sync with the data from upstream. The best solution here is to generate the readme instead of attempting to edit it by hand, because it's not always obvious what the differences are. Assuming you're on a unixy system, you can do so like this:

# start in the project root
# get configlet and put it into the bin directory
bin/fetch-configlet
bin/configlet generate . --only "palindrome-products"

Copy link
Member

@coriolinus coriolinus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Dev7353 for contributing this! I've added a few notes about particular lines. Of these, the most critical (and the one which will require the most work) is the API. I'm not saying that you have to rewrite things to exactly the API I proposed, but I do believe that bare minimum, we can improve things' names. Palindrome is explicit in a way that ReturnVals is not.

Of the changes requested below, I consider the API design stuff the most critical.

@@ -0,0 +1,6 @@
[package]
name = "palindrome-products"
version = "0.1.0"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When there exists canonical data, we use this version field to indicate when the problem was last synchronized to the canonical data. As such, we'd expect this to be "1.1.0", because that's the canonical data version.

pub factors: Vec<(i32, i32)>,
}

pub fn get_smallest_palindrome_product(_min: i32, _max:i32)->Result<ReturnVals, Error>{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a huge fan of having get_smallest_palindrome_product and get_largest_palindrome_product as separate functions. This forces the machine to do extra work, which slows down the tests. Given that every test in the canonical data examines both the smallest and largest results, how about the following API:

pub struct Palindrome {
    pub value: i32,
    pub factors: Vec<(i32, i32)>,
}

pub type Palindromes = Vec<Palindrome>;

// in the case of an empty range or no results, we can just return an empty Vec, which simplifies the API
pub fn get_palindrome_products(min: i32, max: i32) -> Palindromes {
    unimplemented!("Find all palindromic numbers which are products of numbers in the inclusive range ({}..{})", min, max)
}

pub fn min(palindromes: &Palindromes) -> Option<Palindrome> {
    unimplemented!("Return the palindrome of minimal value from the supplied list: {:?}", palindromes)
}

pub fn max(palindromes: &Palindromes) -> Option<Palindrome> {
    unimplemented!("Return the palindrome of maximal value from the supplied list: {:?}", palindromes)
}

The tests could then be written in such a way that for any given test range, the computation is then performed only once.

pub factors: Vec<(i32, i32)>,
}

pub fn get_smallest_palindrome_product(_min: i32, _max:i32)->Result<ReturnVals, Error>{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per #476, we're avoiding the use of underscore-prefixed variables in favor of descriptive text in the unimplemented!() block.

Largest,
}

fn test(e: GET, (min, max): (i32, i32))->Result<ReturnVals, Error>{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function appears not to do very much. One possibility would be to remove it entirely. Another would be to expand it into a macro which writes individually-ignorable tests from the input data. I find the second option much cooler, and am willing to help you work on the macro if you need help. See the perfect-numbers tests to see how that might look.

}

fn get_smallest_palindrome(min: i32, max:i32)-> Result<i32, Error>{
let l:Vec<i32> = (min*min .. max*max).collect();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach isn't very efficient, particularly as the difference between min and max grows. You'd do better with a nested for loop:

for i in min..(max+1) {
    for j in i..(max + 1) {
        if is_palindrome(i*j) {
            ...
        }
    }
}

#[derive(Debug, PartialEq)]
pub struct ReturnVals{
pub result: i32,
pub factors: Vec<(i32, i32)>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not obvious to me what value there is in returning the factors here. Best case, there exist certain implementations for which one can get the factorization more or less for free when generating the palindrome. Worst case, the student has to go back and factorize their solutions after generation, which is at best an orthogonal problem.

I see that the canonical data has them. There's no real explanation of why. I therefore propose that for the Rust track, we diverge from the canonical data by dropping the factors from the implementation and the tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do this, we'd end up with a much nicer type for Palindrome:

pub type Palindrome = u64;

}
#[derive(Debug, PartialEq)]
pub struct ReturnVals{
pub result: i32,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No negative number can be a palindrome, so let's use an unsigned type for the palindromes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use u64, in fact, because it's easier than you might expect to overflow a u32. It's fine if min and max are u32, however; this helps reduce the chance of an overflow.

RangeFailure,
}
#[derive(Debug, PartialEq)]
pub struct ReturnVals{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be named Palindrome, which is much more descriptive than ReturnVals.

Menkir added 4 commits April 17, 2018 22:11
	- change correspondongin tests
	- apply api in lib.rs and reference  method arugments to avoid '_'
@petertseng
Copy link
Member

To keep this PR focused on adding the Palindrome Product exercise instead of unrelated changes, let's not configlet fmt in this PR please.

Copy link
Member

@coriolinus coriolinus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is starting to look really good. I've made a line comment below about a specific change I'd like to see, but the code here is very nearly there.

Please exclude the configlet fmt changes, though: they introduce quite a lot of unrelated noise to this PR.

unimplemented!("Find all palindromic numbers which are products of numbers in the inclusive range ({}..{})", min, max)
}

pub fn min(palindromes: Vec<Palindrome>)->Option<Palindrome>{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accepting Vec<Palindrome> here transfers ownership of that vector, negating the potential performance improvement of computing once and then getting min/max.

It would be possible to accept a &Vec<Palindrome> instead and get that back, but that's not considered idiomatic: it locks the caller into a Vec implementation, when it may be more efficient for them to generate a slice in other ways.

It's therefore best to just accept a slice reference directly: &[Palindrome].

This applies to max as well.

Copy link
Contributor Author

@Menkir Menkir Apr 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coriolinus I'm struggling a bit with the lifetimes. So when I return a slice over a vector e.g. palindrome.as_slice() in get_palindrome_product() then I need to specify that the slice lives long enough. But I have no reference lifetime in my arguments. Should I give min and max as references or is there a more elegant approach?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't really need to use lifetimes here for this to work. I've thrown up a quick branch to demonstrate: lib.rs, palindrome-products.rs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the performance improvement is because you give a reference instead of a copy of the vector?

Copy link
Member

@coriolinus coriolinus Apr 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few performance improvements in there:

  • 5: j begins its lower bound at i instead of min. This eliminates roughly half the search space.
  • 15, 19: clone only the single item we care about, not the entire list. Doesn't matter so much for Copy types like u64, but for largeish things like strings, this really does speed things up.
  • 26: Perform to_string() only once. Do not copy the string. Producing a string requires heap allocation, which isn't quick.
  • 26: Iterate over the bytes of the string instead of the characters. This is faster because when you use the chars() string iterator, it must produce a four-byte char on each iteration. For strings for which all chars are bytes, converting it into a byte vec and then iterating is faster. Naturally, if you can't guarantee that all chars in your string are in fact bytes, then you don't have any choice but to pay that price. Here, however, we know that all digits produced by the base-10 representation of a number are in fact a single byte wide.

Another performance improvement which I haven't implemented in this branch, but which should roughly halve the time required:

  • Restructure tests to persist the list of palindrome products for min and max. That is, instead of having a pair of tests:

    #[test]
    fn smallest_palindrome_double_digits() {
        assert_eq!(min(&get_palindrome_products(10, 99)), Some(121));
    }
    #[test]
    fn largest_palindrome_double_digits() {
        assert_eq!(max(&get_palindrome_products(10, 99)), Some(9009));
    }

    We could persist the list of palindromes in a single test:

    #[test]
    fn double_digits() {
        let pp = get_palindrome_products(10, 99);
        assert_eq!(min(&pp), Some(121));
        assert_eq!(max(&pp), Some(9009));
    }

    The part of these tests which takes the longest is searching a large number-space for palindrome products. By persisting the results of that search between the min and max checks, we can significantly reduce the time requires.

    There's a tradeoff to that approach, of course: it reduces the granularity of the tests. For long-running tests such as these, though, I believe that it's probably worth it.

Note that only the performance improvement which I didn't actually implement requires the use of a reference in the function API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another unimplemented performance improvement: zip up only the first half of the byes checked in is_palindrome with the second half of the bytes checked there. Right now, each byte pair is checked twice; that's not necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth mentioning that non-reference function arguments have fundamentally different semantics than in, say, go. In go, a non-reference function argument produces a copy, which would be expensive for a vector. In Rust, it moves the variable. This is fast, but it means that the calling code no longer has access to that variable.


pub fn min(palindromes: Vec<Palindrome>)->Option<Palindrome>{
if palindromes.is_empty(){
return None;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it necessary? https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.min already returns None if empty.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah, you're right. Thank you for the feedback!


pub fn max(palindromes: Vec<Palindrome>)->Option<Palindrome>{
if palindromes.is_empty(){
return None;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it necessary? https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.max already returns None if empty.

@coriolinus
Copy link
Member

@Menkir On my machine, testing in release mode takes just over 8 seconds, while debug mode takes almost 2 minutes. That's a big enough discrepancy that it's worth testing this exercise in release mode. To accomplish this, please add a file .meta/test-in-release-mode which contains a short comment describing the timing.

This is important because the CI runs every exercise's tests multiple times, and for those which take non-trivial time, this can really speed up the CI results.

@Menkir
Copy link
Contributor Author

Menkir commented Apr 20, 2018

@coriolinus

Please exclude the configlet fmt changes, though: they introduce quite a lot of unrelated noise to this PR.

Would it be enough to revert this commit by a new one or should i rebase it?

@coriolinus
Copy link
Member

Simple reversion will be fine; we're going to squash-merge this in the end.

Copy link
Member

@coriolinus coriolinus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more minor change is necessary, and then I think all the required work is done.

There is an optional change: run rustfmt on your rust files to establish a consistent code style. We have no policy requiring this, but in my opinion, its output is nicer than what most people (including myself!) write by hand.

use palindrome_products::*;

#[test]
#[ignore]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first test in the tests file should be the one not ignored. There are at least two ways to accomplish this:

  • move the single_digits test to the top of the file.
  • ignore single_digits and unignore empty_result_for_smallest_palindrome.

I do not have a preference between these options.

	- put the first running test on the top
Copy link
Member

@coriolinus coriolinus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting: apparently rustfmt has suggested a change that broke previously-working code! We should probably report that to the rustfmt team.

Once that's fixed, I believe this PR is ready to merge. Thanks @Menkir for all your work! The next steps are a bit anticlimactic from here: I'm going to let this one sit for a week to ensure that the other maintainers have a chance to look the PR over.

If no objections surface in that time, and if the travis issue is resolved, then I intend to squash and merge this on Friday the 27th. If on the 28th I have not yet done so, please feel free to ping me and remind me about this.

@Menkir
Copy link
Contributor Author

Menkir commented Apr 21, 2018

@coriolinus thank you for the big support.
I should note that i used rustfmt 0.60 and it say that a line of size 101 chars is not allowed (only <= 100). So i split the string by hand. Maybe this is the issue?

@coriolinus
Copy link
Member

coriolinus commented Apr 21, 2018 via email

@Menkir Menkir changed the title New Exercise: Palindrome Product Palindrome Product Exercise Apr 27, 2018
@coriolinus coriolinus merged commit bf3375b into exercism:master Apr 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants