Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

auto_categorize and method_missing #135

Closed
ibnesayeed opened this issue Jan 18, 2017 · 10 comments
Closed

auto_categorize and method_missing #135

ibnesayeed opened this issue Jan 18, 2017 · 10 comments

Comments

@ibnesayeed
Copy link
Contributor

The magic train and untrain methods based on method_missing only work if the category already exists. This raises error if the auto_categorization is on and the category in question is a new one.

@Ch4s3
Copy link
Member

Ch4s3 commented Jan 18, 2017

That is not surprising at all. Frankly, I don't like the magic methods but a lot of people complained when I proposed deprecating them. We should probably add some sensible behavior in this case.

@ibnesayeed
Copy link
Contributor Author

I don't like it either. It simply adds more burden for documentation and not very useful when training model from a script (why would one bother to concatenate strings to make the method name).

I see two ways to go forward; 1) make adjustments so that it behave as expected when auto-categorization is on or 2) keep it the way it is for backwards compatibility, but do not document it so that new people are not encouraged to use it. Later when we introduce breaking API changes (e.g., 3.x version), we can completely deprecate it.

@Ch4s3
Copy link
Member

Ch4s3 commented Jan 18, 2017

My plan going in was to create a category on the fly, but after looking at this I realized that if I do that, you could easily end up with a misspelled category by mistake. I think the thing to do is to remove it from the docs and add a deprecation warning. Then remove it in 3.0.

@ibnesayeed
Copy link
Contributor Author

When we update the documentation, we can also change the tests to not use these deprecated methods, except for the case where we add a test to check if deprecation warning is shown.

@Ch4s3
Copy link
Member

Ch4s3 commented Jan 18, 2017

In progress

@Ch4s3
Copy link
Member

Ch4s3 commented Jan 18, 2017

wow... they return different things...

 pry(main)>classifier.train_interesting 'love'
=>["love"]
 pry(main)>classifier.train('interesting', 'love')
=> {:love=>1}

I think the latter is the correct behavior.

@ibnesayeed
Copy link
Contributor Author

Well, that's the consequence of Ruby returning the last evaluated expression. None of those two returned values are any useful or representative of anything significant in my opinion.

@Ch4s3
Copy link
Member

Ch4s3 commented Jan 18, 2017

I agree that the returns aren't particularly useful, but we shouldn't have 2 functionally equivalent functions that return different things.

@ibnesayeed
Copy link
Contributor Author

In case of pure function, this would have been an essential requirement. Additionally, it would have been important in case where the return value was used to predict the behavior of the method when called. However, I wont be too religious about a method who's sole job is to change the state of the object.

Having said that, it occurs to me that we can actually return something meaningful from the train and untrain methods (I would perhaps not care about the method_missing here) to reflect how did they go. For example, if the training or untraining was skipped due to emptiness of the word_hash, we can return nil, which means changing return if word_hash.empty? to return nil if word_hash.empty?. Similarly, in case of success we can show the count of total trainings for the category that was trained/untrained. This basically means we just need to move the @backend.update_category_training_count line at the end of the method (and corresponding line in the untraining method as well). If you do move them at the end, please also consider moving @backend.update_total_trainings lines at the end (just switch the order) so that the intended line is the last one. This will be a very nice way to telling the user if the training/untraining was successful or skipped, without specifically printing any warnings or info.

@ibnesayeed
Copy link
Contributor Author

PR #137 should do the trick of what I explained in the last comment.

@Ch4s3 Ch4s3 closed this as completed Jan 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants