-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update sklearn API for Gensim models #1473
Changes from 6 commits
c55fcc9
dde234f
721806b
3cdcfde
99dba85
4d1eaf4
155a1ec
ae6c0f3
3c78873
341ed1f
9113f82
2935680
9628f99
3849d06
6097349
5b21875
6bfdb4d
9f0be87
6004eee
3262ec2
d4e560e
ad3f1f7
877632e
0871b50
3f363a1
c0894bc
9b7402d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
#!/usr/bin/env python | ||
# -*- coding: utf-8 -*- | ||
# | ||
# Copyright (C) 2011 Radim Rehurek <[email protected]> | ||
# Licensed under the GNU LGPL v2.1 - http://www.gnu.org/licenses/lgpl.html | ||
"""Scikit learn wrapper for gensim. | ||
Contains various gensim based implementations which match with scikit-learn standards. | ||
See [1] for complete set of conventions. | ||
[1] http://scikit-learn.org/stable/developers/ | ||
""" | ||
|
||
|
||
from .basemodel import BaseTransformer # noqa: F401 | ||
from .ldamodel import LdaTransformer # noqa: F401 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is all that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A line having There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It feels wrong to litter the gensim code with such constructs -- the gensim code is correct, this is essentially working around some idiosyncrasy (bug?) of an unrelated library. By the way, how come we don't these errors from all the other There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @piskvorky because flake8 analyze the only diff every time, if you change same lines in any other There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, so the fake warnings will go away immediately? Why did we add these comments then. Let's remove it. |
||
from .lsimodel import LsiTransformer # noqa: F401 | ||
from .rpmodel import RpTransformer # noqa: F401 | ||
from .ldaseqmodel import LdaSeqTransformer # noqa: F401 | ||
from .w2vmodel import W2VTransformer # noqa: F401 | ||
from .atmodel import AuthorTopicTransformer # noqa: F401 |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -11,7 +11,7 @@ | |
from abc import ABCMeta, abstractmethod | ||
|
||
|
||
class BaseSklearnWrapper(object): | ||
class BaseTransformer(object): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. With this refactoring... I'm again wondering, is this class even helpful? Its There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @gojomo scikit-learn indeed does not impose such a hierarchy among its classes. However, as you stated above, in addition to not having to implement the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But, the existing And then, the only benefit of this shared superclass is forcing an error as a reminder of what to implement. But if sklearn doesn't do that itself, is it that important to do so? It's also over-enforcing – it is not a strict requirement that all transformers support There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In my head I expect all sklearn models to have Also There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, providing a place to include a error/warning is a potential reason to implement But with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. hi, jumping in the conversation a little late here, but I do agree that the old
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would struggle to find a gensim algorithm for which I haven't seen a "is it possible to update the model post-training?" question on the mailing list... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right, it's always asked, but not always supported. The scikit-learn convention for not supporting incremental training is to simply not implement There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree with @gojomo 's point of view here. The point of these adapters is to assume the "philosophy" of another package, whatever structure or idiosyncrasies that may have. The least amount of surprises and novelty for users of that package. It is not the place of the adapters to introduce their own philosophy on class structures or missing methods or whatever. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have removed the |
||
""" | ||
Base sklearn wrapper module | ||
""" | ||
|
This file was deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chinmayapancholi13 please add yourself as the author. This is your baby :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hehe. Ok! 😄