-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MultiModalData class improvement #789
Conversation
Hello @andreygetmanov! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
Comment last updated at 2022-09-05 14:00:25 UTC |
c6a7966
to
c11ba62
Compare
Codecov Report
@@ Coverage Diff @@
## master #789 +/- ##
==========================================
+ Coverage 87.73% 87.78% +0.04%
==========================================
Files 193 194 +1
Lines 13230 13343 +113
==========================================
+ Hits 11608 11713 +105
- Misses 1622 1630 +8
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
6844f9e
to
93c3c58
Compare
This comment was marked as resolved.
This comment was marked as resolved.
59526fb
to
ff59b62
Compare
a108477
to
a5c3751
Compare
a5c3751
to
e2554e7
Compare
ALLOWED_NAN_PERCENT = 0.9 | ||
|
||
|
||
class DataDetector: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Не совсем понял предназначение данного класса
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Не совсем понял предназначение данного класса
Если ты про DataDetector, то это абстракция двух последующих классов. В них есть схожие по механике методы, поэтому решил на это указать, создав абстрактный класс
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Схожие по механике методы вижу. Так может один раз реализовать в DataDetector и использовать от туда, чем для каждого по-отдельности? Или это как-то по другому работает?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Схожие по механике методы вижу. Так может один раз реализовать в DataDetector и использовать от туда, чем для каждого по-отдельности? Или это как-то по другому работает?
Это шаблон абстракции для этих классов. В этом классе методы просто унифицируются. Думаю в будущем расширять по мере необходимости
87e78dd
to
38c8796
Compare
a2da375
to
64d320d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Поддерживаю идею разделить код для разных типов данных
Возможно, можно было обойтись разными модулями вместо разных классов, но на данном этапе не принципиально, не стоит всё пытаться сделать сразу идеально. По мере усложнения станет понятнее, как удобно.
UPD: Я тут писал про ошибку в property MultiModalData
, но сам ошибся. Там всё ок
Now csv files with text and table data can be read just in one motion - from_csv method added - text fields are defined automatically - tests are added
- added docstring for _column_contains_text - multimodal_wine dataset is moved to more appropriate place - protected funcs of multi_modal.py are now protected
- refactoring of text preprocessing - refactoring of multimodal data test
- now if text column contains a lot of nans, it's dropped
…ing DataDetector parent in data_detection.py
- added description for multimodal_text_num_example.py
- added docstrings
- DataDetection classes now look better - refactoring of prepare_multimodal_ts logic - preprocessing is now excludes some data types using decorators
64d320d
to
ef31bae
Compare
- minor changes
Now csv files with text and table columns can be read and separated to various data sources just in one motion