Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adaptation of API for multimodal data #663

Merged
merged 22 commits into from
Jul 15, 2022
Merged

Adaptation of API for multimodal data #663

merged 22 commits into from
Jul 15, 2022

Conversation

andreygetmanov
Copy link
Collaborator

@andreygetmanov andreygetmanov commented Apr 29, 2022

Global purpose of PR - to make Fedot API work with multimodal data

Changes:

  • now multimodal datasets can be run via Fedot API as easy as unimodal datasets
  • now initial pipeline is based on the type of data (for example, text data always precedes a vectorizing node)
  • slightly changed the logics of operation filter. Now operations are chosen not only by task but by data type too (each data source will have its own operations list in next PRs)
  • MultiModalStrategy now works more effectively and stable
  • AssumptionsBuilder now supports multimodal data and creates subbuilder for each data source
  • added pretrained embeddings as vectorization models. The type of model can be chosen automatically during pipeline tuning
  • other parameters of text models now are tunable too
  • added various tests on new functionality
  • added new example based on text+table dataset
  • deleted deprecated imdb case
  • fixed issue Fix composer and tuner work in the multimodal case Fix composer and tuner work in the multimodal case #630
  • fixed issue Adapt multimodal example to run by API Adapt multimodal example to run by API #626
  • fixed issue Fix CNN work in the multimodal case Fix CNN work in the multimodal case #627

@pep8speaks
Copy link

pep8speaks commented Apr 29, 2022

Hello @andreygetmanov! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2022-07-15 09:40:51 UTC

@andreygetmanov andreygetmanov force-pushed the multimodal_API branch 2 times, most recently from fedee4c to e69054b Compare April 29, 2022 15:41
@Dreamlone
Copy link
Collaborator

Напиши плиз сообщение в наш чат с code review - может ребята захотят поревьюить. Кто-то один точно должен захотеть

И ещё от меня просьба, - раз запрошено моё code review - пожалуйста, дай мне время его посмотреть (желательно конечно, чтобы это время было предусмотрено не поздно вечером и в выходные, как это получилось с прошлым PR'м, когда все правки по сути были сделаны в период когда я ехал с работы домой и по быстрому все влито в мастер). Не вливай PR сразу как получил аппрув от одного ревьюера - стоит критически оценивать все изменения и не торопиться

@andreygetmanov andreygetmanov removed the request for review from Dreamlone April 29, 2022 18:27
Copy link
Collaborator

@Dreamlone Dreamlone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Стоит не забыть поправить класс MulitmodalStrategy, так как именно в нём определяются data source узлы, но нынешняя логика сильно усеченная и не использует тип данных, надо его до туда прокинуть и использовать
См. строчку https://github.com/nccr-itmo/FEDOT/blob/master/fedot/api/api_utils/data_definition.py#L137

@andreygetmanov andreygetmanov force-pushed the multimodal_API branch 5 times, most recently from 0f77f29 to a96a60f Compare May 12, 2022 14:22
@andreygetmanov andreygetmanov requested a review from Dreamlone May 12, 2022 14:44
Copy link
Collaborator

@Dreamlone Dreamlone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Попробуй ещё плиз в рамках этого PR подумать над примером examples/simple/multitask_classification_regression_api.py, возможно уже сейчас стоит предусмотреть механизм, когда пользователь хочет решать одновременно две разных задачи (классификацию и регрессию, как в этом кейсе например)

@andreygetmanov andreygetmanov force-pushed the multimodal_API branch 2 times, most recently from bcc992c to 22b2971 Compare May 20, 2022 13:48
@codecov
Copy link

codecov bot commented May 20, 2022

Codecov Report

Merging #663 (0adccc4) into master (e08d2e4) will increase coverage by 0.44%.
The diff coverage is 88.25%.

@@            Coverage Diff             @@
##           master     #663      +/-   ##
==========================================
+ Coverage   87.71%   88.16%   +0.44%     
==========================================
  Files         172      181       +9     
  Lines       12098    12467     +369     
==========================================
+ Hits        10612    10991     +379     
+ Misses       1486     1476      -10     
Impacted Files Coverage Δ
fedot/core/composer/composer.py 89.74% <ø> (ø)
...ot/core/composer/gp_composer/specific_operators.py 93.93% <ø> (ø)
.../core/optimisers/archive/individuals_containers.py 89.39% <ø> (-0.32%) ⬇️
...ore/optimisers/gp_comp/operators/regularization.py 48.57% <0.00%> (+2.62%) ⬆️
fedot/core/pipelines/tuning/hyperparams.py 91.83% <0.00%> (+0.34%) ⬆️
fedot/core/pipelines/tuning/search_space.py 100.00% <ø> (ø)
fedot/core/pipelines/tuning/tuner_interface.py 92.36% <ø> (+0.05%) ⬆️
fedot/core/serializers/coders/any_serialization.py 100.00% <ø> (ø)
fedot/core/serializers/serializer.py 96.42% <ø> (ø)
fedot/core/optimisers/adapters.py 88.77% <33.33%> (ø)
... and 106 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 703f7b3...0adccc4. Read the comment docs.

@andreygetmanov andreygetmanov force-pushed the multimodal_API branch 2 times, most recently from 54fa8b4 to d510824 Compare June 9, 2022 15:07
@andreygetmanov andreygetmanov force-pushed the multimodal_API branch 2 times, most recently from 8145bed to 0a22b97 Compare July 13, 2022 14:48
@andreygetmanov andreygetmanov merged commit 128c6e6 into master Jul 15, 2022
@andreygetmanov andreygetmanov deleted the multimodal_API branch July 15, 2022 13:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants