Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PreTrained on ImageNet #8

Open
Lwt-diamond opened this issue Mar 4, 2023 · 1 comment
Open

PreTrained on ImageNet #8

Lwt-diamond opened this issue Mar 4, 2023 · 1 comment

Comments

@Lwt-diamond
Copy link

Lwt-diamond commented Mar 4, 2023

Did you used Res2Net pretrained on ImageNet as a backbone ? But the input size of the Res2Net which pretrained on ImageNet is 224x224. The input size in your code( which is also named train_size ) is 352x352. In my opinion, the pretrained data on ImageNet is not suitable for the model. Could you tell me your opinion? Thank you very much

image

@thograce
Copy link
Owner

thograce commented Mar 6, 2023

Did you used Res2Net pretrained on ImageNet as a backbone ? But the input size of the Res2Net which pretrained on ImageNet is 224x224. The input size in your code( which is also named train_size ) is 352x352. In my opinion, the pretrained data on ImageNet is not suitable for the model. Could you tell me your opinion? Thank you very much

Yes, I used Res2Net pre-trained on ImageNet as a backbone. Now let me explain the selection of training image input size.
1.The general baseline method for camouflage object detection (COD), SINet, selects 352x352 as the input size. For fair comparison, many subsequent methods follow the input size of SINet, so does C2FNet.
2.COD is not a classification task. Res2Net is only used as a feature extractor, not a classifier, so the FC layer is abandoned. Therefore, the change of input size will not affect the network operation.
3.COD is a pixel-level segmentation task, which requires high resolution to retain more effective information. The datasets represented by COD10K all have very high resolution, so it is obviously inappropriate to compress all images to 224x224.
4.Relevant research also shows that improving resolution can achieve greater benefits. You can see the change of input size during the evolution of YOLO series, which will help you understand transfer learning and finetuning.
5. In general, for non-classification tasks, it is a domain consensus that the input size is not limited by the pre-trained model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants