Can the number of frames during training differ from testing? #635
-
Hi, I noticed that the total number of frames during training and testing is mismatch. For example, in your tsn config, there are 3 frames for training, but using 25 frames for testing. Obviously, increasing the num of frames both in training and testing could get better accuracy, but as reported in FineGym, when there is a large gap of total frames num between training and testing, the accuracy drop sharply. I read some paper, such as TEA, TPN, etc. However, in their paper, they said use 8 or 16 frames during training, but they never stated whether the number of frames during training and the number of frames during testing are the same. This really confuses me, and I found in TSN, they could be different, but in other configs, they seems equal. So could you please tell me that the methods used in most current papers, is the number of frames used in training as the same as the number of frames used in testing ? If not, then why are they the same in your code except for some TSN configs? Thanks a lot for your contribution! |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments
-
Num frames used in training and testing could be different. Which one to used differs in different methods. Generally speaking, people use the setting that produces highest score. So you can try different numbers for your use case. |
Beta Was this translation helpful? Give feedback.
-
It depends on the model you used is 2D or 3D. So to be brief, #frames can vary for TSN, and should be consistent for other models. |
Beta Was this translation helpful? Give feedback.
-
I think you are right! I sent email to TEA and SmallBig authors, they said the number of frames used in training and testing is the same, thank you very much. |
Beta Was this translation helpful? Give feedback.
It depends on the model you used is 2D or 3D.
When I talk about 3D, the definition is: any model with a 3D(Temporal dim) operation (including convolution, shifting, etc., does not include avg or max) is a 3D model. For 3D models, the #frames should be consistent during training and testing.
For 2D models (only TSN is 2D by the above definition in this repo), #frames can vary during training and testing.
So to be brief, #frames can vary for TSN, and should be consistent for other models.