-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Code is too complicated #7
Comments
Hi @Crazylov3, Thanks a lot for the feedback. Unfortunately there are no immediate plan to re-write a simplified version of the codebase. However, this could change if there is popular demand for it. In the meantime, if you could provide a few examples of where you find the code complicated or difficult to follow, I could address those issues one by one. And make sure the documentation / codebase is improved. |
Firstly, thank you to the team for the awesome work, and for releasing it under a permissive license. I would love to create a minimalistic/simplified repo, with only the inference code. It's great that you have made silk fully reproducible, however I also think that a lot of people (myself included!) just want the "meat". |
I've created a stripped down version of the repository here. https://github.com/CGCooke/silk @Crazylov3 , you might find this a little easier to get started with. Thanks again for releasing this. |
Completely understandable.
Oh wow. Thanks a lot for this. |
@CGCooke Thanks a lot |
@gleize @CGCooke I have one more question about your decoder, extract keypoint coord of keypoint. Since you didn't use padding then the detector heatmap output has a different shape compared to the input. How do you extract correct keypoint from the heatmap? I read the code, then I realize you have a class "LinearCoordinateMapping" which scales the coord of keypoint from a heatmap by a scale factor and bias. What is the scale factor, bias in this case |
Hi @Crazylov3,
We have a function called
This means that the scale factor is 1, and the bias is 9 for the default VGG-4 backbone. On both x and y dimensions. |
@gleize thanks for the quick response. |
Yes that's correct. Is the confusion coming from the sign ? (i.e. being I should have run
Sorry for the confusion. |
Many thanks |
@gleize One more question about "x <- tensor([1., 1.]) x + tensor([9., 9.])". If we do that, it means we never get the key point in the top-left of the image with a margin = 9, doesn't it? I am just curious why you do that instead of interpolating the heatmap to the original image size. Are there any technique details in the training process that force us to do that? |
Hi @Crazylov3,
No worry.
Yes you can. The ResFPN we've tried actually downsamples the input. You can check it out as a reference.
Yes.
Interpolating the heatmap would displace the final keypoint positions, making them incorrect. I'm curious why would that be a problem for your task ? |
The model backbone is the main problem for my task. I am working with Rockchip using NPU (which is poorly supported). I need the real-time performance of keypoint matching, however, your backbone is too heavy computational (at least for my task), I currently use a backbone almost the same as SuperPoint (Down-sample to H/16 x W/16), description map's shape (256, H/16, W/16), I used 256 channels (float) to convert it to 32 dims (uint8, for faster matching), I use pixel shuffle to convert (256, H/16, W/16) -> (1, H, W) for detector. In short, the decoder (which contains upsample -> hurt the speed) and high resolution in the description map must be avoided in my task. I know using down-sample, without decoder, it hurt performance by a lot. Now I try |
I see. This seems to be a good use case for testing our smallest backbone VGG CHECKPOINT_PATH = os.path.join(os.path.dirname(__file__), "../../assets/models/silk/analysis/alpha/pvgg-micro.ckpt") SILK_BACKBONE = ParametricVGG(
use_max_pooling=False,
padding=0,
normalization_fn=[torch.nn.BatchNorm2d(64)],
channels=(64,),
) model = SiLK(
in_channels=1,
backbone=deepcopy(SILK_BACKBONE),
detection_threshold=SILK_THRESHOLD,
detection_top_k=SILK_TOP_K,
nms_dist=nms,
border_dist=SILK_BORDER,
default_outputs=default_outputs,
descriptor_scale_factor=SILK_SCALE_FACTOR,
padding=0,
lat_channels=32,
desc_channels=32,
feat_channels=64,
) |
Answers moved to FAQ. Closing now. |
Thank you for sharing your beautiful works.
Do you have any plans to release a simplified version of the source code? I am very stuck to follow the repo
The text was updated successfully, but these errors were encountered: