SAM2 degraded results compared to SAM #93

omrastogi · 2024-08-01T11:30:41Z

SAM

Version: vit_l
Input type: box input
Multimask Output: True

SAM2

Version: large
Input type: box input
Multimask Output: True

The text was updated successfully, but these errors were encountered:

heyoeyo · 2024-08-02T20:15:24Z

It may be that the box is defined backwards, as in the top-left/bottom-right coordinates are reversed...? That might explain why the mask looks reversed. It might also be worth checking the other masks (from multi-mask output), since it may just be that one of them is giving this odd looking result.

From what I've seen, the results from v2 are generally similar to v1, but a bit more prone to weird artifacts. However, the new models scale to larger image sizes using a lot less VRAM than the v1 models, so they can give cleaner/smoother outlines.

WaterKnight1998 · 2024-08-05T09:40:04Z

I am also seeing worse performance with points prediction

heyoeyo · 2024-08-05T15:35:54Z

I am also seeing worse performance with points prediction

From what I've seen, between the different sized SAMv2 models, there can be significant differences in which masks (i.e. whole object, sub-components of object etc.) end up in the different indexes of the multi-mask output.
For example, the 0-th index mask of the large model tends to pick the smallest sub-component around the point prompt, while the same 0-th mask of the base-plus model tends to pick the 'whole' object. So you might be able to get a better result by picking a different mask output.

rdfong · 2024-12-05T21:42:56Z

It also tends to output these strange artifact like results as seen here which was not an issue with sam v1, this is using a point prompt

heyoeyo · 2024-12-06T00:51:27Z

It could be that there's an issue with the coordinate position mismatching the image size, given that the mask seems to be selecting a different area than the point (or maybe the point is drawn in the wrong spot?).

If that point is correct, you can probably get a better result by using one of the other mask outputs. For example, if you're using the large model, that ground patch is cleanly segmented in the last-most mask (using multi-mask output):

The non-multi-mask + 3 multi-mask outputs are shown on the right. You can see in one of the masks (second-last of multi-mask) it gives a patchy mask a bit like the example you showed, so switching to a different one may fix the problem.

The last-most mask isn't always the best though, for example the full car would be segmented by the second-last mask (and the last-most mask is patchy):

Which mask is best also depends on the model size that's being used (the examples above are only for the v2.1 large model, the results can be very different for other model sizes).

omrastogi changed the title ~~Getting degrading results from Sam2 compared to Sam Segmentations~~ SAM2 degraded results compared to SAM Aug 1, 2024

amirmohammadnsh mentioned this issue Aug 5, 2024

Sam1 is giving better results in comparision to Sam2 #148

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SAM2 degraded results compared to SAM #93

SAM2 degraded results compared to SAM #93

omrastogi commented Aug 1, 2024

heyoeyo commented Aug 2, 2024 •

edited

Loading

WaterKnight1998 commented Aug 5, 2024

heyoeyo commented Aug 5, 2024

rdfong commented Dec 5, 2024

heyoeyo commented Dec 6, 2024

SAM2 degraded results compared to SAM #93

SAM2 degraded results compared to SAM #93

Comments

omrastogi commented Aug 1, 2024

SAM

SAM2

heyoeyo commented Aug 2, 2024 • edited Loading

WaterKnight1998 commented Aug 5, 2024

heyoeyo commented Aug 5, 2024

rdfong commented Dec 5, 2024

heyoeyo commented Dec 6, 2024

heyoeyo commented Aug 2, 2024 •

edited

Loading