Skip to content

Latest commit

 

History

History
17 lines (14 loc) · 885 Bytes

README.md

File metadata and controls

17 lines (14 loc) · 885 Bytes

Kernel Transformer

A transformer model based on sliding kernel self attention mechanism. This model is based on a implementation of Swin Transformer. See Swin Transformer repository for the original implementation.

Preliminary comparisons on CIFAR10

Model Params Val. Acc.
Swin Transformer (tiny) 26,598,166 82.19% @200eps
Swin Transformer (tiny) 26,598,166 83.34% @300eps
Kernel Transformer (tiny) 26,600,362 85.83% @300eps
kernel_vs_swin