-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.html
1856 lines (1696 loc) · 210 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!doctype html>
<html lang="en">
<head>
<!-- Global site tag (gtag.js) - Google Analytics -->
<script async src="https://www.googletagmanager.com/gtag/js?id=UA-42711199-4"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'UA-42711199-4');
</script>
<!-- Required meta tags -->
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<!-- Bootstrap CSS -->
<link href="https://cdn.jsdelivr.net/npm/[email protected]/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-1BmE4kWBq78iYhFldvKuhfTAU6auU8tT94WrHftjDbrCEXSU1oBoqyl2QvZ6jIW3" crossorigin="anonymous">
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/js/bootstrap.bundle.min.js" integrity="sha384-ka7Sk0Gln4gmtz2MlQnikT1wXgYsOg+OMhuP+IlRH9sENBO0LRn5q+8nbTov4+1p" crossorigin="anonymous"></script>
<link href="./assets/css/common.css" rel="stylesheet">
<link rel="shortcut icon" href="favicon.ico" type="image/x-icon">
<link rel="icon" type="image/png" href="favicon.png">
<!-- Roboto -->
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=Roboto&display=swap" rel="stylesheet">
<title>Sanghyuk Chun - NAVER AI Lab</title>
</head>
<body>
<div class="container-wrapper">
<div class="container">
<div class="mt-5 mb-5">
<div class="row profile_header">
<div class="col-md-2 d-none d-lg-block d-xl-block">
<img src="assets/img/profile3.jpg" width="85%" max-width="250px" alt="profile" class="rounded-circle shadow-lg">
</div>
<div class="col-md-10">
<h1>Sanghyuk Chun</h1>
<ul class="list-unstyled">
<li>Lead Research Scientist</li>
<li>NAVER AI Lab</li>
<li>sanghyuk.chun [at] gmail.com</li>
<li><a href="https://scholar.google.co.kr/citations?user=4_uj0xcAAAAJ">Google Scholar</a> | <a href="https://github.com/SanghyukChun">Github</a> | <a href="https://twitter.com/SanghyukChun">Twitter</a></li>
<li><a href="media/CV_Sanghyuk_Chun_Public_05Dec23.pdf">CV</a> (as of December 5th, 2023) | <a href="media/research_statement.pdf">research statement</a> (as of December 2023)</li>
<!--<li><a href="https://scholar.google.co.kr/citations?user=4_uj0xcAAAAJ">Google Scholar</a> | <a href="https://www.semanticscholar.org/author/Sanghyuk-Chun/2647582">Semantic Scholar</a> | <a href="https://github.com/SanghyukChun">Github</a> | <a href="https://twitter.com/SanghyukChun">Twitter</a> | <a href="media/CV_Sanghyuk_Chun_Public_18Dec22.pdf">CV</a> (as of December 18th, 2022)</li>-->
</ul>
</div>
</div>
</div>
<div class="mt-5 mb-5 on-click-all-active">
<p>I am currently serving as a lead research scientist at <a href="https://naver-career.gitbook.io/en/positions/ai-ml/ml-research">ML Research</a> team in <a href="https://naver-career.gitbook.io/en/teams/clova-cic/ai-lab">NAVER AI Lab</a>, where my focus lies in the domains of machine learning, multi-modal learning (e.g., vision-language, language-audio, and audio-visual), and computer vision. At NAVER, my primary research goal aims to the development of generalizable machine learning models to challenging yet practical scenarios. Prior to joining NAVER, I held a position as a research engineer at KAKAO Corp from 2016 to 2018, where my work focused on recommendation systems and machine learning applications.</p>
</div>
<div class="mt-5 mb-5">
<p><b class="text-danger">NOTICE!</b> NAVER AI Lab is hiring research scientists (full-time) and internship students (maximum 6 months). Please check the job description in the <a href="https://naver-career.gitbook.io/en/positions/ai-ml/ml-research#application-process-and-contact">ML Research introduction page</a> for more details. Our other teams (<a href="https://naver-career.gitbook.io/en/positions/ai-ml/backbone-research">Backbone Research</a>, <a href="https://naver-career.gitbook.io/en/positions/ai-ml/generation-research">Generation Research</a>, <a href="https://naver-career.gitbook.io/en/positions/ai-ml/language-research">Language Research</a> and <a href="https://naver-career.gitbook.io/en/positions/ai-ml/human-computer-interaction-research">HCI Research</a>) are also hiring!</p>
</div>
<nav class="navbar navbar-expand sticky-top navbar-light bg-white top-bottom-border">
<ul class="navbar-nav me-auto mb-0">
<li class="nav-item">
<a class="nav-link active ps-0 pe-4" aria-current="page" href="#research">Research</a>
</li>
<li class="nav-item">
<a class="nav-link active ps-0 pe-4" aria-current="page" href="#papers">Publications</a>
</li>
<li class="nav-item">
<a class="nav-link active ps-0 pe-4" aria-current="page" href="#activities">Activities</a>
</li>
</ul>
</nav>
<h3 id="news" class="mt-5">News</h3>
<div class="mb-5 on-click-all-active">
<ul class="pl15 mb-0">
<li>_1/2025 : 1 paper <sup><a href="#probabilistic-language-image-pre-training">[ProLIP]</a></sup> is accepted at ICLR 2025.</li>
<li>_1/2025: Reaching a research milestone of 10,000 citations at <a href="https://scholar.google.co.kr/citations?user=4_uj0xcAAAAJ">Google Scholar</a>!</li>
</ul>
<details>
<summary><strong>See older news</strong></summary>
<ul class="pl15 mb-0">
<li>12/2024 : Giving a talk at POSTECH AI Day (topic: Probabilistic Language-Image Pre-training) <a href="https://docs.google.com/presentation/d/1BEHEphXxdg0TjUsI3Cv8Xr3kLX6sbAlGytDEiN5iW7s/edit?usp=sharing">[slide]</a></li>
<li>12/2024 : I will serve as an area chair at <a href="https://icml.cc/Conferences/2025">ICML 2025</a></li>
<li>12/2024 : 1 paper <sup><a href="#read-watch-and-scream-sound-generation-from-text-and-video">[ReWaS]</a></sup> is accepted at AAAI 2025.</li>
<li>11/2024 : 1 paper <sup><a href="#fairdro-group-fairness-regularization-via-classwise-robust-optim">[FairDRO extension]</a></sup> is accepted at <a href="https://www.sciencedirect.com/journal/neural-networks">Neural Networks</a>.</li>
<li>10/2024 : 1 paper <sup><a href="#read-watch-and-scream-sound-generation-from-text-and-video">[ReWaS]</a></sup> is accepted at NeurIPS 2024 Workshopon Video-Language Models.</li>
<li>10/2024 : I will serve as an area chair at <a href="https://aistats.org/aistats2025/">AISTATS 2025</a></li>
<li>_9/2024 : 1 paper <sup><a href="#do-counterfactually-fair-image-classifiers-satisfy-group-fairnes">[CKD]</a></sup> is accepted at NeurIPS 2024 D&B track.</li>
<li>_9/2024 : Giving a talk at SKKU (topic: "Realistic challenges and limitations of AI") <a href="https://docs.google.com/presentation/d/1s_7f3Uu6CtYrucFYQLhCJyV8l3Nhz7QZebyxpi5IwLs/edit?usp=sharing">[slide]</a></li>
<li>_8/2024 : RoCOCO<sup><a href="#rococo-robust-benchmark-of-ms-coco-to-stress-test-robustness-of">[RoCOCO]</a></sup> is accepted as <a href="https://syntheticdata4cv.wordpress.com/">ECCV 2024 Synthetic Data for Computer Vision Workshop</a> and selected as Oral presentation!</li>
<li>_8/2024 : Giving a talk at <a href="https://soict.hust.edu.vn/summer-school">HUST AI Summer School on "Generative AI"</a> (topic: "CompoDiff") <a href="https://docs.google.com/presentation/d/1GEVu5aZUxeJg3B6AU4iORlfQ1qCA5tIQjePkk0q6c1c/edit?usp=sharing">[slide]</a></li>
<li>_8/2024 : I will serve as an area chair at <a href="https://iclr.cc/Conferences/2025">ICLR 2025</a></li>
<li>_8/2024 : HYPE<sup><a href="#hype-hyperbolic-entailment-filtering-for-underspecified-images-a">[HYPE]</a></sup> is selected as Oral presentation at this ECCV!</li>
<li>_7/2024 : 1 paper <sup><a href="#compodiff-versatile-composed-image-retrieval-with-latent-diffusi">[CompoDiff]</a></sup> is accepted at <a href="https://openreview.net/forum?id=mKtlzW0bWc">TMLR</a>.</li>
<li>_7/2024 : 3 papers <sup><a href="#hype-hyperbolic-entailment-filtering-for-underspecified-images-a">[HYPE]</a></sup><sup><a href="#similarity-of-neural-architectures-using-adversarial-attack-tran">[SAT]</a></sup><sup><a href="#learning-with-unmasked-tokens-drives-stronger-vision-learners">[LUT]</a></sup> are accepted at ECCV 2024.</li>
<li>_4/2024 : 1 paper <sup><a href="#compodiff-versatile-composed-image-retrieval-with-latent-diffusi">[CompoDiff]</a></sup> is accepted at <a href="https://syndata4cv.github.io/">CVPR 2024 SynData4CV Workshop</a>.</li>
<li>_4/2024 : I will serve as an area chair at <a href="https://nips.cc/Conferences/2024/CallForDatasetsBenchmarks">NeurIPS 2024 Datasets and Benchmarks Track</a>.</li>
<li>_3/2024 : I will serve as an area chair at <a href="https://nips.cc/Conferences/2024">NeurIPS 2024</a>.</li>
<li>_3/2024 : 1 paper <sup><a href="#toward-interactive-regional-understanding-in-vision-large-langua">[RegionVLP]</a></sup> is accepted at NAACL 2024 main track.</li>
<li>_3/2024 : Giving a talk at UNIST (topic: "Probabilistic Image-Text Representations") <a href="https://docs.google.com/presentation/d/1IB-2A8w--jjQ9TAp1Xn8ANkfq_NyXAKaHtX3dh2e9_4/edit?usp=sharing">[slide]</a></li>
<li>_2/2024 : 1 paper <sup><a href="#language-only-efficient-training-of-zero-shot-composed-image-ret">[LinCIR]</a></sup> is accepted at CVPR 2024.</li>
<li>_2/2024 : Giving a talk at <a href="https://www.theieie.org/events/?part=03&c_id=872">IEIE AI Signal Processing Society Winter School</a> (topic: "Probabilistic Image-Text Representations") <a href="https://docs.google.com/presentation/d/1yelrDSN11rnChAk-gtU2YzSNX49XPHFIgROzj_lRA4Q/edit?usp=sharing">[slide]</a></li>
<li>_1/2024 : 2 papers <sup><a href="#improved-probabilistic-image-text-representations">[PCME++]</a></sup><sup><a href="#what-does-automatic-differentiation-compute-for-neural-networks">[AD Correctness]</a></sup> are accepted at ICLR 2024. One paper<sup><a href="#improved-probabilistic-image-text-representations">[PCME++]</a></sup> is my sole authored paper 🤗, and one paper<sup><a href="#what-does-automatic-differentiation-compute-for-neural-networks">[AD Correctness]</a></sup> is selected as spotlight! (Top-5% paper)</li>
<li>12/2023 : Giving a talk at Dankook University (topic: "Probabilistic Image-Text Representations") <a href="https://docs.google.com/presentation/d/1IB-2A8w--jjQ9TAp1Xn8ANkfq_NyXAKaHtX3dh2e9_4/edit?usp=sharing">[slide]</a></li>
<li>12/2013: We finally released <a href="https://huggingface.co/datasets/navervision/SynthTriplets18M">SynthTriplets18 dataset</a>!</li>
<li>11/2013: Being nominated as <a href="https://neurips.cc/Conferences/2023/ProgramCommittee#top-reivewers">NeurIPS 2023 top reviewers (10%)</a>.</li>
<li>_9/2023 : Giving a talk at <a href="https://soict.hust.edu.vn/summer-school">HUST AI Summer School on "Modern Machine Learning: Foundations and Applications"</a> (topic: "Probabilistic Image-Text Representations") <a href="https://docs.google.com/presentation/d/1IB-2A8w--jjQ9TAp1Xn8ANkfq_NyXAKaHtX3dh2e9_4/edit?usp=sharing">[slide]</a></li>
<li>_9/2023 : 1 paper <sup><a href="#improved-probabilistic-image-text-representations">[PCME++ short]</a></sup> is accepted at the non-archival track of <a href="https://iccv-clvl.github.io/2023/">ICCV 2023 Workshop on Closing The Loop Between Vision And Language (CLVL)</a>.</li>
<li>_8/2023 : Giving a talk at Yonsei University (topic: "CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion") <a href="https://docs.google.com/presentation/d/1VTJlrHqnLAcQP3aHydnlFXNeZpsPMGa9-L-Oaigi_6M/edit?usp=sharing">[slide]</a></li>
<li>_7/2023 : 1 paper <sup><a href="#seit-storage-efficient-vision-training-with-tokens-using-1-of-pi">[SeiT]</a></sup> is accepted at ICCV 2023.</li>
<li>_7/2023 : Serving as a <a href="https://jmlr.org/tmlr/editorial-board.html">TMLR Action Editor</a>.</li>
<li>_6/2023 : Being nominated as a <a href="https://openreview.net/group?id=TMLR/Expert_Reviewers">TMLR Expert Reviewer</a>.</li>
<li>_6/2023 : Giving a talk at Sogang University (topic: "Probabilistic Image-Text Representations") <a href="https://docs.google.com/presentation/d/1hLCAGuY3HuJYzo20Puugw79aFSAIz8BMej-WJxCIsJk/edit?usp=sharing">[slide]</a></li>
<li>_5/2023 : Serving as an area chair at <a href="https://nips.cc/Conferences/2023/CallForDatasetsBenchmarks">NeurIPS 2023 Datasets and Benchmarks Track</a>.</li>
<li>_4/2023 : We released "Graphit: A Unified Framework for Diverse Image Editing Tasks" <a href="https://github.com/navervision/Graphit">[GitHub]</a> <sup><a href="#graphit-a-unified-framework-for-diverse-image-editing-tasks">[Graphit]</a></sup>, The technical report will be released soon!</li>
<li>_4/2023 : 1 paper <sup><a href="#three-recipes-for-better-3d-pseudo-gts-of-3d-human-mesh-estimati">[3D-Pseudo-Gts]</a></sup> is accepted at CVPR 2023 Workshop on Computer Vision for Mixed Reality (CV4MR).</li>
<li>_1/2023 : 1 paper <sup><a href="#re-weighting-based-group-fairness-regularization-via-classwise-r">[FairDRO]</a></sup> is accepted at ICLR 2023.</li>
<li>_9/2022 : Giving a talk at Sogang University (topic: "ECCV Caption") <a href="https://docs.google.com/presentation/d/1OKaWPlNblepiXF57oWs2miGgYb5kuu1qxNqV_-hDddU/edit?usp=sharing">[slide]</a></li>
<li>_9/2022 : 1 paper <sup><a href="#a-unified-analysis-of-mixed-sample-data-augmentation-a-loss-func">[MSDA theorem]</a></sup> is accepted at NeurIPS 2022.</li>
<li>_8/2022 : Starting a new chapter in life with <a href="https://8uos.github.io/">Song Park</a> 🤵❤️👰.</li>
<li>_7/2022 : 1 paper <sup><a href="#few-shot-font-generation-with-weakly-supervised-localized-repres">[LF-Font journal]</a></sup> is accepted at IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI).</li>
<li>_7/2022 : 2 papers <sup><a href="#eccv-caption-correcting-false-negatives-by-collecting-machine-an">[ECCV Caption]</a></sup> <sup><a href="#domain-generalization-by-mutual-information-regularization-with">[MIRO]</a></sup> are accepted at ECCV 2022.</li>
<li>_7/2022 : Giving a talk at UNIST AIGS (topic: "Towards Reliable Machine Learning: Challenges, Examples, Solutions") <a href="https://docs.google.com/presentation/d/1SK2XwkQX5TPbkObDGnGgY4LWg6K-0dUdqyuosyVv2EI/edit?usp=sharing">[slide]</a></li>
<li>_6/2022 : Giving a tutorial on "Shortcut learning in Machine Learning: Challenges, Analysis, Solutions" at <a href="https://facctconference.org/2022/schedule.html">FAccT 2022</a>. [ <a href="https://sites.google.com/view/facct22-shortcut-learning/home">tutorial homepage</a> | <a href="https://docs.google.com/presentation/d/1UP-unGwtOhiO5rihMzNaXSCtn9fF7J5gLqmsY6jLve0/edit?usp=sharing">slide</a> | <a href="https://www.youtube.com/watch?v=8830gv2mIss">video</a> ]</li>
<li>_5/2022 : Receiving <strong><span class="text-danger">an outstanding reviewer award</span></strong> at CVPR 2022 <a href="https://cvpr2022.thecvf.com/outstanding-reviewers">[link]</a>.</li>
<li>_5/2022 : 1 paper <sup><a href="#dataset-condensation-with-contrastive-signals">[DCC]</a></sup> is accepted at ICML 2022.</li>
<li>_4/2022 : 1 paper <sup><a href="#evaluation-for-weakly-supervised-object-localization-protocol-me">[WSOL Eval journal]</a></sup> is accepted at IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI).</li>
<li>_4/2022 : Organizing ICLR 2022 ML in Korea Social</li>
<li>_3/2022 : Giving guest lectures at KAIST and SNU (topic: "Towards Reliable Machine Learning") <a href="https://docs.google.com/presentation/d/1Z1XNVl6LfslCyTsERQQRzTb6qMpuyc-8wkMIYuu7IAY/edit?usp=sharing">[slide]</a></li>
<li>_3/2022 : Co-organizing FAccT 2022 Translation/Dialogue Tutorial: "Shortcut learning in Machine Learning: Challenges, Analysis, Solutions" (slides, videos and web pages will be released soon)</li>
<li>_3/2022 : 1 paper <sup><a href="#learning-fair-classifiers-with-partially-annotated-group-labels">[CGL]</a></sup> is accepted at CVPR 2022.</li>
<li>_2/2022 : Giving a talk at <a href="https://sites.google.com/view/pair-ml-winter-seminar-2022/home">POSTECH AI Research (PAIR) ML Winter Seminar 2022</a> (topic: "Shortcut learning in Machine Learning: Challenges, Examples, Solutions") <a href="https://docs.google.com/presentation/d/1LGtjxaXwkk_Z6OaJFUEjputi6eEtkq3h1sXB2yx7gPk/edit?usp=sharing">[slide]</a></li>
<li>_1/2022 : 2 papers <sup><a href="#vidt-an-efficient-and-effective-fully-transformer-based-object-d">[ViDT]</a></sup> <sup><a href="#which-shortcut-cues-will-dnns-choose-a-study-from-the-parameter">[WCST-ML]</a></sup> are accepted at ICLR 2022.</li>
<li>12/2021 : Co-hosting NeurIPS'21 workshop on <a href="#neurips-2021-workshop-on-imagenet-past-present-and-future">ImageNet: Past, Present, and Future</a> with 400+ attendees!</li>
<li>12/2021 : Giving a talk at University of Seoul (topic: "Realistic challenges and limitations of AI") <a href="https://docs.google.com/presentation/d/1FOIMm4bYWN6b80_H0El1N4PzGu-2gfNqfR6bgjU8-58/edit?usp=sharing">[slide]</a></li>
<li>11/2021 : Giving a talk at NAVER and NAVER Labs Europe (topic: Mitigating dataset biases in Real-world ML applications) <a href="https://docs.google.com/presentation/d/1jyrfZfSEwbgzSuVKUURfmIEo3baksOL2GxSrBSzR4qo/edit">[slide]</a></li>
<li>11/2021 : Giving a guest lecture at UNIST (topic: Limits and Challenges in Deep Learning Optimizers) <a href="https://docs.google.com/presentation/d/1NNftqS6BcCPd52tv8gWEjB34retYhP0FToOFBd9ewkQ/edit">[slide]</a></li>
<li>10/2021 : Releasing an unified few-shot font generation framework! <a href="https://github.com/clovaai/fewshot-font-generation">[code]</a></li>
<li>_9/2021 : 2 papers <sup><a href="#swad-domain-generalization-by-seeking-flat-minima">[SWAD]</a></sup> <sup><a href="#neural-hybrid-automata-learning-dynamics-with-multiple-modes-and">[NHA]</a></sup> are accepted at NeurIPS 2021.</li>
<li>_8/2021: Reaching a research milestone of 1,000 citations at <a href="https://scholar.google.co.kr/citations?user=4_uj0xcAAAAJ">Google Scholar</a> and <a href="https://www.semanticscholar.org/author/Sanghyuk-Chun/2647582">Semantic Scholar</a>!</li>
<li>_7/2021 : Co-organizing the NeurIPS Workshop on ImageNet: Past, Present, and Future! <a href="https://sites.google.com/view/imagenet-workshop/home">[webpage]</a></li>
<li>_7/2021 : 2 papers <sup><a href="#multiple-heads-are-better-than-one-few-shot-font-generation-with">[MX-Font]</a></sup> <sup><a href="#rethinking-spatial-dimensions-of-vision-transformers">[PiT]</a></sup> are accepted at ICCV 2021.</li>
<li>_7/2021 : Giving a talk at Computer Vision Centre (CVC), UAB (topic: PCME and AdamP) <a href="http://www.cvc.uab.es/?p=7778">[info]</a> <a href="https://docs.google.com/presentation/d/1dGQUqud3iMld-UgMMlRQuqA7JhzndMfzFoKaFAwZ58I/edit?usp=sharing">[slide]</a></li>
<li>_6/2021 : Giving a talk at KSIAM 2021 (topic: AdamP). <a href="https://docs.google.com/presentation/d/1s9zgQ22WFnhEj6POL_0ecrTiED__qmL9XgL0Nv3zNP4/edit?usp=sharing">[slide]</a></li>
<li>_6/2021 : Giving a guest lecture at Seoul National University (topic: few-shot font generation) .<a href="https://docs.google.com/presentation/d/13WoYS9r9C751s3nZ2yF5WdestRXh3V7LFQDOJyCrt9s/edit?usp=sharing">[slide]</a></li>
<li>_5/2021 : Receiving <strong><span class="text-danger">an outstanding reviewer award</span></strong> at CVPR 2021 <a href="https://cvpr2021.thecvf.com/node/184">[link]</a>.</li>
<li>_4/2021 : 1 paper <sup><a href="#few-shot-font-generation-with-localized-style-representations-an">[LF-Font]</a></sup> is accepted at CVPR 2021 workshop (also appeared at AAAI).</li>
<li>_3/2021 : 2 papers <sup><a href="#probabilistic-embeddings-for-cross-modal-retrieval">[PCME]</a></sup> <sup><a href="#re-labeling-imagenet-from-single-to-multi-labels-from-global-to">[ReLabel]</a></sup> are accepted at CVPR 2021.</li>
<li>_1/2021 : 1 paper <sup><a href="#adamp-slowing-down-the-slowdown-for-momentum-optimizers-on-scale">[AdamP]</a></sup> is accepted at ICLR 2021.</li>
<li>12/2020 : 1 paper <sup><a href="#few-shot-font-generation-with-localized-style-representations-an">[LF-Font]</a></sup> is accepted at AAAI 2021.</li>
<li>_7/2020 : 1 paper <sup><a href="#few-shot-compositional-font-generation-with-dual-memory">[DM-Font]</a></sup> is accepted at ECCV 2020.</li>
<li>_6/2020 : Receiving <strong><span class="text-danger">the best paper runner-up award</span></strong> at AICCW CVPR 2020.</li>
<li>_6/2020 : Receiving <strong><span class="text-danger">an outstanding reviewer award</span></strong> at CVPR 2020 <a href="https://cvpr2020.thecvf.com/reviewer-acknowledgements">[link]</a>.</li>
<li>_6/2020 : Giving a talk at CVPR 2020 NAVER interative session.</li>
<li>_6/2020 : 1 paper <sup><a href="#learning-de-biased-representations-with-biased-representations">[ReBias]</a></sup> is accepted at ICML 2020.</li>
<li>_4/2020 : 1 paper <sup><a href="#toward-high-quality-few-shot-font-generation-with-dual-memory">[DM-Font short]</a></sup> is accepted at CVPR 2020 workshop.</li>
<li>_2/2020 : 1 paper <sup><a href="#evaluating-weakly-supervised-object-localization-methods-right">[wsoleval]</a></sup> is accepted at CVPR 2020.</li>
<li>_1/2020 : 1 paper <sup><a href="#data-driven-harmonic-filters-for-audio-representation-learning">[HCNN]</a></sup> is accepted at ICASSP 2020.</li>
<li>10/2019 : 1 paper <sup><a href="#automatic-music-tagging-with-harmonic-cnn">[HCNN short]</a></sup> is accpeted at ISMIR late break demo.</li>
<li>10/2019 : Working at Naver Labs Europe as a visiting researcher (Oct - Dec 2019)</li>
<li>_7/2019 : 2 papers <sup><a href="#cutmix-regularization-strategy-to-train-strong-classifiers-with">[CutMix]</a> <a href="#photorealistic-style-transfer-via-wavelet-transforms">[WCT2]</a></sup> are accepted at ICCV 2019 (1 oral presentation).</li>
<li>_6/2019 : Giving a talk at ICML 2019 Expo workshop.</li>
<li>_5/2019 : 2 papers <sup><a href="#visualizing-and-understanding-self-attention-based-music-tagging">[MTSA]</a> <a href="#an-empirical-evaluation-on-robustness-and-uncertainty-of-regular">[RegEval]</a></sup> are accepted at ICML 2019 workshops (1 oral presentation).</li>
<li>_5/2019 : Giving a talk at ICLR 2019 Expo talk.</li>
<li>_3/2019 : 1 paper <sup><a href="#where-to-be-adversarial-perturbations-added-investigating-and-ma">[PRM]</a></sup> is accepted at ICLR 2019 workshop.</li>
</ul>
</details>
</div>
<div id="current_service" class="mt-5 mb-5 on-click-all-active">
<h3>Current Service Appointments</h3>
<ul>
<li><strong>Area Chair</strong> - <span class="badge bg-danger">ICLR 2025</span> <span class="badge bg-danger">AISTATS 2025</span> <span class="badge bg-danger">ICML 2025</span></li>
<li><strong>Action Editor</strong> - <span class="badge bg-success">TMLR</span></li>
<li><strong>Reviewer</strong> - <span class="badge bg-warning">CVPR 2025</span> <span class="badge bg-warning">ICLR Blog track 2025</span></li>
</ul>
</div>
<div id="research" class="mt-5 mb-5 on-click-all-active">
<!--<h3>Research: <small><span class="text-warning">Scalable</span> and <span class="text-success">Reliable Machine Learning</span> with <span class="text-danger">Language-guided Representation Learning</span></small></h3>-->
<h3>Research: <small>Scalable and Reliable Machine Learning with Language-guided Representation Learning</small></h3>
<p>To ensure the real-world applicability of machine learning (ML) models, we need the ability to generalize effectively to unseen scenarios encountered beyond the training phase. There are three scenarios frequently encountered in practical applications: (1) when input data significantly differs from the training data; (2) when the model faces the target behavior beyond the scope of training targets, such as unexplored labels; and (3) when the application needs human opinions or subjective value judgments. Addressing all three scenarios relies on more than just massive large-scale datasets; it demands the inclusion of human knowledge that extends beyond web-crawled content. Yet, the question remains: How can we effectively integrate large-scale training and human knowledge guidance? To answer the question, my research aims to develop large-scale ML models exhibiting greater controllability and interpretability, thereby enabling human intervention to guide model behavior, even beyond the training phase. My work revolves around three main research themes towards this goal: <span class="text-danger">Language-combined Representation Learning</span>, <span class="text-success">Machine learning reliability</span> and <span class="text-warning">Optimization techniques for large-scale ML</span>.</p>
<p>More detailed statement can be found in <a href="media/research_statement.pdf">my research statement</a>.</p>
<details>
<summary><strong style="font-size: 1.2em">Click here to read more about my research</strong></summary>
<p class="mt-4"><span class="text-danger">Language-combined Representation Learning.</span> Language serves as the most natural method for encoding human knowledge. If our ML model can comprehend human language alongside the target modality, we can understand the model better by interventing the space with human language. However, as language descriptions are the product of conscious choices of the key relevant concepts to report from input data, language-combined representation learning methods often suffer from the multiplicity (or many-to-many problem) between modalities. My recent works address this problem through <span class="text-danger">understanding and addressing the multiplicity problem by probabilistic representation learning</span>. In this paradigm, an input is mapped to a probabilistic distribution, rather than a deterministic vector. This approach enhances the interpretability of datasets and user controllability. Likewise, as another possible direction, I have explored adding more information or modality to the existing language-X models which enables more controllability. Furthermore, I have worked on establishing a robust <span class="text-danger">evaluation framework for vision-language models</span> in terms of their multiplicity and robustness.</p>
<ul>
<li><i><strong>S. Chun</strong>, <a href="#improved-probabilistic-image-text-representations">Improved Probabilistic Image-Text Representations</a>, <strong>ICLR 2024</strong></i></li>
<li><i><strong>S. Chun</strong>, S. J. Oh, R. S. de Rezende, Y. Kalantidis, D. Larlus, <a href="#probabilistic-embeddings-for-cross-modal-retrieval">Probabilistic Embeddings for Cross-Modal Retrieval</a>, <strong>CVPR 2021</strong></i></li>
<li><i><strong>S. Chun</strong>, W. Kim, S. Park, M. Chang, S. J. Oh, <a href="#eccv-caption-correcting-false-negatives-by-collecting-machine-an">ECCV Caption: Correcting False Negatives by Collecting Machine-and-Human-verified Image-Caption Associations for MS-COCO</a>, <strong>ECCV 2022</strong></i></li>
<li><i>W. Kim, <strong>S. Chun</strong>, W. Kim, T. Kim, D. Han, S. Yun, <a href="#hype-hyperbolic-entailment-filtering-for-underspecified-images-a">HYPE: Hyperbolic Entailment Filtering for Underspecified Images and Texts</a>, <strong>ECCV 2024</strong></i></li>
<li><i>J. Lee, <strong>S. Chun</strong>, S. Yun, <a href="#toward-interactive-regional-understanding-in-vision-large-langua">Toward Interactive Regional Understanding in Vision-Large Language Models</a>, <strong>NAACL 2024</strong></i></li>
<li><i>S. Park, D. Um, H. Yoon, <strong>S. Chun</strong>, S. Yun, J. Y. Choi, <a href="#rococo-robust-benchmark-ms-coco-to-stress-test-robustness-of-ima">RoCOCO: Robust Benchmark of MS-COCO to Stress-test Robustness of Image-Text Matching Models</a>, <strong>ECCV 2024 Synthetic Data for Computer Vision Workshop</strong></i></li>
</ul>
<p>How can we make a model comprehend human language alongside the target modality? To answer the question, I recently have worked on <span class="text-danger">text-conditioned diffusion models</span>. Especially, I am interested in utilizing the power of recent diffusion models for text-conditioned feature transforms or data augmentation. However, we need more versatility and controllability to adopt diffusion models to the desired tasks, e.g., localized conditions via providing region masks. My recent works have focused on the versatility and controllability of diffusion models, and applying diffusion models to non-generative downstream tasks, such as composed image retrieval (CIR) tasks.</p>
<ul>
<li><i>G. Gu<sup>❋</sup>, <strong>S. Chun<sup>❋</sup></strong>, W. Kim, Y. Kang, S. Yun, <a href="#language-only-efficient-training-of-zero-shot-composed-image-ret">Language-only Efficient Training of Zero-shot Composed Image Retrieval</a>, <strong>CVPR 2024</strong></i></li>
<li><i>G. Gu<sup>❋</sup>, <strong>S. Chun<sup>❋</sup></strong>, W. Kim, H. Jun, Y. Kang, S. Yun, <a href="#compodiff-versatile-composed-image-retrieval-with-latent-diffusi">CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion</a>, <strong>TMLR</strong></i></li>
<li><i>J. Byun<sup>❋</sup>, S. Jeong<sup>❋</sup>, W. Kim, <strong>S. Chun<sup>†</sup></strong>, T. Moon<sup>†</sup>, <a href="#reducing-task-discrepancy-of-text-encoders-for-zero-shot-compose">Reducing Task Discrepancy of Text Encoders for Zero-Shot Composed Image Retrieval</a>, <strong>preprint</strong></i></li>
<li><i>G. Gu, <strong>S. Chun</strong>, W. Kim, H. Jun, Y, S. Yun, Y. Kang, <a href="#graphit-a-unified-framework-for-diverse-image-editing-tasks">Graphit: A Unified Framework for Diverse Image Editing Tasks</a>, <strong>open source</strong></i></li>
</ul>
<p class="mt-4"><span class="text-success">Machine learning reliability.</span> Existing machine learning models cannot understand the problem itself <sup><a href="#facct-2022-translation-dialogue-tutorial-shortcut-learning-in-ma">[Shortcut learning tutorial]</a></sup>. This causes many realistic problems, such as discrimination by machines, poor generalizability to unseen (or minor) corruptions / environments / groups. Current state-of-the-art machines only do "predict", rather than "logical thinking based on logical reasoning". As models prefer to learn by shortcuts <sup><a href="#which-shortcut-cues-will-dnns-choose-a-study-from-the-parameter">[WCST-ML]</a></sup>, just training models as usual will lead to biased models. One of my research interest is to investigate these phenomena with various tools.</p>
<ul>
<li><i><strong>S. Chun</strong>, K. Song, Y. Jung, <a href="#facct-2022-translation-dialogue-tutorial-shortcut-learning-in-ma">FAccT 2022 Translation/Dialogue Tutorial: "Shortcut learning in Machine Learning: Challenges, Analysis, Solutions"</a></i></li>
<li><i>L. Scimeca<sup>❋</sup>, S. J. Oh<sup>❋</sup>, <strong>S. Chun</strong>, M. Poli, S. Yun, <a href="#which-shortcut-cues-will-dnns-choose-a-study-from-the-parameter">Which Shortcut Cues Will DNNs Choose? A Study from the Parameter-Space Perspective</a>, <strong>ICLR 2022</strong></i></li>
</ul>
<p>If it is difficult to make machines understand the problem itself, what can we do? Our model should not learn undesirable shortcut features <sup><a href="#learning-de-biased-representations-with-biased-representations">[ReBias]</a> <a href="#styleaugment-learning-texture-de-biased-representations-by-style">[StyleAugment]</a></sup>, or should be robust to unseen corruptions <sup><a href="#cutmix-regularization-strategy-to-train-strong-classifiers-with">[CutMix]</a> <a href="#an-empirical-evaluation-on-robustness-and-uncertainty-of-regular">[RegEval]</a> <a href="#re-labeling-imagenet-from-single-to-multi-labels-from-global-to">[ReLabel]</a> <a href="#rethinking-spatial-dimensions-of-vision-transformers">[PiT]</a></sup> or significant distribution shifts <sup><a href="#swad-domain-generalization-by-seeking-flat-minima">[SWAD]</a></sup> <sup><a href="#domain-generalization-by-mutual-information-regularization-with">[MIRO]</a></sup>. Also we need to make a machine not discriminative to certain demographic groups <sup><a href="#learning-fair-classifiers-with-partially-annotated-group-labels">[CGL]</a></sup> <sup><a href="#re-weighting-based-group-fairness-regularization-via-classwise-r">[FairDRO]</a></sup>. We expect a model says "I don't know" when they get unexpected inputs <sup><a href="#probabilistic-embeddings-for-cross-modal-retrieval">[PCME]</a></sup> <sup><a href="#improved-probabilistic-image-text-representations">[PCME++]</a></sup>. At least, we expect a model can explain why it makes a such decision <sup><a href="#toward-interpretable-music-tagging-with-self-attention">[MTSA]</a> <a href="#visualizing-and-understanding-self-attention-based-music-tagging">[MTSA WS]</a> <a href="#evaluating-weakly-supervised-object-localization-methods-right">[WSOL eval]</a> <a href="#evaluation-for-weakly-supervised-object-localization-protocol-me">[WSOL Eval journal]</a></sup>, how different model design choices will change model decisions <sup><a href="#similarity-of-neural-architectures-based-on-input-gradient-trans">[NetSim]</a></sup> and how it can be fixed (e.g., More data collection? More annotations? Filtering?). My research focuses on expanding machine knowledge from "just prediction" to "logical reasoning". Especially, my recent researches have contentrated to tackle various generalization down stream tasks, such as <span class="text-success">de-biasing, domain generalization, algorithmic fairness and adversarial robustness.</span></p>
<ul>
<li>De-biasing</li>
<ul>
<li><i>H. Bahng, <strong>S. Chun</strong>, S. Yun, J. Choo, S. J. Oh, <a href="#learning-de-biased-representations-with-biased-representations">Learning De-biased Representations with Biased Representations</a>, <strong>ICML 2020</strong></i></li>
<li><i><strong>S. Chun</strong>, S. Park, <a href="#styleaugment-learning-texture-de-biased-representations-by-style">StyleAugment: Learning Texture De-biased Representations by Style Augmentation without Pre-defined Textures</a>, <strong>preprint</strong></i></li>
</ul>
<li>Domain Generalization</li>
<ul>
<li><i>J. Cha, <strong>S. Chun<sup>❋</sup></strong>, K. Lee <sup>❋</sup>, H. C. Cho, S. Park, Y. Lee, S. Park <a href="#swad-domain-generalization-by-seeking-flat-minima">SWAD: Domain Generalization by Seeking Flat Minima</a>, <strong>NeurIPS 2021</strong></i></li>
<li><i>J. Cha, K. Lee, S. Park, <strong>S. Chun</strong> <a href="#domain-generalization-by-mutual-information-regularization-with">Domain Generalization by Mutual-Information Regularization with Pre-trained Models</a>, <strong>ECCV 2022</strong></i></li>
</ul>
<li>Algorithmic Fairness</li>
<ul>
<li><i>S. Jung, <strong>S. Chun<sup>❋</sup></strong>, T. Moon<sup>❋</sup>, <a href="#learning-fair-classifiers-with-partially-annotated-group-labels">Learning Fair Classifiers with Partially Annotated Group Labels</a>, <strong>CVPR 2022</strong></i></li>
<li><i>S. Jung<sup>❋</sup>, T. Park<sup>❋</sup>, <strong>S. Chun</strong>, T. Moon, <a href="re-weighting-based-group-fairness-regularization-via-classwise-r">Re-weighting based Group Fairness Regularization via Classwise Robust Optimization</a>, <strong>ICLR 2023</strong></i></li>
</ul>
<li>Adversarial Robustness</li>
<ul>
<li><i>J. Hwang, D. Han, B. Heo, S. Park, <strong>S. Chun<sup>❋</sup></strong>, J. S. Lee<sup>❋</sup> <a href="#similarity-of-neural-architectures-using-adversarial-attack-tran">Similarity of Neural Architectures using Adversarial Attack Transferability</a>, <strong>ECCV 2024</strong></i></li>
<li><i>J. Hwang<sup>❋</sup>, Y. Kim<sup>❋</sup>, <strong>S. Chun<sup>❋</sup></strong>, J. Yoom, J. H. Kim, D. Han, <a href="#where-to-be-adversarial-perturbations-added-investigating-and-ma">Where To Be Adversarial Perturbations Added? Investigating and Manipulating Pixel Robustness Using Input Gradients</a>, <strong>ICLR Workshop 2019</strong></i></li>
</ul>
</ul>
<p>Correct and fair evaluation is crucial for research development. However, existing evaluation protocols and metrics often lack reliability in measuring how machines learn proper knowledge. I also have actively engaged in addressing this issue by working with <span class="text-success">fair evaluation benchmarks and metrics.</span></p>
<ul>
<li><i><strong>S. Chun</strong>, W. Kim, S. Park, M. Chang, S. J. Oh, <a href="#eccv-caption-correcting-false-negatives-by-collecting-machine-an">ECCV Caption: Correcting False Negatives by Collecting Machine-and-Human-verified Image-Caption Associations for MS-COCO</a>, <strong>ECCV 2022</strong></i></li>
<li><i><strong>S. Chun</strong>, S. J. Oh, S. Yun, D. Han, J. Choe, Y. Yoo, <a href="#an-empirical-evaluation-on-robustness-and-uncertainty-of-regular">An Empirical Evaluation on Robustness and Uncertainty of Regularization methods</a>, <strong>ICML 2019 Workshop</strong></i></li>
<li><i>J. Choe<sup>❋</sup>, S. J. Oh<sup>❋</sup>, <strong>S. Chun</strong>, S. Lee, Z. Akata, H. Shim, <a href="#evaluation-for-weakly-supervised-object-localization-protocol-me">Evaluation for Weakly Supervised Object Localization: Protocol, Metrics, and Datasets</a>, <strong>PAMI</strong></i></li>
<li><i>J. Choe<sup>❋</sup>, S. J. Oh<sup>❋</sup>, S. Lee, <strong>S. Chun</strong>, Z. Akata, H. Shim, <a href="#evaluating-weakly-supervised-object-localization-methods-right">Evaluating Weakly Supervised Object Localization Methods Right</a>, <strong>CVPR 2020</strong></i></li>
</ul>
<p class="mt-4"><span class="text-warning">Optimization techniques for large-scale ML.</span> Last but not least, I have actively worked on developing <span class="text-warning">general optimization techniques for large-scale machine learning models</span>, including data augmentation, optimizer, network architecture, objective function. My research emphasizes two key objectives: empirical impact and theoretical soundness. Especially, my aim is to develop easy-to-use techniques that seamlessly function as plug-and-play solutions.</p>
<ul>
<li>Data augmentation</li>
<ul>
<li><i>S. Yun, D. Han, S. J. Oh, <strong>S. Chun</strong>, J. Choe, Y. Yoo, <a href="#cutmix-regularization-strategy-to-train-strong-classifiers-with">CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features</a>, <strong>ICCV 2021</strong></i></li>
<li><i>C. Park<sup>❋</sup>, S. Yun<sup>❋</sup>, <strong>S. Chun</strong>, <a href="#a-unified-analysis-of-mixed-sample-data-augmentation-a-loss-func">A Unified Analysis of Mixed Sample Data Augmentation: A Loss Function Perspective</a>, <strong>NeurIPS 2022</strong></i></li>
<li><i><strong>S. Chun</strong>, S. Park, <a href="#styleaugment-learning-texture-de-biased-representations-by-style">StyleAugment: Learning Texture De-biased Representations by Style Augmentation without Pre-defined Textures</a>, <strong>preprint</strong></i></li>
</ul>
<li>Storage efficient learning</li>
<ul>
<li><i>S. Park<sup>❋</sup>, <strong>S. Chun<sup>❋</sup></strong>, B. Heo, W. Kim, S. Yun, <a href="#seit-storage-efficient-vision-training-with-tokens-using-1-of-pi">SeiT: Storage-Efficient Vision Training with Tokens Using 1% of Pixel Storage</a>, <strong>ICCV 2023</strong></i></li>
<li><i>S. Yun, S. J. Oh, B. Heo, D. Han, J. Choe, <strong>S. Chun</strong>, <a href="#re-labeling-imagenet-from-single-to-multi-labels-from-global-to">Re-labeling ImageNet: from Single to Multi-Labels, from Global to Localized Labels</a>, <strong>CVPR 2021</strong></i></li>
<li><i>S. Lee, <strong>S. Chun</strong>, S. Jung, S. Yun, S. Yoon, <a href="#dataset-condensation-with-contrastive-signals">Dataset Condensation with Contrastive Signals</a>, <strong>ICML 2022</strong></i></li>
</ul>
<li>Optimizer</li>
<ul>
<li><i>B. Heo<sup>❋</sup>, <strong>S. Chun<sup>❋</sup></strong>, S. J. Oh, D. Han, S. Yun, G. Kim, Y. Uh, J. W. Ha, <a href="#adamp-slowing-down-the-slowdown-for-momentum-optimizers-on-scale">AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights</a>, <strong>ICLR 2021</strong></i></li>
<li><i>J. Cha, <strong>S. Chun<sup>❋</sup></strong>, K. Lee <sup>❋</sup>, H. C. Cho, S. Park, Y. Lee, S. Park <a href="#swad-domain-generalization-by-seeking-flat-minima">SWAD: Domain Generalization by Seeking Flat Minima</a>, <strong>NeurIPS 2021</strong></i></li>
</ul>
<li>Network architecture</li>
<ul>
<li><i>B. Heo, S. Yun, D. Han, <strong>S. Chun</strong>, J. Choe, S. J. Oh, <a href="#rethinking-spatial-dimensions-of-vision-transformers">Rethinking Spatial Dimensions of Vision Transformers</a>, <strong>ICCV 2021</strong></i></li>
<li><i>H. Song, D. Sun, <strong>S. Chun</strong>, V. Jampani, D. Han, B. Heo, W. Kim, M. H. Yang, <a href="#vidt-an-efficient-and-effective-fully-transformer-based-object-d">ViDT: An Efficient and Effective Fully Transformer-based Object Detector</a>, <strong>ICLR 2022</strong></i></li>
<li><i>H. Song, D. Sun, <strong>S. Chun</strong>, V. Jampani, D. Han, B. Heo, W. Kim, M. H. Yang, <a href="#an-extendable-efficient-and-effective-transformer-based-object-d">An Extendable, Efficient and Effective Transformer-based Object Detector</a>, <strong>preprint</strong></i></li>
<li><i>J. Hwang, D. Han, B. Heo, S. Park, <strong>S. Chun<sup>❋</sup></strong>, J. S. Lee<sup>❋</sup> <a href="#similarity-of-neural-architectures-using-adversarial-attack-tran">Similarity of Neural Architectures using Adversarial Attack Transferability</a>, <strong>ECCV 2024</strong></i></li>
</ul>
</ul>
<p>Lastly, I also have worked on <span class="text-warning">domain specific optimization techniques</span> by utilizing properties of the given data, e.g., compositionaly of Korean/Chinese letters, low- and high- frequency information for better audio understanding, or harmonic information for multi-source audio understanding.</p>
<ul>
<li><i>S. Park, <strong>S. Chun</strong>, J. Cha, B. Lee, H. Shim, <a href="#multiple-heads-are-better-than-one-few-shot-font-generation-with">Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts</a>, <strong>ICCV 2021</strong></i></li>
<li><i>S. Park <sup>❋</sup>, <strong>S. Chun<sup>❋</sup></strong>, J. Cha, B. Lee, H. Shim, <a href="#few-shot-font-generation-with-weakly-supervised-localized-repres">Few-shot Font Generation with Weakly Supervised Localized Representations</a>, <strong>PAMI</strong></i></li>
<li><i>S. Park <sup>❋</sup>, <strong>S. Chun<sup>❋</sup></strong>, J. Cha, B. Lee, H. Shim, <a href="#few-shot-font-generation-with-localized-style-representations-an">Few-shot Font Generation with Localized Style Representations and Factorization</a>, <strong>AAAI 2021</strong></i></li>
<li><i>J. Cha, <strong>S. Chun</strong>, G. Lee, B. Lee, S. Kim, H. Lee, <a href="#few-shot-compositional-font-generation-with-dual-memory">Few-shot Compositional Font Generation with Dual Memory</a>, <strong>ECCV 2020</strong></i></li>
<li><i> J. Yoo<sup>❋</sup>, Y. Uh<sup>❋</sup>, <strong>S. Chun<sup>❋</sup></strong>, B. Kang, J. W. Ha, <a href="#photorealistic-style-transfer-via-wavelet-transforms">Photorealistic Style Transfer via Wavelet Transforms</a>, <strong>ICCV 2019</strong></i></li>
<li><i>M. Won, <strong>S. Chun</strong>, O. Nieto, X. Serra, <a href="#data-driven-harmonic-filters-for-audio-representation-learning">Data-driven Harmonic Filters for Audio Representation Learning</a>, <strong>ICASSP 2020</strong></i></li>
<li><i>M. Won, <strong>S. Chun</strong>, X. Serra, <a href="#toward-interpretable-music-tagging-with-self-attention">Toward Interpretable Music Tagging with Self-attention</a>, <strong>preprint</strong>
<li><i>J. H. Kim<sup>❋</sup>, J. Yoo<sup>❋</sup>, <strong>S. Chun</strong>, A. Kim, J. W. Ha, <a href="#multi-domain-processing-via-hybrid-denoising-networks-for-speech">Multi-Domain Processing via Hybrid Denoising Networks for Speech Enhancement</a>, <strong>preprint</strong></i></li>
</i></li>
</ul>
</details>
</div>
<hr>
<h3 id="papers" class="mt-5">Publications</h3>
<p>
<strong>(C: peer-reviewed conference, W: peer-reviewed workshop, A: arxiv preprint, O: others)<br>
<span class="text-danger">(<sup>❋</sup>authors contributed equally)</strong></span><br>
See also at my <a href="https://scholar.google.co.kr/citations?user=4_uj0xcAAAAJ">Google Scholar</a>.
</p>
<div class="mb-4">
<nav>
<div class="nav nav-tabs" id="nav-tab" role="tablist">
<button class="nav-link active" id="nav-selected-tab" data-bs-toggle="tab" data-bs-target="#nav-selected" type="button" role="tab" aria-controls="nav-selected" aria-selected="true">SELECTED PAPERS</button>
<button class="nav-link" id="nav-all-tab" data-bs-toggle="tab" data-bs-target="#nav-all" type="button" role="tab" aria-controls="nav-all" aria-selected="false">ALL</button>
<button class="nav-link" id="nav-vl-tab" data-bs-toggle="tab" data-bs-target="#nav-vl" type="button" role="tab" aria-controls="nav-vl" aria-selected="false">VISION-LANGUAGE</button>
<button class="nav-link" id="nav-reliability-tab" data-bs-toggle="tab" data-bs-target="#nav-reliability" type="button" role="tab" aria-controls="nav-reliability" aria-selected="false">ML RELIABILITY</button>
<button class="nav-link" id="nav-optimization-tab" data-bs-toggle="tab" data-bs-target="#nav-optimization" type="button" role="tab" aria-controls="nav-optimization" aria-selected="false">LARGE-SCALE OPTIMIZATION</button>
</div>
</nav>
<div class="tab-content" id="nav-tabContent">
<div class="tab-pane fade show active" id="nav-selected" role="tabpanel" aria-labelledby="nav-selected-tab">
<ul class="list-group list-group-flush list-no-border">
<li class="list-group-item">
<strong>Probabilistic Language-Image Pre-Training.</strong>
<ul>
<li><strong>Sanghyuk Chun</strong>, Wonjae Kim, Song Park, Sangdoo Yun</li>
<li><strong><em>ICLR 2025.</em></strong> <a href="media/papers/chun2024prolip.pdf">paper</a> | <a href="https://github.com/naver-ai/prolip/">code</a> | <a href="https://huggingface.co/collections/SanghyukChun/prolip-6712595dfc87fd8597350291">pre-trained models 🤗</a> | <a href="https://docs.google.com/presentation/d/1BEHEphXxdg0TjUsI3Cv8Xr3kLX6sbAlGytDEiN5iW7s/edit?usp=sharing">slide</a> | <a href="media/bibtex/chun2024prolip.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item">
<strong>Improved Probabilistic Image-Text Representations.</strong>
<ul>
<li><strong>Sanghyuk Chun</strong></li>
<li><strong><em>ICLR 2024.</em></strong> <strong><em>ICCV CLVL 2023.</em></strong> <a href="media/papers/chun2024iclr_pcmepp.pdf">paper</a> | <a href="https://github.com/naver-ai/pcmepp/">code</a> | <a href="https://naver-ai.github.io/pcmepp/">project page</a> | <a href="https://docs.google.com/presentation/d/1yelrDSN11rnChAk-gtU2YzSNX49XPHFIgROzj_lRA4Q/edit?usp=sharing">slide</a> | <a href="media/bibtex/chun2023pcmepp.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item">
<strong>Probabilistic Embeddings for Cross-Modal Retrieval.</strong>
<ul>
<li><strong>Sanghyuk Chun</strong>, Seong Joon Oh, Rafael Sampaio de Rezende, Yannis Kalantidis, Diane Larlus</li>
<li><strong><em>CVPR 2021.</em></strong> <a href="media/papers/chun2021cvpr_pcme.pdf">paper</a> | <a href="https://github.com/naver-ai/pcme">code</a> | <a href="https://www.youtube.com/watch?v=J_DaqSLEcVk">video</a> | <a href="https://docs.google.com/presentation/d/1Tyac3fRvEGYkmbB9iUELU5jjv8IWo5oU_eQUGOkiCuk/edit?usp=sharing">slide (short talk)</a> | <a href="https://docs.google.com/presentation/d/1dGQUqud3iMld-UgMMlRQuqA7JhzndMfzFoKaFAwZ58I/edit?usp=sharing">slide (long talk)</a> | <a href="media/bibtex/chun2021pcme.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item">
<strong>ECCV Caption: Correcting False Negatives by Collecting Machine-and-Human-verified Image-Caption Associations for MS-COCO.</strong>
<ul>
<li><strong>Sanghyuk Chun</strong>, Wonjae Kim, Song Park, Minsuk Chang, Seong Joon Oh</li>
<li><strong><em>ECCV 2022.</em></strong> <a href="media/papers/chun2022eccv_caption.pdf">paper</a> | <a href="https://github.com/naver-ai/eccv-caption">code</a> | <a href="https://pypi.org/project/eccv-caption/">pypi</a> | <a href="https://docs.google.com/presentation/d/1zyLL49_2-F6mQFaMIumPfdE7el_r048XtidLnehepHo/edit?usp=sharing">slide (short talk)</a> | <a href="https://docs.google.com/presentation/d/1OKaWPlNblepiXF57oWs2miGgYb5kuu1qxNqV_-hDddU/edit?usp=sharing">slide (long talk)</a> | <a href="media/bibtex/chun2022eccv_caption.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item">
<strong>Language-only Efficient Training of Zero-shot Composed Image Retrieval.</strong>
<ul>
<li>Geonmo Gu<sup>❋</sup>, <strong>Sanghyuk Chun</strong><sup>❋</sup>, Wonjae Kim, Yoohoon Kang, Sangdoo Yun</li>
<li><strong><em>CVPR 2024.</em></strong> <a href="media/papers/gu2024cvpr_lincir.pdf">paper</a> | <a href="https://github.com/navervision/lincir">code</a> | <a href="https://huggingface.co/spaces/navervision/LinCIR">demo 🤗</a> | <a href="media/bibtex/gu2024lincir.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item">
<strong>CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion.</strong>
<ul>
<li>Geonmo Gu<sup>❋</sup>, <strong>Sanghyuk Chun</strong><sup>❋</sup>, Wonjae Kim, HeeJae Jun, Yoohoon Kang, Sangdoo Yun</li>
<li><strong><em>TMLR.</em></strong> <strong><em>CVPR 2024 SynData4CV Workshop.</em></strong> <a href="media/papers/gu2024compodiff.pdf">paper</a> | <a href="https://github.com/navervision/CompoDiff">code</a> | <a href="https://huggingface.co/datasets/navervision/SynthTriplets18M">SynthTriplets18M dataset</a> | <a href="https://huggingface.co/spaces/navervision/CompoDiff-Aesthetic">demo 🤗</a> | <a href="https://docs.google.com/presentation/d/1VTJlrHqnLAcQP3aHydnlFXNeZpsPMGa9-L-Oaigi_6M/edit?usp=sharing">slide</a> | <a href="media/bibtex/gu2024compodiff.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item">
<strong>HYPE: Hyperbolic Entailment Filtering for Underspecified Images and Texts.</strong> <span class="badge bg-danger">Oral presentation</span>
<ul>
<li>Wonjae Kim, <strong>Sanghyuk Chun</strong>, Taekyung Kim, Dongyoon Han, Sangdoo Yun</li>
<li><strong><em>ECCV 2024.</em></strong> <a href="media/papers/kim2024hype.pdf">paper</a> | <a href="https://github.com/naver-ai/hype">code</a> | <a href="media/bibtex/kim2024hype.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item">
<strong>Reducing Task Discrepancy of Text Encoders for Zero-Shot Composed Image Retrieval.</strong>
<ul>
<li>Jaeseok Byun<sup>❋</sup>, Seokhyeon Jeong<sup>❋</sup>, Wonjae Kim, <strong>Sanghyuk Chun</strong><sup>†</sup>, Taesup Moon<sup>†</sup></li>
<li><strong><em>preprint.</em></strong> <a href="media/papers/byun2024rtd.pdf">paper</a> | <a href="media/bibtex/byun2024rtd.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item">
<strong>Toward Interactive Regional Understanding in Vision-Large Language Models</strong>
<ul>
<li>Jungbeom Lee, <strong>Sanghyuk Chun<sup>❋</sup></strong>, Sangdoo Yun<sup>❋</sup></li>
<li><strong><em>NAACL 2024.</em></strong> <a href="media/papers/lee2024naacl_region_vlm.pdf">paper</a> | <a href="https://github.com/jbeomlee93/RegionVLM">code</a> | <a href="media/bibtex/lee2024vlm.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item">
<strong>AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights.</strong>
<ul>
<li>Byeongho Heo<sup>❋</sup>, <strong>Sanghyuk Chun</strong><sup>❋</sup>, Seong Joon Oh, Dongyoon Han, Sangdoo Yun, Gyuwan Kim, Youngjung Uh, Jung-Woo Ha</li>
<li><strong><em>ICLR 2021.</em></strong> <a href="media/papers/heo2021iclr_adamp.pdf">paper</a> | <a href="https://github.com/ClovaAI/AdamP">code</a> | <a href="https://clovaai.github.io/AdamP/">project page</a> | <a href="https://pypi.org/project/adamp/">pypi</a> | <a href="https://docs.google.com/presentation/d/1s9zgQ22WFnhEj6POL_0ecrTiED__qmL9XgL0Nv3zNP4/edit?usp=sharing">slide</a> | <a href="media/bibtex/heo2021adamp.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item">
<strong>SWAD: Domain Generalization by Seeking Flat Minima.</strong>
<ul>
<li>Junbum Cha, <strong>Sanghyuk Chun</strong><sup>❋</sup>, Kyungjae Lee<sup>❋</sup>, Han-Cheol Cho, Seunghyun Park, Yunsung Lee, Sungrae Park</li>
<li><strong><em>NeurIPS 2021.</em></strong> <a href="media/papers/cha2021neurips_swad.pdf">paper</a> | <a href="https://github.com/khanrc/swad">code</a> | <a href="media/bibtex/cha2021neurips_swad.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item">
<strong>Learning De-biased Representations with Biased Representations.</strong>
<ul>
<li>Hyojin Bahng, <strong>Sanghyuk Chun</strong>, Sangdoo Yun, Jaegul Choo, Seong Joon Oh</li>
<li><strong><em>ICML 2020.</em></strong> <a href="media/papers/bahng2020icml_rebias.pdf">paper</a> | <a href="https://github.com/clovaai/rebias">code</a> | <a href="https://twitter.com/SanghyukChun/status/1278357842362155008">tweet</a> | <a href="https://youtu.be/lkjMxZDGubA">video</a> | <a href="https://docs.google.com/presentation/d/1edTv6-gHs-HCF10f2gUCIXiWJ-nSZWg7dzp5q-eKeRk/edit?usp=sharing">slide</a> | <a href="media/bibtex/bahng2020rebias.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item">
<strong>Learning Fair Classifiers with Partially Annotated Group Labels.</strong>
<ul>
<li>Sangwon Jung, <strong>Sanghyuk Chun</strong><sup>❋</sup>, Taesup Moon<sup>❋</sup></li>
<li><strong><em>CVPR 2022.</em></strong> <a href="media/papers/jung2022cvpr_cgl.pdf">paper</a> | <a href="https://github.com/naver-ai/cgl_fairness">code</a> | <a href="media/bibtex/jung2022cgl.txt">bibtex</a></li>
</ul>
</li>
<!--<li class="list-group-item">-->
<!--<strong>Domain Generalization by Mutual-Information Regularization with Pre-trained Models.</strong>-->
<!--<ul>-->
<!--<li>Junbum Cha, Kyungjae Lee, Sungrae Park, <strong>Sanghyuk Chun</strong></li>-->
<!--<li><strong><em>ECCV 2022.</em></strong> <a href="media/papers/cha2022miro.pdf">paper</a> | <a href="https://github.com/kakaobrain/miro">code</a> | <a href="media/bibtex/cha2022miro.txt">bibtex</a></li>-->
<!--</ul>-->
<!--</li>-->
<!--<li class="list-group-item">-->
<!--<strong>Re-weighting based Group Fairness Regularization via Classwise Robust Optimization.</strong>-->
<!--<ul>-->
<!--<li>Sangwon Jung<sup>❋</sup>, Taeeon Park<sup>❋</sup>, <strong>Sanghyuk Chun</strong>, Taesup Moon</li>-->
<!--<li><strong><em>ICLR 2023.</em></strong> <a href="media/papers/jung2023iclr_fairdro.pdf">paper</a> | <a href="https://github.com/sangwon-jung94/FairDRO">code</a> | <a href="media/bibtex/jung2023iclr_fairdro.txt">bibtex</a></li>-->
<!--</ul>-->
<!--</li>-->
<li class="list-group-item">
<strong>CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features.</strong> <span class="badge bg-danger">Oral presentation</span>
<ul>
<li>Sangdoo Yun, Dongyoon Han, Seong Joon Oh, <strong>Sanghyuk Chun</strong>, Junsuk Choe, Youngjoon Yoo</li>
<li><strong><em>ICCV 2019.</em></strong> <a href="media/papers/yun2019iccv_cutmix.pdf">paper</a> | <a href="https://github.com/ClovaAI/CutMix-PyTorch">code and pretrained models</a> | <a href="https://clova-ai.blog/2019/07/15/cutmix-regularization-strategy-to-train-strong-classifiers-with-localizable-features/">blog</a> | <a href="media/bibtex/yun2019cutmix.txt">bibtex</a></li>
</ul>
</li>
<!--<li class="list-group-item">-->
<!--<strong>An Empirical Evaluation on Robustness and Uncertainty of Regularization methods.</strong>-->
<!--<ul>-->
<!--<li><strong>Sanghyuk Chun</strong>, Seong Joon Oh, Sangdoo Yun, Dongyoon Han, Junsuk Choe, Youngjoon Yoo</li>-->
<!--<li><strong><em>ICML Workshop 2019.</em></strong> <a href="media/papers/chun2019icmlws.pdf">paper</a> | <a href="media/bibtex/chun2019icmlw.txt">bibtex</a></li>-->
<!--</ul>-->
<!--</li>-->
<!--<li class="list-group-item">-->
<!--<strong>Similarity of Neural Architectures using Adversarial Attack Transferability.</strong>-->
<!--<ul>-->
<!--<li>Jaehui Hwang, Dongyoon Han, Byeongho Heo, Song Park, <strong>Sanghyuk Chun</strong><sup>❋</sup>, Jong-Seok Lee<sup>❋</sup></li>-->
<!--<li><strong><em>ECCV 2024.</em></strong> <a href="media/papers/hwang2024sat.pdf">paper</a> | <a href="https://github.com/J-H-Hwang/SAT">code</a> | <a href="media/bibtex/hwang2024sat.txt">bibtex</a></li>-->
<!--</ul>-->
<!--</li>-->
<!--<li class="list-group-item">-->
<!--<strong>SeiT: Storage-Efficient Vision Training with Tokens Using 1% of Pixel Storage.</strong>-->
<!--<ul>-->
<!--<li>Song Park<sup>❋</sup>, <strong>Sanghyuk Chun</strong><sup>❋</sup>, Byeongho Heo, Wonjae Kim, Sangdoo Yun</li>-->
<!--<li><strong><em>ICCV 2023.</em></strong> <a href="media/papers/park2023seit.pdf">paper</a> | <a href="https://github.com/naver-ai/seit">code</a> | <a href="https://docs.google.com/presentation/d/1oHjYjvbC3QINuwR5MQAteik0ifsIxR3LTuCus-BTKwY/edit#slide=id.p">slide</a> | <a href="media/bibtex/park2023seit.txt">bibtex</a></li>-->
<!--</ul>-->
<!--</li>-->
</ul>
</div>
<div class="tab-pane fade" id="nav-all" role="tabpanel" aria-labelledby="nav-all-tab">
<div class="card-header no-border">
<h4 class="float-left mb-0" id="papers-2024">2025</h4>
<div class="clearfix"></div>
</div>
<ul class="list-group list-group-flush list-no-border">
<li class="list-group-item li-conference">
<strong class="anchor-strong">Probabilistic Language-Image Pre-Training.</strong>
<ul>
<li><strong>Sanghyuk Chun</strong>, Wonjae Kim, Song Park, Sangdoo Yun</li>
<li><strong><em>ICLR 2025.</em></strong> <a href="media/papers/chun2024prolip.pdf">paper</a> | <a href="https://github.com/naver-ai/prolip/">code</a> | <a href="https://huggingface.co/collections/SanghyukChun/prolip-6712595dfc87fd8597350291">pre-trained models 🤗</a> | <a href="https://docs.google.com/presentation/d/1BEHEphXxdg0TjUsI3Cv8Xr3kLX6sbAlGytDEiN5iW7s/edit?usp=sharing">slide</a> | <a href="media/bibtex/chun2024prolip.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference-workshop">
<strong class="anchor-strong">Read, Watch and Scream! Sound Generation from Text and Video.</strong>
<ul>
<li>Yujin Jeong, Yunji Kim, <strong>Sanghyuk Chun</strong>, Jiyoung Lee</li>
<li><strong><em>AAAI 2025 | NeurIPS 2024 Workshop on Video-Language Models.</em></strong> <a href="media/papers/jeong2025aaai_rewas.pdf">paper</a> | <a href="https://naver-ai.github.io/rewas/">project page</a> | <a href="media/bibtex/jeong2025aaai_rewas.txt">bibtex</a></li>
</ul>
</li>
</ul>
<div class="card-header no-border">
<h4 class="float-left mb-0" id="papers-2024">2024</h4>
<div class="clearfix"></div>
</div>
<ul class="list-group list-group-flush list-no-border">
<li class="list-group-item li-conference">
<strong class="anchor-strong">Do Counterfactually Fair Image Classifiers Satisfy Group Fairness? -- A Theoretical and Empirical Study.</strong>
<ul>
<li>Sangwon Jung, Sumin Yu, <strong>Sanghyuk Chun<sup>❋</sup></strong>, Taesup Moon<sup>❋</sup></li>
<li><strong><em>NeurIPS 2024 D&B.</em></strong> <a href="media/papers/jung2024neurips_ckd.pdf">paper</a> | <a href="https://github.com/sumin-yu/CKD">code and dataset</a> | <a href="media/bibtex/jung2024neurips_ckd.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference">
<strong class="anchor-strong">HYPE: Hyperbolic Entailment Filtering for Underspecified Images and Texts.</strong> <span class="badge bg-danger">Oral presentation</span>
<ul>
<li>Wonjae Kim, <strong>Sanghyuk Chun</strong>, Taekyung Kim, Dongyoon Han, Sangdoo Yun</li>
<li><strong><em>ECCV 2024.</em></strong> <a href="media/papers/kim2024hype.pdf">paper</a> | <a href="https://github.com/naver-ai/hype">code</a> | <a href="media/bibtex/kim2024hype.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference">
<strong class="anchor-strong">Similarity of Neural Architectures using Adversarial Attack Transferability.</strong>
<ul>
<li>Jaehui Hwang, Dongyoon Han, Byeongho Heo, Song Park, <strong>Sanghyuk Chun</strong><sup>❋</sup>, Jong-Seok Lee<sup>❋</sup></li>
<li><strong><em>ECCV 2024.</em></strong> <a href="media/papers/hwang2024sat.pdf">paper</a> | <a href="https://github.com/J-H-Hwang/SAT">code</a> | <a href="media/bibtex/hwang2024sat.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference">
<strong class="anchor-strong">Learning with Unmasked Tokens Drives Stronger Vision Learners.</strong>
<ul>
<li>Taekyung Kim<sup>❋</sup>, <strong>Sanghyuk Chun</strong>, Byeongho Heo, Dongyoon Han<sup>❋</sup></li>
<li><strong><em>ECCV 2024.</em></strong> <a href="media/papers/kim2024lut.pdf">paper</a> | <a href="media/bibtex/kim2024lut.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-workshop">
<strong class="anchor-strong">RoCOCO: Robust Benchmark of MS-COCO to Stress-test Robustness of Image-Text Matching Models.</strong> <span class="badge bg-danger">Oral presentation</span>
<ul>
<li>Seulki Park, Daeho Um, Hajung Yoon, <strong>Sanghyuk Chun</strong>, Sangdoo Yun</li>
<li><strong><em>ECCV 2024 Synthetic Data for Computer Vision Workshop (SyntheticData4CV 2024).</em></strong> <a href="media/papers/park2023rococo.pdf">paper</a> | <a href="https://github.com/pseulki/rococo">code</a> | <a href="media/bibtex/park2023rococo.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-arxiv">
<strong class="anchor-strong">Reducing Task Discrepancy of Text Encoders for Zero-Shot Composed Image Retrieval.</strong>
<ul>
<li>Jaeseok Byun<sup>❋</sup>, Seokhyeon Jeong<sup>❋</sup>, Wonjae Kim, <strong>Sanghyuk Chun</strong><sup>†</sup>, Taesup Moon<sup>†</sup></li>
<li><strong><em>preprint.</em></strong> <a href="media/papers/byun2024rtd.pdf">paper</a> | <a href="media/bibtex/byun2024rtd.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference">
<strong class="anchor-strong">Toward Interactive Regional Understanding in Vision-Large Language Models</strong>
<ul>
<li>Jungbeom Lee, <strong>Sanghyuk Chun<sup>❋</sup></strong>, Sangdoo Yun<sup>❋</sup></li>
<li><strong><em>NAACL 2024.</em></strong> <a href="media/papers/lee2024naacl_region_vlm.pdf">paper</a> | <a href="https://github.com/jbeomlee93/RegionVLM">code</a> | <a href="media/bibtex/lee2024vlm.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference">
<strong class="anchor-strong">Language-only Efficient Training of Zero-shot Composed Image Retrieval.</strong>
<ul>
<li>Geonmo Gu<sup>❋</sup>, <strong>Sanghyuk Chun</strong><sup>❋</sup>, Wonjae Kim, Yoohoon Kang, Sangdoo Yun</li>
<li><strong><em>CVPR 2024.</em></strong> <a href="media/papers/gu2024cvpr_lincir.pdf">paper</a> | <a href="https://github.com/navervision/lincir">code</a> | <a href="media/bibtex/gu2024lincir.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference">
<strong class="anchor-strong">What Does Automatic Differentiation Compute for Neural Networks?</strong> <span class="badge bg-info">Spotlight presentation</span>
<ul>
<li>Sejun Park<sup>❋</sup>, <strong>Sanghyuk Chun<sup>❋</sup></strong>, Wonyeol Lee</li>
<li><strong><em>ICLR 2024.</em></strong> <a href="media/papers/park2024ad_correctness.pdf">paper</a> | <a href="https://github.com/SanghyukChun/ad_correctness">code</a> | <a href="media/bibtex/park2024ad_correctness.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference-workshop">
<strong class="anchor-strong">Improved Probabilistic Image-Text Representations.</strong>
<ul>
<li><strong>Sanghyuk Chun</strong></li>
<li><strong><em>ICLR 2024.</em></strong> <strong><em>ICCV CLVL 2023.</em></strong> <a href="media/papers/chun2024iclr_pcmepp.pdf">paper</a> | <a href="https://github.com/naver-ai/pcmepp/">code</a> | <a href="https://naver-ai.github.io/pcmepp/">project page</a> | <a href="https://docs.google.com/presentation/d/1yelrDSN11rnChAk-gtU2YzSNX49XPHFIgROzj_lRA4Q/edit?usp=sharing">slide</a> | <a href="media/bibtex/chun2023pcmepp.txt">bibtex</a></li>
</ul>
</li>
</ul>
<div class="card-header no-border">
<h4 class="float-left mb-0" id="papers-2023">2023</h4>
<!--<span class="float-left"><strong> 1 conference papers (1 ICLR)</strong></span>-->
<div class="clearfix"></div>
</div>
<ul class="list-group list-group-flush list-no-border">
<li class="list-group-item li-conference">
<strong class="anchor-strong">SeiT: Storage-Efficient Vision Training with Tokens Using 1% of Pixel Storage.</strong>
<ul>
<li>Song Park<sup>❋</sup>, <strong>Sanghyuk Chun</strong><sup>❋</sup>, Byeongho Heo, Wonjae Kim, Sangdoo Yun</li>
<li><strong><em>ICCV 2023.</em></strong> <a href="media/papers/park2023seit.pdf">paper</a> | <a href="https://github.com/naver-ai/seit">code</a> | <a href="https://docs.google.com/presentation/d/1oHjYjvbC3QINuwR5MQAteik0ifsIxR3LTuCus-BTKwY/edit#slide=id.p">slide</a> | <a href="media/bibtex/park2023seit.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-workshop">
<strong class="anchor-strong">Three Recipes for Better 3D Pseudo-GTs of 3D Human Mesh Estimation in the Wild.</strong>
<ul>
<li>Gyeongsik Moon, Hongsuk Choi, <strong>Sanghyuk Chun</strong>, Jiyoung Lee, Sangdoo Yun</li>
<li><strong><em>CVPR 2023 Workshop on Computer Vision for Mixed Reality (CV4MR).</em></strong> <a href="media/papers/moon2023pseudo_gt_recipes.pdf">paper</a> | <a href="https://github.com/mks0601/NeuralAnnot_RELEASE">code</a> | <a href="media/bibtex/moon2023pseudo_gt_recipes.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference">
<strong class="anchor-strong">Re-weighting based Group Fairness Regularization via Classwise Robust Optimization.</strong>
<ul>
<li>Sangwon Jung<sup>❋</sup>, Taeeon Park<sup>❋</sup>, <strong>Sanghyuk Chun</strong>, Taesup Moon</li>
<li><strong><em>ICLR 2023.</em></strong> <a href="media/papers/jung2023iclr_fairdro.pdf">paper</a> | <a href="https://github.com/sangwon-jung94/FairDRO">code</a> | <a href="media/bibtex/jung2023iclr_fairdro.txt">bibtex</a></li>
</ul>
</li>
</ul>
<div class="card-header no-border">
<h4 class="float-left mb-0" id="papers-2022">2022</h4>
<!--<span class="float-left"><strong> 7 conference papers (2 ICLR, 1 CVPR, 1 ICML, 2 ECCV, 1 NeurIPS), 2 journal papers (WSOL eval journal, LF-Font journal)</strong></span>-->
<div class="clearfix"></div>
</div>
<ul class="list-group list-group-flush list-no-border">
<li class="list-group-item li-arxiv">
<strong class="anchor-strong">Group Generalized Mean Pooling for Vision Transformer.</strong>
<ul>
<li>Byungsoo Ko, Han-Gyu Kim, Byeongho Heo, Sangdoo Yun, <strong>Sanghyuk Chun</strong>, Geonmo Gu, Wonjae Kim</li>
<li><strong><em>preprint.</em></strong> <a href="media/papers/ko2022ggem.pdf">paper</a> | <a href="media/bibtex/ko2022ggem.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference">
<strong class="anchor-strong">A Unified Analysis of Mixed Sample Data Augmentation: A Loss Function Perspective.</strong>
<ul>
<li>Chanwoo Park<sup>❋</sup>, Sangdoo Yun<sup>❋</sup>, <strong>Sanghyuk Chun</strong></li>
<li><strong><em>NeurIPS 2022.</em></strong> <a href="media/papers/park2022msda.pdf">paper</a> | <a href="https://github.com/naver-ai/hmix-gmix">code</a> | <a href="media/bibtex/park2022neurips_msda.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference">
<strong class="anchor-strong">ECCV Caption: Correcting False Negatives by Collecting Machine-and-Human-verified Image-Caption Associations for MS-COCO.</strong>
<ul>
<li><strong>Sanghyuk Chun</strong>, Wonjae Kim, Song Park, Minsuk Chang, Seong Joon Oh</li>
<li><strong><em>ECCV 2022.</em></strong> <a href="media/papers/chun2022eccv_caption.pdf">paper</a> | <a href="https://github.com/naver-ai/eccv-caption">code</a> | <a href="https://pypi.org/project/eccv-caption/">pypi</a> | <a href="https://docs.google.com/presentation/d/1zyLL49_2-F6mQFaMIumPfdE7el_r048XtidLnehepHo/edit?usp=sharing">slide (short talk)</a> | <a href="https://docs.google.com/presentation/d/1OKaWPlNblepiXF57oWs2miGgYb5kuu1qxNqV_-hDddU/edit?usp=sharing">slide (long talk)</a> | <a href="media/bibtex/chun2022eccv_caption.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference">
<strong class="anchor-strong">Domain Generalization by Mutual-Information Regularization with Pre-trained Models.</strong>
<ul>
<li>Junbum Cha, Kyungjae Lee, Sungrae Park, <strong>Sanghyuk Chun</strong></li>
<li><strong><em>ECCV 2022.</em></strong> <a href="media/papers/cha2022miro.pdf">paper</a> | <a href="https://github.com/kakaobrain/miro">code</a> | <a href="media/bibtex/cha2022miro.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-arxiv">
<strong class="anchor-strong">An Extendable, Efficient and Effective Transformer-based Object Detector.</strong>
<ul>
<li>Hwanjun Song, Deqing Sun, <strong>Sanghyuk Chun</strong>, Varun Jampani, Dongyoon Han, Byeongho Heo, Wonjae Kim, Ming-Hsuan Yang</li>
<li><strong><em>preprint.</em></strong> <a href="media/papers/song2022vidtplus.pdf">paper</a> | <a href="https://github.com/naver-ai/vidt">code</a> | <a href="media/bibtex/song2022vidtplus.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference">
<strong class="anchor-strong">Dataset Condensation with Contrastive Signals.</strong>
<ul>
<li>Saehyung Lee, <strong>Sanghyuk Chun</strong>, Sangwon Jung, Sangdoo Yun, Sungroh Yoon</li>
<li><strong><em>ICML 2022.</em></strong> <a href="media/papers/lee2022icml_dcc.pdf">paper</a> | <a href="media/bibtex/lee2022dcc.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference">
<strong class="anchor-strong">Learning Fair Classifiers with Partially Annotated Group Labels.</strong>
<ul>
<li>Sangwon Jung, <strong>Sanghyuk Chun</strong><sup>❋</sup>, Taesup Moon<sup>❋</sup></li>
<li><strong><em>CVPR 2022.</em></strong> <a href="media/papers/jung2022cvpr_cgl.pdf">paper</a> | <a href="https://github.com/naver-ai/cgl_fairness">code</a> | <a href="media/bibtex/jung2022cgl.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference">
<strong class="anchor-strong">ViDT: An Efficient and Effective Fully Transformer-based Object Detector.</strong>
<ul>
<li>Hwanjun Song, Deqing Sun, <strong>Sanghyuk Chun</strong>, Varun Jampani, Dongyoon Han, Byeongho Heo, Wonjae Kim, Ming-Hsuan Yang</li>
<li><strong><em>ICLR 2022.</em></strong> <a href="media/papers/song2022iclr_vidt.pdf">paper</a> | <a href="https://github.com/naver-ai/vidt">code</a> | <a href="media/bibtex/song2022vidt.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference">
<strong class="anchor-strong">Which Shortcut Cues Will DNNs Choose? A Study from the Parameter-Space Perspective.</strong>
<ul>
<li>Luca Scimeca<sup>❋</sup>, Seong Joon Oh<sup>❋</sup>, <strong>Sanghyuk Chun</strong>, Michael Poli, Sangdoo Yun</li>
<li><strong><em>ICLR 2022.</em></strong> <a href="media/papers/scimeca2021wcst-ml.pdf">paper</a> | <a href="media/bibtex/scimeca2022wcst-ml.txt">bibtex</a></li>
</ul>
</li>
</ul>
<div class="card-header no-border">
<h4 class="float-left mb-0" id="papers-2021">2021</h4>
<!--<span class="float-left"><strong> 8 conference papers and 1 workshop paper (1 AAAI, 1 ICLR, 2 CVPR, 1 CVPR Workshop, 2 ICCV, 2 NeurIPS)</strong></span>-->
<div class="clearfix"></div>
</div>
<ul class="list-group list-group-flush list-no-border">
<li class="list-group-item li-conference">
<strong class="anchor-strong">SWAD: Domain Generalization by Seeking Flat Minima.</strong>
<ul>
<li>Junbum Cha, <strong>Sanghyuk Chun</strong><sup>❋</sup>, Kyungjae Lee<sup>❋</sup>, Han-Cheol Cho, Seunghyun Park, Yunsung Lee, Sungrae Park</li>
<li><strong><em>NeurIPS 2021.</em></strong> <a href="media/papers/cha2021neurips_swad.pdf">paper</a> | <a href="https://github.com/khanrc/swad">code</a> | <a href="media/bibtex/cha2021neurips_swad.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference">
<strong class="anchor-strong">Neural Hybrid Automata: Learning Dynamics with Multiple Modes and Stochastic Transitions.</strong>
<ul>
<li>Michael Poli<sup>❋</sup>, Stefano Massaroli<sup>❋</sup>, Luca Scimeca, Seong Joon Oh, <strong>Sanghyuk Chun</strong>, Atsushi Yamashita, Hajime Asama, Jinkyoo Park, Animesh Garg</li>
<li><strong><em>NeurIPS 2021.</em></strong> <a href="media/papers/poli2021nha.pdf">paper</a> | <a href="media/bibtex/poli2021nha.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-arxiv">
<strong class="anchor-strong">StyleAugment: Learning Texture De-biased Representations by Style Augmentation without Pre-defined Textures.</strong>
<ul>
<li><strong>Sanghyuk Chun</strong>, Song Park</li>
<li><strong><em>preprint.</em></strong> <a href="media/papers/chun2021styleaugment.pdf">paper</a> | <a href="media/bibtex/chun2021styleaugment.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference">
<strong class="anchor-strong">Rethinking Spatial Dimensions of Vision Transformers.</strong>
<ul>
<li>Byeongho Heo, Sangdoo Yun, Dongyoon Han, <strong>Sanghyuk Chun</strong>, Junsuk Choe, Seong Joon Oh</li>
<li><strong><em>ICCV 2021.</em></strong> <a href="media/papers/heo2021iccv_pit.pdf">paper</a> | <a href="https://github.com/naver-ai/pit">code</a> | <a href="https://twitter.com/SanghyukChun/status/1377125468999049216">tweet</a> | <a href="media/bibtex/heo2021pit.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference">
<strong class="anchor-strong">Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts.</strong>
<ul>
<li>Song Park, <strong>Sanghyuk Chun</strong>, Junbum Cha, Bado Lee, Hyunjung Shim</li>
<li><strong><em>ICCV 2021.</em></strong> <a href="media/papers/park2021iccv_mxfont.pdf">paper</a> | <a href="https://github.com/clovaai/mxfont">code</a> | <a href="media/bibtex/park2021mxfont.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference">
<strong class="anchor-strong">Probabilistic Embeddings for Cross-Modal Retrieval.</strong>
<ul>
<li><strong>Sanghyuk Chun</strong>, Seong Joon Oh, Rafael Sampaio de Rezende, Yannis Kalantidis, Diane Larlus</li>
<li><strong><em>CVPR 2021.</em></strong> <a href="media/papers/chun2021cvpr_pcme.pdf">paper</a> | <a href="https://github.com/naver-ai/pcme">code</a> | <a href="https://www.youtube.com/watch?v=J_DaqSLEcVk">video</a> | <a href="https://docs.google.com/presentation/d/1Tyac3fRvEGYkmbB9iUELU5jjv8IWo5oU_eQUGOkiCuk/edit?usp=sharing">slide (short talk)</a> | <a href="https://docs.google.com/presentation/d/1dGQUqud3iMld-UgMMlRQuqA7JhzndMfzFoKaFAwZ58I/edit?usp=sharing">slide (long talk)</a> | <a href="media/bibtex/chun2021pcme.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference">
<strong class="anchor-strong">Re-labeling ImageNet: from Single to Multi-Labels, from Global to Localized Labels.</strong>
<ul>
<li>Sangdoo Yun, Seong Joon Oh, Byeongho Heo, Dongyoon Han, Junsuk Choe, <strong>Sanghyuk Chun</strong></li>
<li><strong><em>CVPR 2021.</em></strong> <a href="media/papers/yun2021cvpr_relabel.pdf">paper</a> | <a href="https://github.com/naver-ai/relabel_imagenet">code</a> | <a href="media/bibtex/yun2021relabel.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference">
<strong class="anchor-strong">AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights.</strong>
<ul>
<li>Byeongho Heo<sup>❋</sup>, <strong>Sanghyuk Chun</strong><sup>❋</sup>, Seong Joon Oh, Dongyoon Han, Sangdoo Yun, Gyuwan Kim, Youngjung Uh, Jung-Woo Ha</li>
<li><strong><em>ICLR 2021.</em></strong> <a href="media/papers/heo2021iclr_adamp.pdf">paper</a> | <a href="https://github.com/ClovaAI/AdamP">code</a> | <a href="https://clovaai.github.io/AdamP/">project page</a> | <a href="https://pypi.org/project/adamp/">pypi</a> | <a href="https://docs.google.com/presentation/d/1s9zgQ22WFnhEj6POL_0ecrTiED__qmL9XgL0Nv3zNP4/edit?usp=sharing">slide</a> | <a href="media/bibtex/heo2021adamp.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference-workshop">
<strong class="anchor-strong">Few-shot Font Generation with Localized Style Representations and Factorization.</strong>
<ul>
<li>Song Park<sup>❋</sup>, <strong>Sanghyuk Chun</strong><sup>❋</sup>, Junbum Cha, Bado Lee, Hyunjung Shim</li>
<li><strong><em>AAAI 2021.</em></strong> <strong><em>CVPR Workshop 2021.</em></strong> <a href="media/papers/park2021aaai_lffont.pdf">paper</a> | <a href="https://github.com/clovaai/lffont">code</a> | <a href="https://cvml.yonsei.ac.kr/projects/few-shot-font-generation">project page</a> | <a href="media/bibtex/park2021lffont.txt">bibtex</a></li>
</ul>
</li>
</ul>
<div class="card-header no-border">
<h4 class="float-left mb-0" id="papers-2020">2020</h4>
<!--<span class="float-left"><strong> 4 conference papers, 1 workshop paper (1 ICASSP, 1 CVPR, 1 ICML, 1 CVPR WS, 1 ECCV)</strong></span>-->
<div class="clearfix"></div>
</div>
<ul class="list-group list-group-flush list-no-border">
<li class="list-group-item li-conference">
<strong class="anchor-strong">Few-shot Compositional Font Generation with Dual Memory.</strong>
<ul>
<li>Junbum Cha, <strong>Sanghyuk Chun</strong>, Gayoung Lee, Bado Lee, Seonghyeon Kim, Hwalsuk Lee</li>
<li><strong><em>ECCV 2020.</em></strong> <a href="media/papers/cha2020eccv_dmfont.pdf">paper</a> | <a href="https://github.com/clovaai/dmfont">code</a> | <a href="https://www.youtube.com/watch?v=VMrMJf21XEA">video</a> | <a href="media/bibtex/cha2020dmfont.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference">
<strong class="anchor-strong">Learning De-biased Representations with Biased Representations.</strong>
<ul>
<li>Hyojin Bahng, <strong>Sanghyuk Chun</strong>, Sangdoo Yun, Jaegul Choo, Seong Joon Oh</li>
<li><strong><em>ICML 2020.</em></strong> <a href="media/papers/bahng2020icml_rebias.pdf">paper</a> | <a href="https://github.com/clovaai/rebias">code</a> | <a href="https://twitter.com/SanghyukChun/status/1278357842362155008">tweet</a> | <a href="https://youtu.be/lkjMxZDGubA">video</a> | <a href="https://docs.google.com/presentation/d/1edTv6-gHs-HCF10f2gUCIXiWJ-nSZWg7dzp5q-eKeRk/edit?usp=sharing">slide</a> | <a href="media/bibtex/bahng2020rebias.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-workshop">
<strong class="anchor-strong">Toward High-quality Few-shot Font Generation with Dual Memory.</strong> <span class="badge bg-danger">Oral presentation</span> <span class="badge bg-warning">The best paper runner-up award</span>
<ul>
<li>Junbum Cha, <strong>Sanghyuk Chun</strong>, Gayoung Lee, Bado Lee, Seonghyeon Kim, Hwalsuk Lee</li>
<li><strong><em>CVPR Workshop 2020.</em></strong> <a href="media/papers/cha2020cvprws.pdf">paper</a> | <a href="media/bibtex/cha2020cvprw.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference">
<strong class="anchor-strong">Evaluating Weakly Supervised Object Localization Methods Right.</strong>
<ul>
<li>Junsuk Choe<sup>❋</sup>, Seong Joon Oh<sup>❋</sup>, Seongho Lee, <strong>Sanghyuk Chun</strong>, Zeynep Akata, Hyunjung Shim</li>
<li><strong><em>CVPR 2020.</em></strong> <a href="media/papers/choe2020cvpr_wsoleval.pdf">paper</a> | <a href="https://github.com/ClovaAI/wsolevaluation">code and dataset</a> | <a href="https://twitter.com/SanghyukChun/status/1271329234217099264">tweet</a> | <a href="media/slides/CVPR20_slide_evaluating_wsol_methods_right.pdf">slide</a> | <a href="https://www.youtube.com/watch?v=Vy1NcMxUi_Y">video on CVPR</a> | <a href="https://youtu.be/D_dEkeb-fto">video on ECCV tutorial</a> | <a href="media/bibtex/choe2020wsoleval.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference">
<strong class="anchor-strong">Data-driven Harmonic Filters for Audio Representation Learning.</strong>
<ul>
<li>Minz Won, <strong>Sanghyuk Chun</strong>, Oriol Nieto, Xavier Serra</li>
<li><strong><em>ICASSP 2020.</em></strong> <a href="media/papers/won2020icassp_hcnn.pdf">paper</a> | <a href="https://github.com/minzwon/sota-music-tagging-models">code and pretrained models</a> | <a href="https://youtu.be/BXHEYb_Axus">video</a> | <a href="media/bibtex/won2020harmonic.txt">bibtex</a></li>
</ul>
</li>
</ul>
<div class="card-header no-border">
<h4 class="float-left mb-0" id="papers-2019">2019</h4>
<!--<span class="float-left"><strong> 2 conference papers, 4 workshop papers (2 ICCV, 1 ICLR WS, 2 ICML WS, 1 ISMIR LBD)</strong></span>-->
<div class="clearfix"></div>
</div>
<ul class="list-group list-group-flush list-no-border">
<li class="list-group-item li-arxiv">
<strong class="anchor-strong">Neural Approximation of Auto-Regressive Process through Confidence Guided Sampling.</strong>
<ul>
<li>YoungJoon Yoo, <strong>Sanghyuk Chun</strong>, Sangdoo Yun, Jung-Woo Ha, Jaejun Yoo</li>
<li><strong><em>preprint.</em></strong> <a href="media/papers/yoo2019nara.pdf">paper</a> | <a href="media/bibtex/yoo2019nara.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-arxiv">
<strong class="anchor-strong">Toward Interpretable Music Tagging with Self-attention.</strong>
<ul>
<li>Minz Won, <strong>Sanghyuk Chun</strong>, Xavier Serra</li>
<li><strong><em>preprint.</em></strong> <a href="media/papers/won2019mtsa.pdf">paper</a> | <a href="https://github.com/minzwon/sota-music-tagging-models">code and pretrained models</a> | <a href="media/bibtex/won2019attention.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference">
<strong class="anchor-strong">CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features.</strong> <span class="badge bg-danger">Oral presentation</span>
<ul>
<li>Sangdoo Yun, Dongyoon Han, Seong Joon Oh, <strong>Sanghyuk Chun</strong>, Junsuk Choe, Youngjoon Yoo</li>
<li><strong><em>ICCV 2019.</em></strong> <a href="media/papers/yun2019iccv_cutmix.pdf">paper</a> | <a href="https://github.com/ClovaAI/CutMix-PyTorch">code and pretrained models</a> | <a href="https://clova-ai.blog/2019/07/15/cutmix-regularization-strategy-to-train-strong-classifiers-with-localizable-features/">blog</a> | <a href="media/bibtex/yun2019cutmix.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference">
<strong class="anchor-strong">Photorealistic Style Transfer via Wavelet Transforms.</strong>
<ul>
<li>Jaejun Yoo<sup>❋</sup>, Youngjung Uh<sup>❋</sup>, <strong>Sanghyuk Chun</strong><sup>❋</sup>, Byungkyu Kang, Jung-Woo Ha</li>
<li><strong><em>ICCV 2019.</em></strong> <a href="media/papers/yoo2019iccv_wct2.pdf">paper</a> | <a href="https://github.com/ClovaAI/WCT2">code and model weights</a> | <a href="https://youtu.be/o-AgHt1VA30">video</a> | <a href="https://clova-ai.blog/2019/08/06/photorealistic-style-transfer-via-wavelet-transforms-iccv-2019/">blog</a> | <a href="https://sanghyukchun.github.io/home/media/bibtex/yoo2019wct2.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-workshop">
<strong class="anchor-strong">Automatic Music Tagging with Harmonic CNN.</strong>
<ul>
<li>Minz Won, <strong>Sanghyuk Chun</strong>, Oriol Nieto, Xavier Serra</li>
<li><strong><em>ISMIR LBD 2019.</em></strong> <a href="media/papers/won2019ismirlbd.pdf">paper</a> | <a href="https://github.com/minzwon/fast-harmonic-cnn">code</a> | <a href="media/bibtex/won2019lbd.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-workshop">
<strong class="anchor-strong">An Empirical Evaluation on Robustness and Uncertainty of Regularization methods.</strong>
<ul>
<li><strong>Sanghyuk Chun</strong>, Seong Joon Oh, Sangdoo Yun, Dongyoon Han, Junsuk Choe, Youngjoon Yoo</li>
<li><strong><em>ICML Workshop 2019.</em></strong> <a href="media/papers/chun2019icmlws.pdf">paper</a> | <a href="media/bibtex/chun2019icmlw.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-workshop">
<strong class="anchor-strong">Visualizing and Understanding Self-attention based Music Tagging.</strong> <span class="badge bg-danger">Oral presentation</span>
<ul>
<li>Minz Won, <strong>Sanghyuk Chun</strong>, Xavier Serra</li>
<li><strong><em>ICML Workshop 2019.</em></strong> <a href="media/papers/won2019icmlws.pdf">paper</a> | <a href="https://github.com/minzwon/sota-music-tagging-models">code</a> | <a href="https://slideslive.com/38917230/visualizing-and-understanding-selfattention-based-music-tagging">talk video</a> | <a href="media/bibtex/won2019icmlw.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-workshop">
<strong class="anchor-strong">Where To Be Adversarial Perturbations Added? Investigating and Manipulating Pixel Robustness Using Input Gradients.</strong>
<ul>
<li>Jisung Hwang<sup>❋</sup>, Younghoon Kim<sup>❋</sup>, <strong>Sanghyuk Chun</strong><sup>❋</sup>, Jaejun Yoo, Ji-Hoon Kim, Dongyoon Han</li>
<li><strong><em>ICLR Workshop 2019.</em></strong> <a href="media/papers/hwang2019iclrws.pdf">paper</a> | <a href="media/bibtex/hwang2019prm.txt">bibtex</a></li>
</ul>
</li>
</ul>
<h4 class="card-header no-border" id="papers-2018">~ 2018</h4>
<ul class="list-group list-group-flush list-no-border">
<li class="list-group-item li-arxiv">
<strong class="anchor-strong">Multi-Domain Processing via Hybrid Denoising Networks for Speech Enhancement.</strong>
<ul>
<li>Jang-Hyun Kim<sup>❋</sup>, Jaejun Yoo<sup>❋</sup>, <strong>Sanghyuk Chun</strong>, Adrian Kim, Jung-Woo Ha</li>
<li><strong><em>preprint.</em></strong> <a href="media/papers/kim2018mdphd.pdf">paper</a> | <a href="https://mdphdnet.github.io/">project page</a> | <a href="media/bibtex/kim2018mdphd.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-conference">
<strong class="anchor-strong">A Study on Intelligent Personalized Push Notification with User History.</strong>
<ul>
<li>Hyunjong Lee, Youngin Jo, <strong>Sanghyuk Chun</strong>, Kwangseob Kim</li>
<li><strong><em>Big Data 2017.</em></strong> <a href="https://ieeexplore.ieee.org/abstract/document/8258081/">paper</a> | <a href="media/bibtex/lee2017ippn.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-others">
<strong class="anchor-strong">Scalable Iterative Algorithm for Robust Subspace Clustering: Convergence and Initialization.</strong>
<ul>
<li>Master's Thesis, Korea Advanced Institute of Science and Technology, 2016 (advised by <a href="http://alinlab.kaist.ac.kr/shin.html">Jinwoo Shin</a>) <a href="media/papers/chun2016scsi.pdf">paper</a> | <a href="http://github.com/SanghyukChun/SC_SI">code</a></li>
</ul>
</li>
</ul>
<h4 class="card-header no-border" id="papers-journal">Journals</h4>
<ul class="list-group list-group-flush list-no-border">
<li class="list-group-item li-journal">
<strong class="anchor-strong">FairDRO: Group Fairness Regularization via Classwise Robust Optimization.</strong>
<ul>
<li>Taeeon Park, Sangwon Jung, <strong>Sanghyuk Chun</strong>, Taesup Moon</li>
<li><strong><em>Neural Networks.</em></strong> <a href="https://www.sciencedirect.com/science/article/abs/pii/S0893608024008207?CMX_ID=&SIS_ID=&dgcid=STMJ_219742_AUTH_SERV_PA&utm_acid=188087385&utm_campaign=STMJ_219742_AUTH_SERV_PA&utm_in=DM523185&utm_medium=email&utm_source=AC_">paper</a> | <a href="https://github.com/sangwon-jung94/FairDRO">code</a> | <a href="media/bibtex/park2024fairdro_journal.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-journal">
<strong class="anchor-strong">CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion.</strong>
<ul>
<li>Geonmo Gu<sup>❋</sup>, <strong>Sanghyuk Chun</strong><sup>❋</sup>, Wonjae Kim, HeeJae Jun, Yoohoon Kang, Sangdoo Yun</li>
<li><strong><em>TMLR.</em></strong> <strong><em>CVPR 2024 SynData4CV Workshop.</em></strong> <a href="media/papers/gu2024compodiff.pdf">paper</a> | <a href="https://github.com/navervision/CompoDiff">code</a> | <a href="https://huggingface.co/datasets/navervision/SynthTriplets18M">SynthTriplets18M dataset</a> | <a href="https://huggingface.co/spaces/navervision/CompoDiff-Aesthetic">demo 🤗</a> | <a href="https://docs.google.com/presentation/d/1VTJlrHqnLAcQP3aHydnlFXNeZpsPMGa9-L-Oaigi_6M/edit?usp=sharing">slide</a> | <a href="media/bibtex/gu2024compodiff.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-journal">
<strong class="anchor-strong">Few-shot Font Generation with Weakly Supervised Localized Representations.</strong>
<ul>
<li>Song Park<sup>❋</sup>, <strong>Sanghyuk Chun</strong><sup>❋</sup>, Junbum Cha, Bado Lee, Hyunjung Shim</li>
<li>IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 2022. <strong><span class="text-danger">(IF:24.314)</span></strong></li>
<li><strong><em>PAMI.</em></strong> <a href="media/papers/park2021lffont_extension.pdf">paper</a> | <a href="https://github.com/clovaai/lffont">code (old)</a> | <a href="https://github.com/clovaai/fewshot-font-generation">code (new)</a> | <a href="https://cvml.yonsei.ac.kr/projects/few-shot-font-generation">project page</a> | <a href="media/bibtex/park2021lffont_extension.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item li-journal">
<strong class="anchor-strong">Evaluation for Weakly Supervised Object Localization: Protocol, Metrics, and Datasets.</strong>
<ul>
<li>Junsuk Choe<sup>❋</sup>, Seong Joon Oh<sup>❋</sup>, <strong>Sanghyuk Chun</strong>, Seungho Lee, Zeynep Akata, Hyunjung Shim</li>
<li>IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 2022. <strong><span class="text-danger">(IF:24.314)</span></strong></li>
<li><strong><em>PAMI.</em></strong> <a href="media/papers/choe2020wsoleval_extension.pdf">paper</a> | <a href="https://github.com/ClovaAI/wsolevaluation">code and dataset</a> | <a href="media/bibtex/choe2020wsoleval_extension.txt">bibtex</a></li>
</ul>
</li>
</ul>
</div>
<div class="tab-pane fade" id="nav-vl" role="tabpanel" aria-labelledby="nav-vl-tab">
<ul class="list-group list-group-flush list-no-border">
<li class="list-group-item">
<strong>Probabilistic Language-Image Pre-Training.</strong>
<ul>
<li><strong>Sanghyuk Chun</strong>, Wonjae Kim, Song Park, Sangdoo Yun</li>
<li><strong><em>ICLR 2025.</em></strong> <a href="media/papers/chun2024prolip.pdf">paper</a> | <a href="https://github.com/naver-ai/prolip/">code</a> | <a href="https://huggingface.co/collections/SanghyukChun/prolip-6712595dfc87fd8597350291">pre-trained models 🤗</a> | <a href="https://docs.google.com/presentation/d/1BEHEphXxdg0TjUsI3Cv8Xr3kLX6sbAlGytDEiN5iW7s/edit?usp=sharing">slide</a> | <a href="media/bibtex/chun2024prolip.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item">
<strong>Read, Watch and Scream! Sound Generation from Text and Video.</strong>
<ul>
<li>Yujin Jeong, Yunji Kim, <strong>Sanghyuk Chun</strong>, Jiyoung Lee</li>
<li><strong><em>AAAI 2025 | NeurIPS 2024 Workshop on Video-Language Models.</em></strong> <a href="media/papers/jeong2025aaai_rewas.pdf">paper</a> | <a href="https://naver-ai.github.io/rewas/">project page</a> | <a href="media/bibtex/jeong2025aaai_rewas.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item">
<strong>HYPE: Hyperbolic Entailment Filtering for Underspecified Images and Texts.</strong> <span class="badge bg-danger">Oral presentation</span>
<ul>
<li>Wonjae Kim, <strong>Sanghyuk Chun</strong>, Taekyung Kim, Dongyoon Han, Sangdoo Yun</li>
<li><strong><em>ECCV 2024.</em></strong> <a href="media/papers/kim2024hype.pdf">paper</a> | <a href="https://github.com/naver-ai/hype">code</a> | <a href="media/bibtex/kim2024hype.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item">
<strong>RoCOCO: Robust Benchmark of MS-COCO to Stress-test Robustness of Image-Text Matching Models.</strong> <span class="badge bg-danger">Oral presentation</span>
<ul>
<li>Seulki Park, Daeho Um, Hajung Yoon, <strong>Sanghyuk Chun</strong>, Sangdoo Yun</li>
<li><strong><em>ECCV 2024 Synthetic Data for Computer Vision Workshop (SyntheticData4CV 2024).</em></strong> <a href="media/papers/park2023rococo.pdf">paper</a> | <a href="https://github.com/pseulki/rococo">code</a> | <a href="media/bibtex/park2023rococo.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item">
<strong>Reducing Task Discrepancy of Text Encoders for Zero-Shot Composed Image Retrieval.</strong>
<ul>
<li>Jaeseok Byun<sup>❋</sup>, Seokhyeon Jeong<sup>❋</sup>, Wonjae Kim, <strong>Sanghyuk Chun</strong><sup>†</sup>, Taesup Moon<sup>†</sup></li>
<li><strong><em>preprint.</em></strong> <a href="media/papers/byun2024rtd.pdf">paper</a> | <a href="media/bibtex/byun2024rtd.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item">
<strong>CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion.</strong>
<ul>
<li>Geonmo Gu<sup>❋</sup>, <strong>Sanghyuk Chun</strong><sup>❋</sup>, Wonjae Kim, HeeJae Jun, Yoohoon Kang, Sangdoo Yun</li>
<li><strong><em>TMLR.</em></strong> <strong><em>CVPR 2024 SynData4CV Workshop.</em></strong> <a href="media/papers/gu2024compodiff.pdf">paper</a> | <a href="https://github.com/navervision/CompoDiff">code</a> | <a href="https://huggingface.co/datasets/navervision/SynthTriplets18M">SynthTriplets18M dataset</a> | <a href="https://huggingface.co/spaces/navervision/CompoDiff-Aesthetic">demo 🤗</a> | <a href="https://docs.google.com/presentation/d/1VTJlrHqnLAcQP3aHydnlFXNeZpsPMGa9-L-Oaigi_6M/edit?usp=sharing">slide</a> | <a href="media/bibtex/gu2024compodiff.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item">
<strong>Toward Interactive Regional Understanding in Vision-Large Language Models</strong>
<ul>
<li>Jungbeom Lee, <strong>Sanghyuk Chun<sup>❋</sup></strong>, Sangdoo Yun<sup>❋</sup></li>
<li><strong><em>NAACL 2024.</em></strong> <a href="media/papers/lee2024naacl_region_vlm.pdf">paper</a> | <a href="https://github.com/jbeomlee93/RegionVLM">code</a> | <a href="media/bibtex/lee2024vlm.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item">
<strong>Language-only Efficient Training of Zero-shot Composed Image Retrieval.</strong>
<ul>
<li>Geonmo Gu<sup>❋</sup>, <strong>Sanghyuk Chun</strong><sup>❋</sup>, Wonjae Kim, Yoohoon Kang, Sangdoo Yun</li>
<li><strong><em>CVPR 2024.</em></strong> <a href="media/papers/gu2024cvpr_lincir.pdf">paper</a> | <a href="https://github.com/navervision/lincir">code</a> | <a href="https://huggingface.co/spaces/navervision/LinCIR">demo 🤗</a> | <a href="media/bibtex/gu2024lincir.txt">bibtex</a></li>
</ul>
</li>
<li class="list-group-item">
<strong>Improved Probabilistic Image-Text Representations.</strong>
<ul>