14.5.10.6.1 Patch Based Vision Transformers

Chapter Contents (Back)
Vision Transformers. Transformers. Patch Based.

Kim, B.[Boah], Kim, J.[Jeongsol], Ye, J.C.[Jong Chul],
Task-Agnostic Vision Transformer for Distributed Learning of Image Processing,
IP(32), 2023, pp. 203-218.
IEEE DOI 2301
Task analysis, Transformers, Servers, Distance learning, Computer aided instruction, Tail, Head, Distributed learning, task-agnostic learning BibRef

Park, S.[Sangjoon], Ye, J.C.[Jong Chul],
Multi-Task Distributed Learning Using Vision Transformer With Random Patch Permutation,
MedImg(42), No. 7, July 2023, pp. 2091-2105.
IEEE DOI 2307
Task analysis, Transformers, Head, Tail, Servers, Multitasking, Distance learning, Federated learning, split learning, privacy preservation BibRef

Kim, B.J.[Bum Jun], Choi, H.[Hyeyeon], Jang, H.[Hyeonah], Lee, D.G.[Dong Gu], Jeong, W.[Wonseok], Kim, S.W.[Sang Woo],
Improved robustness of vision transformers via prelayernorm in patch embedding,
PR(141), 2023, pp. 109659.
Elsevier DOI 2306
Vision transformer, Patch embedding, Contrast enhancement, Robustness, Layer normalization, Convolutional neural network, Deep learning BibRef

Kang, J.Y.[Jun-Yong], Heo, B.[Byeongho], Choe, J.[Junsuk],
Improving ViT interpretability with patch-level mask prediction,
PRL(187), 2025, pp. 73-79.
Elsevier DOI 2501
Vision Transformer, Interpretability, Weak supervision, Object localization BibRef


Yu, Q.[Qing], Tanaka, M.[Mikihiro], Fujiwara, K.[Kent],
Exploring Vision Transformers for 3D Human Motion-Language Models with Motion Patches,
CVPR24(937-946)
IEEE DOI 2410
Training, Solid modeling, Computational modeling, Transfer learning, Transformers, Motion-Language Models, Text-Motion Retrieval BibRef

Guo, Y.[Yong], Stutz, D.[David], Schiele, B.[Bernt],
Improving Robustness of Vision Transformers by Reducing Sensitivity to Patch Corruptions,
CVPR23(4108-4118)
IEEE DOI 2309
BibRef

Nalmpantis, A.[Angelos], Panagiotopoulos, A.[Apostolos], Gkountouras, J.[John], Papakostas, K.[Konstantinos], Aziz, W.[Wilker],
Vision DiffMask: Faithful Interpretation of Vision Transformers with Differentiable Patch Masking,
XAI4CV23(3756-3763)
IEEE DOI 2309
BibRef

Beyer, L.[Lucas], Izmailov, P.[Pavel], Kolesnikov, A.[Alexander], Caron, M.[Mathilde], Kornblith, S.[Simon], Zhai, X.H.[Xiao-Hua], Minderer, M.[Matthias], Tschannen, M.[Michael], Alabdulmohsin, I.[Ibrahim], Pavetic, F.[Filip],
FlexiViT: One Model for All Patch Sizes,
CVPR23(14496-14506)
IEEE DOI 2309
BibRef

Chang, S.N.[Shu-Ning], Wang, P.[Pichao], Lin, M.[Ming], Wang, F.[Fan], Zhang, D.J.H.[David Jun-Hao], Jin, R.[Rong], Shou, M.Z.[Mike Zheng],
Making Vision Transformers Efficient from A Token Sparsification View,
CVPR23(6195-6205)
IEEE DOI 2309
BibRef

Phan, L.[Lam], Nguyen, H.T.H.[Hiep Thi Hong], Warrier, H.[Harikrishna], Gupta, Y.[Yogesh],
Patch Embedding as Local Features: Unifying Deep Local and Global Features via Vision Transformer for Image Retrieval,
ACCV22(II:204-221).
Springer DOI 2307
BibRef

Liu, Y.[Yue], Matsoukas, C.[Christos], Strand, F.[Fredrik], Azizpour, H.[Hossein], Smith, K.[Kevin],
PatchDropout: Economizing Vision Transformers Using Patch Dropout,
WACV23(3942-3951)
IEEE DOI 2302
Training, Image resolution, Computational modeling, Biological system modeling, Memory management, Transformers, Biomedical/healthcare/medicine BibRef

Gu, J.D.[Jin-Dong], Tresp, V.[Volker], Qin, Y.[Yao],
Are Vision Transformers Robust to Patch Perturbations?,
ECCV22(XII:404-421).
Springer DOI 2211
BibRef

Li, Z.K.[Zhi-Kai], Ma, L.P.[Li-Ping], Chen, M.J.[Meng-Juan], Xiao, J.R.[Jun-Rui], Gu, Q.Y.[Qing-Yi],
Patch Similarity Aware Data-Free Quantization for Vision Transformers,
ECCV22(XI:154-170).
Springer DOI 2211
BibRef

Yun, S.[Sukmin], Lee, H.[Hankook], Kim, J.[Jaehyung], Shin, J.[Jinwoo],
Patch-level Representation Learning for Self-supervised Vision Transformers,
CVPR22(8344-8353)
IEEE DOI 2210
Training, Representation learning, Visualization, Neural networks, Object detection, Self-supervised learning, Transformers, Self- semi- meta- unsupervised learning BibRef

Salman, H.[Hadi], Jain, S.[Saachi], Wong, E.[Eric], Madry, A.[Aleksander],
Certified Patch Robustness via Smoothed Vision Transformers,
CVPR22(15116-15126)
IEEE DOI 2210
Visualization, Smoothing methods, Costs, Computational modeling, Transformers, Adversarial attack and defense BibRef

Tang, Y.[Yehui], Han, K.[Kai], Wang, Y.H.[Yun-He], Xu, C.[Chang], Guo, J.Y.[Jian-Yuan], Xu, C.[Chao], Tao, D.C.[Da-Cheng],
Patch Slimming for Efficient Vision Transformers,
CVPR22(12155-12164)
IEEE DOI 2210
Visualization, Quantization (signal), Computational modeling, Aggregates, Benchmark testing, Representation learning BibRef

Chen, Z.Y.[Zhao-Yu], Li, B.[Bo], Wu, S.[Shuang], Xu, J.H.[Jiang-He], Ding, S.H.[Shou-Hong], Zhang, W.Q.[Wen-Qiang],
Shape Matters: Deformable Patch Attack,
ECCV22(IV:529-548).
Springer DOI 2211
BibRef

Chen, Z.Y.[Zhao-Yu], Li, B.[Bo], Xu, J.H.[Jiang-He], Wu, S.[Shuang], Ding, S.H.[Shou-Hong], Zhang, W.Q.[Wen-Qiang],
Towards Practical Certifiable Patch Defense with Vision Transformer,
CVPR22(15127-15137)
IEEE DOI 2210
Smoothing methods, Toy manufacturing industry, Semantics, Network architecture, Transformers, Robustness, Adversarial attack and defense BibRef

Chapter on Pattern Recognition, Clustering, Statistics, Grammars, Learning, Neural Nets, Genetic Algorithms continues in
Attention in Vision Transformers .


Last update:Jan 20, 2025 at 11:36:25