Bazi, Y.[Yakoub],
Bashmal, L.[Laila],
Al Rahhal, M.M.[Mohamad M.],
Al Dayil, R.[Reham],
Al Ajlan, N.[Naif],
Vision Transformers for Remote Sensing Image Classification,
RS(13), No. 3, 2021, pp. xx-yy.
DOI Link
2102
BibRef
Li, T.[Tao],
Zhang, Z.[Zheng],
Pei, L.[Lishen],
Gan, Y.[Yan],
HashFormer: Vision Transformer Based Deep Hashing for Image Retrieval,
SPLetters(29), 2022, pp. 827-831.
IEEE DOI
2204
Transformers, Binary codes, Task analysis, Training, Image retrieval,
Feature extraction, Databases, Binary embedding, image retrieval
BibRef
Jiang, B.[Bo],
Zhao, K.K.[Kang-Kang],
Tang, J.[Jin],
RGTransformer: Region-Graph Transformer for Image Representation and
Few-Shot Classification,
SPLetters(29), 2022, pp. 792-796.
IEEE DOI
2204
Measurement, Transformers, Image representation,
Feature extraction, Visualization, transformer
BibRef
Chen, Z.M.[Zhao-Min],
Cui, Q.[Quan],
Zhao, B.[Borui],
Song, R.J.[Ren-Jie],
Zhang, X.Q.[Xiao-Qin],
Yoshie, O.[Osamu],
SST: Spatial and Semantic Transformers for Multi-Label Image
Recognition,
IP(31), 2022, pp. 2570-2583.
IEEE DOI
2204
Correlation, Semantics, Transformers, Image recognition,
Task analysis, Training, Feature extraction, label correlation
BibRef
Wang, G.H.[Guang-Hui],
Li, B.[Bin],
Zhang, T.[Tao],
Zhang, S.[Shubi],
A Network Combining a Transformer and a Convolutional Neural Network
for Remote Sensing Image Change Detection,
RS(14), No. 9, 2022, pp. xx-yy.
DOI Link
2205
BibRef
Luo, G.[Gen],
Zhou, Y.[Yiyi],
Sun, X.S.[Xiao-Shuai],
Wang, Y.[Yan],
Cao, L.J.[Liu-Juan],
Wu, Y.J.[Yong-Jian],
Huang, F.Y.[Fei-Yue],
Ji, R.R.[Rong-Rong],
Towards Lightweight Transformer Via Group-Wise Transformation for
Vision-and-Language Tasks,
IP(31), 2022, pp. 3386-3398.
IEEE DOI
2205
Transformers, Task analysis, Computational modeling,
Benchmark testing, Visualization, Convolution, Head,
reference expression comprehension
BibRef
Wang, J.Y.[Jia-Yun],
Chakraborty, R.[Rudrasis],
Yu, S.X.[Stella X.],
Transformer for 3D Point Clouds,
PAMI(44), No. 8, August 2022, pp. 4419-4431.
IEEE DOI
2207
Convolution, Feature extraction, Shape, Semantics, Task analysis,
Measurement, point cloud, transformation, deformable, segmentation, 3D detection
BibRef
Li, Z.K.[Ze-Kun],
Liu, Y.F.[Yu-Fan],
Li, B.[Bing],
Feng, B.L.[Bai-Lan],
Wu, K.[Kebin],
Peng, C.W.[Cheng-Wei],
Hu, W.M.[Wei-Ming],
SDTP: Semantic-Aware Decoupled Transformer Pyramid for Dense Image
Prediction,
CirSysVideo(32), No. 9, September 2022, pp. 6160-6173.
IEEE DOI
2209
Transformers, Semantics, Task analysis, Detectors,
Image segmentation, Head, Convolution, Transformer, dense prediction,
multi-level interaction
BibRef
Wu, J.J.[Jia-Jing],
Wei, Z.Q.[Zhi-Qiang],
Zhang, J.P.[Jin-Peng],
Zhang, Y.S.[Yu-Shi],
Jia, D.N.[Dong-Ning],
Yin, B.[Bo],
Yu, Y.C.[Yun-Chao],
Full-Coupled Convolutional Transformer for Surface-Based Duct
Refractivity Inversion,
RS(14), No. 17, 2022, pp. xx-yy.
DOI Link
2209
BibRef
Jiang, K.[Kai],
Peng, P.[Peng],
Lian, Y.[Youzao],
Xu, W.S.[Wei-Sheng],
The encoding method of position embeddings in vision transformer,
JVCIR(89), 2022, pp. 103664.
Elsevier DOI
2212
Vision transformer, Position embeddings, Gabor filters
BibRef
Han, K.[Kai],
Wang, Y.H.[Yun-He],
Chen, H.T.[Han-Ting],
Chen, X.H.[Xing-Hao],
Guo, J.Y.[Jian-Yuan],
Liu, Z.H.[Zhen-Hua],
Tang, Y.[Yehui],
Xiao, A.[An],
Xu, C.J.[Chun-Jing],
Xu, Y.X.[Yi-Xing],
Yang, Z.H.[Zhao-Hui],
Zhang, Y.[Yiman],
Tao, D.C.[Da-Cheng],
A Survey on Vision Transformer,
PAMI(45), No. 1, January 2023, pp. 87-110.
IEEE DOI
2212
Survey, Vision Transformer. Transformers, Task analysis, Encoding, Computational modeling,
Visualization, Object detection, high-level vision,
video
BibRef
Hou, Q.[Qibin],
Jiang, Z.H.[Zi-Hang],
Yuan, L.[Li],
Cheng, M.M.[Ming-Ming],
Yan, S.C.[Shui-Cheng],
Feng, J.S.[Jia-Shi],
Vision Permutator:
A Permutable MLP-Like Architecture for Visual Recognition,
PAMI(45), No. 1, January 2023, pp. 1328-1334.
IEEE DOI
2212
Transformers, Encoding, Visualization, Convolutional codes, Mixers,
Computer architecture, Training data, Vision permutator, deep neural network
BibRef
Yu, W.H.[Wei-Hao],
Si, C.Y.[Chen-Yang],
Zhou, P.[Pan],
Luo, M.[Mi],
Zhou, Y.C.[Yi-Chen],
Feng, J.S.[Jia-Shi],
Yan, S.C.[Shui-Cheng],
Wang, X.C.[Xin-Chao],
MetaFormer Baselines for Vision,
PAMI(46), No. 2, February 2024, pp. 896-912.
IEEE DOI
2401
BibRef
And: A1, A4, A3, A2, A5, A8, A6, A7:
MetaFormer is Actually What You Need for Vision,
CVPR22(10809-10819)
IEEE DOI
2210
The abstracted architecture of Transformer.
Computational modeling, Focusing,
Transformers, Pattern recognition, Task analysis, retrieval
BibRef
Zhou, D.[Daquan],
Hou, Q.[Qibin],
Yang, L.J.[Lin-Jie],
Jin, X.J.[Xiao-Jie],
Feng, J.S.[Jia-Shi],
Token Selection is a Simple Booster for Vision Transformers,
PAMI(45), No. 11, November 2023, pp. 12738-12746.
IEEE DOI
2310
BibRef
Yuan, L.[Li],
Hou, Q.[Qibin],
Jiang, Z.H.[Zi-Hang],
Feng, J.S.[Jia-Shi],
Yan, S.C.[Shui-Cheng],
VOLO: Vision Outlooker for Visual Recognition,
PAMI(45), No. 5, May 2023, pp. 6575-6586.
IEEE DOI
2304
Transformers, Computer architecture, Computational modeling,
Training, Data models, Task analysis, Visualization,
image classification
BibRef
Ren, S.[Sucheng],
Zhou, D.[Daquan],
He, S.F.[Sheng-Feng],
Feng, J.S.[Jia-Shi],
Wang, X.C.[Xin-Chao],
Shunted Self-Attention via Multi-Scale Token Aggregation,
CVPR22(10843-10852)
IEEE DOI
2210
Degradation, Deep learning, Costs, Computational modeling, Merging,
Efficient learning and inferences
BibRef
Wu, Y.H.[Yu-Huan],
Liu, Y.[Yun],
Zhan, X.[Xin],
Cheng, M.M.[Ming-Ming],
P2T: Pyramid Pooling Transformer for Scene Understanding,
PAMI(45), No. 11, November 2023, pp. 12760-12771.
IEEE DOI
2310
BibRef
Li, Y.[Yehao],
Yao, T.[Ting],
Pan, Y.W.[Ying-Wei],
Mei, T.[Tao],
Contextual Transformer Networks for Visual Recognition,
PAMI(45), No. 2, February 2023, pp. 1489-1500.
IEEE DOI
2301
Transformers, Convolution, Visualization, Task analysis,
Image recognition, Object detection, Transformer, image recognition
BibRef
Wang, H.[Hang],
Du, Y.[Youtian],
Zhang, Y.[Yabin],
Li, S.[Shuai],
Zhang, L.[Lei],
One-Stage Visual Relationship Referring With Transformers and
Adaptive Message Passing,
IP(32), 2023, pp. 190-202.
IEEE DOI
2301
Visualization, Proposals, Transformers, Task analysis, Detectors,
Message passing, Predictive models, gated message passing
BibRef
Kim, B.[Boah],
Kim, J.[Jeongsol],
Ye, J.C.[Jong Chul],
Task-Agnostic Vision Transformer for Distributed Learning of Image
Processing,
IP(32), 2023, pp. 203-218.
IEEE DOI
2301
Task analysis, Transformers, Servers, Distance learning,
Computer aided instruction, Tail, Head, Distributed learning,
task-agnostic learning
BibRef
Park, S.[Sangjoon],
Ye, J.C.[Jong Chul],
Multi-Task Distributed Learning Using Vision Transformer With Random
Patch Permutation,
MedImg(42), No. 7, July 2023, pp. 2091-2105.
IEEE DOI
2307
Task analysis, Transformers, Head, Tail, Servers, Multitasking,
Distance learning, Federated learning, split learning,
privacy preservation
BibRef
Kiya, H.[Hitoshi],
Iijima, R.[Ryota],
Maungmaung, A.[Aprilpyone],
Kinoshit, Y.[Yuma],
Image and Model Transformation with Secret Key for Vision Transformer,
IEICE(E106-D), No. 1, January 2023, pp. 2-11.
WWW Link.
2301
BibRef
Zhang, H.F.[Hao-Fei],
Mao, F.[Feng],
Xue, M.Q.[Meng-Qi],
Fang, G.F.[Gong-Fan],
Feng, Z.L.[Zun-Lei],
Song, J.[Jie],
Song, M.L.[Ming-Li],
Knowledge Amalgamation for Object Detection With Transformers,
IP(32), 2023, pp. 2093-2106.
IEEE DOI
2304
Transformers, Task analysis, Object detection, Detectors, Training,
Feature extraction, Model reusing, vision transformers
BibRef
Li, Y.[Ying],
Chen, K.[Kehan],
Sun, S.L.[Shi-Lei],
He, C.[Chu],
Multi-scale homography estimation based on dual feature aggregation
transformer,
IET-IPR(17), No. 5, 2023, pp. 1403-1416.
DOI Link
2304
image matching, image registration
BibRef
Wang, G.Q.[Guan-Qun],
Chen, H.[He],
Chen, L.[Liang],
Zhuang, Y.[Yin],
Zhang, S.H.[Shang-Hang],
Zhang, T.[Tong],
Dong, H.[Hao],
Gao, P.[Peng],
P2FEViT: Plug-and-Play CNN Feature Embedded Hybrid Vision Transformer
for Remote Sensing Image Classification,
RS(15), No. 7, 2023, pp. 1773.
DOI Link
2304
BibRef
Zhang, Q.M.[Qi-Ming],
Xu, Y.F.[Yu-Fei],
Zhang, J.[Jing],
Tao, D.C.[Da-Cheng],
ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for
Image Recognition and Beyond,
IJCV(131), No. 5, May 2023, pp. 1141-1162.
Springer DOI
2305
BibRef
Zhang, J.N.[Jiang-Ning],
Li, X.T.[Xiang-Tai],
Wang, Y.B.[Ya-Biao],
Wang, C.J.[Cheng-Jie],
Yang, Y.B.[Yi-Bo],
Liu, Y.[Yong],
Tao, D.C.[Da-Cheng],
EATFormer: Improving Vision Transformer Inspired by Evolutionary
Algorithm,
IJCV(132), No. 1, January 2024, pp. 3509-3536.
Springer DOI
2409
BibRef
Fan, X.Y.[Xin-Yi],
Liu, H.J.[Hua-Jun],
FlexFormer: Flexible Transformer for efficient visual recognition,
PRL(169), 2023, pp. 95-101.
Elsevier DOI
2305
Vision transformer, Frequency analysis, Image classification
BibRef
Cho, S.[Seokju],
Hong, S.[Sunghwan],
Kim, S.[Seungryong],
CATs++: Boosting Cost Aggregation With Convolutions and Transformers,
PAMI(45), No. 6, June 2023, pp. 7174-7194.
IEEE DOI
WWW Link.
2305
Costs, Transformers, Correlation, Semantics, Feature extraction,
Task analysis, Cost aggregation, efficient transformer,
semantic visual correspondence
BibRef
Kim, B.J.[Bum Jun],
Choi, H.[Hyeyeon],
Jang, H.[Hyeonah],
Lee, D.G.[Dong Gu],
Jeong, W.[Wonseok],
Kim, S.W.[Sang Woo],
Improved robustness of vision transformers via prelayernorm in patch
embedding,
PR(141), 2023, pp. 109659.
Elsevier DOI
2306
Vision transformer, Patch embedding, Contrast enhancement,
Robustness, Layer normalization, Convolutional neural network, Deep learning
BibRef
Wang, Z.W.[Zi-Wei],
Wang, C.Y.[Chang-Yuan],
Xu, X.W.[Xiu-Wei],
Zhou, J.[Jie],
Lu, J.W.[Ji-Wen],
Quantformer: Learning Extremely Low-Precision Vision Transformers,
PAMI(45), No. 7, July 2023, pp. 8813-8826.
IEEE DOI
2306
Quantization (signal), Transformers, Computational modeling,
Search problems, Object detection, Image color analysis,
vision transformers
BibRef
Sun, S.Y.[Shu-Yang],
Yue, X.Y.[Xiao-Yu],
Zhao, H.S.[Heng-Shuang],
Torr, P.H.S.[Philip H.S.],
Bai, S.[Song],
Patch-Based Separable Transformer for Visual Recognition,
PAMI(45), No. 7, July 2023, pp. 9241-9247.
IEEE DOI
2306
Task analysis, Current transformers, Visualization,
Feature extraction, Convolutional neural networks,
instance segmentation
BibRef
Yue, X.Y.[Xiao-Yu],
Sun, S.Y.[Shu-Yang],
Kuang, Z.H.[Zhang-Hui],
Wei, M.[Meng],
Torr, P.H.S.[Philip H.S.],
Zhang, W.[Wayne],
Lin, D.[Dahua],
Vision Transformer with Progressive Sampling,
ICCV21(377-386)
IEEE DOI
2203
Codes, Computational modeling, Interference,
Transformers, Feature extraction, Recognition and classification,
Representation learning
BibRef
Peng, Z.L.[Zhi-Liang],
Guo, Z.H.[Zong-Hao],
Huang, W.[Wei],
Wang, Y.W.[Yao-Wei],
Xie, L.X.[Ling-Xi],
Jiao, J.B.[Jian-Bin],
Tian, Q.[Qi],
Ye, Q.X.[Qi-Xiang],
Conformer: Local Features Coupling Global Representations for
Recognition and Detection,
PAMI(45), No. 8, August 2023, pp. 9454-9468.
IEEE DOI
2307
Transformers, Feature extraction, Couplings, Visualization,
Detectors, Convolution, Object detection, Feature fusion,
vision transformer
BibRef
Peng, Z.L.[Zhi-Liang],
Huang, W.[Wei],
Gu, S.Z.[Shan-Zhi],
Xie, L.X.[Ling-Xi],
Wang, Y.[Yaowei],
Jiao, J.B.[Jian-Bin],
Ye, Q.X.[Qi-Xiang],
Conformer: Local Features Coupling Global Representations for Visual
Recognition,
ICCV21(357-366)
IEEE DOI
2203
Couplings, Representation learning, Visualization, Fuses,
Convolution, Object detection, Transformers,
Representation learning
BibRef
Feng, Z.Z.[Zhan-Zhou],
Zhang, S.L.[Shi-Liang],
Efficient Vision Transformer via Token Merger,
IP(32), 2023, pp. 4156-4169.
IEEE DOI
2307
Corporate acquisitions, Transformers, Semantics, Task analysis,
Visualization, Merging, Computational efficiency, sparese representation
BibRef
Yang, J.H.[Jia-Hao],
Li, X.Y.[Xiang-Yang],
Zheng, M.[Mao],
Wang, Z.H.[Zi-Han],
Zhu, Y.Q.[Yong-Qing],
Guo, X.Q.[Xiao-Qian],
Yuan, Y.C.[Yu-Chen],
Chai, Z.[Zifeng],
Jiang, S.Q.[Shu-Qiang],
MemBridge: Video-Language Pre-Training With Memory-Augmented
Inter-Modality Bridge,
IP(32), 2023, pp. 4073-4087.
IEEE DOI
2307
WWW Link. Bridges, Transformers, Computer architecture, Task analysis,
Visualization, Feature extraction, Memory modules, memory module
BibRef
Wang, D.L.[Duo-Lin],
Chen, Y.[Yadang],
Naz, B.[Bushra],
Sun, L.[Le],
Li, B.Z.[Bao-Zhu],
Spatial-Aware Transformer (SAT): Enhancing Global Modeling in
Transformer Segmentation for Remote Sensing Images,
RS(15), No. 14, 2023, pp. 3607.
DOI Link
2307
BibRef
Huang, X.Y.[Xin-Yan],
Liu, F.[Fang],
Cui, Y.H.[Yuan-Hao],
Chen, P.[Puhua],
Li, L.L.[Ling-Ling],
Li, P.F.[Peng-Fang],
Faster and Better: A Lightweight Transformer Network for Remote
Sensing Scene Classification,
RS(15), No. 14, 2023, pp. 3645.
DOI Link
2307
BibRef
Yao, T.[Ting],
Li, Y.[Yehao],
Pan, Y.W.[Ying-Wei],
Wang, Y.[Yu],
Zhang, X.P.[Xiao-Ping],
Mei, T.[Tao],
Dual Vision Transformer,
PAMI(45), No. 9, September 2023, pp. 10870-10882.
IEEE DOI
2309
Survey, Vision Transformer.
BibRef
Rao, Y.M.[Yong-Ming],
Liu, Z.[Zuyan],
Zhao, W.L.[Wen-Liang],
Zhou, J.[Jie],
Lu, J.W.[Ji-Wen],
Dynamic Spatial Sparsification for Efficient Vision Transformers and
Convolutional Neural Networks,
PAMI(45), No. 9, September 2023, pp. 10883-10897.
IEEE DOI
2309
BibRef
Li, J.[Jie],
Liu, Z.[Zhao],
Li, L.[Li],
Lin, J.Q.[Jun-Qin],
Yao, J.[Jian],
Tu, J.[Jingmin],
Multi-view convolutional vision transformer for 3D object recognition,
JVCIR(95), 2023, pp. 103906.
Elsevier DOI
2309
Multi-view, 3D object recognition, Feature fusion, Convolutional neural networks
BibRef
Shang, J.H.[Jing-Huan],
Li, X.[Xiang],
Kahatapitiya, K.[Kumara],
Lee, Y.C.[Yu-Cheol],
Ryoo, M.S.[Michael S.],
StARformer: Transformer With State-Action-Reward Representations for
Robot Learning,
PAMI(45), No. 11, November 2023, pp. 12862-12877.
IEEE DOI
2310
BibRef
Earlier: A1, A3, A2, A5, Only:
StARformer: Transformer with State-Action-Reward Representations for
Visual Reinforcement Learning,
ECCV22(XXIX:462-479).
Springer DOI
2211
BibRef
Duan, H.R.[Hao-Ran],
Long, Y.[Yang],
Wang, S.D.[Shi-Dong],
Zhang, H.F.[Hao-Feng],
Willcocks, C.G.[Chris G.],
Shao, L.[Ling],
Dynamic Unary Convolution in Transformers,
PAMI(45), No. 11, November 2023, pp. 12747-12759.
IEEE DOI
2310
BibRef
Chen, S.M.[Shi-Ming],
Hong, Z.M.[Zi-Ming],
Hou, W.J.[Wen-Jin],
Xie, G.S.[Guo-Sen],
Song, Y.B.[Yi-Bing],
Zhao, J.[Jian],
You, X.G.[Xin-Ge],
Yan, S.C.[Shui-Cheng],
Shao, L.[Ling],
TransZero++:
Cross Attribute-Guided Transformer for Zero-Shot Learning,
PAMI(45), No. 11, November 2023, pp. 12844-12861.
IEEE DOI
2310
BibRef
Qian, S.J.[Sheng-Ju],
Zhu, Y.[Yi],
Li, W.B.[Wen-Bo],
Li, M.[Mu],
Jia, J.Y.[Jia-Ya],
What Makes for Good Tokenizers in Vision Transformer?,
PAMI(45), No. 11, November 2023, pp. 13011-13023.
IEEE DOI
2310
BibRef
Sun, W.X.[Wei-Xuan],
Qin, Z.[Zhen],
Deng, H.[Hui],
Wang, J.[Jianyuan],
Zhang, Y.[Yi],
Zhang, K.[Kaihao],
Barnes, N.[Nick],
Birchfield, S.[Stan],
Kong, L.P.[Ling-Peng],
Zhong, Y.[Yiran],
Vicinity Vision Transformer,
PAMI(45), No. 10, October 2023, pp. 12635-12649.
IEEE DOI
2310
BibRef
Cao, C.J.[Chen-Jie],
Dong, Q.[Qiaole],
Fu, Y.W.[Yan-Wei],
ZITS++: Image Inpainting by Improving the Incremental Transformer on
Structural Priors,
PAMI(45), No. 10, October 2023, pp. 12667-12684.
IEEE DOI
2310
BibRef
Fang, Y.X.[Yu-Xin],
Wang, X.G.[Xing-Gang],
Wu, R.[Rui],
Liu, W.Y.[Wen-Yu],
What Makes for Hierarchical Vision Transformer?,
PAMI(45), No. 10, October 2023, pp. 12714-12720.
IEEE DOI
2310
BibRef
Xu, P.[Peng],
Zhu, X.T.[Xia-Tian],
Clifton, D.A.[David A.],
Multimodal Learning With Transformers: A Survey,
PAMI(45), No. 10, October 2023, pp. 12113-12132.
IEEE DOI
2310
BibRef
Liu, J.[Jun],
Guo, H.R.[Hao-Ran],
He, Y.[Yile],
Li, H.L.[Hua-Li],
Vision Transformer-Based Ensemble Learning for Hyperspectral Image
Classification,
RS(15), No. 21, 2023, pp. 5208.
DOI Link
2311
BibRef
Lin, M.B.[Ming-Bao],
Chen, M.Z.[Meng-Zhao],
Zhang, Y.X.[Yu-Xin],
Shen, C.H.[Chun-Hua],
Ji, R.R.[Rong-Rong],
Cao, L.J.[Liu-Juan],
Super Vision Transformer,
IJCV(131), No. 12, December 2023, pp. 3136-3151.
Springer DOI
2311
BibRef
Li, Z.Y.[Zhong-Yu],
Gao, S.H.[Shang-Hua],
Cheng, M.M.[Ming-Ming],
SERE: Exploring Feature Self-Relation for Self-Supervised Transformer,
PAMI(45), No. 12, December 2023, pp. 15619-15631.
IEEE DOI
2311
BibRef
Yuan, Y.H.[Yu-Hui],
Liang, W.C.[Wei-Cong],
Ding, H.H.[Heng-Hui],
Liang, Z.H.[Zhan-Hao],
Zhang, C.[Chao],
Hu, H.[Han],
Expediting Large-Scale Vision Transformer for Dense Prediction
Without Fine-Tuning,
PAMI(46), No. 1, January 2024, pp. 250-266.
IEEE DOI
2312
BibRef
Jiao, J.[Jiayu],
Tang, Y.M.[Yu-Ming],
Lin, K.Y.[Kun-Yu],
Gao, Y.P.[Yi-Peng],
Ma, A.J.[Andy J.],
Wang, Y.W.[Yao-Wei],
Zheng, W.S.[Wei-Shi],
DilateFormer: Multi-Scale Dilated Transformer for Visual Recognition,
MultMed(25), 2023, pp. 8906-8919.
IEEE DOI Code:
HTML Version.
2312
BibRef
Li, Z.H.[Zi-Han],
Li, Y.X.[Yun-Xiang],
Li, Q.D.[Qing-De],
Wang, P.[Puyang],
Guo, D.[Dazhou],
Lu, L.[Le],
Jin, D.[Dakai],
Zhang, Y.[You],
Hong, Q.Q.[Qing-Qi],
LViT: Language Meets Vision Transformer in Medical Image Segmentation,
MedImg(43), No. 1, January 2024, pp. 96-107.
IEEE DOI Code:
WWW Link.
2401
BibRef
Fu, K.[Kexue],
Yuan, M.Z.[Ming-Zhi],
Liu, S.L.[Shao-Lei],
Wang, M.[Manning],
Boosting Point-BERT by Multi-Choice Tokens,
CirSysVideo(34), No. 1, January 2024, pp. 438-447.
IEEE DOI
2401
self-supervised pre-training task.
See also Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point Modeling.
BibRef
Ghosal, S.S.[Soumya Suvra],
Li, Y.X.[Yi-Xuan],
Are Vision Transformers Robust to Spurious Correlations?,
IJCV(132), No. 3, March 2024, pp. 689-709.
Springer DOI
2402
BibRef
Yan, F.Y.[Fang-Yuan],
Yan, B.[Bin],
Liang, W.[Wei],
Pei, M.T.[Ming-Tao],
Token labeling-guided multi-scale medical image classification,
PRL(178), 2024, pp. 28-34.
Elsevier DOI
2402
Medical image classification, Vision transformer, Token labeling
BibRef
Li, Y.X.[Yue-Xiang],
Huang, Y.W.[Ya-Wen],
He, N.[Nanjun],
Ma, K.[Kai],
Zheng, Y.F.[Ye-Feng],
Improving vision transformer for medical image classification via
token-wise perturbation,
JVCIR(98), 2024, pp. 104022.
Elsevier DOI
2402
Self-supervised learning, Vision transformer, Image classification
BibRef
Nguyen, H.[Hung],
Kim, C.[Chanho],
Li, F.[Fuxin],
Space-time recurrent memory network,
CVIU(241), 2024, pp. 103943.
Elsevier DOI
2403
Deep learning architectures and techniques, Segmentation,
Memory network, Transformer
BibRef
Kheldouni, A.[Amine],
Boumhidi, J.[Jaouad],
A Study of Bidirectional Encoder Representations from Transformers
for Sequential Recommendations,
ISCV22(1-5)
IEEE DOI
2208
Knowledge engineering, Recurrent neural networks,
Predictive models, Markov processes
BibRef
Chen, Z.[Ziyi],
Bai, C.Y.[Chen-Yao],
Zhu, Y.L.[Yun-Long],
Lu, X.W.[Xi-Wen],
TUT: Template-Augmented U-Net Transformer for Unsupervised Anomaly
Detection,
SPLetters(31), 2024, pp. 780-784.
IEEE DOI
2404
Image reconstruction, Decoding, Convolution, Vectors, Anomaly detection,
Head, Self-supervised learning, unsupervised learning
BibRef
Xiao, Q.[Qiao],
Zhang, Y.[Yu],
Yang, Q.[Qiang],
Selective Random Walk for Transfer Learning in Heterogeneous Label
Spaces,
PAMI(46), No. 6, June 2024, pp. 4476-4488.
IEEE DOI
2405
Transfer learning, Bridges, Metalearning, Adaptation models,
Training, Task analysis, Transfer learning, selective random walk
BibRef
Zhang, J.S.[Jin-Song],
Gu, L.F.[Ling-Feng],
Lai, Y.K.[Yu-Kun],
Wang, X.Y.[Xue-Yang],
Li, K.[Kun],
Toward Grouping in Large Scenes With Occlusion-Aware Spatio-Temporal
Transformers,
CirSysVideo(34), No. 5, May 2024, pp. 3919-3929.
IEEE DOI
2405
Feature extraction, Trajectory, Transformers, Task analysis,
Data mining, Video sequences, Group detection, large-scale scenes,
spatio-temporal transformers
BibRef
Akkaya, I.B.[Ibrahim Batuhan],
Kathiresan, S.S.[Senthilkumar S.],
Arani, E.[Elahe],
Zonooz, B.[Bahram],
Enhancing performance of vision transformers on small datasets
through local inductive bias incorporation,
PR(153), 2024, pp. 110510.
Elsevier DOI Code:
WWW Link.
2405
Vision transformer, Inductive bias, Locality, Small dataset
BibRef
Yao, T.[Ting],
Li, Y.[Yehao],
Pan, Y.W.[Ying-Wei],
Mei, T.[Tao],
HIRI-ViT: Scaling Vision Transformer With High Resolution Inputs,
PAMI(46), No. 9, September 2024, pp. 6431-6442.
IEEE DOI
2408
Transformers, Convolution, Convolutional neural networks,
Computational efficiency, Spatial resolution, Visualization, vision transformer
BibRef
Lu, J.C.[Jia-Chen],
Zhang, J.G.[Jun-Ge],
Zhu, X.T.[Xia-Tian],
Feng, J.F.[Jian-Feng],
Xiang, T.[Tao],
Zhang, L.[Li],
Softmax-Free Linear Transformers,
IJCV(132), No. 8, August 2024, pp. 3355-3374.
Springer DOI Code:
WWW Link.
2408
Approximage the self-attention by linear function.
BibRef
Xu, G.Y.[Guang-Yi],
Ye, J.Y.[Jun-Yong],
Liu, X.Y.[Xin-Yuan],
Wen, X.B.[Xu-Bin],
Li, Y.[Youwei],
Wang, J.J.[Jing-Jing],
Lv-Adapter: Adapting Vision Transformers for Visual Classification
with Linear-layers and Vectors,
CVIU(246), 2024, pp. 104049.
Elsevier DOI
2408
Deep learning, Vision Transformers, Fine-tuning, Plug and play,
Transfer learning
BibRef
Li, C.H.[Cheng-Hao],
Zhang, C.N.[Chao-Ning],
Toward a deeper understanding: RetNet viewed through Convolution,
PR(155), 2024, pp. 110625.
Elsevier DOI Code:
WWW Link.
2408
Boost local response of ViT.
Convolutional neural network, Vision transformer, RetNet
BibRef
Yan, L.Q.[Long-Quan],
Yan, R.X.[Rui-Xiang],
Chai, B.[Bosong],
Geng, G.H.[Guo-Hua],
Zhou, P.[Pengbo],
Gao, J.[Jian],
DM-GAN: CNN hybrid vits for training GANs under limited data,
PR(156), 2024, pp. 110810.
Elsevier DOI
2408
GAN, Few-shot, Vision transformer, Proprietary artifact image
BibRef
Feng, Q.H.[Qi-Hua],
Li, P.Y.[Pei-Ya],
Lu, Z.X.[Zhi-Xun],
Li, C.Z.[Chao-Zhuo],
Wang, Z.[Zefan],
Liu, Z.Q.[Zhi-Quan],
Duan, C.H.[Chun-Hui],
Huang, F.[Feiran],
Weng, J.[Jian],
Yu, P.S.[Philip S.],
EViT: Privacy-Preserving Image Retrieval via Encrypted Vision
Transformer in Cloud Computing,
CirSysVideo(34), No. 8, August 2024, pp. 7467-7483.
IEEE DOI Code:
WWW Link.
2408
Feature extraction, Encryption, Codes, Cloud computing, Transform coding,
Streaming media, Ciphers, Image retrieval, self-supervised learning
BibRef
Liao, H.X.[Hui-Xian],
Li, X.[Xiaosen],
Qin, X.[Xiao],
Wang, W.J.[Wen-Ji],
He, G.[Guodui],
Huang, H.J.[Hao-Jie],
Guo, X.[Xu],
Chun, X.[Xin],
Zhang, J.[Jinyong],
Fu, Y.Q.[Yun-Qin],
Qin, Z.Y.[Zheng-You],
EPSViTs: A hybrid architecture for image classification based on
parameter-shared multi-head self-attention,
IVC(149), 2024, pp. 105130.
Elsevier DOI
2408
Image classification, Multi-head self-attention,
Parameter-shared, Hybrid architecture
BibRef
Yang, F.F.[Fei-Fan],
Chen, G.[Gang],
Duan, J.S.[Jian-Shu],
Skip-Encoder and Skip-Decoder for Detection Transformer in Optical
Remote Sensing,
RS(16), No. 16, 2024, pp. 2884.
DOI Link
2408
BibRef
Naeem, M.F.[Muhammad Ferjad],
Xian, Y.Q.[Yong-Qin],
Van Gool, L.J.[Luc J.],
Tombari, F.[Federico],
I2DFormer+: Learning Image to Document Summary Attention for Zero-Shot
Image Classification,
IJCV(132), No. 1, January 2024, pp. 3806-3822.
Springer DOI
2409
BibRef
Naeem, M.F.[Muhammad Ferjad],
Khan, M.G.Z.A.[Muhammad Gul Zain Ali],
Xian, Y.Q.[Yong-Qin],
Afzal, M.Z.[Muhammad Zeshan],
Stricker, D.[Didier],
Van Gool, L.J.[Luc J.],
Tombari, F.[Federico],
I2MVFormer: Large Language Model Generated Multi-View Document
Supervision for Zero-Shot Image Classification,
CVPR23(15169-15179)
IEEE DOI
2309
BibRef
Kim, S.[Sunpil],
Yoon, G.J.[Gang-Joon],
Song, J.[Jinjoo],
Yoon, S.M.[Sang Min],
Simultaneous image patch attention and pruning for patch selective
transformer,
IVC(150), 2024, pp. 105239.
Elsevier DOI
2409
Patch pruning, Patch emphasis, Attentive patch selection, Vision transformer
BibRef
Wang, H.Y.[Hong-Yu],
Ma, S.M.[Shu-Ming],
Dong, L.[Li],
Huang, S.[Shaohan],
Zhang, D.D.[Dong-Dong],
Wei, F.[Furu],
DeepNet: Scaling Transformers to 1,000 Layers,
PAMI(46), No. 10, October 2024, pp. 6761-6774.
IEEE DOI
2409
Transformers, Training, Optimization, Stability analysis,
Machine translation, Decoding, Computational modeling, Big models,
transformers
BibRef
Herzig, R.[Roei],
Abramovich, O.[Ofir],
Ben Avraham, E.[Elad],
Arbelle, A.[Assaf],
Karlinsky, L.[Leonid],
Shamir, A.[Ariel],
Darrell, T.J.[Trevor J.],
Globerson, A.[Amir],
PromptonomyViT: Multi-Task Prompt Learning Improves Video
Transformers using Synthetic Scene Data,
WACV24(6789-6801)
IEEE DOI Code:
WWW Link.
2404
Graphics, Solid modeling, Annotations, Transformers, Multitasking,
Task analysis, Algorithms, Video recognition and understanding,
Image recognition and understanding
BibRef
Marouf, I.E.[Imad Eddine],
Tartaglione, E.[Enzo],
Lathuiličre, S.[Stéphane],
Mini but Mighty: Finetuning ViTs with Mini Adapters,
WACV24(1721-1730)
IEEE DOI
2404
Training, Costs, Neurons, Transfer learning, Estimation,
Computer architecture, Algorithms
BibRef
Kim, G.[Gihyun],
Kim, J.[Juyeop],
Lee, J.S.[Jong-Seok],
Exploring Adversarial Robustness of Vision Transformers in the
Spectral Perspective,
WACV24(3964-3973)
IEEE DOI
2404
Deep learning, Perturbation methods, Frequency-domain analysis,
Linearity, Transformers, Robustness, High frequency, Algorithms,
adversarial attack and defense methods
BibRef
Xu, X.[Xuwei],
Wang, S.[Sen],
Chen, Y.D.[Yu-Dong],
Zheng, Y.P.[Yan-Ping],
Wei, Z.W.[Zhe-Wei],
Liu, J.J.[Jia-Jun],
GTP-ViT: Efficient Vision Transformers via Graph-based Token
Propagation,
WACV24(86-95)
IEEE DOI Code:
WWW Link.
2404
Source coding, Computational modeling, Merging, Broadcasting,
Transformers, Computational complexity, Algorithms
BibRef
Han, Q.[Qiu],
Zhang, G.J.[Gong-Jie],
Huang, J.X.[Jia-Xing],
Gao, P.[Peng],
Wei, Z.[Zhang],
Lu, S.J.[Shi-Jian],
Efficient MAE towards Large-Scale Vision Transformers,
WACV24(595-604)
IEEE DOI
2404
Measurement, Degradation, Visualization, Runtime,
Computational modeling, Transformers, Algorithms
BibRef
Park, J.W.[Jong-Woo],
Kahatapitiya, K.[Kumara],
Kim, D.H.[Dong-Hyun],
Sudalairaj, S.[Shivchander],
Fan, Q.F.[Quan-Fu],
Ryoo, M.S.[Michael S.],
Grafting Vision Transformers,
WACV24(1134-1143)
IEEE DOI Code:
WWW Link.
2404
Codes, Computational modeling, Semantics, Information sharing,
Computer architecture, Transformers, Algorithms,
Image recognition and understanding
BibRef
Shimizu, S.[Shuki],
Tamaki, T.[Toru],
Joint learning of images and videos with a single Vision Transformer,
MVA23(1-6)
DOI Link
2403
Training, Image recognition, Machine vision, Transformers, Tuning, Videos
BibRef
Li, K.C.[Kun-Chang],
Wang, Y.[Yali],
Li, Y.Z.[Yi-Zhuo],
Wang, Y.[Yi],
He, Y.[Yinan],
Wang, L.M.[Li-Min],
Qiao, Y.[Yu],
Unmasked Teacher: Towards Training-Efficient Video Foundation Models,
ICCV23(19891-19903)
IEEE DOI
2401
BibRef
Ding, S.R.[Shuang-Rui],
Zhao, P.S.[Pei-Sen],
Zhang, X.P.[Xiao-Peng],
Qian, R.[Rui],
Xiong, H.K.[Hong-Kai],
Tian, Q.[Qi],
Prune Spatio-temporal Tokens by Semantic-aware Temporal Accumulation,
ICCV23(16899-16910)
IEEE DOI Code:
WWW Link.
2401
BibRef
Chen, M.Z.[Meng-Zhao],
Lin, M.[Mingbao],
Lin, Z.H.[Zhi-Hang],
Zhang, Y.X.[Yu-Xin],
Chao, F.[Fei],
Ji, R.R.[Rong-Rong],
SMMix: Self-Motivated Image Mixing for Vision Transformers,
ICCV23(17214-17224)
IEEE DOI Code:
WWW Link.
2401
BibRef
Kim, D.[Dahun],
Angelova, A.[Anelia],
Kuo, W.C.[Wei-Cheng],
Contrastive Feature Masking Open-Vocabulary Vision Transformer,
ICCV23(15556-15566)
IEEE DOI
2401
BibRef
Zhang, Y.[Yuke],
Chen, D.[Dake],
Kundu, S.[Souvik],
Li, C.H.[Cheng-Hao],
Beerel, P.A.[Peter A.],
SAL-ViT: Towards Latency Efficient Private Inference on ViT using
Selective Attention Search with a Learnable Softmax Approximation,
ICCV23(5093-5102)
IEEE DOI
2401
BibRef
Li, Z.K.[Zhi-Kai],
Gu, Q.Y.[Qing-Yi],
I-ViT: Integer-only Quantization for Efficient Vision Transformer
Inference,
ICCV23(17019-17029)
IEEE DOI Code:
WWW Link.
2401
BibRef
Frumkin, N.[Natalia],
Gope, D.[Dibakar],
Marculescu, D.[Diana],
Jumping through Local Minima: Quantization in the Loss Landscape of
Vision Transformers,
ICCV23(16932-16942)
IEEE DOI Code:
WWW Link.
2401
BibRef
Li, Z.K.[Zhi-Kai],
Xiao, J.R.[Jun-Rui],
Yang, L.W.[Lian-Wei],
Gu, Q.Y.[Qing-Yi],
RepQ-ViT: Scale Reparameterization for Post-Training Quantization of
Vision Transformers,
ICCV23(17181-17190)
IEEE DOI Code:
WWW Link.
2401
BibRef
Havtorn, J.D.[Jakob Drachmann],
Royer, A.[Amélie],
Blankevoort, T.[Tijmen],
Bejnordi, B.E.[Babak Ehteshami],
MSViT: Dynamic Mixed-scale Tokenization for Vision Transformers,
NIVT23(838-848)
IEEE DOI
2401
BibRef
Haurum, J.B.[Joakim Bruslund],
Escalera, S.[Sergio],
Taylor, G.W.[Graham W.],
Moeslund, T.B.[Thomas B.],
Which Tokens to Use? Investigating Token Reduction in Vision
Transformers,
NIVT23(773-783)
IEEE DOI Code:
WWW Link.
2401
BibRef
Wang, X.[Xijun],
Chu, X.J.[Xiao-Jie],
Han, C.[Chunrui],
Zhang, X.Y.[Xiang-Yu],
SCSC: Spatial Cross-scale Convolution Module to Strengthen both CNNs
and Transformers,
NIVT23(731-741)
IEEE DOI
2401
BibRef
Chen, Y.H.[Yi-Hsin],
Weng, Y.C.[Ying-Chieh],
Kao, C.H.[Chia-Hao],
Chien, C.[Cheng],
Chiu, W.C.[Wei-Chen],
Peng, W.H.[Wen-Hsiao],
TransTIC: Transferring Transformer-based Image Compression from Human
Perception to Machine Perception,
ICCV23(23240-23250)
IEEE DOI
2401
BibRef
Li, Y.[Yanyu],
Hu, J.[Ju],
Wen, Y.[Yang],
Evangelidis, G.[Georgios],
Salahi, K.[Kamyar],
Wang, Y.Z.[Yan-Zhi],
Tulyakov, S.[Sergey],
Ren, J.[Jian],
Rethinking Vision Transformers for MobileNet Size and Speed,
ICCV23(16843-16854)
IEEE DOI
2401
BibRef
Nurgazin, M.[Maxat],
Tu, N.A.[Nguyen Anh],
A Comparative Study of Vision Transformer Encoders and Few-shot
Learning for Medical Image Classification,
CVAMD23(2505-2513)
IEEE DOI
2401
BibRef
Yeganeh, Y.[Yousef],
Farshad, A.[Azade],
Weinberger, P.[Peter],
Ahmadi, S.A.[Seyed-Ahmad],
Adeli, E.[Ehsan],
Navab, N.[Nassir],
Transformers Pay Attention to Convolutions Leveraging Emerging
Properties of ViTs by Dual Attention-Image Network,
CVAMD23(2296-2307)
IEEE DOI
2401
BibRef
Zheng, J.H.[Jia-Hao],
Yang, L.Q.[Long-Qi],
Li, Y.[Yiying],
Yang, K.[Ke],
Wang, Z.Y.[Zhi-Yuan],
Zhou, J.[Jun],
Lightweight Vision Transformer with Spatial and Channel Enhanced
Self-Attention,
REDLCV23(1484-1488)
IEEE DOI
2401
BibRef
Xie, W.[Wei],
Zhao, Z.[Zimeng],
Li, S.Y.[Shi-Ying],
Zuo, B.H.[Bing-Hui],
Wang, Y.G.[Yan-Gang],
Nonrigid Object Contact Estimation With Regional Unwrapping
Transformer,
ICCV23(9308-9317)
IEEE DOI
2401
BibRef
Vasu, P.K.A.[Pavan Kumar Anasosalu],
Gabriel, J.[James],
Zhu, J.[Jeff],
Tuzel, O.[Oncel],
Ranjan, A.[Anurag],
FastViT: A Fast Hybrid Vision Transformer using Structural
Reparameterization,
ICCV23(5762-5772)
IEEE DOI Code:
WWW Link.
2401
BibRef
Hyeon-Woo, N.[Nam],
Yu-Ji, K.[Kim],
Heo, B.[Byeongho],
Han, D.Y.[Dong-Yoon],
Oh, S.J.[Seong Joon],
Oh, T.H.[Tae-Hyun],
Scratching Visual Transformer's Back with Uniform Attention,
ICCV23(5784-5795)
IEEE DOI
2401
BibRef
Tang, C.[Chen],
Zhang, L.L.[Li Lyna],
Jiang, H.Q.[Hui-Qiang],
Xu, J.H.[Jia-Hang],
Cao, T.[Ting],
Zhang, Q.[Quanlu],
Yang, Y.Q.[Yu-Qing],
Wang, Z.[Zhi],
Yang, M.[Mao],
ElasticViT: Conflict-aware Supernet Training for Deploying Fast
Vision Transformer on Diverse Mobile Devices,
ICCV23(5806-5817)
IEEE DOI
2401
BibRef
Ren, S.[Sucheng],
Yang, X.Y.[Xing-Yi],
Liu, S.[Songhua],
Wang, X.C.[Xin-Chao],
SG-Former: Self-guided Transformer with Evolving Token Reallocation,
ICCV23(5980-5991)
IEEE DOI Code:
WWW Link.
2401
BibRef
Lin, W.F.[Wei-Feng],
Wu, Z.H.[Zi-Heng],
Chen, J.[Jiayu],
Huang, J.[Jun],
Jin, L.W.[Lian-Wen],
Scale-Aware Modulation Meet Transformer,
ICCV23(5992-6003)
IEEE DOI Code:
WWW Link.
2401
BibRef
Zhang, H.K.[Hao-Kui],
Hu, W.Z.[Wen-Ze],
Wang, X.Y.[Xiao-Yu],
Fcaformer: Forward Cross Attention in Hybrid Vision Transformer,
ICCV23(6037-6046)
IEEE DOI Code:
WWW Link.
2401
BibRef
He, Y.F.[Ye-Fei],
Lou, Z.Y.[Zhen-Yu],
Zhang, L.[Luoming],
Liu, J.[Jing],
Wu, W.J.[Wei-Jia],
Zhou, H.[Hong],
Zhuang, B.[Bohan],
BiViT: Extremely Compressed Binary Vision Transformers,
ICCV23(5628-5640)
IEEE DOI
2401
BibRef
Dutson, M.[Matthew],
Li, Y.[Yin],
Gupta, M.[Mohit],
Eventful Transformers:
Leveraging Temporal Redundancy in Vision Transformers,
ICCV23(16865-16877)
IEEE DOI
2401
BibRef
Wang, Z.Q.[Zi-Qing],
Fang, Y.T.[Yue-Tong],
Cao, J.H.[Jia-Hang],
Zhang, Q.[Qiang],
Wang, Z.[Zhongrui],
Xu, R.[Renjing],
Masked Spiking Transformer,
ICCV23(1761-1771)
IEEE DOI Code:
WWW Link.
2401
BibRef
Peebles, W.[William],
Xie, S.[Saining],
Scalable Diffusion Models with Transformers,
ICCV23(4172-4182)
IEEE DOI
2401
BibRef
Zeng, W.X.[Wen-Xuan],
Li, M.[Meng],
Xiong, W.J.[Wen-Jie],
Tong, T.[Tong],
Lu, W.J.[Wen-Jie],
Tan, J.[Jin],
Wang, R.S.[Run-Sheng],
Huang, R.[Ru],
MPCViT: Searching for Accurate and Efficient MPC-Friendly Vision
Transformer with Heterogeneous Attention,
ICCV23(5029-5040)
IEEE DOI Code:
WWW Link.
2401
BibRef
Mentzer, F.[Fabian],
Agustson, E.[Eirikur],
Tschannen, M.[Michael],
M2T: Masking Transformers Twice for Faster Decoding,
ICCV23(5317-5326)
IEEE DOI
2401
BibRef
Psomas, B.[Bill],
Kakogeorgiou, I.[Ioannis],
Karantzalos, K.[Konstantinos],
Avrithis, Y.[Yannis],
Keep It SimPool:Who Said Supervised Transformers Suffer from
Attention Deficit?,
ICCV23(5327-5337)
IEEE DOI Code:
WWW Link.
2401
BibRef
Xiao, H.[Han],
Zheng, W.Z.[Wen-Zhao],
Zhu, Z.[Zheng],
Zhou, J.[Jie],
Lu, J.W.[Ji-Wen],
Token-Label Alignment for Vision Transformers,
ICCV23(5472-5481)
IEEE DOI Code:
WWW Link.
2401
BibRef
Yu, R.Y.[Run-Yi],
Wang, Z.N.[Zhen-Nan],
Wang, Y.H.[Yin-Huai],
Li, K.[Kehan],
Liu, C.[Chang],
Duan, H.[Haoyi],
Ji, X.Y.[Xiang-Yang],
Chen, J.[Jie],
LaPE: Layer-adaptive Position Embedding for Vision Transformers with
Independent Layer Normalization,
ICCV23(5863-5873)
IEEE DOI
2401
BibRef
Roy, A.[Anurag],
Verma, V.K.[Vinay K.],
Voonna, S.[Sravan],
Ghosh, K.[Kripabandhu],
Ghosh, S.[Saptarshi],
Das, A.[Abir],
Exemplar-Free Continual Transformer with Convolutions,
ICCV23(5874-5884)
IEEE DOI
2401
BibRef
Xu, Y.X.[Yi-Xing],
Li, C.[Chao],
Li, D.[Dong],
Sheng, X.[Xiao],
Jiang, F.[Fan],
Tian, L.[Lu],
Sirasao, A.[Ashish],
FDViT: Improve the Hierarchical Architecture of Vision Transformer,
ICCV23(5927-5937)
IEEE DOI
2401
BibRef
Han, D.C.[Dong-Chen],
Pan, X.[Xuran],
Han, Y.Z.[Yi-Zeng],
Song, S.[Shiji],
Huang, G.[Gao],
FLatten Transformer: Vision Transformer using Focused Linear
Attention,
ICCV23(5938-5948)
IEEE DOI Code:
WWW Link.
2401
BibRef
Chen, Y.J.[Yong-Jie],
Liu, H.M.[Hong-Min],
Yin, H.R.[Hao-Ran],
Fan, B.[Bin],
Building Vision Transformers with Hierarchy Aware Feature Aggregation,
ICCV23(5885-5895)
IEEE DOI
2401
BibRef
Quétu, V.[Victor],
Milovanovic, M.[Marta],
Tartaglione, E.[Enzo],
Sparse Double Descent in Vision Transformers: Real or Phantom Threat?,
CIAP23(II:490-502).
Springer DOI
2312
BibRef
Ak, K.E.[Kenan Emir],
Lee, G.G.[Gwang-Gook],
Xu, Y.[Yan],
Shen, M.W.[Ming-Wei],
Leveraging Efficient Training and Feature Fusion in Transformers for
Multimodal Classification,
ICIP23(1420-1424)
IEEE DOI
2312
BibRef
Popovic, N.[Nikola],
Paudel, D.P.[Danda Pani],
Probst, T.[Thomas],
Van Gool, L.J.[Luc J.],
Token-Consistent Dropout For Calibrated Vision Transformers,
ICIP23(1030-1034)
IEEE DOI
2312
BibRef
Sajjadi, M.S.M.[Mehdi S. M.],
Mahendran, A.[Aravindh],
Kipf, T.[Thomas],
Pot, E.[Etienne],
Duckworth, D.[Daniel],
Lucic, M.[Mario],
Greff, K.[Klaus],
RUST: Latent Neural Scene Representations from Unposed Imagery,
CVPR23(17297-17306)
IEEE DOI
2309
BibRef
Bowman, B.[Benjamin],
Achille, A.[Alessandro],
Zancato, L.[Luca],
Trager, M.[Matthew],
Perera, P.[Pramuditha],
Paolini, G.[Giovanni],
Soatto, S.[Stefano],
Ŕ-la-carte Prompt Tuning (APT):
Combining Distinct Data Via Composable Prompting,
CVPR23(14984-14993)
IEEE DOI
2309
BibRef
Nakhli, R.[Ramin],
Moghadam, P.A.[Puria Azadi],
Mi, H.Y.[Hao-Yang],
Farahani, H.[Hossein],
Baras, A.[Alexander],
Gilks, B.[Blake],
Bashashati, A.[Ali],
Sparse Multi-Modal Graph Transformer with Shared-Context Processing
for Representation Learning of Giga-pixel Images,
CVPR23(11547-11557)
IEEE DOI
2309
BibRef
Gärtner, E.[Erik],
Metz, L.[Luke],
Andriluka, M.[Mykhaylo],
Freeman, C.D.[C. Daniel],
Sminchisescu, C.[Cristian],
Transformer-Based Learned Optimization,
CVPR23(11970-11979)
IEEE DOI
2309
BibRef
Li, J.C.[Jia-Chen],
Hassani, A.[Ali],
Walton, S.[Steven],
Shi, H.[Humphrey],
ConvMLP: Hierarchical Convolutional MLPs for Vision,
WFM23(6307-6316)
IEEE DOI
2309
multi-layer perceptron
BibRef
Walmer, M.[Matthew],
Suri, S.[Saksham],
Gupta, K.[Kamal],
Shrivastava, A.[Abhinav],
Teaching Matters:
Investigating the Role of Supervision in Vision Transformers,
CVPR23(7486-7496)
IEEE DOI
2309
BibRef
Wang, S.G.[Shi-Guang],
Xie, T.[Tao],
Cheng, J.[Jian],
Zhang, X.C.[Xing-Cheng],
Liu, H.J.[Hai-Jun],
MDL-NAS: A Joint Multi-domain Learning Framework for Vision
Transformer,
CVPR23(20094-20104)
IEEE DOI
2309
BibRef
Ko, D.[Dohwan],
Choi, J.[Joonmyung],
Choi, H.K.[Hyeong Kyu],
On, K.W.[Kyoung-Woon],
Roh, B.[Byungseok],
Kim, H.W.J.[Hyun-Woo J.],
MELTR: Meta Loss Transformer for Learning to Fine-tune Video
Foundation Models,
CVPR23(20105-20115)
IEEE DOI
2309
BibRef
Ren, S.[Sucheng],
Wei, F.Y.[Fang-Yun],
Zhang, Z.[Zheng],
Hu, H.[Han],
TinyMIM: An Empirical Study of Distilling MIM Pre-trained Models,
CVPR23(3687-3697)
IEEE DOI
2309
BibRef
He, J.F.[Jian-Feng],
Gao, Y.[Yuan],
Zhang, T.Z.[Tian-Zhu],
Zhang, Z.[Zhe],
Wu, F.[Feng],
D2Former: Jointly Learning Hierarchical Detectors and Contextual
Descriptors via Agent-Based Transformers,
CVPR23(2904-2914)
IEEE DOI
2309
BibRef
Chen, X.Y.[Xuan-Yao],
Liu, Z.J.[Zhi-Jian],
Tang, H.T.[Hao-Tian],
Yi, L.[Li],
Zhao, H.[Hang],
Han, S.[Song],
SparseViT: Revisiting Activation Sparsity for Efficient
High-Resolution Vision Transformer,
CVPR23(2061-2070)
IEEE DOI
2309
BibRef
Wei, S.Y.[Si-Yuan],
Ye, T.Z.[Tian-Zhu],
Zhang, S.[Shen],
Tang, Y.[Yao],
Liang, J.J.[Jia-Jun],
Joint Token Pruning and Squeezing Towards More Aggressive Compression
of Vision Transformers,
CVPR23(2092-2101)
IEEE DOI
2309
BibRef
Lin, Y.B.[Yan-Bo],
Sung, Y.L.[Yi-Lin],
Lei, J.[Jie],
Bansal, M.[Mohit],
Bertasius, G.[Gedas],
Vision Transformers are Parameter-Efficient Audio-Visual Learners,
CVPR23(2299-2309)
IEEE DOI
2309
BibRef
Das, R.[Rajshekhar],
Dukler, Y.[Yonatan],
Ravichandran, A.[Avinash],
Swaminathan, A.[Ashwin],
Learning Expressive Prompting With Residuals for Vision Transformers,
CVPR23(3366-3377)
IEEE DOI
2309
BibRef
Zheng, M.X.[Meng-Xin],
Lou, Q.[Qian],
Jiang, L.[Lei],
TrojViT: Trojan Insertion in Vision Transformers,
CVPR23(4025-4034)
IEEE DOI
2309
BibRef
Guo, Y.[Yong],
Stutz, D.[David],
Schiele, B.[Bernt],
Improving Robustness of Vision Transformers by Reducing Sensitivity
to Patch Corruptions,
CVPR23(4108-4118)
IEEE DOI
2309
BibRef
Li, Y.X.[Yan-Xi],
Xu, C.[Chang],
Trade-off between Robustness and Accuracy of Vision Transformers,
CVPR23(7558-7568)
IEEE DOI
2309
BibRef
Tarasiou, M.[Michail],
Chavez, E.[Erik],
Zafeiriou, S.[Stefanos],
ViTs for SITS: Vision Transformers for Satellite Image Time Series,
CVPR23(10418-10428)
IEEE DOI
2309
BibRef
Yu, Z.Z.[Zhong-Zhi],
Wu, S.[Shang],
Fu, Y.G.[Yong-Gan],
Zhang, S.[Shunyao],
Lin, Y.Y.C.[Ying-Yan Celine],
Hint-Aug: Drawing Hints from Foundation Vision Transformers towards
Boosted Few-shot Parameter-Efficient Tuning,
CVPR23(11102-11112)
IEEE DOI
2309
BibRef
Kim, D.[Dahun],
Angelova, A.[Anelia],
Kuo, W.C.[Wei-Cheng],
Region-Aware Pretraining for Open-Vocabulary Object Detection with
Vision Transformers,
CVPR23(11144-11154)
IEEE DOI
2309
BibRef
Hou, J.[Ji],
Dai, X.L.[Xiao-Liang],
He, Z.J.[Zi-Jian],
Dai, A.[Angela],
Nießner, M.[Matthias],
Mask3D: Pretraining 2D Vision Transformers by Learning Masked 3D
Priors,
CVPR23(13510-13519)
IEEE DOI
2309
BibRef
Xu, Z.Z.[Zheng-Zhuo],
Liu, R.K.[Rui-Kang],
Yang, S.[Shuo],
Chai, Z.H.[Zeng-Hao],
Yuan, C.[Chun],
Learning Imbalanced Data with Vision Transformers,
CVPR23(15793-15803)
IEEE DOI
2309
BibRef
Zhang, J.P.[Jian-Ping],
Huang, Y.Z.[Yi-Zhan],
Wu, W.B.[Wei-Bin],
Lyu, M.R.[Michael R.],
Transferable Adversarial Attacks on Vision Transformers with Token
Gradient Regularization,
CVPR23(16415-16424)
IEEE DOI
2309
BibRef
Yang, H.[Huanrui],
Yin, H.X.[Hong-Xu],
Shen, M.[Maying],
Molchanov, P.[Pavlo],
Li, H.[Hai],
Kautz, J.[Jan],
Global Vision Transformer Pruning with Hessian-Aware Saliency,
CVPR23(18547-18557)
IEEE DOI
2309
BibRef
Nakamura, R.[Ryo],
Kataoka, H.[Hirokatsu],
Takashima, S.[Sora],
Noriega, E.J.M.[Edgar Josafat Martinez],
Yokota, R.[Rio],
Inoue, N.[Nakamasa],
Pre-training Vision Transformers with Very Limited Synthesized Images,
ICCV23(20303-20312)
IEEE DOI
2401
BibRef
Takashima, S.[Sora],
Hayamizu, R.[Ryo],
Inoue, N.[Nakamasa],
Kataoka, H.[Hirokatsu],
Yokota, R.[Rio],
Visual Atoms: Pre-Training Vision Transformers with Sinusoidal Waves,
CVPR23(18579-18588)
IEEE DOI
2309
BibRef
Kang, D.[Dahyun],
Koniusz, P.[Piotr],
Cho, M.[Minsu],
Murray, N.[Naila],
Distilling Self-Supervised Vision Transformers for Weakly-Supervised
Few-Shot Classification and Segmentation,
CVPR23(19627-19638)
IEEE DOI
2309
BibRef
Liu, Y.J.[Yi-Jiang],
Yang, H.R.[Huan-Rui],
Dong, Z.[Zhen],
Keutzer, K.[Kurt],
Du, L.[Li],
Zhang, S.H.[Shang-Hang],
NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization
for Vision Transformers,
CVPR23(20321-20330)
IEEE DOI
2309
BibRef
Park, J.[Jeongsoo],
Johnson, J.[Justin],
RGB No More: Minimally-Decoded JPEG Vision Transformers,
CVPR23(22334-22346)
IEEE DOI
2309
BibRef
Yu, C.[Chong],
Chen, T.[Tao],
Gan, Z.X.[Zhong-Xue],
Fan, J.Y.[Jia-Yuan],
Boost Vision Transformer with GPU-Friendly Sparsity and Quantization,
CVPR23(22658-22668)
IEEE DOI
2309
BibRef
Bao, F.[Fan],
Nie, S.[Shen],
Xue, K.W.[Kai-Wen],
Cao, Y.[Yue],
Li, C.X.[Chong-Xuan],
Su, H.[Hang],
Zhu, J.[Jun],
All are Worth Words: A ViT Backbone for Diffusion Models,
CVPR23(22669-22679)
IEEE DOI
2309
BibRef
Li, B.[Bonan],
Hu, Y.[Yinhan],
Nie, X.C.[Xue-Cheng],
Han, C.Y.[Cong-Ying],
Jiang, X.J.[Xiang-Jian],
Guo, T.D.[Tian-De],
Liu, L.Q.[Luo-Qi],
DropKey for Vision Transformer,
CVPR23(22700-22709)
IEEE DOI
2309
BibRef
Lan, S.Y.[Shi-Yi],
Yang, X.[Xitong],
Yu, Z.[Zhiding],
Wu, Z.[Zuxuan],
Alvarez, J.M.[Jose M.],
Anandkumar, A.[Anima],
Vision Transformers are Good Mask Auto-Labelers,
CVPR23(23745-23755)
IEEE DOI
2309
BibRef
Yu, L.[Lu],
Xiang, W.[Wei],
X-Pruner: eXplainable Pruning for Vision Transformers,
CVPR23(24355-24363)
IEEE DOI
2309
BibRef
Singh, A.[Apoorv],
Training Strategies for Vision Transformers for Object Detection,
WAD23(110-118)
IEEE DOI
2309
BibRef
Hukkelĺs, H.[Hĺkon],
Lindseth, F.[Frank],
Does Image Anonymization Impact Computer Vision Training?,
WAD23(140-150)
IEEE DOI
2309
BibRef
Marnissi, M.A.[Mohamed Amine],
Revolutionizing Thermal Imaging: GAN-Based Vision Transformers for
Image Enhancement,
ICIP23(2735-2739)
IEEE DOI
2312
BibRef
Marnissi, M.A.[Mohamed Amine],
Fathallah, A.[Abir],
GAN-based Vision Transformer for High-Quality Thermal Image
Enhancement,
GCV23(817-825)
IEEE DOI
2309
BibRef
Scheibenreif, L.[Linus],
Mommert, M.[Michael],
Borth, D.[Damian],
Masked Vision Transformers for Hyperspectral Image Classification,
EarthVision23(2166-2176)
IEEE DOI
2309
BibRef
Komorowski, P.[Piotr],
Baniecki, H.[Hubert],
Biecek, P.[Przemyslaw],
Towards Evaluating Explanations of Vision Transformers for Medical
Imaging,
XAI4CV23(3726-3732)
IEEE DOI
2309
BibRef
Nalmpantis, A.[Angelos],
Panagiotopoulos, A.[Apostolos],
Gkountouras, J.[John],
Papakostas, K.[Konstantinos],
Aziz, W.[Wilker],
Vision DiffMask: Faithful Interpretation of Vision Transformers with
Differentiable Patch Masking,
XAI4CV23(3756-3763)
IEEE DOI
2309
BibRef
Ronen, T.[Tomer],
Levy, O.[Omer],
Golbert, A.[Avram],
Vision Transformers with Mixed-Resolution Tokenization,
ECV23(4613-4622)
IEEE DOI
2309
BibRef
Le, P.H.C.[Phuoc-Hoan Charles],
Li, X.[Xinlin],
BinaryViT: Pushing Binary Vision Transformers Towards Convolutional
Models,
ECV23(4665-4674)
IEEE DOI
2309
BibRef
Ma, D.[Dongning],
Zhao, P.F.[Peng-Fei],
Jiao, X.[Xun],
PerfHD: Efficient ViT Architecture Performance Ranking using
Hyperdimensional Computing,
NAS23(2230-2237)
IEEE DOI
2309
BibRef
Wang, J.[Jun],
Alamayreh, O.[Omran],
Tondi, B.[Benedetta],
Barni, M.[Mauro],
Open Set Classification of GAN-based Image Manipulations via a
ViT-based Hybrid Architecture,
WMF23(953-962)
IEEE DOI
2309
BibRef
Tian, R.[Rui],
Wu, Z.[Zuxuan],
Dai, Q.[Qi],
Hu, H.[Han],
Qiao, Y.[Yu],
Jiang, Y.G.[Yu-Gang],
ResFormer: Scaling ViTs with Multi-Resolution Training,
CVPR23(22721-22731)
IEEE DOI
2309
BibRef
Li, Y.[Yi],
Min, K.[Kyle],
Tripathi, S.[Subarna],
Vasconcelos, N.M.[Nuno M.],
SViTT: Temporal Learning of Sparse Video-Text Transformers,
CVPR23(18919-18929)
IEEE DOI
2309
BibRef
Beyer, L.[Lucas],
Izmailov, P.[Pavel],
Kolesnikov, A.[Alexander],
Caron, M.[Mathilde],
Kornblith, S.[Simon],
Zhai, X.H.[Xiao-Hua],
Minderer, M.[Matthias],
Tschannen, M.[Michael],
Alabdulmohsin, I.[Ibrahim],
Pavetic, F.[Filip],
FlexiViT: One Model for All Patch Sizes,
CVPR23(14496-14506)
IEEE DOI
2309
BibRef
Chang, S.N.[Shu-Ning],
Wang, P.[Pichao],
Lin, M.[Ming],
Wang, F.[Fan],
Zhang, D.J.H.[David Jun-Hao],
Jin, R.[Rong],
Shou, M.Z.[Mike Zheng],
Making Vision Transformers Efficient from A Token Sparsification View,
CVPR23(6195-6205)
IEEE DOI
2309
BibRef
Phan, L.[Lam],
Nguyen, H.T.H.[Hiep Thi Hong],
Warrier, H.[Harikrishna],
Gupta, Y.[Yogesh],
Patch Embedding as Local Features: Unifying Deep Local and Global
Features via Vision Transformer for Image Retrieval,
ACCV22(II:204-221).
Springer DOI
2307
BibRef
Guo, X.D.[Xin-Dong],
Sun, Y.[Yu],
Zhao, R.[Rong],
Kuang, L.Q.[Li-Qun],
Han, X.[Xie],
SWPT: Spherical Window-based Point Cloud Transformer,
ACCV22(I:396-412).
Springer DOI
2307
BibRef
Wang, W.J.[Wen-Ju],
Chen, G.[Gang],
Zhou, H.R.[Hao-Ran],
Wang, X.L.[Xiao-Lin],
OVPT: Optimal Viewset Pooling Transformer for 3d Object Recognition,
ACCV22(I:486-503).
Springer DOI
2307
BibRef
Kim, D.[Daeho],
Kim, J.[Jaeil],
Vision Transformer Compression and Architecture Exploration with
Efficient Embedding Space Search,
ACCV22(III:524-540).
Springer DOI
2307
BibRef
Lee, Y.S.[Yun-Sung],
Lee, G.[Gyuseong],
Ryoo, K.[Kwangrok],
Go, H.[Hyojun],
Park, J.[Jihye],
Kim, S.[Seungryong],
Towards Flexible Inductive Bias via Progressive Reparameterization
Scheduling,
VIPriors22(706-720).
Springer DOI
2304
Transformers vs. CNN different benefits. Best of both.
BibRef
Amir, S.[Shir],
Gandelsman, Y.[Yossi],
Bagon, S.[Shai],
Dekel, T.[Tali],
On the Effectiveness of VIT Features as Local Semantic Descriptors,
SelfLearn22(39-55).
Springer DOI
2304
BibRef
Deng, X.[Xuran],
Liu, C.B.[Chuan-Bin],
Lu, Z.Y.[Zhi-Ying],
Recombining Vision Transformer Architecture for Fine-grained Visual
Categorization,
MMMod23(II: 127-138).
Springer DOI
2304
BibRef
Tonkes, V.[Vincent],
Sabatelli, M.[Matthia],
How Well Do Vision Transformers (vts) Transfer to the Non-natural Image
Domain? An Empirical Study Involving Art Classification,
VisArt22(234-250).
Springer DOI
2304
BibRef
Rangrej, S.B.[Samrudhdhi B],
Liang, K.J.[Kevin J],
Hassner, T.[Tal],
Clark, J.J.[James J],
GliTr: Glimpse Transformers with Spatiotemporal Consistency for
Online Action Prediction,
WACV23(3402-3412)
IEEE DOI
2302
Predictive models, Transformers, Cameras, Spatiotemporal phenomena,
Sensors, Observability
BibRef
Liu, Y.[Yue],
Matsoukas, C.[Christos],
Strand, F.[Fredrik],
Azizpour, H.[Hossein],
Smith, K.[Kevin],
PatchDropout: Economizing Vision Transformers Using Patch Dropout,
WACV23(3942-3951)
IEEE DOI
2302
Training, Image resolution, Computational modeling,
Biological system modeling, Memory management, Transformers,
Biomedical/healthcare/medicine
BibRef
Song, C.H.[Chull Hwan],
Yoon, J.Y.[Joo-Young],
Choi, S.[Shunghyun],
Avrithis, Y.[Yannis],
Boosting vision transformers for image retrieval,
WACV23(107-117)
IEEE DOI
2302
Training, Location awareness, Image retrieval,
Self-supervised learning, Image representation, Transformers
BibRef
Yang, J.[Jinyu],
Liu, J.J.[Jing-Jing],
Xu, N.[Ning],
Huang, J.Z.[Jun-Zhou],
TVT: Transferable Vision Transformer for Unsupervised Domain
Adaptation,
WACV23(520-530)
IEEE DOI
2302
Benchmark testing, Image representation, Transformers,
Convolutional neural networks, Task analysis,
and algorithms (including transfer)
BibRef
Saavedra-Ruiz, M.[Miguel],
Morin, S.[Sacha],
Paull, L.[Liam],
Monocular Robot Navigation with Self-Supervised Pretrained Vision
Transformers,
CRV22(197-204)
IEEE DOI
2301
Adaptation models, Image segmentation, Image resolution,
Navigation, Transformers, Robot sensing systems, Visual Servoing
BibRef
Patel, K.[Krushi],
Bur, A.M.[Andrés M.],
Li, F.J.[Feng-Jun],
Wang, G.H.[Guang-Hui],
Aggregating Global Features into Local Vision Transformer,
ICPR22(1141-1147)
IEEE DOI
2212
Source coding, Computational modeling,
Information processing, Performance gain, Transformers
BibRef
Shen, Z.Q.[Zhi-Qiang],
Liu, Z.[Zechun],
Xing, E.[Eric],
Sliced Recursive Transformer,
ECCV22(XXIV:727-744).
Springer DOI
2211
BibRef
Shao, Y.[Yidi],
Loy, C.C.[Chen Change],
Dai, B.[Bo],
Transformer with Implicit Edges for Particle-Based Physics Simulation,
ECCV22(XIX:549-564).
Springer DOI
2211
BibRef
Wang, W.[Wen],
Zhang, J.[Jing],
Cao, Y.[Yang],
Shen, Y.L.[Yong-Liang],
Tao, D.C.[Da-Cheng],
Towards Data-Efficient Detection Transformers,
ECCV22(IX:88-105).
Springer DOI
2211
BibRef
Lorenzana, M.B.[Marlon Bran],
Engstrom, C.[Craig],
Chandra, S.S.[Shekhar S.],
Transformer Compressed Sensing Via Global Image Tokens,
ICIP22(3011-3015)
IEEE DOI
2211
Training, Limiting, Image resolution, Neural networks,
Image representation, Transformers, MRI
BibRef
Lu, X.Y.[Xiao-Yong],
Du, S.[Songlin],
NCTR: Neighborhood Consensus Transformer for Feature Matching,
ICIP22(2726-2730)
IEEE DOI
2211
Learning systems, Impedance matching, Aggregates, Pose estimation,
Neural networks, Transformers, Local feature matching,
graph neural network
BibRef
Jeny, A.A.[Afsana Ahsan],
Junayed, M.S.[Masum Shah],
Islam, M.B.[Md Baharul],
An Efficient End-To-End Image Compression Transformer,
ICIP22(1786-1790)
IEEE DOI
2211
Image coding, Correlation, Limiting, Computational modeling,
Rate-distortion, Video compression, Transformers, entropy model
BibRef
Bai, J.W.[Jia-Wang],
Yuan, L.[Li],
Xia, S.T.[Shu-Tao],
Yan, S.C.[Shui-Cheng],
Li, Z.F.[Zhi-Feng],
Liu, W.[Wei],
Improving Vision Transformers by Revisiting High-Frequency Components,
ECCV22(XXIV:1-18).
Springer DOI
2211
BibRef
Li, K.[Kehan],
Yu, R.[Runyi],
Wang, Z.[Zhennan],
Yuan, L.[Li],
Song, G.[Guoli],
Chen, J.[Jie],
Locality Guidance for Improving Vision Transformers on Tiny Datasets,
ECCV22(XXIV:110-127).
Springer DOI
2211
BibRef
Tu, Z.Z.[Zheng-Zhong],
Talebi, H.[Hossein],
Zhang, H.[Han],
Yang, F.[Feng],
Milanfar, P.[Peyman],
Bovik, A.C.[Alan C.],
Li, Y.[Yinxiao],
MaxViT: Multi-axis Vision Transformer,
ECCV22(XXIV:459-479).
Springer DOI
2211
BibRef
Yang, R.[Rui],
Ma, H.L.[Hai-Long],
Wu, J.[Jie],
Tang, Y.S.[Yan-Song],
Xiao, X.F.[Xue-Feng],
Zheng, M.[Min],
Li, X.[Xiu],
ScalableViT: Rethinking the Context-Oriented Generalization of Vision
Transformer,
ECCV22(XXIV:480-496).
Springer DOI
2211
BibRef
Touvron, H.[Hugo],
Cord, M.[Matthieu],
El-Nouby, A.[Alaaeldin],
Verbeek, J.[Jakob],
Jégou, H.[Hervé],
Three Things Everyone Should Know About Vision Transformers,
ECCV22(XXIV:497-515).
Springer DOI
2211
BibRef
Touvron, H.[Hugo],
Cord, M.[Matthieu],
Jégou, H.[Hervé],
DeiT III: Revenge of the ViT,
ECCV22(XXIV:516-533).
Springer DOI
2211
BibRef
Li, Y.H.[Yang-Hao],
Mao, H.Z.[Han-Zi],
Girshick, R.[Ross],
He, K.M.[Kai-Ming],
Exploring Plain Vision Transformer Backbones for Object Detection,
ECCV22(IX:280-296).
Springer DOI
2211
BibRef
Yu, Q.H.[Qi-Hang],
Wang, H.Y.[Hui-Yu],
Qiao, S.Y.[Si-Yuan],
Collins, M.[Maxwell],
Zhu, Y.K.[Yu-Kun],
Adam, H.[Hartwig],
Yuille, A.L.[Alan L.],
Chen, L.C.[Liang-Chieh],
k-means Mask Transformer,
ECCV22(XXIX:288-307).
Springer DOI
2211
BibRef
Pham, K.[Khoi],
Kafle, K.[Kushal],
Lin, Z.[Zhe],
Ding, Z.H.[Zhi-Hong],
Cohen, S.[Scott],
Tran, Q.[Quan],
Shrivastava, A.[Abhinav],
Improving Closed and Open-Vocabulary Attribute Prediction Using
Transformers,
ECCV22(XXV:201-219).
Springer DOI
2211
BibRef
Yu, W.X.[Wen-Xin],
Zhang, H.[Hongru],
Lan, T.X.[Tian-Xiang],
Hu, Y.C.[Yu-Cheng],
Yin, D.[Dong],
CBPT: A New Backbone for Enhancing Information Transmission of Vision
Transformers,
ICIP22(156-160)
IEEE DOI
2211
Merging, Information processing, Object detection, Transformers,
Computational complexity, Vision Transformer, Backbone
BibRef
Takeda, M.[Mana],
Yanai, K.[Keiji],
Continual Learning in Vision Transformer,
ICIP22(616-620)
IEEE DOI
2211
Learning systems, Image recognition, Transformers,
Natural language processing, Convolutional neural networks, Vision Transformer
BibRef
Zhou, W.L.[Wei-Lian],
Kamata, S.I.[Sei-Ichiro],
Luo, Z.[Zhengbo],
Xue, X.[Xi],
Rethinking Unified Spectral-Spatial-Based Hyperspectral Image
Classification Under 3D Configuration of Vision Transformer,
ICIP22(711-715)
IEEE DOI
2211
Flowcharts, Correlation, Convolution, Transformers,
Hyperspectral image classification, 3D coordinate positional embedding
BibRef
Li, J.[Junbo],
Zhang, H.[Huan],
Xie, C.[Cihang],
ViP: Unified Certified Detection and Recovery for Patch Attack with
Vision Transformers,
ECCV22(XXV:573-587).
Springer DOI
2211
BibRef
Cao, Y.H.[Yun-Hao],
Yu, H.[Hao],
Wu, J.X.[Jian-Xin],
Training Vision Transformers with only 2040 Images,
ECCV22(XXV:220-237).
Springer DOI
2211
BibRef
Wang, C.[Cong],
Xu, H.M.[Hong-Min],
Zhang, X.[Xiong],
Wang, L.[Li],
Zheng, Z.[Zhitong],
Liu, H.F.[Hai-Feng],
Convolutional Embedding Makes Hierarchical Vision Transformer Stronger,
ECCV22(XX:739-756).
Springer DOI
2211
BibRef
Wu, B.[Boxi],
Gu, J.D.[Jin-Dong],
Li, Z.F.[Zhi-Feng],
Cai, D.[Deng],
He, X.F.[Xiao-Fei],
Liu, W.[Wei],
Towards Efficient Adversarial Training on Vision Transformers,
ECCV22(XIII:307-325).
Springer DOI
2211
BibRef
Gu, J.D.[Jin-Dong],
Tresp, V.[Volker],
Qin, Y.[Yao],
Are Vision Transformers Robust to Patch Perturbations?,
ECCV22(XII:404-421).
Springer DOI
2211
BibRef
Zong, Z.[Zhuofan],
Li, K.[Kunchang],
Song, G.[Guanglu],
Wang, Y.[Yali],
Qiao, Y.[Yu],
Leng, B.[Biao],
Liu, Y.[Yu],
Self-slimmed Vision Transformer,
ECCV22(XI:432-448).
Springer DOI
2211
BibRef
Fayyaz, M.[Mohsen],
Koohpayegani, S.A.[Soroush Abbasi],
Jafari, F.R.[Farnoush Rezaei],
Sengupta, S.[Sunando],
Joze, H.R.V.[Hamid Reza Vaezi],
Sommerlade, E.[Eric],
Pirsiavash, H.[Hamed],
Gall, J.[Jürgen],
Adaptive Token Sampling for Efficient Vision Transformers,
ECCV22(XI:396-414).
Springer DOI
2211
BibRef
Li, Z.K.[Zhi-Kai],
Ma, L.P.[Li-Ping],
Chen, M.J.[Meng-Juan],
Xiao, J.R.[Jun-Rui],
Gu, Q.Y.[Qing-Yi],
Patch Similarity Aware Data-Free Quantization for Vision Transformers,
ECCV22(XI:154-170).
Springer DOI
2211
BibRef
Weng, Z.J.[Ze-Jia],
Yang, X.T.[Xi-Tong],
Li, A.[Ang],
Wu, Z.X.[Zu-Xuan],
Jiang, Y.G.[Yu-Gang],
Semi-supervised Vision Transformers,
ECCV22(XXX:605-620).
Springer DOI
2211
BibRef
Su, T.[Tong],
Ye, S.[Shuo],
Song, C.Q.[Cheng-Qun],
Cheng, J.[Jun],
Mask-Vit: an Object Mask Embedding in Vision Transformer for
Fine-Grained Visual Classification,
ICIP22(1626-1630)
IEEE DOI
2211
Knowledge engineering, Visualization, Focusing, Interference,
Benchmark testing, Transformers, Feature extraction, Knowledge Embedding
BibRef
Gai, L.[Lulu],
Chen, W.[Wei],
Gao, R.[Rui],
Chen, Y.W.[Yan-Wei],
Qiao, X.[Xu],
Using Vision Transformers in 3-D Medical Image Classifications,
ICIP22(696-700)
IEEE DOI
2211
Deep learning, Training, Visualization, Transfer learning,
Optimization methods, Self-supervised learning, Transformers,
3-D medical image classifications
BibRef
Wu, K.[Kan],
Zhang, J.[Jinnian],
Peng, H.[Houwen],
Liu, M.C.[Meng-Chen],
Xiao, B.[Bin],
Fu, J.L.[Jian-Long],
Yuan, L.[Lu],
TinyViT: Fast Pretraining Distillation for Small Vision Transformers,
ECCV22(XXI:68-85).
Springer DOI
2211
BibRef
Gao, L.[Li],
Nie, D.[Dong],
Li, B.[Bo],
Ren, X.F.[Xiao-Feng],
Doubly-Fused ViT: Fuse Information from Vision Transformer Doubly with
Local Representation,
ECCV22(XXIII:744-761).
Springer DOI
2211
BibRef
Yao, T.[Ting],
Pan, Y.W.[Ying-Wei],
Li, Y.[Yehao],
Ngo, C.W.[Chong-Wah],
Mei, T.[Tao],
Wave-ViT: Unifying Wavelet and Transformers for Visual Representation
Learning,
ECCV22(XXV:328-345).
Springer DOI
2211
BibRef
Yuan, Z.H.[Zhi-Hang],
Xue, C.H.[Chen-Hao],
Chen, Y.Q.[Yi-Qi],
Wu, Q.[Qiang],
Sun, G.Y.[Guang-Yu],
PTQ4ViT: Post-training Quantization for Vision Transformers with Twin
Uniform Quantization,
ECCV22(XII:191-207).
Springer DOI
2211
BibRef
Kong, Z.L.[Zheng-Lun],
Dong, P.Y.[Pei-Yan],
Ma, X.L.[Xiao-Long],
Meng, X.[Xin],
Niu, W.[Wei],
Sun, M.S.[Meng-Shu],
Shen, X.[Xuan],
Yuan, G.[Geng],
Ren, B.[Bin],
Tang, H.[Hao],
Qin, M.H.[Ming-Hai],
Wang, Y.Z.[Yan-Zhi],
SPViT:
Enabling Faster Vision Transformers via Latency-Aware Soft Token Pruning,
ECCV22(XI:620-640).
Springer DOI
2211
BibRef
Pan, J.T.[Jun-Ting],
Bulat, A.[Adrian],
Tan, F.[Fuwen],
Zhu, X.T.[Xia-Tian],
Dudziak, L.[Lukasz],
Li, H.S.[Hong-Sheng],
Tzimiropoulos, G.[Georgios],
Martinez, B.[Brais],
EdgeViTs: Competing Light-Weight CNNs on Mobile Devices with Vision
Transformers,
ECCV22(XI:294-311).
Springer DOI
2211
BibRef
Xiang, H.[Hao],
Xu, R.S.[Run-Sheng],
Ma, J.Q.[Jia-Qi],
HM-ViT: Hetero-modal Vehicle-to-Vehicle Cooperative Perception with
Vision Transformer,
ICCV23(284-295)
IEEE DOI Code:
WWW Link.
2401
BibRef
Xu, R.S.[Run-Sheng],
Xiang, H.[Hao],
Tu, Z.Z.[Zheng-Zhong],
Xia, X.[Xin],
Yang, M.H.[Ming-Hsuan],
Ma, J.Q.[Jia-Qi],
V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision
Transformer,
ECCV22(XXIX:107-124).
Springer DOI
2211
BibRef
Liu, Y.[Yong],
Mai, S.Q.[Si-Qi],
Chen, X.N.[Xiang-Ning],
Hsieh, C.J.[Cho-Jui],
You, Y.[Yang],
Towards Efficient and Scalable Sharpness-Aware Minimization,
CVPR22(12350-12360)
IEEE DOI
2210
WWW Link. Training, Schedules, Scalability, Perturbation methods,
Stochastic processes, Transformers, Minimization,
Vision applications and systems
BibRef
Ren, P.Z.[Peng-Zhen],
Li, C.[Changlin],
Wang, G.[Guangrun],
Xiao, Y.[Yun],
Du, Q.[Qing],
Liang, X.D.[Xiao-Dan],
Chang, X.J.[Xiao-Jun],
Beyond Fixation: Dynamic Window Visual Transformer,
CVPR22(11977-11987)
IEEE DOI
2210
Performance evaluation, Visualization, Systematics,
Computational modeling, Scalability, Transformers,
Deep learning architectures and techniques
BibRef
Bhattacharjee, D.[Deblina],
Zhang, T.[Tong],
Süsstrunk, S.[Sabine],
Salzmann, M.[Mathieu],
MuIT: An End-to-End Multitask Learning Transformer,
CVPR22(12021-12031)
IEEE DOI
2210
Heart, Image segmentation, Computational modeling,
Image edge detection, Semantics, Estimation, Predictive models,
Scene analysis and understanding
BibRef
Fang, J.[Jiemin],
Xie, L.X.[Ling-Xi],
Wang, X.G.[Xing-Gang],
Zhang, X.P.[Xiao-Peng],
Liu, W.Y.[Wen-Yu],
Tian, Q.[Qi],
MSG-Transformer:
Exchanging Local Spatial Information by Manipulating Messenger Tokens,
CVPR22(12053-12062)
IEEE DOI
2210
Deep learning, Visualization, Neural networks,
Graphics processing units, retrieval
BibRef
Sandler, M.[Mark],
Zhmoginov, A.[Andrey],
Vladymyrov, M.[Max],
Jackson, A.[Andrew],
Fine-tuning Image Transformers using Learnable Memory,
CVPR22(12145-12154)
IEEE DOI
2210
Deep learning, Adaptation models, Costs, Computational modeling,
Memory management, Transformers, Transfer/low-shot/long-tail learning
BibRef
Yu, X.[Xumin],
Tang, L.[Lulu],
Rao, Y.M.[Yong-Ming],
Huang, T.J.[Tie-Jun],
Zhou, J.[Jie],
Lu, J.W.[Ji-Wen],
Point-BERT: Pre-training 3D Point Cloud Transformers with Masked
Point Modeling,
CVPR22(19291-19300)
IEEE DOI
2210
Point cloud compression, Solid modeling, Computational modeling,
Bit error rate, Transformers, Pattern recognition,
Deep learning architectures and techniques
BibRef
Park, C.[Chunghyun],
Jeong, Y.[Yoonwoo],
Cho, M.[Minsu],
Park, J.[Jaesik],
Fast Point Transformer,
CVPR22(16928-16937)
IEEE DOI
2210
Point cloud compression, Shape, Semantics, Neural networks,
Transformers, grouping and shape analysis
BibRef
Zeng, W.[Wang],
Jin, S.[Sheng],
Liu, W.T.[Wen-Tao],
Qian, C.[Chen],
Luo, P.[Ping],
Ouyang, W.L.[Wan-Li],
Wang, X.G.[Xiao-Gang],
Not All Tokens Are Equal:
Human-centric Visual Analysis via Token Clustering Transformer,
CVPR22(11091-11101)
IEEE DOI
2210
Visualization, Shape, Pose estimation, Semantics,
Pose estimation and tracking,
Deep learning architectures and techniques
BibRef
Tu, Z.Z.[Zheng-Zhong],
Talebi, H.[Hossein],
Zhang, H.[Han],
Yang, F.[Feng],
Milanfar, P.[Peyman],
Bovik, A.[Alan],
Li, Y.X.[Yin-Xiao],
MAXIM: Multi-Axis MLP for Image Processing,
CVPR22(5759-5770)
IEEE DOI
2210
WWW Link. Training, Photography, Adaptation models, Visualization,
Computational modeling, Transformers, Low-level vision,
Computational photography
BibRef
Yun, S.[Sukmin],
Lee, H.[Hankook],
Kim, J.[Jaehyung],
Shin, J.[Jinwoo],
Patch-level Representation Learning for Self-supervised Vision
Transformers,
CVPR22(8344-8353)
IEEE DOI
2210
Training, Representation learning, Visualization, Neural networks,
Object detection, Self-supervised learning, Transformers,
Self- semi- meta- unsupervised learning
BibRef
Hou, Z.J.[Ze-Jiang],
Kung, S.Y.[Sun-Yuan],
Multi-Dimensional Vision Transformer Compression via Dependency
Guided Gaussian Process Search,
EVW22(3668-3677)
IEEE DOI
2210
Adaptation models, Image coding, Head, Computational modeling,
Neurons, Gaussian processes, Transformers
BibRef
Salman, H.[Hadi],
Jain, S.[Saachi],
Wong, E.[Eric],
Madry, A.[Aleksander],
Certified Patch Robustness via Smoothed Vision Transformers,
CVPR22(15116-15126)
IEEE DOI
2210
Visualization, Smoothing methods, Costs, Computational modeling,
Transformers, Adversarial attack and defense
BibRef
Wang, Y.K.[Yi-Kai],
Chen, X.H.[Xing-Hao],
Cao, L.[Lele],
Huang, W.B.[Wen-Bing],
Sun, F.C.[Fu-Chun],
Wang, Y.H.[Yun-He],
Multimodal Token Fusion for Vision Transformers,
CVPR22(12176-12185)
IEEE DOI
2210
Point cloud compression, Image segmentation, Shape, Semantics,
Object detection, Vision+X
BibRef
Tang, Y.[Yehui],
Han, K.[Kai],
Wang, Y.H.[Yun-He],
Xu, C.[Chang],
Guo, J.Y.[Jian-Yuan],
Xu, C.[Chao],
Tao, D.C.[Da-Cheng],
Patch Slimming for Efficient Vision Transformers,
CVPR22(12155-12164)
IEEE DOI
2210
Visualization, Quantization (signal), Computational modeling,
Aggregates, Benchmark testing,
Representation learning
BibRef
Zhang, J.[Jinnian],
Peng, H.[Houwen],
Wu, K.[Kan],
Liu, M.C.[Meng-Chen],
Xiao, B.[Bin],
Fu, J.L.[Jian-Long],
Yuan, L.[Lu],
MiniViT: Compressing Vision Transformers with Weight Multiplexing,
CVPR22(12135-12144)
IEEE DOI
2210
Multiplexing, Performance evaluation, Image coding, Codes,
Computational modeling, Benchmark testing,
Vision applications and systems
BibRef
Chen, T.L.[Tian-Long],
Zhang, Z.Y.[Zhen-Yu],
Cheng, Y.[Yu],
Awadallah, A.[Ahmed],
Wang, Z.Y.[Zhang-Yang],
The Principle of Diversity: Training Stronger Vision Transformers
Calls for Reducing All Levels of Redundancy,
CVPR22(12010-12020)
IEEE DOI
2210
Training, Convolutional codes, Deep learning,
Computational modeling, Redundancy, Deep learning architectures and techniques
BibRef
Yin, H.X.[Hong-Xu],
Vahdat, A.[Arash],
Alvarez, J.M.[Jose M.],
Mallya, A.[Arun],
Kautz, J.[Jan],
Molchanov, P.[Pavlo],
A-ViT: Adaptive Tokens for Efficient Vision Transformer,
CVPR22(10799-10808)
IEEE DOI
2210
Training, Adaptive systems, Network architecture, Transformers,
Throughput, Hardware, Complexity theory,
Efficient learning and inferences
BibRef
Lu, J.H.[Jia-Hao],
Zhang, X.S.[Xi Sheryl],
Zhao, T.L.[Tian-Li],
He, X.Y.[Xiang-Yu],
Cheng, J.[Jian],
APRIL: Finding the Achilles' Heel on Privacy for Vision Transformers,
CVPR22(10041-10050)
IEEE DOI
2210
Privacy, Data privacy, Federated learning, Computational modeling,
Training data, Transformers, Market research, Privacy and federated learning
BibRef
Hatamizadeh, A.[Ali],
Yin, H.X.[Hong-Xu],
Roth, H.[Holger],
Li, W.Q.[Wen-Qi],
Kautz, J.[Jan],
Xu, D.[Daguang],
Molchanov, P.[Pavlo],
GradViT: Gradient Inversion of Vision Transformers,
CVPR22(10011-10020)
IEEE DOI
2210
Measurement, Differential privacy, Neural networks, Transformers,
Pattern recognition, Security, Iterative methods, Privacy and federated learning
BibRef
Zhang, H.F.[Hao-Fei],
Duan, J.R.[Jia-Rui],
Xue, M.Q.[Meng-Qi],
Song, J.[Jie],
Sun, L.[Li],
Song, M.L.[Ming-Li],
Bootstrapping ViTs: Towards Liberating Vision Transformers from
Pre-training,
CVPR22(8934-8943)
IEEE DOI
2210
Training, Upper bound, Neural networks, Training data,
Network architecture, Transformers, Computer vision theory,
Efficient learning and inferences
BibRef
Chavan, A.[Arnav],
Shen, Z.Q.[Zhi-Qiang],
Liu, Z.[Zhuang],
Liu, Z.[Zechun],
Cheng, K.T.[Kwang-Ting],
Xing, E.[Eric],
Vision Transformer Slimming:
Multi-Dimension Searching in Continuous Optimization Space,
CVPR22(4921-4931)
IEEE DOI
2210
Training, Performance evaluation, Image coding, Force,
Graphics processing units,
Vision applications and systems
BibRef
Chen, Z.Y.[Zhao-Yu],
Li, B.[Bo],
Wu, S.[Shuang],
Xu, J.H.[Jiang-He],
Ding, S.H.[Shou-Hong],
Zhang, W.Q.[Wen-Qiang],
Shape Matters: Deformable Patch Attack,
ECCV22(IV:529-548).
Springer DOI
2211
BibRef
Chen, Z.Y.[Zhao-Yu],
Li, B.[Bo],
Xu, J.H.[Jiang-He],
Wu, S.[Shuang],
Ding, S.H.[Shou-Hong],
Zhang, W.Q.[Wen-Qiang],
Towards Practical Certifiable Patch Defense with Vision Transformer,
CVPR22(15127-15137)
IEEE DOI
2210
Smoothing methods, Toy manufacturing industry, Semantics,
Network architecture, Transformers, Robustness,
Adversarial attack and defense
BibRef
Chen, R.J.[Richard J.],
Chen, C.[Chengkuan],
Li, Y.C.[Yi-Cong],
Chen, T.Y.[Tiffany Y.],
Trister, A.D.[Andrew D.],
Krishnan, R.G.[Rahul G.],
Mahmood, F.[Faisal],
Scaling Vision Transformers to Gigapixel Images via Hierarchical
Self-Supervised Learning,
CVPR22(16123-16134)
IEEE DOI
2210
Training, Visualization, Self-supervised learning,
Image representation, Transformers,
Self- semi- meta- unsupervised learning
BibRef
Yang, Z.[Zhao],
Wang, J.Q.[Jia-Qi],
Tang, Y.S.[Yan-Song],
Chen, K.[Kai],
Zhao, H.S.[Heng-Shuang],
Torr, P.H.S.[Philip H.S.],
LAVT: Language-Aware Vision Transformer for Referring Image
Segmentation,
CVPR22(18134-18144)
IEEE DOI
2210
Image segmentation, Visualization, Image coding, Shape, Linguistics,
Transformers, Feature extraction, Segmentation, grouping and shape analysis
BibRef
Scheibenreif, L.[Linus],
Hanna, J.[Joëlle],
Mommert, M.[Michael],
Borth, D.[Damian],
Self-supervised Vision Transformers for Land-cover Segmentation and
Classification,
EarthVision22(1421-1430)
IEEE DOI
2210
Training, Earth, Image segmentation, Computational modeling,
Conferences, Transformers
BibRef
Zhai, X.H.[Xiao-Hua],
Kolesnikov, A.[Alexander],
Houlsby, N.[Neil],
Beyer, L.[Lucas],
Scaling Vision Transformers,
CVPR22(1204-1213)
IEEE DOI
2210
Training, Error analysis, Computational modeling, Neural networks,
Memory management, Training data,
Transfer/low-shot/long-tail learning
BibRef
Guo, J.Y.[Jian-Yuan],
Han, K.[Kai],
Wu, H.[Han],
Tang, Y.[Yehui],
Chen, X.H.[Xing-Hao],
Wang, Y.H.[Yun-He],
Xu, C.[Chang],
CMT: Convolutional Neural Networks Meet Vision Transformers,
CVPR22(12165-12175)
IEEE DOI
2210
Visualization, Image recognition, Force,
Object detection, Transformers,
Representation learning
BibRef
Meng, L.C.[Ling-Chen],
Li, H.D.[Heng-Duo],
Chen, B.C.[Bor-Chun],
Lan, S.Y.[Shi-Yi],
Wu, Z.X.[Zu-Xuan],
Jiang, Y.G.[Yu-Gang],
Lim, S.N.[Ser-Nam],
AdaViT: Adaptive Vision Transformers for Efficient Image Recognition,
CVPR22(12299-12308)
IEEE DOI
2210
Image recognition, Head, Law enforcement, Computational modeling,
Redundancy, Transformers, Efficient learning and inferences,
retrieval
BibRef
Herrmann, C.[Charles],
Sargent, K.[Kyle],
Jiang, L.[Lu],
Zabih, R.[Ramin],
Chang, H.[Huiwen],
Liu, C.[Ce],
Krishnan, D.[Dilip],
Sun, D.Q.[De-Qing],
Pyramid Adversarial Training Improves ViT Performance,
CVPR22(13409-13419)
IEEE DOI
2210
Training, Image recognition, Stochastic processes,
Transformers, Robustness, retrieval,
Recognition: detection
BibRef
Li, C.L.[Chang-Lin],
Zhuang, B.[Bohan],
Wang, G.R.[Guang-Run],
Liang, X.D.[Xiao-Dan],
Chang, X.J.[Xiao-Jun],
Yang, Y.[Yi],
Automated Progressive Learning for Efficient Training of Vision
Transformers,
CVPR22(12476-12486)
IEEE DOI
2210
Training, Adaptation models, Schedules, Computational modeling,
Estimation, Manuals, Transformers, Representation learning
BibRef
Pu, M.Y.[Meng-Yang],
Huang, Y.P.[Ya-Ping],
Liu, Y.M.[Yu-Ming],
Guan, Q.J.[Qing-Ji],
Ling, H.B.[Hai-Bin],
EDTER: Edge Detection with Transformer,
CVPR22(1392-1402)
IEEE DOI
2210
Head, Image edge detection, Semantics, Detectors, Transformers,
Feature extraction, Segmentation, grouping and shape analysis,
Scene analysis and understanding
BibRef
Zhu, R.[Rui],
Li, Z.Q.[Zheng-Qin],
Matai, J.[Janarbek],
Porikli, F.M.[Fatih M.],
Chandraker, M.[Manmohan],
IRISformer: Dense Vision Transformers for Single-Image Inverse
Rendering in Indoor Scenes,
CVPR22(2812-2821)
IEEE DOI
2210
Photorealism, Shape, Computational modeling, Lighting,
Transformers,
Physics-based vision and shape-from-X
BibRef
Ermolov, A.[Aleksandr],
Mirvakhabova, L.[Leyla],
Khrulkov, V.[Valentin],
Sebe, N.[Nicu],
Oseledets, I.[Ivan],
Hyperbolic Vision Transformers: Combining Improvements in Metric
Learning,
CVPR22(7399-7409)
IEEE DOI
2210
Measurement, Geometry, Visualization, Semantics, Self-supervised learning,
Transformer cores, Transformers, Representation learning
BibRef
Zhang, C.Z.[Chong-Zhi],
Zhang, M.Y.[Ming-Yuan],
Zhang, S.H.[Shang-Hang],
Jin, D.S.[Dai-Sheng],
Zhou, Q.[Qiang],
Cai, Z.A.[Zhong-Ang],
Zhao, H.[Haiyu],
Liu, X.L.[Xiang-Long],
Liu, Z.W.[Zi-Wei],
Delving Deep into the Generalization of Vision Transformers under
Distribution Shifts,
CVPR22(7267-7276)
IEEE DOI
2210
Training, Representation learning, Systematics, Shape, Taxonomy,
Self-supervised learning, Transformers, Recognition: detection,
Representation learning
BibRef
Hou, Z.[Zhi],
Yu, B.[Baosheng],
Tao, D.C.[Da-Cheng],
BatchFormer: Learning to Explore Sample Relationships for Robust
Representation Learning,
CVPR22(7246-7256)
IEEE DOI
2210
Training, Deep learning, Representation learning, Neural networks,
Tail, Transformers, Transfer/low-shot/long-tail learning,
Self- semi- meta- unsupervised learning
BibRef
Zamir, S.W.[Syed Waqas],
Arora, A.[Aditya],
Khan, S.[Salman],
Hayat, M.[Munawar],
Khan, F.S.[Fahad Shahbaz],
Yang, M.H.[Ming-Hsuan],
Restormer: Efficient Transformer for High-Resolution Image
Restoration,
CVPR22(5718-5729)
IEEE DOI
2210
Computational modeling, Transformer cores,
Transformers, Data models, Image restoration, Task analysis,
Deep learning architectures and techniques
BibRef
Lin, K.[Kevin],
Wang, L.J.[Li-Juan],
Liu, Z.C.[Zi-Cheng],
Mesh Graphormer,
ICCV21(12919-12928)
IEEE DOI
2203
Convolutional codes, Solid modeling, Network topology,
Transformers, Gestures and body pose
BibRef
Casey, E.[Evan],
Pérez, V.[Víctor],
Li, Z.R.[Zhuo-Ru],
The Animation Transformer: Visual Correspondence via Segment Matching,
ICCV21(11303-11312)
IEEE DOI
2203
Visualization, Image segmentation, Image color analysis,
Production, Animation, Transformers,
grouping and shape
BibRef
Reizenstein, J.[Jeremy],
Shapovalov, R.[Roman],
Henzler, P.[Philipp],
Sbordone, L.[Luca],
Labatut, P.[Patrick],
Novotny, D.[David],
Common Objects in 3D: Large-Scale Learning and Evaluation of
Real-life 3D Category Reconstruction,
ICCV21(10881-10891)
IEEE DOI
2203
Award, Marr Prize, HM. Point cloud compression, Transformers,
Rendering (computer graphics), Cameras, Image reconstruction,
3D from multiview and other sensors
BibRef
Feng, W.X.[Wei-Xin],
Wang, Y.J.[Yuan-Jiang],
Ma, L.H.[Li-Hua],
Yuan, Y.[Ye],
Zhang, C.[Chi],
Temporal Knowledge Consistency for Unsupervised Visual Representation
Learning,
ICCV21(10150-10160)
IEEE DOI
2203
Training, Representation learning, Visualization, Protocols,
Object detection, Semisupervised learning, Transformers,
Transfer/Low-shot/Semi/Unsupervised Learning
BibRef
Wu, H.P.[Hai-Ping],
Xiao, B.[Bin],
Codella, N.[Noel],
Liu, M.C.[Meng-Chen],
Dai, X.Y.[Xi-Yang],
Yuan, L.[Lu],
Zhang, L.[Lei],
CvT: Introducing Convolutions to Vision Transformers,
ICCV21(22-31)
IEEE DOI
2203
Code, Vision Transformer.
WWW Link. Convolutional codes, Image resolution, Image recognition,
Performance gain, Transformers, Distortion,
BibRef
Touvron, H.[Hugo],
Cord, M.[Matthieu],
Sablayrolles, A.[Alexandre],
Synnaeve, G.[Gabriel],
Jégou, H.[Hervé],
Going deeper with Image Transformers,
ICCV21(32-42)
IEEE DOI
2203
Training, Neural networks, Training data,
Data models, Circuit faults, Recognition and classification,
Optimization and learning methods
BibRef
Zhao, J.W.[Jia-Wei],
Yan, K.[Ke],
Zhao, Y.F.[Yi-Fan],
Guo, X.W.[Xiao-Wei],
Huang, F.Y.[Fei-Yue],
Li, J.[Jia],
Transformer-based Dual Relation Graph for Multi-label Image
Recognition,
ICCV21(163-172)
IEEE DOI
2203
Image recognition, Correlation, Computational modeling, Semantics,
Benchmark testing, Representation learning
BibRef
Pan, Z.Z.[Zi-Zheng],
Zhuang, B.[Bohan],
Liu, J.[Jing],
He, H.Y.[Hao-Yu],
Cai, J.F.[Jian-Fei],
Scalable Vision Transformers with Hierarchical Pooling,
ICCV21(367-376)
IEEE DOI
2203
Visualization, Image recognition, Computational modeling,
Scalability, Transformers, Computational efficiency,
Efficient training and inference methods
BibRef
Yuan, L.[Li],
Chen, Y.P.[Yun-Peng],
Wang, T.[Tao],
Yu, W.H.[Wei-Hao],
Shi, Y.J.[Yu-Jun],
Jiang, Z.H.[Zi-Hang],
Tay, F.E.H.[Francis E. H.],
Feng, J.S.[Jia-Shi],
Yan, S.C.[Shui-Cheng],
Tokens-to-Token ViT:
Training Vision Transformers from Scratch on ImageNet,
ICCV21(538-547)
IEEE DOI
2203
Training, Image resolution, Computational modeling,
Image edge detection, Transformers,
BibRef
Wu, B.[Bichen],
Xu, C.F.[Chen-Feng],
Dai, X.L.[Xiao-Liang],
Wan, A.[Alvin],
Zhang, P.Z.[Pei-Zhao],
Yan, Z.C.[Zhi-Cheng],
Tomizuka, M.[Masayoshi],
Gonzalez, J.[Joseph],
Keutzer, K.[Kurt],
Vajda, P.[Peter],
Visual Transformers: Where Do Transformers Really Belong in Vision
Models?,
ICCV21(579-589)
IEEE DOI
2203
Training, Visualization, Image segmentation, Lips,
Computational modeling, Semantics, Vision applications and systems
BibRef
Hu, R.H.[Rong-Hang],
Singh, A.[Amanpreet],
UniT: Multimodal Multitask Learning with a Unified Transformer,
ICCV21(1419-1429)
IEEE DOI
2203
Training, Natural languages,
Object detection, Predictive models, Transformers, Multitasking,
Representation learning
BibRef
Qiu, Y.[Yue],
Yamamoto, S.[Shintaro],
Nakashima, K.[Kodai],
Suzuki, R.[Ryota],
Iwata, K.[Kenji],
Kataoka, H.[Hirokatsu],
Satoh, Y.[Yutaka],
Describing and Localizing Multiple Changes with Transformers,
ICCV21(1951-1960)
IEEE DOI
2203
Measurement, Location awareness, Codes, Natural languages,
Benchmark testing, Transformers,
Vision applications and systems
BibRef
Song, M.[Myungseo],
Choi, J.[Jinyoung],
Han, B.H.[Bo-Hyung],
Variable-Rate Deep Image Compression through Spatially-Adaptive
Feature Transform,
ICCV21(2360-2369)
IEEE DOI
2203
Training, Image coding, Neural networks, Rate-distortion, Transforms,
Network architecture, Computational photography,
Low-level and physics-based vision
BibRef
Shenga, H.[Hualian],
Cai, S.[Sijia],
Liu, Y.[Yuan],
Deng, B.[Bing],
Huang, J.Q.[Jian-Qiang],
Hua, X.S.[Xian-Sheng],
Zhao, M.J.[Min-Jian],
Improving 3D Object Detection with Channel-wise Transformer,
ICCV21(2723-2732)
IEEE DOI
2203
Point cloud compression, Object detection, Detectors, Transforms,
Transformers, Encoding, Detection and localization in 2D and 3D,
BibRef
Zhang, P.C.[Peng-Chuan],
Dai, X.[Xiyang],
Yang, J.W.[Jian-Wei],
Xiao, B.[Bin],
Yuan, L.[Lu],
Zhang, L.[Lei],
Gao, J.F.[Jian-Feng],
Multi-Scale Vision Longformer: A New Vision Transformer for
High-Resolution Image Encoding,
ICCV21(2978-2988)
IEEE DOI
2203
Image segmentation, Image coding, Computational modeling,
Memory management, Object detection, Transformers,
Representation learning
BibRef
Dong, Q.[Qi],
Tu, Z.W.[Zhuo-Wen],
Liao, H.F.[Hao-Fu],
Zhang, Y.T.[Yu-Ting],
Mahadevan, V.[Vijay],
Soatto, S.[Stefano],
Visual Relationship Detection Using Part-and-Sum Transformers with
Composite Queries,
ICCV21(3530-3539)
IEEE DOI
2203
Visualization, Detectors, Transformers, Task analysis, Standards,
Detection and localization in 2D and 3D,
Representation learning
BibRef
Fan, H.Q.[Hao-Qi],
Xiong, B.[Bo],
Mangalam, K.[Karttikeya],
Li, Y.[Yanghao],
Yan, Z.C.[Zhi-Cheng],
Malik, J.[Jitendra],
Feichtenhofer, C.[Christoph],
Multiscale Vision Transformers,
ICCV21(6804-6815)
IEEE DOI
2203
Visualization, Image recognition, Codes, Computational modeling,
Transformers, Complexity theory,
Recognition and classification
BibRef
Mahmood, K.[Kaleel],
Mahmood, R.[Rigel],
van Dijk, M.[Marten],
On the Robustness of Vision Transformers to Adversarial Examples,
ICCV21(7818-7827)
IEEE DOI
2203
Transformers, Robustness,
Adversarial machine learning, Security,
Machine learning architectures and formulations
BibRef
Chen, X.L.[Xin-Lei],
Xie, S.[Saining],
He, K.[Kaiming],
An Empirical Study of Training Self-Supervised Vision Transformers,
ICCV21(9620-9629)
IEEE DOI
2203
Training, Benchmark testing, Transformers, Standards,
Representation learning, Recognition and classification, Transfer/Low-shot/Semi/Unsupervised Learning
BibRef
Yuan, Y.[Ye],
Weng, X.[Xinshuo],
Ou, Y.[Yanglan],
Kitani, K.[Kris],
AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent
Forecasting,
ICCV21(9793-9803)
IEEE DOI
2203
Uncertainty, Stochastic processes, Predictive models, Transformers,
Encoding, Trajectory, Motion and tracking,
Vision for robotics and autonomous vehicles
BibRef
Wu, K.[Kan],
Peng, H.W.[Hou-Wen],
Chen, M.H.[Ming-Hao],
Fu, J.L.[Jian-Long],
Chao, H.Y.[Hong-Yang],
Rethinking and Improving Relative Position Encoding for Vision
Transformer,
ICCV21(10013-10021)
IEEE DOI
2203
Image coding, Codes, Computational modeling, Transformers, Encoding,
Natural language processing, Datasets and evaluation,
Recognition and classification
BibRef
Bhojanapalli, S.[Srinadh],
Chakrabarti, A.[Ayan],
Glasner, D.[Daniel],
Li, D.[Daliang],
Unterthiner, T.[Thomas],
Veit, A.[Andreas],
Understanding Robustness of Transformers for Image Classification,
ICCV21(10211-10221)
IEEE DOI
2203
Perturbation methods, Transformers,
Robustness, Data models, Convolutional neural networks,
Recognition and classification
BibRef
Yan, B.[Bin],
Peng, H.[Houwen],
Fu, J.L.[Jian-Long],
Wang, D.[Dong],
Lu, H.C.[Hu-Chuan],
Learning Spatio-Temporal Transformer for Visual Tracking,
ICCV21(10428-10437)
IEEE DOI
2203
Visualization, Target tracking, Smoothing methods, Pipelines,
Benchmark testing, Transformers,
BibRef
Heo, B.[Byeongho],
Yun, S.[Sangdoo],
Han, D.Y.[Dong-Yoon],
Chun, S.[Sanghyuk],
Choe, J.[Junsuk],
Oh, S.J.[Seong Joon],
Rethinking Spatial Dimensions of Vision Transformers,
ICCV21(11916-11925)
IEEE DOI
2203
Dimensionality reduction, Computational modeling,
Object detection, Transformers, Robustness,
Recognition and classification
BibRef
Voskou, A.[Andreas],
Panousis, K.P.[Konstantinos P.],
Kosmopoulos, D.[Dimitrios],
Metaxas, D.N.[Dimitris N.],
Chatzis, S.[Sotirios],
Stochastic Transformer Networks with Linear Competing Units:
Application to end-to-end SL Translation,
ICCV21(11926-11935)
IEEE DOI
2203
Training, Memory management, Stochastic processes,
Gesture recognition, Benchmark testing, Assistive technologies,
BibRef
Ranftl, R.[René],
Bochkovskiy, A.[Alexey],
Koltun, V.[Vladlen],
Vision Transformers for Dense Prediction,
ICCV21(12159-12168)
IEEE DOI
2203
Image resolution, Semantics, Neural networks, Estimation,
Training data, grouping and shape
BibRef
Chen, M.H.[Ming-Hao],
Peng, H.W.[Hou-Wen],
Fu, J.L.[Jian-Long],
Ling, H.B.[Hai-Bin],
AutoFormer: Searching Transformers for Visual Recognition,
ICCV21(12250-12260)
IEEE DOI
2203
Training, Convolutional codes, Visualization, Head, Search methods,
Manuals, Recognition and classification
BibRef
Yuan, K.[Kun],
Guo, S.P.[Shao-Peng],
Liu, Z.W.[Zi-Wei],
Zhou, A.[Aojun],
Yu, F.W.[Feng-Wei],
Wu, W.[Wei],
Incorporating Convolution Designs into Visual Transformers,
ICCV21(559-568)
IEEE DOI
2203
Training, Visualization, Costs, Convolution, Training data,
Transformers, Feature extraction, Recognition and classification,
Efficient training and inference methods
BibRef
Chen, Z.[Zhengsu],
Xie, L.X.[Ling-Xi],
Niu, J.W.[Jian-Wei],
Liu, X.F.[Xue-Feng],
Wei, L.[Longhui],
Tian, Q.[Qi],
Visformer: The Vision-friendly Transformer,
ICCV21(569-578)
IEEE DOI
2203
Convolutional codes, Training, Visualization, Protocols,
Computational modeling, Fitting, Recognition and classification,
Representation learning
BibRef
Yao, Z.L.[Zhu-Liang],
Cao, Y.[Yue],
Lin, Y.T.[Yu-Tong],
Liu, Z.[Ze],
Zhang, Z.[Zheng],
Hu, H.[Han],
Leveraging Batch Normalization for Vision Transformers,
NeruArch21(413-422)
IEEE DOI
2112
Training, Transformers, Feeds
BibRef
Graham, B.[Ben],
El-Nouby, A.[Alaaeldin],
Touvron, H.[Hugo],
Stock, P.[Pierre],
Joulin, A.[Armand],
Jégou, H.[Hervé],
Douze, M.[Matthijs],
LeViT: a Vision Transformer in ConvNet's Clothing for Faster
Inference,
ICCV21(12239-12249)
IEEE DOI
2203
Training, Image resolution, Neural networks,
Parallel processing, Transformers, Feature extraction,
Representation learning
BibRef
Horváth, J.[János],
Baireddy, S.[Sriram],
Hao, H.X.[Han-Xiang],
Montserrat, D.M.[Daniel Mas],
Delp, E.J.[Edward J.],
Manipulation Detection in Satellite Images Using Vision Transformer,
WMF21(1032-1041)
IEEE DOI
2109
BibRef
Earlier: A1, A4, A3, A5, Only:
Manipulation Detection in Satellite Images Using Deep Belief Networks,
WMF20(2832-2840)
IEEE DOI
2008
Image sensors, Satellites, Splicing, Forestry, Tools.
Satellites, Image reconstruction, Training, Forgery,
Heating systems, Feature extraction
BibRef
Beal, J.[Josh],
Wu, H.Y.[Hao-Yu],
Park, D.H.[Dong Huk],
Zhai, A.[Andrew],
Kislyuk, D.[Dmitry],
Billion-Scale Pretraining with Vision Transformers for Multi-Task
Visual Representations,
WACV22(1431-1440)
IEEE DOI
2202
Visualization, Solid modeling, Systematics,
Computational modeling, Transformers,
Semi- and Un- supervised Learning
BibRef
Chapter on Pattern Recognition, Clustering, Statistics, Grammars, Learning, Neural Nets, Genetic Algorithms continues in
Attention in Vision Transformers .