Hu, H.Q.[Hao-Qi],
Lu, X.F.[Xiao-Feng],
Zhang, X.P.[Xin-Peng],
Zhang, T.X.[Tian-Xing],
Sun, G.L.[Guang-Ling],
Inheritance Attention Matrix-Based Universal Adversarial
Perturbations on Vision Transformers,
SPLetters(28), 2021, pp. 1923-1927.
IEEE DOI
2110
Perturbation methods, Robustness, Visualization, Transformers,
Optimization, Task analysis, Head, Vision Transformers, self-attention
BibRef
Xue, Z.X.[Zhi-Xiang],
Tan, X.[Xiong],
Yu, X.[Xuchu],
Liu, B.[Bing],
Yu, A.[Anzhu],
Zhang, P.Q.[Peng-Qiang],
Deep Hierarchical Vision Transformer for Hyperspectral and LiDAR Data
Classification,
IP(31), 2022, pp. 3095-3110.
IEEE DOI
2205
Feature extraction, Transformers, Hyperspectral imaging,
Laser radar, Data mining, Collaboration, Data models,
cross attention fusion
BibRef
Heo, J.[Jiseong],
Wang, Y.[Yooseung],
Park, J.[Jihun],
Occlusion-aware spatial attention transformer for occluded object
recognition,
PRL(159), 2022, pp. 70-76.
Elsevier DOI
2206
Occluded object recognition, Visual transformer, Spatial attention
BibRef
Yu, X.H.[Xiao-Han],
Wang, J.[Jun],
Zhao, Y.[Yang],
Gao, Y.S.[Yong-Sheng],
Mix-ViT: Mixing attentive vision transformer for ultra-fine-grained
visual categorization,
PR(135), 2023, pp. 109131.
Elsevier DOI
2212
Ultra-fine-grained visual categorization, Vision transformer,
Self-supervised learning, Attentive mixing
BibRef
Wu, G.[Gaojie],
Zheng, W.S.[Wei-Shi],
Lu, Y.T.[Yu-Tong],
Tian, Q.[Qi],
PSLT: A Light-Weight Vision Transformer With Ladder Self-Attention
and Progressive Shift,
PAMI(45), No. 9, September 2023, pp. 11120-11135.
IEEE DOI
2309
BibRef
Li, K.C.[Kun-Chang],
Wang, Y.[Yali],
Zhang, J.H.[Jun-Hao],
Gao, P.[Peng],
Song, G.L.[Guang-Lu],
Liu, Y.[Yu],
Li, H.S.[Hong-Sheng],
Qiao, Y.[Yu],
UniFormer: Unifying Convolution and Self-Attention for Visual
Recognition,
PAMI(45), No. 10, October 2023, pp. 12581-12600.
IEEE DOI
2310
Unify CNN and Transformers
BibRef
Li, H.L.[Hao-Ling],
Xue, M.Q.[Meng-Qi],
Song, J.[Jie],
Zhang, H.F.[Hao-Fei],
Huang, W.Q.[Wen-Qi],
Liang, L.Y.[Ling-Yu],
Song, M.L.[Ming-Li],
Constituent Attention for Vision Transformers,
CVIU(237), 2023, pp. 103838.
Elsevier DOI Code:
WWW Link.
2311
Vision Transformer, Attention mechanism, Classification,
Interpretability for deep learning
BibRef
Qin, R.[Ruiru],
Wang, C.Z.[Chuan-Zhi],
Wu, Y.M.[Yong-Mei],
Du, H.[Huafei],
Lv, M.Y.[Ming-Yun],
A U-Shaped Convolution-Aided Transformer with Double Attention for
Hyperspectral Image Classification,
RS(16), No. 2, 2024, pp. 288.
DOI Link
2402
BibRef
Wang, W.X.[Wen-Xiao],
Chen, W.[Wei],
Qiu, Q.[Qibo],
Chen, L.[Long],
Wu, B.X.[Bo-Xi],
Lin, B.B.[Bin-Bin],
He, X.F.[Xiao-Fei],
Liu, W.[Wei],
CrossFormer++: A Versatile Vision Transformer Hinging on Cross-Scale
Attention,
PAMI(46), No. 5, May 2024, pp. 3123-3136.
IEEE DOI
2404
Transformers, Task analysis, Feature extraction, Visualization,
Object detection, Costs, Adaptation models, Image classification,
vision transformer
BibRef
Zhang, Q.M.[Qi-Ming],
Zhang, J.[Jing],
Xu, Y.F.[Yu-Fei],
Tao, D.C.[Da-Cheng],
Vision Transformer With Quadrangle Attention,
PAMI(46), No. 5, May 2024, pp. 3608-3624.
IEEE DOI
2404
Transformers, Task analysis, Shape, Feature extraction,
Adaptation models, Semantic segmentation, vision transformer
BibRef
Huang, L.[Lan],
Bai, X.Y.[Xing-Yu],
Zeng, J.[Jia],
Yu, M.Q.[Meng-Qiang],
Pang, W.[Wei],
Wang, K.P.[Kang-Ping],
FAM: Improving columnar vision transformer with feature attention
mechanism,
CVIU(242), 2024, pp. 103981.
Elsevier DOI
2404
Vision transformer, Feature adjustment, Network structure improvement
BibRef
Li, M.X.[Ming-Xiu],
Yu, W.[Wei],
Liu, Q.L.[Qing-Lin],
Li, Z.L.[Zong-Lin],
Li, R.[Ru],
Zhong, B.[Bineng],
Zhang, S.P.[Sheng-Ping],
Hybrid Transformers With Attention-Guided Spatial Embeddings for
Makeup Transfer and Removal,
CirSysVideo(34), No. 4, April 2024, pp. 2876-2890.
IEEE DOI
2404
Faces, Feature extraction, Semantics, Transformers, Shape,
Image color analysis, Data mining, Makeup transfer, makeup removal,
vision transformer
BibRef
Nie, X.S.[Xue-Song],
Jin, H.Y.[Hao-Yuan],
Yan, Y.F.[Yun-Feng],
Chen, X.[Xi],
Zhu, Z.H.[Zhi-Hang],
Qi, D.L.[Dong-Lian],
ScopeViT: Scale-Aware Vision Transformer,
PR(153), 2024, pp. 110470.
Elsevier DOI
2405
Vision transformer, Multi-scale features, Efficient attention mechanism
BibRef
Hanyu, T.[Taisei],
Yamazaki, K.[Kashu],
Tran, M.[Minh],
McCann, R.A.[Roy A.],
Liao, H.T.[Hai-Tao],
Rainwater, C.[Chase],
Adkins, M.[Meredith],
Cothren, J.[Jackson],
Le, N.[Ngan],
AerialFormer: Multi-Resolution Transformer for Aerial Image
Segmentation,
RS(16), No. 16, 2024, pp. 2930.
DOI Link
2408
BibRef
Wang, D.Z.[De-Zheng],
Wei, X.Y.[Xiao-Yi],
Chen, C.Y.[Cong-Yan],
CAST: An innovative framework for Cross-dimensional Attention
Structure in Transformers,
PR(159), 2025, pp. 111153.
Elsevier DOI
2412
Cross-dimensional attention structure,
Static attention mechanism, Time series forecasting
BibRef
van Engelenhoven, A.[Adjorn],
Strisciuglio, N.[Nicola],
Talavera, E.[Estefanía],
CAST: Clustering self-Attention using Surrogate Tokens for efficient
transformers,
PRL(186), 2024, pp. 30-36.
Elsevier DOI
2412
Self-attention mechanism, Clustering self-attention mechanism,
Complexity, Efficient transformers, LRA benchmark
BibRef
Zheng, G.Y.[Guang-Yao],
Zang, B.[Bo],
Yang, P.H.[Peng-Hui],
Zhang, W.B.[Wen-Bo],
Li, B.[Bin],
FE-SKViT: A Feature-Enhanced ViT Model with Skip Attention for
Automatic Modulation Recognition,
RS(16), No. 22, 2024, pp. 4204.
DOI Link
2412
BibRef
Lu, J.C.[Jia-Chen],
Zhang, J.G.[Jun-Ge],
Zhu, X.T.[Xia-Tian],
Feng, J.F.[Jian-Feng],
Xiang, T.[Tao],
Zhang, L.[Li],
Softmax-Free Linear Transformers,
IJCV(132), No. 8, August 2024, pp. 3355-3374.
Springer DOI Code:
WWW Link.
2408
Approximage the self-attention by linear function.
BibRef
Li, C.H.[Cheng-Hao],
Zhang, C.N.[Chao-Ning],
Toward a deeper understanding: RetNet viewed through Convolution,
PR(155), 2024, pp. 110625.
Elsevier DOI Code:
WWW Link.
2408
Boost local response of ViT.
Convolutional neural network, Vision transformer, RetNet
BibRef
Liao, H.X.[Hui-Xian],
Li, X.S.[Xiao-Sen],
Qin, X.[Xiao],
Wang, W.J.[Wen-Ji],
He, G.D.[Guo-Dui],
Huang, H.J.[Hao-Jie],
Guo, X.[Xu],
Chun, X.[Xin],
Zhang, J.Y.[Jin-Yong],
Fu, Y.Q.[Yun-Qin],
Qin, Z.Y.[Zheng-You],
EPSViTs: A hybrid architecture for image classification based on
parameter-shared multi-head self-attention,
IVC(149), 2024, pp. 105130.
Elsevier DOI
2408
Image classification, Multi-head self-attention,
Parameter-shared, Hybrid architecture
BibRef
Sa, J.W.[Jae-Won],
Ryu, J.[Junhwan],
Kim, H.[Heegon],
ECTFormer: An efficient Conv-Transformer model design for image
recognition,
PR(159), 2025, pp. 111092.
Elsevier DOI
2412
Conv-Transformer network, Lightweight architecture,
Dynamic kernel sizes, Efficient overlapping patchify,
Efficient self-attention mechanism
BibRef
Li, J.F.[Jin-Feng],
Feng, M.L.[Mei-Ling],
Xia, C.Y.[Cheng-Yi],
DBCvT: Double Branch Convolutional Transformer for Medical Image
Classification,
PRL(186), 2024, pp. 250-257.
Elsevier DOI
2412
Convolutional Neural Networks, Transformer, Self-attention,
Channel attention, Medical Image Classification
BibRef
Gong, H.H.[Hui-Hui],
Dong, M.J.[Min-Jing],
Ma, S.Q.[Si-Qi],
Camtepe, S.[Seyit],
Nepal, S.[Surya],
Xu, C.[Chang],
Random Entangled Tokens for Adversarially Robust Vision Transformer,
CVPR24(24554-24563)
IEEE DOI
2410
Training, Computer architecture, Benchmark testing, Transformers,
Robustness, Vision Transformers, Self-Attention Mechanism
BibRef
Lee, S.[Sanghyeok],
Choi, J.[Joonmyung],
Kim, H.W.J.[Hyun-Woo J.],
Multi-Criteria Token Fusion with One-Step-Ahead Attention for
Efficient Vision Transformers,
CVPR24(15741-15750)
IEEE DOI Code:
WWW Link.
2410
Training, Degradation, Costs, Fuses, Computational modeling,
Transformers, Efficient ViTs, Token Fusion, Token Reduction, Token Merging
BibRef
Zhang, S.X.[Shuo-Xi],
Liu, H.P.[Han-Peng],
Lin, S.[Stephen],
He, K.[Kun],
You Only Need Less Attention at Each Stage in Vision Transformers,
CVPR24(6057-6066)
IEEE DOI
2410
Deep learning, Computational modeling,
Transformers, Computational efficiency, efficient training
BibRef
Li, L.[Lujun],
Wei, Z.[Zimian],
Dong, P.[Peijie],
Luo, W.H.[Wen-Han],
Xue, W.[Wei],
Liu, Q.F.[Qi-Feng],
Guo, Y.[Yike],
Attnzero: Efficient Attention Discovery for Vision Transformers,
ECCV24(V: 20-37).
Springer DOI
2412
BibRef
Bao-Long, N.H.[Nguyen-Huu],
Zhang, C.Y.[Chen-Yu],
Shi, Y.Z.[Yu-Zhi],
Hirakawa, T.[Tsubasa],
Yamashita, T.[Takayoshi],
Matsui, T.[Tohgoroh],
Fujiyoshi, H.[Hironobu],
Debiformer: Vision Transformer with Deformable Agent Bi-level Routing
Attention,
ACCV24(X: 445-462).
Springer DOI
2412
BibRef
Yang, X.[Xuan],
Yuan, L.Z.[Liang-Zhe],
Wilber, K.[Kimberly],
Sharma, A.[Astuti],
Gu, X.Y.[Xiu-Ye],
Qiao, S.Y.[Si-Yuan],
Debats, S.[Stephanie],
Wang, H.S.[Hui-Sheng],
Adam, H.[Hartwig],
Sirotenko, M.[Mikhail],
Chen, L.C.[Liang-Chieh],
PolyMaX: General Dense Prediction with Mask Transformer,
WACV24(1039-1050)
IEEE DOI
2404
Codes, Image synthesis, Semantic segmentation, Estimation,
Computer architecture, Benchmark testing, Algorithms,
Image recognition and understanding
BibRef
Nie, X.S.[Xue-Song],
Chen, X.[Xi],
Jin, H.Y.[Hao-Yuan],
Zhu, Z.H.[Zhi-Hang],
Yan, Y.F.[Yun-Feng],
Qi, D.L.[Dong-Lian],
Triplet Attention Transformer for Spatiotemporal Predictive Learning,
WACV24(7021-7030)
IEEE DOI
2404
Computational modeling, Self-supervised learning,
Predictive models, Parallel processing, Transformers, and algorithms
BibRef
Cai, H.[Han],
Li, J.[Junyan],
Hu, M.[Muyan],
Gan, C.[Chuang],
Han, S.[Song],
EfficientViT: Lightweight Multi-Scale Attention for High-Resolution
Dense Prediction,
ICCV23(17256-17267)
IEEE DOI
2401
BibRef
Ryu, J.[Jongbin],
Han, D.Y.[Dong-Yoon],
Lim, J.W.[Jong-Woo],
Gramian Attention Heads are Strong yet Efficient Vision Learners,
ICCV23(5818-5828)
IEEE DOI Code:
WWW Link.
2401
BibRef
Xu, R.H.[Rui-Han],
Zhang, H.[Haokui],
Hu, W.Z.[Wen-Ze],
Zhang, S.L.[Shi-Liang],
Wang, X.Y.[Xiao-Yu],
ParCNetV2: Oversized Kernel with Enhanced Attention*,
ICCV23(5729-5739)
IEEE DOI Code:
WWW Link.
2401
BibRef
Zhao, B.Y.[Bing-Yin],
Yu, Z.[Zhiding],
Lan, S.Y.[Shi-Yi],
Cheng, Y.[Yutao],
Anandkumar, A.[Anima],
Lao, Y.J.[Ying-Jie],
Alvarez, J.M.[Jose M.],
Fully Attentional Networks with Self-emerging Token Labeling,
ICCV23(5562-5572)
IEEE DOI
2401
BibRef
Guo, Y.[Yong],
Stutz, D.[David],
Schiele, B.[Bernt],
Robustifying Token Attention for Vision Transformers,
ICCV23(17511-17522)
IEEE DOI Code:
WWW Link.
2401
BibRef
Zhao, Y.[Youpeng],
Tang, H.D.[Hua-Dong],
Jiang, Y.Y.[Ying-Ying],
A, Y.[Yong],
Wu, Q.[Qiang],
Wang, J.[Jun],
Parameter-Efficient Vision Transformer with Linear Attention,
ICIP23(1275-1279)
IEEE DOI
2312
BibRef
Shi, L.[Lili],
Huang, H.D.[Hai-Duo],
Song, B.[Bowei],
Tan, M.[Meng],
Zhao, W.Z.[Wen-Zhe],
Xia, T.[Tian],
Ren, P.J.[Peng-Ju],
TAQ: Top-K Attention-Aware Quantization for Vision Transformers,
ICIP23(1750-1754)
IEEE DOI
2312
BibRef
Baili, N.[Nada],
Frigui, H.[Hichem],
ADA-VIT: Attention-Guided Data Augmentation for Vision Transformers,
ICIP23(385-389)
IEEE DOI
2312
BibRef
Ding, M.Y.[Ming-Yu],
Shen, Y.[Yikang],
Fan, L.J.[Li-Jie],
Chen, Z.F.[Zhen-Fang],
Chen, Z.[Zitian],
Luo, P.[Ping],
Tenenbaum, J.[Josh],
Gan, C.[Chuang],
Visual Dependency Transformers:
Dependency Tree Emerges from Reversed Attention,
CVPR23(14528-14539)
IEEE DOI
2309
BibRef
Song, J.C.[Jie-Chong],
Mou, C.[Chong],
Wang, S.Q.[Shi-Qi],
Ma, S.W.[Si-Wei],
Zhang, J.[Jian],
Optimization-Inspired Cross-Attention Transformer for Compressive
Sensing,
CVPR23(6174-6184)
IEEE DOI
2309
BibRef
Hassani, A.[Ali],
Walton, S.[Steven],
Li, J.C.[Jia-Chen],
Li, S.[Shen],
Shi, H.[Humphrey],
Neighborhood Attention Transformer,
CVPR23(6185-6194)
IEEE DOI
2309
BibRef
Liu, Z.J.[Zhi-Jian],
Yang, X.Y.[Xin-Yu],
Tang, H.T.[Hao-Tian],
Yang, S.[Shang],
Han, S.[Song],
FlatFormer: Flattened Window Attention for Efficient Point Cloud
Transformer,
CVPR23(1200-1211)
IEEE DOI
2309
BibRef
Pan, X.[Xuran],
Ye, T.Z.[Tian-Zhu],
Xia, Z.F.[Zhuo-Fan],
Song, S.[Shiji],
Huang, G.[Gao],
Slide-Transformer: Hierarchical Vision Transformer with Local
Self-Attention,
CVPR23(2082-2091)
IEEE DOI
2309
BibRef
Zhu, L.[Lei],
Wang, X.J.[Xin-Jiang],
Ke, Z.H.[Zhang-Han],
Zhang, W.[Wayne],
Lau, R.[Rynson],
BiFormer: Vision Transformer with Bi-Level Routing Attention,
CVPR23(10323-10333)
IEEE DOI
2309
BibRef
Long, S.[Sifan],
Zhao, Z.[Zhen],
Pi, J.[Jimin],
Wang, S.S.[Sheng-Sheng],
Wang, J.D.[Jing-Dong],
Beyond Attentive Tokens: Incorporating Token Importance and Diversity
for Efficient Vision Transformers,
CVPR23(10334-10343)
IEEE DOI
2309
BibRef
Liu, X.Y.[Xin-Yu],
Peng, H.[Houwen],
Zheng, N.X.[Ning-Xin],
Yang, Y.Q.[Yu-Qing],
Hu, H.[Han],
Yuan, Y.X.[Yi-Xuan],
EfficientViT: Memory Efficient Vision Transformer with Cascaded Group
Attention,
CVPR23(14420-14430)
IEEE DOI
2309
BibRef
You, H.R.[Hao-Ran],
Xiong, Y.[Yunyang],
Dai, X.L.[Xiao-Liang],
Wu, B.[Bichen],
Zhang, P.Z.[Pei-Zhao],
Fan, H.Q.[Hao-Qi],
Vajda, P.[Peter],
Lin, Y.Y.C.[Ying-Yan Celine],
Castling-ViT: Compressing Self-Attention via Switching Towards
Linear-Angular Attention at Vision Transformer Inference,
CVPR23(14431-14442)
IEEE DOI
2309
BibRef
Grainger, R.[Ryan],
Paniagua, T.[Thomas],
Song, X.[Xi],
Cuntoor, N.[Naresh],
Lee, M.W.[Mun Wai],
Wu, T.F.[Tian-Fu],
PaCa-ViT: Learning Patch-to-Cluster Attention in Vision Transformers,
CVPR23(18568-18578)
IEEE DOI
2309
BibRef
Wei, C.[Cong],
Duke, B.[Brendan],
Jiang, R.[Ruowei],
Aarabi, P.[Parham],
Taylor, G.W.[Graham W.],
Shkurti, F.[Florian],
Sparsifiner: Learning Sparse Instance-Dependent Attention for
Efficient Vision Transformers,
CVPR23(22680-22689)
IEEE DOI
2309
BibRef
Bhattacharyya, M.[Mayukh],
Chattopadhyay, S.[Soumitri],
Nag, S.[Sayan],
DeCAtt: Efficient Vision Transformers with Decorrelated Attention
Heads,
ECV23(4695-4699)
IEEE DOI
2309
BibRef
Zhang, Y.[Yuke],
Chen, D.[Dake],
Kundu, S.[Souvik],
Li, C.H.[Cheng-Hao],
Beerel, P.A.[Peter A.],
SAL-ViT: Towards Latency Efficient Private Inference on ViT using
Selective Attention Search with a Learnable Softmax Approximation,
ICCV23(5093-5102)
IEEE DOI
2401
BibRef
Yeganeh, Y.[Yousef],
Farshad, A.[Azade],
Weinberger, P.[Peter],
Ahmadi, S.A.[Seyed-Ahmad],
Adeli, E.[Ehsan],
Navab, N.[Nassir],
Transformers Pay Attention to Convolutions Leveraging Emerging
Properties of ViTs by Dual Attention-Image Network,
CVAMD23(2296-2307)
IEEE DOI
2401
BibRef
Zheng, J.H.[Jia-Hao],
Yang, L.Q.[Long-Qi],
Li, Y.[Yiying],
Yang, K.[Ke],
Wang, Z.Y.[Zhi-Yuan],
Zhou, J.[Jun],
Lightweight Vision Transformer with Spatial and Channel Enhanced
Self-Attention,
REDLCV23(1484-1488)
IEEE DOI
2401
BibRef
Hyeon-Woo, N.[Nam],
Yu-Ji, K.[Kim],
Heo, B.[Byeongho],
Han, D.Y.[Dong-Yoon],
Oh, S.J.[Seong Joon],
Oh, T.H.[Tae-Hyun],
Scratching Visual Transformer's Back with Uniform Attention,
ICCV23(5784-5795)
IEEE DOI
2401
BibRef
Zhang, H.K.[Hao-Kui],
Hu, W.Z.[Wen-Ze],
Wang, X.Y.[Xiao-Yu],
Fcaformer: Forward Cross Attention in Hybrid Vision Transformer,
ICCV23(6037-6046)
IEEE DOI Code:
WWW Link.
2401
BibRef
Zeng, W.X.[Wen-Xuan],
Li, M.[Meng],
Xiong, W.J.[Wen-Jie],
Tong, T.[Tong],
Lu, W.J.[Wen-Jie],
Tan, J.[Jin],
Wang, R.S.[Run-Sheng],
Huang, R.[Ru],
MPCViT: Searching for Accurate and Efficient MPC-Friendly Vision
Transformer with Heterogeneous Attention,
ICCV23(5029-5040)
IEEE DOI Code:
WWW Link.
2401
BibRef
Psomas, B.[Bill],
Kakogeorgiou, I.[Ioannis],
Karantzalos, K.[Konstantinos],
Avrithis, Y.[Yannis],
Keep It SimPool:Who Said Supervised Transformers Suffer from
Attention Deficit?,
ICCV23(5327-5337)
IEEE DOI Code:
WWW Link.
2401
BibRef
Han, D.C.[Dong-Chen],
Pan, X.[Xuran],
Han, Y.Z.[Yi-Zeng],
Song, S.[Shiji],
Huang, G.[Gao],
FLatten Transformer: Vision Transformer using Focused Linear
Attention,
ICCV23(5938-5948)
IEEE DOI Code:
WWW Link.
2401
BibRef
Tatsunami, Y.[Yuki],
Taki, M.[Masato],
RaftMLP: How Much Can Be Done Without Attention and with Less Spatial
Locality?,
ACCV22(VI:459-475).
Springer DOI
2307
WWW Link. Address computational comlexity.
BibRef
Bolya, D.[Daniel],
Fu, C.Y.[Cheng-Yang],
Dai, X.L.[Xiao-Liang],
Zhang, P.Z.[Pei-Zhao],
Hoffman, J.[Judy],
Hydra Attention: Efficient Attention with Many Heads,
CADK22(35-49).
Springer DOI
2304
Transformers computation explodes with large images. Multiple heads.
BibRef
Chen, X.Y.[Xiang-Yu],
Hu, Q.[Qinghao],
Li, K.[Kaidong],
Zhong, C.[Cuncong],
Wang, G.H.[Guang-Hui],
Accumulated Trivial Attention Matters in Vision Transformers on Small
Datasets,
WACV23(3973-3981)
IEEE DOI
2302
Codes, Focusing, Transformers, Convolutional neural networks,
Task analysis, Algorithms: Machine learning architectures,
and algorithms (including transfer)
BibRef
Lan, H.[Hai],
Wang, X.[Xihao],
Shen, H.[Hao],
Liang, P.[Peidong],
Wei, X.[Xian],
Couplformer: Rethinking Vision Transformer with Coupling Attention,
WACV23(6464-6473)
IEEE DOI
2302
Couplings, Visualization, Image segmentation,
Computational modeling, Memory management, Object detection,
Visualization
BibRef
Debnath, B.[Biplob],
Po, O.[Oliver],
Chowdhury, F.A.[Farhan Asif],
Chakradhar, S.[Srimat],
Cosine Similarity based Few-Shot Video Classifier with
Attention-based Aggregation,
ICPR22(1273-1279)
IEEE DOI
2212
Training, Head, Pipelines, Benchmark testing, Feature extraction,
Transformers
BibRef
Mari, C.R.[Carlos Roig],
Gonzalez, D.V.[David Varas],
Bou-Balust, E.[Elisenda],
Multi-Scale Transformer-Based Feature Combination for Image Retrieval,
ICIP22(3166-3170)
IEEE DOI
2211
Visualization, Semantics, Image retrieval, Feature extraction,
Transformers, Internet, Image retrieval, Attention, Multi-scale,
Feature combination
BibRef
Furukawa, R.[Ryouichi],
Hotta, K.[Kazuhiro],
Local Embedding for Axial Attention,
ICIP22(2586-2590)
IEEE DOI
2211
Deep learning, Image segmentation, Visualization,
Computational modeling, Neural networks, Transformers.
BibRef
Ding, M.Y.[Ming-Yu],
Xiao, B.[Bin],
Codella, N.[Noel],
Luo, P.[Ping],
Wang, J.D.[Jing-Dong],
Yuan, L.[Lu],
DaViT: Dual Attention Vision Transformers,
ECCV22(XXIV:74-92).
Springer DOI
2211
BibRef
Wang, P.C.[Pi-Chao],
Wang, X.[Xue],
Wang, F.[Fan],
Lin, M.[Ming],
Chang, S.N.[Shu-Ning],
Li, H.[Hao],
Jin, R.[Rong],
KVT: k-NN Attention for Boosting Vision Transformers,
ECCV22(XXIV:285-302).
Springer DOI
2211
BibRef
Rao, Y.M.[Yong-Ming],
Zhao, W.L.[Wen-Liang],
Zhou, J.[Jie],
Lu, J.W.[Ji-Wen],
AMixer:
Adaptive Weight Mixing for Self-Attention Free Vision Transformers,
ECCV22(XXI:50-67).
Springer DOI
2211
BibRef
Li, A.[Ang],
Jiao, J.[Jichao],
Li, N.[Ning],
Qi, W.[Wangjing],
Xu, W.[Wei],
Pang, M.[Min],
Conmw Transformer: A General Vision Transformer Backbone With
Merged-Window Attention,
ICIP22(1551-1555)
IEEE DOI
2211
Image resolution, Convolution, Transformers, Feature extraction, Tokenization,
Computational efficiency, Vision Transformer, hybrid architecture
BibRef
Zhang, Q.M.[Qi-Ming],
Xu, Y.F.[Yu-Fei],
Zhang, J.[Jing],
Tao, D.C.[Da-Cheng],
VSA: Learning Varied-Size Window Attention in Vision Transformers,
ECCV22(XXV:466-483).
Springer DOI
2211
BibRef
Mallick, R.[Rupayan],
Benois-Pineau, J.[Jenny],
Zemmari, A.[Akka],
I Saw: A Self-Attention Weighted Method for Explanation of Visual
Transformers,
ICIP22(3271-3275)
IEEE DOI
2211
Measurement, Correlation coefficient, Visualization,
Image segmentation, Databases, Object detection, Transformers,
Gaze Fixation Density Maps
BibRef
Song, Z.K.[Zi-Kai],
Yu, J.Q.[Jun-Qing],
Chen, Y.P.P.[Yi-Ping Phoebe],
Yang, W.[Wei],
Transformer Tracking with Cyclic Shifting Window Attention,
CVPR22(8781-8790)
IEEE DOI
2210
WWW Link. Visualization, Target tracking, Image recognition,
Optimization methods, Benchmark testing
BibRef
Yang, C.L.[Cheng-Lin],
Wang, Y.L.[Yi-Lin],
Zhang, J.M.[Jian-Ming],
Zhang, H.[He],
Wei, Z.J.[Zi-Jun],
Lin, Z.[Zhe],
Yuille, A.L.[Alan L.],
Lite Vision Transformer with Enhanced Self-Attention,
CVPR22(11988-11998)
IEEE DOI
2210
Convolutional codes, Image segmentation, Visualization,
Convolution, Semantics, Merging, Predictive models,
Deep learning architectures and techniques
BibRef
Xia, Z.F.[Zhuo-Fan],
Pan, X.[Xuran],
Song, S.[Shiji],
Li, L.E.[Li Erran],
Huang, G.[Gao],
Vision Transformer with Deformable Attention,
CVPR22(4784-4793)
IEEE DOI
2210
Deformable models, Adaptation models, Computational modeling,
Predictive models, Transformers, Data models,
grouping and shape analysis
BibRef
Yu, T.[Tong],
Khalitov, R.[Ruslan],
Cheng, L.[Lei],
Yang, Z.R.[Zhi-Rong],
Paramixer: Parameterizing Mixing Links in Sparse Factors Works Better
than Dot-Product Self-Attention,
CVPR22(681-690)
IEEE DOI
2210
Protocols, Costs, Scalability, Neural networks, Stacking, Genomics,
Transformers, Deep learning architectures and techniques,
Representation learning
BibRef
Cheng, B.[Bowen],
Misra, I.[Ishan],
Schwing, A.G.[Alexander G.],
Kirillov, A.[Alexander],
Girdhar, R.[Rohit],
Masked-attention Mask Transformer for Universal Image Segmentation,
CVPR22(1280-1289)
IEEE DOI
2210
Image segmentation, Shape, Computational modeling, Semantics,
Transformers, Feature extraction, retrieval
BibRef
Rangrej, S.B.[Samrudhdhi B.],
Srinidhi, C.L.[Chetan L.],
Clark, J.J.[James J.],
Consistency driven Sequential Transformers Attention Model for
Partially Observable Scenes,
CVPR22(2508-2517)
IEEE DOI
2210
Training, Computational modeling, Imaging, Predictive models,
Transformers, Prediction algorithms, Visual reasoning
BibRef
Chen, C.F.R.[Chun-Fu Richard],
Fan, Q.F.[Quan-Fu],
Panda, R.[Rameswar],
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image
Classification,
ICCV21(347-356)
IEEE DOI
2203
Image segmentation, Image recognition, Computational modeling,
Semantics, Memory management, Object detection, Representation learning
BibRef
Chefer, H.[Hila],
Gur, S.[Shir],
Wolf, L.B.[Lior B.],
Generic Attention-model Explainability for Interpreting Bi-Modal and
Encoder-Decoder Transformers,
ICCV21(387-396)
IEEE DOI
2203
Measurement, Visualization, Image segmentation,
Computational modeling, Object detection,
BibRef
Xu, W.J.[Wei-Jian],
Xu, Y.F.[Yi-Fan],
Chang, T.[Tyler],
Tu, Z.W.[Zhuo-Wen],
Co-Scale Conv-Attentional Image Transformers,
ICCV21(9961-9970)
IEEE DOI
2203
Image segmentation, Computational modeling, Object detection,
Transformers, Convolutional neural networks, Task analysis,
Recognition and classification
BibRef
Yang, G.L.[Guang-Lei],
Tang, H.[Hao],
Ding, M.L.[Ming-Li],
Sebe, N.[Nicu],
Ricci, E.[Elisa],
Transformer-Based Attention Networks for Continuous Pixel-Wise
Prediction,
ICCV21(16249-16259)
IEEE DOI
2203
Correlation, Estimation, Logic gates,
Transformers, Natural language processing,
Vision applications and systems
BibRef
Kim, K.[Kyungmin],
Wu, B.C.[Bi-Chen],
Dai, X.L.[Xiao-Liang],
Zhang, P.Z.[Pei-Zhao],
Yan, Z.C.[Zhi-Cheng],
Vajda, P.[Peter],
Kim, S.[Seon],
Rethinking the Self-Attention in Vision Transformers,
ECV21(3065-3069)
IEEE DOI
2109
Computational modeling
BibRef
Chapter on Pattern Recognition, Clustering, Statistics, Grammars, Learning, Neural Nets, Genetic Algorithms continues in
Video Transformers .