Journals starting with wacv

* Accessing the Authorship Confidence of Handwritten Items
* Achieving Accurate Colour Image Segmentation in 2D and 3D with LVQ Classifiers and Partial ACSR
* Agent Network for Microburst Detection, An
* Analysis of traffic flow in urban areas using web cameras
* Augmented Geophysical Data Interpretation Through Automated Velocity Picking in Semblance Velocity Images
* Automated Analysis of DNA Hybridization Images
* Automated Performance Evaluation of Range Image Segmentation
* Automatic Description of Complex Buildings with Multiple Images
* Automatic Image Segmentation and Classification Using On-line Shape Learning
* Bootstrap a Statistical Brain Atlas
* Creating 3D Models with Uncalibrated Cameras
* Customizable MPEG-4 Face Player Using Real-time 2D Image Sequence
* Decision Combination of Multiple Classifiers for Pattern Classification: Hybridisation of Majority Voting and Divide and Conquer Techniques
* Detection of Side-View Faces in Color Images
* Film scratch removal using Kalman filtering and Bayesian restoration
* Fingerprint image matching by minimization of a thin-plate energy using a two-step iterative algorithm with auxiliary variables
* Flame recognition in video
* Flexible Eyetracker for Psychological Applications, A
* Global Matching Criterion and Color Segmentation Based Stereo
* Head tracking using stereo
* Image Sequence Analysis for Real-Time Underwater Cable Tracking
* Learning Partitioned Least Squares Filters for Fingerprint Enhancement
* Modeling 3-D Complex Buildings With User Assistance
* Motion Segmentation and Pose Recognition with Motion History Gradients
* Multimodal Image Registration Using Local Frequency
* Real-time Visually Guided Human Figure Control Using IK-based Motion Synthesis
* Registration of Technical Drawings and Calibrated Images for Industrial Augmented Reality
* Removal of Interfering Strokes in Double-Sided Document Images
* Restoration of Multiple Images with Motion Blur in Different Directions
* Robust Background Subtraction Method for Changing Background, A
* Robust Fingerprint Authentication Using Local Structural Similarity
* Scheme for Road Extraction in Rural Areas and its Evaluation, A
* Synthesized Virtual View-Based EigenSpace for Face Recognition
* Tracking and Pose Estimation for Computer Assisted Localization in Industrial Environment
* Urban Street Grid Description and Verification
* View Cube: An Efficient Method of View Planning for 3d Modelling from Range Data, The
37 for WACV00

* 3D model based gesture acquisition using a single camera
* Active facial tracking for fatigue detection
* Activity maps for location-aware computing
* Adaptive aperture control for image acquisition
* Appearance-based eye gaze estimation
* Arm gesture detection in a classroom environment
* Attentive billboards: towards to video based customer behavior understanding
* augmented-reality interface for telerobotic applications, An
* Automatic detection of signs with affine transformation
* Automatic pose estimation of complex 3D building models
* Boosting image orientation detection with indoor vs. outdoor classification
* CV-SDF -a model for real-time computer vision applications
* Dense disparity maps in real-time with an application to augmented reality
* Development and analysis of a real-time human motion tracking system
* Does colorspace transformation make any difference on skin detection?
* droplet virtual brush for Chinese calligraphic character modeling, The
* Dynamical road modeling and matching for direct visual navigation
* Evaluation of tracking methods for human-computer interaction
* experimental evaluation of linear and kernel-based methods for face recognition, An
* Eye typing using Markov and active appearance models
* Face model adaptation using robust matching and active appearance models
* Fast and robust planar registration by early consensus with applications to document stitching
* FASU: A full automatic segmenting system for ultrasound images
* Fingerprint verification using genetic algorithms
* Foreground object detection in changing background based on color co-occurrence statistics
* Genetic feature subset selection for gender classification: A comparison study
* Group behavior recognition with multiple cameras
* kernel logic approach for face and non-face classification, A
* Kinematic-based human motion analysis in infrared sequences
* Knowledge-based registration & segmentation of the left ventricle: A level set approach
* Mean-shift segmentation with wavelet-based bandwidth selection
* model-driven method of estimating the state of clothes for manipulating it, A
* Monocular, vision based, autonomous refueling system
* Mosaic generation for under vehicle inspection
* Multi-sensor super-resolution
* Multi-view face detection with FloatBoost
* Multimodal human-computer interaction for crisis management systems
* Object detection method based on local kernels and automatic kernel selection by Kullback-Leibler divergence
* Optimal motion estimation from visual and inertial measurements
* Pattern alignment method based on consistency among local registration candidates for LSI wafer pattern inspection
* PDA-based face recognition system, A
* Pose estimation and integration for complete 3D model reconstruction
* Range synthesis for 3D environment modeling
* real-time precrash vehicle detection system, A
* Retrieving faces by the PIFS fractal code
* Robust and efficient detection of non-lint material in cotton fiber samples
* Segmentation of complex buildings from aerial images and 3D surface reconstruction
* Segmentation of myocardium using velocity field constrained front propagation
* Single-frame super-resolution by a cortex based mechanism using high level visual features in natural images
* Skin-color extraction in images with complex background and varying illumination
* Towards automatic analysis of DNA microarrays
* unified prediction method for heterogeneous weather radar patterns, A
* Using attribute trees to analyse auroral appearance over Canada
* Video de-abstraction or how to save money on your wedding video
* Viewing enhancement in video-endoscopy
56 for WACV02

* 3D Pose Estimation of Cactus Leaves using an Active Shape Model
* 3D Recognition and Segmentation of Objects in Cluttered Scenes
* Acquiring Multi-Scale Images by Pan-Tilt-Zoom Control and Automatic Multi-Camera Calibration
* Activity Recognition using Visual Tracking and RFID
* Automated Extraction of Microtubules and Their Plus-Ends
* Automated Microaneurysm Segmentation and Detection using Generalized Eigenvectors
* Automatic 2D Hand Tracking in Video Sequences
* Automatic Augmentation and Meshing of Sparse 3D Scene Structure
* Automatic In Situ Identification of Plankton
* Automatic Segmentation of Abdominal Fat from CT Data
* Class-Specific Color Camera Calibration with Application to Object Recognition
* Comparison of Machine Learned Image Interpretation Systems in the Domain of Forestry
* Contour Matching for 3D Ear Recognition
* Creating Realistic Shadows of Composited Objects
* Deformation Analysis for 3D Face Matching
* Dental Biometrics: Alignment and Matching of Dental Radiographs
* Detecting Motion Patterns via Direction Maps with Application to Surveillance
* Dynamic Hidden Markov Random Field Model for Foreground and Shadow Segmentation, A
* Ensemble Methods in the Clustering of String Patterns
* Epiflow Quadruplet Matching: Enforcing Epipolar Geometry for Spatio-Temporal Stereo Correspondences
* Evaluation of Matching Metrics for Trajectory-Based Indexing and Retrieval of Video Clips
* Evolutionary Feature Synthesis for Image Databases
* Extracting the Optic Disk Endpoints in Optical Coherence Tomography Data
* Extraction of Rhythmical Factors on Dance Actions Thorough Motion Analysis
* Facial Expression Hallucination
* Fast Multi-Modal Approach to Facial Feature Detection, A
* Fingerprint Deformation Models Using Minutiae Locations and Orientations
* Gait Verification Using Probabilistic Methods
* Generating Verbal Descriptions of Colored Objects: Towards Grounding Language in Perception
* Hierarchical Approach to Sign Recognition, A
* High-Resolution Video Synthesis from Mixed-Resolution Video Based on the Estimate-and-Correct Method
* Hybrid IMM/SVM Approach for Wavelet-Domain Probabilistic Model Based Texture Classification
* Image Registration with Uncalibrated Cameras in Hybrid Vision Systems
* Image Segmentation by Unsupervised Sparse Clustering
* In Vivo Quantitative Evaluation of Skin Ageing by Capacitance Image Analysis
* Incorporating Background Invariance into Feature-Based Object Recognition
* Integrating Range and Texture Information for 3D Face Recognition
* Learning the Behavior of Users in a Public Space through Video Tracking
* Learning to Detect Small Impact Craters
* Mapping the Physical Properties of Cosmic Hot Gas with Hyper-Spectral Imaging
* Measures of Similarity
* Model-Based Interactive Object Segmentation Procedure, A
* Motion Layer Based Object Removal in Videos
* Multi-Matching Process Based on Wavelet Transform for Iris Recognition
* Multi-Modal Human Identification System
* Multi-Target Tracking Using Hybrid Particle Filtering
* Multi-View Face Tracking with Factorial and Switching HMM
* Multilevel Spectral Partitioning for Efficient Image Segmentation and Tracking
* Novel Approaches for Minutiae Verification in Fingerprint Images
* Patchlets: Representing Stereo Vision Data with Surface Elements
* Pre-Attentive Face Detection for Foveated Wide-Field Surveillance
* Predictive and Probabilistic Tracking to Detect Stopped Vehicles
* Real-Time Rodent Tracking System for Both Light and Dark Cycle Behavior Analysis, A
* Real-Time System for Monitoring Pedestrians, A
* Realtime Road Detection by Learning from One Example
* Reliable Automatic Calibration of a Marker-Based Position Tracking System
* Requirements for Camera Calibration: Must Accuracy Come with a High Price?
* Robust and Real-Time Image Stabilization and Rectification
* Robust Fingerprint Matching Method, A
* Robust Framework For Eigenspace Image Reconstruction, A
* Robust Metric and Alignment for Profile-Based Face Recognition: An Experimental Comparison
* Semi-Supervised Self-Training of Object Detection Models
* Shadow Detection by Combined Photometric Invariants for Improved Foreground Segmentation
* Shared Features for Scalable Appearance-Based Object Recognition
* Smart Bookshelf: A Study of Camera Projector Scene Augmentation of an Everyday Environment, The
* Stereo-Based Tree Traversability Analysis for Autonomous Off-Road Navigation
* Surface Reconstruction for Computer Vision-Based Craniofacial Surgery
* Texture Defect Detection Using Support Vector Machines with Adaptive Gabor Wavelet Features
* Two-Stage Template Approach to Person Detection in Thermal Imagery, A
* Using Co-Occurrence and Segmentation to Learn Feature-Based Object Models from Video
* Using Continuous Face Verification to Improve Desktop Security
* Vehicle Speed Detection and Identification from a Single Motion Blurred Image
* Virtual Forces for Camera Planning in Smart Vision Systems
* Visual Hull Construction Using Adaptive Sampling
* Visual Integration from Multiple Cameras
* Wavelet-Based Approach for Skeleton Extraction
77 for WACV05

* 2DCCA: A Novel Method for Small Sample Size Face Recognition
* 3D Finite Element Modeling of Nonrigid Breast Deformation for Feature Registration in -ray and MR Images
* 3D Hand Model Fitting for Virtual Keyboard System
* Accurate Mutual Information-based Registration of Digitized, An
* Automated Insect Identification through Concatenated Histograms of Local Appearance Features
* Automatic Deformation Detection for Visual Post Inspection
* Automatic Extraction of Femur Contours from Calibrated Fluoroscopic Images
* Building Adaptive Camera Models for Video Surveillance
* Camera-Driven Interactive Table, The
* Chart Image Classification Using Multiple-Instance Learning
* Clustering Billions of Images with Large Scale Nearest Neighbor Search
* Dense Surface from Infrared Stereo
* Drifting-proof Framework for Tracking and Online Appearance Learning, A
* Egomotion Estimation in Monocular Infra-red Image Sequence for Night Vision Applications
* Extraction of Person Silhouettes from Surveillance Imagery using MRFs
* Facial Range Image Matching Using the Complex Wavelet Structural Similarity Metric
* Facial Strain Pattern as a Soft Forensic Evidence
* Fast and Accurate Tensor-based Optical Flow Algorithm Implemented in FPGA, A
* Fast Multi-scale Template Matching Using Binary Features
* Feature-based Part Retrieval for Interactive 3D Reassembly
* Geodesic Active Contour Based Fusion of Visible and Infrared Video for Persistent Object Tracking
* Geometric and Timing Calibration for Unsynchronized Cameras Using Trajectories of a Moving Marker
* Hairline Fracture Detection using MRF and Gibbs Sampling
* hMouse: Head Tracking Driven Virtual Computer Mouse
* Homography-based Analysis of People and Vehicle Activities in Crowded Scenes
* Human Pose Inference from Stereo Cameras
* Identity Verification Via the 3Bid Face Alignment System
* Image Segmentation of Overlapping Particles in Automatic Size Analysis Using Multi-Flash Imaging
* Interactive Image Repair with Assisted Structure and Texture Completion
* Local Graph Matching for Face Recognition
* Localization and Mapping for Autonomous Navigation in Outdoor Terrains: A Stereo Vision Approach
* Maneuvering Aid for Large Vehicle using Omnidirectional Cameras
* Map-Enhanced UAV Image Sequence Registration
* Motion Estimation Using a General Purpose Neural Network Simulator for Visual Attention
* New Affine Registration Algorithm for Matching 2D Point Sets, A
* Nonlinear Mean Shift for Robust Pose Estimation
* Novel Algorithm for Estimating Vehicle Speed from Two Consecutive Images, A
* Object Categorization Robust to Surface Markings using Entropy-guided Codebook
* On Channel Reliability Measure Training for Multi-Camera Face Recognition
* Performance Evaluation of Vision-Based Navigation and Landing on a Rotorcraft Unmanned Aerial Vehicle
* Pose Estimation Based on Two Images from Different Views
* Probabilistic Hierarchical Face Model for Feature Localization
* Real-time Detection of Semi-transparent Watermarks in Decompressed Video
* Recovery of 3D Pose of Bones in Single 2D X-ray Images
* Robust Dissolve Detection Using Local Feature Tracking
* Segmenting Biological Particles in Multispectral Microscopy Images
* Stereo Matching and 3D Visualization for Gamma-Ray Cargo Inspection
* Threshold-based 3D Tumor Segmentation using Level Set (TSL)
* Tracking 3D Human Motion in Compact Base Space
* Two-stage Algorithm for Shoreline Detection, A
* Using Image Flow to Detect Eye Blinks in Color Videos
* Vector field characterization in ERS-1 imagery of sea ice
* Video-based Metrology of Water Droplet Spreading on Nanostructured Surfaces
* VIM: Vision for Interactive Music
* Vision System for Monitoring Intermodal Freight Trains, A
* Warped Document Image Restoration Using Shape-from-Shading and Physically-Based Modeling
57 for WACV07

* Adaptive Fusion of Gait and Face for Human Identification in Video
* Background Subtraction for Temporally Irregular Dynamic Textures
* Cascading Trilinear Tensors for Face Authentication
* Cata-Fisheye Camera for Panoramic Imaging
* Channel Segmentation using Confidence and Curvature-Guided Level Sets on Noisy Seismic Images
* Color Enhancement in Image Fusion
* Color Photometric Stereo for Albedo and Shape Reconstruction
* Computer Vision based Whiteboard Capture System, A
* Distributed Visual Processing for a Home Visual Sensor Network
* EAVA: A 3D Emotive Audio-Visual Avatar
* Efficient Active Camera Model for Video Surveillance, An
* Event-Driven Visual Sensor Networks: Issues in Reliability
* Explanation-Based Object Recognition
* Fast and Fully Automatic Ear Detection Using Cascaded AdaBoost
* Flexible Edge Arrangement Templates for Object Detection
* Hierarchical Scheme for Rapid Video Copy Detection, A
* Human Pose Estimation with Rotated Geometric Blur
* Image Rendering Based on a Spatial Extension of the CIECAM02
* Innovative Model of Tempo and Its Application in Action Scene Detection for Movie Analysis, An
* INSPEC2T: Inexpensive Spectrometer Color Camera Technology
* Interactive Portrait Art
* Intraoperative Visualization of Anatomical Targets in Retinal Surgery
* Iris Extraction Based on Intensity Gradient and Texture Difference
* Learning Optimal Compact Codebook for Efficient Object Categorization
* Likelihood Map Fusion for Visual Object Tracking
* Localization and Segmentation of A 2D High Capacity Color Barcode
* Locally Adjusted Robust Regression for Human Age Estimation
* Location-based Services using Image Search
* Mosaicfaces: a discrete representation for face recognition
* Multi-Pose Face Detection with Asymmetric Haar Features
* non parametric approach for modeling interferometric SAR imagery and applications, A
* Novel Method for Power Saving in a Surveillance Environment, A
* Novel Thresholding Approach to Background Subtraction, A
* Object Categorization Based on Kernel Principal Component Analysis of Visual Words
* Online Character Recognition using Regression Techniques
* Online/Realtime Structure and Motion for General Camera Models
* Qualitative Assessment of Video Stabilization and Mosaicking Systems
* Real-time Robust Mapping for an Autonomous Surface Vehicle using an Omnidirectional Camera
* Recognition of Human Actions using an Optimal Control Based Motor Model
* Recovering Social Networks From Massive Track Datasets
* Robust 6DOF Motion Estimation for Non-Overlapping, Multi-Camera Systems
* Robust Human Pose Recognition Using Unlabelled Markers
* Segmentation of Salient Regions in Outdoor Scenes Using Imagery and 3-D Data
* Self Calibrating Visual Sensor Networks
* Single View Metrology: A Practical Example
* Toward Fully Automatic Geo-Location and Geo-Orientation of Static Outdoor Cameras
* Tracking and Segmentation of Highway Vehicles in Cluttered and Crowded Scenes
* Tracking Down Under: Following the Satin Bowerbird
* Urban building recognition during significant temporal variations
* Variational Transform Invariant Mixture of Probabilistic PCA
* Vision-Based System For Automatic Detection and Extraction Of Road Networks, A
52 for WACV08

* 3-D model based vehicle recognition
* 3D segmentation of soft organs by flipping-free mesh deformation
* Action recognition via multi-feature fusion and Gaussian process classification
* Adaptive, real-time visual simultaneous localization and mapping
* Age categorization via ECOC with fused gabor and LBP features
* algorithm enabling blind users to find and read barcodes, An
* Analyzing human interactions with a network of dynamic probabilistic models
* Angle vertex and bisector geometric model for triangular road sign detection
* Applying Bayes Markov chains for the detection of ATM related scenarios
* Applying robust structure from motion to markerless augmented reality
* Attribute-based people search in surveillance environments
* Automated thickness measuring system for brake shoe of rolling stock
* Automatic analysis of flourescence labeled neurites in microscope images
* Automatic camera placement for large scale surveillance networks
* Automatic marker detection for blob images
* Automatically detecting the small group structure of a crowd
* Automating multi-camera self-calibration
* Ball joints for Marker-less human Motion Capture
* Bimodal information analysis for emotion recognition
* Biometric cryptographic key generation based on city block distance
* Building extraction and change detection in multitemporal remotely sensed images with multiple birth and death dynamics
* Camera calibration from orthogonally projected coordinates with noisy-RANSAC
* camera flash based projector system for true scale metric reconstruction, A
* Color photometric stereo for directional diffuse object
* Colorization of natural images via L1 optimization
* Combining multiple kernels for efficient image classification
* Comparison between EOG and high frame rate camera for drowsiness detection
* comparison of 3d model-based tracking approaches for human motion capture in uncontrolled environments, A
* Context-aware search using cooperative agents in a smart environment
* cyclostationarity analysis applied to image forensics, A
* Efficient SIFT matching from keypoint descriptor properties
* Elevation-based MRF stereo implemented in real-time on a GPU
* Enhancing images in scattering media utilizing stereovision and polarization
* Evaluation based combining of classifiers for monitoring honeybees
* Extensive articulated human detection by voting Cluster Boosted Tree
* Facial expression recognition using histogram variances faces
* Fast lead star detection in entertainment videos
* fast multi-channel edge detection algorithm for vision-based autonomous spacecraft docking, A
* fast multi-model approach for object duplicate extraction, A
* Fingerprint recognition system performance in the maritime environment
* Fused visible and infrared video for use in Wilderness Search and Rescue
* Fusing face recognition from multiple cameras
* general framework for reconciling multiple weak segmentations of an image, A
* Haarlet-based hand gesture recognition for 3D interaction
* hands-on approach to high-dynamic-range and superresolution fusion, A
* Hash functions for near duplicate image retrieval
* Hierarchical belief propagation to reduce search space using CUDA for stereo and motion estimation
* Human action recognition using Recursive Self Organizing map and longest common subsequence matching
* Improving edge detection in highly noised sheet-metal images
* Intelligent frame selection for anatomic reconstruction from endoscopic video
* Interactive degraded document binarization: An example (and case) for interactive computer vision
* interactive graph cut method for brain tumor segmentation, An
* Interference reflection separation from a single image
* Iterative self-dual reconstruction on radar image recovery
* level set model without initial contour, A
* Localization of an unmanned ground vehicle using 3D registration of laser range data and DSM
* MAP algorithm for AVO seismic inversion based on the mixed (L2, non-L2) norms to separate primary and multiple signals in slowness space, A
* Medical volume image summarization
* ML-fusion based multi-model human detection and tracking for robust human-robot interfaces
* Multi-instance learning with relational information of instances
* Multi-view traffic sign detection, recognition, and 3D localisation
* multiview, multimodal fusion framework for classifying small marine animals with an opto-acoustic imaging system, A
* new method for the detection of singular points in fingerprint images, A
* Non-rigid registration of 3D facial surfaces with robust outlier detection
* Optimization of video coding for telepresence applications
* Precise 2.5D facial landmarking via an analysis by synthesis approach
* Reading challenging barcodes with cameras
* Real-time 3D registration of stereo-vision based range images using GPU
* Real-time human detection using histograms of oriented gradients on a GPU
* Real-time inference of 3D human poses by assembling local patches
* Recognition and volume estimation of food intake using a mobile device
* Recovering the full pose from a single keyframe
* Robust bee tracking with adaptive appearance template and geometry-constrained resampling
* Robust multi-view car detection using unsupervised sub-categorization
* Robust segmentation of freight containers in train monitoring videos
* Sign language spotting based on semi-Markov Conditional Random Field
* Simultaneous 3D face pose and person-specific shape estimation from a single image using a holistic approach
* Stable text line detection
* Temporal integration for on-board stereo-based pedestrian detection
* Towards macro- and micro-expression spotting in video using strain patterns
* Tracking colliding cells
* TRECVid 2008 Event Detection evaluation, The
* Uniform image and camera access
* Video object detection speedup using staggered sampling
* Video object detection speedup using staggered sampling
* Viewpoint invariant features from single images using 3D geometry
* Virtual Mouse interface based on Two-layered Bayesian Network, A
* vision-based 2D-3D registration system, A
* Wavelet-based compressive Super-Resolution
* Wheelchair recognition by using stereo vision and histogram of oriented gradients (HOG) in real environments
91 for WACV09

* 2D Barcode localization and motion deblurring using a flutter shutter camera
* 3D Object recognition using a voting algorithm in a real-world environment
* 4D Photogeometric face recognition with time-of-flight sensors
* Action recognition: A region based approach
* Active stereo vision for improving long range hearing using a Laser Doppler Vibrometer
* AirTouch: Interacting with computer systems at a distance
* Aligning surfaces without aligning surfaces
* Analysis and retargeting of ball sports video
* Analysis of eye gaze pattern of infants at risk of autism spectrum disorder using Markov models
* analysis of facial shape and texture for recognition: A large scale evaluation on FRGC ver2.0, An
* assisted photography method for street scenes, An
* Augmented distinctive features for efficient image matching
* Augmented transit maps
* Bayesian 3D model based human detection in crowded scenes using efficient optimization
* Car-Rec: A real time car recognition system
* Cell image analysis: Algorithms, system and applications
* Classification of image registration problems using support vector machines
* Classification of traffic video based on a spatiotemporal orientation analysis
* Combining RGB and ToF cameras for real-time 3D hand gesture interaction
* Comparing state-of-the-art visual features on invariant object recognition tasks
* Computationally efficient retrieval-based tracking system and augmented reality for large-scale areas
* Dense point-to-point correspondences between 3D faces using parametric remeshing for constructing 3D Morphable Models
* Detecting people carrying objects based on an optical flow motion model
* Detecting questionable observers using face track clustering
* Detection of static objects for the task of video surveillance
* evaluation of bags-of-words and spatio-temporal shapes for action recognition, An
* Evolving improved transforms for reconstruction of quantized ultrasound images
* Experimental evidence of a template aging effect in iris biometrics
* Exploratory analysis of time-lapse imagery with fast subset PCA
* Face recognition across large pose variations via Boosted Tied Factor Analysis
* Feature fusion for vehicle detection and tracking with low-angle cameras
* Generalized autofocus
* GPU accelerated one-pass algorithm for computing minimal rectangles of connected components
* Groupwise pose normalization for craniofacial applications
* Historical comparison of vehicles using scanned x-ray images
* Human gait estimation using a wearable camera
* Illumination change compensation techniques to improve kinematic tracking
* Image matching with distinctive visual vocabulary
* Indexing in large scale image collections: Scaling properties and benchmark
* Information fusion in low-resolution iris videos using Principal Components Transform
* Large-scale vehicle detection in challenging urban surveillance environments
* Localized support vector machines using Parzen window for incomplete sets of categories
* Localizing blurry and low-resolution text in natural images
* Moving object detection with background model based on spatio-temporal texture
* Multi-modal summarization of key events and top players in sports tournament videos
* Multi-modal visual concept classification of images via Markov random walk over tags
* Multi-view human action recognition system employing 2DPCA
* Multiple ant tracking with global foreground maximization and variable target proposal distribution
* Multisensory embedded pose estimation
* Object matching using feature aggregation over a frame sequence
* On the reliability of eye color as a soft biometric trait
* On the use of multispectral conjunctival vasculature as a soft biometric
* overview of automatic event detection in soccer matches, An
* parallel region based object recognition system, A
* performance study of an intelligent headlight control system, A
* Personalized video summarization with human in the loop
* PIRF-Nav 2: Speeded-up online and incremental appearance-based SLAM in an indoor environment
* Quadtree decomposition based extended vector space model for image retrieval
* random center surround bottom up visual attention model useful for salient region detection, A
* Real-time detection and reading of LED/LCD displays for visually impaired persons
* Realistic stereo error models and finite optimal stereo baselines
* recursive Otsu thresholding method for scanned document binarization, A
* Restoration for weakly blurred and strongly noisy images
* Robust alignment of wide baseline terrestrial laser scans via 3D viewpoint normalization
* Robust multi-view camera calibration for wide-baseline camera networks
* Robust realtime feature detection in raw 3D face images
* Room-structure estimation in Manhattan-like environments from dense 2½D range data using minumum entropy and histograms
* Saliency detection based on proto-objects and topic model
* Saliency retargeting: An approach to enhance image aesthetics
* Segmenting color images into surface patches by exploiting sparse depth data
* Soft margin keyframe comparison: Enhancing precision of fraud detection in retail surveillance
* Stacked spatial-pyramid kernel: An object-class recognition method to combine scores from random trees
* study on recognizing non-artistic face sketches, A
* Supervised particle filter for tracking 2D human pose in monocular video
* Texture classification using multimodal Invariant Local Binary Pattern
* Tracking and reconstruction of vehicles for accurate position estimation
* Tracking gaze direction from far-field surveillance cameras
* TranslatAR: A mobile augmented reality translator
* User-driven saliency maps for evaluating Region-of-Interest detection
* Using visibility cameras to estimate atmospheric light extinction
* Vehicle detection from low quality aerial LIDAR data
* View context based 2D sketch-3D model alignment
* Visual item verification for fraud prevention in retail self-checkout
* Webcam geo-localization using aggregate light levels
* Window detection from mobile LiDAR data
86 for WACV11

* Accurate efficient mosaicking for Wide Area Aerial Surveillance
* Apparel silhouette attributes recognition
* Appearance-based face recognition using a supervised manifold learning framework
* Automatic identification of Frankfurt plane and mid-sagittal plane of skull
* Batch mode active learning for multi-label image classification with informative label correlation mining
* blob representation for tracking robust to merging and fragmentation, A
* Classification of plant structures from uncalibrated image sequences
* Color balancing for change detection in multitemporal images
* CompactKdt: Compact signatures for accurate large scale object recognition
* complementary local feature descriptor for face identification, A
* Depth-supported real-time video segmentation with the Kinect
* Dynamic and invisible messaging for visual MIMO
* Efficient tracking of ants in long video with GPU and interaction
* Enhanced rail component detection and consolidation for rail track inspection
* Estimating the spatial extents of geospatial objects using hierarchical models
* Face typing: Vision-based perceptual interface for hands-free text entry with a scrollable virtual keyboard
* Fast graph cuts using shrink-expand reparameterization
* Fast planar object detection and tracking via edgel templates
* For your eyes only
* Group context learning for event recognition
* Illumination-free photometric metric for range image registration
* Image alignment for multiple camera high dynamic range microscopy
* Image congealing via efficient feature selection
* Implementing high resolution structured light by exploiting projector blur
* Improving Evolution-Constructed features using speciation for general object detection
* Improving realism of 3D texture using component based modeling
* Indian Classical Dance classification by learning dance pose bases
* Kernel analysis over Riemannian manifolds for visual recognition of actions, pedestrians and textures
* Learning and recognizing complex multi-agent activities with applications to american football plays
* Learning reconfigurable scene representation by tangram model
* LOST: Longterm Observation of Scenes (with Tracks)
* Measuring face familiarity and its application to face recognition
* modular system architecture for online parallel vision pipelines, A
* Multimodal ranking for non-compliance detection in retail surveillance
* Multiple-Instance learning from multiple perspectives: Combining models for Multiple-Instance learning
* Mutual occlusion between real and virtual elements in Augmented Reality based on fiducial markers
* New hope for recognizing twins by using facial motion
* new upsampling method for mobile LiDAR data, A
* Non-parametric motion-priors for flow understanding
* Non-rigid surface detection for gestural interaction with applicable surfaces
* nonintrusive system for behavioral analysis of children using multiple RGB+depth sensors, A
* Online discriminative object tracking with local sparse representation
* Point-less calibration: Camera parameters from gradient-based alignment to edge images
* Predicting good, bad and ugly match Pairs
* Predicting human gaze using quaternion DCT image signature saliency and face detection
* PTZ camera network calibration from moving people in sports broadcasts
* Real time moving vehicle detection and reconstruction for improving classification
* Real-time 3-D face tracking and modeling from a webcam
* Real-time stereo and flow-based video segmentation with superpixels
* Reconfigurable templates for robust vehicle detection and classification
* Robust detection, classification and positioning of traffic signs from street-level panoramic images for inventory purposes
* Robust tracking for interactive social video
* Secure remote matching with privacy: Scrambled support vector vaulted verification (S2V3)
* Simultaneous inference of activity, pose and object
* sparse representation approach to face matching across plastic surgery, A
* Street view goes indoors: Automatic pose estimation from uncalibrated unordered spherical panoramas
* systems level approach to perimeter protection, A
* Tools for richer crowd source image annotations
* Unconstrained periocular biometric acquisition and recognition using COTS PTZ camera for uncooperative and non-cooperative subjects
60 for WACV12

* 3D free form object recognition using rotational projection statistics
* Accurate motion deblurring using camera motion tracking and scene depth
* Animal recognition in the Mojave Desert: Vision tools for field biologists
* Are you using the right approximate nearest neighbor algorithm?
* Automatic content-based temporal alignment of image sequences with varying spatio-temporal resolution
* Automatic curve selection for lens distortion correction using Hough transform energy
* Automatic region-of-interest detection and prioritisation for visually optimised coding of low bit rate videos
* Automatic scallop detection in benthic environments
* Bayesian non-parametric viewpoint to visual tracking, A
* Berkeley MHAD: A comprehensive Multimodal Human Action Database
* Boosting object detection performance in crowded surveillance videos
* Classification of Human Epithelial type 2 cell indirect immunofluoresence images via codebook based descriptors
* Clustering of video-patches on Grassmannian manifold for facial expression recognition from 3D videos
* Depth SEEDS: Recovering incomplete depth data using superpixels
* DIRSAC: A directed sampling and consensus approach to quasi-degenerate data fitting
* Domain adaptive object detection
* Estimation of camera pose with respect to terrestrial LiDAR data
* Expanding gait identification methods from straight to curved trajectories
* experimental study of pupil constriction for liveness detection, An
* full-spherical device for simultaneous geometry and reflectance acquisition, A
* Fusing appearance and geometric constraints for estimating the epipolar geometry
* Gender classification using 2-D ear images and sparse representation
* Geometric calibration for a multi-camera-projector system
* Gixel array descriptor (GAD) for multimodal image matching, The
* graph-based algorithm for multi-target tracking with occlusion, A
* Handwritten text segmentation using average longest path algorithm
* Heteroscedastic probabilistic linear discriminant analysis for manifold learning in video-based face recognition
* high resolution 3D tire and footprint impression acquisition for forensics applications, A
* HotSpotter: Patterned species instance recognition
* Human behavior segmentation and recognition using Continuous Linear Dynamic System
* Illumination invariant Mean-shift tracking
* Image quality quantification for fingerprints using quality-impairment assessment
* Image segmentation for large-scale subcategory flower recognition
* Image to LIDAR matching for geotagging in urban environments
* Improving pollen classification with less training effort
* Laparoscopic instrument localization using a 3-D Time-of-Flight/RGB endoscope
* Large-scale web video event classification by use of Fisher Vectors
* lip extraction algorithm using region-based ACM with automatic contour initialization, A
* low cost 3D markerless system for the reconstruction of athletic techniques, A
* Mass anomaly depth estimation from Full Tensor Gradient gravity data
* Multi-pose multi-target tracking for activity understanding
* Nonuniform image patch exemplars for low level vision
* OpenVL: A task-based abstraction for developer-friendly computer vision
* Periocular biometric recognition using image sets
* Person re-identification using semantic color names and RankBoost
* Real-time tracking of low-resolution vehicles for wide-area persistent surveillance
* Reconstructing a fragmented face from a cryptographic identification protocol
* RegionCut: Interactive multi-label segmentation utilizing cellular automaton
* Relational divergence based classification on Riemannian manifolds
* relational kernel-based approach to scene classification, A
* Relative ranking of facial attractiveness
* Ridge Regression based classifiers for large scale class imbalanced datasets
* Robust autocalibration for a surveillance camera network
* Robust classification system with reliability prediction for semi-automatic traffic-sign inventory systems
* Robust rank-4 affine factorization for structure from motion
* SAGE: An approach and implementation empowering quick and reliable quantitative analysis of segmentation quality
* Scene image categorization and video event detection using Naive Bayes Nearest Neighbor
* Semantic tie points
* Shape and image retrieval by organizing instances using population cues
* Single view pose estimation of mobile devices in urban environments
* Spatial-temporal structural and dynamics features for Video Fire Detection
* Spatio-temporal covariance descriptors for action and gesture recognition
* Statistical angular error-based triangulation for efficient and accurate multi-view scene reconstruction
* Towards a practical PTZ face detection and tracking system
* Tracking multiple ants in a colony
* Unwrapping the eye for visible-spectrum gaze tracking on wearable devices
* Using Kinect for face recognition under varying poses, expressions, illumination and disguise
* Video event recognition using concept attributes
* weakly supervised approach for object detection based on Soft-Label Boosting, A
* Webcam2Satellite: Estimating cloud maps from webcam imagery
* Whale blow detection in infrared video using fractal analysis as tool for representing dynamic shape variation
* What is the space of spectral sensitivity functions for digital color cameras?
* Wildfire smoke detection using spatiotemporal bag-of-features of smoke
74 for WACV13

* 3D Metric Rectification using Angle Regularity
* 3D pose estimation of bats in the wild
* Accelerating arrays of linear classifiers using approximate range queries
* Action in chains: A chains model for action localization and classification
* Active Clustering with Ensembles for Social structure extraction
* Adaptive representations for video-based face recognition across pose
* Age group classification via structured fusion of uncertainty-driven shape features and selected surface features
* Ant tracking with occlusion tunnels
* Attribute-based vehicle recognition using viewpoint-aware multiple instance SVMs
* AutoCaption: Automatic caption generation for personal photos
* Automatic 3D change detection for glaucoma diagnosis
* Automatic identification of window regions on indoor point clouds using LiDAR and cameras
* Automatic tracker selection w.r.t object detection performance
* Bayesian Optimization with an Empirical Hardness Model for approximate Nearest Neighbour Search
* Benchmarking large-scale Fine-Grained Categorization
* Beyond PASCAL: A benchmark for 3D object detection in the wild
* Brownian descriptor: A rich meta-feature for appearance matching
* Car make and model recognition using 3D curve alignment
* Car make and model recognition using 3D curve alignment
* Color and flow based superpixels for 3D geometry respecting meshing
* COLOR CHILD: A robust and computationally efficient Color Image Local Descriptor
* combination of generative and discriminative models for fast unsupervised activity recognition from traffic scene videos, A
* Combining semantic scene priors and haze removal for single image depth estimation
* Comparison of face detection and image classification for detecting front seat passengers in vehicles
* Composite Discriminant Factor analysis
* Consensus-based matching and tracking of keypoints for object tracking
* Coupling video segmentation and action recognition
* CRF approach to fitting a generalized hand skeleton model, A
* Data association based ant tracking with interactive error correction
* Data-driven exemplar model selection
* Data-driven road detection
* Depth-based patch scaling for content-aware stereo image completion
* Detecting 3D geometric boundaries of indoor scenes under varying lighting
* Determining underwater vehicle movement from sonar data in relatively featureless seafloor tracking missions
* Discovering discriminative cell attributes for HEp-2 specimen image classification
* discriminative parts based model approach for fiducial points free and shape constrained head pose normalisation in the wild, A
* discriminative parts based model approach for fiducial points free and shape constrained head pose normalisation in the wild, A
* effectiveness of face detection algorithms in unconstrained crowd scenes, The
* Efficient dense subspace clustering
* Elastic reflection symmetry based shape descriptors
* Estimating cloudmaps from outdoor image sequences
* Exemplar codes for facial attributes and tattoo recognition
* Exploring context information for inter-camera multiple target tracking
* Exploring the geo-dependence of human face appearance
* Extending explicit shape regression with mixed feature channels and pose priors
* Fast dense 3D reconstruction using an adaptive multiscale discrete-continuous variational method
* Feature combination with Multi-Kernel Learning for fine-grained visual classification
* Fine-grained object recognition with Gnostic Fields
* Finger-knuckle-print verification based on vector consistency of corresponding interest points
* Fully automatic 3D facial expression recognition using local depth features
* fully implicit alternating direction method of multipliers for the minimization of convex problems with an application to motion segmentation, A
* Furniture-geek: Understanding fine-grained furniture attributes from freely associated text and tags
* Generalized feature learning and indexing for object localization and recognition
* GMM improves the reject option in hierarchical classification for fish recognition
* GPU-accelerated and efficient multi-view triangulation for scene reconstruction
* Gradient based efficient feature selection
* Hierarchical representation of videos with spatio-temporal fibers
* Im2depth: Scalable exemplar based depth transfer
* Image parsing with graph grammars and Markov Random Fields applied to facade analysis
* Image segmentation of mesenchymal stem cells in diverse culturing conditions
* Important stuff, everywhere! Activity recognition with salient proto-objects as context
* Improving background subtraction using Local Binary Similarity Patterns
* Improving Multiview Face Detection with Multi-Task Deep Convolutional Neural Networks
* Improving streaming video segmentation with early and mid-level visual processing
* Information theoretic sensor management for multi-target tracking with a single pan-tilt-zoom camera
* Integrating visual words as bunch of n-grams for effective biomedical image classification
* Interactive video segmentation using occlusion boundaries and temporally coherent superpixels
* Interactively test driving an object detector: Estimating performance on unlabeled data
* Introspective semantic segmentation
* Iris crypts: Multi-scale detection and shape-based matching
* Is my new tracker really better than yours?
* Joint hierarchical learning for efficient multi-class object detection
* joint perspective towards image super-resolution: Unifying external- and self-examples, A
* Joint semantic and geometric segmentation of videos with a stage model
* Large-scale semantic co-labeling of image sets
* Learning local image descriptors using binary decision trees
* Learning mid-level features from object hierarchy for image classification
* Linear Local Distance coding for classification of HEp-2 staining patterns
* Linear regression motion analysis for unsupervised temporal segmentation of human actions
* Local inter-session variability modelling for object classification
* lp-norm MTMKL framework for simultaneous detection of multiple facial action units, A
* Matching image sets via adaptive multi convex hull
* Materials discovery: Fine-grained classification of X-ray scattering images
* Max residual classifier
* Mining discriminative 3D Poselet for cross-view action recognition
* Model-based anthropometry: Predicting measurements from 3D human scans in multiple poses
* Multi class boosted random ferns for adapting a generic object detector to a specific video
* Multi-leaf alignment from fluorescence plant images
* Multi-view action recognition one camera at a time
* Multimodal fusion using dynamic hybrid models
* Multiple foreground recognition and cosegmentation: An object-oriented CRF model with robust higher-order potentials
* novel method for post-surgery face recognition using sum of facial parts recognition, A
* NRSfM using local rigidity
* Object co-labeling in multiple images
* Object tracking via non-Euclidean geometry: A Grassmann approach
* Offline learning of prototypical negatives for efficient online Exemplar SVM
* Online algorithms for factorization-based structure from motion
* Online discriminative dictionary learning for visual tracking
* Optical filter selection for automatic visual inspection
* Pedestrian detection in low resolution videos
* Physical querying with multi-modal sensing
* Plant classification system for crop /weed discrimination without segmentation
* Play type recognition in real-world football video
* Point pattern matching based on line graph spectral context and descriptor embedding
* Predicting movie ratings from audience behaviors
* Random projections on manifolds of Symmetric Positive Definite matrices for image classification
* Real time action recognition using histograms of depth gradients and random decision forests
* Real-time 3-D face tracking and modeling framework for mid-res cam
* Real-time 3D page tracking and book status recognition for high-speed book digitization based on adaptive capturing
* Real-time multi-target tracking at 210 megapixels/second in Wide Area Motion Imagery
* Real-time video decolorization using bilateral filtering
* Recognition of 3D package shapes for single camera metrology
* Recognizing locations with Google Glass: A case study
* Relative facial action unit detection
* Repeated constrained sparse coding with partial dictionaries for hyperspectral unmixing
* Robust optical flow estimation for continuous blurred scenes using RGB-motion imaging and directional filtering
* Robust tracking and mapping with a handheld RGB-D camera
* Robust tracking of articulated human movements through Component-Based Multiple Instance Learning with particle filtering
* Rotation estimation from cloud tracking
* Scale-invariant line descriptors for wide baseline matching
* Scale-Space SIFT flow
* Scene recognition by jointly modeling latent topics
* Segmentation and matching: Towards a robust object detection system
* Segmentation and tracking of partial planar templates
* Selection of universal features for image classification
* Selectively guiding visual concept discovery
* Simultaneous recognition of facial expression and identity via sparse representation
* Small Hand-Held Object Recognition Test (SHORT)
* Smart surveillance framework: A versatile tool for video analysis
* Spatial inference for coherent geophysical fluids by appearance and geometry
* spatial-color layout feature for representing galaxy images, A
* Structure-aware keypoint tracking for partial occlusion handling
* Summarisation of short-term and long-term videos using texture and colour
* System for semi-automated surveying of street-lighting poles from street-level panoramic images
* Task-based control of articulated human pose detection for OpenVL
* Towards cautious collective inference for object verification
* Transfer learning via attributes for improved on-the-fly classification
* Understanding and analyzing a large collection of archived swimming videos
* Understanding the 3D layout of a cluttered room from multiple images
* Unsupervised dictionary learning with double-layer sparse representation
* Unsupervised domain adaptation using parallel transport on Grassmann manifold
* Unsupervised iterative manifold alignment via local feature histograms
* Unsupervised Non-parametric Geospatial Modeling from Ground Imagery
* Urban Tracker: Multiple object tracking in urban mixed traffic
* Video alignment to a common reference
* Video segmentation and feature co-occurrences for activity classification
* Video segmentation with joint object and trajectory labeling
* Video text detection and recognition: Dataset and benchmark
* Viewpoint-independent book spine segmentation
* Vision for road inspection
* Volumetric reconstruction applied to perceptual studies of size and weight
* Writer identification and verification using GMM supervectors
153 for WACV14

* 3-D Mediated Detection and Tracking in Wide Area Aerial Surveillance
* 3D Pictorial Structures for Human Pose Estimation with Supervoxels
* 3D Reconstruction from Hyperspectral Images
* Action Recognition from Depth Sequences Using Depth Motion Maps-Based Local Binary Patterns
* Action Recognition Using Discriminative Structured Trajectory Groups
* Adaptive Deformation Handling for Pedestrian Detection
* Adaptive Keyframe Selection for Video Summarization
* Adaptive Local Movement Modelling for Object Tracking
* Analyzing Tracklets for the Detection of Abnormal Crowd Behavior
* Anomaly Localization in Topic-Based Analysis of Surveillance Videos
* AR-Weapon: Live Augmented Reality Based First-Person Shooting System
* Automated Axon Segmentation from Highly Noisy Microscopic Videos
* Automatic 4D Facial Expression Recognition Using DCT Features
* Autonomous Driving Simulation for Unmanned Vehicles
* Bank of Quantization Models: A Data-Specific Approach to Learning Binary Codes for Large-Scale Retrieval Applications
* Bayesian Multi-object Tracking Using Motion Context from Multiple Objects
* Beyond Pedestrians: A Hybrid Approach of Tracking Multiple Articulating Humans
* Bikers Are Like Tobacco Shops, Formal Dressers Are Like Suits: Recognizing Urban Tribes with Caffe
* Camera Network Tracking (CamNeT) Dataset and Performance Baseline, A
* Category Attentional Search for Fast Object Detection by Mimicking Human Visual Perception
* Change Detection in Laser-Scanned Data of Industrial Sites
* Characterizing Feature Matching Performance over Long Time Periods
* Choosing Basic-Level Concept Names Using Visual and Language Context
* Circular Hough Transform and Local Circularity Measure for Weight Estimation of a Graph-Cut Based Wood Stack Measurement
* City Scale Image Geolocalization via Dense Scene Alignment
* Classification of 3D Multicellular Organization in Phase Microscopy for High Throughput Screening of Therapeutic Targets
* Clauselets: Leveraging Temporally Related Actions for Video Event Analysis
* Co-operative Pedestrians Group Tracking in Crowded Scenes Using an MST Approach
* Composition Context Photography
* Convergence of Iteratively Re-weighted Least Squares to Robust M-Estimators
* De-correlating CNN Features for Generative Classification
* Deeply-Learned Feature for Age Estimation
* Dense and Deformable Motion Extraction in Dynamic Scenes Based on Hierarchical MRF Optimization in RGB-D Images
* Detecting Building-Level Changes of a City Using Street Images and a 2D City Map
* Detection of Arrows in On-Line Sketched Diagrams Using Relative Stroke Positioning
* Distance Transform Based Active Contour Approach for Document Image Rectification
* Document Retrieval with Unlimited Vocabulary
* Efficient Facade Segmentation Using Auto-context
* Efficient Model Evaluation with Bilinear Separation Model
* Efficient Training of Multiple Ant Tracking
* Efficiently Constructing Mosaics from Video Collections
* Egocentric Field-of-View Localization Using First-Person Point-of-View Devices
* Enhancing Linear Programming with Motion Modeling for Multi-target Tracking
* Ensemble Color Model for Human Re-identification, An
* Ensembles of Correlation Filters for Object Detection
* Entropy-Based Similarity Evaluation and Visualization of Cartographic Symbol Sets
* Error Factor Analysis for Wild Scene Image-Labelling
* Estimating Drivable Collision-Free Space from Monocular Video
* Evaluation of Features for Leaf Classification in Challenging Conditions
* Extending Digital Image Correlation to Reconstruct Displacement and Strain Fields around Discontinuities in Geomechanical Structures under Deformation
* Extending the Performance of Human Classifiers Using a Viewpoint Specific Approach
* Extracting Image Regions by Structured Edge Prediction
* Face Alignment Refinement
* Family Member Identification from Photo Collections
* Fast Approximate Matching of Videos from Hand-Held Cameras for Robust Background Subtraction
* Feature Fusion by Similarity Regression for Logo Retrieval
* Finding Temporally Consistent Occlusion Boundaries in Videos Using Geometric Context
* Fingerprint Orientation Modeling Using Symmetric Filters
* Fixing WTFs: Detecting Image Matches Caused by Watermarks, Timestamps, and Frames in Internet Photos
* Flexible Trajectory Indexing for 3D Motion Recognition
* Forecasting Human Pose and Motion with Multibody Dynamic Model
* Gait-Based Person Identification Method Using Shadow Biometrics for Robustness to Changes in the Walking Direction
* General Framework for Fast 3D Object Detection and Localization Using an Uncalibrated Camera, A
* Generalized Sum of Gaussians for Real-Time Human Pose Tracking from a Single Depth Sensor
* Genre and Style Based Painting Classification
* Geometry-Aware Feature Matching for Structure from Motion Applications
* Global-to-Local Framework for Infrared and Visible Image Sequence Registration, A
* Gradient Boundary Histograms for Action Recognition
* Heterogeneous Multi-column ConvNets with a Fusion Framework for Object Recognition
* Hierarchical Spherical Hashing for Compressing High Dimensional Vectors
* High Breakdown Bundle Adjustment
* How to Collect Segmentations for Biomedical Images? A Benchmark Evaluating the Performance of Experts, Crowdsourced Non-experts, and Algorithms
* How to Transfer? Zero-Shot Object Recognition via Hierarchical Transfer of Semantic Attributes
* Image Classification Using Generative Neuro Evolution for Deep Learning
* Improved Model for Segmentation and Recognition of Fine-Grained Activities with Application to Surgical Training Tasks, An
* Improving Vision-Based Self-Positioning in Intelligent Transportation Systems via Integrated Lane and Vehicle Detection
* Inertial Optical Flow for Throw-and-Go Micro Air Vehicles
* Information in Temporal Histograms, The
* Interleaved Regression Tree Field Cascades for Blind Image Deconvolution
* Joint Detection and Tracking of Moving Objects Using Spatio-temporal Marked Point Processes
* Key-Pose Prediction in Cyclic Human Motion
* Learned Collaborative Representations for Image Classification
* Learning an Aesthetic Photo Cropping Cascade
* Learning Localized Perceptual Similarity Metrics for Interactive Categorization
* Learning to Select and Order Vacation Photographs
* Leveraging Context to Support Automated Food Recognition in Restaurants
* Lie-Struck: Affine Tracking on Lie Groups Using Structured SVM
* Linear Chain Markov Model for Detection and Localization of Cells in Early Stage Embryo Development, A
* Local Novelty Detection in Multi-class Recognition Problems
* Low-Noise Fluttering Shutter Camera Handling Accelerated Motion, A
* Material Classification on Symmetric Positive Definite Manifolds
* Menu-Match: Restaurant-Specific Food Logging from Images
* Mimicking Human Camera Operators
* Motion Blur Resilient Fiducial for Quadcopter Imaging, A
* Motion Segmentation of Truncated Signed Distance Function Based Volumetric Surfaces
* Mountain Habitats Segmentation and Change Detection Dataset, The
* Multi-class Semantic Video Segmentation with Exemplar-Based Object Reasoning
* Multi-modal 2D + 3D Face Recognition Method with a Novel Local Feature Descriptor, A
* Multi-modal Graphical Model for Scene Analysis, A
* Multi-Modal Sparse Coding Classifier Using Dictionaries with Different Number of Atoms, A
* Multi-person Tracking Based on Body Parts and Online Random Ferns Learning of Thermal Images
* Multi-shot Re-identification with Random-Projection-Based Random Forests
* Multimodal Registration of Multiple Retinal Images Based on Line Structures
* Multiple Insect Tracking with Occlusion Sub-tunnels
* Multiscale Superpixels and Supervoxels Based on Hierarchical Edge-Weighted Centroidal Voronoi Tessellation
* Near Duplicate Image Discovery on One Billion Images
* Non-negative Sparse Coding with Regularizer for Image Classification
* Non-rigid Articulated Point Set Registration for Human Pose Estimation
* Norm-Induced Entropies for Decision Forests
* Online Visual Tracking Using Temporally Coherent Part Cluster
* Optimization of Plane Fits to Image Segments in Multi-view Stereo
* Part-Based Tracking via Salient Collaborating Features
* Person Re-identification Using the Silhouette Shape Described by a Point Distribution Model
* Photometric Stereo in the Wild
* Pose Estimation of Object Categories in Videos Using Linear Programming
* Predicting Geo-informative Attributes in Large-Scale Image Collections Using Convolutional Neural Networks
* Progressive 3D Model Acquisition with a Commodity Hand-Held Camera
* Qualitative Tracking Performance Evaluation without Ground-Truth
* Quality-Aware Estimation of Facial Landmarks in Video Sequences
* Re-ranking by Multi-feature Fusion with Diffusion for Image Retrieval
* Real Time Multi-vehicle Tracking and Counting at Intersections from a Fisheye Camera
* Real-Time Barcode Detection in the Wild
* Real-Time Facial Expression Recognition on Smartphones
* Real-Time Multi-scale Action Detection from 3D Skeleton Data
* Retrieval of Images with Objects of Specific Size, Location, and Spatial Configuration
* Robot-centric Activity Recognition from First-Person RGB-D Videos
* Robust Adaptive Classifier for Detector Adaptation in a Video, A
* Robust Fastener Detection for Autonomous Visual Railway Track Inspection
* Robust Nonrigid Point Set Registration Using Graph-Laplacian Regularization
* Runway to Realway: Visual Analysis of Fashion
* Scalable Similarity Learning Using Large Margin Neighborhood Embedding
* Selective Pooling Vector for Fine-Grained Recognition
* Self-Adjusting Approach to Change Detection Based on Background Word Consensus, A
* Semantic Instance Labeling Leveraging Hierarchical Segmentation
* Semantic Multi-body Motion Segmentation
* Sequential Boosting for Learning a Random Forest Classifier
* Sequential Online 3D Reconstruction System Using Dense Stereo Matching, A
* Shot Boundary Detection with Graph Theory Using Keypoint Features and Color Histograms
* Sparse Flow: Sparse Matching for Small to Large Displacement Optical Flow
* Spatially Stratified Correspondence Sampling for Real-Time Point Cloud tracking
* Stereovision Bias Removal by Autocorrelation
* Structured Hough Voting for Vision-Based Highway Border Detection
* Topology-Preserving Multi-label Image Segmentation
* Touch Gesture-Based Active User Authentication Using Dictionaries
* Towards Convenient Calibration for Cross-Ratio Based Gaze Estimation
* Tracking People by Evolving Social Groups: An Approach with Social Network Perspective
* Training a Scene-Specific Pedestrian Detector Using Tracklets
* Tree-Based Locally Linear Regression for Image Denoising
* Unsupervised Feature Extraction Inspired by Latent Low-Rank Representation
* Unsupervised Generation of Context-Relevant Training-Sets for Visual Object Recognition Employing Multilinguality
* Vision-Based Offline-Online Perception Paradigm for Autonomous Driving
* Visual Gyroscope for Accurate Orientation Estimation
* Visual Object Clustering via Mixed-Norm Regularization
* Visual Recognition to Access and Analyze People Density and Flow Patterns in Indoor Environments
* Visual Saliency Models Based on Spectrum Processing
156 for WACV15

* 3D shape retrieval using a single depth image from low-cost sensors
* 6DOF point cloud alignment using geometric algebra-based adaptive filtering
* Abstraction hierarchy and self annotation update for fine grained activity recognition
* Accurate 3D bone segmentation in challenging CT images: Bottom-up parsing and contextualized optimization
* Accurate and efficient pulse measurement from facial videos on smartphones
* Accurate eye center localization using Snakuscule
* Active contours for selective object segmentation
* Activity recognition and prediction with pose based discriminative patch model
* Adapting attributes by selecting features similar across domains
* analysis of 1-to-first matching in iris recognition, An
* Analyzing human appearance as a cue for dating images
* Architectural decomposition for 3D landmark building understanding
* Arrays of single pixel time-of-flight sensors for privacy preserving tracking and coarse pose estimation
* Assessing tracking performance in complex scenarios using mean time between failures
* Atomic scenes for scalable traffic scene recognition in monocular videos
* Automatic 3D reconstruction of manifold meshes via delaunay triangulation and mesh sweeping
* Automatic and quantitative evaluation of attribute discovery methods
* Automatic detection and analysis of photovoltaic modules in aerial infrared imagery
* Automatic video editing for sensor-rich videos
* Binary patterns for shape description in RGB-D object registration
* Can we still avoid automatic face detection?
* Capturing facial videos with Kinect 2.0: A multithreaded open source tool and database
* Categorizing cubes: Revisiting pose normalization
* coarse-to-fine deep learning for person re-identification, A
* Color multi-fusion fisher vector feature for fine art painting categorization and influence analysis
* Combining multiple sources of knowledge in deep CNNs for action recognition
* Compact CNN for indexing egocentric videos
* Constructing image mosaics using focus based depth analysis
* CoRBS: Comprehensive RGB-D benchmark for SLAM using Kinect v2
* Correlation filter cascade for facial landmark localization
* Coupled depth learning
* Crowd density estimation based on rich features and random projection forest
* crowdsourced approach to student engagement recognition in e-learning environments, A
* Customized expression recognition for performance-driven cutout character animation
* Cutting through the clutter: Task-relevant features for image matching
* Dealing with small data and training blind spots in the Manhattan world
* Decomposing time series with application to temporal segmentation
* deep convolutional neural network trained on representative samples for circulating tumor cell detection, A
* Deep learning architectures for domain adaptation in HOV/HOT lane enforcement
* Deep learning the dynamic appearance and shape of facial action units
* Deep recursive and hierarchical conditional random fields for human action recognition
* Deep tree-structured face: A unified representation for multi-task facial biometrics
* Detecting temporally consistent objects in videos through object class label propagation
* Detection of cracks in nuclear power plant using spatial-temporal grouping of local patches
* Detection of quadrilateral document regions from digital photographs
* Direct 3D pose estimation of a planar target
* Direct face detection and video reconstruction from event cameras
* Discovering picturesque highlights from egocentric vacation videos
* Discovering useful parts for pose estimation in sparsely annotated datasets
* Discovery of facial motions using deep machine perception
* Discriminative FaceTopics for face recognition via latent Dirichlet allocation
* Discriminative training of CRF models with probably submodular constraints
* driver fatigue detection method based on multi-sensor signals, A
* Dynamic belief fusion for object detection
* Effect of illicit drug abuse on face recognition
* Efficient joint stereo estimation and land usage classification for multiview satellite data
* Efficient transductive semantic segmentation
* Efficient unsupervised abnormal crowd activity detection based on a spatiotemporal saliency detector
* Efficient video-based retrieval of human motion with flexible alignment
* elastic functional data analysis framework for preoperative evaluation of patients with Rheumatoid Arthritis, An
* end-to-end generative framework for video segmentation and recognition, An
* Energy-efficient ConvNets through approximate computing
* enhanced deep feature representation for person re-identification, An
* Exploring bounding box context for multi-object tracker fusion
* Extended coherent point drift algorithm with correspondence priors and optimal subsampling
* Eye-CU: Sleep pose classification for healthcare using multimodal multiview data
* Face fiducial detection by consensus of exemplars
* Face recognition using deep multi-pose representations
* Fashion apparel detection: The role of deep convolutional neural network and pose-dependent priors
* fast and robust text spotter, A
* fast method for estimating transient scene attributes, A
* Fine-grained classification via mixture of deep convolutional neural networks
* Fine-Grained Sketch-Based Image Retrieval: The Role of Part-Aware Attributes
* Fine-tuning human pose estimations in videos
* Fixation prediction with a combined model of bottom-up saliency and vanishing point
* Forget the checkerboard: Practical self-calibration using a planar scene
* FPGA-accelerated partial duplicate image retrieval engine for a document search system, An
* Frontal to profile face verification in the wild
* Furthering fingerprint-based authentication: Introducing the true-neighbor template
* generalized relative pose and scale problem: View-graph fusion via 2D-2D registration, The
* Generating reliable video annotations by exploiting the crowd
* Geometric calibration for mobile, stereo, autofocus cameras
* geometry of a scene: On deep semantics for visual perception driven cognitive film, studies, The
* Going deeper in facial expression recognition using deep neural networks
* Graph matching with low-rank regularization
* Half hypersphere confinement for piecewise linear regression
* Heat propagation contours for 3D non-rigid shape analysis
* HeHOP: Highly efficient head orientation and position estimation
* Hide and seek: Uncovering facial occlusion with variable-threshold robust PCA
* High accuracy model-based object pose estimation for autonomous recharging applications
* High performance moves recognition and sequence segmentation based on key poses filtering
* Higher-order class-specific priors for semantic segmentation of 3D outdoor scenes
* Human and sheep facial landmarks localisation by triplet interpolated features
* Image set classification by symmetric positive semi-definite matrices
* IPDC: Iterative part-based dense correspondence between point clouds
* Is alice chasing or being chased?: Determining subject and object of activities in videos
* Is image super-resolution helpful for other vision tasks?
* Joint geometric graph embedding for partial shape matching in images
* Joint Object Recognition and Pose Estimation Using a Nonlinear View-Invariant Latent Generative Model
* Joint point and line segment matching on wide-baseline stereo images
* Kernel auto-encoder for semi-supervised hashing
* KrishnaCam: Using a longitudinal, single-person, egocentric dataset for scene understanding tasks
* LATCH: Learned arrangements of three patch codes
* Learning a structured dictionary for video-based face recognition
* Learning deep-sea substrate types with visual topic models
* Learning patch-dependent kernel forest for person re-identification
* Leveraging single for multi-target tracking using a novel trajectory overlap affinity measure
* Lifting GIS maps into strong geometric context for scene understanding
* Line segment matching: A benchmark
* Linear-time online action detection from 3D skeletal data using bags of gesturelets
* Measuring and modeling apple trees using time-of-flight data for automation of dormant pruning applications
* mid-level representation of visual structures for video compression, A
* MIDI-assisted egocentric optical music recognition
* Mobile phone and cloud: A dream team for 3D reconstruction
* Mode-shape interpretation: Re-thinking modal space for recovering deformable shapes
* Mono camera multi-view diminished reality
* Monocular obstacle avoidance for blind people using probabilistic focus of expansion estimation
* Mosaicing scenes with a quadcopter
* multi-modal feature fusion framework for kinect-based facial expression recognition using Dual Kernel Discriminant Analysis (DKDA), A
* Multi-view dynamic texture learning
* Multimodal emotion recognition using deep learning architectures
* Multiscale fully convolutional network with application to industrial inspection
* Naming TV characters by watching and analyzing dialogs
* new computer vision-based system to help clinicians objectively assess visual pursuit with the moving mirror stimulus for the diagnosis of minimally conscious state, A
* novel inheritable color space with application to kinship verification, A
* Object detection in 20 questions
* Occlusion-aware video registration for highly non-rigid objects
* OCPAD: Occluded checkerboard pattern detector
* Omnidirectional image capture on mobile devices for fast automatic generation of 2.5D indoor maps
* On the importance of normalisation layers in deep learning with piecewise linear activation units
* One-to-many face recognition with bilinear CNNs
* Online inspection of 3D parts via a locally overlapping camera network
* Online tracking using saliency
* Open source structure-from-motion for aerial video
* OpenFace: An open source facial behavior analysis toolkit
* Optimal feature learning and discriminative framework for polarimetric thermal to visible face recognition
* Optimal radiometric calibration for camera-display communication
* People detection in crowded scenes by context-driven label propagation
* Persistent 3D stabilization for aerial imagery
* Person re-identification using deformable patch metric learning
* Person re-identification using multiple first-person-views on wearable devices
* Person-following UAVs
* Pose tracking by efficiently exploiting global features
* practical approach to real-time neutral feature subtraction for facial expression recognition, A
* Precise deterministic change detection for smooth surfaces
* Predicting wide receiver trajectories in American football
* Procrustean decomposition for orthogonal cascade detection
* Real-Time Eye Pupil Localization Using Hough Regression Forest
* Real-time road traffic density estimation using block variance
* real-time visual card reader for mobile devices, A
* Recognition of ongoing complex activities by sequence prediction over a hierarchical label space
* Region graph based method for multi-object detection and tracking using depth cameras
* Resolution enhancement in single depth map and aligned image
* revisit to human action recognition from depth sequences: Guided SVM-sampling for joint selection, A
* Robust visual tracking using template anchors
* Score reliability based weighting technique for score-level fusion in multi-biometric systems
* Self-taught object localization with deep networks
* Semantic segmentation of modular furniture
* Simultaneous semantic segmentation of a set of partially labeled images
* Sky segmentation in the wild: An empirical study
* Static action recognition by efficient greedy inference
* structured approach to predicting image enhancement parameters, A
* Support vector machines with time series distance kernels for action classification
* Survey on Moving Object Detection for Wide Area Motion Imagery, A
* Tag-based video retrieval by embedding semantic content in a continuous word space
* Task-driven progressive part localization for fine-grained recognition
* Text detection in stores using a repetition prior
* Texture classification for rail surface condition evaluation
* Texture instance similarity via dense correspondences
* Think big, solve small: Scaling up robust PCA with coupled dictionaries
* Tooth guard: A vision system for detecting missing tooth in rope mine shovel
* Toward correlating and solving abstract tasks using convolutional neural networks
* Transition Hough forest for trajectory-based action recognition
* two-sample test for statistical comparisons of shape populations, A
* two-stage detector for hand detection in ego-centric videos, A
* ULg multimodality drowsiness database (called DROZY) and examples of use, The
* Unconstrained face verification using deep CNN features
* Underwater 3D capture using a low-cost commercial depth camera
* Unifying diffuse and specular reflections for the photometric stereo problem
* Unsupervised categorical shape reconstruction through manifolds
* Unsupervised network pretraining via encoding human design
* Unsupervised saliency estimation based on robust hypotheses
* Variational multi-phase segmentation using high-dimensional local features
* Video summarization for remote invigilation of online exams
* Vision-based counting of pedestrians and cyclists
* Visual recognition of paper analytical device images for detection of falsified pharmaceuticals
* Voting-based 3D object cuboid detection robust to partial occlusion from RGB-D images
* Weighted atlas auto-context with application to multiple organ segmentation
* Where is that pixel in the oblique-view video?
190 for WACV16

* 2-Line Exhaustive Searching for Real-Time Vanishing Point Estimation in Manhattan World
* 3D Semantic Segmentation of Modular Furniture Using rjMCMC
* 3D-Brain Segmentation Using Deep Neural Network and Gaussian Mixture Model
* Accurate 3D Reconstruction of Dynamic Scenes from Monocular Image Sequences with Severe Occlusions
* Active Online Anomaly Detection Using Dirichlet Process Mixture Model and Gaussian Process Classification
* Artistic Movement Recognition by Boosted Fusion of Color Structure and Topographic Description
* Assessment of Peanut Pod Maturity
* Automatic Calibration of a Multiple-Projector Spherical Fish Tank VR Display
* Automatic Defect Recognition in X-Ray Testing Using Computer Vision
* Bandwidth Limited Object Recognition in High Resolution Imagery
* Beacon-Guided Structure from Motion for Smartphone-Based Navigation
* Beyond Spatial Auto-Regressive Models: Predicting Housing Prices with Satellite Imagery
* Boosted Convolutional Neural Networks (BCNN) for Pedestrian Detection
* Box Refinement: Object Proposal Enhancement and Pruning
* Breathing Rate Monitoring during Sleep from a Depth Camera under Real-Life Conditions
* Calibration Technique for Underwater Active Oneshot Scanning System with Static Pattern Projector and Multiple Cameras
* Can Affordances Guide Object Decomposition into Semantically Meaningful Parts?
* Center-Focusing Multi-task CNN with Injected Features for Classification of Glioma Nuclear Images
* Complex Event Recognition from Images with Few Training Examples
* Computing Egomotion with Local Loop Closures for Egocentric Videos
* ContlensNet: Robust Iris Contact Lens Detection Using Deep Convolutional Neural Networks
* Convolutional Sparse and Low-Rank Coding-Based Rain Streak Removal
* Crack Segmentation by Leveraging Multiple Frames of Varying Illumination
* Cyclical Learning Rates for Training Neural Networks
* Deep Context Modeling for Semantic Segmentation
* Deep Feature Consistent Variational Autoencoder
* Deep Heterogeneous Feature Fusion for Template-Based Face Recognition
* Deep Image Set Hashing
* Deep Learning Frame-Work for Recognizing Developmental Disorders, A
* Deep Learning Logo Detection with Data Expansion by Synthesising Context
* Deep Learning Paradigm for Detection of Harmful Algal Blooms, A
* Deep Moving Poselets for Video Based Action Recognition
* Deep Multi-modal Vehicle Detection in Aerial ISR Imagery
* Deep Object Ranking for Template Matching
* Deep Salient Object Detection by Integrating Multi-level Cues
* Deep Spatio-Temporal Features for Multimodal Emotion Recognition
* Dense Batch Non-Rigid Structure from Motion in a Second
* Densification of Semi-Dense Reconstructions for Novel View Generation of Live Scenes
* Describing Unseen Classes by Exemplars: Zero-Shot Learning Using Grouped Simile Ensemble
* Detecting Sexually Provocative Images
* Detecting Social Insects in Videos Using Spatiotemporal Regularization
* Dictionary Alignment for Low-Resolution and Heterogeneous Face Recognition
* Distance Penalization and Fusion for Person Re-identification
* Efficient Action Detection in Untrimmed Videos via Multi-task Learning
* Egocentric Height Estimation
* Enriched Deep Recurrent Visual Attention Model for Multiple Object Recognition
* Exploring Local Context for Multi-target Tracking in Wide Area Aerial Surveillance
* Fast and Robust Eyelid Outline and Aperture Detection in Real-World Scenarios
* Fast Deep Vehicle Detection in Aerial Images
* Fast Pedestrian Detection via Random Projection Features with Shape Prior
* Fast Semi Dense Epipolar Flow Estimation
* Fast, Accurate, Small-Scale 3D Scene Capture Using a Low-Cost Depth Sensor
* First-Person Action Decomposition and Zero-Shot Learning
* Flowdometry: An Optical Flow and Deep Learning Based Approach to Visual Odometry
* From Affine Rank Minimization Solution to Sparse Modeling
* Fused DNN: A Deep Neural Network Fusion Approach to Fast and Robust Pedestrian Detection
* Gaussian Mixture Models for Temporal Depth Fusion
* Gender-from-Iris or Gender-from-Mascara?
* Global Consistency Priors for Joint Part-Based Object Tracking and Image Segmentation
* Global Model with Local Interpretation for Dynamic Shape Reconstruction
* GPU-Accelerated Real-Time Stixel Computation
* Guaranteed Parameter Estimation for Discrete Energy Minimization
* Hardware-Centric Vision Processing for Mobile IoT Environment Exploiting Approximate Graph Cut in Resistor Grid
* High-Level Concepts for Affective Understanding of Images
* Higher-Order Pooling of CNN Features via Kernel Linearization for Action Recognition
* Human Pose Estimation Using Deep Structure Guided Learning
* Image Set Classification Using Sparse Bayesian Regression
* Improved Deep Learning of Object Category Using Pose Information
* Integrated Global-Local Metric Learning for Person Re-identification
* Joint Epipolar Tracking (JET): Simultaneous Optimization of Epipolar Geometry and Feature Correspondences
* Joint Regression and Ranking for Image Enhancement
* Learn How to Choose: Independent Detectors Versus Composite Visual Phrases
* Learning Attributes from Human Gaze
* Learning Discriminative Features via Label Consistent Neural Network
* Learning Effective Binary Descriptors via Cross Entropy
* Learning Spatial Transforms for Refining Object Segment Proposals
* Learning to Recognize Objects by Retaining Other Factors of Variation
* Material Classification under Natural Illumination Using Reflectance Maps
* Melanoma Detection Based on Mahalanobis Distance Learning and Constrained Graph Regularized Nonnegative Matrix Factorization
* Model-Driven Simulations for Computer Vision
* Multi-Camera Action Dataset for Cross-Camera Action Recognition Benchmarking
* Multi-planar Fitting in an Indoor Manhattan World
* Multi-shot Person Re-Identification Using Part Appearance Mixture
* Multi-task Curriculum Transfer Deep Learning of Clothing Attributes
* Multi-view RGB-D Approach for Human Pose Estimation in Operating Rooms, A
* Occlusions are Fleeting - Texture is Forever: Moving Past Brightness Constancy
* On Crater Verification Using Mislocalized Crater Regions
* On Geometric Features for Skeleton-Based Action Recognition Using Multilayer LSTM Networks
* Open-Source Platform for Underwater Image and Video Analytics, An
* Optimal Threshold and LoG Based Feature Identification and Tracking of Bat Flapping Flight
* Ordered Pooling of Optical Flow Sequences for Action Recognition
* Pano2CAD: Room Layout from a Single Panorama Image
* PASCAL Boundaries: A Semantic Boundary Dataset with a Deep Semantic Boundary Detector
* Patchwork Stereo: Scalable, Structure-Aware 3D Reconstruction in Man-Made Environments
* PCA Based Computation of Illumination-Invariant Space for Road Detection
* Personalized Image Aesthetic Quality Assessment by Joint Regression and Ranking
* Pose-Robust Face Verification by Exploiting Competing Tasks
* Predicting the Perceptual Demands of Urban Driving with Video Regression
* Probabilistic Surface Inference for Industrial Inspection Planning
* Providing Video Annotations in Multimedia Containers for Visualization and Research
* Quantitative Analysis of Automatic Image Cropping Algorithms: A Dataset and Comparative Study
* Real Estate Image Classification
* Real-Time Online Action Detection Forests Using Spatio-Temporal Contexts
* Recognition of Group Activities in Videos Based on Single-and Two-Person Descriptors
* Recurrent Fully Convolutional Networks for Video Segmentation
* Repeated Pattern Detection Using CNN Activations
* Robust 3D Patch-Based Face Hallucination
* Robust Hand Gestural Interaction for Smartphone Based AR/VR Applications
* Robust Road Marking Detection and Recognition Using Density-Based Grouping and Machine Learning Techniques
* SAMP: Shape and Motion Priors for 4D Vehicle Reconstruction
* Semantic Text Summarization of Long Videos
* Semi-Coupled Two-Stream Fusion ConvNets for Action Recognition at Extremely Low Resolutions
* Size and Texture-Based Classification of Lung Tumors with 3D CNNs
* Solving Occlusion Problem in Pedestrian Detection by Constructing Discriminative Part Layers
* Solving Robust Regularization Problems Using Iteratively Re-weighted Least Squares
* Sparse Dictionary Learning for Identifying Grasp Locations
* Spatial-Temporal Motion Field Analysis for Pixelwise Crack Detection on Concrete Surfaces
* Spatio-Temporal Anomaly Detection for Industrial Robots through Prediction in Unsupervised Feature Space
* Statistical Approach to Continuous Self-Calibrating Eye Gaze Tracking for Head-Mounted Virtual Reality Systems, A
* StuffNet: Using Stuff to Improve Object Detection
* Subcategory-Aware Convolutional Neural Networks for Object Proposals and Detection
* Switching Linear Inverse-Regression Model for Tracking Head Pose
* T-LESS: An RGB-D Dataset for 6D Pose Estimation of Texture-Less Objects
* Telecom Inventory Management via Object Recognition and Localisation on Google Street View Images
* Temporal Robust Features for Violence Detection
* Temporally Coded Illumination for Rolling Shutter Motion De-blurring
* Text-Edge-Box: An Object Proposal Approach for Scene Texts Localization
* Texture Attribute Synthesis and Transfer Using Feed-Forward CNNs
* Towards Fine-Grained Open Zero-Shot Learning: Inferring Unseen Visual Features from Attributes
* Transfer Learning and Deep Feature Extraction for Planktonic Image Data Sets
* Two Stream LSTM: A Deep Fusion Framework for Human Action Recognition
* Ultrasound Tracking Using ProbeSight: Camera Pose Estimation Relative to External Anatomy by Inverse Rendering of a Prior High-Resolution 3D Surface Map
* Unifying Registration Based Tracking: A Case Study with Structural Similarity
* Universal Skin Detection Without Color Information
* Unsupervised Joint Mining of Deep Features and Image Labels for Large-Scale Radiology Image Categorization and Scene Recognition
* When Was That Made?
* Who Moved My Cheese? Automatic Annotation of Rodent Behaviors with Convolutional Neural Networks
* Writer Identification in Noisy Handwritten Documents
* X-Ray PoseNet: 6 DoF Pose Estimation for Mobile X-Ray Devices
* X-Ray Scattering Image Classification Using Deep Learning
141 for WACV17

* 3D Head Pose Estimation Enhanced Through SURF-Based Key-Frames
* 3D Shape Processing by Convolutional Denoising Autoencoders on Local Patches
* ActionFlowNet: Learning Motion Representation for Action Recognition
* Activity-Conditioned Continuous Human Pose Estimation for Performance Analysis of Athletes Using the Example of Swimming
* Adversarial Training of Variational Auto-Encoders for High Fidelity Image Generation
* Aesthetic Inference for Smart Mobile Devices
* Analysis of Human-Centered Geolocation, An
* Animal Detection Pipeline for Identification, An
* Anomaly Explanation Using Metadata
* Automated Action Units Vs. Expert Raters: Face off
* Automated Top View Registration of Broadcast Football Videos
* Automatic Analysis of Sewer Pipes Based on Unrolled Monocular Fisheye Images
* Balancing Content and Style with Two-Stream FCNs for Style Transfer
* BranchConnect: Image Categorization with Learned Branch Connections
* ByLabel: A Boundary Based Semi-Automatic Image Annotation Tool
* Camera Selection for Broadcasting Soccer Games
* Chainlets: A New Descriptor for Detection and Recognition
* Channel-Recurrent Autoencoding for Image Modeling
* Chess Piece Recognition Using Oriented Chamfer Matching with a Comparison to CNN
* Classification of Crop Lodging with Gray Level Co-occurrence Matrix
* Compact Convolutional Neural Network for Textured Surface Anomaly Detection, A
* Confidence Prediction for Lexicon-Free OCR
* Context-Aware Single-Shot Detector
* Contextually Customized Video Summaries Via Natural Language
* Crowd Counting via Scale-Adaptive Convolutional Neural Network
* Crowd Counting with Minimal Data Using Generative Adversarial Networks for Multiple Target Regression
* CSVideoNet: A Real-Time End-to-End Learning Framework for High-Frame-Rate Video Compressive Sensing
* CT-SRCNN: Cascade Trained and Trimmed Deep Convolutional Neural Networks for Image Super Resolution
* C^2MSNet: A Novel Approach for Single Image Haze Removal
* DARN: A Deep Adversarial Residual Network for Intrinsic Image Decomposition
* Decoupled Learning for Conditional Adversarial Networks
* Deep Cosine Metric Learning for Person Re-identification
* Deep Four-Stream Siamese Convolutional Neural Network with Joint Verification and Identification Loss for Person Re-Detection, A
* Deep Radio-Visual Localization
* DeepLung: Deep 3D Dual Path Nets for Automated Pulmonary Nodule Detection and Classification
* DeepSolarEye: Power Loss Prediction and Weakly Supervised Soiling Localization via Fully Convolutional Networks for Solar Panels
* DeepWheat: Estimating Phenotypic Traits from Crop Images with Deep Learning
* DeformNet: Free-Form Deformation Network for 3D Shape Reconstruction from a Single Image
* Delay Compensation for Actuated Stereoscopic 360 Degree Telepresence Systems with Probabilistic Head Motion Prediction
* Depth Map Completion by Jointly Exploiting Blurry Color Images and Sparse Depth Maps
* Detect-SLAM: Making Object Detection and SLAM Mutually Beneficial
* DGSAC: Density Guided Sampling and Consensus
* Discriminative Cross-View Binary Representation Learning
* Disjoint Multi-task Learning Between Heterogeneous Human-Centric Tasks
* Distributed Active Learning for Image Recognition
* Distribution-Aware Binarization of Neural Networks for Sketch Recognition
* Driving Scene Perception Network: Real-Time Joint Detection, Depth Estimation and Semantic Segmentation
* Dynamic Visual Sequence Prediction with Motion Flow Networks
* ECLIPSE: Ensembles of Centroids Leveraging Iteratively Processed Spatial Eclipse Clustering
* Effective Combination of Vertical and Horizontal Stereo Vision
* Effective Use of Dilated Convolutions for Segmenting Small Object Instances in Remote Sensing Imagery
* Efficient Map Compression for Collaborative Visual SLAM
* Efficient Multi-attribute Similarity Learning Towards Attribute-Based Fashion Search
* Efficient Training for Automatic Defect Classification by Image Augmentation
* Emotion Analysis Using Audio/Video, EMG and EEG: A Dataset and Comparison Study
* End-to-End Fine-Grained Action Segmentation and Recognition Using Conditional Random Field Models and Discriminative Sparse Coding
* EnKCF: Ensemble of Kernelized Correlation Filters for High-Speed Object Tracking
* Ensemble Knowledge Transfer for Semantic Segmentation
* Epipolar Line from a Single Pixel, An
* Face and Body Association for Video-Based Face Recognition
* Face Liveness Detection Based on Perceptual Image Quality Assessment Features with Multi-scale Analysis
* Face Sketch Synthesis with Style Transfer Using Pyramid Column Feature
* Face-MagNet: Magnifying Feature Maps to Detect Small Faces
* Factorized Convolutional Networks: Unsupervised Fine-Tuning for Image Clustering
* Fading Affect Bias: Improving the Trade-Off Between Accuracy and Efficiency in Feature Clustering
* FARSA: Fully Automated Roadway Safety Assessment
* Fast and Robust Curve Skeletonization for Real-World Elongated Objects
* Fast Self-Attentive Multimodal Retrieval
* Fine-Grained and Semantic-Guided Visual Attention for Image Captioning
* Foot Contact Timings and Step Length for Sprint Training
* From Pixels to Actions: Learning to Drive a Car with Deep Neural Networks
* Fully-Coupled Two-Stream Spatiotemporal Networks for Extremely Low Resolution Action Recognition
* Fusion of Infrared and Visible-Light Videos Using Motion-Compensated Temporal Sub-Band Decompositions
* Fusion of Keypoint Tracking and Facial Landmark Detection for Real-Time Head Pose Estimation
* Gabor Convolutional Networks
* General-Purpose Deep Point Cloud Feature Extractor
* Generating Handwritten Chinese Characters Using CycleGAN
* Generative and Discriminative Sparse Coding for Image Classification Applications
* Generative Approach to Zero-Shot and Few-Shot Action Recognition, A
* Generic Tubelet Proposals for Action Localization
* Grad-CAM++: Generalized Gradient-Based Visual Explanations for Deep Convolutional Networks
* Graph-Based Correlated Topic Model for Trajectory Clustering in Crowded Videos
* Greedy Part Assignment Algorithm for Real-Time Multi-person 2D Pose Estimation, A
* Guided Filtering of Hyperspectral Images
* HoloFace: Augmenting Human-to-Human Interactions on HoloLens
* How Much Chemistry Does a Deep Neural Network Need to Know to Make Accurate Predictions?
* Human Shape Capture and Tracking at Home
* Hybrid Binary Networks: Optimizing for Accuracy, Efficiency and Memory
* Hybrid Method for 3D Pose Estimation of Personalized Human Body Models, A
* Identity-Preserving Face Recovery from Portraits
* Illumination-Invariant Robust Multiview 3D Human Motion Capture
* Image Copy-Move Forgery Detection via an End-to-End Deep Neural Network
* Image Segmentation Using Sparse Subset Selection
* Image2GIF: Generating Cinemagraphs Using Recurrent Deep Q-Networks
* Improvement of Extrinsic Parameters from a Single Stereo Pair
* Improving Object Classification Performance via Confusing Categories Study
* Improving Text-Based Person Search by Spatial Matching and Adaptive Threshold
* Incremental Structural Modeling Based on Geometric and Statistical Analyses
* Instance-Aware Detailed Action Labeling in Videos
* Iris Presentation Attack via Textured Contact Lens in Unconstrained Environment
* Iterative Cross Learning on Noisy Labels
* Joint 3D-2D Based Method for Free Space Detection on Roads, A
* Large Scale Novel Object Discovery in 3D
* LBP Channels for Pedestrian Detection
* Learning Confidence Measures by Multi-modal Convolutional Neural Networks
* Learning Disentangled Multimodal Representations for the Fashion Domain
* Learning Generative Models of Tissue Organization with Supervised GANs
* Learning Hierarchical Models of Complex Daily Activities from Annotated Videos
* Learning Higher Order Potentials for MRFs
* Learning Image Representations by Completing Damaged Jigsaw Puzzles
* Learning Long-Term Invariant Features for Vision-Based Localization
* Learning Semantic Segmentation with Diverse Supervision
* Learning to Detect Human-Object Interactions
* Learning to Detect Multiple Photographic Defects
* Learning to Generate 3D Stylized Character Expressions from Humans
* Learning to Prune Filters in Convolutional Neural Networks
* Learning to See Through Turbulent Water
* Learning to Segment Breast Biopsy Whole Slide Images
* Learning Video-Story Composition via Recurrent Neural Network
* Light-Field Surface Color Segmentation with an Application to Intrinsic Decomposition
* Long-Term Person Re-identification Using True Motion from Videos
* Look-Up Table Unit Activation Function for Deep Convolutional Neural Networks
* Method for Segmentation, Matching and Alignment of Dead Sea Scrolls, A
* Micro-Expression Spotting Using the Riesz Pyramid
* Minimal Non-Linear Camera Pose Estimation Method Using Lines for SLAM Application
* Minimal Solvers for Monocular Rolling Shutter Compensation Under Ackermann Motion
* Modeling Temporal Structure with LSTM for Online Action Detection
* More You Look, the More You See: Towards General Object Understanding Through Recursive Refinement, The
* Multi Feature Deconvolutional Faster R-CNN for Precise Vehicle Detection in Aerial Imagery
* Multi-modal Learning from Unpaired Images: Application to Multi-organ Segmentation in CT and MRI
* Multi-pattern Embedded Phase Shifting Using a High-Speed Projector for Fast and Accurate Dynamic 3D Measurement
* Multi-task Spatiotemporal Neural Networks for Structured Surface Reconstruction
* Multi-view Stereo 3D Edge Reconstruction
* Multichannel Attention Network for Analyzing Visual Behavior in Public Speaking
* Multilinear Autoencoder for 3D Face Model Learning
* Multiple Anthropological Fisher Kernel Framework and Its Application to Kinship Verification
* NCC-Net: Normalized Cross Correlation Based Deep Matcher with Robustness to Illumination Variations
* Neural Algebra of Classifiers
* Object Detection in Real-Time Systems: Going Beyond Precision
* Object Referring in Visual Scene with Spoken Language
* Object-Based Reasoning in VQA
* Object-Centric Photometric Bundle Adjustment with Deep Shape Prior
* Order Preserving Bilinear Model for Person Detection in Multi-Modal Data, An
* Path Reducing Watershed for the GPU
* Person Authentication Using Head Images
* PIVO: Probabilistic Inertial-Visual Odometry for Occlusion-Robust Navigation
* Plug-and-Play CNN for Crowd Motion Analysis: An Application in Abnormal Event Detection
* Predicting Facial Attributes in Video Using Temporal Coherence and Motion-Attention
* QRkit: Sparse, Composable QR Decompositions for Efficient and Stable Solutions to Problems in Computer Vision
* Real-Time Simultaneous 3D Reconstruction and Optical Flow Estimation
* Real-Time Variational Range Image Fusion and Visualization for Large-Scale Scenes Using GPU Hash Tables
* Recognition of Pollen-Bearing Bees from Video Using Convolutional Neural Network
* Recognizing Visual Signatures of Spontaneous Head Gestures
* Recommending Outfits from Personal Closet
* Recovering from Random Pruning: On the Plasticity of Deep Convolutional Neural Networks
* Recurrent Autoregressive Networks for Online Multi-object Tracking
* ReHAR: Robust and Efficient Human Activity Recognition
* Reliability Map Estimation for CNN-Based Camera Model Attribution
* ReMotENet: Efficient Relevant Motion Event Detection for Large-Scale Home Surveillance Videos
* Retweet Wars: Tweet Popularity Prediction via Dynamic Multimodal Regression
* RGBD Camera Based Material Recognition via Surface Roughness Estimation
* Robust Adaptive Heart-Rate Monitoring Using Face Videos
* Robust and Accurate Text Stroke Segmentation
* Robust and User Friendly 3D Re-Construction of Neutron Tomographic Images
* Robust Subspace Clustering by Bi-Sparsity Pursuit: Guarantees and Sequential Algorithm
* Rotation Adaptive Visual Object Tracking with Motion Consistency
* Rotational Rectification Network: Enabling Pedestrian Detection for Mobile Vision
* Rotationally-Invariant Convolution Module by Feature Map Back-Rotation, A
* Saliency Driven Image Manipulation
* Saliency Prediction for Mobile User Interfaces
* Salient Region-Based Online Object Tracking
* SatTel: A Framework for Commercial Satellite Imagery Exploitation
* Scaling Human-Object Interaction Recognition Through Zero-Shot Learning
* ScanNet: A Fast and Dense Scanning Framework for Metastastic Breast Cancer Detection from Whole-Slide Image
* SceneFlowFields: Dense Interpolation of Sparse Scene Flow Correspondences
* Seeing is Believing: Pedestrian Trajectory Forecasting Using Visual Frustum of Attention
* Segmentation and Shape Extraction from Convolutional Neural Networks
* Segmenting Root Systems in X-Ray Computed Tomography Images Using Level Sets
* Semantic Labeling Based Vehicle Detection in Aerial Imagery
* Semantically Guided Visual Question Answering
* Semi-Supervised Two-Stage Approach to Learning from Noisy Labels, A
* SHADHO: Massively Scalable Hardware-Aware Distributed Hyperparameter Optimization
* Simple yet Effective Model for Zero-Shot Learning, A
* SmartPartNet: Part-Informed Person Detection for Body-Worn Smartphones
* Soft-Cascade Learning with Explicit Computation Time Considerations
* SS-LSTM: A Hierarchical LSTM Model for Pedestrian Trajectory Prediction
* Stabilizing First Person 360 Degree Videos
* StairNet: Top-Down Semantic Aggregation for Accurate One Shot Detection
* Structural Recurrent Neural Network (SRNN) for Group Activity Analysis
* Structured GANs
* Structured Triplet Learning with POS-Tag Guided Attention for Visual Question Answering
* Super-Resolution for Overhead Imagery Using DenseNets and Adversarial Learning
* Supervised Deep-Autoencoder for Depth Image-Based 3D Model Retrieval
* Synthetic to Real Adaptation with Generative Correlation Alignment Networks
* Task Specific Visual Saliency Prediction with Memory Augmented Conditional Generative Adversarial Networks
* Temporal Difference Networks for Video Action Recognition
* Temporal Sequence Learning for Action Recognition and Prediction, A
* Thermal to Visible Synthesis of Face Images Using Multiple Regions
* To Frontalize or Not to Frontalize: Do We Really Need Elaborate Pre-processing to Improve Face Recognition?
* Tool Detection and Operative Skill Assessment in Surgical Videos Using Region-Based Convolutional Neural Networks
* Towards Automated Transcription of Label Text from Pinned Insect Collections
* Towards Robust Deep Neural Networks with BANG
* Towards Structured Analysis of Broadcast Badminton Videos
* Towards the Success Rate of One: Real-Time Unconstrained Salient Object Detection
* Tracking an RGB-D Camera on Mobile Devices Using an Improved Frame-to-Frame Pose Estimation Method
* Tracking by Prediction: A Deep Generative Model for Mutli-person Localisation and Tracking
* Two-Point Method for PTZ Camera Calibration in Sports, A
* UG^2: a Video Benchmark for Assessing the Impact of Image Restoration and Enhancement on Automatic Visual Recognition
* Understanding Convolution for Semantic Segmentation
* Unsupervised Clustering Guided Semantic Segmentation
* Unsupervised Object Discovery for Instance Recognition
* Using a Single RGB Frame for Real Time 3D Hand Pose Estimation in the Wild
* Vector Graph Representation for Deformation Transfer Using Poisson Interpolation
* Vehicle Re-Identification by Adversarial Bi-Directional LSTM Network
* Video Inpainting for Arbitrary Foreground Object Removal
* Visual Weather Temperature Prediction
* Weakly Supervised Facial Attribute Manipulation via Deep Adversarial Network
* Where and Who? Automatic Semantic-Aware Person Composition
* Wide-Slice Residual Networks for Food Recognition
* Will People Like Your Image? Learning the Aesthetic Space
* Word Spotting in Silent Lip Videos
222 for WACV18

* 2D-3D Object Detection System for Updating Building Information Models with Mobile Robots, A
* 3D Human Pose Estimation With 2D Marginal Heatmaps
* 3D Reconstruction and Texture Optimization Using a Sparse Set of RGB-D Cameras
* 3DCapsule: Extending the Capsule Architecture to Classify 3D Point Clouds
* Action Quality Assessment Across Multiple Actions
* Action-Agnostic Human Pose Forecasting
* Active Learning with n-ary Queries for Image Recognition
* ADA: Adversarial Data Augmentation for Object Detection
* AddressNet: Shift-Based Primitives for Efficient Convolutional Neural Networks
* Advanced Super-Resolution Using Lossless Pooling Convolutional Networks
* Aligned to the Object, Not to the Image: A Unified Pose-Aligned Representation for Fine-Grained Recognition
* Alignment by Composition
* Analyzing Modern Camera Response Functions
* Ancient Painting to Natural Image: A New Solution for Painting Processing
* AsiANet: Autoencoders in Autoencoder for Unsupervised Monocular Depth Estimation
* Attention Based Natural Language Grounding by Navigating Virtual Environment
* Attention Mechanisms for Object Recognition With Event-Based Cameras
* Attentive and Adversarial Learning for Video Summarization
* Attentive Conditional Channel-Recurrent Autoencoding for Attribute-Conditioned Face Synthesis
* Automatic Detection and Segmentation of Lentil Crop Breeding Plots From Multi-Spectral Images Captured by UAV-Mounted Camera
* Autonomous Curiosity for Real-Time Training Onboard Robotic Agents
* Beyond Pixels: Image Provenance Analysis Leveraging Metadata
* Binary Constrained Deep Hashing Network for Image Retrieval Without Manual Annotation
* BRDF Estimation of Complex Materials with Nested Learning
* Bringing Vision to the Blind: From Coarse to Fine, One Dollar at a Time
* C4Synth: Cross-Caption Cycle-Consistent Text-to-Image Synthesis
* Cascade Attention Machine for Occluded Landmark Detection in 2D X-Ray Angiography
* CDNet: Single Image De-Hazing Using Unpaired Adversarial Training
* Classification and Re-Identification of Fruit Fly Individuals Across Days With Convolutional Neural Networks
* CNN Based Dense Underwater 3D Scene Reconstruction by Transfer Learning Using Bubble Database
* CNN-Based Semantic Segmentation Using Level Set Loss
* Colorful Trees: Visualizing Random Forests for Analysis and Interpretation
* Comparative Analysis of Visual-Inertial SLAM for Assisted Wayfinding of the Visually Impaired, A
* Conditional Deep Generative Model of People in Natural Images, A
* Conditional Generative Adversarial Refinement Networks for Unbalanced Medical Image Semantic Segmentation
* Coupled Generative Adversarial Network for Continuous Fine-Grained Action Segmentation
* Cross Domain Residual Transfer Learning for Person Re-Identification
* Crowd Counting Using Scale-Aware Attention Networks
* DAC: Data-Free Automatic Acceleration of Convolutional Networks
* DAFE-FD: Density Aware Feature Enrichment for Face Detection
* Data Augmentation Using Part Analysis for Shape Classification
* Data-Efficient Graph Embedding Learning for PCB Component Detection
* Deep Learning Approach to Solar-Irradiance Forecasting in Sky-Videos, A
* Deep Micro-Dictionary Learning and Coding Network
* Deep Neural Networks in Fully Connected CRF for Image Labeling with Social Network Metadata
* Deep Representation Learning Characterized by Inter-Class Separation for Image Clustering
* Deep Semantic Instance Segmentation of Tree-Like Structures Using Synthetic Data
* Deep-Hurricane-Tracker: Tracking and Forecasting Extreme Climate Events
* Defocus Magnification Using Conditional Adversarial Networks
* Demystifying Multi-Faceted Video Summarization: Tradeoff Between Diversity, Representation, Coverage and Importance
* Dense 3D Point Cloud Reconstruction Using a Deep Pyramid Network
* Deployment Conscious Automatic Surface Crack Detection
* Detecting Abnormal Events in Video Using Narrowed Normality Clusters
* DGC-Net: Dense Geometric Correspondence Network
* Digging Deeper Into Egocentric Gaze Prediction
* DIMAL: Deep Isometric Manifold Learning Using Sparse Geodesic Sampling
* Domain Agnostic Normalization Layer for Unsupervised Adversarial Domain Adaptation, A
* Domain Randomization for Scene-Specific Car Detection and Pose Estimation
* Domain-Specific Human-Inspired Binarized Statistical Image Features for Iris Recognition
* EGO-SLAM: A Robust Monocular SLAM for Egocentric Videos
* End-to-End Video Captioning With Multitask Reinforcement Learning
* Euclidean Invariant Recognition of 2D Shapes Using Histograms of Magnitudes of Local Fourier-Mellin Descriptors
* Exploring Classification of Histological Disease Biomarkers From Renal Biopsy Images
* Eyemotion: Classifying Facial Expressions in VR Using Eye-Tracking Cameras
* Fashion Attributes-to-Image Synthesis Using Attention-Based Generative Adversarial Network
* Fashion Is Taking Shape: Understanding Clothing Preference Based on Body Shape From Online Sources
* Fast Face Image Synthesis With Minimal Training
* Fast Geometrically-Perturbed Adversarial Faces
* FgGAN: A Cascaded Unpaired Learning for Background Estimation and Foreground Segmentation
* Framework Towards Domain Specific Video Summarization, A
* FreeLabel: A Publicly Available Annotation Tool Based on Freehand Traces
* Fusion Approach for Multi-Frame Optical Flow Estimation, A
* FuturePose - Mixed Reality Martial Arts Training Using Real-Time 3D Human Pose Forecasting With a RGB Camera
* GAN-Based Pose-Aware Regulation for Video-Based Person Re-Identification
* Gated Context Aggregation Network for Image Dehazing and Deraining
* Generalization in Metric Learning: Should the Embedding Layer Be Embedding Layer?
* Generative Model of Worldwide Facial Appearance, A
* Give Me a Hint! Navigating Image Databases Using Human-in-the-Loop Feedback
* Going Deeper With Semantics: Video Activity Interpretation Using Semantic Contextualization
* Good Choices for Deep Convolutional Feature Encoding
* Good Similar Patches for Image Denoising
* Guided Image Inpainting: Replacing an Image Region by Pulling Content From Another Image
* Gyroscope-Aided Motion Deblurring with Deep Networks
* HiBsteR: Hierarchical Boosted Deep Metric Learning for Image Retrieval
* Hidden States Exploration for 3D Skeleton-Based Gesture Recognition
* Hierarchical Grocery Store Image Dataset With Visual and Semantic Labels, A
* Hierarchy-Based Image Embeddings for Semantic Image Retrieval
* High Fidelity Semantic Shape Completion for Point Clouds Using Latent Optimization
* High-Speed Video from Asynchronous Camera Array
* Human-Centric Light Sensing and Estimation From RGBD Images: The Invisible Light Switch
* IDD: A Dataset for Exploring Problems of Autonomous Navigation in Unconstrained Environments
* IEGAN: Multi-Purpose Perceptual Quality Image Enhancement Using Generative Adversarial Network
* Illumination-Invariant Face Recognition With Deep Relit Face Images
* Improved Mixed-Example Data Augmentation
* Improving 3D Human Pose Estimation Via 3D Part Affinity Fields
* Improving Diversity of Image Captioning Through Variational Autoencoders and Adversarial Learning
* Improving Image Captioning by Leveraging Knowledge Graphs
* Improving Robustness of Random Forest Under Label Noise
* Instance-Based Deep Transfer Learning
* Interpretable Visual Question Answering by Visual Grounding From Attention Supervision Mining
* Iris Presentation Attack Detection Based on Photometric Stereo Features
* Iris Recognition: Comparing Visible-Light Lateral and Frontal Illumination to NIR Frontal Illumination
* Is Pose Really Solved? A Frontalization Study On Off-Angle Face Matching
* Joint Event Detection and Description in Continuous Video Streams
* Keep Me In, Coach!: A Computer Vision Perspective on Assessing ACL Injury Risk in Female Athletes
* Keep Your Eye on the Puck: Automatic Hockey Videography
* Latent Fingerprint Enhancement Using Generative Adversarial Networks
* Learning From Less Data: A Unified Data Subset Selection and Active Learning Framework for Computer Vision
* Learning Generator Networks for Dynamic Patterns
* Learning On-Road Visual Control for Self-Driving Vehicles With Auxiliary Tasks
* Learning Privacy Preserving Encodings Through Adversarial Training
* Learning Receptive Field Size by Learning Filter Size
* Learning Sports Camera Selection From Internet Videos
* Learning to See the Invisible: End-to-End Trainable Amodal Instance Segmentation
* Learning to Segment With Image-Level Supervision
* Lidar Cloud Detection With Fully Convolutional Networks
* Local Color Mapping Combined with Color Transfer for Underwater Image Enhancement
* Local Gradients Smoothing: Defense Against Localized Adversarial Attacks
* Location-Velocity Attention for Pedestrian Trajectory Prediction
* Low-and Semantic-Level Cues for Forensic Splice Detection
* Low-Shot Learning From Imaginary 3D Model
* LUCFER: A Large-Scale Context-Sensitive Image Dataset for Deep Learning of Visual Emotions
* MAC: Mining Activity Concepts for Language-Based Temporal Localization
* Mask R-CNN With Pyramid Attention Network for Scene Text Detection
* Matching Disparate Image Pairs Using Shape-Aware ConvNets
* Memory Warps for Long-Term Online Video Representations and Anticipation
* Model-Free Tracking With Deep Appearance and Motion Features Integration
* Multi-Component Image Translation for Deep Domain Generalization
* Multi-Layer Pruning Framework for Compressing Single Shot MultiBox Detector
* Multi-Modal Detection Fusion on a Mobile UGV for Wide-Area, Long-Range Surveillance
* Multi-Modality Empowered Network for Facial Action Unit Detection
* Multi-Scale Aggregation Network for Direct Face Alignment
* Multi-Scale Convolution Aggregation and Stochastic Feature Reuse for DenseNets
* Multi-Scale Dense Networks for Deep High Dynamic Range Imaging
* MultiNet: Multi-Modal Multi-Task Learning for Autonomous Driving
* Multispectral Direct-Global Separation of Dynamic Scenes
* MURAUER: Mapping Unlabeled Real Data for Label AUstERity
* No-Reference Image Quality Assessment: An Attention Driven Approach
* Observing Pianist Accuracy and Form with Computer Vision
* Omnidirectional Pedestrian Detection by Rotation Invariant Training
* On Measuring the Iconicity of a Face
* On the Importance of Feature Aggregation for Face Reconstruction
* Online and Batch Supervised Background Estimation Via L1 Regression
* Online Multi-Object Tracking With Instance-Aware Tracker and Dynamic Model Refreshment
* Online Video Summarization: Predicting Future to Better Summarize Present
* Optimize Deep Convolutional Neural Network with Ternarized Weights and High Accuracy
* Ordinal Regression Using Noisy Pairwise Comparisons for Body Mass Index Range Estimation
* Performance of Humans in Iris Recognition: The Impact of Iris Condition and Annotation-Driven Verification
* Photo-Realistic Facial Texture Transfer
* Photo-Sketching: Inferring Contour Drawings From Images
* Pixel-Wise Attentional Gating for Scene Parsing
* Power Normalizing Second-Order Similarity Network for Few-Shot Learning
* Predicting Gender From Iris Texture May Be Harder Than It Seems
* Progressively-Trained Scale-Invariant and Boundary-Aware Deep Neural Network for the Automatic 3D Segmentation of Lung Lesions, A
* Proposal-Based Solution to Spatio-Temporal Action Detection in Untrimmed Videos, A
* Prototypicality Effects in Global Semantic Description of Objects
* Rapid Technique to Eliminate Moving Shadows for Accurate Vehicle Detection
* Real-Time Progressive 3D Semantic Segmentation for Indoor Scenes
* Recovering Faces From Portraits with Auxiliary Facial Attributes
* Recurrent Flow-Guided Semantic Forecasting
* Recurrent Iterative Gating Networks for Semantic Segmentation
* Resultant Based Incremental Recovery of Camera Pose From Pairwise Matches
* Revisiting Single Image Depth Estimation: Toward Higher Resolution Maps With Accurate Object Boundaries
* RGBD2lux: Dense Light Intensity Estimation With an RGBD Sensor
* SAF-BAGE: Salient Approach for Facial Soft-Biometric Classification - Age, Gender, and Facial Expression
* Satellite Imagery Multiscale Rapid Detection with Windowed Networks
* Scalable Logo Recognition Using Proxies
* Scale Pyramid Network for Crowd Counting
* Scene Parsing Via Dense Recurrent Neural Networks With Attentional Selection
* Segmenting Sky Pixels in Images: Analysis and Comparison
* Self-Paced Adversarial Training for Multimodal Few-Shot Learning
* Self-Supervised Bootstrap Method for Single-Image 3D Face Reconstruction, A
* Sem-GAN: Semantically-Consistent Image-to-Image Translation
* Semantic Correspondence in the Wild
* Semantic Instance Meets Salient Object: Study on Video Semantic Salient Instance Segmentation
* Semantic Matching by Weakly Supervised 2D Point Set Registration
* Semantic Stereo for Incidental Satellite Images
* Semi-Dense Stereo Matching Using Dual CNNs
* Semi-Supervised 3D Abdominal Multi-Organ Segmentation Via Deep Multi-Planar Co-Training
* Semi-Supervised Convolutional Neural Networks for In-Situ Video Monitoring of Selective Laser Melting
* Sensor Adaptation for Improved Semantic Segmentation of Overhead Imagery
* SfMLearner++: Learning Monocular Depth Ego-Motion Using Meaningful Geometric Constraints
* Shadow Patching: Guided Image Completion for Shadow Removal
* Single Image Deblurring and Camera Motion Estimation With Depth Map
* Single-Shot Analysis of Refractive Shape Using Convolutional Neural Networks
* Skeleton-Based Action Recognition of People Handling Objects
* Skip Residual Pairwise Networks With Learnable Comparative Functions for Few-Shot Learning
* Soft Transfer Learning via Gradient Diagnosis for Visual Relationship Detection
* Space-Time Event Clouds for Gesture Recognition: From RGB Cameras to Event Cameras
* SPaSe - Multi-Label Page Segmentation for Presentation Slides
* Spatial Focal Loss for Pedestrian Detection in Fisheye Imagery
* Spatial Knowledge Distillation to Aid Visual Reasoning
* Stability Based Filter Pruning for Accelerating Deep CNNs
* Starts Better and Ends Better: A Target Adaptive Image Signature Tracker
* Still Image Action Recognition by Predicting Spatial-Temporal Pixel Evolution
* Stochastic Gradient Descent With Hyperbolic-Tangent Decay on Classification
* Student Attendance System in Crowded Classrooms Using a Smartphone Camera
* Style and Content Disentanglement in Generative Adversarial Networks
* TAN: Temporal Aggregation Network for Dense Multi-Label Action Recognition
* Task Relation Networks
* Taylor Convolutional Networks for Image Classification
* TextCaps: Handwritten Character Recognition With Very Small Datasets
* TextContourNet: A Flexible and Effective Framework for Improving Scene Text Detection Architecture With a Multi-Task Cascade
* ThunderNet: A Turbo Unified Network for Real-Time Semantic Segmentation
* Toward Computer Vision Systems That Understand Real-World Assembly Processes
* Training Adversarial Discriminators for Cross-Channel Abnormal Event Detection in Crowds
* Tukey-Inspired Video Object Segmentation
* Unbounded Sparse Census Transform Using Genetic Algorithm
* Understanding Image Quality and Trust in Peer-to-Peer Marketplaces
* Understanding Kernel Size in Blind Deconvolution
* Universal Image Attractiveness Ranking Framework, An
* Unsupervised Adversarial Visual Level Domain Adaptation for Learning Video Object Detectors From Images
* Unsupervised Feature Learning of Human Actions As Trajectories in Pose Embedding Manifold
* VeGAN: Using GANs for Augmentation in Latent Space to Improve the Semantic Segmentation of Vehicles in Images From an Aerial Perspective
* VelocityGAN: Subsurface Velocity Image Estimation Using Conditional Adversarial Networks
* Ventral-Dorsal Neural Networks: Object Detection Via Selective Attention
* Video Action Recognition With an Additional End-to-End Trained Temporal Stream
* Video Jigsaw: Unsupervised Learning of Spatiotemporal Context for Video Action Recognition
* Video Summarization Via Actionness Ranking
* Video-Based Face Alignment With Local Motion Modeling
* Video-Rate Video Inpainting
* Visualizing Deep Similarity Networks
* Warping-Based Stereoscopic 3D Video Retargeting With Depth Remapping
* Weakly-Supervised Spatial Context Networks
* Where to Focus on for Human Action Recognition?
* Which Body Is Mine?
* Zero Shot License Plate Re-Identification
* Zero-Shot Learning Via Recurrent Knowledge Transfer
229 for WACV19

* 2-MAP: Aligned Visualizations for Comparison of High-Dimensional Point Sets
* 360 Panorama Synthesis from a Sparse Set of Images with Unknown Field of View
* 360-Indoor: Towards Learning Real-World Objects in 360° Indoor Equirectangular Images
* 3D Hand Pose Estimation with Disentangled Cross-Modal Latent Space
* 3D Semi-Supervised Learning with Uncertainty-Aware Multi-View Co-Training
* Ablation-CAM: Visual Explanations for Deep Convolutional Network via Gradient-free Localization
* Accuracy Booster: Performance Boosting using Feature Map Re-calibration
* Action Graphs: Weakly-supervised Action Localization with Graph Convolution Networks
* Action Segmentation with Mixed Temporal Domain Adaptation
* Active Adversarial Domain Adaptation
* Active Learning for Imbalanced Datasets
* Actor Conditioned Attention Maps for Video Action Detection
* Adapting Grad-CAM for Embedding Networks
* Adapting Style and Content for Attended Text Sequence Recognition
* Adaptive Aggregation of Arbitrary Online Trackers with a Regret Bound
* Adaptive Neural Connections for Sparsity Learning
* ADNet: Adaptively Dense Convolutional Neural Networks
* Adversarial Defense based on Structure-to-Signal Autoencoders
* Adversarial Discriminative Attention for Robust Anomaly Detection
* Adversarial Domain Adaptation Network For Cross-Domain Fine-Grained Recognition, An
* Adversarial Examples for Edge Detection: They Exist, and They Transfer
* Adversarial Sampling for Active Learning
* AlignNet: A Unifying Approach to Audio-Visual Alignment
* Analysis and a Solution of Momentarily Missed Detection for Anchor-based Object Detectors
* Anchor Box Optimization for Object Detection
* Animal Detection in Man-made Environments
* Animating Face using Disentangled Audio Representations
* Answering Questions about Data Visualizations using Efficient Bimodal Fusion
* Appearance and Shape from Water Reflection
* Architecture Search of Dynamic Cells for Semantic Video Segmentation
* Attention Flow: End-to-End Joint Attention Estimation
* Attention-based Fusion for Multi-source Human Image Generation
* Audio-Visual Model Distillation Using Acoustic Images
* AutoToon: Automatic Geometric Warping for Face Cartoon Generation
* BERT Representations for Video Question Answering
* Best Frame Selection in a Short Video
* BIRDSAI: A Dataset for Detection and Tracking in Aerial Thermal Infrared Videos
* Blended Convolution and Synthesis for Efficient Discrimination of 3D Shapes
* Body Pose Sonification for a View-Independent Auditory Aid to Blind Rock Climbers
* Boosting Deep Face Recognition via Disentangling Appearance and Geometry
* Boosting Standard Classification Architectures Through a Ranking Regularizer
* BRDF-Reconstruction in Photogrammetry Studio Setups
* Bridged Variational Autoencoders for Joint Modeling of Images and Attributes
* BSUV-Net: A Fully-Convolutional Neural Network for Background Subtraction of Unseen Videos
* Calibrated Domain-Invariant Learning for Highly Generalizable Large Scale Re-Identification
* Can a CNN Automatically Learn the Significance of Minutiae Points for Fingerprint Matching?
* Can I teach a robot to replicate a line art
* CANZSL: Cycle-Consistent Adversarial Networks for Zero-Shot Learning from Natural Language
* Casting Geometric Constraints in Semantic Segmentation as Semi-Supervised Learning
* Characteristic Regularisation for Super-Resolving Face Images
* Charting the Right Manifold: Manifold Mixup for Few-shot Learning
* ChromaGAN: Adversarial Picture Colorization with Semantic Class Distribution
* City-Scale Road Extraction from Satellite Imagery v2: Road Speeds and Travel Times
* Class-Discriminative Feature Embedding For Meta-Learning based Few-Shot Classification
* Class-incremental Learning via Deep Model Consolidation
* Classifying All Interacting Pairs in a Single Shot
* Cloud Removal in Satellite Images Using Spatiotemporal Generative Networks
* CoachGAN
* Color Composition Similarity and Its Application in Fine-grained Similarity
* Combinational Class Activation Maps for Weakly Supervised Object Localization
* Combining Compositional Models and Deep Networks For Robust Object Classification under Occlusion
* Component Attention Guided Face Super-Resolution Network: CAGFace
* Composition-Aware Image Aesthetics Assessment
* CompressNet: Generative Compression at Extremely Low Bitrates
* CookGAN: Meal Image Synthesis from Ingredients
* Cooperative Initialization based Deep Neural Network Training
* Coordinated Joint Multimodal Embeddings for Generalized Audio-Visual Zero-shot Classification and Retrieval of Videos
* Cross-Conditioned Recurrent Networks for Long-Term Synthesis of Inter-Person Human Motion Interactions
* Cross-Domain Face Synthesis using a Controllable GAN
* Cross-modal Scene Graph Matching for Relationship-aware Image-Text Retrieval
* Cross-Time and Orientation-Invariant Overhead Image Geolocalization Using Deep Local Features
* Cross-View Contextual Relation Transferred Network for Unsupervised Vehicle Tracking in Drone Videos
* CrossNet: Latent Cross-Consistency for Unpaired Image Translation
* Crowded Human Detection via an Anchor-pair Network
* D3D: Distilled 3D Networks for Video Action Recognition
* DATNet: Dense Auxiliary Tasks for Object Detection
* DAVID: Dual-Attentional Video Deblurring
* DCIL: Deep Contextual Internal Learning for Image Restoration and Image Retargeting
* Deep Adaptive Wavelet Network
* Deep Bayesian Network for Visual Question Generation
* Deep Image Blending
* Deep Learning on Small Datasets without Pre-Training using Cosine Loss
* Deep Position-Aware Hashing for Semantic Continuous Image Retrieval
* Deep Remote Sensing Methods for Methane Detection in Overhead Hyperspectral Imagery
* DeepErase: Weakly Supervised Ink Artifact Removal in Document Text Images
* DeepFuse: An IMU-Aware Network for Real-Time 3D Human Pose Estimation from Multi-View Image
* DeepPTZ: Deep Self-Calibration for PTZ Cameras
* DeFraudNet:End2End Fingerprint Spoof Detection using Patch Level Attention
* Dense Extreme Inception Network: Towards a Robust CNN Model for Edge Detection
* DeOccNet: Learning to See Through Foreground Occlusions in Light Fields
* Depth Completion via Deep Basis Fitting
* Detecting Face2Face Facial Reenactment in Videos
* Detecting Morphed Face Attacks Using Residual Noise from Deep Multi-scale Context Aggregation Network
* Detecting the Starting Frame of Actions in Video
* Devon: Deformable Volume Network for Learning Optical Flow
* DGGAN: Depth-image Guided Generative Adversarial Networks for Disentangling RGB and Depth Images in 3D Hand Pose Estimation
* Differentiable Scene Graphs
* DIPNet: Dynamic Identity Propagation Network for Video Object Segmentation
* Disentangling Human Dynamics for Pedestrian Locomotion Forecasting with Noisy Supervision
* Distributed Iterative Gating Networks for Semantic Segmentation
* Do As I Do: Transferring Human Motion and Appearance between Monocular Videos with Spatial and Temporal Constraints
* Does Face Recognition Accuracy Get Better With Age? Deep Face Matchers Say No
* Domain Bridge for Unpaired Image-to-Image Translation and Unsupervised Domain Adaptation
* Domain-Specific Semantics Guided Approach to Video Captioning
* Dual-Mode Training with Style Control and Quality Enhancement for Road Image Domain Adaptation
* Dynamic Motion Representation for Human Action Recognition
* EDGE20: A Cross Spectral Evaluation Dataset for Multiple Surveillance Problems
* Efficient Object Detection in Large Images Using Deep Reinforcement Learning
* Efficient Video Semantic Segmentation with Labels Propagation and Refinement
* ELoPE: Fine-Grained Visual Classification with Efficient Localization, Pooling and Embedding
* End to End Lip Synchronization with a Temporal AutoEncoder
* End-To-End Trainable Video Super-Resolution Based on a New Mechanism for Implicit Motion Estimation and Compensation
* Enhanced generative adversarial network for 3D brain MRI super-resolution
* EpO-Net: Exploiting Geometric Constraints on Dense Trajectories for Motion Saliency
* Erasing Scene Text with Weak Supervision
* Estimating 3D Camera Pose from 2D Pedestrian Trajectories
* Evaluation of Image Inpainting for Classification and Retrieval
* Event-based Star Tracking via Multiresolution Progressive Hough Transforms
* Exploring 3 R's of Long-term Tracking: Re-detection, Recovery and Reliability
* Exploring Hate Speech Detection in Multimodal Publications
* Extended Exposure Fusion and its Application to Single Image Contrast Enhancement, An
* Extracting identifying contours for African elephants and humpback whales using a learned appearance model
* Eye Contact Correction using Deep Neural Networks
* EyeGAN: Gaze-Preserving, Mask-Mediated Eye Image Synthesis
* Fast Deep Stereo with 2D Convolutional Processing of Cost Signatures
* Fast Image Reconstruction with an Event Camera
* Fast Postprocessing for Difficult Discrete Energy Minimization Problems
* Fast Video Multi-Style Transfer
* Few-Shot Learning of Video Action Recognition Only Based on Video Contents
* Few-Shot Scene Adaptive Crowd Counting Using Meta-Learning
* Figure Captioning with Relation Maps for Reasoning
* Filter Distillation for Network Compression
* Fine-grained Image Classification and Retrieval by Combining Visual and Locally Pooled Textual Features
* Fine-Grained Motion Representation For Template-Free Visual Tracking
* Flexible Selection Scheme for Minimum-Effort Transfer Learning, A
* FlowNet3D++: Geometric Losses For Deep Scene Flow Estimation
* Focusing Visual Relation Detection on Relevant Relations with Prior Potentials
* Fourier Based Pre-Processing For Seeing Through Water
* From Image to Video Face Inpainting: Spatial-Temporal Nested GAN (STN-GAN) for Usability Recovery
* Frustum VoxNet for 3D object detection from RGB-D or Depth images
* Fungi Recognition: A Practical Use Case
* FuseSeg: LiDAR Point Cloud Segmentation Fusing Multi-Modal Data
* Fusing Semantics and Motion State Detection for Robust Visual SLAM
* FX-GAN: Self-Supervised GAN Learning via Feature Exchange
* GAN-based Tunable Image Compression System, A
* GAR: Graph Assisted Reasoning for Object Detection
* Gaze Estimation for Assisted Living Environments
* Generating Positive Bounding Boxes for Balanced Training of Object Detectors
* Generative Framework for Zero-Shot Learning with Adversarial Domain Adaptation, A
* Generative Model with Semantic Embedding and Integrated Classifier for Generalized Zero-Shot Learning
* Generative Pseudo-label Refinement for Unsupervised Domain Adaptation
* Geometric Image Correspondence Verification by Dense Pixel Matching
* Global Co-occurrence Feature Learning and Active Coordinate System Conversion for Skeleton-based Action Recognition
* Global Context Reasoning for Semantic Segmentation of 3D Point Clouds
* Going Beyond the Regression Paradigm with Accurate Dot Prediction for Dense Crowds
* Going Much Wider with Deep Networks for Image Super-Resolution
* GradMix: Multi-source Transfer across Domains and Tasks
* Graph Networks for Multiple Object Tracking
* Graph Neural Networks for Image Understanding Based on Multiple Cues: Group Emotion Recognition and Event Recognition as Use Cases
* Hand-Priming in Object Localization for Assistive Egocentric Vision
* High Accuracy Face Geometry Capture using a Smartphone Video
* High-Frequency Refinement for Sharper Video Super-Resolution
* HistoNet: Predicting size histograms of object instances
* How Much Deep Learning does Neural Style Transfer Really Need? An Ablation Study
* I-MOVE: Independent Moving Objects for Velocity Estimation
* ICface: Interpretable and Controllable Face Reenactment Using GANs
* Identifying Recurring Patterns with Deep Neural Networks for Natural Image Denoising
* Image denoising via K-SVD with primal-dual active set algorithm
* Image Difficulty Curriculum for Generative Adversarial Networks (CuGAN)
* Image Hashing via Linear Discriminant Learning
* Image identification of Protea species with attributes and subgenus scaling
* Image to Video Domain Adaptation Using Web Supervision
* ImaGINator: Conditional Spatio-Temporal GAN for Video Generation
* Improved Embeddings with Easy Positive Triplet Mining
* Improving Object Detection with Inverted Attention
* Improving Style Transfer with Calibrated Metrics
* Inferring Super-Resolution Depth from a Moving Light-Source Enhanced RGB-D Sensor: A Variational Approach
* Instance Segmentation for the Quantification of Microplastic Fiber Images
* Instance Segmentation of Benthic Scale Worms at a Hydrothermal Site
* Intelligent Image Collection: Building the Optimal Dataset
* Internet of Things (IoT) Discovery Using Deep Neural Networks
* Inverse Rectification for Efficient Procam Pattern Correspondence
* Is Pruning Compression?: Investigating Pruning Via Network Layer Similarity
* It's All About The Scale: Efficient Text Detection Using Adaptive Scaling
* Iterative and Adaptive Sampling with Spatial Attention for Black-Box Model Explanations
* IterNet: Retinal Image Segmentation Utilizing Structural Redundancy in Vessel Networks
* Jointly Trained Image and Video Generation using Residual Vectors
* Kornia: an Open Source Differentiable Computer Vision Library for PyTorch
* L*ReLU: Piece-wise Linear Activation Functions for Deep Fine-grained Visual Categorization
* Lane detection using lane boundary marker network with road geometry constraints
* LEAF-QA: Locate, Encode Attend for Figure Question Answering
* Learn a Global Appearance Semi-Supervisedly for Synthesizing Person Images
* Learning a distance function with a Siamese network to localize anomalies in videos
* Learning Discriminative and Generalizable Representations by Spatial-Channel Partition for Person Re-Identification
* Learning from Noisy Labels via Discrepant Collaborative Training
* Learning from THEODORE: A Synthetic Omnidirectional Top-View Indoor Dataset for Deep Transfer Learning
* Learning Multimodal Representations for Unseen Activities
* Learning to Detect Head Movement in Unconstrained Remote Gaze Estimation in the Wild
* Leveraging Filter Correlations for Deep Model Compression
* Leveraging Pretrained Image Classifiers for Language-Based Segmentation
* Lightweight 3D Human Pose Estimation Network Training Using Teacher-Student Learning
* Little Fog for a Large Turn, A
* Local Binary Pattern Networks
* Localizing Grouped Instances for Efficient Detection in Low-Resource Scenarios
* Long-Short Graph Memory Network for Skeleton-Based Action Recognition
* Looking Ahead: Anticipating Pedestrians Crossing with Future Frames Prediction
* Looking deeper into Time for Activities of Daily Living Recognition
* Low Cost, High Performance Automatic Motorcycle Helmet Violation Detection
* Main-Secondary Network for Defect Segmentation of Textured Surface Images
* MaskPlus: Improving Mask Generation for Instance Segmentation
* Measuring the Utilization of Public Open Spaces by Deep Learning: A Benchmark Study at the Detroit Riverfront
* MHSAN: Multi-Head Self-Attention Network for Visual Semantic Embedding
* microbatchGAN: Stimulating Diversity with Multi-Adversarial Discrimination
* MLSL: Multi-Level Self-Supervised Learning for Domain Adaptation with Spatially Independent and Semantically Consistent Labeling
* MoBiNet: A Mobile Binary Network for Image Classification
* Model-Agnostic Metric for Zero-Shot Learning
* Mono Lay out: Amodal scene layout from a single image
* MotionRec: A Unified Deep Framework for Moving Object Recognition
* Multi Receptive Field Network for Semantic Segmentation
* Multi-class Novelty Detection Using Mix-up Technique
* Multi-Label Visual Feature Learning with Attentional Aggregation
* Multi-Level Representation Learning for Deep Subspace Clustering
* Multi-Modal Association based Grouping for Form Structure Extraction
* Multi-Scale Adversarial Cross-Domain Detection with Robust Discriminative Learning
* Multi-Scale Guided Cascade Hourglass Network for Depth Completion, A
* Multi-Space Approach to Zero-Shot Object Detection, A
* Multi-timescale Trajectory Prediction for Abnormal Human Activity Detection
* Multi-way Encoding for Robustness
* Multimodal Image Outpainting with Regularized Normalized Diversification
* Multiparty Visual Co-Occurrences for Estimating Personality Traits in Group Meetings
* Multiple Object Forecasting: Predicting Future Object Locations in Diverse Environments
* Multiview Co-segmentation for Wide Baseline Images using Cross-view Supervision
* Multiview Supervision By Registration
* Munich to Dubai: How far is it for Semantic Segmentation?
* Network Pruning Network Approach to Deep Model Compression, A
* Neural Puppet: Generative Layered Cartoon Characters
* Neural Sign Language Synthesis: Words Are Our Glosses
* NeurReg: Neural Registration and Its Application to Image Segmentation
* Non-Rigid Structure from Motion: Prior-Free Factorization Method Revisited
* Nonparametric Structure Regularization Machine for 2D Hand Pose Estimation
* Novel Inspection System For Variable Data Printing Using Deep Learning, A
* Novel Self-Supervised Re-labeling Approach for Training with Noisy Labels, A
* NRMVS: Non-Rigid Multi-View Stereo
* Offset Calibration for Appearance-Based Gaze Estimation via Gaze Decomposition
* On Hallucinating Context and Background Pixels from a Face Mask using Multi-scale GANs
* On Scene Flow Computation of GAS Structures with Optical GAS Imaging Cameras
* One-and-Half Stage Pedestrian Detector, A
* One-to-one Mapping for Unpaired Image-to-image Translation
* Online Lens Motion Smoothing for Video Autofocus
* Optimizing Through Learned Errors for Accurate Sports Field Registration
* Overlap Sampler for Region-Based Object Detection
* Overlooked Elephant of Object Detection: Open Set, The
* Partially Zero-shot Domain Adaptation from Incomplete Target Data with Missing Classes
* Periphery-Fovea Multi-Resolution Driving Model Guided by Human Attention
* Personalizing Fast-Forward Videos Based on Visual and Textual Features from Social Network
* PlotQA: Reasoning over Scientific Plots
* Plug-and-Play Rescaling Based Crowd Counting in Static Images
* Plugin Networks for Inference under Partial Evidence
* PointGrow: Autoregressively Learned Point Cloud Generation with Self-Attention
* PointPoseNet: Point Pose Network for Robust 6D Object Pose Estimation
* Pose Guided Gated Fusion for Person Re-identification
* Post-Mortem Iris Recognition Resistant to Biological Eye Decay Processes
* Predicting the Physical Dynamics of Unseen 3D Objects
* Preference-Based Image Generation
* Print Defect Mapping with Semantic Segmentation
* Probabilistic Object Detection: Definition and Evaluation
* Progressive Domain Adaptation for Object Detection
* Proposal-free Temporal Moment Localization of a Natural-Language Query in Video using Guided Attention
* Propose-and-Attend Single Shot Detector
* PSNet: A Style Transfer Network for Point Cloud Stylization on Geometry and Color
* QR-code Reconstruction from Event Data via Optimization in Code Subspace
* Quadtree Generating Networks: Efficient Hierarchical Scene Parsing with Sparse Convolutions
* QUICKSAL: A small and sparse visual saliency model for efficient inference in resource constrained hardware
* Real-Time Multi-Person Pose Tracking using Data Assimilation
* Real-time vehicle distance estimation using single view geometry
* Real-time Visual Object Tracking with Natural Language Description
* Reconstructing Road Network Graphs from both Aerial Lidar and Images
* Reducing Footskate in Human Motion Reconstruction with Ground Contact Constraints
* Reference Grid-assisted Network for 3D Point Signature Learning from Point Clouds
* Region Pooling with Adaptive Feature Fusion for End-to-End Person Recognition
* Regularize, Expand and Compress: NonExpansive Continual Learning
* Relativistic Discriminator: A One-Class Classifier for Generalized Iris Presentation Attack Detection
* Representing Objects in Video as Space-Time Volumes by Combining Top-Down and Bottom-Up Processes
* Resisting Large Data Variations via Introspective Transformation Network
* ReStGAN: A step towards visually guided shopper experience via text-to-image synthesis
* Reverse Variational Autoencoder for Visual Attribute Manipulation and Anomaly Detection
* Robust estimation of local affine maps and its applications to image matching
* Robust Explanations for Visual Question Answering
* Robust Face Detection via Learning Small Faces on Hard Images
* Robust Facial Landmark Detection via Aggregation on Geometrically Manipulated Faces
* Robust Feature Tracking in DVS Event Stream using Bézier Mapping
* Robust Template-Based Non-Rigid Motion Tracking Using Local Coordinate Regularization
* ROSS: Robust Learning of One-Shot 3D Shape Segmentation
* Rotation-invariant Mixed Graphical Model Network for 2D Hand Pose Estimation
* RPM-Net: Robust Pixel-Level Matching Networks for Self-Supervised Video Object Segmentation
* s-SBIR: Style Augmented Sketch based Image Retrieval
* ScaIL: Classifier Weights Scaling for Class Incremental Learning
* Scalable Detection of Offensive and Non-compliant Content / Logo in Product Images
* Scale Match for Tiny Person Detection
* Scale-aware Conditional Generative Adversarial Network for Image Dehazing
* See the Sound, Hear the Pixels
* Self-Attention Network for Skeleton-based Human Action Recognition
* Self-Contained Stylization via Steganography for Reverse and Serial Style Transfer
* Self-Growing Spatial Graph Networks for Pedestrian Trajectory Prediction
* Self-Guided Novel View Synthesis via Elastic Displacement Network
* Self-Orthogonality Module: A Network Architecture Plug-in for Learning Orthogonal Filters
* Semantic Consistency and Identity Mapping Multi-Component Generative Adversarial Network for Person Re-Identification
* Shape Constrained Network for Eye Segmentation in the Wild
* SieveNet: A Unified Framework for Robust Image-Based Virtual Try-On
* Silhouette Guided Point Cloud Reconstruction beyond Occlusion
* Simultaneous Detection and Removal of Dynamic Objects in Multi-view Images
* SINet: Extreme Lightweight Portrait Segmentation Networks with Spatial Squeeze Modules and Information Blocking Decoder
* Single Satellite Optical Imagery Dehazing using SAR Image Prior Based on conditional Generative Adversarial Networks
* SketchTransfer: A Challenging New Task for Exploring Detail-Invariance and the Abstractions Learned by Deep Networks
* Smart Hypothesis Generation for Efficient and Robust Room Layout Estimation
* SmartOverlays: A Visual Saliency Driven Label Placement for Intelligent Human-Computer Interfaces
* SmoothFool: An Efficient Framework for Computing Smooth Adversarial Perturbations
* Spatial-Content Image Search in Complex Scenes
* Spatio-Temporal Pyramid Graph Convolutions for Human Action Recognition and Postural Assessment
* Spatio-Temporal Ranked-Attention Networks for Video Captioning
* Stable Intrinsic Auto-Calibration from Fundamental Matrices of Devices with Uncorrelated Camera Parameters
* Stacked Adversarial Network for Zero-Shot Sketch based Image Retrieval
* Stacked Spatio-Temporal Graph Convolutional Networks for Action Segmentation
* Star-convex Polyhedra for 3D Object Detection and Segmentation in Microscopy
* Stochastic Dynamics for Video Infilling
* Street Scene: A new dataset and evaluation protocol for video anomaly detection
* Structured Compression of Deep Neural Networks with Debiased Elastic Group LASSO
* Style Transfer for Light Field Photography
* Super-resolved Chromatic Mapping of Snapshot Mosaic Image Sensors via a Texture Sensitive Residual Network
* Supervised and Unsupervised Learning of Parameterized Color Enhancement
* SVIRO: Synthetic Vehicle Interior Rear Seat Occupancy Dataset and Benchmark
* SymGAN: Orientation Estimation without Annotation for Symmetric Objects
* Synthesizing human-like sketches from natural images using a conditional convolutional decoder
* Synthetic Examples Improve Generalization for Rare Classes
* Synthinel-1 dataset: a collection of high resolution synthetic overhead imagery for building segmentation, The
* TailorGAN: Making User-Defined Fashion Designs
* Template-Based Automatic Search of Compact Semantic Segmentation Architectures
* Temporal Aggregation with Clip-level Attention for Video-based Person Re-identification
* Temporal Contrastive Pretraining for Video Action Recognition
* Temporal Similarity Analysis of Remote Photoplethysmography for Fast 3D Mask Face Presentation Attack Detection
* Text-based Person Search via Attribute-aided Matching
* TKD: Temporal Knowledge Distillation for Active Perception
* Toward Explainable Fashion Recommendation
* Toward Interactive Self-Annotation For Video Object Bounding Box: Recurrent Self-Learning And Hierarchical Annotation Based Framework
* Towards a Unified Framework for Visual Compatibility Prediction
* Towards Good Practice for CNN-Based Monocular Depth Estimation
* Towards Learning Affine-Invariant Representations via Data-Efficient CNNs
* Towards Photographic Image Manipulation with Balanced Growing of Generative Autoencoders
* Towards Preserving the Ephemeral: Texture-Based Background Modelling for Capturing Back-of-the-Napkin Notes
* Training with Noise Adversarial Network: A Generalization Method for Object Detection on Sonar Image
* Transductive Zero-Shot Learning for 3D Point Cloud Classification
* Triple-SGM: Stereo Processing using Semi-Global Matching with Cost Fusion
* Two-Grid Preconditioned Solver for Bundle Adjustment
* TwoStreamVAN: Improving Motion Modeling in Video Generation
* ULSAM: Ultra-Lightweight Subspace Attention Module for Compact Convolutional Neural Networks
* Uncertainty in Model-Agnostic Meta-Learning using Variational Inference
* Uncertainty-aware Short-term Motion Prediction of Traffic Actors for Autonomous Driving
* UnOVOST: Unsupervised Offline Video Object Segmentation and Tracking
* Unsupervised Adaptation for Synthetic-to-Real Handwritten Word Recognition
* Unsupervised and Semi-Supervised Domain Adaptation for Action Recognition from Drones
* Unsupervised Cross-Dataset Adaptation via Probabilistic Amodal 3D Human Pose Completion
* Unsupervised Domain Adaptation in Person re-ID via k-Reciprocal Clustering and Large-Scale Heterogeneous Environment Synthesis
* Unsupervised Image Style Embeddings for Retrieval and Recognition Tasks
* Unsupervised Learning of Camera Pose with Compositional Re-estimation
* Variational Image Deraining
* Very Power Efficient Neural Time-of-Flight
* Video Object Segmentation-based Visual Servo Control and Object Depth Estimation on a Mobile Robot
* Video Person Re-Identification using Learned Clip Similarity Aggregation
* ViP: Virtual Pooling for Accelerating CNN-based Image Classification and Object Detection
* Visual Question Answering on 360° Images
* VRT-Net: Real-Time Scene Parsing via Variable Resolution Transform
* Watch to Listen Clearly: Visual Speech Enhancement Driven Multi-modality Speech Recognition
* Weakly Supervised Gaussian Networks for Action Detection
* Weakly Supervised Graph Convolutional Neural Network for Human Action Localization
* Weakly Supervised Temporal Action Localization Using Deep Metric Learning
* Weakly-Supervised Multi-Person Action Recognition in 360° Videos
* Wide Hidden Expansion Layer for Deep Convolutional Neural Networks
* Word-level Deep Sign Language Recognition from Video: A New Large-scale Dataset and Methods Comparison
* 2D to 3D Medical Image Colorization
* 3D Dense Geometry-Guided Facial Expression Synthesis by Adversarial Learning
* 3D Human Pose and Shape Estimation Through Collaborative Learning and Multi-view Model-fitting
* 3DPoseLite: A Compact 3D Pose Estimation Using Node Embeddings
* Accelerated WGAN update strategy with loss change rate balancing
* Action Duration Prediction for Segment-Level Alignment of Weakly-Labeled Videos
* Active Latent Space Shape Model: A Bayesian Treatment of Shape Model Adaptation with an Application to Psoriatic Arthritis Radiographs
* Active Learning for Bayesian 3D Hand Pose Estimation
* ADA-AT/DT: An Adversarial Approach for Cross-Domain and Cross-Task Knowledge Transfer
* Adaptiope: A Modern Benchmark for Unsupervised Domain Adaptation
* Adaptive Multiplane Image Generation from a Single Internet Picture
* Adaptive Privacy Preserving Deep Learning Algorithms for Medical Data
* Adaptive Streaming of 360-Degree Videos with Reinforcement Learning
* Adaptive-Attentive Geolocalization from Few Queries: A Hybrid Approach
* AdarGCN: Adaptive Aggregation GCN for Few-Shot Learning
* Adversarial Deepfakes: Evaluating Vulnerability of Deepfake Detectors to Adversarial Examples
* Adversarial Dual Distinct Classifiers for Unsupervised Domain Adaptation
* Adversarial Reinforcement Learning for Unsupervised Domain Adaptation
* AI on the Bog: Monitoring and Evaluating Cranberry Crop Risk
* Alleviating Over-segmentation Errors by Detecting Action Boundaries
* Alternative of LiDAR in Nighttime: Unsupervised Depth Estimation Based on Single Thermal Image, An
* Analyzing Deep Neural Network's Transferability via Fréchet Distance
* Appending Adversarial Frames for Universal Video Attack
* Are These from the Same Place? Seeing the Unseen in Cross-View Image Geo-Localization
* Assessing Image and Text Generation with Topological Analysis and Fuzzy Logic
* Asymmetric Contextual Modulation for Infrared Small Target Detection
* ATM: Attentional Text Matting
* Attention-Based Spatial Guidance for Image-to-Image Translation
* Attentional Feature Fusion
* Audio- and Gaze-driven Facial Animation of Codec Avatars
* Audio-Visual Event Localization via Recursive Fusion by Joint Co-Attention
* Auto-Navigator: Decoupled Neural Architecture Search for Visual Navigation
* Automatic Calibration of the Fisheye Camera for Egocentric 3D Human Pose Estimation from a Single Image
* Automatic Object Recoloring Using Adversarial Learning
* Automatic Open-World Reliability Assessment
* Automatic Quantification of Plant Disease from Field Image Data Using Deep Learning
* Autonomous Tracking For Volumetric Video Sequences
* AutoRetouch: Automatic Professional Face Retouching
* Auxiliary Tasks for Efficient Learning of Point-Goal Navigation
* AVGZSLNet: Audio-Visual Generalized Zero-Shot Learning by Reconstructing Label Features from Multi-Modal Embeddings
* Benchmark for Evaluating Pedestrian Action Prediction
* Benefiting from Bicubically Down-Sampled Images for Learning Real-World Image Super-Resolution
* Boosting Monocular Depth with Panoptic Segmentation Maps
* Breaking Shortcuts by Masking for Robust Visual Reasoning
* Can Selfless Learning improve accuracy of a single classification task?
* CAP: Context-Aware Pruning for Semantic Segmentation
* CASIA-SURF CeFA: A Benchmark for Multi-modal Cross-Ethnicity Face Anti-spoofing
* CAT-Net: Compression Artifact Tracing Network for Detection and Localization of Image Splicing
* CenterFusion: Center-based Radar and Camera Fusion for 3D Object Detection
* ChartOCR: Data Extraction from Charts Images via a Deep Hybrid Framework
* Cinematic-L1 Video Stabilization with a Log-Homography Model
* CIT-GAN: Cyclic Image Translation Generative Adversarial Network With Application in Iris Presentation Attack Detection
* Class Anchor Clustering: A Loss for Distance-based Open Set Recognition
* Class-agnostic Few-shot Object Counting
* Class-agnostic Object Detection
* Class-wise Metric Scaling for Improved Few-Shot Classification
* ClassMix: Segmentation-Based Data Augmentation for Semi-Supervised Learning
* Coarse Temporal Attention Network (CTA-Net) for Driver's Activity Recognition
* Coarse- and Fine-grained Attention Network with Background-aware Loss for Crowd Density Map Estimation
* Coarse-to-Fine Gaze Redirection with Numerical and Pictorial Guidance
* CoMoDA: Continuous Monocular Depth Adaptation Using Past Experiences
* Compositional Embeddings for Multi-Label One-Shot Learning
* Compositional Learning of Image-Text Query for Image Retrieval
* Conditional Link Prediction of Category-Implicit Keypoint Detection
* Confidence-Driven Hierarchical Classification of Cultivated Plant Stresses
* Conflicting Bundles: Adapting Architectures Towards the Improved Training of Deep Neural Networks
* Constrained Weight Optimization for Learning without Activation Normalization
* Context-Aware Domain Adaptation in Semantic Segmentation
* Continual Representation Learning for Biometric Identification
* Continuous Geodesic Convolutions for Learning on 3D Shapes
* Controllable and Progressive Image Extrapolation
* Covariance-free Partial Least Squares: An Incremental Dimensionality Reduction Method
* CPM R-CNN: Calibrating Point-guided Misalignment in Object Detection
* Cross-Domain Latent Modulation for Variational Transfer Learning
* Cross-Modality 3D Object Detection
* DACS: Domain Adaptation via Cross-domain Mixed Sampling
* DANCE: A Deep Attentive Contour Model for Efficient Instance Segmentation
* Data-efficient Alignment of Multimodal Sequences by Aligning Gradient Updates and Internal Feature Distributions
* Data-free Knowledge Distillation for Object Detection
* DB-GAN: Boosting Object Recognition Under Strong Lighting Conditions
* De-biasing Neural Networks with Estimated Offset for Class Imbalanced Learning
* Deep Active Learning for Joint Classification Segmentation with Weak Annotator
* Deep Image Compositing
* Deep Interactive Thin Object Selection
* Deep Poisoning: Towards Robust Image Data Sharing against Visual Disclosure
* Deep Preset: Blending and Retouching Photos with Color Style Transfer
* Deep Template-based Object Instance Detection
* Deep Temporal Fusion Framework for Scene Flow Using a Learnable Motion Model and Occlusions, A
* Deep Unsupervised Anomaly Detection
* DeepCFL: Deep Contextual Features Learning from a Single Image
* DeepCSR: A 3D Deep Learning Approach for Cortical Surface Reconstruction
* DeepMark++: Real-time Clothing Detection at the Edge
* DeepOpht: Medical Report Generation for Retinal Images via Deep Models and Visual Explanation
* Defect-GAN: High-Fidelity Defect Synthesis for Automated Defect Inspection
* Defense-friendly Images in Adversarial Attacks: Dataset and Metrics for Perturbation Difficulty
* Deflation based Fast and Robust Preconditioner for Bundle Adjustment, A
* Deformable Gabor Feature Networks for Biomedical Image Classification
* Dense 3D-Reconstruction from Monocular Image Sequences for Computationally Constrained UAS*
* Dense-Resolution Network for Point Cloud Classification and Segmentation
* Detecting Human-Object Interaction with Mixed Supervision
* Devil is in the Boundary: Exploiting Boundary Representation for Basis-based Instance Segmentation, The
* Disentangled Contour Learning for Quadrilateral Text Detection
* Distillation Multiple Choice Learning for Multimodal Action Recognition
* Do not Forget to Attend to Uncertainty while Mitigating Catastrophic Forgetting
* Do We Really Need Gold Samples for Sample Weighting under Label Noise?
* DocVQA: A Dataset for VQA on Document Images
* Domain Impression: A Source Data Free Domain Adaptation Method
* Domain-Adaptive Few-Shot Learning
* Domain-Aware Unsupervised Hyperspectral Reconstruction for Aerial Image Dehazing
* DORi: Discovering Object Relationships for Moment Localization of a Natural Language Query in a Video
* Driver Anomaly Detection: A Dataset and Contrastive Learning Approach
* Driving among Flatmobiles: Bird-Eye-View occupancy grids from a monocular camera for holistic trajectory planning
* Dual-Stream Fusion Network for Spatiotemporal Video Super-Resolution
* DualSANet: Dual Spatial Attention Network for Iris Recognition
* DualSR: Zero-Shot Dual Learning for Real-World Super-Resolution
* Dynamic Plane Convolutional Occupancy Networks
* Dynamic Routing Networks
* DynaVSR: Dynamic Adaptive Blind Video Super-Resolution
* EAGLE-Eye: Extreme-pose Action Grader using detaiL bird's-Eye view
* EDEN: Multimodal Synthetic Dataset of Enclosed GarDEN Scenes
* Effective Fusion Factor in FPN for Tiny Object Detection
* Effectiveness of Arbitrary Transfer Sets for Data-free Knowledge Distillation
* Efficient 3D Video Engine Using Frame Redundancy
* Efficient Attention: Attention with Linear Complexities
* Efficient Real-Time Radial Distortion Correction for UAVs
* Efficient video annotation with visual interpolation and frame selection guidance
* Ellipse Detection and Localization with Applications to Knots in Sawn Lumber Images
* Embedded Dense Camera Trajectories in Multi-Video Image Mosaics by Geodesic Interpolation-based Reintegration
* End-to-End Chinese Landscape Painting Creation Using Generative Adversarial Networks
* End-to-end Lane Shape Prediction with Transformers
* End-to-end Learning Improves Static Object Geo-Localization from Video
* Enhancing Diversity in Teacher-Student Networks via Asymmetric branches for Unsupervised Person Re-identification
* Ensembling Low Precision Models for Binary Biomedical Image Segmentation
* EVET: Enhancing Visual Explanations of Deep Neural Networks Using Image Transformations
* EvidentialMix: Learning with Combined Open-set and Closed-set Noisy Labels
* ExMaps: Long-Term Localization in Dynamic Scenes using Exponential Decay
* Exploiting Spatial Relation for Reducing Distortion in Style Transfer
* Exploiting the Redundancy in Convolutional Filters for Parameter Reduction
* Exploration of Spatial and Temporal Modeling Alternatives for HOI
* FACEGAN: Facial Attribute Controllable rEenactment GAN
* Faces à la Carte: Text-to-Face Generation via Attribute Disentanglement
* Facial Emotion Recognition with Noisy Multi-task Annotations
* Facial Expression Recognition in the Wild via Deep Attentive Center Loss
* Fair Comparison: Quantifying Variance in Results for Fine-grained Visual Categorization
* FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age for Bias Measurement and Mitigation
* Fast Fourier Intrinsic Network
* Fast Kernelized Correlation Filter without Boundary Effect
* Fast Pose Graph Optimization via Krylov-Schur and Cholesky Factorization
* Few-shot Font Style Transfer between Different Languages
* Few-Shot Learning via Feature Hallucination with Variational Inference
* Fine-grained Foreground Retrieval via Teacher-Student Learning
* FlowCaps: Optical Flow Estimation with Capsule Networks For Action Recognition
* Focus and retain: Complement the Broken Pose in Human Image Synthesis
* Foreground color prediction through inverse compositing
* Foreground-aware Semantic Representations for Image Harmonization
* From generalized zero-shot learning to long-tail with class descriptors
* Fusion Learning using Semantics and Graph Convolutional Network for Visual Food Recognition
* Future Moment Assessment for Action Query
* G2D: Generate to Detect Anomaly
* Generalized Object Detection on Fisheye Cameras for Autonomous Driving: Dataset, Representations and Baseline
* Generating Physically Sound Training Data for Image Recognition of Additively Manufactured Parts
* Generative Patch Priors for Practical Compressive Image Recovery
* Global Table Extractor (GTE): A Framework for Joint Table Identification and Cell Structure Recognition Using Visual Context
* GlocalNet: Class-aware Long-term Human Motion Synthesis
* Goal-driven Long-Term Trajectory Prediction
* GraphTCN: Spatio-Temporal Interaction Modeling for Human Trajectory Prediction
* Group Softmax Loss with Discriminative Feature Grouping
* Guided Attentive Feature Fusion for Multispectral Pedestrian Detection
* H2O-Net: Self-Supervised Flood Segmentation via Adversarial Domain Adaptation and Label Refinement
* Hand Pose Guided 3D Pooling for Word-level Sign Language Recognition
* Handwritten Chinese Font Generation with Collaborative Stroke Refinement
* Have Fun Storming the Castle(s)!
* HealTech - A System for Predicting Patient Hospitalization Risk and Wound Progression in Old Patients
* Hierarchical Generative Adversarial Networks for Single Image Super-Resolution
* High-quality Frame Interpolation via Tridirectional Inference
* Holistic Filter Pruning for Efficient Deep Neural Networks
* How to Make a BLT Sandwich? Learning VQA towards Understanding Web Instructional Videos
* HyperCon: Image-To-Video Model Transfer for Video-To-Video Translation Tasks
* Hyperrealistic Image Inpainting with Hypergraphs
* Identity Unbiased Deception Detection by 2D-to-3D Face Reconstruction
* IGSSTRCF: Importance Guided Sparse Spatio-Temporal Regularized Correlation Filters For Tracking
* IKEA ASM Dataset: Understanding People Assembling Furniture through Actions, Objects and Pose, The
* Illumination Normalization by Partially Impossible Encoder-Decoder Cost Function
* Improve CAM with Auto-Adapted Segmentation and Co-Supervised Augmentation
* Improved Techniques for Training Single-Image GANs
* Improved Training of Generative Adversarial Networks Using Decision Forests
* Improving Few-Shot Learning using Composite Rotation based Auxiliary Task
* Improving Point Cloud Semantic Segmentation by Learning 3D Object Detection
* Improving Robustness and Uncertainty Modelling in Neural Ordinary Differential Equations
* Improving Video Captioning with Temporal Composition of a Visual-Syntactic Embedding*
* IncreACO: Incrementally Learned Automatic Check-out with Photorealistic Exemplar Augmentation
* InfoMax-GAN: Improved Adversarial Image Generation via Information Maximization and Contrastive Learning
* Integrating Human Gaze into Attention for Egocentric Activity Recognition
* Interpretable and Trustworthy Deepfake Detection via Dynamic Prototypes
* Intra-class Part Swapping for Fine-Grained Image Classification
* Intro and Recap Detection for Movies and TV Series
* Joint Visual-Temporal Embedding for Unsupervised Learning of Actions in Untrimmed Sequences
* JOLO-GCN: Mining Joint-Centered Light-Weight Information for Skeleton-Based Action Recognition
* Kernel Self-Attention for Weakly-supervised Image Classification using Deep Multiple Instance Learning
* Keypoint-Aligned Embeddings for Image Retrieval and Re-identification
* Large image datasets: A pyrrhic win for computer vision?
* Large-Scale, Time-Synchronized Visible and Thermal Face Dataset, A
* Laughing Machine: Predicting Humor in Video, The
* Learn like a Pathologist: Curriculum Learning by Annotator Agreement for Histopathology Image Classification
* Learned Dual-View Reflection Removal
* Learning Data Augmentation with Online Bilevel Optimization for Image Classification
* Learning Fast Converging, Effective Conditional Generative Adversarial Networks with a Mirrored Auxiliary Classifier
* Learning of low-level feature keypoints for accurate and robust detection
* Learning Shape Representations for Person Re-Identification under Clothing Change
* Learning to Distill Convolutional Features into Compact Local Descriptors
* Learning to Generate Dense Point Clouds with Textures on Multiple Categories
* Learning-Based Approach to Parametric Rotoscoping of Multi-Shape Systems, A
* Legacy Photo Editing with Learned Noise Prior
* Let's Get Dirty: GAN Based Data Augmentation for Camera Lens Soiling Detection in Autonomous Driving
* Line Art Correlation Matching Feature Transfer Network for Automatic Animation Colorization
* Lip-reading with Densely Connected Temporal Convolutional Networks
* Local to Global: Efficient Visual Localization for a Monocular Camera
* LoGAN: Latent Graph Co-Attention Network for Weakly-Supervised Video Moment Retrieval
* Long-range Attention Network for Multi-View Stereo
* LT-GAN: Self-Supervised GAN with Latent Transformation Detection
* Making DensePose fast and light
* MART: Motion-Aware Recurrent Neural Network for Robust Visual Tracking
* Mask Selection and Propagation for Unsupervised Video Object Segmentation
* maskedFaceNet: A Progressive Semi-Supervised Masked Face Detector
* MECCANO Dataset: Understanding Human-Object Interactions from Egocentric Videos in an Industrial-like Domain, The
* MeliusNet: An Improved Network Architecture for Binary Neural Networks
* Meta Module Network for Compositional Visual Reasoning
* MEVA: A Large-Scale Multiview, Multimodal Video Dataset for Activity Detection
* Minimal Solvers for Single-View Lens-Distorted Camera Auto-Calibration
* MinkLoc3D: Point Cloud Based Large-Scale Place Recognition
* Misclassification Risk and Uncertainty Quantification in Deep Classifiers
* MoRe: A Large-Scale Motorcycle Re-Identification Dataset
* Motion Adaptive Deblurring with Single-Photon Cameras
* MPRNet: Multi-Path Residual Network for Lightweight Image Super Resolution
* MSNet: A Multilevel Instance Segmentation Network for Natural Disaster Damage Assessment in Aerial Videos
* Multi Projection Fusion for Real-time Semantic Segmentation of 3D LiDAR Point Clouds
* Multi-Class Hinge Loss for Conditional GANs, A
* Multi-frame Recurrent Adversarial Network for Moving Object Segmentation
* Multi-Level Generative Chaotic Recurrent Network for Image Inpainting
* Multi-Loss Weighting with Coefficient of Variations
* Multi-Modal Reasoning Graph for Scene-Text Based Fine-Grained Image Classification and Retrieval
* Multi-Modal Trajectory Prediction of NBA Players
* Multi-path Neural Networks for On-device Multi-domain Visual Classification
* Multi-Task Knowledge Distillation for Eye Disease Prediction
* Multi-Task Learning Approach for Human Activity Segmentation and Ergonomics Risk Assessment, A
* Multimodal Humor Dataset: Predicting Laughter tracks for Sitcoms
* Multimodal Prototypical Networks for Few-shot Learning
* Multimodal Trajectory Predictions for Autonomous Driving without a Detailed Prior Map
* MUSCLE: Strengthening Semi-Supervised Learning Via Concurrent Unsupervised Learning Using Mutual Information Maximization
* Mutual Information Maximization on Disentangled Representations for Differential Morph Detection
* MVHM: A Large-Scale Multi-View Hand Mesh Benchmark for Accurate 3D Hand Pose Estimation
* Neural Contrast Enhancement of CT Image
* Neuron matching in C. elegans with robust approximate linear regression without correspondence
* Noise as a Resource for Learning in Knowledge Distillation
* Noisy Concurrent Training for Efficient Learning under Label Noise
* Novel View Synthesis via Depth-guided Skip Connections
* Object Recognition with Continual Open Set Domain Adaptation for Home Robot
* On the generalization of learning-based 3D reconstruction
* On the Texture Bias for Few-Shot CNN Segmentation
* One-Shot Image Recognition Using Prototypical Encoders with Reduced Hubness
* Only Time Can Tell: Discovering Temporal Data for Temporal Modeling
* Ontology-driven Event Type Classification in Images
* Optimistic Agent: Accurate Graph-Based Value Estimation for More Successful Visual Navigation
* Oriented Object Detection in Aerial Images with Box Boundary-Aware Vectors
* Overcomplete Deep Subspace Clustering Networks
* OverNet: Lightweight Multi-Scale Super-Resolution with Overscaling Network
* Painting Outside as Inside: Edge Guided Image Outpainting via Bidirectional Rearrangement with Progressive Step Learning
* Part Segmentation of Unseen Objects using Keypoint Guidance
* PDAN: Pyramid Dilated Attention Network for Action Detection
* Person-in-Context Synthesis with Compositional Structural Space
* Phase-wise Parameter Aggregation For Improving SGD Optimization
* PI-Net: Pose Interacting Network for Multi-Person Monocular 3D Pose Estimation
* PNPDet: Efficient Few-shot Detection without Forgetting via Plug-and-Play Sub-networks
* Pose Proposal and Refinement Network for Better 6D Object Pose Estimation, A
* Pretraining boosts out-of-domain robustness for pose estimation
* Proposal Learning for Semi-Supervised Object Detection
* QuadroNet: Multi-Task Learning for Real-Time Semantic Depth Aware Instance Segmentation
* R-MNet: A Perceptual Adversarial Network for Image Inpainting
* RarePlanes: Synthetic Data Takes Flight
* Real-Time Gait-Based Age Estimation and Gender Classification from a Single Image
* Real-time Localized Photorealistic Video Style Transfer
* Real-time RGBD-based Extended Body Pose Estimation
* Real-Time Uncertainty Estimation in Computer Vision via Uncertainty-Aware Distribution Distillation
* Receptive Field Size Optimization with Continuous Time Pooling
* Recovering Trajectories of Unmarked Joints in 3D Human Actions Using Latent Space Optimization
* Red Carpet to Fight Club: Partially-supervised Domain Transfer for Face Recognition in Violent Videos
* Reducing the Annotation Effort for Video Object Segmentation Datasets
* RefineLoc: Iterative Refinement for Weakly-Supervised Action Localization
* Regional Attention Networks with Context-aware Fusion for Group Emotion Recognition
* Relighting Images in the Wild with a Self-Supervised Siamese Auto-Encoder
* Representation learning from videos in-the-wild: An object-centric approach
* Representation Learning Through Latent Canonicalizations
* Representation Learning with Statistical Independence to Mitigate Bias
* ResNet or DenseNet? Introducing Dense Shortcuts to ResNet
* Revisiting Adaptive Convolutions for Video Frame Interpolation
* Revisiting Batch Normalization for Improving Corruption Robustness
* Revisiting Street-to-Aerial View Image Geo-localization and Orientation Estimation
* RGPNet: A Real-Time General Purpose Semantic Segmentation
* RNNP: A Robust Few-Shot Learning Approach
* Robust and Efficient Framework for Sports-Field Registration, A
* Robust Lensless Image Reconstruction via PSF Estimation
* RODNet: Radar Object Detection using Cross-Modal Supervision
* Rotate to Attend: Convolutional Triplet Attention Module
* S-VVAD: Visual Voice Activity Detection by Motion Segmentation
* S3-Net: A Fast and Lightweight Video Scene Understanding Network by Single-shot Segmentation
* SALAD: Self-Assessment Learning for Action Detection
* Saliency Driven Perceptual Image Compression
* Saliency Prediction with External Knowledge
* Same Same But DifferNet: Semi-Supervised Defect Detection with Normalizing Flows
* Scale Aware Adaptation for Land-Cover Classification in Remote Sensing Imagery
* Scale Equivariance Improves Siamese Tracking
* Scaling digital screen reading with one-shot learning and re-identification
* SChISM: Semantic Clustering via Image Sequence Merging for Images of Human-Decomposition
* Seeing Through your Skin: Recognizing Objects with a Novel Visuotactile Sensor
* Selective Spatio-Temporal Aggregation Based Pose Refinement System: Towards Understanding Human Activities in Real-World Videos
* Self Supervision for Attention Networks
* Self-Distillation for Few-Shot Image Captioning
* Self-supervised 4D Spatio-temporal Feature Learning via Order Prediction of Sequential Point Cloud Clips
* Self-Supervised Learning for Domain Adaptation on Point Clouds
* Self-Supervised Poisson-Gaussian Denoising
* Self-supervised training for blind multi-frame video denoising
* Self-supervised Visual-LiDAR Odometry with Flip Consistency
* Separable Four Points Fundamental Matrix
* Set Augmented Triplet Loss for Video Person Re-Identification
* SHAD3S: A model to Sketch, Shade and Shadow
* Shape from Caustics: Reconstruction of 3D-Printed Glass from Simulated Caustic Images
* Shape from semantic segmentation via the geometric Rényi divergence
* SinGAN-GIF: Learning a Generative Video Model from a Single GIF
* Single Image Human Proxemics Estimation for Visual Social Distancing
* Single Image Reflection Removal with Edge Guidance, Reflection Classifier, and Recurrent Decomposition
* Size-invariant Detection of Marine Vessels From Visual Time Series
* SLAM in the Field: An Evaluation of Monocular Mapping and Localization on Challenging Dynamic Agricultural Environment
* SliceNets: A Scalable Approach for Object Detection in 3D CT Scans
* SMPLpix: Neural Avatars from 3D Human Models
* SoFA: Source-data-free Feature Alignment for Unsupervised Domain Adaptation
* Spatial Context-Aware Self-Attention Model For Multi-Organ Segmentation
* Spatially Aware Metadata for Raw Reconstruction
* Spike-Thrift: Towards Energy-Efficient Deep Spiking Neural Networks by Limiting Spiking Activity via Attention-Guided Compression
* Splatty- A Unified Image Demosaicing and Rectification Method
* SSGP: Sparse Spatial Guided Propagation for Robust and Generic Interpolation
* StacMR: Scene-Text Aware Cross-Modal Retrieval
* StressNet: Detecting Stress in Thermal Videos
* Structured Visual Search via Composition-aware Learning
* Style Consistent Image Generation for Nuclei Instance Segmentation
* Style Transfer by Rigid Alignment in Neural Net Feature Space
* SubICap: Towards Subword-informed Image Captioning
* Subject Guided Eye Image Synthesis with Application to Gaze Redirection
* Subsurface Pipes Detection Using DNN-based Back Projection on GPR Data
* SuPEr-SAM: Using the Supervision Signal from a Pose Estimator to Train a Spatial Attention Module for Personal Protective Equipment Recognition
* Supervoxel Attention Graphs for Long-Range Video Modeling
* SWAG: Superpixels Weighted by Average Gradients for Explanations of CNNs
* SynDistNet: Self-Supervised Monocular Fisheye Camera Distance Estimation Synergized with Semantic Segmentation for Autonomous Driving
* Synthetic Expressions are Better Than Real for Learning to Detect Facial Actions
* Task-Assisted Domain Adaptation with Anchor Tasks
* TB-Net: A Three-Stream Boundary-Aware Network for Fine-Grained Pavement Disease Segmentation
* Temporal Context Aggregation for Video Retrieval with Contrastive Learning
* Temporal Shift GAN for Large Scale Video Generation
* Temporal Stochastic Softmax for 3D CNNs: An Application in Facial Expression Recognition
* Temporal-Aware Self-Supervised Learning for 3D Hand Pose and Mesh Estimation in Videos
* Temporally Consistent 3D Human Pose Estimation Using Dual 360° Cameras
* Text-to-Image Generation Grounded by Fine-Grained User Attention
* This Face Does Not Exist... But It Might Be Yours! Identity Leakage in Generative Models
* Towards Contextual Learning in Few-shot Object Classification
* Towards Enhancing Fine-grained Details for Image Matting
* Towards Fair Cross-Domain Adaptation via Generative Learning
* Towards Precise Intra-camera Supervised Person Re-Identification
* Towards Resolving the Challenge of Long-tail Distribution in UAV Images for Object Detection
* Towards Visually Explaining Video Understanding Networks with Perturbation
* Towards Zero-Shot Learning with Fewer Seen Class Examples
* TracKlinic: Diagnosis of Challenge Factors in Visual Tracking
* Transductive Visual Verb Sense Disambiguation
* Transductive Zero-Shot Learning by Decoupled Feature Generation
* TranstextNet: Transducing Text for Recognizing Unseen Visual Relationships
* TResNet: High Performance GPU-Dedicated Architecture
* Triangle-Net: Towards Robustness in Point Cloud Learning
* TrustMAE: A Noise-Resilient Defect Classification Framework using Memory-Augmented Auto-Encoders with Trust Regions
* Two-hand Global 3D Pose Estimation using Monocular RGB
* Two-Level Adversarial Visual-Semantic Coupling for Generalized Zero-shot Learning
* Understanding the impact of mistakes on background regions in crowd counting
* Unified Framework for Compressive Video Recovery from Coded Exposure Techniques, A
* Unsupervised Attention Based Instance Discriminative Learning for Person Re-Identification
* Unsupervised Domain Adaptation in Semantic Segmentation via Orthogonal and Clustered Embeddings
* Unsupervised Meta-Domain Adaptation for Fashion Retrieval
* Unsupervised Multi-Target Domain Adaptation Through Knowledge Distillation
* Unsupervised Multimodal Video-to-Video Translation via Self-Supervised Learning
* Unsupervised Video Representation Learning by Bidirectional Feature Prediction
* Utilizing Every Image Object for Semi-supervised Phrase Grounding
* Variational Information Bottleneck Based Method to Compress Sequential Networks for Human Action Recognition, A
* Variational Prototype Inference for Few-Shot Semantic Segmentation
* Vector-based Representation to Enhance Head Pose Estimation, A
* Vid2Int: Detecting Implicit Intention from Long Dialog Videos
* Video Captioning of Future Frames
* VideoSSL: Semi-Supervised Learning for Video Classification
* Viewpoint-agnostic Image Rendering
* Visual Speech Enhancement Without A Real Visual Stream
* Visual tracking of deepwater animals using machine learning-controlled robotic underwater vehicles
* WDNet: Watermark-Decomposition Network for Visible Watermark Removal
* We don't Need Thousand Proposals: Single Shot Actor-Action Detection in Videos
* Weakly Supervised Consistency-based Learning Method for COVID-19 Segmentation in CT Images, A
* Weakly Supervised Deep Reinforcement Learning for Video Summarization With Semantically Meaningful Reward
* Weakly Supervised Instance Segmentation by Deep Community Learning
* Weakly-Supervised Object Representation Learning for Few-Shot Semantic Segmentation
* Where to Look?: Mining Complementary Image Regions for Weakly Supervised Object Localization
* Whose hand is this? Person Identification from Egocentric Hand Gestures
* Zero-Pair Image to Image Translation using Domain Conditional Normalization
* Zero-Shot Recognition via Optimal Transport
* 3D Modeling Beneath Ground: Plant Root Detection and Reconstruction Based on Ground-Penetrating Radar
* 3DFaceFill: An Analysis-By-Synthesis Approach to Face Completion
* 3DRefTransformer: Fine-Grained Object Identification in Real-World Scenes Using Natural Language
* Action anticipation using latent goal learning
* Active Learning for Improved Semi-Supervised Semantic Segmentation in Satellite Images
* ADC: Adversarial attacks against object Detection that evade Context consistency checks
* Addressing out-of-distribution label noise in webly-labelled data
* Adversarial Branch Architecture Search for Unsupervised Domain Adaptation
* Adversarial Open Domain Adaptation for Sketch-to-Photo Synthesis
* Adversarial Robustness of Deep Sensor Fusion Models
* Adversarial Semantic Hallucination for Domain Generalized Semantic Segmentation
* AE-StyleGAN: Improved Training of Style-Based Auto-Encoders
* AFTer-UNet: Axial Fusion Transformer UNet for Medical Image Segmentation
* Agree to Disagree: When Deep Learning Models With Identical Architectures Produce Distinct Explanations
* AirCamRTM: Enhancing Vehicle Detection for Efficient Aerial Camera-based Road Traffic Monitoring
* All the attention you need: Global-local, spatial-channel attention for image retrieval
* Approximate Neural Architecture Search via Operation Distribution Learning
* Attack Agnostic Detection of Adversarial Examples via Random Subspace Analysis
* Attribute-Based Deep Periocular Recognition: Leveraging Soft Biometrics to Improve Periocular Recognition
* AttWalk: Attentive Cross-Walks for Deep Mesh Analysis
* Auditing saliency cropping algorithms
* Auto White-Balance Correction for Mixed-Illuminant Scenes
* Auto-X3D: Ultra-Efficient Video Understanding via Finer-Grained Neural Architecture Search
* Automated Defect Inspection in Reverse Engineering of Integrated Circuits
* AuxAdapt: Stable and Efficient Test-Time Adaptation for Temporally Consistent Video Semantic Segmentation
* Batch Normalization Tells You Which Filter is Important
* Bayesian Uncertainty and Expected Gradient Length - Regression: Two Sides Of The Same Coin?
* Beyond Mono to Binaural: Generating Binaural Audio from Mono Audio with Depth and Cross Modal Attention
* BiHPF: Bilateral High-Pass Filters for Robust Deepfake Detection
* Billion-Scale Pretraining with Vision Transformers for Multi-Task Visual Representations
* Biomass Prediction with 3D Point Clouds from LiDAR
* Boosting Contrastive Self-Supervised Learning with False Negative Cancellation
* Busy-Quiet Video Disentangling for Video Classification
* C-VTON: Context-Driven Image-Based Virtual Try-On Network
* Calibrating CNNs for Few-Shot Meta Learning
* CeyMo: See More on Roads - A Novel Benchmark Dataset for Road Marking Detection
* CFLOW-AD: Real-Time Unsupervised Anomaly Detection with Localization via Conditional Normalizing Flows
* Challenges in Procedural Multimodal Machine Comprehension: A Novel Way To Benchmark
* Channel Pruning via Lookahead Search Guided Reinforcement Learning
* CharacterGAN: Few-Shot Keypoint Character Animation and Reposing
* Class-Balanced Active Learning for Image Classification
* Cleaning Noisy Labels by Negative Ensemble Learning for Source-Free Unsupervised Domain Adaptation
* Co-Net: A Collaborative Region-Contour-Driven Network for Fine-to-Finer Medical Image Segmentation
* Co-Segmentation Aided Two-Stream Architecture for Video Captioning
* COCOA: Context-Conditional Adaptation for Recognizing Unseen Classes in Unseen Domains
* Compensation Tracker: Reprocessing Lost Object for Multi-Object Tracking
* Complete Face Recovery GAN: Unsupervised Joint Face Rotation and De-Occlusion from a Single-View Image
* Compressed Sensing MRI Reconstruction with Co-VeGAN: Complex-Valued Generative Adversarial Network
* Consistent Cell Tracking in Multi-frames with Spatio-Temporal Context by Object-Level Warping Loss
* Context-enriched Satellite Imagery Dataset and an Approach for Parking Lot Detection, A
* Contextual Gradient Scaling for Few-Shot Learning
* Contextual Proposal Network for Action Localization
* Contrast to Divide: Self-Supervised Pre-Training for Learning with Noisy Labels
* Controlled GAN-Based Creature Synthesis via a Challenging Game Art Dataset: Addressing the Noise-Latent Trade-Off
* CoordiNet: uncertainty-aware pose regressor for reliable vehicle localization
* Coupled Training for Multi-Source Domain Adaptation
* Creating and Reenacting Controllable 3D Humans with Differentiable Rendering
* Cross-modal Adversarial Reprogramming
* CrossLocate: Cross-modal Large-scale Visual Geo-Localization in Natural Environments using Rendered Modalities
* D2Conv3D: Dynamic Dilated Convolutions for Object Segmentation in Videos
* DAD: Data-free Adversarial Defense at Test Time
* Danish Fungi 2020: Not Just Another Image Recognition Dataset
* DAQ: Channel-Wise Distribution-Aware Quantization for Deep Image Super-Resolution Networks
* Data Augmented 3D Semantic Scene Completion with 2D Segmentation Priors
* Data InStance Prior (DISP) in Generative Adversarial Networks
* Dataset Knowledge Transfer for Class-Incremental Learning without Memory
* Deep Feature Prior Guided Face Deblurring
* Deep Insight into Measuring Face Image Utility with General and Face-specific Image Quality Metrics, A
* Deep Online Fused Video Stabilization
* Deep Optimization Prior for THz Model Parameter Estimation
* Deep Photo Scan: Semi-Supervised Learning for dealing with the real-world degradation in Smartphone Photo Scanning
* Deep Two-Stream Video Inference for Human Body Pose and Shape Estimation
* DeepPatent: Large scale patent drawing recognition and retrieval
* Densely-packed Object Detection via Hard Negative-Aware Anchor Attention
* Detail Preserving Residual Feature Pyramid Modules for Optical Flow
* Detecting Tear Gas Canisters With Limited Training Data
* Detection and Localization of Facial Expression Manipulations
* DG-Labeler and DGL-MOTS Dataset: Boost the Autonomous Driving Perception
* Digital and Physical-World Attacks on Remote Pulse Detection
* Discovering Underground Maps from Fashion
* Discrete neural representations for explainable anomaly detection
* Disentangled Representation with Dual-stage Feature Learning for Face Anti-spoofing
* Distance-based Hyperspherical Classification for Multi-source Open-Set Domain Adaptation
* Does Data Repair Lead to Fair Models? Curating Contextually Fair Data To Reduce Model Bias
* Domain Generalization through Audio-Visual Relative Norm Alignment in First Person Action Recognition
* Dual-Head Contrastive Domain Adaptation for Video Action Recognition
* Dynamic CNNs using uncertainty to overcome domain generalization for surgical instrument localization
* Dynamic Iterative Refinement for Efficient 3D Hand Pose Estimation
* edge-SR: Super-Resolution For The Masses
* EdgeConv with Attention Module for Monocular Depth Estimation
* Efficient Counterfactual Debiasing for Visual Question Answering
* EllipsoidNet: Ellipsoid Representation for Point Cloud Classification and Segmentation
* Enhanced Correlation Matching based Video Frame Interpolation
* Enhancing Few-Shot Image Classification with Unlabelled Examples
* Equine Pain Behavior Classification via Self-Supervised Disentangled Pose Representation
* Estimating Image Depth in the Comics Domain
* Evaluating and Mitigating Bias in Image Classifiers: A Causal Perspective Using Counterfactuals
* Evaluating the Robustness of Semantic Segmentation for Autonomous Driving against Real-World Adversarial Patch Attacks
* Evaluation of Correctness in Unsupervised Many-to-Many Image Translation
* Event-Based Kilohertz Eye Tracking using Coded Differential Lighting
* experimental comparison of multi-view stereo approaches on satellite images, An
* Extracting Vignetting and Grain Filter Effects from Photos
* Extraction of Positional Player Data from Broadcast Soccer Videos
* Extractive Knowledge Distillation
* EZCrop: Energy-Zoned Channels for Robust Output Pruning
* F-CAM: Full Resolution Class Activation Maps via Guided Parametric Upscaling
* Face Verification with Challenging Imposters and Diversified Demographics
* Facial Attribute Transformers for Precise and Robust Makeup Transfer
* Fair and accurate age prediction using distribution aware data curation and augmentation
* Fair Visual Recognition in Limited Data Regime using Self-Supervision and Self-Distillation
* FalCon: Fine-grained Feature Map Sparsity Computing with Decomposed Convolutions for Inference Optimization
* FASSST: Fast Attention Based Single-Stage Segmentation Net for Real-Time Instance Segmentation
* Fast and Efficient Restoration of Extremely Dark Light Fields
* Fast and Explicit Neural View Synthesis
* Fast Nonlinear Image Unblending
* Fast Partial Video Copy Detection Using KNN and Global Feature Database, A
* Fast-CLOCs: Fast Camera-LiDAR Object Candidates Fusion for 3D Object Detection
* FastAno: Fast Anomaly Detection via Spatio-temporal Patch Transformation
* Federated Multi-Target Domain Adaptation
* Few-Shot Object Detection by Attending to Per-Sample-Prototype
* Few-Shot Open-Set Recognition of Hyperspectral Images with Outlier Calibration Network
* Few-shot Weakly-Supervised Object Detection via Directional Statistics
* FLUID: Few-Shot Self-Supervised Image Deraining
* ForeSI: Success-Aware Visual Navigation Agent
* Forgery Detection by Internal Positional Learning of Demosaicing Traces
* From Node to Graph: Joint Reasoning on Visual-Semantic Relational Graph for Zero-Shot Detection
* FT-DeepNets: Fault-Tolerant Convolutional Neural Networks with Kernel-based Duplication
* Fully Convolutional Cross-Scale-Flows for Image-based Defect Detection
* Fusion Point Pruning for Optimized 2D Object Detection with Radar-Camera Fusion
* GANs Spatial Control via Inference-Time Adaptive Normalization
* Generalized Clustering and Multi-Manifold Learning with Geometric Structure Preservation
* Generalized Facial Manipulation Detection with Edge Region Feature Extraction
* Generating and Controlling Diversity in Image Search
* Generative Adversarial Attack on Ensemble Clustering
* Generative Adversarial Graph Convolutional Networks for Human Action Synthesis
* Geometrically Adaptive Dictionary Attack on Face Recognition
* Geometry-Aware Hierarchical Bayesian Learning on Manifolds
* Geometry-Inspired Top-k Adversarial Perturbations
* Global Assists Local: Effective Aerial Representations for Field of View Constrained Image Geo-Localization
* GraDual: Graph-based Dual-modal Representation for Image-Text Matching
* GraN-GAN: Piecewise Gradient Normalization for Generative Adversarial Networks
* HERS Superpixels: Deep Affinity Learning for Hierarchical Entropy Rate Segmentation
* Hessian-Aware Pruning and Optimal Neural Implant
* HHP-Net: A light Heteroscedastic neural network for Head Pose estimation with uncertainty
* Hierarchical Modeling for Task Recognition and Action Segmentation in Weakly-Labeled Instructional Videos
* Hierarchical Proxy-based Loss for Deep Metric Learning
* Hierarchically Decoupled Spatial-Temporal Contrast for Self-supervised Video Representation Learning
* HierMatch: Leveraging Label Hierarchies for Improving Semi-Supervised Learning
* High Dynamic Range Imaging of Dynamic Scenes with Saturation Compensation but without Explicit Motion Compensation
* Hitchhiker's Guide to Prior-Shift Adaptation, The
* Hole-robust Wireframe Detection
* How and What to Learn: Taxonomizing Self-Supervised Learning for 3D Action Recognition
* How Good is your Explanation? Algorithmic Stability Measures to Assess the Quality of Explanations for Deep Neural Networks
* Human-Aided Saliency Maps Improve Generalization of Deep Learning
* HybVIO: Pushing the Limits of Real-time Visual-inertial Odometry
* Hyper-Convolution Networks for Biomedical Image Segmentation
* Hyperspectral Image Super-Resolution with RGB Image Super-Resolution as an Auxiliary Task
* Identifying Wrongly Predicted Samples: A Method for Active Learning
* Image Restoration by Deep Projected GSURE
* Image-Adaptive Hint Generation via Vision Transformer for Outpainting
* Improve Image Captioning by Estimating the Gazing Patterns from the Caption
* Improving Fractal Pre-training
* Improving Model Generalization by Agreement of Learned Representations from Data Augmentation
* Improving Object Detection by Label Assignment Distillation
* Improving Single-Image Defocus Deblurring: How Dual-Pixel Images Help Through Multi-Task Learning
* ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection
* In-Field Phenotyping Based on Crop Leaf and Plant Instance Segmentation
* Inferring the Class Conditional Response Map for Weakly Supervised Semantic Segmentation
* InfographicVQA
* Information Bottlenecked Variational Autoencoder for Disentangled 3D Facial Expression Modelling
* Inpaint2Learn: A Self-Supervised Framework for Affordance Learning
* Intelligent Camera Selection Decisions for Target Tracking in a Camera Network
* Interpretable Semantic Photo Geolocation
* Investigation of Critical Issues in Bias Mitigation Techniques, An
* Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching
* Joint Classification and Trajectory Regression of Online Handwriting using a Multi-Task Learning Approach
* Knowledge Capture and Replay for Continual Learning
* Knowledge-Augmented Contrastive Learning for Abnormality Classification and Localization in Chest X-rays with Radiomics using a Feedback Loop
* Lane-Level Street Map Extraction from Aerial Imagery
* Late-resizing: A Simple but Effective Sketch Extraction Strategy for Improving Generalization of Line-art Colorization
* Latent reweighting, an almost free improvement for GANs
* Latent to Latent: A Learned Mapper for Identity Preserving Editing of Multiple Face Attributes in StyleGAN-generated Images
* LEAD: Self-Supervised Landmark Estimation by Aligning Distributions of Feature Similarity
* Leaky Gated Cross-Attention for Weakly Supervised Multi-Modal Temporal Action Localization
* Learnable Adaptive Cosine Estimator (LACE) for Image Classification
* Learnable Multi-level Frequency Decomposition and Hierarchical Attention Mechanism for Generalized Face Presentation Attack Detection
* Learned Event-based Visual Perception for Improved Space Object Detection
* Learning Color Representations for Low-Light Image Enhancement
* Learning Foreground-Background Segmentation from Improved Layered GANs
* Learning from the CNN-based Compressed Domain
* Learning Maritime Obstacle Detection from Weak Annotations by Scaffolding
* Learning Temporal Video Procedure Segmentation from an Automatically Collected Large Dataset
* Learning to Generate the Unknowns as a Remedy to the Open-Set Domain Shift
* Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image
* Learning to Weight Filter Groups for Robust Classification
* Learning with Label Noise for Image Retrieval by Selecting Interactions
* Less Can Be More: Sound Source Localization With a Classification Model
* Let there be a clock on the beach: Reducing Object Hallucination in Image Captioning
* Leveraging Test-Time Consensus Prediction for Robustness against Unseen Noise
* Lightweight Monocular Depth with a Novel Neural Architecture Search Method
* Low-cost Multispectral Scene Analysis with Modality Distillation
* LwPosr: Lightweight Efficient Fine Grained Head Pose Estimation
* M3DETR: Multi-representation, Multi-scale, Mutual-relation 3D Object Detection with Transformers
* MAPS: Multimodal Attention for Product Similarity
* Masking Modalities for Cross-modal Video Retrieval
* MaskSplit: Self-supervised Meta-learning for Few-shot Semantic Segmentation
* Matching and Recovering 3D People from Multiple Views
* Maximizing Cosine Similarity Between Spatial Features for Unsupervised Domain Adaptation in Semantic Segmentation
* Measuring Hidden Bias within Face Recognition via Racial Phenotypes
* Measuring Representation of Race, Gender, and Age in Children's Books: Face Detection and Feature Classification in Illustrated Images
* MEGAN: Memory Enhanced Graph Attention Network for Space-Time Video Super-Resolution
* Mending Neural Implicit Modeling for 3D Vehicle Reconstruction in the Wild
* Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes
* Meta Approach to Data Augmentation Optimization
* Meta-Learning for Multi-Label Few-Shot Classification
* Meta-Meta Classification for One-Shot Learning
* Meta-UDA: Unsupervised Domain Adaptive Thermal Object Detection using Meta-Learning
* METGAN: Generative Tumour Inpainting and Modality Synthesis in Light Sheet Microscopy
* MisConv: Convolutional Neural Networks for Missing Data
* Mixed-Dual-Head Meets Box Priors: A Robust Framework for Semi-supervised Segmentation
* MM-ViT: Multi-Modal Video Transformer for Compressed Video Action Recognition
* Mobile based Human Identification using Forehead Creases: Application and Assessment under COVID-19 Masked Face Scenarios
* MobileStereoNet: Towards Lightweight Deep Networks for Stereo Matching
* Model Compression Using Optimal Transport
* Modeling Aleatoric Uncertainty for Camouflaged Object Detection
* Modeling dynamic target deformation in camera calibration
* Modular and Unified Framework for Detecting and Localizing Video Anomalies, A
* MoESR: Blind Super-Resolution using Kernel-Aware Mixture of Experts
* Monocular Depth Estimation with Adaptive Geometric Attention
* MovingFashion: a Benchmark for the Video-to-Shop Challenge
* MTGLS: Multi-Task Gaze Estimation with Limited Supervision
* mToFNet: Object Anti-Spoofing with Mobile Time-of-Flight Data
* MUGL: Large Scale Multi Person Conditional Action Generation with Locomotion
* Multi-branch Neural Networks for Video Anomaly Detection in Adverse Lighting and Weather Conditions
* Multi-Dimensional Dynamic Model Compression for Efficient Image Super-Resolution
* Multi-Domain Incremental Learning for Semantic Segmentation
* Multi-domain semantic segmentation with overlapping labels *
* Multi-Head Deep Metric Learning Using Global and Local Representations
* Multi-level Attentive Adversarial Learning with Temporal Dilation for Unsupervised Video Domain Adaptation
* Multi-motion and Appearance Self-Supervised Moving Object Detection
* Multi-Scale Patch-Based Representation Learning for Image Anomaly Detection and Segmentation
* Multi-stream dynamic video Summarization
* Multi-Task Classification of Sewer Pipe Defects and Properties using a Cross-Task Graph Neural Network Decoder
* Multi-View Fusion of Sensor Data for Improved Perception and Prediction in Autonomous Driving
* Multimodal Learning using Optimal Transport for Sarcasm and Humor Detection
* Mutual Learning of Joint and Separate Domain Alignments for Multi-Source Domain Adaptation
* Natural Language Video Moment Localization Through Query-Controlled Temporal Convolution
* Network Generalization Prediction for Safety Critical Tasks in Novel Operating Domains
* Neural Architecture Search for Efficient Uncalibrated Deep Photometric Stereo
* Neural Radiance Fields Approach to Deep Multi-View Photometric Stereo
* No-Reference Image Quality Assessment via Transformers, Relative Ranking, and Self-Consistency
* Non-Blind Deblurring for Fluorescence: A Deformable Latent Space Approach with Kernel Parameterization
* Non-local Attention Improves Description Generation for Retinal Images
* Non-Semantic Evaluation of Image Forensics Tools: Methodology and Database
* Nonnegative Low-Rank Tensor Completion via Dual Formulation with Applications to Image and Video Completion
* Normalizing Flow as a Flexible Fidelity Objective for Photo-Realistic Super-resolution
* Novel Ensemble Diversification Methods for Open-Set Scenarios
* Novel-View Synthesis of Human Tourist Photos
* NUTA: Non-uniform Temporal Aggregation for Action Recognition
* Occlusion Resistant Network for 3D Face Reconstruction
* Occlusion-Robust Object Pose Estimation with Holistic Representation
* On Black-Box Explanation for Face Verification
* On the Effectiveness of Small Input Noise for Defending Against Query-based Black-Box Attacks
* On the Maximum Radius of Polynomial Lens Distortion
* One-Class Learned Encoder-Decoder Network with Adversarial Context Masking for Novelty Detection
* One-shot Compositional Data Generation for Low Resource Handwritten Text Recognition
* Online Continual Learning Via Candidates Voting
* Online Knowledge Distillation by Temporal-Spatial Boosting
* Ortho-Shot: Low Displacement Rank Regularization with Data Augmentation for Few-Shot Learning
* Parsing Line Chart Images Using Linear Programming
* Perceptual Consistency in Video Segmentation
* PERF-Net: Pose Empowered RGB-Flow Net
* PhotoWCT2: Compact Autoencoder for Photorealistic Style Transfer Resulting from Blockwise Training and Skip Connections of High-Frequency Residuals
* Physical Adversarial Attacks on an Aerial Imagery Object Detector
* PICA: Point-Wise Instance and Centroid Alignment Based Few-Shot Domain Adaptive Object Detection with Loose Annotations
* Pixel-by-Pixel Cross-Domain Alignment for Few-Shot Semantic Segmentation
* Pixel-Level Bijective Matching for Video Object Segmentation
* Pixel-Level Meta-Learner for Weakly Supervised Few-Shot Semantic Segmentation, A
* Plugging Self-Supervised Monocular Depth into Unsupervised Domain Adaptation for Semantic Segmentation
* PoP-Net: Pose over Parts Network for Multi-Person 3D Pose Estimation from a Depth Image
* Pose and Joint-Aware Action Recognition
* Pose-guided Generative Adversarial Net for Novel View Action Synthesis
* Post-OCR Paragraph Recognition by Graph Convolutional Networks
* PPCD-GAN: Progressive Pruning and Class-Aware Distillation for Large-Scale Conditional GANs Compression
* PRECODE - A Generic Model Extension to Prevent Deep Gradient Leakage
* Predicting Levels of Household Electricity Consumption in Low-Access Settings
* PredStereo: An Accurate Real-time Stereo Vision System
* Preventing Catastrophic Forgetting and Distribution Mismatch in Knowledge Distillation via Synthetic Data
* Pro-CCaps: Progressively Teaching Colourisation to Capsules
* Progressive Automatic Design of Search Space for One-Shot Neural Architecture Search
* PROVES: Establishing Image Provenance using Semantic Signatures
* QUALIFIER: Question-Guided Self-Attentive Multimodal Fusion Network for Audio Visual Scene-Aware Dialog
* Quantified Facial Expressiveness for Affective Behavior Analytics
* Re-Compose the Image by Evaluating the Crop on More Than Just a Score
* Reconstructing Training Data from Diverse ML Models by Ensemble Inversion
* Recursive Contour-Saliency Blending Network for Accurate Salient Object Detection
* REFICS: A Step Towards Linking Vision with Hardware Assurance
* Registration of Human Point Set using Automatic Key Point Detection and Region-aware Features
* REGroup: Rank-aggregating Ensemble of Generative Classifiers for Robust Predictions
* Resolution-robust Large Mask Inpainting with Fourier Convolutions
* Resource-efficient Hybrid X-formers for Vision
* Rethinking Video Anomaly Detection - A Continual Learning Approach
* Revealing Disocclusions in Temporal View Synthesis through Infilling Vector Prediction
* RGL-NET: A Recurrent Graph Learning framework for Progressive Part Assembly
* Riemannian Framework for Analysis of Human Body Surface, A
* RLSS: A Deep Reinforcement Learning Algorithm for Sequential Scene Generation
* Robust 3D Garment Digitization from Monocular 2D Images for 3D Virtual Try-On Systems
* Robust High-Resolution Video Matting with Temporal Guidance
* Robust Lane Detection via Expanded Self Attention
* Robustly Recognizing Irregular Scene Text by Rectifying Principle Irregularities
* S2-MLP: Spatial-Shift MLP Architecture for Vision
* S2FGAN: Semantically Aware Interactive Sketch-to-Face Translation
* SAC: Semantic Attention Composition for Text-Conditioned Image Retrieval
* Sandwich Batch Normalization: A Drop-In Replacement for Feature Distribution Heterogeneity
* SBEVNet: End-to-End Deep Stereo Layout Estimation
* SC-UDA: Style and Content Gaps aware Unsupervised Domain Adaptation for Object Detection
* SeaDronesSee: A Maritime Benchmark for Detecting Humans in Open Water
* Seeing Implicit Neural Representations as Fourier Series
* SeeTek: Very Large-Scale Open-set Logo Recognition with Text-Aware Metric Learning
* SEGA: Semantic Guided Attention on Visual Prototype for Few-Shot Learning
* Self-Guidance: Improve Deep Neural Network Generalization via Knowledge Distillation
* Self-Supervised Domain Adaptation for Visual Navigation with Global Map Consistency
* Self-Supervised Generative Style Transfer for One-Shot Medical Image Segmentation
* Self-Supervised Knowledge Transfer via Loosely Supervised Auxiliary Tasks
* Self-Supervised Learning of Domain Invariant Features for Depth Estimation
* Self-Supervised Pretraining Improves Self-Supervised Pretraining
* Self-Supervised Shape Alignment for Sports Field Registration
* Self-Supervised Test-Time Adaptation on Video Data
* Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting
* Semantically Stealthy Adversarial Attacks against Segmentation Models
* Semi-supervised Domain Adaptation via Sample-to-Sample Self-Distillation
* Semi-supervised Generalized VAE Framework for Abnormality Detection using One-Class Classification, A
* Semi-supervised Multi-task Learning for Semantics and Depth
* Semi-Supervised Semantic Segmentation of Vessel Images using Leaking Perturbations
* Shadow Art Revisited: A Differentiable Rendering Based Approach
* Shallow Features Guide Unsupervised Domain Adaptation for Semantic Segmentation at Class Boundaries
* Shape-coded ArUco: Fiducial Marker for Bridging 2D and 3D Modalities
* Sharing Decoders: Network Fission for Multi-task Pixel Prediction
* Short-term Solar Irradiance Prediction from Sky Images with a Clear Sky Model
* Siamese Transformer Pyramid Networks for Real-Time UAV Tracking
* SIDE: Center-based Stereo 3D Detector with Structure-aware Instance Depth Estimation
* Sign Language Translation with Hierarchical Spatio-Temporal Graph Neural Network
* SIGNAV: Semantically-Informed GPS-Denied Navigation and Mapping in Visually-Degraded Environments
* Single Image Deraining Network with Rain Embedding Consistency and Layered LSTM
* Single Image Object Counting and Localizing using Active-Learning
* Single Source One Shot Reenactment using Weighted Motion from Paired Feature Points
* Single-Photon Camera Guided Extreme Dynamic Range Imaging
* Single-shot dense active stereo with pixel-wise phase estimation based on grid-structure using CNN and correspondence estimation using GCN
* Single-shot Path Integrated Panoptic Segmentation
* Skeleton-DML: Deep Metric Learning for Skeleton-Based One-Shot Action Recognition
* Spatial-Temporal Transformer for 3D Point Cloud Sequences
* Spatiotemporal Initialization for 3D CNNs with Generated Motion Patterns
* SpectraNet: Learned Recognition of Artificial Satellites from High Contrast Spectroscopic Imagery
* SporeAgent: Reinforced Scene-level Plausibility for Object Pose Refinement
* SSCAP: Self-supervised Co-occurrence Action Parsing for Unsupervised Temporal Action Segmentation
* StickyLocalization: Robust End-To-End Relocalization on Point Clouds using Graph Neural Networks
* Structure-Aware Method for Direct Pose Estimation, A
* Strumming to the Beat: Audio-Conditioned Contrastive Video Textures
* Style Agnostic 3D Reconstruction via Adversarial Style Transfer
* StyleMC: Multi-Channel Based Fast Text-Guided Image Generation and Manipulation
* Stylizing 3D Scene via Implicit Representation and HyperNetwork
* Supervised Compression for Resource-Constrained Edge Computing Systems
* Surrogate Model-Based Explainability Methods for Point Cloud NNs
* SWAG-V: Explanations for Video using Superpixels Weighted by Average Gradients
* Symmetric-light Photometric Stereo
* T-Net: A Resource-Constrained Tiny Convolutional Neural Network for Medical Image Segmentation
* TA-Net: Topology-Aware Network for Gland Segmentation
* Tailor Me: An Editing Network for Fashion Attribute Shape Manipulation
* Temporally stable video segmentation without video annotations
* Tensor feature hallucination for few-shot learning
* Tensor-Based Non-Rigid Structure from Motion
* Time-Space Transformers for Video Panoptic Segmentation
* To miss-attend is to misalign! Residual Self-Attentive Feature Alignment for Adapting Object Detectors
* Towards a Robust Differentiable Architecture Search under Label Noise
* Towards Active Vision for Action Localization with Reactive Control and Predictive Learning
* Towards Class-Oriented Poisoning Attacks Against Neural Networks
* Towards Durability Estimation of Bioprosthetic Heart Valves Via Motion Symmetry Analysis
* Trading-off Information Modalities in Zero-shot Classification
* Training a Task-Specific Image Reconstruction Loss
* Transductive Weakly-Supervised Player Detection using Soccer Broadcast Videos
* Transfer Learning for Pose Estimation of Illustrated Characters
* Transferable 3D Adversarial Textures using End-to-end Optimization
* TricubeNet: 2D Kernel-Based Object Representation for Weakly-Occluded Oriented Object Detection
* TypeNet: Towards Camera Enabled Touch Typing on Flat Surfaces through Self-Refinement
* Uncertainty Learning towards Unsupervised Deformable Medical Image Registration
* UNETR: Transformers for 3D Medical Image Segmentation
* Unsupervised Learning for Human Sensing Using Radio Signals
* Unsupervised Robust Domain Adaptation without Source Data
* Unsupervised Sounding Object Localization with Bottom-Up and Top-Down Attention
* Untapped Potential of Off-the-Shelf Convolutional Neural Networks, The
* Unveiling Real-Life Effects of Online Photo Sharing
* V-SlowFast Network for Efficient Visual Sound Separation
* Variational Stacked Local Attention Networks for Diverse Video Captioning
* VCSeg: Virtual Camera Adaptation for Road Segmentation
* Video and Text Matching with Conditioned Embeddings
* Video Salient Object Detection via Contrastive Features and Attention Modules
* Visual Understanding of Complex Table Structures from Document Images
* Visualizing Paired Image Similarity in Transformer Networks
* Visually Guided Sound Source Separation and Localization using Self-Supervised Motion Representations
* Weakly supervised Branch Network with Template Mask for Classifying Masses in 3D Automated Breast Ultrasound
* Weakly Supervised Learning for Joint Image Denoising and Protein Localization in Cryo-Electron Microscopy
* Weakly-Supervised Convolutional Neural Networks for Vessel Segmentation in Cerebral Angiography
* WEPDTOF: A Dataset and Benchmark Algorithms for In-the-Wild People Detection and Tracking from Overhead Fisheye Cameras
* What Makes for Effective Few-shot Point Cloud Classification?
* X-MIR: EXplainable Medical Image Retrieval
* YOLO-ReT: Towards High Accuracy Real-time Object Detection on Edge GPUs
WACV23 * *Unsupervised Multi-Object Segmentation Using Attention and Soft-Argmax
* 360MVSNet: Deep Multi-view Stereo Network with 360° Images for Indoor Scene Reconstruction
* 3D Change Localization and Captioning from Dynamic Scans of Indoor Scenes
* 3D GAN Inversion with Pose Optimization
* 3D Neural Sculpting (3DNS): Editing Neural Signed Distance Functions
* 3D-SpLineNet: 3D Traffic Line Detection using Parametric Spline Representations
* 3DMM-RF: Convolutional Radiance Fields for 3D Face Modeling
* Accelerating Self-Supervised Learning via Efficient Training Strategies
* Accumulated Trivial Attention Matters in Vision Transformers on Small Datasets
* Action-aware Masking Network with Group-based Attention for Temporal Action Localization
* AdaNorm: Adaptive Gradient Norm Correction based Optimizer for CNNs
* Adaptive Feature Fusion for Cooperative Perception using LiDAR Point Clouds
* Adaptive Local-Component-aware Graph Convolutional Network for One-shot Skeleton-based Action Recognition
* Adaptive Sample Selection for Robust Learning under Label Noise
* Adaptively-Realistic Image Generation from Stroke and Sketch with Diffusion Model
* Addressing Feature Suppression in Unsupervised Visual Representations
* Adversarial local distribution regularization for knowledge distillation
* Adversarial robustness in discontinuous spaces via alternating sampling and descent
* AdvisIL - A Class-Incremental Learning Advisor
* Aerial Image Dehazing with Attentive Deformable Transformers
* AFPSNet: Multi-Class Part Parsing based on Scaled Attention and Feature Fusion
* Aggregating Bilateral Attention for Few-Shot Instance Localization
* ALPINE: Improving Remote Heart Rate Estimation using Contrastive Learning
* Analysis of Master Vein Attacks on Finger Vein Recognition Systems
* Ancestor Search: Generalized Open Set Recognition via Hyperbolic Side Information Learning
* Anisotropic Multi-Scale Graph Convolutional Network for Dense Shape Correspondence
* AnoLeaf: Unsupervised Leaf Disease Segmentation via Structurally Robust Generative Inpainting
* Anomaly Clustering: Grouping Images into Coherent Clusters of Anomaly Types
* Anomaly Detection in 3D Point Clouds using Deep Geometric Descriptors
* Anticipative Feature Fusion Transformer for Multi-Modal Action Anticipation
* Arbitrary Style Guidance for Enhanced Diffusion-Based Text-to-Image Generation
* Are Straight-Through gradients and Soft-Thresholding all you need for Sparse Training?
* ARUBA: An Architecture-Agnostic Balanced Loss for Aerial Object Detection
* Asymmetric Student-Teacher Networks for Industrial Anomaly Detection
* AT-DDPM: Restoring Faces Degraded by Atmospheric Turbulence Using Denoising Diffusion Probabilistic Models
* ATCON: Attention Consistency for Vision Models
* Attend Who is Weak: Pruning-assisted Medical Image Localization under Sophisticated and Implicit Imbalances
* Attention Attention Everywhere: Monocular Depth Prediction with Skip Attention
* Attribution-aware Weight Transfer: A Warm-Start Initialization for Class-Incremental Semantic Segmentation
* AttTrack: Online Deep Attention Transfer for Multi-object Tracking
* Audio-Visual Efficient Conformer for Robust Speech Recognition
* Audio-Visual Face Reenactment
* AudioViewer: Learning to Visualize Sounds
* Augmentation by Counterfactual Explanation: Fixing an Overconfident Classifier
* Autoencoder-based background reconstruction and foreground segmentation with background noise estimation
* Automated Detection of Label Errors in Semantic Segmentation Datasets via Deep Learning and Uncertainty Quantification
* Automated Line Labelling: Dataset for Contour Detection and 3D Reconstruction
* Automatically Annotating Indoor Images with CAD Models via RGB-D Scans
* Auxiliary Task-Guided CycleGAN for Black-Box Model Domain Adaptation
* AVE-CLIP: AudioCLIP-based Multi-window Temporal Transformer for Audio Visual Event Localization
* Avoiding Lingering in Learning Active Recognition by Adversarial Disturbance
* Back to MLP: A Simple Baseline for Human Motion Prediction
* Backprop Induced Feature Weighting for Adversarial Domain Adaptation with Iterative Label Distribution Alignment
* Barlow constrained optimization for Visual Question Answering
* Benchmarking Visual Localization for Autonomous Navigation
* Bent and Broken Bicycles: Leveraging synthetic data for damaged object re-identification
* BEVSegFormer: Bird's Eye View Semantic Segmentation From Arbitrary Camera Rigs
* Beyond RGB: Scene-Property Synthesis with Neural Radiance Fields
* Bi-directional Frame Interpolation for Unsupervised Video Anomaly Detection
* BirdSoundsDenoising: Deep Visual Audio Denoising for Bird Sounds
* Body Part-Based Representation Learning for Occluded Person Re-Identification
* Boosting neural video codecs by exploiting hierarchical redundancy
* Boosting vision transformers for image retrieval
* Bootstrapping the Relationship Between Images and Their Clean and Noisy Labels
* Box Size Confidence Bias Harms Your Object Detector, The
* BoxMask: Revisiting Bounding Box Supervision for Video Object Detection
* BrightFlow: Brightness-Change-Aware Unsupervised Learning of Optical Flow
* Burst Reflection Removal using Reflection Motion Aggregation Cues
* Burst Vision Using Single-Photon Cameras
* BURST: A Benchmark for Unifying Object Recognition, Segmentation and Tracking in Video
* Calibrating Deep Neural Networks using Explicit Regularisation and Dynamic Data Pruning
* Camera Alignment and Weighted Contrastive Learning for Domain Adaptation in Video Person ReID
* CameraPose: Weakly-Supervised Monocular 3D Human Pose Estimation by Leveraging In-the-wild 2D Annotations
* Can Shadows Reveal Biometric Informationƒ
* CAST: Conditional Attribute Subsampling Toolkit for Fine-grained Evaluation
* CellTranspose: Few-shot Domain Adaptation for Cellular Instance Segmentation
* Center-aware Adversarial Augmentation for Single Domain Generalization
* Centroid Distance Keypoint Detector for Colored Point Clouds
* Certified Defense for Content Based Image Retrieval
* CFL-Net: Image Forgery Localization Using Contrastive Learning
* CG-NeRF: Conditional Generative Neural Radiance Fields for 3D-aware Image Synthesis
* Change You Want to See, The
* Class-Level Confidence Based 3D Semi-Supervised Learning
* Closer Look at the Transferability of Adversarial Examples: How They Fool Different Models Differently
* CNN2Graph: Building Graphs for Image Classification
* CoKe: Contrastive Learning for Robust Keypoint Detection
* Collaborative Multi-Teacher Knowledge Distillation for Learning Low Bit-width Deep Neural Networks
* Color Recommendation for Vector Graphic Documents based on Multi-Palette Representation
* Compact and Optimal Deep Learning with Recurrent Parameter Generators
* Complementary Bi-directional Feature Compression for Indoor 360° Semantic Segmentation with Self-distillation
* Complementary Cues from Audio Help Combat Noise in Weakly-Supervised Object Detection
* Composite Learning for Robust and Effective Dense Predictions
* Composite Relationship Fields with Transformers for Scene Graph Generation
* Compressing Explicit Voxel Grid Representations: fast NeRFs become also small
* Computer Vision for International Border Legibility
* Computer Vision for Ocean Eddy Detection in Infrared Imagery
* Computer Vision to the Rescue: Infant Postural Symmetry Estimation from Incongruent Annotations
* Concept Correlation and Its Effects on Concept-Based Models
* ConfMix: Unsupervised Domain Adaptation for Object Detection via Confidence-based Mixing
* CoNMix for Source-free Single and Multi-target Domain Adaptation
* Content-Based Music-Image Retrieval Using Self- and Cross-Modal Feature Embedding Memory
* Context-empowered Visual Attention Prediction in Pedestrian Scenarios
* Continual Deepfake Detection Benchmark: Dataset, Methods, and Essentials, A
* Continual Learning with Dependency Preserving Hypernetworks
* Contrastive Knowledge-Augmented Meta-Learning for Few-Shot Classification
* Contrastive Learning of Semantic Concepts for Open-set Cross-domain Retrieval
* Contrastive Losses Are Natural Criteria for Unsupervised Video Summarization
* Control-NeRF: Editable Feature Volumes for Scene Rendering and Manipulation
* Controllable 3D Generative Adversarial Face Model via Disentangling Shape and Appearance
* Cooperative Self-Training for Multi-Target Adaptive Semantic Segmentation
* COPE: End-to-end trainable Constant Runtime Object Pose Estimation
* CORL: Compositional Representation Learning for Few-Shot Classification
* CountNet3D: A 3D Computer Vision Approach to Infer Counts of Occluded Objects
* Couplformer: Rethinking Vision Transformer with Coupling Attention
* Creating a Forensic Database of Shoeprints from Online Shoe-Tread Photos
* CropAndWeed Dataset: a Multi-Modal Learning Approach for Efficient Crop and Weed Manipulation, The
* Cross-Domain Video Anomaly Detection without Target Domain Adaptation
* Cross-identity Video Motion Retargeting with Joint Transformation and Synthesis
* Cross-modal Semantic Enhanced Interaction for Image-Sentence Retrieval
* Cross-Modality Feature Fusion Network for Few-Shot 3D Point Cloud Classification
* Cross-Resolution Flow Propagation for Foveated Video Super-Resolution
* Cross-task Attention Mechanism for Dense Multi-task Learning
* Cross-View Image Sequence Geo-localization
* CRT-6D: Fast 6D Object Pose Estimation with Cascaded Refinement Transformers
* CTrGAN: Cycle Transformers GAN for Gait Transfer
* CUDA-GHR: Controllable Unsupervised Domain Adaptation for Gaze and Head Redirection
* Cut-Paste Consistency Learning for Semi-Supervised Lesion Segmentation
* CYBORG: Blending Human Saliency Into the Loss Improves Deep Learning-Based Synthetic Face Detection
* D-Extract: Extracting Dimensional Attributes From Product Images
* D2F2WOD: Learning Object Proposals for Weakly-Supervised Object Detection via Progressive Domain Adaptation
* Dance Style Transfer with Cross-modal Transformer
* Dataset Condensation with Distribution Matching
* DBCE: A Saliency Method for Medical Deep Learning Through Anatomically-Consistent Free-Form Deformations
* DCVNet: Dilated Cost Volume Networks for Fast Optical Flow
* DDNeRF: Depth Distribution Neural Radiance Fields
* DE-CROP: Data-efficient Certified Robustness for Pretrained Classifiers
* Deep Learning Methodology for Early Detection and Outbreak Prediction of Invasive Species Growth
* Deep Model-Based Super-Resolution with Non-uniform Blur
* Deep Neural Framework to Detect Individual Advertisement (Ad) from Videos, A
* DeepPrivacy2: Towards Realistic Full-Body Anonymization
* DeformIrisNet: An Identity-Preserving Model of Iris Texture Deformation
* DELS-MVS: Deep Epipolar Line Search for Multi-View Stereo
* Delving into Masked Autoencoders for Multi-Label Thorax Disease Classification
* Dense but Efficient VideoQA for Intricate Compositional Reasoning
* Dense Prediction with Attentive Feature Aggregation
* Dense Voxel Fusion for 3D Object Detection
* Detection Recovery in Online Multi-Object Tracking with Sparse Graph Tracker
* Diffeomorphic Image Registration with Neural Velocity Field
* Difficulty-Net: Learning to Predict Difficulty for Long-Tailed Recognition
* DigiFace-1M: 1 Million Digital Face Images for Face Recognition
* Discrete Cosin TransFormer: Image Modeling From Frequency Domain
* Dissecting Deep Metric Learning Losses for Image-Text Retrieval
* Do Adaptive Active Attacks Pose Greater Risk Than Static Attacks?
* Do Pre-trained Models Benefit Equally in Continual Learning?
* Domain Adaptation using Self-Training with Mixup for One-Stage Object Detection
* Domain Adaptive Object Detection for Autonomous Driving under Foggy Weather
* Domain Adaptive Video Semantic Segmentation via Cross-Domain Moving Object Mixing
* Domain Invariant Vision Transformer Learning for Face Anti-spoofing
* DRAMA: Joint Risk Localization and Captioning in Driving
* DSAG: A Scalable Deep Framework for Action-Conditioned Multi-Actor Full Body Motion Synthesis
* DSFormer: A Dual-domain Self-supervised Transformer for Accelerated Multi-contrast MRI Reconstruction
* DSTrans: Dual-Stream Transformer for Hyperspectral Image Restoration
* DyAnNet: A Scene Dynamicity Guided Self-Trained Video Anomaly Detection Network
* Dynamic Mixture of Counter Network for Location-Agnostic Crowd Counting
* Dynamic Neural Portraits
* Dynamic Re-weighting for Long-tailed Semi-supervised Learning
* DyStyle: Dynamic Neural Network for Multi-Attribute-Conditioned Style Editings
* Effective Invertible Arbitrary Image Rescaling
* Efficient few-shot learning for pixel-precise handwritten document layout analysis
* Efficient Flow-Guided Multi-frame De-fencing
* Efficient Reference-based Video Super-Resolution (ERVSR): Single Reference Image Is All You Need
* Efficient Skeleton-Based Action Recognition via Joint-Mapping strategies
* Efficient Visual Tracking with Exemplar Transformers
* EfficientPhys: Enabling Simple, Fast and Accurate Camera-Based Cardiac Measurement
* Ego-Vehicle Action Recognition based on Semi-Supervised Contrastive Learning
* Elimination of Non-Novel Segments at Multi-Scale for Few-Shot Segmentation
* ElliPose: Stereoscopic 3D Human Pose Estimation by Fitting Ellipsoids
* Embedding-Dynamic Approach to Self-Supervised Learning, An
* EmbryosFormer: Deformable Transformer and Collaborative Encoding-Decoding for Embryos Stage Development Classification
* Empirical Generalization Study: Unsupervised Domain Adaptation vs. Domain Generalization Methods for Semantic Segmentation in the Wild
* Enabling ISPless Low-Power Computer Vision
* Encouraging Disentangled and Convex Representation with Controllable Interpolation Regularization
* End-to-End Single-Frame Image Signal Processing for High Dynamic Range Scenes
* Enhanced Bi-directional Motion Estimation for Video Frame Interpolation
* Enriched CNN-Transformer Feature Aggregation Networks for Super-Resolution
* ETR: An Efficient Transformer for Re-ranking in Visual Place Recognition
* Ev-NeRF: Event Based Neural Radiance Field
* Evaluating generative networks using Gaussian mixtures of image features
* Event-based RGB sensing with structured light
* Event-Specific Audio-Visual Fusion Layers: A Simple and New Perspective on Video Understanding
* EventPoint: Self-Supervised Interest Point Detection and Description for Event-based Camera
* Exemplar Guided Deep Neural Network for Spatial Transcriptomics Analysis of Gene Expression Prediction
* Expansion of Visual Hints for Improved Generalization in Stereo Matching
* Expert-defined Keywords Improve Interpretability of Retinal Image Captioning
* Explainability-Aware One Point Attack for Point Cloud Neural Networks
* Exploiting Instance-based Mixed Sampling via Auxiliary Source Domain Supervision for Domain-adaptive Action Detection
* Exploiting Long-Term Dependencies for Generating Dynamic Scene Graphs
* Exploiting Visual Context Semantics for Sound Source Localization
* FaceDancer: Pose- and Occlusion-Aware High Fidelity Face Swapping
* FaceOff: A Video-to-Video Face Swapping System
* FAN-Trans: Online Knowledge Distillation for Facial Action Unit Detection
* Fantastic Style Channels and Where to Find Them: A Submodular Framework for Discovering Diverse Directions in GANs
* Far3Det: Towards Far-Field 3D Detection
* Fashion Image Retrieval with Text Feedback by Additive Attention Compositional Learning
* Fast and Accurate: Video Enhancement Using Sparse Depth
* Fast Differentiable Transient Rendering for Non-Line-of-Sight Reconstruction
* Fast Online Video Super-Resolution with Deformable Attention Pyramid
* FastSwap: A Lightweight One-Stage Framework for Real-Time Face Swapping
* Feature Disentanglement Learning with Switching and Aggregation for Video-based Person Re-Identification
* Federated Domain Generalization for Image Recognition via Cross-Client Style Transfer
* Federated Learning for Commercial Image Sources
* FeTrIL: Feature Translation for Exemplar-Free Class-Incremental Learning
* Few-Shot Learning of Compact Models via Task-Specific Meta Distillation
* Few-shot Medical Image Segmentation with Cycle-resemblance Attention
* Few-shot Object Counting with Similarity-Aware Feature Enhancement
* Few-shot Object Detection via Improved Classification Features
* FFM: Injecting Out-of-Domain Knowledge via Factorized Frequency Modification
* Fine Gaze Redirection Learning with Gaze Hardness-aware Transformation
* Fine-Context Shadow Detection using Shadow Removal
* Fine-grained Activities of People Worldwide
* Fine-grained Affordance Annotation for Egocentric Hand-Object Interaction Videos
* FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation
* FLOAT: Fast Learnable Once-for-All Adversarial Training for Tunable Trade-off between Accuracy and Robustness
* Foreground Guidance and Multi-Layer Feature Fusion for Unsupervised Object Discovery with Transformers
* Fractual Projection Forest: Fast and Explainable Point Cloud Classifier
* Frame Interpolation for Dynamic Scenes with Implicit Flow Encoding
* FreeREA: Training-Free Evolution-based Architecture Search
* Frequency-Aware Self-Supervised Monocular Depth Estimation
* From Forks to Forceps: A New Framework for Instance Segmentation of Surgical Instruments
* Full Contextual Attention for Multi-resolution Transformers in Semantic Segmentation
* Fully Convolutional Transformer for Medical Image Segmentation, The
* FUSSL: Fuzzy Uncertain Self Supervised Learning
* GAF-Net: Improving the Performance of Remote Sensing Image Fusion using Novel Global Self and Cross Attention Learning
* GAFNet: A Global Fourier Self Attention Based Novel Network for multi-modal downstream tasks
* GaIA: Graphical Information Gain based Attention Network for Weakly Supervised Point Cloud Semantic Segmentation
* Gait Recognition Using 3-D Human Body Shape Inference
* Gallery Filter Network for Person Search
* GarSim: Particle Based Neural Garment Simulator
* GEMS: Generating Efficient Meta-Subnets
* GEMS: Scene Expansion using Generative Models of Graphs
* Generative Alignment of Posterior Probabilities for Source-free Domain Adaptation
* Generative Colorization of Structured Mobile Web Pages
* Generative Range Imaging for Learning Scene Priors of 3D LiDAR Data
* GeoFill: Reference-Based Image Inpainting with Better Geometric Understanding
* GLAD: A Global-to-Local Anomaly Detector
* GliTr: Glimpse Transformers with Spatiotemporal Consistency for Online Action Prediction
* Global-Local Self-Distillation for Visual Representation Learning
* GlobalFlowNet: Video Stabilization using Deep Distilled Global Motion Estimates
* Gradient-Based Quantification of Epistemic Uncertainty for Deep Object Detectors
* Graph-Based Self-Learning for Robust Person Re-identification
* Grounding Scene Graphs on Natural Images via Visio-Lingual Message Passing
* Guiding Users to Where to Give Color Hints for Efficient Interactive Sketch Colorization via Unsupervised Region Prioritization
* Guiding Visual Question Answering with Attention Priors
* HandGCNFormer: A Novel Topology-Aware Transformer Network for 3D Hand Pose Estimation
* Handling Image and Label Resolution Mismatch in Remote Sensing
* Hard to Track Objects with Irregular Motions and Similar Appearances? Make It Easier by Buffering the Matching Space
* Harnessing Unrecognizable Faces for Improving Face Recognition
* Hear The Flow: Optical Flow-Based Self-Supervised Visual Sound Source Localization
* Heatmap-based Out-of-Distribution Detection
* Heightfields for Efficient Scene Reconstruction for AR
* HiFormer: Hierarchical Multi-scale Representations Using Transformers for Medical Image Segmentation
* High-Quality RGB-D Reconstruction via Multi-View Uncalibrated Photometric Stereo and Gradient-SDF
* High-Resolution Depth Estimation for 360° Panoramas through Perspective and Panoramic Depth Images Registration
* HIME: Efficient Headshot Image Super-Resolution with Multiple Exemplars
* HoechstGAN: Virtual Lymphocyte Staining Using Generative Adversarial Networks
* Holistic Interaction Transformer Network for Action Detection
* HOOT: Heavy Occlusions in Object Tracking Benchmark
* How to Practice VQA on a Resource-limited Target Domain
* Human-in-the-Loop Video Semantic Segmentation Auto-Annotation
* HuPR: A Benchmark for Human Pose Estimation Using Millimeter Wave Radar
* HyperBlock Floating Point: Generalised Quantization Scheme for Gradient and Inference Computation
* Hyperdimensional Feature Fusion for Out-of-Distribution Detection
* HyperPosePDF Hypernetworks Predicting the Probability Distribution on SO(3)
* HyperShot: Few-Shot Learning by Kernel HyperNetworks
* Hyperspherical Quantization: Toward Smaller and More Accurate Models
* I See-Through You: A Framework for Removing Foreground Occlusion in Both Sparse and Dense Light Field Images
* iColoriT: Towards Propagating Local Hints to the Right Region in Interactive Colorization by Leveraging Vision Transformer
* IDD-3D: Indian Driving Dataset for 3D Unstructured Road Scenes
* IFQA: Interpretable Face Quality Assessment
* Image Completion with Heterogeneously Filtered Spectral Hints
* Image Segmentation-based Unsupervised Multiple Objects Discovery
* Image-Consistent Detection of Road Anomalies as Unpredictable Patches
* Image-free Domain Generalization via CLIP for 3D Hand Pose Estimation
* Image-Text Pre-Training for Logo Recognition
* ImpDet: Exploring Implicit Fields for 3D Object Detection
* ImPosing: Implicit Pose Encoding for Efficient Visual Localization
* Improving Deep Facial Phenotyping for Ultra-rare Disorder Verification Using Model Ensembles
* Improving Diversity with Adversarially Learned Transformations for Domain Generalization
* Improving Multi-fidelity Optimization with a Recurring Learning Rate for Hyperparameter Tuning
* Improving Pixel-Level Contrastive Learning by Leveraging Exogenous Depth Information
* Improving Predicate Representation in Scene Graph Generation by Self-Supervised Learning
* Improving saliency models' predictions of the next fixation with humans' intrinsic cost of gaze shifts
* Improving the Pair Selection and the Model Fusion Steps of Satellite Multi-View Stereo Pipelines
* Improving the Robustness of Point Convolution on k-Nearest Neighbor Neighborhoods with a Viewpoint-Invariant Coordinate Transform
* Indirect Adversarial Losses via an Intermediate Distribution for Training GANs
* InDiReCT: Language-Guided Zero-Shot Deep Metric Learning for Images
* Inducing Data Amplification Using Auxiliary Datasets in Adversarial Training
* Instance-Dependent Noisy Label Learning via Graphical Modelling
* Intention-Conditioned Long-Term Human Egocentric Action Anticipation
* Interacting Hand-Object Pose Estimation via Dense Mutual Attention
* Interactive Image Manipulation with Complex Text Instructions
* Interpolated SelectionConv for Spherical Images and Surfaces
* Interpreting Disparate Privacy-Utility Tradeoff in Adversarial Learning via Attribute Correlation
* Intra-Batch Supervision for Panoptic Segmentation on High-Resolution Images
* Intra-Source Style Augmentation for Improved Domain Generalization
* Is Bigger Always Better? An Empirical Study on Efficient Architectures for Style Transfer and Beyond
* Is your noise correction noisy? PLS: Robustness to label noise with two stage detection
* Joint Point Interaction-Dimension Search for 3D Point Cloud
* Joint Video Rolling Shutter Correction and Super-Resolution
* Jointly Learning Band Selection and Filter Array Design for Hyperspectral Imaging
* K-VQG: Knowledge-aware Visual Question Generation for Common-sense Acquisition
* Kernel-Aware Burst Blind Super-Resolution
* Keys to Better Image Inpainting: Structure and Texture Go Hand in Hand
* Kinematic-aware Hierarchical Attention Network for Human Pose Estimation in Videos
* Knowing What to Label for Few Shot Microscopy Image Cell Segmentation
* LAB: Learnable Activation Binarizer for Binary Neural Networks
* Language-free Training for Zero-shot Video Grounding
* Large-Scale Open-Set Classification Protocols for ImageNet
* Large-to-small Image Resolution Asymmetry in Deep Metric Learning
* LAVA:Label-efficient Visual Learning and Adaptation
* LayerDoc: Layer-wise Extraction of Spatial Hierarchical Structure in Visually-Rich Documents
* LCS: Learning Compressible Subspaces for Efficient, Adaptive, Real-Time Network Compression at Inference Time
* Learnable Human Mesh Triangulation for 3D Human Pose and Shape Estimation
* Learning 3D Human Pose Estimation from Dozens of Datasets using a Geometry-Aware Autoencoder to Bridge Between Skeleton Formats
* Learning Across Domains and Devices: Style-Driven Source-Free Domain Adaptation in Clustered Federated Learning
* Learning Attention Propagation for Compositional Zero-Shot Learning
* Learning by Hallucinating: Vision-Language Pre-training with Weak Supervision
* Learning Classifiers of Prototypes and Reciprocal Points for Universal Domain Adaptation
* Learning Few-shot Segmentation from Bounding Box Annotations
* Learning Graph Variational Autoencoders with Constraints and Structured Priors for Conditional Indoor 3D Scene Generation
* Learning How to MIMIC: Using Model Explanations to Guide Deep Learning Training
* Learning incoherent light emission steering from metasurfaces using generative models
* Learning Latent Structural Relations with Message Passing Prior
* Learning Lightweight Neural Networks via Channel-Split Recurrent Convolution
* Learning Style Subspaces for Controllable Unpaired Domain Translation
* Learning to Detect 3D Lanes by Shape Matching and Embedding
* Leveraging Local Patch Differences in Multi-Object Scenes for Generative Adversarial Attacks
* Leveraging Off-the-shelf Diffusion Model for Multi-attribute Fashion Image Manipulation
* Li3DeTr: A LiDAR based 3D Detection Transformer
* Lightweight Network For Video Motion Magnification
* Lightweight Video Denoising using Aggregated Shifted Window Attention
* Line Search-Based Feature Transformation for Fast, Stable, and Tunable Content-Style Control in Photorealistic Style Transfer
* LineEX: Data Extraction from Scientific Line Charts
* LiveSeg: Unsupervised Multimodal Temporal Segmentation of Long Livestream Videos
* LoopDA: Constructing Self-loops to Adapt Nighttime Semantic Segmentation
* Lossy Image Compression with Quantized Hierarchical VAEs
* LRA&LDRA: Rethinking Residual Predictions for Efficient Shadow Detection and Removal
* M-FUSE: Multi-frame Fusion for Scene Flow Estimation
* Magnification Prior: A Self-Supervised Method for Learning Representations on Breast Cancer Histopathological Images
* Mapping DNN Embedding Manifolds for Network Generalization Prediction
* Marker-removal Networks to Collect Precise 3D Hand Data for RGB-based Estimation and its Application in Piano
* Masked Image Modeling Advances 3D Medical Image Analysis
* MASTAF: A Model-Agnostic Spatio-Temporal Attention Fusion Network for Few-shot Video Classification
* Match Cutting: Finding Cuts with Smooth Visual Transitions
* Medical Image Segmentation via Cascaded Attention Decoding
* Mesh-Tension Driven Expression-Based Wrinkles for Synthetic Faces
* Meta-Auxiliary Learning for Future Depth Prediction in Videos
* Meta-Learning for Adaptation of Deep Optical Flow Networks
* Meta-OLE: Meta-learned Orthogonal Low-Rank Embedding
* MEVID: Multi-view Extended Videos with Identities for Video Person Re-Identification
* MFCFlow: A Motion Feature Compensated Multi-Frame Recurrent Network for Optical Flow Estimation
* MFFN: Multi-view Feature Fusion Network for Camouflaged Object Detection
* Mixture Outlier Exposure: Towards Out-of-Distribution Detection in Fine-grained Environments
* MixVPR: Feature Mixing for Visual Place Recognition
* ML-Decoder: Scalable and Versatile Classification Head
* MMPTRACK: Large-scale Densely Annotated Multi-camera Multiple People Tracking Benchmark
* Mobile Robot Manipulation using Pure Object Detection
* Modality Mixer for Multi-modal Action Recognition
* Modeling Stroke Mask for End-to-End Text Erasing
* Modeling the Lighting in Scenes as Style for Auto White-Balance Correction
* MonoDVPS: A Self-Supervised Monocular Depth Estimation Approach to Depth-aware Video Panoptic Segmentation
* MonoEdge: Monocular 3D Object Detection Using Local Perspectives
* More Control for Free! Image Synthesis with Semantic Diffusion Guidance
* More Knowledge, Less Bias: Unbiasing Scene Graph Generation with Explicit Ontological Adjustment
* More Than Just Attention: Improving Cross-Modal Attentions with Contrastive Constraints for Image-Text Matching
* MORGAN: Meta-Learning-based Few-Shot Open-Set Recognition via Generative Adversarial Network
* Morphology Focused Diffusion Probabilistic Model for Synthesis of Histopathology Images, A
* Motif Mining: Finding and Summarizing Remixed Image Content
* Motion Aware Self-Supervision for Generic Event Boundary Detection
* MovieCLIP: Visual Scene Recognition in Movies
* MRI Imputation based on Fused Index- and Intensity-Registration
* MT-DETR: Robust End-to-end Multimodal Detection with Confidence Fusion
* Multi-Frame Attention with Feature-Level Warping for Drone Crowd Tracking
* Multi-level Contrastive Learning for Self-Supervised Vision Transformers
* Multi-scale Cell-based Layout Representation for Document Understanding
* Multi-scale Contrastive Learning for Complex Scene Generation
* Multi-View Action Recognition using Contrastive Learning
* Multi-View Photometric Stereo Revisited
* Multi-view Tracking Using Weakly Supervised Human Motion Prediction
* Multimodal Multi-Head Convolutional Attention with Various Kernel Sizes for Medical Image Super-Resolution
* Multimodal Vision Transformers with Forced Attention for Behavior Analysis
* Multivariate Probabilistic Monocular 3D Object Detection
* Mutual Learning for Long-Tailed Recognition
* My Face My Choice: Privacy Enhancing Deepfakes for Social Media Anonymization
* NAPReg: Nouns As Proxies Regularization for Semantically Aware Cross-Modal Embeddings
* Nearest Neighbors Meet Deep Neural Networks for Point Cloud Analysis
* Nested Deformable Multi-head Attention for Facial Image Inpainting
* Neural Distributed Image Compression with Cross-Attention Feature Alignment
* Neural Implicit Representations for Physical Parameter Inference from a Single Video
* neural video codec with spatial rate-distortion control, A
* Neural Weight Search for Scalable Task Incremental Learning
* NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds
* nLMVS-Net: Deep Non-Lambertian Multi-View Stereo
* No Reference Opinion Unaware Quality Assessment of Authentically Distorted Images
* No Shifted Augmentations (NSA): Compact distributions for robust self-supervised Anomaly Detection
* Normality Guided Multiple Instance Learning for Weakly Supervised Video Anomaly Detection
* OCR-VQGAN: Taming Text-within-Image Generation
* On Quantizing Implicit Neural Representations
* On the Importance of Denoising when Learning to Compress Images
* One-Shot Doc Snippet Detection: Powering Search in Document Beyond Text
* One-Shot Synthesis of Images and Segmentation Masks
* Online Adaptive Temporal Memory with Certainty Estimation for Human Trajectory Prediction
* Online Knowledge Distillation for Multi-task Learning
* OpenEarthMap: A Benchmark Dataset for Global High-Resolution Land Cover Mapping
* Orthogonal Transforms For Learning Invariant Representations In Equivariant Neural Networks
* Out-of-distribution Detection via Frequency-regularized Generative Models
* Out-of-Distribution Detection with Reconstruction Error and Typicality-based Penalty
* OutfitTransformer: Learning Outfit Representations for Fashion Recommendation
* Overlap-guided Gaussian Mixture Models for Point Cloud Registration
* Panoptic-aware Image-to-Image Translation
* Partially calibrated semi-generalized pose from hybrid point correspondences
* Patch-based Privacy Preserving Neural Network for Vision Tasks
* Patch-level Gaze Distribution Prediction for Gaze Following
* PatchDropout: Economizing Vision Transformers Using Patch Dropout
* PatchZero: Defending against Adversarial Patch Attacks by Detecting and Zeroing the Patch
* PERCEIVER-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention
* Perceptual Image Enhancement for Smartphone Real-Time Applications
* Performance comparison of DVS data spatial downscaling methods using Spiking Neural Networks
* Performer: A Novel PPG-to-ECG Reconstruction Transformer for a Digital Biomarker of Cardiovascular Disease Detection
* Phantom Sponges: Exploiting Non-Maximum Suppression to Attack Deep Object Detectors
* Physically Plausible Animation of Human Upper Body from a Single Image
* Pik-Fix: Restoring and Colorizing Old Photos
* PINER: Prior-informed Implicit Neural Representation Learning for Test-time Adaptation in Sparse-view CT Reconstruction
* Pixel-Wise Prediction based Visual Odometry via Uncertainty Estimation
* Placing Human Animations into 3D Scenes by Learning Interaction- and Geometry-Driven Keyframes
* Planar Object Tracking via Weighted Optical Flow
* PointInverter: Point Cloud Reconstruction and Editing via a Generative Model with Shape Priors
* PointNeuron: 3D Neuron Reconstruction via Geometry and Topology Learning of Point Clouds
* PP4AV: A benchmarking Dataset for Privacy-preserving Autonomous Driving
* PreViTS: Contrastive Pretraining with Video Tracking Supervision
* Priority Map for Vision-and-Language Navigation with Trajectory Plans and Feature-Location Cues, A
* PRN: Panoptic Refinement Network
* Proactive Deepfake Defence via Identity Watermarking
* Probabilistic Integration of Object Level Annotations in Chest X-ray Classification
* Probabilistic Volumetric Fusion for Dense Monocular SLAM
* Progressive Video Summarization via Multimodal Self-supervised Learning
* Protocol for Evaluating Model Interpretation Methods from Visual Explanations, A
* ProtoSeg: Interpretable Semantic Segmentation with Prototypical Parts
* Pruning-Guided Curriculum Learning for Semi-Supervised Semantic Segmentation
* PSENet: Progressive Self-Enhancement Network for Unsupervised Extreme-Light Image Enhancement
* Pushing the Efficiency Limit Using Structured Sparse Convolutions
* QMagFace: Simple and Accurate Quality-Aware Face Recognition
* Quality Aware Sample-to-Sample Comparison for Face Recognition, A
* RADIANT: Better rPPG estimation using signal embeddings and Transformer
* RANCER: Non-Axis Aligned Anisotropic Certification with Randomized Smoothing
* Randomness is the Root of All Evil: More Reliable Evaluation of Deep Active Learning
* RAST: Restorable Arbitrary Style Transfer via Multi-restoration
* Real-time Concealed Weapon Detection on 3D Radar Images for Walk-through Screening System
* Real-Time Restoration of Dark Stereo Images
* Realistic Full-Body Anonymization with Surface-Guided GANs
* Rebalancing gradient to improve self-supervised co-training of depth, odometry and optical flow predictions
* Recipe2Video: Synthesizing Personalized Videos from Recipe Texts
* Reconstructing Humpty Dumpty: Multi-feature Graph Autoencoder for Open Set Action Recognition
* Recovering Fine Details for Neural Implicit Surface Reconstruction
* Recur, Attend or Convolve? On Whether Temporal Modeling Matters for Cross-Domain Robustness in Action Recognition
* Reducing Annotation Effort by Identifying and Labeling Contextually Diverse Classes for Semantic Segmentation Under Domain Shift
* ReEnFP: Detail-Preserving Face Reconstruction by Encoding Facial Priors
* Refign: Align and Refine for Adaptation of Semantic Segmentation to Adverse Conditions
* Relation Preserving Triplet Mining for Stabilising the Triplet Loss in Re-identification Systems
* Relaxing Contrastiveness in Multimodal Representation Learning
* Representation Disentanglement in Generative Models with Contrastive Learning
* Representation Recovering for Self-Supervised Pre-training on Medical Images
* Resolving Class Imbalance for LiDAR-based Object Detector by Dynamic Weight Average and Contextual Ground Truth Sampling
* Rethinking Rotation in Self-Supervised Contrastive Learning: Adaptive Positive or Negative Data Augmentation
* Rethinking the Data Annotation Process for Multiview 3D Pose Estimation with Active Learning and Self-Training
* Revisiting Training-free NAS Metrics: An Efficient Training-based Method
* RIFT: Disentangled Unsupervised Image Translation via Restricted Information Flow
* RNAS-MER: A Refined Neural Architecture Search with Hybrid Spatiotemporal Operations for Micro-Expression Recognition
* Robust and Efficient Alignment of Calcium Imaging Data through Simultaneous Low Rank and Sparse Decomposition
* Robust Real-world Image Enhancement Based on Multi-Exposure LDR Images
* Robustness of Trajectory Prediction Models Under Map-Based Attacks
* ROMA: Run-Time Object Detection To Maximize Real-Time Accuracy
* RSF: Optimizing Rigid Scene Flow From 3D Point Clouds Without Labels
* SAILOR: Scaling Anchors via Insights into Latent Object Representation
* SALAD: Source-free Active Label-Agnostic Domain Adaptation for Classification, Segmentation and Detection
* Saliency Guided Experience Packing for Replay in Continual Learning
* SAT: Scale-Augmented Transformer for Person Search
* Scaling Neural Face Synthesis to High FPS and Low Latency by Neural Caching
* Scaling Novel Object Detection with Weakly Supervised Detection Transformers
* ScanNeRF: a Scalable Benchmark for Neural Radiance Fields
* ScoreNet: Learning Non-Uniform Attention and Augmentation for Transformer-Based Histopathological Image Classification
* SCTS: Instance Segmentation of Single Cells Using a Transformer-Based Semantic-Aware Model and Space-Filling Augmentation
* SD-Conv: Towards the Parameter-Efficiency of Dynamic Convolution
* SD-Pose: Structural Discrepancy Aware Category-Level 6D Object Pose Estimation
* Searching Efficient Neural Architecture with Multi-resolution Fusion Transformer for Appearance-based Gaze Estimation
* Searching for Robust Binary Neural Networks via Bimodal Parameter Perturbation
* SeCo: Separating Unknown Musical Visual Sounds with Consistency Guidance
* Seg&Struct: The Interplay Between Part Segmentation and Structure Inference for 3D Shape Parsing
* Segmentation-free Direct Iris Localization Networks
* Select, Label, and Mix: Learning Discriminative Invariant Feature Representations for Partial Domain Adaptation
* Self Supervised Low Dose Computed Tomography Image Denoising Using Invertible Network Exploiting Inter Slice Congruence
* Self-Attention Message Passing for Contrastive Few-Shot Learning
* Self-Attentive Pooling for Efficient Deep Learning
* Self-Distillation for Unsupervised 3D Domain Adaptation
* Self-Distilled Self-supervised Representation Learning
* Self-improving Multiplane-to-layer Images for Novel View Synthesis
* Self-Pair: Synthesizing Changes from Single Source for Object Change Detection in Remote Sensing Imagery
* Self-Supervised 2D/3D Registration for X-Ray to CT Image Fusion
* Self-Supervised Clustering based on Manifold Learning and Graph Convolutional Networks
* Self-supervised Correspondence Estimation via Multiview Registration
* Self-Supervised Distilled Learning for Multi-modal Misinformation Identification
* Self-supervised Learning with Local Contrastive Loss for Detection and Semantic Segmentation
* Self-Supervised Learning with Masked Image Modeling for Teeth Numbering, Detection of Dental Restorations, and Instance Segmentation in Dental Panoramic Radiographs
* Self-supervised Monocular Depth Estimation from Thermal Images via Adversarial Multi-spectral Adaptation
* Self-Supervised Monocular Depth Estimation: Solving the Edge-Fattening Problem
* Self-Supervised Pyramid Representation Learning for Multi-Label Visual Analysis and Beyond
* Self-supervised Relative Pose with Homography Model-fitting in the Loop
* Semantic Guided Latent Parts Embedding for Few-Shot Learning
* Semantic Segmentation in Aerial Imagery Using Multi-level Contrastive Learning with Local Consistency
* Semantic Segmentation of Degraded Images Using Layer-Wise Feature Adjustor
* Semantic Segmentation with Active Semi-Supervised Learning
* Semantics Guided Contrastive Learning of Transformers for Zero-shot Temporal Activity Detection
* Semantics-Depth-Symbiosis: Deeply Coupled Semi-Supervised Learning of Semantics and Depth
* Semi-Supervised Domain Adaptation with Auto-Encoder via Simultaneous Learning
* Semi-Supervised Learning for Low-light Image Restoration through Quality Assisted Pseudo-Labeling
* Semi-Supervised Learning for Sparsely-Labeled Sequential Data: Application to Healthcare Video Processing
* Separating Partially-Polarized Diffuse and Specular Reflection Components under Unpolarized Light Sources
* Seq-UPS: Sequential Uncertainty-aware Pseudo-label Selection for Semi-Supervised Text Recognition
* Serf: Towards better training of deep neural networks using log-Softplus ERror activation Function
* SGPCR: Spherical Gaussian Point Cloud Representation and its Application to Object Registration and Retrieval
* SHARDS: Efficient SHAdow Removal using Dual Stage Network for High-Resolution Images
* Sim2real Transfer Learning for Point Cloud Segmentation: An Industrial Application Case on Autonomous Disassembly
* Sim2RealVS: A New Benchmark for Video Stabilization with a Strong Baseline
* SimGlim: Simplifying glimpse based active visual reconstruction
* Similarity Contrastive Estimation for Self-Supervised Soft Contrastive Learning
* Simple and Efficient Pipeline to Build an End-to-End Spatial-Temporal Action Detector, A
* Simple and Powerful Global Optimization for Unsupervised Video Object Segmentation, A
* Simultaneous Acquisition of High Quality RGB Image and Polarization Information using a Sparse Polarization Sensor
* Single Image Super-Resolution via a Dual Interactive Implicit Neural Network
* Single Stage Weakly Supervised Semantic Segmentation of Complex Scenes
* Single-Image HDR Reconstruction by Multi-Exposure Generation
* SIRA: Relightable Avatars from a Single Image
* SIUNet: Sparsity Invariant U-Net for Edge-Aware Depth Completion
* SketchInverter: Multi-Class Sketch-Based Image Generation via GAN Inversion
* Skew-Robust Human-Object Interactions in Videos
* SLI-pSp: Injecting Multi-Scale Spatial Layout in pSp
* SONGs: Self-Organizing Neural Graphs
* Sparsity Agnostic Depth Completion
* Spatial Consistency Loss for Training Multi-Label Classifiers from Single-Label Annotations
* Spatially Multi-Conditional Image Generation
* Spatio-Temporal Action Detection Under Large Motion
* Spike-Based Anytime Perception
* SPIQ: Data-Free Per-Channel Static Input Quantization
* Splatting-based Synthesis for Video Frame Interpolation
* Split to Learn: Gradient Split for Multi-Task Human Image Analysis
* SSFE-Net: Self-Supervised Feature Enhancement for Ultra-Fine-Grained Few-Shot Class Incremental Learning
* SSSD: Self-Supervised Self Distillation
* STAR-Transformer: A Spatio-temporal Cross Attention Transformer for Human Action Recognition
* Stop or Forward: Dynamic Layer Skipping for Efficient Action Recognition
* Structure-Encoding Auxiliary Tasks for Improved Visual Representation in Vision-and-Language Navigation
* Style-Guided Inference of Transformer for High-resolution Image Synthesis
* Surface normal estimation from optimized and distributed light sources using DNN-based photometric stereo
* Suspect Identification Framework using Contrastive Relevance Feedback, A
* SVD-NAS: Coupling Low-Rank Approximation and Neural Architecture Search
* Switching to Discriminative Image Captioning by Relieving a Bottleneck of Reinforcement Learning
* Synthetic Latent Fingerprint Generator
* Task Agnostic and Post-hoc Unseen Distribution Detection
* TCAM: Temporal Class Activation Maps for Object Localization in Weakly-Labeled Unconstrained Videos
* Temporal Feature Enhancement Dilated Convolution Network for Weakly-supervised Temporal Action Localization
* Temporally Consistent Online Depth Estimation in Dynamic Scenes
* TeST: Test-time Self-Training under Distribution Shift
* Text and Image Guided 3D Avatar Generation and Manipulation
* Text-Guided Object Detector for Multi-modal Video Question Answering
* THOR-Net: End-to-end Graformer-based Realistic Two Hands and Object Reconstruction with Self-supervision
* TI2Net: Temporal Identity Inconsistency Network for Deepfake Detection
* TinyHD: Efficient Video Saliency Prediction with Heterogeneous Decoders using Hierarchical Maps Distillation
* Token Pooling in Vision Transformers for Image Classification
* Toward Edge-Efficient Dense Predictions with Synergistic Multi-Task Neural Architecture Search
* Towards A Framework for Privacy-Preserving Pedestrian Analysis
* Towards Discriminative and Transferable One-Stage Few-Shot Object Detectors
* Towards Disturbance-Free Visual Mobile Manipulation
* Towards Equivariant Optical Flow Estimation with Deep Learning
* Towards Few-Annotation Learning for Object Detection: Are Transformer-based Models More Efficient?
* Towards Generating Ultra-High Resolution Talking-Face Videos with Lip synchronization
* Towards Interpretable Video Anomaly Detection
* Towards MOOCs for Lipreading: Using Synthetic Talking Heads to Train Humans in Lipreading at Scale
* Towards Online Domain Adaptive Object Detection
* Tracking Growth and Decay of Plant Roots in Minirhizotron Images
* Training Auxiliary Prototypical Classifiers for Explainable Anomaly Detection in Medical Image Segmentation
* Trans4Map: Revisiting Holistic Bird's-Eye-View Mapping from Egocentric Images to Allocentric Semantics with Vision Transformers
* Transformers For Recognition In Overhead Imagery: A Reality Check
* TransMOT: Spatial-Temporal Graph Transformer for Multiple Object Tracking
* TransPillars: Coarse-to-Fine Aggregation for Multi-Frame 3D Object Detection
* TransVLAD: Multi-Scale Attention-Based Global Descriptors for Visual Geo-Localization
* Treating Motion as Option to Reduce Motion Dependency in Unsupervised Video Object Segmentation
* Treatment Learning Causal Transformer for Noisy Image Classification
* TTTFlow: Unsupervised Test-Time Training with Normalizing Flow
* TVCalib: Camera Calibration for Sports Field Registration in Soccer
* TVT: Transferable Vision Transformer for Unsupervised Domain Adaptation
* Two-level Data Augmentation for Calibrated Multi-view Detection
* Uncertainty-Aware Interactive LiDAR Sampling for Deep Depth Completion
* Uncertainty-aware Label Distribution Learning for Facial Expression Recognition
* Understanding the Role of Mixup in Knowledge Distillation: An Empirical Study
* Unified Framework for Language Guided Image Completion, An
* Unifying Distribution Alignment as a Loss for Imbalanced Semi-supervised Learning
* Unifying Margin-Based Softmax Losses in Face Recognition
* Universal Deep Image Compression via Content-Adaptive Optimization with Adapters
* Unsupervised 4D LiDAR Moving Object Segmentation in Stationary Settings with Multivariate Occupancy Time Series
* Unsupervised Audio-Visual Lecture Segmentation
* Unsupervised Video Object Segmentation via Prototype Memory Network
* UPAR: Unified Pedestrian Attribute Recognition and Person Retrieval
* Uplift and Upsample: Efficient 3D Human Pose Estimation with Uplifting Transformers
* Urban Scene Semantic Segmentation with Low-Cost Coarse Annotation
* UVCGAN: UNet Vision Transformer cycle-consistent GAN for unpaired image-to-image translation
* Video joint denoising and demosaicing with recurrent CNNs
* Video Object Matting via Hierarchical Space-Time Semantic Guidance
* ViewCLR: Learning Self-supervised Video Representation for Unseen Viewpoints
* VirtualHome Action Genome: A Simulated Spatio-Temporal Scene Graph Dataset with Consistent Relationship Labels
* Vis2Rec: A Large-Scale Visual Dataset for Visit Recommendation
* Vision Transformer for NeRF-Based View Synthesis from a Single Input Image
* Visualizing Global Explanations of Point Cloud DNNs
* Visually explaining 3D-CNN predictions for video classification with an adaptive occlusion sensitivity analysis
* VLC-BERT: Visual Question Answering with Contextualized Commonsense Knowledge
* VSGD-Net: Virtual Staining Guided Melanocyte Detection on Histopathological Images
* Watch Those Words: Video Falsification Detection Using Word-Conditioned Facial Motion
* Watching the News: Towards VideoQA Models that can Read
* Wavelength-aware 2D Convolutions for Hyperspectral Imaging
* Weakly Supervised Cell-Instance Segmentation with Two Types of Weak Labels by Single Instance Pasting
* Weakly Supervised Face Naming with Symmetry-Enhanced Contrastive Loss
* Weakly-Supervised Optical Flow Estimation for Time-of-Flight
* Weakly-supervised Point Cloud Instance Segmentation with Geometric Priors
* What can we Learn by Predicting Accuracy?
* WHFL: Wavelet-Domain High Frequency Loss for Sketch-to-Image Translation
* Wiener Guided DIP for Unsupervised Blind Image Deconvolution
* WSNet: Towards An Effective Method for Wound Image Segmentation
* X-Align: Cross-Modal Cross-View Alignment for Bird's-Eye-View Segmentation
* X-NeRF: Explicit Neural Radiance Field for Multi-Scene 360° Insufficient RGB-D Views
* Zero-shot versus Many-shot: Unsupervised Texture Anomaly Detection
* 2D Feature Distillation for Weakly- and Semi-Supervised 3D Semantic Segmentation
* 360BEV: Panoramic Semantic Mapping for Indoor Bird's-Eye View
* 3D Face Style Transfer with a Hybrid Solution of NeRF and Mesh Rasterization
* 3D Human Pose Estimation with Two-step Mixed-Training Strategy
* 3D Reconstruction of Interacting Multi-Person in Clothing from a Single Image
* 3D Super-Resolution Model for Vehicle Flow Field Enrichment
* 3D-Aware Talking-Head Video Motion Transfer
* 3SD: Self-Supervised Saliency Detection With No Labels
* 4K-Resolution Photo Exposure Correction at 125 FPS with ~8K Parameters
* A*: Atrous Spatial Temporal Action Recognition for Real Time Applications
* Active Batch Sampling for Multi-label Classification with Binary User Feedback
* Active Learning for Single-Stage Object Detection in UAV Images
* Active Learning with Task Consistency and Diversity in Multi-Task Networks
* Active Transfer Learning for Efficient Video-Specific Human Pose Estimation
* Activity-based Early Autism Diagnosis Using A Multi-Dataset Supervised Contrastive Learning Approach
* Adapt Your Teacher: Improving Knowledge Distillation for Exemplar-free Continual Learning
* Adaptive Deep Neural Network Inference Optimization with EENet
* Adaptive Latent Diffusion Model for 3D Medical Image to Image Translation: Multi-modal Magnetic Resonance Imaging Study
* Adaptive manifold for imbalanced transductive few-shot learning
* Adversarial Likelihood Estimation With One-Way Flows
* AFTer-SAM: Adapting SAM with Axial Fusion Transformer for Medical Imaging Segmentation
* Aligning Non-Causal Factors for Transformer-Based Source-Free Domain Adaptation
* Alleviating Foreground Sparsity for Semi-Supervised Monocular 3D Object Detection
* AMEND: Adaptive Margin and Expanded Neighborhood for Efficient Generalized Category Discovery
* Amodal Intra-class Instance Segmentation: Synthetic Datasets and Benchmark
* Analysis of Initial Training Strategies for Exemplar-Free Class-Incremental Learning, An
* Analyzing the Domain Shift Immunity of Deep Homography Estimation
* Annotation-free Audio-Visual Segmentation
* AnyStar: Domain randomized universal star-convex 3D instance segmentation
* Appearance-Based Curriculum for Semi-Supervised Learning with Multi-Angle Unlabeled Data
* Approximating Intersections and Differences Between Linear Statistical Shape Models Using Markov Chain Monte Carlo
* Arbitrary-Resolution and Arbitrary-Scale Face Super-Resolution with Implicit Representation Networks
* ArcAid: Analysis of Archaeological Artifacts using Drawings
* ArcGeo: Localizing Limited Field-of-View Images using Cross-view Matching
* Are Natural Domain Foundation Models Useful for Medical Image Classification?
* Army of Thieves: Enhancing Black-Box Model Extraction via Ensemble based sample selection
* ARNIQA: Learning Distortion Manifold for Image Quality Assessment
* ArtQuest: Countering Hidden Language Biases in ArtVQA
* AssemblyNet: A Point Cloud Dataset and Benchmark for Predicting Part Directions in an Exploded Layout
* Assessing Neural Network Robustness via Adversarial Pivotal Tuning
* Assist Is Just as Important as the Goal: Image Resurfacing to Aid Model's Robust Prediction
* Asymmetric Image Retrieval with Cross Model Compatible Ensembles
* ATS: Adaptive Temperature Scaling for Enhancing Out-of-Distribution Detection Methods
* Attention Modules Improve Image-Level Anomaly Detection for Industrial Inspection: A DifferNet Case Study
* Attention-Guided Prototype Mixing: Diversifying Minority Context on Imbalanced Whole Slide Images Classification Learning
* Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D Object Detection
* AU-Aware Dynamic 3D Face Reconstruction from Videos with Transformer
* Augment the Pairs: Semantics-Preserving Image-Caption Pair Augmentation for Grounding-Based Vision and Language Models
* Auto-BPA: An Enhanced Ball-Pivoting Algorithm with Adaptive Radius using Contextual Bandits
* Automated Camera Calibration via Homography Estimation with GNNs
* Automated Monitoring of Ear Biting in Pigs by Tracking Individuals and Events
* Automated Sperm Assessment Framework and Neural Network Specialized for Sperm Video Recognition
* AvatarOne: Monocular 3D Human Animation
* Back to Optimization: Diffusion-based Zero-Shot 3D Human Pose Estimation
* Background Also Matters: Background-Aware Motion-Guided Objects Discovery, The
* Bag of Tricks for Fully Test-Time Adaptation
* BALF: Simple and Efficient Blur Aware Local Feature Detector
* Benchmark Generation Framework with Customizable Distortions for Image Classifier Robustness
* Benchmarking Out-of-Distribution Detection in Visual Question Answering
* Best of Both Worlds: Learning Arbitrary-scale Blind Super-Resolution via Dual Degradation Representations and Cycle-Consistency
* BEVMap: Map-Aware BEV Modeling for 3D Perception
* Beyond Active Learning: Leveraging the Full Potential of Human Interaction via Auto-Labeling, Human Correction, and Human Verification
* Beyond Classification: Definition and Density-based Estimation of Calibration in Object Detection
* Beyond Document Page Classification: Design, Datasets, and Challenges
* Beyond Fusion: Modality Hallucination-based Multispectral Fusion for Pedestrian Detection
* Beyond RGB: A Real World Dataset for Multispectral Imaging in Mobile Devices
* Beyond Self-Attention: Deformable Large Kernel Attention for Medical Image Segmentation
* Beyond SOT: Tracking Multiple Generic Objects at Once
* Bi-directional Training for Composed Image Retrieval via Text Prompt Learning
* Bias and Diversity in Synthetic-based Face Recognition
* BigSmall: Efficient Multi-Task Learning for Disparate Spatial and Temporal Physiological Measurements
* Bipartite Graph Diffusion Model for Human Interaction Generation
* BirdSAT: Cross-View Contrastive Masked Autoencoders for Bird Species Classification and Mapping
* Blurry Video Compression A Trade-off between Visual Enhancement and Data Compression
* Booster-SHOT: Boosting Stacked Homography Transformations for Multiview Pedestrian Detection with Attention
* Boosting Weakly Supervised Object Detection using Fusion and Priors from Hallucinated Depth
* BoostRad: Enhancing Object Detection by Boosting Radar Reflections
* BPKD: Boundary Privileged Knowledge Distillation For Semantic Segmentation
* Brainomaly: Unsupervised Neurologic Disease Detection Utilizing Unannotated T1-weighted Brain MR Images
* Bridging Generalization Gaps in High Content Imaging Through Online Self-Supervised Domain Adaptation
* Bridging the Gap between Multi-focus and Multi-modal: A Focused Integration Framework for Multi-modal Image Fusion
* BSRAW: Improving Blind RAW Image Super-Resolution
* C-CLIP: Contrastive Image-Text Encoders to Close the Descriptive-Commentative Gap
* C2AIR: Consolidated Compact Aerial Image Haze Removal
* CAD: Contextual Multi-modal Alignment for Dynamic AVQA
* CAILA: Concept-Aware Intra-Layer Adapters for Compositional Zero-Shot Learning
* Camera-Independent Single Image Depth Estimation from Defocus Blur
* CamoFocus: Enhancing Camouflage Object Detection with Split-Feature Focal Modulation and Context Refinement
* CAMOT: Camera Angle-aware Multi-Object Tracking
* Can CLIP Help Sound Source Localization?
* Can Vision-Language Models be a Good Guesser? Exploring VLMs for Times and Location Reasoning
* Can you even tell left from right? Presenting a new challenge for VQA
* CARE: Counterfactual-based Algorithmic Recourse for Explainable Pose Correction
* CATS: Combined Activation and Temporal Suppression for Efficient Network Inference
* Causal Analysis for Robust Interpretability of Neural Networks
* Causal Feature Alignment: Learning to Ignore Spurious Background Features
* CCMR: High Resolution Optical Flow Estimation via Coarse-to-Fine Context-Guided Motion Reasoning
* CGAPoseNet+GCAN: A Geometric Clifford Algebra Network for Geometry-aware Camera Pose Regression
* CHAI: Craters in Historical Aerial Images
* Cheating Depth: Enhancing 3D Surface Anomaly Detection via Depth Simulation
* CL-MAE: Curriculum-Learned Masked Autoencoders
* Classifying Cable Tendency with Semantic Segmentation by Utilizing Real and Simulated RGB Data
* CLID: Controlled-Length Image Descriptions with Limited Data
* CLIP-DIY: CLIP Dense Inference Yields Open-Vocabulary Semantic Segmentation For-Free
* CLIPAG: Towards Generator-Free Text-to-Image Generation
* Closer Look at Robustness of Vision Transformers to Backdoor Attacks, A
* CLRerNet: Improving Confidence of Lane Detection with LaneIoU
* ClusterFix: A Cluster-Based Debiasing Approach without Protected-Group Supervision
* Co-Speech Gesture Detection through Multi-Phase Sequence Labeling
* Coarse-to-Fine Pseudo-Labeling (C2FPL) Framework for Unsupervised Video Anomaly Detection, A
* CoD: Coherent Detection of Entities from Images with Multiple Modalities
* Collage Diffusion
* Common Diffusion Noise Schedules and Sample Steps are Flawed
* Complementary-Contradictory Feature Regularization against Multimodal Overfitting
* Complex Organ Mask Guided Radiology Report Generation
* Composite Diffusion: whole >= Sigma-parts
* Computer Vision on the Edge: Individual Cattle Identification in Real-time with ReadMyCow System
* Concept-Centric Transformers: Enhancing Model Interpretability through Object-Centric Concept Learning within a Shared Global Workspace
* Concurrent Band Selection and Traversability Estimation from Long-Wave Hyperspectral Imagery in Off-Road Settings
* Conditional Velocity Score Estimation for Image Restoration
* ConeQuest: A Benchmark for Cone Segmentation on Mars
* ConfTrack: Kalman Filter-based Multi-Person Tracking by Utilizing Confidence Score of Detection Box
* Consistent Multimodal Generation via A Unified GAN Framework
* Constrained Probabilistic Mask Learning for Task-specific Undersampled MRI Reconstruction
* Content-Aware Image Color Editing with Auxiliary Color Restoration Tasks
* Context in Human Action through Motion Complementarity
* Context-based Interpretable Spatio-Temporal Graph Convolutional Network for Human Motion Forecasting
* Contextual Affinity Distillation for Image Anomaly Detection
* Continual atlas-based segmentation of prostate MRI
* Continual Learning of Unsupervised Monocular Depth from Videos
* Continual Test-time Domain Adaptation via Dynamic Sample Selection
* Continuous Adaptation for Interactive Segmentation Using Teacher-Student Architecture
* Contrastive Learning for Multi-Object Tracking with Transformers
* Contrastive Viewpoint-aware Shape Learning for Long-term Person Re-Identification
* Controllable Image Synthesis of Industrial Data using Stable Diffusion
* Controllable Text-to-Image Synthesis for Multi-Modality MR Images
* Controlling Character Motions without Observable Driving Source
* Controlling Rate, Distortion, and Realism: Towards a Single Comprehensive Neural Image Compression Model
* Controlling Virtual Try-on Pipeline Through Rendering Policies
* Convolutional Masked Image Modeling for Dense Prediction Tasks on Pathology Images
* Correlation-aware active learning for surgery video segmentation
* CPSeg: Finer-grained Image Semantic Segmentation via Chain-of-Thought Language Prompting
* CrashCar101: Procedural Generation for Damage Assessment
* Critical Gap Between Generalization Error and Empirical Error in Active Learning
* Cross-Attention Between Satellite and Ground Views for Enhanced Fine-Grained Robot Geo-Localization
* Cross-Domain Few-Shot Incremental Learning for Point-Cloud Recognition
* Cross-feature Contrastive Loss for Decentralized Deep Learning on Heterogeneous Data
* CryoRL: Reinforcement Learning Enables Efficient Cryo-EM Data Collection
* CSAM: A 2.5D Cross-Slice Attention Module for Anisotropic Volumetric Medical Image Segmentation
* Customizing 360-Degree Panoramas through Text-to-Image Diffusion Models
* CVTHead: One-shot Controllable Head Avatar with Vertex-feature Transformer
* CXR-IRGen: An Integrated Vision and Language Model for the Generation of Clinically Accurate Chest X-Ray Image-Report Pairs
* CycleCL: Self-supervised Learning for Periodic Videos
* D3GU: Multi-target Active Domain Adaptation via Enhancing Domain Alignment
* D4: Detection of Adversarial Diffusion Deepfakes Using Disjoint Ensembles
* dacl10k: Benchmark for Semantic Bridge Damage Segmentation
* Data Augmentation for Object Detection via Controllable Diffusion Models
* Data-Centric Debugging: Mitigating Model Failures Via Targeted Image Retrieval
* DDAM-PS: Diligent Domain Adaptive Mixer for Person Search
* Debiasing, calibrating, and improving Semi-supervised Learning performance via simple Ensemble Projector
* Deblur-NSFF: Neural Scene Flow Fields for Blurry Dynamic Scenes
* DECDM: Document Enhancement using Cycle-Consistent Diffusion Models
* Deep Image Fingerprint: Towards Low Budget Synthetic Image Detection and Model Lineage Analysis
* Deep Metric Learning with Chance Constraints
* Deep Optics for Optomechanical Control Policy Design
* Deep Plug-and-play Nighttime Non-blind Deblurring with Saturated Pixel Handling Schemes
* Deep Subdomain Alignment for Cross-domain Image Classification
* Deep Visual-Genetic Biometrics for Taxonomic Classification of Rare Species
* Defending Object Detection Models against Image Distortions
* Defense against Adversarial Cloud Attack on Remote Sensing Salient Object Detection
* Denoising and Selecting Pseudo-Heatmaps for Semi-Supervised Human Pose Estimation
* Density-Based Flow Mask Integration via Deformable Convolution for Video People Flux Estimation
* Depth from Asymmetric Frame-Event Stereo: A Divide-and-Conquer Approach
* Describe Images in a Boring Way: Towards Cross-Modal Sarcasm Generation
* Design Choices for Enhancing Noisy Student Self-Training
* Designing a Hybrid Neural System to Learn Real-world Crack Segmentation from Fractal-based Simulation
* Detecting Content Segments from Online Sports Streaming Events: Challenges and Solutions
* Detection Defenses: An Empty Promise against Adversarial Patch Attacks on Optical Flow
* DeVOS: Flow-Guided Deformable Transformer for Video Object Segmentation
* Diff2Lip: Audio Conditioned Diffusion Models for Lip-Synchronization
* DiffBody: Diffusion-based Pose and Shape Editing of Human Images
* DiffCLIP: Leveraging Stable Diffusion for Language Grounded 3D Classification
* Differentiable JPEG: The Devil is in the Details
* Differentially Private Video Activity Recognition
* Diffuse and Restore: A Region-Adaptive Diffusion Model for Identity-Preserving Blind Face Restoration
* Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation
* Diffusion in the Dark: A Diffusion Model for Low-Light Text Recognition
* Diffusion models meet image counter-forensics
* Diffusion-based generation of Histopathological Whole Slide Images at a Gigapixel scale
* DISCO: Distributed Inference with Sparse Communications
* Discovering and Mitigating Biases in CLIP-based Image Editing
* Discriminator-free Unsupervised Domain Adaptation for Multi-label Image Classification
* Disentangled Pre-training for Image Matting
* Distortion-Disentangled Contrastive Learning
* Diverse Imagenet Models Transfer Better
* Do VSR Models Generalize Beyond LRS3?
* Do We Still Need Non-Maximum Suppression? Accurate Confidence Estimates and Implicit Duplication Modeling with IoU-Aware Calibration
* DocReal: Robust Document Dewarping of Real-Life Images via Attention-Enhanced Control Point Prediction
* Domain Adaptive 3D Shape Retrieval from Monocular Images
* Domain Aligned CLIP for Few-shot Classification
* Domain Generalisation via Risk Distribution Matching
* Domain Generalization by Rejecting Extreme Augmentations
* Domain Generalization with Correlated Style Uncertainty
* Domain-Aware Knowledge Distillation for Continual Model Generalization
* DPPMask: Masked Image Modeling with Determinantal Point Processes
* DR10K: Transfer Learning Using Weak Labels for Grading Diabetic Retinopathy on DR10K Dataset
* DR2: Disentangled Recurrent Representation Learning for Data-efficient Speech Video Synthesis
* DREAM: Visual Decoding from REversing HumAn Visual SysteM
* Driving through the Concept Gridlock: Unraveling Explainability Bottlenecks in Automated Driving
* DTrOCR: Decoder-only Transformer for Optical Character Recognition
* Dual Domain Diffusion Guidance for 3D CBCT Metal Artifact Reduction
* Dynamic Multimodal Information Bottleneck for Multimodality Classification
* Dynamic Token-Pass Transformers for Semantic Segmentation
* EASUM: Enhancing Affective State Understanding through Joint Sentiment and Emotion Modeling for Multimodal Tasks
* ECSIC: Epipolar Cross Attention for Stereo Image Compression
* Edge Inference with Fully Differentiable Quantized Mixed Precision Neural Networks
* Effective Restoration of Source Knowledge in Continual Test Time Adaptation
* Effects of Markers in Training Datasets on the Accuracy of 6D Pose Estimation
* Efficient Expansion and Gradient Based Task Inference for Replay Free Incremental Learning
* Efficient Explainable Face Verification based on Similarity Score Argument Backpropagation
* Efficient Feature Distillation for Zero-shot Annotation Object Detection
* Efficient Layout-Guided Image Inpainting for Mobile Use
* Efficient MAE towards Large-Scale Vision Transformers
* Efficient Semantic Matching with Hypercolumn Correlation
* Efficient Transferability Assessment for Selection of Pre-trained Detectors
* EfficientAD: Accurate Visual Anomaly Detection at Millisecond-Level Latencies
* Ego2HandsPose: A Dataset for Egocentric Two-hand 3D Global Pose Estimation
* Egocentric Action Recognition by Capturing Hand-Object Contact and Object State
* Elusive Images: Beyond Coarse Analysis for Fine-Grained Recognition
* Embedding Task Structure for Action Detection
* Embodied Human Activity Recognition
* EmoStyle: One-Shot Facial Expression Editing Using Continuous Emotion Parameters
* Empirical Investigation into Benchmarking Model Multiplicity for Trustworthy Machine Learning: A Case Study on Image Classification, An
* Empowering Unsupervised Domain Adaptation with Large-scale Pre-trained Vision-Language Models
* Enforcing Sparsity on Latent Space for Robust and Explainable Representations
* Enhancing Diverse Intra-identity Representation for Visible-Infrared Person Re-Identification
* Enhancing Multi-view Pedestrian Detection Through Generalized 3D Feature Pulling
* Enhancing Multimodal Compositional Reasoning of Visual Language Models with Generative Negative Mining
* ENIGMA-51: Towards a Fine-Grained Understanding of Human Behavior in Industrial Scenarios
* ENTED: Enhanced Neural Texture Extraction and Distribution for Reference-based Blind Face Restoration
* EResFD: Rediscovery of the Effectiveness of Standard Convolution for Lightweight Face Detection
* Estimating Blood Alcohol Level Through Facial Features for Driver Impairment Assessment
* Estimating Fog Parameters from an Image Sequence using Non-linear Optimisation
* Evaluation of Video Masked Autoencoders' Performance and Uncertainty Estimations for Driver Action and Intention Recognition
* EvDNeRF: Reconstructing Event Data with Dynamic Neural Radiance Fields
* Evidential Uncertainty Quantification: A Variance-Based Perspective
* Evolve: Enhancing Unsupervised Continual Learning with Multiple Experts
* Expanding Expressiveness of Diffusion Models with Limited Data via Self-Distillation based Fine-Tuning
* Expanding Hyperspherical Space for Few-Shot Class-Incremental Learning
* Exploiting CLIP for Zero-shot HOI Detection Requires Knowledge Distillation at Multiple Levels
* Exploiting the Signal-Leak Bias in Diffusion Models
* Exploring Adversarial Robustness of Vision Transformers in the Spectral Perspective
* Exploring the Impact of Rendering Method and Motion Quality on Model Performance when Using Multi-view Synthetic Data for Action Recognition
* FacadeNet: Conditional Facade Synthesis via Selective Editing
* Face Identity-Aware Disentanglement in StyleGAN
* Face Presentation Attack Detection by Excavating Causal Clues and Adapting Embedding Statistics
* FAKD: Feature Augmented Knowledge Distillation for Semantic Segmentation
* FarSight: A Physics-Driven Whole-Body Biometric System at Large Distance and Altitude
* Fast and Interpretable Face Identification for Out-Of-Distribution Data Using Vision Transformers
* Fast Diffusion EM: a diffusion model for blind inverse problems with application to deconvolution
* Fast Sun-aligned Outdoor Scene Relighting based on TensoRF
* FastCLIPstyler: Optimisation-free Text-based Image Style Transfer Using Style Representations
* FastSR-NeRF: Improving NeRF Efficiency on Consumer Devices with A Simple Super-Resolution Pipeline
* FATE: Feature-Agnostic Transformer-based Encoder for learning generalized embedding spaces in flow cytometry data
* Favoring One Among Equals - Not a Good Idea: Many-to-one Matching for Robust Transformer based Pedestrian Detection
* Feed-Forward Latent Domain Adaptation
* FELGA: Unsupervised Fragment Embedding for Fine-Grained Cross-Modal Association
* Few-Shot Event Classification in Images using Knowledge Graphs for Prompting
* Few-shot generative model for skeleton-based human action synthesis using cross-domain adversarial learning
* Few-shot Shape Recognition by Learning Deep Shape-aware Features
* FG-Net: Facial Action Unit Detection with Generalizable Pyramidal Features
* FinderNet: A Data Augmentation Free Canonicalization aided Loop Detection and Closure technique for Point clouds in 6-DOF separation
* Fine-Grained Alignment for Cross-Modal Recipe Retrieval
* Fingervein Verification using Convolutional Multi-Head Attention Network
* FIRe: Fast Inverse Rendering using Directional and Signed Distance Functions
* FIRE: Food Image to REcipe generation
* FishTrack23: An Ensemble Underwater Dataset for Multi-Object Tracking
* Fixed Pattern Noise Removal for Multi-View Single-Sensor Infrared Camera
* Fixing Overconfidence in Dynamic Neural Networks
* FLORA: Fine-grained Low-Rank Architecture Search for Vision Transformer
* FocusTune: Tuning Visual Localization through Focus-Guided Sampling
* FOSSIL: Free Open-Vocabulary Semantic Segmentation through Synthetic References Retrieval
* FOUND: Foot Optimization with Uncertain Normals for Surface Deformation Using Synthetic Data
* Foundation Model Assisted Weakly Supervised Semantic Segmentation
* FPGAN-Control: A Controllable Fingerprint Generator for Training with Synthetic Data
* Framework-agnostic Semantically-aware Global Reasoning for Segmentation
* FreMIM: Fourier Transform Meets Masked Image Modeling for Medical Image Segmentation
* Frequency Attention for Knowledge Distillation
* FRoG-MOT: Fast and Robust Generic Multiple-Object Tracking by IoU and Motion-State Associations
* From Chaos to Calibration: A Geometric Mutual Information Approach to Target-Free Camera LiDAR Extrinsic Calibration
* From Denoising Training to Test-Time Adaptation: Enhancing Domain Generalization for Medical Image Segmentation
* Fully-Automatic Reflection Removal for 360-Degree Images
* FuseCap: Leveraging Large Language Models for Enriched Fused Image Captions
* G-CASCADE: Efficient Cascaded Graph Convolutional Decoding for 2D Medical Image Segmentation
* GazeGNN: A Gaze-Guided Graph Neural Network for Chest X-ray Classification
* GC-MVSNet: Multi-View, Multi-Scale, Geometrically-Consistent Multi-View Stereo
* GC-VTON: Predicting Globally Consistent and Occlusion Aware Local Flows with Neighborhood Integrity Preservation for Virtual Try-on
* Generalization by Adaptation: Diffusion-Based Domain Extension for Domain-Generalized Semantic Segmentation
* Generalizing to Unseen Domains in Diabetic Retinopathy Classification
* Generated Distributions Are All You Need for Membership Inference Attacks Against Generative Models
* Generation of Upright Panoramic Image from Non-upright Panoramic Image
* Generative Multi-Resolution Pyramid and Normal-Conditioning 3D Cloth Draping, A
* generic and flexible regularization framework for NeRFs, A
* Geometry Loss Combination for 3D Human Pose Estimation, A
* GIPCOL: Graph-Injected Soft Prompting for Compositional Zero-Shot Learning
* GLAD: Global-Local View Alignment and Background Debiasing for Unsupervised Video Domain Adaptation with Large Domain Gap
* Glance to Count: Learning to Rank with Anchors for Weakly-supervised Crowd Counting
* Global Occlusion-Aware Transformer for Robust Stereo Matching
* Gradient Coreset for Federated Learning
* Gradient-Guided Knowledge Distillation for Object Detectors
* Gradual Source Domain Expansion for Unsupervised Domain Adaptation
* Grafting Vision Transformers
* Graph Neural Networks for End-to-End Information Extraction from Handwritten Documents
* Graph(Graph): A Nested Graph-Based Framework for Early Accident Anticipation
* GraphFill: Deep Image Inpainting using Graphs
* GRIT: GAN Residuals for Paired Image-to-Image Translation
* Group-wise Contrastive Bottleneck for Weakly-Supervised Visual Representation Learning
* Growing Strawberries Dataset: Tracking Multiple Objects with Biological Development over an Extended Period, The
* GTP-ViT: Efficient Vision Transformers via Graph-based Token Propagation
* Guided Cluster Aggregation: A Hierarchical Approach to Generalized Category Discovery
* Guided Distillation for Semi-Supervised Instance Segmentation
* HaGRID: HAnd Gesture Recognition Image Dataset
* HalluciDet: Hallucinating RGB Modality for Person Detection Through Privileged Information
* HALSIE: Hybrid Approach to Learning Segmentation by Simultaneously Exploiting Image and Event Modalities
* HAMMER: Learning Entropy Maps to Create Accurate 3D Models in Multi-View Stereo
* Handformer2T: A Lightweight Regression-based Model for Interacting Hands Pose Estimation from A Single RGB Image
* Hard Sample-aware Consistency for Low-resolution Facial Expression Recognition
* Hard-label based Small Query Black-box Adversarial Attack
* Hardware Aware Evolutionary Neural Architecture Search using Representation Similarity Metric
* Harnessing the Power of Multi-Lingual Datasets for Pre-training: Towards Enhancing Text Spotting Performance
* HashReID: Dynamic Network with Binary Codes for Efficient Person Re-identification
* Have We Ever Encountered This Before? Retrieving Out-of-Distribution Road Obstacles from Driving Scenes
* HD-Fusion: Detailed Text-to-3D Generation Leveraging Multiple Noise Estimation
* HDMNet: A Hierarchical Matching Network with Double Attention for Large-scale Outdoor LiDAR Point Cloud Registration
* HELA-VFA: A Hellinger Distance-Attention-based Feature Aggregation Network for Few-Shot Classification
* Hierarchical Diffusion Autoencoders and Disentangled Image Manipulation
* Hierarchical Text Spotter for Joint Text Spotting and Layout Analysis
* High-fidelity Pseudo-labels for Boosting Weakly-Supervised Segmentation
* High-Fidelity Zero-Shot Texture Anomaly Localization Using Feature Correspondence Analysis
* HMP: Hand Motion Priors for Pose and Shape Estimation from Video
* Holistic Representation Learning for Multitask Trajectory Anomaly Detection
* How Do Deepfakes Move? Motion Magnification for Deepfake Source Detection
* Human Motion Aware Text-to-Video Generation with Explicit Camera Control
* Hyb-NeRF: A Multiresolution Hybrid Encoding for Neural Radiance Fields
* Hybrid Graph Network for Complex Activity Detection in Video, A
* Hybrid Neural Diffeomorphic Flow for Shape Representation and Generation via Triplane
* Hybrid Sample Synthesis-based Debiasing of Classifier in Limited Data Setting
* Hyperbolic vs Euclidean Embeddings in Few-Shot Learning: Two Sides of the Same Coin
* HyperMix: Out-of-Distribution Detection and Classification in Few-Shot Settings
* I-AI: A Controllable & Interpretable AI System for Decoding Radiologists' Intense Focus for Accurate CXR Diagnoses
* iBARLE: imBalance-Aware Room Layout Estimation
* ICF-SRSR: Invertible scale-Conditional Function for Self-Supervised Real-world Single Image Super-Resolution
* IDD-AW: A Benchmark for Safe and Robust Segmentation of Drive Scenes in Unstructured Traffic and Adverse Weather
* Identifying Label Errors in Object Detection Datasets by Loss Inspection
* IKEA Ego 3D Dataset: Understanding furniture assembly actions from ego-view 3D Point Clouds
* Image Denoising and the Generative Accumulation of Photons
* Image Labels Are All You Need for Coarse Seagrass Segmentation
* Implicit Neural Image Stitching With Enhanced and Blended Feature Reconstruction
* Implicit neural representation for change detection
* Improved Techniques for Quantizing Deep Networks with Adaptive Bit-Widths
* Improved Topological Preservation in 3D Axon Segmentation and Centerline Detection using Geometric Assessment-driven Topological Smoothing (GATS)
* Improving Fairness in Deepfake Detection
* Improving Fairness using Vision-Language Driven Image Augmentation
* Improving Graph Networks through Selection-based Convolution
* Improving Normalization with the James-Stein Estimator
* Improving Open-Set Semi-Supervised Learning with Self-Supervision
* Improving the Effectiveness of Deep Generative Data
* Improving the Fairness of the Min-Max Game in GANs Training
* Improving the Leaking of Augmentations in Data-Efficient GANs via Adaptive Negative Data Augmentation
* Improving Vision-and-Language Reasoning via Spatial Relations Modeling
* INCODE: Implicit Neural Conditioning with Prior Knowledge Embeddings
* Incorporating Physics Principles for Precise Human Motion Prediction
* Increasing biases can be more efficient than increasing weights
* Indoor Visual Localization using Point and Line Correspondences in dense colored point cloud
* IndustReal: A Dataset for Procedure Step Recognition Handling Execution Errors in Egocentric Videos in an Industrial-Like Setting
* InfraParis: A multi-modal and multi-task autonomous driving dataset
* Instruct Me More! Random Prompting for Visual In-Context Learning
* Interaction Region Visual Transformer for Egocentric Action Anticipation
* Interactive Network Perturbation between Teacher and Students for Semi-Supervised Semantic Segmentation
* Interactive Segmentation for Diverse Gesture Types Without Context
* Interpretable Object Recognition by Semantic Prototype Analysis
* Intrinsic Hand Avatar: Illumination-aware Hand Appearance and Shape Reconstruction from Monocular RGB Video
* Investigating the Role of Attribute Context in Vision-Language Models for Object Recognition and Detection
* IR-FRestormer: Iterative Refinement with Fourier-Based Restormer for Accelerated MRI Reconstruction
* ISAR: A Benchmark for Single- and Few-Shot Object Instance Segmentation and Re-Identification
* Iterative Multi-granular Image Editing using Diffusion Models
* JOADAA: joint online action detection and action anticipation
* Joint 3D Shape and Motion Estimation from Rolling Shutter Light-Field Images
* Joint Depth Prediction and Semantic Segmentation with Multi-View SAM
* Kaizen: Practical self-supervised continual learning with continual fine-tuning
* Label Augmentation as Inter-class Data Augmentation for Conditional Image Synthesis with Imbalanced Data
* Label Shift Estimation for Class-Imbalance Problem: A Bayesian Approach
* Label-Free Synthetic Pretraining of Object Detectors
* Late to the party? On-demand unlabeled personalized federated learning
* Latent Feature-Guided Diffusion Models for Shadow Removal
* Latent-Guided Exemplar-Based Image Re-Colorization
* LatentDR: Improving Model Generalization Through Sample-Aware Latent Degradation and Restoration
* LatentPaint: Image Inpainting in Latent Space with Diffusion Models
* LaughTalk: Expressive 3D Talking Head Generation with Laughter
* LAVSS: Location-Guided Audio-Visual Spatial Audio Separation
* Layer-wise Auto-Weighting for Non-Stationary Test-Time Adaptation
* Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning Interference with Gradient Projection
* Learnable Cube-based Video Encryption for Privacy-Preserving Action Recognition
* Learning Better Keypoints for Multi-Object 6DoF Pose Estimation
* Learning Class and Domain Augmentations for Single-Source Open-Domain Generalization
* Learning Generalizable Perceptual Representations for Data-Efficient No-Reference Image Quality Assessment
* Learning Intra-class Multimodal Distributions with Orthonormal Matrices
* Learning Low-Rank Latent Spaces with Simple Deterministic Autoencoder: Theoretical and Empirical Insights
* Learning Quality Labels for Robust Image Classification
* Learning Residual Elastic Warps for Image Stitching under Dirichlet Boundary Condition
* Learning Robust Deep Visual Representations from EEG Brain Recordings
* Learning Saliency From Fixations
* Learning the What and How of Annotation in Video Object Segmentation
* Learning to Adapt CLIP for Few-Shot Monocular Depth Estimation
* Learning to Compose SuperWeights for Neural Parameter Allocation Search
* Learning to Detour: Shortcut Mitigating Augmentation for Weakly Supervised Semantic Segmentation
* Learning to generate training datasets for robust semantic segmentation
* Learning to Read Analog Gauges from Synthetic Data
* Learning to Recognize Occluded and Small Objects with Partial Inputs
* Learning Transferable Representations for Image Anomaly Localization Using Dense Pretraining
* Learning Visual Body-shape-Aware Embeddings for Fashion Compatibility
* Learning-based Spotlight Position Optimization for Non-Line-of-Sight Human Localization and Posture Classification
* LensNeRF: Rethinking Volume Rendering based on Thin-Lens Camera Model
* Let the Beat Follow You - Creating Interactive Drum Sounds From Body Rhythm
* Let's Observe Them Over Time: An Improved Pedestrian Attribute Recognition Approach
* Letting 3D Guide the Way: 3D Guided 2D Few-Shot Image Classification
* Leveraging Bitstream Metadata for Fast, Accurate, Generalized Compressed Video Quality Enhancement
* Leveraging Next-Active Objects for Context-Aware Anticipation in Egocentric Videos
* Leveraging Synthetic Data to Learn Video Stabilization Under Adverse Conditions
* Leveraging Task-Specific Pre-Training to Reason across Images and Videos
* Leveraging the Power of Data Augmentation for Transformer-based Tracking
* LibreFace: An Open-Source Toolkit for Deep Facial Expression Analysis
* LidarCLIP or: How I Learned to Talk to Point Clouds
* Lightweight Delivery Detection on Doorbell Cameras
* Lightweight Portrait Matting via Regional Attention and Refinement
* Lightweight Thermal Super-Resolution and Object Detection for Robust Perception in Adverse Weather Conditions
* Limited Data, Unlimited Potential: A Study on ViTs Augmented by Masked Autoencoders
* Link Prediction for Flow-Driven Spatial Networks
* Linking convolutional kernel size to generalization bias in face analysis CNNs
* LInKs Lifting Independent Keypoints - Partial Pose Lifting for Occlusion Handling with Improved Accuracy in 2D-3D Human Pose Estimation
* LipAT: Beyond Style Transfer for Controllable Neural Simulation of Lipstick using Cosmetic Attributes
* LIVENet: A novel network for real-world low-light image denoising and enhancement
* Localization and Manipulation of Immoral Visual Cues for Safe Text-to-Image Generation
* Location-Aware Self-Supervised Transformers for Semantic Segmentation
* LongFormer: Longitudinal Transformer for Alzheimer's Disease Classification with Structural MRIs
* Lost Your Style? Navigating with Semantic-Level Approach for Text-to-Outfit Retrieval
* LP-OVOD: Open-Vocabulary Object Detection by Linear Probing
* M33D: Learning 3D priors using Multi-Modal Masked Autoencoders for 2D image and video understanding
* MACP: Efficient Model Adaptation for Cooperative Perception
* MAdVerse: A Hierarchical Dataset of Multi-Lingual Ads from Diverse Sources and Categories
* MAELi: Masked Autoencoder for Large-Scale LiDAR Point Clouds
* MagneticPillars: Efficient Point Cloud Registration through Hierarchized Birds-Eye-View Cell Correspondence Refinement
* MarsLS-Net: Martian Landslides Segmentation Network and Benchmark Dataset
* MaskConver: Revisiting Pure Convolution Model for Panoptic Segmentation
* Masked Collaborative Contrast for Weakly Supervised Semantic Segmentation
* Masked Event Modeling: Self-Supervised Pretraining for Event Cameras
* Masking Improves Contrastive Self-Supervised Learning for ConvNets, and Saliency Tells You Where
* Maximum Knowledge Orthogonality Reconstruction with Gradients in Federated Learning
* Med-DANet V2: A Flexible Dynamic Architecture for Efficient Medical Volumetric Segmentation
* MEGANet: Multi-Scale Edge-Guided Attention Network for Weak Boundary Polyp Segmentation
* Membership Inference Attack Using Self Influence Functions
* Meta-Learned Attribute Self-Interaction Network for Continual and Generalized Zero-Shot Learning
* Meta-Learned Kernel For Blind Super-Resolution Kernel Estimation
* MetaSeg: MetaFormer-based Global Contexts-aware Network for Efficient Semantic Segmentation
* MetaVers: Meta-Learned Versatile Representations for Personalized Federated Learning
* MFT: Long-Term Tracking of Every Pixel
* MGM-AE: Self-Supervised Learning on 3D Shape Using Mesh Graph Masked Autoencoders
* MICS: Midpoint Interpolation to Learn Compact and Separated Representations for Few-Shot Class-Incremental Learning
* MIDAS: Mixing Ambiguous Data with Soft Labels for Dynamic Facial Expression Recognition
* Mini but Mighty: Finetuning ViTs with Mini Adapters
* Minimizing Layerwise Activation Norm Improves Generalization in Federated Learning
* Mining and Unifying Heterogeneous Contrastive Relations for Weakly-Supervised Actor-Action Segmentation
* Missing Modality Robustness in Semi-Supervised Multi-Modal Semantic Segmentation
* MIST: Medical Image Segmentation Transformer with Convolutional Attention Mixing (CAM) Decoder
* MITFAS: Mutual Information based Temporal Feature Alignment and Sampling for Aerial Video Action Recognition
* Mitigate Domain Shift by Primary-Auxiliary Objectives Association for Generalizing Person ReID
* MIVC: Multiple Instance Visual Component for Visual-Language Models
* Mixing Gradients in Neural Networks as a Strategy to Enhance Privacy in Federated Learning
* MixtureGrowth: Growing Neural Networks by Recombining Learned Parameters
* MobileNVC: Real-time 1080p Neural Video Compression on a Mobile Device
* Modality-Aware Representation Learning for Zero-shot Sketch-based Image Retrieval
* Monocular 3D Object Detection with LiDAR Guided Semi Supervised Active Learning
* MonoProb: Self-Supervised Monocular Depth Estimation with Interpretable Uncertainty
* MoP-CLIP: A Mixture of Prompt-Tuned CLIP Models for Domain Incremental Learning
* MOPA: Modular Object Navigation with PointGoal Agents
* MoRF: Mobile Realistic Fullbody Avatars from a Monocular Video
* Motion Matters: Neural Motion Transfer for Better Camera Physiological Measurement
* MotionAGFormer: Enhancing 3D Human Pose Estimation with a Transformer-GCNFormer Network
* MotionGPT: Human Motion Synthesis with Improved Diversity and Realism via GPT-3 Prompting
* Movie Genre Classification by Language Augmentation and Shot Sampling
* MPT: Mesh Pre-Training with Transformers for Human Pose and Mesh Reconstruction
* MS-EVS: Multispectral event-based vision for deep learning based face detection
* MSCC: Multi-Scale Transformers for Camera Calibration
* Multi-Class Segmentation from Aerial Views using Recursive Noise Diffusion
* Multi-level Attention Aggregation for Aesthetic Face Relighting
* Multi-Modal Gaze Following in Conversational Scenarios
* Multi-Source Domain Adaptation for Object Detection with Prototype-based Mean Teacher
* Multi-view 3D Object Reconstruction and Uncertainty Modelling with Neural Shape Prior
* Multi-view Classification Using Hybrid Fusion and Mutual Distillation
* Multimodal Benchmark and Improved Architecture for Zero Shot Learning, A
* Multimodal Channel-Mixing: Channel and Spatial Masked AutoEncoder on Facial Action Unit Detection
* Multimodal Deep Learning for Remote Stress Estimation Using CCT-LSTM
* Multimodality-guided Image Style Transfer using Cross-modal GAN Inversion
* Multispectral Imaging for Differential Face Morphing Attack Detection: A Preliminary Study
* Multitask Vision-Language Prompt Tuning
* MuSHRoom: Multi-Sensor Hybrid Room Dataset for Joint 3D Reconstruction and Novel View Synthesis
* Natural Light Can Also be Dangerous: Traffic Sign Misinterpretation Under Adversarial Natural Light Attacks
* NCIS: Neural Contextual Iterative Smoothing for Purifying Adversarial Perturbations
* NeRFEditor: Differentiable Style Decomposition for 3D Scene Editing
* Nested Diffusion Processes for Anytime Image Generation
* Neural Echos: Depthwise Convolutional Filters Replicate Biological Receptive Fields
* Neural Height-Map Approach for the Binocular Photometric Stereo Problem, A
* Neural Image Compression Using Masked Sparse Visual Representation
* Neural Style Protection: Counteracting Unauthorized Neural Style Transfer
* Neural Textured Deformable Meshes for Robust Analysis-by-Synthesis
* NITEC: Versatile Hand-Annotated Eye Contact Dataset for Ego-Vision Interaction
* NOMAD: A Natural, Occluded, Multi-scale Aerial Dataset, for Emergency Response Scenarios
* NVAutoNet: Fast and Accurate 360° 3D Visual Perception For Self Driving
* Object Aware Contrastive Prior for Interactive Image Segmentation
* Object Re-Identification from Point Clouds
* Object-centric Video Representation for Long-term Action Anticipation
* Occlusion Sensitivity Analysis with Augmentation Subspace Perturbation in Deep Feature Space
* OE-CTST: Outlier-Embedded Cross Temporal Scale Transformer for Weakly-supervised Video Anomaly Detection
* Offline-to-Online Knowledge Distillation for Video Instance Segmentation
* OmniVec: Learning robust representations with cross modal sharing
* On Manipulating Scene Text in the Wild with Diffusion Models
* On the Fly Neural Style Smoothing for Risk-Averse Domain Generalization
* On the Importance of Large Objects in CNN Based Object Detection Algorithms
* On the Quantification of Image Reconstruction Uncertainty without Training Data
* One Style is All You Need to Generate a Video
* One-Shot Learning Approach to Document Layout Segmentation of Ancient Arabic Manuscripts, A
* Online Class-Incremental Learning For Real-World Food Image Classification
* OOD Aware Supervised Contrastive Learning
* Open-NeRF: Towards Open Vocabulary NeRF Decomposition
* Open-Set Object Detection By Aligning Known Class Representations
* Opinion Unaware Image Quality Assessment via Adversarial Convolutional Variational Autoencoder
* OptFlow: Fast Optimization-based Scene Flow Estimation without Supervision
* Optical Flow Domain Adaptation via Target Style Transfer
* Optimizing Long-Term Robot Tracking with Multi-Platform Sensor Fusion
* Ordinal Classification with Distance Regularization for Robust Brain Age Prediction
* OTAS: Unsupervised Boundary Detection for Object-Centric Temporal Action Segmentation
* Out-of-Distribution Detection with Logical Reasoning
* OVeNet: Offset Vector Network for Semantic Segmentation
* Overcoming Catastrophic Forgetting for Multi-Label Class-Incremental Learning
* P-Age: Pexels Dataset for Robust Spatio-Temporal Apparent Age Classification
* P2D: Plug and Play Discriminator for accelerating GAN frameworks
* Painterly Image Harmonization via Adversarial Residual Learning
* PAIR: Perception Aided Image Restoration for Natural Driving Conditions
* Paleographer's Eye ex machina: Using Computer Vision to Assist Humanists in Scribal Hand Identification, The
* Panelformer: Sewing Pattern Reconstruction from 2D Garment Images
* Partial Binarization of Neural Networks for Budget-Aware Efficient Learning
* ParticleNeRF: A Particle-Based Encoding for Online Neural Radiance Fields
* Patch-based Selection and Refinement for Early Object Detection
* PatchRefineNet: Improving Binary Segmentation by Incorporating Signals from Optimal Patch-wise Binarization
* PathLDM: Text conditioned Latent Diffusion Model for Histopathology
* PATROL: Privacy-Oriented Pruning for Collaborative Inference Against Model Inversion Attacks
* PDA-RWSR: Pixel-Wise Degradation Adaptive Real-World Super-Resolution
* PECoP: Parameter Efficient Continual Pretraining for Action Quality Assessment
* Permutation-Aware Activity Segmentation via Unsupervised Frame-to-Segment Alignment
* Personalized Face Inpainting with Diffusion Models by Parallel Visual Attention
* PETIT-GAN: Physically Enhanced Thermal Image-Translating Generative Adversarial Network
* PGVT: Pose-Guided Video Transformer for Fine-Grained Action Recognition
* PHG-Net: Persistent Homology Guided Medical Image Classification*
* PhISH-Net: Physics Inspired System for High Resolution Underwater Image Enhancement
* Physical-space Multi-body Mesh Detection Achieved by Local Alignment and Global Dense Learning
* PIDiffu: Pixel-aligned Diffusion Model for High-Fidelity Clothed Human Reconstruction
* Pixel Matching Network for Cross-Domain Few-Shot Segmentation
* Pixel-Grounded Prototypical Part Networks
* PlantPlotGAN: A Physics-Informed Generative Adversarial Network for Plant Disease Prediction
* Plasticity-Optimized Complementary Networks for Unsupervised Continual Learning
* PMI Sampler: Patch Similarity Guided Frame Selection For Aerial Action Recognition
* PMVC: Promoting Multi-View Consistency for 3D Scene Reconstruction
* Point-DynRF: Point-based Dynamic Radiance Fields from a Monocular Video
* PointCT: Point Central Transformer Network for Weakly-supervised Point Cloud Semantic Segmentation
* POISE: Pose Guided Human Silhouette Extraction under Occlusions
* Polarimetric PatchMatch Multi-View Stereo
* PolyMaX: General Dense Prediction with Mask Transformer
* POP-VQA: Privacy preserving, On-device, Personalized Visual Question Answering
* PoseDiff: Pose-conditioned Multimodal Diffusion Model for Unbounded Scene Synthesis from Sparse Inputs
* PreciseDebias: An Automatic Prompt Engineering Approach for Generative AI to Mitigate Image Demographic Biases
* Preserving Image Properties Through Initializations in Diffusion Models
* PressureVision++: Estimating Fingertip Pressure from Diverse RGB Images
* Privacy-Enhancing Person Re-identification Framework: A Dual-Stage Approach
* PrivObfNet: A Weakly Supervised Semantic Segmentation Model for Data Protection
* ProcSim: Proxy-based Confidence for Robust Similarity Learning
* Progressive Hypothesis Transformer for 3D Human Mesh Recovery
* PromptAD: Zero-shot Anomaly Detection using Text Prompts
* Prompting classes: Exploring the Power of Prompt Class Learning in Weakly Supervised Semantic Segmentation
* PromptonomyViT: Multi-Task Prompt Learning Improves Video Transformers using Synthetic Scene Data
* ProS: Facial Omni-Representation Learning via Prototype-based Self-Distillation
* Prototype Learning for Explainable Brain Age Prediction
* Prototypical Contrastive Network for Imbalanced Aerial Image Segmentation
* ProxEdit: Improving Tuning-Free Real Image Editing with Proximal Guidance
* Pruning from Scratch via Shared Pruning Module and Nuclear norm-based Regularization
* pSTarC: Pseudo Source Guided Target Clustering for Fully Test-Time Adaptation
* PsyMo: A Dataset for Estimating Self-Reported Psychological Traits from Gait
* Query-guided Attention in Vision Transformers for Localizing Objects Using a Single Sketch
* RADIO: Reference-Agnostic Dubbing Video Synthesis
* Random Walks for Temporal Action Segmentation with Timestamp Supervision
* Randomized Adversarial Style Perturbations for Domain Generalization
* Rank2Tell: A Multimodal Driving Dataset for Joint Importance Ranking and Reasoning
* RankDVQA: Deep VQA based on Ranking-inspired Hybrid Training
* Ray Deformation Networks for Novel View Synthesis of Refractive Objects
* Re-Evaluating LiDAR Scene Flow
* Re-VoxelDet: Rethinking Neck and Head Architectures for High-Performance Voxel-based 3D Detection
* Real Time GAZED: Online Shot Selection and Editing of Virtual Cameras from Wide-Angle Monocular Video Recordings
* Real-time 6-DoF Pose Estimation by an Event-based Camera using Active LED Markers
* Real-Time Polyp Detection in Colonoscopy using Lightweight Transformer
* Real-Time User-guided Adaptive Colorization with Vision Transformer
* Real-Time Weakly Supervised Video Anomaly Detection
* REALM: Robust Entropy Adaptive Loss Minimization for Improved Single-Sample Test-Time Adaptation
* ReCLIP: Refine Contrastive Language Image Pre-Training with Source Free Domain Adaptation
* Recognition of Unseen Bird Species by Learning from Field Guides
* ReConPatch: Contrastive Patch Representation Learning for Industrial Anomaly Detection
* RecycleNet: Latent Feature Recycling Leads to Iterative Decision Refinement
* Reducing the Side-Effects of Oscillations in Training of Quantized YOLO Networks
* Reference-based Restoration of Digitized Analog Videotapes
* Refine and Redistribute: Multi-Domain Fusion and Dynamic Label Assignment for Unbiased Scene Graph Generation
* Registered and Segmented Deformable Object Reconstruction from a Single View Point Cloud
* Removing the Quality Tax in Controllable Face Generation
* Repetitive Action Counting with Motion Feature Learning
* Residual Graph Convolutional Network for Bird's-Eye-View Semantic Segmentation
* Restoring Degraded Old Films with Recursive Recurrent Transformer Networks
* Rethink Cross-Modal Fusion in Weakly-Supervised Audio-Visual Video Parsing
* Rethinking Knowledge Distillation with Raw Features for Semantic Segmentation
* Rethinking Multimodal Content Moderation from an Asymmetric Angle with Mixed-modality
* Rethinking Visibility in Human Pose Estimation: Occluded Pose Reasoning via Transformers
* Reverse Knowledge Distillation: Training a Large Model using a Small One for Retinal Image Matching on Limited Data
* Revisiting Latent Space of GAN Inversion for Robust Real Image Editing
* Revisiting Pixel-Level Contrastive Pre-Training on Scene Images
* Revisiting Token Pruning for Object Detection and Instance Segmentation
* Revolutionize the Oceanic Drone RGB Imagery with Pioneering Sun Glint Detection and Removal Techniques
* RGB-D Mapping and Tracking in a Plenoxel Radiance Field
* RGB-X Object Detection via Scene-Specific Fusion Modules
* RGBT-Dog: A Parametric Model and Pose Prior For Canine Body Analysis Data Creation
* RIMeshGNN: A Rotation-Invariant Graph Neural Network for Mesh Classification
* RMFER: Semi-supervised Contrastive Learning for Facial Expression Recognition with Reaction Mashup Video
* Robust Category-Level 3D Pose Estimation from Diffusion-Enhanced Synthetic Data
* Robust Diffusion Modeling Framework for Radar Camera 3D Object Detection, A
* Robust Eye Blink Detection Using Dual Embedding Video Vision Transformer
* Robust Feature Learning and Global Variance-Driven Classifier Alignment for Long-Tail Class Incremental Learning
* Robust Learning via Conditional Prevalence Adjustment
* Robust Object Detection in Challenging Weather Conditions
* Robust Source-Free Domain Adaptation for Fundus Image Segmentation
* Robust TRISO-fueled Pebble Identification by Digit Recognition
* Robust Unsupervised Domain Adaptation through Negative-View Regularization
* RobustCLEVR: A Benchmark and Framework for Evaluating Robustness in Object-centric Learning
* Rotation-Constrained Cross-View Feature Fusion for Multi-View Appearance-based Gaze Estimation
* RPCANet: Deep Unfolding RPCA Based Infrared Small Target Detection
* RS2G: Data-Driven Scene-Graph Extraction and Embedding for Robust Autonomous Perception and Scenario Understanding
* RSMPNet: Relationship Guided Semantic Map Prediction
* S3AD: Semi-supervised Small Apple Detection in Orchard Environments
* Salient Object Detection for Images Taken by People With Vision Impairments
* SAM Fewshot Finetuning for Anatomical Segmentation in Medical Images
* SBCFormer: Lightweight Network Capable of Full-size ImageNet Classification at 1 FPS on Single Board Computers
* SC-MIL: Supervised Contrastive Multiple Instance Learning for Imbalanced Classification in Pathology
* Scale-Adaptive Feature Aggregation for Efficient Space-Time Video Super-Resolution
* ScanEnts3D: Exploiting Phrase-to-3D-Object Correspondences for Improved Visio-Linguistic Models in 3D Scenes
* Scene Text Image Super-resolution based on Text-conditional Diffusion Models
* SciOL and MuLMS-Img: Introducing A Large-Scale Multimodal Scientific Dataset and Models for Image-Text Tasks in the Scientific Domain
* SCoRD: Subject-Conditional Relation Detection with Text-Augmented Data
* SCUNet++: Swin-UNet and CNN Bottleneck Hybrid Architecture with Multi-Fusion Dense Skip Connection for Pulmonary Embolism CT Image Segmentation*
* SDNet: An Extremely Efficient Portrait Matting Model via Self-Distillation
* SeaTurtleID2022: A long-span dataset for reliable sea turtle re-identification
* Second-Order Graph ODEs for Multi-Agent Trajectory Forecasting
* Seeing Stars: Learned Star Localization for Narrow-Field Astrometry
* Segment anything, from space?
* Self-Annotated 3D Geometric Learning for Smeared Points Removal
* Self-Sampling Meta SAM: Enhancing Few-shot Medical Image Segmentation with Meta-Learning
* Self-Supervised Denoising Transformer with Gaussian Process
* Self-Supervised Edge Detection Reconstruction for Topology-Informed 3D Axon Segmentation and Centerline Detection
* Self-Supervised Learning for Place Representation Generalization across Appearance Changes
* Self-Supervised Learning for Visual Relationship Detection through Masked Bounding Box Reconstruction
* Self-supervised Learning of Semantic Correspondence Using Web Videos
* Self-Supervised Learning with Masked Autoencoders for Teeth Segmentation from Intra-oral 3D Scans
* Self-Supervised Relation Alignment for Scene Graph Generation
* Self-Supervised Representation Learning with Cross-Context Learning between Global and Hypercolumn Features
* SEMA: Semantic Attention for Capturing Long-Range Dependencies in Egocentric Lifelogs
* Semantic Fusion Augmentation and Semantic Boundary Detection: A Novel Approach to Multi-Target Video Moment Retrieval
* Semantic Generative Augmentations for Few-Shot Counting
* Semantic Labels-Aware Transformer Model for Searching over a Large Collection of Lecture-Slides
* Semantic Transfer from Head to Tail: Enlarging Tail Margin for Long-Tailed Visual Recognition
* Semantic-aware Video Representation for Few-shot Action Recognition
* Semi-Supervised Scene Change Detection by Distillation from Feature-metric Alignment
* Semi-Supervised Semantic Depth Estimation using Symbiotic Transformer and NearFarMix Augmentation
* SemST: Semantically Consistent Multi-Scale Image Translation via Structure-Texture Alignment
* Separable Self and Mixed Attention Transformers for Efficient Object Tracking
* SequenceMatch Revisiting the design of weak-strong augmentations for Semi-supervised learning
* Sequential Learning-based Approach for Monocular Human Performance Capture, A
* Sequential Transformer for End-to-End Video Text Detection
* SGRec3D: Self-Supervised 3D Scene Graph Learning via Object-Level Scene Reconstruction
* ShadowSense: Unsupervised Domain Adaptation and Feature Fusion for Shadow-Agnostic Tree Crown Detection from RGB-Thermal Drone Imagery
* Shape from Shading for Robotic Manipulation
* Shape-biased CNNs are Not Always Superior in Out-of-Distribution Robustness
* Shape-Guided Diffusion with Inside-Outside Attention
* ShARc: Shape and Appearance Recognition for Person Identification In-the-wild
* Sharp-NeRF: Grid-based Fast Deblurring Neural Radiance Fields using Sharpness Prior
* Show Your Face: Restoring Complete Facial Images from Partial Observations for VR Meeting
* SICKLE: A Multi-Sensor Satellite Imagery Dataset Annotated with Multiple Key Cropping Parameters
* SigmML: Metric meta-learning for Writer Independent Offline Signature Verification in the Space of SPD Matrices
* Sign Language Production with Latent Motion Transformer
* SimA: Simple Softmax-free Attention for Vision Transformers
* Simple Post-Training Robustness using Test Time Augmentations and Random Forest
* Simple Token-Level Confidence Improves Caption Correctness
* SimpliMix: A Simplified Manifold Mixup for Few-shot Point Cloud Classification
* Single Domain Generalization via Normalised Cross-correlation Based Convolutions
* Single Frame Semantic Segmentation Using Multi-Modal Spherical Images
* Single-Image Deblurring, Trajectory and Shape Recovery of Fast Moving Objects with Denoising Diffusion Probabilistic Models
* Sketch-based Video Object Localization
* Slice and Conquer: A Planar-to-3D Framework for Efficient Interactive Segmentation of Volumetric Images
* SLoSH: Set Locality Sensitive Hashing via Sliced-Wasserstein Embeddings
* Small Objects Matters in Weakly-supervised Semantic Segmentation
* So you think you can track?
* SOAP: Cross-sensor Domain Adaptation for 3D Object Detection Using Stationary Object Aggregation Pseudo-labelling
* Soft Curriculum for Learning Conditional GANs with Noisy-Labeled and Uncurated Unlabeled Data
* Solving the Plane-Sphere Ambiguity in Top-Down Structure-from-Motion
* Sound3DVDet: 3D Sound Source Detection using Multiview Microphone Array and RGB Images
* Source-Guided Similarity Preservation for Online Person Re-Identification
* Sparse Convolutional Networks for Surface Reconstruction from Noisy Point Clouds
* Spatio-temporal Filter Analysis Improves 3D-CNN For Action Classification
* SpectralCLIP: Preventing Artifacts in Text-Guided Style Transfer from a Spectral Perspective
* Spectroformer: Multi-Domain Query Cascaded Transformer Network For Underwater Image Enhancement
* Specular Object Reconstruction Behind Frosted Glass by Differentiable Rendering
* SphereCraft: A Dataset for Spherical Keypoint Detection, Matching and Camera Pose Estimation
* Spiking Denoising Diffusion Probabilistic Models
* Spiking Neural Networks for Active Time-Resolved SPAD Imaging
* SSP: Semi-signed prioritized neural fitting for surface reconstruction from unoriented point clouds
* SSVOD: Semi-Supervised Video Object Detection with Sparse Annotations
* Steering Prototypes with Prompt-tuning for Rehearsal-free Continual Learning
* STEP - Towards Structured Scene-Text Spotting
* Stereo Conversion with Disparity-Aware Warping, Compositing and Inpainting
* Stereo Matching in Time: 100+ FPS Video Stereo Matching for Extended Reality
* Stochastic Binary Network for Universal Domain Adaptation
* StreamMapNet: Streaming Mapping Network for Vectorized Online HD Map Construction
* StyleAvatar: Stylizing Animatable Head Avatars
* StyleGAN-Fusion: Diffusion Guided Domain Adaptation of Image Generators
* StyleGenes: Discrete and Efficient Latent Distributions for GANs
* StyLIP: Multi-Scale Style-Conditioned Prompt Learning for CLIP-based Domain Generalization
* SupeRVol: Super-Resolution Shape and Reflectance Estimation in Inverse Volume Rendering
* Synergizing Contrastive Learning and Optimal Transport for 3D Point Cloud Domain Adaptation
* SynergyNet: Bridging the Gap between Discrete and Continuous Representations for Precise Medical Image Segmentation
* Synthesizing Anyone, Anywhere, in Any Pose
* Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models
* SyntheWorld: A Large-Scale Synthetic Dataset for Land Cover Mapping and Building Change Detection
* SynthProv: Interpretable Framework for Profiling Identity Leakage
* Tackling Data Bias in MUSIC-AVQA: Crafting a Balanced Dataset for Unbiased Question-Answering
* Taming Normalizing Flows
* TAMPAR: Visual Tampering Detection for Parcel Logistics in Postal Supply Chains
* Task-Oriented Human-Object Interactions Generation with Implicit Neural Representations
* TCP: Triplet Contrastive-relationship Preserving for Class-Incremental Learning
* TEGLO: High Fidelity Canonical Texture Mapping from Single-View Images
* Temporal Context Enhanced Referring Video Object Segmentation
* Temporally-Consistent Video Semantic Segmentation with Bidirectional Occlusion-guided Feature Propagation
* Text-Guided Face Recognition using Multi-Granularity Cross-Modal Contrastive Learning
* Text-to-image Editing by Image Information Removal
* Text-to-Image Models for Counterfactual Explanations: A Black-Box Approach
* Textron: Weakly Supervised Multilingual Text Detection through Data Programming
* Textual Alchemy: CoFormer for Scene Text Understanding
* THInImg: Cross-modal Steganography for Presenting Talking Heads in Images
* Think before You Simulate: Symbolic Reasoning to Orchestrate Neural Computation for Counterfactual Question Answering
* TIAM - A Metric for Evaluating Alignment in Text-to-Image Generation
* Time to Shine: Fine-Tuning Object Detection Models with Synthetic Adverse Weather Images
* Token Fusion: Bridging the Gap between Token Pruning and Token Merging
* Top-Down Beats Bottom-Up in 3D Instance Segmentation
* Torque based Structured Pruning for Deep Neural Network
* Toward Planet-Wide Traffic Camera Calibration
* Towards a Dynamic Vision Sensor-based Insect Camera Trap
* Towards Accurate Disease Segmentation in Plant Images: A Comprehensive Dataset Creation and Network Evaluation
* Towards Addressing the Misalignment of Object Proposal Evaluation for Vision-Language Tasks via Semantic Grounding
* Towards Better Structured Pruning Saliency by Reorganizing Convolution
* Towards Diverse and Consistent Typography Generation
* Towards More Realistic Membership Inference Attacks on Large Diffusion Models
* Towards Realistic Generative 3D Face Models
* Towards Visual Saliency Explanations of Face Verification
* TPSeNCE: Towards Artifact-Free Realistic Rain Generation for Deraining and Object Detection in Rain
* Tracking Skiers from the Top to the Bottom
* Tracking Tiny Insects in Cluttered Natural Environments using Refinable Recurrent Neural Networks
* Training Ensembles with Inliers and Outliers for Semi-supervised Active Learning
* Training-Based Model Refinement and Representation Disagreement for Semi-Supervised Object Detection
* Training-free Content Injection using h-space in Diffusion Models
* Training-Free Layout Control with Cross-Attention Guidance
* Training-free Object Counting with Prompts
* TransFed: A way to epitomize Focal Modulation using Transformer-based Federated Learning
* TransRadar: Adaptive-Directional Transformer for Real-Time Multi-View Radar Semantic Segmentation
* TriCoLo: Trimodal Contrastive Loss for Text to Shape Retrieval
* TriPlaneNet: An Encoder for EG3D Inversion
* Triplet Attention Transformer for Spatiotemporal Predictive Learning
* TSA2: Temporal Segment Adaptation and Aggregation for Video Harmonization
* TSP-Transformer: Task-Specific Prompts Boosted Transformer for Holistic Scene Understanding
* Tunable Hybrid Proposal Networks for the Open World
* U3DS3: Unsupervised 3D Semantic Scene Segmentation
* UGPNet: Universal Generative Prior for Image Restoration
* Uncertainty Estimation in Instance Segmentation with Star-convex Shapes
* Uncertainty-weighted Loss Functions for Improved Adversarial Attacks on Semantic Segmentation
* Understanding Dark Scenes by Contrasting Multi-Modal Observations
* Understanding Hyperbolic Metric Learning through Hard Negative Sampling
* Unified Concept Editing in Diffusion Models
* United We Stand, Divided We Fall: UnityGraph for Unsupervised Procedure Learning from Videos
* Universal Semi-supervised Model Adaptation via Collaborative Consistency Training
* Universal Test-time Adaptation through Weight Ensembling, Diversity Weighting, and Prior Correction
* UNSPAT: Uncertainty-Guided SpatioTemporal Transformer for 3D Human Pose and Shape Estimation on Videos
* Unsupervised 3D Pose Estimation with Non-Rigid Structure-from-Motion Modeling
* Unsupervised and semi-supervised co-salient object detection via segmentation frequency statistics
* Unsupervised Co-Generation of Foreground-Background Segmentation from Text-to-Image Synthesis
* Unsupervised Domain Adaptation for Semantic Segmentation with Pseudo Label Self-Refinement
* Unsupervised Domain Adaptation of MRI Skull-stripping Trained on Adult Data to Newborns
* Unsupervised Event-Based Video Reconstruction
* Unsupervised Exemplar-Based Image-to-Image Translation and Cascaded Vision Transformers for Tagged and Untagged Cardiac Cine MRI Registration
* Unsupervised Graphic Layout Grouping with Transformers
* Unsupervised Model-based Learning for Simultaneous Video Deflickering and Deblotching
* UOW-Vessel: A Benchmark Dataset of High-Resolution Optical Satellite Images for Vessel Detection and Segmentation
* USDN: A Unified Sample-wise Dynamic Network with Mixed-Precision and Early-Exit
* Using Early Readouts to Mediate Featural Bias in Distillation
* VCISR: Blind Single Image Super-Resolution with Video Compression Synthetic Data
* VD-GR: Boosting Visual Dialog with Cascaded Spatial-Temporal Multi-Modal GRaphs
* VEATIC: Video-based Emotion and Affect Tracking in Context Dataset
* Video Instance Matting
* Video-kMaX: A Simple Unified Approach for Online and Near-Online Video Panoptic Segmentation
* VideoFACT: Detecting Video Forgeries Using Attention, Scene Context, and Forensic Traces
* Vikriti-ID: A Novel Approach For Real Looking Fingerprint Data-set Generation
* Vision Transformer for Multispectral Satellite Imagery: Advancing Landcover Classification*
* Visual Active Search Framework for Geospatial Exploration, A
* Visual Narratives: Large-scale Hierarchical Classification of Art-historical Images
* Visually Guided Audio Source Separation with Meta Consistency Learning
* VMFormer: End-to-End Video Matting with Transformer
* Volumetric Disentanglement for 3D Scene Manipulation
* Wakening Past Concepts without Past Data: Class-Incremental Learning from Online Placebos
* WalkFormer: Point Cloud Completion via Guided Walks
* Watch Where You Head: A View-biased Domain Gap in Gait Recognition and Unsupervised Adaptation
* WATCH: Wide-Area Terrestrial Change Hypercube
* WaveMixSR: Resource-efficient Neural Network for Image Super-resolution
* Weakly-supervised deepfake localization in diffusion-generated images
* Weakly-Supervised Representation Learning for Video Alignment and Analysis
* What Decreases Editing Capability? Domain-Specific Hybrid Refinement for Improved GAN Inversion
* What's in the Flow? Exploiting Temporal Motion Cues for Unsupervised Generic Event Boundary Detection
* What's Outside the Intersection? Fine-grained Error Analysis for Semantic Segmentation Beyond IoU
* When 3D Bounding-Box Meets SAM: Point Cloud Instance Segmentation with Weak-and-Noisy Supervision
* WildlifeDatasets: An open-source toolkit for animal re-identification
* Wino Vidi Vici: Conquering Numerical Instability of 8-bit Winograd Convolution for Accurate Inference Acceleration on Edge
* You Can Run but not Hide: Improving Gait Recognition with Intrinsic Occlusion Type Awareness
* ZEETAD: Adapting Pretrained Vision-Language Model for Zero-Shot End-to-End Temporal Action Detection
* Zero-shot Building Attribute Extraction from Large-Scale Vision and Language Models
* Zero-Shot Video Moment Retrieval from Frozen Vision-Language Models
* ZIGNeRF: Zero-shot 3D Scene Representation with Invertible Generative Neural Radiance Fields
* ZRG: A Dataset for Multimodal 3D Residential Rooftop Understanding
* Adaptive Control Techniques for Dynamic Visual Repositioning of Hand-Eye Robotic Systems
* Algorithms for a Fast Confocal Optical Inspection System
* Automated direct patterned wafer inspection
* Autonomous Landing Of Airplanes By Dynamic Machine Vision
* Cartrack: Computer Vision-Based Car Following
* Curve Recognition Using B-Spline Representation
* Fiber identification in microscopy by ridge detection and grouping
* Interactive map conversion: combining machine vision and human input
* Interactive Road Finding For Aerial Images
* Interpolation of cinematic sequences
* Liquid Crystal Polarization Camera
* Multiple Object Tracking System with Three Level Continuous Processes
* Multiscale Analysis Model Applied to Natural Surfaces, A
* new methodology for isolating and diagnosing inconsistencies in image matching, as applied to the analysis of 2-D electrophoretic gels, A
* New Visual Invariants for Obstacle Detection Using Optical Flow Induced from General Motion
* Performance Assessment of Model-Based Tracking
* Point Target Detection in Spatially Varying Clutter
* Projectile impact detection and performance evaluation using machine vision
* PROMAP: A System for Analysis of Topographic Maps
* Real time color purity and convergence measurement algorithms for automatic ITC adjustment system
* Recovering Building Structures from Stereo
* Registar Machine: From Conception to Installation, The
* Restoration of scanning probe microscope images
* Scale-space clustering and classification of SAR images with numerous attributes and classes
* segmentation method for multi-connected particle delineation, A
* segmentation-free approach to OCR, A
* Shadow Handler in a Video-Based Real-Time Traffic Monitoring System, A
* Shape Analysis Model with Applications to a Character Recognition System, A
* Shape recovery methods for visual inspection
* System for Obstacle Detection During Rotorcraft Low-Altitude Flight, A
* System-level design of specialized VLSI hardware for computing relative orientation
* Target Tracking and Range Estimation Using an Image Sequence
* vision system for inspection of ball bonds in integrated circuits, A
* Visual Processing for Autonomous Driving
* Visually Guided Mobile Robot Acting in Indoor Environments, A
* Voice-Bandwidth Visual Communication Through Logmaps: The Telecortex
* Acquisition of 3D Structure of Selectable Quality from Image Streams
* Anatomy of a hand-filled form reader
* Application constraints in the design of an automatic reading device for analog display instruments
* Application of optical flow for automated overtaking control
* Application of the controlled active vision framework to robotic and transportation problems
* automated stereoscopic coal profiling system: CCLPS, An
* Automatic classification of wooden cabinet doors
* Binocular gaze holding of a moving object with the active stereo vision system
* Compilation of Mosaics from Separately Scanned Line Drawings
* Frameless registration of MR and CT 3D volumetric data sets
* Genetic labeling and its application to depalletizing robot vision
* Image Mosaicing for Tele-Reality Applications
* Knowledge-based interpretation of thyroid scintigrams
* Leukocyte classifications by size functions
* Method for Recognition and Localization of Generic Objects for Indoor Navigation, A
* Methodology for Evaluating Range Image Segmentation Techniques, A
* Model Supported Exploitation: Quick Look, Detection and Counting, and Change Detection
* Model Validation for Change Detection
* Model-based path finding using adjacent area shape
* Modelling Issues in Vision Based Aircraft Navigation During Landing
* Morphological Model-Driven Approach to Real-Time Road Boundary Detection for Vision-Based Automotive Systems, A
* Parameterisation of a Stochastic Model for Human Face Identification
* Practical Obstacle Detection and Avoidance System, A
* Precise visual inspection for LSI wafer patterns using subpixel image alignment
* Real-time Scene Stabilization and Mosaic Construction
* Real-Time Traffic Monitoring
* Recognizing a facial image from a police sketch
* Recursive Identification of Gesture Inputs Using Hidden Markov Models
* Robust Cognitive Approach to Traffic Scene Analysis, A
* System for Aircraft Recognition in Perspective Aerial Images, A
* system for automated iris recognition, A
* Task Driven Perceptual Organization for Extraction of Rooftop Polygons
* Unified Recognition and Stereo Vision System for Size Assessment of Fish, A
* Using modeling and fuzzy logic to detect and track microvessels in conjunctiva images
* Visual Servoing using Correlation Filters
* 3-D Real-Time Gesture Recognition Using Proximity Spaces
* 3L Fitting of Higher Degree Implicit Polynomials
* Adaptive Quantization of Color Space for Recognition of Finished Wooden Components
* Analysis of Moire Patterns in Non-Uniformly Sampled Halftones
* Automated Solder Joint Inspection System Using Optical 3D Image Detector
* Automated Visual Inspection of Solder Joints Using 2D and 3D Features, An
* Automatic-Measurement of Vertebral Shape Using Active Shape Models
* Cartographic Indexing into a Database of Remotely Sensed Images
* Cartographic Matching onto Millimetre Radar Images
* Choreographed Scope Maneuvering in Robotically-Assisted Laparoscopy with Active Vision Guidance
* Color and Texture Fusion: Application to Aerial Image Segmentation and GIS Updating
* Computer Vision System to Detect 3-D Rectangular Solids, A
* Control of Scene Reconstruction Using Explicit Knowledge
* Detecting Suspicious Background Changes in Video Surveillance of Busy Scenes
* Detection of Obstacles on Runway Using Ego-Motion Compensation and Tracking of Significant Features
* Document Layout Structure Extraction Using Bounding Boxes of Different Entities
* Efficient Image Warping and Super-Resolution
* Evolutive OCR System Based on Continuous Learning, An
* Fast Range Image Segmentation Using High-Level Segmentation Primitives
* Fingerprint Enhancement
* Handwritten Numeral Recognition Using Personal Handwriting Characteristics Based on Clustering Method
* Histogram Refinement for Content-Based Image Retrieval
* Identifying Nude Pictures
* Image Processing for Computer-Aided Diagnosis of Lung Cancer by CT (LSCT)
* Machine Vision System for the Automated Classification and Counting of Neurons in 3-D Brain Tissue Samples, A
* Mosaic Image Generation on a Flattened Gaussian Sphere
* Mosaicing of Paintings on Curved Surfaces
* Nonparametric Correction of Distortion
* Novel Approach to Computer-Aided Diagnosis of Mammographic Images, A
* Object Recognition Using Multiple View Invariance Based on Complex Features
* Passive Navigation Using Focus of Expansion
* Performance Evaluation of People Tracking Systems
* Placing Observers to Cover a Polyhedral Terrain in Polynomial Time
* Pose Estimation of Artificial Knee Implants in Fluoroscopy Images Using a Template Matching Technique
* Position Estimation from Outdoor Visual Landmarks for Teleoperation of Lunar Rovers
* Real time tracking of borescope tip pose
* Real-Time Face Tracker, A
* Real-Time Recognition of Activity Using Temporal Templates
* Real-Time Vision System for Automatic Traffic Monitoring Based on 2D Spatio-Temporal Images, A
* Real-Time Vision-Based 3D Motion Estimation System for Positioning and Trajectory Following, A
* Robust Automatic Target Recognition in Second Generation FLIR Images
* System for Detection of Internal Log Defects by Computer Analysis of Axial CT Images, A
* VeggieVision: A Produce Recognition System
* Video Indexing Through Integration of Syntactic and Semantic Features
* 3-D Cardiac Volume Analysis Using Magnetic Resonance Imaging
* 3D Reconstruction of Environements for Virtual Collaboration
* Analysis of the Tongue Surface Movement Using a Spatiotemporally Coherent Deformable Model
* Applications of Omnidirectional Imaging: Multi-body Tracking and Remote Reality
* Applications of the Geometry of Digital Spaces to Medical Imaging
* Applying Super-Resolution to Panoramic Mosaics
* Automatic Contour Detection by Encoding Knowledge into Active Contour Models
* Automatic Interpretation of Contour Lines by Using External Data
* Bimodal System for Interactive Indexing and Retrieval of Pathology Images
* Bimodal System for Interactive Indexing and Retrieval of Pathology Images
* Building and Using Hypervideos
* Building Pixel Classifiers Using the Interactive Teacher/Learner (ITL) System
* Can a Computer see the Beating Heart from snow-storm Images?
* Catadioptric Video Sensors
* Compariative Studies of 3-D Textural Features and Their Reliability in Terrain Classification
* Computer Vision System for Lumber Production Planning, A
* Cylicon: Software Package for 3D Reconstruction of Industrial Pipelines
* Dynamic 3D Stabilization for Video CG Composite
* Efficient Computation of the Most Probable Motion from Fuzzy Correspondences
* Environment for Interactive Video Organization, An
* Face Recognition Using a DCT-HMM Approach
* Generalizing Over Aspect and Location for Rooftop Detection
* Indexing Flowers by Color Names Using Domain Knowledge-Driven Segmentation
* Influence of Global Constraints and Lens Distortion on Pose and Appearance Recovery from a Purely Rotating Camera
* Integrated Approaches to Non-rigid Registration in Medical Images
* Interactive 3D Modeling from Multiple Images Using Scene Regularities
* Interventional 3D-Angiography: Calibration, Reconstruction and Visualization System
* Jacobian Images of Super-resolved Texture Maps for Model Based Motion Estimation and Tracking
* Learning a Similarity Based Distance Measure for Image Database Organization from Human Partitionings of an Image Set
* Letter Level Shape Description by Skeletonization in Faded Documents
* MIRACLE: Multimedia Information Retrieval by Analysing Content and Learning from Examples
* Moving Target Classification and Tracking from Real Time Video
* Multimedia Applications of Computer Vision
* Omni-rig Sensors: What Can be Done with a Non-rigid Vision Platform
* On Computing Global Similarity in Images
* Parking Lot Analysis/Visualization Using Multiple Aerial Images
* Polarization of Light Based System, Designed for Real Time Applications in Computer Vision, Making Use of Highlights in a Metallic Environement, A
* Qualitative Approach to Classifying Head and Eye Pose, A
* Real Time Face and Object Tracking as a Component of a Perceptual User Interface
* Real Time Human Motion Analysis by Image Skeletonization
* Real-time Estimation of Head Motion Using Weak Perspective Epipolar Geometry
* Real-time Fixation, Mosaic Construction, and Moving Object Detection from a Moving Camera
* Real-Time Object Tracking from a Moving Video Camera: A Software Approach on a PC
* Real-time Stereo Processing, Obstacle Detection and Terrain Reconstruction from a Vehicle-mounted Moving Stereo Pair of Cameras
* Real-time Tracking of Face Features and Gaze Direction Determination
* Recognizing Human Actions in a Static Room
* Registration, Calibration and Blending in Creating High Quality Panoramas
* Relevance Feedback in Surfimage
* Robust Automatic Target Detection/recognition System in Second Generation FLIR Imagery
* Rotation and Zooming in Image Mosaicing
* Scanning a Document with a Small Camera Attached to a Mouse
* Sensar: Secure(tm) Iris Identification System
* Shape Similarity Measure for Image Database of Occluding Contours
* Structural Lines for Triangulation of Terrain
* Surface Reconstruction from Sparse Fringe Contours
* Video Occupant Detection for Airbag Deployment
* VideoBrush: Experiences with Consumer Video Mosaicing
* VideoQ: An Automated Content Based Video Search System Using Visual Cues
* View Synthesis by Edge Matching and Transfer
* Vision System for Real-time Positioning, Navigation and Video Mosaicing of Sea Floor Imagery in the Application of ROVs/AUVs, A
* Volumetric Description of Dip Solder Joints from Range Data
* 2020 Sequestered Data Evaluation for Known Activities in Extended Video: Summary and Results
* Automatic Virtual 3D City Generation for Synthetic Data Collection
* Context-Aware Personality Inference in Dyadic Scenarios: Introducing the UDIVA Dataset
* Domain Adaptive Knowledge Distillation for Driving Scene Semantic Segmentation
* DriveGuard: Robustification of Automated Driving Systems with Deep Spatio-Temporal Convolutional Autoencoder
* Explainable Attention-Guided Iris Presentation Attack Detector, An
* Explainable Fingerprint ROI Segmentation Using Monte Carlo Dropout
* Facial Expression Neutralization With StoicNet
* Focused LRP: Explainable AI for Face Morphing Attack Detection
* Geeks and guests: Estimating player's level of experience from board game behaviors
* Interpretable security analysis of cancellable biometrics using constrained-optimized similarity-based attack
* Log-likelihood Regularized KL Divergence for Video Prediction With a 3D Convolutional Variational Recurrent Network, A
* Multi-Scale Voxel Class Balanced ASPP for LIDAR Pointcloud Semantic Segmentation
* Neural vision-based semantic 3D world modeling
* Per-frame mAP Prediction for Continuous Performance Monitoring of Object Detection During Deployment
* PeR-ViS: Person Retrieval in Video Surveillance using Semantic Description
* Person Perception Biases Exposed: Revisiting the First Impressions Dataset
* Pose-based Sign Language Recognition using GCN and BERT
* Reliability of GAN Generated Data to Train and Validate Perception Systems for Autonomous Vehicles
* ShineOn: Illuminating Design Choices for Practical Video-based Virtual Clothing Try-on
* Symbolic AI for XAI: Evaluating LFIT Inductive Programming for Fair and Explainable Automatic Recruitment
* Using Semantic Information to Improve Generalization of Reinforcement Learning Policies for Autonomous Driving
* Weakly Supervised Multi-Object Tracking and Segmentation
* Activity Detection in Untrimmed Videos Using Chunk-based Classifiers
* Adaptive Feature Aggregation for Video Object Detection
* Analysis of Gender Inequality In Face Recognition Accuracy
* Argus: Efficient Activity Detection System for Extended Video Analysis
* Boosted Kernelized Correlation Filters for Event-based Face Detection
* Bumblebee Re-Identification Dataset
* CAFM: A 3D Morphable Model for Animals
* Context Sensitivity of Spatio-Temporal Activity Detection using Hierarchical Deep Neural Networks in Extended Videos
* Disrupting Image-Translation-Based DeepFake Algorithms with Adversarial Attacks
* Exploring Techniques to Improve Activity Recognition using Human Pose Skeletons
* Fusing Animal Biometrics with Autonomous Robotics: Drone-based Search and Individual ID of Friesian Cattle (Extended Abstract)
* IMD2020: A Large-Scale Annotated Dataset Tailored for Detecting Manipulated Images
* Impact of ImageNet Model Selection on Domain Adaptation
* Learning Landmark Guided Embeddings for Animal Re-identification
* Memory-Efficient Models for Scene Text Recognition via Neural Architecture Search
* Mitigating Algorithmic Bias: Evolving an Augmentation Policy that is Non-Biasing
* Re-Identification of Zebrafish using Metric Learning
* Real-Time Activity Detection of Human Movement in Videos via Smartphone Based on Synthetic Training Data
* Real-time Detection of Activities in Untrimmed Videos
* Reducing Geographic Performance Differentials for Face Recognition
* Siamese Network Based Pelage Pattern Matching for Ringed Seal Re-identification
* Similarity Learning Networks for Animal Individual Re-Identification: Beyond the Capabilities of a Human Observer
* Summary of the 2019 Activity Detection in Extended Videos Prize Challenge
* Syn2Real: Forgery Classification via Unsupervised Domain Adaptation
