Journals starting with cvpr

CVPR00 * *CVPR
* 3-D Model Construction using Range and Image Data
* 360 x 360 Mosaics
* Active Character Recognition using A*-like Algorithm
* Adaptive Bayesian Recognition in Tracking Rigid Objects
* Adaptive Metric Nearest Neighbor Classification
* Agent-based Moving Object Correspondence using Differential Discriminative Diagnosis
* Algorithms for Plane-based Pose Estimation
* Alpha Estimation in Natural Images
* Appearance Representation for Multiple Reflection Components, An
* Arbitrary View Position and Direction Rendering for Large-Scale Scenes
* Are Multifractal Multipermuted Multinomial Measures Good Enough for Unsupervised Image Segmentation?
* Articulated Body Motion Capture by Annealed Particle Filtering
* Articulated-Pose Estimation using Brightness and Depth-Constancy Constraints
* Assignment Problem in Edge Detection Performance Evaluation
* Augmented Reality System for Surgical Navigation using Robust Target Vision
* Automatic Recovery of Relative Camera Rotations for Urban Scenes
* Batch/Recursive Algorithm for 3D Scene Reconstruction, A
* Bayesian Framework for Radar Shape-from-Shading, A
* Bayesian Super-Resolved Surface Reconstruction from Images
* Blind Recovery of Transparent and Semireflected Scenes
* Boosting Image Retrieval
* Boundary Extraction Method based on Dual-T-Snakes and Dynamic Programming, A
* Calibration of Light Sources
* Camera Model Selection based on Geometric AIC
* Cameras for Stereo Panoramic Imaging
* Catadioptric Self-Calibration
* Catadioptric Sensors that Approximate Wide-Angle Perspective Projections
* Categorical Representation and Recognition of Oscillatory Motion Patterns
* Chromatic Framework for Vision in Bad Weather
* Codimension-Two Geodesic Active Contours for the Segmentation of Tubular Structures
* Color Channels Decorrelation by ICA Transformation in the Wavelet Domain for, Color Texture Analysis and Synthesis
* Color Tracking by Transductive Learning
* Combined Feature-Texture Similarity Measure for Face Alignment under Varying Pose, A
* Comparative Technique and Performance Results on Novel Learned Snakes in Two Dissimilar Medical Domains, A
* Computational Model for Repeated Pattern Perception using Frieze and Wallpaper Groups, A
* Computing 3D Object Parts from Similarities among Object Views
* Computing Optical Flow with Physical Models of Brightness Variation
* Conditioning Analysis of Missing Data Estimation for Large Sensor Arrays
* Corner Guided Curve Matching and its Application to Scene Reconstruction
* CueVideo: a system for cross-modal search and browse of video databases
* Curve and Surface Reconstruction from Regular and Non-Regular Point Sets
* Curve Evolution Approach to Smoothing and Segmentation Using the Mumford-Shah Functional, A
* Detecting and Tracking Eyes by using their Physiological Properties, Dynamics, and Appearance
* Detecting and Tracking Human Face and Eye using Space-Varying Sensor and an Active Vision Head
* Detecting Binocular Half-Occlusions: Empirical Comparisons of Four Approaches
* Detecting Changes in 3-D Shape using Self-Consistency
* Detecting Dynamic Behavior in Compressed Fingerprint Videos: Distortion
* Detecting People in Cluttered Indoor Scenes
* Detection of Obstacles in the Flight Path of an Aircraft
* Discriminant-EM Algorithm with Application to Image Retrieval
* Distortion-Invariant Recognition via Jittered Queries
* Dynamic Layer Representation with Applications to Tracking
* Dynamic Memory: Architecture for Real Time Integration of Visual Perception, Camera Action, and Network Communication
* Edge-Constrained Joint View Triangulation for Image Interpolation
* Effective Approach to Detect Lesions in Color Retinal Images, An
* Efficient Matching of Pictorial Structures
* Efficient Query Modification for Image Retrieval
* Energy-based Framework for Dense 3D Registration of Volumetric Brain Images, An
* Error Analysis of Background Adaption
* Estimating Anthropometry and Pose from a Single Image
* Estimation and Prediction of Evolving Color Distributions for Skin Segmentation under Varying Illumination
* Exact Voxel Occupancy with Graph Cuts
* Face Recognition by Distribution Specific Feature Extraction
* Fast and Robust Approach to Recovering Structure and Motion from Live Video Frames, A
* Fast Face Detection using Subspace Discriminant Wavelet Features
* Fast Multiscale Image Segmentation
* Feature based Visualization of Geophysical Data
* Feature-based Technique for Joint, Linear Estimation of High-Order Image-to-Mosaic Transformations: Application to Mosaicing the Curved Human Retina, A
* Filling in Scenes by Propagating Probabilities through Layers and into Appearance Models
* Fixed Topology Skeletons
* Fluid Structure and Motion Analysis from Multi-spectrum 2D Cloud Image Sequences
* Flying a Toy Plane
* Formal Classification of 3D Medial Axis Points and their Local Geometry, A
* Fusion of Color, Shading and Boundary Information for Factory Pipe Segmentation
* General Method for Errors-in-Variables Problems in Computer Vision, A
* General Scheme for Training and Optimization of the Grenander Deformable Template Model, A
* Generalized Optical Flow Constraint and its Physical Interpretation, A
* Geodesic Distance Evolution of Surfaces: A New Method for Matching Surfaces
* Geometric Approach to Blind Deconvolution with Application to Shape from Defocus, A
* Geometric Approach to Train Support Vector Machines, A
* Gesture, Speech, and Gaze Cues for Discourse Segmentation
* Handwritten Digit Recognition with a Novel Vision Model that Extracts Linearly Separable Features
* Hierarchical Structure and Nonrigid Motion Recovery from 2D Monocular Views
* High Dynamic Range Imaging: Spatially Varying Pixel Exposures
* Histogram Preserving Image Transformations
* Histogram-based Object Recognition using Shape-from-Shading
* Ill-Posed Problems in Surface and Surface Shape Recovery
* Illuminant Direction Determination for Multiple Light Sources
* Illumination-Insensitive Face Recognition using Symmetric Shape-from-Shading
* Image retrieval and segmentation based on color invariants
* Image Segmentation by Nested Cuts
* Image-based Bayesian Framework for Face Detection, An
* Image-based Re-Rendering of Faces for Continuous Pose and Illumination Directions
* Image-Consistent Surface Triangulation
* Impact of Dynamic Model Learning on Classification of Human Motion
* Improved Motion Stereo Matching based on a Modified Dynamic Programming
* Improving Correlation-based DEMs by Image Warping and Facade Correlation
* Improving Visual Matching
* In Search of Illumination Invariants
* Inconsistencies in Edge Detector Evaluation
* Indexing for Topics in Videos using Foils
* Inferring Body Pose without Tracking Body Parts
* Integrated 3D Scene Flow and Structure Recovery from Multiview Image Sequences
* Integrating Bottom-Up/Top-Down for Object Recognition by Data Driven Markov Chain Monte Carlo
* Integrating Color, Texture, and Geometry for Image Retrieval
* Intel's Computer Vision Library: applications in calibration, stereo segmentation, tracking, gesture, face and object recognition
* Intelligent selection tools
* Invariant web defect detection and classification system
* Iterative Projective Reconstruction from Multiple Views
* Layer Extraction from Multiple Images Containing Reflections and Transparency
* Learning from One Example through Shared Densities on Transforms
* Learning in Gibbsian Fields: How Accurate and How Fast Can It Be?
* Learning Patterns from Images by Combining Soft Decisions and Hard Decisions
* Learning to Recognize Objects
* Likelihood Functions and Confidence Bounds for Total-Least-Squares Problems
* Limits on Super-Resolution and How to Break Them
* Line Net Global Vectorization: An Algorithm and its Performance Evaluation
* Linear Algorithm for Camera Self-Calibration, Motion and Structure Recovery for Multi-Planar Scenes from Two Perspective Images, A
* Lines in One Orthographic and Two Perspective Views
* Local Appearance for Robust Object Recognition
* Maintaining Valid Topology with Active Contours: Theory and Application
* Matching Images with Different Resolutions
* Maximum-Likelihood Template Matching
* Measurement of Color Invariants
* Mixture Models and the Segmentation of Multimodal Textures
* Motion Characterization by Temporal Slices Analysis
* Multi-modality model-based registration in the cardiac domain
* Multi-view 3D analysis with applications for augmented reality and enhanced video visualization
* Multidimensional Motion Segmentation and Identification
* Multifeature Object Tracking using a Model-Free Approach
* Multimodal Speaker Detection using Error Feedback Dynamic Bayesian Networks
* Multiscale Combination of Physically-based Registration and Deformation Modeling
* Multisensor Integration for Building Modeling
* Nearest Neighbor Search using Additive Binary Tree
* Neural Optimization Framework for Zoom Lens Camera Calibration, A
* New Algorithm for Non-Rigid Point Matching, A
* New Regularized Approach for Contour Morphing, A
* Novel Algorithm for Rotated Human Face Detection, A
* Novel Approach to Depth Ordering in Monocular Image Sequences, A
* Object Recognition for an Intelligent Room
* On Measuring Low-Level Saliency in Photographic Images
* On the Number of Samples Needed in Light Field Rendering with Constant-Depth Assumption
* On the Space Requirements of Indexing 3D Models from 2D Perspective Images
* On the Synthesis of Dynamic Scenes from Reference Views
* On-Line Handwriting Recognition System using Fisher Segmental Matching and Hypotheses Propagation Network, An
* Optimizing Learning in Image Retrieval
* Order Parameters for Minimax Entropy Distributions: When does High Level Knowledge Help?
* Parallel Projections for Stereo Reconstruction
* Parameterization of Closed Surfaces for Parametric Surface Description
* Parametric Template Method and its Application to Robust Matching, A
* Perceptual Grouping and Segmentation by Stochastic Clustering
* Perspective Pose from Spectral Voting
* Physical Panoramic Pyramid and Noise Sensitivity in Pyramids
* Point Pattern Matching with Robust Spectral Correspondence
* Pose Estimation, Model Refinement, and Enhanced Visualization using Video
* Probabilistic Architecture for Content-based Image Retrieval, A
* Probabilistic vs. Geometric Similarity Measures for Image Retrieval
* Provably Fast Algorithms for Contour Tracking
* Real Time System for Robust 3D Voxel Reconstruction of Human Motions, A
* Real-time 3D motion and structure of point features: a front-end system for vision-based control and interaction
* Real-Time System for Epipolar Geometry and Ego-Motion Estimation, A
* Real-Time Tracking of Non-Rigid Objects using Mean Shift
* Recognition of Partially Occluded and/or Imprecisely Localized Faces using a Probabilistic Approach
* Recognizing Upper Face Action Units for Facial Expression Analysis
* Reconstruction from Six-Point Sequences
* Reconstruction of a Scene with Multiple Linearly Moving Objects
* Reconstruction of Articulated Objects from Point Correspondences in a Single Uncalibrated Image
* Reconstruction of Scene Models from Sparse 3D Structure
* Recovering Non-Rigid 3D Shape from Image Streams
* Recovering Projection Geometry: How a Cheap Camera Can Outperform an Expensive Stereo System
* Rectified Catadioptric Stereo Sensors
* Rectified Mosaicing: Mosaics without the Curl
* Recursive Estimation of Motion and Planar Structure
* Region Correspondence by Global Configuration Matching and Progressive Delaunay Triangulation
* Region Extraction Method using Multiple Active Contour Models, A
* Reliable Feature Matching across Widely Separated Views
* Representation and Optimal Recognition of Human Activities
* Representation and Recognition of Complex Human Motion
* Retinal thickness measurements from optical coherence tomography using a markov boundary model
* Robust and Efficient Motion Segmentation based on Orthogonal Projection Matrix of Shape Space, A
* Robust and Efficient Skeletal Graphs
* Robust Periodic Motion and Motion Symmetry Detection
* Robust Snake Model
* Robust Stereo Ego-Motion for Long Distance Navigation
* Robust Visual Recognition of Color Images
* ROR: Rejection of Outliers by Rotations in Stereo Matching
* Scene Constraints-Aided Tracking of Human Body
* Scene Modeling for Wide Area Surveillance and Image Synthesis
* Segmentation with Invisible Keying Signal
* Segmenting Visual Actions based on Spatio-Temporal Motion Patterns
* Semi-Automatic Method for Resolving Occlusion in Augmented Reality, A
* Sensor networked mobile robotics
* Shape and Motion Carving in 6D
* Shape Descriptors for Non-Rigid Shapes with a Single Closed Contour
* Shape-based 3D Surface Correspondence using Geodesics and Local Geometry
* Simultaneous Tracking and Verification via Sequential Posterior Estimation
* Smooth Region Structure: Folds, Domes, Bowls, Ridges, Valleys, and Slopes
* Solving the Rotation-Estimation Problem by using the Perspective Three-Point Algorithm
* Spatio-Temporal Analysis of Omni Image
* Statistical Cues for Domain Specific Image Segmentation with Performance Analysis
* Statistical Method for 3D Object Detection Applied to Faces and Cars, A
* Statistical Modeling and Performance Characterization of a Real-Time Dual Camera Surveillance System
* Statistical Shape Influence in Geodesic Active Contours
* Statistics of Range Images
* Step Towards Sequence-to-Sequence Alignment, A
* Stereo based Gesture Recognition Invariant to 3D Pose and Lighting
* Stereo without Depth Search and Metric Calibration
* Stereoscopic Shading: Integrating Multi-Frame Shape Cues in a Variational Framework
* Structure from Motion using Points, Lines, and Intensities
* Structure from Motion without Correspondence
* Surface Growing from Stereo Images
* Surface Landmark Selection and Matching in Natural Terrain
* Template Deformation Constrained by Shape Priors
* Tomographic Reconstruction using Curve Evolution
* Towards Automatic Discovery of Object Categories
* Towards Detection of Human Motion
* Towards Reliable Fusion of Segmented Surface Descriptions
* Tracking of Elongated Structures using Statistical Snakes
* Tracking Segmented Objects using Tensor Voting
* Transformed Hidden Markov Models: Estimating Mixture Models of Images and Inferring Spatial Transformations in Video Sequences
* Two-Stage Robust Optical Flow Estimation
* Using Lexical Similarity in Handwritten Word Recognition
* Variable Albedo Surface Reconstruction from Stereo and Shape from Shading
* Video Summarization using Singular Value Decomposition
* View-Independent Recognition of Hand Postures
* Visual Servoing for Automatic and Uncalibrated Needle Placement for Percutaneous Procedures
* Visual Tunnel Analysis for Visibility Prediction and Camera Planning
* Visual Venture: investigations with images and videos for middle school education
* Visual websearching using iconic queries
* Wide Area Camera Calibration using Virtual Calibration Objects
* Zebra-Crossing Detection for the Partially Sighted
232 for CVPR00

CVPR01 * *CVPR
* 2D-3D Rigid Registration of X-Ray Fluoroscopy and CT Images Using Mutual Information and Sparsely Sampled Histogram Estimators
* 3-D Interpretation of Single Line Drawings Based on Entropy Minimization Principle
* 3D Biplanar Reconstruction of Scoliotic Vertebrae Using Statistical Models
* 3D Head Tracking Using Motion Adaptive Texture-Mapping
* 3D Line Motion Matrix and Alignment of Line Reconstructions, The
* 3D Model Generation for Cities Using Aerial Photographs and Ground Level Laser Scans
* 3D Object Recognition from Range Images using Local Feature Histograms
* 3D Reconstruction from 360 x 360 Mosaics
* 3D Simultaneous Localisation and Map-Building Using Active Vision for a Robot Moving on Undulating Terrain
* 3D Video Capture, Reconstruction, and Viewing Systems
* 3D-Orientation Signatures with Conic Kernel Filtering for Multiple Motion Analysis
* Adaptive Algorithm for Text Detection from Natural Scenes, An
* Adaptive Binning and Dissimilarity Measure for Image Retrieval and Classification
* Adaptive Quasiconformal Kernel Metric for Image Retrieval
* Affine Arithmetic Based Estimation of Cue Distributions in Deformable Model Tracking
* Alpha Channel Estimation in High Resolution Images and Image Sequences
* Analysis and Detection of Shadows in Video Streams: A Comparative Evaluation
* Analysis and Synthesis of Human Faces with Pose Variations by a Parametric Piecewise Linear Subspace Method
* Anti-Sequences: Event Detection by Frame Stacking
* Appearance-Based Object Recognition Using Multiple Views
* Articulated Body Posture Estimation from Multi-Camera Voxel Data
* Automated Cartridge Identification for Firearm Authentication
* Automatic Description of Buildings with Complex Rooftops from Multiple Images
* Automatic Determination of Camera Positions and Poses in a Sequence of Snapshots
* Automatic Estimation of the Projected Light Source Direction
* Automatic Partitioning of High Dimensional Search Spaces Associated with Articulated Body Motion Capture
* Bagging Is a Small-Data-Set Phenomenon
* Bayesian Approach to Digital Matting, A
* Bayesian Color Constancy for Outdoor Object Recognition
* Bayesian Learning of Sparse Classifiers
* Bayesian Tracking with Optical Flow Initialization and Re-initialization
* Bending Invariant Representations for Surfaces
* Better Proposal Distributions: Object Tracking Using Unscented Particle Filter
* Calibrated, Registered Images of an Extended Urban Area
* Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results
* Clustering Art
* Color Constant Ratio Gradients for Image Segmentation and Similarity of Textured Objects
* Combining Two-view Constraints for Motion Estimation
* Communication via Eye Blinks: Detection and Duration Analysis in Real Time
* Compact Representation of Bidirectional Texture Functions
* Comparison of Local Plane Fitting Methods for Range Data
* Component-based Face Detection
* Computer Vision System for Home Monitoring, A
* Computer Vision System for On-Screen Item Selection by Finger Pointing, A
* Computing Depth Maps from Descent Imagery
* Confidence Measure for Boundary Detection and Object Selection, A
* Constrained Minimum Cut for Classification Using Labeled and Unlabeled Data
* Constructing Facial Identity Surfaces in a Nonlinear Discriminating Space
* Constructing Models for Content-Based Image Retrieval
* Contextual Classification by Entropy-Based Polygonization
* Contour Grouping with Strong Prior Models
* Contracting Curve Density Algorithm and its Application to Model-based Image Segmentation, The
* Convex and Non-convex Illuminant Constraints for Dichromatic Colour Constancy
* Covariance Scaled Sampling for Monocular 3D Body Tracking
* Critical Configurations for N-view Projective Reconstruction
* Decomposed Eigenface for Face Recognition under Various Lighting Conditions
* Dense Image Matching with Global and Local Statistical Criteria: A Variational Approach
* Depth Layers from Occlusions
* Detection and Tracking of Shopping Groups in Stores
* Differential Methods for Nonmetric Calibration of Camera Lens Distortion
* Diffusion Tensor Regularization with Constraints Preservation
* Dimension Recognition and Geometry Reconstruction in Vectorization of Engineering Drawings
* Direct Appearance Models
* Discrimination of Motion Based on Traces in the Space of Probability Functions over Feature Relations
* Dynamic Bayesian Framework for Extracting Temporal Structure in Video
* Dynamic Coupled Component Analysis
* Dynamic Depth Recovery from Multiple Synchronized Video Streams
* Dynamic Shadow Elimination for Multi-Projector Displays
* Dynamic Texture Recognition
* Efficient Computation of Adaptive Threshold Surfaces for Image Binarization
* Efficient Evaluation of Classification and Recognition Systems
* Efficient Grouping under Perspective Skew
* Efficient Non-parametric Adaptive Color Modeling Using Fast Gauss Transform
* Efficient Spatiotemporal Grouping Using the Nyström Method
* Eliminating Ghosting and Exposure Artifacts in Image Mosaics
* Enforcing Integrability for Surface Reconstruction Algorithms Using Belief Propagation in Graphical Models
* Epipolar Geometry Estimation for Non-Static Scenes by 4D Tensor Voting
* Equivalence and Efficiency of Image Alignment Algorithms
* Estimating 3D Body Pose using Uncalibrated Cameras
* Event Detection and Summarization in Sports Video
* Evolving Image Segmentations for the Analysis of Video Sequences
* Extended Performance Graphs for Cluster Retrieval
* Extraction of Illusory Linear Clues in Perspectively Skewed Documents
* Face Detection in a Video Sequence: A Temporal Approach
* Face Verification Using Error Correcting Output Codes
* Fast and Reliable Color Region Merging Inspired by Decision Tree Pruning
* Fast Focal Length Solution in Partial Panoramic Image Stitching
* Feature Reduction and Hierarchy of Classifiers for Fast Object Detection in Video Images
* Finding Folds: On the Appearance and Identification of Occlusion
* First Order Tensor Voting and Application to 3-D Scale Analysis
* Flexible Flow for 3D Nonrigid Tracking and Shape Recovery
* Frame-Rate Spatial Referencing Based on Invariant Indexing and Alignment with Application to Laser Retinal Surgery
* Framework for Multiple Snakes, A
* Framework for Sensor Planning and Control with Applications to Vision Guided Multi-Robot Systems, A
* Fully Automatic Panoramic Image Registration
* Gait Recognition from Time-normalized Joint-angle Trajectories in the Walking Plane
* Gait Recognition Using Static, Activity-Specific Parameters
* Gauge Fixing for Accurate 3D Estimation
* Generalized Dynamic Programming Approaches for Object Detection: Detecting Spine Boundaries and Vertebra Endplates
* Generic Model Abstraction from Examples
* Geocoded Terrestrial Mosaics Using Pose Sensors and Video Registration
* Geometric Blur and Template Matching
* Geometric Distributions for Catadioptric Sensor Design
* Graph-Spectral Method for Surface Height Recovery from Needle-Maps, A
* Grouping Connected Components Using Minimal Path Techniques: Application to Reconstruction of Vessels in 2D and 3D Images
* Handling Occlusions in Dense Multi-view Stereo
* Hierarchical Scheme for Representing Curves without Self-Intersections, A
* High-Resolution Panoramic Camera, A
* Houghing the Hough: Peak Collection for Detection of Corners, Junctions and Line Intersections
* Illumination Invariant Face Recognition Using Thermal Infrared Imagery
* Illumination Subspace for Multibody Motion Segmentation
* Image Indexing with Mixture Hierarchies
* Image Magnification Using Level-Set Reconstruction
* Image Registration by Aligning Entropies
* Image-based Modeling and Rendering of Surfaces with Arbitrary BRDFs
* Image-Based Surface Detail Transfer
* Improving the Scope of Deformable Model Shape and Motion Estimation
* Increasing the Discrimination Power of Spectral Spatial Gradients
* Instant Dehazing of Images Using Polarization
* Integrated Face and Gait Recognition from Multiple Views
* Investigation of Measures for Grouping by Graph Partitioning
* Issues on the Geometry of Central Catadioptric Image Formation
* JPDAF Based HMM for Real-Time Contour Tracking
* LDA/SVM Driven Nearest Neighbor Classification
* Learned Templates for Feature Extraction in Fingerprint Images
* Learning Flexible Sprites in Video Layers
* Learning Generative Models of Scene Features
* Learning Models for Object Recognition
* Learning Probabilistic Distribution Model for Multi-View Face Detection
* Learning Probabilistic Structure for Human Motion Detection
* Learning Representative Local Features for Face Detection
* Learning Similarity Measure for Natural Image Retrieval with Relevance Feedback
* Learning Spatially Localized, Parts-Based Representation
* Learning-Based Building Outline Detection from Multiple Aerial Images
* Light Field Rendering for Large-Scale Scenes
* Linear Image Coding for Regression and Classification using the Tensor-rank Principle
* Linear Iterative Method for Auto-Calibration using the DAC Equation, A
* Linear Subspaces for Illumination Robust Face Recognition
* Local Analysis for 3D Reconstruction of Specular Surfaces
* Local Feature View Clustering for 3D Object Recognition
* Local Gradient, Global Matching, Piecewise-Smooth Optical Flow
* Local Mode Filtering
* Matching of Double-Sided Document Images to Remove Interference
* Matching, Reconstructing and Grouping 3D Lines From Multiple Views Using Uncertain Projective Geometry
* Metric Self Calibration from Screw-Transform Manifolds
* Minimally Supervised Acquisition of 3D Recognition Models from Cluttered Images
* Mixtures of Trees for Object Recognition
* Model-Based 3D Tracking of an Articulated Hand
* Model-Based Curve Evolution Technique for Image Segmentation
* Model-Based Road Sign Identification System, A
* Model-Free Optimal Trajectories in the Image Space: Application to Robot Vision Control
* Morphable 3D Models from Video
* Morphological Color Quantization
* MRF-based Approach for Real-Time Subway Monitoring, A
* Multi-Object Tracking Using Dynamical Graph Matching
* Multibody Grouping via Orthogonal Subspace Decomposition
* Multispectral Skin Color Modeling
* Multiview Texture Models
* Multiview Texture Models
* Navier-Stokes, Fluid Dynamics, and Image and Video Inpainting
* Necessary Conditions to Attain Performance Bounds on Structure and Motion Estimates of Rigid Objects
* New 3-D Pattern Recognition Technique With Application to Computer Aided Colonoscopy, A
* New Analysis Framework for Relevance Feedback-Driven Similarity Measure Refinement in Content-Based Image Retrieval, A
* New Panorama, The
* New Signature-Based Method for Efficient 3-D Object Recognition, A
* Nine Points of Light: Acquiring Subspaces for Face Recognition under Variable Lighting
* Non-Metric Image-Based Rendering for Video Stabilization
* Non-Parametric Motion Recognition Using Temporal Multiscale Gibbs Models
* Non-Rigid Object Tracking using Performance Evaluation Measures as Feedback
* Nonparametric Statistical Comparison of Principal Component and Linear Discriminant Subspaces for Face Recognition, A
* Object Based Segmentation of Video Using Color, Motion and Spatial Information
* Object Recognition using Boosted Discriminants
* On 3D Scene Flow and Structure Estimation
* On Computing Exact Visual Hulls of Solids Bounded by Smooth Surfaces
* On Focal Length Calibration from Two Views
* On Photometric Aspects of Catadioptric Cameras
* On Representing Edge Structure for Model Matching
* On Solving 2D and 3D Puzzles Using Curve Matching
* On the Fundamental Limits of Reconstruction-Based Super-resolution Algorithms
* On the Individuality of Fingerprints
* On the Perceptual Organization of Texture and Shading Flows: From a Geometrical Model to Coherence Computation
* Open Problem in Matching Sets of 3D Lines, An
* Optimal Adaptive Learning for Image Retrieval
* Optimal Texture Map Reconstruction from Multiple Views
* Outlier Detection in Video Sequences under Affine Projection
* PACS Based Interface for 3D Anatomical Structures Visualization and Surgical Planning, A
* Pairwise Coupling for Machine Recognition of Hand-Printed Japanese Characters
* Parallel Processing 360-Degree Rotation Invariant Face Detection
* Parametric Representations for Nonlinear Modeling of Visual Data
* PDE Approach for Measuring Tissue Thickness, A
* Photometric Stereo with General, Unknown Lighting
* Piecewise Planar Segmentation for Automatic Scene Modeling
* Poseidon Technologies: The world's first and only computer-aided drowning detection system
* Precise Omnidirectional Camera Calibration
* Probabilistic Framework for Graph Clustering, A
* Probabilistic Framework for Surface Reconstruction from Multiple Images, A
* Probabilistic Tracking of Motion Boundaries with Spatiotemporal Predictions
* Production of Video Images by Computer Controlled Camera and Its Application to TV Conference System
* Provably-Convergent Iterative Methods for Projective Structure from Motion
* Pure Learning Approach to Background-Invariant Object Recognition Using Pedagogical Support Vector Learning, A
* Quantification, Visualization, and Motion Correction in Dynamic Processes
* Quantigraphic Imaging: Estimating the Camera Response and Exposures from Differently Exposed Images
* Radical Approach to Handwritten Chinese Character Recognition Using Active Handwriting Models, A
* Range-based Foreground Detection Using Visibility Constraints
* Rapid Object Detection using a Boosted Cascade of Simple Features
* Real Time 3D Template Matching
* Real-Time Affine Region Tracking and Coplanar Grouping
* Real-Time Multi-View Face Detection, Pose Estimation, Tracking, Alignment, and Recognition
* Real-Time Video Georegistration
* Recognition of Human Gaits
* Reconstruction of 3D Figure Motion from 2D Correspondences
* Reconstruction of Specular Surfaces using Polarization Imaging
* Recovering the Geometry of Single Axis Motions by Conic Fitting
* Rectifying Transformations That Minimize Resampling Effects
* Region-based MRF Model for Unsupervised Segmentation of Moving Objects in Image Sequences, A
* Registration Via Direct Methods: A Statistical Approach
* Relationship Between Spline-Based Deformable Models and Weighted Graphs in Non-rigid Matching, A
* Relief Mosaics by Joint View Triangulation
* Relighting with the Reflected Irradiance Field: Representation, Sampling and Reconstruction
* Removing Weather Effects from Monochrome Images
* Representation for Image Structure and Its Application to Object Selection Using Freehand Sketches, A
* Rethinking Classical Internal Forces for Active Contour Models
* Robot Homing Based on Corner Tracking in a Sequence of Panoramic Images
* Robust and Adaptive Integration of Multiple Range Images with Photometric Attributes
* Robust Change Detection by Fusing Intensity and Texture Differences
* Robust Crease Detection and Curvature Estimation of Piecewise Smooth Surfaces from Triangle Mesh Approximations Using Normal Voting
* Robust Online Appearance Models for Visual Tracking
* Robust Point Feature Matching in Projective Space
* Robust Regression for Data with Multiple Structures
* Robust Super-Resolution
* Role of Domain Knowledge in the Detection of Retinal Hard Exudates, The
* Salient Points for Content Based Retrieval
* Scalable, Absolute Position Recovery for Omni-Directional Image Networks
* Scene Text Extraction and Translation for Handheld Devices
* Scene-Consistent Detection of Feature Points in Video Sequences
* Segmentation and Boundary Detection Using Multiscale Intensity Measurements
* Segmentation and Tracking of Multiple Humans in Complex Situations
* Segmentation for Robust Tracking in the Presence of Severe Occlusion
* Self Correcting Projector, A
* Semi-Dense Stereo Correspondence with Dense Features
* Separation of Diffuse and Specular Reflection from Color Images
* Sequential Knowledge-Driven Scene Recognition Model
* Shadow Elimination and Occluder Light Suppression for Multi-Projector Displays
* Shape Contexts Enable Efficient Retrieval of Similar Shapes
* Similarity Templates for Detection and Recognition
* Simple Stereo Algorithm to Recover Precise Object Boundaries and Smooth Surfaces, A
* Simultaneous Linear Estimation of Multiple View Geometry and Lens Distortion
* Single View Modeling of Free-Form Scenes
* Skewed Symmetry Groups
* Small Sample Learning during Multimedia Retrieval Using BiasMap
* Spatial Information in Multiresolution Histograms
* Spatial Lesion Indexing for Medical Image Databases Using Force Histograms
* Spherical Eye from Multiple Cameras (Makes Better Models of the World), A
* Spin Discriminant Analysis(SDA): Using A One-Dimensional Classifier for High Dimensional Classification Problems
* Stability Issues in Recovering Illumination Distribution from Brightness in Shadows
* Statistics of Real-World Illumination
* Stereo Head Calibration from a Planar Object
* Structure and Motion Estimation with Expectation Maximization and Extended Kalman Smoother for Continuous Image Sequences
* Structure and Motion from Uncalibrated Catadioptric Views
* Subspace Approach to Layer Extraction, A
* Super-Resolution from Multiple Views Using Learnt Image Models
* Support Vector Tracking
* Systematic Design and Analysis Cycle of a Vision System: A Case Study in Video Surveillance, The
* Temporal Integration of Multiple Silhouette-based Body-part Hypotheses
* Text Identification in Complex Background Using SVM
* Texture Replacement in Real Images
* Three-Dimensional Medial Shape Representation Incorporating Object Variability
* Time-constrained Dynamic Semantic Compression for Video Indexing and Interactive Searching
* Time-varying Shape Tensors for Scenes with Multiply Moving Points
* Tool for Decomposing 3D Discrete Objects, A
* Topology Preserving Deformable Model Using Level Sets, A
* Tracking and Modeling Non-Rigid Objects with Rank Constraints
* Tracking Multiple Moving Objects with a Mobile Robot
* Tracking of Object with SVM Regression
* Tracking the Optic Nervehead in OCT Video Using Dual Eigenspaces and an Adaptive Vascular Distribution Model
* Two-body Segmentation from Two Perspective Views
* Two-Step Approach to Hallucinating Faces: Global Parametric Model and Local Nonparametric Model, A
* Understanding Popout through Repulsion
* Undoing Paper Curl Distortion Using Applicable Surfaces
* Unsupervised Face Recognition from Image Sequences Based on Clustering with Attraction and Repulsion
* Using an ICA Representation of High Dimensional Data for Object Recognition and Classification
* Using Occlusions to Aid Position Estimation for Visual Motion Capture
* Using Robust Methods for Automatic Extraction of Buildings
* Video Analysis using the Acadia I(TM) Single-Chip Vision System
* Video to Reference Image Alignment in the Presence of Sparse Features and Appearance Change
* View-Based Human Activity Recognition by Indexing and Sequencing
* View-Invariance in Action Recognition
* Virtual Sample Generation for Template-Based Shape Matching
* Vision System for Fast 3D Model Reconstruction, A
* Vitrionic sensors: Computer vision for an intelligent touchless water faucet and intelligent plumbing systems
* Weighted Non-negative Matrix Factorization for Local Representations, A
292 for CVPR01

CVPR03 * *CVPR
* 2D moving grid geometric deformable model, A
* 3D model retrieval with morphing-based geometric and topological feature maps
* 3D Modeling Using a Statistical Sensor Model and Stochastic Search
* 3D object modeling and recognition using affine-invariant patches and multi-view spatial constraints
* 3D shape from anistropic diffusion
* 3D surface modeling from range curves
* Active Unsupervised Texture Segmentation on a Diffusion Based Feature Space
* Activity recognition using the dynamics of the configuration of interacting objects
* Adaptation for multiple cue integration
* Adaptive pattern discovery for interactive multimedia retrieval
* Adaptive view-based appearance models
* Advanced gaussian MRF rotation-invariant texture features for classification of remote sensing imagery
* Analyzing appearance and contour based methods for object categorization
* Appearance management and cue fusion for 3D model-based tracking
* Automated feature-based range registration of urban scenes of large scale
* Automated multi-camera planar tracking correspondence modeling
* Automatic relighting of overlapping textures of a 3D model
* Background subtraction based on cooccurrence of image variations
* Bayesian approach to image-based visual hull reconstruction, A
* bayesian framework for fusing multiple word knowledge models in videotext recognition, A
* Bayesian human segmentation in crowded situations
* Bayesian Tangent Shape Model: Estimating Shape and Pose Parameters Via Bayesian Inference
* Bootstrapping SVM active learning by incorporating unlabelled images for image retrieval
* Classification based on symmetric maximized minimal distance in subspace (SMMS)
* Clustering appearances of objects under varying illumination conditions
* Computation of the shock scaffold for unorganized point clouds in 3D
* Computing layered surface representations: an algorithm for detecting and separating transparent overlays
* Constrained subspace modelling
* Constraint on five points in two images
* Constructing 3D city models by merging ground-based and airborne views
* Continuous tracking within and across camera streams
* Critical Configuration for Reconstruction from Rectilinear Motion, A
* Curvature correction of the Hamilton-Jacobi skeleton
* Deformable object tracking using the boundary element method
* Deformable pedal curves with application to face contour extraction
* Direct 3D-rotation estimation from spherical images via a generalized shift theorem
* Directional histogram model for three-dimensional shape similarity
* Discovering clusters in motion time-series data
* Document image enhancement using directional wavelet
* Dynamic depth recovery from unsynchronized video streams
* Editable dynamic textures
* effects of segmentation and feature choice in a translation model of object recognition, The
* efficient approach to learning inhomogeneous Gibbs model, An
* Efficient Solution to the Five-Point Relative Pose Problem, An
* Enhancing DPF for near-replica image recognition
* Enhancing Image and Video Retrieval: Learning via Equivalence Constraints
* Establishment shot detection using qualitative motion
* Estimating 3D hand pose from a cluttered image
* Estimating surface characteristics using physical reflectance models
* Estimating the photorealism of images: distinguishing paintings from photographs
* Estimation of omnidirectional camera model from epipolar geometry
* Evaluation of local models of dynamic backgrounds
* Evolvable visual commercial detector
* Example-based style synthesis
* Expectation Grammars: Leveraging High-Level Expectations for Activity Recognition
* Extracting dense features for visual correspondence with graph cuts
* Eye gaze tracking using an active stereo head
* Face alignment using statistical models and wavelet features
* Face authentication using the trace transform
* Face recognition in hyperspectral images
* Face recognition under variable lighting using harmonic image exemplars
* Face relighting with radiance environment maps
* Fast variable window for stereo correspondence using integral images
* Feature selection by maximum marginal diversity: Optimality and Implications for Visual Recognition
* Feature selection for reliable tracking using template matching
* Finding and tracking people from the bottom up
* Flux driven fly throughs
* Flux invariants for shape
* Fusing online and offline information for stable 3D tracking in real-time
* General C-Means Clustering Model and Its Application
* Generalized Principal Component Analysis (GPCA)
* High-accuracy stereo depth maps using structured light
* Homotopy-based computation of defocus blur and affine transform
* Hue fields and color curvatures: A Perceptual Organization Approach to Color Image Denoising
* hybrid approach for computing visual hulls of complex objects, A
* Hybrid evolutionary ridge regression approach for high-accurate corner extraction
* Illumination chromaticity estimation using inverse-intensity chromaticity space
* Illumination Normalization with Time-Dependent Intrinsic Images for Video Surveillance
* Image hallucination with primal sketch priors
* Image repairing: robust image synthesis by adaptive ND tensor voting
* Implicit meshes for modeling and reconstruction
* Implicit similarity: a new approach to multi-sensor image registration
* Improving continuous gesture recognition with spoken prosody
* Independent component analysis in a facial local residue space
* Joint 3D-reconstruction and background separation in multiple views using graph cuts
* Joint manifold distance: a new approach to appearance based clustering
* Kernel principal angles for classification machines with applications to image sequence interpretation
* Kinematic jump processes for monocular 3D human tracking
* Kullback-Leibler boosting
* Learning a discriminative classifier using shape context distances
* Learning Affinity Functions for Image Segmentation: Combining Patch-Based and Gradient-Based Approaches
* Learning appearance and transparency manifolds of occluded objects in layers
* Learning Bayesian network classifiers for facial expression recognition using both labeled and unlabeled data
* Learning dynamics for exemplar-based gesture recognition
* Learning epipolar geometry from image sequences
* Learning object intrinsic structure for robust visual tracking
* Line reconstruction from many perspective images by factorization
* Linear auto-calibration for ground plane motion
* Local appearance-based models using high-order statistics of image features
* Low-dimensional representations of shaded surfaces under varying illumination
* Man-made structure detection in natural images using a causal multiscale random field
* Many-to-many graph matching via metric embedding
* Markerless kinematic model and motion capture from volume sequences
* Mean-shift blob tracking through scale space
* Methods and geometry for plane-based self-calibration
* Motion Deblurring Using Hybrid Imaging
* Motion from 3D Line Correspondences: Linear and Non-Linear Solutions
* Motion segmentation with accurate boundaries: A tensor voting approach
* Multi-modal image registration by minimizing Kullback-Leibler distance between expected and observed joint class histograms
* Multi-resolution real-time stereo on commodity graphics hardware
* Multi-scale phase-based local features
* Multi-view stereo beyond Lambert
* Multilinear subspace analysis of image ensembles
* Nearest neighbor search for relevance feedback
* new graph-theoretic approach to clustering and segmentation, A
* new semi-supervised EM algorithm for image retrieval, A
* Nonparametric belief propagation
* Nonparametric information fusion for motion estimation
* novel convergence scheme for active appearance models, A
* novel model for orientation field of fingerprints, A
* novel support vector classifier with better rejection performance, A
* Object class recognition by unsupervised scale-invariant learning
* Object recognition based on photometric alignment using RANSAC
* Object removal by exemplar-based inpainting
* Object segmentation in videos from moving camera with MRFs on color and motion features
* Object segmentation using graph cuts based active contours
* Object-specific figure-ground segregation
* Occupant classification system for automotive airbag suppression
* On region merging: the statistical soundness of fast sorting, with applications
* On the fundamental performance for fingerprint matching
* optimal distance measure for object detection, The
* Optimal linear representations of images for object recognition
* Optimal segmentation of dynamic scenes from two perspective views
* PAMPAS: Real-Valued Graphical Models for Computer Vision
* Perception-based 3D triangle mesh segmentation using fast marching watersheds
* Performance Evaluation of Local Descriptors, A
* perspective on distortions, A
* Polydioptric camera design and 3D motion estimation
* Pose Reconstruction with an Uncalibrated Computed Tomography Imaging Device
* Practical non-parametric density estimation on a transformation group for vision
* Practical super-resolution from dynamic video sequences
* Probabilistic spatial context models for scene content understanding
* Probabilistic tracking in joint feature-spatial spaces
* Properties and Applications of Shape Recipes
* Qualitative image based localization in indoors environments
* Range image segmentation by surface extraction using an improved robust estimator
* Recognising and monitoring high-level behaviours in complex spatial environments
* Recognizing expression variant faces from a single sample image per class
* Recognizing Objects in Adversarial Clutter: Breaking a Visual CAPTCHA
* Representation and Detection of Deformable Shapes
* Resolution vs. tracking error: Zoom as a gain controller
* road sign recognition system based on dynamic visual model, A
* Robust crease detection in fingerprint images
* Robust data association for online applications
* Robust data clustering
* ROD-TV: reconstruction on demand by tensor voting
* S-AdaBoost and Pattern Detection in Complex Environment
* Scene detection in Hollywood movies and TV shows
* Seeing beyond occlusions (and other marvels of a finite lens aperture)
* Shadow Elimination and Occluder Light Suppression for Multi-Projector Displays
* Shape and materials by example: a photometric stereo approach
* Shape context and chamfer matching in cluttered scenes
* Shape-Based Recognition of Wiry Objects
* Shape-from-silhouette of articulated objects and its use for human body kinematics estimation and motion capture
* Shape-Time Photography
* Shedding light on the weather
* Simultaneous estimation of left ventricular motion and material properties with maximum a posteriori strategy
* Simultaneous feature selection and classifier training via linear programming: a case study for face expression recognition
* Simultaneous pose and correspondence determination using line features
* Simultaneous smoothing and estimation of the tensor field from diffusion tensor MRI
* Simultaneous structure and texture image inpainting
* Spacetime Stereo: A Unifying Framework for Depth from Triangulation
* Spacetime stereo: shape recovery for dynamic scenes
* sparse texture representation using affine-invariant regions, A
* Statistics of shape via principal geodesic analysis on lie groups
* Stereo matching with reflections and translucency
* Structure from motion for scenes without features
* Subset selection for efficient SVM tracking
* Surface reconstruction via Helmholtz reciprocity with a single image pair
* Surfaces with Occlusions from Layered Stereo
* Switching observation models for contour tracking in clutter
* Tensor-based brain surface modeling and analysis
* Texture classification: are filter banks necessary?
* Toward a statistically optimal method for estimating geometric relations from noisy data: cases of linear relations
* Toward a stratification of Helmholtz stereopsis
* Tracking appearances with occlusions
* Transforming camera geometry to a virtual downward-looking camera: robust ego-motion estimation and ground-layer detection
* Using many cameras as one
* Using multiple cues for hand tracking and model refinement
* variational framework for image segmentation combining motion estimation and shape regularization, A
* Variational inference for visual tracking
* Vector-Valued Image Regularization with PDEs: A Common Framework for Different Applications
* Video content annotation using visual analysis and a large semantic knowledgebase
* Video segmentation based on graphical models
* Video-Based Face Recognition Using Adaptive Hidden Markov Models
* Video-based face recognition using probabilistic appearance manifolds
* Video-rate stereo depth measurement on programmable hardware
* View invariants for human action recognition
* viewing graph, The
* Visual hull alignment and refinement across time: a 3D reconstruction algorithm combining shape-from-silhouette with stereo
* Visual landmarks detection and recognition for mobile robot navigation
* What is the space of camera response functions?
* What went where
* Wide-baseline multiple-view correspondences
* Word image matching using dynamic time warping
206 for CVPR03

CVPR04 * *CVPR
* 2D-Shape Analysis Using Conformal Mapping
* 3D facial tracking from corrupted movie sequences
* 3D head tracking based on recognition and interpolation using at ime-of-flight depth sensor
* 3D human pose from silhouettes by relevance vector regression
* 3D models coding and morphing for efficient video compression
* Accurate face models from uncalibrated and ill-lit video sequences
* Affine image registration using a new information metric
* affine invariant tensor dissimilarity measure and its applications to tensor-valued image segmentation, An
* Algebraic solution for the visual hull
* algorithm for multiple object trajectory tracking, An
* Alignment of Continuous Video onto 3D Point Clouds
* Approximation of canonical sets and their applications to 2D view simplification
* Articulated models from video
* Asymmetrically boosted HMM for speech reading
* Atlanta world: an expectation maximization framework for simultaneous low-level edge grouping and camera calibration in complex man-made environments
* Audio-visual based emotion recognition-a new approach
* Augmenting images of non-rigid scenes using point and curve correspondences
* Autocalibration & 3D reconstruction with non-central catadioptric cameras
* Automatic cascade training with perturbation bias
* Automatic method for correlating horizons across faults in 3D seismic data
* Automatic view recognition in echocardiogram videos using parts-based representation
* Bayesian assembly of 3D axially symmetric shapes from fragments
* Bayesian face recognition using support vector machine and face clustering
* Bayesian fusion of camera metadata cues in semantic scene classification
* Bayesian video matting using learnt image priors
* Biventricular myocardial kinematics based on tagged MRI from anatomical NURBS models
* BoostMap: a method for efficient approximate similarity rankings
* Bridging the gaps between cameras
* Brightness Perception, Dynamic Range and Noise: A Unifying Model for Adaptive Image Sensors
* Calibrating an air-ground control system from motion correspondences
* Camera calibration from a single night sky image
* Camera network calibration from dynamic silhouettes
* Capturing Image Structure with Probabilistic Index Maps
* Clear underwater vision
* cognitive vision system for action recognition in office environments, A
* Collaborative tracking of multiple targets
* Color alignment in texture mapping of images under point light source and general lighting condition
* Color Lines: Image Specific Color Representation.
* Computing depth under ambient illumination using multi-shuttered light
* Corefaces-robust shift invariant PCA based correlation filter for illumination tolerant face recognition
* correlation-based model prior for stereo, A
* Covariance-driven mosaic formation from sparsely-overlapping image sets with application to retinal image mosaicing
* Cue integration through discriminative accumulation
* Cyclic articulated human motion tracking by sequential ancestral simulation
* Detecting and reading text in natural scenes
* Detecting unusual activity in video
* Detection and removal of rain from videos
* Detection and tracking of objects in underwater video
* Diffeomorphic matching of distributions: a new approach for unlabelled point-sets and sub-manifolds matching
* Difference sphere: An approach to near light source estimation
* Direct super-resolution and registration using raw CFA images
* discriminative feature space for detecting and recognizing faces, A
* Discriminative Learning Framework with Pairwise Constraints for Video Object Classification, A
* Distortion estimation techniques in solving visual CAPTCHAs
* Dual-space linear discriminant analysis for face recognition
* Dynamic geodesic snakes for visual tracking
* Effect of colorspace transformation,the illuminance component,and color modeling on skin detection
* Efficient Belief Propagation for Early Vision
* Efficient graphical models for processing images
* Efficient model-based linear head motion recovery from movies
* Efficient search of faces from complex line drawings
* Efficient Tracking with the Bounded Hough Transform
* Elastic-string models for representation and analysis of planar shapes
* EM-like algorithm for color-histogram-based object tracking, An
* Error analysis for a navigation algorithm based on optical-flow and a digital terrain map
* Estimating illumination direction from textured images
* Estimation of blood flow speed and vessel location from thermal video
* Estimation, smoothing, and characterization of apparent diffusion coefficient profiles from high angular resolution DWI
* Extracting semantic information through illumination classification
* Extraction and integration of window in a 3D building model from ground view images
* Extraction and recognition of periodically deforming objects by continuous, spatio-temporal shape description
* Eye typing off the shelf
* Face localization via hierarchical Condensation with Fisher boosting feature selection
* Facial event classification with task oriented dynamic Bayesian network
* Fast Contour Matching Using Approximate Earth Mover's Distance
* fast multigrid implicit algorithm for the evolution of geodesic active contours, A
* Fast wide baseline matching for visual navigation
* Fast, integrated person tracking and activity recognition with plan-view templates from a single stereo camera
* Faster graph-theoretic image processing via small-world and quadtree topologies
* Feature selection for classifying high-dimensional numerical data
* Feature-centric evaluation for efficient cascaded object detection
* flexible projector-camera system for multi-planar displays, A
* Flexible spatial models for grouping local image features
* Frame synchronization and multi-level subspace analysis for video based face recognition
* From facial expression to level of interest: a spatio-temporal approach
* From Fragments to Salient Closed Boundaries: An In-Depth Study
* Generalized quotient image
* Geometric and shading correction for images of printed materials a unified approach using boundary
* Gibbs likelihoods for Bayesian tracking
* Globally optimal segmentation of interacting surfaces with geometric constraints
* GMM parts based face representation for improved verification through relevance adaptation, A
* graphical model framework for coupling MRFs and deformable models, A
* Graphical models for graph matching
* Grouping dominant orientations for ill-structured road following
* Grouping with bias revisited
* Hidden semantic concept discovery in region based image retrieval
* Hierarchical decision making scheme for sports video categorisation with temporal post-processing
* High resolution video mosaicing with global alignment
* High-speed videography using a dense camera array
* High-zoom video hallucination by exploiting spatio-temporal regularities
* How features of the human face affect recognition: a statistical comparison of three face recognition algorithms
* Hybrid textons: modeling surfaces with reflectance and geometry
* Hyperspectral texture classification using generalized Markov fields
* Improving object classification in far-field video
* Incremental density approximation and kernel-based Bayesian filtering for object tracking
* Inference of multiple subspaces from high-dimensional data and application to multibody grouping
* Inferring 3D body pose from silhouettes using activity manifold learning
* Integrating and employing multiple levels of zoom for activity recognition
* Integrating multiple model views for object recognition
* Invariant operators, small samples, and the bias-variance dilemma
* invariant, closed-form solution for matching sets of 3D lines, An
* Is bottom-up attention useful for object recognition?
* Jitter camera: high resolution video from a low resolution detector
* Joint feature-basis subset selection
* Joint prior models of neighboring objects for 3D image segmentation
* Learning a restricted bayesian network for object detection
* Learning classifiers from imbalanced data based on biased minimax probability machine
* Learning distance functions for image retrieval
* Learning methods for generic object recognition with invariance to pose and lighting
* Learning Object Detection from a Small Number of Examples: The Importance of Good Features
* Learning to segment images using region-based perceptual features
* Lie-algebraic averaging for globally consistent motion estimation
* Linear model hashing and batch RANSAC for rapid and accurate object recognition
* Linear projection methods in face recognition under unconstrained illuminations: a comparative study
* Linear sequence-to-sequence alignment
* Local facial asymmetry for expression classification
* Local smoothing for manifold learning
* LX minimization in geometric reconstruction problems
* Making one object look like another: Controlling appearance using a projector-camera system
* Metamorphs: deformable shape and texture models
* method of vector fields for catadioptric sensor design with applications to panoramic imaging, The
* Minimal Solution to the Generalised 3-Point Pose Problem, A
* Minimum Effective Dimension for Mixtures of Subspaces: A Robust GPCA Algorithm and its Applications
* model for dynamic shape and its applications, A
* Model-based motion clustering using boosted mixture modeling
* Modeling complex motion by tracking and editing hidden markov graphs
* Modelling the effects of walking speed on appearance-based gait recognition
* Models of large population recognition performance
* Motion Estimation by Swendsen-Wang Cuts
* Motion Layer Extraction in the Presence of Occlusion Using Graph Cuts
* Motion segmentation with missing data using powerfactorization and GPCA
* Motion without correspondence from tomographic projections by Bayesian inversion theory
* Motion-based background subtraction using adaptive kernel density estimation
* Multi-classifier framework for atlas-based image segmentation
* Multi-scale visual tracking by sequential belief propagation
* Multibody factorization with uncertainty and missing data using the EM algorithm
* Multibody motion segmentation based on simulated annealing
* Multibody Trifocal Tensor: Motion Segmentation from 3 Perspective Views, The
* Multigrid and Multi-Level Swendsen-Wang Cuts for Hierarchic Graph Partition
* Multiobjective data clustering
* Multiple Bernoulli relevance models for image and video annotation
* Multiple kernel tracking with SSD
* Multiscale conditional random fields for image labeling
* Multiview occlusion analysis for tracking densely populated objects based on 2-D visual angles
* Names and faces in the news
* new GPCA algorithm for clustering subspaces by fitting, differentiating and dividing polynomials, A
* non-parametric approach for independent component analysis using kernel density estimation, A
* Non-rigid shape and motion recovery: degenerate deformations
* Novel region-based modeling for human detection within highly dynamic aquatic environment
* Object-based image retrieval using the statistical structure of images
* On the Distribution of Saliency
* Optimizing motion estimation with linear programming and detail-preserving variational method
* Orthogonal complement component analysis for positive samples in SVM based relevance feedback image retrieval
* Parts-based 3D object classification
* PCA-SIFT: a more distinctive representation for local image descriptors
* Perceptual organization of radial symmetries
* Perspective shape-from-shading by fast marching
* Point matching as a classification problem for fast and robust object pose estimation
* Pointwise motion tracking in echocardiographic images
* Probabilistic data association methods in visual tracking of groups
* Probabilistic expression analysis on manifolds
* probabilistic framework for combining tracking algorithms, A
* Probabilistic identity characterization for face recognition
* Probabilistic parameter-free motion detection
* Probability models for high dynamic range imaging
* Programmable imaging using a digital micromirror array
* Propagation networks for recognition of partially ordered sequential action
* Proposal maps driven MCMC for estimating human body pose in static images
* Radiometric alignment of image sequences
* Radiometric calibration from a single image
* Radiometric calibration of a Helmholtz stereo rig
* Random sampling based SVM for relevance feedback image retrieval
* Random sampling LDA for face recognition
* Rao-Blackwellized particle filter for eigentracking, A
* Real-time combined 2D+3D active appearance models
* Reconstructing 3D independent motions using non-accidentalness
* Reconstructing open surfaces from unorganized data points
* Recovering Human Body Configurations: Combining Segmentation and Recognition
* Recovering shape and irradiance maps from rich dense texton fields
* Recovering shape and reflectance model of non-lambertian objects from multiple views
* Region-based progressive stereo matching
* Registration of diffusion tensor images
* Representation and matching of articulated shapes
* Restoration of curved document images through 3D shape modeling
* Robust color object detection using spatial-color joint probability functions
* Robust subspace clustering by combined use of kNND metric and SVD algorithm
* Role of shape and kinematics in human movement analysis
* Scalable discriminant feature selection for image retrieval and recognition
* Scale Selection for Anisotropic Scale-Space: Application to Volumetric Tumor Characterization
* Scale-invariant shape features for recognition of object categories
* Searching the web with mobile images for location recognition
* Segment-based stereo matching using graph cuts
* Segmentation using multiscale cues
* segmentation-free approach for skeletonization of gray-scale images via anisotropic vector diffusion, A
* Segmenting, Modeling, and Matching Video Clips Containing Multiple Moving Objects
* Selecting ghosts and queues from a car trackers output using a spatio-temporal query language
* Self shadowing and local illumination of randomly rough surfaces
* Self-normalized linear tests
* Separating reflections from a single image using local features
* Separating style and content on a nonlinear manifold
* Shape constrained image segmentation by parametric distributional clustering
* Shape correspondence through landmark sliding
* Shape Representation and Classification Using the Poisson Equation
* Shaping receptive fields for affine invariance
* Sharing Features: Efficient Boosting Procedures for Multiclass Object Detection
* Shedding light on stereoscopic segmentation
* Similarity measure and learning with gray level aura matrices (GLAM) for texture image retrieval
* Simultaneous calibration and tracking with a network of non-overlapping sensors
* Space-Time Isosurface Evolution for Temporally Coherent 3D Reconstruction
* Space-time video completion
* Spatially coherent clustering using graph cuts
* Spherical harmonics vs. Haar wavelets: Basis for Recovering Illumination from Cast Shadows
* SPS algorithm: patching figural continuity and transparency by split-patch search, The
* Statistical feature fusion for gait-based human recognition
* Stereo Correspondence with Slanted Surfaces: Critical Implications of Horizontal Slant
* Studies on silhouette quality and gait recognition
* Super-resolution through neighbor embedding
* Synchronizing video sequences
* Thermal face recognition in an operational scenario
* Tomographic reconstruction of piecewise smooth images
* Towards robust structure-based enhancement and horizon picking in 3-D seismic data
* Tracking loose-limbed people
* Tracking multiple humans in crowded environment
* Uncalibrated and unsynchronized human motion capture: A Stereo Factorization Approach
* Uncontrolled Modulation Imaging
* unified framework for uncertainty propagation in automatic shape tracking, A
* unified spatio-temporal articulated model for tracking, A
* Unsupervised Learning of Image Manifolds by Semidefinite Programming
* unsupervised, online learning framework for moving object detection, An
* Using plane + parallax for calibrating dense camera arrays
* Using skew Gabor filter in source signal separation and local spectral multi-orientation analysis
* Value directed learning of gestures and facial displays
* Variational Approach to Problems in Calibration of Multiple Cameras, A
* variational approach to scene reconstruction and image segmentation from motion-blur cues, A
* Variational mixture smoothing for non-linear dynamical systems
* Video data mining using configurations of viewpoint invariant regions
* Video Repairing: Inference of Foreground and Background Under Severe Occlusion
* Video stabilization as a variational problem and numerical solution with the viterbi method
* View independent human body pose estimation from a single perspective image
* Visual odometry
* Visual odometry and map correlation
* Visual tracking using learned linear subspaces
* Wavelet-based hierarchical surface approximation from height fields
* What image information is important in silhouette-based gait recognition?
* Wide baseline feature matching using the cross-epipolar ordering constraint
* Wide-baseline stereo from multiple views: A probabilistic account
* Window-based, discontinuity preserving stereo
* world in an eye, The
259 for CVPR04

CVPR05 * *CVPR
* 2D Statistical Models of Facial Expressions for Realistic 3D Avatar Animation
* 3D Articulated Motion Estimation from Images
* 3D Geometric and Optical Modeling of Warped Document Images from Scanners
* 3D Reconstruction by Fitting Low-Rank Matrices with Missing Data
* 8D-THERMO CAM: Combination of Geometry with Physiological Information for Face Recognition
* Accurate and Efficient Stereo Processing by Semi-Global Matching and Mutual Information
* Accurate Motion Layer Segmentation and Matting
* Actions Sketch: A Novel Action Representation
* Active Contours Using a Constraint-Based Implicit Representation
* Active Polyhedron: Surface Evolution Theory Applied to Deformable Meshes
* Activity Recognition and Abnormality Detection with the Switching Hidden Semi-Markov Model
* Addressing Radiometric Nonidealities: A Unified Framework
* Affine Object Tracking with Kernel-Based Spatial-Color Representation
* Algebraically Accurate Volume Registration Using Euler's Theorem and the 3-D Pseudo-Polar FFT
* ALIP: The Automatic Linguistic Indexing of Pictures System
* Analytically Solving Radial Distortion Parameters
* Appearance Modeling for Tracking in Multiple Non-Overlapping Cameras
* Appearance-Guided Particle Filtering for Articulated Hand Tracking
* Applying Neighborhood Consistency for Fast Clustering and Kernel Density Estimation
* ARTag: a Fiducial Marker System Using Digital Techniques
* Articulated Structure from Motion by Factorization
* Asymmetrical Occlusion Handling Using Graph Cut for Multi-View Stereo
* Audio-Visual Affect Recognition through Multi-Stream Fused HMM for HCI
* Automatic 3D to 2D Registration for the Photorealistic Rendering of Urban Scenes
* Automatic Face Recognition for Film Character Retrieval in Feature-Length Films
* Automatic Thermal Monitoring System (ATHEMOS) for Deception Detection
* Axiomatic Approach to Corner Detection, An
* Background Recognition in Dynamic Scenes with Motion Constraints
* Band-Weighted Landuse Classification Method for Multispectral Images, A
* Bayesian 3D Modeling from Images Using Multiple Depth Maps
* Bayesian Approach to Unsupervised Feature Selection and Density Estimation Using Expectation Propagation, A
* Bayesian Hierarchical Model for Learning Natural Scene Categories, A
* Bayesian Image Segmentation Using Wavelet-Based Priors
* Bayesian Mixture Model for Multi-View Face Alignment, A
* Bayesian Object Detection in Dynamic Scenes
* Bayesian Super-Resolution of Text in Video with a Text-Specific Bimodal Prior
* Beyond Lambert: Reconstructing Specular Surfaces Using Color
* Beyond Pairwise Clustering
* Bi-Layer Segmentation of Binocular Stereo Video
* Blob Segmentation Using Joint Space-Intensity Likelihood Ratio Test: Application to 3D Tumor Segmentation
* Boosting Saliency in Color Image Features
* Calibration of Pan-Tilt-Zoom (PTZ) Cameras and Omni-Directional Cameras
* Camera Calibration and Light Source Estimation from Images with Shadows
* Camera Matchmoving in Unprepared, Unknown Environments
* Caratheodory-Fejer Approach to Dynamic Appearance Modeling, A
* Cerebral Vascular Atlas Generation for Anatomical Knowledge Modeling and Segmentation Purpose
* Classification of Contour Shapes Using Class Segment Sets
* Closed Form Solution to Direct Motion Segmentation, A
* Cloth Representation by Shape from Shading with Shading Primitives
* Coherent Regions for Concise and Stable Image Description
* Colour Constancy Using the Chromagenic Constraint
* Combined Physical and Statistical Approach to Colour Constancy, A
* Combining Object and Feature Dynamics in Probabilistic Tracking
* Combining Variable Selection with Dimensionality Reduction
* Complex 3D Shape Recovery Using a Dual-Space Approach
* Computer Vision for Music Identification
* Computer Vision for Music Identification: Video Demonstration
* Concurrent Subspaces Analysis
* Conformal Deskewing of Non-Planar Documents
* Contrast Enhancement of Multi-Displays Using Human Contrast Sensitivity
* Corrected Laplacians: Closer Cuts and Segmentation with Shape Priors
* Correspondence Expansion for Wide Baseline Stereo
* Coupled Kernel-Based Subspace Learning
* Coupled PDEs for Non-Rigid Registration and Segmentation
* Creating Invariance to Nuisance Parameters in Face Recognition
* Cross-Generalization: Learning Novel Classes from a Single Example by Feature Replacement
* Cross-Validatory Statistical Approach to Scale Selection for Image Denoising by Nonlinear Diffusion, A
* Damped Newton Algorithms for Matrix Factorization with Missing Data
* Database-Guided Segmentation of Anatomical Structures with Complex Appearance
* Decentralized Multiple Target Tracking Using Netted Collaborative Autonomous Trackers
* Dense Photometric Stereo Using a Mirror Sphere and Graph Cut
* Dense Photometric Stereo Using Tensorial Belief Propagation
* Dense Stereo Matching Using Two-Pass Dynamic Programming with Generalized Ground Control Points, A
* Detecting Doctored Images Using Camera Response Normality and Consistency
* Detecting, Localizing and Recovering Kinematics of Textured Animals
* Detection and Explanation of Anomalous Activities: Representing Activities as Bags of Event n-Grams
* Determining the Radiometric Response Function from a Single Grayscale Image
* Diagram Structure Recognition by Bayesian Conditional Random Fields
* Digital Tapestry
* Direct Method for 3D Factorization of Nonrigid Motion Observed in 2D, A
* Direct Method for Modeling Non-Rigid Motion with Thin Plate Spline, A
* Discriminant Analysis with Tensor Representation
* Discriminative Density Propagation for 3D Human Motion Estimation
* Discriminative Framework for Modelling Object Classes, A
* Discriminative Learning of Markov Random Fields for Segmentation of 3D Scan Data
* Discriminative Training for Object Recognition Using Image Patches
* Distinctiveness, Detectability, and Robustness of Local Image Features, The
* Driver State Monitor from DELPHI
* Dynamic Conditional Random Field Model for Object Segmentation in Image Sequences, A
* Dynamic Environment Exploration Using a Virtual White Cane
* Dynamosaics: Video Mosaics with Non-Chronological Time
* Efficient Image Matching with Distributions of Local Invariant Features
* Efficient Mean-Shift Tracking via a New Similarity Measure
* Efficient Multiclass Object Detection by a Hierarchy of Classifiers
* Efficient Nearest Neighbor Classification Using a Cascade of Approximate Similarity Measures
* Efficient Real-Time Algorithms for Eye State and Head Pose Tracking in Advanced Driver Support Systems
* Energy Minimization via Graph Cuts: Settling What is Possible
* Ensemble Tracking
* Estimating 3D Shape and Texture Using Pixel Intensity, Edges, Specular Highlights, Texture Constraints and a Prior
* Estimating Disparity and Occlusions in Stereo Video Sequences
* Evaluating Image Retrieval
* Eye Gaze Tracking under Natural Head Movements
* Face Modeling and Analysis in Stony Brook University
* Face Recognition Based on Frontal Views Generated from Non-Frontal Images
* Face Recognition with Image Sets Using Manifold Density Divergence
* Face Synthesis and Recognition from a Single Image under Arbitrary Unknown Lighting Using a Spherical Harmonic Basis Morphable Model
* Face Verification Across Age Progression
* Facial Muscle Activations from Motion Capture
* Factorization-Based Approach to Articulated Motion Recovery, A
* Factorization-Based Approach to Articulated Motion Recovery, A
* Fast Illumination-Invariant Background Subtraction Using Two Views: Error Analysis, Sensor Placement and Applications
* Fast Spatial Pattern Discovery Integrating Boosting with Constellations of Contextual Descriptors
* Feature Kernel Functions: Improving SVMs Using High-Level Knowledge
* Feature Uncertainty Arising from Covariant Image Noise
* Feature-Level Fusion in Personal Identification
* Fields of Experts: A Framework for Learning Image Priors
* Finding Glass
* Fisher+Kernel Criterion for Discriminant Analysis
* Flattening Curved Documents in Images
* Flow-Based Approach to Vehicle Detection and Background Mosaicking in Airborne Video, A
* Formulating Semantic Image Annotation as a Supervised Learning Problem
* Framework of 2D Fisher Discriminant Analysis: Application to Face Recognition with Small Number of Training Samples, A
* Full Body Tracking from Multiple Views Using Stochastic Sampling
* Full-Frame Video Stabilization
* Generative Model of Human Hair for Hair Sketching, A
* Generative versus Discriminative Methods for Object Recognition
* Geo-Consistency for Wide Multi-Camera Stereo
* Geodesic Computation for Adaptive Remeshing
* Graph Embedding: A General Framework for Dimensionality Reduction
* Guided Sampling via Weak Motion Models and Outlier Sample Generation for Epipolar Geometry Estimation
* Hallucinating Faces: TensorPatch Super-Resolution and Coupled Residue Compensation
* Hand Tracking with Flocks of Features
* Hierarchical Part-Based Visual Object Categorization
* High Resolution Grammatical Model for Face Representation and Sketching, A
* Higher Order Whitening of Natural Images
* Higher-Order Image Statistics for Unsupervised, Information-Theoretic, Adaptive, Image Filtering
* Histograms of Oriented Gradients for Human Detection
* Hybrid Graphical Model for Robust Feature Extraction from Video, A
* Hybrid Joint-Separable Multibody Tracking
* Hybrid Models for Human Motion Recognition
* Identifying Semantically Equivalent Object Fragments
* Illumination Normalization for Face Recognition and Uneven Background Correction Using Total Variation Based Image Models
* Illumination-Invariant Tracking via Graph Cuts
* Image Denoising Using Non-Negative Sparse Coding Shrinkage Algorithm
* Imaging the Cardiovascular Pulse
* Implicit Surfaces Make for Better Silhouettes
* Indexing with Unknown Illumination and Pose
* Infomax Boosting
* Integral Histogram: A Fast Way To Extract Histograms in Cartesian Spaces
* Integrated Learning of Saliency, Complex Features, and Object Detectors from Cluttered Scenes
* Integration of Motion Fields through Shape
* Interactive Graph Cut Based Segmentation with Shape Priors
* Interactive Montages of Sprites for Indexing and Summarizing Security Video
* Interactive Pinpoint Image Object Removal
* Interactive Shape from Shading
* Inverse Polarization Raytracing: Estimating Surface Shapes of Transparent Objects
* Isophote Properties as Features for Object Detection
* Jensen-Shannon Boosting Learning for Object Recognition
* Joint Nonparametric Alignment for Analyzing Spatial Gene Expression Patterns in Drosophila Imaginal Discs
* Kernel-Based Bayesian Filtering for Object Tracking
* Learn Discriminant Features for Multi-View Face and Eye Detection
* Learning a Multi-Size Patch-Based Hybrid Kernel Machine Ensemble for Abnormal Region Detection in Colonoscopic Images
* Learning a Similarity Metric Discriminatively, with Application to Face Verification
* Learning and Detecting Activities from Movement Trajectories Using the Hierarchical Hidden Markov Models
* Learning Appearance Manifolds from Video
* Learning Feature Distance Measures for Image Correspondences
* Learning Spatiotemporal T-Junctions for Occlusion Detection
* Learning the Semantics of Images by Using Unlabeled Samples
* Learning to Estimate Human Pose with Data Driven Belief Propagation
* Learning to Track: Conceptual Manifold Map for Closed-Form Tracking
* Learning with Constrained and Unlabelled Data
* Level Set Active Contours on Unstructured Point Cloud
* Level Set Based Shape Prior Segmentation
* Level Set Evolution without Re-Initialization: A New Variational Formulation
* Linear Combination Representation for Outlier Detection in Motion Tracking
* Local Color Transfer via Probabilistic Segmentation by Expectation-Maximization
* Local Discriminant Embedding and Its Variants
* Localization in Urban Environments: Monocular Vision Compared to a Differential GPS Sensor
* Locally Adaptive Support-Weight Approach for Visual Correspondence Search
* Machine Learning for Clinical Diagnosis from Functional Magnetic Resonance Imaging
* Mapping Low-Level Features to High-Level Semantic Concepts in Region-Based Image Retrieval
* Markov Random Field Approach for Dense Photometric Stereo, A
* Matching with PROSAC: Progressive Sample Consensus
* Measure of Deformability of Shapes, with Applications to Human Motion Analysis, A
* MER-DIMES: A Planetary Landing Application of Computer Vision
* Mercer Kernels for Object Recognition with Local Features
* Minimal Solution for Relative Pose with Unknown Focal Length, A
* Mixture Trees for Modeling and Fast Conditional Sampling with Applications in Vision and Graphics
* Modeling and Learning Contact Dynamics in Human Motion
* Modeling Dynamic Scenes with Active Appearance
* Modelling Dynamic Scenes by Registering Multi-View Image Sequences
* Modelling Reflections via Multiperspective Imaging
* Modified pbM-Estimator Method and a Runtime Analysis Technique for the RANSAC Family, The
* Monocular 3-D Tracking of the Golf Swing
* Monocular 3-D Tracking of the Golf Swing
* MosaicShape: Stochastic Region Grouping with Shape Prior
* Motion Segmentation of Multiple Translating Objects Using Line Correspondences
* Moving Cast Shadow Detection from a Gaussian Mixture Shadow Model
* MRF Augmented Particle Filter Tracker
* Multi-Image Matching Using Multi-Scale Oriented Patches
* Multi-Output Regularized Projection
* Multi-View Geometry for General Camera Models
* Multi-View Stereo via Volumetric Graph-Cuts
* Multilabel Random Walker Image Segmentation Using Prior Models
* Multilinear Independent Components Analysis
* Multimodal Face Recognition: Combination of Geometry with Physiological Information
* Multiple Collaborative Kernel Tracking
* Multiple Object Tracking with Kernel Particle Filter
* Multiscale Segmentation by Combining Motion and Intensity Cues
* Multitarget Tracking with Split and Merged Measurements
* Near Real-Time Reliable Stereo Matching Using Programmable Graphics Hardware
* New Active Contour Method Based on Elastic Interaction, A
* Non-Local Algorithm for Image Denoising, A
* Nonlinear Approach for Face Sketch Synthesis and Recognition, A
* Nonlinear Face Recognition Based on Maximum Average Margin Criterion
* Nonparametric Subspace Analysis for Face Recognition
* Obj Cut
* Object Class Recognition by Boosting a Part-Based Model
* Object Class Recognition Using Multiple Layer Boosting with Heterogeneous Features
* Object Detection Using 2D Spatial Ordering Constraints
* Object Detection Using 2D Spatial Ordering Constraints
* Object Recognition with Features Inspired by Visual Cortex
* On Modelling Nonlinear Shape-and-Texture Appearance Manifolds
* On the Absolute Quadratic Complex and Its Application to Autocalibration
* On the Localization of Straight Lines in 3D Space from Single 2D Images
* On the Small Sample Performance of Boosted Classifiers
* On-Line Learning Mechanism for Unsupervised Classification and Topology Representation, An
* Online Detection and Classification of Moving Objects Using Progressively Improving Detectors
* Online Learning of Probabilistic Appearance Manifolds for Video-Based Recognition and Tracking
* Online Modeling and Tracking of Pose-Varying Faces in Video
* Online Selecting Discriminative Tracking Features Using Particle Filter
* Optical Flow Estimation and Segmentation of Multiple Moving Dynamic Textures
* Optimal Point Correspondence through the Use of Rank Constraints
* Optimal Sub-Shape Models by Minimum Description Length
* Optimization Design of Cascaded Classifiers
* Ordinal Palmprint Represention for Personal Identification
* Overview of the Face Recognition Grand Challenge
* Parameter Estimation for MRF Stereo
* Part-Based Statistical Models for Object Classification and Detection
* Particle Filtering for Geometric Active Contours with Application to Tracking Moving and Deforming Objects
* Pedestrian Detection in Crowded Scenes
* Personal Identification Utilizing Finger Surface Features
* Pixels that Sound
* Polarization Multiplexing for Bidirectional Imaging
* Pose-Robust Face Recognition Using Geometry Assisted Probabilistic Modeling
* Principled Approach to Detecting Surprising Events in Video, A
* Probabilistic Fusion of Stereo with Color and Contrast for Bi-Layer Segmentation
* Probabilistic Kernels for the Classification of Auto-Regressive Visual Processes
* Probabilistic Modeling-Based Vessel Enhancement in Thoracic CT Scans
* Projector-Camera System with Real-Time Photometric Adaptation for Dynamic Environments, A
* Projector-Camera System with Real-Time Photometric Adaptation for Dynamic Environments, A
* Pruning Training Sets for Learning of Object Categories
* Quantitative Evaluation of a Novel Image Segmentation Algorithm
* Radial Trifocal Tensor: A Tool for Calibrating the Radial Distortion of Wide-Angle Cameras, The
* Radon-Based Structure from Motion without Correspondences
* Random Subspaces and Subsampling for 2-D Face Recognition
* Random Subwindows for Robust Image Classification
* Randomized Trees for Real-Time Keypoint Recognition
* Range Data Registration Using Photometric Features
* Rank-R Approximation of Tensors: Using Image-as-Matrix Representation
* Rational Function Lens Distortion Model for General Cameras, A
* Real-Time Connectivity Constrained Depth Map Computation Using Programmable Graphics Hardware
* Real-Time Multiple Objects Tracking with Occlusion Handling in Dynamic Scenes
* Real-Time Non-Rigid Surface Detection
* Real-Time Tracking Using Level Sets
* Real-Time Tracking with Multiple Cues by Set Theoretic Random Search
* Real-Time Wide Area Multi-Camera Stereo Tracking
* Recognizing Facial Expression: Machine Learning and Application to Spontaneous Behavior
* Reflection Components Decomposition of Textured Surfaces Using Linear Basis Functions
* Reflections on the Generalized Bas-Relief Ambiguity
* Region Competition via Local Watershed Operators
* Representational Oriented Component Analysis (ROCA) for Face Recognition with One Sample Image per Training Class
* Restoration and Recognition in a Loop
* RGB-Z: Mapping a Sparse Depth Map to a High Resolution RGB Camera Image
* Robust and Efficient Foreground Analysis for Real-Time Video Surveillance
* Robust Boosting for Learning from Few Examples
* Robust Centerline Extraction Framework Using Level Sets
* Robust Face Detection with Multi-Class Boosting
* Robust Instantaneous Rigid Motion Estimation
* Robust L-1 Norm Factorization in the Presence of Outliers and Missing Data by Alternative Convex Programming
* Robust Object Detection via Soft Cascade
* Scene-Adapted Structured Light
* Segmentation Induced by Scale Invariance
* Segmentation of a Piece-Wise Planar Scene from Perspective Images
* Segmentation of Edge Preserving Gradient Vector Flow: An Approach Toward Automatically Initializing and Splitting of Snakes
* Selection and Fusion of Color Models for Feature Detection
* Semi-Supervised Active Learning Framework for Image Retrieval, A
* Semi-Supervised Adapted HMMs for Unusual Event Detection
* Semi-Supervised Cross Feature Learning for Semantic Concept Detection in Videos
* Semi-Supervised Learning Based Object Detection in Aerial Imagery
* Shape from Shading: A Well-Posed Problem?
* Shape Matching and Object Recognition Using Low Distortion Correspondences
* Shape Regularized Active Contour Using Iterative Global Search and Local Optimization
* Shock Filters Based on Implicit Cluster Separation
* SIFT Descriptor with Global Context, A
* Simultaneous Estimation of Segmentation and Shape
* Simultaneous Modeling and Tracking (SMAT) of Feature Sets
* Single Image Phase-Based MRI Fat Suppression Expectation Maximization Algorithm
* Skeletal Parameter Estimation from Optical Motion Capture Data
* Skeletal Parameter Estimation from Optical Motion Capture Data
* Slit Scanning Depth of Route Panorama from Stationary Blur, A
* Space-Time Behavior Based Correlation
* Sparse Object Category Model for Efficient Learning and Exhaustive Recognition, A
* Sparse Support Vector Machine Approach to Region-Based Image Categorization, A
* Spatial Priors for Part-Based Recognition Using Statistical Models
* Spatiograms versus Histograms for Region-Based Tracking
* Speckle-Constrained Filtering of Ultrasound Images
* Spectral Segmentation with Multiscale Graph Decomposition
* Statistical Cue Integration for Foveated Wide-Field Surveillance
* Statistical Field Model for Pedestrian Detection, A
* Stereo Correspondence by Dynamic Programming on a Tree
* Strike a Pose: Tracking People by Finding Stylized Poses
* Subspace Analysis Using Random Mixture Models
* Symmetric Stereo Matching for Occlusion Handling
* Synchronization and Calibration of a Camera Network for 3D Event Reconstruction from Live Video
* Tangent-Corrected Embedding
* Tensor Decomposition for Geometric Grouping and Segmentation, A
* Theoretical Analysis on Reconstruction-Based Super-Resolution for an Arbitrary PSF
* Theory for Variational Area-Based Segmentation Using Non-Quadratic Penalty Functions
* Tone Reproduction: A Perspective from Luminance-Driven Perceptual Grouping
* Top-Down and Bottom-Up Strategies in Lesion Detection of Background Diabetic Retinopathy
* Towards Complete Generic Camera Calibration
* Tracking Multiple Colored Blobs with a Moving Camera
* Tracking Multiple Mouse Contours (without Too Many Samples)
* Tracking Multiple Objects through Occlusions
* Tracking Multiple Objects through Occlusions
* Tracking Non-Stationary Appearances and Dynamic Feature Selection
* Tracking People and Recognizing Their Activities
* Two Level Approach for Scene Recognition, A
* Two-Stage Level Set Evolution Scheme for Man-Made Objects Detection in Aerial Images, A
* Two-View Geometry Estimation Unaffected by a Dominant Plane
* Two-View Multibody Structure-and-Motion with Outliers
* Unified Framework for Tracking through Occlusions and across Sensor Gaps, A
* Unified Optimization Based Learning Method for Image Retrieval, A
* Unstructured Point Cloud Matching within Graph-Theoretic and Thermodynamic Frameworks
* Unsupervised Learning in Radiology Using Novel Latent Variable Models
* Unsupervised Learning of Discriminative Edge Measures for Vehicle Matching between Non-Overlapping Cameras
* Unsupervised Learning of Object Features from Video Sequences
* Using Coupled Subspace Models for Recovery of Reflectance Spectra from Airborne Images
* Using Particles to Track Varying Numbers of Interacting People
* Using the Inner-Distance for Classification of Articulated Shapes
* Using the KL-Center for Efficient and Accurate Retrieval of Distributions Arising from Texture Images
* Vehicle Fingerprinting for Reacquisition and Tracking in Videos
* Vehicle Segmentation and Tracking from a Low-Angle Off-Axis Camera
* Video Epitomes
* Videoshop: A New Framework for Spatio-Temporal Video Editing in Gradient Domain
* Visibility Constrained Surface Evolution
* Visual Concepts for News Story Tracking: Analyzing and Exploiting the NIST TRECVID Video Annotation Experiment
* Visual Tracking in the Presence of Motion Blur
* WaldBoost: Learning for Time Constrained Sequential Detection
* Weighted Nearest Mean Classifier for Sparse Subspaces, A
* Why I Want a Gradient Camera
* Wide-Baseline Stereo Matching with Line Segments
354 for CVPR05

CVPR06 * *CVPR
* 3-D Shape Reconstruction of Retinal Fundus
* 3D Alignment of Face in a Single Image
* 3D Building Detection and Modeling from Aerial LIDAR Data
* 3D Face Recognition Using 3D Alignment for PCA
* 3D Facial Expression Recognition Based on Primitive Surface Feature Distribution
* 3D People Tracking with Gaussian Process Dynamical Models
* 3D Reconstruction of Background and Objects Moving on Ground Plane Viewed from a Moving Camera
* 3D Surface Matching and Recognition Using Conformal Geometry
* Accelerated Kernel Feature Analysis
* Acceleration Strategies for Gaussian Mean-Shift Image Segmentation
* Accurate Face Alignment using Shape Constrained Markov Network
* Accurate Tracking of Monotonically Advancing Fronts
* Active Graph Cuts
* Activity Analysis in Microtubule Videos by Mixture of Hidden Markov Models
* AdaBoost.MRF: Boosted Markov Random Forests and Application to Multilevel Activity Recognition
* Adaptive Appearance Model Approach for Model-based Articulated Object Tracking, An
* Affine Invariance Revisited
* Aligning ASL for Statistical Translation Using a Discriminative Word Model
* Animals on the Web
* AnnoSearch: Image Auto-Annotation by Search
* Applying Ensembles of Multilinear Classifiers in the Frequency Domain
* Are two rotational flows sufficient to calibrate a smooth non-parametric sensor?
* Augmenting Shape with Appearance in Vehicle Category Recognition
* Automatic Cast Listing in Feature-Length Films with Anisotropic Manifold Space
* Automatic Discovery of Action Taxonomies from Multiple Views
* Automatic Kinematic Chain Building from Feature Trajectories of Articulated Objects
* Automatic Landmark Tracking and its Application to the Optimization of Brain Conformal Mapping
* Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories
* Bilayer Segmentation of Live Video
* Binocular Stereo Dense Matching in the Presence of Specular Reflections
* Blind Haze Separation
* Body Localization in Still Images Using Hierarchical Models and Hybrid Search
* BoostMotion: Boosting a Discriminative Similarity Function for Motion Estimation
* Bottleneck Geodesic: Computing Pixel Affinity, The
* Bottom-Up and Top-down Object Detection using Primal Sketch Features and Graphical Models
* Classifying Human Dynamics Without Contact Forces
* Closed-Form Solution to Natural Image Matting, A
* Clustering Appearance for Scene Analysis
* Color Subspaces as Photometric Invariants
* Combined Depth and Outlier Estimation in Multi-View Stereo
* Combining Cues: Shape from Shading and Texture
* Comparing Belief Propagation and Graph Cuts for Novelty Detection
* Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms, A
* Composite Templates for Cloth Modeling and Sketching
* Computing Exact Discrete Minimal Surfaces: Extending and Solving the Shortest Path Problem in 3D with Application to Segmentation
* Conditional Random People: Tracking Humans with CRFs and Grid Filters
* Conic Section Classifier and its Application to Image Datasets, A
* Context and Hierarchy in a Probabilistic Image Model
* Continuous Super-Resolution for Recovery of 1-D Image Features: Algorithm and Performance Modeling
* Contour-Based Structure from Reflection
* Control Theory and Fast Marching Techniques for Brain Connectivity Mapping
* Correlated Label Propagation with Application to Multi-label Learning
* Cosegmentation of Image Pairs by Histogram Matching: Incorporating a Global Constraint into MRFs
* Counting Crowded Moving Objects
* Coupled Bayesian Framework for Dual Energy Image Registration
* Covariance Tracking using Model Update Based on Lie Algebra
* CSIFT: A SIFT Descriptor with Color Invariant Characteristics
* Deformation Modeling for Robust 3D Face Matching
* Depth from Familiar Objects: A Hierarchical Model for 3D Scenes
* Design of High-Level Features for Photo Quality Assessment, The
* Design Principle for Coarse-to-Fine Classification, A
* Detection Technique for Degraded Face Images, A
* Differential Tracking based on Spatial-Appearance Model (SAM)
* Diffusion Distance for Histogram Comparison
* Dimensionality Reduction by Learning an Invariant Mapping
* Discriminative Learning of Mixture of Bayesian Network Classifiers for Sequence Classification
* Discriminative Object Class Models of Appearance and Shape by Correlatons
* Distributed Cost Boosting and Bounds on Mis-classification Cost
* Dynamic Appearance Modeling for Human Tracking
* Dynamic Bayesian Network Model for Autonomous 3D Reconstruction from a Single Indoor Image, A
* Dynamics Based Robust Motion Segmentation
* Edge Suppression by Gradient Field Transformation Using Cross-Projection Tensors
* Efficiency Criterion for 2D Shape Model Selection, An
* Efficient Maximally Stable Extremal Region (MSER) Tracking
* Efficient Nonparametric Belief Propagation with Application to Articulated Body Tracking
* Efficient Optimal Kernel Placement for Reliable Visual Tracking
* Element-Free Elastic Models for Volume Fitting and Capture
* Epipolar Geometry of Central Projection Systems Using Veronese Maps
* Equivalence of Non-Iterative Algorithms for Simultaneous Low Rank Approximations of Matrices
* Escaping local minima through hierarchical model selection: Automatic object discovery, segmentation, and tracking in video
* Estimating Intrinsic Component Images using Non-Linear Regression
* Euclidean Structure from Confocal Conics: Theory and Application to Camera Calibration
* Extracting Subimages of an Unknown Category from a Set of Images
* Face Recognition using 2.5D Shape Information
* Fast Compact City Modeling for Navigation Pre-Visualization
* Fast Human Detection Using a Cascade of Histograms of Oriented Gradients
* Fast Variational Segmentation using Partial Extremal Initialization
* Feature Selection for Evaluating Fluorescence Microscopy Images in Genome-Wide Cell Screens
* Framework for Feature Selection for Background Subtraction, A
* Fully Automatic Registration of 3D Point Clouds
* Function Space of an Activity, The
* Fusion of Detection and Matching Based Approaches for Laser Based Multiple People Tracking
* Fusion of Summation Invariants in 3D Human Face Recognition
* General Framework and New Alignment Criterion for Dense Optical Flow, A
* Generalized EM Approach for 3D Model Based Face Recognition under Occlusions, A
* Generative-Discriminative Hybrid Method for Multi-View Object Detection, A
* Geodesic Active Contour Framework for Finding Glass, A
* Geometric Hashing with Local Affine Frames
* Gesture Recognition using Hidden Markov Models from Fragmented Observations
* Globally Optimal Grouping for Symmetric Boundaries
* Graph Based Approach for Naming Faces in News Photos, A
* Graph Laplacian Kernels for Object Classification from a Single Example
* Graph Partitioning by Spectral Rounding: Applications in Image Segmentation and Clustering
* Grouping with Asymmetric Affinities: A Game-Theoretic Perspective
* Groupwise point pattern registration using a novel CDF-based Jensen-Shannon Divergence
* Hidden Conditional Random Fields for Gesture Recognition
* Hierarchical Procrustes Matching for Shape Retrieval
* Hierarchical Statistical Learning of Generic Parts of Object Structure
* Hierarchical Volumetric Multi-view Stereo Reconstruction of Manifold Surfaces based on Dual Graph Embedding
* Homography from Coplanar Ellipses with Application to Forensic Blood Splatter Reconstruction
* How Do Movie Viewers Perceive Scene Structure from Dynamic Cues
* How many planar viewing surfaces are there in noncentral catadioptric cameras? Towards singe-image localization of space lines
* Human Carrying Status in Visual Surveillance
* Identifying Color in Motion in Video Sensors
* Image Comparison by Compound Disjoint Information
* Image Completion Using Global Optimization
* Image Denoising Via Learned Dictionaries and Sparse representation
* Image Denoising with Shrinkage and Redundant Representations
* Image Matching Using Photometric Information
* Image Pre-Conditioning for Out-of-Focus Projector Blur
* Image-Based Multiclass Boosting and Echocardiographic View Classification
* Image-Segmentation Evaluation From the Perspective of Salient Object Extraction
* Impact of Dynamics on Subspace Embedding and Tracking of Sequences
* Improved watershed segmentation using water diffusion and local shape priors
* Improving Border Localization of Multi-Baseline Stereo Using Border-Cut
* Improving Recognition of Novel Input with Similarity
* Incorporating the Boltzmann Prior in Object Detection Using SVM
* Incremental learning of object detectors using a visual shape alphabet
* Inferring Facial Action Units with Causal Relations
* Instant 3Descatter
* Integrated Model of Top-Down and Bottom-Up Attention for Optimizing Detection Speed, An
* Integrated Segmentation and Classification Approach Applied to Multiple Sclerosis Analysis, An
* Integration of Top-down and Bottom-up Information for Image Labeling
* Intelligent Collaborative Tracking by Mining Auxiliary Objects
* Intensity-augmented Ordinal Measure for Visual Correspondence, An
* Interactive Feature Tracking using K-D Trees and Dynamic Programming
* Joint Boosting Feature Selection for Robust Face Recognition
* Joint Illumination and Shape Model for Visual Tracking, A
* Joint Recognition of Complex Events and Track Matching
* Kernel Uncorrelated and Orthogonal Discriminant Analysis: A Unified Approach
* Kernel-based Template Alignment
* Landmark-Based Geodesic Computation for Heuristically Driven Path Planning
* Large-scale Learning with SVM and Convolutional for Generic Object Categorization
* Layout Consistent Random Field for Recognizing and Segmenting Partially Occluded Objects, The
* Learning a manifold-constrained map between image sets: Applications to matching and pose estimation
* Learning Boosted Asymmetric Classifiers for Object Detection
* Learning Distance Metrics with Contextual Constraints for Image Retrieval
* Learning Exemplar-Based Categorization for the Detection of Multi-View Multi-Pose Objects
* Learning Joint Top-Down and Bottom-up Processes for 3D Visual Inference
* Learning Non-Metric Partial Similarity Based on Maximal Margin Criterion
* Learning Object Shape: From Drawings to Images
* Learning Patch Dependencies for Improved Pose Mismatched Face Verification
* Learning Semantic Patterns with Discriminant Localized Binary Projections
* Learning Temporal Sequence Model from Partially Labeled Data
* Lensless Imaging with a Controllable Aperture
* Local Features, All Grown Up
* Local Steerable Phase (LSP) Feature for Face Representation and Recognition
* Locally Linear Models on Face Appearance Manifolds with Application to Dual-Subspace Based Classification
* Making a Long Video Short: Dynamic Video Synopsis
* Mean Field EM-algorithm for Coherent Occlusion Handling in MAP-Estimation Prob, A
* Measure Locally, Reason Globally: Occlusion-sensitive Articulated Pose Estimation
* Measurement integration under inconsistency for robust tracking
* Mesostructure from Specularity
* Meta-Evaluation of Image Segmentation Using Machine Learning
* Model Order Selection and Cue Combination for Image Segmentation
* Modeling Age Progression in Young Faces
* Modeling and Classifying Breast Tissue Density in Mammograms
* Modeling Correspondences for Multi-Camera Tracking Using Nonlinear Manifold Learning and Target Dynamics
* Modular Approach to the Analysis and Evaluation of Particle Filters for Figure Tracking, A
* Motion Estimation from Spheres
* Motion Patterns: High-Level Representation of Natural Video Sequences
* MRFs for MRIs: Bayesian Reconstruction of MR Images via Graph Cuts
* Multi-Aspect Detection of Articulated Objects
* Multi-Camera Scene Flow by Tracking 3-D Points and Surfels
* Multi-Channel Algorithm for Edge Detection Under Varying Lighting, A
* Multi-Object Tracking Through Simultaneous Long Occlusions and Split-Merge Conditions
* Multi-Resolution Patch Tensor for Facial Expression Hallucination
* Multi-Resolution Spin-Images
* Multi-Target Tracking: Linking Identities using Bayesian Network Inference
* Multi-View Stereo Revisited
* Multiclass Object Recognition with Sparse, Localized Features
* Multiple Face Model of Hybrid Fourier Feature for Large Face Image Set
* Multiple Object Class Detection with a Generative Model
* Multiscale Nonlinear Diffusion and Shock Filter for Ultrasound Image Enhancement
* Multiview Geometry for Texture Mapping 2D Images Onto 3D Range Data
* Neighborhood Aided Implicit Active Contours
* New Deformable Model for Boundary Tracking in Cardiac MRI and Its Application to the Detection of Intra-Ventricular Dyssynchrony, A
* New Formulation for Shape from Shading for Non-Lambertian Surfaces, A
* New Method of Probability Density Estimation with Application to Mutual Information Based Image Registration
* Noise Estimation from a Single Image
* Non-rigid Image Registration Using Geometric Features and Local Salient Region Features
* Non-Rigid Metric Shape and Motion Recovery from Uncalibrated Images Using Priors
* Nonlinear Mean Shift for Clustering over Analytic Manifolds
* Nonparametric Priors on the Space of Joint Intensity Distributions for Non-Rigid Multi-Modal Image Registration
* Novel Data Association Algorithm for Object Tracking in Clutter with Application to Tennis Video Analysis, A
* Object Pose Detection in Range Scan Data
* Off-road Path Following using Region Classification and Geometric Projection Constraints
* On Manifold Structure of Cardiac MRI Data: Application to Segmentation
* On-line Boosting and Vision
* Optimal Pose for Face Recognition
* Panoramic 3D Reconstruction Using Rotational Stereo Camera with Simple Epipolar Constraints
* Panum Proxy Algorithm for Dense Stereo Matching over a Volume of Interest, The
* Parameterized Duration Mmodeling for Switching Linear Dynamic Systems
* Particle Video: Long-Range Motion Estimation Using Point Trajectories
* Perception Strategies in Hierarchical Vision Systems
* Perceptually-Inspired and Edge-Directed Color Image Super-Resolution
* Performance Modeling and Prediction of Face Recognition Systems
* Person Reidentification Using Spatiotemporal Appearance
* Picture Collage
* Piecewise Image Registration in the Presence of Multiple Large Motions
* Planar Light Probe, A
* Polarization-based Surface Reconstruction via Patch Matching
* Principled Hybrids of Generative and Discriminative Models
* Probabilistic 3D Polyp Detection in CT Images: The Role of Sample Alignment
* Projective Invariant for Textures, A
* Pursuing Informative Projection on Grassmann Manifold
* Putting Objects in Perspective
* Quantitative Evaluation of Near Regular Texture Synthesis Algorithms
* Rank-One Projections With Adaptive Margins for Face Recognition
* RANSAC for (Quasi-)Degenerate data (QDEGSAC)
* Real Time Localization and 3D Reconstruction
* Real-time Hand Pose Recognition Using Low-Resolution Depth Images
* Real-time Image-Based Guidance Method for Lung-Cancer Assessment
* Real-Time Semi-Automatic Segmentation Using a Bayesian Network
* Real-Time Visual SLAM with Resilience to Erratic Motion
* Reciprocal Image Features for Uncalibrated Helmholtz Stereopsis
* Recognition of Composite Human Activities through Context-Free Grammar Based Representation
* Recognize High Resolution Faces: From Macrocosm to Microcosm
* Reconstructing Occluded Surfaces Using Synthetic Apertures: Stereo, Focus and Robust Measures
* Reconstruction in the Round Using Photometric Normals and Silhouettes
* Reconstruction with Interval Constraints Propagation
* Recovering Camera Motion Using L-inf Minimization
* Recursive estimation of generative models of video
* Recursive Recovery of Position and Orientation from Stereo Image Sequences without Three-Dimensional Structures
* Refractive Camera for Acquiring Stereo and Super-resolution Images, A
* Region-based Image Annotation using Asymmetrical Support Vector Machine-based Multiple-Instance Learning
* Region-Tree Based Stereo Using Dynamic Programming Optimization
* Registration Problem Revisited: Optimal Solutions From Points, Lines and Planes, The
* Regression-based Hand Pose Estimation from Multiple Cameras
* Removing Outliers Using The L-inf Norm
* Robust AAM Fitting by Fusion of Images and Disparity Data
* Robust Fragments-based Tracking using the Integral Histogram
* Robust multi-target tracking using spatio-temporal context
* Robust People Tracking with Global Trajectory Optimization
* Robust Real-Time Face Pose and Facial Expression Recovery
* Robust Tracking and Stereo Matching under Variable Illumination
* Robust Visual Tracking Using Case-Based Reasoning with Confidence
* Satellite Features for the Classification of Visually Similar Classes
* Scalable Monocular SLAM
* Scalable Recognition with a Vocabulary Tree
* Scale Variant Image Pyramids
* Scale-Driven Iterative Optimization for Brain Extraction and Registration
* SDG Cut: 3D Reconstruction of Non-lambertian Objects Using Graph Cuts on Surface Distance Grid
* Seamless Image Stitching of Scenes with Large Motions and Exposure Differences
* Searching Off-line Arabic Documents
* Segmentation by Level Sets and Symmetry
* Selecting Principal Components in a Two-Stage LDA Algorithm
* Semi-Supervised Classification Using Linear Neighborhood Propagation
* Separation of Highlight Reflections on Textured Surfaces
* Shape from Dynamic Texture for Planes
* Shape from Shading: Recognizing the Mountains through a Global View
* Shape Guided Object Segmentation
* Shape Representation based on Integral Kernels: Application to Image Matching and Segmentation
* Shape Representation for Planar Curves by Shape Signature Harmonic Embedding, A
* Shape Topics: A Compact Representation and New Algorithms for 3D Partial Shape Retrieval
* Shape-Based Approach to Robust Image Segmentation using Kernel PCA
* Simple Bayesian Framework for Content-Based Image Retrieval, A
* Simultaneous Registration and Modeling of Deformable Shapes
* Single View Reconstruction of Curved Surfaces
* Single-image vignetting correction using radial gradient symmetry
* Solving Markov Random Fields using Second Order Cone Programming Relaxations
* Space-Time Video Montage
* Sparse and Semi-supervised Visual Mapping with the S^3GP
* Spatial Divide and Conquer with Motion Cues for Tracking through Clutter
* Spatial Reflectance Recovery under Complex Illumination from Sparse Images
* Spatial Weighting for Bag-of-Features
* Spectral Methods for Automatic Multiscale Data Clustering
* Specular Flow and the Recovery of Surface Structure
* Statistical Analysis of Local 3D Structure in 2D Images
* Stereo Matching with Color-Weighted Correlation, Hierachical Belief Propagation and Occlusion Handling
* Stereo Matching with Symmetric Cost Functions
* Stereo Vision in Structured Environments by Consistent Semi-Global Matching
* Structure and View Estimation for Tomographic Reconstruction: A Bayesian Approach
* Structure from Motion with Known Camera Positions
* Successive Convex Matching for Action Detection
* Supervised Learning of Edges and Object Boundaries
* Surface Geometric Constraints for Stereo in Belief Propagation
* Surface Reconstruction of Bone from X-ray Images and Point Distribution Model Incorporating a Novel Method for 2D-3D Correspondence
* SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition
* Three-Dimensional Volume Reconstruction Based on Trajectory Fusion from Confocal Laser Scanning Microscope Images
* Toward Optimal Kernel-based Tracking
* Toward Robust Distance Metric Analysis for Similarity Estimation
* Towards Multi-View Object Class Detection
* Tracking of Multiple, Partially Occluded Humans based on Static Body Part Detection
* Tracking of the Articulated Upper Body on Multi-View Stereo Image Sequences
* Tracking With Sobolev Active Contours
* Training Deformable Models for Localization
* Transformation invariant component analysis for binary images
* Tunable Kernels for Tracking
* Ultrasound-Specific Segmentation via Decorrelation and Statistical Region-Based Active Contours
* Uncertainty Models in Quasiconvex Optimization for Geometric Reconstruction
* Unsupervised Bayesian Detection of Independent Motion in Crowds
* Unsupervised Discovery of Action Classes
* Unsupervised Learning of Categories from Sets of Partially Matching Image Features
* Using Bilinear Models for View-invariant Action and Identity Recognition
* Using Dependent Regions for Object Categorization in a Generative Framework
* Using Language to Drive the Perceptual Grouping of Local Image Features
* Using Multiple Segmentations to Discover Objects and their Extent in Image Collections
* Using Stationary-Dynamic Camera Assemblies for Wide-area Video Surveillance and Selective Attention
* Vertical Parallax from Moving Shadows
* Vessel Crawlers: 3D Physically-based Deformable Organisms for Vasculature Segmentation and Analysis
* Video Completion by Motion Field Transfer
* Visible Surface Reconstruction from Normals with Discontinuity Consideration
* Visual Vocabulary for Flower Classification, A
* Weakly Supervised Top-down Image Segmentation
* When Fisher meets Fukunaga-Koontz: A New Look at Linear Discriminants
* Wire Structure Pattern Extraction and Tracking From X-Ray Images of Composite Mechanisms
318 for CVPR06

CVPR07 * *CVPR
* 3D Face Recognition Founded on the Structural Diversity of Human Faces
* 3D Face Recognition in the Presence of Expression: A Guidance-based Constraint Deformation Approach
* 3D Layout CRF for Multi-View Object Class Recognition and Segmentation
* 3D Occlusion Inference from Silhouette Cues
* 3D Probabilistic Feature Point Model for Object Detection and Recognition
* Accurate Object Detection with Deformable Shape Models Learnt from Images
* Accurate Object Localization with Shape Masks
* Accurate, Dense, and Robust Multi-View Stereopsis
* Accurately measuring human movement using articulated ICP with soft-joint constraints and a repository of articulated models
* Active Aperture Control and Sensor Modulation for Flexible Imaging
* Adaptive Distance Metric Learning for Clustering
* Algorithms for Batch Matrix Factorization with Application to Structure-from-Motion
* Approximate Nearest Subspace Search with Applications to Pattern Recognition
* Artificial Complex Cells via the Tropical Semiring
* Autocalibration and Uncalibrated Reconstruction of Shape from Defocus
* Autocalibration via Rank-Constrained Estimation of the Absolute Quadric
* Automatic Face Recognition from Skeletal Remains
* Automatic Removal of Chromatic Aberration from a Single Image
* Belief Propagation in a 3D Spatio-temporal MRF for Moving Object Detection
* Benchmark for the Comparison of 3-D Motion Segmentation Algorithms, A
* Beyond bottom-up: Incorporating task-dependent influences into a computational model of spatial attention
* Beyond Local Appearance: Category Recognition from Pairwise Interactions of Simple Features
* Biased Manifold Embedding: A Framework for Person-Independent Head Pose Estimation
* Bilattice-based Logical Reasoning for Human Detection
* Binning Scheme for Fast Hard Drive Based Image Search, A
* Blind Source Separation Perspective on Image Restoration, A
* Boosting Coded Dynamic Features for Facial Action Units and Facial Expression Recognition
* boosting regression approach to medical anatomy detection, A
* Bottom-Up Recognition and Parsing of the Human Body
* Bridging the Gap between Detection and Tracking for 3D Monocular Video-Based Motion Capture
* Canonical Face Depth Map: A Robust 3D Representation for Face Verification
* Capturing long-range correlations with patch models
* Change Detection in a 3-d World
* City-Scale Location Recognition
* Classifying Video with Kernel Dynamic Textures
* Closed-form Solution to 3D Reconstruction of Piecewise Planar Objects from Single Images, A
* Closed-Loop Tracking and Change Detection in Multi-Activity Sequences
* Color Constancy using Natural Image Statistics
* Combining local and global motion models for feature point tracking
* Combining Region and Edge Cues for Image Segmentation in a Probabilistic Gaussian Mixture Framework
* Combining Static Classifiers and Class Syntax Models for Logical Entity Recognition in Scanned Historical Documents
* Composite Models of Objects and Scenes for Category Recognition
* Compositional Boosting for Computing Hierarchical Image Structures
* Concurrent Multiple Instance Learning for Image Categorization
* Connecting the Out-of-Sample and Pre-Image Problems in Kernel Methods
* Consistent Temporal Variations in Many Outdoor Scenes
* Content-Based Image Annotation Refinement
* contextual dissimilarity measure for accurate and efficient image search, A
* Contextual Identity Recognition in Personal Photo Albums
* CRF-driven Implicit Deformable Model
* Crisp Weighted Support Vector Regression for robust single model estimation: application to object tracking in image sequences
* Deformable Motion Tracking of Cardiac Structures (DEMOTRACS) for Improved MR Imaging
* Deformable Surface Tracking Ambiguities
* Delaunay Deformable Models: Topology-Adaptive Meshes Based on the Restricted Delaunay Triangulation
* Detailed Human Shape and Pose from Images
* Detecting Object Boundaries Using Low-, Mid-, and High-level Information
* Detecting Pedestrians by Learning Shapelet Features
* Detecting Specular Surfaces on Natural Images
* Detection and segmentation of moving objects in highly dynamic scenes
* Detector Ensemble
* Differential Camera Tracking through Linearizing the Local Appearance Manifold
* Digital Topology on Adaptive Octree Grids
* Direct and Efficient Method for Piecewise-Planar Surface Reconstruction from Stereo Images, A
* Discontinuity Preserving Filtering over Analytic Manifolds
* Discovery of Collocation Patterns: from Visual Words to Visual Phrases
* Discriminant Additive Tangent Spaces for Object Recognition
* Discriminant Interest Points are Stable
* Discriminant Mutual Subspace Learning for Indoor and Outdoor Face Recognition
* Discriminative Cluster Refinement: Improving Object Category Recognition Given Limited Training Data
* Discriminative Learning of Dynamical Systems for Motion Tracking
* Dynamic 3D Scene Analysis from a Moving Vehicle
* Efficient Belief Propagation for Vision Using Linear Constraint Nodes
* Efficient Indexing For Articulation Invariant Shape Matching And Retrieval
* Efficient Minimal Solution for Infinitesimal Camera Motion, An
* Efficient MRF Deformation Model for Non-Rigid Image Matching
* Efficient new-view synthesis using pairwise dictionary priors
* Efficiently Determining Silhouette Consistency
* Eigenboosting: Combining Discriminative and Generative Information
* Element Rearrangement for Tensor-Based Subspace Learning
* Enhanced Level Building Algorithm for the Movement Epenthesis Problem in Sign Language Recognition
* Epitomic Representation of Human Activities
* Estimating Scale of a Scene from a Single Image Based on Defocus Blur and Scene Geometry
* Evaluation of Cost Functions for Stereo Matching
* Evaluation of Epipole Estimation Methods with/without Rank-2 Constraint across Algebraic/Geometric Error Functions
* Exemplar Model for Learning Object Classes, An
* Face Annotation Framework with Partial Clustering and Interactive Labeling, A
* Face Re-Lighting from a Single Image under Harsh Lighting Conditions
* Face Recognition using Discriminatively Trained Orthogonal Rank One Tensor Projections
* Face Recognition Using Kernel Ridge Regression
* Fast 3D Correspondence Method for Statistical Shape Modeling, A
* Fast 3D Scanning with Automatic Motion Compensation
* Fast Human Pose Estimation using Appearance and Motion via Multi-Dimensional Boosting Regression
* Fast Keypoint Recognition in Ten Lines of Code
* Fast Terrain Classification Using Variable-Length Representation for Autonomous Navigation
* Fast, Approximately Optimal Solutions for Single and Dynamic MRFs
* Feature Extraction by Maximizing the Average Neighborhood Margin
* Feature Mining for Image Classification
* Fiber Tract Clustering on Manifolds With Dual Rooted-Graphs
* Filtered Component Analysis to Increase Robustness to Local Minima in Appearance Models
* Fisher Kernels on Visual Vocabularies for Image Categorization
* Flash Cut: Foreground Extraction with Flash and No-flash Image Pairs
* Flexible Object Models for Category-Level 3D Object Recognition
* Free-Form Nonrigid Image Registration Using Generalized Elastic Nets
* From Videos to Verbs: Mining Videos for Activities using a Cascade of Dynamical Systems
* Generalized Thin-Plate Spline Warps
* Generic Face Alignment using Boosted Appearance Model
* Global Optimization for Shape Fitting
* Graph Cut Based Optimization for MRFs with Truncated Convex Priors
* Graph Reduction Method for 2D Snake Problems, A
* Graphical Model Approach to Iris Matching Under Deformation and Occlusion
* Groupwise Shape Registration on Raw Edge Sequence via A Spatio-Temporal Generative Model
* Handwritten Carbon Form Preprocessing Based on Markov Random Field
* Harmony in Motion
* Hierarchical Learning of Curves Application to Guidewire Localization in Fluoroscopy
* Hierarchical Matching of Deformable Shapes
* Hierarchical Model of Shape and Appearance for Human Action Classification, A
* Hierarchical Structuring of Data on Manifolds
* High Distortion and Non-Structural Image Matching via Feature Co-occurrence
* High-dimensional statistical distance for region-of-interest tracking: Application to combining a soft geometric constraint with radiometry
* Human Detection via Classification on Riemannian Manifolds
* Hybrid learning of large jigsaws
* Hyperbolic Geometry of Illumination-Induced Chromaticity Changes, The
* Illumination Multiplexing within Fundamental Limits
* Image Classification with Segmentation Graph Kernels
* Image Hallucination Using Neighbor Embedding over Visual Primitive Manifolds
* Image Matching via Saliency Region Correspondences
* Image representations beyond histograms of gradients: The role of Gestalt descriptors
* Image Segmentation by Probabilistic Bottom-Up Aggregation and Cue Integration
* Imaging the Finger Force Direction
* Implicit Active Contours Driven by Local Binary Fitting Energy
* Improved Video Registration using Non-Distinctive Local Image Features
* Improving Part based Object Detection by Unsupervised, Online Boosting
* In Situ Evaluation of Tracking Algorithms Using Time Reversed Chains
* Incorporating On-demand Stereo for Real Time Recognition
* Incremental Linear Discriminant Analysis Using Sufficient Spanning Set Approximations
* Inferring 3D Volumetric Shape of Both Moving Objects and Static Background Observed by a Moving Camera
* Inferring Grammar-based Structure Models from 3D Microscopy Data
* Inferring Temporal Order of Images From 3D Structure
* Integrating Global and Local Structures: A Least Squares Framework for Dimensionality Reduction
* Isotropy, Reciprocity and the Generalized Bas-Relief Ambiguity
* Iterative MAP and ML Estimations for Image Segmentation
* Joint Object Segmentation and Behavior Classification in Image Sequences
* Joint Optimization of Cascaded Classifiers for Computer Aided Detection
* Joint Real-time Object Detection and Pose Estimation Using Probabilistic Boosting Network
* Kernel-based Tracking from a Probabilistic Viewpoint
* Kinematics from Lines in a Single Rolling Shutter Image
* Lagrangian Particle Dynamics Approach for Crowd Flow Segmentation and Stability Analysis, A
* Large scale vision-based navigation without an accurate global reconstruction
* Latent-Dynamic Discriminative Models for Continuous Gesture Recognition
* Layered Depth Panoramas
* Layered Graph Match with Graph Editing
* Learning a Spatially Smooth Subspace for Face Recognition
* Learning and Matching Line Aspects for Articulated Objects
* Learning Color Names from Real-World Images
* Learning Conditional Random Fields for Stereo
* Learning Dynamic Event Descriptions in Image Sequences
* Learning Features for Tracking
* Learning Gaussian Conditional Random Fields for Low-Level Vision
* Learning Generative Models via Discriminative Approaches
* Learning GMRF Structures for Spatial Priors
* Learning Kernel Expansions for Image Classification
* Learning Local Image Descriptors
* Learning Motion Categories using both Semantic and Structural Information
* Learning the Compositional Nature of Visual Objects
* Learning to Detect A Salient Object
* Learning Visual Representations using Images with Captions
* Learning Visual Similarity Measures for Comparing Never Seen Objects
* Leveraging temporal, contextual and ordering constraints for recognizing complex activities in video
* Light Fall-off Stereo
* Linear and Quadratic Subsets for Template-Based Tracking
* Linear Laplacian Discrimination for Feature Extraction
* Linear Programming Approach for Multiple Object Tracking, A
* Local and Weighted Maximum Margin Discriminant Analysis
* Local Ensemble Kernel Learning for Object Category Recognition
* Local Structure Detection with Orientation-invariant Radial Configuration
* Mapping Natural Image Patches by Explicit and Implicit Manifolds
* Marker-less Deformable Mesh Tracking for Human Shape and Motion Capture
* Matching Local Self-Similarities across Images and Videos
* Matrix-Structural Learning (MSL) of Cascaded Classifier from Enormous Training Set
* Maximally Stable Colour Regions for Recognition and Matching
* Microphone Arrays as Generalized Cameras for Integrated Audio Visual Processing
* minimal solution to the autocalibration of radial distortion, A
* Minimal Solutions for Panoramic Stitching
* Minutiae-based Fingerprint Individuality Model, A
* Model-Guided Segmentation of 3D Neuroradiological Image Using Statistical Surface Wavelet Model
* Modeling Appearances with Low-Rank SVM
* Monocular and Stereo Methods for AAM Learning from Video
* Motion and Appearance Contexts for Tracking and Re-Acquiring Targets in Aerial Videos
* Moving Forward in Structure From Motion
* Multi-class object tracking algorithm that handles fragmentation and grouping
* Multi-label image segmentation via max-sum solver
* Multi-modal Clustering for Multimedia Collections
* Multi-Resolution Dynamic Model for Face Aging Simulation, A
* Multi-scale Features for Detection and Segmentation of Rocks in Mars Images
* Multi-scale Structural Saliency for Signature Detection
* Multi-Scale Tikhonov Regularization Scheme for Implicit Surface Modelling, A
* Multi-View Document Rectification using Boundary
* Multiple Class Segmentation Using A Unified Framework over Mean-Shift Patches
* Multiple Instance Learning of Pulmonary Embolism Detection with Geodesic Distance along Vascular Structure
* Multiple Target Tracking Using Spatio-Temporal Markov Chain Monte Carlo Data Association
* Multiple View Image Reconstruction: A Harmonic Approach
* Mumford-Shah Meets Stereo: Integration of Weak Depth Hypotheses
* Nearest First Traversing Graph for Simultaneous Object Tracking and Recognition
* New Performance Evaluation Method for Face Identification: Regression Analysis of Misidentification Risk, A
* Nine-point Algorithm for Estimating Para-Catadioptric Fundamental Matrices, A
* Nonlinear Dynamical Shape Priors for Level Set Segmentation
* Nonparametric Treatment for Location/Segmentation Based Visual Tracking, A
* Novel Representation for Riemannian Analysis of Elastic Curves in Rn, A
* Object retrieval with large vocabularies and fast spatial matching
* Object Tracking by Asymmetric Kernel Mean Shift with Automatic Scale and Orientation Selection
* Objects in Action: An Approach for Combining Action Understanding and Object Perception
* Offline Signature Verification Using Online Handwriting Registration
* On Constant Focal Length Self-Calibration From Multiple Views
* On Constructing Facial Similarity Maps
* On Stabilisation of Parametric Active Contours
* On the Blind Classification of Time Series
* On the Direct Estimation of the Fundamental Matrix
* On the Performance Prediction and Validation for Multisensor Fusion
* On the Spacetime Geometry of Galilean Cameras
* On-the-fly Object Modeling while Tracking
* One-class Machine Learning for Brain Activation Detection
* Online Learning Asymmetric Boosted Classifiers for Object Detection
* Optimal Reduced Representation of a MoG with Applicatios to Medical Image Database Classification, An
* Optimal Step Nonrigid ICP Algorithms for Surface Registration
* Optimized Color Sampling for Robust Matting
* Optimizing Binary MRFs via Extended Roof Duality
* Optimizing Distribution-based Matching by Random Subsampling
* OPTIMOL: automatic Online Picture collecTion via Incremental MOdel Learning
* P3 and Beyond: Solving Energies with Higher Order Cliques
* Parameter Sensitive Detectors
* Partially Occluded Object-Specific Segmentation in View-Based Recognition
* PEET: Prototype Embedding and Embedding Transition for Matching Vehicles over Disparate Viewpoints
* Physics-Based Person Tracking Using Simplified Lower-Body Dynamics
* Polarization and Phase-Shifting for 3D Scanning of Translucent Objects
* practical algorithm for L triangulation with outliers, A
* Precise Registration of 3D Models To Images by Swarming Particles
* Principal Curvature-Based Region Detector for Object Recognition
* Probabilistic Intensity Similarity Measure based on Noise Distributions, A
* Probabilistic Model for Object Recognition, Segmentation, and Non-Rigid Correspondence, A
* Probabilistic Reverse Annotation for Large Scale Image Retrieval
* Probabilistic visibility for multi-view stereo
* Progressive Finite Newton Approach To Real-time Nonrigid Surface Detection
* Projective Factorization of Multiple Rigid-Body Motions
* Pyramid Match Hashing: Sub-Linear Time Indexing Over Partial Correspondences
* Quality-Driven Face Occlusion Detection and Recovery
* Quantifying Facial Expression Abnormality in Schizophrenia by Combining 2D and 3D Features
* Quasi-Dense Wide Baseline Matching Using Match Propagation
* Radiometric Calibration from Noise Distributions
* Real-time Gesture Recognition with Minimal Training Requirements and On-line Learning
* Real-Time Plane-Sweeping Stereo with Multiple Sweeping Directions
* Real-time Visual Tracking under Arbitrary Illumination Changes
* Recognizing Human Activities from Silhouettes: Motion Subspace and Factorial Discriminative Graphical Model
* Recognizing objects by piecing together the Segmentation Puzzle
* Reducing correspondence ambiguity in loosely labeled training data
* Region Classification with Markov Field Aspect Models
* Regression tracking with data relevance determination
* Removal of Image Artifacts Due to Sensor Dust
* Resolving Objects at Higher Resolution from a Single Motion-blurred Image
* Resolving the Generalized Bas-Relief Ambiguity by Entropy Minimization
* Revocable Fingerprint Biotokens: Accuracy and Security Analysis
* Riemannian Analysis of Probability Density Functions with Applications in Vision
* Robust 3D Face Recognition Using Learned Visual Codebook
* Robust Estimation of Texture Flow via Dense Feature Sampling
* Robust Metric Reconstruction from Challenging Video Sequences
* Robust Real-Time Visual SLAM Using Scale Prediction and Exemplar Based Feature Description
* Robust Rotation and Translation Estimation in Multiview Reconstruction
* ROI-SEG: Unsupervised Color Segmentation by Combining Differently Focused Sub Results
* Saliency Detection: A Spectral Residual Approach
* Scaled Motion Dynamics for Markerless Motion Capture
* Seamless Mosaicing of Image-Based Texture Maps
* Searching Video for Complex Activities with Finite State Models
* Segmenting Images on the Tensor Manifold
* Segmenting Motions of Different Types by Unsupervised Manifold Clustering
* Semantic Hierarchies for Recognizing Objects and Parts
* Semantic Hierarchies for Visual Object Recognition
* Semi-supervised Hierarchical Models for 3D Human Pose Reconstruction
* Sensor noise modeling using the Skellam distribution: Application to the color edge detection
* ShadowCuts: Photometric Stereo with Shadows
* Shape from Planar Curves: A Linear Escape from Flatland
* Shape from Shading Based on Lax-Friedrichs Fast Sweeping and Regularization Techniques With Applications to Document Image Restoration
* Shape from Shading Under Various Imaging Conditions
* Shape Representation and Registration using Vector Distance Functions
* Shape Statistics for Image Segmentation with Prior
* Shape Variation-Based Frieze Pattern for Robust Gait Recognition
* Simultaneous Covariance Driven Correspondence (CDC) and Transformation Estimation in the Expectation Maximization Framework
* Simultaneous Depth Reconstruction and Restoration of Noisy Stereo Images using Non-local Pixel Distribution
* Simultaneous Matting and Compositing
* Simultaneous Object Detection and Segmentation by Boosting Local Shape Feature based Classifier
* Simultaneous Optimization of Structure and Motion in Dynamic Scenes Using Unsynchronized Stereo Cameras
* Single Image Motion Deblurring Using Transparency
* Single View Human Action Recognition using Key Pose Matching and Viterbi Path Searching
* Skin Detail Analysis for Face Recognition
* Soft Edge Smoothness Prior for Alpha Channel Super Resolution
* Solving Large Scale Binary Quadratic Problems: Spectral Methods vs. Semidefinite Programming
* Spatial selection for attentional visual tracking
* Spatial-Depth Super Resolution for Range Images
* Spatio-Temporal Markov Random Field for Video Denoising
* Speckle Tracking in 3D Echocardiography with Motion Coherence
* Spectral Matting
* Statistical Shape Analysis of Multi-Object Complexes
* Statistics of Infrared Images
* Stereo Matching on Objects with Fractional Boundary
* Stereo Matching via Disparity Estimation and Surface Modeling
* Surface-Growing Approach to Multi-View Stereo Reconstruction, A
* Surveillance in Virtual Reality: System Design and Multi-Camera Control
* Symmetric Objects are Hardly Ambiguous
* Tensor Canonical Correlation Analysis for Action Classification
* Texture-Preserving Shadow Removal in Color Images Containing Curved Surfaces
* Topic-Motion Model for Unsupervised Video Object Discovery, A
* Topological Approach to Hierarchical Segmentation using Mean Shift, A
* Topology matching for 3D video compression
* Topology Preserving Log-Unbiased Nonlinear Image Registration: Theory and Implementation
* Toward Flexible 3D Modeling using a Catadioptric Camera
* Towards Automatic Photometric Correction of Casually Illuminated Documents
* Towards Fog-Free In-Vehicle Vision Systems through Contrast Restoration
* Towards Robust Pedestrian Detection in Crowded Image Sequences
* Towards Scalable Representations of Object Categories: Learning a Hierarchy of Parts
* Trace Ratio vs. Ratio Trace for Dimensionality Reduction
* Tracking as Repeated Figure/Ground Segmentation
* Tracking in Low Frame Rate Video: A Cascade Particle Filter with Discriminative Observers of Different Life Spans
* Tracking Large Variable Numbers of Objects in Clutter
* Tracking-as-Recognition for Articulated Full-Body Human Motion Analysis
* Trajectory Association across Non-overlapping Moving Cameras in Planar Scenes
* Trajectory Series Analysis based Event Rule Induction for Visual Surveillance
* Transfer Learning in Sign language
* Tree-based Classifiers for Bilayer Video Segmentation
* Two-View Motion Segmentation from Linear Programming Relaxation
* Unified Probabilistic Framework for Facial Activity Modeling and xoUnderstanding, A
* Unsupervised Activity Perception by Hierarchical Bayesian Models
* Unsupervised Clustering using Multi-Resolution Perceptual Grouping
* Unsupervised Learning of Image Transformations
* Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition
* Unsupervised Segmentation of Objects using Efficient Learning
* Using Galois Theory to Prove Structure from Motion Algorithms are Optimal
* Using Geometry Invariants for Camera Response Function Estimation
* Using Segmentation to Verify Object Hypotheses
* Using Stereo Matching for 2-D Face Recognition Across Pose
* Utilizing Variational Optimization to Learn Markov Random Fields
* Variable Bandwidth Image Denoising Using Image-based Noise Models
* Variational Approach to the Evolution of Radial Basis Functions for Image Segmentation, A
* Variational Bayes Based Approach to Robust Subspace Learning
* Variational Bayesian Approach for Classification with Corrupted Inputs, A
* Variational Distance-Dependent Image Restoration
* Viewpoint-Coded Structured Light
* Virtual Training for Multi-View Object Class Recognition
* Visual Curvature
* Visual Event Recognition in News Video using Kernel Methods with Multi-Level Temporal Alignment
* Visual Odometry System Using Multiple Stereo Cameras and Inertial Measurement Unit
* Weighted Substructure Mining for Image Analysis
* What makes a good model of natural images?
* Wide-Area Egomotion Estimation from Known 3D Structure
352 for CVPR07

CVPR08 * *CVPR
* 3-D Tracking of shoes for Virtual Mirror applications
* 3D face tracking and expression inference from a 2D sequence using manifold learning
* 3D model matching with Viewpoint-Invariant Patches (VIP)
* 3D occlusion recovery using few cameras
* 3D pose refinement from reflections
* 3D shape reconstruction of Mooney faces
* 3D surface models by geometric constraints propagation
* 3D ultrasound tracking of the left ventricle using one-step forward prediction and data fusion of collaborative trackers
* 3D-2D spatiotemporal registration for sports motion analysis
* Accurate and robust registration for in-hand modeling
* Accurate Camera Calibration from Multi-View Stereo and Bundle Adjustment
* Accurate eye center location and tracking using isophote curvature
* Accurate multi-view reconstruction using robust binocular stereo and surface meshing
* Accurate polyp segmentation for 3D CT colongraphy using multi-staged probabilistic binary learning and compositional model
* Action MACH a spatio-temporal Maximum Average Correlation Height filter for action recognition
* Action recognition by learning mid-level motion features
* Action recognition using ballistic dynamics
* Action recognition using exemplar-based embedding
* Action recognition with motion-appearance vocabulary forest
* Action snippets: How many frames does human action recognition require?
* Active microscopic cellular image annotation by superposable graph transduction with imbalanced labels
* Adaptive and compact shape descriptor by progressive feature combination and selection with boosting
* Adaptive and constrained algorithms for inverse compositional Active Appearance Model fitting
* adaptive learning method for target tracking across multiple cameras, An
* Adaptive parametrization of multivariate B-splines for image registration
* Adaptive region intensity based rigid ultrasound and CT image registration
* Annotating collections of photos using hierarchical event and scene models
* ANSIG: An analytic signature for permutation-invariant two-dimensional shape representation
* Application and evaluation of spatiotemporal enhancement of live aerial video using temporally local mosaics
* Approximate earth mover's distance in linear time
* Articulated shape matching using Laplacian eigenfunctions and unsupervised point registration
* Auto-context and its application to high-level vision tasks
* Automatic calibration of a single-projector catadioptric display system
* Automatic face naming with caption-based supervision
* Automatic non-rigid registration of 3D dynamic data for facial expression synthesis and transfer
* Automatic registration of aerial imagery with untextured 3D LiDAR models
* Automatic symmetry plane estimation of bilateral objects in point clouds
* Background subtraction in highly dynamic scenes
* Bayesian Approach for Image Segmentation with Shape Priors, A
* Bayesian color constancy revisited
* Bayesian tactile face
* Beyond sliding windows: Object localization by efficient subwindow search
* Beyond the Lambertian assumption: A generative model for Apparent BRDF fields of faces using anti-symmetric tensor splines
* bi-illuminant dichromatic reflection model for understanding images, A
* Blindly separating mixtures of multiple layers with spatial shifts
* Boosted deformable model for human body alignment
* Boosting adaptive linear weak classifiers for online learning and tracking
* Boosting ordinal features for accurate and fast iris recognition
* Boundary snapping for robust image cutouts
* BP)2: Beyond pairwise Belief Propagation labeling by approximating Kikuchi free energies
* Branch-and-bound hypothesis selection for two-view multiple structure and motion segmentation
* Building reconstruction from a single DEM
* Building segmentation for densely built urban regions using aerial LIDAR data
* Calibration and rectification for reflection stereo
* Calibration of an Articulated Camera System
* Cell motion analysis without explicit tracking
* Characterizing the shadow space of camera-light pairs
* Classifiability-based Optimal Discriminatory Projection Pursuit
* Classification and evaluation of cost aggregation methods for stereo correspondence
* Classification using intersection kernel support vector machines is efficient
* Classification via semi-Riemannian spaces
* Closing the loop in scene interpretation
* Clothing cosegmentation for recognizing people
* Clustering and dimensionality reduction on Riemannian manifolds
* Coarse-to-fine low-rank structure-from-motion
* Coherent image annotation by learning semantic distance
* Coherent Laplacian 3-D protrusion segmentation
* Color constancy beyond bags of pixels
* Combining appearance models and Markov Random Fields for category level object segmentation
* Combining brain computer interfaces with vision for object categorization
* Computing minimal deformations: application to construction of statistical shape models
* Conditional density learning via regression with application to deformable shape segmentation
* conditional random field for automatic photo editing, A
* Conjugate rotation: Parameterization and estimation from an affine feature correspondence
* Connected Segmentation Tree: A joint representation of region layout and hierarchy
* Consistent image analogies using semi-supervised learning
* Constant time O(1) bilateral filtering
* Constrained image segmentation from hierarchical boundaries
* Constrained spectral clustering through affinity propagation
* Context and observation driven latent variable model for human pose estimation
* Context-aware clustering
* Context-dependent kernel design for object matching and recognition
* Correlational spectral clustering
* Correspondence-free multi-camera activity analysis and scene modeling
* Correspondences between parts of shapes with particle filters
* Cost-sensitive face recognition
* Decomposition, discovery and detection of visual categories using topic models
* deformable local image descriptor, A
* Demosaicing by smoothing along 1D features
* Demosaicking recognition with applications in digital photo authentication based on a quadratic pixel correlation model
* Dense 3D motion capture from synchronized video streams
* Dense 3D reconstruction from specularity consistency
* Dense correspondence finding for parametrization-free animation reconstruction from video
* Dense specular shape from multiple specular flows
* Detecting and matching repeated patterns for automatic geo-tagging in urban environments
* Detection and matching of rectilinear structures
* Detection with multi-exit asymmetric boosting
* Dimensionality reduction by unsupervised regression
* Dimensionality reduction using covariance operator inverse regression
* Directional independent component analysis with tensor representation
* Directions of egomotion from antipodal points
* Discovering class specific composite features through discriminative sampling with Swendsen-Wang Cut
* Discriminative human action segmentation and recognition using semi-Markov model
* Discriminative learned dictionaries for local image analysis
* Discriminative learning of visual words for 3D human pose estimation
* Discriminative local binary patterns for human detection in personal album
* Discriminative modeling by Boosting on Multilevel Aggregates
* discriminatively trained, multiscale, deformable part model, A
* Distributed data association and filtering for multiple target tracking
* Drift-free tracking of rigid and articulated objects
* Dynamic scene shape reconstruction using a single structured light pattern
* Dynamic visual category learning
* Edge descriptors for robust wide-baseline correspondence
* Edge preserving spatially varying mixtures for image segmentation
* efficient algorithm for compressed MR imaging using total variation and wavelets, An
* Efficient mean shift belief propagation for vision tracking
* Efficient object shape recovery via slicing planes
* Efficient photometric stereo on glossy surfaces with wide specular lobes
* Efficient Sequential Correspondence Selection by Cosegmentation
* Efficient subdivision-based image and volume warping
* Enforcing convexity for improved alignment with constrained local models
* Enforcing non-positive weights for stable support vector tracking
* Enforcing stochastic inverse consistency in non-rigid image registration and matching
* Enhanced biologically inspired model
* Enhancing photographs with Near Infra-Red images
* Epitomic Location Recognition
* Estimating age, gender, and identity using first name priors
* Estimating camera response functions using probabilistic intensity similarity
* Evaluation of color descriptors for object and scene recognition
* Evaluation of constructable match cost measures for stereo correspondence using cluster ranking
* Exact inference in multi-label CRFs with higher order cliques
* experimental study of employing visual appearance as a phenotype, An
* Exploiting side information in locality preserving projection
* Extracting a fluid dynamic texture and the background from video
* Extracting smooth and transparent layers from a single image
* Extrinsic and depth calibration of ToF-cameras
* Face alignment via boosted ranking model
* Face illumination normalization on large and small scale features
* Face shape recovery from a single image using CCA mapping between tensor spaces
* Face tracking and recognition with visual constraints in real-world videos
* Facial expression recognition using encoded dynamic features
* factorization approach to structure from motion with shape priors, A
* Fast algorithms for L-inf problems in multiview geometry
* Fast algorithms for large scale conditional 3D prediction
* Fast and robust numerical solutions to minimal problems for cameras with radial distortion
* Fast approximate Random Walker segmentation using eigenvector precomputation
* Fast image search for learned metrics
* Fast kernel learning for spatial pyramid matching
* fast local descriptor for dense matching, A
* Fast texture segmentation model based on the shape operator and active contour
* Fast track matching and event detection
* Filtering Internet image search results towards keyword based category recognition
* Finding people in archive films through tracking
* Finding trails
* Flat Refractive Geometry
* framework for reducing ink-bleed in old documents, A
* From appearance to context-based recognition: Dense labeling in small images
* From skeletons to bone graphs: Medial abstraction for object recognition
* Fully automatic feature localization for medical images using a global vector concentration approach
* Fusion of time-of-flight depth and stereo for high accuracy depth maps
* FusionFlow: Discrete-continuous optimization for optical flow estimation
* Fuzzy chamfer distance and its probabilistic formulation for visual tracking
* FuzzyMatte: A computationally efficient scheme for interactive matting
* General constraints for batch Multiple-Target Tracking applied to large-scale videomicroscopy
* general solution to the P4P problem for camera with unknown focal length, A
* Generalised blurring mean-shift algorithms for nonparametric clustering
* Geo-located image analysis using latent representations
* Geo-spatial aerial video processing for scene understanding and object tracking
* Global data association for multi-object tracking using network flows
* Global image registration based on learning the prior appearance model
* Global pose estimation using non-tree models
* Global Stereo Reconstruction under Second-Order Smoothness Priors
* Globally optimal bilinear programming for computer vision applications
* Globally optimal shape-based tracking in real-time
* Globally optimal surface segmentation using regional properties of segmented objects
* Granularity and elasticity adaptation in visual tracking
* Graph commute times for image representation
* Graph cut based image segmentation with connectivity priors
* Graph cut with ordering constraints on labels and its applications
* Graph-shifts: Natural image labeling by dynamic hierarchical computing
* Hallucinating 3D facial shapes
* hierarchical and contextual model for aerial image understanding, A
* Hierarchical, learning-based automatic liver segmentation
* High quality mesostructure acquisition using specularities
* High resolution matting via interactive trimap segmentation
* High resolution motion layer decomposition using dual-space graph cuts
* High-arity interactions, polyhedral relaxations, and cutting plane algorithm for soft constraint optimisation (MAP-MRF)
* Histogram-based search: A comparative study
* Homography based multiple camera detection and tracking of people in a dense crowd
* Human action recognition using Local Spatio-Temporal Discriminant Embedding
* Human-assisted motion annotation
* Hybrid body representation for integrated pose recognition, localization and segmentation
* hybrid camera for motion deblurring and depth map super-resolution, A
* Illumination and camera invariant stereo matching
* IM2GPS: estimating geographic information from a single image
* Image based rendering for motion compensation in angiographic roadmapping
* Image de-fencing
* Image decomposition into structure and texture subcomponents with multifrequency modulation constraints
* Image partial blur detection and classification
* Image segmentation via convolution of a level-set function with a Rigaut Kernel
* Image segmentation with a parametric deformable model using shape and appearance priors
* Image selection for improved Multi-View Stereo
* Image super-resolution as sparse representation of raw image patches
* Image super-resolution using gradient profile prior
* Image/video deblurring using a hybrid camera
* importance sampling approach to learning structural representations of shape, An
* Improved building detection by Gaussian processes classification via feature space rescale and spectral kernel selection
* Improving local learning for object categorization by exploring the effects of ranking
* In defense of Nearest-Neighbor based image classification
* Increasing the density of Active Appearance Models
* Incremental learning of nonparametric Bayesian mixture models
* Information-theoretic active scene exploration
* integrated background model for video surveillance based on primal sketch and 3D scene geometry, An
* Integrated feature selection and higher-order spatial feature extraction for object categorization
* Intensity statistics-based HSI diffusion for color photo denoising
* Interactive image matting for multiple layers
* Interactive image segmentation via minimization of quadratic energies on directed graphs
* Intrinsic image decomposition with non-local texture cues
* Inverse-polar ray projection for recovering projective transformations
* joint appearance-spatial distance for kernel-based image categorization, A
* Joint Conditional Random Field of multiple views with online learning for image-based rendering
* Joint data alignment up to (lossy) transformations
* Joint learning and dictionary construction for pattern recognition
* Joint multi-label multi-instance learning for image classification
* Joint tracking of features and edges
* Kernel integral images: A framework for fast non-uniform filtering
* Kernel-based learning of cast shadows from a physical model of light sources and surfaces for low-level segmentation
* Keywords to visual categories: Multiple-instance learning for weakly supervised object categorization
* Kneed Walker for human pose tracking, The
* L1 regularized projection pursuit for additive model learning
* Large margin pursuit for a Conic Section classifier
* Large-scale manifold learning
* Latent topic random fields: Learning using a taxonomy of labels
* Layered Graphical Models for Tracking Partially Occluded Objects
* Learning 4D action feature models for arbitrary view action recognition
* Learning a geometry integrated image appearance manifold from a small training set
* Learning and using taxonomies for fast visual categorization
* Learning based coarse-to-fine image registration
* Learning Bayesian Networks with qualitative constraints
* Learning class-specific affinities for image labelling
* Learning coupled conditional random field for image decomposition with application on object categorization
* Learning for stereo vision using the structured support vector machine
* Learning human actions via information maximization
* Learning human motion models from unsegmented videos
* Learning object motion patterns for anomaly detection and improved object detection
* Learning on lie groups for invariant detection and tracking
* Learning patch correspondences for improved viewpoint invariant face recognition
* Learning realistic human actions from movies
* Learning stick-figure models using nonparametric Bayesian priors over trees
* Learning subcategory relevances for category recognition
* Learning the viewpoint manifold for action recognition
* Learning-based face hallucination in DCT domain
* learning-based hybrid tagging and browsing approach for efficient manual image annotation, A
* Least squares congealing for unsupervised alignment of images
* Least squares surface reconstruction from measured gradient fields
* LED-only BRDF measurement device, An
* Light-invariant fitting of active appearance models
* linear approach to motion estimation using generalized camera models, A
* Linear motion estimation for systems of articulated planes
* Local deformation models for monocular 3D shape recovery
* Local grouping for optical flow
* Local minima free Parameterized Appearance Models
* Local tensor descriptor from micro-deformation analysis
* Localization accuracy of region detectors
* Locally adaptive learning for translation-variant MRF image priors
* Locally Assembled Binary (LAB) feature with feature-centric cascade for fast and accurate face detection
* Logistic Random Field: A convenient graphical model for learning parameters for MRF-based labeling, The
* Looking around the backyard helps to recognize faces and digits
* Loopy Belief Propagation approach for robust background estimation, A
* Loose shape model for discriminative learning of object categories
* Lost in quantization: Improving particular object retrieval in large scale image databases
* Macro-cuboļd based probabilistic matching for lip-reading digits
* Manifold learning using robust Graph Laplacian for interactive image search
* Manifold-Manifold Distance with application to face recognition based on image set
* Margin-based discriminant dimensionality reduction for visual recognition
* Markerless motion capture of man-machine interaction
* Matching images under unstable segmentations
* Matching non-rigidly deformable shapes across images: A globally optimal solution
* Matching vehicles under large pose transformations using approximate 3D models and piecewise MRF model
* Max Margin AND/OR Graph learning for parsing the human body
* Measuring camera translation by the dominant apical angle
* Meshless deformable models for LV motion analysis
* Minimal local reconstruction error measure based discriminant feature extraction and classification
* Minimal solutions for generic imaging models
* Mining compositional features for boosting
* Misalignment-robust face recognition
* mixed generative-discriminative framework for pedestrian classification, A
* mobile vision system for robust multi-person tracking, A
* Model-based hand tracking with texture, shading and self-occlusions
* Modeling and generating complex motion blur for real-time tracking
* Modeling complex luminance variations for target tracking
* Modeling the structure of multivariate manifolds: Shape maps
* Modulated phase-shifting for 3D scanning
* Motion blur identification from image gradients
* Motion estimation for multi-camera systems using global optimization
* Motion estimation method based on physical properties of waves
* Motion from blur
* Motion segmentation via robust subspace separation in the presence of outlying, incomplete, or corrupted trajectories
* Moving shape dynamics: A signal processing perspective
* multi-compartment segmentation framework with homeomorphic level sets, A
* Multi-label image segmentation via point-wise repetition
* Multi-object shape estimation and tracking from silhouette cues
* Multiple-instance ranking: Learning to rank images for image retrieval
* Multiplicative kernels: Object detection, segmentation and pose estimation
* Near duplicate image identification with patially Aligned Pyramid Matching
* Non-negative graph embedding
* Non-refractive modulators for encoding and capturing scene appearance and depth
* Nonlinear image representation using divisive normalization
* Normalized tree partitioning for image segmentation
* NURBS-based spectral reflectance descriptor with applications in computer vision and pattern recognition, A
* Object categorization using co-occurrence, location and appearance
* Object image retrieval by exploiting online knowledge resources
* Object recognition and segmentation by non-rigid quasi-dense matching
* Object tracking and detection after occlusion via numerical hybrid local and global mode-seeking
* Observe-and-explain: A new approach for multiple hypotheses tracking of humans and objects
* Off-axis aperture camera: 3D shape reconstruction and image restoration
* On benchmarking camera calibration and multi-view stereo for high resolution imagery
* On controlling light transport in poor visibility environments
* On errors-in-variables regression with arbitrary covariance and its application to optical flow estimation
* On handling uncertainty in the fundamental matrix for scene and motion adaptive pose recovery
* On the use of independent tasks for face recognition
* One step beyond histograms: Image representation using Markov stationary features
* Online learning of patch perspective rectification for efficient object detection
* Optical flow estimation using Fourier Mellin Transform
* Optical flow estimation with uncertainties through dynamic MRFs
* Optimised KD-trees for fast image descriptor matching
* Optimizing discrimination-efficiency tradeoff in integrating heterogeneous local features for object detection
* Order consistent change detection via fast statistical significance testing
* Overcoming visual reverberations
* Pair-activity classification by bi-trajectories analysis
* Pan, zoom, scan: Time-coherent, trained automatic video cropping
* Parallel Decomposition Solver for SVM: Distributed dual ascend using Fenchel Duality, A
* Parameterized Kernel Principal Component Analysis: Theory and applications to supervised and unsupervised image alignment
* Particle filtering for registration of 2D and 3D point sets with stochastic dynamics
* Partitioning of image datasets using discriminative context information
* patch transform and its applications to image editing, The
* Pattern discovery in motion time series via structure-based spectral clustering
* People-tracking-by-detection and people-detection-by-tracking
* Performance evaluation of state-of-the-art discrete symmetry detection algorithms
* Photogeometric structured light: A self-calibrating and multi-viewpoint framework for accurate 3D modeling
* Photometric stereo with coherent outlier handling and confidence estimation
* Photometric stereo with non-parametric and spatially-varying reflectance
* Physical simulation for probabilistic motion tracking
* polynomial-time bound for matching and registration with outliers, A
* Pose primitive based human action recognition in videos or still images
* Practical camera auto-calibration based on object appearance and motion for traffic scene visual surveillance
* Precise detailed detection of faces and facial features
* Principled fusion of high-level model and low-level cues for motion segmentation
* Privacy preserving crowd monitoring: Counting people without people models or tracking
* Private Content Based Image Retrieval
* Probabilistic graph and hypergraph matching
* Probabilistic image registration and anomaly detection by nonlinear warping
* Probabilistic multi-tensor estimation using the Tensor Distribution Function
* probabilistic segmentation method for the identification of luminal borders in intravascular ultrasound images, A
* Progressive search space reduction for human pose estimation
* PSF estimation using sharp edge prediction
* Quasi-perspective projection with applications to 3D factorization from uncalibrated image sequences
* quasi-random sampling approach to image retrieval, A
* Radiometric calibration using temporal irradiance mixtures
* Radiometric calibration with illumination change for outdoor scene analysis
* Randomized trees for human pose detection
* rank constrained continuous formulation of multi-frame multi-target tracking problem, A
* Rank-based distance metric learning: An application to image retrieval
* Re-thinking non-rigid structure from motion
* Re-weighting Linear Discrimination Analysis under ranking loss
* Real time object tracking based on dynamic feature grouping with background subtraction
* Real-time 3D segmentation of the left ventricle using deformable subdivision surfaces
* Real-Time Face Pose Estimation from Single Range Images
* Real-time global localization with a pre-built visual landmark database
* Real-time pose estimation of articulated objects using low-level motion
* Recognising faces in unseen modes: A tensor based approach
* Recognition by association via learning per-exemplar distances
* Recognizing human actions using multiple features
* Recognizing primitive interactions by exploring actor-object states
* Reconstructing non-stationary articulated objects in monocular video using silhouette information
* Recovering consistent video depth maps via bundle optimization
* Recovering shape characteristics on near-flat specular surfaces
* Recovery of relative depth from a single observation using an uncalibrated (real-aperture) camera
* recursive filter for linear systems on Riemannian manifolds, A
* Recursive photometric stereo when multiple shadows and highlights are present
* Reduce, reuse & recycle: Efficiently solving multi-label MRFs
* region based stereo matching algorithm using cooperative optimization, A
* Regression from patch-kernel
* Regularizing 3D medial axis using medial scaffold transforms
* Relaxed matching kernels for robust image comparison
* Retinal image registration from 2D to 3D
* Robust 3D face recognition in uncontrolled environments
* robust descriptor based on Weber's Law, A
* Robust dual motion deblurring
* Robust estimation of gaussian mixtures from noisy input data
* Robust fusion of dynamic shape and normal capture for high-quality reconstruction of time-varying geometry
* Robust Higher Order Potentials for Enforcing Label Consistency
* robust identification approach to gait recognition, A
* Robust learning of discriminative projection for multicategory classification on the Stiefel manifold
* Robust motion estimation and structure recovery from endoscopic image sequences with an Adaptive Scale Kernel Consensus estimator
* Robust null space representation and sampling for view-invariant motion trajectory analysis
* Robust statistics on Riemannian manifolds via the geometric median
* Robust tensor factorization using R1 norm
* Robust unambiguous parametrization of the essential manifold
* Rotation symmetry group detection via frequency analysis of frieze-expansions
* Scalable graph-cut algorithm for N-D grids, A
* Scale invariance without scale selection
* scale of a texture and its application to segmentation, The
* Scene classification with low-dimensional semantic spaces and weak supervision
* Scene understanding with discriminative structured prediction
* Segmentation by transduction
* Segmentation of left ventricle from 3D cardiac MR image sequences using a subject-specific dynamical model
* Segmentation of multiple, partially occluded objects by grouping, merging, assigning part detection responses
* Selective hidden random fields: Exploiting domain-specific saliency for event classification
* Semantic texton forests for image categorization and segmentation
* Semantic-based indexing of fetal anatomies from 3-D ultrasound data using global/semi-local context and sequential sampling
* Semi-supervised boosting using visual similarity learning
* Semi-Supervised Discriminant Analysis using robust path-based similarity
* Semi-Supervised Distance Metric Learning for Collaborative Image Retrieval
* Semi-supervised learning of multi-factor models for face de-identification
* Semi-supervised SVM batch mode active learning for image retrieval
* Sensing increased image resolution using aperture masks
* Sensor planning for automated and persistent object tracking with multiple cameras
* Sequential particle swarm optimization for visual tracking
* Sequential sparsification for change detection
* Shading models for illumination and reflectance invariant shape detectors
* Shape L'Ane rouge: Sliding wavelets for indexing and retrieval
* Shape prior segmentation of multiple objects with graph cuts
* Shape priors in variational image segmentation: Convexity, Lipschitz continuity and globally optimal solutions
* Silhouette-based camera calibration from sparse views under circular motion
* similarity measure between unordered vector sets with application to image categorization, A
* Similarity-based cross-layered hierarchical representation for object categorization
* Simple calibration of non-overlapping cameras with a mirror
* Simultaneous clustering and tracking unknown number of objects
* Simultaneous data volume reconstruction and pose estimation from slice samples
* Simultaneous image transformation and sparse representation recovery
* Simultaneous learning of a discriminative projection and prototypes for Nearest-Neighbor classification
* Simultaneous super-resolution and 3D video using graph-cuts
* Simultaneous super-resolution and feature extraction for recognition of low-resolution faces
* Single-image vignetting correction using radial gradient symmetry
* Skeletal graphs for efficient structure from motion
* Sketching in the air: A vision-based system for 3D object design
* Small codes and large image databases for recognition
* Smoothing-based Optimization
* SMRFI: Shape matching via registration of vector-valued feature images
* Sparse probabilistic regression for activity-independent human pose inference
* Sparsity, redundancy and optimal image support towards knowledge-based segmentation
* Spatio-temporal Saliency detection using phase spectrum of quaternion fourier transform
* Spectral methods for semi-supervised manifold learning
* Spectrally optimal factorization of incomplete matrices
* Statistical analysis on Stiefel and Grassmann manifolds with applications in computer vision
* statistical deformation prior for non-rigid image and shape registration, A
* statistical modelling of fingerprint minutiae distribution with implications for fingerprint individuality studies, The
* Stereo reconstruction with mixed pixels using adaptive over-segmentation
* Stereoscopic inpainting: Joint color and depth completion from stereo images
* Structure learning in random fields for heart motion abnormality detection
* Structure-perceptron learning of a hierarchical log-linear model
* Subspace segmentation with outliers: A grassmannian approach to the maximum consensus subspace
* Summarizing visual data using bidirectional similarity
* Super-resolution from image sequence under influence of hot-air optical turbulence
* Superpixel lattices
* Symmetric multi-view stereo reconstruction from planar camera arrays
* Taylor expansion based classifier adaptation: Application to person detection
* Tensor reduction error analysis: Applications to video compression and classification
* Texture classification with a dictionary of basic image features
* theoretical analysis of linear and multi-linear models of image appearance, A
* theory of defocus via Fourier analysis, A
* three-point minimal solution for panoramic stitching with lens distortion, A
* Toward automatic 3D modeling of scenes using a generic camera model
* Towards unsupervised whole-object segmentation: Combining automated matting with boundary detection
* Tracking distributions with an overlap prior
* Tracking rotating fluids in realtime using snapshots
* Trajectory analysis and semantic region modeling using a nonparametric Bayesian model
* Transductive object cutout
* Transfer learning for image classification with sparse prototype representations
* Two-Dimensional Active Learning for image classification
* two-frame theory of motion, lighting and shape, A
* unified framework for generalized Linear Discriminant Analysis, A
* Unified Principal Component Analysis with generalized Covariance Matrix for face recognition
* Unifying discriminative visual codebook generation with classifier training for object category recognition
* Unsupervised discovery of visual object class hierarchies
* Unsupervised estimation of segmentation quality using nonnegative factorization
* Unsupervised feature selection via distributed coding for multi-view object recognition
* Unsupervised learning of finite mixtures using entropy regularization and its application to image segmentation
* Unsupervised learning of human perspective context using ME-DT for efficient human detection in surveillance
* Unsupervised learning of probabilistic object models (POMs) for object classification, segmentation and recognition
* Unsupervised learning of visual taxonomies
* Unsupervised modeling of object categories using link analysis techniques
* Using circular statistics for trajectory shape analysis
* Using contours to detect and localize junctions in natural images
* Utilizing semantic word similarity measures for video retrieval
* Variable baseline/resolution stereo
* Verifying global minima for L2 minimization problems
* Video falsifying by motion interpolation and inpainting
* Video segmentation: Propagation, validation and aggregation of a preceding graph
* View and scale invariant action recognition using multiview shape-flow models
* View-invariant action recognition using fundamental ratios
* View-invariant recognition of body pose from space-time templates
* Viewpoint-independent object class detection using 3D Feature Maps
* Visibility in bad weather from a single image
* Visual quasi-periodicity
* Visual Synset: Towards a higher-level visual representation
* Visual tracking via incremental Log-Euclidean Riemannian subspace learning
* Visual tracking with histograms and articulating blocks
* Vital sign estimation from passive thermal video
* Volumetric reconstruction from multi-energy single-view radiography
* What are the high-level concepts with small semantic gaps?
* What can missing correspondences tell us about 3D structure and motion?
* What do color changes reveal about an outdoor scene?
* Where am I: Place instance and category recognition using spatial PACT
* Who killed the directed model?
507 for CVPR08

CVPR09 * *CVPR
* 3D morphable face models revisited
* 3D pose estimation and segmentation using specular cues
* 3D reconstruction of curved objects from single 2D line drawings
* 3D reconstruction pipeline for digital preservation, A
* Abnormal crowd behavior detection using social force model
* Abnormal events detection based on spatio-temporal co-occurences
* Actions in context
* Active learning for large multi-class problems
* Active stereo tracking of multiple free-moving targets
* Active volume models for 3D medical image segmentation
* Adaptive Contour Features in oriented granular space for human detection and segmentation
* Adaptive image and video retargeting technique based on Fourier analysis
* Alphabet SOUP: A framework for approximate energy minimization
* Angular embedding: From jarring intensity differences to perceived luminance
* Anomaly detection in extremely crowded scenes using spatio-temporal motion pattern models
* Appearance-based keypoint clustering
* Automated extraction of signs from continuous sign language sentences using Iterated Conditional Modes
* Automated feature extraction for early detection of diabetic retinopathy in fundus images
* Automatic facial landmark labeling with minimal supervision
* Automatic fetal face detection from ultrasound volumes via learning 3D and 2D information
* Automatic reconstruction of cities from remote sensor data
* Automatic registration of LIDAR and optical images of urban scenes
* Average of Synthetic Exact Filters
* Beyond pairwise energies: Efficient optimization for higher-order MRFs
* Beyond the graphs: Semi-parametric semi-supervised discriminant analysis
* Bias reduction for stereo based motion estimation with applications to large scale visual odometry
* Blind motion deblurring from a single image using sparse approximation
* Blind separation of superimposed images with unknown motions
* Boosted multi-task learning for face verification with applications to web image and video search
* Building a database of 3D scenes from user annotations
* Building text features for object image classification
* Bundling features for large scale partial-duplicate web image search
* Cancelable iris biometrics and using Error Correcting Codes to reduce variability in biometric data
* Capturing 3D stretchable surfaces from single images in closed form
* Capturing multiple illumination conditions using time and color multiplexing
* Catadioptric projectors
* CHoG: Compressed histogram of gradients A low bit-rate feature descriptor
* Class-specific Hough forests for object detection
* Classification of tensors and fiber tracts using Mercer-kernels encoding soft probabilistic spatial and diffusion information
* Classifier grids for robust adaptive object detection
* Co-training with noisy perceptual observations
* Coded exposure deblurring: Optimized codes for PSF estimation and invertibility
* collaborative benchmark for region of interest detection algorithms, A
* Color estimation from a single surface color
* Combining powerful local and global statistics for texture description
* Compensation of motion artifacts in MRI via graph-based optimization
* compressive sensing approach for expression-invariant face recognition, A
* Constrained clustering via spectral regularization
* Constrained marginal space learning for efficient 3D anatomical structure detection in medical images
* Contextual classification with functional Max-Margin Markov Networks
* Contextual decomposition of multi-label images
* Contextual flow
* Contextual restoration of severely degraded document images
* Contextualizing histogram
* Continuous depth estimation for multi-view stereo
* Continuous maximal flows and Wulff shapes: Application to MRFs
* Continuous ratio optimization via convex relaxation with applications to multiview 3D reconstruction
* convex relaxation approach for computing minimal partitions, A
* Convexity and Bayesian constrained local models
* Cooperative mapping of multiple PTZ cameras in automated surveillance systems
* Coupled Spectral Regression for matching heterogeneous faces
* Curvature and singularity driven diffusion for oriented pattern enhancement with singular points
* Curved Glide-Reflection Symmetry Detection
* D-Clutter: Building object model library from unsupervised segmentation of cluttered scenes
* De) focusing on global light transport for active scene recovery
* Dense 3D motion capture for human faces
* Dense saliency-based spatiotemporal feature points for action recognition
* Depth from sliding projections
* Describing objects by their attributes
* Dictionary-free categorization of very similar objects via stacked evidence trees
* Digital face makeup by example
* Dimension-free affine shape matching through subspace invariance
* Directed assistance for ink-bleed reduction in old documents
* Disambiguating the recognition of 3D objects
* Discrete tracking of parametrized curves
* Discriminative structure learning of hierarchical representations for object detection
* Discriminative subvolume search for efficient action detection
* Discriminatively trained particle filters for complex multi-object tracking
* Distance transform templates for object detection and pose estimation
* Distributed multi-target tracking in a self-configuring camera network
* Distributed volumetric scene geometry reconstruction with a network of distributed smart cameras
* distribution-based approach to tracking points in velocity vector fields, A
* Domain Transfer SVM for video concept detection
* Dual distributions of multilinear geometric entities
* Early spatiotemporal grouping with a distributed oriented energy representation
* Echocardiogram view classification using edge filtered scale-invariant motion features
* Efficient algorithms for subwindow search in object detection and localization
* Efficient image alignment using linear appearance models
* Efficient Kernels for identifying unbounded-order spatial features
* Efficient multi-label classification with hypergraph regularization
* Efficient planar graph cuts with applications in Computer Vision
* Efficient reduction of L-infinity geometry problems
* Efficient representation of local geometry for large scale object retrieval
* Efficient scale space auto-context for image segmentation and labeling
* efficient stochastic approach to groupwise non-rigid image registration, An
* Efficiently training a better visual detector with sparse eigenvectors
* empirical Bayes approach to contextual region classification, An
* empirical study of context in object detection, An
* Enforcing integrability by error correction using L1-minimization
* Enhanced Pictorial Structures for precise eye localization under incontrolled conditions
* Ensemble manifold regularization
* Epitomized priors for multi-labeling problems
* Error propagations for local bundle adjustment
* Expression-insensitive 3D face recognition using sparse representation
* Extraction of tubular structures over an orientation domain
* Face verification and identification using Facial Trait Code
* Facial deblur inference to improve recognition of blurred Faces
* Factorization for non-rigid and articulated structure using metric projections
* family of contextual measures of similarity between distributions with application to image retrieval, A
* Fast car detection using image strip features
* Fast concurrent object localization and recognition
* Fast human detection in crowded scenes by contour integration and local shape estimation
* Fast Mean Shift by compact density representation
* Fast multiple shape correspondence by pre-organizing shape instances
* Fast normalized cut with linear constraints
* Flow mosaicking: Real-time pedestrian counting without scene-specific learning
* Fourier analysis and Gabor filtering for texture analysis and local reconstruction of general shapes
* Frequency-tuned salient region detection
* From contours to regions: An empirical evaluation
* From structure-from-motion point clouds to fast location recognition
* Fuzzy-Cuts: A knowledge-driven graph-based method for medical image segmentation
* Geometric and probabilistic image dissimilarity measures for common field of view detection
* Geometric min-Hashing: Finding a (thick) needle in a haystack
* Geometric reasoning for single image structure recovery
* geometry of 2D image signals, The
* Global active contour-based image segmentation via probability alignment
* Global connectivity potentials for random field models
* Global optimization for alignment of generalized shapes
* Granularity-tunable gradients partition (GGP) descriptors for human detection
* graph-based approach to skin mole matching incorporating template-normalized coordinates, A
* Half-integrality based algorithms for cosegmentation of images
* Hardware-efficient belief propagation
* Harris corners in the real world: A principled selection criterion for interest points based on ecological statistics
* Hierarchical spatio-temporal context modeling for action recognition
* High dynamic range image reconstruction from hand-held cameras
* High-quality curvelet-based motion deblurring from an image pair
* Higher-order clique reduction in binary graph cut
* Histogram-based interest point detectors
* Histograms of oriented optical flow and Binet-Cauchy kernels on nonlinear dynamical systems for the recognition of human actions
* Holistic context modeling using semantic co-occurrences
* HOP: Hierarchical object parsing
* How far can you get with a modern face recognition test set using only simple features?
* Human age estimation using bio-inspired features
* Human motion synthesis from 3D video
* Illumination and spatially varying specular reflectance from a single view
* Image categorization by learning with context and consistency
* Image categorization with spatial mismatch kernels
* Image deblurring and denoising using color priors
* Image deblurring for less intrusive iris capture
* Image hallucination with feature enhancement
* Image registration by minimization of residual complexity
* ImageNet: A large-scale hierarchical image database
* Imbalanced RankBoost for efficiently ranking large-scale image/video collections
* Implicit elastic matching with random projections for pose-variant face recognition
* implicit Markov random field model for the multi-scale oriented representations of natural images, An
* In defense of orthonormality constraints for nonrigid structure from motion
* Increased discrimination in level set methods with embedded conditional random fields
* instance selection approach to Multiple instance Learning, An
* Interval HSV: Extracting ink annotations
* Intrinsic mean shift for clustering on Stiefel and Grassmann manifolds
* Isometric registration of ambiguous and partial data
* Joint and implicit registration for face recognition
* Joint depth and alpha matte optimization via fusion of stereo and time-of-flight sensor
* Keypoint induced distance profiles for visual recognition
* Label diagnosis through self tuning for web image search
* Large displacement optical flow
* Layered graph matching by composite cluster sampling with collaborative and competitive interactions
* Learning a distance metric from multi-instance multi-label data
* Learning based automatic face annotation for arbitrary poses and expressions from frontal images only
* Learning color and locality cues for moving object detection and segmentation
* Learning from ambiguously labeled images
* Learning general optical flow subspaces for egomotion estimation and detection of motion anomalies
* Learning IMED via shift-invariant transformation
* Learning invariant features through topographic filter maps
* Learning mappings for face synthesis from near infrared to visual light images
* Learning mixed templates for object recognition
* Learning multi-modal densities on Discriminative Temporal Interaction Manifold for group activity recognition
* Learning optimized MAP estimates in continuously-valued MRF models
* Learning partially-observed hidden conditional random fields for facial expression recognition
* Learning photometric invariance from diversified color model ensembles
* Learning query-dependent prefilters for scalable image retrieval
* Learning real-time MRF inference for image denoising
* Learning rotational features for filament detection
* Learning semantic scene models by object classification and trajectory clustering
* Learning semantic visual vocabularies using diffusion distance
* Learning shape prior models for object matching
* Learning sign language by watching TV (using weakly aligned subtitles)
* Learning signs from subtitles: A weakly supervised approach to sign language recognition
* Learning similarity measure for multi-modal 3D image registration
* Learning to associate: HybridBoosted multi-target tracker for crowded scene
* Learning to detect unseen object classes by between-class attribute transfer
* Learning to track with multiple observers
* Learning trajectory patterns by clustering: Experimental studies and comparative evaluation
* Learning visual flows: A Lie algebraic approach
* Let the kernel figure it out; Principled learning of pre-processing for kernel classifiers
* LidarBoost: Depth superresolution for ToF 3D shape scanning
* Linear embeddings in non-rigid structure from motion
* Linear solution to scale and rotation invariant object matching
* Linear spatial pyramid matching using sparse coding for image classification
* Linear stratified approach for 3D modelling and calibration using full geometric constraints
* Localized content-based image retrieval through evidence region identification
* Locally constrained diffusion process on locally densified distance spaces with applications to shape retrieval
* Locally time-invariant models of human activities using trajectories on the grassmannian
* Manhattan-world stereo
* Manifold Discriminant Analysis
* Marked point processes for crowd counting
* Markerless Motion Capture with unsynchronized moving cameras
* Markov Chain Monte Carlo combined with deterministic methods for Markov random field optimization
* Material classification using BRDF slices
* Max-margin hidden conditional random fields for human action recognition
* Maximizing intra-individual correlations for face recognition across pose differences
* Memory-based Particle Filter for face pose tracking robust under complex dynamics
* min-max framework of cascaded classifier with multiple instance learning for computer aided diagnosis, A
* minimal parameterization of the trifocal tensor, A
* Minimizing sparse higher order energy functions of discrete variables
* Modeling images as mixtures of reference images
* Monitoring, recognizing and discovering social networks
* Motion capture using joint skeleton tracking and surface estimation
* Motion pattern interpretation and detection for tracking moving vehicles in airborne video
* Moving cast shadow detection using physics-based features
* Multi-camera activity correlation analysis
* Multi-class active learning for image classification
* Multi-cue onboard pedestrian detection
* Multi-label sparse coding for automatic image annotation
* Multi-object tracking through occlusions by local tracklets filtering and global tracklets association with detection responses
* Multi-view 3D human pose estimation combining single-frame recovery, temporal integration and model adaptation
* multi-view probabilistic model for 3D object classes, A
* Multiphase geometric couplings for the segmentation of neural processes
* Multiple instance fFeature for robust part-based object detection
* Multiple view image denoising
* Multiplicative nonnegative graph embedding
* multiscale hybrid model exploiting heterogeneous contextual relationships for image segmentation, A
* Mutual information-based stereo matching combined with SIFT descriptor in log-chromaticity color space
* New appearance models for natural image matting
* Non-rigid 2D-3D pose estimation and 2D image segmentation
* Noninvasive volumetric imaging of cardiac electrophysiology
* Nonlinear Nonnegative Component Analysis
* Nonnegative Matrix Factorization with Earth Mover's Distance metric
* Nonparametric discriminant HMM and application to facial expression recognition
* nonparametric Riemannian framework for processing high angular resolution diffusion images (HARDI), A
* Nonparametric scene parsing: Label transfer via dense scene alignment
* Nonrigid registration combining global and local statistics
* Nonrigid shape recovery by Gaussian process regression
* novel feature descriptor invariant to complex brightness changes, A
* Object detection using a max-margin Hough transform
* Observable subspaces for 3D human motion recovery
* Observe locally, infer globally: A space-time MRF for detecting abnormal activities with incremental updates
* On bias correction for geometric parameter estimation in computer vision
* On compositional Image Alignment, with an application to Active Appearance Models
* On edge detection on surfaces
* On the burstiness of visual elements
* On the set of images modulo viewpoint and contrast changes
* Optimal scanning for faster object detection
* Optimal single image capture for motion deblurring
* Optimization of landmark selection for cortical surface registration
* P-brush: Continuous valued MRFs with normed pairwise distributions for image segmentation
* Pedestrian detection: A benchmark
* perceptually motivated online benchmark for image matting, A
* Photometric stereo and weather estimation using internet images
* Physics-based edge evaluation for improved color constancy
* Physiological face recognition is coming of age
* Picking the best DAISY
* Pictorial structures revisited: People detection and articulated pose estimation
* Piecewise planar city 3D modeling from street view panoramic sequences
* Planar orientation from blur gradients in a single image
* Polarization: Beneficial for visibility enhancement?
* Pose estimation for category specific multiview object localization
* Pose estimation with radial distortion and unknown focal length
* Pose search: Retrieving people using their pose
* Predicting high resolution image edges with a generic, adaptive, 3-D vehicle model
* projective framework for radiometric image analysis, A
* Projective least-squares: Global solutions with local optimization
* projector-based movable hand-held display system, A
* projector-camera setup for geometry-invariant frequency demultiplexing, A
* Random walks on graphs to model saliency in images
* Randomized structure from motion based on atomic 3D models from camera triplets
* Rank Priors for Continuous Non-Linear Dimensionality Reduction
* Real-time learning of accurate patch rectification
* Real-time O(1) bilateral filtering
* Real-time vehicle detection for highway driving
* Recognising action as clouds of space-time interest points
* Recognition of repetitive sequential human activity
* Recognition using regions
* Recognizing human group activities with localized causalities
* Recognizing indoor scenes
* Recognizing linked events: Searching the space of feasible explanations
* Recognizing realistic actions from videos in the wild
* Reconstructing sharply folding surfaces: A convex formulation
* Recovering specular surfaces using curved line images
* Reducing JointBoost-based multiclass classification to proximity search
* Regularized multi-class semi-supervised boosting
* Relighting objects from image collections
* Removing partial blur in a single image
* Resolution-Invariant Image Representation and its applications
* Retrographic sensing for the measurement of surface texture and shape
* revisit of Generative Model for Automatic Image Annotation using Markov Random Fields, A
* robust approach for automatic registration of aerial images with untextured aerial LiDAR data, A
* Robust guidewire tracking in fluoroscopy
* Robust multi-class transductive learning with graphs
* Robust object detection using marginal space learning and ranking-based multi-detector aggregation: Application to left ventricle detection in 2D MRI images
* Robust Parametric Method for Bias Field Estimation and Segmentation of MR Images, A
* Robust shadow and illumination estimation using a mixture model
* robust shape model for multi-view car alignment, A
* Robust unsupervised segmentation of degraded document images with topic models
* Robustifying eye center localization by head pose cues
* Saliency-based discriminant tracking
* Shape analysis with conformal invariants for multiply connected domains and its application to analyzing brain morphology
* Shape band: A deformable object detection approach
* Shape classification through structured learning of matching measures
* Shape comparison using perturbing shape registration
* Shape constrained figure-ground segmentation and tracking
* Shape discovery from unlabeled image collections
* Shape evolution for rigid and nonrigid shape registration and recovery
* Shape of Gaussians as feature descriptors
* Shape priors and discrete MRFs for knowledge-based segmentation
* Shape-based object recognition in videos using 3D synthetic object models
* Shared Kernel Information Embedding for Discriminative Inference
* SIFT-Rank: Ordinal description for invariant feature correspondence
* Sigma Set: A small second order statistical region descriptor
* similarity measure between vector sequences with application to handwritten word image retrieval, A
* Similarity metrics and efficient optimization for simultaneous registration
* Simultaneous image classification and annotation
* Single Image Haze Removal Using Dark Channel Prior
* Single-image optical center estimation from vignetting and tangential gradient symmetry
* Sparse subspace clustering
* Spatiotemporal stereo via spatiotemporal quadric element (stequel) matching
* Stacks of convolutional Restricted Boltzmann Machines for shift-invariant feature learning
* StaRSaC: Stable random sample consensus for parameter estimation
* Stel component analysis: Modeling spatial correlations in image class structure
* stereo approach that handles the matting problem via image warping, A
* Stereo matching in the presence of sub-pixel calibration errors
* Stereo matching with nonparametric smoothness priors in feature space
* Stereographic rectification of omnidirectional stereo pairs
* Stochastic gradient kernel density mode-seeking
* streaming framework for seamless building reconstruction from large-scale aerial LiDAR data, A
* Structured output-associative regression
* Super-resolution via recapture and Bayesian effect modeling
* Support Vector Machines in face recognition with occlusions
* Surface feature detection and description with applications to mesh matching
* SURFTrac: Efficient tracking and continuous object recognition using local feature descriptors
* Switching Gaussian Process Dynamic Models for simultaneous composite motion tracking and recognition
* Symmetric two dimensional linear discriminant analysis (2DLDA)
* Symmetry integrated region-based image segmentation
* Tensor-Based Algorithm for High-Order Graph Matching, A
* Textural Hausdorff Distance for wider-range tolerance to pose variation and misalignment in 2D face recognition
* Topology dictionary with Markov model for 3D video content-based skimming and description
* Tour the world: Building a web-scale landmark recognition engine
* Towards a practical face recognition system: Robust registration and illumination by sparse representation
* Towards geographical referencing of monocular SLAM reconstruction using 3D city models: Application to real-time accurate vision-based localization
* Towards high-resolution large-scale multi-view stereo
* Towards total scene understanding: Classification, annotation and segmentation in an automatic framework
* Tracking of a non-rigid object via patch-based dynamic appearance modeling and adaptive Basin Hopping Monte Carlo sampling
* Trajectory parsing by cluster sampling in spatio-temporal graph
* Trajectory reconstruction for affine structure-from-motion by global and local constraints
* Tubular anisotropy for 2D vessel segmentation
* Uncalibrated synthetic aperture for defocus control
* Understanding and evaluating blind deconvolution algorithms
* Understanding images of groups of people
* Understanding videos, constructing plots learning a visually grounded storyline model from annotated videos
* unified active and semi-supervised learning framework for image compression, A
* unified model of spectacular and diffuse reflectance for rough, glossy surfaces, A
* Unsupervised feature optimization (UFO): Simultaneous selection of multiple features with their detection parameters
* Unsupervised learning for graph matching
* Unsupervised learning of hierarchical spatial structures in images
* Unsupervised Maximum Margin Feature Selection with manifold regularization
* Vanishing point detection for road detection
* Vanishing points estimation by self-similarity
* Variational layered dynamic textures
* Video object segmentation by hypergraph cut
* VideoTrek: A vision system for a tag-along robot
* View-invariant dynamic texture recognition using a bag of dynamical systems
* Visibility constraints on features of 3D objects
* Visual loop closing using multi-resolution SIFT grids in metric-topological SLAM
* Visual tracking via geometric particle filtering on the affine group with optimal importance functions
* Visual tracking with online Multiple Instance Learning
* Vocabulary hierarchy optimization for effective and transferable retrieval
* Volterrafaces: Discriminant analysis using Volterra kernels
* Wavelet energy map: A robust support for multi-modal registration of medical images
* What is a camera?
* What is the spatial extent of an object?
* What's it going to cost you?: Predicting effort vs. informativeness for multi-label image annotations
* Who are you? - Learning person specific classifiers from video
383 for CVPR09

CVPR10 * *CVPR
* 3D curve sketch: Flexible curve-based stereo reconstruction and calibration
* 3D model based vehicle classification in aerial imagery
* 3D morphable model construction for robust ear and face recognition
* 3D reconstruction of glossy surfaces using stereo cameras and projector-display
* 3D Scene priors for road detection
* 3D Shape correspondence by isometry-driven greedy optimization
* 3D shape scanning with a time-of-flight camera
* AAM based face tracking with temporal matching and face segmentation
* Abrupt motion tracking via adaptive stochastic approximation Monte Carlo sampling
* ABSORB: Atlas building by Self-Organized Registration and Bundling
* Accurate 3D face reconstruction from weakly calibrated wide baseline images with profile contours
* Action classification on product manifolds
* Action unit detection with segment-based SVMs
* Adaptive generic learning for face recognition from a single sample per person
* Adaptive linear predictors for real-time tracking
* Adaptive pose priors for pictorial structures
* Admissible linear map models of linear cameras
* Aggregating local descriptors into a compact image representation
* Analysis of light transport in scattering media
* Analyzing spatially-varying blur
* Anatomical parts-based regression using non-negative matrix factorization
* Anomaly detection in crowded scenes
* approach to vectorial total variation based on geometric measure theory, An
* ARISTA: Image search to annotation on billions of web photos
* Asymmetric region-to-image matching for comparing images with generic object categories
* Attribute-centric recognition for cross-category generalization
* Authority-shift clustering: Hierarchical clustering by authority seeking on graphs
* Automatic attribution of ancient Roman imperial coins
* automatic design of feature spaces for local image descriptors using an ensemble of non-linear feature extractors, The
* Automatic discovery of meaningful object parts with latent CRFs
* Automatic image annotation using group sparsity
* Automatic point-based facial trait judgments evaluation
* automatic unsupervised classification of MR images in Alzheimer's disease, An
* Axial light field for curved mirrors: Reflect your perspective, widen your view
* Bayes optimal kernel discriminant analysis
* Beyond active noun tagging: Modeling contextual interactions for multi-class active learning
* Beyond trees: MRF inference via outer-planar decomposition
* Bidirectional relighting for 3D-aided 2D face recognition
* Bimodal gender recognition from face and fingerprint
* Boosting for transfer learning with multiple sources
* Boundary Learning by Optimization with Topological Constraints
* Breaking the interactive bottleneck in multi-class classification with active selection and binary feedback
* Building and using a semantivisual image hierarchy
* Building reconstruction using manhattan-world grammars
* Bundled depth-map merging for multi-view stereo
* Calibration-free gaze sensing using saliency maps
* Cascade object detection with deformable part models
* Cascaded L1-norm Minimization Learning (CLML) classifier for human detection
* Cascaded pose regression
* chains model for detecting parts by their context, The
* Chaotic invariants of Lagrangian particle trajectories for anomaly detection in crowded scenes
* Classification and clustering via dictionary learning with structured incoherence and shared features
* Clustering dynamic textures with the hierarchical EM algorithm
* Co-clustering of image segments using convex optimization applied to EM neuronal reconstruction
* Coded exposure imaging for projective motion deblurring
* Collect-cut: Segmentation with top-down cues discovered in multi-object images
* Combining discriminative and generative methods for 3D deformable surface and articulated pose reconstruction
* Common visual pattern discovery via spatially coherent correspondences
* Compact projection: Simple and efficient near neighbor search with practical memory requirements
* Comparative object similarity for improved recognition with few or no examples
* Compression of surface registrations using Beltrami coefficients
* Connecting modalities: Semi-supervised segmentation and annotation of images using unaligned text corpora
* Consensus photometric stereo
* constant-space belief propagation algorithm for stereo matching, A
* Constrained parametric min-cuts for automatic object segmentation
* content-aware image prior, A
* Content-aware Ranking for visual search
* Context-Aware Saliency Detection
* Context-constrained hallucination for image super-resolution
* Contour people: A parameterized model of 2D articulated human shape
* Convex shape decomposition
* Correcting over-exposure in photographs
* Cost-sensitive subspace learning for face recognition
* Covering trees and lower-bounds on quadratic assignment
* CRAM: Compact representation of actions in movies
* Cross-dataset action detection
* Curious snakes: A minimum latency solution to the cluttered background problem in active contours
* DARTs: Efficient scale-space extraction of DAISY keypoints
* Data driven mean-shift belief propagation for non-gaussian MRFs
* Data fusion through cross-modality metric learning using similarity-sensitive hashing
* Deconvolutional networks
* Delineating trees in noisy 2D images and 3D image-stacks
* Denoising vs. deblurring: HDR imaging techniques using moving cameras
* Dense interest points
* Dense non-rigid surface registration using high-order graph matching
* Depth from Diffusion
* Detecting and parsing architecture at city scale from range data
* Detecting and sketching the common
* Detecting text in natural scenes with stroke width transform
* Diffeomorphic sulcal shape analysis for cortical surface registration
* diffusion approach to seeded image segmentation, A
* Diffusion filtering without parameter tuning: Models and inference tools
* Direct image alignment of projector-camera systems with planar surfaces
* Disambiguating visual relations using loop constraints
* Discontinuous seam-carving for video retargeting
* Discovering scene categories by information projection and cluster sampling
* Discrete minimum ratio curves and surfaces
* Discriminative clustering for image co-segmentation
* Discriminative K-SVD for dictionary learning in face recognition
* Dominant orientation templates for real-time detection of texture-less objects
* Dynamic and scalable large scale image reconstruction
* Dynamic surface matching by geodesic mapping for 3D animation transfer
* Dynamic texture recognition based on distributions of spacetime oriented structure
* Dynamical binary latent variable models for 3D human pose tracking
* Efficient action spotting based on a spacetime oriented structure representation
* Efficient Additive Kernels via Explicit Feature Maps
* Efficient computation of robust low-rank matrix approximations in the presence of missing data using the L1 norm
* efficient divide-and-conquer cascade for nonlinear object detection, An
* Efficient extraction of human motion volumes by tracking
* Efficient filter flow for space-variant multiframe blind deconvolution
* Efficient hierarchical graph-based video segmentation
* Efficient histogram-based sliding window
* Efficient joint 2D and 3D palmprint matching with alignment refinement
* Efficient piecewise learning for conditional random fields
* Efficient rotation invariant object detection using boosted Random Ferns
* Efficiently selecting regions for scene understanding
* Egomotion using assorted features
* Energy minimization for linear envelope MRFs
* Estimating camera pose from a single urban ground-view omnidirectional image and a 2D building outline map
* Estimating demosaicing algorithms using image noise variance
* Estimating optical properties of layered surfaces using the spider model
* Estimating satellite attitude from pushbroom sensors
* Estimation of image bias field with sparsity constraints
* Evaluation of stereo confidence indoors and outdoors
* Exploiting global connectivity constraints for reconstruction of 3D line segments from images
* Exploiting hierarchical context on a large database of object categories
* Exploiting Monge structures in optimum subwindow search
* Exploiting simple hierarchies for unsupervised human behavior analysis
* Exploring facial expressions with compositional features
* Exploring features in a Bayesian framework for material recognition
* extension of multifactor analysis for face recognition based on submanifold learning, An
* eye for an eye: A single camera gaze-replacement method, An
* Face recognition based on image sets
* Face recognition with learning-based descriptor
* Facial point detection using boosted regression and graph models
* Factorization towards a classifier
* Far-sighted active learning on a budget for image and video recognition
* Fast and robust object segmentation with the Integral Linear Classifier
* Fast Approximate Energy Minimization with Label Costs
* Fast directional chamfer matching
* Fast global optimization of curvature
* Fast globally optimal 2D human detection with loopy graph models
* Fast image alignment in the Fourier domain
* Fast matting using large kernel matting Laplacian matrices
* Fast pattern matching using orthogonal Haar transform
* Fast polygonal integration and its application in extending Haar-like features to improve object detection
* Fast sparse representation with prototypes
* Figure-ground segmentation improves handled object recognition in egocentric video
* Finding dots: Segmentation as popping out regions from boundaries
* Finding image distributions on active curves
* Finding meaning on YouTube: Tag recommendation and category discovery
* Finding Nemo: Deformable object class modelling using curve matching
* Food recognition using statistics of pairwise local features
* framework for ultra high resolution 3D imaging, A
* Free-form mesh tracking: A patch-based approach
* Free-shape subwindow search for object localization
* game-theoretic approach to fine surface registration without initial motion estimation, A
* Generalized simultaneous registration and segmentation
* Generating sharp panoramas from motion-blurred videos
* generative perspective on MRFs in low-level vision, A
* Geo-location estimation from two shadow trajectories
* Geodesic graph cut for interactive image segmentation
* Geodesic star convexity for interactive image segmentation
* Geometric properties of multiple reflections in catadioptric camera with two planar mirrors
* Gesture recognition by learning local motion signatures
* Global and efficient self-similarity for object classification and detection
* Global and local isometry-invariant descriptor for 3D shape comparison and partial matching
* Global Gaussian approach for scene categorization using information geometry
* Global optimization for estimating a BRDF with multiple specular lobes
* globally optimal data-driven approach for image distortion estimation, A
* Globally optimal pixel labeling algorithms for tree metrics
* GPCA with denoising: A moments-based convex approach
* Gradient-directed composition of multi-exposure images
* Graph cut segmentation with a global constraint: Recovering region distribution via a bound of the Bhattacharyya measure
* Group motion segmentation using a Spatio-Temporal Driving Force Model
* Group MRF for fMRI activation detection
* Grouplet: A structured image representation for recognizing human and object interactions
* Growing semantically meaningful models for visual SLAM
* Harmony potentials for joint classification and segmentation
* Harvesting large-scale weakly-tagged image databases from the web
* Heterogeneous Conditional Random Field: Realizing joint detection and segmentation of cell regions in microscopic images
* High performance object detection by collaborative learning of Joint Ranking of Granules features
* High-Resolution Modeling of Moving and Deforming Objects Using Sparse Geometric and Dense Photometric Measurements
* Highly accurate boundary detection and grouping
* Hough transform-based voting framework for action recognition, A
* Human identity recognition in aerial images
* Hybrid multi-view reconstruction by Jump-Diffusion
* Hybrid shift map for video retargeting
* iCoseg: Interactive co-segmentation with intelligent scribble guidance
* Illumination compensation based change detection using order consistency
* Image atlas construction via intrinsic averaging on the manifold of images
* Image restoration and disparity estimation from an uncalibrated multi-layered image
* Image retrieval via probabilistic hypergraph ranking
* Image webs: Computing and exploiting connectivity in image collections
* Implicit hierarchical boosting for multi-view object detection
* Improving state-of-the-art OCR through high-precision document-specific modeling
* Improving the efficiency of hierarchical structure-and-motion
* Improving web image search results using query-relative classifiers
* Increasing depth resolution of electron microscopy of neural circuits using sparse tomographic reconstruction
* Ink-bleed reduction using functional minimization
* Integrated pedestrian classification and orientation estimation
* Interest seam image
* Isoperimetric cut on a directed graph
* Label propagation in video sequences
* Large-scale image categorization with explicit data embedding
* Large-scale image retrieval with compressed Fisher vectors
* Latent hierarchical structural learning for object detection
* Lattice Cut: Constructing superpixels using layer constraints
* Layered object detection for multi-class segmentation
* Learning 3D action models from a few 2D videos for view invariant action recognition
* Learning 3D shape from a single facial image via non-linear manifold embedding and alignment
* Learning a hierarchy of discriminative space-time neighborhood features for human action recognition
* Learning a probabilistic model mixing 3D and 2D primitives for view invariant object recognition
* Learning appearance in virtual scenarios for pedestrian detection
* Learning from interpolated images using neural networks for digital forensics
* Learning Full Pairwise Affinities for Spectral Segmentation
* Learning kernels for variants of normalized cuts: Convex relaxations and applications
* Learning mid-level features for recognition
* Learning shift-invariant sparse representation of actions
* Learning to recognize shadows in monochromatic natural images
* Learning weights for codebook in image classification and retrieval
* Line matching leveraged by point correspondences
* Linear view synthesis using a dimensionality gap light field prior
* Linked edges as stable region boundaries
* Live dense reconstruction with a single moving camera
* Local features are not lonely: Laplacian sparse coding for image classification
* Locality-constrained Linear Coding for image classification
* Localizing non-overlapping surveillance cameras under the L-Infinity norm
* LP norm multiple kernel Fisher discriminant analysis for object and image categorisation
* Lymph node detection in 3-D chest CT using a spatial prior probability
* Making specific features less discriminative to improve point-based 3D object recognition
* Manifold blurring mean shift algorithms for manifold denoising
* Many-to-one contour matching for describing and discriminating object shape
* Masked FFT registration
* Measuring visual saliency by Site Entropy Rate
* Metric-induced optimal embedding for intrinsic 3D shape analysis
* Minimum length in the tangent bundle as a model for curve completion
* Model evolution: An incremental approach to non-rigid structure from motion
* Model globally, match locally: Efficient and robust 3D object recognition
* Model-based respiratory motion compensation for image-guided cardiac interventions
* Modeling and estimating persistent motion with geometric flows
* Modeling mutual context of object and human pose in human-object Interaction activities
* Modeling pixel means and covariances using factorized third-order boltzmann machines
* Modeling pixel process with scale invariant local patterns for background subtraction in complex scenes
* Monocular 3D pose estimation and tracking by detection
* Monocular SLAM with locally planar landmarks via geometric rao-blackwellized particle filtering on Lie groups
* Morphable Reflectance Fields for enhancing face recognition
* Morphological snakes
* Motion Detail Preserving Optical Flow Estimation
* Motion estimation with non-local total variation regularization
* Motion fields to predict play evolution in dynamic sport scenes
* Moving vistas: Exploiting motion for describing scenes
* Multi-class object localization by combining local contextual interactions
* Multi-cue pedestrian classification with partial occlusion handling
* Multi-domain, higher order level set scheme for 3D image segmentation on the GPU
* Multi-structure model selection via kernel optimisation
* Multi-target tracking by on-line learned discriminative appearance models
* Multi-target tracking of time-varying spatial patterns
* Multi-task warped Gaussian process for personalized age estimation
* Multi-view object class detection with a 3D geometric model
* Multi-view Scene Flow Estimation: A View Centered Variational Approach
* Multi-view structure computation without explicitly estimating motion
* Multilinear feature extraction and classification of multi-focal images, with applications in nematode taxonomy
* Multilinear pose and body shape estimation of dressed subjects from image sets
* Multimodal semi-supervised learning for image classification
* Multiple dynamic models for tracking the left ventricle of the heart from ultrasound data using particle filters and deep learning architectures
* Multiple object detection by sequential monte carlo and Hierarchical Detection Network
* multiscale competitive code via sparse representation for palmprint verification, The
* Multisensor-fusion for 3D full-body human motion capture
* Multiview constraints in frequency space and camera calibration from unsynchronized images
* Natural gradients for deformable registration
* Neuron geometry extraction by perceptual grouping in ssTEM images
* New features and insights for pedestrian detection
* new texture descriptor using multifractal analysis in multi-orientation wavelet pyramid, A
* Noise-optimal capture for high dynamic range photography
* Non-rigid structure from locally-rigid motion
* Non-uniform Deblurring for Shaken Images
* Nonparametric higher-order learning for interactive segmentation
* Nonparametric Label-to-Region by search
* novel Markov random field based deformable model for face recognition, A
* Novel observation model for probabilistic object tracking
* novel riemannian framework for shape analysis of 3D objects, A
* Object cut: Complex 3D object reconstruction through line drawing separation
* Object detection via boundary structure segmentation
* Object matching with a locally affine-invariant constraint
* Object recognition as ranking holistic figure-ground hypotheses
* Object recognition by discriminative combinations of line segments and ellipses
* Object separation in x-ray image sets
* object-dependent hand pose prior from sparse training data, An
* Object-graphs for context-aware category discovery
* Object-to-object color transfer: Optimal flows and SMSP transformations
* On Detection of Multiple Object Instances Using Hough Transforms
* On growth and formlets: Sparse multi-scale coding of planar shape
* On the design of robust classifiers for computer vision
* On-line semi-supervised multiple-instance boosting
* One-shot multi-set non-rigid feature-spatial matching
* online approach: Learning-Semantic-Scene-by-Tracking and Tracking-by-Learning-Semantic-Scene, An
* Online multi-class LPBoost
* Online multiple instance learning with no regret
* Online visual vocabulary pruning using pairwise constraints
* Online-batch strongly convex Multi Kernel Learning
* Optical flow estimation with adaptive convolution kernel prior on discrete framework
* Optimal coded sampling for temporal super-resolution
* Optimal HDR reconstruction with linear digital cameras
* Optimizing kd-trees for scalable visual descriptor indexing
* Optimizing one-shot recognition with micro-set learning
* Outlier removal using duality
* P-N learning: Bootstrapping binary classifiers by structural constraints
* Parallel and distributed graph cuts by dual decomposition
* Parallel graph-cuts by adaptive bottom-up merging
* Parametric dimensionality reduction by unsupervised regression
* Pareto discriminant analysis
* Pareto-optimal dictionaries for signatures
* Part and appearance sharing: Recursive Compositional Models for multi-view, Multi-Object Detection
* Performance evaluation of color correction approaches for automatic multi-view image and video stitching
* Person re-identification by symmetry-driven accumulation of local features
* Personalization of image enhancement
* phase only transform for unsupervised surface defect detection, The
* Piecewise planar and non-planar stereo for urban scene reconstruction
* Player localization using multiple static cameras for sports visualization
* Point-based non-rigid surface registration with accuracy estimation
* Polynomial shape from shading
* Pose-robust albedo estimation from a single image
* Posture invariant surface description and feature extraction
* Probabilistic 3D occupancy flow with latent silhouette cues
* probabilistic framework for joint segmentation and tracking, A
* probabilistic image jigsaw puzzle solver, A
* Probabilistic models for supervised dictionary learning
* Probabilistic temporal inference on reconstructed 3D scenes
* PROST: Parallel robust online simple tracking
* Proximate sensing: Inferring what-is-where from georeferenced photo collections
* Pushing the Envelope of Modern Methods for Bundle Adjustment
* Putting local features on a manifold
* Quasi-dense 3D reconstruction using tensor-based multiview stereo
* Randomized hybrid linear modeling by local best-fit flats
* Rapid and accurate developmental stage recognition of C. elegans from high-throughput image data
* Rapid face recognition using hashing
* Rapid selection of reliable templates for visual tracking
* RASL: Robust Alignment by Sparse and Low-Rank Decomposition for Linearly Correlated Images
* Ray Markov Random Fields for image-based 3D modeling: Model and efficient inference
* Reading between the Lines: Object Localization Using Implicit Cues from Image Tags
* Real time motion capture using a single time-of-flight camera
* Real-time tracking of multiple occluding objects using level sets
* Real-time vehicle global localisation with a single camera in dense urban areas: Exploitation of coarse 3D city models
* Recognizing human actions from still images with latent poses
* Reconstruction of display and eyes from a single image
* Recovering fluid-type motions using Navier-Stokes potential flow
* Recovering thin structures via nonlocal-means regularization with application to depth from defocus
* Rectification of figures and photos in document images using bounding box interface
* Rectifying rolling shutter video from hand-held devices
* Rectilinear parsing of architecture in urban environment
* Refinement of digital elevation models from shadowing cues
* Regenerative morphing
* Region moments: Fast invariant descriptors for detecting small image structures
* Relaxing the 3L algorithm for an accurate implicit polynomial fitting
* Removal of 3D facial expressions: A learning-based approach
* Removing rolling shutter wobble
* Robust classification of objects, faces, and flowers using natural image statistics
* Robust flash deblurring
* Robust order-based methods for feature description
* Robust piecewise-planar 3D reconstruction and completion from large-scale unstructured point data
* Robust RVM regression using sparse outlier model
* Robust video denoising using low rank matrix completion
* role of features, algorithms and data in visual recognition, The
* Safety in numbers: Learning categories from few examples with multi model knowledge transfer
* Scalable active matching
* Scalable Face Image Retrieval with Identity-Based Quantization and Multireference Reranking
* Scale-hierarchical 3D object recognition in cluttered scenes
* Scale-invariant heat kernel signatures for non-rigid shape recognition
* Scene understanding by statistical modeling of motion patterns
* Search strategies for multiple landmark detection by submodular maximization
* Secrets of optical flow estimation and their principles
* Segmentation of building facades using procedural shape priors
* Segmenting video into classes of algorithm-suitability
* Self-calibrating photometric stereo
* Semantic context modeling with maximal margin Conditional Random Fields for automatic image annotation
* Semi-supervised hashing for scalable image retrieval
* Sensor saturation in Fourier multiplexed imaging
* Shape and refractive index recovery from single-view polarisation images
* Shape-based similarity retrieval of Doppler images for clinical decision support
* shape-driven MRF model for the segmentation of organs in medical images, A
* Sign ambiguity resolution for phase demodulation in interferometry with application to prelens tear film analysis
* Silhouette transformation based on walking speed for gait identification
* Simultaneous foreground, background, and alpha estimation for image matting
* Simultaneous point matching and 3D deformable surface reconstruction
* Simultaneous pose, correspondence and non-rigid shape
* Simultaneous searching of globally optimal interacting surfaces with shape priors
* Simultaneous surveillance camera calibration and foot-head homology estimation from human detections
* Single image depth estimation from predicted semantic labels
* Sparse representation using nonnegative curds and whey
* Sparsity model for robust optical flow estimation at motion discontinuities
* Spatial-bag-of-features
* Spatialized epitome and its applications
* spatially varying PSF-based prior for alpha matting, A
* SPEC hashing: Similarity preserving algorithm for entropy-based coding
* Specular surface reconstruction from sparse reflection correspondences
* Spherical embeddings for non-Euclidean dissimilarities
* Spike train driven dynamical models for human actions
* square-root sampling approach to fast histogram-based search, A
* Steiner tree approach to efficient object detection, A
* Stratified learning of local anatomical context for lung nodules in CT images
* study on continuous max-flow and min-cut approaches, A
* Sufficient dimension reduction for visual sequence classification
* SUN database: Large-scale scene recognition from abbey to zoo
* Super resolution using edge prior and single image detail synthesis
* Supervised translation-invariant sparse coding
* Support vector regression for multi-view gait recognition based on local motion feature selection
* Surface color estimation based on inter- and intra-pixel relationships in outdoor scenes
* Surface extraction from binary volumes with higher-order smoothness
* Surface stereo with soft segmentation
* SVM for edge-preserving filtering
* Tag-based web photo retrieval improved by batch mode re-tagging
* Talking pictures: Temporal grouping and dialog-supervised person recognition
* Taxonomic classification for web-based videos
* Temporal causality for the analysis of visual events
* theory of phase-sensitive rotation invariance with spherical harmonic and moment-based representations, A
* theory of plenoptic multiplexing, A
* Tiered scene labeling with dynamic programming
* Topic regression multi-modal Latent Dirichlet Allocation for image annotation
* Total Bregman divergence and its applications to shape retrieval
* Toward coherent object detection and scene layout understanding
* Towards general motion-based face recognition
* Towards Internet-scale multi-view stereo
* Towards semantic embedding in visual vocabulary
* Towards weakly supervised semantic segmentation by means of multiple instance and multitask learning
* Tracking people interacting with objects
* Tracking the invisible: Learning where the object might be
* Tracking with local spatio-temporal motion patterns in extremely crowded scenes
* Trajectory matching from unsynchronized videos
* Transductive segmentation of live video with non-stationary background
* Transform coding for fast approximate nearest neighbor search in high dimensions
* Triangulation made easy
* Two perceptually motivated strategies for shape classification
* Unified graph matching in Euclidean spaces
* Unified Real-Time Tracking and Recognition with Rotation-Invariant Fast Features
* Unsupervised detection and segmentation of identical objects
* Unsupervised discovery of co-occurrence in sparse high dimensional data
* Unsupervised discovery of facial events
* Unsupervised learning of invariant features using video
* Upsampling range data in dynamic environments
* Use bin-ratio information for category and scene classification
* Using cloud shadows to infer scene structure and camera calibration
* Using optical defocus to denoise
* Variational segmentation of elongated volumetric structures
* Vehicle detection and tracking in wide field-of-view aerial video
* Vessel scale-selection using MRF optimization
* Visual Classification With Multitask Joint Sparse Representation
* Visual Event Recognition in Videos by Learning from Web Data
* Visual object tracking using adaptive correlation filters
* Visual recognition and detection under bounded computational resources
* Visual recognition using mappings that replicate margins
* Visual tracking decomposition
* Visual tracking via incremental self-tuning particle filtering on the affine group
* Visual tracking via weakly supervised learning from multiple imperfect oracles 1
* Warp propagation for video resizing
* Warping background subtraction
* Weakly-supervised hashing in kernel space
* What helps where - and why? Semantic relatedness for knowledge transfer
* What is an object?
* What's going on? Discovering spatio-temporal dependencies in dynamic scenes
* YouTubeCat: Learning to categorize wild web videos
462 for CVPR10

CVPR11 * *CVPR
* 2.5D building modeling with topology control
* 2D nonrigid partial shape matching using MCMC and contour subdivision
* 3-D marked point process model for multi-view people detection, A
* 3D motion reconstruction for real-world camera motion
* Abnormal detection using interaction energy potentials
* Accelerated low-rank visual recovery by random projection
* Action recognition by dense trajectories
* Action recognition from a distributed representation of pose and appearance
* Action recognition using context and appearance distribution features
* Action recognition with multiscale spatio-temporal contexts
* Active learning for piecewise planar 3D reconstruction
* Activity recognition using dynamic subspace angles
* Actom sequence models for efficient action detection
* AdaBoost on low-rank PSD matrices for metric learning
* Adapted Gaussian models for image classification
* Adapting an object detector by considering the worst case: A conservative approach
* Adaptive and discriminative metric differential tracking
* Adaptive random forest: How many experts to ask before making a decision?
* Adequate reconstruction of transparent objects on a shoestring budget
* Aesthetic quality classification of photographs based on color harmony
* Affine-invariant diffusion geometry for the analysis of deformable 3D shapes
* Affinity learning on a tensor product graph with applications to shape and image retrieval
* Aggregating gradient distributions into intensity orders: A novel local image descriptor
* analysis of using high-frequency sinusoidal illumination to measure the 3D shape of translucent objects, An
* Are sparse representations really relevant for image classification?
* Articulated pose estimation with flexible mixtures-of-parts
* associate-predict model for face recognition, An
* Asymmetric distances for binary embeddings
* Auto-directed video stabilization with robust L1 optimal camera paths
* Automatic adaptation of a generic pedestrian detector to a specific traffic scene
* Automatic photo-to-terrain alignment for the annotation of mountain pictures
* BabyTalk: Understanding and Generating Simple Image Descriptions
* Bayesian approach to adaptive video super resolution, A
* Bayesian deblurring with integrated noise estimation
* Beyond Alhazen's problem: Analytical projection model for non-central catadioptric cameras with quadric mirrors
* Biased normalized cuts
* Blind deconvolution using a normalized sparsity measure
* Blur kernel estimation using the radon transform
* Boosted local structured HOG-LBP for object localization
* Boundary Preserving Dense Local Regions
* branch and contract algorithm for globally optimal fundamental matrix estimation, A
* Branch and track
* brute-force algorithm for reconstructing a scene from two projections, A
* Bypassing synthesis: PLS for face recognition with pose, low-resolution and sketch
* Camera calibration with lens distortion from low-rank textures
* Capturing Time-of-Flight data with confidence
* City-scale landmark identification on mobile devices
* Classification with scattering operators
* closed form solution to robust subspace estimation and clustering, A
* Clues from the beaten path: Location estimation with bursty sequences of tourist photos
* coarse-to-fine approach for fast deformable object detection, A
* Collaborative personalization of image enhancement
* Combining attributes and Fisher vectors for efficient image retrieval
* Combining randomization and discrimination for fine-grained image categorization
* Compact hashing with joint optimization of search accuracy and time
* Comparing data-dependent and data-independent embeddings for classification and ranking of Internet images
* complete statistical inverse ray tracing approach to multi-view stereo, A
* Connecting non-quadratic variational models and MRFs
* Constructing image panoramas using dual-homography warping
* Context tracker: Exploring supporters and distracters in unconstrained environments
* Contextualizing object detection and classification
* Continuously tracking and see-through occlusion based on a new hybrid synthetic aperture imaging model
* Contour cut: Identifying salient contours in images by solving a Hermitian eigenvalue problem
* Contour-based joint clustering of multiple segmentations
* Correspondence driven adaptation for human profile recognition
* Coupled information-theoretic encoding for face photo-sketch recognition
* Cross-view action recognition via view knowledge transfer
* CrossTrack: Robust 3D tracking from two cross-sectional views
* Deformation and illumination invariant feature point descriptor
* deformation and lighting insensitive metric for face recognition based on dense correspondences, A
* Detection free tracking: Exploiting motion and topology for segmenting and tracking under entanglement
* Detection of mitosis within a stem cell population of high cell confluence in phase-contrast microscopy images
* Deterministically maximizing feasible subsystem for robust model fitting with unit norm constraint
* direct formulation for totally-corrective multi-class boosting, A
* Dirichlet process mixture models on symmetric positive definite matrices for appearance clustering in video surveillance applications
* Discrete-continuous optimization for large-scale structure from motion
* Discriminative affine sparse codes for image classification
* Discriminative image warping with attribute flow
* Discriminative spatial pyramid
* Discriminative tag learning on YouTube videos with latent sub-tags
* Distributed computer vision algorithms through distributed averaging
* Distributed message passing for large scale graphical models
* Dynamic batch mode active learning
* Earth mover's prototypes: A convex learning approach for discovering activity patterns in dynamic scenes
* Edgel index for large-scale sketch-based image search
* Effective 3D object detection and regression using probabilistic segmentation features in CT images
* effective document image deblurring algorithm, An
* Efficient Euclidean distance transform using perpendicular bisector segmentation
* Efficient groupwise non-rigid registration of textured surfaces
* Efficient marginal likelihood optimization in blind deconvolution
* Efficient MCMC sampling with implicit shape representations
* Efficient multi-camera detection, tracking, and identification using a shared set of haar-features
* Efficient region search for object detection
* Efficient subwindow search with submodular score functions
* Efficient track linking methods for track graphs using network-flow and set-cover techniques
* Efficient training for pairwise or higher order CRFs via dual decomposition
* Energy based multiple model fitting for non-rigid structure from motion
* Enforcing similarity constraints with integer programming for better scene text recognition
* Enforcing topological constraints in random field image segmentation
* Enhancing by saliency-guided decolorization
* Entropy rate superpixel segmentation
* Estimating Motion and size of moving non-line-of-sight objects in cluttered environments
* Evaluating combinational color constancy methods on real-world images
* Evaluating knowledge transfer and zero-shot learning in a large-scale setting
* Evaluation of background subtraction techniques for video surveillance
* Exhaustive family of energies minimizable exactly by a graph cut
* Exploiting phonological constraints for handshape inference in ASL video
* Exploring aligned complementary image pair for blind motion deblurring
* Exploring relations of visual codes for image classification
* Extracting and locating temporal motifs in video scenes using a hierarchical non parametric Bayesian model
* Extracting vanishing points across multiple views
* Face illumination transfer through edge-preserving filters
* Face image retrieval by shape manipulation
* Face recognition in unconstrained videos with matched background similarity
* Fast and high-performance template matching method
* Fast Cost-Volume Filtering for Visual Correspondence and Beyond
* Fast unsupervised ego-action learning for first-person sports videos
* Feature context for image classification and object detection
* Feature guided motion artifact reduction with structure-awareness in 4D CT images
* Feature- and depth-supported modified total variation optical flow for 3D motion field estimation in real scenes
* Finding the weakest link in person detectors
* FlowBoost: Appearance learning from sparsely annotated video
* Foreground segmentation of live videos using locally competing 1SVMs
* Foreground-background segmentation using iterated distribution matching
* From 3D scene geometry to human workspace
* From active contours to active surfaces
* From co-saliency to co-segmentation: An efficient and fully unsupervised energy minimization model
* From partial shape matching through local deformation to robust global shape similarity for object detection
* From region similarity to category discovery
* fully automated greedy square jigsaw puzzle solver, A
* Functional categorization of objects using real-time markerless motion capture
* Fusion of GPS and Structure-from-Motion Using Constrained Bundle Adjustments
* Gated classifiers: Boosting under high intra-class variation
* general method for the point of regard estimation in 3D space, A
* Generalized Gaussian process models
* Generalized group sparse classifiers with application in fMRI brain decoding
* Generalized Probabilistic Framework for Compact Codebook Creation, A
* Generalized projection based M-estimator: Theory and applications
* generative model for 3D urban scene understanding from movable platforms, A
* generative statistical model for tracking multiple smooth trajectories, A
* Geometric LP-norm feature pooling for image classification
* Glare encoding of high dynamic range images
* Global contrast based salient region detection C
* global optimization approach to robust multi-model fitting, A
* Global optimization for optimal generalized procrustes analysis
* global sampling method for alpha matting, A
* Global stereo matching leveraged by sparse ground control points
* Global temporal registration of multiple non-rigid surface sequences
* Globally-optimal greedy algorithms for tracking a variable number of objects
* Graph connectivity in sparse subspace clustering
* Graph embedding discriminant analysis on Grassmannian manifolds for improved image set matching
* Graph matching through entropic manifold alignment
* GraphTrack: Fast and globally optimal tracking in videos
* Heat-mapping: A robust approach toward perceptually consistent mesh segmentation
* Hello neighbor: Accurate object retrieval with k-reciprocal nearest neighbors
* Heterogeneous image feature integration via multi-modal spectral clustering
* Hierarchical anatomical brain networks for MCI prediction by partial least square analysis
* hierarchical conditional random field model for labeling and segmenting images of street scenes, A
* Hierarchical semantic indexing for large scale image retrieval
* High level describable attributes for predicting aesthetics and interestingness
* High resolution multispectral video capture with a hybrid camera system
* High-dimensional signature compression for large-scale image classification
* High-frequency shape and albedo from shading using natural image statistics
* High-precision localization using visual landmarks fused with range data
* High-quality shape from multi-view stereo and shading under general illumination
* High-resolution hyperspectral imaging via matrix factorization
* How does person identity recognition help multi-person tracking?
* Human brain labeling using image similarities
* Hybrid generative-discriminative classification using posterior divergence
* Hyper-graph matching via reweighted random walks
* Identifying players in broadcast sports videos using conditional random fields
* Illumination estimation and cast shadow detection through a higher-order graphical model
* Illumination invariant feature extraction based on natural images statistics: Taking face images as an example
* Image analysis by counting on a grid
* Image annotation using bi-relational graph of images and semantic labels
* Image classification by non-negative sparse coding, low-rank and sparse decomposition
* Image ranking and retrieval based on multi-attribute queries
* Image retrieval with geometry-preserving visual phrases
* Image saliency: From intrinsic to extrinsic context
* Importance filtering for image retargeting
* importance of intermediate representations for the modeling of 2D shape detection: Endstopping and curvature tuned computations, The
* Improving classifiers with unlabeled weakly-related videos
* Inertial sensor-aligned visual feature descriptors
* Inference for order reduction in Markov random fields
* Instantly telling what happens in a video sequence using simple features
* Interactively building a discriminative vocabulary of nameable attributes
* Internal statistics of a single natural image
* Interreflection removal for photometric stereo by using spectrum-dependent albedo
* Intrinsic dense 3D surface tracking
* Intrinsic images decomposition using a local and global sparse representation of reflectance
* Intrinsic images using optimization
* Is face recognition really a Compressive Sensing problem?
* Iterative quantization: A procrustean approach to learning binary codes
* Joint face alignment with a generic deformable face model
* Joint segmentation and classification of human actions in video
* Kernelized structural SVM learning for supervised object segmentation
* L1 rotation averaging using the Weiszfeld algorithm
* L1-based variational model for Retinex theory and its application to medical images, An
* Landmark/image-based deformable registration of gene expression data
* large-scale benchmark dataset for event recognition in surveillance video, A
* Large-scale image classification: Fast feature extraction and SVM training
* Large-Scale Live Active Learning: Training Object Detectors with Crawled Data and Crowds
* Learning a blind measure of perceptual image quality
* Learning a discriminative dictionary for sparse coding via label consistent K-SVD
* Learning affinities and dependencies for multi-target tracking using a CRF model
* Learning and matching multiscale template descriptors for real-time detection, localization and tracking
* Learning better image representations using flobject analysis
* Learning context for collective activity recognition
* Learning effective human pose estimation from inaccurate annotation
* Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis
* Learning hierarchical poselets for human parsing
* Learning image representations from the pixel level via hierarchical sparse coding
* Learning image Vicept description via mixed-norm regularization for large scale semantic image search
* Learning invariance through imitation
* Learning message-passing inference machines for structured prediction
* Learning non-local range Markov Random field for image restoration
* Learning object color models from multi-view constraints
* Learning people detection models from few training samples
* Learning photographic global tonal adjustment with a database of input-output image pairs
* Learning structured prediction models for interactive image labeling
* Learning temporally consistent rigidities
* Learning the easy things first: Self-paced visual category discovery
* Learning to find occlusion regions
* Learning to recognize objects in egocentric activities
* Learning to share visual appearance for multiclass object detection
* Learning-based hypothesis fusion for robust catheter tracking in 2D X-ray fluoroscopy
* Least squares surface reconstruction from gradients: Direct algebraic methods with spectral, Tikhonov, and constrained regularization
* light-path less traveled, The
* Line-based relative pose estimation
* Linearity of each channel pixel values from a surface in and out of shadows and its applications
* Local isomorphism to solve the pre-image problem in kernel methods
* Locality-sensitive support vector machine by exploring local correlation and global regularization
* Localizing Parts of Faces Using a Consensus of Exemplars
* magic sigma, The
* Majorization-minimization mixture model determination in image segmentation
* Making the right moves: Guiding alpha-expansion using local primal-dual gaps
* Markerless motion capture of interacting characters using multi-view image segmentation
* Matching 2D image lines to 3D models: Two improvements and a new algorithm
* Max-margin clustering: Detecting margins from projections of points on lines
* Minimum error bounded efficient L1 tracker with occlusion detection
* Mining discriminative co-occurrence patterns for visual recognition
* MKPM: A multiclass extension to the kernel projection machine
* Modeling human activities as speech
* Modeling the joint density of two images under a variety of transformations
* Modelling composite shapes by Gibbs random fields
* Monocular 3D scene understanding with explicit occlusion reasoning
* Motion denoising with application to time-lapse photography
* Multi-agent event recognition in structured scenarios
* Multi-label learning with incomplete class assignments
* Multi-layer group sparse coding: For concurrent image classification and annotation
* Multi-level inference by relaxed dual decomposition for human pose segmentation
* Multi-spectral SIFT for scene category recognition
* Multi-target tracking by continuous energy minimization
* Multi-view reconstruction preserving weakly-supported surfaces
* Multichannel Edge-Weighted Centroidal Voronoi Tessellation algorithm for 3D super-alloy image segmentation, A
* Multicore bundle adjustment
* Multifactor analysis based on factor-dependent geometry
* Multiobject tracking as maximum weight independent set
* Multiscale geometric and spectral analysis of plane arrangements
* Multiview registration via graph diffusion of dual quaternions
* Multiview specular stereo reconstruction of large mirror surfaces
* Natural image denoising: Optimality and inherent bounds
* Noise resistant graph ranking for improved web image search
* Noise suppression in low-light images through joint denoising and demosaicing
* non-convex relaxation approach to sparse dictionary learning, A
* Non-negative local coordinate factorization for image representation
* Non-negative matrix factorization as a feature selection tool for maximum margin classifiers
* Non-rigid structure from motion with complementary rank-3 spaces
* NonLinear refinement of structure from motion reconstruction by taking advantage of a partial knowledge of the environment
* Nonlinear shape manifolds as shape priors in level set segmentation and tracking
* Nonlocal matting
* Nonnegative sparse coding for discriminative semi-supervised learning
* Nonparametric density estimation on a graph: Learning framework, fast approximation and application in image segmentation
* Novel 4-D Open-Curve Active Contour and curve completion approach for automated tree structure extraction
* novel parametrization of the perspective-three-point problem for a direct computation of absolute camera position and orientation, A
* novel supervised level set method for non-rigid object tracking, A
* Novelty detection from an ego-centric perspective
* O(N) implicit subspace embedding for unsupervised multi-scale image segmentation
* Object association across PTZ cameras using logistic MIL
* Object cosegmentation
* Object recognition with hierarchical kernel descriptors
* Object segmentation by alignment of poselet activations to image contours
* Object stereo: Joint stereo matching and object segmentation
* Occlusion boundary detection and figure/ground assignment from optical flow
* On analyzing video with very small motions
* On deep generative models with applications to recognition
* On dynamic scene geometry for view-invariant action matching
* On dynamic scene geometry for view-invariant action matching
* Online detection of unusual events in videos via dynamic sparse coding
* Online domain adaptation of a pre-trained cascade of classifiers
* Online environment mapping
* Online group-structured dictionary learning
* Optimal similarity registration of volumetric images
* Optimal spatio-temporal path discovery for video event detection
* Ordinal hyperplanes ranker with cost sensitivities for age estimation
* P2C2: Programmable pixel compressive camera for high speed imaging
* Parameter learning with truncated message-passing
* Parsing human motion with stretchable models
* Partial similarity based nonparametric scene parsing in certain environment
* Particle filter with state permutations for solving image jigsaw puzzles
* pattern framework driven by the Hamming distance for structured light-based reconstruction with a single image, A
* PClines: Line detection using parallel coordinates
* Person re-identification by probabilistic relative distance comparison
* Piecing together the segmentation jigsaw using context
* polar representation of motion and implications for optical flow, A
* Pose-Robust Recognition of Low-Resolution Face Images
* Predicting image matching using affine distortion models
* Principal regression analysis
* Probabilistic event logic for interval-based event recognition
* Probabilistic gaze estimation without active personal calibration
* probabilistic model for recursive factorized image features, A
* probabilistic representation for efficient large scale visual recognition tasks, A
* Probabilistic simultaneous pose and non-rigid shape recovery
* Projective alignment of range and parallax data
* Proposal generation for object detection using cascaded ranking SVMs
* Query-specific visual semantic spaces for web image re-ranking
* Radiometric calibration by transform invariant low-rank structure
* Random field topic model for semantic region analysis in crowded scenes from tracklets
* Random maximum margin hashing
* rank-order distance based clustering algorithm for face tagging, A
* Rank-SIFT: Learning to rank repeatable local interest points
* Real Time Head Pose Estimation with Random Regression Forests
* Real-Time Human Pose Recognition in Parts from Single Depth Images
* Real-time visual tracking using compressive sensing
* Recognition using visual phrases
* Recognizing human actions by attributes
* Reconstructing an image from its local descriptors
* Reconstruction of relief objects from line drawings
* Recovering shape from a single image of a mirrored surface from curvature constraints
* Recovery of corrupted low-rank matrices via half-quadratic based nonconvex minimization
* Rectification and 3D reconstruction of curved document images
* Reduced epipolar cost for accelerated incremental SfM
* Reflection detection in image sequences
* Registration for 3D surfaces with large deformations using quasi-conformal curvature flow
* Registration of camera captured documents under non-rigid deformation
* Regression-based label fusion for multi-atlas segmentation
* Relative pose problem for non-overlapping surveillance cameras with known gravity vector
* Repetition-based dense single-view reconstruction
* Robust and efficient regularized boosting using total Bregman divergence
* Robust classification using structured sparse representation
* Robust discriminative wire structure modeling with application to stent enhancement in fluoroscopy
* robust method for vector field learning with application to mismatch removing, A
* Robust point set registration using EM-ICP with information-theoretically optimal outlier handling
* Robust sparse coding for face recognition
* Robust tracking using local sparse appearance model and K-selection
* RUNE-Tag: A high accuracy fiducial marker with strong occlusion resilience
* Saliency estimation using a non-parametric low-level vision model
* Salient coding for image classification
* Sampling bedrooms
* scalable dual approach to semidefinite metric learning, A
* Scalable multi-class object detection
* Scale and Rotation Invariant Matching Using Linearly Augmented Trees
* Scale invariant cosegmentation for image groups
* Scenario-based video event recognition by constraint flow
* Scene flow estimation by growing correspondence seeds
* Scene shape from texture of objects
* Segment an image by looking into an image corpus
* segmentation-aware object detection model with occlusion handling, A
* Semantic structure from motion
* Semi-Supervised Video Segmentation Using Tree Structured Graphical Models
* Separating reflective and fluorescent components of an image
* Shape estimation in natural illumination
* Shape from specular flow: Is one flow enough?
* Shape grammar parsing via Reinforcement Learning
* Shape-based pedestrian parsing
* Shared parts for deformable part-based models
* Sharing features between objects and their attributes
* Simulating human saccadic scanpaths on natural images
* Simultaneous dimensionality reduction and human age estimation via kernel partial least squares regression
* Single Image Super-Resolution Using Gaussian Process Regression
* Single-image shadow detection and removal using paired regions
* sLLE: Spherical locally linear embedding with applications to tomography
* Smoothly varying affine stitching
* Sobolev-type metric for polar active contours, A
* Space-time super-resolution from a single video
* Sparse approximated nearest points for image set classification
* Sparse concept coding for visual analysis
* Sparse image representation with epitomes
* Sparse reconstruction cost for abnormal event detection
* Sparse shape composition: A new framework for shape prior modeling
* Sparsity-based image denoising via dictionary learning and structural clustering
* Spatial-DiscLDA for visual recognition
* Stable multi-target tracking in real-time surveillance video
* Statistics of real-world hyperspectral images
* Structure from motion blur in low light
* Structure from motion for scenes with large duplicate structures
* Structure-from-motion based hand-eye calibration using L-inf minimization
* Structured light 3D scanning in the presence of global illumination
* study of Nesterov's scheme for Lagrangian decomposition and MAP labeling, A
* Style transfer matrix learning for writer adaptation
* Submodular decomposition framework for inference in associative Markov networks with global constraints
* Submodularity beyond submodular energies: Coupling edges in graph cuts
* Supervised hierarchical Pitman-Yor process for natural scene segmentation
* Supervised hypergraph labeling
* Supervised local subspace learning for continuous head pose estimation
* Support tucker machines
* Symmetric piecewise planar object reconstruction from a single image
* Tag localization with spatial correlations and joint group sparsity
* TaylorBoost: First and second-order boosting algorithms with explicit margin control
* theory of differential photometric stereo for unknown isotropic BRDFs, A
* theory of multi-perspective defocusing, A
* Three-dimensional kaleidoscopic imaging
* Time and space efficient spectral clustering via column sampling
* Topologically-robust 3D shape matching based on diffusion geometry and seed growing
* Topology-adaptive multi-view photometric stereo
* Total recall II: Query expansion revisited
* Total variation for cyclic structures: Convex relaxation and efficient minimization
* Towards a practical lipreading system
* Towards cross-category knowledge propagation for learning visual concepts
* Tracking 3D human pose with large root node uncertainty
* Tracking low resolution objects by metric preservation
* Translation symmetry detection in a fronto-parallel view
* TVParser: An automatic TV video parsing method
* two-stage reconstruction approach for seeing through water, A
* Unbiased look at dataset bias
* Uncovering vein patterns from color skin images for forensic analysis
* unified framework for locating and recognizing human actions, A
* Unsupervised auxiliary visual words discovery for large-scale image object retrieval
* Unsupervised local color correction for coarsely registered images
* Unsupervised random forest indexing for fast action search
* Using 3D scene structure to improve tracking
* Using global bag of features models in random fields for joint categorization and segmentation of objects
* Using Ripley's K-function to improve graph-based clustering techniques
* Using specular highlights as pose invariant features for 2D-3D pose estimation
* Variable grouping for energy minimization
* Vehicle tracking across nonoverlapping cameras using joint kinematic and appearance features
* Visual and semantic similarity in ImageNet
* Visual saliency detection by spatially weighted dissimilarity
* Visual textures as realizations of multivariate log-Gaussian Cox processes
* Wavelet belief propagation for large scale inference problems
* What makes a chair a chair?
* What makes an image memorable?
* What you saw is not what you get: Domain adaptation using asymmetric kernel transforms
* Where's Waldo: Matching people in images of crowds
* Which parts of the face give out your identity?
* Who are you with and where are you going?
* Wide-angle micro sensors for vision on a tight budget
* Wide-baseline stereo for face recognition with large pose variation
439 for CVPR11

CVPR12 * *CVPR
* 2.5D building modeling by discovering global regularities
* 2D/3D rotation-invariant detection using equivariant filters and kernel weighted mapping
* 3D Constrained Local Model for rigid and non-rigid facial tracking
* 3D extension to cortex like mechanisms for 3D object class recognition, A
* 3D visual phrases for landmark recognition
* A-Optimal Non-negative Projection for image representation
* Accidental pinhole and pinspeck cameras: Revealing the scene outside the picture
* Action bank: A high-level representation of activity in video
* Action recognition by exploring data distribution and feature correlation
* Actionable saliency detection: Independent motion detection without independent motion estimation
* Active attentional sampling for speed-up of background subtraction
* Active image clustering: Seeking constraints from humans to complement algorithms
* Active learning for semantic segmentation with expected change
* Adaptive figure-ground classification
* Affine-invariant, elastic shape analysis of planar contours
* Affinity aggregation for spectral clustering
* Affinity learning via self-diffusion for image segmentation and clustering
* Aligning images in the wild
* analysis of color demosaicing in plenoptic cameras, An
* Angular domain reconstruction of dynamic 3D fluid surfaces
* Application of the mean field methods to MRF optimization in computer vision
* Are we ready for autonomous driving? The KITTI vision benchmark suite
* Articulated people detection and pose estimation: Reshaping the future
* Articulated pose estimation with parts connectivity using discriminative local oriented contours
* Augmenting deformable part models with irregular-shaped object patches
* Auto face re-ranking by mining the web and video archives
* Automated annotation of coral reef survey images
* Automated quantitative description of spiral galaxy arm-segment structure
* Automated reconstruction of tree structures using path classifiers and Mixed Integer Programming
* Automatic discovery of groups of objects for scene understanding
* Automatic mitral leaflet tracking in echocardiography by outlier detection in the low-rank representation
* Automatic upright adjustment of photographs
* Autonomous cleaning of corrupted scanned documents: A generative modeling approach
* AVA: A large-scale database for aesthetic visual analysis
* Background modeling using adaptive pixelwise kernel variances in a hybrid feature space
* Bag of textons for image segmentation via soft clustering and convex shift
* Batch mode Adaptive Multiple Instance Learning for computer vision tasks
* Bayesian geometric modeling of indoor scenes
* Beyond spatial pyramids: Receptive field learning for pooled image features
* Bilevel sparse coding for coupled feature spaces
* biquadratic reflectance model for radiometric image analysis, A
* Bispectral photometric stereo based on fluorescence
* Boosting algorithms for simultaneous feature extraction and selection
* Boosting bottom-up and top-down visual features for saliency estimation
* branch-and-bound algorithm for globally optimal hand-eye calibration, A
* Branch-and-price global optimization for multi-view multi-target tracking
* Bridging the past, present and future: Modeling scene activities from event relationships and global rules
* Building a dictionary of image fragments
* bundle approach to efficient MAP-inference by Lagrangian relaxation, A
* Cache-efficient graph cuts on structured grids
* Camera spectral sensitivity estimation from a single image under unknown illumination by using fluorescence
* Cats and dogs
* Center-Shift: An approach towards automatic robust mesh segmentation (ARMS)
* Chebyshev approximations to the histogram X^2 kernel
* City scale geo-spatial trajectory estimation of a moving camera
* Classifying covert photographs
* closed-form solution to uncalibrated photometric stereo via diffuse maxima, A
* codebook-free and annotation-free approach for fine-grained image categorization, A
* Collection flow
* Color attributes for object detection
* Color constancy using faces
* combined pose, object, and feature model for action understanding, A
* Complex loss optimization via dual decomposition
* Compressive depth map acquisition using a single photon-counting detector: Parametric signal processing meets sparsity
* Computer vision aided target linked radiation imaging
* Computing nearest-neighbor fields via Propagation-Assisted KD-Trees
* Conditional regression forests for human pose estimation
* Connected contours: A new contour completion model that respects the closure effect
* Connecting the dots in multi-class classification: From nearest subspace to collaborative representation
* Consistent depth maps recovery from a trinocular video sequence
* constrained latent variable model, A
* Context aware topic model for scene recognition
* Contextual boost for pedestrian detection
* contextual maximum likelihood framework for modeling image registration, A
* Contour-based recognition
* Convex reduction of high-dimensional kernels for visual classification
* convex representation for the vectorial Mumford-Shah functional, A
* Coupling detection and data association for multiple object tracking
* Covariance discriminative learning: A natural and efficient approach to image set classification
* Cross-based local multipoint filtering
* Cross-view activity recognition using Hankelets
* Curvature-based regularization for surface approximation
* Customizing biometric authentication systems via discriminative score calibration
* D-Nets: Beyond patch-based image descriptors
* data driven method for feature transformation, A
* data-driven approach for facial expression synthesis in video, A
* database for fine grained activity detection of cooking activities, A
* DCMSVM: Distributed parallel training for single-machine multiclass classifiers
* Decentralized particle filter for joint individual-group tracking
* Decomposing and regularizing sparse/non-sparse components for motion field estimation
* Decomposing Global Light Transport Using Time of Flight Imaging
* Dense Lagrangian motion estimation with occlusions
* Dense reconstruction on-the-fly
* Depth from optical turbulence
* Describing the scene as a whole: Joint object detection, scene classification and semantic segmentation
* Detecting activities of daily living in first-person camera views
* Detecting regions of interest in dynamic scenes with camera motions
* Detecting texts of arbitrary orientations in natural images
* Detection by detections: Non-parametric detector adaptation for a video
* Discovering and exploiting 3D symmetries in structure from motion
* Discovering discriminative action parts from mid-level video representations
* Discovering important people and objects for egocentric video summarization
* Discovering localized attributes for fine-grained recognition
* Discrete texture traces: Topological representation of geometric context
* Discrete-continuous optimization for multi-target tracking
* Discriminant Image Filter Learning for Face Recognition with Local Binary Pattern Like Representation
* Discriminately decreasing discriminability with learned image filters
* discriminative deep model for pedestrian detection with occlusion handling, A
* Discriminative feature fusion for image classification
* Discriminative Illumination: Per-Pixel Classification of Raw Materials Based on Optimal Projections of Spectral BRDF
* Discriminative spatial saliency for image classification
* Discriminative virtual views for cross-view action recognition
* Distribution fields for tracking
* Dynamic scene understanding: The role of orientation features in space and time in scene classification
* Edge-preserving photometric stereo via depth fusion
* Efficient activity detection with max-subgraph search
* Efficient automatic 3D-reconstruction of branching neurons from EM data
* efficient branch-and-bound algorithm for optimal human pose estimation, An
* Efficient discriminative learning of parametric nearest neighbor classifiers
* Efficient inference for fully-connected CRFs with stationarity
* Efficient object detection using cascades of nearest convex model classifiers
* Efficient online structured output learning for keypoint-based object tracking
* Efficient structure detection via random consensus graph
* Efficient structured prediction for 3D indoor scene understanding
* Enhancing underwater images and videos by fusion
* Estimating the aspect layout of object categories
* Evaluation of low-level features and their combinations for complex event detection in open source videos
* Evaluation of super-voxel methods for early video processing
* Example-based 3D object reconstruction from line drawings
* Example-based cross-modal denoising
* Exemplar-based human action pose correction and tagging
* Exploiting local and global patch rarities for saliency detection
* Exploiting nonlocal spatiotemporal structure for video segmentation
* Exploiting web images for event recognition in consumer videos: A multiple source domain adaptation approach
* Face Alignment by Explicit Shape Regression
* Face detection, pose estimation, and landmark localization in the wild
* Facial expression editing in video using a temporally-smooth factorization
* Factorized Graph Matching
* Factorizing appearance using epitomic flobject analysis
* Fan Shape Model for object detection
* Fast algorithms for structured robust principal component analysis
* Fast and globally optimal single view reconstruction of curved objects
* Fast approximate k-means via cluster closures
* Fast axis estimation from a segment of rotationally symmetric object
* Fast computation of min-Hash signatures for image collections
* Fast dynamic programming for labeling problems with ordering constraints
* fast nearest neighbor search algorithm by nonlinear embedding, A
* Fast radial symmetry detection under affine transformations
* Fast recursive ensemble convolution of Haar-like features
* Fast search in Hamming space with multi-index hashing
* Feature-domain super-resolution framework for Gabor-based face and iris recognition
* Figure-ground segmentation by transferring window masks
* Finite Element based sequential Bayesian Non-Rigid Structure from Motion
* Fixed-rank representation for unsupervised visual learning
* flow model for joint action recognition and identity maintenance, A
* Foreground detection using spatiotemporal projection kernels
* FREAK: Fast Retina Keypoint
* From label fusion to correspondence fusion: A new approach to unbiased groupwise registration
* From Pictorial Structures to deformable structures
* From pixels to physics: Probabilistic color de-rendering
* game-theoretic approach to deformable shape matching, A
* General and nested Wiberg minimization
* General trajectory prior for Non-Rigid reconstruction
* Generalized Multiview Analysis: A discriminative latent space
* Generalized time warping for multi-modal alignment of human motion
* Generalizing Wiberg algorithm for rigid and nonrigid factorizations with missing components and metric constraints
* Geodesic flow kernel for unsupervised domain adaptation
* Geometric understanding of point clouds using Laplace-Beltrami operator
* Geometry Constrained Sparse Coding for Single Image Super-Resolution
* Globally consistent depth labeling of 4D light fields
* Globally optimal hand-eye calibration
* Globally optimal line clustering and vanishing point estimation in Manhattan world
* Graph cuts optimization for multi-limb human segmentation in depth maps
* Graph-based detection, segmentation and characterization of brain tumors
* Graph-guided sparse reconstruction for region tagging
* Group action induced distances for averaging and clustering Linear Dynamical Systems with applications to the analysis of dynamic scenes
* Growing a bag of systems tree for fast and accurate classification
* Hand tracking by binary quadratic programming and its application to retail activity recognition
* Hedging your bets: Optimizing accuracy-specificity trade-offs in large scale visual recognition
* Hierarchical face parsing via deep learning
* hierarchical image clustering cosegmentation framework, A
* Hierarchical matching with side information for image classification
* Higher level segmentation: Detecting and grouping of invariant repetitive patterns
* Higher order motion models and spectral clustering
* Icon scanning: Towards next generation QR codes
* Identigram/watermark removal using cross-channel correlation
* Image categorization using Fisher kernels of non-iid image models
* Image collection summarization via dictionary learning for sparse representation
* Image denoising: Can plain neural networks compete with BM3D?
* Image description with a goal: Building efficient discriminating expressions for images
* Image matching using local symmetry features
* Image search results refinement via outlier detection using deep contexts
* Image sets alignment for Video-Based Face Recognition
* image torque operator: A new tool for mid-level vision, The
* Improved facial expression recognition via uni-hyperplane classification
* Improved subspace clustering via exploitation of spatial constraints
* Improving multi-target tracking via social grouping
* Incremental gradient on the Grassmannian for online foreground and background separation in subsampled video
* Interactive object detection
* Intrinsic shape context descriptors for deformable shapes
* Inverted Multi-Index, The
* Irregular lattices for complex shape grammar facade parsing
* Isogeometric finite-elements methods and variational reconstruction tasks in vision: A perfect match
* Iterative Nearest Neighbors for classification and dimensionality reduction
* Jigsaw puzzles with pieces of unknown orientation
* Joint 2D-3D temporally consistent semantic segmentation of street scenes
* Joint motion estimation and segmentation of complex scenes with label costs and occlusion modeling
* KNN Matting
* Knock! Knock! Who is it? probabilistic person identification in TV-series
* l2, 1 Regularized correntropy for robust feature selection
* Large scale metric learning from equivalence constraints
* Large-scale image classification with trace-norm regularization
* Large-scale knowledge transfer for object localization in ImageNet
* Laser speckle photography for surface tampering detection
* Layered segmentation and optical flow estimation over time
* Learning 3D object templates by hierarchical quantization of geometry and appearance spaces
* Learning active facial patches for expression analysis
* Learning an object class representation on a continuous viewsphere
* Learning attention map from images
* learning based deformable template matching method for automatic rib centerline extraction and labeling in CT images, A
* Learning contour-fragment-based shape model with And-Or tree representation
* Learning hierarchical representations for face verification with convolutional deep belief networks
* Learning hierarchical similarity metrics
* Learning image-specific parameters for interactive segmentation
* Learning inter-related visual dictionary for object recognition
* Learning latent temporal structure for complex event detection
* Learning object class detectors from weakly annotated video
* Learning object relationships via graph-based context model
* Learning ordinal discriminative features for age estimation
* Learning rotation-aware features: From invariant priors to equivariant descriptors
* Learning shared body plans
* Learning sparse covariance patterns for natural scenes
* Learning structural element patch models with hierarchical palettes
* Learning the right model: Efficient max-margin learning in Laplacian CRFs
* Learning to localize detected objects
* Learning to segment dense cell nuclei with shape prior
* learning-based framework for depth ordering, A
* Leveraging category-level labels for instance-level image retrieval
* Leveraging stereopsis for saliency analysis
* line-structure-preserving approach to image resizing, A
* Linear discriminative image processing operator analysis
* Linear solution to scale invariant global figure ground separation
* Local Naive Bayes Nearest Neighbor for image classification
* Locality-constrained and spatially regularized coding for scene categorization
* Locally Orderless Tracking
* Low level vision via switchable Markov random fields
* Low-rank matrix recovery with structural incoherence for robust face recognition
* Making minimal solvers fast
* Manifold guided composite of Markov random fields for image modeling
* MAP-MRF inference based on extended junction tree representation
* Markov Weight Fields for face sketch synthesis
* Matrix completion by Truncated Nuclear Norm Regularization
* Max-Margin Early Event Detectors
* Maximum weight cliques with mutex constraints for video object segmentation
* Memory constrained face recognition
* Meta-class features for large-scale object categorization on a budget
* Metric learning with two-dimensional smoothness for visual analysis
* Micro Phase Shifting
* Mining actionlet ensemble for action recognition with depth cameras
* Mobile object detection through client-server based vote transfer
* Mobile product search with Bag of Hash Bits and boundary reranking
* Mode-seeking on graphs via random walks
* Model recommendation for action recognition
* Modeling and correction of multipath interference in time of flight cameras
* Modulation transfer function of patch-based stereo systems
* Monotonicity and Error Type Differentiability in Performance Measures for Target Detection and Tracking in Video
* Motion-aware noise filtering for deblurring of noisy and blurry images
* Multi view registration for novelty/background separation
* Multi-attribute spaces: Calibration for attribute fusion and similarity search
* Multi-class cosegmentation
* Multi-column deep neural networks for image classification
* Multi-feature metric learning with knowledge transfer among semantics and social tagging
* Multi-label ReliefF and F-statistic feature selections for image annotation
* Multi-Output Laplacian Dynamic Ordinal Regression for Facial Expression Recognition and Intensity Estimation
* Multi-pedestrian detection in crowded scenes: A global view
* Multi-scale dictionary for single image super-resolution
* Multi-target tracking by online learning of non-linear motion patterns and robust appearance models
* Multi-view hair capture using orientation fields
* Multi-view latent variable discriminative models for action recognition
* Multiclass pixel labeling with non-local matching constraints
* Multimodal feature fusion for robust event detection in web videos
* Multiple Clustered Instance Learning for Histopathology Cancer Image Classification, Segmentation and Clustering
* Multitarget data association with higher-order motion models
* Names and shades of color for intrinsic image estimation
* Neighborhood Repulsed Metric Learning for Kinship Verification
* new convexity measurement for 3D meshes, A
* new mirror-based extrinsic camera calibration using an orthogonality constraint, A
* non-local cost aggregation method for stereo matching, A
* Non-Negative Low Rank and Sparse Graph for Semi-Supervised Learning
* Non-sparse linear representations for visual tracking with online reservoir metric learning
* Nonlinear camera response functions and image deblurring
* Nonparametric image parsing using adaptive neighbor sets
* Nonparametric kernel estimators for image classification
* Nonparametric learning for layered segmentation of natural images
* Object retrieval and localization with spatially-constrained similarity measure and k-NN re-ranking
* Occlusion Reasoning for Object Detection Under Arbitrary Viewpoint
* Omni-range spatial contexts for visual classification
* On multiple foreground cosegmentation
* On partial least squares in head pose estimation: How to simultaneously deal with misalignment
* On SIFTs and their scales
* On template-based reconstruction from a single view: Analytical solutions and proofs of well-posedness for developable, isometric and conformal surfaces
* On the dimensionality of video bricks under varying illumination
* On the regularization of image semantics by modal expansion
* Online content-aware video condensation
* Online continuous stereo extrinsic parameter estimation
* Online incremental attribute-based zero-shot learning
* online learned CRF model for multi-target tracking, An
* Online robust image alignment via iterative convex optimization
* Optical flow in the presence of spatially-varying motion blur
* Optimal integration of photometric and geometric surface measurements using inaccurate reflectance/illumination knowledge
* optimized DBN-based mode-focussing particle filter, An
* Order determination and sparsity-regularized metric learning adaptive visual tracking
* Parameter-free/Pareto-driven procedural 3D reconstruction of buildings from ground-level sequences
* Parsing clothing in fashion photographs
* Parsing Faēade with rank-one approximation
* Part-based multiple-person tracking with partial occlusion handling
* PCCA: A new approach for distance learning from sparse pairwise constraints
* Pedestrian detection at 100 frames per second
* Per-pixel translational symmetry detection, optimization, and segmentation
* Photometric stereo for outdoor webcams
* physically-based approach to reflection separation, A
* Pose pooling kernels for sub-category recognition
* Power mean SVM for large scale visual classification
* Power SVM: Generalization with exemplar classification uncertainty
* Practical low-rank matrix approximation under robust L1-norm
* Probabilistic learning of task-specific visual attention
* Progressive graph matching: Making a move of graphs via probabilistic voting
* Progressive shape models
* QsRank: Query-sensitive hash code ranking for efficient-neighbor search
* RALF: A reinforced active learning formulation for object class recognition
* Random Cluster Model for Robust Geometric Fitting, The
* Random walks based multi-image segmentation: Quasiconvexity results and GPU-based solutions
* Randomized visual phrases for object search
* Real time robust L1 tracker using accelerated proximal gradient approach
* Real-time 6D stereo Visual Odometry with non-overlapping fields of view
* Real-time facial feature detection using conditional regression forests
* Real-time image-based 6-DOF localization in large-scale environments
* Real-time scene text localization and recognition
* Recognizing proxemics in personal photos
* Recognizing scene viewpoint using panoramic place representation
* Reconfigurable models for scene recognition
* Reconstruction of super-resolution lung 4D-CT using patch-based sparse representation
* Recovering free space of indoor scenes from a single image
* Refractive height fields from single and multiple images
* Regression Tree Fields: An efficient, non-parametric approach to image labeling problems
* regularized spectral algorithm for Hidden Markov Models with applications in computer vision, A
* Relaxed collaborative representation for pattern classification
* Revisiting uncertainty in graph cut solutions
* RGB-(D) scene labeling: Features and algorithms
* Riemannian approach for estimating orientation distribution function (ODF) images from high-angular resolution diffusion imaging (HARDI), A
* Robust and discriminative distance for Multi-Instance Learning
* Robust Boltzmann Machines for recognition and denoising
* Robust camera self-calibration from monocular images of Manhattan worlds
* Robust late fusion with rank minimization
* Robust Maximum Likelihood estimation by sparse bundle adjustment using the L1 norm
* Robust Non-negative Graph Embedding: Towards noisy data, unreliable graphs, and noisy labels
* Robust non-rigid registration of 2D and 3D graphs
* Robust nonrigid ICP using outlier-sparsity regularization
* Robust object tracking via sparsity-based collaborative model
* Robust photometric stereo using sparse regression
* Robust plane-based structure from motion
* Robust stereo with flash and no-flash image pairs
* Robust tracking via weakly supervised ranking SVM
* Robust visual domain adaptation with low-rank reconstruction
* Robust visual tracking using autoregressive hidden Markov Model
* Robust Visual Tracking Via Multi-Task Sparse Learning
* role of image understanding in contour detection, The
* Rolling shutter bundle adjustment
* Saliency filters: Contrast based filtering for salient region detection
* Saliency-guided integration of multiple scans
* Salient object detection for searched web images via global saliency
* Sasaki metrics for analysis of longitudinal data on manifolds
* Scalable action recognition with a subspace forest
* Scalable k-NN graph construction for visual descriptors
* scale of edges, The
* Scale resilient, rotation invariant articulated object matching
* Scene warping: Layer-based stereoscopic image resizing
* Schematic surface reconstruction
* Schrödinger distance transform (SDT) for point-sets and curves, The
* See all by looking at a few: Sparse modeling for finding representative objects
* Seeded watershed cut uncertainty estimators for guided interactive segmentation
* Seeing double without confusion: Structure-from-motion in highly ambiguous scenes
* Seeing through the blur
* Segmentation using superpixels: A bipartite graph partitioning approach
* Semantic segmentation using regions and parts
* Semantic structure from motion with points, regions, and objects
* Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis
* Shape Boltzmann Machine: A Strong Model of Object Shape, The
* Shape, albedo, and illumination from a single image of an unknown object
* Sharing features in multi-class boosting via group sparsity
* Shrink boost for selecting multi-LBP histogram features in object detection
* Sign Language Recognition using Sequential Pattern Trees
* Simple Prior-Free Method for Non-rigid Structure-from-Motion Factorization, A
* Single image 3D human pose estimation from noisy observations
* Single image multimaterial estimation
* Small sample scene categorization from perceptual relations
* Social behavior recognition in continuous video
* Social interactions: A first-person perspective
* Social roles in hierarchical models for human activity recognition
* Sparse Bayesian multi-task learning for predicting cognitive outcomes from neuroimaging measures in Alzheimer's disease
* Sparse kernel approximations for efficient classification and detection
* Sparse representation for blind image quality assessment
* Sparse representation for face recognition based on discriminative low-rank dictionary learning
* Spatial bias in multi-atlas based segmentation
* Spatio-temporal motion tracking with unsynchronized cameras
* Spherical hashing
* Steerable part models
* Stream-based joint exploration-exploitation active learning
* Street-to-shop: Cross-scenario clothing retrieval via parts alignment and auxiliary set
* Structure and motion from scene registration
* Structured Local Predictors for image labelling
* study on human age estimation under facial expression changes, A
* Submodular dictionary learning for sparse coding
* Substructure and boundary modeling for continuous action recognition
* Sum-product networks for modeling activities with stochastic structure
* SUN attribute database: Discovering, annotating, and recognizing scene attributes
* Superedge grouping for object localization by combining appearance and shape information
* Supervised hashing with kernels
* Surface Regions of Interest for Viewpoint Selection
* SURFing the point clouds: Selective 3D spatial pyramids for category-level object recognition
* Synthesizing oil painting surface geometry from a single photograph
* Teaching 3D geometry to deformable part models
* theory of multi-layer flat refractive geometry, A
* Three things everyone should know to improve object retrieval
* tiered move-making algorithm for general pairwise MRFs, A
* Top-down and bottom-up cues for scene text recognition
* Top-Down Visual Saliency via Joint CRF and Dictionary Learning
* Towards compact topical descriptors
* Towards good practice in large-scale learning for image classification
* Tracking the articulated motion of two strongly interacting hands
* Transfer re-identification: From person to set-based verification
* Transferring a generic pedestrian detector towards specific scenes
* Twisted window search for efficient shape localization
* two-stage approach to blind spatially-varying motion deblurring, A
* Understanding and predicting importance in images
* Understanding collective crowd behaviors: Learning a Mixture model of Dynamic pedestrian-Agents
* unified approach to salient object detection via low rank matrix recovery, A
* unified framework for event summarization and rare event detection, A
* unifying resolution-independent formulation for early vision, A
* Unseen) event recognition via semantic compositionality
* Unsupervised co-segmentation through region matching
* Unsupervised feature learning framework for no-reference image quality assessment
* Unsupervised incremental learning for improved object detection in a video
* Unsupervised learning of translation invariant occlusive components
* Unsupervised metric fusion by cross diffusion
* Unsupervised Object Class Discovery via Saliency-Guided Multiple Class Learning
* use of on-line co-training to reduce the training set size in pattern recognition methods: Application to left ventricle segmentation in ultrasound, The
* Vector array based Multi-View Face Detection with compound exemplars
* Video anomaly detection based on local statistical aggregates
* Video from nearly still: An application to low frame-rate gait recognition
* Video segmentation by tracing discontinuities in a trajectory embedding
* Video stabilization with a depth camera
* Visibility Based Preconditioning for bundle adjustment
* Visual stem mapping and Geometric Tense coding for Augmented Visual Vocabulary
* Visual Tracking Via Adaptive Structural Local Sparse Appearance Model
* Vitruvian manifold: Inferring dense correspondences for one-shot human pose estimation, The
* We are not contortionists: Coupled adaptive learning for head and body orientation estimation in surveillance video
* Weak attributes for large-scale image retrieval
* Weakly supervised sparse coding with geometric consistency pooling
* Weakly supervised structured output learning for semantic segmentation
* Weighted Color and Texture Sample Selection for Image Matting
* What are good parts for hair shape modeling?
* What are we looking for: Towards statistical modeling of saccadic eye movements and visual saliency
* What has my classifier learned? Visualizing the classification rules of bag-of-feature model by support region detection
* What is optimized in tight convex relaxations for multi-label problems?
* WhittleSearch: Image search with relative attribute feedback
467 for CVPR12

CVPR13 * *CVPR
* 3D Pictorial Structures for Multiple View Articulated Pose Estimation
* 3D Visual Proxemics: Recognizing Human Interactions in 3D from a Single Image
* 3D-Based Reasoning with Blocks, Support, and Stability
* 3D-R Transform on Spatio-temporal Interest Points for Action Recognition
* Accurate and Robust Registration of Nonrigid Surface Using Hierarchical Statistical Shape Model
* Accurate Localization of 3D Objects from RGB-D Data Using Segmentation Hypotheses
* Action Recognition by Hierarchical Sequence Summarization
* Active Contours with Group Similarity
* Adaptive Active Learning for Image Classification
* Adaptive Compressed Tomography Sensing
* Adding Unlabeled Samples to Categories by Learned Attributes
* Adherent Raindrop Detection and Removal in Video
* All About VLAD
* Alternating Decision Forests
* Analytic Bilinear Appearance Subspace Construction for Modeling Image Irradiance under Natural Illumination and Non-Lambertian Reflectance
* Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs
* Approach to Pose-Based Action Recognition, An
* Area Preserving Brain Mapping
* Articulated and Restricted Motion Subspaces and Their Signatures
* Articulated Pose Estimation Using Discriminative Armlet Classifiers
* As-Projective-As-Possible Image Stitching with Moving DLT
* Attribute-Based Detection of Unfamiliar Classes with Humans in the Loop
* Augmenting Bag-of-Words: Data-Driven Discovery of Temporal and Structural Information for Activity Recognition
* Augmenting CRFs with Boltzmann Machine Shape Priors for Image Labeling
* Auxiliary Cuts for General Classes of Higher Order Functionals
* Axially Symmetric 3D Pots Configuration System Using Axis of Symmetry and Break Curve
* Background Modeling Based on Bidirectional Analysis
* Bayesian Approach to Multimodal Visual Dictionary Learning, A
* Bayesian Depth-from-Defocus with Shading Constraints
* Bayesian Grammar Learning for Inverse Procedural Modeling
* Beta Process Joint Dictionary Learning for Coupled Feature Spaces with Application to Single Image Super-Resolution
* Better Exploiting Motion for Better Action Recognition
* Beyond Physical Connections: Tree Models in Human Pose Estimation
* Beyond Point Clouds: Scene Understanding by Reasoning Geometry and Physics
* BFO Meets HOG: Feature Extraction Based on Histograms of Oriented p.d.f. Gradients for Image Classification
* Bilinear Programming for Human Activity Recognition with Unknown MRF Graphs
* Binary Code Ranking with Weighted Hamming Distance
* Blessing of Dimensionality: High-Dimensional Feature and Its Efficient Compression for Face Verification
* Blind Deconvolution of Widefield Fluorescence Microscopic Data by Regularization of the Optical Transfer Function (OTF)
* Block and Group Regularized Sparse Modeling for Dictionary Learning
* Blocks That Shout: Distinctive Parts for Scene Classification
* Blur Processing Using Double Discrete Wavelet Transform
* Boosting Binary Keypoint Descriptors
* Bottom-Up Segmentation for Top-Down Detection
* Boundary Cues for 3D Object Shape Recovery
* Boundary Detection Benchmarking: Beyond F-Measures
* BRDF Slices: Accurate Adaptive Anisotropic Appearance Acquisition
* Bringing Semantics into Focus Using Visual Abstraction
* Calibrating Photometric Stereo by Holistic Reflectance Symmetry Analysis
* Can a Fully Unconstrained Imaging Model Be Applied Effectively to Central Cameras?
* Capturing Complex Spatio-temporal Relations among Facial Muscles for Facial Expression Recognition
* Capturing Layers in Image Collections with Componential Models: From the Layered Epitome to the Componential Counting Grid
* Cartesian K-Means
* Category Modeling from Just a Single Labeling: Use Depth Information to Guide the Learning of 2D Models
* City-Scale Change Detection in Cadastral 3D Models Using Images
* CLAM: Coupled Localization and Mapping with Efficient Outlier Handling
* Class Generative Models Based on Feature Regression for Pose Estimation of Object Categories
* Classification of Tumor Histology via Morphometric Context
* Cloud Motion as a Calibration Cue
* Comparative Study of Modern Inference Techniques for Discrete Energy Minimization Problems, A
* Complex Event Detection via Multi-source Video Attributes
* Composite Statistical Inference for Semantic Segmentation
* Compressed Hashing
* Compressible Motion Fields
* Computationally Efficient Regression on a Dependency Graph for Human Pose Estimation
* Computing Diffeomorphic Paths for Large Motion Interpolation
* Consensus of k-NNs for Robust Neighborhood Selection on Graph-Based Manifolds
* Constrained Clustering and Its Application to Face Clustering in Videos
* Constraints as Features
* Context-Aware Modeling and Recognition of Activities in Video
* Continuous Inference in Graphical Models with Polynomial Energies
* Convex Regularize for Reducing Color Artifact in Color Image Recovery, A
* Correlation Filters for Object Alignment
* Correspondence-Less Non-rigid Registration of Triangular Surface Meshes
* Cross-View Action Recognition via a Continuous Virtual Path
* Cross-View Image Geolocalization
* Crossing the Line: Crowd Counting by Integer Programming with Local Features
* Cumulative Attribute Space for Age and Crowd Density Estimation
* Decoding Children's Social Behavior
* Decoding, Calibration and Rectification for Lenselet-Based Plenoptic Cameras
* Deep Convolutional Network Cascade for Facial Point Detection
* Deep Learning Shape Priors for Object Segmentation
* Deformable Graph Matching
* Deformable Spatial Pyramid Matching for Fast Dense Correspondences
* Dense 3D Reconstruction from Severely Blurred Images Using a Single Moving Camera
* Dense Non-rigid Point-Matching Using Random Projections
* Dense Object Reconstruction with Semantic Priors
* Dense Reconstruction Using 3D Object Shape Priors
* Dense Segmentation-Aware Descriptors
* Dense Variational Reconstruction of Non-rigid Surfaces from Monocular Video
* Depth Acquisition from Density Modulated Binary Patterns
* Depth Super Resolution by Rigid Body Self-Similarity in 3D
* Designing Category-Level Attributes for Discriminative Visual Recognition
* Detecting and Aligning Faces by Image Retrieval
* Detecting and Naming Actors in Movies Using Generative Appearance Models
* Detecting Changes in 3D Structure of a Scene from Multi-view Images Captured by a Vehicle-Mounted Camera
* Detecting Pulse from Head Motions in Video
* Detection Evolution with Multi-order Contextual Co-occurrence
* Detection of Manipulation Action Consequences (MAC)
* Detection- and Trajectory-Level Exclusion in Multiple Object Tracking
* Determining Motion Directly from Normal Flows Upon the Use of a Spherical Eye Platform
* Dictionary Learning from Ambiguously Labeled Data
* Diffusion Processes for Retrieval Revisited
* Discovering the Structure of a Planar Mirror System from Multiple Observations of a Single Point
* Discrete MRF Inference of Marginal Densities for Non-uniformly Discretized Variable Space
* Discriminative Brain Effective Connectivity Analysis for Alzheimer's Disease: A Kernel Learning Approach upon Sparse Gaussian Bayesian Network
* Discriminative Color Descriptors
* Discriminative Non-blind Deblurring
* Discriminative Re-ranking of Diverse Segmentations
* Discriminative Segment Annotation in Weakly Labeled Video
* Discriminative Sub-categorization
* Discriminative Subspace Clustering
* Discriminatively Trained And-Or Tree Models for Object Detection
* Divide-and-Conquer Method for Scalable Low-Rank Latent Matrix Pursuit, A
* Dynamic Scene Classification: Learning Motion Descriptors with Slow Features Analysis
* Efficient 2D-to-3D Correspondence Filtering for Scalable 3D Object Recognition
* Efficient 3D Endfiring TRUS Prostate Segmentation with Globally Optimized Rotational Symmetry
* Efficient Color Boundary Detection with Color-Opponent Mechanisms
* Efficient Computation of Shortest Path-Concavity for 3D Meshes
* Efficient Detector Adaptation for Object Detection in a Video
* Efficient Large-Scale Structured Learning
* Efficient Maximum Appearance Search for Large-Scale Object Detection
* Efficient Object Detection and Segmentation for Fine-Grained Recognition
* Enriching Texture Analysis with Semantic Data
* Ensemble Learning for Confidence Measures in Stereo Vision
* Ensemble Video Object Cut in Highly Dynamic Scenes
* Episolar Constraint: Monocular Shape from Shadow Correspondence, The
* Evaluation of Color STIPs for Human Action Recognition
* Event Recognition in Videos by Learning from Heterogeneous Web Sources
* Event Retrieval in Large Video Collections with Circulant Temporal Encoding
* Exemplar-Based Face Parsing
* Expanded Parts Model for Human Attribute and Action Recognition in Still Images
* Explicit Occlusion Modeling for 3D Object Class Representations
* Exploiting the Power of Stereo Confidences
* Exploring Compositional High Order Pattern Potentials for Structured Output Learning
* Exploring Implicit Image Statistics for Visual Representativeness Modeling
* Exploring Weak Stabilization for Motion Feature Extraction
* Expressive Visual Text-to-Speech Using Active Appearance Models
* Face Recognition in Movie Trailers via Mean Sequence Sparse Representation-Based Classification
* Facial Feature Tracking Under Varying Facial Expressions and Face Poses Based on Restricted Boltzmann Machines
* Fast Approximate AIB Algorithm for Distributional Word Clustering, A
* Fast Convolutional Sparse Coding
* Fast Energy Minimization Using Learned State Filters
* Fast Image Super-Resolution Based on In-Place Example Regression
* Fast Multiple-Part Based Object Detection Using KD-Ferns
* Fast Object Detection with Entropy-Driven Evaluation
* Fast Patch-Based Denoising Using Approximated Patch Geodesic Paths
* Fast Rigid Motion Segmentation via Incrementally-Complex Local Models
* Fast Semidefinite Approach to Solving Binary Quadratic Problems, A
* Fast Trust Region for Segmentation
* Fast, Accurate Detection of 100,000 Object Classes on a Single Machine
* FasT-Match: Fast Affine Template Matching
* Finding Group Interactions in Social Clutter
* Finding Things: Image Parsing with Regions and Per-Exemplar Detectors
* Fine-Grained Crowdsourcing for Fine-Grained Recognition
* First-Person Activity Recognition: What Are They Doing to Me?
* Five Shades of Grey for Fast and Reliable Camera Pose Estimation
* FrameBreak: Dramatic Image Extrapolation by Guided Shift-Maps
* From Local Similarity to Global Coding: An Application to Image Classification
* From N to N+1: Multiclass Transfer Incremental Learning
* Fully-Connected CRFs with Non-Parametric Pairwise Potential
* Fully-Connected Layered Model of Foreground and Background Flow, A
* Fusing Depth from Defocus and Stereo with Coded Apertures
* Fusing Robust Face Region Descriptors via Multiple Metric Learning for Face Recognition in the Wild
* Gauging Association Patterns of Chromosome Territories via Chromatic Median
* Generalized Domain-Adaptive Dictionaries
* Generalized Laplacian Distance and Its Applications for Visual Matching, The
* Genetic Algorithm-Based Solver for Very Large Jigsaw Puzzles, A
* GeoF: Geodesic Forests for Learning Coupled Predictors
* Geometric Context from Videos
* Global Approach for the Detection of Vanishing Points and Mutually Orthogonal Vanishing Directions, A
* Globally Consistent Multi-label Assignment on the Ray Space of 4D Light Fields
* Graph Matching with Anchor Nodes: A Learning Approach
* Graph Transduction Learning with Connectivity Constraints with Application to Multiple Foreground Cosegmentation
* Graph-Based Discriminative Learning for Location Recognition
* Graph-Based Optimization with Tubularity Markov Tree for 3D Vessel Segmentation
* Graph-Laplacian PCA: Closed-Form Solution and Robustness
* GRASP Recurring Patterns from a Single View
* Groupwise Registration via Graph Shrinkage on the Image Manifold
* Hallucinated Humans as the Hidden Context for Labeling 3D Scenes
* Handling Noise in Single Image Deblurring Using Directional Filters
* Harry Potter's Marauder's Map: Localizing and Tracking Multiple Persons-of-Interest by Nonnegative Discretization
* Harvesting Mid-level Visual Concepts from Large-Scale Internet Images
* Hash Bit Selection: A Unified Solution for Selection Problems in Hashing
* HDR Deghosting: How to Deal with Saturation?
* Heterogeneous Visual Features Fusion via Sparse Multimodal Machine
* Hierarchical Saliency Detection
* Hierarchical Video Representation with Trajectory Binary Partition Tree
* Higher-Order CRF Model for Road Network Extraction, A
* Histograms of Sparse Codes for Object Detection
* Hollywood 3D: Recognizing Actions in 3D Natural Scenes
* HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences
* Human Pose Estimation Using a Joint Pixel-wise and Part-wise Formulation
* Human Pose Estimation Using Body Parts Dependent Joint Regressors
* Hyperbolic Harmonic Mapping for Constrained Brain Surface Registration
* Hypergraphs for Joint Multi-view Reconstruction and Multi-object Tracking
* Illumination Estimation Based on Bilayer Sparse Coding
* Image Matting with Local and Nonlocal Smooth Priors
* Image Segmentation by Cascaded Region Agglomeration
* Image Tag Completion via Image-Specific and Tag-Specific Linear Sparse Reconstructions
* Image Understanding from Experts' Eyes by Modeling Perceptual Skill of Diagnostic Reasoning Processes
* Improved Image Set Classification via Joint Sparse Approximated Nearest Subspaces
* Improving an Object Detector and Extracting Regions Using Superpixels
* Improving Image Matting Using Comprehensive Sampling Sets
* Improving the Visual Comprehension of Point Sets
* In Defense of 3D-Label Stereo
* In Defense of Sparsity Based Face Recognition
* Incorporating Structural Alternatives and Sharing into Hierarchy for Multiclass Object Recognition and Detection
* Incorporating User Interaction and Topological Constraints within Contour Completion via Discrete Calculus
* Inductive Hashing on Manifolds
* Information Consensus for Distributed Multi-target Tracking
* Integrating Grammar and Segmentation for Human Pose Estimation
* Intrinsic Characterization of Dynamic Surfaces
* Intrinsic Scene Properties from a Single RGB-D Image
* Is There a Procedural Logic to Architecture?
* It's Not Polite to Point: Describing People with Uncertain Attributes
* Iterated L1 Algorithm for Non-smooth Non-convex Optimization in Computer Vision, An
* Joint 3D Scene Reconstruction and Class Segmentation
* Joint Detection, Tracking and Mapping by Semantic Bundle Adjustment
* Joint Geodesic Upsampling of Depth Images
* Joint Model for 2D and 3D Pose Estimation from a Single Image, A
* Joint Sparsity-Based Representation and Analysis of Unconstrained Activities
* Joint Spectral Correspondence for Disparate Image Matching
* Jointly Aligning and Segmenting Multiple Web Photo Streams for the Inference of Collective Photo Storylines
* K-Means Hashing: An Affinity-Preserving Quantization Method for Learning Binary Compact Codes
* Kernel Learning for Extrinsic Classification of Manifold Features
* Kernel Methods on the Riemannian Manifold of Symmetric Positive Definite Matrices
* Kernel Null Space Methods for Novelty Detection
* Keypoints from Symmetries by Wave Propagation
* Label Propagation from ImageNet to 3D Point Clouds
* Label-Embedding for Attribute-Based Classification
* Large Displacement Optical Flow from Nearest Neighbor Fields
* Large-Scale Video Summarization Using Web-Image Priors
* Layer Depth Denoising and Completion for Structured-Light RGB-D Cameras
* Lazy Man's Approach to Benchmarking: Semisupervised Classifier Evaluation and Recalibration, A
* Learning a Manifold as an Atlas
* Learning and Calibrating Per-Location Classifiers for Visual Place Recognition
* Learning by Associating Ambiguously Labeled Images
* Learning Class-to-Image Distance with Object Matchings
* Learning Collections of Part Models for Object Recognition
* Learning Compact Binary Codes for Visual Tracking
* Learning Cross-Domain Information Transfer for Location Recognition and Clustering
* Learning Discriminative Illumination and Filters for Raw Material Classification with Optimal Projections of Bidirectional Texture Functions
* Learning for Structured Prediction Using Approximate Subgradient Descent with Working Sets
* Learning Locally-Adaptive Decision Functions for Person Verification
* Learning Multiple Non-linear Sub-spaces Using K-RBMs
* Learning Separable Filters
* Learning Structured Hough Voting for Joint Object Detection and Occlusion Reasoning
* Learning Structured Low-Rank Representations for Image Classification
* Learning SURF Cascade for Fast and Accurate Object Detection
* Learning the Change for Automatic Image Cropping
* Learning to Detect Partially Overlapping Instances
* Learning to Estimate and Remove Non-uniform Image Blur
* Learning Video Saliency from Human Gaze Using Candidate Selection
* Learning without Human Scores for Blind Image Quality Assessment
* Least Soft-Threshold Squares Tracking
* Leveraging Structure from Motion to Learn Discriminative Codebooks for Scalable Landmark Classification
* Light Field Distortion Feature for Transparent Object Recognition
* Linear Approach to Matching Cuboids in RGBD Images, A
* Local Fisher Discriminant Analysis for Pedestrian Re-identification
* Locally Aligned Feature Transforms across Views
* Long-Term Occupancy Analysis Using Graph-Based Optimisation in Thermal Imagery
* Looking Beyond the Image: Unsupervised Learning for Object Saliency and Detection
* Lost! Leveraging the Crowd for Probabilistic Visual Self-Localization
* LP-Norm IDF for Large Scale Image Search
* Machine Learning Approach for Non-blind Image Deconvolution, A
* Manhattan Junction Catalogue for Spatial Reasoning of Indoor Scenes
* Manhattan Scene Understanding via XSlit Imaging
* Max-Margin Riffled Independence Model for Image Tag Ranking, A
* Maximum Cohesive Grid of Superpixels for Fast Object Localization
* Measures and Meta-Measures for the Supervised Evaluation of Image Segmentation
* Measuring Crowd Collectiveness
* Megastereo: Constructing High-Resolution Stereo Panoramas
* Mesh Based Semantic Modelling for Indoor and Outdoor Scenes
* Minimum Error Vanishing Point Detection Approach for Uncalibrated Monocular Images of Man-Made Environments, A
* Minimum Uncertainty Gap for Robust Visual Tracking
* Mirror Surface Reconstruction from a Single Image
* MKPLS: Manifold Kernel Partial Least Squares for Lipreading and Speaker Identification
* MODEC: Multimodal Decomposable Models for Human Pose Estimation
* Modeling Actions through State Changes
* Modeling Mutual Visibility Relationship in Pedestrian Detection
* Monocular Template-Based 3D Reconstruction of Extensible Surfaces with Local Linear Elasticity
* Motion Estimation for Self-Driving Cars with a Generalized Camera
* Motionlets: Mid-level 3D Parts for Human Motion Recognition
* Multi-agent Event Detection: Localization and Role Assignment
* Multi-attribute Queries: To Merge or Not to Merge?
* Multi-class Video Co-segmentation with a Generative Multi-video Model
* Multi-image Blind Deblurring Using a Coupled Adaptive Sparse Prior
* Multi-level Discriminative Dictionary Learning towards Hierarchical Visual Categorization
* Multi-resolution Shape Analysis via Non-Euclidean Wavelets: Applications to Mesh Segmentation and Surface Alignment Problems
* Multi-scale Curve Detection on Surfaces
* Multi-source Multi-scale Counting in Extremely Dense Crowd Images
* Multi-target Tracking by Lagrangian Relaxation to Min-cost Network Flow
* Multi-target Tracking by Rank-1 Tensor Approximation
* Multi-task Sparse Learning with Beta Process Prior for Action Recognition
* Multi-view Photometric Stereo with Spatially Varying Isotropic Materials
* Multipath Sparse Coding Using Hierarchical Matching Pursuit
* New Model and Simple Algorithms for Multi-label Mumford-Shah Problems, A
* New Perspective on Uncalibrated Photometric Stereo, A
* Non-parametric Filtering for Geometric Detail Extraction and Material Representation
* Non-parametric Framework for Document Bleed-through Removal, A
* Non-rigid Structure from Motion with Diffusion Maps Prior
* Non-uniform Motion Deblurring for Bilayer Scenes
* Nonlinearly Constrained MRFs: Exploring the Intrinsic Dimensions of Higher-Order Cliques
* Nonparametric Scene Parsing with Adaptive Feature Relevance and Semantic Context
* Object-Centric Anomaly Detection by Attribute-Based Reasoning
* Occlusion Patterns for Object Class Detection
* On a Link Between Kernel Mean Maps and Fraunhofer Diffraction, with an Application to Super-Resolution Beyond the Diffraction Limit
* Online Dominant and Anomalous Behavior Detection in Videos
* Online Object Tracking: A Benchmark
* Online Robust Dictionary Learning
* Optical Flow Estimation Using Laplacian Mesh Energy
* Optimal Geometric Fitting under the Truncated L2-Norm
* Optimized Pedestrian Detection for Multiple and Occluded People
* Optimized Product Quantization for Approximate Nearest Neighbor Search
* Optimizing 1-Nearest Prototype Classifiers
* Part Discovery from Partial Correspondence
* Part-Based Visual Tracking with Online Latent Structural Learning
* Patch Match Filter: Efficient Edge-Aware Filtering Meets Randomized Search for Fast Correspondence Field Estimation
* Pattern-Driven Colorization of 3D Surfaces
* PDM-ENLOR: Learning Ensemble of Local PDM-Based Regressions
* Pedestrian Detection with Unsupervised Multi-stage Feature Learning
* Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images
* Photometric Ambient Occlusion
* Physically Plausible 3D Scene Tracking: The Single Actor Hypothesis
* PISA: Pixelwise Image Saliency by Aggregating Complementary Appearance Contrast Measures with Spatial Priors
* Pixel-Level Hand Detection in Ego-centric Videos
* Plane-Based Content Preserving Warps for Video Stabilization
* POOF: Part-Based One-vs.-One Features for Fine-Grained Categorization, Face Verification, and Attribute Estimation
* Pose from Flow and Flow from Pose
* Poselet Conditioned Pictorial Structures
* Poselet Key-Framing: A Model for Human Activity Recognition
* Practical Rank-Constrained Eight-Point Algorithm for Fundamental Matrix Estimation, A
* Principal Observation Ray Calibration for Tiled-Lens-Array Integral Imaging Display
* Principled Deep Random Field Model for Image Segmentation, A
* Probabilistic Elastic Matching for Pose Variant Face Verification
* Probabilistic Graphlet Cut: Exploiting Spatial Structure Cue for Weakly Supervised Image Segmentation
* Probabilistic Label Trees for Efficient Large Scale Image Classification
* Procrustean Normal Distribution for Non-Rigid Structure from Motion
* Query Adaptive Similarity for Large Scale Object Retrieval
* Radial Distortion Self-Calibration
* Real-Time Model-Based Rigid Object Pose Estimation and Tracking Combining Dense and Sparse Visual Cues
* Real-Time No-Reference Image Quality Assessment Based on Filter Learning
* Recognize Human Activities from Partially Observed Videos
* Recognizing Activities via Bag of Words for Attribute Dynamics
* Reconstructing Gas Flows Using Light-Path Approximation
* Reconstructing Loopy Curvilinear Structures Using Integer Programming
* Recovering Line-Networks in Images by Junction-Point Processes
* Recovering Stereo Pairs from Anaglyphs
* Relative Hidden Markov Models for Evaluating Motion Skill
* Relative Volume Constraints for Single View 3D Reconstruction
* Representing and Discovering Adversarial Team Behaviors Using Player Roles
* Representing Videos Using Mid-level Discriminative Patches
* Revisiting Depth Layers from Occlusions
* Robust Canonical Time Warping for the Alignment of Grossly Corrupted Sequences
* Robust Discriminative Response Map Fitting with Constrained Local Models
* Robust Estimation of Nonrigid Transformation for Point Set Registration
* Robust Feature Matching with Alternate Hough and Inverted Hough Transforms
* Robust Monocular Epipolar Flow Estimation
* Robust Multi-resolution Pedestrian Detection in Traffic Scenes
* Robust Object Co-detection
* Robust Real-Time Tracking of Multiple Objects by Volumetric Mass Densities
* Robust Region Grouping via Internal Patch Statistics
* Rolling Riemannian Manifolds to Solve the Multi-class Classification Problem
* Rolling Shutter Camera Calibration
* Rotation, Scaling and Deformation Invariant Scattering for Texture Discrimination
* Saliency Aggregation: A Data-Driven Approach
* Saliency Detection via Graph-Based Manifold Ranking
* Salient Object Detection: A Discriminative Regional Feature Integration Approach C
* Sample-Specific Late Fusion for Visual Category Recognition
* Sampling Strategies for Real-Time Action Recognition
* Scalable Sparse Subspace Clustering
* SCaLE: Supervised and Cascaded Laplacian Eigenmaps for Visual Object Recognition Based on Nearest Neighbors
* SCALPEL: Segmentation Cascades with Localized Priors and Efficient Learning
* Scene Coordinate Regression Forests for Camera Relocalization in RGB-D Images
* Scene Parsing by Integrating Function, Geometry and Appearance Models
* Scene Text Recognition Using Part-Based Tree-Structured Character Detection
* Seeking the Strongest Rigid Detector
* Segment-Tree Based Cost Aggregation for Stereo Matching
* Selective Transfer Machine for Personalized Facial Action Unit Detection
* Self-Paced Learning for Long-Term Tracking
* Semi-supervised Domain Adaptation with Instance Constraints
* Semi-supervised Learning of Feature Hierarchies for Object Detection in a Video
* Semi-supervised Learning with Constraints for Person Identification in Multimedia Data
* Semi-supervised Node Splitting for Random Forest Construction
* Sensing and Recognizing Surface Textures Using a GelSight Sensor
* Sentence Is Worth a Thousand Pixels, A
* Separable Dictionary Learning
* Separating Signal from Noise Using Patch Recurrence across Scales
* Shading-Based Shape Refinement of RGB-D Images
* Shape from Silhouette Probability Maps: Reconstruction of Thin Objects in the Presence of Silhouette Extraction and Calibration Error
* Simultaneous Active Learning of Classifiers: Attributes via Relative Feedback
* Simultaneous Super-Resolution of Depth and Images Using a Single Camera
* Single Image Calibration of Multi-axial Imaging Systems
* Single-Pedestrian Detection Aided by Multi-pedestrian Detection
* Single-Sample Face Recognition with Image Corruption and Misalignment via Sparse Illumination Transfer
* Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection
* SLAM++: Simultaneous Localisation and Mapping at the Level of Objects
* Social Role Discovery in Human Events
* Sparse Output Coding for Large-Scale Visual Recognition
* Sparse Quantization for Patch Description
* Sparse Subspace Denoising for Image Manifolds
* Spatial Inference Machines
* Spatio-temporal Depth Cuboid Similarity Feature for Activity Recognition Using Depth Camera
* Spatiotemporal Deformable Part Models for Action Detection
* Spectral Modeling and Relighting of Reflective-Fluorescent Scenes
* Specular Reflection Separation Using Dark Channel Prior
* Statistical Model for Recreational Trails in Aerial Images, A
* Statistical Textural Distinctiveness for Salient Region Detection in Natural Images
* Stochastic Deconvolution
* Story-Driven Summarization for Egocentric Video
* Structure Preserving Object Tracking
* Structured Face Hallucination
* Studying Relationships between Human Gaze, Description, and Computer Vision
* Subcategory-Aware Object Classification
* Submodular Salient Region Detection
* Subspace Interpolation via Dictionary Learning for Unsupervised Domain Adaptation
* Supervised Descent Method and Its Applications to Face Alignment
* Supervised Kernel Descriptors for Visual Recognition
* Supervised Semantic Gradient Extraction Using Linear-Time Optimization
* SVM-Minus Similarity Score for Video Face Recognition, The
* SWIGS: A Swift Guided Sampling Method
* Tag Taxonomy Aware Dictionary Learning for Region Tagging
* Template-Based Isometric Deformable 3D Reconstruction with Sampling-Based Focal Length Self-Calibration
* Templateless Quasi-rigid Shape Modeling with Implicit Loop-Closure
* Tensor-Based High-Order Semantic Relation Transfer for Semantic Scene Segmentation
* Tensor-Based Human Body Modeling
* Texture Enhanced Image Denoising via Gradient Histogram Preservation
* Theory of Refractive Photo-Light-Path Triangulation, A
* Thousand Frames in Just a Few Words: Lingual Description of Videos through Latent Topics and Sparse Object Stitching, A
* Three-Dimensional Bilateral Symmetry Plane Estimation in the Phase Domain
* Top-Down Segmentation of Non-rigid Visual Objects Using Derivative-Based Search on Sparse Manifolds
* Topical Video Object Discovery from Key Frames by Modeling Word Co-occurrence Prior
* Towards Contactless, Low-Cost and Accurate 3D Fingerprint Identification
* Towards Efficient and Exact MAP-Inference for Large Scale Discrete Computer Vision Problems via Combinatorial Optimization
* Towards Fast and Accurate Segmentation
* Towards Pose Robust Face Recognition
* Tracking Human Pose by Tracking Symmetric Parts
* Tracking People and Their Objects
* Tracking Sports Players with Context-Conditioned Motion Models
* Transfer Sparse Coding for Robust Image Representation
* Uncalibrated Photometric Stereo for Unknown Isotropic Reflectances
* Unconstrained Monocular 3D Human Pose Estimation by Action Detection and Cross-Modality Regression Forest
* Understanding Bayesian Rooms Using Composite 3D Object Models
* Understanding Indoor Scenes Using 3D Geometric Phrases
* Underwater Camera Calibration Using Wavelength Triangulation
* Universality of the Local Marginal Polytope
* Unnatural L0 Sparse Representation for Natural Image Deblurring
* Unsupervised Joint Object Discovery and Segmentation in Internet Images
* Unsupervised Salience Learning for Person Re-identification
* Vantage Feature Frames for Fine-Grained Categorization
* Variational Structure of Disparity and Regularization of 4D Light Fields, The
* Video Editing with Temporal, Spatial and Appearance Consistency
* Video Enhancement of People Wearing Polarized Glasses: Darkening Reversal and Reflection Reduction
* Video Object Segmentation through Spatially Accurate and Temporally Dense Extraction of Primary Object Regions
* Video Representation Using Temporal Superpixels, A
* Visual Place Recognition with Repetitive Structures
* Visual Tracking via Locality Sensitive Histograms
* Voxel Cloud Connectivity Segmentation: Supervoxels for Point Clouds
* Watching Unlabeled Video Helps Learn New Human Actions from Very Few Labeled Snapshots
* Weakly Supervised Learning for Attribute Localization in Outdoor Scenes
* Weakly Supervised Learning of Mid-Level Features with Beta-Bernoulli Process Restricted Boltzmann Machines
* Weakly-Supervised Dual Clustering for Image Semantic Segmentation
* What Makes a Patch Distinct?
* What Object Motion Reveals about Shape with Unknown BRDF and Lighting
* What's in a Name? First Names as Facial Attributes
* Whitened Expectation Propagation: Non-Lambertian Shape from Shading and Shadow
* Wide-Baseline Hair Capture Using Strand-Based Refinement
* Winding Number for Region-Boundary Consistent Salient Contour Extraction
470 for CVPR13

CVPR14 * *CVPR
* 100+ Times Faster Weighted Median Filter (WMF)
* 2D Human Pose Estimation: New Benchmark and State of the Art Analysis
* 3D Modeling from Wide Baseline Range Scans Using Contour Coherence
* 3D Pictorial Structures for Multiple Human Pose Estimation
* 3D Pose from Motion for Cross-View Action Recognition via Non-linear Circulant Temporal Encoding
* 3D Reconstruction from Accidental Motion
* 3D Shape and Indirect Appearance by Structured Light Transport
* 3D-Aided Face Recognition Robust to Expression and Pose Variations
* 6 Seconds of Sound and Vision: Creativity in Micro-videos
* Accurate Localization and Pose Estimation for Large 3D Models
* Accurate Object Detection with Joint Classification-Regression Random Forests
* Action Localization with Tubelets from Motion
* Actionness Ranking with Lattice Conditional Ordinal Random Fields
* Active Annotation Translation
* Active Flattening of Curved Document Images via Two Structured Beams
* Active Frame, Location, and Detector Selection for Automated and Manual Video Annotation
* Active Sampling for Subjective Image Quality Assessment
* Adaptive Color Attributes for Real-Time Visual Tracking
* Adaptive Object Retrieval with Kernel Reconstructive Hashing
* Adaptive Partial Differential Equation Learning for Visual Saliency Detection
* Additive Quantization for Extreme Vector Compression
* Aerial Reconstructions via Probabilistic Data Fusion
* Aliasing Detection and Reduction in Plenoptic Imaging
* Analysis by Synthesis: 3D Object Recognition by Object Reconstruction
* Anytime Recognition of Objects and Scenes
* Are Cars Just 3D Boxes? Jointly Estimating the 3D Shape of Multiple Objects
* Ask the Image: Supervised Pooling to Preserve Feature Locality
* Associative Embeddings for Large-Scale Knowledge Transfer with Self-Assessment
* Asymmetric Sparse Kernel Approximations for Large-Scale Visual Search
* Asymmetrical Gauss Mixture Models for Point Sets Matching
* Attributed Graph Mining and Matching: An Attempt to Define and Extract Soft Attributed Patterns
* Automated Estimator of Image Visual Realism Based on Human Cognition, An
* Automatic Construction of Deformable Models In-the-Wild
* Automatic Face Reenactment
* Automatic Feature Learning for Robust Shadow Detection
* Backscatter Compensated Photometric Stereo with 3 Sources
* Bags of Spacetime Energies for Dynamic Scene Recognition
* Bayes Merging of Multiple Vocabularies for Scalable Image Retrieval
* Bayesian Active Appearance Models
* Bayesian Active Contours with Affine-Invariant, Elastic Shape Prior
* Bayesian Framework for the Local Configuration of Retinal Junctions, A
* Bayesian View Synthesis and Image-Based Rendering Principles
* Beat the MTurkers: Automatic Image Labeling from Weak 3D Supervision
* Beta Process Multiple Kernel Learning
* Better Feature Tracking through Subspace Constraints
* Better Shading for Better Shape Recovery
* Beyond Comparing Image Pairs: Setwise Active Learning for Relative Attributes
* Beyond Human Opinion Scores: Blind Image Quality Assessment Based on Synthetic Scores
* Bi-label Propagation for Generic Multiple Object Tracking
* BING: Binarized Normed Gradients for Objectness Estimation at 300fps
* Birdsnap: Large-Scale Fine-Grained Visual Categorization of Birds
* Blind Image Quality Assessment Using Semi-supervised Rectifier Networks
* Bregman Divergences for Infinite Dimensional Covariance Matrices
* Calibrating a Non-isotropic Near Point Light Source Using a Plane
* Camouflaging an Object from Many Viewpoints
* Capturing Long-Tail Distributions of Object Subcategories
* Cause and Effect Analysis of Motion Trajectories for Modeling Actions, A
* CID: Combined Image Denoising in Spatial and Frequency Domains Using Web Images
* Class Specific 3D Object Shape Priors Using Surface Normals
* Classification of Histology Sections via Multispectral Convolutional Sparse Coding
* Clothing Co-parsing by Joint Image Segmentation and Labeling
* Co-localization in Real-World Images
* Co-segmentation of Textured 3D Shapes with Sparse Annotations
* Collaborative Hashing
* Collective Matrix Factorization Hashing for Multimodal Data
* Color Transfer Using Probabilistic Moving Least Squares
* Compact and Discriminative Face Track Descriptor, A
* Compact Representation for Image Classification: To Choose or to Compress?
* Complex Activity Recognition Using Granger Constrained DBN (GCDBN) in Sports and Surveillance Video
* Complex Non-rigid Motion 3D Reconstruction by Union of Subspaces
* Compositional Model for Low-Dimensional Image Set Representation, A
* Confidence-Rated Multiple Instance Boosting for Object Detection
* Congruency-Based Reranking
* Constructing Robust Affinity Graphs for Spectral Clustering
* Context Driven Scene Parsing with Attention to Rare Classes
* Continuous Manifold Based Adaptation for Evolving Visual Domains
* Convex Relaxation of the Ambrosio-Tortorelli Elliptic Functionals for the Mumford-Shah Functional, A
* Convolutional Neural Networks for No-Reference Image Quality Assessment
* COSTA: Co-Occurrence Statistics for Zero-Shot Classification
* Covariance Descriptors for 3D Shape Matching and Retrieval
* Covariance Trees for 2D and 3D Processing
* Critical Configurations for Radial Distortion Self-Calibration
* Cross-Scale Cost Aggregation for Stereo Matching
* Cross-View Action Modeling, Learning, and Recognition
* Curvilinear Structure Tracking by Low Rank Tensor Approximation with Model Propagation
* Cut, Glue, & Cut: A Fast, Approximate Solver for Multicut Partitioning
* DAISY Filter Flow: A Generalized Discrete Approach to Dense Correspondences
* Data-Driven Flower Petal Modeling with Botany Priors
* Deblurring Low-Light Images with Light Streaks
* Deblurring Text Images via L0-Regularized Intensity and Gradient Prior
* Decomposable Nonlocal Tensor Dictionary Learning for Multispectral Image Denoising
* Decorrelated Vectorial Total Variation
* Decorrelating Semantic Visual Attributes by Resisting the Urge to Share
* Deep Fisher Kernels: End to End Learning of the Fisher Kernel GMM Parameters
* Deep Learning Face Representation from Predicting 10,000 Classes
* DeepFace: Closing the Gap to Human-Level Performance in Face Verification
* DeepPose: Human Pose Estimation via Deep Neural Networks
* DeepReID: Deep Filter Pairing Neural Network for Person Re-identification
* Deformable Object Matching via Deformation Decomposition Based 2D Label MRF
* Deformable Registration of Feature-Endowed Point Sets Based on Tensor Fields
* Dense Non-rigid Shape Correspondence Using Random Forests
* Dense Semantic Image Segmentation with Objects and Attributes
* Depth and Skeleton Associated Action Recognition without Online Accessible RGB-D Cameras
* Depth Enhancement via Low-Rank Matrix Completion
* Describing Textures in the Wild
* Detect What You Can: Detecting and Representing Objects Using Holistic Models and Body Parts
* Detecting Objects Using Deformation Dictionaries
* Diffuse Mirrors: 3D Reconstruction from Diffuse Indirect Illumination Using Inexpensive Time-of-Flight Sensors
* Dirichlet-Based Histogram Feature Transform for Image Classification
* DISCOVER: Discovering Important Segments for Classification of Video Events and Recounting
* Discrete-Continuous Depth Estimation from a Single Image
* Discrete-Continuous Gradient Orientation Estimation for Faster Image Segmentation
* Discriminative Blur Detection Features
* Discriminative Deep Metric Learning for Face Verification in the Wild
* Discriminative Feature-to-Point Matching in Image-Based Localization
* Discriminative Ferns Ensemble for Hand Pose Recognition
* Discriminative Hierarchical Modeling of Spatio-temporally Composable Human Activities
* Discriminative Sparse Inverse Covariance Matrix: Application in Brain Functional Network Classification
* Distance Encoded Product Quantization
* Diversity-Enhanced Condensation Algorithm and Its Application for Robust and Accurate Endoscope Three-Dimensional Motion Tracking
* DL-SFA: Deeply-Learned Slow Feature Analysis for Action Recognition
* Domain Adaptation on the Statistical Manifold
* Dual Linear Regression Based Classification for Face Cluster Recognition
* Dual-Space Decomposition of 2D Complex Shapes
* Edge-Aware Gradient Domain Optimization Framework for Image Filtering by Local Propagation
* Efficient Action Localization with Approximately Normalized Fisher Vectors
* Efficient Boosted Exemplar-Based Face Detection
* Efficient Computation of Relative Pose for Multi-camera Systems
* Efficient Feature Extraction, Encoding, and Classification for Action Recognition
* Efficient Hierarchical Graph-Based Segmentation of RGBD Videos
* Efficient High-Resolution Stereo Matching Using Local Plane Sweeps
* Efficient Nonlinear Markov Models for Human Motion
* Efficient Pruning LMI Conditions for Branch-and-Prune Rank and Chirality-Constrained Estimation of the Dual Absolute Quadric
* Efficient Squared Curvature
* Efficient Structured Parsing of Facades Using Dynamic Programming
* Empirical Minimum Bayes Risk Prediction: How to Extract an Extra Few % Performance from Vision Models with Just Three More Parameters
* Energy Based Multi-model Fitting & Matching for 3D Reconstruction
* Enriching Visual Knowledge Bases via Object Discovery and Segmentation
* Error-Tolerant Scribbles Based Interactive Image Segmentation
* Evaluation of Scan-Line Optimization for 3D Medical Image Registration
* Event Detection Using Multi-level Relevance Labels and Multiple Features
* Evolutionary Quasi-Random Search for Hand Articulations Tracking
* Exemplar-Based CRF for Multi-instance Object Segmentation, An
* Exploiting Shading Cues in Kinect IR Images for Geometry Refinement
* Face Alignment at 3000 FPS via Regressing Local Binary Features
* Facial Expression Recognition via a Boosted Deep Belief Network
* Fantope Regularization in Metric Learning
* Fast and Accurate Image Matching with Cascade Hashing for 3D Reconstruction
* Fast and Exact: ADMM-Based Discriminative Shape Segmentation with Loopy Part Models
* Fast and Reliable Two-View Translation Estimation
* Fast and Robust Algorithm to Count Topologically Persistent Holes in Noisy Clouds, A
* Fast and Robust Archetypal Analysis for Representation Learning
* Fast Approximate Inference in Higher Order MRF-MAP Labeling Problems
* Fast Edge-Preserving PatchMatch for Large Displacement Optical Flow
* FAST LABEL: Easy and Efficient Solution of Joint Multi-label and Estimation Problems
* Fast MRF Optimization with Application to Depth Reconstruction
* Fast Rotation Search with Stereographic Projections for 3D Registration
* Fast Supervised Hashing with Decision Trees for High-Dimensional Data
* Fast, Approximate Piecewise-Planar Modeling Based on Sparse Structure-from-Motion and Superpixels
* Fastest Deformable Part Model for Object Detection, The
* FAUST: Dataset and Evaluation for 3D Mesh Registration
* Feature-Independent Action Spotting without Human Localization, Segmentation, or Frame-wise Tracking
* Filter Forests for Learning Data-Dependent Convolutional Kernels
* Finding Matches in a Haystack: A Max-Pooling Strategy for Graph Matching in the Presence of Outliers
* Finding the Subspace Mean or Median to Fit Your Need
* Finding Vanishing Points via Point Alignments in Image Primal and Dual Domains
* Fine-Grained Visual Comparisons with Local Learning
* Fisher and VLAD with FLAIR
* Fourier Analysis on Transient Imaging with a Multifrequency Time-of-Flight Camera
* From Categories to Individuals in Real Time: A Unified Boosting Approach
* From Stochastic Grammar to Bayes Network: Probabilistic Parsing of Complex Activity
* Full-Angle Quaternions for Robustly Matching Vectors of 3D Rotations
* Fully Automated Non-rigid Segmentation with Distance Regularized Level Set Evolution Initialized and Constrained by Deep-Structured Inference
* Gait Recognition under Speed Transition
* Gauss-Newton Deformable Part Models for Face Alignment In-the-Wild
* General and Simple Method for Camera Pose and Focal Length Determination, A
* Generalized Max Pooling
* Generalized Nonconvex Nonsmooth Low-Rank Minimization
* Generalized Pupil-centric Imaging and Analytical Calibration for a Non-frontal Camera
* Generating Object Segmentation Proposals Using Global and Local Search
* Geometric Generative Gaze Estimation (G3E) for Remote RGB-D Cameras
* Geometric Urban Geo-localization
* Gesture Recognition Portfolios for Personalization
* Good Vibrations: A Modal Analysis Approach for Sequential Non-rigid Structure from Motion
* GPS-Tag Refinement Using Random Walks with an Adaptive Damping Factor
* Graph Cut Based Continuous Stereo Matching Using Locally Shared Labels
* Grassmann Averages for Scalable Robust PCA
* Ground Plane Estimation Using a Hidden Markov Model
* Gyro-Based Multi-image Deconvolution for Removing Handshake Blur
* Hash-SVM: Scalable Kernel Machines for Large-Scale Visual Classification
* Head Pose Estimation Based on Multivariate Label Distribution
* Hierarchical Context Model for Event Recognition in Surveillance Video, A
* Hierarchical Feature Hashing for Fast Dimensionality Reduction
* Hierarchical Probabilistic Model for Facial Feature Detection, A
* Hierarchical Subquery Evaluation for Active Learning on a Graph
* High Quality Photometric Reconstruction Using a Depth Camera
* High Resolution 3D Shape Texture from Multiple Videos
* Higher-Order Clique Reduction without Auxiliary Variables
* Histograms of Pattern Sets for Image Classification and Object Recognition
* How to Evaluate Foreground Maps
* Human Action Recognition across Datasets by Foreground-Weighted Histogram Decomposition
* Human Action Recognition Based on Context-Dependent Graph Kernels
* Human Action Recognition by Representing 3D Skeletons as Points in a Lie Group
* Human Body Shape Estimation Using a Multi-resolution Manifold Forest
* Human Shape and Pose Tracking Using Keyframes
* Human vs. Computer in Scene and Object Recognition
* Illumination-Aware Age Progression
* Image Fusion with Local Spectral Consistency and Dynamic Gradient Sparsity
* Image Pre-compensation: Balancing Contrast and Ringing
* Image Reconstruction from Bag-of-Visual-Words
* Image-Based Synthesis and Re-synthesis of Viewpoints Guided by 3D Models
* Immediate, Scalable Object Category Detection
* Improving Semantic Concept Detection through the Dictionary of Visually-Distinct Elements
* In Search of Inliers: 3D Correspondence by Local and Global Voting
* Incorporating Scene Context and Object Layout into Appearance Modeling
* Incremental Activity Modeling and Recognition in Streaming Videos
* Incremental Face Alignment in the Wild
* Incremental Learning of NCM Forests for Large-Scale Image Classification
* Inferring Analogous Attributes
* Inferring Unseen Views of People
* Informed Haar-Like Features Improve Pedestrian Detection
* Instance-Weighted Transfer Learning of Active Appearance Models
* Interval Tracker: Tracking by Interval Analysis
* Investigating Haze-Relevant Features in a Learning Framework for Image Dehazing
* Is Rotation a Nuisance in Shape Recognition?
* Iterated Second-Order Label Sensitive Pooling for 3D Human Pose Estimation
* Iterative Multilevel MRF Leveraging Context and Voxel Information for Brain Tumour Segmentation in MRI
* Joint Coupled-Feature Representation and Coupled Boosting for AD Diagnosis
* Joint Depth Estimation and Camera Shake Removal from Single Blurry Image
* Joint Motion Segmentation and Background Estimation in Dynamic Scenes
* Joint Summarization of Large-Scale Collections of Web Images and Videos for Storyline Reconstruction
* Kernel-PCA Analysis of Surface Normals for Shape-from-Shading
* L0 Norm Based Dictionary Learning by Proximal Methods with Global Convergence
* Lacunarity Analysis on Image Patterns for Texture Classification
* Language of Actions: Recovering the Syntax and Semantics of Goal-Directed Human Activities, The
* Laplacian Coordinates for Seeded Image Segmentation
* Large Scale Multi-view Stereopsis Evaluation
* Large-Scale Optimization of Hierarchical Features for Saliency Prediction in Natural Images
* Large-Scale Video Classification with Convolutional Neural Networks
* Large-Scale Visual Font Recognition
* Latent Dictionary Learning for Sparse Representation Based Classification
* Latent Regression Forest: Structured Estimation of 3D Articulated Hand Posture
* Learning an Image-Based Motion Context for Multiple People Tracking
* Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks
* Learning Euclidean-to-Riemannian Metric for Point-to-Set Classification
* Learning Everything about Anything: Webly-Supervised Visual Concept Learning
* Learning Expressionlets on Spatio-temporal Manifold for Dynamic Facial Expression Recognition
* Learning Fine-Grained Image Similarity with Deep Ranking
* Learning Important Spatial Pooling Regions for Scene Classification
* Learning Inhomogeneous FRAME Models for Object Patterns
* Learning Mid-level Filters for Person Re-identification
* Learning Non-linear Reconstruction Models for Image Set Classification
* Learning Optimal Seeds for Diffusion-Based Salient Object Detection
* Learning Receptive Fields for Pooling from Tensors of Feature Response
* Learning Scalable Discriminative Dictionary with Sample Relatedness
* Learning to Detect Ground Control Points for Improving the Accuracy of Stereo Matching
* Learning to Group Objects
* Learning to Learn, from Transfer Learning to Domain Adaptation: A Unifying Perspective
* Learning-Based Atlas Selection for Multiple-Atlas Segmentation
* Learning-by-Synthesis for Appearance-Based 3D Gaze Estimation
* Learning-to-Rank Approach for Image Color Enhancement, A
* Leveraging Hierarchical Parametric Networks for Skeletal Joints Based Action Segmentation and Recognition
* Light Field Stereo Matching Using Bilateral Statistics of Surface Cameras
* Linear Ranking Analysis
* Local Layering for Joint Motion Estimation and Occlusion Detection
* Local Readjustment for High-Resolution 3D Reconstruction
* Local Regularity-Driven City-Scale Facade Detection from Aerial Images
* Locality in Generic Instance Search from One Example
* Locally Linear Hashing for Extracting Non-linear Manifolds
* Locally Optimized Product Quantization for Approximate Nearest Neighbor Search
* Look at the Driver, Look at the Road: No Distraction! No Accident!
* Looking Beyond the Visible Scene
* Low-Cost Compressive Sensing for Color Video and Depth
* L_0 Regularized Stationary Time Estimation for Crowd Group Analysis
* Manifold Based Dynamic Texture Synthesis from Extremely Few Samples
* MAP Visibility Estimation for Large-Scale Dynamic 3D Reconstruction
* Matrix-Similarity Based Loss Function and Feature Selection for Alzheimer's Disease Diagnosis
* Max-Margin Boltzmann Machines for Object Segmentation
* Maximum Persistency in Energy Minimization
* Measuring Distance between Unordered Sets of Different Sizes
* Merging SVMs with Linear Discriminant Analysis: A Combined Model
* MILCut: A Sweeping Line Multiple Instance Learning Paradigm for Interactive Image Segmentation
* Minimal Scene Descriptions from Structure from Motion Models
* Minimal Solution to the Generalized Pose-and-Scale Problem, A
* Minimal Solvers for Relative Pose with a Single Unknown Radial Distortion
* Mirror Symmetry Histograms for Capturing Geometric Properties in Images
* Mixing Body-Part Sequences for Human Pose Estimation
* Mixture of Manhattan Frames: Beyond the Manhattan World, A
* Model Transport: Towards Scalable Transfer Learning on Manifolds
* Modeling Image Patches with a Generic Dictionary of Mini-epitomes
* Motion-Depth: RGB-D Depth Map Enhancement with Motion and Depth in Complement
* Multi-cue Visual Tracking Using Robust Feature-Level Fusion Based on Joint Sparse Representation
* Multi-feature Spectral Clustering with Minimax Optimization
* Multi-fold MIL Training for Weakly Supervised Object Localization
* Multi-forest Tracker: A Chameleon in Tracking
* Multi-label Generic Cuts: Optimal Inference in Multi-label Multi-clique MRF-MAP Problems
* Multi-object Tracking via Constrained Sequential Labeling
* Multi-output Learning for Camera Relocalization
* Multi-shot Imaging: Joint Alignment, Deblurring, and Resolution-Enhancement
* Multi-source Deep Learning for Human Pose Estimation
* Multi-target Tracking with Motion Context in Tensor Power Iteration
* Multi-view Super Vector for Action Recognition
* Multigraph Representation for Improved Unsupervised/Semi-supervised Learning of Human Actions, A
* Multilabel Ranking with Inconsistent Rankers
* Multimodal Learning in Loosely-Organized Web Images
* Multiple Granularity Analysis for Fine-Grained Action Detection
* Multiple Structured-Instance Learning for Semantic Segmentation with Uncertain Training Data
* Multiple Target Tracking Based on Undirected Hierarchical Relation Hypergraph
* Multipoint Filtering with Local Polynomial Approximation and Range Guidance
* Multiscale Centerline Detection by Learning a Scale-Space Distance Transform
* Multiscale Combinatorial Grouping
* Multivariate General Linear Models (MGLM) on Riemannian Manifolds with Applications to Statistical Analysis of Diffusion Weighted Images
* Multiview Shape and Reflectance from Natural Illumination
* Neural Decision Forests for Semantic Image Labelling
* New Perspective on Material Classification and Ink Identification, A
* Newton Greedy Pursuit: A Quadratic Approximation Method for Sparsity-Constrained Optimization
* NMF-KNN: Image Annotation Using Weighted Multi-view Non-negative Matrix Factorization
* Noising versus Smoothing for Vertex Identification in Unknown Shapes
* Non-parametric Bayesian Constrained Local Models
* Non-rigid Segmentation Using Sparse Low Dimensional Manifolds and Deep Belief Networks
* Nonparametric Context Modeling of Local Appearance for Pose- and Expression-Robust Facial Landmark Localization
* Nonparametric Part Transfer for Fine-Grained Recognition
* Novel Chamfer Template Matching Method Using Variational Mean Field, A
* Novel Methods for Multilinear Data Completion and De-noising Based on Tensor-SVD
* Object Classification with Adaptable Regions
* Object Partitioning Using Local Convexity
* Object-Based Multiple Foreground Video Co-segmentation
* Occluding Contours for Multi-view Stereo
* Occlusion Coherence: Localizing Occluded Faces with a Hierarchical Deformable Part Model
* Occlusion Geodesics for Online Multi-object Tracking
* On Projective Reconstruction in Arbitrary Dimensions
* On the Quotient Representation for the Essential Manifold
* One Millisecond Face Alignment with an Ensemble of Regression Trees
* Online Learned Elementary Grouping Model for Multi-target Tracking, An
* Online Object Tracking, Learning and Parsing with And-Or Graphs
* Optimal Decisions from Probabilistic Models: The Intersection-over-Union Case
* Optimizing Average Precision Using Weakly Supervised Data
* Optimizing over Radial Kernels on Compact Manifolds
* Orientation Robust Text Line Detection in Natural Images
* Orientational Pyramid Matching for Recognizing Indoor Scenes
* Packing and Padding: Coupled Multi-index for Accurate Image Retrieval
* PANDA: Pose Aligned Networks for Deep Attribute Modeling
* Parallax-Tolerant Image Stitching
* Parsing Occluded People
* Parsing Videos of Actions with Segmental Grammars
* Parsing World's Skylines Using Shape-Constrained MRFs
* Partial Occlusion Handling for Visual Tracking via Robust Part Matching
* Partial Optimality by Pruning for MAP-Inference with General Graphical Models
* Partial Symmetry in Polynomial Systems and Its Applications in Computer Vision
* Patch to the Future: Unsupervised Visual Prediction
* Patch-Based Evaluation of Image Segmentation
* PatchMatch Based Joint View Selection and Depthmap Estimation
* Pedestrian Detection in Low-Resolution Imagery by Learning Multi-scale Intrinsic Motion Structures (MIMS)
* Persistence-Based Structural Recognition
* Persistent Tracking for Wide Area Aerial Surveillance
* Photometric Bundle Adjustment for Dense Multi-view 3D Modeling
* Photometric Stereo Using Constrained Bivariate Regression for General Isotropic Surfaces
* Photometry of Intrinsic Images, The
* Piecewise Planar and Compact Floorplan Reconstruction from Images
* Point Matching in the Presence of Outliers in Both Point Sets: A Concave Optimization Approach
* Posebits for Monocular Human Pose Estimation
* Preconditioning for Accelerated Iteratively Reweighted Least Squares in Structured Sparsity Reconstruction
* Predicting Failures of Vision Systems
* Predicting Matchability
* Predicting Multiple Attributes via Relative Multi-task Learning
* Predicting Object Dynamics in Scenes
* Predicting User Annoyance Using Visual Attributes
* Primal-Dual Algorithm for Higher-Order Multilabel Markov Random Fields, A
* Principled Approach for Coarse-to-Fine MAP Inference, A
* Probabilistic Framework for Multitarget Tracking with Mutual Occlusions, A
* Probabilistic Labeling Cost for High-Accuracy Multi-view Reconstruction
* Procrustean Markov Process for Non-rigid Structure Recovery, A
* Product Sparse Coding
* Pseudoconvex Proximal Splitting for L-infinity Problems in Multiview Geometry
* Pulling Things out of Perspective
* Pyramid-Based Visual Tracking Using Sparsity Represented Mean Transform
* Quality Assessment for Comparing Image Enhancement Algorithms
* Quality Dynamic Human Body Modeling Using a Single Low-Cost Depth Camera
* Quality-Based Multimodal Classification Using Tree-Structured Sparsity
* Quasi Real-Time Summarization for Consumer Videos
* Random Laplace Feature Maps for Semigroup Kernels on Histograms
* Randomized Max-Margin Compositions for Visual Recognition
* Range-Sample Depth Feature for Action Recognition
* RAPS: Robust and Efficient Automatic Construction of Person-Specific Deformable Models
* Rate-Invariant Analysis of Trajectories on Riemannian Manifolds with Application in Visual Speech Recognition
* Raw-to-Raw: Mapping between Image Sensor Color Responses
* Real-Time Model-Based Articulated Object Pose Detection and Tracking with Variable Rigidity Constraints
* Real-Time Simultaneous Pose and Shape Estimation for Articulated Objects Using a Single Depth Camera
* Realtime and Robust Hand Tracking from Depth
* Recognition of Complex Events: Exploiting Temporal Dynamics between Underlying Concepts
* Recognizing RGB Images by Learning from RGB-D Data
* Reconstructing Evolving Tree Structures in Time Lapse Sequences
* Reconstructing PASCAL VOC
* Reconstructing Storyline Graphs for Image Recommendation from Web Community Photos
* Recovering Surface Details under General Unknown Illumination Using Shading and Coarse Multi-view Stereo
* Rectification, and Segmentation of Coplanar Repeated Patterns
* Reflectance and Fluorescent Spectra Recovery Based on Fluorescent Chromaticity Invariance under Varying Illumination
* Region-Based Discriminative Feature Pooling for Scene Text Recognition
* Region-Based Particle Filter for Video Object Segmentation
* Relative Parts: Distinctive Parts for Learning Relative Attributes
* Relative Pose Estimation for a Multi-camera System with Known Vertical Direction
* Remote Heart Rate Measurement from Face Videos under Realistic Situations
* Reverse Hierarchy Model for Predicting Eye Fixations, A
* Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation
* Riemannian Framework for Matching Point Clouds Represented by the Schrodinger Distance Transform, A
* Rigid Motion Segmentation Using Randomized Voting
* RIGOR: Reusing Inference in Graph Cuts for Generating Object Regions
* Robust 3D Features for Matching between Distorted Range Scans Captured by Moving Systems
* Robust 3D Tracking with Descriptor Fields
* Robust Estimation of 3D Human Poses from a Single Image
* Robust Online Multi-object Tracking Based on Tracklet Confidence and Online Discriminative Appearance Learning
* Robust Orthonormal Subspace Learning: Efficient Recovery of Corrupted Low-Rank Matrices
* Robust Scale Estimation in Real-Time Monocular SFM for Autonomous Driving
* Robust Separation of Reflection from Multiple Images
* Robust Subspace Segmentation with Block-Diagonal Prior
* Robust Surface Reconstruction via Triple Sparsity
* Role of Context for Object Detection and Semantic Segmentation in the Wild, The
* Saliency Detection on Light Field
* Saliency Optimization from Robust Background Detection
* Salient Region Detection via High-Dimensional Color Transform
* Scalable 3D Tracking of Multiple Interacting Objects
* Scalable Multitask Representation Learning for Scene Classification
* Scalable Object Detection Using Deep Neural Networks
* Scale-Space Processing Using Polynomial Representations
* SCAMS: Simultaneous Clustering and Model Selection
* Scanline Sampler without Detailed Balance: An Efficient MCMC for MRF Optimization
* Scattering Parameters and Surface Normals from Homogeneous Translucent Materials Using Photometric Stereo
* Scene Labeling Using Beam Search under Mutex Constraints
* Scene Parsing with Object Instances and Occlusion Ordering
* Scene-Independent Group Profiling in Crowd
* SeamSeg: Video Object Segmentation Using Patch Seams
* Second-Order Shape Optimization for Geometric Inverse Problems in Vision
* Secrets of Salient Object Segmentation, The
* Seeing 3D Chairs: Exemplar Part-Based 2D-3D Alignment Using a Large Dataset of CAD Models
* Seeing the Arrow of Time
* Seeing What You're Told: Sentence-Guided Activity Recognition in Video
* Segmentation-Aware Deformable Part Models
* Segmentation-Free Dynamic Scene Deblurring
* Semantic Object Selection
* Semi-supervised Coupled Dictionary Learning for Person Re-identification
* Semi-supervised Relational Topic Model for Weakly Annotated Image Recognition in Social Media
* Semi-supervised Spectral Clustering for Image Set Classification
* Separable Kernel for Image Deblurring
* Separation of Line Drawings Based on Split Faces for 3D Object Reconstruction
* Sequential Convex Relaxation for Mutual Information-Based Unsupervised Figure-Ground Segmentation
* Shadow Removal from Single RGB-D Images
* Shape-Preserving Half-Projective Warps for Image Stitching
* Shape-Time Random Field for Semantic Video Labeling, The
* Shrinkage Fields for Effective Image Restoration
* Sign Spotting Using Hierarchical Sequential Patterns with Temporal Intervals
* Similarity Comparisons for Interactive Fine-Grained Categorization
* Similarity-Aware Patchwork Assembly for Depth Image Super-resolution
* Simplex-Based 3D Spatio-temporal Feature Description for Action Recognition
* Simultaneous Localization and Calibration: Self-Calibration of Consumer Depth Cameras
* Simultaneous Twin Kernel Learning Using Polynomial Transformations for Structured Prediction
* Single Image Layer Separation Using Relative Smoothness
* Single Image Super-resolution Using Deformable Patches
* Single-View 3D Scene Parsing by Attributed Grammar
* Smooth Representation Clustering
* Socially-Aware Large-Scale Crowd Forecasting
* Sparse Dictionary Learning for Edit Propagation of High-Resolution Images
* Spectral Clustering with Jensen-Type Kernels and Their Multi-point Extensions
* Spectral Graph Reduction for Efficient Image and Streaming Video Segmentation
* Speeding Up Tracking by Ignoring Features
* SphereFlow: 6 DoF Scene Flow from RGB-D Pairs
* Stable and Informative Spectral Signatures for Graph Matching
* Stable Learning in Coding Space for Multi-class Decoding and Its Extension for Multi-class Hypothesis Transfer Learning
* Stable Template-Based Isometric 3D Reconstruction in All Imaging Conditions by Linear Least-Squares
* Stacked Progressive Auto-Encoders (SPAE) for Face Recognition Across Poses
* SteadyFlow: Spatially Smooth Optical Flow for Video Stabilization
* Stereo under Sequential Optimal Sampling: A Statistical Analysis Framework for Search Space Reduction
* StoryGraphs: Visualizing Character Interactions as a Timeline
* Strokelets: A Learned Multi-scale Representation for Scene Text Recognition
* Study on Cross-Population Age Estimation, A
* Submodular Object Recognition
* Submodularization for Binary Pairwise Energies
* Subspace Clustering for Sequential Data
* Subspace Tracking under Dynamic Dimensionality for Online Background Subtraction
* Super Normal Vector for Activity Recognition Using Depth Sequences
* Super-resolving Noisy Images
* Surface Registration by Optimization in Constrained Diffeomorphism Space
* Surface-from-Gradients: An Approach Based on Discrete Geometry Processing
* Switchable Deep Network for Pedestrian Detection
* Symmetry-Aware Nonrigid Matching of Incomplete 3D Surfaces
* Synthesizability of Texture Examples, The
* T-Linkage: A Continuous Relaxation of J-Linkage for Multi-model Fitting
* Talking Heads: Detecting Humans and Recognizing Their Interactions
* Tell Me What You See and I Will Show You Where It Is
* Temporal Segmentation of Egocentric Videos
* Temporal Sequence Modeling for Video Event Detection
* Three Guidelines of Online Learning for Large-Scale Visual Recognition
* Time-Mapping Using Space-Time Saliency
* Timing-Based Local Descriptor for Dynamic Surfaces
* Topic Modeling of Multimodal Data: An Autoregressive Approach
* Total Variation Blind Deconvolution: The Devil Is in the Details
* Total-Variation Minimization on Unstructured Volumetric Mesh: Biophysical Applications on Reconstruction of 3D Ischemic Myocardium
* Towards Good Practices for Action Video Encoding
* Towards Multi-view and Partially-Occluded Face Alignment
* Towards Unified Human Parsing and Pose Estimation
* Tracking Indistinguishable Translucent Objects over Time Using Weakly Supervised Structured Learning
* Tracking on the Product Manifold of Shape and Orientation for Tractography from Diffusion MRI
* Tracklet Association with Online Target-Specific Metric Learning
* Transfer Joint Matching for Unsupervised Domain Adaptation
* Transformation Pursuit for Image Classification
* Transitive Distance Clustering with K-Means Duality
* Transparent Object Reconstruction via Coded Transport of Intensity
* Triangulation Embedding and Democratic Aggregation for Image Search
* Trinocular Geometry Revisited
* Turning Mobile Phones into 3D Scanners
* Two-Class Weather Classification
* Two-View Camera Housing Parameters Calibration for Multi-layer Flat Refractive Interface
* Understanding Objects in Detail with Fine-Grained Attributes
* Unified Face Analysis by Iterative Multi-output Random Forests
* Unifying Spatial and Attribute Selection for Distracter-Resilient Tracking
* Unsupervised Learning of Dictionaries of Hierarchical Compositional Models
* Unsupervised Multi-class Joint Image Segmentation
* Unsupervised One-Class Learning for Automatic Outlier Removal
* Unsupervised Spectral Dual Assignment Clustering of Human Actions in Context
* Unsupervised Trajectory Modelling Using Temporal Information via Minimal Paths
* User-Specific Hand Modeling from Monocular Depth Sequences
* Using a Deformation Field Model for Localizing Faces and Facial Points under Weak Supervision
* Using k-Poselets for Detecting People and Localizing Their Keypoints
* Using Projection Kurtosis Concentration of Natural Images for Blind Noise Covariance Matrix Estimation
* Very Fast Solution to the PnP Problem with Algebraic Outlier Rejection
* Video Classification Using Semantic Concept Co-occurrences
* Video Event Detection by Inferring Temporal Instance Labels
* Video Motion Segmentation Using New Adaptive Manifold Denoising Model
* Visual Persuasion: Inferring Communicative Intents of Images
* Visual Semantic Search: Retrieving Videos via Complex Textual Queries
* Visual Tracking Using Pertinent Patch Selection and Masking
* Visual Tracking via Probability Continuous Outlier Model
* Weakly Supervised Multiclass Video Segmentation
* Weighted Nuclear Norm Minimization with Application to Image Denoising
* What Are You Talking About? Text-to-Image Coreference
* What Camera Motion Reveals about Shape with Unknown BRDF
* When 3D Reconstruction Meets Ubiquitous RGB-D Images
* Who Do I Look Like? Determining Parent-Offspring Resemblance via Gated Autoencoders
* Word Channel Based Multiscale Pedestrian Detection without Image Resizing and Using Only One Classifier
* Zero-Shot Event Detection Using Multi-modal Fusion of Weakly Supervised Concepts
540 for CVPR14

CVPR15 * *CVPR
* 24/7 Place Recognition by View Synthesis
* 3D all the way: Semantic segmentation of urban scenes from start to end in 3D
* 3D deep shape descriptor
* 3D model-based continuous emotion recognition
* 3D Reconstruction in the presence of glasses by acoustic and stereo fusion
* 3D scanning deformable objects with a single RGBD sensor
* 3D shape estimation from 2D landmarks: A convex relaxation approach
* 3D ShapeNets: A deep representation for volumetric shapes
* Absolute pose for cameras under flat refractive interfaces
* Accurate depth map estimation from a lenslet light field camera
* Action recognition with trajectory-pooled deep-convolutional descriptors
* Active learning and discovery of object categories in the presence of unnameable instances
* Active learning for structured probabilistic models with histogram approximation
* Active Pictorial Structures
* Active sample selection and correction propagation on a gradually-augmented graph
* active search strategy for efficient object class detection, An
* ActivityNet: A large-scale video benchmark for human activity understanding
* Adaptive as-natural-as-possible image stitching
* Adaptive eye-camera calibration for head-worn devices
* Adaptive region pooling for object detection
* Adopting an unconstrained ray model in light-field cameras for 3D shape reconstruction
* Aligning 3D models to RGB-D images of cluttered scenes
* Ambient occlusion via compressive visibility estimation
* aperture problem for refractive motion, The
* Appearance-based gaze estimation in the wild
* application of two-level attention models in deep convolutional neural network for fine-grained image classification, The
* Approximate nearest neighbor fields in video
* approximate shading model for object relighting, An
* Articulated motion discovery using pairs of trajectories
* Associating neural word embeddings with deep image representations using Fisher Vectors
* Attributes and categories for generic instance search from one example
* Automatic construction of robust spherical harmonic subspaces
* Automatically discovering local visual material attributes
* Background Subtraction via generalized fused LASSO foreground modeling
* Basis mapping based boosting for object detection
* Bayesian adaptive matrix factorization with automatic model selection
* Bayesian Inference for Neighborhood Filters With Application in Denoising
* Bayesian sparse representation for hyperspectral image super resolution
* Becoming the expert: Interactive multi-class machine teaching
* Best of both worlds: Human-machine collaboration for object annotation
* Best-Buddies Similarity for robust template matching
* Beyond frontal faces: Improving Person Recognition using multiple cues
* Beyond Gaussian Pyramid: Multi-skip Feature Stacking for action recognition
* Beyond Mahalanobis metric: Cayley-Klein metric learning
* Beyond Principal Components: Deep Boltzmann Machines for face modeling
* Beyond short snippets: Deep networks for video classification
* Beyond spatial pooling: Fine-grained representation learning in multiple domains
* Beyond the shortest path: Unsupervised domain adaptation by Sampling Subspaces along the Spline Flow
* Bilinear heterogeneous information machine for RGB-D action recognition
* Bilinear random projections for locality-sensitive binary codes
* Blind optical aberration correction by exploring geometric and visual priors
* Blur kernel estimation using normalized color-line priors
* BOLD: Binary online learned descriptor for efficient image matching
* Book2Movie: Aligning video scenes with book chapters
* Building a bird recognition app and large scale dataset with citizen scientists: The fine print in fine-grained dataset collection
* Building proteins in a day: Efficient 3D molecular reconstruction
* Burst Deblurring: Removing Camera Shake Through Fourier Burst Accumulation
* Camera intrinsic blur kernel estimation: A reliable framework
* Can humans fly? Action understanding with multiple classes of actors
* Cascaded hand pose regression
* Casual stereoscopic panorama stitching
* Category-specific object reconstruction from a single image
* Causal video object segmentation from persistence of occlusions
* CIDEr: Consensus-based image description evaluation
* Class consistent multi-modal fusion with binary features
* Classifier adaptation at prediction time
* Classifier based graph construction for video segmentation
* Classifier learning with hidden information
* Clique-graph matching by preserving global & local structure
* Clustering of static-adaptive correspondences for deformable object tracking
* Co-saliency detection via looking deep and wide
* coarse-to-fine model for 3D pose estimation and sub-category recognition, A
* Coarse-to-fine region selection and matching
* Collaborative feature learning from social media
* Combination features and models for human detection
* Combining local appearance and holistic view: Dual-Source Deep Neural Networks for human pose estimation
* common self-polar triangle of concentric circles and its application to camera calibration, The
* Completing 3D object shape from one depth image
* Complexity-adaptive distance metric for object proposals generation
* Computationally bounded retrieval
* Computing similarity transformations from only image correspondences
* Computing the stereo matching cost with a convolutional neural network
* ConceptLearner: Discovering visual concepts from weakly labeled image collections
* Constrained planar cuts: Object partitioning for point clouds
* Continuous Visibility Feature
* convex optimization approach to robust fundamental matrix estimation, A
* Convolutional feature masking for joint object and stuff segmentation
* convolutional neural network cascade for face detection, A
* Convolutional neural networks at constrained time cost
* Correlation filters with limited boundaries
* Cross-age face verification by coordinating with cross-face age verification
* Cross-scene crowd counting via deep convolutional neural networks
* Curriculum learning of multiple tasks
* DASC: Dense adaptive self-correlation descriptor for multi-modal and multi-spectral correspondence
* Data-driven 3D Voxel Patterns for object category recognition
* Data-driven depth map refinement via multi-scale sparse representation
* Data-driven sparsity-based restoration of JPEG-compressed images in dual transform-pixel domain
* Dataset fingerprints: Exploring image collections through data mining
* dataset for Movie Description, A
* Deep convolutional neural fields for depth estimation from a single image
* Deep correlation for matching images and text
* Deep domain adaptation for describing people based on fine-grained clothing attributes
* Deep filter banks for texture recognition and segmentation
* Deep hashing for compact binary codes learning
* Deep hierarchical parsing for semantic segmentation
* Deep LAC: Deep localization, alignment and classification for fine-grained recognition
* Deep multiple instance learning for image classification and auto-annotation
* Deep networks for saliency detection via local estimation and global search
* Deep neural networks are easily fooled: High confidence predictions for unrecognizable images
* Deep roto-translation scattering for object classification
* Deep Semantic Ranking Based Hashing for Multi-Label Image Retrieval
* Deep sparse representation for robust image registration
* Deep transfer metric learning
* Deep Visual-Semantic Alignments for Generating Image Descriptions
* DEEP-CARVING: Discovering visual attributes by carving deep neural nets
* DeepContour: A deep convolutional feature learned by positive-sharing loss for contour detection
* DeepEdge: A multi-scale bifurcated deep network for top-down contour detection
* DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection
* Deeply learned attributes for crowded scene understanding
* Deeply learned face representations are sparse, selective, and robust
* Deepshape: Deep learned shape descriptor for 3D shape matching and retrieval
* Defocus deblurring and superresolution for time-of-flight depth cameras
* Deformable part models are convolutional neural networks
* Delving into egocentric actions
* Dense, accurate optical flow estimation with piecewise parametric model
* Depth and surface normal estimation from monocular images using regression on deep features and hierarchical CRFs
* Depth camera tracking with contour cues
* Depth from focus with your mobile phone
* Depth from shading, defocus, and correspondence using light-field angular coherence
* Depth image enhancement using local tangent plane approximations
* Descriptor free visual indoor localization with line segments
* Designing deep networks for surface normal estimation
* Detector discovery in the wild: Joint multiple instance and representation learning
* DevNet: A Deep Event Network for multimedia event detection and evidence recounting
* Direct structure estimation for 3D reconstruction
* Direction matters: Depth estimation with a surface normal classifier
* Discovering states and transformations in image collections
* Discrete hyper-graph matching
* Discrete optimization of ray potentials for semantic 3D reconstruction
* Discriminant Analysis on Riemannian Manifold of Gaussian Distributions for Face Recognition With Image Sets
* Discriminative and consistent similarities in instance-level Multiple Instance Learning
* discriminative CNN video representation for event detection, A
* Discriminative learning of iteration-wise priors for blind deconvolution
* Discriminative shape from shading in uncalibrated illumination
* Displets: Resolving stereo ambiguities using object knowledge
* Diversity-induced Multi-view Subspace Clustering
* Domain-size pooling in local descriptors: DSP-SIFT
* Don't just listen, use your imagination: Leveraging visual common sense for non-visual tasks
* Dual domain filters based texture and structure preserving image non-blind deconvolution
* Dynamic Convolutional Layer for short range weather prediction, A
* dynamic programming approach for fast and robust object pose recognition from range images, A
* Dynamically encoded actions based on spacetime saliency
* DynamicFusion: Reconstruction and tracking of non-rigid scenes in real-time
* Early burst detection for memory-efficient image retrieval
* Effective face frontalization in unconstrained images
* Effective learning-based illuminant estimation using simple features
* Efficient and accurate approximations of nonlinear convolutional networks
* Efficient ConvNet-based marker-less motion capture in general scenes with a low number of cameras
* Efficient Globally Optimal Consensus Maximisation with Tree Search
* Efficient illuminant estimation for color constancy using grey pixels
* Efficient label collection for unlabeled image datasets
* Efficient minimal-surface regularization of perspective depth maps in variational stereo
* Efficient object localization using Convolutional Networks
* Efficient parallel optimization for potts energy with hierarchical fusion
* Efficient SDP inference for fully-connected CRFs based on low-rank decomposition
* Efficient sparse-to-dense optical flow estimation using a learned basis and layers
* efficient volumetric framework for shape tracking, An
* Ego-surfing first person videos
* EgoSampling: Fast-forward and stereo for egocentric videos
* Elastic functional coding of human actions: From vector-fields to latent variables
* Elastic-net regularization of singular values for robust subspace learning
* Embedded phase shifting: Robust phase shifting with embedded signals
* Encoding based saliency detection for videos and images
* End-to-end integration of a Convolutional Network, Deformable Parts Model and non-maximum suppression
* Enriching object detection with 2D-3D registration and continuous viewpoint estimation
* EpicFlow: Edge-preserving interpolation of correspondences for optical flow
* Evaluation of output embeddings for fine-grained image classification
* Event-driven stereo matching for real-time 3D panoramic vision
* Exact bias correction and covariance estimation for stereo vision
* Exemplar SVMs as visual feature encoders
* Expanding object detector's Horizon: Incremental learning framework for object detection in videos
* Exploiting uncertainty in regression forests for accurate camera relocalization
* Eye tracking assisted extraction of attentionally important objects from videos
* Face alignment by coarse-to-fine shape searching
* Face alignment using cascade Gaussian process regression trees
* Face video retrieval with image query via hashing across Euclidean space and Riemannian manifold
* FaceNet: A unified embedding for face recognition and clustering
* FAemb: A function approximation-based embedding method for image retrieval
* FaLRR: A fast low rank representation solver
* Fast 2D border ownership assignment
* Fast action proposals for human action detection and search
* fast algorithm for elastic shape distances between closed planar curves, A
* Fast and accurate image upscaling with super-resolution forests
* Fast and flexible convolutional sparse coding
* Fast and robust hand tracking using detection-guided optimization
* Fast bilateral-space stereo for synthetic defocus
* Fast randomized Singular Value Thresholding for Nuclear Norm Minimization
* Feature-independent context estimation for automatic image annotation
* Feedforward semantic segmentation with zoom-out features
* Filtered channel features for pedestrian detection
* Finding action tubes
* Finding distractors in images
* Fine-grained classification of pedestrians in video: Benchmark and state of the art
* Fine-grained histopathological image analysis via robust segmentation and large-scale retrieval
* Fine-grained recognition without part annotations
* Fine-grained visual categorization via multi-stage metric learning
* First-person pose recognition using egocentric workspaces
* Fisher vectors meet Neural Networks: A hybrid classification architecture
* Fixation bank: Learning to reweight fixation candidates
* fixed viewpoint approach for dense reconstruction of transparent objects, A
* flexible tensor block coordinate ascent scheme for hypergraph matching, A
* FlowWeb: Joint image set alignment by weaving consistent, pixel-wise correspondences
* Flying objects detection from a single moving camera
* FPA-CS: Focal plane array-based compressive imaging in short-wave infrared
* From captions to visual concepts and back
* From categories to subcategories: Large-scale image classification with partial class label refinement
* From dictionary of visual words to subspaces: Locality-constrained affine subspace coding
* From image-level to pixel-level labeling with Convolutional Networks
* From single image query to detailed 3D reconstruction
* Fully Convolutional Networks for Semantic Segmentation
* Functional correspondence by matrix completion
* Fusing subcategory probabilities for texture classification
* Fusion moves for correlation clustering
* Gaze-enabled egocentric video summarization via constrained submodular maximization
* Generalized Deformable Spatial Pyramid: Geometry-preserving dense correspondence estimation
* Generalized Tensor Total Variation minimization for visual data recovery?
* Generalized video deblurring for dynamic scenes
* Geo-semantic segmentation
* Geodesic exponential kernels: When curvature and linearity conflict
* geodesic-preserving method for image warping, A
* Global refinement of random forest
* Global supervised descent method
* GMMCP tracker: Globally optimal Generalized Maximum Multi Clique problem for multiple object tracking
* Going deeper with convolutions
* Good features to track for visual SLAM
* Graph-based simplex method for pairwise energy minimization with binary variables
* graphical model approach for matching partial signatures, A
* Grasp type revisited: A modern perspective on a classical feature for vision
* GRSA: Generalized range swap algorithm for the efficient optimization of MRFs
* Handling motion blur in multi-frame super-resolution
* Hardware compliant approximate image codes
* Hashing with binary autoencoders
* HC-search for structured prediction in computer vision
* Heat diffusion over weighted manifolds: A new descriptor for textured 3D non-rigid shapes
* Heteroscedastic max-min distance analysis
* Hierarchical Recurrent Neural Network for Skeleton Based Action Recognition
* Hierarchical sparse coding with geometric prior for visual geo-location
* Hierarchical-PEP model for real-world face recognition
* Hierarchically-constrained optical flow
* High-fidelity Pose and Expression Normalization for face recognition in the wild
* High-speed hyperspectral video acquisition with a dual-camera architecture
* Holistic 3D scene understanding from a single geo-tagged image
* How do we use our hands? Discovering a diverse set of common grasps
* How many bits does it take for a stimulus to be salient?
* Human action segmentation with hierarchical supervoxel consistency
* Hyper-class augmented and regularized deep learning for fine-grained image classification
* Hypercolumns for object segmentation and fine-grained localization
* Illumination and reflectance spectra separation of a hyperspectral image meets low-rank matrix factorization
* Image denoising via adaptive soft-thresholding based on non-local samples
* Image parsing with a wide range of classes and scene-level context
* Image partitioning into convex polygons
* Image retrieval using scene graphs
* Image segmentation in Twenty Questions
* Image specificity
* improved deep learning architecture for person re-identification, An
* Improving object detection with deep convolutional networks via Bayesian optimization and structured prediction
* Improving object proposals with multi-thresholding straddling expansion
* In defense of color-based model-free tracking
* Indoor scene structure analysis for single image depth estimation
* Inferring 3D layout of building facades from a single image
* Integrating parametric and non-parametric models for scene labeling
* Interaction part mining: A mid-level approach for fine-grained action recognition
* Interleaved text/image Deep Mining on a large-scale radiology database
* Intra-frame deblurring by leveraging inter-frame camera motion
* Inverting RANSAC: Global model detection via inlier rate estimation
* Is object localization for free? - Weakly-supervised learning with convolutional neural networks
* Iteratively reweighted graph cut for multi-label MRFs with non-convex priors
* Joint action recognition and pose estimation from video
* Joint calibration of Ensemble of Exemplar SVMs
* Joint inference of groups, events and human roles in aerial videos
* Joint multi-feature spatial context for scene recognition in the semantic manifold
* Joint patch and multi-label learning for facial action unit detection
* Joint photo stream and blog post summarization and exploration
* Joint SFM and detection cues for monocular 3D localization in road scenes
* Joint tracking and segmentation of multiple targets
* Joint vanishing point extraction and tracking
* Jointly Learning Heterogeneous Features for RGB-D Activity Recognition
* JOTS: Joint Online Tracking and Segmentation
* Just noticeable defocus blur detection and estimation
* k-support norm and convex envelopes of cardinality and rank, The
* Kernel fusion for better image deblurring
* KL divergence based agglomerative clustering for automated Vitiligo grading
* Label Consistent Quadratic Surrogate model for visual saliency prediction
* Landmarks-based kernelized subspace alignment for unsupervised domain adaptation
* Large-scale and drift-free surface reconstruction using online subvolume registration
* large-scale car dataset for fine-grained categorization and verification, A
* Large-scale damage detection using satellite imagery
* Latent trees for estimating intensity of Facial Action Units
* Layered RGBD scene flow estimation
* Learning a convolutional neural network for non-uniform motion blur removal
* Learning a non-linear knowledge transfer model for cross-view action recognition
* Learning a sequential search for landmarks
* Learning an efficient model of hand shape variation from depth images
* Learning coarse-to-fine sparselets for efficient object detection and scene classification
* Learning deep representations for ground-to-aerial geolocalization
* Learning descriptors for object recognition and 3D pose estimation
* Learning from massive noisy labeled data for image classification
* Learning Graph Structure for Multi-Label Image Classification Via Clique Generation
* Learning Hypergraph-regularized Attribute Predictors
* Learning lightness from human judgement on relative reflectance
* Learning multiple visual tasks while discovering their structure
* Learning scene-specific pedestrian detectors without real data
* Learning semantic relationships for better action retrieval in images
* Learning similarity metrics for dynamic scene segmentation
* Learning to compare image patches via convolutional neural networks
* Learning to detect Motion Boundaries
* Learning to generate chairs with convolutional neural networks
* Learning to look up: Realtime monocular gaze correction using machine learning
* Learning to propose objects
* Learning to rank in person re-identification with metric ensembles
* Learning to segment moving objects in videos
* Learning to segment under various forms of weak supervision
* Learning with dataset bias in latent subcategory models
* Leveraging stereo matching with learning-based confidence measures
* Light field from micro-baseline image pair
* Light field layer matting
* light transport model for mitigating multipath interference in Time-of-flight sensors, A
* Line drawing interpretation in a multi-view context
* Line-based Multi-Label Energy Optimization for fisheye image rectification and calibration
* Line-sweep: Cross-ratio for wide-baseline matching and 3D reconstruction
* linear least-squares solution to elastic Shape-from-Template, A
* LMI-based 2D-3D registration: From uncalibrated images to Euclidean scene
* Local high-order regularization on data manifolds
* Long-term correlation tracking
* Long-Term Recurrent Convolutional Networks for Visual Recognition and Description
* low-dimensional step pattern analysis algorithm with application to multimodal retinal image registration, A
* Low-level vision by consensus in a spatial hierarchy of regions
* L_0TV: A new method for image restoration in the presence of impulse noise
* Making better use of edges via perceptual grouping
* Mapping visual features to semantic profiles for retrieval in medical imaging
* Matching bags of regions in RGBD images
* Matching-CNN meets KNN: Quasi-parametric human parsing
* MatchNet: Unifying feature and metric learning for patch-based matching
* Material classification with thermal imagery
* Material recognition in the wild with the Materials in Context Database
* Matrix completion for resolving label ambiguity
* maximum entropy feature descriptor for age invariant face recognition, A
* Maximum persistency via iterative relaxed inference with graphical models
* Membership representation for detecting block-diagonal structure in low-rank or sparse subspace clustering
* Metric imitation by manifold transfer for efficient vision applications
* metric parametrization for trifocal tensors with non-colinear pinholes, A
* Mid-level deep pattern mining
* Mind's eye: A recurrent visual representation for image caption generation
* Mining semantic affordances of visual object categories
* Mirror, mirror on the wall, tell me, is the error small?
* mixed bag of emotions: Model, predict, and transfer emotion distributions, A
* Model recommendation: Generating object detectors from few samples
* Modeling deformable gradient compositions for single-image super-resolution
* Modeling local and global deformations in Deep Learning: Epitomic convolution, Multiple Instance Learning, and sliding window detection
* Modeling object appearance using Context-Conditioned Component Analysis
* Modeling video evolution for action recognition
* More about VLAD: A leap from Euclidean to Riemannian manifolds
* Motion Part Regularization: Improving action recognition via trajectory group selection
* MRF optimization by graph approximation
* MRF shape prior for facade parsing with occlusions, A
* Multi-feature max-margin hierarchical Bayesian model for action recognition
* Multi-instance object segmentation with occlusion handling
* Multi-manifold deep metric learning for image set classification
* Multi-objective convolutional learning for face labeling
* multi-plane block-coordinate frank-wolfe algorithm for training structural SVMs with a costly max-oracle, A
* MUlti-Store Tracker (MUSTer): A cognitive psychology inspired approach to object tracking
* Multi-task deep visual-semantic embedding for video thumbnail selection
* Multi-view feature engineering and learning
* Multiclass semantic video segmentation with object-level active inference
* Multihypothesis trajectory analysis for robust visual tracking
* Multiple instance learning for soft bags via top instances
* Multiple random walkers and their application to image cosegmentation
* Multispectral pedestrian detection: Benchmark dataset and baseline
* Nested motion descriptors
* Neuroaesthetics in fashion: Modeling the perception of fashionability
* New insights into Laplacian similarity search
* new retraction for accelerating the Riemannian three-factor low-rank matrix completion algorithm, A
* Non-rigid registration of images with geometric and photometric deformation by using local affine Fourier-moment matching
* novel locally linear KNN model for visual recognition, A
* Object detection by labeling superpixels
* Object proposal by multi-branch hierarchical segmentation
* Object scene flow for autonomous vehicles
* Object-based RGBD image co-segmentation with mutex constraint
* On learning optimized reaction diffusion processes for effective image restoration
* On pairwise costs for network flow multi-object tracking
* On the appearance of translucent edges
* On the minimal problems of low-rank matrix factorization
* On the relationship between visual attributes and convolutional networks
* One-day outdoor photometric stereo via skylight estimation
* Online sketching hashing
* Ontological supervision for fine grained classification of Street View storefronts
* Optimal graph learning with partial tags and multiple features for image and video annotation
* Oriented edge forests for boundary detection
* P3.5P: Pose estimation with unknown focal length
* PAIGE: PAirwise Image Geometry Encoding for improved efficiency in Structure-from-Motion
* Pairwise geometric matching for large-scale object retrieval
* Parsing occluded people by flexible compositions
* Part-based modelling of compound scenes from images
* PatchCut: Data-driven object segmentation via local shape transfer
* Pedestrian Detection Aided by Deep Learning Semantic Tasks
* Person count localization in videos from noisy foreground and detections
* Person re-identification by Local Maximal Occurrence representation and metric learning
* Phase-based frame interpolation for video
* Photometric refinement of depth maps for multi-albedo objects
* Photometric stereo with near point lighting: A solution by mesh deformation
* Picture: A probabilistic programming language for scene perception
* Pooled motion features for first-person videos
* Pose-conditioned joint angle limits for 3D human pose reconstruction
* Practical robust two-view translation estimation
* Predicting eye fixations using convolutional neural networks
* Predicting the future behavior of a time-varying probability distribution
* Prediction of search targets from fixations in open-world settings
* Privacy preserving optics for miniature vision sensors
* Probability occupancy maps for occluded depth images
* Project-Out Cascaded Regression with an application to face alignment
* Projection Metric Learning on Grassmann Manifold with Application to Video based Face Recognition
* Propagated image filtering
* Protecting against screenshots: An image processing approach
* Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus Benchmark A
* Query-adaptive late fusion for image search and person re-identification
* R6P: Rolling shutter absolute pose problem
* Radial distortion homography
* Random tree walk toward instantaneous 3D human pose estimation
* Ranking and retrieval of image sequences from multiple paragraph queries
* Real-time 3D head pose and facial landmark estimation from depth images using triangular surface patch features
* Real-time coarse-to-fine topologically preserving segmentation
* Real-time joint estimation of camera orientation and vanishing points
* Real-time part-based visual tracking via adaptive correlation filters
* Real-time visual analysis of microvascular blood flow for critical care
* Recognize complex events from static images by fusing deep channels
* Reconstructing the world* in six days
* Recovering inner slices of translucent objects by multi-frequency illumination
* Recurrent convolutional neural network for object recognition
* Reflectance hashing for material recognition
* Reflection removal for in-vehicle black box videos
* Reflection removal using ghosting cues
* Region-based temporally consistent video post-processing
* Regularizing max-margin exemplars by reconstruction and generative models
* Reliable Patch Trackers: Robust visual tracking by exploiting reliable patches
* Rent3D: Floor-plan priors for monocular layout estimation
* Representing 3D texture on mesh manifolds for retrieval and recognition applications
* Revisiting kernelized locality-sensitive hashing for improved large-scale image retrieval
* Reweighted laplace prior based hyperspectral compressive sensing for unknown sparsity
* RGBD-fusion: Real-time high precision depth recovery
* Riemannian coding and dictionary learning: Kernels to the rescue
* Robust camera location estimation by convex programming
* Robust image alignment with multiple feature descriptors and matching-guided neighborhoods
* Robust image filtering using joint static and dynamic guidance
* Robust large scale monocular visual SLAM
* Robust Manhattan Frame estimation from a single RGB-D image
* Robust multi-image based blind face hallucination
* Robust multiple homography estimation: An ill-solved problem
* Robust reconstruction of indoor scenes
* Robust regression on image manifolds for ordered label denoising
* Robust saliency detection via regularized random walks ranking
* Robust video segment proposals with painless occlusion handling
* Rolling shutter motion deblurring
* Rotating your face using multi-task deep neural network
* S-HOCK dataset: Analyzing crowds at the stadium, The
* SALICON: Saliency in Context
* Saliency detection by multi-context deep learning
* Saliency detection via Cellular Automata
* Saliency propagation from simple to difficult
* Saliency-aware geodesic video object segmentation
* Salient object detection via bootstrap learning
* Salient Object Subitizing
* Saturation-preserving specular reflection separation
* Scalable object detection by filter compression with regularized sparse coding
* Scalable structure from motion for densely sampled videos
* Scene classification with semantic Fisher vectors
* Scene labeling with LSTM recurrent neural networks
* Second-order constrained parametric proposals and sequential search-based structured prediction for semantic segmentation in RGB-D images
* segDeepM: Exploiting segmentation and context in deep neural networks for object detection
* Segment based 3D object shape priors
* Self Scaled Regularized Robust Regression
* Semantic alignment of LiDAR data at city scale
* Semantic object segmentation via detection in weakly labeled video
* Semantic part segmentation using compositional model combining shape and appearance
* Semantics-preserving hashing for cross-view retrieval
* Semi-supervised Domain Adaptation with Subspace Learning for visual recognition
* Semi-supervised learning with explicit relationship regularization
* Semi-Supervised Low-Rank Mapping Learning for Multi-Label Classification
* Sense discovery via co-clustering on images and text
* Separating objects and clutter in indoor scenes
* Shadow optimization from structured deep edge detection
* Shape and light directions from shading and polarization
* Shape driven kernel adaptation in Convolutional Neural Network for robust facial trait recognition
* Shape-based automatic detection of a large number of 3D facial landmarks
* Shape-from-Template in Flatland
* Shape-tailored local descriptors and their application to segmentation and tracking
* Show and tell: A neural image caption generator
* Similarity learning on an explicit polynomial kernel feature map for person re-identification
* Simplified mirror-based camera pose computation via rotation averaging
* Simulating makeup through physics-based manipulation of intrinsic image layers
* Simultaneous feature learning and hash coding with deep neural networks
* Simultaneous pose and non-rigid shape with particle dynamics
* Simultaneous Time-of-Flight sensing and photometric stereo with a single ToF sensor
* Simultaneous video defogging and stereo reconstruction
* Single image super-resolution from transformed self-exemplars
* Single target tracking using adaptive clustered decision trees and dynamic multi-level appearance models
* Single-image estimation of the camera response function in near-lighting
* Situational object boundary detection
* Sketch-based 3D shape retrieval using Convolutional Neural Networks
* Small instance detection by integer programming on object density maps
* Small-variance nonparametric clustering on the hypersphere
* Social saliency prediction
* SOLD: Sub-optimal low-rank decomposition for efficient video segmentation
* solution for multi-alignment by transformation synchronisation, A
* Solving multiple square jigsaw puzzles with missing pieces
* SOM: Semantic obviousness metric for image quality assessment
* Space-time tree ensemble for action recognition
* Sparse composite quantization
* Sparse Convolutional Neural Networks
* Sparse depth super resolution
* Sparse projections for high-dimensional binary codes
* Sparse representation classification with manifold constraints transfer
* Spherical embedding of inlier silhouette dissimilarities
* stable multi-scale kernel for topological machine learning, A
* Statistical inference models for image datasets with systematic variations
* statistical model of Riemannian metric variation for deformable shape analysis, A
* stitched puppet: A graphical model of 3D human shape and pose, The
* Structural Sparse Tracking
* Structured Sparse Subspace Clustering: A unified optimization framework
* Subgraph decomposition for multi-target tracking
* Subgraph matching using compactness prior for robust feature correspondence
* Subspace clustering by Mixture of Gaussian Regression
* SUN RGB-D: A RGB-D scene understanding benchmark suite
* Super-Resolution Person Re-Identification with Semi-Coupled Low-Rank Discriminant Dictionary Learning
* Superdifferential cuts for binary energies
* Superpixel meshes for fast edge-preserving surface reconstruction
* Superpixel segmentation using Linear Spectral Clustering
* Superpixel-based video object segmentation using perceptual organization and location prior
* Supervised descriptor learning for multi-output regression
* Supervised Discrete Hashing
* Supervised mid-level features for word image representation
* SWIFT: Sparse Withdrawal of Inliers in a First Trial
* Symmetry-based text line detection in natural scenes
* Taking a deeper look at pedestrians
* Target Identity-aware Network Flow for online multiple target tracking
* Temporally coherent interpretations for long videos using pattern theory
* Texture representations for image and video synthesis
* Three viewpoints toward exemplar SVM
* TILDE: A Temporally Invariant Learned DEtector
* Time-to-contact from image intensity
* Total variation regularization of shape signals
* Toward user-specific tracking by detection of human shapes in multi-cameras
* Towards 3D object detection with bimodal deep Boltzmann machines over RGBD imagery
* Towards force sensing from vision: Observing hand-object interactions to infer manipulation forces
* Towards Open World Recognition
* Towards unified depth and semantic prediction from a single image
* Traditional saliency reloaded: A good old model in new shape
* Transferring a semantic representation for person re-identification and search
* Transformation of Markov Random Fields for marginal distribution estimation
* Transformation-Invariant Convolutional Jungles
* Transport-based single frame super resolution of very low resolution face images
* treasure beneath convolutional layers: Cross-convolutional-layer pooling for image classification, The
* Tree quantization for large-scale similarity search and classification
* TVSum: Summarizing web videos using titles
* Uncalibrated photometric stereo based on elevation angle recovery from BRDF symmetry of isotropic materials
* Unconstrained 3D face reconstruction
* Unconstrained realtime facial performance capture
* Understanding classifier errors by examining influential neighbors
* Understanding deep image representations by inverting them
* Understanding Image Representations by Measuring Their Equivariance and Equivalence
* Understanding image structure via hierarchical shape parsing
* Understanding image virality
* Understanding pedestrian behaviors from stationary crowd groups
* Understanding tools: Task-oriented object modeling, learning and recognition
* Unifying holistic and Parts-Based Deformable Model fitting
* UniHIST: A unified framework for image restoration with marginal histogram constraints
* Unsupervised learning of complex articulated kinematic structures combining motion and skeleton information
* Unsupervised object discovery and localization in the wild: Part-based matching with bottom-up region proposals
* Unsupervised Simultaneous Orthogonal basis Clustering Feature Selection
* Unsupervised visual alignment with similarity graphs
* Video anomaly detection and localization using hierarchical feature representation and Gaussian process regression
* Video co-summarization: Video summarization by visual co-occurrence
* Video event recognition with deep hierarchical context model
* Video magnification in presence of large motions
* Video summarization by learning submodular mixtures of objectives
* Viewpoints and keypoints
* VIP: Finding important people in images
* Virtual view networks for object reconstruction
* VisKE: Visual knowledge extraction and question answering by visual verification of relation phrases
* Visual recognition by counting instances: A multi-instance cardinality potential kernel
* Visual recognition by learning from web data: A weakly supervised domain generalization approach
* Visual saliency based on multiscale deep features
* Visual Vibrometry: Estimating Material Properties from Small Motions in Video
* Watch and learn: Semi-supervised learning of object detectors from videos
* Watch-n-patch: Unsupervised understanding of actions and relations
* Weakly supervised localization of novel objects using appearance transfer
* Weakly supervised object detection with convex clustering
* Weakly supervised semantic segmentation for social images
* Web scale photo hash clustering on a single machine
* Web-scale training for face identification
* weighted sparse coding framework for saliency detection, A
* What do 15,000 object categories tell us about classifying and localizing actions?
* Zero-shot object recognition by semantic manifold distance
603 for CVPR15

CVPR16 * *CVPR
* 3D Action Recognition from Novel Viewpoints
* 3D Morphable Model Learnt from 10,000 Faces, A
* 3D Part-Based Sparse Tracker with Automatic Synchronization and Registration
* 3D Reconstruction of Transparent Objects with Position-Normal Consistency
* 3D Semantic Parsing of Large-Scale Indoor Spaces
* 3D Shape Attributes
* 6D Dynamic Camera Relocalization from Single Reference Image
* Accelerated Generative Models for 3D Point Cloud Data
* Accumulated Stability Voting: A Robust Descriptor from Descriptors of Multiple Scales
* Accurate Image Super-Resolution Using Very Deep Convolutional Networks
* Action Recognition in Video Using Sparse Coding and Relative Features
* Actionness Estimation Using Hybrid Fully Convolutional Networks
* Actions ~ Transformations
* Active Image Segmentation Propagation
* Active Learning for Delineation of Curvilinear Structures
* Actor-Action Semantic Segmentation with Grouping Process Models
* Adaptive 3D Face Reconstruction from Unconstrained Photo Collections
* Adaptive Decontamination of the Training Set: A Unified Formulation for Discriminative Visual Tracking
* Adaptive Object Detection Using Adjacency and Zoom Prediction
* Affinity CNN: Learning Pixel-Centric Pairwise Relations for Figure/Ground Embedding
* Aggregating Image and Text Quantized Correlated Components
* Ambiguity Helps: Classification with Disagreements in Crowdsourced Annotations
* Amplitude Modulated Video Camera: Light Separation in Dynamic Scenes
* Analyzing Classifiers: Fisher Vectors and Deep Neural Networks
* Answer-Type Prediction for Visual Question Answering
* Anticipating Visual Representations from Unlabeled Video
* Approximate Log-Hilbert-Schmidt Distances between Covariance Operators for Image Classification
* Ask Me Anything: Free-Form Visual Question Answering Based on Knowledge from External Sources
* ASP Vision: Optically Computing the First Layer of Convolutional Neural Networks Using Angle Sensitive Pixels
* Attention to Scale: Scale-Aware Semantic Image Segmentation
* Augmented Blendshapes for Real-Time Simultaneous 3D Head Modeling and Facial Motion Capture
* Automated 3D Face Reconstruction from Multiple Images Using Quality Measures
* Automatic Content-Aware Color and Tone Stylization
* Automatic Fence Segmentation in Videos of Dynamic Scenes
* Automatic Image Cropping: A Computational Complexity Study
* Automating Carotid Intima-Media Thickness Video Interpretation with Convolutional Neural Networks
* Backtracking ScSPM Image Classifier for Weakly Supervised Top-Down Saliency
* Benchmark Dataset and Evaluation for Non-Lambertian and Uncalibrated Photometric Stereo, A
* Benchmark Dataset and Evaluation Methodology for Video Object Segmentation, A
* Beyond F-Formations: Determining Social Involvement in Free Standing Conversing Groups from Static Images
* Beyond Local Search: Tracking Objects Everywhere with Instance-Specific Proposals
* Bilateral Space Video Segmentation
* Blind Image Deblurring Using Dark Channel Prior
* Blind Image Deconvolution by Automatic Gradient Activation
* Blockout: Dynamic Model Selection for Hierarchical Deep Networks
* BORDER: An Oriented Rectangles Approach to Texture-Less Object Recognition
* Bottom-Up and Top-Down Reasoning with Hierarchical Rectified Gaussians
* BoxCars: 3D Boxes as CNN Input for Improved Fine-Grained Vehicle Recognition
* Camera Calibration from Dynamic Silhouettes Using Motion Barcodes
* Camera Calibration from Periodic Motion of a Pedestrian
* Canny Text Detector: Fast and Robust Scene Text Localization Algorithm
* Cascaded Interactional Targeting Network for Egocentric Video Analysis
* Cataloging Public Objects Using Aerial and Street-Level Images: Urban Trees
* Cityscapes Dataset for Semantic Urban Scene Understanding, The
* Closed-Form Training of Mahalanobis Distance for Supervised Clustering
* CNN-N-Gram for Handwriting Word Recognition
* CNN-RNN: A Unified Framework for Multi-label Image Classification
* Coherent Parametric Contours for Interactive Video Object Segmentation
* Collaborative Quantization for Cross-Modal Similarity Search
* CoMaL: Good Features to Match on Object Boundaries
* Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis
* Compact Bilinear Pooling
* Comparative Deep Learning of Hybrid Representations for Image Recommendations
* Comparative Study for Single Image Blind Deblurring, A
* Composition-Preserving Deep Photo Aesthetics Assessment
* Computational Imaging for VLBI Image Reconstruction
* Conditional Graphical Lasso for Multi-label Image Classification
* Conformal Surface Alignment with Optimal Mobius Search
* Consensus of Non-rigid Reconstructions
* Consensus-Based Framework for Distributed Bundle Adjustment, A
* Consistency of Silhouettes and Their Duals
* Constrained Deep Transfer Feature Learning and Its Applications
* Constrained Joint Cascade Regression Framework for Simultaneous Facial Action Unit Recognition and Facial Landmark Detection
* Constructing Canonical Regions for Fast and Effective View Selection
* Context Encoders: Feature Learning by Inpainting
* Context-Aware Gaussian Fields for Non-rigid Point Set Registration
* Continuous Occlusion Model for Road Scene Understanding, A
* Contour Detection in Unstructured 3D Point Clouds
* Convexity Shape Constraints for Image Segmentation
* Convolutional Networks for Shape from Light Field
* Convolutional Pose Machines
* Convolutional Two-Stream Network Fusion for Video Action Recognition
* Coordinating Multiple Disparity Proposals for Stereo Computation
* Copula Ordinal Regression for Joint Estimation of Facial Action Unit Intensity
* Coupled Harmonic Bases for Longitudinal Characterization of Brain Networks
* CP-mtML: Coupled Projection Multi-Task Metric Learning for Large Scale Face Retrieval
* CRAFT Objects from Images
* Cross Modal Distillation for Supervision Transfer
* Cross-Stitch Networks for Multi-task Learning
* D3: Deep Dual-Domain Based Fast Restoration of JPEG-Compressed Images
* DAG-Recurrent Neural Networks for Scene Labeling
* DCAN: Deep Contour-Aware Networks for Accurate Gland Segmentation
* Deep Canonical Time Warping
* Deep Compositional Captioning: Describing Novel Object Categories without Paired Training Data
* Deep Contrast Learning for Salient Object Detection
* Deep Decision Network for Multi-class Image Classification
* Deep Exemplar 2D-3D Detection by Adapting from Real to Rendered Views
* Deep Gaussian Conditional Random Field Network: A Model-Based Deep Network for Discriminative Denoising
* Deep Hand: How to Train a CNN on 1 Million Hand Images When Your Data is Continuous and Weakly Labelled
* Deep Interactive Object Selection
* Deep Metric Learning via Lifted Structured Feature Embedding
* Deep Reflectance Maps
* Deep Region and Multi-label Learning for Facial Action Unit Detection
* Deep Relative Distance Learning: Tell the Difference between Similar Vehicles
* Deep Residual Learning for Image Recognition
* Deep Saliency with Encoded Low Level Distance Map and High Level Features
* Deep SimNets
* Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images
* Deep Stereo: Learning to Predict New Views from the World's Imagery
* Deep Structured Scene Parsing by Learning with Image Descriptions
* Deep Supervised Hashing for Fast Image Retrieval
* DeepCAMP: Deep Convolutional Action Attribute Mid-Level Patterns
* DeepCut: Joint Subset Partition and Labeling for Multi Person Pose Estimation
* Deeper Look at Saliency: Feature Contrast, Semantics, and Beyond, A
* DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations
* DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks
* DeepHand: Robust Hand Pose Estimation by Completing a Matrix Imputed with Deep Features
* Deeply-Recursive Convolutional Network for Image Super-Resolution
* DeLay: Robust Spatial Layout Estimation for Cluttered Indoor Scenes
* Dense Human Body Correspondences Using Convolutional Networks
* Dense Monocular Depth Estimation in Complex Dynamic Scenes
* DenseCap: Fully Convolutional Localization Networks for Dense Captioning
* Depth from Semi-Calibrated Stereo and Defocus
* Detecting Events and Key Actors in Multi-person Videos
* Detecting Migrating Birds at Night
* Detecting Repeating Objects Using Patch Correlation Analysis
* Detecting Vanishing Points Using Global Image Context in a Non-ManhattanWorld
* Detection and Accurate Localization of Circular Fiducials under Highly Challenging Conditions
* Determining Occlusions from Space and Time Image Reconstructions
* DHSNet: Deep Hierarchical Saliency Network for Salient Object Detection
* Dictionary Pair Classifier Driven Convolutional Neural Networks for Object Detection
* Direct Least-Squares Solution to the PnP Problem with Unknown Focal Length, A
* Direct Prediction of 3D Body Poses from Motion Compensated Sequences
* Discovering the Physical Parts of an Articulated Object Class from Multiple Videos
* Discriminative Hierarchical Rank Pooling for Activity Recognition
* Discriminative Invariant Kernel Features: A Bells-and-Whistles-Free Approach to Unsupervised Face Recognition and Pose Estimation
* Discriminative Multi-modal Feature Fusion for RGBD Indoor Scene Recognition
* Discriminatively Embedded K-Means for Multi-view Clustering
* DisturbLabel: Regularizing CNN on the Loss Layer
* Do Computational Models Differ Systematically from Human Object Perception?
* Do It Yourself Hyperspectral Imaging with Everyday Digital Cameras
* Dual-Source Approach for 3D Pose Estimation from a Single Image, A
* Dynamic Image Networks for Action Recognition
* Efficient 3D Room Shape Recovery from a Single Panorama
* Efficient and Robust Color Consistency for Community Photo Collections
* Efficient Coarse-to-Fine Patch Match for Large Displacement Optical Flow
* Efficient Deep Learning for Stereo Matching
* Efficient Exact-PGA Algorithm for Constant Curvature Manifolds, An
* Efficient Globally Optimal 2D-to-3D Deformable Shape Matching
* Efficient Indexing of Billion-Scale Datasets of Deep Descriptors
* Efficient Intersection of Three Quadrics and Applications in Computer Vision
* Efficient Large-Scale Approximate Nearest Neighbor Search on the GPU
* Efficient Large-Scale Similarity Search Using Matrix Factorization
* Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation
* Efficient Point Process Inference for Large-Scale Object Detection
* Efficient Temporal Sequence Comparison and Classification Using Gram Matrix Embeddings on a Riemannian Manifold
* Efficient Training of Very Deep Neural Networks for Supervised Hashing
* Efficiently Creating 3D Training Data for Fine Hand Pose Estimation
* Egocentric Future Localization
* Egocentric Look at Video Photographer Identity, An
* Embedding Label Structures for Fine-Grained Feature Representation
* EmotioNet: An Accurate, Real-Time Algorithm for the Automatic Annotation of a Million Facial Expressions in the Wild
* Empirical Evaluation of Current Convolutional Architectures: Ability to Manage Nuisance Location and Scale Variability, An
* End-to-End Learning of Action Detection from Frame Glimpses in Videos
* End-to-End Learning of Deformable Mixture of Parts and Deep Convolutional Neural Networks for Human Pose Estimation
* End-to-End People Detection in Crowded Scenes
* End-to-End Saliency Mapping via Probability Distribution Prediction
* Equiangular Kernel Dictionary Learning with Applications to Dynamic Texture Analysis
* Estimating Correspondences of Deformable Objects In-the-Wild
* Estimating Sparse Signals with Smooth Support via Convex Programming and Block Sparsity
* Event-Specific Image Importance
* Exemplar-Driven Top-Down Saliency Detection via Deep Association
* Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers
* Exploit Bounding Box Annotations for Multi-Label Object Recognition
* Exploiting Spectral-Spatial Correlation for Coded Hyperspectral Image Restoration
* Eye Tracking for Everyone
* Face Alignment Across Large Poses: A 3D Solution
* Face2Face: Real-Time Face Capture and Reenactment of RGB Videos
* Facial Expression Intensity Estimation Using Ordinal Information
* Factors in Finetuning Deep Model for Object Detection with Long-Tail Distribution
* FANNG: Fast Approximate Nearest Neighbour Graphs
* Fashion Style in 128 Floats: Joint Ranking and Classification Using Weak Data for Feature Extraction
* Fast Algorithms for Convolutional Neural Networks
* Fast Algorithms for Linear and Kernel SVM+
* Fast ConvNets Using Group-Wise Brain Damage
* Fast Detection of Curved Edges at Low SNR
* Fast Temporal Activity Proposals for Efficient Detection of Human Actions in Untrimmed Videos
* Fast Training of Triplet-Based Deep Binary Embedding Networks
* Fast Zero-Shot Image Tagging
* Feature Space Optimization for Semantic Video Segmentation
* Field Model for Repairing 3D Shapes, A
* Fine-Grained Categorization and Dataset Bootstrapping Using Deep Metric Learning with Humans in the Loop
* Fine-Grained Image Classification by Exploring Bipartite-Graph Labels
* FireCaffe: Near-Linear Acceleration of Deep Neural Network Training on Compute Clusters
* First Person Action Recognition Using Deep Learned Descriptors
* Fits Like a Glove: Rapid and Reliable Hand Shape Personalization
* Force from Motion: Decoding Physical Sensation in a First Person Video
* ForgetMeNot: Memory-Aware Forensic Facial Sketch Matching
* From Bows to Arrows: Rolling Shutter Rectification of Urban Scenes
* From Dusk Till Dawn: Modeling in the Dark
* From Keyframes to Key Objects: Video Summarization by Representative Object Proposal Selection
* From Noise Modeling to Blind Image Denoising
* Full Flow: Optical Flow Estimation By Global Optimization over Regular Grids
* Functional Faces: Groupwise Dense Correspondence Using Functional Maps
* G-CNN: An Iterative Grid Based Object Detector
* Gaussian Conditional Random Field Network for Semantic Segmentation
* Generation and Comprehension of Unambiguous Object Descriptions
* Geometry-Informed Material Recognition
* Geospatial Correspondences for Multimodal Registration
* GIFT: A Real-Time and Scalable 3D Shape Search Engine
* Global Patch Collider, The
* Globally Optimal Manhattan Frame Estimation in Real-Time
* Globally Optimal Rigid Intensity Based Registration: A Fast Fourier Domain Approach
* GOGMA: Globally-Optimal Gaussian Mixture Alignment
* Going Deeper into First-Person Activity Recognition
* GraB: Visual Saliency via Novel Graph Model and Background Priors
* Gradient-Domain Image Reconstruction Framework with Intensity-Range and Base-Structure Constraints
* Gradual DropIn of Layers to Train Very Deep Neural Networks
* Gravitational Approach for Point Set Registration
* Group MAD Competition? A New Methodology to Compare Objective Image Quality Models
* Groupwise Tracking of Crowded Similar-Appearance Targets from Low-Continuity Image Sequences
* Guaranteed Outlier Removal with Mixed Integer Linear Programs
* Harnessing Object and Scene Semantics for Large-Scale Video Understanding
* HD Maps: Fine-Grained Road Segmentation by Parsing Ground and Aerial Images
* Hedged Deep Tracking
* Hedgehog Shape Priors for Multi-Object Segmentation
* Heterogeneous Light Fields
* Hierarchical Deep Temporal Model for Group Activity Recognition, A
* Hierarchical Gaussian Descriptor for Person Re-identification
* Hierarchical Pose-Based Approach to Complex Action Understanding Using Dictionaries of Actionlets and Motion Poselets, A
* Hierarchical Recurrent Neural Encoder for Video Representation with Application to Captioning
* Hierarchically Gated Deep Networks for Semantic Segmentation
* High-Quality Depth from Uncalibrated Small Motion Clip
* Highlight Detection with Pairwise Deep Ranking for First-Person Video Summarization
* Highway Vehicle Counting in Compressed Domain
* Hole Filling Approach Based on Background Reconstruction for View Synthesis in 3D Video, A
* Holistic Approach to Cross-Channel Image Noise Modeling and Its Application to Image Denoising, A
* Homography Estimation from the Common Self-Polar Triangle of Separate Ellipses
* How Far are We from Solving Pedestrian Detection?
* How Hard Can It Be? Estimating the Difficulty of Visual Search in an Image
* Human Pose Estimation with Iterative Error Feedback
* HyperDepth: Learning Depth from Structured Light without Matching
* HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection
* Identifying Good Training Data for Self-Supervised Free Space Estimation
* iLab-20M: A Large-Scale Controlled Object Dataset to Investigate Deep Learning
* Image Captioning with Semantic Attention
* Image Deblurring Using Smartphone Inertial Sensors
* Image Question Answering Using Convolutional Neural Network with Dynamic Parameter Prediction
* Image Style Transfer Using Convolutional Neural Networks
* Improved Hamming Distance Search Using Variable Length Hashing
* Improving Human Action Recognition by Non-action Classification
* Improving Person Re-identification via Pose-Aware Multi-shot Matching
* Improving the Robustness of Deep Neural Networks via Stability Training
* In Defense of Sparse Tracking: Circulant Sparse Tracker
* In the Shadows, Shape Priors Shine: Using Occlusion to Improve Multi-region Segmentation
* Incremental Object Discovery in Time-Varying Image Collections
* Inextensible Non-Rigid Shape-from-Motion by Second-Order Cone Programming
* Inferring Forces and Learning Human Utilities from Videos
* Information Bottleneck Learning Using Privileged Information for Visual Recognition
* Information-Driven Adaptive Structured-Light Scanners
* Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks
* Instance-Aware Semantic Segmentation via Multi-task Network Cascades
* Instance-Level Segmentation for Autonomous Driving with Deep Densely Connected MRFs
* Instance-Level Video Segmentation from Object Tracks
* Interactive Segmentation on RGBD Images via Cue Selection
* InterActive: Inter-Layer Activeness Propagation
* Inverting Visual Representations with Convolutional Networks
* Isometric Non-rigid Shape-from-Motion in Linear Time
* Iterative Instance Segmentation
* Joint Learning of Single-Image and Cross-Image Representations for Person Re-identification
* Joint Multiview Segmentation and Localization of RGB-D Images Using Depth-Induced Silhouette Consistency
* Joint Probabilistic Matching Using m-Best Solutions
* Joint Recovery of Dense Correspondence and Cosegmentation in Two Images
* Joint Training of Cascaded CNN for Face Detection
* Joint Unsupervised Deformable Spatio-Temporal Alignment of Sequences
* Joint Unsupervised Learning of Deep Representations and Image Clusters
* Jointly Modeling Embedding and Translation to Bridge Video and Language
* Just Look at the Image: Viewpoint-Specific Surface Normal Prediction for Improved Multi-View Reconstruction
* Kernel Approximation via Empirical Orthogonal Decomposition for Unsupervised Feature Learning
* Kernel Sparse Subspace Clustering on Symmetric Positive Definite Manifolds
* Key Volume Mining Deep Framework for Action Recognition, A
* Kinematic Structure Correspondences via Hypergraph Matching
* Laplacian Patch-Based Image Synthesis
* Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation, A
* Large Scale Hard Sample Mining with Monte Carlo Tree Search
* Large Scale Semi-Supervised Object Detection Using Visual and Semantic Knowledge Transfer
* Large-Pose Face Alignment via CNN-Based Dense 3D Model Fitting
* Large-Scale Location Recognition and the Geometric Burstiness Problem
* Large-Scale Semantic 3D Reconstruction: An Adaptive Multi-resolution Model for Multi-class Volumetric Labeling
* Latent Embeddings for Zero-Shot Classification
* Latent Factor Guided Convolutional Neural Networks for Age-Invariant Face Recognition
* Latent Variable Graphical Model Selection Using Harmonic Analysis: Applications to the Human Connectome Project (HCP)
* Layered Scene Decomposition via the Occlusion-CRF
* Learned Binary Spectral Shape Descriptor for 3D Shape Correspondence
* Learning a Discriminative Null Space for Person Re-identification
* Learning Action Maps of Large Environments via First-Person Vision
* Learning Activity Progression in LSTMs for Activity Detection and Early Detection
* Learning Aligned Cross-Modal Representations from Weakly Aligned Data
* Learning Attributes Equals Multi-Source Domain Generalization
* Learning Compact Binary Descriptors with Unsupervised Deep Neural Networks
* Learning Cross-Domain Landmarks for Heterogeneous Domain Adaptation
* Learning Deep Feature Representations with Domain Guided Dropout for Person Re-identification
* Learning Deep Features for Discriminative Localization
* Learning Deep Representation for Imbalanced Classification
* Learning Deep Representations of Fine-Grained Visual Descriptions
* Learning Deep Structure-Preserving Image-Text Embeddings
* Learning Dense Correspondence via 3D-Guided Cycle Consistency
* Learning from the Mistakes of Others: Matching Errors in Cross-Dataset Learning
* Learning Local Image Descriptors with Deep Siamese and Triplet Convolutional Networks by Minimizing Global Loss Functions
* Learning Multi-domain Convolutional Neural Networks for Visual Tracking
* Learning Online Smooth Predictors for Realtime Camera Planning Using Recurrent Decision Trees
* Learning Reconstruction-Based Remote Gaze Estimation
* Learning Relaxed Deep Supervision for Better Edge Detection
* Learning Sparse High Dimensional Filters: Image Filtering, Dense CRFs and Bilateral Neural Networks
* Learning Structured Inference Neural Networks with Label Relations
* Learning Temporal Regularity in Video Sequences
* Learning to Assign Orientations to Feature Points
* Learning to Co-Generate Object Proposals with a Deep Structured Network
* Learning to Localize Little Landmarks
* Learning to Match Aerial Images with Deep Attentive Architectures
* Learning to Read Chest X-Rays: Recurrent Neural Cascade Model for Automated Image Annotation
* Learning to Select Pre-Trained Deep Representations with Bayesian Evidence Framework
* Learning Transferrable Knowledge for Semantic Segmentation with Deep Convolutional Neural Network
* Learning Weight Uncertainty with Stochastic Gradient MCMC for Shape Classification
* Learning with Side Information through Modality Hallucination
* Learnt Quasi-Transitive Similarity for Retrieval from Large Collections of Faces
* Less is More: Zero-Shot Learning from Online Textual Documents with Noise Suppression
* Linear Shape Deformation Models with Local Support Using Graph-Based Structured Matrix Factorisation
* Local Background Enclosure for RGB-D Salient Object Detection
* LocNet: Improving Localization Accuracy for Object Detection
* Logistic Boosting Regression for Label Distribution Learning
* LOMo: Latent Ordinal Model for Facial Analysis in Videos
* Longitudinal Face Modeling via Temporal Deep Restricted Boltzmann Machines
* Loss Functions for Top-k Error: Analysis and Insights
* Macroscopic Interferometry: Rethinking Depth Estimation with Frequency-Domain Time-of-Flight
* Manifold SLIC: A Fast Method to Compute Content-Sensitive Superpixels
* Marr Revisited: 2D-3D Alignment via Surface Normal Prediction
* Material Classification Using Raw Time-of-Flight Measurements
* MCMC Shape Sampling for Image Segmentation with Nonparametric Shape Priors
* MDL-CW: A Multimodal Deep Learning Framework with CrossWeights
* MegaFace Benchmark: 1 Million Faces for Recognition at Scale, The
* Memory Efficient Max Flow for Multi-Label Submodular MRFs
* Metric Learning as Convex Combinations of Local Models with Generalization Guarantees
* Min Norm Point Algorithm for Higher Order MRF-MAP Inference
* Minimizing the Maximal Rank
* Mining 3D Key-Pose-Motifs for Action Recognition
* Mining Discriminative Triplets of Patches for Fine-Grained Classification
* Mirror Surface Reconstruction under an Uncalibrated Camera
* Mixture of Bilateral-Projection Two-Dimensional Probabilistic Principal Component Analysis
* Mnemonic Descent Method: A Recurrent Process Applied for End-to-End Face Alignment
* Modality and Component Aware Feature Fusion for RGB-D Scene Classification
* Monocular 3D Object Detection for Autonomous Driving
* Monocular Depth Estimation Using Neural Regression Forest
* Moral Lineage Tracing
* Motion from Structure (MfS): Searching for 3D Objects in Cluttered Point Trajectories
* MovieQA: Understanding Stories in Movies through Question-Answering
* MSR-VTT: A Large Video Description Dataset for Bridging Video and Language
* Multi-cue Zero-Shot Learning with Strong Supervision
* Multi-label Ranking from Positive and Unlabeled Data
* Multi-level Contextual Model for Person Recognition in Photo Albums, A
* Multi-oriented Text Detection with Fully Convolutional Networks
* Multi-scale Patch Aggregation (MPA) for Simultaneous Detection and Segmentation
* Multi-stream Bi-directional Recurrent Neural Network for Fine-Grained Action Detection, A
* Multi-view Deep Network for Cross-View Classification
* Multi-view People Tracking via Hierarchical Trajectory Composition
* Multicamera Calibration from Visible and Mirrored Epipoles
* Multilinear Hyperplane Hashing
* Multimodal Spontaneous Emotion Corpus for Human Behavior Analysis
* Multiple Models Fitting as a Set Coverage Problem
* Multispectral Images Denoising by Intrinsic Tensor Sparsity Regularization
* Multivariate Regression on the Grassmannian for Predicting Novel Domains
* Multiverse Loss for Robust Transfer Learning, The
* Multiview Image Completion with Space Structure Propagation
* Natural Language Object Retrieval
* Needle-Match: Reliable Patch Matching under High Uncertainty
* NetVLAD: CNN Architecture for Weakly Supervised Place Recognition
* Neural Module Networks
* New Finsler Minimal Path Model with Curvature Penalization for Image Segmentation and Closed Contour Detection, A
* Newtonian Image Understanding: Unfolding the Dynamics of Objects in Static Images
* Next Best Underwater View, The
* Noisy Label Recovery for Shadow Detection in Unfamiliar Domains
* Non-local Image Dehazing
* Nonlinear Regression Technique for Manifold Valued Data with Applications to Medical Image Analysis, A
* NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis
* Object Co-segmentation via Graph Optimized-Flexible Manifold Ranking
* Object Contour Detection with a Fully Convolutional Encoder-Decoder Network
* Object Detection from Video Tubelets with Convolutional Neural Networks
* Object Skeleton Extraction in Natural Images by Fusing Scale-Associated Deep Side Outputs
* Object Tracking via Dual Linear Structured SVM and Explicit Feature Map
* Object-Proposal Evaluation Protocol is Gameable
* Occlusion Boundary Detection via Deep Exploration of Context
* Occlusion-Free Face Alignment: Deep Regression Networks Coupled with De-Corrupt AutoEncoders
* On Benefits of Selection Diversity via Bilevel Exclusive Sparsity
* One-Shot Learning of Scene Locations via Feature Trajectory Transfer
* Online Collaborative Learning for Open-Vocabulary Visual Classifiers
* Online Detection and Classification of Dynamic Hand Gestures with Recurrent 3D Convolutional Neural Networks
* Online Learning with Bayesian Classification Trees
* Online Multi-object Tracking via Structural Constraint Event Aggregation
* Online Reconstruction of Indoor Scenes from RGB-D Streams
* Optical Flow with Semantic Segmentation and Localized Layers
* Optimal Relative Pose with Unknown Correspondences
* Oracle Based Active Set Algorithm for Scalable Elastic Net Subspace Clustering
* Ordinal Regression with Multiple Output CNN for Age Estimation
* Pairwise Decomposition of Image Sequences for Active Multi-view Recognition
* Pairwise Linear Regression Classification for Image Set Retrieval
* Pairwise Matching through Max-Weight Bipartite Belief Propagation
* Panoramic Stereo Videos with a Single Camera
* Paradigm for Building Generalized Models of Human Image Perception through Data Fusion, A
* Parametric Object Motion from Blur
* Part-Stacked CNN for Fine-Grained Visual Categorization
* Patch-Based Convolutional Neural Network for Whole Slide Tissue Image Classification
* PatchBatch: A Batch Augmented Loss for Optical Flow
* Patches, Planes and Probabilities: A Non-Local Prior for Volumetric 3D Reconstruction
* Pedestrian Detection Inspired by Appearance Constancy and Shape Symmetry
* Person Re-identification by Multi-Channel Parts-Based CNN with Improved Triplet Loss Function
* Personalizing Human Video Pose Estimation
* Picking Deep Filter Responses for Fine-Grained Image Recognition
* Piecewise-Planar 3D Approximation from Wide-Baseline Stereo
* POD: Discovering Primary Objects in Videos Based on Evolutionary Refinement of Object Recurrence, Background, and Primary Object Models
* Pose-Aware Face Recognition in the Wild
* PPP: Joint Pointwise and Pairwise Image Label Prediction
* Predicting Motivations of Actions by Leveraging Text
* Predicting the Where and What of Actors and Actions through Online Action Localization
* Predicting When Saliency Maps are Accurate and Eye Fixations Consistent
* Primary Object Segmentation in Videos via Alternate Convex Optimization of Foreground and Background Distributions
* Principled Parallel Mean-Field Inference for Discrete Random Fields
* Prior-Less Compressible Structure from Motion
* Probabilistic Collaborative Representation Based Approach for Pattern Classification, A
* Probabilistic Framework for Color-Based Point Set Registration, A
* Progressive Feature Matching with Alternate Descriptor Selection and Correspondence Enrichment
* Progressive Prioritized Multi-view Stereo
* Progressively Parsing Interactional Objects for Fine Grained Action Detection
* ProNet: Learning to Propose Object-Specific Boxes for Cascaded Neural Networks
* Proposal Flow
* Proximal Riemannian Pursuit for Large-Scale Trace-Norm Minimization
* PSyCo: Manifold Span Reduction for Super Resolution
* Pull the Plug? Predicting If Computers or Humans Should Segment Images
* Quantized Convolutional Neural Networks for Mobile Devices
* RAID-G: Robust Estimation of Approximate Infinite Dimensional Gaussian with Application to Material Recognition
* Rain Streak Removal Using Layer Priors
* Random Features for Sparse Signal Classification
* RAW Image Reconstruction Using a Self-Contained sRGB-JPEG Image with Only 64 KB Overhead
* Real-Time Action Recognition with Enhanced Motion Vector CNNs
* Real-Time Depth Refinement for Specular Objects
* Real-Time Salient Object Detection with a Minimum Spanning Tree
* Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network
* Recognizing Activities of Daily Living with a Wrist-Mounted Camera
* Recognizing Car Fluents from Video
* Recognizing Emotions from Abstract Paintings Using Non-Linear Matrix Completion
* Recognizing Micro-Actions and Reactions from Paired Egocentric Videos
* Recombinator Networks: Learning Coarse-to-Fine Feature Aggregation
* ReconNet: Non-Iterative Reconstruction of Images from Compressively Sensed Measurements
* Reconstructing Shapes and Appearances of Thin Film Objects Using RGB Images
* Recovering 6D Object Pose and Predicting Next-Best-View in the Crowd
* Recovering the Missing Link: Predicting Class-Attribute Associations for Unsupervised Zero-Shot Learning
* Recovering Transparent Shape from Time-of-Flight Distortion
* Recurrent Attention Models for Depth-Based Person Identification
* Recurrent Attentional Networks for Saliency Detection
* Recurrent Convolutional Network for Video-Based Person Re-identification
* Recurrent Face Aging
* Recurrently Target-Attending Tracking
* Recursive Recurrent Nets with Attention Modeling for OCR in the Wild
* ReD-SFA: Relation Discovery Based Slow Feature Analysis for Trajectory Clustering
* Refining Architectures of Deep Convolutional Neural Networks
* Region Ranking SVM for Image Classification
* Regularity-Driven Building Facade Matching between Aerial and Street Views
* Regularizing Long Short Term Memory with 3D Human-Skeleton Sequences for Action Recognition
* Reinforcement Learning for Visual Object Detection
* Relaxation-Based Preprocessing Techniques for Markov Random Field Inference
* Removing Clouds and Recovering Ground Observations in Satellite Image Sequences via Temporally Contiguous Robust Matrix Completion
* Rethinking the Inception Architecture for Computer Vision
* Reversible Recursive Instance-Level Object Segmentation
* RIFD-CNN: Rotation-Invariant and Fisher Discriminative Convolutional Neural Networks for Object Detection
* Robust 3D Hand Pose Estimation in Single Depth Images: From Single-View CNN to Multi-View CNNs
* Robust Kernel Estimation with Outliers Handling for Image Deblurring
* Robust Light Field Depth Estimation for Noisy Scene with Occlusion
* Robust Multi-Body Feature Tracker: A Segmentation-Free Approach
* Robust Multilinear Model Learning Framework for 3D Faces, A
* Robust Optical Flow Estimation of Double-Layer Images under Transparency or Reflection
* Robust Scene Text Recognition with Automatic Rectification
* Robust Tensor Factorization with Unknown Noise
* Robust Visual Place Recognition with Graph Kernels
* Robust, Real-Time 3D Tracking of Multiple Objects with Similar Appearances
* Rolling Rotations for Recognizing Human Actions from 3D Skeletal Data
* Rolling Shutter Absolute Pose Problem with Known Vertical Direction
* Rolling Shutter Camera Relative Pose: Generalized Epipolar Geometry
* Rotational Crossed-Slit Light Fields
* Saliency Guided Dictionary Learning for Weakly-Supervised Image Parsing
* Saliency Unified: A Deep Architecture for simultaneous Eye Fixation Prediction and Salient Object Segmentation
* Sample and Filter: Nonparametric Scene Parsing via Efficient Filtering
* Sample-Specific SVM Learning for Person Re-identification
* Scalable Sparse Subspace Clustering by Orthogonal Matching Pursuit
* Scale-Aware Alignment of Hierarchical Image Segmentation
* Scene Labeling Using Sparse Precision Matrix
* Scene Recognition with CNNs: Objects, Scales and Dataset Bias
* ScribbleSup: Scribble-Supervised Convolutional Networks for Semantic Segmentation
* Seeing Behind the Camera: Identifying the Authorship of a Photograph
* Seeing through the Human Reporting Bias: Visual Classifiers from Noisy Human-Centric Labels
* Self-Adaptive Matrix Completion for Heart Rate Estimation from Face Videos under Realistic Conditions
* Semantic 3D Reconstruction with Continuous Regularization and Ray Potentials Using a Visibility Consistency Constraint
* Semantic Channels for Fast Pedestrian Detection
* Semantic Filtering
* Semantic Image Segmentation with Task-Specific Edge Detection Using CNNs and a Discriminatively Trained Domain Transform
* Semantic Instance Annotation of Street Scenes by 3D to 2D Label Transfer
* Semantic Object Parsing with Local-Global Long Short-Term Memory
* Semantic Segmentation with Boundary Neural Fields
* Semi-supervised Vocabulary-Informed Learning
* SemiContour: A Semi-Supervised Learning Approach for Contour Detection
* Seven Ways to Improve Example-Based Single Image Super Resolution
* Shallow and Deep Convolutional Networks for Saliency Prediction
* Shape Analysis with Hyperbolic Wasserstein Distance
* Shortlist Selection with Residual-Aware Distance Estimator for K-Nearest Neighbor Search
* Siamese Instance Search for Tracking
* Similarity Learning with Spatial Constraints for Person Re-identification
* Similarity Metric for Curved Shapes in Euclidean Space
* Simultaneous Clustering and Model Selection for Tensor Affinities
* Simultaneous Estimation of Near IR BRDF and Fine-Scale Surface Geometry
* Simultaneous Optical Flow and Intensity Estimation from an Event Camera
* Single Image Camera Calibration with Lenticular Arrays for Augmented Reality
* Single Image Object Modeling Based on BRDF and r-Surfaces Learning
* Single-Image Crowd Counting via Multi-Column Convolutional Neural Network
* Situation Recognition: Visual Semantic Role Labeling for Image Understanding
* Sketch Me That Shoe
* SketchNet: Sketch Classification with Web Images
* Sliced Wasserstein Kernels for Probability Distributions
* Slicing Convolutional Neural Network for Crowd Video Understanding
* Slow and Steady Feature Analysis: Higher Order Temporal Coherence in Video
* Social LSTM: Human Trajectory Prediction in Crowded Spaces
* Soft-Segmentation Guided Object Motion Deblurring
* Solution Path Algorithm for Identity-Aware Multi-object Tracking, The
* Solving Small-Piece Jigsaw Puzzles by Growing Consensus
* Solving Temporal Puzzles
* Some Like It Hot: Visual Guidance for Preference Prediction
* Sparse Coding and Dictionary Learning with Linear Dynamical Systems
* Sparse Coding for Classification via Discrimination Ensemble
* Sparse Coding for Third-Order Super-Symmetric Tensor Descriptors with Application to Texture Recognition
* Sparse to Dense 3D Reconstruction from Rolling Shutter Images
* Sparseness Meets Deepness: 3D Human Pose Estimation from Monocular Video
* Sparsifying Neural Network Connections for Face Recognition
* Spatially Binned ROC: A Comprehensive Saliency Metric
* Spatiotemporal Bundle Adjustment for Dynamic 3D Reconstruction
* SPDA-CNN: Unifying Semantic Part Detection and Abstraction for Fine-Grained Recognition
* Split and Match: Example-Based Adaptive Patch Sampling for Unsupervised Style Transfer
* Stacked Attention Networks for Image Question Answering
* Staple: Complementary Learners for Real-Time Tracking
* STCT: Sequentially Training Convolutional Networks for Visual Tracking
* Stereo Matching with Color and Monochrome Cameras in Low-Light Conditions
* Structural Correlation Filter for Robust Visual Tracking
* Structural-RNN: Deep Learning on Spatio-Temporal Graphs
* Structure from Motion with Objects
* Structure Inference Machines: Recurrent Neural Networks for Analyzing Relations in Group Activity Recognition
* Structure-from-Motion Revisited
* Structured Feature Learning for Pose Estimation
* Structured Feature Similarity with Explicit Feature Map
* Structured Prediction of Unobserved Voxels from a Single Depth Image
* Structured Receptive Fields in CNNs
* Structured Regression Gradient Boosting
* Studying Very Low Resolution Recognition Using Deep Networks
* Sublabel-Accurate Relaxation of Nonconvex Energies
* Subspace Clustering with Priors via Sparse Quadratically Constrained Quadratic Programming
* Summary Transfer: Exemplar-Based Subset Selection for Video Summarization
* Supervised Quantization for Similarity Search
* SVBRDF-Invariant Shape and Reflectance Estimation from Light-Field Cameras
* Symmetry reCAPTCHA
* Synthesized Classifiers for Zero-Shot Learning
* Synthetic Data for Text Localisation in Natural Images
* SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes, The
* Task-Oriented Approach for Cost-Sensitive Recognition, A
* Temporal Action Detection Using a Statistical Language Model
* Temporal Action Localization in Untrimmed Videos via Multi-stage CNNs
* Temporal Action Localization with Pyramid of Score Distribution Features
* Temporal Epipolar Regions
* Temporal Multimodal Learning in Audiovisual Speech Recognition
* Temporally Coherent 4D Reconstruction of Complex Dynamic Scenes
* Tensor Power Iteration for Multi-graph Matching
* Tensor Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Tensors via Convex Optimization
* TenSR: Multi-dimensional Tensor Sparse Representation
* Text Detection System for Natural Scenes with Convolutional Feature Learning and Cascaded Classification, A
* TGIF: A New Dataset and Benchmark on Animated GIF Description
* Theory and Practice of Structure-From-Motion Using Affine Correspondences
* They are Not Equally Reliable: Semantic Event Search Using Differentiated Concept Classifiers
* Thin-Slicing for Pose: Learning to Understand Pose without Explicit Pose Estimation
* Three-Dimensional Object Detection and Layout Prediction Using Clouds of Oriented Gradients
* TI-POOLING: Transformation-Invariant Pooling for Feature Learning in Convolutional Neural Networks
* Top-Push Video-Based Person Re-identification
* Towards Open Set Deep Networks
* Trace Quotient Meets Sparsity: A Method for Learning Low Dimensional Image Representations
* Track and Segment: An Iterative Unsupervised Approach for Video Object Proposals
* Track and Transfer: Watching Videos to Simulate Strong Human Supervision for Weakly-Supervised Object Detection
* Traffic-Sign Detection and Classification in the Wild
* Training Region-Based Object Detectors with Online Hard Example Mining
* Trust No One: Low Rank Matrix Factorization Using Hierarchical RANSAC
* Two Illuminant Estimation and User Correction Preference
* UAVSensor Fusion with Latent-Dynamic Conditional Random Fields in Coronal Plane Estimation
* Unbiased Photometric Stereo for Colored Surfaces: A Variational Approach
* Uncalibrated Photometric Stereo by Stepwise Optimization Using Principal Components of Isotropic BRDFs
* Uncertainty-Driven 6D Pose Estimation of Objects and Scenes from a Single RGB Image
* Unconstrained Face Alignment via Cascaded Compositional Learning
* Unconstrained Salient Object Detection via Proposal Subset Optimization
* Understanding RealWorld Indoor Scenes with Synthetic Data
* Unsupervised Cross-Dataset Transfer Learning for Person Re-identification
* Unsupervised Learning from Narrated Instruction Videos
* Unsupervised Learning of Discriminative Attributes and Visual Representations
* Unsupervised Learning of Edges
* Using Self-Contradiction to Learn Confidence Measures in Stereo Vision
* Using Spatial Order to Boost the Elimination of Incorrect Feature Matches
* Variable Aperture Light Field Photography: Overcoming the Diffraction-Limited Spatio-Angular Resolution Tradeoff
* Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks
* Video Segmentation via Object Flow
* Video-Story Composition via Plot Analysis
* Video2GIF: Automatic Generation of Animated GIFs from Video
* VirtualWorlds as Proxy for Multi-object Tracking Analysis
* Visual Path Prediction in Complex Scenes with Crowded Moving Objects
* Visual Tracking Using Attention-Modulated Disintegration and Integration
* Visual7W: Grounded Question Answering in Images
* Visualizing and Understanding Deep Texture Representations
* Visually Indicated Sounds
* VisualWord2Vec (Vis-W2V): Learning Visually Grounded Word Embeddings Using Abstract Scenes
* VLAD3: Encoding Dynamics of Deep Features for Action Recognition
* Volumetric 3D Tracking by Detection
* Volumetric and Multi-view CNNs for Object Classification on 3D Data
* Walk and Learn: Facial Attribute Representation Learning from Egocentric Video and Contextual Data
* WarpNet: Weakly Supervised Matching for Single-View Reconstruction
* We are Humor Beings: Understanding and Predicting Visual Humor
* We Don't Need No Bounding-Boxes: Training Object Class Detectors Using Only Human Verification
* Weakly Supervised Deep Detection Networks
* Weakly Supervised Object Boundaries
* Weakly Supervised Object Localization with Progressive Domain Adaptation
* Weighted Variational Model for Simultaneous Reflectance and Illumination Estimation, A
* WELDON: Weakly Supervised Learning of Deep Convolutional Neural Networks
* What If We Do Not have Multiple Videos of the Same Action? Video Action Localization Using Web Images
* What Players do with the Ball: A Physically Constrained Interaction Modeling
* What Sparse Light Field Coding Reveals about Scene Structure
* What Value Do Explicit High Level Concepts Have in Vision to Language Problems?
* What's Wrong with That Object? Identifying Images of Unusual Objects by Modelling the Detection Score Distribution
* When Naive Bayes Nearest Neighbors Meet Convolutional Neural Networks
* When VLAD Met Hilbert
* Where to Look: Focus Regions for Visual Question Answering
* WIDER FACE: A Face Detection Benchmark
* Yin and Yang: Balancing and Answering Binary Visual Questions
* You Lead, We Exceed: Labor-Free Video Concept Learning by Jointly Exploiting Web Videos and Images
* You Only Look Once: Unified, Real-Time Object Detection
* Zero-Shot Learning via Joint Latent Similarity Embedding
644 for CVPR16

CVPR17 * *CVPR
* 3D Bounding Box Estimation Using Deep Learning and Geometry
* 3D Convolutional Neural Networks for Efficient and Robust Hand Pose Estimation from Single Depth Images
* 3D Face Morphable Models In-the-Wild
* 3D Human Pose Estimation = 2D Pose Estimation + Matching
* 3D Human Pose Estimation from a Single Image via Distance Matrix Regression
* 3D Menagerie: Modeling the 3D Shape and Pose of Animals
* 3D Point Cloud Registration for Localization Using a Deep Neural Network Auto-Encoder
* 3D Shape Segmentation with Projective Convolutional Networks
* 3DMatch: Learning Local Geometric Descriptors from RGB-D Reconstructions
* 4D Light Field Superpixel and Segmentation
* A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection
* A-Lamp: Adaptive Layout-Aware Multi-patch Deep Convolutional Neural Network for Photo Aesthetic Assessment
* Accurate Depth and Normal Maps from Occlusion-Aware Focal Stack Symmetry
* Accurate Optical Flow via Direct Cost Volume Processing
* Accurate Single Stage Detector Using Recurrent Rolling Convolution
* Acquiring Axially-Symmetric Transparent Objects Using Single-View Transmission Imaging
* Action Unit Detection with Region Adaptation, Multi-labeling Learning and Optimal Temporal Fusing
* Action-Decision Networks for Visual Tracking with Deep Reinforcement Learning
* ActionVLAD: Learning Spatio-Temporal Aggregation for Action Classification
* Active Convolution: Learning the Shape of Convolution for Image Classification
* Adaptive and Move Making Auxiliary Cuts for Binary Pairwise Energies
* Adaptive Class Preserving Representation for Image Classification
* AdaScan: Adaptive Scan Pooling in Deep Convolutional Neural Networks for Human Action Recognition in Videos
* Additive Component Analysis
* Adversarial Discriminative Domain Adaptation
* Adversarially Tuned Scene Generation
* AGA: Attribute-Guided Augmentation
* Age Progression/Regression by Conditional Adversarial Autoencoder
* Agent-Centric Risk Assessment: Accident Anticipation and Risky Region Localization
* Aggregated Residual Transformations for Deep Neural Networks
* All You Need is Beyond a Good Init: Exploring Better Solution for Training Extremely Deep Convolutional Neural Networks with Orthonormality and Modulation
* Alternating Direction Graph Matching
* Amazing Mysteries of the Gutter: Drawing Inferences Between Panels in Comic Book Narratives, The
* AMC: Attention Guided Multi-modal Correlation Learning for Image Search
* Amodal Detection of 3D Objects: Inferring 3D Bounding Boxes from 2D Ones in RGB-Depth Images
* AMVH: Asymmetric Multi-Valued hashing
* Analyzing Computer Vision Data: The Good, the Bad and the Ugly
* AnchorNet: A Weakly Supervised Network to Learn Geometry-Sensitive Features for Semantic Matching
* Annotating Object Instances with a Polygon-RNN
* Anti-Glare: Tightly Constrained Optimization for Eyeglass Reflection Removal
* Are Large-Scale 3D Models Really Necessary for Accurate Visual Localization?
* Are You Smarter Than a Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension
* ArtTrack: Articulated Multi-Person Tracking in the Wild
* Asymmetric Feature Maps with Application to Sketch Based Retrieval
* Asynchronous Temporal Fields for Action Recognition
* Attend in Groups: A Weakly-Supervised Deep Learning Framework for Learning from Web Data
* Attend to You: Personalized Image Captioning with Context Sequence Memory Networks
* Attention-Aware Face Hallucination via Deep Reinforcement Learning
* Attentional Correlation Filter Network for Adaptive Visual Tracking
* Attentional Push: A Deep Convolutional Network for Augmenting Image Salience with Shared Attention Modeling in Social Scenes
* Automatic Discovery, Association Estimation and Learning of Semantic Attributes for a Thousand Categories
* Automatic Understanding of Image and Video Advertisements
* Awesome Typography: Statistics-Based Text Effects Transfer
* Bayesian Supervised Hashing
* Benchmarking Denoising Algorithms with Real Photographs
* Beyond Instance-Level Image Retrieval: Leveraging Captions to Learn a Global Visual Representation for Semantic Retrieval
* Beyond Triplet Loss: A Deep Quadruplet Network for Person Re-identification
* Bidirectional Beam Search: Forward-Backward Inference in Neural Sequence Models for Fill-in-the-Blank Image Captioning
* Bidirectional Multirate Reconstruction for Temporal Modeling in Videos
* BigHand2.2M Benchmark: Hand Pose Dataset and State of the Art Analysis
* Binarized Mode Seeking for Scalable Visual Pattern Discovery
* Binary Coding for Partial Action Analysis with Limited Observation Ratios
* Binary Constraint Preserving Graph Matching
* BIND: Binary Integrated Net Descriptors for Texture-Less Object Recognition
* Binge Watching: Scaling Affordance Learning from Sitcoms
* Borrowing Treasures from the Wealthy: Deep Transfer Learning through Selective Joint Fine-Tuning
* Boundary-Aware Instance Segmentation
* BranchOut: Regularization for Online Ensemble Tracking with Convolutional Neural Networks
* BRISKS: Binary Features for Spherical Images on a Geodesic Grid
* Budget-Aware Deep Semantic Video Segmentation
* Building a Regular Decision Boundary with Deep Networks
* Can Walking and Measuring Along Chord Bunches Better Describe Leaf Shapes?
* Captioning Images with Diverse Objects
* CASENet: Deep Category-Aware Semantic Edge Detection
* CATS: A Color and Thermal Stereo Benchmark
* CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos
* CERN: Confidence-Energy Recurrent Network for Group Activity Recognition
* ChestX-Ray8: Hospital-Scale Chest X-Ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases
* CityPersons: A Diverse Dataset for Pedestrian Detection
* Clever Elimination Strategy for Efficient Minimal Solvers, A
* CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning
* CLKN: Cascaded Lucas-Kanade Networks for Image Alignment
* CNN-Based Patch Matching for Optical Flow with Thresholded Hinge Embedding Loss
* CNN-SLAM: Real-Time Dense Monocular SLAM with Learned Depth Prediction
* Co-occurrence Filter
* Coarse-to-Fine Segmentation with Shape-Tailored Continuum Scale Spaces
* Coarse-to-Fine Volumetric Prediction for Single-Image 3D Human Pose
* Cognitive Mapping and Planning for Visual Navigation
* Collaborative Deep Reinforcement Learning for Joint Object Search
* Collaborative Summarization of Topic-Related Videos
* Colorization as a Proxy Task for Visual Understanding
* Combinatorial Solution to Non-Rigid 3D Shape-to-Image Matching, A
* Combining Bottom-Up, Top-Down, and Smoothness Cues for Weakly Supervised Image Segmentation
* Commonly Uncommon: Semantic Sparsity in Situation Recognition
* Compact DNN: Approaching GoogLeNet-Level Accuracy of Classification and Domain Adaptation, A
* Compact Matrix Factorization with Dependent Subspaces
* Comparative Evaluation of Hand-Crafted and Learned Local Features
* Comprehension-Guided Referring Expressions
* Computational Imaging on the Electric Grid
* Conditional Similarity Networks
* Connecting Look and Feel: Associating the Visual and Tactile Properties of Physical Materials
* Consensus Maximization with Linear Matrix Inequality Constraints
* Consistent-Aware Deep Learning for Person Re-identification in a Camera Network
* Context-Aware Captions from Context-Agnostic Supervision
* Context-Aware Correlation Filter Tracking
* Contour-Constrained Superpixels for Image and Video Processing
* Controlling Perceptual Factors in Neural Style Transfer
* Convex Global 3D Registration with Lagrangian Duality
* Convolutional Neural Network Architecture for Geometric Matching
* Convolutional Random Walk Networks for Semantic Image Segmentation
* Correlational Gaussian Processes for Cross-Domain Visual Recognition
* Counting Everyday Objects in Everyday Scenes
* Creativity: Generating Diverse Questions Using Variational Autoencoders
* Cross-Modality Binary Code Learning via Fusion Similarity Hashing
* Cross-View Image Matching for Geo-Localization in Urban Environments
* Crossing Nets: Combining GANs and VAEs with a Shared Latent Space for Hand Pose Estimation
* Dataset and Exploration of Models for Understanding Video Data through Fill-in-the-Blank Question-Answering, A
* Dataset for Benchmarking Image-Based Localization, A
* Deep 360 Pilot: Learning a Deep Agent for Piloting through 360 deg; Sports Videos
* Deep Affordance-Grounded Sensorimotor Object Recognition
* Deep Co-occurrence Feature Learning for Visual Object Recognition
* Deep Crisp Boundaries
* Deep Cross-Modal Hashing
* Deep Feature Flow for Video Recognition
* Deep Feature Interpolation for Image Content Changes
* Deep Future Gaze: Gaze Anticipation on Egocentric Videos Using Adversarial Networks
* Deep Hashing Network for Unsupervised Domain Adaptation
* Deep Image Harmonization
* Deep Image Matting
* Deep Joint Rain Detection and Removal from a Single Image
* Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution
* Deep Learning Human Mind for Automated Visual Classification
* Deep Learning of Human Visual Sensitivity in Image Quality Assessment Framework
* Deep Learning on Lie Groups for Skeleton-Based Action Recognition
* Deep Learning with Low Precision by Half-Wave Gaussian Quantization
* Deep Level Sets for Salient Object Detection
* Deep MANTA: A Coarse-to-Fine Many-Task Network for Joint 2D and 3D Vehicle Analysis from Monocular Image
* Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection
* Deep Metric Learning via Facility Location
* Deep Mixture of Linear Inverse Regressions Applied to Head-Pose Estimation
* Deep Multi-scale Convolutional Neural Network for Dynamic Scene Deblurring
* Deep Multimodal Representation Learning from Temporal Data
* Deep Multitask Architecture for Integrated 2D and 3D Human Sensing
* Deep Network Flow for Multi-object Tracking
* Deep Outdoor Illumination Estimation
* Deep Photo Style Transfer
* Deep Pyramidal Residual Networks
* Deep Quantization: Encoding Convolutional Activations with Deep Generative Model
* Deep Regression Architecture with Two-Stage Re-initialization for High Performance Facial Landmark Detection, A
* Deep Reinforcement Learning-Based Image Captioning with Embedding Reward
* Deep Representation Learning for Human Motion Prediction and Classification
* Deep Roots: Improving CNN Efficiency with Hierarchical Filter Groups
* Deep Self-Taught Learning for Weakly Supervised Object Localization
* Deep Semantic Feature Matching
* Deep Sequential Context Networks for Action Prediction
* Deep Sketch Hashing: Fast Free-Hand Sketch-Based Image Retrieval
* Deep Structured Learning for Facial Action Unit Intensity Estimation
* Deep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing
* Deep Temporal Linear Encoding Networks
* Deep TEN: Texture Encoding Network
* Deep Unsupervised Similarity Learning Using Partially Ordered Sets
* Deep Variation-Structured Reinforcement Learning for Visual Relationship and Attribute Detection
* Deep Video Deblurring for Hand-Held Cameras
* Deep View Morphing
* Deep Visual-Semantic Quantization for Efficient Image Retrieval
* Deep Watershed Transform for Instance Segmentation
* Deeply Aggregated Alternating Minimization for Image Restoration
* DeepNav: Learning to Navigate Large Cities
* DeepPermNet: Visual Permutation Learning
* DeLiGAN: Generative Adversarial Networks for Diverse and Limited Data
* DeMoN: Depth and Motion Network for Learning Monocular Stereo
* Dense Captioning with Joint Inference and Visual Context
* Densely Connected Convolutional Networks
* DenseReg: Fully Convolutional Dense Shape Regression In-the-Wild
* Depth from Defocus in the Wild
* DeshadowNet: A Multi-context Embedding Deep Network for Shadow Removal
* Designing Effective Inter-Pixel Information Flow for Natural Image Matting
* Designing Energy-Efficient Convolutional Neural Networks Using Energy-Aware Pruning
* Designing Illuminant Spectral Power Distributions for Surface Classification
* DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents
* Detailed, Accurate, Human Shape Estimation from Clothed 3D Scan Sequences
* Detangling People: Individuating Multiple Close People and Their Body Parts via Region Assembly
* Detect, Replace, Refine: Deep Structured Prediction for Pixel Wise Labeling
* Detecting Masked Faces in the Wild with LLE-CNNs
* Detecting Oriented Text in Natural Images by Linking Segments
* Detecting Visual Relationships with Deep Relational Networks
* Differential Angular Imaging for Material Recognition
* Dilated Residual Networks
* Direct Photometric Alignment by Mesh Deformation
* Discover and Learn New Objects from Documentaries
* Discovering Causal Signals in Images
* Discretely Coding Semantic Rank Orders for Supervised Image Hashing
* Discriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries
* Discriminative Correlation Filter with Channel and Spatial Reliability
* Discriminative Covariance Oriented Representation Learning for Face Recognition with Image Sets
* Discriminative Optimization: Theory and Applications to Point Cloud Registration
* Disentangled Representation Learning GAN for Pose-Invariant Face Recognition
* Distinguishing the Indistinguishable: Exploring Structural Ambiguities via Geodesic Context
* Diverse Image Annotation
* Diversified Texture Synthesis with Feed-Forward Networks
* Domain Adaptation by Mixture of Alignments of Second-or Higher-Order Scatter Tensors
* Domain Based Approach to Social Relation Recognition, A
* DOPE: Distributed Optimization for Pairwise Energies
* DSAC: Differentiable RANSAC for Camera Localization
* Dual Ascent Framework for Lagrangean Decomposition of Combinatorial Problems, A
* Dual Attention Networks for Multimodal Reasoning and Matching
* DUST: Dual Union of Spatio-Temporal Subspaces for Monocular Multiple Object 3D Reconstruction
* Dynamic Attention-Controlled Cascaded Shape Regression Exploiting Training Data Augmentation and Fuzzy-Set Sample Weighting
* Dynamic Edge-Conditioned Filters in Convolutional Neural Networks on Graphs
* Dynamic Facial Analysis: From Bayesian Filtering to Recurrent Neural Network
* Dynamic FAUST: Registering Human Bodies in Motion
* Dynamic Time-of-Flight
* EAST: An Efficient and Accurate Scene Text Detector
* ECO: Efficient Convolution Operators for Tracking
* Efficient Algebraic Solution to the Perspective-Three-Point Problem, An
* Efficient Background Term for 3D Reconstruction and Tracking with Smooth Surface Models, An
* Efficient Diffusion on Region Manifolds: Recovering Small Objects with Compact CNN Representations
* Efficient Global Point Cloud Alignment Using Bayesian Nonparametric Mixtures
* Efficient Linear Programming for Dense CRFs
* Efficient Multiple Instance Metric Learning Using Weakly Supervised Data
* Efficient Optimization for Hierarchically-Structured Interacting Segments (HINTS)
* Efficient Solvers for Minimal Problems by Syzygy-Based Reduction
* Elastic Shape-from-Template with Spatially Sparse Deforming Forces
* Emotion Recognition in Context
* Empirical Evaluation of Visual Question Answering for Novel Objects, An
* End-to-End 3D Face Reconstruction with Deep Neural Networks
* End-to-End Concept Word Detection for Video Captioning, Retrieval, and Question Answering
* End-to-End Instance Segmentation with Recurrent Attention
* End-to-End Learning of Driving Models from Large-Scale Video Datasets
* End-to-End Representation Learning for Correlation Filter Based Tracking
* End-to-End Training of Hybrid CNN-CRF Models for Stereo
* Enhancing Video Summarization via Vision-Language Embedding
* Episodic CAMN: Contextual Attention-Based Memory Networks with Iterative Feedback for Scene Labeling
* ER3: A Unified Framework for Event Retrieval, Recognition and Recounting
* Event-Based Visual Inertial Odometry
* Exact Penalty Method for Locally Convergent Maximum Consensus, An
* Exclusivity-Consistency Regularized Multi-view Subspace Clustering
* Expecting the Unexpected: Training Detectors for Unusual Pedestrians with Adversarial Imposters
* Expert Gate: Lifelong Learning with a Network of Experts
* Exploiting 2D Floorplan for Building-Scale Panorama RGBD Alignment
* Exploiting Saliency for Object Segmentation from Image Level Labels
* Exploiting Symmetry and/or Manhattan Properties for 3D Object Structure Estimation from Single and Multiple Images
* Face Normals In-the-Wild Using Fully Convolutional Networks
* Factorized Variational Autoencoders for Modeling Audience Reactions to Movies
* FASON: First and Second Order Information Fusion Network for Texture Recognition
* Fast 3D Reconstruction of Faces with Glasses
* Fast Boosting Based Detection Using Scale Invariant Multimodal Multiresolution Filtered Features
* Fast Fourier Color Constancy
* Fast Haze Removal for Nighttime Image Using Maximum Reflectance Prior
* Fast Multi-frame Stereo Scene Flow with Motion Segmentation
* Fast Person Re-identification via Cross-Camera Semantic Binary Transformation
* Fast Video Classification via Adaptive Cascading of Deep Models
* Fast-At: Fast Automatic Thumbnail Generation Using Deep Neural Networks
* FastMask: Segment Multi-scale Object Candidates in One Shot
* FCSS: Fully Convolutional Self-Similarity for Dense Semantic Correspondence
* FC^4: Fully Convolutional Color Constancy with Confidence-Weighted Pooling
* Feature Pyramid Networks for Object Detection
* Feedback Networks
* Few-Shot Object Recognition from Machine-Labeled Web Images
* FFTLasso: Large-Scale LASSO in the Fourier Domain
* Filter Flow Made Practical: Massively Parallel and Lock-Free
* Finding Tiny Faces
* Fine-Grained Image Classification via Combining Vision and Language
* Fine-Grained Recognition as HSnet Search for Informative Image Parts
* Fine-Grained Recognition of Thousands of Object Categories with Single-Example Training
* Fine-to-Coarse Global Registration of RGB-D Scans
* Fine-Tuning Convolutional Neural Networks for Biomedical Image Analysis: Actively and Incrementally
* Fixed-Point Factorized Networks
* Flexible Spatio-Temporal Networks for Video Prediction
* Flight Dynamics-Based Recovery of a UAV Trajectory Using Ground Cameras
* FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks
* Forecasting Human Dynamics from Static Images
* Forecasting Interactive Dynamics of Pedestrians with Fictitious Play
* Fractal Dimension Invariant Filtering and Its CNN-Based Implementation
* Fried Binary Embedding for High-Dimensional Visual Features
* From Local to Global: Edge Profiles to Camera Motion in Blurred Images
* From Motion Blur to Motion Flow: A Deep Learning Solution for Removing Heterogeneous Motion Blur
* From Red Wine to Red Tomato: Composition with Context
* From Zero-Shot Learning to Conventional Supervised Classification: Unseen Visual Data Synthesis
* Full Resolution Image Compression with Recurrent Neural Networks
* Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes
* Fully Convolutional Instance-Aware Semantic Segmentation
* Fully-Adaptive Feature Sharing in Multi-Task Networks with Applications in Person Attribute Classification
* FusionSeg: Learning to Combine Motion and Appearance for Fully Automatic Segmentation of Generic Objects in Videos
* G2DeNet: Global Gaussian Distribution Embedding Network and Its Application to Visual Recognition
* Gated Feedback Refinement Network for Dense Image Labeling
* Gaze Embeddings for Zero-Shot Image Classification
* General Framework for Curve and Surface Comparison and Registration with Oriented Varifolds, A
* General Models for Rational Cameras and the Case of Two-Slit Projections
* Generalized Deep Image to Image Regression
* Generalized Rank Pooling for Activity Recognition
* Generalized Semantic Preserving Hashing for N-Label Cross-Modal Retrieval
* Generating Descriptions with Grounded and Co-referenced People
* Generating Holistic 3D Scene Abstractions for Text-Based Image Retrieval
* Generating the Future with Adversarial Transformers
* Generative Attribute Controller with Conditional Filtered Generative Adversarial Networks
* Generative Face Completion
* Generative Hierarchical Learning of Sparse FRAME Models
* Generative Model for Depth-Based Robust 3D Facial Pose Tracking, A
* Geodesic Distance Descriptors
* Geometric Deep Learning on Graphs and Manifolds Using Mixture Model CNNs
* Geometric Loss Functions for Camera Pose Regression with Deep Learning
* Geometry of First-Returning Photons for Non-Line-of-Sight Imaging, The
* Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning, A
* Global Context-Aware Attention LSTM Networks for 3D Action Recognition
* Global Hypothesis Generation for 6D Object Pose Estimation
* Global Optimality in Neural Network Training
* GMS: Grid-Based Motion Statistics for Fast, Ultra-Robust Feature Correspondence
* Graph Regularized Deep Neural Network for Unsupervised Image Representation Learning, A
* Graph-Structured Representations for Visual Question Answering
* Grassmannian Manifold Optimization Assisted Sparse Spectral Clustering
* Group-Wise Point-Set Registration Based on Renyi's Second Order Entropy
* Growing a Brain: Fine-Tuning by Increasing Model Capacity
* GuessWhat?! Visual Object Discovery through Multi-modal Dialogue
* H-Patches: A Benchmark and Evaluation of Handcrafted and Learned Local Descriptors
* Hallucinating Very Low-Resolution Unaligned and Noisy Face Images by Transformative Discriminative Autoencoders
* Hand Keypoint Detection in Single Images Using Multiview Bootstrapping
* Hard Mixtures of Experts for Large Scale Weakly Supervised Vision
* Hardware-Efficient Guided Image Filtering for Multi-label Problem
* Harmonic Networks: Deep Translation and Rotation Equivariance
* Harvesting Multiple Views for Marker-Less 3D Human Pose Annotations
* Hidden Layers in Perceptual Learning
* Hierarchical Approach for Generating Descriptive Image Paragraphs, A
* Hierarchical Boundary-Aware Neural Encoder for Video Captioning
* Hierarchical Multimodal Metric Learning for Multimodal Classification
* High-Resolution Image Inpainting Using Multi-scale Neural Patch Synthesis
* HOPE: Hierarchical Object Prototype Encoding for Efficient Object Instance Search in Videos
* HSfM: Hybrid Structure-from-Motion
* Human Shape from Silhouettes Using Generative HKS Descriptors and Cross-Modal Neural Networks
* Hyper-Laplacian Regularized Unidirectional Low-Rank Tensor Recovery for Multispectral Image Denoising
* Hyperspectral Image Super-Resolution via Non-local Sparse Tensor Factorization
* iCaRL: Incremental Classifier and Representation Learning
* Identifying First-Person Camera Wearers in Third-Person Videos
* Illuminant-Camera Communication to Observe Moving Objects under Strong External Light by Spread Spectrum Modulation
* IM2CAD
* Image Deblurring via Extreme Channels Prior
* Image Splicing Detection via Camera Response Function Analysis
* Image Super-Resolution via Deep Recursive Residual Network
* Image-to-Image Translation with Conditional Adversarial Networks
* Impact of Typicality for Informative Representative Selection, The
* Improved Stereo Matching with Constant Highway Networks and Reflective Confidence Learning
* Improved Texture Networks: Maximizing Quality and Diversity in Feed-Forward Stylization and Texture Synthesis
* Improving Facial Attribute Prediction Using Semantic Segmentation
* Improving Interpretability of Deep Neural Networks with Semantic Information
* Improving Pairwise Ranking for Multi-label Image Classification
* Improving RANSAC-Based Segmentation through CNN Encapsulation
* Improving Training of Deep Neural Networks via Singular Value Bounding
* Incorporating Copying Mechanism in Image Captioning for Learning Novel Objects
* Incremental Kernel Null Space Discriminant Analysis for Novelty Detection
* Incremental Multiresolution Matrix Factorization Algorithm, The
* Indoor Scene Parsing with Instance Segmentation, Semantic Labeling and Support Relationship Inference
* Infinite Variational Autoencoder for Semi-Supervised Learning
* Instance-Aware Image and Sentence Matching with Selective Multimodal LSTM
* Instance-Level Salient Object Segmentation
* InstanceCut: From Edges to Instances with MultiCut
* InterpoNet, a Brain Inspired Neural Network for Optical Flow Dense Interpolation
* Interpretable Structure-Evolving LSTM
* Interspecies Knowledge Transfer for Facial Keypoint Detection
* Intrinsic Grassmann Averages for Online Linear and Robust Subspace Learning
* Inverse Compositional Spatial Transformer Networks
* IRINA: Iris Recognition (Even) in Inaccurately Segmented Data
* Joint Detection and Identification Feature Learning for Person Search
* Joint Discriminative Bayesian Dictionary and Classifier Learning
* Joint Gap Detection and Inpainting of Line Drawings
* Joint Geometrical and Statistical Alignment for Visual Domain Adaptation
* Joint Graph Decomposition Node Labeling: Problem, Algorithms, Applications
* Joint Intensity and Spatial Metric Learning for Robust Gait Recognition
* Joint Multi-person Pose Estimation and Semantic Part Segmentation
* Joint Registration and Representation Learning for Unconstrained Face Identification
* Joint Sequence Learning and Cross-Modality Convolution for 3D Biomedical Segmentation
* Joint Speaker-Listener-Reinforcer Model for Referring Expressions, A
* Jointly Learning Energy Expenditures and Activities Using Egocentric Multimodal Signals
* Kernel Pooling for Convolutional Neural Networks
* Kernel Square-Loss Exemplar Machines for Image Retrieval
* KillingFusion: Non-rigid 3D Reconstruction without Correspondences
* Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning
* Knowledge Acquisition for Visual Question Answering via Iterative Querying
* L2-Net: Deep Learning of Discriminative Patch Descriptor in Euclidean Space
* Large Kernel Matters: Improve Semantic Segmentation by Global Convolutional Network
* Large Margin Object Tracking with Circulant Feature Maps
* Latent Multi-view Subspace Clustering
* LCNN: Lookup-Based Convolutional Neural Network
* LCR-Net: Localization-Classification-Regression for Human Pose
* Lean Crowdsourcing: Combining Humans and Machines in an Online System
* Learned Contextual Feature Reweighting for Image Geo-Localization
* Learning a Deep Embedding Model for Zero-Shot Learning
* Learning Adaptive Receptive Fields for Deep Image Parsing Network
* Learning an Invariant Hilbert Space for Domain Adaptation
* Learning and Refining of Privileged Information-Based RNNs for Action Recognition from Depth Sequences
* Learning Barycentric Representations of 3D Shapes for Sketch-Based 3D Shape Retrieval
* Learning by Association: A Versatile Semi-Supervised Training Method for Neural Networks
* Learning Category-Specific 3D Shape Models from Weakly Labeled 2D Images
* Learning Cross-Modal Deep Representations for Robust Pedestrian Detection
* Learning Cross-Modal Embeddings for Cooking Recipes and Food Images
* Learning Deep Binary Descriptor with Multi-Quantization
* Learning Deep CNN Denoiser Prior for Image Restoration
* Learning Deep Context-Aware Features over Body and Latent Parts for Person Re-identification
* Learning Deep Match Kernels for Image-Set Classification
* Learning Detailed Face Reconstruction from a Single Image
* Learning Detection with Diverse Proposals
* Learning Discriminative and Transformation Covariant Local Feature Detectors
* Learning Diverse Image Colorization
* Learning Dynamic Guidance for Depth Image Enhancement
* Learning Features by Watching Objects Move
* Learning from Noisy Large-Scale Datasets with Minimal Supervision
* Learning from Simulated and Unsupervised Images through Adversarial Training
* Learning from Synthetic Humans
* Learning Fully Convolutional Networks for Iterative Non-blind Deconvolution
* Learning Motion Patterns in Videos
* Learning Multifunctional Binary Codes for Both Category and Attribute Oriented Retrieval Tasks
* Learning Non-Lambertian Object Intrinsics Across ShapeNet Categories
* Learning Non-maximum Suppression
* Learning Object Interactions and Descriptions for Semantic Image Segmentation
* Learning Random-Walk Label Propagation for Weakly-Supervised Semantic Segmentation
* Learning Residual Images for Face Attribute Manipulation
* Learning Shape Abstractions by Assembling Volumetric Primitives
* Learning Spatial Regularization with Image-Level Supervisions for Multi-label Image Classification
* Learning the Multilinear Structure of Visual Data
* Learning to Align Semantic Segmentation and 2.5D Maps for Geolocalization
* Learning to Detect Salient Objects with Image-Level Supervision
* Learning to Extract Semantic Structure from Documents Using Multimodal Fully Convolutional Neural Networks
* Learning to Learn from Noisy Web Videos
* Learning to Predict Stereo Reliability Enforcing Local Consistency of Confidence Maps
* Learning to Rank Retargeted Images
* Learning Video Object Segmentation from Static Images
* Level Playing Field for Million Scale Face Recognition
* Lifting from the Deep: Convolutional 3D Pose Estimation from a Single Image
* Light Field Blind Motion Deblurring
* Light Field Reconstruction Using Deep Convolutional Network on EPI
* Linear Extrinsic Calibration of Kaleidoscopic Imaging System from Single 3D Point, A
* Link the Head to the Beak: Zero Shot Learning from Noisy Text Description at Part Precision
* Linking Image and Text with 2-Way Nets
* Lip Reading Sentences in the Wild
* Local Binary Convolutional Neural Networks
* Locality-Sensitive Deconvolution Networks with Gated Fusion for RGB-D Indoor Semantic Segmentation
* Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-Grained Image Recognition
* Look into Person: Self-Supervised Structure-Sensitive Learning and a New Benchmark for Human Parsing
* Loss Max-Pooling for Semantic Image Segmentation
* Low Power, Fully Event-Based Gesture Recognition System, A
* Low-Rank Bilinear Pooling for Fine-Grained Classification
* Low-Rank Embedded Ensemble Semantic Dictionary for Zero-Shot Learning
* Low-Rank-Sparse Subspace Representation for Robust Regression
* LSTM Self-Supervision for Detailed Behavior Analysis
* Making 360° Video Watchable in 2D: Learning Videography for Click Free Viewing
* Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach
* Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
* Material Classification Using Frequency-and Depth-Dependent Time-of-Flight Distortion
* Matrix Splitting Method for Composite Function Minimization, A
* Matrix Tri-Factorization with Manifold Regularizations for Zero-Shot Learning
* Matting and Depth Recovery of Thin Structures Using a Focal Stack
* MCMLSD: A Dynamic Programming Approach to Line Segment Detection
* MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network
* Memory-Augmented Attribute Manipulation Networks for Interactive Fashion Search
* Message Passing Algorithm for the Minimum Cost Multicut Problem, A
* Mimicking Very Efficient Network for Object Detection
* MIML-FCN+: Multi-Instance Multi-Label Learning via Fully Convolutional Networks with Privileged Information
* Mind the Class Weight Bias: Weighted Maximum Mean Discrepancy for Unsupervised Domain Adaptation
* Minimal Solution for Two-View Focal-Length Estimation Using Two Affine Correspondences, A
* Minimum Delay Moving Object Detection
* Mining Object Parts from CNNs via Active Question-Answering
* Missing Modalities Imputation via Cascaded Residual Autoencoder
* Misty Three Point Algorithm for Relative Pose, The
* Model-Based Iterative Restoration for Binary Document Image Compression with Dictionary Learning
* Modeling Relationships in Referential Expressions with Compositional Modular Networks
* Modeling Sub-Event Dynamics in First-Person Action Recognition
* Modeling Temporal Dynamics and Spatial Configurations of Actions Using Two-Stream Recurrent Neural Networks
* More is Less: A More Complicated Network with Less Inference Complexity
* More You Know: Using Knowledge Graphs for Image Classification, The
* MuCaLe-Net: Multi Categorical-Level Networks to Generate More Discriminating Features
* Multi-attention Network for One Shot Learning
* Multi-context Attention for Human Pose Estimation
* Multi-level Attention Networks for Visual Question Answering
* Multi-modal Mean-Fields via Cardinality-Based Clamping
* Multi-object Tracking with Quadruplet Convolutional Neural Networks
* Multi-scale Continuous CRFs as Sequential Deep Networks for Monocular Depth Estimation
* Multi-scale FCN with Cascaded Instance Aware Segmentation for Arbitrary Oriented Word Spotting in the Wild
* Multi-task Clustering of Human Actions by Sharing Information
* Multi-task Correlation Particle Filter for Robust Object Tracking
* Multi-view 3D Object Detection Network for Autonomous Driving
* Multi-view Stereo Benchmark with High-Resolution Images and Multi-camera Videos, A
* Multi-View Supervision for Single-View Reconstruction via Differentiable Ray Consistency
* Multi-way Multi-level Kernel Modeling for Neuroimaging Classification
* Multigrid Neural Architectures
* Multimodal Transfer: A Hierarchical Deep Convolutional Neural Network for Fast Artistic Style Transfer
* Multiple Instance Detection Network with Online Instance Classifier Refinement
* Multiple People Tracking by Lifted Multicut and Person Re-identification
* Multiple-Scattering Microphysics Tomography
* Network Dissection: Quantifying Interpretability of Deep Visual Representations
* Network Sketching: Exploiting Binary Structure in Deep CNNs
* Neural Aggregation Network for Video Face Recognition
* Neural Face Editing with Intrinsic Image Disentangling
* Neural Scene De-rendering
* New Rank Constraint on Multi-view Fundamental Matrices, and Its Application to Camera Location Recovery, A
* New Representation of Skeleton Sequences for 3D Action Recognition, A
* Newton-Type Methods for Inference in Higher-Order Markov Random Fields
* NID-SLAM: Robust Monocular SLAM Using Normalised Information Distance
* Noise Robust Depth from Focus Using a Ring Difference Filter
* Noise-Blind Image Deblurring
* Noisy Softmax: Improving the Generalization Ability of DCNN via Postponing the Early Softmax Saturation
* Non-contact Full Field Vibration Measurement Based on Phase-Shifting
* Non-convex Variational Approach to Photometric Stereo under Inaccurate Lighting, A
* Non-local Color Image Denoising with Convolutional Neural Networks
* Non-local Deep Features for Salient Object Detection
* Non-local Low-Rank Framework for Ultrasound Speckle Reduction, A
* Non-uniform Subset Selection for Active Learning in Structured Data
* Nonnegative Matrix Underapproximation for Robust Multiple Model Fitting
* Not Afraid of the Dark: NIR-VIS Face Recognition via Cross-Spectral Hallucination and Low-Rank Embedding
* Not All Pixels Are Equal: Difficulty-Aware Semantic Segmentation via Deep Layer Cascade
* Novel Tensor-Based Video Rain Streaks Removal Approach via Utilizing Discriminatively Intrinsic Priors, A
* Object Co-skeletonization with Co-segmentation
* Object Detection in Videos with Tubelet Proposal Networks
* Object Region Mining with Adversarial Erasing: A Simple Classification to Semantic Segmentation Approach
* Object-Aware Dense Semantic Correspondence
* OctNet: Learning Deep 3D Representations at High Resolutions
* On Compressing Deep Models by Low Rank and Sparse Decomposition
* On Human Motion Prediction Using Recurrent Neural Networks
* On the Effectiveness of Visible Watermarks
* On the Global Geometry of Sphere-Constrained Sparse Blind Deconvolution
* On the Two-View Geometry of Unsynchronized Cameras
* On-the-Fly Adaptation of Regression Forests for Online Camera Relocalisation
* One-Shot Hyperspectral Imaging Using Faced Reflectors
* One-Shot Metric Learning for Person Re-identification
* One-Shot Video Object Segmentation
* One-To-Many Network for Visually Pleasing Compression Artifacts Reduction
* Online Asymmetric Similarity Learning for Cross-Modal Retrieval
* Online Graph Completion: Multivariate Signal Recovery in Computer Vision
* Online Summarization via Submodular and Convex Optimization
* Online Video Object Segmentation via Convolutional Trident Network
* Optical Flow Estimation Using a Spatial Pyramid Network
* Optical Flow in Mostly Rigid Scenes
* Optical Flow Requires Multiple Strategies (but Only One Network)
* Order-Preserving Wasserstein Distance for Sequence Matching
* Oriented Response Networks
* Outlier-Robust Tensor PCA
* Parametric T-Spline Face Morphable Model for Detailed Fitting in Shape Subspace
* Parsing Images of Overlapping Organisms with Deep Singling-Out Networks
* Perceptual Generative Adversarial Networks for Small Object Detection
* Person Re-identification in the Wild
* Person Search with Natural Language Description
* Personalizing Gesture Recognition Using Hierarchical Bayesian Neural Networks
* Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network
* Photorealistic Facial Texture Inference Using Deep Neural Networks
* Physically-Based Rendering for Indoor Scene Understanding Using Convolutional Neural Networks
* Physics Inspired Optimization on Semantic Transfer Features: An Alternative Method for Room Layout Estimation
* Pixelwise Instance Segmentation with a Dynamically Instantiated Network
* Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space
* Point Set Generation Network for 3D Object Reconstruction from a Single Image, A
* Point to Set Similarity Based Deep Feature Learning for Person Re-Identification
* PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
* Polarimetric Multi-view Stereo
* Polyhedral Conic Classifiers for Visual Object Detection and Classification
* PolyNet: A Pursuit of Structural Diversity in Very Deep Networks
* Pose-Aware Person Recognition
* PoseAgent: Budget-Constrained 6D Object Pose Estimation via Reinforcement Learning
* POSEidon: Face-from-Depth for Driver Pose Estimation
* PoseTrack: Joint Multi-person Pose Estimation and Tracking
* Position Tracking for Virtual Reality Using Commodity WiFi
* Practical Method for Fully Automatic Intrinsic Camera Calibration Using Directionally Encoded Light, A
* Predicting Behaviors of Basketball Players from First Person Videos
* Predicting Ground-Level Scene Layout from Aerial Imagery
* Predicting Salient Face in Multiple-Face Videos
* Predictive-Corrective Networks for Action Detection
* Primary Object Segmentation in Videos Based on Region Augmentation and Reduction
* Probabilistic Temporal Subspace Clustering
* Procedural Generation of Videos to Train Deep Action Recognition Networks
* Product Manifold Filter: Non-rigid Shape Correspondence via Kernel Density Estimation in the Product Space
* Product Split Trees
* Provable Self-Representation Based Outlier Detection in a Union of Subspaces
* Pyramid Scene Parsing Network
* Quad-Networks: Unsupervised Learning to Rank for Interest Point Detection
* Quality Aware Network for Set to Set Recognition
* Query-Focused Video Summarization: Dataset, Evaluation, and a Memory Network Based Approach
* Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
* Radiometric Calibration for Internet Photo Collections
* Radiometric Calibration from Faces in Images
* Re-ranking Person Re-identification with k-Reciprocal Encoding
* Re-Sign: Re-Aligned End-to-End Sequence Modelling with Deep Recurrent CNN-HMMs
* Real-Time 3D Model Tracking in Color and Depth on a Single CPU Core
* Real-Time Neural Style Transfer for Videos
* Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation
* Realtime Multi-person 2D Pose Estimation Using Part Affinity Fields
* Reconstructing Transient Images from Single-Photon Sensors
* Recurrent 3D Pose Sequence Machines
* Recurrent Convolutional Neural Networks for Continuous Sign Language Recognition by Staged Optimization
* Recurrent Modeling of Interaction Context for Collective Activity Recognition
* RefineNet: Multi-path Refinement Networks for High-Resolution Semantic Segmentation
* Reflectance Adaptive Filtering Improves Intrinsic Image Estimation
* Reflection Removal Using Low-Rank Matrix Completion
* Regressing Robust and Discriminative 3D Morphable Models with a Very Deep Neural Network
* Reinforcement Learning Approach to the View Planning Problem, A
* Relationship Proposal Networks
* Reliable Crowdsourcing and Deep Locality-Preserving Learning for Expression Recognition in the Wild
* Removing Rain from Single Images via a Deep Detail Network
* Residual Attention Network for Image Classification
* Residual Expansion Algorithm: Fast and Effective Optimization for Nonconvex Least Squares Problems
* Revisiting Metric Learning for SPD Matrix Based Visual Representation
* Revisiting the Variable Projection Method for Separable Nonlinear Least Squares Problems
* Richer Convolutional Features for Edge Detection
* Riemannian Nonlinear Mixed Effects Models: Analyzing Longitudinal Deformations in Neuroimaging
* ROAM: A Rich Object Appearance Model with Application to Rotoscoping
* Robust Energy Minimization for BRDF-Invariant Shape from Light Fields
* Robust Interpolation of Correspondences for Large Displacement Optical Flow
* Robust Joint and Individual Variance Explained
* Robust Visual Tracking Using Oblique Random Forests
* RON: Reverse Connection with Objectness Prior Networks for Object Detection
* S2F: Slow-to-Fast Interpolator Flow
* S3Pool: Pooling with Stochastic Spatial Sampling
* Saliency Revisited: Analysis of Mouse Movements Versus Fixations
* Salient Object Detection with Pyramid Attention and Salient Edges
* SCA-CNN: Spatial and Channel-Wise Attention in Convolutional Networks for Image Captioning
* Scalable Person Re-identification on Supervised Smoothed Manifold
* Scalable Surface Reconstruction from Point Clouds with Extreme Scale and Density Diversity
* Scale-Aware Face Detection
* ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes
* SCC: Semantic Context Cascade for Efficient Action Detection
* Scene Flow to Action Map: A New Representation for RGB-D Based Action Recognition with Convolutional Neural Networks
* Scene Graph Generation by Iterative Message Passing
* Scene Parsing through ADE20K Dataset
* Scribbler: Controlling Deep Image Synthesis with Sketch and Color
* See the Forest for the Trees: Joint Spatial and Temporal Recurrent Neural Networks for Video-Based Person Re-identification
* Seeing into Darkness: Scotopic Visual Recognition
* Seeing Invisible Poses: Estimating 3D Body Pose from Egocentric Video
* Seeing What is Not There: Learning Context to Determine Where Objects are Missing
* Self-Calibration-Based Approach to Critical Motion Sequences of Rolling-Shutter Structure from Motion
* Self-Critical Sequence Training for Image Captioning
* Self-Learning Scene-Specific Pedestrian Detectors Using a Progressive Latent Model
* Self-Supervised Learning of Visual Features through Embedding Images into Text Topic Spaces
* Self-Supervised Video Representation Learning with Odd-One-Out Networks
* Semantic Amodal Segmentation
* Semantic Autoencoder for Zero-Shot Learning
* Semantic Compositional Networks for Visual Captioning
* Semantic Image Inpainting with Deep Generative Models
* Semantic Multi-view Stereo: Jointly Estimating Objects and Voxels
* Semantic Regularisation for Recurrent Image Annotation
* Semantic Scene Completion from a Single Depth Image
* Semantic Segmentation via Structured Patch Prediction, Context CRF and Guidance CRF
* Semantically Coherent Co-Segmentation and Reconstruction of Dynamic Scenes
* Semantically Consistent Regularization for Zero-Shot Recognition
* Semi-Calibrated Near Field Photometric Stereo
* Semi-Supervised Deep Learning for Monocular Depth Map Prediction
* Sequential Person Recognition in Photo Albums with a Recurrent Network
* SGM-Nets: Semi-Global Matching with Neural Networks
* Shading Annotations in the Wild
* Shape Completion Using 3D-Encoder-Predictor CNNs and Shape Synthesis
* ShapeOdds: Variational Bayesian Learning of Generative Shape Models
* Simple Does It: Weakly Supervised Instance and Semantic Segmentation
* Simultaneous Facial Landmark Detection, Pose and Deformation Estimation Under Facial Occlusion
* Simultaneous Feature Aggregating and Hashing for Large-Scale Image Search
* Simultaneous Geometric and Radiometric Calibration of a Projector-Camera Pair
* Simultaneous Stereo Video Deblurring and Scene Flow Estimation
* Simultaneous Super-Resolution and Cross-Modality Synthesis of 3D Medical Images Using Weakly-Supervised Joint Convolutional Sparse Coding
* Simultaneous Visual Data Completion and Denoising Based on Tensor Rank and Total Variation Minimization and Its Primal-Dual Splitting Algorithm
* Single Image Reflection Suppression
* Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition
* Slow Flow: Exploiting High-Speed Cameras for Accurate and Diverse Optical Flow Reference Data
* Snapshot Hyperspectral Light Field Imaging
* Social Scene Understanding: End-to-End Multi-person Action Localization and Collective Activity Recognition
* Soft-Margin Mixture of Regressions
* Spatial-Semantic Image Search by Visual Feature Synthesis
* Spatially Adaptive Computation Time for Residual Networks
* Spatially-Varying Blur Detection Based on Multiscale Fused and Sorted Transform Coefficients of Gradient Magnitudes
* Spatio-Temporal Alignment of Non-overlapping Sequences from Independently Panning Cameras
* Spatio-Temporal Naive-Bayes Nearest-Neighbor (ST-NBNN) for Skeleton-Based Action Recognition
* Spatio-Temporal Self-Organizing Map Deep Network for Dynamic Object Detection from Videos
* Spatio-Temporal Vector of Locally Max Pooled Features for Action Recognition in Videos
* Spatiotemporal Multiplier Networks for Video Action Recognition
* Spatiotemporal Pyramid Network for Video Action Recognition
* Specular Highlight Removal in Facial Images
* Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors
* SPFTN: A Self-Paced Fine-Tuning Network for Segmenting Objects in Weakly Labelled Videos
* SphereFace: Deep Hypersphere Embedding for Face Recognition
* Spindle Net: Person Re-identification with Human Body Region Guided Feature Decomposition and Fusion
* Split-Brain Autoencoders: Unsupervised Learning by Cross-Channel Prediction
* Sports Field Localization via Deep Structured Models
* SRN: Side-Output Residual Network for Object Symmetry Detection in the Wild
* SST: Single-Stream Temporal Action Proposals
* Stacked Generative Adversarial Networks
* STD2P: RGBD Semantic Segmentation Using Spatio-Temporal Data-Driven Pooling
* Stereo-Based 3D Reconstruction of Dynamic Fluid Surfaces by Global Optimization
* Straight to Shapes: Real-Time Detection of Encoded Shapes
* Study of Lagrangean Decompositions and Dual Ascent Solvers for Graph Matching, A
* StyleBank: An Explicit Representation for Neural Image Style Transfer
* StyleNet: Generating Attractive Visual Captions with Styles
* Subspace Clustering via Variance Regularized Ridge Regression
* Superpixel-Based Tracking-by-Segmentation Using Markov Chains
* Superpixels and Polygons Using Simple Non-iterative Clustering
* Supervising Neural Attention Models for Video Captioning by Human Gaze Data
* Surface Motion Capture Transfer with Gaussian Process Regression
* Surfacing of Multiview 3D Drawings via Lofting and Occlusion Reasoning, The
* SurfNet: Generating 3D Shape Surfaces Using Deep Residual Networks
* Surveillance Video Parsing with Single Frame Supervision
* Switching Convolutional Neural Network for Crowd Counting
* SyncSpecCNN: Synchronized Spectral CNN for 3D Shape Segmentation
* Synthesizing 3D Shapes via Modeling Multi-view Depth Maps and Silhouettes with Deep Generative Networks
* Synthesizing Dynamic Patterns by Spatial-Temporal Generative ConvNet
* Synthesizing Normalized Faces from Facial Identity Features
* Task-Driven Dynamic Fusion: Reducing Ambiguity in Video Description
* Teaching Compositionality to CNNs
* Template Matching with Deformable Diversity Similarity
* Template-Based Monocular 3D Recovery of Elastic Shapes Using Lagrangian Multipliers
* Temporal Action Co-Segmentation in 3D Motion Capture Data and Videos
* Temporal Action Localization by Structured Maximal Sums
* Temporal Attention-Gated Model for Robust Sequence Classification
* Temporal Convolutional Networks for Action Segmentation and Detection
* Temporal Residual Networks for Dynamic Scene Recognition
* TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering
* Thin-Slicing Network: A Deep Structured Model for Pose Estimation in Videos
* Top-Down Visual Saliency Guided by Captions
* Toroidal Constraints for Two-Point Localization Under High Outlier Ratios
* Towards a Quality Metric for Dense Light Fields
* Towards Accurate Multi-person Pose Estimation in the Wild
* Tracking by Natural Language Specification
* Training Object Class Detectors with Click Supervision
* Transformation-Grounded Image Generation Network for Novel 3D View Synthesis
* Transition Forests: Learning Discriminative Temporal Transitions for Action Recognition and Detection
* Truncated Max-of-Convex Models
* Turning an Urban Scene Video into a Cinemagraph
* UberNet: Training a Universal Convolutional Neural Network for Low-, Mid-, and High-Level Vision Using Diverse Datasets and Limited Memory
* UltraStereo: Efficient Learning-Based Matching for Active Stereo Systems
* Unambiguous Text Localization and Retrieval for Cluttered Scenes
* Understanding Traffic Density from Large-Scale Web Camera Data
* Unified Approach of Multi-scale Deep and Hand-Crafted Features for Defocus Estimation, A
* Unified Embedding and Metric Learning for Zero-Exemplar Event Detection
* Unite the People: Closing the Loop Between 3D and 2D Human Representations
* Universal Adversarial Perturbations
* Unrolling the Shutter: CNN to Correct Motion Distortions
* Unsupervised Adaptive Re-identification in Open World Dynamic Camera Networks
* Unsupervised Learning of Depth and Ego-Motion from Video
* Unsupervised Learning of Long-Term Motion Dynamics for Videos
* Unsupervised Monocular Depth Estimation with Left-Right Consistency
* Unsupervised Part Learning for Visual Recognition
* Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks
* Unsupervised Semantic Scene Labeling for Streaming Data
* Unsupervised Vanishing Point Detection and Camera Calibration from a Single Manhattan Image with Radial Distortion
* Unsupervised Video Summarization with Adversarial LSTM Networks
* Unsupervised Visual-Linguistic Reference Resolution in Instructional Videos
* UntrimmedNets for Weakly Supervised Action Recognition and Detection
* Using Locally Corresponding CAD Models for Dense 3D Reconstructions from a Single Image
* Using Ranking-CNN for Age Estimation
* Variational Autoencoded Regression: High Dimensional Regression of Visual Data on Complex Manifold
* Variational Bayesian Multiple Instance Learning with Gaussian Processes
* Video Acceleration Magnification
* Video Captioning with Transferred Semantic Attributes
* Video Desnowing and Deraining Based on Matrix Decomposition
* Video Frame Interpolation via Adaptive Convolution
* Video Propagation Networks
* Video Segmentation via Multiple Granularity Analysis
* Video2Shop: Exact Matching Clothes in Videos to Online Shopping Images
* VidLoc: A Deep Spatio-Temporal Model for 6-DoF Video-Clip Relocalization
* ViP-CNN: Visual Phrase Guided Convolutional Neural Network
* Viraliency: Pooling Local Virality
* Visual Dialog
* Visual Translation Embedding Network for Visual Relation Detection
* Visual-Inertial-Semantic Scene Representation for 3D Object Detection
* VQA-Machine: Learning How to Use Existing Vision Algorithms to Answer New Questions, The
* Weakly Supervised Action Learning with RNN Based Fine-to-Coarse Modeling
* Weakly Supervised Actor-Action Segmentation via Robust Multi-task Ranking
* Weakly Supervised Affordance Detection
* Weakly Supervised Cascaded Convolutional Networks
* Weakly Supervised Dense Video Captioning
* Weakly Supervised Semantic Segmentation Using Web-Crawled Videos
* Weakly-Supervised Visual Grounding of Phrases with Linguistic Structures
* Webly Supervised Semantic Segmentation
* Weighted-Entropy-Based Quantization for Deep Neural Networks
* Wetness and Color from a Single Multispectral Image
* What Can Help Pedestrian Detection?
* What is and What is Not a Salient Object? Learning Salient Object Detector by Ensembling Linear Exemplar Regressors
* What is the Space of Attenuation Coefficients in Underwater Computer Vision?
* What's in a Question: Using Visual Questions as a Form of Supervision
* Why You Should Forget Luminance Conversion and Do Something Better
* Wide-Field-of-View Monocentric Light Field Camera, A
* WILDCAT: Weakly Supervised Learning of Deep ConvNets for Image Classification, Pointwise Localization and Segmentation
* World of Fast Moving Objects, The
* WSISA: Making Survival Prediction from Whole Slide Histopathological Images
* Xception: Deep Learning with Depthwise Separable Convolutions
* YOLO9000: Better, Faster, Stronger
* YouTube-BoundingBoxes: A Large High-Precision Human-Annotated Data Set for Object Detection in Video
* Zero Shot Learning via Multi-scale Manifold Regularization
* Zero-Shot Action Recognition with Error-Correcting Output Codes
* Zero-Shot Classification with Discriminative Semantic Representation Learning
* Zero-Shot Learning: The Good, the Bad and the Ugly
* Zero-Shot Recognition Using Dual Visual-Semantic Mapping Paths
783 for CVPR17

CVPR18 * *CVPR
* 2D/3D Pose Estimation and Action Recognition Using Multitask Deep Learning
* 3D Human Pose Estimation in the Wild by Adversarial Learning
* 3D Human Sensing, Action and Emotion Recognition in Robot Assisted Therapy of Children with Autism
* 3D Object Detection with Latent Support Surfaces
* 3D Pose Estimation and 3D Model Retrieval for Objects in the Wild
* 3D Registration of Curves and Surfaces Using Local Differential Information
* 3D Semantic Segmentation with Submanifold Sparse Convolutional Networks
* 3D Semantic Trajectory Reconstruction from 3D Pixel Continuum
* 3D-RCNN: Instance-Level 3D Object Reconstruction via Render-and-Compare
* 4D Human Body Correspondences from Panoramic Depth Maps
* 4DFAB: A Large Scale 4D Database for Facial Expression Analysis and Biometric Applications
* A2-RL: Aesthetics Aware Reinforcement Learning for Image Cropping
* Accurate and Diverse Sampling of Sequences Based on a Best of Many Sample Objective
* Action Sets: Weakly Supervised Action Segmentation Without Ordering Constraints
* Active Fixation Control to Predict Saccade Sequences
* Actor and Action Video Segmentation from a Sentence
* Actor and Observer: Joint Modeling of First and Third-Person Videos
* AdaDepth: Unsupervised Content Congruent Adaptation for Depth Estimation
* Adversarial Complementary Learning for Weakly Supervised Object Localization
* Adversarial Data Programming: Using GANs to Relax the Bottleneck of Curated Labeled Data
* Adversarial Feature Augmentation for Unsupervised Domain Adaptation
* Adversarially Learned One-Class Classifier for Novelty Detection
* Adversarially Occluded Samples for Person Re-identification
* Aligning Infinite-Dimensional Covariance Matrices in Reproducing Kernel Hilbert Spaces for Domain Adaptation
* Alive Caricature from 2D to 3D
* Alternating-Stereo VINS: Observability Analysis and Performance Evaluation
* AMNet: Memorability Estimation with Attention
* Analysis of Hand Segmentation in the Wild
* Analysis of Scale Invariance in Object Detection - SNIP, An
* Analytic Expressions for Probabilistic Moments of PL-DNN with Gaussian Input
* Analytical Modeling of Vanishing Points and Curves in Catadioptric Cameras
* Analyzing Filters Toward Efficient ConvNet
* Anatomical Priors in Convolutional Networks for Unsupervised Biomedical Segmentation
* Anticipating Traffic Accidents with Adaptive Loss and Large-Scale Incident DB
* AON: Towards Arbitrarily-Oriented Text Recognition
* Aperture Supervision for Monocular Depth Estimation
* Appearance-and-Relation Networks for Video Classification
* Arbitrary Style Transfer with Deep Feature Reshuffle
* Are You Talking to Me? Reasoned Visual Dialog Generation Through Adversarial Learning
* Art of Singular Vectors and Universal Adversarial Perturbations
* Attend and Interact: Higher-Order Object Interactions for Video Understanding
* Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification
* Attention-Aware Compositional Network for Person Re-identification
* Attentional ShapeContextNet for Point Cloud Recognition
* Attentive Fashion Grammar Network for Fashion Landmark Detection and Clothing Category Classification
* Attentive Generative Adversarial Network for Raindrop Removal from A Single Image
* AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks
* Audio to Body Dynamics
* Augmented Skeleton Space Transfer for Depth-Based Hand Pose Estimation
* Augmenting Crowd-Sourced 3D Reconstructions Using Semantic Detections
* Automatic 3D Indoor Scene Modeling from Single Panorama
* AVA: A Video Dataset of Spatio-Temporally Localized Atomic Visual Actions
* Avatar-Net: Multi-scale Zero-Shot Style Transfer by Feature Decoration
* Baseline Desensitizing in Translation Averaging
* Benchmarking 6DOF Outdoor Visual Localization in Changing Conditions
* Best of Both Worlds: Combining CNNs and Geometric Constraints for Hierarchical Motion Segmentation, The
* Between-Class Learning for Image Classification
* Beyond Grobner Bases: Basis Selection for Minimal Solvers
* Beyond Holistic Object Recognition: Enriching Image Understanding with Part States
* Beyond the Pixel-Wise Loss for Topology-Aware Delineation
* Beyond Trade-Off: Accelerate FCN-Based Face Detector with Higher Accuracy
* Bi-Directional Message Passing Model for Salient Object Detection, A
* Bidirectional Attentive Fusion with Context Gating for Dense Video Captioning
* Bidirectional Retrieval Made Simple
* Bilateral Ordinal Relevance Multi-instance Regression for Facial Action Unit Intensity Estimation
* Biresolution Spectral Framework for Product Quantization, A
* Blazingly Fast Video Object Segmentation with Pixel-Wise Metric Learning
* Blind Predicting Similar Quality Map for Image Quality Assessment
* BlockDrop: Dynamic Inference Paths in Residual Networks
* Boosting Adversarial Attacks with Momentum
* Boosting Domain Adaptation by Discovering Latent Domains
* Boosting Self-Supervised Learning via Knowledge Transfer
* Bootstrapping the Performance of Webly Supervised Semantic Segmentation
* Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
* Boundary Flow: A Siamese Network that Predicts Boundary Motion Without Training on Motion
* BPGrad: Towards Global Optimality in Deep Learning via Branch and Pruning
* Burst Denoising with Kernel Prediction Networks
* Camera Pose Estimation with Unknown Principal Point
* Camera Style Adaptation for Person Re-identification
* Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?
* CarFusion: Combining Point Tracking and Part Detection for Dynamic 3D Reconstruction of Vehicles
* CartoonGAN: Generative Adversarial Networks for Photo Cartoonization
* Cascade R-CNN: Delving Into High Quality Object Detection
* Cascaded Pyramid Network for Multi-person Pose Estimation
* Categorizing Concepts with Basic Level for Vision-to-Language
* Causal And-Or Graph Model for Visibility Fluent Reasoning in Tracking Interacting Objects, A
* CBMV: A Coalesced Bidirectional Matching Volume for Disparity Estimation
* Certifiably Globally Optimal Solution to the Non-minimal Relative Pose Problem, A
* Classification-Driven Dynamic Image Enhancement
* Classifier Learning with Prior Probabilities for Facial Action Unit Recognition
* clcNet: Improving the Efficiency of Convolutional Neural Network Using Channel Local Convolutions
* CleanNet: Transfer Learning for Scalable Image Classifier Training with Label Noise
* CLEAR: Cumulative LEARning for One-Shot One-Class Image Recognition
* Clinical Skin Lesion Diagnosis Using Representations Inspired by Dermatologist Criteria
* CLIP-Q: Deep Network Compression Learning by In-parallel Pruning-Quantization
* Closer Look at Spatiotemporal Convolutions for Action Recognition, A
* ClusterNet: Detecting Small Objects in Large Scenes by Exploiting Spatio-Temporal Information
* CNN Based Learning Using Reflection and Retinex Models for Intrinsic Image Decomposition
* CNN Driven Sparse Multi-level B-Spline Image Registration
* CNN in MRF: Video Object Segmentation via Inference in a CNN-Based Higher-Order Spatio-Temporal MRF
* COCO-Stuff: Thing and Stuff Classes in Context
* CodeSLAM: Learning a Compact, Optimisable Representation for Dense Visual SLAM
* Coding Kendall's Shape Trajectories for 3D Action Recognition
* Collaborative and Adversarial Network for Unsupervised Domain Adaptation
* Common Framework for Interactive Texture Transfer, A
* Compare and Contrast: Learning Prominent Visual Differences
* Compassionately Conservative Balanced Cuts for Image Segmentation
* Compressed Video Action Recognition
* CondenseNet: An Efficient DenseNet Using Learned Group Convolutions
* Conditional Generative Adversarial Network for Structured Domain Adaptation
* Conditional Image-to-Image Translation
* Conditional Probability Models for Deep Image Compression
* Connecting Pixels to Privacy and Utility: Automatic Redaction of Private Information in Images
* Consensus Maximization for Semantic Region Correspondences
* Constrained Deep Neural Network for Ordinal Regression, A
* Content-Sensitive Supervoxels via Uniform Tessellations on Video Manifolds
* Context Contrasted Feature and Gated Multi-scale Aggregation for Scene Segmentation
* Context Embedding Networks
* Context Encoding for Semantic Segmentation
* Context-Aware Deep Feature Compression for High-Speed Visual Tracking
* Context-Aware Synthesis for Video Frame Interpolation
* Continuous Relaxation of MAP Inference: A Nonconvex Perspective
* Controllable Video Generation with Sparse Trajectories
* Convolutional Image Captioning
* Convolutional Neural Networks with Alternately Updated Clique
* Convolutional Sequence to Sequence Model for Human Dynamics
* Correlation Tracking via Joint Discrimination and Reliability Learning
* CosFace: Large Margin Cosine Loss for Deep Face Recognition
* Coupled End-to-End Transfer Learning with Generalized Fisher Information
* Crafting a Toolchain for Image Restoration by Deep Reinforcement Learning
* Creating Capsule Wardrobes from Fashion Images
* Cross-Dataset Adaptation for Visual Question Answering
* Cross-Domain Self-Supervised Multi-task Feature Learning Using Synthetic Imagery
* Cross-Domain Weakly-Supervised Object Detection Through Progressive Domain Adaptation
* Cross-Modal Deep Variational Hand Pose Estimation
* Cross-View Image Synthesis Using Conditional GANs
* Crowd Counting via Adversarial Cross-Scale Consistency Pursuit
* Crowd Counting with Deep Negative Correlation Learning
* CRRN: Multi-scale Guided Concurrent Reflection Removal Network
* CSGNet: Neural Shape Parser for Constructive Solid Geometry
* CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes
* Cube Padding for Weakly-Supervised Saliency Prediction in 360° Videos
* Curve Reconstruction via the Global Statistics of Natural Curves
* Customized Image Narrative Generation via Interactive Visual Question Generation and Answering
* CVM-Net: Cross-View Matching Network for Image-Based Ground-to-Aerial Geo-Localization
* DA-GAN: Instance-Level Image Translation by Deep Attention Generative Adversarial Networks
* Data Distillation: Towards Omni-Supervised Learning
* DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks
* DecideNet: Counting Varying Density Crowds Through Attention Guided Detection and Density Estimation
* Decorrelated Batch Normalization
* Decoupled Networks
* Deep Adversarial Metric Learning
* Deep Adversarial Subspace Clustering
* Deep Back-Projection Networks for Super-Resolution
* Deep Cauchy Hashing for Hamming Space Retrieval
* Deep CockTail Network: Multi-source Unsupervised Domain Adaptation with Category Shift
* Deep Cost-Sensitive and Order-Preserving Feature Learning for Cross-Population Age Estimation
* Deep Cross-Media Knowledge Transfer
* Deep Density Clustering of Unconstrained Faces
* Deep Depth Completion of a Single RGB-D Image
* Deep Diffeomorphic Transformer Networks
* Deep End-to-End Time-of-Flight Imaging
* Deep Extreme Cut: From Extreme Points to Object Segmentation
* Deep Face Detector Adaptation Without Negative Transfer or Catastrophic Forgetting
* Deep Group-Shuffling Random Walk for Person Re-identification
* Deep Hashing via Discrepancy Minimization
* Deep Image Prior
* Deep Layer Aggregation
* Deep Learning of Graph Matching
* Deep Learning Under Privileged Information Using Heteroscedastic Dropout
* Deep Lesion Graphs in the Wild: Relationship Learning and Organization of Significant Radiology Image Findings in a Diverse Large-Scale Lesion Database
* Deep Marching Cubes: Learning Explicit Surface Representations
* Deep Material-Aware Cross-Spectral Stereo Matching
* Deep Mutual Learning
* Deep Ordinal Regression Network for Monocular Depth Estimation
* Deep Parametric Continuous Convolutional Neural Networks
* Deep Photo Enhancer: Unpaired Learning for Image Enhancement from Photographs with GANs
* Deep Progressive Reinforcement Learning for Skeleton-Based Action Recognition
* Deep Regression Forests for Age Estimation
* Deep Reinforcement Learning of Region Proposal Networks for Object Detection
* Deep Semantic Face Deblurring
* Deep Sparse Coding for Invariant Multimodal Halle Berry Neurons
* Deep Spatial Feature Reconstruction for Partial Person Re-identification: Alignment-free Approach
* Deep Spatio-Temporal Random Fields for Efficient Video Segmentation
* Deep Texture Manifold for Ground Terrain Recognition
* Deep Unsupervised Saliency Detection: A Multiple Noisy Labeling Perspective
* Deep Video Super-Resolution Network Using Dynamic Upsampling Filters Without Explicit Motion Compensation
* Deeper Look at Power Normalizations, A
* Deeply Learned Filter Response Functions for Hyperspectral Reconstruction
* DeepMVS: Learning Multi-view Stereopsis
* DeepVoting: A Robust and Explainable Deep Network for Semantic Part Detection Under Partial Occlusion
* Defense Against Adversarial Attacks Using High-Level Representation Guided Denoiser
* Defense Against Universal Adversarial Perturbations
* Deflecting Adversarial Attacks with Pixel Deflection
* Defocus Blur Detection via Multi-stream Bottom-Top-Bottom Fully Convolutional Network
* Deformable GANs for Pose-Based Human Image Generation
* Deformable Shape Completion with Graph Convolutional Autoencoders
* Deformation Aware Image Compression
* DeLS-3D: Deep Localization and Segmentation with a 3D Semantic Map
* Demo2Vec: Reasoning Object Affordances from Online Videos
* Dense 3D Regression for Hand Pose Estimation
* Dense Decoder Shortcut Connections for Single-Pass Semantic Segmentation
* DenseASPP for Semantic Segmentation in Street Scenes
* Densely Connected Pyramid Dehazing Network
* DensePose: Dense Human Pose Estimation in the Wild
* Density Adaptive Point Set Registration
* Density-Aware Single Image De-raining Using a Multi-stream Dense Network
* Depth and Transient Imaging with Compressive SPAD Array Cameras
* Depth-Aware Stereo Video Retargeting
* Depth-Based 3D Hand Pose Estimation: From Current Achievements to Future Goals
* Detach and Adapt: Learning Cross-Domain Disentangled Deep Representation
* Detail-Preserving Pooling in Deep Networks
* Detect Globally, Refine Locally: A Novel Approach to Saliency Detection
* Detect-and-Track: Efficient Pose Estimation in Videos
* Detecting and Recognizing Human-Object Interactions
* Differential Attention for Visual Question Answering
* Dimensionality's Blessing: Clustering Images by Underlying Distribution
* Direct Shape Regression Networks for End-to-End Face Alignment
* Direction-Aware Spatial Context Features for Shadow Detection
* Discovering Point Lights with Intensity Distance Fields
* Discrete-Continuous ADMM for Transductive Inference in Higher-Order MRFs
* Discriminability Objective for Training Descriptive Captions
* Discriminative Learning of Latent Features for Zero-Shot Recognition
* Disentangled Person Image Generation
* Disentangling 3D Pose in a Dendritic CNN for Unconstrained 2D Face Alignment
* Disentangling Factors of Variation by Mixing Them
* Disentangling Features in 3D Face Shapes for Joint Face Reconstruction and Recognition
* Disentangling Structure and Aesthetics for Style-Aware Image Completion
* Distort-and-Recover: Color Enhancement Using Deep Reinforcement Learning
* Distributable Consistent Multi-object Matching
* DiverseNet: When One Right Answer is not Enough
* Diversity Regularized Spatiotemporal Attention for Video-Based Person Re-identification
* Divide and Conquer for Full-Resolution Light Field Deblurring
* Divide and Grow: Capturing Huge Diversity in Crowd Images with Incrementally Growing CNN
* Document Enhancement Using Visibility Detection
* DocUNet: Document Image Unwarping via a Stacked U-Net
* Domain Adaptive Faster R-CNN for Object Detection in the Wild
* Domain Generalization with Adversarial Feature Learning
* Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering
* DOTA: A Large-Scale Dataset for Object Detection in Aerial Images
* DoubleFusion: Real-Time Capture of Human Performances with Inner Body Shapes from a Single Depth Sensor
* DS*: Tighter Lifting-Free Convex Relaxations for Quadratic Matching Problems
* Dual Attention Matching Network for Context-Aware Feature Sequence Based Person Re-identification
* Dual Skipping Networks
* Duplex Generative Adversarial Network for Unsupervised Domain Adaptation
* DVQA: Understanding Data Visualizations via Question Answering
* Dynamic Feature Learning for Partial Face Recognition
* Dynamic Few-Shot Visual Learning Without Forgetting
* Dynamic Graph Generation Network: Generating Relational Knowledge from Diagrams
* Dynamic Scene Deblurring Using Spatially Variant Recurrent Neural Networks
* Dynamic Video Segmentation Network
* Dynamic Zoom-in Network for Fast Object Detection in Large Images
* Dynamic-Structured Semantic Propagation Network
* Easy Identification from Better Constraints: Multi-shot Person Re-identification from Reference Constraints
* Edit Probability for Scene Text Recognition
* Efficient and Deep Person Re-identification Using Multi-level Similarity
* Efficient and Provable Approach for Mixture Proportion Estimation Using Linear Independence Assumption, An
* Efficient Diverse Ensemble for Discriminative Co-tracking
* Efficient Interactive Annotation of Segmentation Datasets with Polygon-RNN++
* Efficient Large-Scale Approximate Nearest Neighbor Search on OpenCL FPGA
* Efficient Optimization for Rank-Based Loss Functions
* Efficient Parametrization of Multi-domain Deep Neural Networks
* Efficient Subpixel Refinement with Symbolic Linear Predictors
* Efficient Video Object Segmentation via Network Modulation
* Efficient, Sparse Representation of Manifold Distance Matrices for Classical Scaling
* Egocentric Activity Recognition on a Budget
* Egocentric Basketball Motion Planning from a Single First-Person Image
* Eliminating Background-bias for Robust Person Re-identification
* Embodied Question Answering
* Emotional Attention: A Study of Image Sentiment and Visual Attention
* Empirical Study of the Topology and Geometry of Deep Networks
* Encoding Crowd Interaction with Deep Neural Network for Pedestrian Trajectory Prediction
* End-to-End Convolutional Semantic Embeddings
* End-to-End Deep Kronecker-Product Matching for Person Re-identification
* End-to-End Dense Video Captioning with Masked Transformer
* End-to-End Flow Correlation Tracking with Spatial-Temporal Attention
* End-to-End Learning of Keypoint Detector and Descriptor for Pose Invariant 3D Matching
* End-to-End Learning of Motion Representation for Video Understanding
* End-to-End Recovery of Human Shape and Pose
* End-to-End TextSpotter with Explicit Alignment and Attention, An
* End-to-End Weakly-Supervised Semantic Alignment
* Enhancing the Spatial Resolution of Stereo Images Using a Parallax Prior
* Environment Upgrade Reinforcement Learning for Non-differentiable Multi-stage Pipelines
* EPINET: A Fully-Convolutional Neural Network Using Epipolar Geometry for Depth from Light Field Images
* Erase or Fill? Deep Joint Recurrent Rain Removal and Reconstruction in Videos
* Estimation of Camera Locations in Highly Corrupted Scenarios: All About that Base, No Shape Trouble
* Event-Based Vision Meets Deep Learning on Steering Prediction for Self-Driving Cars
* Every Smile is Unique: Landmark-Guided Diverse Smile Generation
* Excitation Backprop for RNNs
* Explicit Loss-Error-Aware Quantization for Low-Bit Deep Neural Networks
* Exploit the Unknown Gradually: One-Shot Video-Based Person Re-identification by Stepwise Learning
* Exploiting Transitivity for Learning Person Re-identification Models on a Budget
* Exploring Disentangled Feature Representation Beyond Face Identification
* Extreme 3D Face Reconstruction: Seeing Through Occlusions
* Eye In-painting with Exemplar Generative Adversarial Networks
* Face Aging with Identity-Preserved Conditional Generative Adversarial Networks
* Face-to-Face Neural Conversation Model, A
* FaceID-GAN: Learning a Symmetry Three-Player GAN for Identity-Preserving Face Synthesis
* Facelet-Bank for Fast Portrait Manipulation
* Facial Expression Recognition by De-expression Residue Learning
* Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene
* Fast and Accurate Online Video Object Segmentation via Tracking Parts
* Fast and Accurate Single Image Super-Resolution via Information Distillation Network
* Fast and Furious: Real Time End-to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolutional Net
* Fast and Robust Estimation for Unit-Norm Constrained Linear Fitting Problems
* Fast End-to-End Trainable Guided Filter
* Fast Monte-Carlo Localization on Aerial Vehicles Using Approximate Continuous Belief Representations
* Fast Resection-Intersection Method for the Known Rotation Problem, A
* Fast Spectral Ranking for Similarity Search
* Fast Video Object Segmentation by Reference-Guided Mask Propagation
* FeaStNet: Feature-Steered Graph Convolutions for 3D Shape Analysis
* Feature Generating Networks for Zero-Shot Learning
* Feature Mapping for Learning Fast and Accurate 3D Pose Inference from Synthetic Images
* Feature Quantization for Defending Against Distortion of Images
* Feature Selective Networks for Object Detection
* Feature Space Transfer for Data Augmentation
* Feature Super-Resolution: Make Machine See More Clearly
* Features for Multi-target Multi-camera Tracking and Re-identification
* Feedback-Prop: Convolutional Neural Network Inference Under Partial Evidence
* Few-Shot Image Recognition by Predicting Parameters from Activations
* FFNet: Video Fast-Forwarding via Reinforcement Learning
* Fight Ill-Posedness with Ill-Posedness: Single-shot Variational Depth Super-Resolution from Shading
* Finding Beans in Burgers: Deep Semantic-Visual Embedding with Localization
* Finding It: Weakly-Supervised Reference-Aware Visual Grounding in Instructional Videos
* Finding Tiny Faces in the Wild with Generative Adversarial Network
* Fine-Grained Video Captioning for Sports Narrative
* First-Person Hand Action Benchmark with RGB-D Videos and 3D Hand Pose Annotations
* Five-Point Fundamental Matrix Estimation for Uncalibrated Cameras
* FLIPDIAL: A Generative Model for Two-Way Visual Dialogue
* Flow Guided Recurrent Neural Encoder for Video Salient Object Detection
* Focal Visual-Text Attention for Visual Question Answering
* Focus Manipulation Detection via Photometric Histogram Analysis
* FoldingNet: Point Cloud Auto-Encoder via Deep Grid Deformation
* Fooling Vision and Language Models Despite Localization and Attention Mechanism
* FOTS: Fast Oriented Text Spotting with a Unified Network
* Frame-Recurrent Video Super-Resolution
* Free Supervision from Video Games
* From Lifestyle VLOGs to Everyday Interactions
* From Source to Target and Back: Symmetric Bi-Directional Adaptive GAN
* Frustum PointNets for 3D Object Detection from RGB-D Data
* FSRNet: End-to-End Learning Face Super-Resolution with Facial Priors
* Fully Convolutional Adaptation Networks for Semantic Segmentation
* Functional Map of the World
* Fusing Crowd Density Maps and Visual Object Trackers for People Tracking in Crowd Scenes
* Future Frame Prediction for Anomaly Detection - A New Baseline
* Future Person Localization in First-Person Videos
* GAGAN: Geometry-Aware Generative Adversarial Networks
* GANerated Hands for Real-Time 3D Hand Tracking from Monocular RGB
* Gated Fusion Network for Single Image Dehazing
* Gaze Prediction in Dynamic 360° Immersive Videos
* Generalized Zero-Shot Learning via Synthesized Examples
* Generate to Adapt: Aligning Domains Using Generative Adversarial Networks
* Generating a Fusion Image: One's Identity and Another's Shape
* Generating Synthetic X-Ray Images of a Person from the Surface Geometry
* Generative Adversarial Approach for Zero-Shot Learning from Noisy Texts, A
* Generative Adversarial Image Synthesis with Decision Tree Latent Controller
* Generative Adversarial Learning Towards Fast Weakly Supervised Detection
* Generative Adversarial Perturbations
* Generative Image Inpainting with Contextual Attention
* Generative Modeling Using the Sliced Wasserstein Distance
* Geometric Multi-model Fitting with a Convex Relaxation Algorithm
* Geometric Robustness of Deep Networks: Analysis and Improvement
* Geometry Aware Constrained Optimization Techniques for Deep Learning
* Geometry Guided Convolutional Neural Networks for Self-Supervised Video Representation Learning
* Geometry-Aware Deep Network for Single-Image Novel View Synthesis
* Geometry-Aware Learning of Maps for Camera Localization
* Geometry-Aware Network for Non-rigid Shape Prediction from a Single View
* Geometry-Aware Scene Text Detection with Instance Transformation Network
* GeoNet: Geometric Neural Network for Joint Depth and Surface Normal Estimation
* GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose
* Gesture Recognition: Focus on the Hands
* Gibson Env: Real-World Perception for Embodied Agents
* Glimpse Clouds: Human Activity Recognition from Unstructured Feature Points
* Global Versus Localized Generative Adversarial Nets
* Globally Optimal Inlier Set Maximization for Atlanta Frame Estimation
* Going from Image to Video Saliency: Augmenting Image Salience with Dynamic Attentional Push
* Good View Hunting: Learning Photo Composition from Dense View Pairs
* Graph-Cut RANSAC
* GraphBit: Bitwise Interaction Mining via Deep Reinforcement Learning
* Grounding Referring Expressions in Images by Variational Context
* Group Consistent Similarity Learning via Deep CRF for Person Re-identification
* GroupCap: Group-Based Image Captioning with Structured Relevance and Diversity Constraints
* Guide Me: Interacting with Deep Networks
* Guided Proofreading of Automatic Segmentations for Connectomics
* GVCNN: Group-View Convolutional Neural Networks for 3D Shape Recognition
* Hallucinated-IQA: No-Reference Image Quality Assessment via Adversarial Learning
* Hand PointNet: 3D Hand Pose Estimation Using Point Sets
* Harmonious Attention Network for Person Re-identification
* HashGAN: Deep Learning to Hash with Pair Conditional Wasserstein GAN
* Hashing as Tie-Aware Learning to Rank
* HATS: Histograms of Averaged Time Surfaces for Robust Event-Based Object Classification
* Hierarchical Generative Model for Eye Image Synthesis and Eye Gaze Estimation, A
* Hierarchical Novelty Detection for Visual Object Recognition
* Hierarchical Recurrent Attention Networks for Structured Online Maps
* High Performance Visual Tracking with Siamese Region Proposal Network
* High-Order Tensor Regularization with Application to Attribute Ranking
* High-Quality Denoising Dataset for Smartphone Cameras, A
* High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs
* High-Speed Tracking with Multi-kernel Correlation Filters
* HSA-RNN: Hierarchical Structure-Adaptive RNN for Video Summarization
* Human Appearance Transfer
* Human Pose Estimation with Parsing Induced Learner
* Human Semantic Parsing for Person Re-identification
* Human-Centric Indoor Scene Synthesis Using Stochastic Grammar
* Hybrid Camera Pose Estimation
* Hybrid L1-L0 Layer Decomposition Model for Tone Mapping, A
* HydraNets: Specialized Dynamic Architectures for Efficient Inference
* Hyperparameter Optimization for Tracking with Continuous Deep Q-Learning
* ICE-BA: Incremental, Consistent and Efficient Bundle Adjustment for Visual-Inertial SLAM
* Illuminant Spectra-Based Source Separation Using Flash Photography
* Im2Flow: Motion Hallucination from Static Images for Action Recognition
* Im2Pano3D: Extrapolating 360° Structure and Semantics Beyond the Field of View
* Im2Struct: Recovering 3D Shape Structure from a Single RGB Image
* Image Blind Denoising with Generative Adversarial Network Based Noise Modeling
* Image Collection Pop-up: 3D Reconstruction and Clustering of Rigid and Non-rigid Categories
* Image Correction via Deep Reciprocating HDR Transformation
* Image Generation from Scene Graphs
* Image Restoration by Estimating Frequency Distribution of Local Patches
* Image Super-Resolution via Dual-State Recurrent Networks
* Image to Image Translation for Domain Adaptation
* Image-Image Domain Adaptation with Preserved Self-Similarity and Domain-Dissimilarity for Person Re-identification
* Importance Weighted Adversarial Nets for Partial Domain Adaptation
* Improved Fusion of Visual and Language Representations by Dense Symmetric Co-attention for Visual Question Answering
* Improved Lossy Image Compression with Priming and Spatially Adaptive Bit Rates for Recurrent Networks
* Improvements to Context Based Self-Supervised Learning
* Improving Color Reproduction Accuracy on Cameras
* Improving Landmark Localization with Semi-Supervised Learning
* Improving Object Localization with Fitness NMS and Bounded IoU Loss
* Improving Occlusion and Hard Negative Handling for Single-Stage Pedestrian Detectors
* In-place Activated BatchNorm for Memory-Optimized Training of DNNs
* iNaturalist Species Classification and Detection Dataset, The
* Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN
* Indoor RGB-D Compass from a Single Line and Plane
* Inference in Higher Order MRF-MAP Problems with Small and Large Cliques
* Inferring Light Fields from Shadows
* Inferring Semantic Layout for Hierarchical Text-to-Image Synthesis
* Inferring Shared Attention in Social Scene Videos
* InLoc: Indoor Visual Localization with Dense Matching and View Synthesis
* Instance Embedding Transfer to Unsupervised Video Object Segmentation
* Interactive Image Segmentation with Latent Diversity
* Interleaved Structured Sparse Convolutional Neural Networks
* Interpret Neural Networks by Identifying Critical Data Routing Paths
* Interpretable Convolutional Neural Networks
* Interpretable Video Captioning via Trajectory Structured Localization
* Intrinsic Image Transformation via Scale Space Decomposition
* Inverse Composition Discriminative Optimization for Point Cloud Registration
* InverseFaceNet: Deep Monocular Inverse Face Rendering
* IQA: Visual Question Answering in Interactive Environments
* ISTA-Net: Interpretable Optimization-Inspired Deep Network for Image Compressive Sensing
* Iterative Learning with Open-set Noisy Labels
* Iterative Visual Reasoning Beyond Convolutions
* iVQA: Inverse Visual Question Answering
* Jerk-Aware Video Acceleration Magnification
* Joint Cuts and Matching of Partitions in One Graph
* Joint Optimization Framework for Learning with Noisy Labels
* Joint Pose and Expression Modeling for Facial Expression Recognition
* Jointly Localizing and Describing Events for Dense Video Captioning
* Jointly Optimize Data Augmentation and Network Training: Adversarial Data Augmentation in Human Pose Estimation
* Kernelized Subspace Pooling for Deep Local Descriptors
* KIPPI: KInetic Polygonal Partitioning of Images
* Knowledge Aided Consistency for Weakly Supervised Phrase Grounding
* Label Denoising Adversarial Network (LDAN) for Inverse Lighting of Faces
* LAMV: Learning to Align and Match Videos with Kernelized Temporal Layers
* Language-Based Image Editing with Recurrent Attentive Models
* Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning
* Large-Scale Distance Metric Learning with Uncertainty
* Large-Scale Point Cloud Semantic Segmentation with Superpoint Graphs
* Latent RANSAC
* LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image
* LDMNet: Low Dimensional Manifold Regularized Neural Networks
* Lean Multiclass Crowdsourcing
* Learned Shape-Tailored Descriptors for Segmentation
* Learning 3D Shape Completion from Laser Scan Data with Weak Supervision
* Learning a Complete Image Indexing Pipeline
* Learning a Discriminative Feature Network for Semantic Segmentation
* Learning a Discriminative Filter Bank Within a CNN for Fine-Grained Recognition
* Learning a Discriminative Prior for Blind Image Deblurring
* Learning a Single Convolutional Super-Resolution Network for Multiple Degradations
* Learning and Using the Arrow of Time
* Learning Answer Embeddings for Visual Question Answering
* Learning Attentions: Residual Attentional Siamese Network for High Performance Online Visual Tracking
* Learning Attribute Representations with Localization for Flexible Fashion Search
* Learning by Asking Questions
* Learning Compact Recurrent Neural Networks with Block-Term Tensor Decomposition
* Learning Compositional Visual Concepts with Mutual Consistency
* Learning Compressible 360° Video Isomers
* Learning Convolutional Networks for Content-Weighted Image Compression
* Learning Deep Descriptors with Scale-Aware Triplet Networks
* Learning Deep Models for Face Anti-Spoofing: Binary or Auxiliary Supervision
* Learning Deep Sketch Abstraction
* Learning Deep Structured Active Contours End-to-End
* Learning Depth from Monocular Videos Using Direct Methods
* Learning Descriptor Networks for 3D Shape Synthesis and Analysis
* Learning Distributions of Shape Trajectories from Longitudinal Datasets: A Hierarchical Model on a Manifold of Diffeomorphisms
* Learning Dual Convolutional Neural Networks for Low-Level Vision
* Learning Face Age Progression: A Pyramid Architecture of GANs
* Learning Facial Action Units from Web Images with Scalable Weakly Supervised Clustering
* Learning for Disparity Estimation Through Feature Constancy
* Learning from Millions of 3D Scans for Large-Scale 3D Face Recognition
* Learning from Noisy Web Data with Category-Level Supervision
* Learning from Synthetic Data: Addressing Domain Shift for Semantic Segmentation
* Learning Generative ConvNets via Multi-grid Modeling and Sampling
* Learning Globally Optimized Object Detector via Policy Gradient
* Learning Intelligent Dialogs for Bounding Box Annotation
* Learning Intrinsic Image Decomposition from Watching the World
* Learning Latent Super-Events to Detect Multiple Activities in Videos
* Learning Less is More: 6D Camera Localization via 3D Surface Regression
* Learning Markov Clustering Networks for Scene Text Detection
* Learning Monocular 3D Human Pose Estimation from Multi-view Images
* Learning Multi-instance Enriched Image Representations via Non-greedy Ratio Maximization of the L1-Norm Distances
* Learning Patch Reconstructability for Accelerating Multi-view Stereo
* Learning Pixel-Level Semantic Affinity with Image-Level Supervision for Weakly Supervised Semantic Segmentation
* Learning Pose Specific Representations by Predicting Different Views
* Learning Rich Features for Image Manipulation Detection
* Learning Semantic Concepts and Order for Image and Sentence Matching
* Learning Spatial-Aware Regressions for Visual Tracking
* Learning Spatial-Temporal Regularized Correlation Filters for Visual Tracking
* Learning Steerable Filters for Rotation Equivariant CNNs
* Learning Strict Identity Mappings in Deep Residual Networks
* Learning Structure and Strength of CNN Filters for Small Sample Size Training
* Learning Superpixels with Segmentation-Aware Affinity Loss
* Learning Time/Memory-Efficient Deep Architectures with Budgeted Super Networks
* Learning to Act Properly: Predicting and Explaining Affordances from Images
* Learning to Adapt Structured Output Space for Semantic Segmentation
* Learning to Compare: Relation Network for Few-Shot Learning
* Learning to Detect Features in Texture Images
* Learning to Estimate 3D Human Pose and Shape from a Single Color Image
* Learning to Evaluate Image Captioning
* Learning to Extract a Video Sequence from a Single Motion-Blurred Image
* Learning to Find Good Correspondences
* Learning to Generate Time-Lapse Videos Using Multi-stage Dynamic Generative Adversarial Networks
* Learning to Localize Sound Source in Visual Scenes
* Learning to Look Around: Intelligently Exploring Unseen Environments for Unknown Tasks
* Learning to Parse Wireframes in Images of Man-Made Environments
* Learning to Promote Saliency Detectors
* Learning to See in the Dark
* Learning to Segment Every Thing
* Learning to Sketch with Shortcut Cycle Consistency
* Learning to Understand Image Blur
* Learning Transferable Architectures for Scalable Image Recognition
* Learning Visual Knowledge Memory Networks for Visual Question Answering
* Learning-Compression Algorithms for Neural Net Pruning
* Left-Right Comparative Recurrent Model for Stereo Matching
* LEGO: Learning Edge with Geometry all at Once by Watching Videos
* Leveraging Unlabeled Data for Crowd Counting by Learning to Rank
* LiDAR-Video Driving Dataset: Learning Driving Policies Effectively
* Light Field Intrinsics with a Deep Encoder-Decoder Network
* Lightweight Probabilistic Deep Networks
* LIME: Live Intrinsic Material Estimation
* Link and Code: Fast Indexing with Graphs and Compact Regression Codes
* Lions and Tigers and Bears: Capturing Non-rigid, 3D, Articulated Shape from Images
* LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation
* Local and Global Optimization Techniques in Graph-Based Clustering
* Local Descriptors Optimized for Average Precision
* Logo Synthesis and Manipulation with Clustered Generative Adversarial Networks
* Long-Term On-board Prediction of People in Traffic Scenes Under Uncertainty
* Look at Boundary: A Boundary-Aware Face Alignment Algorithm
* Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models
* Lose the Views: Limited Angle CT Reconstruction via Implicit Sinogram Completion
* Lovasz-Softmax Loss: A Tractable Surrogate for the Optimization of the Intersection-Over-Union Measure in Neural Networks, The
* Low Power, High Throughput, Fully Event-Based Stereo System, A
* Low-Latency Video Semantic Segmentation
* Low-Shot Learning from Imaginary Data
* Low-Shot Learning with Imprinted Weights
* Low-Shot Learning with Large-Scale Diffusion
* LSTM Pose Machines
* M3: Multimodal Memory Modelling for Video Captioning
* Making Convolutional Networks Recurrent for Visual Sequence Learning
* Manifold Learning in Quotient Spaces
* MapNet: An Allocentric Spatial Memory for Mapping Environments
* Mask-Guided Contrastive Attention Model for Person Re-identification
* MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features
* Matching Adversarial Networks
* Matching Pixels Using Co-occurrence Statistics
* Matryoshka Networks: Predicting 3D Geometry via Nested Shape Layers
* MAttNet: Modular Attention Network for Referring Expression Comprehension
* Maximum Classifier Discrepancy for Unsupervised Domain Adaptation
* Mean-Variance Loss for Deep Age Estimation from a Face
* MegaDepth: Learning Single-View Depth Prediction from Internet Photos
* MegDet: A Large Mini-Batch Object Detector
* Memory Based Online Learning of Deep Representations from Video Streams
* Memory Matching Networks for One-Shot Image Recognition
* Memory Network Approach for Story-Based Temporal Summarization of 360° Videos, A
* Mesoscopic Facial Geometry Inference Using Deep Neural Networks
* MiCT: Mixed 3D/2D Convolutional Tube for Human Action Recognition
* Min-Entropy Latent Model for Weakly Supervised Object Detection
* Minimalist Approach to Type-Agnostic Detection of Quadrics in Point Clouds, A
* Mining on Manifolds: Metric Learning Without Labels
* Mining Point Cloud Local Structures by Kernel Correlation and Graph Pooling
* Missing Slice Recovery for Tensors Using a Low-Rank Model in Embedded Space
* Mix and Match Networks: Encoder-Decoder Alignment for Zero-Pair Image Translation
* Mobile Video Object Detection with Temporally-Aware Feature Maps
* MobileNetV2: Inverted Residuals and Linear Bottlenecks
* MoCoGAN: Decomposing Motion and Content for Video Generation
* Modeling Facial Geometry Using Compositional VAEs
* Modifying Non-local Variations Across Multiple Views
* Modulated Convolutional Networks
* MoNet: Deep Motion Exploitation for Video Object Segmentation
* MoNet: Moments Embedding Network
* Monocular 3D Pose and Shape Estimation of Multiple People in Natural Scenes: The Importance of Multiple Scene Constraints
* Monocular Relative Depth Perception with Web Stereo Data Supervision
* MorphNet: Fast & Simple Resource-Constrained Structure Learning of Deep Networks
* Motion Segmentation by Exploiting Complementary Geometric Models
* Motion-Appearance Co-memory Networks for Video Question Answering
* Motion-Guided Cascaded Refinement Network for Video Object Segmentation
* MovieGraphs: Towards Understanding Human-Centric Situations from Videos
* Multi-agent Diverse Generative Adversarial Networks
* Multi-cell Detection and Classification Using a Generative Convolutional Model
* Multi-content GAN for Few-Shot Font Style Transfer
* Multi-cue Correlation Filters for Robust Visual Tracking
* Multi-evidence Filtering and Fusion for Multi-label Classification, Object Detection and Semantic Segmentation Based on Weakly Supervised Learning
* Multi-frame Quality Enhancement for Compressed Video
* Multi-image Semantic Matching by Mining Consistent Features
* Multi-label Zero-Shot Learning with Structured Knowledge Graphs
* Multi-level Factorisation Net for Person Re-identification
* Multi-Level Fusion Based 3D Object Detection from Monocular Images
* Multi-oriented Scene Text Detection via Corner Localization and Region Segmentation
* Multi-scale Location-Aware Kernel Representation for Object Detection
* Multi-scale Weighted Nuclear Norm Image Restoration
* Multi-shot Pedestrian Re-identification via Sequential Decision Making
* Multi-task Adversarial Network for Disentangled Feature Learning
* Multi-task Learning by Maximizing Statistical Dependence
* Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics
* Multi-view Consistency as Supervisory Signal for Learning Shape and Pose Prediction
* Multi-view Harmonized Bilinear Network for 3D Object Recognition
* Multimodal Explanations: Justifying Decisions and Pointing to the Evidence
* Multimodal Visual Concept Learning with Weakly Supervised Techniques
* Multiple Granularity Group Interaction Prediction
* Multispectral Image Intrinsic Decomposition via Subspace Constraint
* Multistage Adversarial Losses for Pose-Based Human Image Synthesis
* MX-LSTM: Mixing Tracklets and Vislets to Jointly Forecast Trajectories and Head Poses
* NAG: Network for Adversary Generation
* Natural and Effective Obfuscation by Head Inpainting
* NestedNet: Learning Nested Sparse Structures in Deep Neural Networks
* Net2Vec: Quantifying and Explaining How Concepts are Encoded by Filters in Deep Neural Networks
* Network Architecture for Point Cloud Classification via Automatic Depth Images Generation, A
* Neural 3D Mesh Renderer
* Neural Baby Talk
* Neural Kinematic Networks for Unsupervised Motion Retargetting
* Neural Motifs: Scene Graph Parsing with Global Context
* Neural Multi-sequence Alignment TeCHnique (NeuMATCH), A
* Neural Sign Language Translation
* Neural Style Transfer via Meta Networks
* NeuralNetwork-Viterbi: A Framework for Weakly Supervised Video Learning
* NISP: Pruning Networks Using Neuron Importance Score Propagation
* Non-blind Deblurring: Handling Kernel Uncertainty with CNNs
* Non-linear Temporal Subspace Representations for Activity Recognition
* Non-local Neural Networks
* Nonlinear 3D Face Morphable Model
* Nonlocal Low-Rank Tensor Factor Analysis for Image Restoration
* Normalized Cut Loss for Weakly-Supervised CNN Segmentation
* Now You Shake Me: Towards Automatic 4D Cinema
* OATM: Occlusion Aware Template Matching by Consensus Set Maximization
* Object Referring in Videos with Language and Human Gaze
* Objects as Context for Detecting Their Semantic Parts
* Occluded Pedestrian Detection Through Guided Attention in CNNs
* Occlusion Aware Unsupervised Learning of Optical Flow
* Occlusion-Aware Rolling Shutter Rectification of 3D Scenes
* OLE: Orthogonal Low-rank Embedding, A Plug and Play Geometric Loss for Deep Learning
* On the Convergence of PatchMatch and Its Variants
* On the Duality Between Retinex and Image Dehazing
* On the Importance of Label Quality for Semantic Segmentation
* On the Robustness of Semantic Segmentation Models to Adversarial Attacks
* One-Shot Action Localization by Learning Sequence Matching Network
* Optical Flow Guided Feature: A Fast and Robust Motion Representation for Video Action Recognition
* Optimal Structured Light a la Carte
* Optimizing Filter Size in Convolutional Neural Networks for Facial Action Unit Recognition
* Optimizing Video Object Detection via a Scale-Time Lattice
* Ordinal Depth Supervision for 3D Human Pose Estimation
* PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning
* PAD-Net: Multi-tasks Guided Prediction-and-Distillation Network for Simultaneous Depth Estimation and Scene Parsing
* PairedCycleGAN: Asymmetric Style Transfer for Applying and Removing Makeup
* Papier-Mache Approach to Learning 3D Surface Generation, A
* Parallel Attention: A Unified Framework for Visual Object Discovery Through Dialogs and Queries
* Partial Transfer Learning with Selective Adversarial Networks
* Partially Shared Multi-task Convolutional Neural Network with Local Constraint for Face Attribute Learning
* Path Aggregation Network for Instance Segmentation
* People, Penguins and Petri Dishes: Adapting Object Counting Models to New Visual Domains and Object Types Without Forgetting
* Perception-Distortion Tradeoff, The
* Perceptual Measure for Deep Single Image Camera Calibration, A
* Person Re-identification with Cascaded Pairwise Convolutions
* Person Transfer GAN to Bridge Domain Gap for Person Re-identification
* Perturbative Neural Networks
* PhaseNet for Video Frame Interpolation
* Photographic Text-to-Image Synthesis with a Hierarchically-Nested Adversarial Network
* Photometric Stereo in Participating Media Considering Shape-Dependent Forward Scatter
* PiCANet: Learning Pixel-Wise Contextual Attention for Saliency Detection
* PID Controller Approach for Stochastic Optimization of Deep Networks, A
* PieAPP: Perceptual Image-Error Assessment Through Pairwise Preference
* Pix3D: Dataset and Methods for Single-Image 3D Shape Modeling
* Pixels, Voxels, and Views: A Study of Shape Representations for Single View 3D Object Shape Prediction
* PIXOR: Real-time 3D Object Detection from Point Clouds
* Planar Shape Detection at Structural Scales
* PlaneNet: Piece-Wise Planar Reconstruction from a Single RGB Image
* PointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation
* PointGrid: A Deep Network for 3D Shape Understanding
* PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition
* Pointwise Convolutional Neural Networks
* Polarimetric Dense Monocular SLAM
* Pose Transferrable Person Re-identification
* Pose-Guided Photorealistic Face Rotation
* Pose-Robust Face Recognition via Deep Residual Equivariant Mapping
* Pose-Sensitive Embedding for Person Re-identification with Expanded Cross Neighborhood Re-ranking, A
* pOSE: Pseudo Object Space Error for Initialization-Free Bundle Adjustment
* PoseFlow: A Deep Motion Representation for Understanding Human Behaviors in Videos
* PoseTrack: A Benchmark for Human Pose Estimation and Tracking
* PoTion: Pose MoTion Representation for Action Recognition
* Power of Ensembles for Active Learning in Image Classification, The
* PPFNet: Global Context Aware Local Features for Robust 3D Point Matching
* Practical Block-Wise Neural Network Architecture Generation
* Preserving Semantic Relations for Zero-Shot Learning
* Prior-Less Method for Multi-face Tracking in Unconstrained Videos, A
* Probabilistic Joint Face-Skull Modelling for Facial Reconstruction
* Probabilistic Plant Modeling via Multi-view Image-to-Image Translation
* Progressive Attention Guided Recurrent Network for Salient Object Detection
* Progressively Complementarity-Aware Fusion Network for RGB-D Salient Object Detection
* Pseudo Mask Augmented Object Detection
* PU-Net: Point Cloud Upsampling Network
* Pulling Actions out of Context: Explicit Separation for Effective Combination
* PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume
* Pyramid Stereo Matching Network
* Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
* Quantization of Fully Convolutional Networks for Accurate Biomedical Image Segmentation
* R-FCN-3000 at 30fps: Decoupling Detection and Classification
* Radially-Distorted Conjugate Translations
* RayNet: Learning Volumetric 3D Reconstruction with Ray Potentials
* Re-Weighted Adversarial Adaptation Network for Unsupervised Domain Adaptation
* Real-Time Monocular Depth Estimation Using Synthetic Data with Domain Adaptation via Image Style Transfer
* Real-Time Rotation-Invariant Face Detection with Progressive Calibration Networks
* Real-Time Seamless Single Shot 6D Object Pose Prediction
* Real-World Anomaly Detection in Surveillance Videos
* Real-World Repetition Estimation by Div, Grad and Curl
* Recognize Actions by Disentangling Components of Dynamics
* Recognizing Human Actions as the Evolution of Pose Estimation Maps
* Reconstructing Thin Structures of Manifold Surfaces by Integrating Spatial Curves
* Reconstruction Network for Video Captioning
* Recovering Realistic Texture in Image Super-Resolution by Deep Spatial Feature Transform
* Recurrent Pixel Embedding for Instance Grouping
* Recurrent Residual Module for Fast Inference in Videos
* Recurrent Saliency Transformation Network: Incorporating Multi-stage Visual Cues for Small Organ Segmentation
* Recurrent Scene Parsing with Perspective Understanding in the Loop
* Recurrent Slice Networks for 3D Segmentation of Point Clouds
* Referring Image Segmentation via Recurrent Refinement Networks
* Referring Relationships
* Reflection Removal for Large-Scale 3D Point Clouds
* Regularizing Deep Networks by Modeling and Predicting Label Structure
* Regularizing RNNs for Caption Generation by Reconstructing the Past with the Present
* Reinforcement Cutting-Agent Learning for Video Object Segmentation
* Relation Networks for Object Detection
* Representing and Learning High Dimensional Data with the Optimal Transport Map from a Probabilistic Viewpoint
* Repulsion Loss: Detecting Pedestrians in a Crowd
* Residual Dense Network for Image Super-Resolution
* Residual Parameter Transfer for Deep Domain Adaptation
* Resource Aware Person Re-identification Across Multiple Resolutions
* Rethinking Feature Distribution for Loss Functions in Image Classification
* Rethinking the Faster R-CNN Architecture for Temporal Action Localization
* Revised Underwater Image Formation Model, A
* Revisiting Deep Intrinsic Image Decompositions
* Revisiting Dilated Convolution: A Simple Approach for Weakly- and Semi-Supervised Semantic Segmentation
* Revisiting Knowledge Transfer for Training Object Class Detectors
* Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking
* Revisiting Salient Object Detection: Simultaneous Detection, Ranking, and Subitizing of Multiple Salient Objects
* Revisiting Video Saliency: A Large-Scale Benchmark and a New Model
* Reward Learning from Narrated Demonstrations
* Ring Loss: Convex Feature Normalization for Face Recognition
* ROAD: Reality Oriented Adaptation for Semantic Segmentation of Urban Scenes
* RoadTracer: Automatic Extraction of Road Networks from Aerial Images
* Robust Classification with Convolutional Prototype Learning
* Robust Depth Estimation from Auto Bracketed Images
* Robust Facial Landmark Detection via a Fully-Convolutional Local-Global Context Network
* Robust Hough Transform Based 3D Reconstruction from Circular Light Fields
* Robust Method for Strong Rolling Shutter Effects Correction Using Lines with Automatic Feature Selection, A
* Robust Physical-World Attacks on Deep Learning Visual Classification
* Robust Video Content Alignment and Compensation for Rain Removal in a CNN Framework
* Rolling Shutter and Radial Distortion are Features for High Frame Rate Multi-camera Tracking
* Rotation Averaging and Strong Duality
* Rotation-Sensitive Regression for Oriented Scene Text Detection
* RotationNet: Joint Object Categorization and Pose Estimation Using Multiviews from Unsupervised Viewpoints
* Salience Guided Depth Calibration for Perceptually Optimized Compressive Light Field 3D Display
* Salient Object Detection Driven by Fixation Prediction
* SBNet: Sparse Blocks Network for Fast Inference
* Scalable and Effective Deep CCA via Soft Decorrelation
* Scalable Dense Non-rigid Structure-from-Motion: A Grassmannian Perspective
* Scale-Recurrent Network for Deep Image Deblurring
* Scale-Transferrable Object Detection
* ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans
* SeedNet: Automatic Seed Generation with Deep Reinforcement Learning for Robust Interactive Segmentation
* Seeing Small Faces from Robust Anchor's Perspective
* Seeing Temporal Modulation of Lights from Standard Cameras
* Seeing Voices and Hearing Faces: Cross-Modal Biometric Matching
* SeGAN: Segmenting and Generating the Invisible
* Self-Calibrating Polarising Radiometric Calibration
* Self-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval
* Self-Supervised Feature Learning by Learning to Spot Artifacts
* Self-Supervised Learning of Geometrically Stable Features Through Probabilistic Introspection
* Self-Supervised Multi-level Face Model Learning for Monocular Reconstruction at Over 250 Hz
* Semantic Video Segmentation by Gated Recurrent Flow Propagation
* Semantic Visual Localization
* Semi-Parametric Image Synthesis
* SemStyle: Learning to Generate Stylised Image Captions Using Unaligned Text
* Separating Self-Expression and Visual Content in Hashtag Supervision
* Separating Style and Content for Generalized Style Transfer
* SfSNet: Learning Shape, Reflectance and Illuminance of Faces in the Wild
* SGAN: An Alternative Training of Generative Adversarial Networks
* SGPN: Similarity Group Proposal Network for 3D Point Cloud Instance Segmentation
* Shape from Shading Through Shape Evolution
* Shift: A Zero FLOP, Zero Parameter Alternative to Spatial Convolutions
* Show Me a Story: Towards Coherent Neural Story Illustration
* ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
* Sim2Real Viewpoint Invariant Visual Servoing by Recurrent Control
* Single Image Dehazing via Conditional Generative Adversarial Network
* Single Image Reflection Separation with Perceptual Losses
* Single View Stereo Matching
* Single-Image Depth Estimation Based on Fourier Domain Analysis
* Single-Shot Object Detection with Enriched Semantics
* Single-Shot Refinement Neural Network for Object Detection
* SINT++: Robust Visual Tracking via Adversarial Positive Instance Generation
* Sketch-a-Classifier: Sketch-Based Photo Classifier Generation
* SketchMate: Deep Hashing for Million-Scale Human Sketch Retrieval
* SketchyGAN: Towards Diverse and Realistic Sketch to Image Synthesis
* Sliced Wasserstein Distance for Learning Gaussian Mixture Models
* Smooth Neighbors on Teacher Graphs for Semi-Supervised Learning
* SO-Net: Self-Organizing Network for Point Cloud Analysis
* SobolevFusion: 3D Reconstruction of Scenes Undergoing Free Non-rigid Motion
* Soccer on Your Tabletop
* Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks
* Solving the Perspective-2-Point Problem for Flying-Camera Photo Composition
* SoS-RSC: A Sum-of-Squares Polynomial Approach to Robustifying Subspace Clustering Algorithms
* Sparse Photometric 3D Face Reconstruction Guided by Morphable Models
* Sparse, Smart Contours to Represent and Edit Images
* Spatially-Adaptive Filter Units for Deep Neural Networks
* SPLATNet: Sparse Lattice Networks for Point Cloud Processing
* Spline Error Weighting for Robust Visual-Inertial Fusion
* SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels
* Squeeze-and-Excitation Networks
* SSNet: Scale Selection Network for Online 3D Action Prediction
* ST-GAN: Spatial Transformer Generative Adversarial Networks for Image Compositing
* Stacked Conditional Generative Adversarial Networks for Jointly Learning Shadow Detection and Shadow Removal
* Stacked Latent Attention for Multimodal Reasoning
* StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation
* Statistical Tomography of Microscopic Life
* Stereoscopic Neural Style Transfer
* Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks
* Stochastic Variational Inference with Gradient Linearization
* Structure from Recurrent Motion: From Rigidity to Recurrency
* Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships
* Structure Preserving Video Prediction
* Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation
* Structured Set Matching Networks for One-Shot Part Labeling
* Structured Uncertainty Prediction Networks
* Style Aggregated Network for Facial Landmark Detection
* Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation
* Super-FAN: Integrated Facial Landmark Localization and Super-Resolution of Real-World Low Resolution Faces in Arbitrary Poses with GANs
* Super-Resolving Very Low-Resolution Face Images with Supplementary Attributes
* Supervision-by-Registration: An Unsupervised Approach to Improve the Precision of Facial Landmark Detectors
* Surface Networks
* SurfConv: Bridging 3D and 2D Convolution for RGBD Images
* Synthesizing Images of Humans in Unseen Poses
* SYQ: Learning Symmetric Quantization for Efficient Deep Neural Networks
* Tagging Like Humans: Diverse and Distinct Image Annotation
* Tags2Parts: Discovering Semantic Regions from Shape Tags
* Tangent Convolutions for Dense Prediction in 3D
* Taskonomy: Disentangling Task Transfer Learning
* Teaching Categories to Human Learners with Visual Explanations
* Tell Me Where to Look: Guided Attention Inference Network
* Temporal Deformable Residual Networks for Action Segmentation in Videos
* Temporal Hallucinating for Action Recognition with Few Still Images
* Tensorize, Factorize and Regularize: Robust Visual Relationship Learning
* Textbook Question Answering Under Instructor Guidance with Memory Networks
* Texture Mapping for 3D Reconstruction with RGB-D Sensor
* TextureGAN: Controlling Deep Image Synthesis with Texture Patches
* Thoracic Disease Identification and Localization with Limited Supervision
* Through-Wall Human Pose Estimation Using Radio Signals
* TieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-Rays
* Time-Resolved Light Transport Decomposition for Thermal Photometric Stereo
* Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge
* TOM-Net: Learning Transparent Object Matting from a Single Image
* Total Capture: A 3D Deformation Model for Tracking Faces, Hands, and Bodies
* Toward Driving Scene Understanding: A Dataset for Learning Driver Behavior and Causal Reasoning
* Towards a Mathematical Understanding of the Difficulty in Learning with Feedforward Neural Networks
* Towards Dense Object Tracking in a 2D Honeybee Hive
* Towards Effective Low-Bitwidth Convolutional Neural Networks
* Towards Faster Training of Global Covariance Pooling Networks by Iterative Matrix Square Root Normalization
* Towards High Performance Video Object Detection
* Towards Human-Machine Cooperation: Self-Supervised Sample Mining for Object Detection
* Towards Open-Set Identity Preserving Face Synthesis
* Towards Pose Invariant Face Recognition in the Wild
* Towards Universal Representation for Unseen Action Recognition
* Tracking Multiple Objects Outside the Line of Sight Using Speckle Imaging
* Transductive Unbiased Embedding for Zero-Shot Learning
* Transferable Joint Attribute-Identity Deep Learning for Unsupervised Person Re-identification
* Translating and Segmenting Multimodal Medical Volumes with Cycle- and Shape-Consistency Generative Adversarial Network
* Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning
* Trapping Light for Time of Flight
* Triplet-Center Loss for Multi-view 3D Object Retrieval
* Trust your Model: Light Field Depth Estimation with Inline Occlusion Handling
* Two Can Play This Game: Visual Dialog with Discriminative Question Generation and Answering
* Two-Step Disentanglement Method, A
* Two-Step Quantization for Low-Bit Neural Networks
* Two-Stream Convolutional Networks for Dynamic Texture Synthesis
* Twofold Siamese Network for Real-Time Object Tracking, A
* Uncalibrated Photometric Stereo Under Natural Illumination
* Unifying Contrast Maximization Framework for Event Cameras, with Applications to Motion, Depth, and Optical Flow Estimation, A
* Unifying Identification and Context Learning for Person Recognition
* Universal Denoising Networks: A Novel CNN Architecture for Image Denoising
* Unreasonable Effectiveness of Deep Features as a Perceptual Metric, The
* Unsupervised Correlation Analysis
* Unsupervised Cross-Dataset Person Re-identification by Transfer Learning of Spatial-Temporal Patterns
* Unsupervised Deep Generative Adversarial Hashing Network
* Unsupervised Discovery of Object Landmarks as Structural Representations
* Unsupervised Domain Adaptation with Similarity Learning
* Unsupervised Feature Learning via Non-parametric Instance Discrimination
* Unsupervised Learning and Segmentation of Complex Activities from Video
* Unsupervised Learning Model for Deformable Medical Image Registration, An
* Unsupervised Learning of Depth and Ego-Motion from Monocular Video Using 3D Geometric Constraints
* Unsupervised Learning of Monocular Depth Estimation and Visual Odometry with Deep Feature Reconstruction
* Unsupervised Person Image Synthesis in Arbitrary Poses
* Unsupervised Sparse Dirichlet-Net for Hyperspectral Image Super-Resolution
* Unsupervised Textual Grounding: Linking Words to Image Concepts
* Unsupervised Training for 3D Morphable Model Regression
* UV-GAN: Adversarial Facial UV Map Completion for Pose-Invariant Face Recognition
* V2V-PoseNet: Voxel-to-Voxel Prediction Network for Accurate 3D Hand and Human Pose Estimation from a Single Depth Map
* Variational Autoencoders for Deforming 3D Mesh Models
* Variational U-Net for Conditional Appearance and Shape Generation, A
* Very Large-Scale Global SfM by Distributed Motion Averaging
* Video Based Reconstruction of 3D People Models
* Video Captioning via Hierarchical Reinforcement Learning
* Video Person Re-identification with Competitive Snippet-Similarity Aggregation and Co-attentive Snippet Embedding
* Video Rain Streak Removal by Multiscale Convolutional Sparse Coding
* Video Representation Learning Using Discriminative Pooling
* View Extrapolation of Human Body from a Single Image
* Viewpoint-Aware Attentive Multi-view Inference for Vehicle Re-identification
* Viewpoint-Aware Video Summarization
* VirtualHome: Simulating Household Activities Via Programs
* Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments
* Visual Feature Attribution Using Wasserstein GANs
* Visual Grounding Via Accumulated Attention
* Visual Question Answering with Memory-Augmented Networks
* Visual Question Generation as Dual Task of Visual Question Answering
* Visual Question Reasoning on General Dependency Tree
* Visual to Sound: Generating Natural Sound for Videos in the Wild
* VITAL: VIsual Tracking via Adversarial Learning
* VITON: An Image-Based Virtual Try-on Network
* VizWiz Grand Challenge: Answering Visual Questions from Blind People
* VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection
* W2F: A Weakly-Supervised to Fully-Supervised Framework for Object Detection
* Wasserstein Introspective Neural Networks
* Weakly and Semi Supervised Human Body Part Parsing via Pose-Guided Knowledge Transfer
* Weakly Supervised Action Localization by Sparse Temporal Pooling Network
* Weakly Supervised Coupled Networks for Visual Sentiment Analysis
* Weakly Supervised Facial Action Unit Recognition Through Adversarial Training
* Weakly Supervised Instance Segmentation Using Class Peak Response
* Weakly Supervised Learning of Single-Cell Feature Embeddings
* Weakly Supervised Phrase Localization with Multi-scale Anchored Transformer Network
* Weakly-Supervised Action Segmentation with Iterative Soft Boundary Assignment
* Weakly-Supervised Deep Convolutional Neural Network Learning for Facial Action Unit Intensity Estimation
* Weakly-Supervised Semantic Segmentation by Iteratively Mining Common Object Features
* Weakly-Supervised Semantic Segmentation Network with Deep Seeded Region Growing
* Webly Supervised Learning Meets Zero-shot Learning: A Hybrid Approach for Fine-Grained Classification
* Weighted Sparse Sampling and Smoothing Frame Transition Approach for Semantic Fast-Forward First-Person Videos, A
* What do Deep Networks Like to See?
* What have We Learned from Deep Representations for Action Recognition?
* What Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets
* When will you do what? - Anticipating Temporal Occurrences of Activities
* Where and Why are They Looking? Jointly Inferring Human Attention and Intentions in Complex Tasks
* Who Let the Dogs Out? Modeling Dog Behavior from Visual Data
* Who's Better? Who's Best? Pairwise Deep Ranking for Skill Determination
* Wide Compression: Tensor Ring Nets
* WILDTRACK: A Multi-camera HD Dataset for Dense Unscripted Pedestrian Detection
* Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks
* Wrapped Gaussian Process Regression on Riemannian Manifolds
* xUnit: Learning a Spatial Activation Function for Efficient Image Restoration
* Zero-Shot Kernel Learning
* Zero-Shot Recognition via Semantic Embeddings and Knowledge Graphs
* Zero-Shot Sketch-Image Hashing
* Zero-Shot Super-Resolution Using Deep Internal Learning
* Zero-Shot Visual Recognition Using Semantics-Preserving Adversarial Embedding Networks
* Zigzag Learning for Weakly Supervised Object Detection
* Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domains
980 for CVPR18

CVPR19 * *CVPR
* 2.5D Visual Sound
* 3D Appearance Super-Resolution With Deep Learning
* 3D Guided Fine-Grained Face Manipulation
* 3D Hand Shape and Pose Estimation From a Single RGB Image
* 3D Hand Shape and Pose From Images in the Wild
* 3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training
* 3D Local Features for Direct Pairwise Registration
* 3D Motion Decomposition for RGBD Future Dynamic Scene Synthesis
* 3D Point Capsule Networks
* 3D Shape Reconstruction From Images in the Frequency Domain
* 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans
* 3DN: 3D Deformation Network
* 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks
* A-CNN: Annularly Convolutional Neural Networks on Point Clouds
* AANet: Attribute Attention Network for Person Re-Identifications
* ABC: A Big CAD Model Dataset for Geometric Deep Learning
* Accel: A Corrective Fusion Network for Efficient Semantic Segmentation on Video
* Accelerating Convolutional Neural Networks via Activation Map Compression
* Acoustic Non-Line-Of-Sight Imaging
* Action Recognition From Single Timestamp Supervision in Untrimmed Videos
* Action4D: Online Action Recognition in the Crowd and Clutter
* Actional-Structural Graph Convolutional Networks for Skeleton-Based Action Recognition
* Actively Seeking and Learning From Live Data
* Activity Driven Weakly Supervised Object Detection
* Actor-Critic Instance Segmentation
* AdaCos: Adaptively Scaling Cosine Logits for Effectively Learning Deep Face Representations
* AdaFrame: Adaptive Frame Selection for Fast Video Recognition
* AdaGraph: Unifying Predictive and Continuous Domain Adaptation Through Graphs
* Adapting Object Detectors via Selective Cross-Domain Alignment
* Adaptive Confidence Smoothing for Generalized Zero-Shot Learning
* Adaptive NMS: Refining Pedestrian Detection in a Crowd
* Adaptive Pyramid Context Network for Semantic Segmentation
* Adaptive Transfer Network for Cross-Domain Person Re-Identification
* Adaptive Weighting Multi-Field-Of-View CNN for Semantic Segmentation in Pathology
* AdaptiveFace: Adaptive Margin and Sampling for Face Recognition
* Adaptively Connected Neural Networks
* ADCrowdNet: An Attention-Injective Deformable Convolutional Network for Crowd Understanding
* Additive Adversarial Learning for Unbiased Authentication
* ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation
* Adversarial Attacks Beyond the Image Space
* Adversarial Defense by Stratified Convolutional Sparse Coding
* Adversarial Defense Through Network Profiling Based Path Extraction
* Adversarial Inference for Multi-Sentence Video Description
* Adversarial Semantic Alignment for Improved Image Captions
* Adversarial Structure Matching for Structured Prediction Tasks
* AE2-Nets: Autoencoder in Autoencoder Networks
* AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transformations Rather Than Data
* Aggregation Cross-Entropy for Sequence Recognition
* AIRD: Adversarial Learning Framework for Image Repurposing Detection
* Alignment of the Spheres: Globally-Optimal Spherical Mixture Alignment for Camera Pose Estimation, The
* All About Structure: Adapting Structural Information Across Domains for Boosting Semantic Segmentation
* All You Need Is a Few Shifts: Designing Efficient Convolutional Neural Networks for Image Classification
* All-Weather Deep Outdoor Lighting Estimation
* Alternative Deep Feature Approach to Line Level Keyword Spotting, An
* Amodal Instance Segmentation With KINS Dataset
* Analysis of Feature Visibility in Non-Line-Of-Sight Measurements
* Animating Arbitrary Objects via Deep Motion Transfer
* Answer Them All! Toward Universal Visual Question Answering Models
* AOGNets: Compositional Grammatical Architectures for Deep Learning
* APDrawingGAN: Generating Artistic Portrait Drawings From Face Photos With Hierarchical GANs
* ApolloCar3D: A Large 3D Car Instance Understanding Benchmark for Autonomous Driving
* Arbitrary Shape Scene Text Detection With Adaptive Text Region Representation
* Arbitrary Style Transfer With Style-Attentional Networks
* ArcFace: Additive Angular Margin Loss for Deep Face Recognition
* Argoverse: 3D Tracking and Forecasting With Rich Maps
* Art2Real: Unfolding the Reality of Artworks via Semantically-Aware Image-To-Image Translation
* Assessing Personally Perceived Image Quality via Image Features and Collaborative Filtering
* Assessment of Faster R-CNN in Man-Machine Collaborative Search
* Assisted Excitation of Activations: A Learning Technique to Improve Object Detectors
* Associatively Segmenting Instances and Semantics in Point Clouds
* Atlas of Digital Pathology: A Generalized Hierarchical Histological Tissue Type-Annotated Database for Deep Learning
* ATOM: Accurate Tracking by Overlap Maximization
* Attending to Discriminative Certainty for Domain Adaptation
* Attention Based Glaucoma Detection: A Large-Scale Database and CNN Model
* Attention Branch Network: Learning of Attention Mechanism for Visual Explanation
* Attention Enhanced Graph Convolutional LSTM Network for Skeleton-Based Action Recognition, An
* Attention-Aware Multi-Stroke Style Transfer
* Attention-Based Adaptive Selection of Operations for Image Restoration in the Presence of Unknown Combined Distortions
* Attention-Based Dropout Layer for Weakly Supervised Object Localization
* Attention-Guided Network for Ghost-Free High Dynamic Range Imaging
* Attention-Guided Unified Network for Panoptic Segmentation
* Attentive Feedback Network for Boundary-Aware Salient Object Detection
* Attentive Region Embedding Network for Zero-Shot Learning
* Attentive Relational Networks for Mapping Images to Scene Graphs
* Attentive Single-Tasking of Multiple Tasks
* Attribute-Aware Face Aging With Wavelet-Based Generative Adversarial Networks
* Attribute-Driven Feature Disentangling and Temporal Aggregation for Video Person Re-Identification
* Audio Visual Scene-Aware Dialog
* Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation
* Auto-Encoding Scene Graphs for Image Captioning
* AutoAugment: Learning Augmentation Strategies From Data
* Automatic Adaptation of Object Detectors to New Domains Using Self-Training
* Automatic Face Aging in Videos via Deep Reinforcement Learning
* BAD SLAM: Bundle Adjusted Direct RGB-D SLAM
* Bag of Tricks for Image Classification with Convolutional Neural Networks
* Balanced Self-Paced Learning for Generative Adversarial Clustering Network
* Barrage of Random Transforms for Adversarially Robust Defense
* BASNet: Boundary-Aware Salient Object Detection
* Bayesian Hierarchical Dynamic Model for Human Action Recognition
* Bayesian Perspective on the Deep Image Prior, A
* BeautyGlow: On-Demand Makeup Transfer Framework With Reversible Generative Network
* Beyond Gradient Descent for Regularized Segmentation Losses
* Beyond Tracking: Selecting Memory and Refining Poses for Deep Visual Odometry
* Beyond Volumetric Albedo: A Surface Optimization Framework for Non-Line-Of-Sight Imaging
* Bi-Directional Cascade Network for Perceptual Edge Detection
* Bidirectional Learning for Domain Adaptation of Semantic Segmentation
* Bilateral Cyclic Constraint and Adaptive Regularization for Unsupervised Monocular Depth Prediction
* Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit?
* Biologically-Constrained Graphs for Global Connectomics Reconstruction
* Blending-Target Domain Adaptation by Adversarial Meta-Adaptation Networks
* Blind Geometric Distortion Correction on Images Through Deep Learning
* Blind Image Deblurring With Local Maximum Gradient Prior
* Blind Super-Resolution With Iterative Kernel Correction
* Blind Visual Motif Removal From a Single Image
* Boosting Local Shape Matching for Dense 3D Face Correspondence
* Bottom-Up Object Detection by Grouping Extreme and Center Points
* Bounding Box Regression With Uncertainty for Accurate Object Detection
* Box-Driven Class-Wise Region Masking and Filling Rate Guided Loss for Weakly Supervised Semantic Segmentation
* BridgeNet: A Continuity-Aware Probabilistic Network for Age Estimation
* Bridging Stereo Matching and Optical Flow via Spatiotemporal Correspondence
* Bringing a Blurry Frame Alive at High Frame-Rate With an Event Camera
* Bringing Alive Blurred Moments
* BubbleNets: Learning to Select the Guidance Frame in Video Object Segmentation by Deep Sorting Frames
* Building Detail-Sensitive Semantic Segmentation Networks With Polynomial Pooling
* Building Efficient Deep Neural Networks With Unitary Group Convolutions
* C-MIL: Continuation Multiple Instance Learning for Weakly Supervised Object Detection
* C2AE: Class Conditioned Auto-Encoder for Open-Set Recognition
* C3AE: Exploring the Limits of Compact Model for Age Estimation
* CAM-Convs: Camera-Aware Multi-Scale Convolutions for Single-View Depth
* Camera Lens Super-Resolution
* CANet: Class-Agnostic Segmentation Networks with Iterative Refinement and Attentive Few-Shot Learning
* CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection
* Capture, Learning, and Synthesis of 3D Speaking Styles
* Cascaded Generative and Discriminative Learning for Microcalcification Detection in Breast Mammograms
* Cascaded Partial Decoder for Fast and Accurate Salient Object Detection
* Cascaded Projection: End-To-End Network Compression and Acceleration
* Catastrophic Child's Play: Easy to Perform, Hard to Defend Adversarial Attacks
* Causes and Corrections for Bimodal Multi-Path Scanning With Structured Light
* Centripetal SGD for Pruning Very Deep Convolutional Networks With Complicated Structure
* ChamNet: Towards Efficient Network Design Through Platform-Aware Model Adaptation
* Character Region Awareness for Text Detection
* Characterizing and Avoiding Negative Transfer
* Circulant Binary Convolutional Networks: Enhancing the Performance of 1-Bit DCNNs With Circulant Back Propagation
* CityFlow: A City-Scale Benchmark for Multi-Target Multi-Camera Vehicle Tracking and Re-Identification
* Class-Balanced Loss Based on Effective Number of Samples
* Classification-Reconstruction Learning for Open-Set Recognition
* CLEVR-Ref+: Diagnosing Visual Reasoning With Referring Expressions
* ClusterNet: Deep Hierarchical Cluster Network With Rigorously Rotation-Invariant Representation for Point Cloud Analysis
* Co-Occurrence Neural Network
* Co-Occurrent Features in Semantic Segmentation
* Co-Saliency Detection via Mask-Guided Fully Convolutional Networks With Multi-Scale Label Smoothing
* COIN: A Large-Scale Dataset for Comprehensive Instructional Video Analysis
* Collaborative Global-Local Networks for Memory-Efficient Segmentation of Ultra-High Resolution Images
* Collaborative Learning of Semi-Supervised Segmentation and Classification for Medical Images
* Collaborative Spatiotemporal Feature Learning for Video Action Recognition
* CollaGAN: Collaborative GAN for Missing Image Data Imputation
* Coloring With Limited Data: Few-Shot Colorization via Memory Augmented Networks
* Combinatorial Persistency Criteria for Multicut and Max-Cut
* Combining 3D Morphable Models: A Large Scale Face-And-Head Model
* ComDefend: An Efficient Image Compression Model to Defend Adversarial Examples
* Compact Embedding for Facial Expression Similarity, A
* Compact Feature Learning for Multi-Domain Image Classification
* Competitive Collaboration: Joint Unsupervised Learning of Depth, Camera Motion, Optical Flow and Motion Segmentation
* Complete the Look: Scene-Based Complementary Product Recommendation
* Completeness Modeling and Context Separation for Weakly Supervised Temporal Action Localization
* Composing Text and Image for Image Retrieval - an Empirical Odyssey
* Compressing Convolutional Neural Networks via Factorized Convolutional Filters
* Compressing Unknown Images With Product Quantizer for Efficient Zero-Shot Classification
* Conditional Adversarial Generative Flow for Controllable Image Synthesis
* Conditional Single-View Shape Generation for Multi-View Stereo Reconstruction
* Connecting the Dots: Learning Representations for Active Monocular Depth Estimation
* Connecting Touch and Vision via Cross-Modal Prediction
* Constrained Generative Adversarial Networks for Interactive Image Generation
* ContactDB: Analyzing and Predicting Grasp Contact via Thermal Imaging
* Content Authentication for Neural Imaging Pipelines: End-To-End Optimization of Photo Provenance in Complex Distribution Channels
* Content Transformation Block for Image Style Transfer, A
* Content-Aware Multi-Level Guidance for Interactive Instance Segmentation
* Context and Attribute Grounded Dense Captioning
* Context-Aware Crowd Counting
* Context-Aware Spatio-Recurrent Curvilinear Structure Segmentation
* Context-Aware Visual Compatibility Prediction
* Context-Reinforced Semantic Segmentation
* ContextDesc: Local Descriptor Augmentation With Cross-Modality Context
* Contrast Prior and Fluid Pyramid Integration for RGBD Salient Object Detection
* Contrastive Adaptation Network for Unsupervised Domain Adaptation
* Convex Relaxation for Multi-Graph Matching, A
* Convolutional Mesh Regression for Single-Image Human Shape Reconstruction
* Convolutional Neural Networks Can Be Deceived by Visual Illusions
* Convolutional Recurrent Network for Road Boundary Extraction
* Convolutional Relational Machine for Group Activity Recognition
* Coordinate-Based Texture Inpainting for Pose-Guided Human Image Generation
* Coordinate-Free Carlsson-Weinshall Duality and Relative Multi-View Geometry
* CRAVES: Controlling Robotic Arm With a Vision-Based Economic System
* CrDoCo: Pixel-Level Domain Transfer With Cross-Domain Consistency
* Creative Flow+ Dataset
* Cross Domain Model Compression by Structurally Weight Sharing
* Cross-Atlas Convolution for Parameterization Invariant Learning on Textured Mesh Surface
* Cross-Classification Clustering: An Efficient Multi-Object Tracking Technique for 3-D Instance Segmentation in Connectomics
* Cross-Modal Relationship Inference for Grounding Referring Expressions
* Cross-Modal Self-Attention Network for Referring Image Segmentation
* Cross-Modality Personalization for Retrieval
* Cross-Season Correspondence Dataset for Robust Semantic Segmentation, A
* Cross-Task Weakly Supervised Learning From Instructional Videos
* CrossInfoNet: Multi-Task Information Sharing Based Hand Pose Estimation
* Crowd Counting and Density Estimation by Trellis Encoder-Decoder Networks
* CrowdPose: Efficient Crowded Scenes Pose Estimation and a New Benchmark
* Curls and Whey: Boosting Black-Box Adversarial Attacks
* Customizable Architecture Search for Semantic Segmentation
* Cycle-Consistency for Robust Visual Question Answering
* Cyclic Guidance for Weakly Supervised Joint Detection and Segmentation
* d-SNE: Domain Adaptation Using Stochastic Neighborhood Embedding
* D2-Net: A Trainable CNN for Joint Description and Detection of Local Features
* D3TW: Discriminative Differentiable Dynamic Time Warping for Weakly Supervised Action Alignment and Segmentation
* Dance With Flow: Two-In-One Stream Action Detection
* DARNet: Deep Active Ray Network for Building Segmentation
* Data Augmentation Using Learned Transformations for One-Shot Medical Image Segmentation
* Data Representation and Learning With Graph Diffusion-Embedding Networks
* Data-Driven Neuron Allocation for Scale Aggregation Networks
* Dataset and Benchmark for Large-Scale Multi-Modal Face Anti-Spoofing, A
* DAVANet: Stereo Deblurring With View Aggregation
* DDLSTM: Dual-Domain LSTM for Cross-Dataset Action Recognition
* Decoders Matter for Semantic Segmentation: Data-Dependent Decoding Enables Flexible Feature Aggregation
* Decomposition Algorithm for the Sparse Generalized Eigenvalue Problem, A
* Decorrelated Adversarial Learning for Age-Invariant Face Recognition
* Decoupling Direction and Norm for Efficient Gradient-Based L2 Adversarial Attacks and Defenses
* Deep Asymmetric Metric Learning via Rich Relationship Mining
* Deep Blind Video Decaptioning by Temporal Aggregation and Recurrence
* Deep ChArUco: Dark ChArUco Marker Pose Estimation
* Deep Defocus Map Estimation Using Domain Adaptation
* Deep Dual Relation Modeling for Egocentric Interaction Recognition
* Deep Embedding Learning With Discriminative Sampling Policy
* Deep Exemplar-Based Video Colorization
* Deep Fitting Degree Scoring Network for Monocular 3D Object Detection
* Deep Flow-Guided Video Inpainting
* Deep Geometric Prior for Surface Reconstruction
* Deep Global Generalized Gaussian Networks
* Deep High-Resolution Representation Learning for Human Pose Estimation
* Deep Incremental Hashing Network for Efficient Image Retrieval
* Deep Metric Learning Beyond Binary Supervision
* Deep Metric Learning to Rank
* Deep Modular Co-Attention Networks for Visual Question Answering
* Deep Multimodal Clustering for Unsupervised Audiovisual Learning
* Deep Network Interpolation for Continuous Imagery Effect Transition
* Deep Plug-And-Play Super-Resolution for Arbitrary Blur Kernels
* Deep Reinforcement Learning of Volume-Guided Progressive View Inpainting for 3D Point Scene Completion From a Single Depth Image
* Deep Rigid Instance Scene Flow
* Deep RNN Framework for Visual Sequential Applications
* Deep Robust Subjective Visual Property Prediction in Crowdsourcing
* Deep Single Image Camera Calibration With Radial Distortion
* Deep Sketch-Shape Hashing With Segmented 3D Stochastic Viewing
* Deep Sky Modeling for Single Image Outdoor Lighting Estimation
* Deep Spectral Clustering Using Dual Autoencoder Network
* Deep Spherical Quantization for Image Search
* Deep Stacked Hierarchical Multi-Patch Network for Image Deblurring
* Deep Supervised Cross-Modal Retrieval
* Deep Surface Normal Estimation With Hierarchical RGB-D Fusion
* Deep Transfer Learning for Multiple Class Novelty Detection
* Deep Tree Learning for Zero-Shot Face Anti-Spoofing
* Deep Video Inpainting
* Deep Virtual Networks for Memory Efficient Inference of Multiple Tasks
* DeepCaps: Going Deeper With Capsule Networks
* DeepCO3: Deep Instance Co-Segmentation by Co-Peak Search and Co-Saliency Detection
* Deeper and Wider Siamese Networks for Real-Time Visual Tracking
* DeepFashion2: A Versatile Benchmark for Detection, Pose Estimation, Segmentation and Re-Identification of Clothing Images
* DeepFlux for Skeletons in the Wild
* DeepLiDAR: Deep Surface Normal Guided Depth Prediction for Outdoor Scene From Sparse LiDAR Data and Single Color Image
* DeepLight: Learning Illumination for Unconstrained Mobile Mixed Reality
* Deeply-Supervised Knowledge Synergy
* DeepMapping: Unsupervised Map Estimation From Multiple Point Clouds
* DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation
* DeepView: View Synthesis With Learned Gradient Descent
* DeepVoxels: Learning Persistent 3D Feature Embeddings
* Defending Against Adversarial Attacks by Randomized Diversification
* Defense Against Adversarial Images Using Web-Scale Nearest-Neighbor Search
* Deformable ConvNets V2: More Deformable, Better Results
* DeFusionNET: Defocus Blur Detection via Recurrently Fusing and Refining Multi-Scale Deep Features
* Dense 3D Face Decoding Over 2500FPS: Joint Texture and Shape Convolutional Mesh Decoders
* Dense Classification and Implanting for Few-Shot Learning
* Dense Depth Posterior (DDP) From Single Image and Sparse Range
* Dense Intrinsic Appearance Flow for Human Pose Transfer
* Dense Relational Captioning: Triple-Stream Networks for Relationship-Based Captioning
* DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion
* Densely Semantically Aligned Person Re-Identification
* Density Map Regression Guided Detection Network for RGB-D Crowd Counting and Localization
* Depth Coefficients for Depth Completion
* Depth From a Polarisation + RGB Stereo Pair
* Depth-Attentional Features for Single-Image Rain Removal
* Depth-Aware Video Frame Interpolation
* Describing Like Humans: On Diversity in Image Captioning
* Destruction and Construction Learning for Fine-Grained Image Recognition
* Detailed Human Shape Estimation From a Single Image by Hierarchical Mesh Deformation
* Detect-To-Retrieve: Efficient Regional Aggregation for Image Search
* Detecting Overfitting of Deep Generative Networks via Latent Recovery
* Detection Based Defense Against Adversarial Examples From the Steganalysis Point of View
* Devil Is in the Edges: Learning Semantic Boundaries From Noisy Annotations
* DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation
* Dichromatic Model Based Temporal Color Constancy for AC Light Sources
* Did It Change? Learning to Detect Point-Of-Interest Changes for Proactive Map Updates
* Direct Object Recognition Without Line-Of-Sight Using Optical Coherence
* Discovering Fair Representations in the Data Domain
* Discovering Visual Patterns in Art Collections With Spatially-Consistent Feature Learning
* Disentangled Representation Learning for 3D Face Shape
* Disentangling Adversarial Robustness and Generalization
* Disentangling Latent Hands for Image Synthesis and Pose Estimation
* Disentangling Latent Space for VAE by Label Relevant/Irrelevant Dimensions
* Dissecting Person Re-Identification From the Viewpoint of Viewpoint
* Dissimilarity Coefficient Based Weakly Supervised Object Detection
* Distant Supervised Centroid Shift: A Simple and Efficient Approach to Visual Domain Adaptation
* Distilled Person Re-Identification: Towards a More Scalable System
* DistillHash: Unsupervised Deep Hashing by Distilling Data Pairs
* Distilling Object Detectors With Fine-Grained Feature Imitation
* Distraction-Aware Shadow Detection
* Divergence Prior and Vessel-Tree Reconstruction
* Divergence Triangle for Joint Training of Generator Model, Energy-Based Model, and Inferential Model
* Diverse Generation for Multi-Agent Sports Games
* Diversify and Match: A Domain Adaptive Representation Learning Paradigm for Object Detection
* Divide and Conquer the Embedding Space for Metric Learning
* DLOW: Domain Flow for Adaptation and Generalization
* DM-GAN: Dynamic Memory Generative Adversarial Networks for Text-To-Image Synthesis
* DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video Action Recognition
* Do Better ImageNet Models Transfer Better?
* Does Learning Specific Features for Related Parts Help Human Pose Estimation?
* Domain Generalization by Solving Jigsaw Puzzles
* Domain Transform Solver, The
* Domain-Specific Batch Normalization for Unsupervised Domain Adaptation
* Domain-Symmetric Networks for Adversarial Domain Adaptation
* Doodle to Search: Practical Zero-Shot Sketch-Based Image Retrieval
* Double Nuclear Norm Based Low Rank Representation on Grassmann Manifolds for Clustering
* Double-DIP: Unsupervised Image Decomposition via Coupled Deep-Image-Priors
* Douglas-Rachford Networks: Learning Both the Image Prior and Data Fidelity Terms for Blind Image Deconvolution
* DrivingStereo: A Large-Scale Dataset for Stereo Matching in Autonomous Driving Scenarios
* DSFD: Dual Shot Face Detector
* Dual Attention Network for Scene Segmentation
* Dual Encoding for Zero-Example Video Retrieval
* Dual Residual Networks Leveraging the Potential of Paired Operations for Image Restoration
* DuDoNet: Dual Domain Network for CT Metal Artifact Reduction
* DuLa-Net: A Dual-Projection Network for Estimating Room Layouts From a Single RGB Panorama
* DVC: An End-To-End Deep Video Compression Framework
* Dynamic Fusion With Intra- and Inter-Modality Attention Flow for Visual Question Answering
* Dynamic Recursive Neural Network
* Dynamic Scene Deblurring With Parameter Selective Sharing and Nested Skip Connections
* Dynamics Are Important for the Recognition of Equine Pain in Video
* DynTypo: Example-Based Dynamic Text Effects Transfer
* ECC: Platform-Independent Energy-Constrained Deep Neural Network Compression via a Bilinear Regression Model
* Edge-Labeling Graph Neural Network for Few-Shot Learning
* Effective Aesthetics Prediction With Multi-Level Spatially Pooled Features
* Efficient Decision-Based Black-Box Adversarial Attacks on Face Recognition
* Efficient Featurized Image Pyramid Network for Single Shot Detector
* Efficient Multi-Domain Learning by Covariance Normalization
* Efficient Neural Network Compression
* Efficient Online Multi-Person 2D Pose Tracking With Recurrent Spatio-Temporal Affinity Fields
* Efficient Parameter-Free Clustering Using First Neighbor Relations
* Efficient Schmidt-EKF for 3D Visual-Inertial SLAM, An
* Efficient Video Classification Using Fewer Frames
* EIGEN: Ecologically-Inspired GENetic Approach for Neural Network Structure Searching From Scratch
* Elastic Boundary Projection for 3D Medical Image Segmentation
* ELASTIC: Improving CNNs With Dynamic Scaling Policies
* Eliminating Exposure Bias and Metric Mismatch in Multiple Object Tracking
* Embedding Complementary Deep Networks for Image Classification
* Embodied Question Answering in Photorealistic Environments With Point Cloud Perception
* Emotion-Aware Human Attention Prediction
* End-To-End Efficient Representation Learning via Cascading Combinatorial Optimization
* End-To-End Interpretable Neural Motion Planner
* End-To-End Learned Random Walker for Seeded Image Segmentation
* End-To-End Multi-Task Learning With Attention
* End-To-End Network for Generating Social Relationship Graphs, An
* End-To-End Network for Panoptic Segmentation, An
* End-To-End Projector Photometric Compensation
* End-To-End Supervised Product Quantization for Image Search and Retrieval
* End-To-End Time-Lapse Video Synthesis From a Single Outdoor Image
* Engaging Image Captioning via Personality
* Enhanced Bayesian Compression via Deep Reinforcement Learning
* Enhanced Pix2pix Dehazing Network
* Enhancing Diversity of Defocus Blur Detectors via Cross-Ensemble Network
* Enhancing TripleGAN for Semi-Supervised Conditional Instance Synthesis and Classification
* Ensemble Deep Manifold Similarity Learning Using Hard Proxies
* ESIR: End-To-End Scene Text Recognition via Iterative Image Rectification
* ESPNetv2: A Light-Weight, Power Efficient, and General Purpose Convolutional Neural Network
* Estimating 3D Motion and Forces of Person-Object Interactions From Monocular Video
* EV-Gait: Event-Based Robust Gait Recognition Using Dynamic Vision Sensors
* Evading Defenses to Transferable Adversarial Examples by Translation-Invariant Attacks
* Event Cameras, Contrast Maximization and Reward Functions: An Analysis
* Event-Based High Dynamic Range Image and Very High Frame Rate Video Generation Using Conditional Generative Adversarial Networks
* EventNet: Asynchronous Recursive Event Processing
* Events-To-Video: Bringing Modern Computer Vision to Event Cameras
* Exact Adversarial Attack to Image Captioning via Structured Output Learning With Latent Variables
* Example-Guided Style-Consistent Image Synthesis From Semantic Labeling
* Explainability Methods for Graph Convolutional Neural Networks
* Explainable and Explicit Visual Reasoning Over Scene Graphs
* Explicit Bias Discovery in Visual Question Answering Models
* Explicit Spatial Encoding for Deep Local Descriptors
* Exploiting Edge Features for Graph Neural Networks
* Exploiting Kernel Sparsity and Entropy for Interpretable CNN Compression
* Exploiting Temporal Context for 3D Human Pose Estimation in the Wild
* Explore-Exploit Graph Traversal for Image Retrieval
* Exploring Context and Visual Pattern of Relationship for Scene Graph Generation
* Exploring Object Relation in Mean Teacher for Cross-Domain Detection
* Exploring the Bounds of the Utility of Context for Object Detection
* Expressive Body Capture: 3D Hands, Face, and Body From a Single Image
* Extreme Relative Pose Estimation for RGB-D Scans via Scene Completion
* F-VAEGAN-D2: A Feature Generating Framework for Any-Shot Learning
* FA-RPN: Floating Region Proposals for Face Detection
* Face Anti-Spoofing: Model Matters, so Does Data
* Face Parsing With RoI Tanh-Warping
* Face-Focused Cross-Stream Network for Deception Detection in Videos
* Facial Emotion Distribution Learning by Exploiting Low-Rank Label Correlations Locally
* Factor Graph Attention
* Fast and Flexible Indoor Scene Synthesis via Deep Convolutional Generative Models
* Fast and Robust Multi-Person 3D Pose Estimation From Multiple Views
* Fast Human Pose Estimation
* Fast Interactive Object Annotation With Curve-GCN
* Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells
* Fast Object Class Labelling via Speech
* Fast Online Object Tracking and Segmentation: A Unifying Approach
* Fast Single Image Reflection Suppression via Convex Optimization
* Fast Spatially-Varying Indoor Lighting Estimation
* Fast Spatio-Temporal Residual Network for Video Super-Resolution
* Fast User-Guided Video Object Segmentation by Interaction-And-Propagation Networks
* Fast, Diverse and Accurate Image Captioning Guided by Part-Of-Speech
* FastDraw: Addressing the Long Tail of Lane Detection by Adapting a Sequential Prediction Network
* FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search
* Feature Denoising for Improving Adversarial Robustness
* Feature Distillation: DNN-Oriented JPEG Compression Against Adversarial Examples
* Feature Selective Anchor-Free Module for Single-Shot Object Detection
* Feature Space Perturbations Yield More Transferable Adversarial Examples
* Feature Transfer Learning for Face Recognition With Under-Represented Data
* Feature-Level Frankenstein: Eliminating Variations for Discriminative Recognition
* Feedback Adversarial Learning: Spatial Feedback for Improving Generative Adversarial Networks
* Feedback Network for Image Super-Resolution
* FEELVOS: Fast End-To-End Embedding Learning for Video Object Segmentation
* Few-Shot Adaptive Faster R-CNN
* Few-Shot Learning via Saliency-Guided Hallucination of Samples
* Few-Shot Learning With Localization in Realistic Settings
* FickleNet: Weakly and Semi-Supervised Semantic Image Segmentation Using Stochastic Inference
* Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration
* FilterReg: Robust and Efficient Probabilistic Point-Set Registration Using Gaussian Filter and Twist Parameterization
* Finding Task-Relevant Features for Few-Shot Learning by Category Traversal
* FineGAN: Unsupervised Hierarchical Disentanglement for Fine-Grained Object Generation and Discovery
* Fitting Multiple Heterogeneous Models by Multi-Class Cascaded T-Linkage
* Flexible Convolutional Solver for Fast Style Transfers, A
* FlowNet3D: Learning Scene Flow in 3D Point Clouds
* FML: Face Model Learning From Videos
* FOCNet: A Fractional Optimal Control Network for Image Denoising
* Focus Is All You Need: Loss Functions for Event-Based Vision
* Foreground-Aware Image Inpainting
* Frame-Consistent Recurrent Video Deraining With Dual-Level Flow
* From Coarse to Fine: Robust Hierarchical Localization at Large Scale
* From Recognition to Cognition: Visual Commonsense Reasoning
* FSA-Net: Learning Fine-Grained Structure Aggregation for Head Pose Estimation From a Single Image
* Fully Automatic Video Colorization With Self-Regularization and Diversity
* Fully Learnable Group Convolution for Acceleration of Deep Neural Networks
* Fully Quantized Network for Object Detection
* GA-Net: Guided Aggregation Net for End-To-End Stereo Matching
* Gait Recognition via Disentangled Representation Learning
* GANFIT: Generative Adversarial Network Fitting for High Fidelity 3D Face Reconstruction
* Gaussian Temporal Awareness Networks for Action Localization
* GCAN: Graph Convolutional Adversarial Network for Unsupervised Domain Adaptation
* General and Adaptive Robust Loss Function, A
* Generalising Fine-Grained Sketch-Based Image Retrieval
* Generalizable Person Re-Identification by Domain-Invariant Mapping Network
* Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression
* Generalized Zero- and Few-Shot Learning via Aligned Variational Autoencoders
* Generalized Zero-Shot Recognition Based on Visually Semantic Embedding
* Generalizing Eye Tracking With Bayesian Adversarial Learning
* Generating 3D Adversarial Point Clouds
* Generating Classification Weights With GNN Denoising Autoencoders for Few-Shot Learning
* Generating Multiple Hypotheses for 3D Human Pose Estimation With Mixture Density Network
* Generative Adversarial Density Estimator, A
* Generative Appearance Model for End-To-End Video Object Segmentation, A
* Generative Dual Adversarial Network for Generalized Zero-Shot Learning
* Geometry-Aware Distillation for Indoor Semantic Segmentation
* Geometry-Aware Symmetric Domain Adaptation for Monocular Depth Estimation
* Geometry-Consistent Generative Adversarial Networks for One-Sided Unsupervised Domain Mapping
* GeoNet: Deep Geodesic Networks for Point Cloud Analysis
* GFrames: Gradient-Based Local Reference Frame for 3D Shape Matching
* GIF2Video: Color Dequantization and Temporal Interpolation of GIF Images
* Global Second-Order Pooling Convolutional Networks
* Good News, Everyone! Context Driven Entity-Aware Captioning for News Images
* Gotta Adapt 'Em All: Joint Pixel and Feature-Level Domain Adaptation for Recognition in the Wild
* GPSfM: Global Projective SFM Using Algebraic Constraints on Multi-View Fundamental Matrices
* GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering
* Gradient Matching Generative Networks for Zero-Shot Learning
* Graph Attention Convolution for Point Cloud Semantic Segmentation
* Graph Convolutional Label Noise Cleaner: Train a Plug-And-Play Action Classifier for Anomaly Detection
* Graph Convolutional Tracking
* Graph-Based Global Reasoning Networks
* Graphical Contrastive Losses for Scene Graph Parsing
* Graphonomy: Universal Human Parsing via Graph Transfer Learning
* Greedy Structure Learning of Hierarchical Compositional Models
* Grid R-CNN
* Grounded Video Description
* Grounding Human-To-Vehicle Advice for Self-Driving Vehicles
* Group Sampling for Scale Invariant Face Detection
* Group-Wise Correlation Stereo Network
* GS3D: An Efficient 3D Object Detection Framework for Autonomous Driving
* GSPN: Generative Shape Proposal Network for 3D Instance Segmentation in Point Cloud
* Guaranteed Matrix Completion Under Multiple Linear Transformations
* Guided Stereo Matching
* H+O: Unified Egocentric Recognition of 3D Hand-Object Poses and Interactions
* Handwriting Recognition in Low-Resource Scripts Using Adversarial Learning
* HAQ: Hardware-Aware Automated Quantization With Mixed Precision
* Hardness-Aware Deep Metric Learning
* Heavy Rain Image Restoration: Integrating Physics Model and Conditional Adversarial Learning
* HetConv: Heterogeneous Kernel-Based Convolutions for Deep CNNs
* Heterogeneous Memory Enhanced Multimodal Attention Model for Video Question Answering
* Hierarchical Cross-Modal Talking Face Generation With Dynamic Pixel-Wise Loss
* Hierarchical Deep Stereo Matching on High-Resolution Images
* Hierarchical Discrete Distribution Decomposition for Match Density Estimation
* Hierarchical Disentanglement of Discriminative Latent Features for Zero-Shot Learning
* Hierarchy Denoising Recursive Autoencoders for 3D Scene Layout Prediction
* High Flux Passive Imaging With Single-Photon Sensors
* High-Level Semantic Feature Detection: A New Perspective for Pedestrian Detection
* High-Quality Face Capture Using Anatomical Muscles
* Holistic and Comprehensive Annotation of Clinically Significant Findings on Diverse CT Images: Learning From Radiology Reports and Label Ontology
* HoloPose: Holistic 3D Human Reconstruction In-The-Wild
* Homomorphic Latent Space Interpolation for Unpaired Image-To-Image Translation
* HorizonNet: Learning Room Layout With 1D Representation and Pano Stretch Data Augmentation
* How to Make a Pizza: Learning a Compositional Layer-Based GAN Model
* HPLFlowNet: Hierarchical Permutohedral Lattice FlowNet for Scene Flow Estimation on Large-Scale Point Clouds
* Hybrid Scene Compression for Visual Localization
* Hybrid Task Cascade for Instance Segmentation
* Hybrid-Attention Based Decoupled Metric Learning for Zero-Shot Image Retrieval
* Hyperspectral Image Reconstruction Using a Deep Spatial-Spectral Prior
* Hyperspectral Image Super-Resolution With Optimized RGB Guidance
* Hyperspectral Imaging With Random Printed Mask
* IGE-Net: Inverse Graphics Energy Networks for Human Pose Estimation and Single-View Reconstruction
* IM-Net for High Resolution Video Frame Interpolation
* Im2Pencil: Controllable Pencil Illustration From Photographs
* Image Deformation Meta-Networks for One-Shot Learning
* Image Generation From Layout
* Image Super-Resolution by Neural Texture Transfer
* Image-Question-Answer Synergistic Network for Visual Dialog
* Image-To-Image Translation via Group-Wise Deep Whitening-And-Coloring Transformation
* Importance Estimation for Neural Network Pruning
* Improved Road Connectivity by Joint Learning of Orientation and Segmentation
* Improving Action Localization by Progressive Cross-Stream Cooperation
* Improving Few-Shot User-Specific Gaze Adaptation via Gaze Redirection Synthesis
* Improving Referring Expression Grounding With Cross-Modal Attention-Guided Erasing
* Improving Semantic Segmentation via Video Propagation and Label Relaxation
* Improving the Performance of Unimodal Dynamic Hand-Gesture Recognition With Multimodal Training
* Improving Transferability of Adversarial Examples With Input Diversity
* In Defense of Pre-Trained ImageNet Architectures for Real-Time Semantic Segmentation of Road-Driving Images
* In the Wild Human Pose Estimation Using Explicit 2D Features and Intermediate 3D Representations
* Incremental Object Learning From Contiguous Views
* Information Maximizing Visual Question Generation
* Informative Object Annotations: Tell Me Something I Don't Know
* Inserting Videos Into Videos
* Instance Segmentation by Jointly Optimizing Spatial Embeddings and Clustering Bandwidth
* Instance-Level Meta Normalization
* Intention Oriented Image Captions With Guiding Objects
* Interaction-And-Aggregation Network for Person Re-Identification
* Interactive Full Image Segmentation by Considering All Regions Jointly
* Interactive Image Segmentation via Backpropagating Refinement Scheme
* Interpretable and Fine-Grained Visual Explanations for Convolutional Neural Networks
* Interpreting CNNs via Decision Trees
* Invariance Matters: Exemplar Memory for Domain Adaptive Person Re-Identification
* Inverse Cooking: Recipe Generation From Food Images
* Inverse Discriminative Networks for Handwritten Signature Verification
* Inverse Path Tracing for Joint Material and Lighting Estimation
* Inverse Procedural Modeling of Knitwear
* InverseRenderNet: Learning Single Image Inverse Rendering
* IP102: A Large-Scale Benchmark Dataset for Insect Pest Recognition
* IRLAS: Inverse Reinforcement Learning for Architecture Search
* Isospectralization, or How to Hear Shape, Style, and Correspondence
* It's Not About the Journey; It's About the Destination: Following Soft Paths Under Question-Guidance for Visual Reasoning
* Iterative Alignment Network for Continuous Sign Language Recognition
* Iterative and Cooperative Top-Down and Bottom-Up Inference Network for Salient Object Detection, An
* Iterative Normalization: Beyond Standardization Towards Efficient Whitening
* Iterative Projection and Matching: Finding Structure-Preserving Representatives and Its Application to Computer Vision
* Iterative Reorganization With Weak Spatial Constraints: Solving Arbitrary Jigsaw Puzzles for Unsupervised Representation Learning
* Iterative Residual CNNs for Burst Photography Applications
* Iterative Residual Refinement for Joint Optical Flow and Occlusion Estimation
* Joint Discriminative and Generative Learning for Person Re-Identification
* Joint Face Detection and Facial Motion Retargeting for Multiple Faces
* Joint Manifold Diffusion for Combining Predictions on Decoupled Observations
* Joint Representation and Estimator Learning for Facial Action Unit Intensity Estimation
* Joint Representative Selection and Feature Learning: A Semi-Supervised Approach
* JSIS3D: Joint Semantic-Instance Segmentation of 3D Point Clouds With Multi-Task Pointwise Networks and Multi-Value Conditional Random Fields
* Jumping Manifolds: Geometry Aware Dense Non-Rigid Structure From Motion
* K-Nearest Neighbors Hashing
* KE-GAN: Knowledge Embedded Generative Adversarial Networks for Semi-Supervised Scene Parsing
* Kernel Transformer Networks for Compact Spherical Convolution
* Kernelized Manifold Mapping to Diminish the Effect of Adversarial Perturbations, A
* Kervolutional Neural Networks
* Knockoff Nets: Stealing Functionality of Black-Box Models
* Knowing When to Stop: Evaluation and Verification of Conformity to Output-Size Specifications
* Knowledge Adaptation for Efficient Semantic Segmentation
* Knowledge Distillation via Instance Relationship Graph
* Knowledge-Embedded Routing Network for Scene Graph Generation
* L3-Net: Towards Learning Based LiDAR Localization for Autonomous Driving
* Label Efficient Semi-Supervised Learning via Graph Filtering
* Label Propagation for Deep Semi-Supervised Learning
* Label-Noise Robust Generative Adversarial Networks
* LAEO-Net: Revisiting People Looking at Each Other in Videos
* LAF-Net: Locally Adaptive Fusion Networks for Stereo Confidence Estimation
* Language-Driven Temporal Activity Localization: A Semantic Matching Reinforcement Learning Model
* Large Scale High-Resolution Land Cover Mapping With Multi-Resolution Data
* Large Scale Incremental Learning
* Large-Scale Distributed Second-Order Optimization Using Kronecker-Factored Approximate Curvature for Deep Convolutional Neural Networks
* Large-Scale Few-Shot Learning: Knowledge Transfer With Class Hierarchy
* Large-Scale Interactive Object Segmentation With Human Annotators
* Large-Scale Long-Tailed Recognition in an Open World
* Large-Scale Weakly-Supervised Pre-Training for Video Action Recognition
* Large-Scale, Metric Structure From Motion for Unordered Light Fields
* LaserNet: An Efficient Probabilistic 3D Object Detector for Autonomous Driving
* LaSO: Label-Set Operations Networks for Multi-Label Few-Shot Learning
* LaSOT: A High-Quality Benchmark for Large-Scale Single Object Tracking
* Late Fusion CNN for Digital Matting, A
* Latent Filter Scaling for Multimodal Unsupervised Image-To-Image Translation
* Latent Space Autoregression for Novelty Detection
* Layout-Graph Reasoning for Fashion Landmark Detection
* LBS Autoencoder: Self-Supervised Fitting of Articulated Meshes to Point Clouds
* Learning 3D Human Dynamics From Video
* Learning a Deep ConvNet for Multi-Label Classification With Partial Labels
* Learning a Unified Classifier Incrementally via Rebalancing
* Learning Active Contour Models for Medical Image Segmentation
* Learning Actor Relation Graphs for Group Activity Recognition
* Learning Attraction Field Representation for Robust Line Segment Detection
* Learning Binary Code for Personalized Fashion Recommendation
* Learning Channel-Wise Interactions for Binary Convolutional Neural Networks
* Learning Context Graph for Person Search
* Learning Correspondence From the Cycle-Consistency of Time
* Learning Cross-Modal Embeddings With Adversarial Networks for Cooking Recipes and Food Images
* Learning for Single-Shot Confidence Calibration in Deep Neural Networks Through Stochastic Inferences
* Learning From Noisy Labels by Regularized Estimation of Annotator Confusion
* Learning From Synthetic Data for Crowd Counting in the Wild
* Learning Image and Video Compression Through Spatial-Temporal Energy Compaction
* Learning Implicit Fields for Generative Shape Modeling
* Learning Independent Object Motion From Unlabelled Stereoscopic Videos
* Learning Individual Styles of Conversational Gesture
* Learning Instance Activation Maps for Weakly Supervised Instance Segmentation
* Learning Joint Gait Representation via Quintuplet Loss Minimization
* Learning Joint Reconstruction of Hands and Manipulated Objects
* Learning Linear Transformations for Fast Image and Video Style Transfer
* Learning Loss for Active Learning
* Learning Metrics From Teachers: Compact Networks for Image Embedding
* Learning Monocular Depth Estimation Infusing Traditional Stereo Knowledge
* Learning Multi-Class Segmentations From Single-Class Datasets
* Learning Non-Volumetric Depth Fusion Using Successive Reprojections
* Learning Not to Learn: Training Deep Neural Networks With Biased Data
* Learning Parallax Attention for Stereo Image Super-Resolution
* Learning Personalized Modular Network Guided by Structured Knowledge
* Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting
* Learning Regularity in Skeleton Trajectories for Anomaly Detection in Videos
* Learning RoI Transformer for Oriented Object Detection in Aerial Images
* Learning Semantic Segmentation From Synthetic Data: A Geometrically Guided Input-Output Adaptation Approach
* Learning Shape-Aware Embedding for Scene Text Detection
* Learning Single-Image Depth From Videos Using Quality Assessment Networks
* Learning Spatial Common Sense With Geometry-Aware Recurrent Networks
* Learning Spatio-Temporal Representation With Local and Global Diffusion
* Learning Structure-And-Motion-Aware Rolling Shutter Correction
* Learning the Depths of Moving People by Watching Frozen People
* Learning to Adapt for Stereo
* Learning to Calibrate Straight Lines for Fisheye Image Rectification
* Learning to Cluster Faces on an Affinity Graph
* Learning to Compose Dynamic Tree Structures for Visual Contexts
* Learning to Detect Human-Object Interactions With Knowledge
* Learning to Explain With Complemental Examples
* Learning to Explore Intrinsic Saliency for Stereoscopic Video
* Learning to Extract Flawless Slow Motion From Blurry Videos
* Learning to Film From Professional Human Motion Videos
* Learning to Generate Synthetic Data via Compositing
* Learning to Learn From Noisy Labeled Data
* Learning to Learn How to Learn: Self-Adaptive Visual Navigation Using Meta-Learning
* Learning to Learn Image Classifiers With Visual Analogy
* Learning to Learn Relation for Important People Detection in Still Images
* Learning to Localize Through Compressed Binary Maps
* Learning to Minify Photometric Stereo
* Learning to Quantize Deep Networks by Optimizing Quantization Intervals With Task Loss
* Learning to Reconstruct People in Clothing From a Single RGB Camera
* Learning to Reduce Dual-Level Discrepancy for Infrared-Visible Person Re-Identification
* Learning to Regress 3D Face Shape and Expression From an Image Without 3D Supervision
* Learning to Remember: A Synaptic Plasticity Driven Framework for Continual Learning
* Learning to Sample
* Learning to Separate Multiple Illuminants in a Single Image
* Learning to Synthesize Motion Blur
* Learning to Transfer Examples for Partial Domain Adaptation
* Learning Transformation Synchronization
* Learning Unsupervised Video Object Segmentation Through Visual Attention
* Learning Video Representations From Correspondence Proposals
* Learning View Priors for Single-View 3D Reconstruction
* Learning With Batch-Wise Optimal Transport Loss for 3D Shape Recognition
* Learning Without Memorizing
* Learning Words by Drawing Images
* Learning-Based Sampling for Natural Image Matting
* Led3D: A Lightweight and Efficient Deep Approach to Recognizing Low-Quality 3D Faces
* Lending Orientation to Neural Networks for Cross-View Geo-Localization
* Less Is More: Learning Highlight Detection From Video Duration
* Leveraging Crowdsourced GPS Data for Road Extraction From Aerial Imagery
* Leveraging Heterogeneous Auxiliary Tasks to Assist Crowd Counting
* Leveraging Shape Completion for 3D Siamese Tracking
* Leveraging the Invariant Side of Generative Zero-Shot Learning
* Libra R-CNN: Towards Balanced Learning for Object Detection
* LiFF: Light Field Features in Scale and Depth
* Lifting Vectorial Variational Problems: A Natural Formulation Based on Geometric Measure Theory and Discrete Exterior Calculus
* Light Field Messaging With Deep Photographic Steganography
* Linkage Based Face Clustering via Graph Convolution Network
* Listen to the Image
* LiveSketch: Query Perturbations for Guided Sketch-Based Visual Search
* LO-Net: Deep Real-Time Lidar Odometry
* Local Block Coordinate Descent Algorithm for the CSC Model, A
* Local Detection of Stereo Occlusion Boundaries
* Local Features and Visual Words Emerge in Activations
* Local Relationship Learning With Person-Specific Shape Regularization for Facial Action Unit Detection
* Local Temporal Bilinear Pooling for Fine-Grained Action Parsing
* Local to Global Learning: Gradually Adding Classes for Training Deep Neural Networks
* Locating Objects Without Bounding Boxes
* Long-Term Feature Banks for Detailed Video Understanding
* Look Back and Predict Forward in Image Captioning
* Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes
* Looking for the Devil in the Details: Learning Trilinear Attention Sampling Network for Fine-Grained Image Recognition
* Low-Rank Laplacian-Uniform Mixed Model for Robust Face Recognition
* Low-Rank Tensor Completion With a New Tensor Nuclear Norm Induced by Invertible Linear Transforms
* LP-3DCNN: Unveiling Local Phase in 3D Convolutional Neural Networks
* LSTA: Long Short-Term Attention for Egocentric Action Recognition
* LVIS: A Dataset for Large Vocabulary Instance Segmentation
* Machine Vision Guided 3D Medical Image Compression for Efficient Transmission and Accurate Segmentation in the Clouds
* MAGSAC: Marginalizing Sample Consensus
* Main/Subsidiary Network Framework for Simplifying Binary Neural Networks, A
* MAN: Moment Alignment Network for Natural Language Moment Retrieval via Iterative Graph Adjustment
* ManTra-Net: Manipulation Tracing Network for Detection and Localization of Image Forgeries With Anomalous Features
* MAP Inference via Block-Coordinate Frank-Wolfe Algorithm
* Mapping, Localization and Path Planning for Image-Based Navigation Using Visual Features and Map
* Marginalized Latent Semantic Encoder for Zero-Shot Learning
* MARS: Motion-Augmented RGB Stream for Action Recognition
* Mask Scoring R-CNN
* Mask-Guided Portrait Editing With Conditional GANs
* Max-Sliced Wasserstein Distance and Its Use for GANs
* MaxpoolNMS: Getting Rid of NMS Bottlenecks in Two-Stage Object Detectors
* MBS: Macroblock Scaling for CNN Model Reduction
* Memory in Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity From Spatiotemporal Dynamics
* Memory-Attended Recurrent Network for Video Captioning
* MeshAdv: Adversarial Meshes for Visual Recognition
* Meta-Learning Convolutional Neural Architectures for Multi-Target Concrete Defect Classification With the COncrete DEfect BRidge IMage Dataset
* Meta-Learning With Differentiable Convex Optimization
* Meta-SR: A Magnification-Arbitrary Network for Super-Resolution
* Meta-Transfer Learning for Few-Shot Learning
* MetaCleaner: Learning to Hallucinate Clean Representations for Noisy-Labeled Visual Recognition
* Metric Learning for Image Registration
* MFAS: Multimodal Fusion Architecture Search
* MHP-VOS: Multiple Hypotheses Propagation for Video Object Segmentation
* Min-Max Statistical Alignment for Transfer Learning
* Mind Your Neighbours: Image Annotation With Metadata Neighbourhood Graph Co-Attention Networks
* Minimal Solvers for Mini-Loop Closures in 3D Multi-Scan Alignment
* MirrorGAN: Learning Text-To-Image Generation by Redescription
* Mitigating Information Leakage in Image Representations: A Maximum Entropy Approach
* Mixed Effects Neural Networks (MeNets) With Applications to Gaze Estimation
* Mixture Density Generative Adversarial Networks
* MMFace: A Multi-Metric Regression Network for Unconstrained Face Reconstruction
* MnasNet: Platform-Aware Neural Architecture Search for Mobile
* Mode Seeking Generative Adversarial Networks for Diverse Image Synthesis
* Model-Blind Video Denoising via Frame-To-Frame Training
* Modeling Local Geometric Structure of 3D Point Clouds Using Geo-CNN
* Modeling Point Clouds With Self-Attention and Gumbel Subset Sampling
* Modularized Textual Grounding for Counterfactual Resilience
* Modulating Image Restoration With Continual Levels via Adaptive Feature Modification Layers
* Monocular 3D Object Detection Leveraging Accurate Proposals and Shape Reconstruction
* Monocular Depth Estimation Using Relative Depth Maps
* Monocular Total Capture: Posing Face, Body, and Hands in the Wild
* Motion Estimation of Non-Holonomic Ground Vehicles From a Single Feature Correspondence Measured Over N Views
* MOTS: Multi-Object Tracking and Segmentation
* Moving Object Detection Under Discontinuous Change in Illumination Using Tensor Low-Rank and Invariant Sparse Decomposition
* MS-TCN: Multi-Stage Temporal Convolutional Network for Action Segmentation
* MSCap: Multi-Style Image Captioning With Unpaired Stylized Text
* Multi-Adversarial Discriminative Deep Domain Generalization for Face Presentation Attack Detection
* Multi-Agent Tensor Fusion for Contextual Trajectory Prediction
* Multi-Channel Attention Selection GAN With Cascaded Semantic Guidance for Cross-View Image Translation
* Multi-Granularity Generator for Temporal Action Proposal
* Multi-Label Image Recognition With Graph Convolutional Networks
* Multi-Level Context Ultra-Aggregation for Stereo Matching
* Multi-Level Multimodal Common Semantic Space for Image-Phrase Grounding
* Multi-Person Articulated Tracking With Spatial and Temporal Embeddings
* Multi-Person Pose Estimation With Enhanced Channel-Wise and Spatial Information
* Multi-Scale Geometric Consistency Guided Multi-View Stereo
* Multi-Similarity Loss With General Pair Weighting for Deep Metric Learning
* Multi-Source Weak Supervision for Saliency Detection
* Multi-Step Prediction of Occupancy Grid Maps With Recurrent Neural Networks
* Multi-Target Embodied Question Answering
* Multi-Task Learning of Hierarchical Vision-Language Representation
* Multi-Task Multi-Sensor Fusion for 3D Object Detection
* Multi-Task Self-Supervised Object Detection via Recycling of Bounding Box Annotations
* Multimodal Explanations by Predicting Counterfactuality in Videos
* Multispectral and Hyperspectral Image Fusion by MS/HS Fusion Net
* Multispectral Imaging for Fine-Grained Recognition of Powders on Complex Backgrounds
* Multiview 2D/3D Rigid Registration via a Point-Of-Interest Network for Tracking and Triangulation
* MUREL: Multimodal Relational Reasoning for Visual Question Answering
* Mutual Learning Method for Salient Object Detection With Intertwined Multi-Supervision, A
* Mutual Learning of Complementary Networks via Residual Correction for Improving Semi-Supervised Classification
* MVF-Net: Multi-View 3D Face Morphable Model Regression
* MVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection
* NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection
* Natural and Realistic Single Image Super-Resolution With Explicit Natural Manifold Discrimination
* NDDR-CNN: Layerwise Feature Fusing in Multi-Task CNNs by Neural Discriminative Dimensionality Reduction
* Neighbourhood Watch: Referring Expression Comprehension via Language-Guided Graph Attention Networks
* Nesti-Net: Normal Estimation for Unstructured 3D Point Clouds Using Convolutional Neural Networks
* NetTailor: Tuning the Architecture, Not Just the Weights
* Networks for Joint Affine and Non-Parametric Image Registration
* Neural Illumination: Lighting Prediction for Indoor Environments
* Neural Network Based on SPD Manifold Learning for Skeleton-Based Hand Gesture Recognition, A
* Neural Rejuvenation: Improving Deep Network Training by Enhancing Computational Resource Utilization
* Neural Rerendering in the Wild
* Neural RGB(r)D Sensing: Depth and Uncertainty From a Video Camera
* Neural Scene Decomposition for Multi-Person Motion Capture
* Neural Sequential Phrase Grounding (SeqGROUND)
* Neural Task Graphs: Generalizing to Unseen Tasks From a Single Video Demonstration
* Neural Temporal Model for Human Motion Prediction, A
* Neuro-Inspired Eye Tracking With Eye Movement Dynamics
* Neurobiological Evaluation Metric for Neural Network Model Search, A
* NM-Net: Mining Reliable Neighbors for Robust Feature Correspondences
* Noise-Aware Unsupervised Deep Lidar-Stereo Fusion
* Noise-Tolerant Paradigm for Training Face Recognition CNNs
* Noise2Void - Learning Denoising From Single Noisy Images
* Non-Adversarial Image Synthesis With Generative Latent Nearest Neighbors
* Non-Local Meets Global: An Integrated Paradigm for Hyperspectral Denoising
* Normalized Diversification
* Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation
* Not All Areas Are Equal: Transfer Learning for Semantic Segmentation via Hierarchical Region Selection
* Not All Frames Are Equal: Weakly-Supervised Video Grounding With Contextual Similarity and Visual Clustering Losses
* Not Using the Car to See the Sidewalk -- Quantifying and Controlling the Effects of Context in Classification and Segmentation
* Object Counting and Instance Segmentation With Image-Level Supervision
* Object Detection With Location-Aware Deformable Convolution and Backward Attention Filtering
* Object Discovery in Videos as Foreground Motion Clustering
* Object Instance Annotation With Deep Extreme Level Set Evolution
* Object Tracking by Reconstruction With View-Specific Discriminative Correlation Filters
* Object-Aware Aggregation With Bidirectional Temporal Graph for Video Captioning
* Object-Centric Auto-Encoders and Dummy Anomalies for Abnormal Event Detection in Video
* Object-Driven Text-To-Image Synthesis via Adversarial Training
* Occlusion-Net: 2D/3D Occluded Keypoint Localization Using Graph Networks
* Occupancy Networks: Learning 3D Reconstruction in Function Space
* OCGAN: One-Class Novelty Detection Using GANs With Constrained Latent Representations
* Octree Guided CNN With Spherical Kernels for 3D Point Clouds
* ODE-Inspired Network Design for Single Image Super-Resolution
* OICSR: Out-In-Channel Sparsity Regularization for Compact Deep Neural Networks
* OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge
* On Exploring Undetermined Relationships for Visual Relationship Detection
* On Finding Gray Pixels
* On Implicit Filter Level Sparsity in Convolutional Neural Networks
* On Learning Density Aware Embeddings
* On Stabilizing Generative Adversarial Training With Noise
* On the Continuity of Rotation Representations in Neural Networks
* On the Intrinsic Dimensionality of Image Representations
* On the Structural Sensitivity of Deep Convolutional Networks to the Directions of Fourier Basis Functions
* On Zero-Shot Recognition of Generic Objects
* Online High Rank Matrix Completion
* Orthogonal Decomposition Network for Pixel-Wise Binary Classification
* Out-Of-Distribution Detection for Generalized Zero-Shot Action Recognition
* Overcoming Limitations of Mixture Density Networks: A Sampling and Fitting Framework for Multimodal Future Prediction
* P2SGrad: Refined Gradients for Optimizing Deep Face Models
* P3SGD: Patient Privacy Preserving SGD for Regularizing Deep CNNs in Pathological Image Classification
* PA3D: Pose-Action 3D Machine for Video Recognition
* Panoptic Feature Pyramid Networks
* Panoptic Segmentation
* Parallel Optimal Transport GAN
* Parametric Noise Injection: Trainable Randomness to Improve Deep Neural Network Robustness Against Adversarial Attack
* Parametric Top-View Representation of Complex Road Scenes, A
* Parsing R-CNN for Instance-Level Human Analysis
* Part-Regularized Near-Duplicate Vehicle Re-Identification
* Partial Order Pruning: For Best Speed/Accuracy Trade-Off in Neural Architecture Search
* PartNet: A Large-Scale Benchmark for Fine-Grained and Hierarchical Part-Level 3D Object Understanding
* PartNet: A Recursive Part Decomposition Network for Fine-Grained and Hierarchical Shape Segmentation
* Patch-Based Discriminative Feature Learning for Unsupervised Person Re-Identification
* Patch-Based Progressive 3D Point Set Upsampling
* Path-Invariant Map Networks
* Pattern-Affinitive Propagation Across Depth, Surface Normal and Semantic Segmentation
* Pay Attention! - Robustifying a Deep Visuomotor Policy Through Task-Focused Visual Attention
* PCAN: 3D Attention Map Learning Using Contextual Information for Point Cloud Based Retrieval
* PDE Acceleration for Active Contours
* Pedestrian Detection With Autoregressive Network Phases
* Peeking Into the Future: Predicting Future Person Activities and Locations in Videos
* PEPSI: Fast Image Inpainting With Parallel Decoding Network
* Perceive Where to Focus: Learning Visibility-Aware Part-Level Features for Partial Person Re-Identification
* Perceptual Prediction Framework for Self Supervised Event Segmentation, A
* Perfect Match: 3D Point Cloud Matching With Smoothed Densities, The
* Perturbation Analysis of the 8-Point Algorithm: A Case Study for Wide FoV Cameras
* Phase-Only Image Based Kernel Estimation for Single Image Blind Deblurring
* Photo Wake-Up: 3D Character Animation From a Single Photo
* Photometric Mesh Optimization for Video-Aligned 3D Object Reconstruction
* Photon-Flooded Single-Photon 3D Cameras
* PIEs: Pose Invariant Embeddings
* PifPaf: Composite Fields for Human Pose Estimation
* Pixel-Adaptive Convolutional Neural Networks
* PlaneRCNN: 3D Plane Detection and Reconstruction From a Single Image
* Pluralistic Image Completion
* PMS-Net: Robust Haze Removal Based on Patch Map for Single Images
* Point Cloud Oversegmentation With Graph-Structured Deep Metric Learning
* Point in, Box Out: Beyond Counting Persons in Crowds
* Point-To-Pose Voting Based Hand Pose Estimation Using Residual Permutation Equivariant Layer
* PointConv: Deep Convolutional Networks on 3D Point Clouds
* PointFlowNet: Learning Representations for Rigid Motion Estimation From Point Clouds
* Pointing Novel Objects in Image Captioning
* PointNetLK: Robust and Efficient Point Cloud Registration Using PointNet
* PointPillars: Fast Encoders for Object Detection From Point Clouds
* PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud
* PointWeb: Enhancing Local Neighborhood Features for Point Cloud Processing
* Poisson-Gaussian Denoising Dataset With Real Fluorescence Microscopy Images, A
* Polarimetric Camera Calibration Using an LCD Monitor
* Polynomial Representation for Persistence Diagram
* Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval
* Pose2Seg: Detection Free Human Instance Segmentation
* PoseFix: Model-Agnostic General Human Pose Refinement Network
* PPGNet: Learning Point-Pair Graph for Line Segment Detection
* Practical Coding Function Design for Time-Of-Flight Imaging
* Practical Full Resolution Learned Lossless Image Compression
* Precise Detection in Densely Packed Scenes
* Predicting Future Frames Using Retrospective Cycle GAN
* Predicting Visible Image Differences Under Varying Display Brightness and Viewing Distance
* Privacy Preserving Image-Based Localization
* Privacy Protection in Street-View Panoramas Using Depth and Multi-View Imagery
* Probabilistic End-To-End Noise Correction for Learning With Noisy Labels
* Probabilistic Permutation Synchronization Using the Riemannian Structure of the Birkhoff Polytope
* Progressive Attention Memory Network for Movie Story Question Answering
* Progressive Ensemble Networks for Zero-Shot Recognition
* Progressive Feature Alignment for Unsupervised Domain Adaptation
* Progressive Image Deraining Networks: A Better and Simpler Baseline
* Progressive Pose Attention Transfer for Person Image Generation
* Progressive Teacher-Student Learning for Early Action Prediction
* Propagation Mechanism for Deep and Wide Neural Networks
* Pros and Cons: Rank-Aware Temporal Attention for Skill Determination in Long Videos, The
* Pseudo-LiDAR From Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving
* Pushing the Boundaries of View Extrapolation With Multiplane Images
* Pushing the Envelope for RGB-Based Dense 3D Hand Pose Estimation via Neural Rendering
* Putting Humans in a Scene: Learning Affordance in 3D Indoor Environments
* PVNet: Pixel-Wise Voting Network for 6DoF Pose Estimation
* Pyramid Feature Attention Network for Saliency Detection
* Pyramidal Person Re-IDentification via Multi-Loss Dynamic Training
* QATM: Quality-Aware Template Matching for Deep Learning
* Quantization Networks
* Quasi-Unsupervised Color Constancy
* Query-Guided End-To-End Person Search
* R2GAN: Cross-Modal Recipe Retrieval With Generative Adversarial Network
* R3 Adversarial Network for Cross Model Face Recognition
* Radial Distortion Triangulation
* Ranked List Loss for Deep Metric Learning
* Rare Event Detection Using Disentangled Representation Learning
* RAVEN: A Dataset for Relational and Analogical Visual REasoNing
* Ray-Space Projection Model for Light Field Camera
* Re-Identification Supervised Texture Generation
* Re-Identification With Consistent Attentive Siamese Networks
* Re-Ranking via Metric Fusion for Object Retrieval and Person Re-Identification
* Real-Time Self-Adaptive Deep Stereo
* Reasoning Visual Dialogs With Structural and Partial Observations
* Reasoning-RCNN: Unifying Adaptive Global Reasoning Into Large-Scale Object Detection
* Recurrent Attentive Zooming for Joint Crowd Counting and Precise Localization
* Recurrent Back-Projection Network for Video Super-Resolution
* Recurrent MVSNet for High-Resolution Multi-View Stereo Depth Inference
* Recurrent Neural Network for (Un-)Supervised Learning of Monocular Video Visual Odometry and Depth
* Recurrent Neural Networks With Intra-Frame Iterations for Video Deblurring
* Recursive Visual Attention in Visual Dialog
* Reducing Uncertainty in Undersampled MRI Reconstruction With Active Acquisition
* Refine and Distill: Exploiting Cycle-Inconsistency and Knowledge Distillation for Unsupervised Monocular Depth Estimation
* Reflection Removal Using a Dual-Pixel Sensor
* Reflective and Fluorescent Separation Under Narrow-Band Illumination
* Region Proposal by Guided Anchoring
* Regretful Agent: Heuristic-Aided Navigation Through Progress Estimation, The
* RegularFace: Deep Face Recognition via Exclusive Regularization
* Regularizing Activation Distribution for Training Binarized Deep Networks
* Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation
* Relation-Augmented Fully Convolutional Network for Semantic Segmentation in Aerial Scenes, A
* Relation-Shape Convolutional Neural Network for Point Cloud Analysis
* Relational Action Forecasting
* Relational Knowledge Distillation
* Reliable and Efficient Image Cropping: A Grid Anchor Based Approach
* RENAS: Reinforced Evolutionary Neural Architecture Search
* REPAIR: Removing Representation Bias by Dataset Resampling
* RepMet: Representative-Based Metric Learning for Classification and Few-Shot Object Detection
* RepNet: Weakly Supervised Training of an Adversarial Reprojection Network for 3D Human Pose Estimation
* RePr: Improved Training of Convolutional Filters
* Representation Flow for Action Recognition
* Representation Similarity Analysis for Efficient Task Taxonomy and Transfer Learning
* RES-PCA: A Scalable Approach to Recovering Low-Rank Matrices
* Residual Networks for Light Field Image Super-Resolution
* Residual Regression With Semantic Prior for Crowd Counting
* Rethinking Knowledge Graph Propagation for Zero-Shot Learning
* Rethinking the Evaluation of Video Summaries
* Retrieval-Augmented Convolutional Neural Networks Against Adversarial Examples
* Revealing Scenes by Inverting Structure From Motion Reconstructions
* Reversible GANs for Memory-Efficient Image-To-Image Translation
* Revisiting Local Descriptor Based Image-To-Class Measure for Few-Shot Learning
* Revisiting Perspective Information for Efficient Crowd Counting
* Revisiting Self-Supervised Visual Representation Learning
* RF-Net: An End-To-End Image Matching Network Based on Receptive Field
* RGBD Based Dimensional Decomposition Residual Network for 3D Semantic Scene Completion
* RL-GAN-Net: A Reinforcement Learning Agent Controlled GAN Network for Real-Time Point Cloud Shape Completion
* Rob-GAN: Generator, Discriminator, and Adversarial Attacker
* Robust Facial Landmark Detection via Occlusion-Adaptive Deep Networks
* Robust Histopathology Image Analysis: To Label or to Synthesize?
* Robust Local Spectral Descriptor for Matching Non-Rigid Shapes With Incompatible Shape Structures, A
* Robust Point Cloud Based Reconstruction of Large-Scale Outdoor Scenes
* Robust Subspace Clustering With Independent and Piecewise Identically Distributed Noise Modeling
* Robust Video Stabilization by Optimization in CNN Weight Space
* Robustness of 3D Deep Learning in an Adversarial Setting
* Robustness Verification of Classification Deep Neural Networks via Linear Programming
* Robustness via Curvature Regularization, and Vice Versa
* ROI Pooled Correlation Filters for Visual Tracking
* ROI-10D: Monocular Lifting of 2D Detection to 6D Pose and Metric Shape
* Rules of the Road: Predicting Driving Behavior With a Convolutional Model of Semantic Interactions
* RVOS: End-To-End Recurrent Network for Video Object Segmentation
* S4Net: Single Stage Salient-Instance Segmentation
* SAIL-VOS: Semantic Amodal Instance Level Video Object Segmentation: A Synthetic Dataset and Baselines
* Salient Object Detection with Pyramid Attention and Salient Edges
* Sampling Techniques for Large-Scale Object Detection From Sparsely Annotated Objects
* Scalable Convolutional Neural Network for Image Compressed Sensing
* Scale-Adaptive Neural Dense Features: Learning via Hierarchical Context Aggregation
* Scan2CAD: Learning CAD Model Alignment in RGB-D Scans
* Scan2Mesh: From Unstructured Range Scans to 3D Meshes
* Scene Categorization From Contours: Medial Axis Based Salience Measures
* Scene Graph Generation With External Knowledge and Image Reconstruction
* Scene Memory Transformer for Embodied Agents in Long-Horizon Tasks
* Scene Parsing via Integrated Classification Model and Variance-Based Regularization
* SceneCode: Monocular Dense Semantic Reconstruction Using Learned Encoded Scene Representations
* SCOPS: Self-Supervised Co-Part Segmentation
* ScratchDet: Training Single-Shot Object Detectors From Scratch
* SDC - Stacked Dilated Convolution: A Unified Descriptor Network for Dense Matching Tasks
* SDRSAC: Semidefinite-Based Randomized Approach for Robust Point Cloud Registration Without Correspondences
* Sea-Thru: A Method for Removing Water From Underwater Images
* Seamless Scene Segmentation
* Searching for a Robust Neural Architecture in Four GPU Hours
* Second-Order Attention Network for Single Image Super-Resolution
* See More, Know More: Unsupervised Video Object Segmentation With Co-Attention Siamese Networks
* SeerNet: Predicting Convolutional Neural Network Feature-Map Sparsity Through Low-Bit Quantization
* Segmentation-Driven 6D Object Pose Estimation
* Selective Kernel Networks
* Selective Sensor Fusion for Neural Visual-Inertial Odometry
* Self-Calibrating Deep Photometric Stereo Networks
* Self-Critical N-Step Training for Image Captioning
* Self-Supervised 3D Hand Pose Estimation Through Training by Fitting
* Self-Supervised Adaptation of High-Fidelity Face Models for Monocular Performance Tracking
* Self-Supervised Convolutional Subspace Clustering Network
* Self-Supervised GANs via Auxiliary Rotation Loss
* Self-Supervised Learning of 3D Human Pose Using Multi-View Geometry
* Self-Supervised Learning via Conditional Motion Propagation
* Self-Supervised Representation Learning by Rotation Feature Decoupling
* Self-Supervised Representation Learning From Videos for Facial Action Unit Detection
* Self-Supervised Spatio-Temporal Representation Learning for Videos by Predicting Motion and Appearance Statistics
* Self-Supervised Spatiotemporal Learning via Video Clip Order Prediction
* SelFlow: Self-Supervised Learning of Optical Flow
* Semantic Alignment: Finding Semantically Consistent Ground-Truth for Facial Landmark Detection
* Semantic Attribute Matching Networks
* Semantic Component Decomposition for Face Attribute Manipulation
* Semantic Correlation Promoted Shape-Variant Context for Segmentation
* Semantic Graph Convolutional Networks for 3D Human Pose Regression
* Semantic Image Synthesis With Spatially-Adaptive Normalization
* Semantic Projection Network for Zero- and Few-Label Semantic Segmentation
* Semantically Aligned Bias Reducing Zero Shot Learning
* Semantically Tied Paired Cycle Consistency for Zero-Shot Sketch-Based Image Retrieval
* Semantics Disentangling for Text-To-Image Generation
* Semi-Supervised Learning With Graph Learning-Convolutional Networks
* Semi-Supervised Transfer Learning for Image Rain Removal
* Sensitive-Sample Fingerprinting of Deep Neural Networks
* Separate to Adapt: Open Set Domain Adaptation via Progressive Separation
* Sequence-To-Sequence Domain Adaptation Network for Robust Text Image Recognition
* SFNet: Learning Object-Aware Semantic Correspondence
* Shape Robust Text Detection With Progressive Scale Expansion Network
* Shape Unicode: A Unified Shape Representation
* Shape2Motion: Joint Analysis of Motion Parts and Attributes From 3D Shapes
* Shapes and Context: In-The-Wild Image Synthesis and Manipulation
* ShieldNets: Defending Against Adversarial Attacks Using Probabilistic Adversarial Robustness
* Shifting More Attention to Video Salient Object Detection
* Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions
* Siamese Cascaded Region Proposal Networks for Real-Time Visual Tracking
* SiamRPN++: Evolution of Siamese Visual Tracking With Very Deep Networks
* SiCloPe: Silhouette-Based Clothed People
* Side Window Filtering
* Signal-To-Noise Ratio: A Robust Distance Metric for Deep Metric Learning
* SIGNet: Semantic Instance Aided Unsupervised 3D Geometry Perception
* Sim-Real Joint Reinforcement Transfer for 3D Indoor Navigation
* Sim-To-Real via Sim-To-Sim: Data-Efficient Robotic Grasping via Randomized-To-Canonical Adaptation Networks
* Simple Baseline for Audio-Visual Scene-Aware Dialog, A
* Simple Pooling-Based Design for Real-Time Salient Object Detection, A
* SimulCap: Single-View Human Performance Capture With Cloth Simulation
* Simultaneously Optimizing Weight and Quantizer of Ternary Neural Network Using Truncated Gaussian Approximation
* Single Image Depth Estimation Trained via Depth From Defocus Cues
* Single Image Deraining: A Comprehensive Benchmark Analysis
* Single Image Reflection Removal Beyond Linearity
* Single Image Reflection Removal Exploiting Misaligned Training Data and Network Enhancements
* Single-Frame Regularization for Temporally Stable CNNs
* Single-Image Piece-Wise Planar 3D Reconstruction via Associative Embedding
* SIXray: A Large-Scale Security Inspection X-Ray Benchmark for Prohibited Item Discovery in Overlapping Images
* Skeleton-Based Action Recognition With Directed Graph Neural Networks
* Skeleton-Bridged Deep Learning Approach for Generating Meshes of Complex Topologies From Single RGB Images, A
* SketchGAN: Joint Sketch Completion and Recognition With Generative Adversarial Network
* Skin-Based Identification From Multispectral Image Data Using CNNs
* Sliced Wasserstein Discrepancy for Unsupervised Domain Adaptation
* Sliced Wasserstein Generative Models
* Slim DensePose: Thrifty Learning From Sparse Annotations and Motion Cues
* Snapshot Distillation: Teacher-Student Optimization in One Generation
* Social Relation Recognition From Videos via Multi-Scale Spatial-Temporal Reasoning
* Social-IQ: A Question Answering Benchmark for Artificial Social Intelligence
* SoDeep: A Sorting Deep Net to Learn Ranking Loss Surrogates
* Soft Labels for Ordinal Regression
* SoPhie: An Attentive GAN for Predicting Paths Compliant to Social and Physical Constraints
* SOSNet: Second Order Similarity Regularization for Local Descriptor Learning
* SparseFool: A Few Pixels Make a Big Difference
* Spatial Attentive Single-Image Deraining With a High Quality Real Rain Dataset
* Spatial Fusion GAN for Image Synthesis
* Spatial-Aware Graph Relation Network for Large-Scale Object Detection
* Spatially Variant Linear Representation Models for Joint Filtering
* Spatio-Temporal Dynamics and Semantic Attribute Enriched Visual Encoding for Video Captioning
* Spatio-Temporal Video Re-Localization by Warp LSTM
* Spatiotemporal CNN for Video Object Segmentation
* Spectral Metric for Dataset Complexity Assessment
* Spectral Reconstruction From Dispersive Blur: A Novel Light Efficient Spectral Imager
* Speech2Face: Learning the Face Behind a Voice
* Speed Invariant Time Surface for Learning to Detect Corner Points With Event-Based Cameras
* Sphere Generative Adversarial Network Based on Geometric Moment Matching
* SpherePHD: Applying CNNs on a Spherical PolyHeDron Representation of 360deg Images
* Spherical Fractal Convolutional Neural Networks for Point Cloud Recognition
* Spherical Regression: Learning Viewpoints, Surface Normals and 3D Rotations on N-Spheres
* SPM-Tracker: Series-Parallel Matching for Real-Time Visual Object Tracking
* Spot and Learn: A Maximum-Entropy Patch Sampler for Few-Shot Image Classification
* SpotTune: Transfer Learning Through Adaptive Fine-Tuning
* SR-LSTM: State Refinement for LSTM Towards Pedestrian Trajectory Prediction
* SSN: Learning Sparse Switchable Normalization via SparsestMax
* Steady-State Non-Line-Of-Sight Imaging
* STEP: Spatio-Temporal Progressive Learning for Video Action Detection
* Stereo R-CNN Based 3D Object Detection for Autonomous Driving
* StereoDRNet: Dilated Residual StereoNet
* STGAN: A Unified Selective Transfer Network for Arbitrary Image Attribute Editing
* Stochastic Class-Based Hard Example Mining for Deep Metric Learning
* StoryGAN: A Sequential Conditional GAN for Story Visualization
* Strand-Accurate Multi-View Hair Capture
* Streamlined Dense Video Captioning
* Strike (With) a Pose: Neural Networks Are Easily Fooled by Strange Poses of Familiar Objects
* Striking the Right Balance With Uncertainty
* Strong-Weak Distribution Alignment for Adaptive Object Detection
* Structural Relational Reasoning of Point Clouds
* Structure-Preserving Stereoscopic View Synthesis With Multi-Scale Adversarial Correlation Matching
* Structured Binary Neural Networks for Accurate Image Classification and Semantic Segmentation
* Structured Knowledge Distillation for Semantic Segmentation
* Structured Model for Action Detection, A
* Structured Pruning of Neural Networks With Budget-Aware Regularization
* Student Becoming the Master: Knowledge Amalgamation for Joint Scene Parsing, Depth Estimation, and More
* Style Transfer by Relaxed Optimal Transport and Self-Similarity
* Style-Based Generator Architecture for Generative Adversarial Networks, A
* Sufficient Condition for Convergences of Adam and RMSProp, A
* Superquadrics Revisited: Learning 3D Shape Parsing Beyond Cuboids
* Supervised Fitting of Geometric Primitives to 3D Point Clouds
* Surface Reconstruction From Normals: A Robust DGP-Based Discontinuity Preservation Approach
* Synthesizing 3D Shapes From Silhouette Image Collections Using Multi-Projection Generative Adversarial Networks
* Synthesizing Environment-Aware Activities via Activity Sketches
* T-Net: Parametrizing Fully Convolutional Nets With a Single High-Order Tensor
* TACNet: Transition-Aware Context Network for Spatio-Temporal Action Detection
* Tactical Rewind: Self-Correction via Backtracking in Vision-And-Language Navigation
* TAFE-Net: Task-Aware Feature Embeddings for Low Shot Learning
* Taking a Closer Look at Domain Shift: Category-Level Adversaries for Semantics Consistent Domain Adaptation
* Taking a Deeper Look at the Inverse Compositional Algorithm
* Tangent-Normal Adversarial Regularization for Semi-Supervised Learning
* Target-Aware Deep Tracking
* Task Agnostic Meta-Learning for Few-Shot Learning
* Task-Free Continual Learning
* Tell Me Where I Am: Object-Level Scene Context Prediction
* Temporal Cycle-Consistency Learning
* Temporal Transformer Networks: Joint Learning of Invariant and Discriminative Time Warping
* Text Guided Person Image Synthesis
* Text2Scene: Generating Compositional Scenes From Textual Descriptions
* Texture Mixer: A Network for Controllable Synthesis and Interpolation of Texture
* Textured Neural Avatars
* TextureNet: Consistent Local Parametrizations for Learning From High-Resolution Signals on Meshes
* Theoretically Sound Upper Bound on the Triplet Loss for Improving the Efficiency of Deep Distance Metric Learning, A
* Theory of Fermat Paths for Non-Line-Of-Sight Shape Reconstruction, A
* Thinking Outside the Pool: Active Training Image Creation for Relative Attributes
* Tightness-Aware Evaluation Protocol for Scene Text Detection
* Time-Conditioned Action Anticipation in One Shot
* Timeception for Complex Action Recognition
* ToothNet: Automatic Tooth Instance Segmentation and Identification From Cone Beam CT Images
* TopNet: Structural Point Cloud Decoder
* Topology Reconstruction of Tree-Like Structure in Images via Structural Similarity Measure and Dominant Set Clustering
* TOUCHDOWN: Natural Language Navigation and Spatial Reasoning in Visual Street Environments
* Toward Convolutional Blind Denoising of Real Photographs
* Toward Realistic Image Compositing With Adversarial Learning
* Towards Accurate One-Stage Object Detection With AP-Loss
* Towards High-Fidelity Nonlinear 3D Face Morphable Model
* Towards Instance-Level Image-To-Image Translation
* Towards Natural and Accurate Future Motion Prediction of Humans and Animals
* Towards Optimal Structured CNN Pruning via Generative Adversarial Learning
* Towards Real Scene Super-Resolution With Raw Images
* Towards Rich Feature Discovery With Class Activation Maps Augmentation for Person Re-Identification
* Towards Robust Curve Text Detection With Conditional Spatial Expansion
* Towards Scene Understanding: Unsupervised Monocular Depth Estimation With Semantic-Aware Representation
* Towards Social Artificial Intelligence: Nonverbal Social Signal Prediction in a Triadic Interaction
* Towards Universal Object Detection by Domain Attention
* Towards Visual Feature Translation
* Towards VQA Models That Can Read
* Tracking by Animation: Unsupervised Learning of Multi-Object Attentive Trackers
* Training Deep Learning Based Image Denoisers From Undersampled Measurements Without Ground Truth and Without Image Prior
* Transfer Learning via Unsupervised Task Discovery for Visual Question Answering
* Transferable AutoML by Model Sharing Over Grouped Datasets
* Transferable Interactiveness Knowledge for Human-Object Interaction Detection
* Transferrable Prototypical Networks for Unsupervised Domain Adaptation
* TransGaGa: Geometry-Aware Unsupervised Image-To-Image Translation
* Translate-to-Recognize Networks for RGB-D Scene Recognition
* TraPHic: Trajectory Prediction in Dense and Heterogeneous Traffic Using Weighted Interactions
* TraVeLGAN: Image-To-Image Translation by Transformation Vector Learning
* Triangulation Learning Network: From Monocular to Stereo 3D Object Detection
* Triply Supervised Decoder Networks for Joint Detection and Segmentation
* Trust Region Based Adversarial Attack on Neural Networks
* Turn a Silicon Camera Into an InGaAs Camera
* Two Body Problem: Collaborative Visual Task Completion
* Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition
* Typography With Decor: Intelligent Text Style Transfer
* Uncertainty Guided Multi-Scale Residual Learning-Using a Cycle Spinning CNN for Single Image De-Raining
* Underexposed Photo Enhancement Using Deep Illumination Estimation
* Understanding and Visualizing Deep Visual Saliency Models
* Understanding the Disharmony Between Dropout and Batch Normalization by Variance Shift
* Understanding the Limitations of CNN-Based Absolute Camera Pose Regression
* Unequal-Training for Deep Face Recognition With Long-Tailed Noisy Data
* Unified Visual-Semantic Embeddings: Bridging Vision and Language With Structured Meaning Representations
* UniformFace: Learning Deep Equidistributed Representation for Face Recognition
* Unifying Heterogeneous Classifiers With Distillation
* Universal Domain Adaptation
* UnOS: Unified Unsupervised Optical-Flow and Stereo-Depth Estimation by Watching Videos
* Unprocessing Images for Learned Raw Denoising
* Unsupervised 3D Pose Estimation With Geometric Self-Supervision
* Unsupervised Deep Epipolar Flow for Stationary or Dynamic Scenes
* Unsupervised Deep Tracking
* Unsupervised Disentangling of Appearance and Geometry by Deformable Generator Network
* Unsupervised Domain Adaptation for ToF Data Denoising With Adversarial Learning
* Unsupervised Domain Adaptation Using Feature-Whitening and Consensus Loss
* Unsupervised Domain-Specific Deblurring via Disentangled Representations
* Unsupervised Embedding Learning via Invariant and Spreading Instance Feature
* Unsupervised Event-Based Learning of Optical Flow, Depth, and Egomotion
* Unsupervised Face Normalization With Extreme Pose and Expression in the Wild
* Unsupervised Image Captioning
* Unsupervised Image Matching and Object Discovery as Optimization
* Unsupervised Learning of Action Classes With Continuous Temporal Embedding
* Unsupervised Learning of Consensus Maximization for 3D Vision Problems
* Unsupervised Learning of Dense Shape Correspondence
* Unsupervised Moving Object Detection via Contextual Information Separation
* Unsupervised Multi-Modal Neural Machine Translation
* Unsupervised Open Domain Recognition by Semantic Discrepancy Minimization
* Unsupervised Part-Based Disentangling of Object Shape and Appearance
* Unsupervised Person Image Generation With Semantic Parsing Transformation
* Unsupervised Person Re-Identification by Soft Multilabel Learning
* Unsupervised Primitive Discovery for Improved 3D Generative Modeling
* Unsupervised Visual Domain Adaptation: A Deep Max-Margin Gaussian Process Approach
* UPSNet: A Unified Panoptic Segmentation Network
* Using Unknown Occluders to Recover Hidden Scenes
* Variational Auto-Encoder Model for Stochastic Point Processes, A
* Variational Autoencoders Pursue PCA Directions (by Accident)
* Variational Bayesian Dropout With a Hierarchical Prior
* Variational Convolutional Neural Network Pruning
* Variational EM Framework With Adaptive Edge Selection for Blind Motion Deblurring, A
* Variational Information Distillation for Knowledge Transfer
* Variational Pan-Sharpening With Local Gradient Constraints, A
* Variational Prototyping-Encoder: One-Shot Learning With Prototypical Images
* VERI-Wild: A Large Dataset and a New Method for Vehicle Re-Identification in the Wild
* Veritatem Dies Aperit - Temporally Consistent Depth Prediction Enabled by a Multi-Task Geometric and Semantic Scene Understanding Approach
* Versatile Multiple Choice Learning and Its Application to Vision Computing
* Video Action Transformer Network
* Video Generation From Single Semantic Label Map
* Video Magnification in the Wild Using Fractional Anisotropy in Temporal Distribution
* Video Relationship Reasoning Using Gated Spatio-Temporal Energy Graph
* Video Summarization by Learning From Unpaired Data
* Viewport Proposal CNN for 360deg Video Quality Assessment
* Vision-Based Navigation With Language-Based Assistance via Imitation Learning With Indirect Intervention
* Visual Attention Consistency Under Image Transforms for Multi-Label Image Classification
* Visual Centrifuge: Model-Free Layered Video Representations, The
* Visual Localization by Learning Objects-Of-Interest Dense Match Regression
* Visual Query Answering by Entity-Attribute Graph Matching and Reasoning
* Visual Question Answering as Reading Comprehension
* Visual Tracking via Adaptive Spatially-Regularized Correlation Filters
* VITAMIN-E: VIsual Tracking and MappINg With Extremely Dense Feature Points
* VizWiz-Priv: A Dataset for Recognizing the Presence and Purpose of Private Visual Information in Images Taken by Blind People
* Volumetric Capture of Humans With a Single RGBD Camera via Semi-Parametric Learning
* VRSTC: Occlusion-Free Video Person Re-Identification
* WarpGAN: Automatic Caricature Generation
* Weakly Supervised Complementary Parts Models for Fine-Grained Image Classification From the Bottom Up
* Weakly Supervised Deep Image Hashing Through Tag Embeddings
* Weakly Supervised Image Classification Through Noise Regularization
* Weakly Supervised Learning of Instance Segmentation With Inter-Pixel Relations
* Weakly Supervised Open-Set Domain Adaptation by Dual-Domain Collaboration
* Weakly Supervised Person Re-Identification
* Weakly Supervised Video Moment Retrieval From Text Queries
* Weakly-Supervised Discovery of Geometry-Aware Representation for 3D Human Pose Estimation
* What and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment
* What Correspondences Reveal About Unknown Camera and Motion Models?
* What Do Single-View 3D Reconstruction Networks Learn?
* What Does It Mean to Learn in Deep Networks? And, How Does One Detect Adversarial Attacks?
* What Object Should I Use? - Task Driven Object Detection
* What's to Know? Uncertainty as a Guide to Asking Goal-Oriented Questions
* When Color Constancy Goes Wrong: Correcting Improperly White-Balanced Images
* Where's Wally Now? Deep Generative and Discriminative Embeddings for Novelty Detection
* Which Way Are You Going? Imitative Decision Learning for Path Forecasting in Dynamic Scenes
* Why ReLU Networks Yield High-Confidence Predictions Far Away From the Training Data and How to Mitigate the Problem
* Wide-Area Crowd Counting via Ground-Plane Density Maps and Multi-View Fusion CNNs
* Wide-Context Semantic Image Extrapolation
* World From Blur
* X2CT-GAN: Reconstructing CT From Biplanar X-Rays With Generative Adversarial Networks
* You Look Twice: GaterNet for Dynamic Filter Selection in CNNs
* You Reap What You Sow: Using Videos to Generate High Precision Object Proposals for Weakly-Supervised Object Detection
* Zero-Shot Task Transfer
* ZigZagNet: Fusing Top-Down and Bottom-Up Context for Object Segmentation
* Zoom to Learn, Learn to Zoom
* Zoom-In-To-Check: Boosting Video Interpolation via Instance-Level Discrimination
1295 for CVPR19

CVPR20 * *CVPR
* 12-in-1: Multi-Task Vision and Language Representation Learning
* 15 Keypoints Is All You Need
* 3D Human Mesh Regression With Dense Correspondence
* 3D Packing for Self-Supervised Monocular Depth Estimation
* 3D Part Guided Image Editing for Fine-Grained Object Understanding
* 3D Photography Using Context-Aware Layered Depth Inpainting
* 3D Sketch-Aware Semantic Scene Completion via Semi-Supervised Structure Prior
* 3D-MPA: Multi-Proposal Aggregation for 3D Semantic Instance Segmentation
* 3D-ZeF: A 3D Zebrafish Tracking Benchmark Dataset
* 3DRegNet: A Deep Neural Network for 3D Point Registration
* 3DSSD: Point-Based 3D Single Stage Object Detector
* 3DV: 3D Dynamic Voxel for Action Recognition in Depth Video
* 3FabRec: Fast Few-Shot Face Alignment by Reconstruction
* 4D Association Graph for Realtime Multi-Person Motion Capture Using Multiple Video Cameras
* 4D Visualization of Dynamic Events From Unconstrained Multi-View Videos
* A2dele: Adaptive and Attentive Depth Distiller for Efficient RGB-D Salient Object Detection
* AANet: Adaptive Aggregation Network for Efficient Stereo Matching
* ABCNet: Real-Time Scene Text Spotting With Adaptive Bezier-Curve Network
* Accurate Estimation of Body Height From a Single Depth Image via a Four-Stage Developing Network
* Achieving Robustness in the Wild via Adversarial Mixing With Disentangled Representations
* ACNe: Attentive Context Normalization for Robust Permutation-Equivariant Learning
* ActBERT: Learning Global-Local Video-Text Representations
* Action Genome: Actions As Compositions of Spatio-Temporal Scene Graphs
* Action Modifiers: Learning From Adverbs in Instructional Videos
* Action Segmentation With Joint Self-Supervised Temporal Domain Adaptation
* ActionBytes: Learning From Trimmed Videos to Localize Actions
* Active 3D Motion Visualization Based on Spatiotemporal Light-Ray Integration
* Active Speakers in Context
* Active Vision for Early Recognition of Human Actions
* ActiveMoCap: Optimized Viewpoint Selection for Active Human Motion Capture
* Actor-Transformers for Group Activity Recognition
* AD-Cluster: Augmented Discriminative Clustering for Domain Adaptive Person Re-Identification
* AdaBits: Neural Network Quantization With Adaptive Bit-Widths
* AdaCoF: Adaptive Collaboration of Flows for Video Frame Interpolation
* AdaCoSeg: Adaptive Shape Co-Segmentation With Group Consistency Loss
* Adaptive Dilated Network With Self-Correction Supervision for Counting
* Adaptive Fractional Dilated Convolution Network for Image Aesthetics Assessment
* Adaptive Graph Convolutional Network With Attention Graph Clustering for Co-Saliency Detection
* Adaptive Hierarchical Down-Sampling for Point Cloud Classification
* Adaptive Interaction Modeling via Graph Operations Search
* Adaptive Loss-Aware Quantization for Multi-Bit Networks
* Adaptive Neural Network for Unsupervised Mosaic Consistency Analysis in Image Forensics, An
* Adaptive Subspaces for Few-Shot Learning
* AdderNet: Do We Really Need Multiplications in Deep Learning?
* ADINet: Attribute Driven Incremental Network for Retinal Image Classification
* Advancing High Fidelity Identity Swapping for Forgery Detection
* Adversarial Camouflage: Hiding Physical-World Attacks With Natural Styles
* Adversarial Examples Improve Image Recognition
* Adversarial Feature Hallucination Networks for Few-Shot Learning
* Adversarial Latent Autoencoders
* Adversarial Robustness: From Self-Supervised Pre-Training to Fine-Tuning
* Adversarial Texture Optimization From RGB-D Scans
* Adversarial Vertex Mixup: Toward Better Adversarially Robust Generalization
* AdversarialNAS: Adversarial Neural Architecture Search for GANs
* Advisable Learning for Self-Driving Vehicles by Internalizing Observation-to-Action Rules
* Affinity Graph Supervision for Visual Recognition
* Agriculture-Vision: A Large Aerial Image Database for Agricultural Pattern Analysis
* ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks
* All in One Bad Weather Removal Using Architectural Search
* Alleviation of Gradient Exploding in GANs: Fake Can Be Real
* Analyzing and Improving the Image Quality of StyleGAN
* AnimalWeb: A Large-Scale Hierarchical Dataset of Annotated Animal Faces
* Anisotropic Convolutional Networks for 3D Semantic Scene Completion
* AOWS: Adaptive and Optimal Network Width Search With Latency Constraints
* Appearance Shock Grammar for Fast Medial Axis Extraction From Real Images
* Approximating shapes in images with low-complexity polygons
* APQ: Joint Search for Network Architecture, Pruning and Quantization Policy
* ARCH: Animatable Reconstruction of Clothed Humans
* ARShadowGAN: Shadow Generative Adversarial Network for Augmented Reality in Single Light Scenes
* Articulation-Aware Canonical Surface Mapping
* ASLFeat: Learning Local Features of Accurate Shape and Localization
* Assessing Eye Aesthetics for Automatic Multi-Reference Eye In-Painting
* Assessing Image Quality Issues for Real-World Problems
* Associate-3Ddet: Perceptual-to-Conceptual Association for 3D Point Cloud Object Detection
* Attack to Explain Deep Representation
* Attention Convolutional Binary Neural Tree for Fine-Grained Visual Categorization
* Attention Mechanism Exploits Temporal Contexts: Real-Time 3D Human Pose Reconstruction
* Attention Scaling for Crowd Counting
* Attention-Aware Multi-View Stereo
* Attention-Based Context Aware Reasoning for Situation Recognition
* Attention-Driven Cropping for Very High Resolution Facial Landmark Detection
* Attention-Guided Hierarchical Structure Aggregation for Image Matting
* Attentive Normalization for Conditional Image Generation
* Attentive Weights Generation for Few Shot Learning via Information Maximization
* Attribution in Scale and Space
* AugFPN: Improving Multi-Scale Feature Learning for Object Detection
* Augment Your Batch: Improving Generalization Through Instance Repetition
* Augmenting Colonoscopy Using Extended and Directional CycleGAN for Lossy Image Translation
* Auto-Encoding Twin-Bottleneck Hashing
* Auto-Tuning Structured Light by Optical Stochastic Gradient Descent
* Autolabeling 3D Objects With Differentiable Rendering of SDF Shape Priors
* Automatic Neural Network Compression by Sparsity-Quantization Joint Learning: A Constrained Optimization-Based Approach
* AutoTrack: Towards High-Performance Visual Tracking for UAV With Automatic Spatio-Temporal Regularization
* Auxiliary Training: Towards Accurate and Robust Models
* AvatarMe: Realistically Renderable 3D Facial Reconstruction In-the-Wild
* Averaging Essential and Fundamental Matrices in Collinear Camera Settings
* BachGAN: High-Resolution Image Synthesis From Salient Object Layout
* Background Data Resampling for Outlier-Aware Classification
* Background Matting: The World Is Your Green Screen
* BANet: Bidirectional Aggregation Network With Occlusion Handling for Panoptic Segmentation
* Barycenters of Natural Images: Constrained Wasserstein Barycenters for Image Morphing
* Basis Prediction Networks for Effective Burst Denoising With Large Kernels
* Bayesian Adversarial Human Motion Synthesis
* BBN: Bilateral-Branch Network With Cumulative Learning for Long-Tailed Visual Recognition
* BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning
* BEDSR-Net: A Deep Shadow Removal Network From a Single Document Image
* Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems
* Benchmarking Adversarial Robustness on Image Classification
* Benchmarking the Robustness of Semantic Segmentation Models
* Better Captioning With Sequence-Level Exploration
* Beyond Short-Term Snippet: Video Relation Detection With Spatio-Temporal Global Context
* BFBox: Searching Face-Appropriate Backbone and Feature Pyramid Network for Face Detector
* Bi-Directional Interaction Network for Person Search
* Bi-Directional Relationship Inferring Network for Referring Image Segmentation
* Bi3D: Stereo Depth Estimation via Binary Classifications
* BiDet: An Efficient Binarized Object Detector
* Bidirectional Graph Reasoning Network for Panoptic Segmentation
* BidNet: Binocular Image Dehazing Without Explicit Disparity Estimation
* BiFuse: Monocular 360 Depth Estimation via Bi-Projection Fusion
* Binarizing MobileNet via Evolution-Based Searching
* BlendedMVS: A Large-Scale Dataset for Generalized Multi-View Stereo Networks
* BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation
* Blindly Assess Image Quality in the Wild Guided by a Self-Adaptive Hyper Network
* Block-Wisely Supervised Neural Architecture Search With Knowledge Distillation
* Blur Aware Calibration of Multi-Focus Plenoptic Camera
* Blurry Video Frame Interpolation
* Bodies at Rest: 3D Human Pose and Shape Estimation From a Pressure Image Using Synthetic Data
* Boosting Few-Shot Learning With Adaptive Margin Loss
* Boosting Semantic Human Matting With Coarse Annotations
* Boosting the Transferability of Adversarial Samples via Attention
* Boundary-Aware 3D Building Reconstruction From a Single Overhead Image
* Breaking the Cycle: Colleagues Are All You Need
* Bridging the Gap Between Anchor-Based and Anchor-Free Detection via Adaptive Training Sample Selection
* Bringing Old Photos Back to Life
* BSP-Net: Generating Compact Meshes via Binary Space Partitioning
* Bundle Adjustment on a Graph Processor
* Bundle Pooling for Polygonal Architecture Segmentation Problem
* Butterfly Transform: An Efficient FFT Based Neural Architecture Design
* C-Flow: Conditional Generative Flow Models for Images and 3D Point Clouds
* C2FNAS: Coarse-to-Fine Neural Architecture Search for 3D Medical Image Segmentation
* Camera On-Boarding for Person Re-Identification Using Hypothesis Transfer Learning
* Camera Trace Erasing
* Camouflaged Object Detection
* Can Deep Learning Recognize Subtle Human Activities?
* Can Facial Pose and Expression Be Separated With Weak Perspective Camera?
* Can We Learn Heuristics for Graphical Model Inference Using Reinforcement Learning?
* Can Weight Sharing Outperform Random Architecture Search? An Investigation With TuNAS
* CARP: Compression Through Adaptive Recursive Partitioning for Multi-Dimensional Images
* Cars Can't Fly Up in the Sky: Improving Urban-Scene Segmentation via Height-Driven Attention Networks
* CARS: Continuous Evolution for Efficient Neural Architecture Search
* Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching
* Cascade EF-GAN: Progressive Facial Expression Editing With Local Focuses
* Cascaded Deep Monocular 3D Human Pose Estimation With Evolutionary Training Data
* Cascaded Deep Video Deblurring Using Temporal Sharpness Prior
* Cascaded Human-Object Interaction Recognition
* Cascaded Refinement Network for Point Cloud Completion
* CascadePSP: Toward Class-Agnostic and Very High-Resolution Segmentation via Global and Local Refinement
* Category-Level Articulated Object Pose Estimation
* Celeb-DF: A Large-Scale Challenging Dataset for DeepFake Forensics
* CenterMask: Real-Time Anchor-Free Instance Segmentation
* CenterMask: Single Shot Instance Segmentation With Point Representation
* Central Similarity Quantization for Efficient Image and Video Retrieval
* CentripetalNet: Pursuing High-Quality Keypoint Pairs for Object Detection
* Certifiably Globally Optimal Solution to Generalized Essential Matrix Estimation, A
* Channel Attention Based Iterative Residual Learning for Depth Map Super-Resolution
* Characteristic Function Approach to Deep Implicit Generative Modeling, A
* CIAGAN: Conditional Identity Anonymization Generative Adversarial Networks
* Circle Loss: A Unified Perspective of Pair Similarity Optimization
* Classifying, Segmenting, and Tracking Object Instances in Video with Mask Propagation
* Clean-Label Backdoor Attacks on Video Recognition Models
* Closed-Loop Matters: Dual Regression Networks for Single Image Super-Resolution
* Cloth in the Wind: A Case Study of Physical Measurement Through Simulation
* ClusterFit: Improving Generalization of Visual Representations
* ClusterVO: Clustering Moving Instances and Estimating Visual Odometry for Self and Surroundings
* CNN-Generated Images Are Surprisingly Easy to Spot… for Now
* COCAS: A Large-Scale Clothes Changing Person Dataset for Re-Identification
* Cogradient Descent for Bilinear Optimization
* Coherent Reconstruction of Multiple Humans From a Single Image
* Collaborative Distillation for Ultra-Resolution Universal Style Transfer
* Collaborative Motion Prediction via Neural Motion Message Passing
* ColorFool: Semantic Adversarial Colorization
* Combating Noisy Labels by Agreement: A Joint Training Method with Co-Regularization
* Combining Detection and Tracking for Human Pose Estimation in Videos
* Composed Query Image Retrieval Using Locally Bounded Features
* Composing Good Shots by Exploiting Mutual Relations
* Compositional Convolutional Neural Networks: A Deep Architecture With Innate Robustness to Partial Occlusion
* Compressed Volumetric Heatmaps for Multi-Person 3D Pose Estimation
* Computing the Testing Error Without a Testing Set
* Computing Valid P-Values for Image Segmentation by Selective Inference
* Conditional Channel Gated Networks for Task-Aware Continual Learning
* Conditional Gaussian Distribution Learning for Open Set Recognition
* Connect-and-Slice: An Hybrid Approach for Reconstructing 3D Objects
* CONSAC: Robust Multi-Model Fitting by Conditional Sample Consensus
* Context Aware Graph Convolution for Skeleton-Based Action Recognition
* Context Prior for Scene Segmentation
* Context R-CNN: Long Term Temporal Context for Per-Camera Object Detection
* Context-Aware and Scale-Insensitive Temporal Repetition Counting
* Context-Aware Attention Network for Image-Text Retrieval
* Context-Aware Group Captioning via Self-Attention and Contrastive Features
* Context-Aware Human Motion Prediction
* Context-Aware Loss Function for Action Spotting in Soccer Videos, A
* Contextual Residual Aggregation for Ultra High-Resolution Image Inpainting
* Continual Learning With Extended Kronecker-Factored Approximate Curvature
* ContourNet: Taking a Further Step Toward Accurate Arbitrary-Shaped Scene Text Detection
* Controllable Orthogonalization in Training DNNs
* Controllable Person Image Synthesis With Attribute-Decomposed GAN
* Conv-MPN: Convolutional Message Passing Neural Network for Structured Outdoor Architecture Reconstruction
* Convolution in the Cloud: Learning Deformable Kernels in 3D Graph Convolution Networks for Point Cloud Analysis
* CookGAN: Causality Based Text-to-Image Synthesis
* Cooling-Shrinking Attack: Blinding the Tracker With Imperceptible Noises
* Cops-Ref: A New Dataset and Task on Compositional Referring Expression Comprehension
* Copy and Paste GAN: Face Hallucination From Shaded Thumbnails
* Correction Filter for Single Image Super-Resolution: Robustifying Off-the-Shelf Deep Super-Resolvers
* Correlating Edge, Pose With Parsing
* Correlation-Guided Attention for Corner Detection Based Visual Tracking
* Correspondence Networks With Adaptive Neighbourhood Consensus
* Correspondence-Free Material Reconstruction using Sparse Surface Constraints
* Cost Volume Pyramid Based Depth Inference for Multi-View Stereo
* Counterfactual Samples Synthesizing for Robust Visual Question Answering
* Counterfactual Vision and Language Learning
* Counting Out Time: Class Agnostic Video Repetition Counting in the Wild
* CoverNet: Multimodal Behavior Prediction Using Trajectory Sets
* CPR-GCN: Conditional Partial-Residual Graph Convolutional Network in Automated Anatomical Labeling of Coronary Arteries
* Creating Something From Nothing: Unsupervised Knowledge Distillation for Cross-Modal Hashing
* CRNet: Cross-Reference Networks for Few-Shot Segmentation
* Cross-Batch Memory for Embedding Learning
* Cross-Domain Correspondence Learning for Exemplar-Based Image Translation
* Cross-Domain Detection via Graph-Induced Prototype Alignment
* Cross-Domain Document Object Detection: Benchmark Suite and Method
* Cross-Domain Face Presentation Attack Detection via Multi-Domain Disentangled Representation Learning
* Cross-domain Object Detection through Coarse-to-Fine Feature Adaptation
* Cross-Domain Semantic Segmentation via Domain-Invariant Interactive Relation Transfer
* Cross-Modal Cross-Domain Moment Alignment Network for Person Search
* Cross-Modal Deep Face Normals With Deactivable Skip Connections
* Cross-Modal Pattern-Propagation for RGB-T Tracking
* Cross-Modality Person Re-Identification With Shared-Specific Feature Transfer
* Cross-Spectral Face Hallucination via Disentangling Independent Factors
* Cross-View Correspondence Reasoning Based on Bipartite Graph Convolutional Network for Mammogram Mass Detection
* Cross-View Tracking for Multi-Human 3D Pose Estimation at Over 100 FPS
* CurricularFace: Adaptive Curriculum Learning Loss for Deep Face Recognition
* CvxNet: Learnable Convex Decomposition
* CycleISP: Real Image Restoration via Improved Data Synthesis
* Cylindrical Convolutional Networks for Joint Object Detection and Viewpoint Estimation
* D2Det: Towards High Quality Object Detection and Instance Segmentation
* D3Feat: Joint Learning of Dense Detection and Description of 3D Local Features
* D3S: A Discriminative Single Shot Segmentation Tracker
* D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry
* DaST: Data-Free Substitute Training for Adversarial Attacks
* Data Uncertainty Learning in Face Recognition
* Data-Efficient Semi-Supervised Learning by Reliable Edge Mining
* Data-Free Knowledge Amalgamation via Group-Stack Dual-GAN
* Dataless Model Selection With the Deep Frame Potential
* DAVD-Net: Deep Audio-Aided Video Decompression of Talking Heads
* Deblurring by Realistic Blurring
* Deblurring Using Analysis-Synthesis Networks Pair
* Decoupled Representation Learning for Skeleton-Based Gesture Recognition
* Deep 3D Capture: Geometry and Reflectance From Sparse Multi-View Images
* Deep 3D Portrait From a Single Image
* Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision
* Deep Adversarial Decomposition: A Unified Framework for Separating Superimposed Images
* Deep Degradation Prior for Low-Quality Image Classification
* Deep Distance Transform for Tubular Structure Segmentation in CT Scans
* Deep Face Super-Resolution With Iterative Collaboration Between Attentive Recovery and Landmark Estimation
* Deep Facial Non-Rigid Multi-View Stereo
* Deep Fair Clustering for Visual Learning
* Deep Generative Model for Robust Imbalance Classification
* Deep Geometric Functional Maps: Robust Feature Learning for Shape Correspondence
* Deep Global Registration
* Deep Grouping Model for Unified Perceptual Parsing
* Deep Homography Estimation for Dynamic Scenes
* Deep Image Spatial Transformation for Person Image Generation
* Deep Implicit Volume Compression
* Deep Iterative Surface Normal Estimation
* Deep Kinematics Analysis for Monocular 3D Human Pose Estimation
* Deep Learning for Handling Kernel/model Uncertainty in Image Deconvolution
* Deep Metric Learning via Adaptive Learnable Assessment
* Deep Non-Line-of-Sight Reconstruction
* Deep Optics for Single-Shot High-Dynamic-Range Imaging
* Deep Parametric Shape Predictions Using Distance Fields
* Deep Polarization Cues for Transparent Object Segmentation
* Deep Relational Reasoning Graph Network for Arbitrary Shape Text Detection
* Deep Representation Learning on Long-Tailed Data: A Learnable Embedding Augmentation Perspective
* Deep Residual Flow for Out of Distribution Detection
* Deep Semantic Clustering by Partition Confidence Maximisation
* Deep Shutter Unrolling Network
* Deep Snake for Real-Time Instance Segmentation
* Deep Spatial Gradient and Temporal Depth Learning for Face Anti-Spoofing
* Deep Stereo Using Adaptive Thin Volume Representation With Uncertainty Awareness
* Deep Structure-Revealed Network for Texture Recognition
* Deep Unfolding Network for Image Super-Resolution
* Deep White-Balance Editing
* DeepCap: Monocular Human Performance Capture Using Weak Supervision
* DeepDeform: Learning Non-Rigid RGB-D Reconstruction With Semi-Supervised Data
* DeepEMD: Few-Shot Image Classification With Differentiable Earth Mover's Distance and Structured Classifiers
* DeeperForensics-1.0: A Large-Scale Dataset for Real-World Face Forgery Detection
* DeepFaceFlow: In-the-Wild Dense 3D Facial Motion Estimation
* DeepFLASH: An Efficient Network for Learning-Based Medical Image Registration
* DeepLPF: Deep Local Parametric Filters for Image Enhancement
* Deepstrip: High-Resolution Boundary Refinement
* DeFeat-Net: General Monocular Depth via Simultaneous Unsupervised Representation Learning
* Defending Against Model Stealing Attacks With Adaptive Misinformation
* Defending Against Universal Attacks Through Selective Feature Regeneration
* Defending and Harnessing the Bit-Flip Based Adversarial Weight Attack
* Deformable Siamese Attention Networks for Visual Object Tracking
* Deformation-Aware Unpaired Image Translation for Pose Estimation on Laboratory Animals
* Dense Regression Network for Video Grounding
* Densely Connected Search Space for More Flexible Neural Architecture Search
* Density-Aware Feature Embedding for Face Clustering
* Density-Aware Graph for Deep Semi-Supervised Visual Recognition
* Density-Based Clustering for 3D Object Detection in Point Clouds
* DEPARA: Deep Attribution Graph for Deep Knowledge Transferability
* Depth Sensing Beyond LiDAR Range
* Designing Network Design Spaces
* Detail-recovery Image Deraining via Context Aggregation Networks
* Detailed 2D-3D Joint Representation for Human-Object Interaction
* Detecting Adversarial Samples Using Influence Functions and Nearest Neighbors
* Detecting Attended Visual Targets in Video
* Detection in Crowded Scenes: One Proposal, Multiple Predictions
* Determinant Regularization for Gradient-Efficient Graph Matching
* Devil Is in the Details: Delving Into Unbiased Data Processing for Human Pose Estimation, The
* Differentiable Adaptive Computation Time for Visual Reasoning
* Differentiable Volumetric Rendering: Learning Implicit 3D Representations Without 3D Supervision
* Differential Treatment for Stuff and Things: A Simple Unsupervised Domain Adaptation Method for Semantic Segmentation
* Discovering Human Interactions With Novel Objects via Zero-Shot Learning
* Discovering Synchronized Subsets of Sequences: A Large Scale Solution
* Discrete Model Compression With Resource Constraint for Deep Neural Networks
* Discriminative Multi-Modality Speech Recognition
* Disentangled and Controllable Face Image Generation via 3D Imitative-Contrastive Learning
* Disentangled Image Generation Through Structured Noise Injection
* Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition
* Disentangling Invertible Interpretation Network for Explaining Latent Representations, A
* Disentangling Physical Dynamics From Unknown Factors for Unsupervised Video Prediction
* Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation
* Disparity-Aware Domain Adaptation in Stereo Image Restoration
* DIST: Rendering Deep Implicit Signed Distance Function With Differentiable Sphere Tracing
* Distilled Semantics for Comprehensive Scene Understanding from Videos
* Distilling Cross-Task Knowledge via Relationship Matching
* Distilling Effective Supervision From Severe Label Noise
* Distilling Image Dehazing With Heterogeneous Task Imitation
* Distilling Knowledge From Graph Convolutional Networks
* Distortion Agnostic Deep Watermarking
* Distribution-Aware Coordinate Representation for Human Pose Estimation
* Distribution-Induced Bidirectional Generative Adversarial Network for Graph Representation Learning
* Diverse Image Generation via Self-Conditioned GANs
* Diversified Arbitrary Style Transfer via Deep Feature Perturbation
* DLWL: Improving Detection for Lowshot Classes With Weakly Labelled Data
* DMCP: Differentiable Markov Channel Pruning for Neural Networks
* DNU: Deep Non-Local Unrolling for Computational Spectral Imaging
* DOA-GAN: Dual-Order Attentive Generative Adversarial Network for Image Copy-Move Forgery Detection and Localization
* Domain Adaptation for Image Dehazing
* Domain Adaptive Image-to-Image Translation
* Domain Balancing: Face Recognition on Long-Tailed Domains
* Domain Decluttering: Simplifying Images to Mitigate Synthetic-Real Domain Shift and Improve Depth Estimation
* Domain-Aware Visual Bias Eliminating for Generalized Zero-Shot Learning
* Don't Even Look Once: Synthesizing Features for Zero-Shot Detection
* Don't Hit Me! Glass Detection in Real-World Scenes
* Don't Judge an Object by Its Context: Learning to Overcome Contextual Bias
* DOPS: Learning to Detect 3D Objects and Predict Their 3D Shapes
* DoveNet: Deep Image Harmonization via Domain Verification
* DPGN: Distribution Propagation Graph Network for Few-Shot Learning
* DR Loss: Improving Object Detection by Distributional Ranking
* Dreaming to Distill: Data-Free Knowledge Transfer via DeepInversion
* DSGN: Deep Stereo Geometry Network for 3D Object Detection
* DSNAS: Direct Neural Architecture Search Without Parameter Retraining
* Dual Super-Resolution Learning for Semantic Segmentation
* DualConvMesh-Net: Joint Geodesic and Euclidean Convolutions on 3D Meshes
* DualSDF: Semantic Shape Manipulation Using a Two-Level Representation
* DuDoRNet: Learning a Dual-Domain Recurrent Network for Fast MRI Reconstruction With Deep T1 Prior
* DUNIT: Detection-Based Unsupervised Image-to-Image Translation
* Dynamic Convolution: Attention Over Convolution Kernels
* Dynamic Convolutions: Exploiting Spatial Sparsity for Faster Inference
* Dynamic Face Video Segmentation via Reinforcement Learning
* Dynamic Fluid Surface Reconstruction Using Deep Neural Network
* Dynamic Graph Message Passing Networks
* Dynamic Hierarchical Mimicking Towards Consistent Optimization Objectives
* Dynamic Multiscale Graph Neural Networks for 3D Skeleton Based Human Motion Prediction
* Dynamic Neural Relational Inference
* Dynamic Refinement Network for Oriented and Densely Packed Object Detection
* Dynamic Traffic Modeling From Overhead Imagery
* ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks
* EcoNAS: Finding Proxies for Economical Neural Architecture Search
* Edge of Depth: Explicit Constraints Between Segmentation and Depth, The
* Editing in Style: Uncovering the Local Semantics of GANs
* Effectively Unbiased FID and Inception Score and Where to Find Them
* Efficient Adversarial Training With Transferable Adversarial Examples
* Efficient and Robust Shape Correspondence via Sparsity-Enforced Quadratic Assignment
* Efficient Derivative Computation for Cumulative B-Splines on Lie Groups
* Efficient Dynamic Scene Deblurring Using Spatially Variant Deconvolution Network With Optical Flow Guided Training
* Efficient Neural Vision Systems Based on Convolutional Image Acquisition
* Efficient PointLSTM for Point Clouds Based Gesture Recognition, An
* EfficientDet: Scalable and Efficient Object Detection
* Ego-Topo: Environment Affordances From Egocentric Video
* Embedding Expansion: Augmentation in Embedding Space for Deep Metric Learning
* Embodied Language Grounding With 3D Visual Feature Representations
* EmotiCon: Context-Aware Multimodal Emotion Recognition Using Frege's Principle
* End-to-End 3D Point Cloud Instance Segmentation Without Detection
* End-to-End Adversarial-Attention Network for Multi-Modal Clustering
* End-to-End Camera Calibration for Broadcast Videos
* End-to-End Edge Aggregation Network for Moving Object Segmentation, An
* End-to-End Illuminant Estimation Based on Deep Metric Learning
* End-to-End Learnable Geometric Vision by Backpropagating PnP Optimization
* End-to-End Learning Local Multi-View Descriptors for 3D Point Clouds
* End-to-End Learning of Visual Representations From Uncurated Instructional Videos
* End-to-End Model-Free Reinforcement Learning for Urban Driving Using Implicit Affordances
* End-to-End Optimization of Scene Layout
* End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection
* Enhanced Blind Face Restoration With Multi-Exemplar Images and Adaptive Spatial Feature Fusion
* Enhanced Transport Distance for Unsupervised Domain Adaptation
* Enhancing Cross-Task Black-Box Transferability of Adversarial Examples With Dispersion Reduction
* Enhancing Generic Segmentation With Learned Region Representations
* Enhancing Intrinsic Adversarial Robustness via Feature Pyramid Decoder
* ENSEI: Efficient Secure Inference via Frequency-Domain Homomorphic Convolution for Privacy-Preserving Visual Recognition
* Ensemble Generative Cleaning With Feedback Loops for Defending Adversarial Attacks
* Epipolar Transformers
* Episode-Based Prototype Generating Network for Zero-Shot Learning
* EPOS: Estimating 6D Pose of Objects With Symmetries
* Equalization Loss for Long-Tailed Object Recognition
* Erasing Integrated Learning: A Simple Yet Effective Approach for Weakly Supervised Object Localization
* Estimating Low-Rank Region Likelihood Maps
* Eternal Sunshine of the Spotless Net: Selective Forgetting in Deep Networks
* Evade Deep Image Retrieval by Stashing Private Images in the Hash Space
* Evaluating Weakly Supervised Object Localization Methods Right
* Event Probability Mask (EPM) and Event Denoising Convolutional Neural Network (EDnCNN) for Neuromorphic Cameras
* EventCap: Monocular 3D Capture of High-Speed Human Motions Using an Event Camera
* EventSR: From Asynchronous Events to Image Reconstruction, Restoration, and Super-Resolution via End-to-End Adversarial Learning
* Evolving Losses for Unsupervised Video Representation Learning
* Exemplar Normalization for Learning Deep Representation
* Explainable Object-Induced Action Decision for Autonomous Vehicles
* Explaining Knowledge Distillation by Quantifying the Knowledge
* Exploit Clues From Views: Self-Supervised and Regularized Learning for Multiview Object Recognition
* Exploiting Joint Robustness to Adversarial Perturbations
* Explorable Super Resolution
* Exploring Bottom-Up and Top-Down Cues With Attentive Learning for Webly Supervised Object Detection
* Exploring Categorical Regularization for Domain Adaptive Object Detection
* Exploring Category-Agnostic Clusters for Open-Set Domain Adaptation
* Exploring Data Aggregation in Policy Learning for Vision-Based Urban Autonomous Driving
* Exploring Self-Attention for Image Recognition
* Exploring Spatial-Temporal Multi-Frequency Analysis for High-Fidelity and Temporal-Consistency Video Prediction
* Exploring Unlabeled Faces for Novel Attribute Discovery
* Extreme Relative Pose Network Under Hybrid Representations
* Extremely Dense Point Correspondences Using a Learned Feature Descriptor
* F-BRS: Rethinking Backpropagating Refinement for Interactive Segmentation
* Face X-Ray for More General Face Forgery Detection
* FaceScape: A Large-Scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction
* Factorized Higher-Order CNNs With an Application to Spatio-Temporal Emotion Estimation
* FALCON: A Fourier Transform Based Approach for Fast and Secure Convolutional Neural Network Predictions
* Fantastic Answers and Where to Find Them: Immersive Question-Directed Visual Attention
* Fashion Editing With Adversarial Parsing Learning
* Fashion Outfit Complementary Item Retrieval
* Fast MSER
* Fast Soft Color Segmentation
* Fast Sparse ConvNets
* Fast Symmetric Diffeomorphic Image Registration with Convolutional Neural Networks
* Fast Template Matching and Update for Video Object Tracking and Segmentation
* Fast Texture Synthesis via Pseudo Optimizer
* Fast Video Object Segmentation With Temporal Aggregation Network and Dynamic Template Matching
* Fast(er) Reconstruction of Shredded Text Documents via Self-Supervised Deep Asymmetric Metric Learning
* Fast-MVSNet: Sparse-to-Dense Multi-View Stereo With Learned Propagation and Gauss-Newton Refinement
* FastDVDnet: Towards Real-Time Deep Video Denoising Without Flow Estimation
* FBNetV2: Differentiable Neural Architecture Search for Spatial and Channel Dimensions
* FDA: Fourier Domain Adaptation for Semantic Segmentation
* Feature-Metric Registration: A Fast Semi-Supervised Approach for Robust Point Cloud Registration Without Correspondences
* FeatureFlow: Robust Video Interpolation via Structure-to-Texture Generation
* Few Sample Knowledge Distillation for Efficient Network Compression
* Few-Shot Class-Incremental Learning
* Few-Shot Learning of Part-Specific Probability Space for 3D Shape Segmentation
* Few-Shot Learning via Embedding Adaptation With Set-to-Set Functions
* Few-Shot Object Detection With Attention-RPN and Multi-Relation Detector
* Few-Shot Open-Set Recognition Using Meta-Learning
* Few-Shot Pill Recognition
* Few-Shot Video Classification via Temporal Alignment
* FGN: Fully Guided Network for Few-Shot Instance Segmentation
* Filter Grafting for Deep Neural Networks
* Filter Response Normalization Layer: Eliminating Batch Dependence in the Training of Deep Neural Networks
* Fine-Grained Generalized Zero-Shot Learning via Dense Attribute-Based Attention
* Fine-Grained Image-to-Image Transformation Towards Visual Recognition
* Fine-Grained Video-Text Retrieval With Hierarchical Graph Reasoning
* FineGym: A Hierarchical Video Dataset for Fine-Grained Action Understanding
* Fixed-Point Back-Propagation Training
* Flow Contrastive Estimation of Energy-Based Models
* Flow2Stereo: Effective Self-Supervised Learning of Optical Flow and Stereo Matching
* FM2u-Net: Face Morphological Multi-Branch Network for Makeup-Invariant Face Verification
* FOAL: Fast Online Adaptive Learning for Cardiac Motion Estimation
* FocalMix: Semi-Supervised Learning for 3D Medical Image Detection
* Focus on Defocus: Bridging the Synthetic to Real Domain Gap for Depth Estimation
* Footprints and Free Space From a Single Color Image
* Foreground-Aware Relation Network for Geospatial Object Segmentation in High Spatial Resolution Remote Sensing Imagery
* Forward and Backward Information Retention for Accurate Binary Neural Networks
* FPConv: Learning Local Flattening for Point Convolution
* FReeNet: Multi-Identity Face Reenactment
* Frequency Domain Compact 3D Convolutional Neural Networks
* FroDO: From Detections to 3D Objects
* From Depth What Can You See? Depth Completion via Auxiliary Image Reconstruction
* From Fidelity to Perceptual Quality: A Semi-Supervised Approach for Low-Light Image Enhancement
* From Image Collections to Point Clouds With Self-Supervised Shape and Pose Networks
* From Paris to Berlin: Discovering Fashion Style Influences Around the World
* From Patches to Pictures (PaQ-2-PiQ): Mapping the Perceptual Space of Picture Quality
* From Two Rolling Shutters to One Global Shutter
* Front2Back: Single View 3D Shape Reconstruction via Front to Back Prediction
* FSS-1000: A 1000-Class Dataset for Few-Shot Segmentation
* Fusing Wearable IMUs With Multi-View Images for Human Pose Estimation: A Geometric Approach
* Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation
* Future Video Synthesis With Object Motion Prediction
* G-TAD: Sub-Graph Localization for Temporal Action Detection
* G2L-Net: Global to Local Network for Real-Time 6D Pose Estimation With Embedding Vector Features
* G3AN: Disentangling Appearance and Motion for Video Generation
* Gait Recognition via Semi-supervised Disentangled Representation Learning to Identity and Covariate Features
* GaitPart: Temporal Part-Based Model for Gait Recognition
* GAMIN: Generative Adversarial Multiple Imputation Network for Highly Missing Data
* GAN Compression: Efficient Architectures for Interactive Conditional GANs
* GAN That Warped: Semantic Attribute Editing With Unpaired Data, The
* GanHand: Predicting Human Grasp Affordances in Multi-Object Scenes
* Garden of Forking Paths: Towards Multi-Future Trajectory Prediction, The
* Gate-Shift Networks for Video Action Recognition
* Gated Channel Transformation for Visual Recognition
* gDLS*: Generalized Pose-and-Scale Estimation Given Scale and Gravity Priors
* Generalized ODIN: Detecting Out-of-Distribution Image Without Learning From Out-of-Distribution Data
* Generalized Product Quantization Network for Semi-Supervised Image Retrieval
* Generalized Zero-Shot Learning via Over-Complete Distribution
* Generalizing Hand Segmentation in Egocentric Videos With Uncertainty-Guided Model Adaptation
* Generating 3D People in Scenes Without People
* Generating Accurate Pseudo-Labels in Semi-Supervised Learning and Avoiding Overconfident Predictions via Hermite Polynomial Activations
* Generating and Exploiting Probabilistic Monocular Depth Estimates
* Generative Hybrid Representations for Activity Forecasting With No-Regret Learning
* Generative-Discriminative Feature Representations for Open-Set Recognition
* GeoDA: A Geometric Framework for Black-Box Adversarial Attacks
* Geometric Structure Based and Regularized Depth Estimation From 360 Indoor Imagery
* Geometrically Principled Connections in Graph Neural Networks
* Geometry and Learning Co-Supported Normal Estimation for Unstructured Point Cloud
* Geometry-Aware Satellite-to-Ground Image Synthesis for Urban Areas
* GhostNet: More Features From Cheap Operations
* GHUM GHUML: Generative 3D Human Shape and Articulated Pose Models
* GIFnets: Differentiable GIF Encoding Framework
* Global Optimality for Point Set Registration Using Semidefinite Programming
* Global Texture Enhancement for Fake Face Detection in the Wild
* Global-Local Bidirectional Reasoning for Unsupervised Representation Learning of 3D Point Clouds
* Global-Local GCN: Large-Scale Label Noise Cleansing for Face Recognition
* Globally Optimal Contrast Maximisation for Event-Based Motion Estimation
* GLU-Net: Global-Local Universal Network for Dense Flow and Correspondences
* GNN3DMOT: Graph Neural Network for 3D Multi-Object Tracking With 2D-3D Multi-Feature Learning
* Going Deeper With Lean Point Networks
* Gold Seeker: Information Gain From Policy Distributions for Goal-Oriented Vision-and-Langauge Reasoning
* Google Landmarks Dataset v2: A Large-Scale Benchmark for Instance-Level Recognition and Retrieval
* GP-NAS: Gaussian Process Based Neural Architecture Search
* GPS-Net: Graph Property Sensing Network for Scene Graph Generation
* Gradually Vanishing Bridge for Adversarial Domain Adaptation
* Graduated Filter Method for Large Scale Robust Estimation, A
* Graph Embedded Pose Clustering for Anomaly Detection
* Graph Structured Network for Image-Text Matching
* Graph-Guided Architecture Search for Real-Time Semantic Segmentation
* Graph-Structured Referring Expression Reasoning in the Wild
* GraphTER: Unsupervised Learning of Graph Transformation Equivariant Representations via Auto-Encoding Node-Wise Transformations
* GrappaNet: Combining Parallel Imaging With Deep Learning for Multi-Coil MRI Reconstruction
* GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping
* GreedyNAS: Towards Fast One-Shot NAS With Greedy Supernet
* Grid-GCN for Fast and Scalable Point Cloud Learning
* Group Sparsity: The Hinge Between Filter Pruning and Decomposition for Network Compression
* GroupFace: Learning Latent Groups and Constructing Group-Based Representations for Face Recognition
* Guided Variational Autoencoder for Disentanglement Learning
* Gum-Net: Unsupervised Geometric Matching for Fast and Accurate 3D Subtomogram Image Alignment and Averaging
* HAMBox: Delving Into Mining High-Quality Anchors on Face Detection
* HandVoxNet: Deep Voxel-Based Network for 3D Hand Shape and Pose Estimation From a Single Depth Map
* Hardware-in-the-Loop End-to-End Optimization of Camera Image Processing Pipelines
* Harmonizing Transferability and Discriminability for Adapting Object Detectors
* HCNAF: Hyper-Conditioned Neural Autoregressive Flow and its Application for Probabilistic Occupancy Map Forecasting
* Height and Uprightness Invariance for 3D Prediction From a Single View
* Heterogeneous Knowledge Distillation Using Information Flow Modeling
* Hi-CMD: Hierarchical Cross-Modality Disentanglement for Visible-Infrared Person Re-Identification
* Hierarchical Clustering With Hard-Batch Triplet Loss for Person Re-Identification
* Hierarchical Conditional Relation Networks for Video Question Answering
* Hierarchical Feature Embedding for Attribute Recognition
* Hierarchical Graph Attention Network for Visual Relationship Detection
* Hierarchical Graph Network for 3D Object Detection on Point Clouds, A
* Hierarchical Human Parsing With Typed Part-Relation Reasoning
* Hierarchical Pyramid Diverse Attention Networks for Face Recognition
* Hierarchical Scene Coordinate Classification and Regression for Visual Localization
* Hierarchically Robust Representation Learning
* High-Dimensional Convolutional Networks for Geometric Pattern Recognition
* High-Frequency Component Helps Explain the Generalization of Convolutional Neural Networks
* High-Order Information Matters: Learning Relation and Topology for Occluded Person Re-Identification
* High-Performance Long-Term Tracking With Meta-Updater
* High-Resolution Daytime Translation Without Domain Labels
* HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation
* Hit-Detector: Hierarchical Trinity Architecture Search for Object Detection
* Holistically-Attracted Wireframe Parsing
* HOnnotate: A Method for 3D Annotation of Hand and Object Poses
* HOPE-Net: A Graph-Based Model for Hand-Object Pose Estimation
* How Does Noise Help Robustness? Explanation and Exploration under the Neural SDE Framework
* How Much Time Do You Have? Modeling Multi-Duration Saliency
* How to Train Your Deep Multi-Object Tracker
* How Useful Is Self-Supervised Pretraining for Visual Tasks?
* HRank: Filter Pruning Using High-Rank Feature Map
* HUMBI: A Large Multiview Dataset of Human Body Expressions
* HVNet: Hybrid Voxel Network for LiDAR Based 3D Object Detection
* HybridPose: 6D Object Pose Estimation Under Hybrid Representations
* Hyperbolic Image Embeddings
* Hyperbolic Visual Embedding Learning for Zero-Shot Recognition
* Hypergraph Attention Networks for Multimodal Learning
* HyperSTAR: Task-Aware Hyperparameters for Deep Networks
* IDA-3D: Instance-Depth-Aware 3D Object Detection From Stereo Vision for Autonomous Driving
* ILFO: Adversarial Attack on Adaptive Neural Networks
* Image Based Virtual Try-On Network From Unpaired Data
* Image Demoireing with Learnable Bandpass Filters
* Image Processing Using Multi-Code GAN Prior
* Image Search With Text Feedback by Visiolinguistic Attention Learning
* Image Super-Resolution With Cross-Scale Non-Local Attention and Exhaustive Self-Exemplars Mining
* Image2StyleGAN++: How to Edit the Embedded Images?
* Imitative Non-Autoregressive Modeling for Trajectory Forecasting and Imputation
* Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion
* Improved Few-Shot Visual Classification
* Improving Action Segmentation via Graph-Based Temporal Reasoning
* Improving Confidence Estimates for Unfamiliar Examples
* Improving Convolutional Networks With Self-Calibrated Convolutions
* Improving One-Shot NAS by Suppressing the Posterior Fading
* Improving the Robustness of Capsule Networks to Image Affine Transformations
* IMRAM: Iterative Matching With Recurrent Attention Memory for Cross-Modal Image-Text Retrieval
* ImVoteNet: Boosting 3D Object Detection in Point Clouds With Image Votes
* In Defense of Grid Features for Visual Question Answering
* In Perfect Shape: Certifiably Optimal 3D Shape Reconstruction From 2D Landmarks
* Incremental Few-Shot Object Detection
* Incremental Learning in Online Scenario
* Inducing Hierarchical Compositional Model by Sparsifying Generator Network
* Inferring Attention Shift Ranks of Objects for Image Saliency
* Inflated Episodic Memory With Region Self-Attention for Long-Tailed Visual Recognition
* Information-Driven Direct RGB-D Odometry
* Instance Credibility Inference for Few-Shot Learning
* Instance Guided Proposal Network for Person Search
* Instance Segmentation of Biological Images Using Harmonic Embeddings
* Instance Shadow Detection
* Instance-Aware Image Colorization
* Instance-Aware, Context-Focused, and Memory-Efficient Weakly Supervised Object Detection
* Intelligent Home 3D: Automatic 3D-House Design From Linguistic Descriptions Only
* Inter-Region Affinity Distillation for Road Marking Segmentation
* Inter-Task Association Critic for Cross-Resolution Person Re-Identification
* Interactive Image Segmentation With First Click Attention
* Interactive Multi-Label CNN Learning With Partial Labels
* Interactive Object Segmentation With Inside-Outside Guidance
* Interactive Two-Stream Decoder for Accurate and Fast Saliency Detection
* Internal Covariate Shift Bounding Algorithm for Deep Neural Networks by Unitizing Layers' Outputs, An
* Interpretable and Accurate Fine-grained Recognition via Region Grouping
* Interpreting the Latent Space of GANs for Semantic Face Editing
* Intra- and Inter-Action Understanding via Temporal Action Parsing
* IntrA: 3D Intracranial Aneurysm Dataset for Deep Learning
* Intuitive, Interactive Beard and Hair Synthesis With Generative Models
* Inverse Rendering for Complex Indoor Scenes: Shape, Spatially-Varying Lighting and SVBRDF From a Single Image
* Investigation Into the Stochasticity of Batch Whitening, An
* iTAML: An Incremental Task-Agnostic Meta-learning Approach
* Iterative Answer Prediction With Pointer-Augmented Multimodal Transformers for TextVQA
* Iterative Context-Aware Graph Inference for Visual Dialog
* Iteratively-Refined Interactive 3D Medical Image Segmentation with Multi-Agent Reinforcement Learning
* JA-POLS: A Moving-Camera Background Model via Joint Alignment and Partially-Overlapping Local Subspaces
* JL-DCF: Joint Learning and Densely-Cooperative Fusion Framework for RGB-D Salient Object Detection
* Joint 3D Instance Segmentation and Object Detection for Autonomous Driving
* Joint Demosaicing and Denoising With Self Guidance
* Joint Filtering of Intensity Images and Neuromorphic Events for High-Resolution Noise-Robust Imaging
* Joint Graph-Based Depth Refinement and Normal Estimation
* Joint Semantic Segmentation and Boundary Detection Using Iterative Pyramid Contexts
* Joint Spatial-Temporal Optimization for Stereo 3D Object Tracking
* Joint Texture and Geometry Optimization for RGB-D Reconstruction
* Joint Training of Variational Auto-Encoder and Latent Energy-Based Model
* Just Go With the Flow: Self-Supervised Scene Flow Estimation
* KeypointNet: A Large-Scale 3D Keypoint Dataset Aggregated From Numerous Human Annotations
* KeyPose: Multi-View 3D Labeling and Keypoint Estimation for Transparent Objects
* KFNet: Learning Temporal Camera Relocalization Using Kalman Filtering
* Knowledge As Priors: Cross-Modal Knowledge Generalization for Datasets Without Superior Knowledge
* Knowledge Within: Methods for Data-Free Model Compression, The
* L2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks
* Label Decoupling Framework for Salient Object Detection
* Label Distribution Learning on Auxiliary Label Space Graphs for Facial Expression Recognition
* Large Scale Video Representation Learning via Relational Graph Clustering
* Large-Scale Object Detection in the Wild From Imbalanced Multi-Labels
* LatentFusion: End-to-End Differentiable Reconstruction and Rendering for Unseen Object Pose Estimation
* Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition
* Learn2Perturb: An End-to-End Feature Perturbation Learning to Improve Adversarial Robustness
* Learned Image Compression With Discretized Gaussian Mixture Likelihoods and Attention Modules
* Learning 3D Semantic Scene Graphs From 3D Indoor Reconstructions
* Learning a Dynamic Map of Visual Appearance
* Learning a Neural 3D Texture Space From 2D Exemplars
* Learning a Neural Solver for Multiple Object Tracking
* Learning a Reinforced Agent for Flexible Exposure Bracketing Selection
* Learning a Unified Sample Weighting Network for Object Detection
* Learning a Weakly-Supervised Video Actor-Action Segmentation Model With a Wise Selection
* Learning Augmentation Network via Influence Functions
* Learning Better Lossless Compression Using Lossy Compression
* Learning by Analogy: Reliable Supervision From Transformations for Unsupervised Optical Flow Estimation
* Learning Canonical Shape Space for Category-Level 6D Object Pose and Size Estimation
* Learning Combinatorial Solver for Graph Matching
* Learning Deep Network for Detecting 3D Object Keypoints and 6D Poses
* Learning Depth-Guided Convolutions for Monocular 3D Object Detection
* Learning Dynamic Relationships for 3D Human Motion Prediction
* Learning Dynamic Routing for Semantic Segmentation
* Learning Event-Based Motion Deblurring
* Learning Fast and Robust Target Models for Video Object Segmentation
* Learning Filter Pruning Criteria for Deep Convolutional Neural Networks Acceleration
* Learning for Video Compression With Hierarchical Quality and Recurrent Enhancement
* Learning Formation of Physically-Based Face Attributes
* Learning From Noisy Anchors for One-Stage Object Detection
* Learning From Synthetic Animals
* Learning From Web Data With Self-Organizing Memory Module
* Learning Fused Pixel and Feature-Based View Reconstructions for Light Fields
* Learning Generative Models of Shape Handles
* Learning Geocentric Object Pose in Oblique Monocular Images
* Learning Human-Object Interaction Detection Using Interaction Points
* Learning Identity-Invariant Motion Representations for Cross-ID Face Reenactment
* Learning in the Frequency Domain
* Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis
* Learning Instance Occlusion for Panoptic Segmentation
* Learning Integral Objects With Intra-Class Discriminator for Weakly-Supervised Semantic Segmentation
* Learning Interactions and Relationships Between Movie Characters
* Learning Invariant Representation for Unsupervised Image Restoration
* Learning Longterm Representations for Person Re-Identification Using Radio Signals
* Learning Memory-Guided Normality for Anomaly Detection
* Learning Meta Face Recognition in Unseen Domains
* Learning Multi-Granular Hypergraphs for Video-Based Person Re-Identification
* Learning Multi-Object Tracking and Segmentation From Automatic Annotations
* Learning Multi-View Camera Relocalization With Graph Neural Networks
* Learning Multiview 3D Point Cloud Registration
* Learning Nanoscale Motion Patterns of Vesicles in Living Cells
* Learning Oracle Attention for High-Fidelity Face Completion
* Learning Physics-Guided Face Relighting Under Directional Light
* Learning Rank-1 Diffractive Optics for Single-Shot High Dynamic Range Imaging
* Learning Representations by Predicting Bags of Visual Words
* Learning Saliency Propagation for Semi-Supervised Instance Segmentation
* Learning Selective Self-Mutual Attention for RGB-D Saliency Detection
* Learning Situational Driving
* Learning Temporal Co-Attention Models for Unsupervised Video Action Localization
* Learning Texture Invariant Representation for Domain Adaptation of Semantic Segmentation
* Learning Texture Transformer Network for Image Super-Resolution
* Learning the Redundancy-Free Features for Generalized Zero-Shot Object Recognition
* Learning to Autofocus
* Learning to Cartoonize Using White-Box Cartoon Representations
* Learning to Cluster Faces via Confidence and Connectivity Estimation
* Learning to Detect Important People in Unlabelled Images for Semi-Supervised Important People Detection
* Learning to Discriminate Information for Online Action Detection
* Learning to Dress 3D People in Generative Clothing
* Learning to Evaluate Perception Models Using Planner-Centric Metrics
* Learning to Forget for Meta-Learning
* Learning to Generate 3D Training Data Through Hybrid Gradient
* Learning to Have an Ear for Face Super-Resolution
* Learning to Learn Cropping Models for Different Aspect Ratio Requirements
* Learning to Learn Single Domain Generalization
* Learning to Manipulate Individual Objects in an Image
* Learning to Measure the Static Friction Coefficient in Cloth Contact
* Learning to Observe: Approximating Human Perceptual Thresholds for Detection of Suprathreshold Image Transformations
* Learning to Optimize Non-Rigid Tracking
* Learning to Optimize on SPD Manifolds
* Learning to Restore Low-Light Images via Decomposition-and-Enhancement
* Learning to See Through Obstructions
* Learning to Segment 3D Point Clouds in 2D Image Space
* Learning to Segment the Tail
* Learning to Select Base Classes for Few-Shot Classification
* Learning to Shadow Hand-Drawn Sketches
* Learning to Simulate Dynamic Environments With GameGAN
* Learning to Structure an Image With Few Colors
* Learning to Super Resolve Intensity Images From Events
* Learning to Transfer Texture From Clothing Images to 3D Humans
* Learning Unseen Concepts via Hierarchical Decomposition and Composition
* Learning Unsupervised Hierarchical Part Decomposition of 3D Objects From a Single RGB Image
* Learning User Representations for Open Vocabulary Image Hashtag Prediction
* Learning Video Object Segmentation From Unlabeled Videos
* Learning Video Stabilization Using Optical Flow
* Learning Visual Emotion Representations From Web Data
* Learning Visual Motion Segmentation Using Event Surfaces
* Learning Weighted Submanifolds With Variational Autoencoders and Riemannian Variational Autoencoders
* Learning When and Where to Zoom With Deep Reinforcement Learning
* Leveraging 2D Data to Learn Textured 3D Mesh Generation
* Leveraging Photometric Consistency Over Time for Sparsely Supervised Hand-Object Reconstruction
* LG-GAN: Label Guided Adversarial Network for Flexible Targeted Attack of Point Cloud Based Deep Networks
* LiDAR-Based Online 3D Video Object Detection With Graph-Based Message Passing and Spatiotemporal Transformer Attention
* LiDARsim: Realistic LiDAR Simulation by Leveraging the Real World
* Light Field Spatial Super-Resolution via Deep Combinatorial Geometry Embedding and Structural Consistency Regularization
* Light-weight Calibrator: A Separable Component for Unsupervised Domain Adaptation
* Lighthouse: Predicting Lighting Volumes for Spatially-Coherent Illumination
* Lighting-Invariant Point Processor for Shading, A
* Lightweight Multi-View 3D Pose Estimation Through Camera-Disentangled Representation
* Lightweight Photometric Stereo for Facial Details Recovery
* Listen to Look: Action Recognition by Previewing Audio
* Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation
* Local Context Normalization: Revisiting Local Normalization
* Local Deep Implicit Functions for 3D Shape
* Local Implicit Grid Representations for 3D Scenes
* Local Non-Rigid Structure-From-Motion From Diffeomorphic Mappings
* Local-Global Video-Text Interactions for Temporal Grounding
* Local-to-Global Approach to Multi-Modal Movie Scene Segmentation, A
* Look-Into-Object: Self-Supervised Structure Modeling for Object Recognition
* Looking at the Right Stuff: Guided Semantic-Gaze for Autonomous Driving
* Low-Rank Compression of Neural Nets: Learning the Rank of Each Layer
* LSM: Learning Subspace Minimization for Low-Level Vision
* LT-Net: Label Transfer by Learning Reversible Voxel-Wise Correspondence for One-Shot Medical Image Segmentation
* LUVLi Face Alignment: Estimating Landmarks' Location, Uncertainty, and Visibility Likelihood
* M-LVC: Multiple Frames Prediction for Learned Video Compression
* M2m: Imbalanced Classification via Major-to-Minor Translation
* MAGSAC++, a Fast, Reliable and Accurate Robust Estimator
* Maintaining Discrimination and Fairness in Class Incremental Learning
* Making Better Mistakes: Leveraging Class Hierarchies With Deep Networks
* ManiGAN: Text-Guided Image Manipulation
* MANTRA: Memory Augmented Networks for Multiple Trajectory Prediction
* Mapillary Street-Level Sequences: A Dataset for Lifelong Place Recognition
* MARMVS: Matching Ambiguity Reduced Multiple View Stereo for Efficient Large Scale Scene Reconstruction
* Mask Encoding for Single Shot Instance Segmentation
* MaskFlownet: Asymmetric Feature Matching With Learnable Occlusion Mask
* MaskGAN: Towards Diverse and Interactive Facial Image Manipulation
* MAST: A Memory-Augmented Self-Supervised Tracker
* MCEN: Bridging Cross-Modal Gap between Cooking Recipes and Dish Images with Latent Variable Model
* McFlow: Monte Carlo Flow Models for Data Imputation
* MEBOW: Monocular Estimation of Body Orientation in the Wild
* MemNAS: Memory-Efficient Neural Architecture Search With Grow-Trim Learning
* Memory Aggregation Networks for Efficient Interactive Video Object Segmentation
* Memory Enhanced Global-Local Aggregation for Video Object Detection
* Memory-Efficient Hierarchical Neural Architecture Search for Image Denoising
* Mesh-Guided Multi-View Stereo With Pyramid Architecture
* Meshed-Memory Transformer for Image Captioning
* Meshlet Priors for 3D Mesh Reconstruction
* Meta-Learning of Neural Architectures for Few-Shot Learning
* Meta-Transfer Learning for Zero-Shot Super-Resolution
* MetaFuse: A Pre-trained Fusion Model for Human Pose Estimation
* MetaIQA: Deep Meta-Learning for No-Reference Image Quality Assessment
* METAL: Minimum Effort Temporal Activity Localization in Untrimmed Videos
* MiLeNAS: Efficient Neural Architecture Search via Mixed-Level Reformulation
* MINA: Convex Mixed-Integer Programming for Non-Rigid Shape Alignment
* MineGAN: Effective Knowledge Transfer From GANs to Target Domains With Few Images
* Minimal Solutions for Relative Pose With a Single Affine Correspondence
* Minimal Solutions to Relative Pose Estimation From Two Views Sharing a Common Direction With Unknown Focal Length
* Minimal Solvers for 3D Scan Alignment With Pairs of Intersecting Lines
* Minimizing Discrete Total Curvature for Image Processing
* MISC: Multi-Condition Injection and Spatially-Adaptive Compositing for Conditional Person Image Synthesis
* Mitigating Bias in Face Recognition Using Skewness-Aware Reinforcement Learning
* Mixture Dense Regression for Object Detection and Human Pose Estimation
* MLCVNet: Multi-Level Context VoteNet for 3D Object Detection
* MMTM: Multimodal Transfer Module for CNN Fusion
* MnasFPN: Learning Latency-Aware Pyramid Architecture for Object Detection on Mobile Devices
* Mnemonics Training: Multi-Class Incremental Learning Without Forgetting
* Modality Shifting Attention Network for Multi-Modal Video Question Answering
* Model Adaptation: Unsupervised Domain Adaptation Without Source Data
* Model-Driven Deep Neural Network for Single Image Rain Removal, A
* Modeling Biological Immunity to Adversarial Examples
* Modeling the Background for Incremental Learning in Semantic Segmentation
* Momentum Contrast for Unsupervised Visual Representation Learning
* Monocular Real-Time Hand Shape and Motion Capture Using Multi-Modal Data
* MonoPair: Monocular 3D Object Detection Using Pairwise Spatial Relationships
* More Grounded Image Captioning by Distilling Image-Text Matching Model
* MoreFusion: Multi-object Reasoning for 6D Pose Estimation from Volumetric Fusion
* Morphable Face Albedo Model, A
* MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird's Eye View Maps
* Moving in the Right Direction: A Regularization for Deep Metric Learning
* MPM: Joint Representation of Motion and Position Map for Cell Tracking
* MSeg: A Composite Dataset for Multi-Domain Semantic Segmentation
* MSG-GAN: Multi-Scale Gradients for Generative Adversarial Networks
* MTL-NAS: Task-Agnostic Neural Architecture Search Towards General-Purpose Multi-Task Learning
* Multi-Dimensional Pruning: A Unified Framework for Model Compression
* Multi-Domain Learning for Accurate and Few-Shot Color Constancy
* Multi-Granularity Reference-Aided Attentive Feature Aggregation for Video-Based Person Re-Identification
* Multi-Hypothesis Approach to Color Constancy, A
* Multi-Modal Domain Adaptation for Fine-Grained Action Recognition
* Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text
* Multi-Modality Cross Attention Network for Image and Sentence Matching
* Multi-Mutual Consistency Induced Transfer Subspace Learning for Human Motion Segmentation
* Multi-Path Learning for Object Pose Estimation Across Domains
* Multi-Path Region Mining for Weakly Supervised 3D Semantic Segmentation on Point Clouds
* Multi-Scale Boosted Dehazing Network with Dense Feature Fusion
* Multi-scale Domain-adversarial Multiple-instance CNN for Cancer Subtype Classification with Unannotated Histopathological Images
* Multi-Scale Fusion Subspace Clustering Using Similarity Constraint
* Multi-Scale Interactive Network for Salient Object Detection
* Multi-Scale Progressive Fusion Network for Single Image Deraining
* Multi-Task Collaborative Network for Joint Referring Expression Comprehension and Segmentation
* Multi-Task Mean Teacher for Semi-Supervised Shadow Detection, A
* Multi-View Neural Human Rendering
* Multigrid Method for Efficiently Training Video Models, A
* Multimodal Categorization of Crisis Events in Social Media
* Multimodal Future Localization and Emergence Prediction for Objects in Egocentric View With a Reachability Prior
* Multiple Anchor Learning for Visual Object Detection
* Multiview-Consistent Semi-Supervised Learning for 3D Human Pose Estimation
* Music Gesture for Visual Sound Separation
* MUXConv: Information Multiplexing in Convolutional Neural Networks
* NAS-FCOS: Fast Neural Architecture Search for Object Detection
* Nested Scale-Editing for Conditional Image Synthesis
* NestedVAE: Isolating Common Factors via Weak Supervision
* NETNet: Neighbor Erasing and Transferring Network for Better Single Shot Object Detection
* Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio
* Neural Architecture Search for Lightweight Non-Local Networks
* Neural Blind Deconvolution Using Deep Priors
* Neural Cages for Detail-Preserving 3D Deformations
* Neural Contours: Learning to Draw Lines From 3D Shapes
* Neural Data Server: A Large-Scale Search Engine for Transfer Learning Data
* Neural Head Reenactment with Latent Pose Descriptors
* Neural Implicit Embedding for Point Cloud Analysis
* Neural Network Pruning With Residual-Connections and Limited-Data
* Neural Networks Are More Productive Teachers Than Human Raters: Active Mixup for Data-Efficient Knowledge Distillation From a Blackbox Model
* Neural Point Cloud Rendering via Multi-Plane Projection
* Neural Pose Transfer by Spatially Adaptive Instance Normalization
* Neural Rendering Framework for Free-Viewpoint Relighting, A
* Neural Topological SLAM for Visual Navigation
* Neural Voxel Renderer: Learning an Accurate and Controllable Rendering Tool
* NeuralScale: Efficient Scaling of Neurons for Resource-Constrained Deep Neural Networks
* Neuromorphic Camera Guided High Dynamic Range Imaging
* NMS by Representative Region: Towards Crowded Pedestrian Detection by Proposal Pairing
* Noise Modeling, Synthesis and Classification for Generic Object Anti-Spoofing
* Noise Robust Generative Adversarial Networks
* Noise-Aware Fully Webly Supervised Object Detection
* Noisier2Noise: Learning to Denoise From Unpaired Noisy Data
* Non-Adversarial Video Synthesis with Learned Priors
* Non-Line-of-Sight Surface Reconstruction Using the Directional Light-Cone Transform
* Non-Local Neural Networks With Grouped Bilinear Attentional Transforms
* Nonparametric Object and Parts Modeling With Lie Group Dynamics
* Norm-Aware Embedding for Efficient Person Search
* Normal Assisted Stereo Depth Estimation
* Normalized and Geometry-Aware Self-Attention Network for Image Captioning
* Normalizing Flows With Multi-Scale Autoregressive Priors
* Novel Object Viewpoint Estimation Through Reconstruction Alignment
* Novel Recurrent Encoder-Decoder Structure for Large-Scale Multi-View Stereo Reconstruction From an Open Aerial Dataset, A
* Novel View Synthesis of Dynamic Scenes With Globally Coherent Depths From a Monocular Camera
* nuScenes: A Multimodal Dataset for Autonomous Driving
* OASIS: A Large-Scale Dataset for Single Image 3D in the Wild
* Object Relational Graph With Teacher-Recommended Learning for Video Captioning
* Object-Occluded Human Shape and Pose Estimation From a Single Color Image
* OccuSeg: Occupancy-Aware 3D Instance Segmentation
* OctSqueeze: Octree-Structured Entropy Model for LiDAR Compression
* Offset Bin Classification Network for Accurate Object Detection
* Old Is Gold: Redefining the Adversarially Learned One-Class Classifier Training Paradigm
* On Isometry Robustness of Deep 3D Point Cloud Models Under Adversarial Attacks
* On Joint Estimation of Pose, Geometry and svBRDF From a Handheld Scanner
* On Positive-Unlabeled Classification in GAN
* On the Acceleration of Deep Learning Model Parallelism With Staleness
* On the Detection of Digital Face Manipulation
* On the Distribution of Minima in Intrinsic-Metric Rotation Averaging
* On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering
* On the Regularization Properties of Structured Dropout
* On the Uncertainty of Self-Supervised Monocular Depth Estimation
* On Translation Invariance in CNNs: Convolutional Layers Can Exploit Absolute Spatial Location
* On Vocabulary Reliance in Scene Text Recognition
* One Man's Trash Is Another Man's Treasure: Resisting Adversarial Examples by Adversarial Examples
* One-Shot Adversarial Attacks on Visual Tracking With Dual Attention
* One-Shot Domain Adaptation for Face Generation
* Online Deep Clustering for Unsupervised Representation Learning
* Online Depth Learning Against Forgetting in Monocular Videos
* Online Joint Multi-Metric Adaptation From Frequent Sharing-Subset Mining for Person Re-Identification
* Online Knowledge Distillation via Collaborative Learning
* Oops! Predicting Unintentional Action in Video
* Open Compound Domain Adaptation
* Optical Flow in Dense Foggy Scenes Using Semi-Supervised Learning
* Optical Flow in the Dark
* Optical Non-Line-of-Sight Physics-Based 3D Human Pose Estimation
* Optimal least-squares solution to the hand-eye calibration problem
* Optimizing Rank-Based Metrics With Blackbox Differentiation
* Orderless Recurrent Models for Multi-Label Classification
* Organ at Risk Segmentation for Head and Neck Cancer Using Stratified Learning and Neural Architecture Search
* OrigamiNet: Weakly-Supervised, Segmentation-Free, One-Step, Full Page Text Recognition by learning to unfold
* Orthogonal Convolutional Neural Networks
* Overcoming Classifier Imbalance for Long-Tail Object Detection With Balanced Group Softmax
* Overcoming Multi-Model Forgetting in One-Shot NAS With Diversity Maximization
* P-nets: Deep Polynomial Neural Networks
* P2B: Point-to-Box Network for 3D Object Tracking in Point Clouds
* PADS: Policy-Adapted Sampling for Visual Similarity Learning
* Painting Many Pasts: Synthesizing Time Lapse Videos of Paintings
* PANDA: A Gigapixel-Level Human-Centric Video Dataset
* PandaNet: Anchor-Based Single-Shot Multi-Person 3D Pose Estimation
* Panoptic-Based Image Synthesis
* Panoptic-DeepLab: A Simple, Strong, and Fast Baseline for Bottom-Up Panoptic Segmentation
* Parsing-Based View-Aware Embedding Network for Vehicle Re-Identification
* Part-Aware Context Network for Human Parsing
* Partial Weight Adaptation for Robust DNN Inference
* PaStaNet: Toward Human Activity Knowledge Engine
* PatchVAE: Learning Local Latent Codes for Recognition
* Pathological Retinal Region Segmentation From OCT Images Using Geometric Relation Based Augmentation
* Pattern-Structure Diffusion for Multi-Task Learning
* Peek-a-Boo: Occlusion Reasoning in Indoor Scenes With Plane Representations
* Perceptual Quality Assessment of Smartphone Photography
* Perspective Plane Program Induction From a Single Image
* PF-Net: Point Fractal Network for 3D Point Cloud Completion
* PFCNN: Convolutional Neural Networks on 3D Surfaces Using Parallel Frames
* PFRL: Pose-Free Reinforcement Learning for 6D Pose Estimation
* Phase Consistent Ecological Domain Adaptation
* Photometric Stereo via Discrete Hypothesis-and-Test Search
* PhraseCut: Language-Based Image Segmentation in the Wild
* PhysGAN: Generating Physical-World-Resilient Adversarial Examples for Autonomous Driving
* Physically Realizable Adversarial Examples for LiDAR Object Detection
* Physics-Based Noise Formation Model for Extreme Low-Light Raw Denoising, A
* PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization
* Pixel Consensus Voting for Panoptic Segmentation
* Plug-and-Play Algorithms for Large-Scale Snapshot Compressive Imaging
* PnPNet: End-to-End Perception and Prediction With Tracking in the Loop
* Point Cloud Completion by Skip-Attention Network With Hierarchical Folding
* Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud
* PointASNL: Robust Point Clouds Processing Using Nonlocal Neural Networks With Adaptive Sampling
* PointAugment: An Auto-Augmentation Framework for Point Cloud Classification
* PointGMM: A Neural GMM Network for Point Clouds
* PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation
* PointPainting: Sequential Fusion for 3D Object Detection
* PointRend: Image Segmentation As Rendering
* Polarized Non-Line-of-Sight Imaging
* Polarized Reflection Removal With Perfect Alignment in the Wild
* PolarMask: Single Shot Instance Segmentation With Polar Representation
* PolarNet: An Improved Grid Representation for Online LiDAR Point Clouds Semantic Segmentation
* Polishing Decision-Based Adversarial Noise With a Customized Sampling
* PolyTransform: Deep Polygon Transformer for Instance Segmentation
* Pose-Guided Visible Part Matching for Occluded Person ReID
* PPDM: Parallel Point Detection and Matching for Real-Time Human-Object Interaction Detection
* PQ-NET: A Generative Part Seq2Seq Network for 3D Shapes
* PREDICT CLUSTER: Unsupervised Skeleton Based Action Recognition
* Predicting Cognitive Declines Using Longitudinally Enriched Representations for Imaging Biomarkers
* Predicting Goal-Directed Human Attention Using Inverse Reinforcement Learning
* Predicting Lymph Node Metastasis Using Histopathological Images Based on Multiple Instance Learning With Deep Graph Convolution
* Predicting Semantic Map Representations From Images Using Pyramid Occupancy Networks
* Predicting Sharp and Accurate Occlusion Boundaries in Monocular Depth Estimation Using Displacement Fields
* Prime Sample Attention in Object Detection
* Prior Guided GAN Based Semantic Inpainting
* Private-kNN: Practical Differential Privacy for Computer Vision
* ProAlignNet: Unsupervised Learning for Progressively Aligning Noisy Contours
* Probabilistic Pixel-Adaptive Refinement Networks
* Probabilistic Regression for Visual Tracking
* Probabilistic Structural Latent Representation for Unsupervised Embedding
* Probabilistic Video Prediction From Noisy Data With a Posterior Confidence
* Probability Weighted Compact Feature for Domain Adaptive Retrieval
* Programmatic and Semantic Approach to Explaining and Debugging Neural Network Based Object Detectors, A
* Progressive Adversarial Networks for Fine-Grained Domain Adaptation
* Progressive Mirror Detection
* Progressive Relation Learning for Group Activity Recognition
* Projection Probability-Driven Black-Box Attack
* PropagationNet: Propagate Points to Curve to Learn Structure Information
* Proxy Anchor Loss for Deep Metric Learning
* PSGAN: Pose and Expression Robust Spatial-Aware GAN for Customizable Makeup Transfer
* PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models
* PuppeteerGAN: Arbitrary Portrait Animation With Semantic-Aware Appearance Transformation
* Putting Visual Object Recognition in Context
* PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection
* PVN3D: A Deep Point-Wise 3D Keypoints Voting Network for 6DoF Pose Estimation
* QEBA: Query-Efficient Boundary-Based Blackbox Attack
* Quantum Computational Approach to Correspondence Problems on Point Sets, A
* Quasi-Newton Solver for Robust Non-Rigid Registration
* Quaternion Product Units for Deep Learning on 3D Rotation Groups
* RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds
* RankMI: A Mutual Information Maximizing Ranking Loss
* RDCFace: Radial Distortion Correction for Face Recognition
* Real-Time Cross-Modality Correlation Filtering Method for Referring Expression Comprehension, A
* Real-Time Panoptic Segmentation From Dense Detections
* Real-World Person Re-Identification via Degradation Invariance Learning
* Reciprocal Learning Networks for Human Trajectory Prediction
* Recognizing Objects From Any View With Object and Viewer-Centered Representations
* Reconstruct Locally, Localize Globally: A Model Free Method for Object Pose Estimation
* Recurrent Feature Reasoning for Image Inpainting
* Recursive Least-Squares Estimator-Aided Online Learning for Visual Tracking
* Recursive Social Behavior Graph for Trajectory Prediction
* ReDA: Reinforced Differentiable Attribute for 3D Face Reconstruction
* Reference-Based Sketch Image Colorization Using Augmented-Self Reference and Dense Semantic Correspondence
* Referring Image Segmentation via Cross-Modal Progressive Comprehension
* Reflection Scene Separation From a Single Image
* Regularization on Spatio-Temporally Smoothed Feature for Action Recognition
* Regularizing Class-Wise Predictions via Self-Knowledge Distillation
* Regularizing CNN Transfer Learning With Randomised Regression
* Regularizing Discriminative Capability of CGANs for Semi-Supervised Generative Learning
* Regularizing Neural Networks via Minimizing Hyperspherical Energy
* Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task
* Relation-Aware Global Attention for Person Re-Identification
* Relative Interior Rule in Block-Coordinate Descent
* Reliable Weighted Optimal Transport for Unsupervised Domain Adaptation
* Residual Feature Aggregation Network for Image Super-Resolution
* Resolution Adaptive Networks for Efficient Inference
* ReSprop: Reuse Sparsified Backpropagation
* Rethinking Class-Balanced Methods for Long-Tailed Visual Recognition From a Domain Adaptation Perspective
* Rethinking Classification and Localization for Object Detection
* Rethinking Computer-Aided Tuberculosis Diagnosis
* Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Analysis and a New Strategy
* Rethinking Depthwise Separable Convolutions: How Intra-Kernel Correlations Lead to Improved MobileNets
* Rethinking Differentiable Search for Mixed-Precision Neural Networks
* Rethinking Performance Estimation in Neural Architecture Search
* Rethinking the Route Towards Weakly Supervised Object Localization
* Rethinking Zero-Shot Video Classification: End-to-End Training for Realistic Applications
* Retina-Like Visual Image Reconstruction via Spiking Neural Model
* RetinaFace: Single-Shot Multi-Level Face Localisation in the Wild
* RetinaTrack: Online Single Stage Joint Detection and Tracking
* Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation
* RevealNet: Seeing Behind Objects in RGB-D Scans
* REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments
* Reverse Perspective Network for Perspective-Aware Object Counting
* Revisiting Knowledge Distillation via Label Smoothing Regularization
* Revisiting Pose-Normalization for Fine-Grained Few-Shot Recognition
* Revisiting Saliency Metrics: Farthest-Neighbor Area Under Curve
* Revisiting the Sibling Head in Object Detector
* RGBD-Dog: Predicting Canine Pose from RGBD Sensors
* RiFeGAN: Rich Feature Generation for Text-to-Image Synthesis From Prior Knowledge
* RL-CycleGAN: Reinforcement Learning Aware Simulation-to-Real
* RMP-SNN: Residual Membrane Potential Neuron for Enabling Deeper High-Accuracy and Low-Latency Spiking Neural Network
* ROAM: Recurrently Optimizing Tracking Model
* RoboTHOR: An Open Simulation-to-Real Embodied AI Platform
* Robust 3D Self-Portraits in Seconds
* Robust Design of Deep Neural Networks Against Adversarial Attacks Based on Lyapunov Theory
* Robust Homography Estimation via Dual Principal Component Pursuit
* Robust Learning Through Cross-Task Consistency
* Robust Object Detection Under Occlusion With Context-Aware CompositionalNets
* Robust Partial Matching for Person Search in the Wild
* Robust Reference-Based Super-Resolution With Similarity-Aware Deformable Convolution
* Robust Superpixel-Guided Attentional Adversarial Attack
* Robustness Guarantees for Deep Neural Networks on Videos
* Rotate-and-Render: Unsupervised Photorealistic Face Rotation From Single-View Images
* Rotation Consistent Margin Loss for Efficient Low-Bit Face Recognition
* Rotation Equivariant Graph Convolutional Network for Spherical Image Classification
* RoutedFusion: Learning Real-Time Depth Map Fusion
* RPM-Net: Robust Point Matching Using Learned Features
* S3VAE: Self-Supervised Sequential VAE for Representation Disentanglement and Data Generation
* SaccadeNet: A Fast and Accurate Object Detector
* SAINT: Spatially Aware Interpolation NeTwork for Medical Slice Synthesis
* SAL: Sign Agnostic Learning of Shapes From Raw Data
* Salience-Guided Cascaded Suppression Network for Person Re-Identification
* SAM: The Sensitivity of Attribution Methods to Hyperparameters
* Same Features, Different Day: Weakly Supervised Feature Learning for Seasonal Invariance
* SampleNet: Differentiable Point Cloud Sampling
* SAPIEN: A SimulAted Part-Based Interactive ENvironment
* Satellite Image Time Series Classification With Pixel-Set Encoders and Temporal Self-Attention
* Say As You Wish: Fine-Grained Control of Image Caption Generation With Abstract Scene Graphs
* Scalability in Perception for Autonomous Driving: Waymo Open Dataset
* Scalable Uncertainty for Computer Vision With Functional Variational Inference
* Scale-Equalizing Pyramid Convolution for Object Detection
* Scale-Space Flow for End-to-End Optimized Video Compression
* SCATTER: Selective Context Attentional Scene Text Recognizer
* Scene Recomposition by Learning-Based ICP
* Scene-Adaptive Video Frame Interpolation via Meta-Learning
* ScopeFlow: Dynamic Scene Scoping for Optical Flow
* SCOUT: Self-Aware Discriminant Counterfactual Explanations
* ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation
* Screencast Tutorial Video Understanding
* SCT: Set Constrained Temporal Transformer for Set Supervised Action Segmentation
* SDC-Depth: Semantic Divide-and-Conquer Network for Monocular Depth Estimation
* SDFDiff: Differentiable Rendering of Signed Distance Fields for 3D Shape Optimization
* SEAN: Image Synthesis With Semantic Region-Adaptive Normalization
* Search to Distill: Pearls Are Everywhere but Not the Eyes
* Searching Central Difference Convolutional Networks for Face Anti-Spoofing
* Searching for Actions on the Hyperbole
* Secret Revealer: Generative Model-Inversion Attacks Against Deep Neural Networks, The
* SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition
* Seeing Around Street Corners: Non-Line-of-Sight Detection and Tracking In-the-Wild Using Doppler Radar
* Seeing the World in a Bag of Chips
* Seeing Through Fog Without Seeing Fog: Deep Multimodal Sensor Fusion in Unseen Adverse Weather
* Seeing without Looking: Contextual Rescoring of Object Detections for AP Maximization
* SegGCN: Efficient 3D Point Cloud Segmentation With Fuzzy Spherical Kernel
* Select to Better Learn: Fast and Accurate Deep Learning Using Data Selection From Nonlinear Manifolds
* Select, Supplement and Focus for RGB-D Saliency Detection
* Selective Transfer With Reinforced Transfer Network for Partial Domain Adaptation
* Self-Learning Video Rain Streak Removal: When Cyclic Consistency Meets Temporal Correspondence
* Self-Learning With Rectification Strategy for Human Parsing
* Self-Robust 3D Point Recognition via Gather-Vector Guidance
* Self-Supervised 3D Human Pose Estimation via Part Guided Novel Image Synthesis
* Self-supervised Approach for Adversarial Robustness, A
* Self-Supervised Deep Visual Odometry With Online Adaptation
* Self-Supervised Domain-Aware Generative Network for Generalized Zero-Shot Learning
* Self-Supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation
* Self-Supervised Human Depth Estimation From Monocular Videos
* Self-Supervised Learning of Interpretable Keypoints From Unlabelled Videos
* Self-Supervised Learning of Pretext-Invariant Representations
* Self-Supervised Learning of Video-Induced Visual Invariances
* Self-Supervised Monocular Scene Flow Estimation
* Self-Supervised Monocular Trained Depth Estimation Using Self-Attention and Discrete Disparity Volume
* Self-Supervised Scene De-Occlusion
* Self-Supervised Viewpoint Learning From Image Collections
* Self-Trained Deep Ordinal Regression for End-to-End Video Anomaly Detection
* Self-Training With Noisy Student Improves ImageNet Classification
* Self2Self With Dropout: Learning Self-Supervised Denoising From Single Image
* Semantic Correspondence as an Optimal Transport Problem
* Semantic Drift Compensation for Class-Incremental Learning
* Semantic Image Manipulation Using Scene Graphs
* Semantic Pyramid for Image Generation
* Semantically Multi-Modal Image Synthesis
* Semantics-Guided Neural Networks for Efficient Skeleton-Based Human Action Recognition
* Semi-Supervised Assessor of Neural Architectures, A
* Semi-Supervised Learning for Few-Shot Image-to-Image Translation
* Semi-Supervised Semantic Image Segmentation With Self-Correcting Networks
* Semi-Supervised Semantic Segmentation With Cross-Consistency Training
* Separating Particulate Matter From a Single Microscopic Image
* Sequential 3D Human Pose and Shape Estimation From Point Clouds
* Sequential Mastery of Multiple Visual Tasks: Networks Naturally Learn to Learn and Forget to Forget
* Sequential Motif Profiles and Topological Plots for Offline Signature Verification
* SER-FIQ: Unsupervised Estimation of Face Image Quality Based on Stochastic Embedding Robustness
* SESS: Self-Ensembling Semi-Supervised 3D Object Detection
* Set-Constrained Viterbi for Set-Supervised Action Segmentation
* Severity-Aware Semantic Segmentation With Reinforced Wasserstein Training
* SG-NN: Sparse Generative Neural Networks for Self-Supervised Scene Completion of RGB-D Scans
* SGAS: Sequential Greedy Architecture Search
* Shape correspondence using anisotropic Chebyshev spectral CNNs
* Shape Reconstruction by Learning Differentiable Surface Representations
* Shared Multi-Attention Framework for Multi-Label Zero-Shot Learning, A
* SharinGAN: Combining Synthetic and Real Data for Unsupervised Geometry Estimation
* Shoestring: Graph-Based Semi-Supervised Classification With Severely Limited Labeled Data
* Show, Edit and Tell: A Framework for Editing Image Captions
* Siam R-CNN: Visual Tracking by Re-Detection
* SiamCAR: Siamese Fully Convolutional Classification and Regression for Visual Tracking
* Siamese Box Adaptive Network for Visual Tracking
* Sideways: Depth-Parallel Training of Video Models
* Sign Language Transformers: Joint End-to-End Sign Language Recognition and Translation
* Single Image Optical Flow Estimation With an Event Camera
* Single Image Reflection Removal Through Cascaded Refinement
* Single Image Reflection Removal With Physically-Based Training Images
* Single-Image HDR Reconstruction by Learning to Reverse the Camera Pipeline
* Single-Shot Monocular RGB-D Imaging Using Uneven Double Refraction
* Single-Side Domain Generalization for Face Anti-Spoofing
* Single-Stage 6D Object Pose Estimation
* Single-Stage Semantic Segmentation From Image Labels
* Single-Step Adversarial Training With Dropout Scheduling
* Single-View View Synthesis With Multiplane Images
* Skeleton-Based Action Recognition With Shift Graph Convolutional Network
* Sketch Less for More: On-the-Fly Fine-Grained Sketch-Based Image Retrieval
* Sketch-BERT: Learning Sketch Bidirectional Encoder Representation From Transformers by Self-Supervised Learning of Sketch Gestalt
* Sketchformer: Transformer-Based Representation for Sketched Structure
* SketchyCOCO: Image Generation From Freehand Scene Sketches
* SLV: Spatial Likelihood Voting for Weakly Supervised Object Detection
* SmallBigNet: Integrating Core and Contextual Views for Video Classification
* Smooth Shells: Multi-Scale Shape Registration With Functional Maps
* Smoothing Adversarial Domain Attack and P-Memory Reconsolidation for Cross-Domain Person Re-Identification
* Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction
* Softmax Splatting for Video Frame Interpolation
* Solving Jigsaw Puzzles With Eroded Boundaries
* Solving Mixed-Modal Jigsaw Puzzle for Fine-Grained Sketch-Based Image Retrieval
* Something-Else: Compositional Action Recognition With Spatial-Temporal Interaction Networks
* SOS: Selective Objective Switch for Rapid Immunofluorescence Whole Slide Image Classification
* SP-NAS: Serial-to-Parallel Backbone Search for Object Detection
* Space-Time-Aware Multi-Resolution Video Enhancement
* SPARE3D: A Dataset for SPAtial REasoning on Three-View Line Drawings
* Sparse Layered Graphs for Multi-Object Segmentation
* Sparse Resultant Based Method for Efficient Minimal Solvers, A
* Spatial Pyramid Based Graph Reasoning for Semantic Segmentation
* Spatial RNN Codec for End-to-End Image Compression, A
* Spatial-Temporal Graph Convolutional Network for Video-Based Person Re-Identification
* Spatially Attentive Output Layer for Image Classification
* Spatially-Attentive Patch-Hierarchical Network for Adaptive Motion Deblurring
* Spatio-Temporal Graph for Video Captioning With Knowledge Distillation
* Spatiotemporal Fusion in 3D CNNs: A Probabilistic View
* Spatiotemporal Volumetric Interpolation Network for 4D Dynamic Medical Image, A
* Speech2Action: Cross-Modal Supervision for Action Recognition
* SpeedNet: Learning the Speediness in Videos
* Spherical Space Domain Adaptation With Robust Pseudo-Label Loss
* SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization
* SpSequenceNet: Semantic Segmentation Network on 4D Point Clouds
* SQE: a Self Quality Evaluation Metric for Parameters Optimization in Multi-Object Tracking
* Squeeze-and-Attention Networks for Semantic Segmentation
* SQuINTing at VQA Models: Introspecting VQA Models With Sub-Questions
* SSRNet: Scalable 3D Surface Reconstruction Network
* StarGAN v2: Diverse Image Synthesis for Multiple Domains
* State-Aware Tracker for Real-Time Video Object Segmentation
* State-Relabeling Adversarial Active Learning
* STAViS: Spatio-Temporal AudioVisual Saliency Network
* Steering Self-Supervised Feature Learning Beyond Local Pixel Statistics
* STEFANN: Scene Text Editor Using Font Adaptive Neural Network
* StegaStamp: Invisible Hyperlinks in Physical Photographs
* StereoGAN: Bridging Synthetic-to-Real Domain Gap by Joint Optimization of Domain Translation and Stereo Matching
* Stereoscopic Flash and No-Flash Photography for Shape and Albedo Recovery
* STINet: Spatio-Temporal-Interactive Network for Pedestrian Detection and Trajectory Prediction
* Stochastic Classifiers for Unsupervised Domain Adaptation
* Stochastic Conditioning Scheme for Diverse Human Motion Prediction, A
* Stochastic Sparse Subspace Clustering
* Straight to the Point: Fast-Forwarding Videos via Reinforcement Learning Using Textual Data
* Strip Pooling: Rethinking Spatial Pooling for Scene Parsing
* StructEdit: Learning Structural Shape Variations
* Structure Aware Single-Stage 3D Object Detection From Point Cloud
* Structure Boundary Preserving Segmentation for Medical Image With Ambiguous Boundary
* Structure Preserving Generative Cross-Domain Learning
* Structure-Guided Ranking Loss for Single Image Depth Prediction
* Structure-Preserving Super Resolution With Gradient Guidance
* Structured Compression by Weight Encryption for Unstructured Pruning and Quantization
* Structured Multi-Hashing for Model Compression
* Style Normalization and Restitution for Generalizable Person Re-Identification
* StyleRig: Rigging StyleGAN for 3D Control Over Portrait Images
* Stylization-Based Architecture for Fast Deep Exemplar Colorization
* Sub-Frame Appearance and 6D Pose Estimation of Fast Moving Objects
* Super-BPD: Super Boundary-to-Pixel Direction for Fast Image Segmentation
* SuperGlue: Learning Feature Matching With Graph Neural Networks
* Superpixel Segmentation With Fully Convolutional Networks
* Supervised Raw Video Denoising With a Benchmark Dataset on Dynamic Scenes
* Suppressing Uncertainties for Large-Scale Facial Expression Recognition
* SurfelGAN: Synthesizing Realistic Sensor Data for Autonomous Driving
* SwapText: Image Based Texts Transfer in Scenes
* Symmetry and Group in Attribute-Object Compositions
* Syn2Real Transfer Learning for Image Deraining Using Gaussian Processes
* Synchronizing Probability Measures on Rotations via Optimal Transport
* SynSin: End-to-End View Synthesis From a Single Image
* Syntax-Aware Action Targeting for Video Captioning
* Synthetic Learning: Learn From Distributed Asynchronized Discriminator GAN Without Sharing Medical Image Data
* TA-Student VQA: Multi-Agents Training by Self-Questioning
* TailorNet: Predicting Clothing in 3D as a Function of Human Pose, Shape and Garment Style
* Taking a Deeper Look at Co-Salient Object Detection
* Tangent Images for Mitigating Spherical Distortion
* Task Agnostic Robust Learning on Corrupt Outputs by Correlation-Guided Mixture Density Networks
* TBT: Targeted Neural Network Attack With Bit Trojan
* TDAN: Temporally-Deformable Alignment Network for Video Super-Resolution
* TEA: Temporal Excitation and Aggregation for Action Recognition
* Telling Left From Right: Learning Spatial Correspondence of Sight and Sound
* Temporal Pyramid Network for Action Recognition
* Temporal-Context Enhanced Detection of Heavily Occluded Pedestrians
* Temporally Distributed Networks for Fast Video Semantic Segmentation
* TESA: Tensor Element Self-Attention via Matricization
* TetraTSDF: 3D Human Reconstruction From a Single Image With a Tetrahedral Outer Shell
* Texture and Shape Biased Two-Stream Networks for Clothing Classification and Attribute Recognition
* TextureFusion: High-Quality Texture Acquisition for Real-Time RGB-D Scanning
* There and Back Again: Revisiting Backpropagation Saliency Methods
* Three-Dimensional Reconstruction of Human Interactions
* Through Fog High-Resolution Imaging Using Millimeter Wave Radar
* Through the Looking Glass: Neural 3D Reconstruction of Transparent Shapes
* Time Flies: Animating a Still Image With Time-Lapse Video As Reference
* TITAN: Future Forecast Using Action Priors
* TomoFluid: Reconstructing Dynamic Fluid From Sparse View Videos
* Total Deep Variation for Linear Inverse Problems
* Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes From a Single Image
* Toward a Universal Model for Shape From Texture
* Towards Accurate Scene Text Recognition With Semantic Reasoning Networks
* Towards Achieving Adversarial Robustness by Enforcing Feature Consistency Across Bit Planes
* Towards Backward-Compatible Representation Learning
* Towards Better Generalization: Joint Depth-Pose Learning Without PoseNet
* Towards Causal VQA: Revealing and Reducing Spurious Correlations by Invariant and Covariant Semantic Editing
* Towards Discriminability and Diversity: Batch Nuclear-Norm Maximization Under Label Insufficient Situations
* Towards Efficient Model Compression via Learned Global Ranking
* Towards Fairness in Visual Recognition: Effective Strategies for Bias Mitigation
* Towards Global Explanations of Convolutional Neural Networks With Concept Attribution
* Towards High-Fidelity 3D Face Reconstruction From In-the-Wild Images Using Graph Convolutional Networks
* Towards Inheritable Models for Open-Set Domain Adaptation
* Towards Large Yet Imperceptible Adversarial Image Perturbations With Perceptual Color Distance
* Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-Training
* Towards Learning Structure via Consensus for Face Segmentation and Parsing
* Towards Photo-Realistic Virtual Try-On by Adaptively Generating?Preserving Image Content
* Towards Robust Image Classification Using Sequential Attention Models
* Towards Transferable Targeted Attack
* Towards Unified INT8 Training for Convolutional Neural Network
* Towards Universal Representation Learning for Deep Face Recognition
* Towards Unsupervised Learning of Generative Models for 3D Controllable Image Synthesis
* Towards Verifying Robustness of Neural Networks Against A Family of Semantic Perturbations
* Towards Visually Explaining Variational Autoencoders
* TPNet: Trajectory Proposal Network for Motion Prediction
* Tracking by Instance Detection: A Meta-Learning Approach
* Train in Germany, Test in the USA: Making 3D Object Detectors Generalize
* Training a Steerable CNN for Guidewire Detection
* Training Noise-Robust Deep Neural Networks via Meta-Learning
* Training Quantized Neural Networks With a Full-Precision Auxiliary Module
* Transductive Approach for Video Object Segmentation, A
* Transfer Learning From Synthetic to Real-Noise Denoising With Adaptive Instance Normalization
* Transferable, Controllable, and Inconspicuous Adversarial Attacks on Person Re-identification With Deep Mis-Ranking
* Transferring and Regularizing Prediction for Semantic Segmentation
* Transferring Cross-Domain Knowledge for Video Sign Language Recognition
* Transferring Dense Pose to Proximal Animal Classes
* Transform and Tell: Entity-Aware News Image Captioning
* Transformation GAN for Unsupervised Image Synthesis and Representation Learning
* TransMatch: A Transfer-Learning Scheme for Semi-Supervised Few-Shot Learning
* TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting
* TRPLP: Trifocal Relative Pose From Lines at Points
* TubeTK: Adopting Tubes to Track Multi-Object in a One-Step Training Model
* Two Causal Principles for Improving Visual Dialog
* Two-Shot Spatially-Varying BRDF and Shape Estimation
* Two-Stage Peer-Regularized Feature Recombination for Arbitrary Image Style Transfer
* U-Net Based Discriminator for Generative Adversarial Networks, A
* UC-Net: Uncertainty Inspired RGB-D Saliency Detection via Conditional Variational Autoencoders
* UCTGAN: Diverse Image Inpainting Based on Unsupervised Cross-Space Translation
* UNAS: Differentiable Architecture Search Meets Reinforcement Learning
* Unbiased Scene Graph Generation from Biased Training
* Uncertainty Based Camera Model Selection
* Uncertainty-Aware CNNs for Depth Completion: Uncertainty from Beginning to End
* Uncertainty-Aware Mesh Decoder for High Fidelity 3D Face Reconstruction
* Uncertainty-Aware Score Distribution Learning for Action Quality Assessment
* Understanding Adversarial Examples From the Mutual Influence of Images and Perturbations
* Understanding Human Hands in Contact at Internet Scale
* Understanding Road Layout From Videos as a Whole
* Unified Dynamic Convolutional Network for Super-Resolution With Variational Degradations
* Unified Object Motion and Affinity Model for Online Multi-Object Tracking, A
* Unified Optimization Framework for Low-Rank Inducing Penalties, A
* Unifying Training and Inference for Panoptic Segmentation
* Uninformed Students: Student-Teacher Anomaly Detection With Discriminative Latent Embeddings
* UniPose: Unified Human Pose Estimation in Single Images and Videos
* Unity Style Transfer for Person Re-Identification
* Universal Litmus Patterns: Revealing Backdoor Attacks in CNNs
* Universal Physical Camouflage Attacks on Object Detectors
* Universal Source-Free Domain Adaptation
* Universal Weighting Metric Learning for Cross-Modal Matching
* Unpaired Image Super-Resolution Using Pseudo-Supervision
* Unpaired Portrait Drawing Generation via Asymmetric Cycle Mapping
* Unsupervised Adaptation Learning for Hyperspectral Imagery Super-Resolution
* Unsupervised Deep Shape Descriptor With Point Distribution Learning
* Unsupervised Domain Adaptation via Structurally Regularized Deep Clustering
* Unsupervised Domain Adaptation With Hierarchical Gradient Synchronization
* Unsupervised Instance Segmentation in Microscopy Images via Panoptic Domain Adaptation and Task Re-Weighting
* Unsupervised Intra-Domain Adaptation for Semantic Segmentation Through Self-Supervision
* Unsupervised Learning for Intrinsic Image Decomposition from a Single Image
* Unsupervised Learning From Video With Deep Neural Embeddings
* Unsupervised Learning of Intrinsic Structural Representation Points
* Unsupervised Learning of Probably Symmetric Deformable 3D Objects From Images in the Wild
* Unsupervised Magnification of Posture Deviations Across Subjects
* Unsupervised Model Personalization While Preserving Privacy and Scalability: An Open Problem
* Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation
* Unsupervised Person Re-Identification via Multi-Label Classification
* Unsupervised Person Re-Identification via Softened Similarity Learning
* Unsupervised Reinforcement Learning of Transferable Meta-Skills for Embodied Navigation
* Unsupervised Representation Learning for Gaze Estimation
* Upgrading Optical Flow to 3D Scene Flow Through Optical Expansion
* Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects
* Variational Context-Deformable ConvNets for Indoor Scene Parsing
* Variational-EM-Based Deep Learning for Noise-Blind Image Deblurring
* Varicolored Image De-Hazing
* Vec2Face: Unveil Human Faces From Their Blackbox Features in Face Recognition
* VecRoad: Point-Based Iterative Graph Exploration for Road Graphs Extraction
* VectorNet: Encoding HD Maps and Agent Dynamics From Vectorized Representation
* ViBE: Dressing for Diverse Body Shapes
* VIBE: Video Inference for Human Body Pose and Shape Estimation
* Video Instance Segmentation Tracking With a Modified VAE Architecture
* Video Modeling With Correlation Networks
* Video Object Grounding Using Semantic Roles in Language Description
* Video Panoptic Segmentation
* Video Playback Rate Perception for Self-Supervised Spatio-Temporal Representation Learning
* Video Super-Resolution With Temporal Group Attention
* Video to Events: Recycling Video Datasets for Event Cameras
* View-GCN: View-Based Graph Convolutional Network for 3D Shape Analysis
* ViewAL: Active Learning With Viewpoint Entropy for Semantic Segmentation
* Violin: A Large-Scale Dataset for Video-and-Language Inference
* Vision-Dialog Navigation by Exploring Cross-Modal Memory
* Vision-Language Navigation With Self-Supervised Auxiliary Reasoning Tasks
* Visual Chirality
* Visual Commonsense R-CNN
* Visual Grounding in Video for Unsupervised Word Translation
* Visual Reaction: Learning to Play Catch With Your Drone
* Visual-Semantic Matching by Exploring High-Order Attention and Distraction
* Visual-Textual Capsule Routing for Text-Based Video Segmentation
* Visually Imbalanced Stereo Matching
* VOLDOR: Visual Odometry From Log-Logistic Dense Optical Flow Residuals
* VPLNet: Deep Single View Normal Estimation With Vanishing Points and Lines
* VQA With No Questions-Answers Training
* VSGNet: Spatial Attention Network for Detecting Human Object Interactions Using Graph Convolutions
* Warp to the Future: Joint Forecasting of Features and Feature Motion
* Warping Residual Based Image Stitching for Large Parallax
* Watch Your Up-Convolution: CNN Based Generative Deep Neural Networks Are Failing to Reproduce Spectral Distributions
* Wavelet Integrated CNNs for Noise-Robust Image Classification
* Wavelet Synthesis Net for Disparity Estimation to Synthesize DSLR Calibre Bokeh Effect on Smartphones
* WaveletStereo: Learning Wavelet Coefficients of Disparity Map in Stereo Matching
* WCP: Worst-Case Perturbations for Semi-Supervised Deep Learning
* Weakly Supervised Discriminative Feature Learning With State Information for Person Identification
* Weakly Supervised Fine-Grained Image Classification via Guassian Mixture Model Oriented Discriminative Learning
* Weakly Supervised Semantic Point Cloud Segmentation: Towards 10x Fewer Labels
* Weakly Supervised Visual Semantic Parsing
* Weakly-Supervised 3D Human Pose Learning via Multi-View Images in the Wild
* Weakly-Supervised Action Localization by Generative Attention Modeling
* Weakly-Supervised Domain Adaptation via GAN and Mesh Model for Estimating 3D Hand Poses Interacting Objects
* Weakly-Supervised Mesh-Convolutional Hand Reconstruction in the Wild
* Weakly-Supervised Salient Object Detection via Scribble Annotations
* Weakly-Supervised Semantic Segmentation via Sub-Category Exploration
* Webly Supervised Knowledge Embedding Model for Visual Reasoning
* What Can Be Transferred: Unsupervised Domain Adaptation for Endoscopic Lesions Segmentation
* What Deep CNNs Benefit From Global Covariance Pooling: An Optimization Perspective
* What Does Plate Glass Reveal About Camera Calibration?
* What It Thinks Is Important Is Important: Robustness Transfers Through Input Gradients
* What Machines See Is Not What They Get: Fooling Scene Text Recognition Models With Adversarial Text Images
* What Makes Training Multi-Modal Classification Networks Hard?
* What You See is What You Get: Exploiting Visibility for 3D Object Detection
* What's Hidden in a Randomly Weighted Neural Network?
* When NAS Meets Robustness: In Search of Robust Architectures Against Adversarial Attacks
* When to Use Convolutional Neural Networks for Inverse Problems
* When2com: Multi-Agent Perception via Communication Graph Grouping
* Where Am I Looking At? Joint Location and Orientation Estimation by Cross-View Matching
* Where Does It End?: Reasoning About Hidden Surfaces by Object Intersection Constraints
* Where Does It Exist: Spatio-Temporal Video Grounding for Multi-Form Sentences
* Which Is Plagiarism: Fashion Image Retrieval Based on Regional Representation for Design Protection
* Why Having 10,000 Parameters in Your Camera Model Is Better Than Twelve
* Wish You Were Here: Context-Aware Human Generation
* X-Linear Attention Networks for Image Captioning
* X3D: Expanding Architectures for Efficient Video Recognition
* xMUDA: Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation
* You2Me: Inferring Body Pose in Egocentric Video via First and Second Person Interactions
* Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models
* Zero-Assignment Constraint for Graph Matching With Outliers
* Zero-Reference Deep Curve Estimation for Low-Light Image Enhancement
* ZeroQ: A Novel Zero Shot Quantization Framework
* Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution
* ZSTAD: Zero-Shot Temporal Activity Detection
1464 for CVPR20

CVPR21 * *CVPR
* 2D or not 2D? Adaptive 3D Convolution Selection for Efficient Video Recognition
* 3D AffordanceNet: A Benchmark for Visual Object Affordance Understanding
* 3D CNNs with Adaptive Temporal Feature Resolutions
* 3D GAN for Improved Large-pose Facial Recognition, A
* 3D Graph Anatomy Geometry-Integrated Network for Pancreatic Mass Segmentation, Diagnosis, and Quantitative Patient Management
* 3D Human Action Representation Learning via Cross-View Consistency Pursuit
* 3D Object Detection with Pointformer
* 3D Shape Generation with Grid-based Implicit Functions
* 3D Spatial Recognition without Spatially Labeled 3D
* 3D Video Stabilization with Depth Estimation by CNN-based Optimization
* 3D-MAN: 3D Multi-frame Attention Network for Object Detection
* 3D-to-2D Distillation for Indoor Scene Parsing
* 3DCaricShop: A Dataset and A Baseline Method for Single-view 3D Caricature Face Reconstruction
* 3DIoUMatch: Leveraging IoU Prediction for Semi-Supervised 3D Object Detection
* 4D Hyperspectral Photoacoustic Data Restoration with Reliability Analysis
* 4D Panoptic LiDAR Segmentation
* A2-FPN: Attention Aggregation based Feature Pyramid Network for Instance Segmentation
* ABMDRNet: Adaptive-weighted Bi-directional Modality Difference Reduction Network for RGB-T Semantic Segmentation
* Abstract Spatial-Temporal Reasoning via Probabilistic Abduction and Execution
* Accurate Few-shot Object Detection with Support-Query Mutual Guidance and Hybrid Loss
* Achieving robustness in classification using optimal transport with hinge regularization
* ACRE: Abstract Causal REasoning Beyond Covariation
* Action Shuffle Alternating Learning for Unsupervised Action Segmentation
* Action Unit Memory Network for Weakly Supervised Temporal Action Localization
* ACTION-Net: Multipath Excitation for Action Recognition
* Activate or Not: Learning Customized Activation
* Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization
* AdaBins: Depth Estimation Using Adaptive Bins
* Adaptive Aggregation Networks for Class-Incremental Learning
* Adaptive Class Suppression Loss for Long-Tail Object Detection
* Adaptive Consistency Prior based Deep Network for Image Denoising
* Adaptive Consistency Regularization for Semi-Supervised Transfer Learning
* Adaptive Convolutions for Structure-Aware Style Transfer
* Adaptive Cross-Modal Prototypes for Cross-Domain Visual-Language Retrieval
* Adaptive Image Transformer for One-Shot Object Detection
* Adaptive Methods for Real-World Domain Generalization
* Adaptive Prototype Learning and Allocation for Few-Shot Segmentation
* Adaptive Rank Estimate in Robust Principal Component Analysis
* Adaptive Weighted Discriminator for Training Generative Adversarial Networks
* AdaStereo: A Simple and Efficient Approach for Adaptive Stereo Matching
* AdCo: Adversarial Contrast for Efficient Learning of Unsupervised Representations from Self-Trained Negative Adversaries
* AdderSR: Towards Energy Efficient Image Super-Resolution
* Adversarial Generation of Continuous Images
* Adversarial Imaging Pipelines
* Adversarial Invariant Learning
* Adversarial Laser Beam: Effective Physical-World Attack to DNNs in a Blink
* Adversarial Robustness Across Representation Spaces
* Adversarial Robustness under Long-Tailed Distribution
* Adversarially Adaptive Normalization for Single Domain Generalization
* AdvSim: Generating Safety-Critical Scenarios for Self-Driving Vehicles
* AF)2-S3Net: Attentive Feature Fusion with Adaptive Feature Selection for Sparse Semantic Segmentation Network
* Affect2MM: Affective Analysis of Multimedia Content Using Emotion Causality
* Affective Growth of Computer Vision, The
* Affective Processes: stochastic modelling of temporal context for emotion and facial expression recognition
* Affordance Transfer Learning for Human-Object Interaction Detection
* AGORA: Avatars in Geography Optimized for Regression Analysis
* AGQA: A Benchmark for Compositional Spatio-Temporal Reasoning
* AIFit: Automatic 3D Human-Interpretable Feedback Models for Fitness Training
* All Labels Are Not Created Equal: Enhancing Semi-supervision via Label Grouping and Co-training
* Alpha-Refine: Boosting Tracking Performance by Precise Bounding Box Estimation
* AlphaMatch: Improving Consistency for Semi-supervised Learning with Alpha-divergence
* Alternative Probabilistic Interpretation of the Huber Loss, An
* Amalgamating Knowledge from Heterogeneous Graph Neural Networks
* Anchor-Constrained Viterbi for Set-Supervised Action Segmentation
* Anchor-Free Person Search
* Animating Pictures with Eulerian Motion Fields
* Anomaly Detection in Video via Self-Supervised and Multi-Task Learning
* ANR: Articulated Neural Rendering for Virtual Avatars
* Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation
* Anti-aliasing Semantic Reconstruction for Few-Shot Semantic Segmentation
* Anticipating human actions by correlating past with the future with Jaccard similarity measures
* Anycost GANs for Interactive Image Synthesis and Editing
* AQD: Towards Accurate Quantized Object Detection
* Architectural Adversarial Robustness: The Case for Deep Pursuit
* Are Labels Always Necessary for Classifier Accuracy Evaluation?
* ArtCoder: An End-to-end Method for Generating Scanning-robust Stylized QR Codes
* ArtEmis: Affective Language for Visual Art
* ArtFlow: Unbiased Image Style Transfer via Reversible Neural Flows
* ARVo: Learning All-Range Volumetric Correspondence for Video Deblurring
* Asymmetric Gained Deep Image Compression With Continuous Rate Adaptation
* Asymmetric metric learning for knowledge transfer
* ATSO: Asynchronous Teacher-Student Optimization for Semi-Supervised Image Segmentation
* Attention-guided Image Compression by Deep Reconstruction of Compressive Sensed Saliency Skeleton
* AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling
* Audio-Driven Emotional Video Portraits
* Audio-Visual Instance Discrimination with Cross-Modal Agreement
* Augmentation Strategies for Learning with Noisy Labels
* Auto-Exposure Fusion for Single-Image Shadow Removal
* AutoDO: Robust AutoAugment for Biased Data with Label Noise via Scalable Probabilistic Implicit Differentiation
* AutoFlow: Learning a Better Training Set for Optical Flow
* AutoInt: Automatic Integration for Fast Neural Volume Rendering
* Automated Log-Scale Quantization for Low-Cost Deep Neural Networks
* Automatic Correction of Internal Units in Generative Neural Networks
* Automatic Vertebra Localization and Identification in CT by Spine Rectification and Anatomically-constrained Optimization
* Autoregressive Stylized Motion Synthesis with Generative Flow
* BABEL: Bodies, Action and Behavior with English Labels
* Back to Event Basics: Self-Supervised Learning of Image Reconstruction for Event Cameras via Photometric Constancy
* Back to the Feature: Learning Robust Camera Localization from Pixels to Pose
* Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds
* Backdoor Attacks Against Deep Learning Systems in the Physical World
* Background Splitting: Finding Rare Classes in a Sea of Background
* Background-Aware Pooling and Noise-Aware Loss for Weakly-Supervised Semantic Segmentation
* BASAR:Black-box Attack on Skeletal Action Recognition
* BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond
* Bayesian Nested Neural Networks for Uncertainty Calibration and Adaptive Compression
* BBAM: Bounding Box Attribution Map for Weakly Supervised Semantic and Instance Segmentation
* BCNet: Searching for Network Width with Bilaterally Coupled Network
* Behavior-Driven Synthesis of Human Dynamics
* Benchmarking Representation Learning for Natural World Image Collections
* Beyond Bounding-Box: Convex-hull Feature Adaptation for Oriented and Densely Packed Object Detection
* Beyond Image to Depth: Improving Depth Prediction using Echoes
* Beyond Max-Margin: Class Margin Equilibrium for Few-shot Object Detection
* Beyond Short Clips: End-to-End Video-Level Learning with Collaborative Memories
* Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video
* Bi-GCN: Binary Graph Convolutional Network
* BiCnet-TKS: Learning Efficient Spatial-Temporal Representation for Video Person Re-Identification
* Bidirectional Projection Network for Cross Dimension Scene Understanding
* Bilateral Grid Learning for Stereo Matching Networks
* Bilevel Online Adaptation for Out-of-Domain Human Mesh Reconstruction
* Bilinear Parameterization for Non-Separable Singular Value Penalties
* Binary Graph Neural Networks
* Binary TTC: A Temporal Geofence for Autonomous Navigation
* Bipartite Graph Network with Adaptive Message Passing for Unbiased Scene Graph Generation
* Birds of a Feather: Capturing Avian Shape Models from Images
* Black-box Explanation of Object Detectors via Saliency Maps
* Blessings of Unlabeled Background in Untrimmed Videos, The
* Blind Deblurring for Saturated Images
* Blocks-World Cameras
* Blur, Noise, and Compression Robust Generative Adversarial Networks
* Body Meshes as Points
* Body2Hands: Learning to Infer 3D Hands from Conversational Gesture Body Dynamics
* Boosting Ensemble Accuracy by Revisiting Ensemble Diversity Metrics
* Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging
* Boosting Video Representation Learning with Multi-Faceted Integration
* Bottleneck Transformers for Visual Recognition
* Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression
* Bottom-Up Shift and Reasoning for Referring Image Segmentation
* Boundary IoU: Improving Object-Centric Image Segmentation Evaluation
* BoxInst: High-Performance Instance Segmentation with Box Annotations
* Brain Image Synthesis with Unsupervised Multivariate Canonical CSC l4Net
* BRepNet: A topological message passing system for solid models
* Bridge to Answer: Structure-aware Graph Interaction Network for Video Question Answering
* Bridging the Visual Gap: Wide-Range Image Blending
* Building Reliable Explanations of Unreliable Neural Networks: Locally Smoothing Perspective of Model Interpretation
* Calibrated RGB-D Salient Object Detection
* Camera Pose Matters: Improving Depth Prediction by Mitigating Pose Distribution Bias
* Camera-Space Hand Mesh Recovery via Semantic Aggregation and Adaptive 2D-1D Registration
* CAMERAS: Enhanced Resolution And Sanity preserving Class Activation Mapping for image saliency
* Camouflaged Object Segmentation with Distraction Mining
* Can audio-visual integration strengthen robustness under multimodal attacks?
* Can We Characterize Tasks Without Labels or Features?
* CanonPose: Self-Supervised Monocular 3D Human Pose Estimation in the Wild
* Capsule Network is Not More Robust than Convolutional Network
* CapsuleRRT: Relationships-aware Regression Tracking via Capsules
* Capturing Omni-Range Context for Omnidirectional Segmentation
* Cascaded Prediction Network via Segment Tree for Temporal Video Grounding
* CASTing Your Model: Learning to Localize Improves Self-Supervised Representations
* Categorical Depth Distribution Network for Monocular 3D Object Detection
* Causal Attention for Vision-Language Tasks
* Causal Hidden Markov Model for Time Series Disease Forecasting
* CausalVAE: Disentangled Representation Learning via Neural Structural Causal Models
* CDFI: Compression-Driven Network Design for Frame Interpolation
* Center-based 3D Object Detection and Tracking
* CFNet: Cascade and Fused Cost Volume for Robust Stereo Matching
* CGA-Net: Category Guided Aggregation for Point Cloud Semantic Segmentation
* ChallenCap: Monocular 3D Capture of Challenging Human Performances using Multi-Modal References
* Checkerboard Context Model for Efficient Learned Image Compression
* Circular-Structured Representation for Visual Emotion Distribution Learning, A
* Class-Aware Robust Adversarial Training for Object Detection
* ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic
* CLCC: Contrastive Learning for Color Constancy
* clDice: A Novel Topology-Preserving Loss Function for Tubular Structure Segmentation
* Closed-Form Factorization of Latent Semantics in GANs
* Closer Look at Fourier Spectrum Discrepancies for CNN-generated Images Detection, A
* Closing the Loop: Joint Rain Generation and Removal via Disentangled Image Translation
* Cloud2Curve: Generation and Vectorization of Parametric Sketches
* Clusformer: A Transformer based Clustering Approach to Unsupervised Large-scale Face and Visual Landmark Recognition
* Cluster, Split, Fuse, and Update: Meta-Learning for Open Compound Domain Adaptive Semantic Segmentation
* Cluster-wise Hierarchical Generative Model for Deep Amortized Clustering
* Co-Attention for Conditioned Image Matching
* Co-Grounding Networks with Semantic Attention for Referring Expression Comprehension in Videos
* Coarse-Fine Networks for Temporal Activity Detection in Videos
* Coarse-to-Fine Domain Adaptive Semantic Segmentation with Photometric Alignment and Category-Center Regularization
* Coarse-to-Fine Person Re-Identification with Auxiliary-Domain Classification and Second-Order Information Bottleneck
* CoCoNets: Continuous Contrastive 3D Scene Representations
* CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation
* CodedStereo: Learned Phase Masks for Large Depth-of-field Stereo
* CoLA: Weakly-Supervised Temporal Action Localization with Snippet Contrastive Learning
* Collaborative Spatial-Temporal Modeling for Language-Queried Video Actor Segmentation
* ColorRL: Reinforced Coloring for End-to-End Instance Segmentation
* Combinatorial Learning of Graph Edit Distance via Dynamic Embedding
* Combined Depth Space based Architecture Search For Person Re-identification
* Combining Semantic Guidance and Deep Reinforcement Learning For Generating Human Level Paintings
* Coming Down to Earth: Satellite-to-Street View Synthesis for Geo-Localization
* Communication Efficient SGD via Gradient Sampling with Bayes Prior
* CoMoGAN: continuous model-guided image-to-image translation
* Compatibility-aware Heterogeneous Visual Search
* Complementary Relation Contrastive Distillation
* Complete & Label: A Domain Adaptation Approach to Semantic Segmentation of LiDAR Point Clouds
* COMPLETER: Incomplete Multi-view Clustering via Contrastive Prediction
* Composing Photos Like a Photographer
* CompositeTasking: Understanding Images by Spatial Composition of Tasks
* Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
* CondenseNet V2: Sparse Feature Reactivation for Deep Networks
* Conditional Bures Metric for Domain Adaptation
* Confluent Vessel Trees with Accurate Bifurcations
* Connecting What to Say With Where to Look by Modeling Human Attention Traces
* Consensus Maximisation Using Influences of Monotone Boolean Functions
* Consistent Instance False Positive Improves Fairness in Face Recognition
* ContactOpt: Optimizing Contact to Improve Grasps
* Content-Aware GAN Compression
* Context Modeling in 3D Human Pose Estimation: A Unified Perspective
* Context-aware Biaffine Localizing Network for Temporal Sentence Grounding
* Context-Aware Layout to Image Generation with Enhanced Object Appearance
* Continual Adaptation of Visual Representations via Domain Randomization and Meta-learning
* Continual Learning via Bit-Level Information Preserving
* Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations
* Continuous Face Aging via Self-estimated Residual Age Embedding
* Contrastive Embedding for Generalized Zero-Shot Learning
* Contrastive Learning based Hybrid Networks for Long-Tailed Image Classification
* Contrastive Learning for Compact Single Image Dehazing
* Contrastive Neural Architecture Search with Neural Architecture Comparators
* Controllable Image Restoration for Under-Display Camera in Smartphones
* Controlling the Rain: from Removal to Rendering
* Convolutional Dynamic Alignment Networks for Interpretable Classifications
* Convolutional Hough Matching Networks
* Convolutional Neural Network Pruning with Structural Redundancy Reduction
* Coordinate Attention for Efficient Mobile Network Design
* Correlated Input-Dependent Label Noise in Large-Scale Image Classification
* CorrNet3D: Unsupervised End-to-end Learning of Dense Correspondence for 3D Point Clouds
* CoSMo: Content-Style Modulation for Image Retrieval with Text Feedback
* Counterfactual VQA: A Cause-Effect Look at Language Bias
* Counterfactual Zero-Shot and Open-Set Visual Recognition
* CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning
* CRFace: Confidence Ranker for Model-Agnostic Face Detection Refinement
* Cross Modal Focal Loss for RGBD Face Anti-Spoofing
* Cross-Domain Adaptive Clustering for Semi-Supervised Domain Adaptation
* Cross-Domain Gradient Discrepancy Minimization for Unsupervised Domain Adaptation
* Cross-Domain Similarity Learning for Face Recognition in Unseen Domains
* Cross-Iteration Batch Normalization
* Cross-Modal Center Loss for 3D Cross-Modal Retrieval
* Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting
* Cross-Modal Contrastive Learning for Text-to-Image Generation
* Cross-MPI: Cross-scale Stereo for Image Super-Resolution using Multiplane Images
* Cross-View Cross-Scene Multi-View Crowd Counting
* Cross-View Gait Recognition with Deep Universal Linear Embeddings
* Cross-View Regularization for Domain Adaptive Panoptic Segmentation
* Crossing cuts polygonal puzzles: Models and Solvers
* CT-Net: Complementary Transfering Network for Garment Transfer with Arbitrary Geometric Changes
* Cuboids Revisited: Learning Robust 3D Shape Fitting to Single RGB Images
* Curriculum Graph Co-Teaching for Multi-Target Domain Adaptation
* CutPaste: Self-Supervised Learning for Anomaly Detection and Localization
* Cycle4Completion: Unpaired Point Cloud Completion using Cycle Transformation with Missing Region Coding
* Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation
* Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation
* D-NeRF: Neural Radiance Fields for Dynamic Scenes
* D2IM-Net: Learning Detail Disentangled Implicit Fields from Single Images
* DANNet: A One-Stage Domain Adaptation Network for Unsupervised Nighttime Semantic Segmentation
* DAP: Detection-Aware Pre-training with Weak Supervision
* DARCNN: Domain Adaptive Region-based Convolutional Neural Network for Unsupervised Instance Segmentation in Biomedical Images
* DAT: Training Deep Networks Robust to Label-Noise by Matching the Feature Distributions
* Data-Free Knowledge Distillation For Image Super-Resolution
* Data-Free Model Extraction
* Data-Uncertainty Guided Multi-Phase Learning for Semi-Supervised Object Detection
* DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort
* DCNAS: Densely Connected Neural Architecture Search for Semantic Image Segmentation
* DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation
* De-rendering the World's Revolutionary Artefacts
* Debiased Subjective Assessment of Real-World Image Enhancement
* Decomposition Model for Stereo Matching, A
* DECOR-GAN: 3D Shape Detailization by Conditional Refinement
* Decoupled Dynamic Filter Networks
* Deep Active Surface Models
* Deep Analysis of CNN-based Spatio-temporal Representations for Action Recognition
* Deep Animation Video Interpolation in the Wild
* Deep Burst Super-Resolution
* Deep Compositional Metric Learning
* Deep Convolutional Dictionary Learning for Image Denoising
* Deep Denoising of Flash and No-Flash Pairs for Photography in Low-Light Environments
* Deep Dual Consecutive Network for Human Pose Estimation
* Deep Emulator for Secondary Motion of 3D Characters, A
* Deep Gaussian Scale Mixture Prior for Spectral Compressive Imaging
* Deep Gradient Projection Networks for Pan-sharpening
* Deep Graph Matching under Quadratic Constraint
* Deep Homography for Efficient Stereo Image Compression
* Deep Implicit Moving Least-Squares Functions for 3D Reconstruction
* Deep Implicit Templates for 3D Shape Representation
* Deep Learning in Latent Space for Video Prediction and Compression
* Deep Lesion Tracker: Monitoring Lesions in 4D Longitudinal Imaging Studies
* Deep Lucas-Kanade Homography for Multimodal Image Alignment
* Deep Multi-Task Learning for Joint Localization, Perception, and Prediction
* Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers
* Deep Optimized Priors for 3D Shape Modeling and Reconstruction
* Deep Perceptual Preprocessing for Video Coding
* Deep Polarization Imaging for 3D Shape and SVBRDF Acquisition
* Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion
* Deep Stable Learning for Out-Of-Distribution Generalization
* Deep Texture Recognition via Exploiting Cross-Layer Statistical Self-Similarity
* Deep Two-View Structure-from-Motion Revisited
* Deep Video Matting via Spatio-Temporal Alignment and Aggregation
* DeepACG: Co-Saliency Detection via Semantic-aware Contrast Gromov-Wasserstein Distance
* DeepI2P: Image-to-Point Cloud Registration via Deep Classification
* DeepLM: Large-scale Nonlinear Least Squares on Deep Learning Frameworks using Stochastic Domain Decomposition
* Deeply Shape-guided Cascade for Instance Segmentation
* DeepMetaHandles: Learning Deformation Meta-Handles of 3D Meshes with Biharmonic Coordinates
* DeepSurfels: Learning Online Appearance Fusion
* DeepTag: An Unsupervised Deep Learning Method for Motion Tracking on Cardiac Tagging Magnetic Resonance Images
* DeepVideoMVS: Multi-View Stereo on Video with Recurrent Spatio-Temporal Fusion
* Defending Multimodal Fusion Models against Single-Source Adversaries
* DeFLOCNet: Deep Image Editing via Flexible Low-level Controls
* DeFlow: Learning Complex Image Degradations from Unpaired Data with Conditional Flows
* DeFMO: Deblurring and Shape Recovery of Fast Moving Objects
* Deformed Implicit Field: Modeling 3D Shapes with Learned Dense Correspondence
* Delving Deep into Many-to-many Attention for Few-shot Video Object Segmentation
* Delving into Data: Effectively Substitute Training for Black-box Attack
* Delving into Localization Errors for Monocular 3D Object Detection
* Denoise and Contrast for Category Agnostic Shape Completion
* Dense Contrastive Learning for Self-Supervised Visual Pre-Training
* Dense Label Encoding for Boundary Discontinuity Free Rotation Detection
* Dense Relation Distillation with Context-aware Aggregation for Few-Shot Object Detection
* Densely connected multidilated convolutional networks for dense prediction tasks
* Depth Completion using Plane-Residual Representation
* Depth Completion with Twin Surface Extrapolation at Occlusion Boundaries
* Depth from Camera Motion and Object Detection
* Depth-Aware Mirror Segmentation
* Depth-conditioned Dynamic Message Propagation for Monocular 3D Object Detection
* DER: Dynamically Expandable Representation for Class Incremental Learning
* DeRF: Decomposed Radiance Fields
* Detecting Human-Object Interaction via Fabricated Compositional Learning
* Detection, Tracking, and Counting Meets Drones in Crowds: A Benchmark
* DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution
* DexYCB: A Benchmark for Capturing Hand Grasping of Objects
* DG-Font: Deformable Generative Networks for Unsupervised Font Generation
* DI-Fusion: Online Implicit 3D Reconstruction with Deep Priors
* Dictionary-guided Scene Text Recognition
* Differentiable Diffusion for Dense Depth Estimation from Multi-view Images
* Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing
* Differentiable Patch Selection for Image Recognition
* Differentiable SLAM-net: Learning Particle SLAM for Visual Navigation
* Diffusion Probabilistic Models for 3D Point Cloud Generation
* Digital Gimbal: End-to-end Deep Image Stabilization with Learnable Exposure Times
* DiNTS: Differentiable Neural Network Topology Search for 3D Medical Image Segmentation
* DISCO: Dynamic and Invariant Sensitive Channel Obfuscation for deep neural networks
* Discover Cross-Modality Nuances for Visible-Infrared Person Re-Identification
* Discovering Hidden Physics Behind Transport Dynamics
* Discovering Interpretable Latent Space Directions of GANs Beyond Binary Attributes
* Discovering Relationships between Object Categories via Universal Canonical Maps
* Discrete-continuous Action Space Policy Gradient-based Attention for Image-Text Matching
* Discrimination-Aware Mechanism for Fine-grained Representation Learning
* Discriminative Appearance Modeling with Multi-track Pooling for Real-time Multi-object Tracking
* Disentangled Cycle Consistency for Highly-realistic Virtual Try-On
* Disentangling Label Distribution for Long-tailed Visual Recognition
* Distilling Audio-Visual Knowledge by Compositional Contrastive Learning
* Distilling Causal Effect of Data in Class-Incremental Learning
* Distilling Knowledge via Knowledge Review
* Distilling Object Detectors via Decoupled Features
* Distractor-Aware Fast Tracking via Dynamic Convolutions and MOT Philosophy
* Distribution Alignment: A Unified Framework for Long-tail Visual Recognition
* Distribution-aware Adaptive Multi-bit Quantization
* DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network
* Dive into Ambiguity: Latent Distribution Mining and Pairwise Uncertainty Estimation for Facial Expression Recognition
* Divergence Optimization for Noisy Universal Domain Adaptation
* Diverse Branch Block: Building a Convolution as an Inception-like Unit
* Diverse Part Discovery: Occluded Person Re-identification with Part-Aware Transformer
* Diverse Semantic Image Synthesis via Probability Distribution Modeling
* Diversifying Sample Generation for Accurate Data-Free Quantization
* Divide-and-Conquer for Lane-Aware Diverse Trajectory Prediction
* DoDNet: Learning to Segment Multi-Organ and Tumors from Multiple Partially Labeled Datasets
* Dogfight: Detecting Drones from Drones Videos
* Domain Adaptation with Auxiliary Target Domain-Oriented Classifier
* Domain Consensus Clustering for Universal Domain Adaptation
* Domain-Independent Dominance of Adaptive Methods
* Domain-robust VQA with diverse datasets and methods but no target labels
* Domain-Specific Suppression for Adaptive Object Detection
* DOTS: Decoupling Operation and Topology in Differentiable Architecture Search
* Double low-rank representation with projection distance penalty for clustering
* Drafting and Revision: Laplacian Pyramid Network for Fast High-Quality Artistic Style Transfer
* DRANet: Disentangling Representation and Adaptation Networks for Unsupervised Cross-Domain Adaptation
* DriveGAN: Towards a Controllable High-Quality Neural Simulation
* DSC-PoseNet: Learning 6DoF Object Pose Estimation via Dual-scale Consistency
* DSRNA: Differentiable Search of Robust Neural Architectures
* Dual Attention Guided Gaze Target Detection in the Wild
* Dual Attention Suppression Attack: Generate Adversarial Camouflage in Physical World
* Dual Contradistinctive Generative Autoencoder
* Dual Iterative Refinement Method for Non-rigid Shape Matching, A
* Dual Pixel Exploration: Simultaneous Depth Estimation and Image Restoration
* Dual-GAN: Joint BVP and Noise Modeling for Remote Physiological Measurement
* Dual-stream Multiple Instance Learning Network for Whole Slide Image Classification with Self-supervised Contrastive Learning
* DualAST: Dual Style-Learning Networks for Artistic Style Transfer
* DualGraph: A graph-based method for reasoning about label noise
* DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution
* DyGLIP: A Dynamic Graph Model with Link Prediction for Accurate Multi-Camera Multiple Object Tracking
* Dynamic Class Queue for Large Scale Face Recognition In the Wild
* Dynamic Domain Adaptation for Efficient Inference
* Dynamic Head: Unifying Object Detection Heads with Attentions
* Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales
* Dynamic Neural Radiance Fields for Monocular 4D Facial Avatar Reconstruction
* Dynamic Probabilistic Graph Convolution for Facial Action Unit Intensity Estimation
* Dynamic Region-Aware Convolution
* Dynamic Slimmable Network
* Dynamic Transfer for Multi-Source Domain Adaptation
* Dynamic Weighted Learning for Unsupervised Domain Adaptation
* DyStaB: Unsupervised Object Segmentation via Dynamic-Static Bootstrapping*
* ECKPN: Explicit Class Knowledge Propagation Network for Transductive Few-shot Learning
* EDNet: Efficient Disparity Estimation with Cost Volume Combination and Attention-based Spatial Residual
* Effective Snapshot Compressive-spectral Imaging via Deep Denoising and Total Variation Priors
* Effective Sparsification of Neural Networks with Global Sparsity Constraint
* Efficient Conditional GAN Transfer with Knowledge Propagation across Classes
* Efficient deformable shape correspondence via multiscale spectral manifold wavelets preservation
* Efficient Feature Transformations for Discriminative and Generative Continual Learning
* Efficient Initial Pose-graph Generation for Global SfM
* Efficient Multi-Stage Video Denoising with Recurrent Spatio-Temporal Fusion
* Efficient Object Embedding for Spliced Image Retrieval
* Efficient Regional Memory Network for Video Object Segmentation
* EffiScene: Efficient Per-Pixel Rigidity Inference for Unsupervised Joint Learning of Optical Flow, Depth, Camera Pose and Motion Segmentation
* Ego-Exo: Transferring Visual Representations from Third-person to First-person Videos
* Embedded Discriminative Attention Mechanism for Weakly Supervised Semantic Segmentation
* Embedding Transfer with Label Relaxation for Improved Metric Learning
* Embracing Uncertainty: Decoupling and De-bias for Robust Temporal Grounding
* Encoder Fusion Network with Co-Attention Embedding for Referring Image Segmentation
* Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation
* End-to-end High Dynamic Range Camera Pipeline Optimization
* End-to-End Human Object Interaction Detection with HOI Transformer
* End-to-End Human Pose and Mesh Reconstruction with Transformers
* End-to-End Learning for Joint Image Demosaicing, Denoising and Super-Resolution
* End-to-End Object Detection with Fully Convolutional Network
* End-to-End Rotation Averaging with Multi-Source Propagation
* End-to-End Video Instance Segmentation with Transformers
* EnD: Entangling and Disentangling deep representations for bias correction
* Energy-Based Learning for Scene Graph Generation
* Enhance Curvature Information by Structured Stochastic Quasi-Newton Methods
* Enhancing the Transferability of Adversarial Attacks through Variance Tuning
* Enriching ImageNet with Human Similarity Judgments and Psychological Embeddings
* Ensembling with Deep Generative Views
* Equalization Loss v2: A New Gradient Balance Approach for Long-tailed Object Detection
* Equivariant Point Network for 3D Point Cloud Analysis
* Euro-PVI: Pedestrian Vehicle Interactions in Dense Urban Centers
* EvDistill: Asynchronous Events to End-task Learning via Bidirectional Reconstruction-guided Cross-modal Knowledge Distillation
* Event-based Bispectral Photometry using Temporally Modulated Illumination
* Event-based Synthetic Aperture Imaging with a Hybrid Network
* EventZoom: Learning to Denoise and Super Resolve Neuromorphic Events
* Every Annotation Counts: Multi-label Deep Supervision for Medical Image Segmentation
* Exemplar-Based Open-Set Panoptic Segmentation Network
* Explaining Classifiers using Adversarial Perturbations on the Perceptual Ball
* Explicit Knowledge Incorporation for Visual Reasoning
* Exploit Visual Dependency Relations for Semantic Segmentation
* Exploiting & Refining Depth Distributions with Triangulation Light Curtains
* Exploiting Aliasing for Manga Restoration
* Exploiting Edge-Oriented Reasoning for 3D Point-based Scene Graph Analysis
* Exploiting Semantic Embedding and Visual Feature for Facial Action Unit Detection
* Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing
* Explore Image Deblurring via Encoded Blur Kernel Space
* Exploring Adversarial Fake Images on Face Manifold
* Exploring and Distilling Posterior and Prior Knowledge for Radiology Report Generation
* Exploring Complementary Strengths of Invariant and Equivariant Representations for Few-Shot Learning
* Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts
* Exploring Heterogeneous Clues for Weakly-Supervised Audio-Visual Video Parsing
* Exploring Intermediate Representation for Monocular Vehicle Pose Estimation
* Exploring Simple Siamese Representation Learning
* Exploring Sparsity in Image Super-Resolution for Efficient Inference
* Exponential Moving Average Normalization for Self-supervised and Semi-supervised Learning
* Extreme Low-Light Environment-Driven Image Denoising over Permanently Shadowed Lunar Regions with a Physical Noise Model
* Extreme Rotation Estimation using Dense Correlation Volumes
* Face Forensics in the Wild
* Face Forgery Detection by 3D Decomposition
* FaceInpainter: High Fidelity Face Adaptation to Heterogeneous Domains
* FACESEC: A Fine-grained Robustness Evaluation Framework for Face Recognition Systems
* Facial Action Unit Detection With Transformers
* FAIEr: Fidelity and Adequacy Ensured Image Caption Evaluation
* Fair Attribute Classification through Latent Space De-biasing
* Fair Feature Distillation for Visual Recognition
* FAPIS: A Few-shot Anchor-free Part-based Instance Segmenter
* Farewell to Mutual Information: Variational Distillation for Cross-Modal Person Re-Identification
* Fashion IQ: A New Dataset Towards Retrieving Images by Natural Language Feedback
* Fast and Accurate Model Scaling
* Fast Bayesian Uncertainty Estimation and Reduction of Batch Normalized Single Image Super-Resolution Network
* Fast end-to-end learning on protein surfaces
* Fast Sinkhorn Filters: Using Matrix Scaling for Non-Rigid Shape Correspondence with Functional Maps
* Faster Meta Update Strategy for Noise-Robust Deep Learning
* FBI-Denoiser: Fast Blind Image Denoiser for Poisson-Gaussian Noise
* FBNetV3: Joint Architecture-Recipe Search using Predictor Pretraining
* FCPose: Fully Convolutional Multi-Person Pose Estimation with Dynamic Instance-Aware Convolutions
* Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition
* Feature-Level Collaboration: Joint Unsupervised Learning of Optical Flow, Stereo Depth and Camera Motion
* FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space
* FESTA: Flow Estimation via Spatial-Temporal Attention for Scene Point Clouds
* Few-shot 3D Point Cloud Semantic Segmentation
* Few-Shot Classification with Feature Map Reconstruction Networks
* Few-Shot Human Motion Transfer by Personalized Geometry and Texture Modeling
* Few-shot Image Generation via Cross-domain Correspondence
* Few-Shot Incremental Learning with Continually Evolved Classifiers
* Few-Shot Object Detection via Classification Refinement and Distractor Retreatment
* Few-shot Open-set Recognition by Transformation Consistency
* Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need?
* Few-Shot Transformation of Common Actions into Time and Space
* FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation
* Fine-grained Angular Contrastive Learning with Coarse Labels
* Fine-Grained Shape-Appearance Mutual Learning for Cloth-Changing Person Re-Identification
* Fingerspelling Detection in American Sign Language
* FixBi: Bridging Domain Spaces for Unsupervised Domain Adaptation
* Flow Guided Transformable Bottleneck Networks for Motion Retargeting
* Flow-based Kernel Prior with Application to Blind Super-Resolution
* Flow-guided One-shot Talking Face Generation with a High-resolution Audio-visual Dataset
* FlowStep3D: Model Unrolling for Self-Supervised Scene Flow Estimation
* Focus on Local: Detecting Lane Marker from Bottom Up via Key Point
* Forecasting Irreversible Disease via Progression Learning
* ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis
* Fostering Generalization in Single-view 3D Reconstruction by Learning a Hierarchy of Local and Global Shape Priors
* Found a Reason for me? Weakly-supervised Grounded Visual Question Answering using Capsules
* Fourier Contour Embedding for Arbitrary-Shaped Text Detection
* Fourier-based Framework for Domain Generalization, A
* FP-NAS: Fast Probabilistic Neural Architecture Search
* FrameExit: Conditional Early Exiting for Efficient Video Recognition
* Frequency-aware Discriminative Feature Learning Supervised by Single-Center Loss for Face Forgery Detection
* From Points to Multi-Object 3D Reconstruction
* From Rain Generation to Rain Removal
* From Semantic Categories to Fixations: A Novel Weakly-supervised Visual-auditory Saliency Detection Approach
* From Shadow Generation to Shadow Removal
* From Synthetic to Real: Unsupervised Domain Adaptation for Animal Pose Estimation
* FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism
* FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding
* FSDR: Frequency Space Domain Randomization for Domain Generalization
* Fully Convolutional Networks for Panoptic Segmentation
* Fully Convolutional Scene Graph Generation
* Fully Understanding Generic Objects: Modeling, Segmentation, and Reconstruction
* Function4D: Real-time Human Volumetric Capture from Very Sparse Consumer RGBD Sensors
* functional approach to rotation equivariant non-linearities for Tensor Field Networks, A
* Fusing the Old with the New: Learning Relative Camera Pose with Geometry-Guided Uncertainty
* FVC: An End-to-End Framework Towards Deep Video Compression in Feature Space
* GAIA: A Transfer Learning System of Object Detection that Fits Your Needs
* GAN Prior Embedded Network for Blind Face Restoration in the Wild
* GANmut: Learning Interpretable Conditional Space for Gamut of Emotions
* Gated Spatio-Temporal Attention-Guided Video Deblurring
* GATSBI: Generative Agent-centric Spatio-temporal Object Interaction
* Gaussian Context Transformer
* GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation
* General Instance Distillation for Object Detection
* General Multi-label Image Classification with Transformers
* Generalizable Pedestrian Detection: The Elephant In The Room
* Generalizable Person Re-identification with Relevance-aware Mixture of Experts
* Generalization on Unseen Domains via Inference-time Label-Preserving Target Projections
* Generalized Domain Adaptation
* Generalized Few-Shot Object Detection without Forgetting
* Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection
* Generalized Loss Function for Crowd Counting and Localization, A
* Generalizing Face Forgery Detection with High-frequency Features
* Generalizing to the Open World: Deep Visual Odometry with Online Adaptation
* Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE
* Generating Manga from Illustrations via Mimicking Manga Creation Workflow
* Generative Classifiers as a Basis for Trustworthy Image Classification
* Generative Hierarchical Features from Synthesizing Images
* Generative Interventions for Causal Learning
* Generative PointNet: Deep Energy-Based Learning on Unordered Point Sets for 3D Generation, Reconstruction and Classification
* Generic Perceptual Loss for Modeling Structured Output Dependencies
* Geo-FARM: Geodesic Factor Regression Model for Misaligned Pre-shape Responses in Statistical Shape Analysis
* GeoSim: Realistic Video Simulation via Geometry-Aware Composition for Self-Driving
* GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields
* Glance and Gaze: Inferring Action-aware Points for One-Stage Human-Object Interaction Detection
* Glancing at the Patch: Anomaly Localization with Global and Local Feature Comparison
* GLAVNet: Global-Local Audio-Visual Cues for Fine-Grained Material Recognition
* GLEAN: Generative Latent Bank for Large-Factor Image Super-Resolution
* Global Transport for Fluid Reconstruction with Learned Self-Supervision
* Global2Local: Efficient Structure Search for Video Action Segmentation
* Globally Optimal Relative Pose Estimation with Gravity Prior
* GMOT-40: A Benchmark for Generic Multiple Object Tracking
* Goal-Oriented Gaze Estimation for Zero-Shot Learning
* Gradient Forward-Propagation for Large-Scale Temporal Video Modelling
* Gradient-based Algorithms for Machine Teaching
* Graph Attention Tracking
* Graph Stacked Hourglass Networks for 3D Human Pose Estimation
* Graph-based High-Order Relation Discovery for Fine-grained Recognition
* Graph-based High-order Relation Modeling for Long-term Action Recognition
* Greedy Hierarchical Variational Autoencoders for Large-Scale Video Prediction
* GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection
* Group Collaborative Learning for Co-Salient Object Detection
* Group Whitening: Balancing Learning Efficiency and Representational Capacity
* Group-aware Label Transfer for Domain Adaptive Person Re-identification
* Guided Integrated Gradients: An Adaptive Path Method for Removing Noise
* Guided Interactive Video Object Segmentation Using Reliability-Based Attention Maps
* Hallucination Improves Few-Shot Object Detection
* Hardness Sampling for Self-Training Based Transductive Zero-Shot Learning
* Harmonious Semantic Line Detection via Maximal Weight Clique Selection
* HCRF-Flow: Scene Flow from Point Clouds with Continuous High-order CRFs and Position-aware Flow Embedding
* HDMapGen: A Hierarchical Graph Generative Model of High Definition Maps
* HDR Environment Map Estimation for Real-Time Augmented Reality
* Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network Architectures, The
* Heterogeneous Grid Convolution for Adaptive, Efficient, and Controllable Computation
* Hierarchical and Partially Observable Goal-driven Policy Learning with Goals Relational Graph
* Hierarchical Layout-Aware Graph Convolutional Network for Unified Aesthetics Assessment
* Hierarchical Lovįsz Embeddings for Proposal-free Panoptic Segmentation
* Hierarchical Motion Understanding via Motion Programs
* Hierarchical Video Prediction using Relational Layouts for Human-Object Interactions
* High-Fidelity and Arbitrary Face Editing
* High-fidelity Face Tracking for AR/VR via Deep Lighting Adaptation
* High-Fidelity Neural Human Motion Transfer from Monocular Video
* High-Quality Stereo Image Restoration from Double Refraction
* High-Resolution Photorealistic Image Translation in Real-Time: A Laplacian Pyramid Translation Network
* High-speed Image Reconstruction through Short-term Plasticity for Spiking Cameras
* Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs
* Hilbert Sinkhorn Divergence for Optimal Transport
* HistoGAN: Controlling Colors of GAN-Generated and Real Images via Color Histograms
* HITNet: Hierarchical Iterative Tile Refinement Network for Real-time Stereo Matching
* HLA-Face: Joint High-Low Adaptation for Low Light Face Detection
* HoHoNet: 360 Indoor Holistic Understanding with Latent Horizontal Features
* Holistic 3D Human and Scene Mesh Estimation from Single View Images
* Holistic 3D Scene Understanding from a Single Image with Implicit Representation
* Home Action Genome: Cooperative Compositional Action Understanding
* HOTR: End-to-End Human-Object Interaction Detection with Transformers
* HourNAS: Extremely Fast Neural Architecture Search Through an Hourglass Lens
* House-GAN++: Generative Adversarial Layout Refinement Network towards Intelligent Computational Agent for Professional Architects
* How does topology influence gradient propagation and model performance of deep networks with DenseNet-type skip connections?
* How Privacy-Preserving are Line Clouds? Recovering Scene Details from 3D Lines
* How Robust are Randomized Smoothing based Defenses to Data Poisoning?
* How to Exploit the Transferability of Learned Image Compression to Conventional Codecs
* How Transferable are Reasoning Patterns in VQA?
* How Well Do Self-Supervised Models Transfer?
* How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language
* HR-NAS: Searching Efficient High-Resolution Neural Architectures with Lightweight Transformers
* Human De-occlusion: Invisible Perception and Recovery for Humans
* Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors
* Human-like Controllable Image Captioning with Verb-specific Semantic Roles
* HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences
* Humble Teachers Teach Better Students for Semi-Supervised Object Detection
* HVPR: Hybrid Voxel-Point Representation for Single-stage 3D Object Detection
* Hybrid Message Passing with Performance-Driven Structures for Facial Action Unit Detection
* Hybrid Rotation Averaging: A Fast and Robust Rotation Averaging Approach
* HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation
* Hyper-LifelongGAN: Scalable Lifelong Learning for Image Conditioned Generation
* Hyperbolic-to-Hyperbolic Graph Convolutional Network, A
* Hyperdimensional computing as a framework for systematic aggregation of image descriptors
* HyperSeg: Patch-wise Hypernetwork for Real-time Semantic Segmentation
* i3DMM: Deep Implicit 3D Morphable Model of Human Heads
* I3Net: Implicit Instance-Invariant Network for Adapting One-Stage Object Detectors
* IBRNet: Learning Multi-View Image-Based Rendering
* ID-Unet: Iterative Soft and Hard Deformation for View Synthesis
* IIRC: Incremental Implicitly-Refined Classification
* Im2Vec: Synthesizing Vector Graphics without Vector Supervision
* Image Change Captioning by Learning from an Auxiliary Task
* Image Generators with Conditionally-Independent Pixel Synthesis
* Image Inpainting Guided by Coherence Priors of Semantics and Textures
* Image Inpainting with External-internal Learning and Monochromic Bottleneck
* Image Restoration for Under-Display Camera
* Image Super-Resolution with Non-Local Sparse Attention
* Image-to-image Translation via Hierarchical Style Disentanglement
* IMAGINE: Image Synthesis by Image-Guided Model Inversion
* img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation
* iMiGUE: An Identity-free Video Dataset for Micro-Gesture Understanding and Emotion Analysis
* IMODAL: creating learnable user-defined deformation models
* Implicit Feature Alignment: Learn to Convert Text Recognizer to Text Spotter
* Improved Handling of Motion Blur in Online Object Detection
* Improved Image Matting via Real-time User Clicks and Uncertainty Estimation
* Improving Accuracy of Binary Neural Networks using Unbalanced Activation Distribution
* Improving Calibration for Long-Tailed Recognition
* Improving Multiple Object Tracking with Single Object Tracking
* Improving Multiple Pedestrian Tracking by Track Management and Occlusion Handling
* Improving OCR-based Image Captioning by Incorporating Geometrical Relationship
* Improving Panoptic Segmentation at All Scales
* Improving Sign Language Translation with Monolingual Data by Sign Back-Translation
* Improving the Efficiency and Robustness of Deepfakes Detection through Precise Geometric Features
* Improving the Transferability of Adversarial Samples with Adversarial Transformations
* Improving Transferability of Adversarial Patches on Face Recognition with Generative Models
* Improving Unsupervised Image Clustering With Robust Learning
* Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation
* In the light of feature distributions: moment matching for Neural Style Transfer
* Inception Convolution with Efficient Dilation Search
* Incremental Few-Shot Instance Segmentation
* Incremental Learning via Rate Reduction
* Indoor Lighting Estimation using an Event Camera
* Indoor Panorama Planar 3D Reconstruction via Divide and Conquer
* Inferring CAD Modeling Sequences Using Zone Graphs
* Information Bottleneck Disentanglement for Identity Swapping
* Information-Theoretic Segmentation by Inpainting Error Maximization
* Informative and Consistent Correspondence Mining for Cross-Domain Weakly Supervised Object Detection
* Instance Level Affinity-Based Transfer for Unsupervised Domain Adaptation
* Instance Localization for Self-supervised Detection Pretraining
* Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework
* Intelligent Carpet: Inferring 3D Human Pose from Tactile Signals
* Intentonomy: a Dataset and Study towards Human Intent Understanding
* Interactive Self-Training with Mean Teachers for Semi-supervised Object Detection
* Interpolation-based Semi-supervised Learning for Object Detection
* Interpretable Social Anchors for Human Trajectory Forecasting in Crowds
* Interpreting Super-Resolution Networks with Local Attribution Maps
* Interventional Video Grounding with Dual Contrastive Learning
* Intra-Inter Camera Similarity for Unsupervised Person Re-Identification
* Intrinsic Image Harmonization
* Introvert: Human Trajectory Prediction via Conditional 3D Attention
* Inverse Simulation: Reconstructing Dynamic Geometry of Clothed Humans via Optimal Control
* InverseForm: A Loss Function for Structured Boundary-Aware Segmentation
* Invertible Denoising Network: A Light Solution for Real Noise Removal
* Invertible Image Signal Processing
* Inverting Generative Adversarial Renderer for Face Reconstruction
* Invisible Perturbations: Physical Adversarial Examples Exploiting the Rolling Shutter Effect
* Involution: Inverting the Inherence of Convolution for Visual Recognition
* IoU Attack: Towards Temporally Coherent Black-Box Adversarial Attack for Visual Object Tracking
* IQDet: Instance-wise Quality Distribution Sampling for Object Detection
* IronMask: Modular Architecture for Protecting Deep Face Template
* Iso-Points: Optimizing Neural Implicit Surfaces with Hybrid Representations
* Isometric Multi-Shape Matching
* Iterative Filter Adaptive Network for Single Image Defocus Deblurring
* Iterative Shrinking for Referring Expression Grounding Using Deep Reinforcement Learning
* iVPF: Numerical Invertible Volume Preserving Flow for Efficient Lossless Compression
* Jigsaw Clustering for Unsupervised Visual Representation Learning
* Jo-SRC: A Contrastive Approach for Combating Noisy Labels
* Joint Deep Model-based MR Image and Coil Sensitivity Reconstruction Network (Joint-ICNet) for Fast MRI
* Joint Generative and Contrastive Learning for Unsupervised Person Re-identification
* Joint Learning of 3D Shape Retrieval and Deformation
* Joint Negative and Positive Learning for Noisy Labels
* Joint Noise-Tolerant Learning and Meta Camera Shift Adaptation for Unsupervised Person Re-Identification
* Joint-DetNAS: Upgrade Your Detector with NAS, Pruning and Dynamic Distillation
* Kaleido-BERT: Vision-Language Pre-training on Fashion Domain
* Keep your Eyes on the Lane: Real-time Attention-guided Lane Detection
* KeepAugment: A Simple Information-Preserving Data Augmentation Approach
* Keypoint-graph-driven learning framework for object pose estimation
* KeypointDeformer: Unsupervised 3D Keypoint Discovery for Shape Control
* Knowledge Evolution in Neural Networks
* KOALAnet: Blind Super-Resolution using Kernel-Oriented Adaptive Local Adjustment
* KRISP: Integrating Implicit and Symbolic Knowledge for Open-Domain Knowledge-Based VQA
* KSM: Fast Multiple Task Adaption via Kernel-wise Soft Mask Learning
* L2M-GAN: Learning to Manipulate Latent Space Semantics for Facial Attribute Editing
* Labeled from Unlabeled: Exploiting Unlabeled Data for Few-shot Deep HDR Deghosting
* LAFEAT: Piercing Through Adversarial Defenses with Latent Features
* Landmark Regularization: Ranking Guided Super-Net Training in Neural Architecture Search
* LaPred: Lane-Aware Prediction of Multi-Modal Future Trajectories of Dynamic Agents
* Large-capacity Image Steganography Based on Invertible Neural Networks
* Large-scale Localization Datasets in Crowded Indoor Spaces
* Large-Scale Study on Unsupervised Spatiotemporal Representation Learning, A
* LAU-Net: Latitude Adaptive Upscaling Network for Omnidirectional Image Super-resolution
* Layer-wise Searching for 1-bit Detectors
* Layerwise Optimization by Gradient Decomposition for Continual Learning
* Layout-Guided Novel View Synthesis from a Single Indoor Panorama
* LayoutGMN: Neural Graph Matching for Structural Layout Similarity
* LayoutTransformer: Scene Layout Generation with Conceptual and Spatial Diversity
* LEAP: Learning Articulated Occupancy of People
* Learnable Companding Quantization for Accurate Low-bit Neural Networks
* Learnable Graph Matching: Incorporating Graph Partitioning with Deep Feature Learning for Multiple Object Tracking
* Learnable Motion Coherence for Correspondence Pruning
* Learned Initializations for Optimizing Coordinate-Based Neural Representations
* Learning 3D Shape Feature for Texture-Insensitive Person Re-Identification
* Learning a Facial Expression Embedding Disentangled from Identity
* Learning a Non-blind Deblurring Network for Night Blurry Images
* Learning a Proposal Classifier for Multiple Object Tracking
* Learning a Self-Expressive Network for Subspace Clustering
* Learning Accurate Dense Correspondences and When to Trust Them
* Learning Affinity-Aware Upsampling for Deep Image Matting*
* Learning An Explicit Weighting Scheme for Adapting Complex HSI Noise
* Learning Asynchronous and Sparse Human-Object Interaction in Videos
* Learning Better Visual Dialog Agents with Pretrained Visual-Linguistic Representation
* Learning by Aligning Videos in Time
* Learning by Planning: Language-Guided Global Image Editing
* Learning by Watching
* Learning Calibrated Medical Image Segmentation via Multi-rater Agreement Modeling
* Learning Camera Localization via Dense Scene Matching
* Learning Complete 3D Morphable Face Models from Images and Videos
* Learning Compositional Radiance Fields of Dynamic Human Heads
* Learning Compositional Representation for 4D Captures with Neural ODE
* Learning Continuous Image Representation with Local Implicit Image Function
* Learning Cross-Modal Retrieval with Noisy Labels
* Learning Decision Trees Recurrently Through Communication
* Learning Deep Classifiers Consistent with Fine-Grained Novelty Detection
* Learning Deep Latent Variable Models by Short-Run MCMC Inference with Optimal Transport Correction
* Learning Delaunay Surface Elements for Mesh Reconstruction
* Learning Discriminative Prototypes with Dynamic Time Warping
* Learning Dynamic Alignment via Meta-filter for Few-shot Learning
* Learning Dynamic Network Using a Reuse Gate Function in Semi-supervised Video Object Segmentation
* Learning Dynamics via Graph Neural Networks for Human Pose Estimation and Tracking
* Learning Feature Aggregation for Deep 3D Morphable Models
* Learning Fine-Grained Segmentation of 3D Shapes without Part Labels
* Learning from the Master: Distilling Cross-modal Advanced Knowledge for Lip Reading
* Learning Goals from Failure
* Learning Graph Embeddings for Compositional Zero-shot Learning
* Learning Graphs for Knowledge Transfer with Limited Labels
* Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos
* Learning Invariant Representations and Risks for Semi-supervised Domain Adaptation
* Learning monocular 3D reconstruction of articulated categories from motion
* Learning Multi-Scale Photo Exposure Correction
* Learning Neural Representation of Camera Pose with Matrix Representation of Pose Shift via View Synthesis
* Learning Normal Dynamics in Videos with Meta Prototype Network
* Learning Optical Flow from a Few Matches
* Learning optical flow from still images
* Learning Parallel Dense Correspondence from Spatio-Temporal Descriptors for Efficient and Robust 4D Reconstruction
* Learning Placeholders for Open-Set Recognition
* Learning Position and Target Consistency for Memory-based Video Object Segmentation
* Learning Probabilistic Ordinal Embeddings for Uncertainty-Aware Regression
* Learning Progressive Point Embeddings for 3D Point Cloud Generation
* Learning Salient Boundary Feature for Anchor-free Temporal Action Localization
* Learning Scalable l8-constrained Near-lossless Image Compression via Joint Lossy Image and Residual Compression
* Learning Scene Structure Guidance via Cross-Task Knowledge Transfer for Single Depth Super-Resolution
* Learning Semantic Person Image Generation by Region-Adaptive Normalization
* Learning Semantic-Aware Dynamics for Video Prediction
* Learning Spatial-Semantic Relationship for Facial Attribute Recognition with Limited Labeled Data
* Learning Spatially-Variant MAP Models for Non-blind Image Deblurring
* Learning Statistical Texture for Semantic Segmentation
* Learning Student Networks in the Wild
* Learning Temporal Consistency for Low Light Video Enhancement from Single Images
* Learning Tensor Low-Rank Prior for Hyperspectral Image Reconstruction
* Learning the Best Pooling Strategy for Visual Semantic Embedding
* Learning the Non-differentiable Optimization for Blind Super-Resolution
* Learning the Predictability of the Future
* Learning the Superpixel in a Non-iterative and Lifelong Manner
* Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo Collection
* Learning to Associate Every Segment for Video Panoptic Segmentation
* Learning To Count Everything
* Learning to Filter: Siamese Relation Network for Robust Tracking
* Learning to Fuse Asymmetric Feature Maps in Siamese Trackers
* Learning to Generalize Unseen Domains via Memory-based Multi-Source Meta-Learning for Person Re-Identification
* Learning to Identify Correct 2D-2D Line Correspondences on Sphere
* Learning to Predict Visual Attributes in the Wild
* Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild
* Learning to Reconstruct High Speed and High Dynamic Range Videos from Events
* Learning to Recover 3D Scene Shape from a Single Image
* Learning to Relate Depth and Semantics for Unsupervised Domain Adaptation
* Learning to Restore Hazy Video: A New Real-World Dataset and A New Method
* Learning to Segment Actions from Visual and Language Instructions via Differentiable Weak Sequence Alignment
* Learning to Segment Rigid Motions from Two Frames
* Learning to Track Instances without Video Annotations
* Learning to Warp for Style Transfer
* Learning Triadic Belief Dynamics in Nonverbal Communication from Videos
* Learning View Selection for 3D Scenes
* Learning View-Disentangled Human Pose Representation by Contrastive Cross-View Mutual Information Maximization
* Learning-based Image Registration with Meta-Regularization
* LED2-Net: Monocular 360° Layout Estimation via Differentiable Depth Rendering
* Lesion-Aware Transformers for Diabetic Retinopathy Grading
* Less is More: CLIPBERT for Video-and-Language Learning via Sparse Sampling
* Leveraging Large-Scale Weakly Labeled Data for Semi-Supervised Mass Detection in Mammograms
* Leveraging Line-point Consistence to Preserve Structures for Wide Parallax Image Stitching
* Leveraging the Availability of Two Cameras for Illuminant Estimation
* LiBRe: A Practical Bayesian Approach to Adversarial Detection
* LiDAR R-CNN: An Efficient and Universal 3D Object Detector
* LiDAR-Aug: A General Rendering-based Augmentation Framework for 3D Object Detection
* LiDAR-based Panoptic Segmentation via Dynamic Shifting Network
* Lifelong Person Re-Identification via Adaptive Knowledge Accumulation
* Lifting 2D StyleGAN for 3D-Aware Face Generation
* Light Field Super-Resolution with Zero-Shot Learning
* LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search
* Limitations of Post-Hoc Feature Alignment for Robustness
* Line Segment Detection Using Transformers without Edges
* Linear Semantics in Generative Adversarial Networks
* Linguistic Structures as Weak Supervision for Visual Scene Graph Generation
* Lips Don't Lie: A Generalisable and Robust Approach to Face Forgery Detection
* Lipstick ain't enough: Beyond Color Matching for In-the-Wild Makeup Transfer
* LipSync3D: Data-Efficient Learning of Personalized 3D Talking Faces from Video using Pose and Lighting Normalization
* Lite-HRNet: A Lightweight High-Resolution Network
* Localizing Visual Sounds the Hard Way
* Locally Aware Piecewise Transformation Fields for 3D Human Mesh Registration
* Locate then Segment: A Strong Pipeline for Referring Image Segmentation
* LoFTR: Detector-Free Local Feature Matching with Transformers
* LOHO: Latent Optimization of Hairstyles via Orthogonalization
* Long-Tailed Multi-Label Visual Recognition by Collaborative Training on Uniform and Re-balanced Samplings
* Look Before You Leap: Learning Landmark Features for One-Stage Visual Grounding
* Look Before you Speak: Visually Contextualized Utterances
* Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation
* Looking into Your Speech: Learning Cross-modal Affinity for Audio-visual Speech Separation
* Lottery Ticket Hypothesis for Object Recognition, The
* Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models, The
* LPSNet: A lightweight solution for fast panoptic segmentation
* LQF: Linear Quadratic Fine-Tuning
* M3DSSD: Monocular 3D Single Stage Object Detector
* M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-training
* MagDR: Mask-guided Detection and Reconstruction for Defending Deepfakes
* MagFace: A Universal Representation for Face Recognition and Quality Assessment
* Magic Layouts: Structural Prior for Component Detection in User Interface Designs
* Manifold Regularized Dynamic Network Pruning
* ManipulaTHOR: A Framework for Visual Object Manipulation
* MASA-SR: Matching Acceleration and Spatial Adaptation for Reference-Based Image Super-Resolution
* Mask Guided Matting via Progressive Refinement Network
* Mask-Embedded Discriminator with Region-based Semantic Regularization for Semi-Supervised Class-Conditional Image Synthesis
* Mask-ToF: Learning Microlens Masks for Flying Pixel Correction in Time-of-Flight Imaging
* Masksembles for Uncertainty Estimation
* MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers
* MaxUp: Lightweight Adversarial Training with Data Augmentation Improves Neural Network Training
* MAZE: Data-Free Model Stealing Attack Using Zeroth-Order Gradient Estimation
* MeanShift++: Extremely Fast Mode-Seeking With Applications to Segmentation and Object Tracking
* MeGA-CDA: Memory Guided Attention for Category-Aware Unsupervised Domain Adaptive Object Detection
* Memory Oriented Transfer Learning for Semi-Supervised Image Deraining
* Memory-Efficient Network for Large-scale Video Compressive Sensing
* Memory-guided Unsupervised Image-to-image Translation
* Mesh Saliency: An Independent Perceptual Measure or A Derivative of Image Saliency?
* Mesoscopic photogrammetry with an unstabilized phone camera
* Meta Batch-Instance Normalization for Generalizable Person Re-Identification
* Meta Pseudo Labels
* Meta-Mining Discriminative Samples for Kinship Verification
* MetaAlign: Coordinating Domain Alignment and Classification for Unsupervised Domain Adaptation
* MetaCorrection: Domain-aware Meta Loss Correction for Unsupervised Domain Adaptation in Semantic Segmentation
* Metadata Normalization
* MetaHTR: Towards Writer-Adaptive Handwritten Text Recognition
* MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition
* MetaSCI: Scalable and Adaptive Reconstruction for Video Compressive Sensing
* MetaSets: Meta-Learning on Point Sets for Generalizable Representations
* MetricOpt: Learning to Optimize Black-Box Evaluation Metrics
* Minimally Invasive Surgery for Sparse Neural Networks in Contrastive Manner
* Mining Better Samples for Contrastive Learning of Temporal Correspondence
* Mirror3D: Depth Refinement for Mirror Surfaces
* MIST: Multiple Instance Self-Training Framework for Video Anomaly Detection
* MIST: Multiple Instance Spatial Transformer
* Mitigating Face Recognition Bias via Group Adaptive Classifier
* Mixed-Privacy Forgetting in Deep Networks
* MobileDets: Searching for Object Detection Architectures for Mobile Accelerators
* Model-Aware Gesture-to-Gesture Translation
* Model-based 3D Hand Reconstruction via Self-Supervised Learning
* Model-Contrastive Federated Learning
* Modeling Multi-Label Action Dependencies for Temporal Action Localization
* Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion
* Mol2Image: Improved Conditional Flow Models for Molecule to Image Synthesis
* MongeNet: Efficient Sampler for Geometric Deep Learning
* Monocular 3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks
* Monocular 3D Object Detection: An Extrinsic Parameter Free Approach
* Monocular Depth Estimation via Listwise Ranking using the Plackett-Luce Model
* Monocular Real-time Full Body Capture with Inter-part Correlations
* Monocular Reconstruction of Neural Face Reflectance Fields
* MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera
* MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation
* Monte Carlo Scene Search for 3D Scene Understanding
* MOOD: Multi-level Out-of-distribution Detection
* More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval
* MOS: Towards Scaling Out-of-distribution Detection for Large Semantic Space
* MOST: A Multi-Oriented Scene Text Detector with Localization Refinement
* Motion Representations for Articulated Animation
* MotionRNN: A Flexible Model for Video Prediction with Spacetime-Varying Motions
* MoViNets: Mobile Video Networks for Efficient Video Recognition
* MP3: A Unified Model to Map, Perceive, Predict and Plan
* MR Image Super-Resolution with Squeeze and Excitation Reasoning Attention Network
* Multi-attentional Deepfake Detection
* Multi-Decoding Deraining Network and Quasi-Sparsity Based Training
* Multi-institutional Collaborations for Improving Deep Learning-based Magnetic Resonance Image Reconstruction Using Federated Learning
* Multi-Label Activity Recognition using Activity-specific Features and Activity Correlations
* Multi-Label Learning from Single Positive Labels
* Multi-Modal Fusion Transformer for End-to-End Autonomous Driving
* Multi-Modal Relational Graph for Cross-Modal Video Moment Retrieval
* Multi-Objective Interpolation Training for Robustness to Label Noise
* Multi-person Implicit Reconstruction from a Single Image
* Multi-Perspective LSTM for Joint Visual Representation Learning
* Multi-Scale Aligned Distillation for Low-Resolution Detection
* Multi-shot Temporal Event Localization: a Benchmark
* Multi-Source Domain Adaptation with Collaborative Learning for Semantic Segmentation
* Multi-stage Aggregated Transformer Network for Temporal Language Localization in Videos
* Multi-Stage Progressive Image Restoration
* Multi-Target Domain Adaptation with Collaborative Consistency Learning
* Multi-Task Network for Joint Specular Highlight Detection and Removal, A
* Multi-Temporal Urban Development SpaceNet Dataset, The
* Multi-view 3D Reconstruction of a Texture-less Smooth Surface of Unknown Generic Reflectance
* Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks
* Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo
* MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization
* MultiLink: Multi-class Structure Recovery via Agglomerative Clustering and Model Selection
* Multimodal Contrastive Training for Visual Representation Learning
* Multimodal Motion Prediction with Stacked Transformers
* Multiple Instance Active Learning for Object Detection
* Multiple Instance Captioning: Learning Representations from Histopathology Textbooks and Articles
* Multiple Object Tracking with Correlation Learning
* Multiplexed Network for End-to-End, Multilingual OCR, A
* Multiresolution Knowledge Distillation for Anomaly Detection
* Multispectral Photometric Stereo for Spatially-Varying Spectral Reflectances: A well posed problem?
* MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generation
* Mutual CRF-GNN for Few-shot Learning
* Mutual Graph Learning for Camouflaged Object Detection
* Natural Adversarial Examples
* Navigating the GAN Parameter Space for Semantic Image Editing
* NBNet: Noise Basis Learning for Image Denoising with Subspace Projection
* Nearest Neighbor Matching for Deep Clustering
* Neighbor2Neighbor: Self-Supervised Denoising from Single Noisy Images
* Neighborhood Contrastive Learning for Novel Class Discovery
* Neighborhood Normalization for Robust Geometric Feature Learning
* NeRD: Neural 3D Reflection Symmetry Detector
* NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections
* NeRV: Neural Reflectance and Visibility Fields for Relighting and View Synthesis
* NetAdaptV2: Efficient Neural Architecture Search with Fast Super-Network Training and Architecture Optimization
* Network Pruning via Performance Maximization
* Network Quantization with Element-wise Gradient Scaling
* Neural Architecture Search with Random Labels
* Neural Auto-Exposure for High-Dynamic Range Object Detection
* Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans
* Neural Camera Simulators
* Neural Cellular Automata Manifold
* Neural Deformation Graphs for Globally-consistent Non-rigid Reconstruction
* Neural Descent for Visual 3D Human Pose and Shape
* Neural Feature Search for RGB-Infrared Person Re-Identification
* Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes
* Neural Lumigraph Rendering
* Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks
* Neural Prototype Trees for Interpretable Fine-grained Image Recognition
* Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation
* Neural Response Interpretation through the Lens of Critical Pathways
* Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes
* Neural Scene Graphs for Dynamic Scenes
* Neural Side-By-Side: Predicting Human Preferences for No-Reference Super-Resolution Evaluation
* Neural Splines: Fitting 3D Surfaces with Infinitely-Wide Neural Networks
* Neural Surface Maps
* Neural Tangent Link Between CNN Denoisers and Non-Local Filters, The
* NeuralFusion: Online Depth Fusion in Latent Space
* NeuralHumanFVV: Real-Time Neural Volumetric Human Performance Rendering using RGB Cameras
* NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video
* NeuroMorph: Unsupervised Shape Interpolation and Correspondence in One Go
* NeuTex: Neural Texture Mapping for Volumetric Neural Rendering
* NewtonianVAE: Proportional Control and Goal Identification from Pixels via Physical Latent Spaces
* NeX: Real-time View Synthesis with Neural Basis Expansion
* NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions
* Nighttime Visibility Enhancement by Increasing the Dynamic Range and Suppression of Light Effects
* No frame left behind: Full Video Action Recognition
* No Shadow Left Behind: Removing Objects and their Shadows using Approximate Lighting and Geometry
* Noise-resistant Deep Metric Learning with Ranking-based Instance Selection
* Non-Salient Region Object Mining for Weakly Supervised Semantic Segmentation
* Normal Integration via Inverse Plane Fitting with Minimum Point-to-Plane Distance
* NormalFusion: Real-Time Acquisition of Surface Normals for High-Resolution RGB-D Scanning
* Normalized Avatar Synthesis Using StyleGAN and Perceptual Refinement
* Not just Compete, but Collaborate: Local Image-to-Image Translation via Cooperative Mask Prediction
* NPAS: A Compiler-aware Framework of Unified Network Pruning and Architecture Search for Beyond Real-Time Mobile Acceleration
* Nutrition5k: Towards Automatic Nutritional Understanding of Generic Food
* Object classification from randomized EEG trials
* Objectron: A Large Scale Dataset of Object-Centric Videos in the Wild with Pose Annotations
* Objects are Different: Flexible Monocular 3D Object Detection
* OBoW: Online Bag-of-Visual-Words Generation for Self-Supervised Learning
* OCONet: Image Extrapolation by Object Completion
* Offboard 3D Object Detection from Point Cloud Sequences
* Omni-supervised Point Cloud Segmentation via Gradual Receptive Field Component Reasoning
* Omnimatte: Associating Objects and Their Effects in Video
* On Feature Normalization and Data Augmentation
* On Focal Loss for Class-Posterior Probability Estimation: A Theoretical Perspective
* On Learning the Geodesic Path for Incremental Learning
* On Robustness and Transferability of Convolutional Neural Networks
* On Self-Contact and Human Pose
* On Semantic Similarity in Video Retrieval
* On the Difficulty of Membership Inference Attacks
* One Shot Face Swapping on Megapixels
* One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation
* One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing
* One-Shot Neural Ensemble Architecture Search by Diversity-Guided Search Space Shrinking
* Online Learning of a Probabilistic and Adaptive Scene Representation
* Online Multiple Object Tracking with Cross-Task Synergy
* OPANAS: One-Shot Path Aggregation Network Architecture Search for Object Detection
* Open Domain Generalization with Domain-Augmented Meta-Learning
* Open World Compositional Zero-Shot Learning
* Open-book Video Captioning with Retrieve-Copy-Generate Network
* Open-Vocabulary Object Detection Using Captions
* OpenMix: Reviving Known Knowledge for Discovering Novel Visual Categories in an Open World
* OpenRooms: An Open Framework for Photorealistic Indoor Scene Datasets
* Optimal Gradient Checkpoint Search for Arbitrary Computation Graphs
* Optimal Quantization using Scaled Codebook
* ORDisCo: Effective and Efficient Usage of Incremental Unlabeled Data for Semi-supervised Continual Learning
* Orthogonal Over-Parameterized Training
* OSTeC: One-Shot Texture Completion
* OTA: Optimal Transport Assignment for Object Detection
* OTCE: A Transferability Metric for Cross-Domain Cross-Task Representations
* Out-of-Distribution Detection Using Union of 1-Dimensional Subspaces
* Over-the-Air Adversarial Flickering Attacks against Video Recognition Networks
* PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds
* PANDA: Adapting Pretrained Features for Anomaly Detection and Segmentation
* Panoptic Segmentation Forecasting
* Panoptic-PolarNet: Proposal-free LiDAR Point Cloud Panoptic Segmentation
* Panoramic Image Reflection Removal
* Pareidolia Face Reenactment
* Pareto Self-Supervised Training for Few-Shot Learning
* Parser-Free Virtual Try-on via Distilling Appearance Flows
* Part-aware Panoptic Segmentation
* Partial Feature Selection and Alignment for Multi-Source Domain Adaptation
* Partial Person Re-identification with Part-Part Correspondence Learning
* Partially View-aligned Representation Learning with Noise-robust Contrastive Loss
* Partition-Guided GANs
* Passive Inter-Photon Imaging
* Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition
* Patch-VQ: Patching Up the Video Quality Problem
* Patch2Pix: Epipolar-Guided Pixel-Level Correspondences
* PatchMatch-Based Neighborhood Consensus for Semantic Correspondence
* PatchmatchNet: Learned Multi-View Patchmatch Stereo
* Patchwise Generative ConvNet: Training Energy-Based Models from a Single Natural Image for Internal Learning
* PAUL: Procrustean Autoencoder for Unsupervised Lifting
* PCLs: Geometry-aware Neural Reconstruction of 3D Pose with Perspective Crop Layers
* PD-GAN: Probabilistic Diverse GAN for Image Inpainting
* Pedestrian and Ego-vehicle Trajectory Prediction from Monocular Camera
* Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts, A
* Perception Matters: Detecting Perception Failures of VQA Models Using Metamorphic Testing
* Perceptual Indistinguishability-Net (PI-Net): Facial Image Obfuscation with Manipulable Semantics
* Permute, Quantize, and Fine-tune: Efficient Compression of Neural Networks
* Permuted AdaIN: Reducing the Bias Towards Global Statistics in Image Classification
* Person Re-identification using Heterogeneous Local Graph Attention Networks
* Person30K: A Dual-Meta Generalization Network for Person Re-Identification
* Personalized Outfit Recommendation with Learnable Anchors
* PGT: A Progressive Method for Training Models on Long Videos
* PhD Learning: Learning with Pompeiu-hausdorff Distances for Video-based Vehicle Re-Identification
* PhySG: Inverse Rendering with Spherical Gaussians for Physics-based Material Editing and Relighting
* Physically-aware Generative Network for 3D Shape Modeling
* Physics-based Iterative Projection Complex Neural Network for Phase Retrieval in Lensless Microscopy Imaging
* pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis
* Picasso: A CUDA-based Library for Deep Learning over 3D Meshes
* PiCIE: Unsupervised Semantic Segmentation using Invariance and Equivariance in Clustering
* PISE: Person Image Synthesis and Editing with Decoupled GAN
* Pixel Codec Avatars
* Pixel-aligned Volumetric Avatars
* Pixel-wise Anomaly Detection in Complex Driving Scenes
* pixelNeRF: Neural Radiance Fields from One or Few Images
* PixMatch: Unsupervised Domain Adaptation via Pixelwise Consistency Training
* PLADE-Net: Towards Pixel-Level Accuracy for Self-Supervised Single-View Depth Estimation with Neural Positional Encoding and Distilled Matting Loss
* Plan2Scene: Converting Floorplans to 3D Scenes
* Playable Video Generation
* PLOP: Learning without Forgetting for Continual Semantic Segmentation
* PlückerNet: Learn to Register 3D Line ReconstructionsØ
* PML: Progressive Margin Loss for Long-tailed Age Classification
* PMP-Net: Point Cloud Completion by Learning Multi-step Point Moving Paths
* Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos
* Point Cloud Instance Segmentation using Probabilistic Embeddings
* Point Cloud Upsampling via Disentangled Refinement
* Point2Skeleton: Learning Skeletal Representations from Point Clouds
* PointAugmenting: Cross-Modal Augmentation for 3D Object Detection
* PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency
* PointFlow: Flowing Semantics Through Points for Aerial Image Segmentation
* PointGuard: Provably Robust 3D Point Cloud Classification
* PointNetLK Revisited
* Points as Queries: Weakly Semi-supervised Object Detection by Points
* Polarimetric Normal Stereo
* Polka Lines: Learning Structured Illumination and Reconstruction for Active Stereo
* Polygonal Building Extraction by Frame Field Learning
* Polygonal Point Set Tracking
* Populating 3D Scenes by Learning Human-Scene Interaction
* Pose Recognition with Cascade Transformers
* Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation
* Pose-Guided Human Animation from a Single Image in the Wild
* PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation
* POSEFusion: Pose-guided Selective Fusion for Single-view Human Volumetric Capture
* Positional Encoding as Spatial Inductive Bias in GANs
* Positive Sample Propagation along the Audio-Visual Event Line
* Positive-Congruent Training: Towards Regression-Free Model Updates
* Positive-Unlabeled Data Purification in the Wild for Object Detection
* Post-hoc Uncertainty Calibration for Domain Drift Scenarios
* Posterior Promoted GAN with Distribution Discriminator for Unsupervised Image Synthesis
* PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency
* PQA: Perceptual Question Answering
* Practical Single-Image Super-Resolution Using Look-Up Table
* Practical Wide-Angle Portraits Correction with Deep Structured Models
* Pre-Trained Image Processing Transformer
* PREDATOR: Registration of 3D Point Clouds with Low Overlap
* Predicting Human Scanpaths in Visual Question Answering
* Primitive Representation Learning for Scene Text Recognition
* Prior Based Human Completion
* Prioritized Architecture Sampling with Monto-Carlo Tree Search
* Privacy Preserving Localization and Mapping from Uncalibrated Cameras
* Privacy-preserving Collaborative Learning with Automatic Transformation Search
* Privacy-Preserving Image Features via Adversarial Affine Subspace Embeddings
* Probabilistic 3D Human Shape and Pose Estimation from Multiple Unconstrained Images in the Wild
* Probabilistic Embeddings for Cross-Modal Retrieval
* Probabilistic Model Distillation for Semantic Correspondence
* Probabilistic Modeling of Semantic Ambiguity for Scene Graph Generation
* Probabilistic Selective Encryption of Convolutional Neural Networks for Hierarchical Services
* Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking
* Progressive Contour Regression for Arbitrary-Shape Scene Text Detection
* Progressive Domain Expansion Network for Single Domain Generalization
* Progressive Modality Reinforcement for Human Multimodal Emotion Recognition from Unaligned Multimodal Sequences
* Progressive Semantic Segmentation
* Progressive Semantic-Aware Style Transformation for Blind Face Restoration
* Progressive Stage-wise Learning for Unsupervised Feature Representation Enhancement
* Progressive Temporal Feature Alignment Network for Video Inpainting
* Progressive Unsupervised Learning for Visual Object Tracking
* Progressively Complementary Network for Fisheye Image Rectification Using Appearance Flow
* Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-view Transformation
* Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning
* ProSelfLC: Progressive Self Label Correction for Training Robust Deep Neural Networks
* Protecting Intellectual Property of Generative Adversarial Networks from Ambiguity Attacks
* Prototype Augmentation and Self-Supervision for Incremental Learning
* Prototype Completion with Primitive Knowledge for Few-Shot Learning
* Prototype-Guided Saliency Feature Learning for Person Search
* Prototype-supervised Adversarial Network for Targeted Attack of Deep Hashing
* Prototypical Cross-domain Self-supervised Learning for Few-shot Unsupervised Domain Adaptation
* Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation
* PSD: Principled Synthetic-to-Real Dehazing Guided by Physical Priors
* Pseudo 3D Auto-Correlation Network for Real Image Denoising
* Pseudo Facial Generation with Extreme Poses for Face Recognition
* PSRR-MaxpoolNMS: Pyramid Shifted MaxpoolNMS with Relationship Recovery
* PU-GCN: Point Cloud Upsampling using Graph Convolutional Networks
* Pulsar: Efficient Sphere-based Neural Rendering
* Pushing it out of the Way: Interactive Visual Navigation
* PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds
* PVGNet: A Bottom-Up One-Stage 3D Object Detector with Integrated Multi-Level Features
* PWCLO-Net: Deep LiDAR Odometry in 3D Point Clouds Using Hierarchical Embedding Mask Optimization
* QAIR: Practical Query-efficient Black-Box Attacks for Image Retrieval
* QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information
* QPP: Real-Time Quantization Parameter Prediction for Deep Neural Networks
* Quality-Agnostic Image Recognition via Invertible Decoder
* Quantifying Explainers of Graph Neural Networks in Computational Pathology
* Quantum Permutation Synchronization
* Quasi-Dense Similarity Learning for Multiple Object Tracking
* Quasiconvex Formulation for Radial Cameras, A
* Radar-Camera Pixel Depth Association for Depth Completion
* RAFT-3D: Scene Flow using Rigid-Motion Embeddings
* Railroad is not a Train: Saliency as Pseudo-pixel Supervision for Weakly Supervised Semantic Segmentation
* Rainbow Memory: Continual Learning with a Memory of Diverse Samples
* RangeIoUDet: Range Image based Real-Time 3D Object Detector Optimized by Intersection over Union
* Rank-One Prior: Toward Real-Time Scene Recovery
* RankDetNet: Delving into Ranking Constraints for Object Detection
* Ranking Neural Checkpoints
* RaScaNet: Learning Tiny Models by Raster-Scanning Images
* Re-labeling ImageNet: From Single to Multi-Labels, from Global to Localized Labels
* Read and Attend: Temporal Localisation in Sign Language Videos
* Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition
* ReAgent: Point Cloud Registration using Imitation and Reinforcement Learning
* Real-Time High-Resolution Background Matting
* Real-Time Selfie Video Stabilization
* Real-Time Sphere Sweeping Stereo from Multiview Fisheye Images
* Realistic Evaluation of Semi-Supervised Learning for Fine-Grained Classification, A
* Reciprocal Landmark Detection and Tracking with Extremely Few Annotations
* Reciprocal Transformations for Unsupervised Video Object Segmentation
* Recognizing Actions in Videos from Unseen Viewpoints
* Reconsidering Representation Alignment for Multi-view Clustering
* Reconstructing 3D Human Pose by Watching Humans in the Mirror
* Recorrupted-to-Recorrupted: Unsupervised Deep Learning for Image Denoising
* Rectification-based Knowledge Retention for Continual Learning
* Recurrent Multi-view Alignment Network for Unsupervised Surface Registration
* ReDet: A Rotation-equivariant Detector for Aerial Object Detection
* Reducing Domain Gap by Reducing Style Bias
* Refer-it-in-RGBD: A Bottom-up Approach for 3D Visual Grounding in RGBD Images
* Refine Myself by Teaching Myself: Feature Refinement via Self-Knowledge Distillation
* RefineMask: Towards High-Quality Instance Segmentation with Fine-Grained Features
* Refining Pseudo Labels with Clustering Consensus over Generations for Unsupervised Object Re-identification
* Reformulating HOI Detection as Adaptive Set Prediction
* Region-aware Adaptive Instance Normalization for Image Harmonization
* Regressive Domain Adaptation for Unsupervised Keypoint Detection
* Regularization Strategy for Point Cloud via Rigidly Mixed Sample
* Regularizing Generative Adversarial Networks under Limited Data
* Regularizing Neural Networks via Adversarial Model Perturbation
* Reinforced Attention for Few-Shot Learning and Beyond
* Relation-aware Instance Refinement for Weakly Supervised Visual Grounding
* Relative Order Analysis and Optimization for Unsupervised Deep Metric Learning
* Relevance-CAM: Your Model Already Knows Where to Look
* ReMix: Towards Image-to-Image Translation with Limited Data
* Removing Diffraction Image Artifacts in Under-Display Camera via Dynamic Skip Connection Network
* Removing Raindrops and Rain Streaks in One Go
* Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning
* ReNAS: Relativistic Evaluation of Neural Architecture Search
* Repetitive Activity Counting by Sight and Sound
* Repopulating Street Scenes
* Representation Learning via Global Temporal Alignment and Cycle-Consistency
* Representative Batch Normalization with Feature Calibration
* Representative Forgery Mining for Fake Face Detection
* Representing Videos as Discriminative Sub-graphs for Action Recognition
* Repurposing GANs for One-Shot Semantic Part Segmentation
* RepVGG: Making VGG-style ConvNets Great Again
* Residential floor plan recognition and reconstruction
* Restore from Restored: Video Restoration with Pseudo Clean Video
* Restoring Extremely Dark Images in Real Time
* Rethinking and Improving the Robustness of Image Style Transfer
* Rethinking BiSeNet For Real-time Semantic Segmentation
* Rethinking Channel Dimensions for Efficient Model Design
* Rethinking Class Relations: Absolute-relative Supervised and Unsupervised Few-shot Learning
* Rethinking Graph Neural Architecture Search from Message-passing
* Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
* Rethinking Style Transfer: From Pixels to Parameterized Brushstrokes
* Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach
* Rethinking the Heatmap Regression for Bottom-up Human Pose Estimation
* Retinex-inspired Unrolling with Cooperative Prior Architecture Search for Low-light Image Enhancement
* Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning
* Revisiting Knowledge Distillation: An Inheritance and Exploration Framework
* Revisiting Superpixels for Active Learning in Semantic Segmentation with Realistic Annotation Costs
* RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction
* RGB-D Local Implicit Function for Depth Completion of Transparent Objects
* Rich Context Aggregation with Reflection Prior for Glass Surface Detection
* Rich features for perceptual quality assessment of UGC videos
* Riggable 3D Face Reconstruction via In-Network Optimization
* Right for the Right Concept: Revising Neuro-Symbolic Concepts by Interacting with their Explanations
* Robust and Accurate Object Detection via Adversarial Learning
* Robust Audio-Visual Instance Discrimination
* Robust Bayesian Neural Networks by Spectral Expectation Bound Regularization
* Robust Consistent Video Depth Estimation
* Robust Instance Segmentation through Reasoning about Multi-Object Occlusion
* Robust Multimodal Vehicle Detection in Foggy Weather Using Complementary Lidar and Radar Signals
* Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor Environments
* Robust Point Cloud Registration Framework Based on Deep Graph Matching
* Robust Reference-based Super-Resolution via C2-Matching
* Robust Reflection Removal with Reflection-free Flash-only Cues
* Robust Representation Learning with Feedback for Single Image Deraining
* RobustNet: Improving Domain Generalization in Urban-Scene Segmentation via Instance Selective Whitening
* Roof-GAN: Learning to Generate Roof Geometry and Relations for Residential Houses
* Room-and-Object Aware Knowledge Reasoning for Remote Embodied Referring Expression
* Roses are Red, Violets are Blue… But Should VQA expect Them To?
* Rotation Coordinate Descent for Fast Globally Optimal Rotation Averaging
* Rotation Equivariant Siamese Networks for Tracking
* Rotation-Only Bundle Adjustment
* RPN Prototype Alignment For Domain Adaptive Object Detector
* RPSRNet: End-to-End Trainable Rigid Point Set Registration Network using Barnes-Hut 2D-Tree Representation
* RSG: A Simple but Effective Module for Learning Imbalanced Datasets
* RSN: Range Sparse Net for Efficient, Accurate LiDAR 3D Object Detection
* RSTNet: Captioning with Adaptive Attention on Visual and Non-Visual Words
* S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration
* S2R-DepthNet: Learning a Generalizable Depth-specific Structural Representation
* S3: Learnable Sparse Signal Superdensity for Guided Depth Estimation
* S3: Neural Shape, Skeleton, and Skinning Fields for 3D Human Modeling
* Safe Local Motion Planning with Self-Supervised Freespace Forecasting
* SAIL-VOS 3D: A Synthetic Dataset and Baselines for Object Detection and 3D Mesh Reconstruction from Video Data
* Saliency-Guided Image Translation
* Scalability vs. Utility: Do We Have to Sacrifice One for the Other in Data Importance Quantification?
* Scalable Differential Privacy with Sparse Network Finetuning
* Scale-aware Automatic Augmentation for Object Detection
* Scale-Aware Graph Neural Network for Few-Shot Semantic Segmentation
* Scale-Localized Abstract Reasoning
* SCALE: Modeling Clothed Humans with a Surface Codec of Articulated Local Elements
* Scaled-YOLOv4: Scaling Cross Stage Partial Network
* Scaling Local Self-Attention for Parameter Efficient Visual Backbones
* Scan2Cap: Context-aware Dense Captioning in RGB-D Scans
* SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks
* Scene Essence
* Scene Text Retrieval via Joint Text Detection and Similarity Learning
* Scene Text Telescope: Text-Focused Scene Image Super-Resolution
* Scene-aware Generative Network for Human Motion Synthesis
* Scene-Intuitive Agent for Remote Embodied Visual Grounding
* SceneGen: Learning to Generate Realistic Traffic Scenes
* SceneGraphFusion: Incremental 3D Scene Graph Prediction from RGB-D Sequences
* SCF-Net: Learning Spatial Contextual Features for Large-Scale Point Cloud Segmentation
* SDD-FIQA: Unsupervised Face Image Quality Assessment with Similarity Distribution Distance
* SE-SSD: Self-Ensembling Single-Stage Object Detector From Point Cloud
* Searching by Generating: Flexible and Efficient One-Shot NAS with Architecture Generator
* Searching for Fast Model Families on Datacenter Accelerators
* Second-Order Approach to Learning with Instance-Dependent Label Noise, A
* See through Gradients: Image Batch Recovery via GradInversion
* Seeing Behind Objects for 3D Multi-Object Tracking in RGB-D Sequences
* Seeing in Extra Darkness Using a Deep-Red Flash
* Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning
* Seeking the Shape of Sound: An Adaptive Framework for Learning Voice-Face Association
* Seesaw Loss for Long-Tailed Instance Segmentation
* Self-Aligned Video Deraining with Transmission-Depth Consistency
* Self-attention based Text Knowledge Mining for Text Detection
* Self-boosting Framework for Automated Radiographic Report Generation, A
* Self-generated Defocus Blur Detection via Dual Adversarial Discriminators
* Self-Guided and Cross-Guided Learning for Few-Shot Segmentation
* Self-Point-Flow: Self-Supervised Scene Flow Estimation from Point Clouds with Optimal Transport and Random Walk
* Self-Promoted Prototype Refinement for Few-Shot Class-Incremental Learning
* Self-SAGCN: Self-Supervised Semantic Alignment for Graph Convolution Network
* Self-Supervised 3D Mesh Reconstruction from Single Images
* Self-supervised Augmentation Consistency for Adapting Semantic Segmentation
* Self-Supervised Collision Handling via Generative 3D Garment Models for Virtual Try-On
* Self-supervised Geometric Perception
* Self-Supervised Learning for Semi-Supervised Temporal Action Proposal
* Self-supervised Learning of Depth Inference for Multi-view Stereo
* Self-Supervised Learning on 3D Point Clouds by Learning Discrete Generative Models
* Self-supervised Motion Learning from Static Images
* Self-Supervised Multi-Frame Monocular Scene Flow
* Self-Supervised Pillar Motion Learning for Autonomous Driving
* Self-Supervised Simultaneous Multi-Step Prediction of Road Dynamics and Cost Map
* Self-Supervised Video GANs: Learning for Appearance Consistency and Motion Coherency
* Self-supervised Video Hashing via Bidirectional Transformers
* Self-Supervised Video Representation Learning by Context and Motion Decoupling
* Self-Supervised Visibility Learning for Novel View Synthesis
* Self-Supervised Wasserstein Pseudo-Labeling for Semi-Supervised Image Classification
* SelfAugment: Automatic Augmentation Policies for Self-Supervised Learning
* SelfDoc: Self-Supervised Document Representation Learning
* Semantic Audio-Visual Navigation
* Semantic Image Matting
* Semantic Palette: Guiding Scene Generation with Class Proportions
* Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection
* Semantic Scene Completion via Integrating Instances and Scene in-the-Loop
* Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion
* Semantic Segmentation with Generative Models: Semi-Supervised Learning and Strong Out-of-Domain Generalization
* Semantic-aware Knowledge Distillation for Few-Shot Class-Incremental Learning
* Semantic-Aware Video Text Detection
* Semi-Supervised 3D Hand-Object Poses Estimation with Interactions in Time
* Semi-Supervised Action Recognition with Temporal Contrastive Learning
* Semi-supervised Domain Adaptation based on Dual-level Domain Mixing for Semantic Segmentation
* Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision
* Semi-supervised Semantic Segmentation with Directional Context-aware Consistency
* Semi-supervised Synthesis of High-Resolution Editable Textures for 3D Humans
* Semi-Supervised Video Deraining with Dynamical Rain Generator
* Separating Skills and Concepts for Novel Visual Question Answering
* Sequence-to-Sequence Contrastive Learning for Text Recognition
* Sequential Graph Convolutional Network for Active Learning
* SetVAE: Learning Hierarchical Composition for Generative Modeling of Set-Structured Data
* Sewer-ML: A Multi-Label Sewer Defect Classification Dataset and Benchmark
* SG-Net: Spatial Granularity Network for One-Stage Video Instance Segmentation
* SGCN: Sparse Graph Convolution Network for Pedestrian Trajectory Prediction
* Shallow Feature Matters for Weakly Supervised Object Localization
* Shape and Material Capture at Home
* Shape from Sky: Polarimetric Normal Recovery Under The Sky
* Shared Cross-Modal Trajectory Prediction for Autonomous Driving
* Shelf-Supervised Mesh Prediction in the Wild
* Shot Contrastive Self-Supervised Learning for Scene Boundary Detection
* Siamese Natural Language Tracker: Tracking by Natural Language Descriptions with Siamese Trackers
* SiamMOT: Siamese Multi-Object Tracking
* Sign-Agnostic Implicit Learning of Surface Self-Similarities for Shape Modeling and Reconstruction from Raw Point Clouds
* Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation
* SimPLE: Similar Pseudo Label Exploitation for Semi-Supervised Classification
* Simpler Certified Radius Maximization by Propagating Covariances
* SimPoE: Simulated Character Control for 3D Human Pose Estimation
* Simulating Unknown Target Models for Query-Efficient Black-box Attacks
* Simultaneously Localize, Segment and Rank the Camouflaged Objects
* Single Image Depth Prediction with Wavelet Decomposition
* Single Image Reflection Removal with Absorption Effect
* Single Pair Cross-Modality Super Resolution
* Single-Shot Freestyle Dance Reenactment
* Single-Stage Instance Shadow Detection with Bidirectional Relation Learning
* Single-View 3D Object Reconstruction from Shape Priors in Memory
* Single-view robot pose and joint angle estimation via render & compare
* SIPSA-Net: Shift-Invariant Pan Sharpening with Moving Object Alignment for Satellite Imagery
* Skeleton Merger: an Unsupervised Aligned Keypoint Detector
* Sketch, Ground, and Refine: Top-Down Dense Video Captioning
* Sketch2Model: View-Aware 3D Modeling from Single Free-Hand Sketches
* SKFAC: Training Neural Networks with Faster Kronecker-Factored Approximate Curvature
* Skip-Convolutions for Efficient Video Processing
* SLADE: A Self-Training Framework For Distance Metric Learning
* Sliced Wasserstein Loss for Neural Texture Synthesis, A
* SliceNet: deep dense depth estimation from a single indoor panorama using a slice-based representation
* Slimmable Compressive Autoencoders for Practical Neural Image Compression
* SMD-Nets: Stereo Mixture Density Networks
* Smoothing the Disentangled Latent Style Space for Unsupervised Image-to-Image Translation
* SMPLicit: Topology-aware Generative Model for Clothed People
* SMURF: Self-Teaching Multi-Frame Unsupervised RAFT with Full-Image Warping
* SOE-Net: A Self-Attention and Orientation Encoding Network for Point Cloud based Place Recognition
* Soft-IntroVAE: Analyzing and Improving the Introspective Variational Autoencoder
* SOLD2: Self-supervised Occlusion-aware Line Description and Detection
* SOON: Scenario Oriented Object Navigation with Graph-based Exploration
* Soteria: Provable Defense against Privacy Leakage in Federated Learning from Representation Perspective
* Source-Free Domain Adaptation for Semantic Segmentation
* Space-Time Distillation for Video Super-Resolution
* Space-time Neural Irradiance Fields for Free-Viewpoint Video
* Sparse Auxiliary Networks for Unified Monocular Depth Prediction and Completion
* Sparse Multi-Path Corrections in Fringe Projection Profilometry
* Sparse R-CNN: End-to-End Object Detection with Learnable Proposals
* Spatial Assembly Networks for Image Representation Learning
* Spatial Feature Calibration and Temporal Fusion for Effective One-stage Video Instance Segmentation
* Spatial-Phase Shallow Learning: Rethinking Face Forgery Detection in Frequency Domain
* Spatial-Temporal Correlation and Topology Learning for Person Re-Identification in Videos
* Spatially Consistent Representation Learning
* Spatially-Adaptive Pixelwise Networks for Fast Image Translation
* Spatially-Correlative Loss for Various Image Translation Tasks, The
* Spatially-invariant Style-codes Controlled Makeup Transfer
* Spatially-Varying Outdoor Lighting Estimation from Intrinsics
* Spatio-temporal Contrastive Domain Adaptation for Action Recognition
* Spatiotemporal Contrastive Video Representation Learning
* Spatiotemporal Registration for Event-based Visual Odometry
* Spherical Confidence Learning for Face Recognition
* SpinNet: Learning a General Surface Descriptor for 3D Point Cloud Registration
* Spk2ImgNet: Learning to Reconstruct Dynamic Scene from Continuous Spike Stream
* Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions
* SPSG: Self-Supervised Photometric Scene Generation from RGB-D Scans
* Square Root Bundle Adjustment for Large-Scale Reconstruction
* SRDAN: Scale-aware and Range-aware Domain Adaptation Network for Cross-dataset 3D Object Detection
* SRWarp: Generalized Image Super-Resolution under Arbitrary Transformation
* SSAN: Separable Self-Attention Network for Video Representation Learning
* SSLayout360: Semi-Supervised Indoor Layout Estimation from 360° Panorama
* SSN: Soft Shadow Network for Image Compositing
* SSTVOS: Sparse Spatiotemporal Transformers for Video Object Segmentation
* ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection
* Stable View Synthesis
* StablePose: Learning 6D Object Poses from Geometrically Stable Patches
* STaR: Self-supervised Tracking and Reconstruction of Rigid Objects in Motion with Neural Rendering
* Stay Positive: Non-Negative Image Synthesis for Augmented Reality
* StEP: Style-based Encoder Pre-training for Multi-modal Image Synthesis
* Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes
* StereoPIFu: Depth Aware Clothed Human Digitization via Stereo Vision
* StickyPillars: Robust and Efficient Feature Matching on Point Clouds using Graph Neural Networks
* STMTrack: Template-free Visual Tracking with Space-time Memory Networks
* Stochastic Image-to-Video Synthesis using cINNs
* Stochastic Whitening Batch Normalization
* Strengthen Learning Tolerance for Weakly Supervised Object Localization
* Structure-Aware Face Clustering on a Large-Scale Graph with 107 Nodes
* Structured Multi-Level Interaction Network for Video Moment Localization via Language Query
* Structured Scene Memory for Vision-Language Navigation
* StruMonoNet: Structure-Aware Monocular 3D Prediction
* Student-Teacher Learning from Clean Inputs to Noisy Inputs
* Style-Aware Normalized Loss for Improving Arbitrary Style Transfer
* Style-based Point Generator with Adversarial Rendering for Point Cloud Completion
* StyleMeUp: Towards Style-Agnostic Sketch-Based Image Retrieval
* StyleMix: Separating Content and Style for Enhanced Data Augmentation
* StylePeople: A Generative Model of Fullbody Human Avatars
* StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation
* Stylized Neural Painting
* SuperMix: Supervising the Mixing Data Augmentation
* SurFree: a fast surrogate-free black-box attack
* Surrogate Gradient Field for Latent Space Manipulation
* SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning over Traffic Events
* SwiftNet: Real-time Video Object Segmentation
* Synthesize-It-Classifier: Learning a Generative Classifier through Recurrent Self-analysis
* Synthesizing Long-Term 3D Human Motion and Interaction in 3D Scenes
* t-vMF Similarity For Regularizing Intra-Class Feature Distribution
* T2VLAD: Global-Local Sequence Alignment for Text-Video Retrieval
* Tackling the Ill-Posedness of Super-Resolution through Adaptive Target Generation
* Taming Transformers for High-Resolution Image Synthesis
* Tangent Space Backpropagation for 3D Transformation Groups
* TAP: Text-Aware Pre-training for Text-VQA and Text-Caption
* Target-Aware Object Discovery and Association for Unsupervised Video Multi-Object Segmentation
* Task Programming: Learning Data Efficient Behavior Representations
* Task-Aware Variational Adversarial Active Learning
* Taskology: Utilizing Task Relations at Scale
* TDN: Temporal Difference Networks for Efficient Action Recognition
* Teachers Do More Than Teach: Compressing Image-to-Image Models
* TearingNet: Point Cloud Autoencoder to Learn Topology-Friendly Representations
* TediGAN: Text-Guided Diverse Face Image Generation and Manipulation
* Temporal Action Segmentation from Timestamp Supervision
* Temporal Context Aggregation Network for Temporal Action Proposal Refinement
* Temporal Modulation Network for Controllable Space-Time Video Super-Resolution
* Temporal Opportunist: Self-Supervised Multi-Frame Monocular Depth, The
* Temporal Query Networks for Fine-grained Video Understanding
* Temporal-Relational CrossTransformers for Few-Shot Action Recognition
* Temporally-Weighted Hierarchical Clustering for Unsupervised Action Segmentation
* TesseTrack: End-to-End Learnable Multi-Person Articulated 3D Pose Tracking
* Test-Time Fast Adaptation for Dynamic Scene Deblurring via Meta-Auxiliary Learning
* TextOCR: Towards large-scale end-to-end reasoning for arbitrary-shaped scene text
* There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge
* Thinking Fast and Slow: Efficient Text-to-Visual Retrieval with Transformers
* Three Birds with One Stone: Multi-Task Temporal Action Detection via Recycling Temporal Annotations
* Three Ways to Improve Semantic Segmentation with Self-Supervised Depth Estimation
* Time Adaptive Recurrent Neural Network
* Time Lens: Event-based Video Frame Interpolation
* To the Point: Efficient 3D Object Detection in the Range Image with Graph Convolution Kernels
* Topological Planning with Transformers for Vision-and-Language Navigation
* Toward Accurate and Realistic Outfits Visualization with Attention to Details
* Toward Joint Thing-and-Stuff Mining for Weakly Supervised Panoptic Segmentation
* Towards Accurate 3D Human Motion Prediction from Incomplete Observations
* Towards Accurate Text-based Image Captioning with Content Diversity Exploration
* Towards Bridging Event Captioner and Sentence Localizer for Weakly Supervised Dense Event Captioning
* Towards Compact CNNs via Collaborative Compression
* Towards Diverse Paragraph Captioning for Untrimmed Videos
* Towards Efficient Tensor Decomposition-Based DNN Model Compression with Optimization Framework
* Towards Evaluating and Training Verifiably Robust Neural Networks
* Towards Extremely Compact RNNs for Video Recognition with Fully Decomposed Hierarchical Tucker Structure
* Towards Fast and Accurate Real-World Depth Super-Resolution: Benchmark Dataset and Baseline
* Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets
* Towards High Fidelity Face Relighting with Realistic Shadows
* Towards Improving the Consistency, Efficiency, and Flexibility of Differentiable Neural Architecture Search
* Towards Long-Form Video Understanding
* Towards More Flexible and Accurate Object Tracking with Natural Language: Algorithms and Benchmark
* Towards Open World Object Detection
* Towards Part-Based Understanding of RGB-D Scans
* Towards Real-World Blind Face Restoration with Generative Facial Prior
* Towards Robust Classification Model by Counterfactual and Invariant Data Generation
* Towards Rolling Shutter Correction and Deblurring in Dynamic Scenes
* Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges
* Towards Unified Surgical Skill Assessment
* TPCN: Temporal Point Cloud Networks for Motion Forecasting
* Track to Detect and Segment: An Online Multi-Object Tracker
* Track, Check, Repeat: An EM Approach to Unsupervised Tracking
* Tracking Pedestrian Heads in Dense Crowd
* TrafficSim: Learning to Simulate Realistic Multi-Agent Behaviors
* Training Generative Adversarial Networks in One Stage
* Training Networks in Null Space of Feature Covariance for Continual Learning
* Trajectory Prediction with Latent Belief Energy-Based Model
* Transferable Query Selection for Active Domain Adaptation
* Transferable Semantic Augmentation for Domain Adaptation
* TransFill: Reference-guided Image Inpainting by Merging Multiple Color and Spatial Transformations
* Transformation Driven Visual Reasoning
* Transformation Invariant Few-Shot Object Detection
* Transformer Interpretability Beyond Attention Visualization
* Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking
* Transformer Tracking
* Transitional Adaptation of Pretrained Models for Visual Storytelling
* Translucent Patch: A Physical and Universal Attack on Object Detectors, The
* TransNAS-Bench-101: Improving transferability and Generalizability of Cross-Task Neural Architecture Search
* Tree-like Decision Distillation
* Triple-cooperative Video Shadow Detection
* Troubleshooting Blind Image Quality Models in the Wild
* Truly shift-invariant convolutional neural networks
* TSGCNet: Discriminative Geometric Feature Learning with Two-Stream Graph Convolutional Network for 3D Dental Model Segmentation
* Tuning IR-cut Filter for Illumination-aware Spectral Reconstruction from RGB
* Turning Frequency to Resolution: Video Super-resolution via Event Cameras
* UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles
* UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training
* Ultra-High-Definition Image Dehazing via Multi-Guided Bilateral Learning
* Unbalanced Feature Transport for Exemplar-based Image Translation
* Unbiased Mean Teacher for Cross-domain Object Detection
* Uncalibrated Neural Inverse Rendering for Photometric Stereo of General Surfaces
* Uncertainty Guided Collaborative Training for Weakly Supervised Temporal Action Detection
* Uncertainty Reduction for Model Adaptation in Semantic Segmentation
* Uncertainty-Aware Camera Pose Estimation from Points and Lines
* Uncertainty-aware Joint Salient Object and Camouflaged Object Detection
* Uncertainty-guided Model Generalization to Unseen Domains
* Understanding and Simplifying Perceptual Distances
* Understanding Failures of Deep Networks via Robust Feature Extraction
* Understanding Object Dynamics for Interactive Image-to-Video Synthesis
* Understanding the Behaviour of Contrastive Loss
* Understanding the Robustness of Skeleton-based Action Recognition under Adversarial Attack
* UniT: Unified Knowledge Transfer for Any-shot Object Detection and Segmentation
* Universal Spectral Adversarial Attacks for Deformable Shapes
* Unpaired Image-to-Image Translation via Latent Energy Transport
* UnrealPerson: An Adaptive Pipeline towards Costless Person Re-identification
* Unsupervised 3D Shape Completion through GAN Inversion
* Unsupervised Degradation Representation Learning for Blind Super-Resolution
* Unsupervised Discovery of the Long-Tail in Instance Segmentation Using Hierarchical Self-Supervision
* Unsupervised Disentanglement of Linear-Encoded Facial Semantics
* Unsupervised Feature Learning by Cross-Level Instance-Group Discrimination
* Unsupervised Human Pose Estimation through Transforming Shape Templates
* Unsupervised Hyperbolic Metric Learning
* Unsupervised Hyperbolic Representation Learning via Message Passing Auto-Encoders
* Unsupervised Learning for Robust Fitting: A Reinforcement Learning Approach
* Unsupervised Learning of 3D Object Categories from Videos in the Wild
* Unsupervised Learning of Depth and Depth-of-Field Effect from Natural Images with Aperture Rendering Generative Adversarial Networks
* Unsupervised Multi-Source Domain Adaptation for Person Re-Identification
* Unsupervised Multi-source Domain Adaptation Without Access to Source Data
* Unsupervised Object Detection with LiDAR Clues
* Unsupervised Part Segmentation through Disentangling Appearance and Shape
* Unsupervised Pre-training for Person Re-identification
* Unsupervised Real-world Image Super Resolution via Domain-distance Aware Training
* Unsupervised Visual Attention and Invariance for Reinforcement Learning
* Unsupervised Visual Representation Learning by Tracking Patches in Video
* UnsupervisedR&R: Unsupervised Point Cloud Registration via Differentiable Rendering
* Unveiling the Potential of Structure Preserving for Weakly Supervised Object Localization
* UP-DETR: Unsupervised Pre-training for Object Detection with Transformers
* UPFlow: Upsampling Pyramid for Unsupervised Optical Flow Learning
* User-Guided Line Art Flat Filling with Split Filling Mechanism
* Using Shape to Categorize: Low-Shot Learning with an Explicit Shape Bias
* UV-Net: Learning from Boundary Representations
* VaB-AL: Incorporating Class Imbalance and Difficulty with Variational Bayes for Active Learning
* Variational Pedestrian Detection
* Variational Prototype Learning for Deep Face Recognition
* Variational Relational Point Completion Network
* Variational Transformer Networks for Layout Generation
* VarifocalNet: An IoU-aware Dense Object Detector
* VDSM: Unsupervised Video Disentanglement with State-Space Modeling and Deep Mixtures of Experts
* Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting
* Verifiability and Predictability: Interpreting Utilities of Network Architectures for Point Cloud Processing
* Video Object Segmentation Using Global and Instance Embedding Learning
* Video Prediction Recalling Long-term Motion Context via Memory Alignment Learning
* Video Rescaling Networks with Joint Optimization Strategies for Downscaling and Upscaling
* VideoMoCo: Contrastive Video Representation Learning with Temporally Adversarial Examples
* View Generalization for Single Image Textured 3D Models
* View-Guided Point Cloud Completion
* VIGOR: Cross-View Image Geo-localization beyond One-to-one Retrieval
* VinVL: Revisiting Visual Representations in Vision-Language Models
* ViP-DeepLab: Learning Visual Perception with Depth-aware Video Panoptic Segmentation
* ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search
* VirFace: Enhancing Face Recognition via Unlabeled Shallow Data
* VirTex: Learning Visual Representations from Textual Annotations
* Virtual Fully-Connected Layer: Training a Large-Scale Face Recognition Dataset with Limited Computational Resources
* Visual Navigation with Spatial Attention
* Visual Room Rearrangement
* Visual Semantic Role Labeling for Video Understanding
* Visualizing Adapted Knowledge in Domain Transfer
* Visually Informed Binaural Audio Generation without Binaural Audios
* VisualVoice: Audio-Visual Speech Separation with Cross-Modal Consistency
* VITON-HD: High-Resolution Virtual Try-On via Misalignment-Aware Normalization
* VLN_BERT: A Recurrent Vision-and-Language BERT for Navigation
* VoxelContext-Net: An Octree based Framework for Point Cloud Compression
* VS-Net: Voting with Segmentation for Visual Localization
* VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild
* VX2TEXT: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs
* Wasserstein Barycenter for Multi-Source Domain Adaptation
* Wasserstein Contrastive Representation Distillation
* Watching You: Global-guided Reciprocal Learning for Video-based Person Re-identification
* We are More than Our Joints: Predicting how 3D Bodies Move
* Weakly Supervised Action Selection Learning in Video
* Weakly Supervised Instance Segmentation for Videos with Temporal Mask Consistency
* Weakly Supervised Learning of Rigid 3D Scene Flow
* Weakly Supervised Video Salient Object Detection
* Weakly-supervised Instance Segmentation via Class-agnostic Learning with Salient Images
* Weakly-Supervised Physically Unconstrained Gaze Estimation
* WebFace260M: A Benchmark Unveiling the Power of Million-Scale Deep Face Recognition
* What Can Style Transfer and Paintings Do For Model Robustness?
* What If We Only Use Real Datasets for Scene Text Recognition? Toward Scene Text Recognition With Fewer Labels
* What's in the Image? Explorable Decoding of Compressed Images
* When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework
* When Human Pose Estimation Meets Robustness: Adversarial Algorithms and Benchmarks
* Where and What? Examining Interpretable Disentangled Representations
* Wide-Baseline Multi-Camera Calibration using Person Re-Identification
* Wide-Baseline Relative Camera Pose Estimation with Directional Learning
* Wide-Depth-Range 6D Object Pose Estimation in Space
* WOAD: Weakly Supervised Online Action Detection in Untrimmed Videos
* XProtoNet: Diagnosis in Chest Radiography with Global and Local Explanations
* You Only Look One-level Feature
* You See What I Want You to See: Exploring Targeted Black-Box Transferability Attack for Hash-based Image Retrieval Systems
* Your Flamingo is My Bird: Fine-Grained, or Not
* Zero-shot Adversarial Quantization
* Zero-Shot Instance Segmentation
* Zero-shot Single Image Restoration through Controlled Perturbation of Koschmieder's Model
* ZeroScatter: Domain Transfer for Long Distance Imaging and Vision through Scattering Media
* Zillow Indoor Dataset: Annotated Floor Plans With 360° Panoramas and 3D Room Layouts
1658 for CVPR21

CVPR22 * *CVPR
* 360-Attack: Distortion-Aware Perturbations from Perspective-Views
* 360MonoDepth: High-Resolution 360° Monocular Depth Estimation
* 3D Common Corruptions and Data Augmentation
* 3D human tongue reconstruction from single in-the-wild images
* 3D Moments from Near-Duplicate Photos
* 3D Photo Stylization: Learning to Generate Stylized Novel Views from a Single Image
* 3D Scene Painting via Semantic Image Synthesis
* 3D Shape Reconstruction from 2D Images with Disentangled Attribute Flow
* 3D Shape Variational Autoencoder Latent Disentanglement via Mini-Batch Feature Swapping for Bodies and Faces
* 3D-aware Image Synthesis via Learning Structural and Textural Representations
* 3D-SPS: Single-Stage 3D Visual Grounding via Referred Point Progressive Selection
* 3D-VField: Adversarial Augmentation of Point Clouds for Domain Generalization in 3D Object Detection
* 3DAC: Learning Attribute Compression for Point Clouds
* 3DeformRS: Certifying Spatial Deformations on Point Clouds
* 3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds
* 3MASSIV: Multilingual, Multimodal and Multi-Aspect dataset of Social Media Short Videos
* 3PSDF: Three-Pole Signed Distance Function for Learning Surfaces with Arbitrary Topologies
* A-ViT: Adaptive Tokens for Efficient Vision Transformer
* Abandoning the Bayer-Filter to See in the Dark
* ABO: Dataset and Benchmarks for Real-World 3D Object Understanding
* ABPN: Adaptive Blend Pyramid Network for Real-Time Local Retouching of Ultra High-Resolution Photo
* Accelerating DETR Convergence via Semantic-Aligned Matching
* Accelerating Video Object Segmentation with Compressed Video
* Accurate 3D Body Shape Regression using Metric and Semantic Attributes
* ACPL: Anti-curriculum Pseudo-labelling for Semi-supervised Medical Image Classification
* Acquiring a Dynamic Light Field through a Single-Shot Coded Image
* Active Learning by Feature Mixing
* Active Learning for Open-set Annotation
* Active Teacher for Semi-Supervised Object Detection
* ActiveZero: Mixed Domain Learning for Active Stereovision with Zero Annotation
* AdaFace: Quality Adaptive Margin for Face Recognition
* AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition
* AdaInt: Learning Adaptive Intervals for 3D Lookup Tables on Real-time Image Enhancement
* AdaMixer: A Fast-Converging Query-Based Object Detector
* ADAPT: Vision-Language Navigation with Modality-Aligned Action Prompts
* Adaptive Early-Learning Correction for Segmentation from Noisy Annotations
* Adaptive Gating for Single-Photon 3D Imaging
* Adaptive Hierarchical Representation Learning for Long-Tailed Object Detection
* Adaptive Trajectory Prediction via Transferable GNN
* AdaptPose: Cross-Dataset Adaptation for 3D Human Pose Estimation by Learnable Motion Generation
* ADAS: A Direct Adaptation Strategy for Multi-Target Domain Adaptive Semantic Segmentation
* AdaSTE: An Adaptive Straight-Through Estimator to Train Binary Neural Networks
* AdaViT: Adaptive Vision Transformers for Efficient Image Recognition
* ADeLA: Automatic Dense Labeling with Attention for Viewpoint Shift in Semantic Segmentation
* Adiabatic Quantum Computing for Multi Object Tracking
* Advancing High-Resolution Video-Language Representation with Large-Scale Video Transcriptions
* Adversarial Eigen Attack on BlackBox Models
* Adversarial Parametric Pose Prior
* Adversarial Texture for Fooling Person Detectors in the Physical World
* AEGNN: Asynchronous Event-based Graph Neural Networks
* Aesthetic Text Logo Synthesis via Content-aware Layout Inferring
* Affine Medical Image Registration with Coarse-to-Fine Vision Transformer
* AIM: an Auto-Augmenter for Images and Meshes
* AirObject: A Temporally Evolving Graph Embedding for Object Identification
* AKB-48: A Real-World Articulated Object Knowledge Base
* Aladdin: Joint Atlas Building and Diffeomorphic Registration Learning with Pairwise Alignment
* Align and Prompt: Video-and-Language Pre-training with Entity Prompts
* Align Representations with Base: A New Approach to Self-Supervised Learning
* Alignment-Uniformity aware Representation Learning for Zero-shot Video Classification
* AlignMixup: Improving Representations By Interpolating Aligned Features
* AlignQ: Alignment Quantization with ADMM-based Correlation Preservation
* All-In-One Image Restoration for Unknown Corruption
* All-photon Polarimetric Time-of-Flight Imaging
* Alleviating Semantics Distortion in Unsupervised Low-Level Image-to-Image Translation via Structure Consistency Constraint
* AME: Attention and Memory Enhancement in Hyper-Parameter Optimization
* Amodal Panoptic Segmentation
* Amodal Segmentation through Out-of-Task and Out-of-Distribution Generalization with a Bayesian Model
* Animal Kingdom: A Large and Diverse Dataset for Animal Behavior Understanding
* Anomaly Detection via Reverse Distillation from One-Class Embedding
* AnyFace: Free-style Text-to-Face Synthesis and Manipulation
* AP-BSN: Self-Supervised Denoising for Real-World Images via Asymmetric PD and Blind-Spot Network
* APES: Articulated Part Extraction from Sprite Sheets
* Appearance and Structure Aware Robust Deep Visual Graph Matching: Attack, Defense and Beyond
* APRIL: Finding the Achilles' Heel on Privacy for Vision Transformers
* AR-NeRF: Unsupervised Learning of Depth and Defocus Effects from Natural Images with Aperture Rendering Neural Radiance Fields
* Arbitrary-Scale Image Synthesis
* Arch-Graph: Acyclic Architecture Relation Predictor for Task-Transferable Neural Architecture Search
* ARCS: Accurate Rotation and Correspondence Search
* Are Multimodal Transformers Robust to Missing Modality?
* ART-Point: Improving Rotation Robustness of Point Cloud Classifiers via Adversarial Rotation
* ArtiBoost: Boosting Articulated 3D Hand-Object Pose Estimation via Online Exploration and Synthesis
* Artistic Style Discovery with Independent Components
* ASM-Loc: Action-aware Segment Modeling for Weakly-Supervised Temporal Action Localization
* Assembly101: A Large-Scale Multi-View Video Dataset for Understanding Procedural Activities
* ATPFL: Automatic Trajectory Prediction Model Design under Federated Learning Framework
* Attention Concatenation Volume for Accurate and Efficient Stereo Matching
* Attentive Fine-Grained Structured Sparsity for Image Restoration
* Attributable Visual Similarity Learning
* Attribute Group Editing for Reliable Few-shot Image Generation
* Attribute Surrogates Learning and Spectral Tokens Pooling in Transformers for Few-shot Learning
* Audio-Adaptive Activity Recognition Across Video Domains
* Audio-driven Neural Gesture Reenactment with Video Motion Graphs
* Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis
* Audiovisual Generalised Zero-shot Learning with Cross-modal Attention and Language
* Auditing Privacy Defenses in Federated Learning via Generative Gradient Leakage
* Aug-NeRF: Training Stronger Neural Radiance Fields with Triple-Level Physically-Grounded Augmentations
* Augmented Geometric Distillation for Data-Free Incremental Person ReID
* Auto Arborist Dataset: A Large-Scale Benchmark for Multiview Urban Forest Monitoring Under Domain Shift, The
* Autofocus for Event Cameras
* AutoGPart: Intermediate Supervision Search for Generalizable 3D Part Segmentation
* AutoLoss-GMS: Searching Generalized Margin-based Softmax Loss Function for Person Re-identification
* AutoLoss-Zero: Searching Loss Functions from Scratch for Generic Tasks
* Automated Progressive Learning for Efficient Training of Vision Transformers
* Automatic Color Image Stitching Using Quaternion Rank-1 Alignment
* Automatic Relation-aware Graph Network Proliferation
* Automatic Synthesis of Diverse Weak Supervision Sources for Behavior Analysis
* AutoMine: An Unmanned Mine Dataset
* Autoregressive Image Generation using Residual Quantization
* AutoRF: Learning 3D Object Radiance Fields from Single View Observations
* AutoSDF: Shape Priors for 3D Completion, Reconstruction and Generation
* AUV-Net: Learning Aligned UV Maps for Texture Transfer and Synthesis
* AxIoU: An Axiomatically Justified Measure for Video Moment Retrieval
* AziNorm: Exploiting the Radial Symmetry of Point Cloud for Azimuth-Normalized 3D Perception
* B-cos Networks: Alignment is All We Need for Interpretability
* Back to Reality: Weakly-supervised 3D Object Detection with Shape-Guided Label Enhancement
* Backdoor Attacks on Self-Supervised Learning
* Background Activation Suppression for Weakly Supervised Object Localization
* Bacon: Band-Limited Coordinate Networks for Multiscale Scene Representation
* Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic Memory
* Balanced and Hierarchical Relation Learning for One-shot Object Detection
* Balanced Contrastive Learning for Long-Tailed Visual Recognition
* Balanced MSE for Imbalanced Visual Regression
* Balanced Multimodal Learning via On-the-fly Gradient Modulation
* BaLeNAS: Differentiable Architecture Search via the Bayesian Learning Rule
* Bandits for Structure Perturbation-based Black-box Attacks to Graph Neural Networks with Theoretical Guarantees
* BANMo: Building Animatable 3D Neural Models from Many Casual Videos
* BARC: Learning to Regress 3D Dog Shape from Images by Exploiting Breed Information
* BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment
* BatchFormer: Learning to Explore Sample Relationships for Robust Representation Learning
* Bayesian Invariant Risk Minimization
* Bayesian Nonparametric Submodular Video Partition for Robust Anomaly Detection
* BCOT: A Markerless High-Precision 3D Object Tracking Benchmark
* BE-STI: Spatial-Temporal Integrated Network for Class-agnostic Motion Prediction with Bidirectional Enhancement
* BEHAVE: Dataset and Method for Tracking Human Object Interactions
* Bending Graphs: Hierarchical Shape Matching using Gated Optimal Transport
* Bending Reality: Distortion-aware Transformers for Adapting to Panoramic Semantic Segmentation
* beta-DARTS: Beta-Decay Regularization for Differentiable Architecture Search
* Better Trigger Inversion Optimization in Backdoor Scanning
* BEVT: BERT Pretraining of Video Transformers
* Beyond 3D Siamese Tracking: A Motion-Centric Paradigm for 3D Single Object Tracking in Point Clouds
* Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for Image Captioning
* Beyond Cross-view Image Retrieval: Highly Accurate Vehicle Localization Using Satellite Image
* Beyond Fixation: Dynamic Window Visual Transformer
* Beyond Semantic to Instance Segmentation: Weakly-Supervised Instance Segmentation via Semantic Knowledge Transfer and Self-Refinement
* Beyond Supervised vs. Unsupervised: Representative Benchmarking and Analysis of Image Representation Learning
* Bi-Directional Object-Context Prioritization Learning for Saliency Ranking
* Bi-level Alignment for Cross-Domain Crowd Counting
* Bi-level Doubly Variational Learning for Energy-based Latent Variable Models
* BigDatasetGAN: Synthesizing ImageNet with Pixel-wise Annotations
* BigDL 2.0: Seamless Scaling of AI Pipelines from Laptops to Distributed Cluster
* Bijective Mapping Network for Shadow Removal
* Bilateral Video Magnification Filter
* Blended Diffusion for Text-driven Editing of Natural Images
* Blind Face Restoration via Integrating Face Shape and Generative Priors
* Blind Image Super-resolution with Elaborate Degradation Modeling on Noise and Kernel
* Blind2Unblind: Self-Supervised Image Denoising with Visible Blind Spots
* Block-NeRF: Scalable Large Scene Neural View Synthesis
* BNUDC: A Two-Branched Deep Neural Network for Restoring Images from Under-Display Cameras
* BNV-Fusion: Dense 3D Reconstruction using Bi-level Neural Volume Fusion
* BodyGAN: General-purpose Controllable Neural Human Body Generation
* BodyMap: Learning Full-Body Dense Correspondence Map
* BokehMe: When Neural Rendering Meets Classical Rendering
* Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions
* BoosterNet: Improving Domain Generalization of Deep Neural Nets using Culpability-Ranked Features
* Boosting 3D Object Detection by Simulating Multimodality on Point Clouds
* Boosting Black-Box Attack with Partially Transferred Conditional Adversarial Distribution
* Boosting Crowd Counting via Multifaceted Attention
* Boosting Robustness of Image Matting with Context Assembling and Strong Data Augmentation
* Boosting View Synthesis with Residual Transfer
* BoostMIS: Boosting Medical Image Semi-supervised Learning with Adaptive Pseudo Labeling and Informative Active Annotation
* Bootstrapping ViTs: Towards Liberating Vision Transformers from Pre-training
* Both Style and Fog Matter: Cumulative Domain Adaptation for Semantic Foggy Scene Understanding
* Bounded Adversarial Attack on Deep Content Features
* BoxeR: Box-Attention for 2D and 3D Transformers
* BppAttack: Stealthy and Efficient Trojan Attacks against Deep Neural Networks via Image Quantization and Contrastive Adversarial Learning
* Brain-inspired Multilayer Perceptron with Spiking Neurons
* Brain-Supervised Image Editing
* Brand New Dance Partner: Music-Conditioned Pluralistic Dancing Controlled by Multiple Dance Genres, A
* Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos
* Bridged Transformer for Vision and Point Cloud 3D Object Detection
* Bridging Global Context Interactions for High-Fidelity Image Completion
* Bridging the Gap between Classification and Localization for Weakly Supervised Object Localization
* Bridging the Gap Between Learning in Discrete and Continuous Environments for Vision-and-Language Navigation
* Bridging Video-Text Retrieval with Multiple Choice Questions
* Bring Evanescent Representations to Life in Lifelong Class Incremental Learning
* Bringing Old Films Back to Life
* BTS: A Bi-lingual Benchmark for Text Segmentation in the Wild
* Burst Image Restoration and Enhancement
* C-CAM: Causal CAM for Weakly Supervised Semantic Segmentation on Medical Image
* C2 AM: Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation
* C2AM Loss: Chasing a Better Decision Boundary for Long-Tail Object Detection
* C2SLR: Consistency-enhanced Continuous Sign Language Recognition
* CAD: Co-Adapting Discriminative Features for Improved Few-Shot Classification
* CaDeX: Learning Canonical Deformation Coordinate Space for Dynamic Surface Representation via Neural Homeomorphism
* CADTransformer: Panoptic Symbol Spotting Transformer for CAD Drawings
* CAFE: Learning to Condense Dataset by Aligning Features
* Calibrating Deep Neural Networks by Pairwise Constraints
* Camera Pose Estimation using Implicit Distortion Models
* Camera-Conditioned Stable Feature Generation for Isolated Camera Supervised Person Re-IDentification
* CamLiFlow: Bidirectional Camera-LiDAR Fusion for Joint Optical Flow and Scene Flow Estimation
* Can Neural Nets Learn the Same Model Twice? Investigating Reproducibility and Double Descent from the Decision Boundary Perspective
* Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection
* Cannot See the Forest for the Trees: Aggregating Multiple Viewpoints to Better Classify Objects in Videos
* Canonical Voting: Towards Robust Oriented Bounding Box Detection in 3D Scenes
* CAPRI-Net: Learning Compact CAD Shapes with Adaptive Primitive Assembly
* Capturing and Inferring Dense Full-Body Human-Scene Contact
* Capturing Humans in Motion: Temporal-Attentive 3D Human Pose and Shape Estimation from Monocular Video
* Cascade Transformers for End-to-End Person Search
* CAT-Det: Contrastively Augmented Transformer for Multimodal 3D Object Detection
* Catching Both Gray and Black Swans: Open-set Supervised Anomaly Detection*
* Category Contrast for Unsupervised Domain Adaptation in Visual Tasks
* Category-Aware Transformer Network for Better Human-Object Interaction Detection
* Causal Transportability for Visual Recognition
* Causality Inspired Representation Learning for Domain Generalization
* CD2-pFed: Cyclic Distillation-guided Channel Decoupling for Model Personalization in Federated Learning
* CDGNet: Class Distribution Guided Network for Human Parsing
* CellTypeGraph: A New Geometric Computer Vision Benchmark
* Cerberus Transformer: Joint Semantic, Affordance and Attribute Parsing
* Certified Patch Robustness via Smoothed Vision Transformers
* Channel Balancing for Accurate Quantization of Winograd Convolutions
* CHEX: CHannel EXploration for CNN Model Compression
* ChiTransformer: Towards Reliable Stereo from Cues
* Class Re-Activation Maps for Weakly-Supervised Semantic Segmentation
* Class Similarity Weighted Knowledge Distillation for Continual Semantic Segmentation
* Class-Aware Contrastive Semi-Supervised Learning
* Class-Balanced Pixel-Level Self-Labeling for Domain Adaptive Semantic Segmentation
* Class-Incremental Learning by Knowledge Distillation with Adaptive Feature Consolidation
* Class-Incremental Learning with Strong Pre-trained Models
* Classification-Then-Grounding: Reformulating Video Scene Graphs as Temporal Bipartite Graphs
* Clean Implicit 3D Structure from Noisy 2D STEM Images
* CLIMS: Cross Language Image Matching for Weakly Supervised Semantic Segmentation
* CLIP-Event: Connecting Text and Images with Event Structures
* CLIP-Forge: Towards Zero-Shot Text-to-Shape Generation
* CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields
* Clipped Hyperbolic Classifiers Are Super-Hyperbolic Classifiers
* CLIPstyler: Image Style Transfer with a Single Text Condition
* Cloning Outfits from Real-World Images to 3D Characters for Generalizable Person Re-Identification
* Closer Look at Few-shot Image Generation, A
* Closing the Generalization Gap of Cross-Silo Federated Medical Image Segmentation
* Cloth-Changing Person Re-identification from A Single Image with Gait Prediction and Regularization
* Clothes-Changing Person Re-identification with RGB Modality Only
* ClothFormer: Taming Video Virtual Try-on in All Module
* CLRNet: Cross Layer Refinement Network for Lane Detection
* Cluster-guided Image Synthesis with Unconditional Models
* ClusterGNN: Cluster-based Coarse-to-Fine Graph Neural Network for Efficient Feature Matching
* Clustering Plotted Data by Image Segmentation
* CMT-DeepLab: Clustering Mask Transformers for Panoptic Segmentation
* CMT: Convolutional Neural Networks Meet Vision Transformers
* CNN Filter DB: An Empirical Investigation of Trained Convolutional Filters
* Co-advise: Cross Inductive Bias Distillation
* Co-domain Symmetry for Complex-Valued Deep Learning
* CO-SNE: Dimensionality Reduction and Visualization for Hyperbolic Data
* COAP: Compositional Articulated Occupancy of People
* Coarse-To-Fine Deep Video Coding with Hyperprior-Guided Mode Prediction
* Coarse-to-Fine Feature Mining for Video Semantic Segmentation
* Coarse-to-Fine Q-attention: Efficient Learning for Visual Robotic Manipulation via Discretisation
* CodedVTR: Codebook-based Sparse Voxel Transformer with Geometric Guidance
* Coherent Point Drift Revisited for Non-rigid Shape Matching and Registration
* Colar: Effective and Efficient Online Action Detection by Consulting Exemplars
* Collaborative Learning for Hand and Object Reconstruction with Attention-guided Graph Convolution
* Collaborative Transformers for Grounded Situation Recognition
* Come-Closer-Diffuse-Faster: Accelerating Conditional Diffusion Models for Inverse Problems through Stochastic Contraction
* Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data
* Comparing Correspondences: Video Prediction with Correspondence-wise Losses
* Complex Backdoor Detection by Symmetric Feature Differencing
* Complex Video Action Reasoning via Learnable Markov Logic Network
* Compositional Temporal Grounding with Structured Variational Cross-Graph Correspondence Learning
* Compound Domain Generalization via Meta-Knowledge Encoding
* Comprehending and Ordering Semantics for Image Captioning
* Comprehensive Study of Image Classification Model Sensitivity to Foregrounds, Backgrounds, and Visual Attributes, A
* Compressing Models with Few Samples: Mimicking then Replacing
* Compressive Single-Photon 3D Cameras
* Computing Wasserstein-p Distance Between Images with Linear Cost
* Condensing CNNs with Partial Differential Equations
* Conditional Prompt Learning for Vision-Language Models
* ConDor: Self-Supervised Canonicalization of 3D Pose for Partial Shapes
* CoNeRF: Controllable Neural Radiance Fields
* Confidence Propagation Cluster: Unleash Full Potential of Object Detectors
* Connecting the Complementary-view Videos: Joint Camera Identification and Subject Association
* Conservative Approach for Unbiased Learning on Unknown Biases, A
* Consistency driven Sequential Transformers Attention Model for Partially Observable Scenes
* Consistency Learning via Decoding Path Augmentation for Transformers in Human Object Interaction Detection
* Consistent Explanations by Contrastive Learning
* Constrained Few-shot Class-incremental Learning
* Context-Aware Sequence Alignment using 4D Skeletal Augmentation
* Context-Aware Video Reconstruction for Rolling Shutter Cameras
* Contextual Debiasing for Visual Recognition with Causal Mechanisms
* Contextual Instance Decoupling for Robust Multi-Person Pose Estimation
* Contextual Outpainting with Object-Level Contrastive Learning
* Contextual Similarity Distillation for Asymmetric Image Retrieval
* Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision
* ContIG: Self-supervised Multimodal Contrastive Learning for Medical Imaging with Genetics
* Continual Learning for Visual Search with Backward Consistent Feature Embedding
* Continual Learning with Lifelong Vision Transformer
* Continual Object Detection via Prototypical Task Correlation Guided Gating Mechanism
* Continual Predictive Learning from Videos
* Continual Stereo Matching of Continuous Driving Scenes with Growing Architecture
* Continual Test-Time Domain Adaptation
* Continuous Scene Representations for Embodied AI
* Contour-Hugging Heatmaps for Landmark Detection
* Contrastive Boundary Learning for Point Cloud Segmentation
* Contrastive Conditional Neural Processes
* Contrastive Dual Gating: Learning Sparse Features With Contrastive Learning
* Contrastive Learning for Space-time Correspondence via Self-cycle Consistency
* Contrastive Learning for Unsupervised Video Highlight Detection
* Contrastive Regression for Domain Adaptation on Gaze Estimation
* Contrastive Test-Time Adaptation
* ContrastMask: Contrastive Learning to Segment Every Thing
* Controllable Animation of Fluid Elements in Still Images
* Controllable Dynamic Multi-Task Architectures
* ConvNet for the 2020s, A
* Convolution of Convolution: Let Kernels Spatially Collaborate
* Convolutions for Spatial Interaction Modeling
* Coopernaut: End-to-End Driving with Cooperative Perception for Networked Vehicles
* CoordGAN: Self-Supervised Dense Correspondences Emerge from GANs
* Correlation Verification for Image Retrieval
* Correlation-Aware Deep Tracking
* CoSSL: Co-Learning of Representation and Classifier for Imbalanced Semi-Supervised Learning
* COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval
* Counterfactual Cycle-Consistent Learning for Instruction Following and Generation in Vision-Language Navigation
* Coupled Iterative Refinement for 6D Multi-Object Pose Estimation
* Coupling Vision and Proprioception for Navigation of Legged Robots
* CPPF: Towards Robust Category-Level 9D Pose Estimation in the Wild
* CRAFT: Cross-Attentional Flow Transformer for Robust Optical Flow
* Crafting Better Contrastive Views for Siamese Representation Learning
* CREAM: Weakly Supervised Object Localization via Class RE-Activation Mapping
* CRIS: CLIP-Driven Referring Image Segmentation
* Critical Regularizations for Neural Surface Reconstruction in the Wild
* CroMo: Cross-Modal Learning for Monocular Depth Estimation
* Cross Domain Object Detection by Target-Perceived Dual Branch Distillation
* Cross Modal Retrieval with Querybank Normalisation
* Cross-Architecture Self-supervised Video Representation Learning
* Cross-Domain Adaptive Teacher for Object Detection
* Cross-Domain Correlation Distillation for Unsupervised Domain Adaptation in Nighttime Semantic Segmentation
* Cross-domain Few-shot Learning with Task-specific Adapters
* Cross-Image Relational Knowledge Distillation for Semantic Segmentation
* Cross-modal Background Suppression for Audio-Visual Event Localization
* Cross-modal Clinical Graph Transformer for Ophthalmic Report Generation
* Cross-modal Map Learning for Vision and Language Navigation
* Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?
* Cross-Modal Transferable Adversarial Attacks from Images to Videos
* Cross-Model Pseudo-Labeling for Semi-Supervised Action Recognition
* Cross-patch Dense Contrastive Learning for Semi-supervised Segmentation of Cellular Nuclei in Histopathologic Images
* Cross-view Transformers for real-time Map-view Semantic Segmentation
* CrossLoc: Scalable Aerial Localization Assisted by Multimodal Synthetic Data
* Crossmodal Representation Learning for Zero-shot Action Recognition
* CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding
* Crowd Counting in the Frequency Domain
* CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows
* CVF-SID: Cyclic multi-Variate Function for Self-Supervised Image Denoising by Disentangling Noise from Image
* CVNet: Contour Vibration Network for Building Extraction
* Cycle-Consistent Counterfactuals by Latent Transformations
* CycleMix: A Holistic Strategy for Medical Image Segmentation from Scribble Supervision
* D-Grasp: Physically Plausible Dynamic Grasp Synthesis for Hand-Object Interactions
* DAD-3DHeads: A Large-scale Dense, Accurate and Diverse Dataset for 3D Head Alignment from a Single Image
* DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation
* DAIR-V2X: A Large-Scale Dataset for Vehicle-Infrastructure Cooperative 3D Object Detection
* DanceTrack: Multi-Object Tracking in Uniform Appearance and Diverse Motion
* Dancing under the stars: video denoising in starlight
* DArch: Dental Arch Prior-assisted 3D Tooth Instance Segmentation with Weak Annotations
* DASO: Distribution-Aware Semantics-Oriented Pseudo-label for Imbalanced Semi-Supervised Learning
* Data-Free Network Compression via Parametric Non-uniform Mixed Precision Quantization
* DATA: Domain-Aware and Task-Aware Self-supervised Learning
* Dataset Distillation by Matching Training Trajectories
* Day-to-Night Image Synthesis for Training Nighttime Neural ISPs
* DC-SSL: Addressing Mismatched Class Distribution in Semi-Supervised Learning
* De-rendering 3D Objects in the Wild
* DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers
* Debiased Learning from Naturally Imbalanced Pseudo-Labels
* Deblur-NeRF: Neural Radiance Fields from Blurry Images
* Deblurring via Stochastic Refinement
* DECORE: Deep Compression with Reinforcement Learning
* Decoupled Knowledge Distillation
* Decoupled Multi-task Learning with Cyclical Self-Regulation for Face Parsing
* Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition
* Decoupling Makes Weakly Supervised Local Feature Better
* Decoupling Zero-Shot Semantic Segmentation
* DeeCap: Dynamic Early Exiting for Efficient Image Captioning
* Deep 3D-to-2D Watermarking: Embedding Messages in 3D Meshes and Extracting Them from 2D Renderings
* Deep Anomaly Discovery from Unlabeled Videos via Normality Advantage and Self-Paced Refinement
* Deep Color Consistent Network for Low-Light Image Enhancement
* Deep Constrained Least Squares for Blind Image Super-Resolution
* Deep Decomposition for Stochastic Normal-Abnormal Transport
* Deep Depth from Focus with Differential Focus Volume
* Deep Equilibrium Optical Flow Estimation
* Deep Generalized Unfolding Networks for Image Restoration
* Deep Hierarchical Semantic Segmentation
* Deep Hybrid Models for Out-of-Distribution Detection
* Deep Hyperspectral-Depth Reconstruction Using Single Color-Dot Projection
* Deep Image-based Illumination Harmonization
* Deep orientation-aware functional maps: Tackling symmetry issues in Shape Matching
* Deep Rectangling for Image Stitching: A Learning Baseline
* Deep Safe Multi-view Clustering: Reducing the Risk of Clustering Performance Degradation Caused by View Increase
* Deep Saliency Prior for Reducing Visual Distraction
* Deep Spectral Methods: A Surprisingly Strong Baseline for Unsupervised Semantic Segmentation and Localization
* Deep Stereo Image Compression via Bi-directional Coding
* Deep Unlearning via Randomized Conditionally Independent Hessians
* Deep vanishing point detection: Geometric priors make dataset variations vanish
* Deep Visual Geo-localization Benchmark
* DeepCurrents: Learning Implicit Representations of Shapes with Boundaries
* DeepDPM: Deep Clustering With an Unknown Number of Clusters
* Deeper Dive Into What Deep Spatiotemporal Networks Encode: Quantifying Static vs. Dynamic Information, A
* DeepFace-EMD: Re-ranking Using Patch-wise Earth Mover's Distance Improves Out-Of-Distribution Face Identification
* DeepFake Disrupter: The Detector of DeepFake Is My Friend
* DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection
* DeepLIIF: An Online Platform for Quantification of Clinical Pathology Slides
* DEFEAT: Deep Hidden Feature Backdoor Attacks by Imperceptible Perturbation and Latent Representation Constraints
* Defensive Patches for Robust Recognition in the Physical World
* Deformable ProtoPNet: An Interpretable Image Classifier Using Deformable Prototypes
* Deformable Sprites for Unsupervised Video Decomposition
* Deformable Video Transformer
* Deformation and Correspondence Aware Unsupervised Synthetic-to-Real Scene Flow Estimation for Point Clouds
* Degradation-agnostic Correspondence from Resolution-asymmetric Stereo
* Degree-of-linear-polarization-based Color Constancy
* DeltaCNN: End-to-End CNN Inference of Sparse Frame Differences in Videos
* Delving Deep into the Generalization of Vision Transformers under Distribution Shifts
* Delving into the Estimation Shift of Batch Normalization in a Network
* Democracy Does Matter: Comprehensive Feature Mining for Co-Salient Object Detection
* Demystifying the Neural Tangent Kernel from a Practical Perspective: Can it be trusted for Neural Architecture Search without training?
* Dense Depth Priors for Neural Radiance Fields from Sparse Input Views
* Dense Learning based Semi-Supervised Object Detection
* DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting
* Density-preserving Deep Point Cloud Compression
* Depth Estimation by Combining Binocular Stereo and Monocular Structured-Light
* Depth-Aware Generative Adversarial Network for Talking Head Video Generation
* Depth-Guided Sparse Structure-from-Motion for Movies and TV Shows
* Depth-supervised NeRF: Fewer Views and Faster Training for Free
* DESTR: Object Detection with Split Transformer
* Details or Artifacts: A Locally Discriminative Learning Approach to Realistic Image Super-Resolution
* Detecting Camouflaged Object in Frequency Domain
* Detecting Deepfakes with Self-Blended Images
* Detector-Free Weakly Supervised Group Activity Recognition
* DetectorDetective: Investigating the Effects of Adversarial Examples on Object Detectors
* Deterministic Point Cloud Registration via Novel Transformation Decomposition
* DETReg: Unsupervised Pretraining with Region Priors for Object Detection
* DEVIL is in the Details: A Diagnostic Evaluation Benchmark for Video Inpainting, The
* Devil Is in the Details: Window-Based Attention for Image Compression, The
* Devil is in the Labels: Noisy Label Correction for Robust Scene Graph Generation, The
* Devil is in the Margin: Margin-based Label Smoothing for Network Calibration, The
* Devil is in the Pose: Ambiguity-free 3D Rotation-invariant Learning via Pose-aware Convolution, The
* DF-GAN: A Simple and Effective Baseline for Text-to-Image Synthesis
* DGECN: A Depth-Guided Edge Convolutional Network for End-to-End 6D Pose Estimation
* Differentiable Dynamics for Articulated 3d Human Motion Reconstruction
* Differentiable Stereopsis: Meshes from multiple views using differentiable rendering
* Differentiable Two-stage Alignment Scheme for Burst Image Reconstruction with Large Shift, A
* Differentially Private Federated Learning with Local Regularization and Sparsification
* DiffPoseNet: Direct Differentiable Camera Pose Estimation
* Diffusion Autoencoders: Toward a Meaningful and Decodable Representation
* DiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulation
* DIFNet: Boosting Visual Information Flow for Image Captioning
* DiGS: Divergence guided shape implicit neural representation for unoriented point clouds
* DiLiGenT102: A Photometric Stereo Benchmark Dataset with Controlled Shape and Material Variation
* Dimension Embeddings for Monocular 3D Object Detection
* DINE: Domain Adaptation from Single and Multiple Black-box Predictors
* DIP: Deep Inverse Patchmatch for High-Resolution Optical Flow
* DiRA: Discriminative, Restorative, and Adversarial Learning for Self-supervised Medical Image Analysis
* DirecFormer: A Directed Attention in Transformer Approach to Robust Action Recognition
* Direct Voxel Grid Optimization: Super-fast Convergence for Radiance Fields Reconstruction
* Directional Self-supervised Learning for Heavy Image Augmentations
* DisARM: Displacement Aware Relation Module for 3D Detection
* Discovering Objects that Can Move
* Discrete Cosine Transform Network for Guided Depth Map Super-Resolution
* Discrete time convolution for fast event-based stereo
* Disentangled3D: Learning a 3D Generative Model with Disentangled Geometry and Appearance from Monocular Images
* Disentangling visual and written concepts in CLIP
* Disentangling Visual Embeddings for Attributes and Objects
* DiSparse: Disentangled Sparsification for Multitask Model Compression
* Dist-PU: Positive-Unlabeled Learning from a Label Distribution Perspective
* Distillation Using Oracle Queries for Transformer-based Human-Object Interaction Detection
* Distinguishing Unseen from Seen for Generalized Zero-shot Learning
* Distribution Consistent Neural Architecture Search
* Distribution-Aware Single-Stage Models for Multi-Person 3D Pose Estimation
* Ditto: Building Digital Twins of Articulated Objects from Interaction
* DIVeR: Real-time and Accurate Neural Radiance Fields with Deterministic Integration for Volume Rendering
* Diverse Plausible 360-Degree Image Outpainting for Efficient 3DCG Background Creation
* Diversity Matters: Fully Exploiting Depth Clues for Reliable Monocular 3D Object Detection
* Divide and Conquer: Compositional Experts for Generalized Novel Class Discovery
* DLFormer: Discrete Latent Transformer for Video Inpainting
* DN-DETR: Accelerate DETR Training by Introducing Query DeNoising
* Do Explanations Explain? Model Knows Best
* Do learned representations respect causal relationships?
* DO-GAN: A Double Oracle Framework for Generative Adversarial Networks
* Does Robustness on ImageNet Transfer to Downstream Tasks?
* Does text attract attention on e-commerce images: A novel saliency prediction dataset and method
* Domain Adaptation on Point Clouds via Geometry-Aware Implicits
* Domain Generalization via Shuffled Style Assembly for Face Anti-Spoofing
* Domain-Agnostic Prior for Transfer Semantic Segmentation
* Doodle It Yourself: Class Incremental Learning by Drawing a Few Sketches
* DoubleField: Bridging the Neural Surface and Radiance Fields for High-fidelity Human Reconstruction and Rendering
* DPGEN: Differentially Private Generative Energy-Guided Network for Natural Image Synthesis
* DPICT: Deep Progressive Image Compression Using Trit-Planes
* DR.VIC: Decomposition and Reasoning for Video Individual Counting
* Dreaming to Prune Image Deraining Networks
* Dressing in the Wild by Watching Dance Videos
* Drop the GAN: In Defense of Patches Nearest Neighbors as Single Image Generative Models
* DST: Dynamic Substitute Training for Data-free Black-box Attack
* DTA: Physical Camouflage Attacks using Differentiable Transformation Network
* DTFD-MIL: Double-Tier Feature Distillation Multiple Instance Learning for Histopathology Whole Slide Image Classification
* Dual Adversarial Adaptation for Cross-Device Real-World Image Super-Resolution
* Dual Cross-Attention Learning for Fine-Grained Visual Categorization and Object Re-Identification
* Dual Task Learning by Leveraging Both Dense Correspondence and Mis-Correspondence for Robust Change Detection With Imperfect Matches
* Dual Temperature Helps Contrastive Learning Without Many Negative Samples: Towards Understanding and Simplifying MoCo
* Dual Weighting Label Assignment Scheme for Object Detection, A
* Dual-AI: Dual-path Actor Interaction Learning for Group Activity Recognition
* Dual-Generator Face Reenactment
* Dual-Key Multimodal Backdoors for Visual Question Answering
* Dual-path Image Inpainting with Auxiliary GAN Inversion
* Dual-Shutter Optical Vibration Sensing
* Dynamic 3D Gaze from Afar: Deep Gaze Estimation from Temporal Eye-Head-Body Coordination
* Dynamic Dual-Output Diffusion Models
* Dynamic Kernel Selection for Improved Generalization and Memory Efficiency in Meta-learning
* Dynamic MLP for Fine-Grained Image Classification by Leveraging Geographical and Temporal Information
* Dynamic Prototype Convolution Network for Few-Shot Semantic Segmentation
* Dynamic Scene Graph Generation via Anticipatory Pre-training
* Dynamic Sparse R-CNN
* DynamicEarthNet: Daily Multi-Spectral Satellite Dataset for Semantic Change Segmentation
* DyRep: Bootstrapping Training with Dynamic Re-parameterization
* DyTox: Transformers for Continual Learning with DYnamic TOken eXpansion
* E-CIR: Event-Enhanced Continuous Intensity Recovery
* E2(GO)MOTION: Motion Augmented Event Stream for Egocentric Action Recognition
* E2EC: An End-to-End Contour-based Method for High-Quality High-Speed Instance Segmentation
* EASE: Unsupervised Discriminant Subspace Learning for Transductive Few-Shot Learning
* EDTER: Edge Detection with Transformer
* Effective conditioned and composed image retrieval combining CLIP-based features
* Efficient Classification of Very Large Images with Tiny Objects
* Efficient Deep Embedded Subspace Clustering
* Efficient Geometry-aware 3D Generative Adversarial Networks
* Efficient Large-scale Localization by Global Instance Recognition
* Efficient Maximal Coding Rate Reduction by Variational Forms
* Efficient Multi-view Stereo by Iterative Dynamic Cost Volume
* Efficient Training Approach for Very Large Scale Face Recognition, An
* Efficient Two-Stage Detection of Human-Object Interactions with a Novel Unary-Pairwise Transformer
* Efficient Video Instance Segmentation via Tracklet Query and Proposal
* EfficientNeRF: Efficient Neural Radiance Fields
* Ego4D: Around the World in 3,000 Hours of Egocentric Video
* Egocentric Deep Multi-Channel Audio-Visual Active Speaker Localization
* Egocentric Prediction of Action Target in 3D
* Egocentric Scene Understanding via Multimodal Spatial Rectifier
* EI-CLIP: Entity-aware Interventional Contrastive Learning for E-commerce Cross-modal Retrieval
* Eigencontours: Novel Contour Descriptors Based on Low-Rank Approximation
* Eigenlanes: Data-Driven Lane Descriptors for Structurally Diverse Lanes
* ElePose: Unsupervised 3D Human Pose Estimation by Predicting Camera Elevation and Learning Normalizing Flows on 2D Poses
* ELIC: Efficient Learned Image Compression with Unevenly Grouped Space-Channel Contextual Adaptive Coding
* ELSR: Efficient Line Segment Reconstruction with Planes and Points Guidance
* Embracing Single Stride 3D Object Detector with Sparse Transformer
* EMOCA: Emotion Driven Monocular Face Capture and Animation
* Empirical Study of End-to-End Temporal Action Detection, An
* Empirical Study of Training End-to-End Vision-and-Language Transformers, An
* EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching
* En-Compactness: Self-Distillation Embedding & Contrastive Generation for Generalized Zero-Shot Learning
* Enabling Equivariance for Arbitrary Lie Groups
* End-to-End Compressed Video Representation Learning for Generic Event Boundary Detection
* End-to-end Generative Pretraining for Multimodal Video Captioning
* End-to-End Human-Gaze-Target Detection with Transformers
* End-to-End Multi-Person Pose Estimation with Transformers
* End-to-End Reconstruction-Classification Learning for Face Forgery Detection
* End-to-End Referring Video Object Segmentation with Multimodal Transformers
* End-to-End Semi-Supervised Learning for Video Action Detection
* End-to-End Trajectory Distribution Prediction Based on Occupancy Grid Maps
* Energy-based Latent Aligner for Incremental Learning
* Enhancing Adversarial Robustness for Deep Metric Learning
* Enhancing Adversarial Training with Second-Order Statistics of Weights
* Enhancing Classifier Conservativeness and Robustness by Polynomiality
* Enhancing Face Recognition with Self-Supervised 3D Reconstruction
* Ensembling Off-the-shelf Models for GAN Training
* Entropy-based Active Learning for Object Detection with Progressive Diversity Constraint
* Envedit: Environment Editing for Vision-and-Language Navigation
* Episodic Memory Question Answering
* EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation
* Equalized Focal Loss for Dense Long-Tailed Object Detection
* Equivariance Allows Handling Multiple Nuisance Variables When Analyzing Pooled Neuroimaging Datasets
* Equivariant Point Cloud Analysis via Learning Orientations for Message Passing
* ES6D: A Computation Efficient and Symmetry-Aware 6D Pose Regression Framework
* Escaping Data Scarcity for High-Resolution Heterogeneous Face Hallucination
* ESCNet: Gaze Target Detection with the Understanding of 3D Scenes
* Estimating Egocentric 3D Human Pose in the Wild with External Weak Supervision
* Estimating Example Difficulty using Variance of Gradients
* Estimating Fine-Grained Noise Model via Contrastive Learning
* Estimating Structural Disparities for Face Models
* ETHSeg: An Amodel Instance Segmentation Network and a Real-world Dataset for X-Ray Waste Inspection
* Ev-TTA: Test-Time Adaptation for Event-Based Object Recognition
* Evading the Simplicity Bias: Training a Diverse Set of Models Discovers Solutions with Superior OOD Generalization
* Evaluation-oriented Knowledge Distillation for Deep Face Recognition
* Event-aided Direct Sparse Odometry
* Event-based Video Reconstruction via Potential-assisted Spiking Neural Network
* Everything at Once - Multi-modal Fusion Transformer for Video Retrieval
* EvUnroll: Neuromorphic Events based Rolling Shutter Image Correction
* Exact Feature Distribution Matching for Arbitrary Style Transfer and Domain Generalization
* Exemplar-based Pattern Synthesis with Implicit Periodic Field Network
* Expanding Large Pre-trained Unimodal Models with Multimodal Information Injection for Image-Text Multimodal Classification
* Expanding Low-Density Latent Regions for Open-Set Object Detection
* Explaining Deep Convolutional Neural Networks via Latent Visual-Semantic Filter Attention
* Exploiting Explainable Metrics for Augmented SGD
* Exploiting Pseudo Labels in a Self-Supervised Learning Framework for Improved Monocular Depth Estimation
* Exploiting Rigidity Constraints for LiDAR Scene Flow Estimation
* Exploiting Temporal Relations on Radar Perception for Autonomous Driving
* Explore Spatio-Temporal Aggregation for Insubstantial Object Detection: Benchmark Dataset and Baseline
* Exploring and Evaluating Image Restoration Potential in Dynamic Scenes
* Exploring Denoised Cross-video Contrast for Weakly-supervised Temporal Action Localization
* Exploring Domain-Invariant Parameters for Source Free Domain Adaptation
* Exploring Dual-task Correlation for Pose Guided Person Image Generation
* Exploring Effective Data for Surrogate Training Towards Black-box Attack
* Exploring Endogenous Shift for Cross-domain Detection: A Large-scale Benchmark and Perturbation Suppression Network
* Exploring Frequency Adversarial Attacks for Face Forgery Detection
* Exploring Geometric Consistency for Monocular 3D Object Detection
* Exploring Patch-wise Semantic Relation for Contrastive Learning in Image-to-Image Translation Tasks
* Exploring Set Similarity for Dense Self-supervised Representation Learning
* Exploring Structure-aware Transformer over Interaction Proposals for Human-Object Interaction Detection
* Exploring the Equivalence of Siamese Self-Supervised Learning via A Unified Gradient Framework
* Exposure Normalization and Compensation for Multiple-Exposure Correction
* Expressive Talking Head Generation with Granular Audio-Visual Control
* Extracting Triangular 3D Models, Materials, and Lighting From Images
* EyePAD++: A Distillation-based approach for joint Eye Authentication and Presentation Attack Detection using Periocular Images
* Face Relighting with Geometrically Consistent Shadows
* Face2Exp: Combating Data Biases for Facial Expression Recognition
* FaceFormer: Speech-Driven 3D Facial Animation with Transformers
* FaceVerse: a Fine-grained and Detail-controllable 3D Face Morphable Model from a Hybrid Dataset
* Failure Modes of Domain Generalization Algorithms
* Fair Contrastive Learning for Facial Attribute Classification
* Fairness-aware Adversarial Perturbation Towards Bias Mitigation for Deployed Deep Models
* Faithful Extreme Rescaling via Generative Prior Reciprocated Invertible Representations
* FAM: Visual Explanations for the Feature Representations from Deep Convolutional Networks
* FashionVLP: Vision Language Transformer for Fashion Retrieval with Feedback
* Fast Algorithm for Low-rank Tensor Completion in Delay-embedded Space
* Fast and Unsupervised Action Boundary Detection for Action Segmentation
* Fast Light-Weight Near-Field Photometric Stereo
* Fast Point Transformer
* Fast, Accurate and Memory-Efficient Partial Permutation Synchronization
* FastDOG: Fast Discrete Optimization on GPU
* Feature Erasing and Diffusion Network for Occluded Person Re-Identification
* Feature Statistics Mixing Regularization for Generative Adversarial Networks
* FedCor: Correlation-Based Active Client Selection Strategy for Heterogeneous Federated Learning
* FedCorr: Multi-Stage Federated Learning for Label Noise Correction
* FedDC: Federated Learning with Non-IID Data via Local Drift Decoupling and Correction
* Federated Class-Incremental Learning
* Federated Learning with Position-Aware Neurons
* FENeRF: Face Editing in Neural Radiance Fields
* FERV39k: A Large-Scale Multi-Scene Dataset for Facial Expression Recognition in Videos
* Few Could Be Better Than All: Feature Sampling and Grouping for Scene Text Detection
* Few Shot Generative Model Adaption via Relaxed Spatial Structural Alignment
* Few-shot Backdoor Defense Using Shapley Estimation
* Few-Shot Font Generation by Learning Fine-Grained Local Styles
* Few-Shot Head Swapping in the Wild
* Few-Shot Incremental Learning for Label-to-Image Translation
* Few-shot Keypoint Detection with Uncertainty Learning for Unseen Species
* Few-shot Learning with Noisy Labels
* Few-Shot Object Detection with Fully Cross-Transformer
* FIBA: Frequency-Injection based Backdoor Attack in Medical Image Analysis
* FIFO: Learning Fog-invariant Features for Foggy Scene Segmentation
* Finding Badly Drawn Bunnies
* Finding Fallen Objects Via Asynchronous Audio-Visual Integration
* Finding Good Configurations of Planar Primitives in Unorganized Point Clouds
* Fine-Grained Object Classification via Self-Supervised Pose Alignment
* Fine-Grained Predicates Learning for Scene Graph Generation
* Fine-grained Temporal Contrastive Learning for Weakly-supervised Temporal Action Localization
* Fine-tuning Global Model via Data-Free Knowledge Distillation for Non-IID Federated Learning
* Fine-tuning Image Transformers using Learnable Memory
* FineDiving: A Fine-grained Dataset for Procedure-aware Action Quality Assessment
* Fingerprinting Deep Neural Networks Globally via Universal Adversarial Perturbations
* Fire Together Wire Together: A Dynamic Pruning Approach with Self-Supervised Mask Prediction
* Fisher Information Guidance for Learned Time-of-Flight Imaging
* FisherMatch: Semi-Supervised Rotation Regression via Entropy-based Filtering
* Fixing Malfunctional Objects With Learned Physical Simulation and Functional Prediction
* Flag Median and FlagIRLS, The
* FLAG: Flow-based 3D Avatar Generation from Sparse Observations
* FLAVA: A Foundational Language And Vision Alignment Model
* FlexIT: Towards Flexible Semantic Image Translation
* FLOAT: Factorized Learning of Object Attributes for Improved Multi-object Multi-part Scene Parsing
* FMCNet: Feature-Level Modality Compensation for Visible-Infrared Person Re-Identification
* Focal and Global Knowledge Distillation for Detectors
* Focal Length and Object Pose Estimation via Render and Compare
* Focal Sparse Convolutional Networks for 3D Object Detection
* FocalClick: Towards Practical Interactive Image Segmentation
* FocusCut: Diving into a Focus View in Interactive Segmentation
* FoggyStereo: Stereo Matching with Fog Volume Representation
* Forecasting Characteristic 3D Poses of Human Actions
* Forecasting from LiDAR via Future Object Detection
* Forward Compatible Few-Shot Class-Incremental Learning
* Forward Compatible Training for Large-Scale Embedding Retrieval Systems
* Forward Propagation, Backward Regression, and Pose Association for Hand Tracking in the Wild
* Fourier Document Restoration for Robust Document Dewarping and Recognition
* Fourier PlenOctrees for Dynamic Radiance Field Rendering in Real-time
* Frame Averaging for Equivariant Shape Space Learning
* Frame-wise Action Representations for Long Videos via Sequence Contrastive Learning
* Framework for Learning Ante-hoc Explainable Models via Concepts, A
* FreeSOLO: Learning to Segment Objects without Annotations
* Frequency-driven Imperceptible Adversarial Attack on Semantic Similarity
* From Representation to Reasoning: Towards both Evidence and Commonsense Reasoning for Video Question-Answering
* FS6D: Few-Shot 6D Pose Estimation of Novel Objects
* Full-Range Virtual Try-On with Recurrent Tri-Level Transform
* Future Transformer for Long-term Action Anticipation
* FvOR: Robust Joint Shape and Pose Optimization for Few-view Object Reconstruction
* FWD: Real-time Novel View Synthesis with Forward Warping and Depth
* Gait Recognition in the Wild with Dense 3D Representations and A Benchmark
* GAN-Supervised Dense Visual Alignment
* GANORCON: Are Generative Models Useful for Few-shot Segmentation?
* GANSeg: Learning to Segment by Unsupervised Hierarchical Image Generation
* GASP, a generalized framework for agglomerative clustering of signed graphs and its application to Instance Segmentation
* GAT-CADNet: Graph Attention Network for Panoptic Symbol Spotting in CAD Drawings
* GaTector: A Unified Framework for Gaze Object Prediction
* Gated2Gated: Self-Supervised Depth Estimation from Gated Images
* GateHUB: Gated History Unit with Background Suppression for Online Action Detection
* Gaussian Process Modeling of Approximate Inference Errors for Variational Autoencoders
* GazeOnce: Real-Time Multi-Person Gaze Estimation
* GCFSR: a Generative and Controllable Face Super Resolution Method Without Facial and GAN Priors
* GCR: Gradient Coreset based Replay Buffer Selection for Continual Learning
* gDNA: Towards Generative Detailed Neural Avatars
* GEN-VLKT: Simplify Association and Enhance Interaction Understanding for HOI Detection
* GenDR: A Generalized Differentiable Renderer
* General Facial Representation Learning in a Visual-Linguistic Manner
* General Incremental Learning with Domain-aware Categorical Representations
* Generalizable Cross-modality Medical Image Segmentation via Style Augmentation and Dual Normalization
* Generalizable Human Pose Triangulation
* Generalized Binary Search Network for Highly-Efficient Multi-View Stereo
* Generalized Category Discovery
* Generalized Few-shot Semantic Segmentation
* Generalizing Gaze Estimation with Rotation Consistency
* Generalizing Interactive Backpropagating Refinement for Dense Prediction Networks
* Generating 3D Bio-Printable Patches Using Wound Segmentation and Reconstruction to Treat Diabetic Foot Ulcers
* Generating Diverse 3D Reconstructions from a Single Occluded Face Image
* Generating Diverse and Natural 3D Human Motions from Text
* Generating High Fidelity Data from Low-density Regions using Diffusion Models
* Generating Representative Samples for Few-Shot Classification
* Generating Useful Accident-Prone Driving Scenarios via a Learned Traffic Prior
* Generative Cooperative Learning for Unsupervised Video Anomaly Detection
* Generative Flows with Invertible Attentions
* GeoEngine: A Platform for Production-Ready Geospatial Research
* Geometric Anchor Correspondence Mining with Uncertainty Modeling for Universal Domain Adaptation
* Geometric and Textural Augmentation for Domain Gap Reduction
* Geometric Structure Preserving Warp for Natural Image Stitching
* Geometric Transformer for Fast and Robust Point Cloud Registration
* Geometry-Aware Guided Loss for Deep Crack Recognition
* GeoNeRF: Generalizing NeRF with Geometry Priors
* GIFS: Neural Implicit Function for General Shape Representation
* GIQE: Generic Image Quality Enhancement via Nth Order Iterative Degradation
* GIRAFFE HD: A High-Resolution 3D-aware Generative Model
* Give Me Your Attention: Dot-Product Attention Considered Harmful for Adversarial Patch Robustness
* GLAMR: Global Occlusion-Aware Human Mesh Recovery with Dynamic Cameras
* Glass Segmentation using Intensity and Spectral Polarization Cues
* GLASS: Geometric Latent Augmentation for Shape Spaces
* GlideNet: Global, Local and Intrinsic based Dense Embedding NETwork for Multi-category Attributes Prediction
* Global Context with Discrete Diffusion in Vector Quantised Modelling for Image Generation
* Global Convergence of MAML and Theory-Inspired Neural Architecture Search for Few-Shot Learning
* Global Matching with Overlapping Attention for Optical Flow Estimation
* Global Sensing and Measurements Reuse for Image Compressed Sensing
* Global Tracking Transformers
* Global Tracking via Ensemble of Local Trackers
* Global-Aware Registration of Less-Overlap RGB-D Scans
* Globetrotter: Connecting Languages by Connecting Images
* GMFlow: Learning Optical Flow via Global Matching
* GOAL: Generating 4D Whole-Body Motion for Hand-Object Grasping
* GPU-Based Homotopy Continuation for Minimal Problems in Computer Vision
* GPV-Pose: Category-level Object Pose Estimation via Geometry-guided Point-wise Voting
* Gradient-SDF: A Semi-Implicit Surface Representation for 3D Reconstruction
* GradViT: Gradient Inversion of Vision Transformers
* GraFormer: Graph-oriented Transformer for 3D Pose Estimation
* GraftNet: Towards Domain Generalized Stereo Matching with a Broad-Spectrum and Task-Oriented Feature
* GrainSpace: A Large-scale Dataset for Fine-grained and Domain-adaptive Recognition of Cereal Grains
* GRAM: Generative Radiance Manifolds for 3D-Aware Image Generation
* Graph Sampling Based Deep Metric Learning for Generalizable Person Re-Identification
* Graph-based Spatial Transformer with Memory Replay for Multi-future Pedestrian Trajectory Prediction
* Graph-context Attention Networks for Size-varied Deep Graph Matching
* Gravitationally Lensed Black Hole Emission Tomography
* GreedyNASv2: Greedier Search with a Greedy Path Filter
* GridShift: A Faster Mode-seeking Algorithm for Image Segmentation and Object Tracking
* Grounded Language-Image Pre-training
* Grounding Answers for Visual Questions Asked by Visually Impaired People
* Group Contextualization for Video Recognition
* Group R-CNN for Weakly Semi-supervised Object Detection with Points
* GroupNet: Multiscale Hypergraph Neural Networks for Trajectory Prediction with Relational Reasoning
* GroupViT: Semantic Segmentation Emerges from Text Supervision
* GuideFormer: Transformers for Image Guided Depth Completion
* H2FA R-CNN: Holistic and Hierarchical Feature Alignment for Cross-domain Weakly Supervised Object Detection
* H4D: Human 4D Modeling by Learning Neural Compositional Representation
* Habitat-Web: Learning Embodied Object-Search Strategies from Human Demonstrations at Scale
* HairCLIP: Design Your Hair by Text and Reference Image
* HairMapper: Removing Hair from Portraits Using GANs
* Hallucinated Neural Radiance Fields in the Wild
* HandOccNet: Occlusion-Robust 3D Hand Mesh Estimation Network
* HARA: A Hierarchical Approach for Robust Rotation Averaging
* Harmony: A Generic Unsupervised Approach for Disentangling Semantic Content from Parameterized Transformations
* HCSC: Hierarchical Contrastive Selective Coding
* HDNet: High-resolution Dual-domain Learning for Spectral Compressive Imaging
* HDR-NeRF: High Dynamic Range Neural Radiance Fields
* HeadNeRF: A Realtime NeRF-based Parametric Head Model
* HEAT: Holistic Edge Attention Transformer for Structured Reconstruction
* HerosNet: Hyperspectral Explicable Reconstruction and Optimal Sampling Deep Network for Snapshot Compressive Imaging
* Hierarchical Modular Network for Video Captioning
* Hierarchical Nearest Neighbor Graph Embedding for Efficient Dimensionality Reduction
* Hierarchical Self-supervised Representation Learning for Movie Understanding
* High Quality Segmentation for Ultra High-Resolution Images
* High-Fidelity GAN Inversion for Image Attribute Editing
* High-Fidelity Human Avatars from a Single RGB Camera
* High-resolution Face Swapping via Latent Semantics Disentanglement
* High-Resolution Image Harmonization via Collaborative Dual Transformations
* High-Resolution Image Synthesis with Latent Diffusion Models
* Highly-efficient Incomplete Largescale Multiview Clustering with Consensus Bipartite Graph
* HINT: Hierarchical Neuron Concept Explainer
* Hire-MLP: Vision MLP via Hierarchical Rearrangement
* HiVT: Hierarchical Vector Transformer for Multi-Agent Motion Prediction
* HL-Net: Heterophily Learning Network for Scene Graph Generation
* HLRTF: Hierarchical Low-Rank Tensor Factorization for Inverse Problems in Multi-Dimensional Imaging
* HODEC: Towards Efficient High-Order DEcomposed Convolutional Neural Networks
* HODOR: High-level Object Descriptors for Object Re-Segmentation in Video Learned from Static Images
* HOI4D: A 4D Egocentric Dataset for Category-Level Human-Object Interaction
* Holocurtains: Programming Light Curtains via Binary Holography
* Homography Loss for Monocular 3D Object Detection
* HOP: History-and-Order Aware Pretraining for Vision-and-Language Navigation
* How Do You Do It? Fine-Grained Action Understanding with Pseudo-Adverbs
* How Good Is Aesthetic Ability of a Fashion Model?
* How many Observations are Enough? Knowledge Distillation for Trajectory Forecasting
* How much does input data type impact final face model accuracy?
* How Much More Data Do I Need? Estimating Requirements for Downstream Tasks
* How Well Do Sparse ImageNet Models Transfer?
* HP-Capsule: Unsupervised Face Part Discovery by Hierarchical Parsing Capsule Network
* HSC4D: Human-centered 4D Scene Capture in Large-scale Indoor-outdoor Space Using Wearable IMUs and LiDAR
* Human Hands as Probes for Interactive Object Understanding
* Human Instance Matting via Mutual Guidance and Multi-Instance Refinement
* Human Mesh Recovery from Multiple Shots
* Human Trajectory Prediction with Momentary Observation
* Human-Aware Object Placement for Visual Environment Reconstruction
* Human-Object Interaction Detection via Disentangled Transformer
* HumanNeRF: Efficiently Generated Human Radiance Field from Sparse Inputs
* HumanNeRF: Free-viewpoint Rendering of Moving People from Monocular Video
* HVH: Learning a Hybrid Neural Volumetric Representation for Dynamic Hair Performance Capture
* Hybrid Egocentric Activity Anticipation Framework via Memory-Augmented Recurrent and One-shot Representation Forecasting, A
* Hybrid Quantum-Classical Algorithm for Robust Fitting, A
* Hybrid Relation Guided Set Matching for Few-shot Action Recognition
* HybridCR: Weakly-Supervised 3D Point Cloud Semantic Segmentation via Hybrid Contrastive Regularization
* Hyperbolic Image Segmentation
* Hyperbolic Vision Transformers: Combining Improvements in Metric Learning
* HyperDet3D: Learning a Scene-conditioned 3D Object Detector
* Hypergraph-Induced Semantic Tuplet Loss for Deep Metric Learning
* HyperInverter: Improving StyleGAN Inversion via Hypernetwork
* HyperSegNAS: Bridging One-Shot Neural Architecture Search with 3D Medical Image Segmentation using HyperNet
* Hyperspherical Consistency Regularization
* HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing
* HyperTransformer: A Textural and Spectral Feature Fusion Transformer for Pansharpening
* I M Avatar: Implicit Morphable Head Avatars from Videos
* ICON: Implicit Clothed humans Obtained from Normals
* Id-Free Person Similarity Learning
* IDEA-Net: Dynamic 3D Point Cloud Interpolation via Deep Embedding Alignment
* Identifying Ambiguous Similarity Conditions via Semantic Matching
* IDR: Self-Supervised Image Denoising via Iterative Data Refinement
* IFOR: Iterative Flow Minimization for Robotic Object Rearrangement
* IFRNet: Intermediate Feature Refine Network for Efficient Frame Interpolation
* iFS-RCNN: An Incremental Few-shot Instance Segmenter
* Image Animation with Perturbed Masks
* Image Based Reconstruction of Liquids from 2D Surface Detections
* Image Dehazing Transformer with Transmission-Aware 3D Position Embedding
* Image Disentanglement Autoencoder for Steganography without Embedding
* Image Patch is a Wave: Phase-Aware Vision MLP, An
* Image Segmentation Using Text and Image Prompts
* Image-to-Lidar Self-Supervised Distillation for Autonomous Driving Data
* ImFace: A Nonlinear 3D Morphable Face Model with Implicit Neural Representations
* Implicit Feature Decoupling with Depthwise Quantization
* Implicit Motion Handling for Video Camouflaged Object Detection
* Implicit Sample Extension for Unsupervised Person Re-Identification
* Implicit Values of A Good Hand Shake: Handheld Multi-Frame Neural Depth Refinement, The
* ImplicitAtlas: Learning Deformable Shape Templates in Medical Imaging
* Imposing Consistency for Optical Flow Estimation
* Improving Adversarial Transferability via Neuron Attribution-based Attacks
* Improving Adversarially Robust Few-shot Image Classification with Generalizable Representations
* Improving GAN Equilibrium by Raising Spatial Awareness
* Improving neural implicit surfaces geometry with patch warping
* Improving Robustness Against Stealthy Weight Bit-Flip Attacks by Output Code Matching
* Improving Segmentation of the Inferior Alveolar Nerve through Deep Label Propagation
* Improving Subgraph Recognition with Variational Graph Information Bottleneck
* Improving the Transferability of Targeted Adversarial Examples through Object-Based Diverse Input
* Improving Video Model Transfer with Dynamic Representation Learning
* Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning
* Incorporating Semi-Supervised and Positive-Unlabeled Learning for Boosting Full Reference Image Quality Assessment
* Incremental Cross-view Mutual Distillation for Self-supervised Medical CT Synthesis
* Incremental Learning in Semantic Segmentation from Image Labels
* Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding
* Industrial Style Transfer with Large-scale Geometric Warping and Content Preservation
* Inertia-Guided Flow Completion and Style Fusion for Video Inpainting
* InfoGCN: Representation Learning for Human Skeleton-based Action Recognition
* InfoNeRF: Ray Entropy Minimization for Few-Shot Neural Volume Rendering
* Infrared Invisible Clothing: Hiding from Infrared Detectors at Multiple Angles in Real World
* Injecting Semantic Concepts into End-to-End Image Captioning
* InOut: Diverse Image Outpainting via GAN Inversion
* Input-level Inductive Biases for 3D Reconstruction
* INS-Conv: Incremental Sparse Convolution for Online 3D Segmentation
* InsetGAN for Full-Body Image Generation
* InstaFormer: Instance-Aware Image-to-Image Translation with Transformer
* Instance Segmentation with Mask-supervised Polygonal Boundary Transformers
* Instance-Aware Dynamic Neural Network Quantization
* Instance-Dependent Label-Noise Learning with Manifold-Regularized Transition Matrix Estimation
* Instance-wise Occlusion and Depth Orders in Natural Scenes
* Integrating Language Guidance into Vision-based Deep Metric Learning
* Integrative Few-Shot Learning for Classification and Segmentation
* IntentVizor: Towards Generic Query Guided Interactive Video Summarization
* Interact before Align: Leveraging Cross-Modal Knowledge for Domain Adaptive Action Recognition
* Interacting Attention Graph for Single Image Two-Hand Reconstruction
* Interactive Disentanglement: Learning Concepts by Interacting with their Prototype Representations
* Interactive Image Synthesis with Panoptic Layout Generation
* Interactive Multi-Class Tiny-Object Detection
* Interactive Segmentation and Visualization for Tiny Objects in Multi-megapixel Images
* Interactiveness Field in Human-Object Interactions
* Interactron: Embodied Adaptive Object Detection
* Interpretable part-whole hierarchies and conceptual-semantic relationships in neural networks
* Interspace Pruning: Using Adaptive Filter Representations to Improve Training of Sparse CNNs
* IntraQ: Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Shot Network Quantization
* Invariant Grounding for Video Question Answering
* Investigating the Impact of Multi-LiDAR Placement on Object Detection for Autonomous Driving
* Investigating Top-k White-Box and Transferable Black-box Attack
* Investigating Tradeoffs in Real-World Video Super-Resolution
* iPLAN: Interactive and Procedural Layout Planning
* IRISformer: Dense Vision Transformers for Single-Image Inverse Rendering in Indoor Scenes
* IRON: Inverse Rendering by Optimizing Neural SDFs and Materials from Photometric Images
* Is Mapping Necessary for Realistic PointGoal Navigation?
* ISDNet: Integrating Shallow and Deep Networks for Efficient Ultra-high Resolution Segmentation
* ISNAS-DIP: Image-Specific Neural Architecture Search for Deep Image Prior
* ISNet: Shape Matters for Infrared Small Target Detection
* It is Okay to Not Be Okay: Overcoming Emotional Bias in Affective Image Captioning by Contrastive Data Collection
* It's About Time: Analog Clock Reading in the Wild
* It's All In the Teacher: Zero-Shot Quantization Brought Closer to the Teacher
* It's Time for Artistic Correspondence in Music and Video
* Iterative Corresponding Geometry: Fusing Region and Depth for Highly Efficient 3D Tracking of Textureless Objects
* Iterative Deep Homography Estimation
* Iterative Quantum Approach for Transformation Estimation from Point Sets, An
* IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo
* Ithaca365: Dataset and Driving Perception under Repeated and Challenging Weather Conditions
* ITSA: An Information-Theoretic Approach to Automatic Shortcut Avoidance and Domain Generalization in Stereo Matching Networks
* JIFF: Jointly-aligned Implicit Face Function for High Quality Single View Clothed Human Reconstruction
* JoinABLe: Learning Bottom-up Assembly of Parametric CAD Joints
* Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification
* Joint Forecasting of Panoptic Segmentations with Difference Attention
* Joint Global and Local Hierarchical Priors for Learned Image Compression
* Joint Hand Motion and Interaction Hotspots Prediction from Egocentric Videos
* Joint Video Summarization and Moment Localization by Cross-Task Sample Transfer
* JRDB-Act: A Large-scale Dataset for Spatio-temporal Action, Social Group and Activity Detection
* Kernelized Few-shot Object Detection with Efficient Integral Aggregation
* Keypoint Transformer: Solving Joint Identification in Challenging Hands and Object Interactions for Accurate 3D Pose Estimation
* Keypoint-based Global Association Network for Lane Detection, A
* KeyTr: Keypoint Transporter for 3D Reconstruction of Deformable Objects in Videos
* KG-SP: Knowledge Guided Simple Primitives for Open World Compositional Zero-Shot Learning
* Killing Two Birds with One Stone: Efficient and Robust Training of Face Recognition CNNs by Partial FC
* KNN Local Attention for Image Restoration
* Knowledge Distillation as Efficient Pre-training: Faster Convergence, Higher Data-efficiency, and Better Transferability
* Knowledge Distillation via the Target-aware Transformer
* Knowledge Distillation with the Reused Teacher Classifier
* Knowledge distillation: A good teacher is patient and consistent
* Knowledge Mining with Scene Text for Fine-Grained Recognition
* Knowledge-Driven Self-Supervised Representation Learning for Facial Action Unit Recognition
* Kubric: A scalable dataset generator
* L-Verse: Bidirectional Generation Between Image and Text
* L2G: A Simple Local-to-Global Knowledge Transfer Framework for Weakly Supervised Semantic Segmentation
* Label Matching Semi-Supervised Object Detection
* Label Relation Graphs Enhanced Hierarchical Residual Network for Hierarchical Multi-Granularity Classification
* Label, Verify, Correct: A Simple Few Shot Object Detection Method
* Label-Only Model Inversion Attacks via Boundary Repulsion
* Lagrange Motion Analysis and View Embeddings for Improved Gait Recognition
* LAKe-Net: Topology-Aware Point Cloud Completion by Localizing Aligned Keypoints
* Language as Queries for Referring Video Object Segmentation
* Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation
* LAR-SR: A Local Autoregressive Model for Image Super-Resolution
* Large Loss Matters in Weakly Supervised Multi-Label Classification
* Large-scale Comprehensive Dataset and Copy-overlap Aware Evaluation Protocol for Segment-level Video Copy Detection, A
* Large-Scale Pre-training for Person Re-identification with Noisy Labels
* Large-scale Video Panoptic Segmentation in the Wild: A Benchmark
* LARGE: Latent-Based Regression through GAN Semantics
* LAS-AT: Adversarial Training with Learnable Attack Strategy
* LASER: LAtent SpacE Rendering for 2D Visual Localization
* LaTr: Layout-Aware Transformer for Scene-Text VQA
* LAVT: Language-Aware Vision Transformer for Referring Image Segmentation
* Layer-wised Model Aggregation for Personalized Federated Learning
* Layered Depth Refinement with Mask Guidance
* LC-FDNet: Learned Lossless Image Compression with Frequency Decomposition Network
* LD-ConGR: A Large RGB-D Video Dataset for Long-Distance Continuous Gesture Recognition
* Learn from Others and Be Yourself in Heterogeneous Federated Learning
* Learnable Irrelevant Modality Dropout for Multimodal Action Recognition on Modality-Specific Annotated Videos
* Learnable Lookup Table for Neural Network Quantization
* Learned Queries for Efficient Local Attention
* Learning 3D Object Shape and Layout without 3D Supervision
* Learning a Structured Latent Space for Unsupervised Point Cloud Completion
* Learning ABCs: Approximate Bijective Correspondence for isolating factors of variation with weak supervision
* Learning Adaptive Warping for RealWorld Rolling Shutter Correction
* Learning Affinity from Attention: End-to-End Weakly-Supervised Semantic Segmentation with Transformers
* Learning Affordance Grounding from Exocentric Images
* Learning based Multi-modality Image and Video Compression
* Learning Bayesian Sparse Networks with Full Experience Replay for Continual Learning
* Learning Canonical F-Correlation Projection for Compact Multiview Representation
* Learning Deep Implicit Functions for 3D Shapes with Dynamic Code Clouds
* Learning Distinctive Margin toward Active Domain Adaptation
* Learning Fair Classifiers with Partially Annotated Group Labels
* Learning from All Vehicles
* Learning from Pixel-Level Noisy Label: A New Perspective for Light Field Saliency Detection
* Learning from Temporal Gradient for Semi-supervised Action Recognition
* Learning from Untrimmed Videos: Self-Supervised Video Representation Learning with Hierarchical Consistency
* Learning Graph Regularisation for Guided Super-Resolution
* Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation
* Learning Invisible Markers for Hidden Codes in Offline-to-online Photography
* Learning Local Displacements for Point Cloud Completion
* Learning Local-Global Contextual Adaptation for Multi-Person Pose Estimation
* Learning Memory-Augmented Unidirectional Metrics for Cross-modality Person Re-identification
* Learning Modal-Invariant and Temporal-Memory for Video-based Visible-Infrared Person Re-Identification
* Learning Motion-Dependent Appearance for High-Fidelity Rendering of Dynamic Humans from a Single Camera
* Learning Multi-View Aggregation In the Wild for Large-Scale 3D Semantic Segmentation
* Learning Multiple Adverse Weather Removal via Two-stage Knowledge Learning and Multi-contrastive Regularization: Toward a Unified Model
* Learning Multiple Dense Prediction Tasks from Partially Annotated Data
* Learning Neural Light Fields with Ray-Space Embedding
* Learning Non-target Knowledge for Few-shot Semantic Segmentation
* Learning Object Context for Novel-view Scene Layout Generation
* Learning of Global Objective for Network Flow in Multi-Object Tracking
* Learning Optical Flow with Kernel Patch Attention
* Learning Optimal K-space Acquisition and Reconstruction using Physics-Informed Neural Networks
* Learning Part Segmentation through Unsupervised Domain Adaptation from Synthetic Vehicles
* Learning Pixel Trajectories with Multiscale Contrastive Random Walks
* Learning Pixel-Level Distinctions for Video Highlight Detection
* Learning Program Representations for Food Images and Cooking Recipes
* Learning Robust Image-Based Rendering on Sparse Scene Geometry via Depth Completion
* Learning Second Order Local Anomaly for General Face Forgery Detection
* Learning Semantic Associations for Mirror Detection
* Learning Soft Estimator of Keypoint Scale and Orientation with Probabilistic Covariant Loss
* Learning sRGB-to-Raw-RGB De-rendering with Content-Aware Metadata
* Learning Structured Gaussians to Approximate Deep Ensembles
* Learning to Affiliate: Mutual Centralized Learning for Few-shot Classification
* Learning to Align Sequential Actions in the Wild
* Learning to Answer Questions in Dynamic Audio-Visual Scenarios
* Learning to Anticipate Future with Dynamic Context Removal
* Learning to Collaborate in Decentralized Learning of Personalized Models
* Learning to Deblur using Light Field Generated and Real Defocus Images
* Learning to Detect Mobile Objects from LiDAR Scans Without Labels
* Learning to Detect Scene Landmarks for Camera Localization
* Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes
* Learning to Find Good Models in RANSAC
* Learning to generate line drawings that convey geometry and semantics
* Learning to Imagine: Diversify Memory for Incremental Learning using Unlabeled Data
* Learning to Learn across Diverse Data Biases in Deep Face Recognition
* Learning to Learn and Remember Super Long Multi-Domain Task Sequence
* Learning to Learn by Jointly Optimizing Neural Architecture and Weights
* Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion
* Learning to Memorize Feature Hallucination for One-Shot Image Generation
* Learning to Prompt for Continual Learning
* Learning to Prompt for Open-Vocabulary Object Detection with Vision-Language Model
* Learning To Recognize Procedural Activities with Distant Supervision
* Learning to Refactor Action and Co-occurrence Features for Temporal Action Localization
* Learning to Restore 3D Face from In-the-Wild Degraded Images
* Learning to Solve Hard Minimal Problems
* Learning to Zoom Inside Camera Imaging Pipeline
* Learning Trajectory-Aware Transformer for Video Super-Resolution
* Learning Transferable Human-Object Interaction Detector with Natural Language Supervision
* Learning Video Representations of Human Motion from Synthetic Data
* Learning What Not to Segment: A New Perspective on Few-Shot Segmentation
* Learning Where to Learn in Cross-View Self-Supervised Learning
* Learning with Neighbor Consistency for Noisy Labels
* Learning with Twin Noisy Labels for Visible-Infrared Person Re-Identification
* Lepard: Learning partial point cloud matching in rigid and deformable scenes
* Less is More: Generating Grounded Navigation Instructions from Landmarks
* Leveling Down in Computer Vision: Pareto Inefficiencies in Fair Deep Classifiers
* Leverage Your Local and Global Representations: A New Self-Supervised Learning Strategy
* Leveraging Adversarial Examples to Quantify Membership Information Leakage
* Leveraging Equivariant Features for Absolute Pose Regression
* Leveraging Real Talking Faces via Self-Supervision for Robust Forgery Detection
* Leveraging Self-Supervision for Cross-Domain Crowd Counting
* LGT-Net: Indoor Panoramic Room Layout Estimation with Geometry-Aware Transformer Network
* LiDAR Snowfall Simulation for Robust 3D Object Detection
* LiDARCap: Long-range Markerless 3D Human Motion Capture with LiDAR Point Clouds
* Lifelong Graph Learning
* Lifelong Unsupervised Domain Adaptive Person Re-identification with Coordinated Anti-forgetting and Adaptation
* LIFT: Learning 4D LiDAR Image Fusion Transformer for 3D Object Detection
* Light Field Neural Rendering
* Likert Scoring with Grade Decoupling for Long-term Action Assessment
* LISA: Learning Implicit Shape and Appearance of Hands
* LiT: Zero-Shot Transfer with Locked-Image text Tuning
* Lite Pose: Efficient Architecture Design for 2D Human Pose Estimation
* Lite Vision Transformer with Enhanced Self-Attention
* Lite-MDETR: A Lightweight Multi-Modal Detector
* LMGP: Lifted Multicut Meets Geometry Projections for Multi-Camera Multi-Object Tracking
* Local Attention Pyramid for Scene Image Generation
* Local Learning Matters: Rethinking Data Heterogeneity in Federated Learning
* Local Texture Estimator for Implicit Representation Function
* Local-Adaptive Face Recognition via Graph-based Meta-Clustering and Regularized Adaptation
* Locality-Aware Inter-and Intra-Video Reconstruction for Self-Supervised Correspondence Learning
* Localization Distillation for Dense Object Detection
* Localized Adversarial Domain Generalization
* Location-Free Human Pose Estimation
* LOLNeRF: Learn from One Look
* Long-Short Temporal Contrastive Learning of Video Transformers
* Long-tail Recognition via Compositional Knowledge Transfer
* Long-Tailed Recognition via Weight Balancing
* Long-tailed Visual Recognition via Gaussian Clouded Logit Adjustment
* Long-term Video Frame Interpolation via Feature Propagation
* Long-term Visual Map Sparsification with Heterogeneous GNN
* Look Back and Forth: Video Super-Resolution with Explicit Temporal Difference Modeling
* Look Closer to Supervise Better: One-Shot Font Generation via Component-Based Discriminator
* Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos
* Look Outside the Room: Synthesizing A Consistent Long-Term 3D Scene Video from A Single Image
* Low-cost & Realtime Motion Capture System, A
* Low-Resource Adaptation for Personalized Co-Speech Gesture Generation
* LSVC: A Learning-based Stereo Video Compression Framework
* LTP: Lane-based Trajectory Prediction for Autonomous Driving
* M2I: From Factored Marginal Trajectory Prediction to Interactive Prediction
* M3L: Language-based Video Editing via Multi-Modal Multi-Level Transformers
* M3T: three-dimensional Medical image classifier using Multi-plane and Multi-slice Transformer
* M5Product: Self-harmonized Contrastive Learning for E-commercial Multi-modal Pretraining
* MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio Descriptions
* Maintaining Reasoning Consistency in Compositional Visual Question Answering
* Majority Can Help the Minority: Context-rich Minority Oversampling for Long-tailed Classification, The
* Make It Move: Controllable Image-to-Video Generation with Text Descriptions
* Manifold Learning Benefits GANs
* ManiTrans: Entity-Level Text-Guided Image Manipulation via Token-wise Semantic Alignment and Generation
* Many-to-many Splatting for Efficient Video Frame Interpolation
* Marginal Contrastive Correspondence for Guided Image Generation
* Mask Transfiner for High-Quality Instance Segmentation
* Mask-guided Spectral-wise Transformer for Efficient Hyperspectral Image Reconstruction
* Masked Autoencoders Are Scalable Vision Learners
* Masked Feature Prediction for Self-Supervised Visual Pre-Training
* Masked-attention Mask Transformer for Universal Image Segmentation
* MaskGIT: Masked Generative Image Transformer
* Masking Adversarial Damage: Finding Adversarial Saliency for Robust and Sparse Network
* MAT: Mask-Aware Transformer for Large Hole Image Inpainting
* Matching Feature Sets for Few-Shot Image Classification
* MatteFormer: Transformer-Based Image Matting via Prior-Tokens
* MAXIM: Multi-Axis MLP for Image Processing
* Maximum Consensus by Weighted Influences of Monotone Boolean Functions
* Maximum Spatial Perturbation Consistency for Unpaired Image-to-Image Translation
* MDAN: Multi-level Dependent Attention Network for Visual Emotion Analysis
* Measuring Compositional Consistency for Video Question Answering
* Medial Spectral Coordinates for 3D Shape Analysis
* Mega-NeRF: Scalable Construction of Large-Scale NeRFs for Virtual Fly- Throughs
* Memory-augmented Deep Conditional Unfolding Network for Pansharpening
* Memory-Augmented Non-Local Attention for Video Super-Resolution
* MeMOT: Multi-Object Tracking with Memory
* MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition
* MERLOT RESERVE: Neural Script Knowledge through Vision and Language and Sound
* Merry Go Round: Rotate a Frame and Fool a DNN
* Meta Agent Teaming Active Learning for Pose Estimation
* Meta Convolutional Neural Networks for Single Domain Generalization
* Meta Distribution Alignment for Generalizable Person Re-Identification
* Meta-attention for ViT-backed Continual Learning
* MetaFormer is Actually What You Need for Vision
* MetaFSCIL: A Meta-Learning Approach for Few-Shot Class Incremental Learning
* MetaPose: Fast 3D Pose from Multiple Views without 3D Supervision
* MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation
* MIL-Derived Transformer for Weakly Supervised Point Cloud Segmentation, An
* Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning
* Mining Multi-View Information: A Strong Self-Supervised Framework for Depth-based 3D Hand Pose and Mesh Estimation
* MiniViT: Compressing Vision Transformers with Weight Multiplexing
* Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields
* MISF:Multi-level Interactive Siamese Filtering for High-Fidelity Image Inpainting
* Mix and Localize: Localizing Sound Sources in Mixtures
* Mixed Differential Privacy in Computer Vision
* MixFormer: End-to-End Tracking with Iterative Mixed Attention
* MixFormer: Mixing Features across Windows and Dimensions
* MixSTE: Seq2seq Mixed Spatio-Temporal Encoder for 3D Human Pose Estimation in Video
* MLP-3D: A MLP-like 3D Architecture with Grouped Time Mixing
* MLSLT: Towards Multilingual Sign Language Translation
* MM-TTA: Multi-Modal Test-Time Adaptation for 3D Semantic Segmentation
* MNSRNet: Multimodal Transformer Network for 3D Surface Super-Resolution
* Mobile-Former: Bridging MobileNet and Transformer
* MobRecon: Mobile-Friendly Hand Mesh Reconstruction from Monocular Image
* Modality-Agnostic Learning for Radar-Lidar Fusion in Vehicle Detection
* Modeling 3D Layout For Group Re-Identification
* Modeling Image Composition for Complex Scene Generation
* Modeling Indirect Illumination for Inverse Rendering
* Modeling Motion with Multi-Modal Features for Text-Based Video Segmentation
* Modeling sRGB Camera Noise with Normalizing Flows
* Modular Action Concept Grounding in Semantic Video Prediction
* Modulated Contrast for Versatile Image Synthesis
* MogFace: Towards a Deeper Appreciation on Face Detection
* MonoDTR: Monocular 3D Object Detection with Depth-Aware Transformer
* MonoGround: Detecting Monocular 3D Objects from the Ground
* MonoJSG: Joint Semantic and Geometric Cost Volume for Monocular 3D Object Detection
* MonoScene: Monocular 3D Semantic Scene Completion
* More than Words: In-the-Wild Visually-Driven Prosody for Text-to-Speech
* Motion-Adjustable Neural Implicit Video Representation
* Motion-aware Contrastive Video Representation Learning via Foreground-background Merging
* Motion-from-Blur: 3D Shape and Motion Estimation of Motion-blurred Objects in Videos
* Motion-modulated Temporal Fragment Alignment Network For Few-Shot Action Recognition
* MotionAug: Augmentation with Physical Correction for Human Motion Prediction
* Motron: Multimodal Probabilistic Human Motion Forecasting
* Moving Window Regression: A Novel Approach to Ordinal Regression
* MPC: Multi-view Probabilistic Clustering
* MPViT: Multi-Path Vision Transformer for Dense Prediction
* Mr.BiQ: Post-Training Non-Uniform Quantization based on Minimizing the Reconstruction Error
* MS-TCT: Multi-Scale Temporal ConvTransformer for Action Detection
* MS2DG-Net: Progressive Correspondence Learning via Multiple Sparse Semantics Dynamic Graph
* MSDN: Mutually Semantic Distillation Network for Zero-Shot Learning
* MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger Tokens
* MSTR: Multi-Scale Transformer for End-to-End Human-Object Interaction Detection
* MuIT: An End-to-End Multitask Learning Transformer
* MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering
* Multi-class Token Transformer for Weakly Supervised Semantic Segmentation
* Multi-Dimensional, Nuanced and Subjective - Measuring the Perception of Facial Expressions
* Multi-Frame Self-Supervised Depth with Transformers
* Multi-grained Spatio-Temporal Features Perceived Network for Event-based Lip-Reading
* Multi-Granularity Alignment Domain Adaptation for Object Detection
* Multi-instance Point Cloud Registration by Efficient Correspondence Clustering
* Multi-label Classification with Partial Annotations using Class-aware Selective Loss
* Multi-label Iterated Learning for Image Classification with Label Ambiguity
* Multi-level Feature Learning for Contrastive Multi-view Clustering
* Multi-Level Representation Learning with Semantic Alignment for Referring Video Object Segmentation
* Multi-marginal Contrastive Learning for Multilabel Subcellular Protein Localization
* Multi-modal Alignment using Representation Codebook
* Multi-Modal Dynamic Graph Transformer for Visual Grounding
* Multi-modal Extreme Classification
* Multi-Object Tracking Meets Moving UAV
* Multi-Objective Diverse Human Motion Prediction with Knowledge Distillation
* Multi-Person Extreme Motion Prediction
* Multi-Robot Active Mapping via Neural Bipartite Graph Matching
* Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation
* Multi-Scale Memory-Based Video Deblurring
* Multi-Source Uncertainty Mining for Deep Unsupervised Saliency Detection
* Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis
* Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry
* Multi-View Mesh Reconstruction with Neural Deferred Shading
* Multi-View Transformer for 3D Visual Grounding
* Multidimensional Belief Quantification for Label-Efficient Meta-Learning
* Multimodal Colored Point Cloud to Image Alignment
* Multimodal Dynamics: Dynamical Fusion for Trustworthy Multimodal Classification
* Multimodal Material Segmentation
* Multimodal Token Fusion for Vision Transformers
* Multiview Transformers for Video Recognition
* MUM: Mix Image Tiles and UnMix Feature Tiles for Semi-Supervised Object Detection
* MUSE-VAE: Multi-Scale VAE for Environment-Aware Long Term Trajectory Prediction
* Mutual Information-driven Pan-sharpening
* Mutual Quantization for Cross-Modal Search with Noisy Labels
* MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
* MVS2D: Efficient Multiview Stereo via Attention-Driven 2D Convolutions
* NAN: Noise-Aware NeRFs for Burst-Denoising
* Negative-Aware Attention Framework for Image-Text Matching
* NeRF in the Dark: High Dynamic Range View Synthesis from Noisy Raw Images
* NeRF-Editing: Geometry Editing of Neural Radiance Fields
* NeRFReN: Neural Radiance Fields with Reflections
* NeRFusion: Fusing Radiance Fields for Large-Scale Scene Reconstruction
* Nested Collaborative Learning for Long-Tailed Visual Recognition
* Nested Hyperbolic Spaces for Dimensionality Reduction and Hyperbolic NN Design
* Neural 3D Scene Reconstruction with the Manhattan-world Assumption
* Neural 3D Video Synthesis from Multi-view Video
* Neural Architecture Search with Representation Mutual Information
* Neural Collaborative Graph Machines for Table Structure Recognition
* Neural Compression-Based Feature Learning for Video Restoration
* Neural Convolutional Surfaces
* Neural Data-Dependent Transform for Learned Image Compression
* Neural Emotion Director: Speech-preserving semantic control of facial expressions in in-the-wild videos
* Neural Face Identification in a 2D Wireframe Projection of a Manifold Object
* Neural Fields as Learnable Kernels for 3D Reconstruction
* Neural Global Shutter: Learn to Restore Video from a Rolling Shutter Camera with Global Reset Feature
* Neural Head Avatars from Monocular RGB Videos
* Neural Inertial Localization
* Neural Mean Discrepancy for Efficient Out-of-Distribution Detection
* Neural Mesh Simplification
* Neural MoCon: Neural Motion Control for Physically Plausible Human Motion Capture
* Neural Point Light Fields
* Neural Points: Point Cloud Representation with Neural Fields for Arbitrary Upsampling
* Neural Prior for Trajectory Estimation
* Neural Rays for Occlusion-aware Image-based Rendering
* Neural Recognition of Dashed Curves with Gestalt Law of Continuity
* Neural Reflectance for Shape Recovery with Shadow Handling
* Neural RGB-D Surface Reconstruction
* Neural Shape Mating: Self-Supervised Object Assembly with Adversarial Shape Priors
* Neural Template: Topology-aware Reconstruction and Disentangled Generation of 3D Meshes
* Neural Texture Extraction and Distribution for Controllable Person Image Synthesis
* Neural Volumetric Object Selection
* Neural Window Fully-connected CRFs for Monocular Depth Estimation
* NeuralHDHair: Automatic High-fidelity Hair Modeling from a Single Image Using Implicit Neural Representations
* NeuralHOFusion: Neural Volumetric Rendering under Human-object Interactions
* Neurally-Guided Shape Parser: Grammar-based Labeling of 3D Shape Regions with Approximate Inference, The
* NeurMiPs: Neural Mixture of Planar Experts for View Synthesis
* NFormer: Robust Person Re-identification with Neighbor Transformer
* NICE-SLAM: Neural Implicit Scalable Encoding for SLAM
* NICGSlowDown: Evaluating the Efficiency Robustness of Neural Image Caption Generation Models
* NightLab: A Dual-level Architecture with Hardness Detection for Segmentation at Night
* NinjaDesc: Content-Concealing Visual Descriptors via Adversarial Learning
* NLX-GPT: A Model for Natural Language Explanations in Vision and Vision-Language Tasks
* No Pain, Big Gain: Classify Dynamic Point Cloud Sequences with Static Models by Fitting Feature-level Space-time Surfaces
* No-Reference Point Cloud Quality Assessment via Domain Adaptation
* NOC-REK: Novel Object Captioning with Retrieved Vocabulary from External Knowledge
* Node Representation Learning in Graph via Node-to-Neighbourhood Mutual Information Maximization
* Node-aligned Graph Convolutional Network for Whole-slide Image Representation and Classification
* NODEO: A Neural Ordinary Differential Equation Based Optimization Framework for Deformable Image Registration
* Noise Distribution Adaptive Self-Supervised Image Denoising using Tweedie Distribution and Score Matching
* Noise Is Also Useful: Negative Correlation-Steered Latent Contrastive Learning
* Noise2NoiseFlow: Realistic Camera Noise Modeling without Clean Images
* Noisy Boundaries: Lemon or Lemonade for Semi-supervised Instance Segmentation?
* NomMer: Nominate Synergistic Context in Vision Transformer for Visual Recognition
* Non-generative Generalized Zero-shot Learning via Task-correlated Disentanglement and Controllable Samples Synthesis
* Non-isotropy Regularization for Proxy-based Deep Metric Learning
* Non-Iterative Recovery from Nonlinear Observations using Generative Models
* Non-parametric Depth Distribution Modelling based Depth Inference for Multi-view Stereo
* Non-Probability Sampling Network for Stochastic Human Trajectory Prediction
* Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation
* Norm Must Go On: Dynamic Unsupervised Domain Adaptation by Normalization, The
* Not All Labels Are Equal: Rationalizing The Labeling Costs for Training Object Detection
* Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds
* Not All Relations are Equal: Mining Informative Labels for Scene Graph Generation
* Not All Tokens Are Equal: Human-centric Visual Analysis via Token Clustering Transformer
* Not Just Selection, but Exploration: Online Class-Incremental Continual Learning via Dual View Consistency
* Notice of Retraction: E2V-SDE: From Asynchronous Events to Fast and Continuous Video Reconstruction via Neural Stochastic Differential Equations
* Novel Class Discovery in Semantic Segmentation
* NPBG++: Accelerating Neural Point-Based Graphics
* OakInk: A Large-scale Knowledge Repository for Understanding Hand-Object Interaction
* Object Localization under Single Coarse Point Supervision
* Object-aware Video-language Pre-training for Retrieval
* Object-Region Video Transformers
* Object-Relation Reasoning Graph for Action Recognition
* ObjectFolder 2.0: A Multisensory Object Dataset for Sim2Real Transfer
* ObjectFormer for Image Manipulation Detection and Localization
* OccAM's Laser: Occlusion-based Attribution Maps for 3D Object Detectors on LiDAR Data
* Occluded Human Mesh Recovery
* Occlusion-Aware Cost Constructor for Light Field Depth Estimation
* Occlusion-robust Face Alignment using A Viewpoint-invariant Hierarchical Network Architecture
* OcclusionFusion: Occlusion-aware Motion Estimation for Real-time Dynamic 3D Reconstruction
* OCSampler: Compressing Videos to One Clip with Single-step Sampling
* Omni-DETR: Omni-Supervised Object Detection with Transformers
* OmniFusion: 360 Monocular Depth Estimation via Geometry-Aware Fusion
* Omnivore: A Single Model for Many Visual Modalities
* On Adversarial Robustness of Trajectory Prediction for Autonomous Vehicles
* On Aliased Resizing and Surprising Subtleties in GAN Evaluation
* On Generalizing Beyond Domains in Cross-Domain Continual Learning
* On Guiding Visual Attention with Language Specification
* On Learning Contrastive Representations for Learning with Noisy Labels
* On the Importance of Asymmetry for Siamese Representation Learning
* On the Instability of Relative Pose Estimation and RANSAC's Role
* On the Integration of Self-Attention and Convolution
* On the Road to Online Adaptation for Semantic Image Segmentation
* ONCE-3DLanes: Building Monocular 3D Lane Detection
* One Loss for Quantization: Deep Hashing with Discrete Wasserstein Distributional Matching
* One Step at a Time: Long-Horizon Vision-and-Language Navigation with Milestones
* One-bit Active Query with Contrastive Pairs
* OnePose: One-Shot Object Pose Estimation without CAD Models
* Online Continual Learning on a Contaminated Data Stream with Blurry Task Boundaries
* Online Convolutional Reparameterization
* Online Learning of Reusable Abstract Models for Object Goal Navigation
* OoD-Bench: Quantifying and Understanding Two Dimensions of Out-of-Distribution Generalization
* Open Challenges in Deep Stereo: the Booster Dataset
* Open-Domain, Content-based, Multi-modal Fact-checking of Out-of-Context Images via Online Resources
* Open-Set Text Recognition via Character-Context Decoupling
* Open-Vocabulary Instance Segmentation via Robust Cross-Modal Pseudo-Labeling
* Open-Vocabulary One-Stage Detection with Hierarchical Visual-Language Knowledge Distillation
* Open-World Instance Segmentation: Exploiting Pseudo Ground Truth From Learned Pairwise Affinity
* Opening up Open World Tracking
* OpenTAL: Towards Open Set Temporal Action Localization
* Optical Flow Estimation for Spiking Camera
* Optimal Correction Cost for Object Detection Evaluation
* Optimal LED Spectral Multiplexing for NIR2RGB Translation
* Optimizing Elimination Templates by Greedy Parameter Search
* Optimizing Video Prediction via Video Frame Interpolation
* Oriented RepPoints for Aerial Object Detection
* OrphicX: A Causality-Inspired Latent Variable Model for Interpreting Graph Neural Networks
* OSKDet: Orientation-sensitive Keypoint Localization for Rotated Object Detection
* OSOP: A Multi-Stage One Shot Object Pose Estimation Framework
* OSSGAN: Open-Set Semi-Supervised Image Generation
* OSSO: Obtaining Skeletal Shape from Outside
* Out-of-distribution Generalization with Causal Invariant Transformations
* OVE6D: Object Viewpoint Encoding for Depth-based 6D Object Pose Estimation
* Overcoming Catastrophic Forgetting in Incremental Object Detection via Elastic Response Distillation
* OW-DETR: Open-world Detection Transformer
* P3Depth: Monocular Depth Estimation with a Piecewise Planarity Prior
* P3IV: Probabilistic Procedure Planning from Instructional Videos with Weak Supervision
* Panoptic Neural Fields: A Semantic Object-Aware Neural Scene Representation
* Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers
* Panoptic, Instance and Semantic Relations: A Relational Context Encoder to Enhance Panoptic Segmentation
* Panoptic-PHNet: Towards Real-Time and High-Precision LiDAR Panoptic Segmentation via Clustering Pseudo Heatmap
* PanopticDepth: A Unified Framework for Depth-aware Panoptic Segmentation
* Parameter-free Online Test-time Adaptation
* Parametric Scattering Networks
* Paramixer: Parameterizing Mixing Links in Sparse Factors Works Better than Dot-Product Self-Attention
* Part-based Pseudo Label Refinement for Unsupervised Person Re-identification
* PartGlot: Learning Shape Part Segmentation from Language Reference Games
* Partial Class Activation Attention for Semantic Segmentation
* Partially Does It: Towards Scene-Level FG-SBIR with Partial Input
* Pastiche Master: Exemplar-Based High-Resolution Portrait Style Transfer
* Patch Slimming for Efficient Vision Transformers
* Patch-level Representation Learning for Self-supervised Vision Transformers
* PatchFormer: An Efficient Point Transformer with Patch Attention
* PatchNet: A Simple Face Anti-Spoofing Framework via Fine-Grained Patch Recognition
* PCA-Based Knowledge Distillation Towards Lightweight and Content-Style Balanced Photorealistic Style Transfer Models
* PCL: Proxy-based Contrastive Learning for Domain Generalization
* Pedestrian next to the Lamppost Adaptive Object Graphs for Better Instantaneous Mapping, The
* Per-Clip Video Object Segmentation
* Perception Prioritized Training of Diffusion Models
* Performance-Aware Mutual Knowledge Distillation for Improving Neural Architecture Search
* Personalized Image Aesthetics Assessment with Rich Attributes
* Perturbed and Strict Mean Teachers for Semi-supervised Semantic Segmentation
* phi-SfT: Shape-from-Template with a Physics-Based Deformation Model
* PhoCaL: A Multi-Modal Dataset for Category-Level Object Pose Estimation with Photometrically Challenging Objects
* Photorealistic Monocular 3D Reconstruction of Humans Wearing Clothing
* PhotoScene: Photorealistic Material and Lighting Transfer for Indoor Scenes
* PhyIR: Physics-based Inverse Rendering for Panoramic Indoor Images
* PhysFormer: Facial Video-based Physiological Measurement with Temporal Difference Transformer
* Physical Inertial Poser (PIP): Physics-aware Real-time Human Motion Tracking from Sparse Inertial Sensors
* Physical Simulation Layer for Accurate 3D Modeling
* Physically Disentangled Intra- and Inter-domain Adaptation for Varicolored Haze Removal
* Physically-guided Disentangled Implicit Rendering for 3D Face Modeling
* PIE-Net: Photometric Invariant Edge Guided Network for Intrinsic Image Decomposition
* PILC: Practical Image Lossless Compression with an End-to-end GPU Oriented Neural Framework
* Pin the Memory: Learning to Generalize Semantic Segmentation
* PINA: Learning a Personalized Implicit Neural Avatar from a Single RGB-D Video Sequence
* Pix2NeRF: Unsupervised Conditional pi-GAN for Single Image to Neural Radiance Fields Translation
* Pixel screening based intermediate correction for blind deblurring
* PixMix: Dreamlike Pictures Comprehensively Improve Safety Measures
* PLAD: Learning to Infer Shape Programs with Pseudo-Labels and Approximate Distributions
* PlanarRecon: Realtime 3D Plane Detection and Reconstruction from Posed Monocular Videos
* PlaneMVS: 3D Plane Reconstruction from Multi-View Stereo
* Playable Environments: Video Manipulation in Space and Time
* Plenoxels: Radiance Fields without Neural Networks
* PNP: Robust Learning from Noisy Labels by Probabilistic Noise Prediction
* POCO: Point Convolution for Surface Reconstruction
* Point Cloud Color Constancy
* Point Cloud Pre-training with Natural 3D Structures
* Point Density-Aware Voxels for LiDAR 3D Object Detection
* Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point Modeling
* Point-Level Region Contrast for Object Detection Pre-Training
* Point-NeRF: Point-based Neural Radiance Fields
* Point-to-Voxel Knowledge Distillation for LiDAR Semantic Segmentation
* Point2Cyl: Reverse Engineering 3D Objects from Point Clouds to Extrusion Cylinders
* Point2Seq: Detecting 3D Objects as Sequences
* PointCLIP: Point Cloud Understanding by CLIP
* Pointly-Supervised Instance Segmentation
* PokeBNN: A Binary Pursuit of Lightweight Accuracy
* Polarity Sampling: Quality and Diversity Control of Pre-Trained Generative Networks via Singular Values
* Polymorphic-GAN: Generating Aligned Samples across Multiple Domains with Learned Morph Maps
* PolyWorld: Polygonal Building Extraction with Graph Neural Networks in Satellite Images
* PONI: Potential Functions for ObjectGoal Navigation with Interaction-free Learning
* Pooling Revisited: Your Receptive Field is Suboptimal
* Pop-Out Motion: 3D-Aware Image Deformation via Learning the Shape Laplacian
* Portrait Eyeglasses and Shadow Removal by Leveraging 3D Synthetic Data
* PoseKernelLifter: Metric Lifting of 3D Human Pose using Sound
* PoseTrack21: A Dataset for Person Search, Multi-Object Tracking and Multi-Person Pose Tracking
* PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision
* PPDL: Predicate Probability Distribution based Loss for Unbiased Scene Graph Generation
* Practical Evaluation of Adversarial Robustness via Adaptive Auto Attack
* Practical Learned Lossless JPEG Recompression with Multi-Level Cross-Channel Entropy Model in the DCT Domain
* Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation
* Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model
* Pretrain, Self-train, Distill: A simple recipe for Supersizing 3D Reconstruction
* Primitive3D: 3D Object Dataset Synthesis from Randomly Assembled Primitives
* Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy, The
* Privacy Preserving Partial Localization
* Privacy-preserving Online AutoML for Domain-Specific Face Detection
* Proactive Image Manipulation Detection
* Probabilistic Graphical Model Based on Neural-symbolic Reasoning for Visual Relationship Detection, A
* Probabilistic Normal Epipolar Constraint for Frame-To-Frame Rotation Optimization under Uncertain Feature Positions, The
* Probabilistic Representations for Video Contrastive Learning
* Probabilistic Warp Consistency for Weakly-Supervised Semantic Correspondences
* Probing Representation Forgetting in Supervised and Unsupervised Continual Learning
* Programmatic Concept Learning for Human Motion Description and Synthesis
* Progressive Attention on Multi-Level Dense Difference Maps for Generic Event Boundary Detection
* Progressive End-to-End Object Detection in Crowded Scenes
* Progressive Minimal Path Method with Embedded CNN
* Progressively Generating Better Initial Guesses Towards Next Stages for High-Quality Human Motion Prediction
* Projective Manifold Gradient Layer for Deep Rotation Regression
* Prompt Distribution Learning
* Propagation Regularizer for Semi-supervised Learning with Extremely Scarce Labeled Samples
* Proper Reuse of Image Classification Features Improves Object Detection
* Proposal-based Paradigm for Self-supervised Sound Source Localization in Videos, A
* ProposalCLIP: Unsupervised Open-Category Object Proposal Generation via Exploiting CLIP Cues
* Protecting Celebrities from DeepFake with Identity Consistency Transformer
* Protecting Facial Privacy: Generating Adversarial Identity Masks via Style-robust Makeup Transfer
* Proto2Proto: Can you recognize the car, the way I do?
* Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding
* Pseudo-Stereo for Monocular 3D Object Detection in Autonomous Driving
* PSMNet: Position-aware Stereo Merging Network for Room Layout Estimation
* PSTR: End-to-End One-Step Person Search With Transformers
* PTTR: Relational 3D Point Cloud Object Tracking with Transformer
* PubTables-1M: Towards comprehensive table extraction from unstructured documents
* PUMP: Pyramidal and Uniqueness Matching Priors for Unsupervised Learning of Local Descriptors
* Pushing the Envelope of Gradient Boosting Forests via Globally-Optimized Oblique Trees
* Pushing the Limits of Simple Pipelines for Few-Shot Learning: External Data and Fine-Tuning Make a Difference
* Pushing the Performance Limit of Scene Text Recognizer without Human Annotation
* Putting People in their Place: Monocular Regression of 3D People in Depth
* PyMiceTracking: An Open-Source Toolbox For Real-Time Behavioral Neuroscience Experiments
* Pyramid Adversarial Training Improves ViT Performance
* Pyramid Architecture for Multi-Scale Processing in Point Cloud Segmentation
* Pyramid Grafting Network for One-Stage High Resolution Saliency Detection
* QS-Attn: Query-Selected Attention for Contrastive Learning in I2I Translation
* Quantifying Societal Bias Amplification in Image Captioning
* Quantization-aware Deep Optics for Diffractive Snapshot Hyperspectral Imaging
* Quarantine: Sparsity Can Uncover the Trojan Attack Trigger for Free
* Query and Attention Augmentation for Knowledge-Based Explainable Reasoning
* QueryDet: Cascaded Sparse Query for Accelerating High-Resolution Small Object Detection
* R(Det)2: Randomized Decision Routing for Object Detection
* RADU: Ray-Aligned Depth Update Convolutions for ToF Data Denoising
* RAGO: Recurrent Graph Optimizer For Multiple Rotation Averaging
* RAMA: A Rapid Multicut Algorithm on GPU
* Ranking Distance Calibration for Cross-Domain Few-Shot Learning
* Ranking-Based Siamese Visual Tracking
* Raw High-Definition Radar for Multi-Task Learning
* Ray Priors through Reprojection: Improving Neural Radiance Fields for Novel View Extrapolation
* Ray3D: ray-based 3D human pose estimation for monocular absolute 3D localization
* RayMVSNet: Learning Ray-based 1D Implicit Fields for Accurate Multi-View Stereo
* RBGNet: Ray-based Grouping for 3D Object Detection
* RCL: Recurrent Continuous Localization for Temporal Action Detection
* RCP: Recurrent Closest Point for Point Cloud
* Re-Balancing Strategy for Class-Imbalanced Classification Based on Instance Difficulty, A
* Reading to Listen at the Cocktail Party: Multi-Modal Speech Separation
* Real-time Hyperspectral Imaging in Hardware via Trained Metasurface Encoders
* Real-time Object Detection for Streaming Perception
* Real-Time, Accurate, and Consistent Video Semantic Segmentation via Unsupervised Adaptation and Cross-Unit Deployment on Mobile Device
* Recall@k Surrogate Loss with Large Batches and Similarity Mixup
* RecDis-SNN: Rectifying Membrane Potential Distribution for Directly Training Spiking Neural Networks
* Reconstructing Surfaces for Sparse Point Clouds with On-Surface Priors
* Recurrent Dynamic Embedding for Video Object Segmentation
* Recurrent Glimpse-based Decoder for Detection with Transformer
* Recurrent Variational Network: A Deep Learning Inverse Problem Solver applied to the task of Accelerated MRI Reconstruction
* Recurring the Transformer for Video Action Recognition
* Reduce Information Loss in Transformers for Pluralistic Image Inpainting
* Ref-NeRF: Structured View-Dependent Appearance for Neural Radiance Fields
* Reference-Based Video Super-Resolution Using Multi-Camera Video Triplets
* Reflash Dropout in Image Super-Resolution
* Reflection and Rotation Symmetry Detection via Equivariant Learning
* Region-Aware Face Swapping
* Regional Semantic Contrast and Aggregation for Weakly Supervised Semantic Segmentation
* RegionCLIP: Region-based Language-Image Pretraining
* Registering Explicit to Implicit: Towards High-Fidelity Garment Mesh Reconstruction from Single Images
* RegNeRF: Regularizing Neural Radiance Fields for View Synthesis from Sparse Inputs
* REGTR: End-to-end Point Cloud Correspondences with Transformers
* Reinforced Structured State-Evolution for Vision-Language Navigation
* Relative Pose from a Calibrated and an Uncalibrated Smartphone Image
* Relieving Long-tailed Instance Segmentation via Pairwise Class Balance
* RelTransformer: A Transformer-Based Long-Tail Visual Relationship Recognition
* Remember Intentions: Retrospective-Memory-based Trajectory Prediction
* RendNet: Unified 2D/3D Recognizer with Latent Space Rendering
* RePaint: Inpainting using Denoising Diffusion Probabilistic Models
* Replacing Labeled Real-image Datasets with Auto-generated Contours
* RepMLPNet: Hierarchical Vision MLP with Re-parameterized Locality
* RepNet: Efficient On-Device Learning via Feature Reprogramming
* Represent, Compare, and Learn: A Similarity-Aware Framework for Class-Agnostic Counting
* Representation Compensation Networks for Continual Semantic Segmentation
* Representing 3D Shapes with Probabilistic Directed Distance Fields
* ResSFL: A Resistance Transfer Framework for Defending Model Inversion Attack in Split Federated Learning
* RestoreFormer: High-Quality Blind Face Restoration from Undegraded Key-Value Pairs
* Restormer: Efficient Transformer for High-Resolution Image Restoration
* ReSTR: Convolution-free Referring Image Segmentation Using Transformers
* Rethinking Architecture Design for Tackling Data Heterogeneity in Federated Learning
* Rethinking Bayesian Deep Learning Methods for Semi-Supervised Volumetric Medical Image Segmentation
* Rethinking Controllable Variational Autoencoders
* Rethinking Deep Face Restoration
* Rethinking Depth Estimation for Multi-View Stereo: A Unified Representation
* Rethinking Efficient Lane Detection via Curve Modeling
* Rethinking Image Cropping: Exploring Diverse Compositions from Global Views
* Rethinking Minimal Sufficient Representation in Contrastive Learning
* Rethinking Reconstruction Autoencoder-Based Out-of-Distribution Detection
* Rethinking Semantic Segmentation: A Prototype View
* Rethinking Spatial Invariance of Convolutional Networks for Object Counting
* Rethinking the Augmentation Module in Contrastive Learning: Learning Hierarchical Augmentation Invariance with Expanded Views
* Rethinking Visual Geo-localization for Large-Scale Applications
* Retrieval Augmented Classification for Long-Tail Visual Recognition
* Retrieval-based Spatially Adaptive Normalization for Semantic Image Synthesis
* Reusing the Task-specific Classifier as a Discriminator: Discriminator-free Adversarial Domain Adaptation
* Revealing Occlusions with 4D Neural Fields
* Reversible Vision Transformers
* Revisiting AP Loss for Dense Object Detection: Adaptive Ranking Pair Selection
* Revisiting Document Image Dewarping by Grid Regularization
* Revisiting Domain Generalized Stereo Matching Networks from a Feature Consistency Perspective
* Revisiting Learnable Affines for Batch Norm in Few-Shot Transfer Learning
* Revisiting Near/Remote Sensing with Geospatial Attention
* Revisiting Random Channel Pruning for Neural Network Compression
* Revisiting Skeleton-based Action Recognition
* Revisiting Temporal Alignment for Video Restoration
* Revisiting the Transferability of Supervised Pretraining: An MLP Perspective
* Revisiting Weakly Supervised Pre-Training of Visual Perception Models
* REX: Reasoning-aware and Grounded Explanation
* RFNet: Unsupervised Network for Mutually Reinforcing Multi-modal Image Registration and Fusion
* RGB-Depth Fusion GAN for Indoor Depth Completion
* RGB-Multispectral Matching: Dataset, Learning Methodology, Evaluation
* RIDDLE: Lidar Data Compression with Range Image Deep Delta Encoding
* RigidFlow: Self-Supervised Scene Flow Learning on Point Clouds by Local Rigidity Prior
* RigNeRF: Fully Controllable Neural 3D Portraits
* RIM-Net: Recursive Implicit Fields for Unsupervised Learning of Hierarchical Shape Structures
* RIO: Rotation-equivariance supervised learning of robust inertial odometry
* RM-Depth: Unsupervised Learning of Recurrent Monocular Depth in Dynamic Scenes
* RNNPose: Recurrent 6-DoF Object Pose Refinement with Robust Correspondence Field Estimation and Pose Optimization
* Robust and Accurate Superquadric Recovery: a Probabilistic Approach
* Robust Combination of Distributed Gradients Under Adversarial Perturbations
* Robust Contrastive Learning against Noisy Views
* Robust Cross-Modal Representation Learning with Progressive Self-Distillation
* Robust Egocentric Photo-realistic Facial Expression Transfer for Virtual Reality
* Robust Equivariant Imaging: A fully unsupervised framework for learning to image from noisy and partial measurements
* Robust Federated Learning with Noisy and Heterogeneous Clients
* Robust fine-tuning of zero-shot models
* Robust Image Forgery Detection over Online Social Network Shared Images
* Robust Invertible Image Steganography
* Robust Optimization as Data Augmentation for Large-scale Graphs
* Robust outlier detection by de-biasing VAE likelihoods
* Robust Region Feature Synthesizer for Zero-Shot Object Detection
* Robust Structured Declarative Classifiers for 3D Point Clouds: Defending Adversarial Attacks with Implicit Gradients
* ROCA: Robust CAD Model Retrieval and Alignment from a Single Image
* Rope3D: The Roadside Perception Dataset for Autonomous Driving and Monocular 3D Object Detection Task
* Rotationally Equivariant 3D Object Detection
* RSCFed: Random Sampling Consensus Federated Semi-supervised Learning
* RSTT: Real-time Spatial Temporal Transformer for Space-Time Video Super-Resolution
* RU-Net: Regularized Unrolling Network for Scene Graph Generation
* Safe Self-Refinement for Transformer-based Domain Adaptation
* Safe-Student for Safe Deep Semi-Supervised Learning with Unseen-Class Unlabeled Data
* Salient-to-Broad Transition for Video Person Re-identification
* Salvage of Supervision in Weakly Supervised Object Detection
* sampling-based approach for efficient clustering in large datasets, A
* SAR-Net: Shape Alignment and Recovery Network for Category-level 6D Object Pose and Size Estimation
* SASIC: Stereo Image Compression with Latent Shifts and Stereo Attention
* SC2-PCR: A Second Order Spatial Compatibility for Efficient and Robust Point Cloud Registration
* Scalable Combinatorial Solver for Elastic Geometrically Consistent 3D Shape Matching, A
* Scalable Penalized Regression for Noise Detection in Learning with Noisy Labels
* Scale-Equivalent Distillation for Semi-Supervised Object Detection
* ScaleNet: A Shallow Architecture for Scale Estimation
* Scaling Up Vision-Language Pretraining for Image Captioning
* Scaling Up Your Kernels to 31X31: Revisiting Large Kernel Design in CNNs
* Scaling Vision Transformers
* Scaling Vision Transformers to Gigapixel Images via Hierarchical Self-Supervised Learning
* Scanline Homographies for Rolling-Shutter Plane Absolute Pose
* ScanQA: 3D Question Answering for Spatial Scene Understanding
* Scene Consistency Representation Learning for Video Scene Segmentation
* Scene Graph Expansion for Semantics-Guided Image Outpainting
* Scene Representation Transformer: Geometry-Free Novel View Synthesis Through Set-Latent Scene Representations
* SceneSqueezer: Learning to Compress Scene for Camera Relocalization
* SCENIC: A JAX Library for Computer Vision Research and Beyond
* ScePT: Scene-consistent, Policy-based Trajectory Predictions for Planning
* Scribble-Supervised LiDAR Semantic Segmentation
* SCS-Co: Self-Consistent Style Contrastive Learning for Image Harmonization
* Searching the Deployable Convolution Neural Networks for GPUs
* SEEG: Semantic Energized Co-speech Gesture Generation
* SeeThroughNet: Resurrection of Auxiliary Loss by Preserving Class Probability Information
* Segment and Complete: Defending Object Detectors against Adversarial Patch Attacks with Robust Patch Detection
* Segment, Magnify and Reiterate: Detecting Camouflaged Objects the Hard Way
* Segment-Fusion: Hierarchical Context Fusion for Robust 3D Semantic Segmentation
* Selective-Supervised Contrastive Learning with Noisy Labels
* Self-augmented Unpaired Image Dehazing via Density and Depth Decomposition
* Self-Distillation from the Last Mini-Batch for Consistency Regularization
* Self-Supervised Arbitrary-Scale Point Clouds Upsampling via Implicit Neural Representation
* Self-Supervised Bulk Motion Artifact Removal in Optical Coherence Tomography Angiography
* Self-supervised Correlation Mining Network for Person Image Generation
* Self-supervised Deep Image Restoration via Adaptive Stochastic Gradient Langevin Dynamics
* Self-Supervised Dense Consistency Regularization for Image-to-Image Translation
* Self-Supervised Descriptor for Image Copy Detection, A
* Self-Supervised Equivariant Learning for Oriented Keypoint Detection
* Self-Supervised Global-Local Structure Modeling for Point Cloud Domain Adaptation with Reliable Voted Pseudo Labels
* Self-Supervised Image Representation Learning with Geometric Set Consistency
* Self-supervised Image-specific Prototype Exploration for Weakly Supervised Semantic Segmentation
* Self-Supervised Keypoint Discovery in Behavioral Videos
* Self-supervised Learning of Adversarial Example: Towards Good Generalizations for Deepfake Detection
* Self-Supervised Learning of Object Parts for Semantic Segmentation
* Self-Supervised Material and Texture Representation Learning for Remote Sensing Tasks
* Self-Supervised Models are Continual Learners
* Self-supervised Neural Articulated Shape and Appearance Models
* Self-supervised object detection from audio-visual correspondence
* Self-Supervised Pre-Training of Swin Transformers for 3D Medical Image Analysis
* Self-Supervised Predictive Convolutional Attentive Block for Anomaly Detection
* Self-supervised Spatial Reasoning on Multi-View Line Drawings
* Self-Supervised Super-Resolution for Multi-Exposure Push-Frame Satellites
* Self-Supervised Transformers for Unsupervised Object Discovery using Normalized Cut
* Self-supervised Video Transformer
* Self-Sustaining Representation Expansion for Non-Exemplar Class-Incremental Learning
* Self-Taught Metric Learning without Labels
* SelfD: Self-Learning Large-Scale Driving Policies From the Web
* SelfRecon: Self Reconstruction Your Digital Avatar from Monocular Video
* SemAffiNet: Semantic-Affine Transformation for Point Cloud Segmentation
* Semantic Segmentation by Early Region Proxy
* Semantic-aligned Fusion Transformer for One-shot Object Detection
* Semantic-Aware Auto-Encoders for Self-supervised Representation Learning
* Semantic-Aware Domain Generalized Segmentation
* Semantic-shape Adaptive Feature Modulation for Semantic Image Synthesis
* SemanticStyleGAN: Learning Compositional Generative Priors for Controllable Image Synthesis and Editing
* Semi-Supervised Few-shot Learning via Multi-Factor Clustering
* Semi-Supervised Learning of Semantic Correspondence with Pseudo-Labels
* Semi-Supervised Object Detection via Multi-instance Alignment with Global Class Prototypes
* Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels
* Semi-supervised Semantic Segmentation with Error Localization Network
* Semi-supervised Video Paragraph Grounding with Contrastive Encoder
* Semi-Supervised Video Semantic Segmentation with Inter-Frame Feature Reconstruction
* Semi-Supervised Wide-Angle Portraits Correction by Multi-Scale Transformer
* Semi-Weakly-Supervised Learning of Complex Actions from Instructional Task Videos
* Semiconductor Defect Detection by Hybrid Classical-Quantum Deep Learning
* Sequential Voting with Relational Box Fields for Active Object Detection
* Set-Supervised Action Learning in Procedural Task Videos via Pairwise Order Consistency
* SGTR: End-to-end Scene Graph Generation with Transformer
* Shadows can be Dangerous: Stealthy and Effective Physical-world Adversarial Attack by Natural Phenomenon
* Shape from Polarization for Complex Scenes in the Wild
* Shape from Thermal Radiation: Passive Ranging Using Multi-spectral LWIR Measurements
* Shape-invariant 3D Adversarial Point Clouds
* ShapeFormer: Transformer-based Shape Completion via Sparse Representation
* Shapley-NAS: Discovering Operation Contribution for Neural Architecture Search
* SharpContour: A Contour-based Boundary Refinement Approach for Efficient and Accurate Instance Segmentation
* SHIFT: A Synthetic Driving Dataset for Continuous Multi-Task Domain Adaptation
* Shifting More Attention to Visual Backbone: Query-modulated Refinement Networks for End-to-End Visual Grounding
* Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning
* Show, Deconfound and Tell: Image Captioning with Causal Inference
* Shunted Self-Attention via Multi-Scale Token Aggregation
* Siamese Contrastive Embedding Network for Compositional Zero-Shot Learning
* SIGMA: Semantic-complete Graph Matching for Domain Adaptive Object Detection
* Sign Language Video Retrieval with Free-Form Textual Queries
* Signing at Scale: Learning to Co-Articulate Signs for Large-Scale Photo-Realistic Sign Language Production
* Sim VQA: Exploring Simulated Environments for Visual Question Answering
* SimAN: Exploring Self-Supervised Representation Learning of Scene Text via Similarity-Aware Normalization
* SIMBAR: Single Image-Based Scene Relighting For Effective Data Augmentation For Automated Driving Vision Tasks
* SimMatch: Semi-supervised Learning with Similarity Matching
* SimMIM: a Simple Framework for Masked Image Modeling
* Simple but Effective: CLIP Embeddings for Embodied AI
* Simple Data Mixing Prior for Improving Self-Supervised Learning, A
* Simple Episodic Linear Probe Improves Visual Recognition in the Wild, A
* Simple Multi-dataset Detection
* Simple Multi-Modality Transfer Learning Baseline for Sign Language Translation, A
* SimT: Handling Open-set Noise for Domain Adaptive Semantic Segmentation
* Simulated Adversarial Testing of Face Recognition Models
* SimVP: Simpler yet Better Video Prediction
* Single-Domain Generalized Object Detection in Urban Scene via Cyclic-Disentangled Self-Distillation
* Single-Photon Structured Light
* Single-Stage 3D Geometry-Preserving Depth Estimation Model Training on Dataset Mixtures with Uncalibrated Stereo Data
* Single-Stage is Enough: Multi-Person Absolute 3D Pose Estimation
* SIOD: Single Instance Annotated Per Category Per Image for Object Detection
* Sketch3T: Test-Time Training for Zero-Shot SBIR
* SketchEdit: Mask-Free Local Image Manipulation with Partial Sketches
* Sketching without Worrying: Noise-Tolerant Sketch-Based Image Retrieval
* SkinningNet: Two-Stream Graph Convolutional Neural Network for Skinning Prediction of Synthetic Characters
* SLIC: Self-Supervised Learning with Iterative Clustering for Human Action Videos
* Slimmable Domain Adaptation
* Slot-VPS: Object-centric Representation Learning for Video Panoptic Segmentation
* Smartadapt: Multi-branch Object Detection Framework for Videos on Mobiles
* SmartPortraits: Depth Powered Handheld Smartphone Dataset of Human Portraits for State Estimation, Reconstruction and Synthesis
* Smooth Maximum Unit: Smooth Activation Function for Deep Networks using Smoothing Maximum Technique
* Smooth-Swap: A Simple Enhancement for Face-Swapping with Smoothness
* SMPL-A: Modeling Person-Specific Deformable Anatomy
* SNR-Aware Low-light Image Enhancement
* SNUG: Self-Supervised Neural Dynamic Garments
* SoftCollage: A Differentiable Probabilistic Tree Generator for Image Collage
* SoftGroup for 3D Instance Segmentation on Point Clouds
* SOMSI: Spherical Novel View Synthesis with Soft Occlusion Multi-Sphere Images
* Sound and Visual Representation Learning with Multiple Pretraining Tasks
* Sound-Guided Semantic Image Manipulation
* Source-Free Domain Adaptation via Distribution Estimation
* Source-Free Object Detection by Learning to Overlook Domain Style
* SpaceEdit: Learning a Unified Editing Space for Open-Domain Image Color Editing
* SPAct: Self-supervised Privacy Preservation for Action Recognition
* SPAMs: Structured Implicit Parametric Models
* Sparse and Complete Latent Organization for Geospatial Semantic Segmentation
* Sparse Fuse Dense: Towards High Quality 3D Detection with Depth Completion
* Sparse Instance Activation for Real-Time Instance Segmentation
* Sparse Local Patch Transformer for Robust Face Alignment and Landmarks Inherent Relation Learning
* Sparse Non-local CRF
* Sparse Object-level Supervision for Instance Segmentation with Pixel Embeddings
* Sparse to Dense Dynamic 3D Facial Expression Generation
* Spatial Commonsense Graph for Object Localisation in Partial Scenes
* Spatial-Temporal Parallel Transformer for Arm-Hand Dynamic Estimation
* Spatial-Temporal Space Hand-in-Hand: Spatial-Temporal Video Super-Resolution via Cycle-Projected Mutual Learning
* Spatially-Adaptive Multilayer Selection for GAN Inversion and Editing
* Spatio-Temporal Gating-Adjacency GCN for Human Motion Prediction
* Spatio-temporal Relation Modeling for Few-shot Action Recognition
* Spectral Unsupervised Domain Adaptation for Visual Recognition
* Speech Driven Tongue Animation
* Speed up Object Detection on Gigapixel-level Images with Patch Arrangement
* SphereSR: 360° Image Super-Resolution with Arbitrary Projection via Continuous Spherical Image Representation
* SphericGAN: Semi-Supervised Hyper-Spherical Generative Adversarial Networks for Fine-Grained Image Synthesis
* Spiking Transformers for Event-based Single Object Tracking
* Splicing ViT Features for Semantic Appearance Transfer
* Split Hierarchical Variational Compression
* SplitNets: Designing Neural Architectures for Efficient Distributed Computing on Head-Mounted Systems
* SS3D: Sparsely-Supervised 3D Object Detection from Point Cloud
* ST++: Make Self-trainingWork Better for Semi-supervised Semantic Segmentation
* ST-MFNet: A Spatio-Temporal Multi-Flow Network for Frame Interpolation
* Stability-driven Contact Reconstruction From Monocular Color Images
* Stable Long-Term Recurrent Video Super-Resolution
* Stacked Hybrid-Attention and Group Collaborative Learning for Unbiased Scene Graph Generation
* Stand-Alone Inter-Frame Attention in Video Models
* STCrowd: A Multimodal Dataset for Pedestrian Perception in Crowded Scenes
* Stereo Depth from Events Cameras: Concentrate and Focus on the Future
* Stereo Magnification with Multi-Layer Images
* Stereoscopic Universal Perturbations across Different Architectures and Datasets
* Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration, A
* Stochastic Backpropagation: A Memory Efficient Strategy for Training Video Models
* Stochastic Trajectory Prediction via Motion Indeterminacy Diffusion
* Stochastic Variance Reduced Ensemble Adversarial Attack for Boosting the Adversarial Transferability
* Stratified Transformer for 3D Point Cloud Segmentation
* STRPM: A Spatiotemporal Residual Predictive Model for High-Resolution Video Prediction
* Structural and Statistical Texture Knowledge Distillation for Semantic Segmentation
* Structure-Aware Flow Generation for Human Body Reshaping
* Structure-Aware Motion Transfer with Deformable Anchor Model
* Structured Dictionary Perspective on Implicit Neural Representations, A
* Structured Local Radiance Fields for Human Avatar Modeling
* Structured Sparse R-CNN for Direct Scene Graph Generation
* study on the distribution of social biases in self-supervised learning visual models, A
* Style Neophile: Constantly Seeking Novel Styles for Domain Generalization
* Style Transformer for Image Inversion and Editing
* Style-aware Discriminator for Controllable Image Translation, A
* Style-Based Global Appearance Flow for Virtual Try-On
* Style-ERD: Responsive and Coherent Online Motion Style Transfer
* Style-Structure Disentangled Features and Normalizing Flows for Diverse Icon Colorization
* Styleformer: Transformer based Generative Adversarial Networks with Style Vector
* StyleGAN-V: A Continuous Video Generator with the Price, Image Quality and Perks of StyleGAN2
* StyleMesh: Style Transfer for Indoor 3D Scene Reconstructions
* StyleSDF: High-Resolution 3D-Consistent Image and Geometry Generation
* StyleSwin: Transformer-based GAN for High-resolution Image Generation
* StyleT2I: Toward Compositional and High-Fidelity Text-to-Image Synthesis
* StylizedNeRF: Consistent 3D Scene Stylization as Stylized NeRF via 2D-3D Mutual Learning
* StyTr2: Image Style Transfer with Transformers
* Sub-word Level Lip Reading With Visual Attention
* Subspace Adversarial Training
* Super-Fibonacci Spirals: Fast, Low-Discrepancy Sampling of SO(3)
* Surface Reconstruction from Point Clouds by Learning Predictive Context Priors
* Surface Representation for Point Clouds
* Surface-Aligned Neural Radiance Fields for Controllable 3D Human Synthesis
* SurfEmb: Dense and Continuous Correspondence Distributions for Object Pose Estimation with Learnt Surface Embeddings
* Surpassing the Human Accuracy: Detecting Gallbladder Cancer from USG Images with Curriculum Learning
* SVIP: Sequence VerIfication for Procedures in Videos
* SwapMix: Diagnosing and Regularizing the Over-Reliance on Visual Context in Visual Question Answering
* SWEM: Towards Real-Time Video Object Segmentation with Sequential Weighted Expectation-Maximization
* Swin Transformer V2: Scaling Up Capacity and Resolution
* SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning
* SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition
* Sylph: A Hypernetwork Framework for Incremental Few-shot Object Detection
* Symmetry and Uncertainty-Aware Object SLAM for 6DoF Object Pose Estimation
* Symmetry-aware Neural Architecture for Embodied Visual Exploration
* Syntax-Aware Network for Handwritten Mathematical Expression Recognition
* Synthetic Aperture Imaging with Events and Frames
* Synthetic Generation of Face Videos with Plethysmograph Physiology
* TableFormer: Table Structure Understanding with Transformers
* Talking Face Generation with Multilingual TTS
* Target-aware Dual Adversarial Learning and a Multi-scenario Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection
* Target-Relevant Knowledge Preservation for Multi-Source Domain Adaptive Object Detection
* Targeted Supervised Contrastive Learning for Long-Tailed Recognition
* Task Adaptive Parameter Sharing for Multi-Task Learning
* Task Decoupled Framework for Reference-based Super-Resolution
* Task Discrepancy Maximization for Fine-grained Few-Shot Classification
* Task-Adaptive Negative Envision for Few-Shot Open-Set Recognition
* Task-specific Inconsistency Alignment for Domain Adaptive Object Detection
* Task2Sim: Towards Effective Pre-training and Transfer from Synthetic Data
* TCTrack: Temporal Contexts for Aerial Tracking
* TeachAugment: Data Augmentation Optimization Using Teacher Knowledge
* Templates for 3D Object Pose Estimation Revisited: Generalization to New Objects and Robustness to Occlusions
* Temporal Alignment Networks for Long-term Video
* Temporal Complementarity-Guided Reinforcement Learning for Image-to-Video Person Re-Identification
* Temporal Context Matters: Enhancing Single Image Prediction with Disease Progression Representations
* Temporal Feature Alignment and Mutual Information Maximization for Video-Based Human Pose Estimation
* Temporally Efficient Vision Transformer for Video Instance Segmentation
* TemporalUV: Capturing Loose Clothing with Temporally Coherent UV Coordinates
* Tencent-MVSE: A Large-Scale Benchmark Dataset for Multi-Modal Video Similarity Evaluation
* Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution, A
* Text Spotting Transformers
* Text to Image Generation with Semantic-Spatial Aware GAN
* Text-to-Image Synthesis based on Object-Guided Joint-Decoding Transformer
* Text2Mesh: Text-Driven Neural Stylization for Meshes
* Text2Pos: Text-to-Point-Cloud Cross-Modal Localization
* Texture-based Error Analysis for Image Super-Resolution
* Thin-Plate Spline Motion Model for Image Animation
* Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation
* Think Twice Before Detecting GAN-generated Fake Images from their Spectral Domain Imprints
* Threshold Matters in WSSS: Manipulating the Activation for the Robust and Accurate Segmentation Model Against Thresholds
* Time Lens++: Event-based Frame Interpolation with Parametric Nonlinear Flow and Multi-scale Fusion
* Time3D: End-to-End Joint Monocular 3D Object Detection and Tracking for Autonomous Driving
* TimeReplayer: Unlocking the Potential of Event Cameras for Video Interpolation
* TO-FLOW: Efficient Continuous Normalizing Flows with Temporal Optimization adjoint with Moving Speed
* TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation
* Topologically-Aware Deformation Fields for Single-View 3D Reconstruction
* Topology Preserving Local Road Network Estimation from Single Onboard Camera Image
* Topology-Preserving Shape Reconstruction and Registration via Neural Diffeomorphic Flow
* Total Variation Optimization Layers for Computer Vision
* Toward Fast, Flexible, and Robust Low-Light Image Enhancement
* Toward Practical Monocular Indoor Depth Estimation
* Towards Accurate Facial Landmark Detection via Cascaded Transformers
* Towards An End-to-End Framework for Flow-Guided Video Inpainting
* Towards Better Plasticity-Stability Trade-off in Incremental Learning: A Simple Linear Connector
* Towards Better Understanding Attribution Methods
* Towards Bidirectional Arbitrary Image Rescaling: Joint Optimization and Cycle Idempotence
* Towards Data-Free Model Stealing in a Hard Label Setting
* Towards Discovering the Effectiveness of Moderately Confident Samples for Semi-Supervised Learning
* Towards Discriminative Representation: Multi-view Trajectory Contrastive Learning for Online Multi-object Tracking
* Towards Diverse and Natural Scene-aware 3D Human Motion Synthesis
* Towards Driving-Oriented Metric for Lane Detection Models
* Towards Efficient and Scalable Sharpness-Aware Minimization
* Towards Efficient Data Free Blackbox Adversarial Attack
* Towards End-to-End Unified Scene Text Detection and Layout Analysis
* Towards Fewer Annotations: Active Learning via Region Impurity and Prediction Uncertainty for Domain Adaptive Semantic Segmentation
* Towards General Purpose Vision Systems: An End-to-End Task-Agnostic Vision-Language Architecture
* Towards Implicit Text-Guided 3D Shape Generation
* Towards Language-Free Training for Text-to-Image Generation
* Towards Layer-wise Image Vectorization
* Towards Low-Cost and Efficient Malaria Detection
* Towards Multi-domain Single Image Dehazing via Test-time Training
* Towards Multimodal Depth Estimation from Light Fields
* Towards Noiseless Object Contours for Weakly Supervised Semantic Segmentation
* Towards Practical Certifiable Patch Defense with Vision Transformer
* Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks
* Towards Principled Disentanglement for Domain Generalization
* Towards real-world navigation with deep differentiable planners
* Towards Robust Adaptive Object Detection under Noisy Annotations
* Towards Robust and Adaptive Motion Forecasting: A Causal Representation Perspective
* Towards Robust and Reproducible Active Learning using Neural Networks
* Towards Robust Rain Removal Against Adversarial Attacks: A Comprehensive Benchmark Analysis and Beyond
* Towards Robust Vision Transformer
* Towards Semi-Supervised Deep Facial Expression Recognition with An Adaptive Confidence Margin
* Towards Total Recall in Industrial Anomaly Detection
* Towards Understanding Adversarial Robustness of Optical Flow Networks
* Towards Unsupervised Domain Generalization
* Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer
* TrackFormer: Multi-Object Tracking with Transformers
* Tracking People by Predicting 3D Appearance, Location and Pose
* Training High-Performance Low-Latency Spiking Neural Networks by Differentiation on Spike Representation
* Training Object Detectors from Scratch: An Empirical Study in the Era of Vision Transformer
* Training Quantised Neural Networks with STE Variants: The Additive Noise Annealing Algorithm
* Training-free Transformer Architecture Search
* Trajectory Optimization for Physics-Based Reconstruction of 3d Human Pose from Monocular Video
* TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing
* Transferability Estimation using Bhattacharyya Class Separability
* Transferability Metrics for Selecting Source Model Ensembles
* Transform-Retrieve-Generate: Natural Language-Centric Outside-Knowledge Visual Question Answering
* TransforMatcher: Match-to-Match Attention for Semantic Correspondence
* Transformer Based Line Segment Classifier with Image Context for Real-Time Vanishing Point Detection in Manhattan World
* Transformer Tracking with Cyclic Shifting Window Attention
* Transformer-empowered Multi-scale Contextual Matching and Aggregation for Multi-contrast MRI Super-resolution
* Transforming Model Prediction for Tracking
* TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers
* TransGeo: Transformer Is All You Need for Cross-view Image Geo-localization
* TransMix: Attend to Mix for Vision Transformers
* TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers
* TransRAC: Encoding Multi-scale Temporal Correlation with Transformers for Repetitive Action Counting
* TransRank: Self-supervised Video Representation Learning via Ranking-based Transformation Recognition
* TransVPR: Transformer-Based Place Recognition with Multi-Level Attention Aggregation
* TransWeather: Transformer-based Restoration of Images Degraded by Adverse Weather Conditions
* Tree Energy Loss: Towards Sparsely Annotated Semantic Segmentation
* Trustworthy Long-Tailed Classification
* TubeDETR: Spatio-Temporal Video Grounding with Transformers
* TubeFormer-DeepLab: Video Mask Transformer
* TubeR: Tubelet Transformer for Video Action Detection
* TVConv: Efficient Translation Variant Convolution for Layout-aware Visual Processing
* TWIST: Two-Way Inter-label Self-Training for Semi-supervised 3D Instance Segmentation
* Two Coupled Rejection Metrics Can Tell Adversarial Examples Apart
* Two Dimensions of Worst-case Training and Their Integrated Effect for Out-of-domain Generalization, The
* UBnormal: New Benchmark for Supervised Open-Set Video Anomaly Detection
* UBoCo: Unsupervised Boundary Contrastive Learning for Generic Event Boundary Detection
* UCC: Uncertainty guided Cross-head Cotraining for Semi-Supervised Semantic Segmentation
* UDA-COPE: Unsupervised Domain Adaptation for Category-level Object Pose Estimation
* Uformer: A General U-Shaped Transformer for Image Restoration
* UKPGAN: A General Self-Supervised Keypoint Detector
* UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection
* Unbiased Subclass Regularization for Semi-Supervised Semantic Segmentation
* Unbiased Teacher v2: Semi-supervised Object Detection for Anchor-free and Anchor-based Detectors
* Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose Estimation
* Uncertainty-Aware Deep Multi-View Photometric Stereo
* Uncertainty-Guided Probabilistic Transformer for Complex Action Recognition
* Understanding 3D Object Articulation in Internet Videos
* Understanding and Increasing Efficiency of Frank-Wolfe Adversarial Training
* Understanding Uncertainty Maps in Vision with Statistical Testing
* Undoing the Damage of Label Shift for Cross-domain Semantic Segmentation
* Uni-Perceiver: Pre-training Unified Architecture for Generic Perception for Zero-shot and Few-shot Tasks
* Uni6D: A Unified CNN Framework without Projection Breakdown for 6D Pose Estimation
* UNICON: Combating Label Noise Through Uniform Selection and Contrastive Learning
* UniCoRN: A Unified Conditional Image Repainting Network
* Unified Contrastive Learning in Image-Text-Label Space
* Unified Framework for Implicit Sinkhorn Differentiation, A
* Unified Model for Line Projections in Catadioptric Cameras with Rotationally Symmetric Mirrors, A
* Unified Multivariate Gaussian Mixture for Efficient Neural Image Compression
* Unified Query-based Paradigm for Point Cloud Understanding, A
* Unified Transformer Tracker for Object Tracking
* Uniform Subdivision of Omnidirectional Camera Space for Efficient Spherical Stereo Matching
* Unifying Motion Deblurring and Frame Interpolation with Events
* Unifying Panoptic Segmentation for Autonomous Driving
* Unimodal-Concentrated Loss: Fully Adaptive Label Distribution Learning for Ordinal Regression
* UNIST: Unpaired Neural Implicit Shape Translation Network
* Universal Photometric Stereo Network using Global Lighting Contexts
* UniVIP: A Unified Framework for Self-Supervised Visual Pre-training
* Unknown-Aware Object Detection: Learning What You Don't Know from Videos in the Wild
* Unleashing Potential of Unsupervised Pre-Training with Intra-Identity Regularization for Person Re-Identification
* Unpaired Cartoon Image Synthesis via Gated Cycle Mapping
* Unpaired Deep Image Deraining Using Dual Contrastive Learning
* Unseen Classes at a Later Time? No Problem
* Unsupervised Action Segmentation by Joint Representation Learning and Online Clustering
* Unsupervised Deraining: Where Contrastive Learning Meets Self-similarity
* Unsupervised Domain Adaptation for Nighttime Aerial Tracking
* Unsupervised Domain Generalization by Learning a Bridge Across Domains
* Unsupervised Hierarchical Semantic Segmentation with Multiview Cosegmentation and Clustering Transformers
* Unsupervised Homography Estimation with Coplanarity-Aware GAN
* Unsupervised Image-to-Image Translation with Generative Prior
* Unsupervised Learning of Accurate Siamese Tracking
* Unsupervised Learning of Debiased Representations with Pseudo-Attributes
* Unsupervised Pre-training for Temporal Action Localization Tasks
* Unsupervised Representation Learning for Binary Networks by Joint Classifier Learning
* Unsupervised Vision-and-Language Pretraining via Retrieval-based Multi-Granular Alignment
* Unsupervised Vision-Language Parsing: Seamlessly Bridging Visual Scene Graphs with Language Structures via Dependency Relationships
* Unsupervised Visual Representation Learning by Online Constrained K-Means
* UnweaveNet: Unweaving Activity Stories
* Upright-Net: Learning Upright Orientation for 3D Point Cloud
* Urban Radiance Fields
* URetinex-Net: Retinex-based Deep Unfolding Network for Low-light Image Enhancement
* Use All The Labels: A Hierarchical Multi-Label Contrastive Learning Framework
* Using 3D Topological Connectivity for Ghost Particle Reduction in Flow Reconstruction
* UTC: A Unified Transformer with Inter-Task Contrastive Learning for Visual Dialog
* V-Doc: Visual questions answers with Documents
* V2C: Visual Voice Cloning
* variational Bayesian method for similarity learning in non-rigid image registration, A
* vCLIMB: A Novel Video Class Incremental Learning Benchmark
* Vector Quantized Diffusion Model for Text-to-Image Synthesis
* Vehicle trajectory prediction works, but not everywhere
* Versatile Multi-Modal Pre-Training for Human-Centric Perception
* Versatile Multi-View Framework for LiDAR-based 3D Object Detection with Guidance from Panoptic Segmentation, A
* VGSE: Visually-Grounded Semantic Embeddings for Zero-Shot Learning
* Video Demoiréing with Relation-Based Temporal Consistency
* Video Frame Interpolation Transformer
* Video Frame Interpolation with Transformer
* Video K-Net: A Simple, Strong, and Unified Baseline for Video Segmentation
* Video Shadow Detection via Spatio-Temporal Interpolation Consistency Training
* Video Swin Transformer
* Video-Text Representation Learning via Differentiable Weak Temporal Alignment
* VideoINR: Learning Video Implicit Neural Representation for Continuous Space-Time Super-Resolution
* ViM: Out-Of-Distribution with Virtual-Logit Matching
* Virtual Correspondence: Humans as a Cue for Extreme-View Geometry
* Virtual Elastic Objects
* VIsCUIT: Visual Auditor for Bias in CNN Image Classifier
* Visible-Thermal UAV Tracking: A Large-Scale Benchmark and New Baseline
* Vision Transformer Slimming: Multi-Dimension Searching in Continuous Optimization Space
* Vision Transformer with Deformable Attention
* Vision-Language Pre-Training for Boosting Scene Text Detectors
* Vision-Language Pre-Training with Triple Contrastive Learning
* VISOLO: Grid-Based Space-Time Aggregation for Efficient Online Video Instance Segmentation
* VISTA: Boosting 3D Object Detection via Dual Cross-VIew SpaTial Attention
* ViSTA: Vision and Scene Text Aggregation for Cross-Modal Retrieval
* Visual Abductive Reasoning
* Visual Acoustic Matching
* Visual Vibration Tomography: Estimating Interior Material Properties from Monocular Video
* VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning
* VisualHow: Multimodal Problem Solving
* VL-ADAPTER: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks
* VL-InterpreT: An Interactive Visualization Tool for Interpreting Vision-Language Transformers
* Volumetric Bundle Adjustment for Online Photorealistic Scene Capture
* Vox2Cortex: Fast Explicit Reconstruction of Cortical Surfaces from 3D MRI Scans with Geometric Deep Neural Networks
* Voxel Field Fusion for 3D Object Detection
* Voxel Graph CNN for Object Classification with Event Cameras, A
* Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds
* VRDFormer: End-to-End Video Visual Relation Detection with Transformers
* WALT: Watch And Learn 2D amodal representation from Time-lapse imagery
* Wanderings of Odysseus in 3D Scenes, The
* WarpingGAN: Warping Multiple Uniform Priors for Adversarial 3D Point Cloud Generation
* Watch It Move: Unsupervised Discovery of 3D Joints for Re-Posing of Articulated Objects
* Wavelet Knowledge Distillation: Towards Efficient Image-to-Image Translation
* Weakly But Deeply Supervised Occlusion-Reasoned Parametric Road Layouts
* Weakly Paired Associative Learning for Sound and Image Representations via Bimodal Associative Memory
* Weakly Supervised High-Fidelity Clothing Model Generation
* Weakly Supervised Object Localization as Domain Adaption
* Weakly Supervised Rotation-Invariant Aerial Object Detection Network
* Weakly Supervised Segmentation on Outdoor 4D point clouds with Temporal Matching and Spatial Graph Propagation
* Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast
* Weakly Supervised Semantic Segmentation using Out-of-Distribution Data
* Weakly Supervised Temporal Action Localization via Representative Snippet Knowledge Propagation
* Weakly Supervised Temporal Sentence Grounding with Gaussian-based Contrastive Proposal Learning
* Weakly-supervised Action Transition Learning for Stochastic Human Motion Prediction
* Weakly-Supervised Generation and Grounding of Visual Descriptions with Conditional Generative Models
* Weakly-supervised Metric Learning with Cross-Module Communications for the Classification of Anterior Chamber Angle Images
* Weakly-Supervised Online Action Segmentation in Multi-View Instructional Videos
* WebQA: Multihop and Multimodal QA
* What do navigation agents learn about their environment?
* What Makes Transfer Learning Work for Medical Images: Feature Reuse & Other Factors
* What Matters For Meta-Learning Vision Regression Tasks?
* What to look at and where: Semantic and Spatial Refined Transformer for detecting human-object interactions
* What's in your hands? 3D Reconstruction of Generic Objects in Hands
* When Does Contrastive Visual Representation Learning Work?
* When to Prune? A Policy towards Early Structural Pruning
* Which images to label for few-shot medical landmark detection?
* Which Model to Transfer? Finding the Needle in the Growing Haystack
* Whose Hands are These? Hand Detection and Hand-Body Association in the Wild
* Whose Track Is It Anyway? Improving Robustness to Tracking Errors with Affinity-based Trajectory Prediction
* Why Discard if You can Recycle?: A Recycling Max Pooling Module for 3D Point Cloud Analysis
* WildNet: Learning Domain Generalized Semantic Segmentation from the Wild
* Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality
* Wnet: Audio-Guided Video Object Segmentation via Wavelet-Based Cross- Modal Denoising Networks
* X-Pool: Cross-Modal Language-Video Attention for Text-Video Retrieval
* X-Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning
* XMP-Font: Self-Supervised Cross-Modality Pre-training for Few-Shot Font Generation
* XYDeblur: Divide and Conquer for Single Image Deblurring
* XYLayoutLM: Towards Layout-Aware Multimodal Networks For Visually-Rich Document Understanding
* YouMVOS: An Actor-centric Multi-shot Video Object Segmentation Dataset
* ZebraPose: Coarse to Fine Surface Encoding for 6DoF Object Pose Estimation
* Zero Experience Required: Plug & Play Modular Transfer Learning for Semantic Visual Navigation
* Zero-Query Transfer Attacks on Context-Aware Object Detectors
* Zero-Shot Text-Guided Object Generation with Dream Fields
* ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
* ZeroWaste Dataset: Towards Deformable Object Segmentation in Cluttered Scenes
* Zoom In and Out: A Mixed-scale Triplet Network for Camouflaged Object Detection
* ZZ-Net: A Universal Rotation Equivariant Architecture for 2D Point Clouds
2070 for CVPR22

CVPR23 * $R^2$ Former: Unified Retrieval and Reranking Transformer for Place Recognition
* *CVPR
* 1% VS 100%: Parameter-Efficient Low Rank Adapter for Dense Predictions
* 1000 FPS HDR Video with a Spike-RGB Hybrid Camera
* 2PCNet: Two-Phase Consistency Training for Day-to-Night Unsupervised Domain Adaptive Object Detection
* 3D Cinemagraphy from a Single Image
* 3D Concept Learning and Reasoning from Multi-View Images
* 3D GAN Inversion with Facial Symmetry Prior
* 3D Highlighter: Localizing Regions on 3D Shapes via Text Descriptions
* 3D Human Keypoints Estimation from Point Clouds in the Wild without Human Labels
* 3D Human Mesh Estimation from Virtual Markers
* 3D Human Pose Estimation via Intuitive Physics
* 3D Human Pose Estimation with Spatio-Temporal Criss-Cross Attention
* 3D Line Mapping Revisited
* 3D Neural Field Generation Using Triplane Diffusion
* 3D Registration with Maximal Cliques
* 3D Semantic Segmentation in the Wild: Learning Generalized Models for Adverse-Condition Point Clouds
* 3D Shape Reconstruction of Semi-Transparent Worms
* 3D Spatial Multimodal Knowledge Accumulation for Scene Graph Prediction in Point Cloud
* 3D Video Loops from Asynchronous Input
* 3D Video Object Detection with Learnable Object-Centric Global Optimization
* 3D-aware Conditional Image Synthesis
* 3D-Aware Face Swapping
* 3D-aware Facial Landmark Detection via Multi-view Consistent Training on Synthetic Data
* 3D-Aware Multi-Class Image-to-Image Translation with NeRFs
* 3D-Aware Object Goal Navigation via Simultaneous Exploration and Identification
* 3D-POP: An Automated Annotation Approach to Facilitate Markerless 2D-3D Tracking of Freely Moving Birds with Marker-Based Motion Capture
* 3DAvatarGAN: Bridging Domains for Personalized Editable Avatars
* 3Mformer: Multi-order Multi-mode Transformer for Skeletal Action Recognition
* @ CREPE: Can Vision-Language Foundation Models Reason Compositionally?
* A-CAP: Anticipation Captioning with Commonsense Knowledge
* A2J-Transformer: Anchor-to-Joint Transformer Network for 3D Interacting Hand Pose Estimation from a Single RGB Image
* ABCD: Arbitrary Bitwise Coefficient for De-Quantization
* ABLE-NeRF: Attention-Based Rendering with Learnable Embeddings for Neural Radiance Field
* Abstract Visual Reasoning: An Algebraic Approach for Solving Raven's Progressive Matrices
* Accelerated Coordinate Encoding: Learning to Relocalize in Minutes Using RGB and Poses
* Accelerating Dataset Distillation via Model Augmentation
* Accelerating Vision-Language Pretraining with Free Language Modeling
* AccelIR: Task-aware Image Compression for Accelerating Neural Restoration
* Accidental Light Probes
* Achieving a Better Stability-Plasticity Trade-off via Auxiliary Networks in Continual Learning
* ACL-SPC: Adaptive Closed-Loop System for Self-Supervised Point Cloud Completion
* ACR: Attention Collaboration-based Regressor for Arbitrary Two-Hand Reconstruction
* ACSeg: Adaptive Conceptualization for Unsupervised Semantic Segmentation
* Actionlet-Dependent Contrastive Learning for Unsupervised Skeleton-Based Action Recognition
* Activating More Pixels in Image Super-Resolution Transformer
* Active Exploration of Multimodal Complementarity for Few-Shot Action Recognition
* Active Finetuning: Exploiting Annotation Budget in the Pretraining-Finetuning Paradigm
* ActMAD: Activation Matching to Align Distributions for Test-Time-Training
* Actor-centric Causality Graph for Asynchronous Temporal Inference in Group Activity, An
* AdaMAE: Adaptive Masking for Efficient Spatiotemporal Learning with Masked Autoencoders
* AdamsFormer for Spatial Action Localization in the Future
* Adapting Shortcut with Normalizing Flow: An Efficient Tuning Framework for Visual Recognition
* Adaptive Annealing for Robust Geometric Estimation
* Adaptive Assignment for Geometry Aware Local Feature Matching
* Adaptive Channel Sparsity for Federated Learning under System Heterogeneity
* Adaptive Data-Free Quantization
* Adaptive Global Decay Process for Event Cameras
* Adaptive Graph Convolutional Subspace Clustering
* Adaptive Human Matting for Dynamic Videos
* Adaptive Patch Deformation for Textureless-Resilient Multi-View Stereo
* Adaptive Plasticity Improvement for Continual Learning
* Adaptive Sparse Convolutional Networks with Global Context Enhancement for Faster Object Detection on Drone Images
* Adaptive Sparse Pairwise Loss for Object Re-Identification
* Adaptive Spot-Guided Transformer for Consistent Local Feature Matching
* Adaptive Zone-aware Hierarchical Planner for Vision-Language Navigation
* AdaptiveMix: Improving GAN Training via Feature Space Shrinkage
* Adjustment and Alignment for Unbiased Open Set Domain Adaptation
* Advancing Visual Grounding with Scene Knowledge: Benchmark and Method
* Adversarial Counterfactual Visual Explanations
* Adversarial Normalization: I Can visualize Everything (ICE)
* Adversarial Robustness via Random Projection Filters
* Adversarially Masking Synthetic to Mimic Real: Adaptive Noise Injection for Point Cloud Segmentation Adaptation
* Adversarially Robust Neural Architecture Search for Graph Neural Networks
* AeDet: Azimuth-Invariant Multi-View 3D Object Detection
* Affection: Learning Affective Explanations for Real-World Visual Data
* Affordance Diffusion: Synthesizing Hand-Object Interactions
* Affordance Grounding from Demonstration Video to Target Image
* Affordances from Human Videos as a Versatile Representation for Robotics
* AGAIN: Adversarial Training with Attribution Span Enlargement and Hybrid Feature Fusion
* Alias-Free Convnets: Fractional Shift Invariance via Polynomial Activations
* Align and Attend: Multimodal Summarization with Dual Contrastive Losses
* Align Your Latents: High-Resolution Video Synthesis with Latent Diffusion Models
* AligNeRF: High-Fidelity Neural Radiance Fields via Alignment-Aware Training
* Aligning Bag of Regions for Open-Vocabulary Object Detection
* Aligning Step-by-Step Instructional Diagrams to Video Demonstrations
* All are Worth Words: A ViT Backbone for Diffusion Models
* All in One: Exploring Unified Video-Language Pre-Training
* All-in-Focus Imaging from Event Focal Stack
* All-in-One Image Restoration for Unknown Degradations Using Adaptive Discriminative Filters for Specific Degradations
* ALOFT: A Lightweight MLP-Like Architecture with Dynamic Low-Frequency Transform for Domain Generalization
* ALSO: Automotive Lidar Self-Supervision by Occupancy Estimation
* AltFreezing for More General Video Face Forgery Detection
* ALTO: Alternating Latent Topologies for Implicit 3D Reconstruction
* Ambiguity-Resistant Semi-Supervised Learning for Dense Object Detection
* Ambiguous Medical Image Segmentation Using Diffusion Models
* AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation
* Analyzing and Diagnosing Pose Estimation with Attributions
* Analyzing Physical Impacts Using Transient Surface Wave Imaging
* Anchor3DLane: Learning to Regress 3D Anchors for Monocular 3D Lane Detection
* AnchorFormer: Point Cloud Completion from Discriminative Nodes
* ANetQA: A Large-scale Benchmark for Fine-grained Compositional Reasoning over Untrimmed Videos
* Angelic Patches for Improving Third-Party Object Detector Performance
* Annealing-based Label-Transfer Learning for Open World Object Detection
* AnyFlow: Arbitrary Scale Optical Flow with Implicit Neural Representation
* Architectural Backdoors in Neural Networks
* Architecture, Dataset and Model-Scale Agnostic Data-free Meta-Learning
* ARCTIC: A Dataset for Dexterous Bimanual Hand-Object Manipulation
* Are Binary Annotations Sufficient? Video Moment Retrieval via Hierarchical Uncertainty-based Active Learning
* Are Data-Driven Explanations Robust Against Out-of-Distribution Data?
* Are Deep Neural Networks SMARTer Than Second Graders?
* Are We Ready for Vision-Centric Driving Streaming Perception? The ASAP Benchmark
* ARKitTrack: A New Diverse Dataset for Tracking Using Mobile RGB-D Data
* ARO-Net: Learning Implicit Fields from Anchored Radial Observations
* AShapeFormer: Semantics-Guided Object-Level Active Shape Encoding for 3D Object Detection via Transformers
* ASPnet: Action Segmentation with Shared-Private Representation of Multiple Data Sources
* AssemblyHands: Towards Egocentric Activity Understanding via 3D Hand Pose Estimation
* AstroNet: When Astrocyte Meets Artificial Neural Network
* AsyFOD: An Asymmetric Adaptation Paradigm for Few-Shot Domain Adaptive Object Detection
* Asymmetric Feature Fusion for Image Retrieval
* Attention-Based Point Cloud Edge Sampling
* AttentionShift: Iteratively Estimated Part-Based Attention Map for Pointly Supervised Instance Segmentation
* Attribute-Preserving Face Dataset Anonymization via Latent Code Optimization
* AttriCLIP: A Non-Incremental Learner for Incremental Knowledge Learning
* Audio-Visual Grouping Network for Sound Localization from Mixtures
* Augmentation Matters: A Simple-Yet-Effective Approach to Semi-Supervised Semantic Segmentation
* AUNet: Learning Relations Between Action Units for Face Forgery Detection
* Auto-CARD: Efficient and Robust Codec Avatar Driving for Real-time Mobile Telepresence
* AutoAD: Movie Description in Context
* AutoFocusFormer: Image Segmentation off the Grid
* AutoLabel: CLIP-based framework for Open-Set Video Domain Adaptation
* Automatic High Resolution Wire Segmentation and Removal
* Autonomous Manipulation Learning for Similar Deformable Objects via Only One Demonstration
* AutoRecon: Automated 3D Object Discovery and Reconstruction
* Autoregressive Visual Tracking
* Avatars Grow Legs: Generating Smooth Human Motion from Sparse Tracking Inputs with Diffusion Model
* AVFace: Towards Detailed Audio-Visual 4D Face Reconstruction
* AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR
* Azimuth Super-Resolution for FMCW Radar in Autonomous Driving
* B-Spline Texture Coefficients Estimator for Screen Content Image Super-Resolution
* BAAM: Monocular 3D pose and shape reconstruction with bi-contextual attention module and attention-guided modeling
* Back to the Source: Diffusion-Driven Adaptation to Test-Time Corruption
* Backdoor Attacks Against Deep Image Compression via Adaptive Frequency Trigger
* Backdoor Cleansing with Unlabeled Data
* Backdoor Defense via Adaptively Splitting Poisoned Dataset
* Backdoor Defense via Deconfounded Representation Learning
* BAD-NeRF: Bundle Adjusted Deblur Neural Radiance Fields
* BAEFormer: Bi-Directional and Early Interaction Transformers for Bird's Eye View Semantic Segmentation
* Bag-of-Prototypes Representation for Dataset-Level Applications, A
* Balanced Energy Regularization Loss for Out-of-distribution Detection
* Balanced Product of Calibrated Experts for Long-Tailed Recognition
* Balanced Spherical Grid for Egocentric View Synthesis
* Balancing Logit Variation for Long-Tailed Semantic Segmentation
* BASiS: Batch Aligned Spectral Embedding Space
* Batch Model Consolidation: A Multi-Task Model Consolidation Framework
* Bayesian Posterior Approximation With Stochastic Ensembles
* BBDM: Image-to-Image Translation with Brownian Bridge Diffusion Models
* BEDLAM: A Synthetic Dataset of Bodies Exhibiting Detailed Lifelike Animated Motion
* Behavioral Analysis of Vision-and-Language Navigation Agents
* Behind the Scenes: Density Fields for Single View Reconstruction
* Being Comes from Not-Being: Open-Vocabulary Text-to-Motion Generation with Wordless Training
* Benchmarking Robustness of 3D Object Detection to Common Corruptions in Autonomous Driving
* Benchmarking Self-Supervised Learning on Diverse Pathology Datasets
* Best Defense is a Good Offense: Adversarial Augmentation Against Adversarial Attacks, The
* Best of Both Worlds: Multimodal Contrastive Learning with Tabular and Imaging Data
* Better CMOS Produces Clearer Images: Learning Space-Variant Blur Estimation for Blind Image Super-Resolution
* BEV-Guided Multi-Modality Fusion for Driving Perception
* BEV-LaneDet: An Efficient 3D Lane Detection Based on Virtual Camera via Key-Points
* BEV-SAN: Accurate BEV 3D Object Detection via Slice Attention Networks
* BEV@DC: Bird's-Eye View Assisted Training for Depth Completion
* BEVFormer v2: Adapting Modern Image Backbones to Bird's-Eye-View Recognition via Perspective Supervision
* BEVHeight: A Robust Framework for Vision-based Roadside 3D Object Detection
* Beyond Appearance: A Semantic Controllable Self-Supervised Learning Framework for Human-Centric Visual Tasks
* Beyond Attentive Tokens: Incorporating Token Importance and Diversity for Efficient Vision Transformers
* Beyond mAP: Towards Better Evaluation of Instance Segmentation
* Bi-Directional Distribution Alignment for Transductive Zero-Shot Learning
* Bi-directional Feature Fusion Generative Adversarial Network for Ultra-high Resolution Pathological Image Virtual Re-Staining
* Bi-Level Meta-Learning for Few-Shot Domain Generalization
* Bi-LRFusion: Bi-Directional LiDAR-Radar Fusion for 3D Dynamic Object Detection
* Bi3D: Bi-Domain Active Learning for Cross-Domain 3D Object Detection
* Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures
* Bias Mimicking: A Simple Sampling Approach for Bias Mitigation
* Bias-Eliminating Augmentation Learning for Debiased Federated Learning
* BiasAdv: Bias-Adversarial Augmentation for Model Debiasing
* BiasBed: Rigorous Texture Bias Evaluation
* BiCro: Noisy Correspondence Rectification for Multi-modality Data via Bi-directional Cross-modal Similarity Consistency
* Bidirectional Copy-Paste for Semi-Supervised Medical Image Segmentation
* Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
* BiFormer: Learning Bilateral Motion Estimation via Bilateral Transformer for 4K Video Frame Interpolation
* BiFormer: Vision Transformer with Bi-Level Routing Attention
* Bilateral Memory Consolidation for Continual Learning
* Binarizing Sparse Convolutional Networks for Efficient Point Cloud Analysis
* Binary Latent Diffusion
* Biomechanics-Guided Facial Action Unit Detection Through Force Modeling
* BioNet: A Biologically-Inspired Network for Face Recognition
* Bit-shrinking: Limiting Instantaneous Sharpness for Improving Post-training Quantization
* BITE: Beyond Priors for Improved Three-D Dog Pose Estimation
* Bitstream-Corrupted JPEG Images are Restorable: Two-stage Compensation and Alignment Framework for Image Restoration
* BKinD-3D: Self-Supervised 3D Keypoint Discovery from Multi-View Videos
* Black-Box Sparse Adversarial Attack via Multi-Objective Optimisation CVPR Proceedings
* BlackVIP: Black-Box Visual Prompting for Robust Transfer Learning
* Blemish-aware and Progressive Face Retouching with Limited Paired Data
* BlendFields: Few-Shot Example-Driven Facial Modeling
* Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective
* Blind Video Deflickering by Neural Filtering with a Flawed Atlas
* Block Selection Method for Using Feature Norm in Out-of-Distribution Detection
* Blowing in the Wind: CycleNet for Human Cinemagraphs from Still Images
* Blur Interpolation Transformer for Real-World Motion from Blur
* Boost Vision Transformer with GPU-Friendly Sparsity and Quantization
* Boosting Accuracy and Robustness of Student Models via Adaptive Adversarial Distillation
* Boosting Detection in Crowd Analysis via Underutilized Output Features
* Boosting Low-Data Instance Segmentation by Unsupervised Pre-training with Saliency Prompt
* Boosting Semi-Supervised Learning by Exploiting All Unlabeled Data
* Boosting Transductive Few-Shot Fine-tuning with Margin-based Uncertainty Weighting and Probability Regularization
* Boosting Verified Training for Robust Image Classifications via Abstraction
* Boosting Video Object Segmentation via Space-Time Correspondence Learning
* Boosting Weakly-Supervised Temporal Action Localization with Text Information
* Bootstrap Your Own Prior: Towards Distribution-Agnostic Novel Class Discovery
* Bootstrapping Objectness from Videos by Relaxed Common Fate and Visual Grouping
* Both Style and Distortion Matter: Dual-Path Unsupervised Domain Adaptation for Panoramic Semantic Segmentation
* Boundary Unlearning: Rapid Forgetting of Deep Networks via Shifting the Decision Boundary
* Boundary-aware Backward-Compatible Representation via Adversarial Learning in Image Retrieval
* Boundary-enhanced Co-training for Weakly Supervised Semantic Segmentation
* Box-Level Active Detection
* BoxTeacher: Exploring High-Quality Pseudo Labels for Weakly Supervised Instance Segmentation
* Breaching FedMD: Image Recovery via Paired-Logits Inversion Attack
* Breaking the Object in Video Object Segmentation
* Bridging Precision and Confidence: A Train-Time Loss for Calibrating Object Detection
* Bridging Search Region Interaction with Template for RGB-T Tracking
* Bridging the Gap Between Model Explanations in Partially Annotated Multi-Label Classification
* Bringing Inputs to Shared Domains for 3D Interacting Hands Recovery in the Wild
* BUFFER: Balancing Accuracy, Efficiency, and Generalizability in Point Cloud Registration
* Building Rearticulable Models for Arbitrary 3D Objects from 4D Point Clouds
* BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects
* BUOL: A Bottom-Up Framework with Occupancy-Aware Lifting for Panoptic 3D Scene Reconstruction From a Single Image
* Burstormer: Burst Image Restoration and Enhancement Transformer
* C-SFDA: A Curriculum Learning Aided Self-Training Framework for Efficient Source Free Domain Adaptation
* CABM: Content-Aware Bit Mapping for Single Image Super-Resolution Network with Large Input
* CafeBoost: Causal Feature Boost to Eliminate Task-Induced Bias for Class Incremental Learning
* Camouflaged Instance Segmentation via Explicit De-Camouflaging
* Camouflaged Object Detection with Feature Decomposition and Edge Reconstruction
* CAMS: CAnonicalized Manipulation Spaces for Category-Level Functional Hand-Object Manipulation Synthesis
* Can't Steal? Cont-Steal! Contrastive Stealing Attacks Against Image Encoders
* Canonical Fields: Self-Supervised Learning of Pose-Canonicalized Neural Fields
* CAP-VSTNet: Content Affinity Preserved Versatile Style Transfer
* Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
* CAP: Robust Point Cloud Classification via Semantic and Structural Modeling
* CapDet: Unifying Dense Captioning and Open-World Detection Pretraining
* CAPE: Camera View Position Embedding for Multi-View 3D Object Detection
* CaPriDe Learning: Confidential and Private Decentralized Learning Based on Encryption-Friendly Distillation Loss
* CARTO: Category and Joint Agnostic Reconstruction of ARTiculated Objects
* Cascade Evidential Learning for Open-world Weakly-supervised Temporal Action Localization
* Cascaded Local Implicit Transformer for Arbitrary-Scale Super-Resolution
* CASP-Net: Rethinking Video Saliency Prediction from an Audio-Visual Consistency Perceptual Perspective
* Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention at Vision Transformer Inference
* CAT: LoCalization and IdentificAtion Cascade Detection Transformer for Open-World Object Detection
* Catch Missing Details: Image Reconstruction with Frequency Augmented Variational Autoencoder
* Category Query Learning for Human-Object Interaction Classification
* Causally-Aware Intraoperative Imputation for Overall Survival Time Prediction
* CCuantuMM: Cycle-Consistent Quantum-Hybrid Matching of Multiple Shapes
* CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for Multi-Modality Image Fusion
* CelebV-Text: A Large-Scale Facial Text-Video Dataset
* Center Focusing Network for Real-Time LiDAR Panoptic Segmentation
* CF-Font: Content Fusion for Few-Shot Font Generation
* CFA: Class-Wise Calibrated Fair Adversarial Training
* Change-Aware Sampling and Contrastive Learning for Satellite Images
* Characteristic Function-Based Method for Bottom-Up Human Pose Estimation, A
* Chat2Map: Efficient Scene Mapping from Multi-Ego Conversations
* CHMATCH: Contrastive Hierarchical Matching and Robust Adaptive Threshold Boosted Semi-Supervised Learning
* CiaoSR: Continuous Implicit Attention-in-Attention Network for Arbitrary-Scale Image Super-Resolution
* CiCo: Domain-Aware Sign Language Retrieval via Cross-Lingual Contrastive Learning
* CIGAR: Cross-Modality Graph Reasoning for Domain Adaptive Object Detection
* CIMI4D: A Large Multimodal Climbing Motion Dataset under Human-scene Interactions
* CIRCLE: Capture In Rich Contextual Environments
* CLAMP: Prompt-based Contrastive Learning for Connecting Language and Animal Pose
* Class Adaptive Network Calibration
* Class Attention Transfer Based Knowledge Distillation
* Class Balanced Adaptive Pseudo Labeling for Federated Semi-Supervised Learning
* Class Prototypes based Contrastive Learning for Classifying Multi-Label and Fine-Grained Educational Videos
* Class Relationship Embedded Learning for Source-Free Unsupervised Domain Adaptation
* Class-Balancing Diffusion Models
* Class-Conditional Sharpness-Aware Minimization for Deep Long-Tailed Recognition
* Class-Incremental Exemplar Compression for Class-Incremental Learning
* CLIP for All Things Zero-Shot Sketch-Based Image Retrieval, Fine-Grained or Not
* CLIP is Also an Efficient Segmenter: A Text-Driven Approach for Weakly Supervised Semantic Segmentation
* CLIP the Gap: A Single Domain Generalization Approach for Object Detection
* CLIP-S4: Language-Guided Self-Supervised Semantic Segmentation
* CLIP-Sculptor: Zero-Shot Generation of High-Fidelity and Diverse Shapes from Natural Language
* CLIP2: Contrastive Language-Image-Point Pretraining from Real-World Point Cloud Data
* CLIP2Protect: Protecting Facial Privacy Using Text-Guided Makeup via Adversarial Latent Search
* CLIP2Scene: Towards Label-efficient 3D Scene Understanding by CLIP
* CLIPPING: Distilling CLIP-Based Models with a Student Base for Video-Language Retrieval
* CLIPPO: Image-and-Language Understanding from Pixels Only
* CloSET: Modeling Clothed Humans on Continuous Surface with Explicit Template Decomposition
* CLOTH4D: A Dataset for Clothed Human Reconstruction
* Clothed Human Performance Capture with a Double-layer Neural Radiance Fields
* Cloud-Device Collaborative Adaptation to Continual Changing Environments in the Real-World
* Clover: Towards A Unified Video-Language Alignment and Fusion Model
* CNVid-3.5M: Build, Filter, and Pre-Train the Large-Scale Public Chinese Video-Text Dataset
* Co-Salient Object Detection with Uncertainty-Aware Group Exchange-Masking
* Co-SLAM: Joint Coordinate and Sparse Parametric Encodings for Neural Real-Time SLAM
* Co-speech Gesture Synthesis by Reinforcement Learning with Contrastive Pretrained Rewards
* Co-training 2L Submodels for Visual Recognition
* Coaching a Teachable Student
* CODA-Prompt: COntinual Decomposed Attention-Based Prompting for Rehearsal-Free Continual Learning
* CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior
* Collaboration Helps Camera Overtake LiDAR in 3D Detection
* Collaborative Diffusion for Multi-Modal Face Generation and Editing
* Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in Movies
* Collaborative Static and Dynamic Vision-Language Streams for Spatio-Temporal Video Grounding
* Collecting Cross-Modal Presence-Absence Evidence for Weakly-Supervised Audio-Visual Event Perception
* Color Backdoor: A Robust Poisoning Attack in Color Space
* Combining Implicit-Explicit View Correlation for Light Field Semantic Segmentation
* CoMFormer: Continual Learning in Semantic and Panoptic Segmentation
* Command-driven Articulated Object Understanding and Manipulation
* Common Pets in 3D: Dynamic New-View Synthesis of Real-Life Deformable Categories
* Compacting Binary Neural Networks by Sparse Kernel Selection
* Complementary Intrinsics from Neural Radiance Fields and CNNs for Outdoor Scene Relighting
* Complete 3D Human Reconstruction from a Single Incomplete Image
* Complete-to-Partial 4D Distillation for Self-Supervised Point Cloud Sequence Representation Learning
* CompletionFormer: Depth Completion with Convolutions and Vision Transformers
* Complexity-guided Slimmable Decoder for Efficient Deep Video Compression
* Compositor: Bottom-Up Clustering and Compositing for Robust Part and Object Segmentation
* Comprehensive and Delicate: An Efficient Transformer for Image Restoration
* Compressing Volumetric Radiance Fields to 1 MB
* Compression-Aware Video Super-Resolution
* Computational Flash Photography through Intrinsics
* Computationally Budgeted Continual Learning: What Does Matter?
* Conditional Generation of Audio from Video via Foley Analogies
* Conditional Image-to-Video Generation with Latent Flow Diffusion Models
* Conditional Text Image Generation with Diffusion Models
* Confidence-Aware Personalized Federated Learning via Variational Expectation Maximization
* Conflict-Based Cross-View Consistency for Semi-Supervised Semantic Segmentation
* Conjugate Product Graphs for Globally Optimal 2D-3D Shape Matching
* Connecting the Dots: Floorplan Reconstruction Using Two-Level Queries
* Connecting Vision and Language with Video Localized Narratives
* ConQueR: Query Contrast Voxel-DETR for 3D Object Detection
* Consistent Direct Time-of-Flight Video Depth Super-Resolution
* Consistent View Synthesis with Pose-Guided Diffusion Models
* Consistent-Teacher: Towards Reducing Inconsistent Pseudo-Targets in Semi-Supervised Object Detection
* Constrained Evolutionary Diffusion Filter for Monocular Endoscope Tracking
* ConStruct-VL: Data-Free Continual Structured VL Concepts Learning*
* Constructing Deep Spiking Neural Networks from Artificial Neural Networks with Knowledge Distillation
* Content-aware Token Sharing for Efficient Semantic Segmentation with Vision Transformers
* Context De-Confounded Emotion Recognition
* Context-aware Alignment and Mutual Masking for 3D-Language Pre-training
* Context-Aware Pretraining for Efficient Blind Image Decomposition
* Context-Aware Relative Object Queries to Unify Video Instance and Panoptic Segmentation
* Context-Based Trit-Plane Coding for Progressive Image Compression
* Continual Detection Transformer for Incremental Object Detection
* Continual Semantic Segmentation with Automatic Memory Sample Selection
* Continuous Intermediate Token Learning with Implicit Motion Manifold for Keyframe Based Motion Interpolation
* Continuous Landmark Detection with 3D Queries
* Continuous Pseudo-Label Rectified Domain Adaptive Semantic Segmentation with Implicit Neural Representations
* Continuous Sign Language Recognition with Correlation Network
* ContraNeRF: Generalizable Neural Radiance Fields for Synthetic-to-real Novel View Synthesis via Contrastive Learning
* Contrastive Grouping with Transformer for Referring Image Segmentation
* Contrastive Mean Teacher for Domain Adaptive Object Detectors
* Contrastive Semi-Supervised Learning for Underwater Image Restoration via Reliable Bank
* Controllable Light Diffusion for Portraits
* Controllable Mesh Generation Through Sparse Latent Point Diffusion Models
* ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders
* ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing
* Cooperation or Competition: Avoiding Player Domination for Multi-Target Robustness via Adaptive Budgets
* CORA: Adapting CLIP for Open-Vocabulary Detection with Region Prompting and Anchor Pre-Matching
* CoralStyleCLIP: Co-optimized Region and Layer Selection for Image Editing
* Coreset Sampling from Open-Set for Fine-Grained Self-Supervised Learning
* Correlational Image Modeling for Self-Supervised Visual Pre-Training
* Correspondence Transformers with Asymmetric Feature Learning and Matching Flow Super-Resolution
* COT: Unsupervised Domain Adaptation with Clustering and Optimal Transport
* CoWs on Pasture: Baselines and Benchmarks for Language-Driven Zero-Shot Object Navigation
* CP3: Channel Pruning Plug-in for Point-Based Networks
* CR-FIQA: Face Image Quality Assessment by Learning Sample Relative Classifiability
* CRAFT: Concept Recursive Activation FacTorization for Explainability
* Critical Learning Periods for Multisensory Integration in Deep Networks
* CrOC: Cross-View Online Clustering for Dense Visual Representation Learning
* Cross-Domain 3D Hand Pose Estimation with Dual Modalities
* Cross-Domain Image Captioning with Discriminative Finetuning
* Cross-GAN Auditing: Unsupervised Identification of Attribute Level Similarities and Differences Between Pretrained Generative Models
* Cross-Guided Optimization of Radiance Fields with Multi-View Image Super-Resolution for High-Resolution Novel View Synthesis
* Cross-Image-Attention for Conditional Embeddings in Deep Metric Learning
* Cross-Modal Implicit Relation Reasoning and Aligning for Text-to-Image Person Retrieval
* Crossing the Gap: Domain Generalization for Image Captioning
* Crowd3D: Towards Hundreds of People Reconstruction from a Single Image
* CrowdCLIP: Unsupervised Crowd Counting via Vision-Language Model
* CUDA: Convolution-Based Unlearnable Datasets
* CUF: Continuous Upsampling Filters
* Curricular Contrastive Regularization for Physics-Aware Single Image Dehazing
* Curricular Object Manipulation in LiDAR-Based Object Detection
* Curvature-Balanced Feature Manifold Learning for Long-Tailed Classification
* Cut and Learn for Unsupervised Object Detection and Instance Segmentation
* CutMIB: Boosting Light Field Super-Resolution via Multi-View Image Blending
* CVT-SLR: Contrastive Visual-Textual Transformation for Sign Language Recognition with Variational Alignment
* CXTrack: Improving 3D Point Cloud Tracking with Contextual Information
* D2Former: Jointly Learning Hierarchical Detectors and Contextual Descriptors via Agent-Based Transformers
* DA Wand: Distortion-Aware Selection Using Neural Mesh Parameterization
* DA-DETR: Domain Adaptive Detection Transformer with Information Fusion
* DAA: A Delta Age AdaIN operation for age estimation via binary code transformer
* DaFKD: Domain-aware Federated Knowledge Distillation
* DANI-Net: Uncalibrated Photometric Stereo by Differentiable Shadow Handling, Anisotropic Reflectance Modeling, and Neural Inverse Rendering
* DARE-GRAM: Unsupervised Domain Adaptation Regression by Aligning Inverse Gram Matrices
* Dark Side of Dynamic Routing Neural Networks: Towards Efficiency Backdoor Injection, The
* DART: Diversify-Aggregate-Repeat Training Improves Generalization of Neural Networks
* DartBlur: Privacy Preservation with Detection Artifact Suppression
* Data-Based Perspective on Transfer Learning, A
* Data-Driven Feature Tracking for Event Cameras
* Data-Efficient Large Scale Place Recognition with Graded Similarity Supervision
* Data-Free Knowledge Distillation via Feature Exchange and Activation Region Constraint
* Data-Free Sketch-Based Image Retrieval
* DATE: Domain Adaptive Product Seeker for E-Commerce
* DATID-3D: Diversity-Preserved Domain Adaptation Using Text-to-Image Diffusion for 3D Generative Model
* DBARF: Deep Bundle-Adjusting Generalizable Neural Radiance Fields
* DC2: Dual-Camera Defocus Control by Learning to Refocus
* DCFace: Synthetic Face Generation with Dual Condition Diffusion Model
* Dealing with Cross-Task Class Discrimination in Online Continual Learning
* DeAR: Debiasing Vision-Language Models with Additive Residuals
* Decentralized Learning with Multi-Headed Distillation
* DeCo: Decomposition and Reconstruction for Compositional Temporal Grounding via Coarse-to-Fine Contrastive Ranking
* Decompose More and Aggregate Better: Two Closer Looks at Frequency Representation Learning for Human Motion Prediction
* Decompose, Adjust, Compose: Effective Normalization by Playing with Frequency for Domain Generalization
* Decomposed Cross-Modal Distillation for RGB-based Temporal Action Detection
* Decomposed Soft Prompt Guided Fusion Enhancing for Compositional Zero-Shot Learning
* Decoupled Multimodal Distilling for Emotion Recognition
* Decoupled Semantic Prototypes enable learning from diverse annotation types for semi-weakly segmentation in expert-driven domains
* Decoupling Human and Camera Motion from Videos in the Wild
* Decoupling Learning and Remembering: a Bilevel Memory Framework with Knowledge Projection for Task-Incremental Learning
* Decoupling MaxLogit for Out-of-Distribution Detection
* Decoupling-and-Aggregating for Image Exposure Correction
* Deep Arbitrary-Scale Image Super-Resolution via Scale-Equivariance Pursuit
* Deep Curvilinear Editing: Commutative and Nonlinear Image Manipulation for Pretrained Deep Generative Model
* Deep Depth Estimation from Thermal Image
* Deep Deterministic Uncertainty: A New Simple Baseline
* Deep Discriminative Spatial and Temporal Network for Efficient Video Deblurring
* Deep Dive into Gradients: Better Optimization for 3D Object Detection with Gradient-Corrected IoU Supervision
* Deep Factorized Metric Learning
* Deep Fair Clustering via Maximizing and Minimizing Mutual Information: Theory, Algorithm and Metric
* Deep Frequency Filtering for Domain Generalization
* Deep Graph Reprogramming
* Deep Graph-based Spatial Consistency for Robust Non-rigid Point Cloud Registration
* Deep Hashing with Minimal-Distance-Separated Hash Centers
* Deep Incomplete Multi-View Clustering with Cross-View Partial Sample and Prototype Alignment
* Deep Learning of Partial Graph Matching via Differentiable Top-K
* Deep Polarization Reconstruction with PDAVIS Events
* Deep Random Projector: Accelerated Deep Image Prior
* Deep Semi-Supervised Metric Learning with Mixed Label Propagation
* Deep Stereo Video Inpainting
* DeepLSD: Line Segment Detection and Refinement with Deep Image Gradients
* DeepMAD: Mathematical Architecture Design for Deep Convolutional Neural Network
* DeepMapping2: Self-Supervised Large-Scale LiDAR Map Optimization
* DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting
* DeepVecFont-v2: Exploiting Transformers to Synthesize Vector Fonts with Higher Quality
* DeFeeNet: Consecutive 3D Human Motion Prediction with Deviation Feedback
* Defending Against Patch-based Backdoor Attacks on Self-Supervised Learning
* Defining and Quantifying the Emergence of Sparse Concepts in DNNs
* Deformable Mesh Transformer for 3D Human Mesh Recovery
* DegAE: A New Pretraining Paradigm for Low-Level Vision
* DeGPR: Deep Guided Posterior Regularization for Multi-Class Cell Detection and Counting
* DejaVu: Conditional Regenerative Learning to Enhance Dense Prediction
* Delivering Arbitrary-Modal Semantic Segmentation
* DeltaEdit: Exploring Text-free Training for Text-Driven Image Manipulation
* Delving into Discrete Normalizing Flows on SO(3) Manifold for Probabilistic Rotation Modeling
* Delving into Shape-aware Zero-shot Semantic Segmentation
* Delving StyleGAN Inversion for Image Editing: A Foundation Latent Space Viewpoint
* Demystifying Causal Features on Adversarial Examples and Causal Inoculation for Robust Network by Adversarial Instrumental Variable Regression
* Dense Distinct Query for End-to-End Object Detection
* Dense Network Expansion for Class Incremental Learning
* Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline
* Density-Insensitive Unsupervised Domain Adaption on 3D Object Detection
* DepGraph: Towards Any Structural Pruning
* Depth Estimation from Camera Image and mmWave Radar Point Cloud
* Depth Estimation from Indoor Panoramas with Neural Scene Representation
* DeSTSeg: Segmentation Guided Denoising Student-Teacher for Anomaly Detection
* DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-training via Word-Region Alignment
* Detecting and Grounding Multi-Modal Media Manipulation
* Detecting Backdoors During the Inference Stage Based on Corruption Robustness Consistency
* Detecting Backdoors in Pre-trained Encoders
* Detecting Everything in the Open World: Towards Universal Object Detection
* Detecting Human-Object Contact in Images
* Detection Hub: Unifying Object Detection Datasets via Query Adaptation on Language Embedding
* Detection of Out-of-Distribution Samples Using Binary Neuron Activation Patterns
* DETR with Additional Global Aggregation for Cross-domain Weakly Supervised Object Detection
* DETRs with Hybrid Matching
* Devil is in the Points: Weakly Semi-Supervised Instance Segmentation via Point-Guided Mask Representation, The
* Devil is in the Queries: Advancing Mask Transformers for Real-world Medical Image Segmentation and Out-of-Distribution Localization
* Devil's on the Edges: Selective Quad Attention for Scene Graph Generation
* DexArt: Benchmarking Generalizable Dexterous Manipulation with Articulated Objects
* DF-Platter: Multi-Face Heterogeneous Deepfake Dataset
* Dialog Must Go On: Improving Visual Dialog via Generative Self-Training, The
* DiffCollage: Parallel Generation of Large Content with Diffusion Models
* Differentiable Architecture Search with Random Features
* Differentiable Lens: Compound Lens Search over Glass Surfaces and Materials for Object Detection, The
* Differentiable Shadow Mapping for Efficient Inverse Graphics
* Difficulty-Based Sampling for Debiased Contrastive Representation Learning
* DiffPose: Toward More Reliable 3D Pose Estimation
* DiffRF: Rendering-Guided 3D Radiance Field Diffusion
* DiffSwap: High-Fidelity and Controllable Face Swapping via 3D-Aware Masked Diffusion
* DiffTalk: Crafting Diffusion Models for Generalized Audio-Driven Portraits Animation
* Diffusion Art or Digital Forgery? Investigating Data Replication in Diffusion Models
* Diffusion Probabilistic Model Made Slim
* Diffusion Video Autoencoders: Toward Temporally Consistent Face Video Editing via Disentangled Video Encoding
* Diffusion-based Generation, Optimization, and Planning in 3D Scenes
* Diffusion-Based Signed Distance Fields for 3D Shape Generation
* Diffusion-SDF: Text-to-Shape via Voxelized Diffusion
* DiffusioNeRF: Regularizing Neural Radiance Fields with Denoising Diffusion Models
* DiffusionRig: Learning Personalized Priors for Facial Appearance Editing
* DIFu: Depth-Guided Implicit Function for Clothed Human Reconstruction
* DiGA: Distil to Generalize and then Adapt for Domain Adaptive Semantic Segmentation
* DiGeo: Discriminative Geometry-Aware Learning for Generalized Few-Shot Object Detection
* Dimensionality-Varying Diffusion Process
* DINER: Depth-aware Image-based NEural Radiance fields
* DINER: Disorder-Invariant Implicit Neural Representation
* DINN360: Deformable Invertible Neural Network for Latitude-aware 360° Image Rescaling
* Dionysus: Recovering Scene Structures by Dividing into Semantic Pieces
* DIP: Dual Incongruity Perceiving Network for Sarcasm Detection
* Directional Connectivity-based Segmentation of Medical Images
* DISC: Learning from Noisy Labels via Dynamic Instance-Specific Selection and Correction
* DisCo-CLIP: A Distributed Contrastive Loss for Memory Efficient CLIP Training
* DisCoScene: Spatially Disentangled Generative Radiance Fields for Controllable 3D-aware Scene Synthesis
* Discovering the Real Association: Multimodal Causal Reasoning in Video Question Answering
* Discrete Point-Wise Attack is Not Enough: Generalized Manifold Adversarial Attack for Face Recognition
* Discriminating Known from Unknown Objects via Structure-Enhanced Recurrent Variational AutoEncoder
* Discriminative Co-Saliency and Background Mining Transformer for Co-Salient Object Detection
* Discriminator-Cooperated Feature Map Distillation for GAN Compression
* Disentangled Representation Learning for Unsupervised Neural Quantization
* Disentangling Orthogonal Planes for Indoor Panoramic Room Layout Estimation with Cross-Scale Distortion Awareness
* Disentangling Writer and Character Styles for Handwriting Generation
* Distilling Cross-Temporal Contexts for Continuous Sign Language Recognition
* Distilling Focal Knowledge from Imperfect Expert for 3D Object Detection
* Distilling Neural Fields for Real-Time Articulated Shape Reconstruction
* Distilling Self-Supervised Vision Transformers for Weakly-Supervised Few-Shot Classification and Segmentation
* Distilling Vision-Language Pre-Training to Collaborate with Weakly-Supervised Temporal Action Localization
* DistilPose: Tokenized Pose Regression with Heatmap Distillation
* DistractFlow: Improving Optical Flow Estimation via Realistic Distractions and Pseudo-Labeling
* Distribution Shift Inversion for Out-of-Distribution Prediction
* DisWOT: Student Architecture Search for Distillation WithOut Training
* DivClust: Controlling Diversity in Deep Clustering
* Diverse 3D Hand Gesture Prediction from Body Dynamics by Bilateral Hand Disentanglement
* Diverse Embedding Expansion Network and Low-Light Cross-Modality Benchmark for Visible-Infrared Person Re-identification
* Diversity-Aware Meta Visual Prompting
* Diversity-Measurable Anomaly Detection
* Divide and Adapt: Active Domain Adaptation via Customized Learning
* Divide and Conquer: Answering Questions with Object Factorization and Compositional Reasoning
* DKM: Dense Kernelized Feature Matching for Geometry Estimation
* DKT: Diverse Knowledge Transfer Transformer for Class Incremental Learning
* DLBD: A Self-Supervised Direct-Learned Binary Descriptor
* DNeRV: Modeling Inherent Dynamics via Difference Neural Representation for Videos
* DNF: Decouple and Feedback Network for Seeing in the Dark
* Document Image Shadow Removal Guided by Color-Aware Background
* Domain Expansion of Image Generators
* Domain Generalized Stereo Matching via Hierarchical Visual Transformation
* Don't Lie to Me! Robust and Efficient Explainability with Verified Perturbation Analysis
* DoNet: Deep De-Overlapping Network for Cytology Instance Segmentation
* Doubly Right Object Recognition: A Why Prompt for Visual Rationales
* DP-NeRF: Deblurred Neural Radiance Field with Physical Scene Priors
* DPE: Disentanglement of Pose and Expression for General Video Portrait Editing
* DPF: Learning Dense Prediction Fields with Weak Supervision
* DR2: Diffusion-Based Robust Degradation Remover for Blind Face Restoration
* DrapeNet: Garment Generation and Self-Supervised Draping
* Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models
* DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
* DropKey for Vision Transformer
* DropMAE: Masked Autoencoders with Spatial-Attention Dropout for Tracking Tasks
* DSFNet: Dual Space Fusion Network for Occlusion-Robust 3D Dense Face Alignment
* DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets
* Dual Alignment Unsupervised Domain Adaptation for Video-Text Retrieval
* Dual-bridging with Adversarial Noise Generation for Domain Adaptive rPPG Estimation
* Dual-Path Adaptation from Image to Video Transformers
* DualRefine: Self-Supervised Depth and Pose Estimation Through Iterative Epipolar Sampling land Refinement Toward Equilibrium
* DualRel: Semi-Supervised Mitochondria Segmentation from A Prototype Perspective
* DualVector: Unsupervised Vector Font Synthesis with Dual-Part Representation
* DyLiN: Making Light Field Networks Dynamic
* DYNAFED: Tackling Client Data Heterogeneity with Global Dynamics
* DynaMask: Dynamic Mask Selection for Instance Segmentation
* Dynamic Aggregated Network for Gait Recognition
* Dynamic Coarse-to-Fine Learning for Oriented Tiny Object Detection
* Dynamic Conceptional Contrastive Learning for Generalized Category Discovery
* Dynamic Focus-aware Positional Queries for Semantic Segmentation
* Dynamic Generative Targeted Attacks with Pattern Injection
* Dynamic Graph Enhanced Contrastive Learning for Chest X-Ray Report Generation
* Dynamic Graph Learning with Content-guided Spatial-Frequency Relation Reasoning for Deepfake Detection
* Dynamic Inference with Grounding Based Vision and Language Models
* Dynamic Multi-Scale Voxel Flow Network for Video Prediction, A
* Dynamic Neural Network for Multi-Task Learning Searching across Diverse Network Topologies
* Dynamically Instance-Guided Adaptation: A Backward-free Approach for Test-Time Domain Adaptive Semantic Segmentation
* DynamicDet: A Unified Dynamic Architecture for Object Detection
* DynamicStereo: Consistent Dynamic Depth from Stereo Videos
* DyNCA: Real-Time Dynamic Texture Synthesis Using Neural Cellular Automata
* DynIBaR: Neural Dynamic Image-Based Rendering
* E2PN: Efficient SE(3)-Equivariant Point Network
* EC2: Emergent Communication for Embodied Control
* ECON: Explicit Clothed humans Optimized via Normal integration
* EcoTTA: Memory-Efficient Continual Test-Time Adaptation via Self-Distilled Regularization
* EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual Grounding
* Edge-aware Regional Message Passing Controller for Image Forgery Localization
* EDGE: Editable Dance Generation From Music
* Edges to Shapes to Concepts: Adversarial Augmentation for Robust Vision
* EDICT: Exact Diffusion Inversion via Coupled Transformations
* EditableNeRF: Editing Topologically Varying Neural Radiance Fields by Key Points
* EFEM: Equivariant Neural Field Expectation Maximization for 3D Object Segmentation Without Scene Supervision
* Effective Ambiguity Attack Against Passport-based DNN Intellectual Property Protection Schemes through Fully Connected Layer Substitution
* Efficient and Explicit Modelling of Image Hierarchies for Image Restoration
* Efficient Frequency Domain-based Transformers for High-Quality Image Deblurring
* Efficient Hierarchical Entropy Model for Learned Point Cloud Compression
* Efficient Loss Function by Minimizing the Detrimental Effect of Floating-Point Errors on Gradient-Based Attacks
* Efficient Map Sparsification Based on 2D and 3D Discretized Grids
* Efficient Mask Correction for Click-Based Interactive Image Segmentation
* Efficient Movie Scene Detection using State-Space Transformers
* Efficient Multimodal Fusion via Interactive Prompting
* Efficient On-Device Training via Gradient Filtering
* Efficient RGB-T Tracking via Cross-Modality Distillation
* Efficient Robust Principal Component Analysis via Block Krylov Iteration and CUR Decomposition
* Efficient Scale-Invariant Generator with Column-Row Entangled Pixel Synthesis
* Efficient Second-Order Plane Adjustment
* Efficient Semantic Segmentation by Altering Resolutions for Compressed Videos
* Efficient Verification of Neural Networks Against LVM-Based Specifications
* Efficient View Synthesis and 3D-based Multi-Frame Denoising with Multiplane Feature Representations
* EfficientSCI: Densely Connected Network with Space-time Factorization for Large-scale Video Snapshot Compressive Imaging
* EfficientViT: Memory Efficient Vision Transformer with Cascaded Group Attention
* Ego-Body Pose Estimation via Ego-Head Pose Estimation
* Egocentric Audio-Visual Object Localization
* Egocentric Auditory Attention Localization in Conversations
* Egocentric Video Task Translation
* Elastic Aggregation for Federated Optimization
* Empirical Study of End-to-End Video-Language Transformers with Masked Visual Modeling, An
* EMT-NAS: Transferring architectural knowledge between tasks from different datasets
* End-to-End 3D Dense Captioning with Vote2Cap-DETR
* End-to-End Vectorized HD-map Construction with Piecewise Bézier Curve
* End-to-end Video Matting with Trimap Propagation
* Endpoints Weight Fusion for Class Incremental Semantic Segmentation
* Enemy of My Enemy is My Friend: Exploring Inverse Adversaries for Improving Adversarial Training, The
* Energy-Efficient Adaptive 3D Sensing
* Enhanced Multimodal Representation Learning with Cross-modal KD
* Enhanced Stable View Synthesis
* Enhanced Training of Query-Based Object Detection via Selective Query Recollection
* Enhancing Deformable Local Features by Jointly Learning to Detect and Describe Keypoints
* Enhancing Multiple Reliability Measures via Nuisance-Extended Information Bottleneck
* Enhancing the Self-Universality for Transferable Targeted Attacks
* Enlarging Instance-specific and Class-specific Information for Open-set Action Recognition
* Ensemble-based Blackbox Attacks on Dense Prediction
* EqMotion: Equivariant Multi-Agent Motion Prediction with Invariant Interaction Reasoning
* Equiangular Basis Vectors
* Equivalent Transformation and Dual Stream Network Construction for Mobile Image Super-Resolution
* ERM-KTP: Knowledge-Level Machine Unlearning via Knowledge Transfer
* ERNIE-ViLG 2.0: Improving Text-to-Image Diffusion Model with Knowledge-Enhanced Mixture-of-Denoising-Experts
* Erudite Fine-Grained Visual Classification Model, An
* ESLAM: Efficient Dense SLAM System Based on Hybrid Representation of Signed Distance Fields
* EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
* Evading DeepFake Detectors via Adversarial Statistical Consistency
* Evading Forensic Classifiers with Attribute-Conditioned Adversarial Faces
* EVAL: Explainable Video Anomaly Localization
* Event-based Blurry Frame Interpolation under Blind Exposure
* Event-Based Frame Interpolation with Ad-hoc Deblurring
* Event-Based Shape from Polarization
* Event-based Video Frame Interpolation with Cross-Modal Asymmetric Bidirectional Motion Fields
* Event-Guided Person Re-Identification via Sparse-Dense Complementary Learning
* EventNeRF: Neural Radiance Fields from a Single Colour Event Camera
* Evolved Part Masking for Self-Supervised Learning
* EvShutter: Transforming Events for Unconstrained Rolling Shutter Correction
* Exact-NeRF: An Exploration of a Precise Volumetric Parameterization for Neural Radiance Fields
* EXCALIBUR: Encouraging and Evaluating Embodied Exploration
* Executing your Commands via Motion Diffusion in Latent Space
* Exemplar-FreeSOLO: Enhancing Unsupervised Instance Segmentation with Exemplars
* EXIF as Language: Learning Cross-Modal Associations between Images and Camera Metadata
* Explaining Image Classifiers with Multiscale Directional Image Representation
* Explicit Boundary Guided Semi-Push-Pull Contrastive Learning for Supervised Anomaly Detection
* Explicit Visual Prompting for Low-Level Structure Segmentations
* Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly Supervised Video Anomaly Detection
* Exploiting Unlabelled Photos for Stronger Fine-Grained SBIR
* Exploring and Exploiting Uncertainty for Incomplete Multi-View Classification
* Exploring and Utilizing Pattern Imbalance
* Exploring Data Geometry for Continual Learning
* Exploring Discontinuity for Video Frame Interpolation
* Exploring Incompatible Knowledge Transfer in Few-shot Image Generation
* Exploring Intra-class Variation Factors with Learnable Cluster Prompts for Semi-supervised Image Synthesis
* Exploring Motion Ambiguity and Alignment for High-Quality Video Frame Interpolation
* Exploring Structured Semantic Prior for Multi Label Recognition with Incomplete Labels
* Exploring the Effect of Primitives for Compositional Generalization in Vision-and-Language
* Exploring the Relationship Between Architectural Design and Adversarially Robust Generalization
* expOSE: Accurate Initialization-Free Projective Factorization using Exponential Regularization
* Extracting Class Activation Maps from Non-Discriminative Features as well
* Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolation
* F2-NeRF: Fast Neural Radiance Field Training with Free Camera Trajectories
* FAC: 3D Representation Learning via Foreground Aware Feature Contrast
* FaceLit: Neural 3D Relightable Faces
* Fair Federated Medical Image Segmentation via Client Contribution Estimation
* Fair Scratch Tickets: Finding Fair Sparse Networks without Weight Training
* Fake it Till You Make it: Learning Transferable Representations from Synthetic ImageNet Clones
* FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks
* Fantastic Breaks: A Dataset of Paired 3D Scans of Real-World Broken Objects and Their Complete Counterparts
* FashionSAP: Symbols and Attributes Prompt for Fine-Grained Fashion Vision-Language Pre-Training
* Fast Contextual Scene Graph Generation with Unbiased Context Augmentation
* Fast Monocular Scene Reconstruction with Global-Sparse Local-Dense Grids
* Fast Point Cloud Generation with Straight Flows
* FastInst: A Simple Query-Based Model for Real-Time Instance Segmentation
* FCC: Feature Clusters Compression for Long-Tailed Visual Recognition
* FeatER: An Efficient Network for Human Reconstruction via Feature Map-Based TransformER
* Feature Aggregated Queries for Transformer-Based Video Object Detectors
* Feature Alignment and Uniformity for Test Time Adaptation
* Feature Representation Learning with Adaptive Displacement Generation and Transformer Fusion for Micro-Expression Recognition
* Feature Separation and Recalibration for Adversarial Robustness
* Feature Shrinkage Pyramid for Camouflaged Object Detection with Transformers
* FeatureBooster: Boosting Feature Descriptors with a Lightweight Neural Network
* FedDM: Iterative Distribution Matching for Communication-Efficient Federated Learning
* Federated Domain Generalization with Generalization Adjustment
* Federated Incremental Semantic Segmentation
* Federated Learning with Data-Agnostic Distribution Fusion
* FedSeg: Class-Heterogeneous Federated Learning for Semantic Segmentation
* FEND: A Future Enhanced Distribution-Aware Contrastive Learning Framework for Long-Tail Trajectory Prediction
* Few-Shot Class-Incremental Learning via Class-Aware Bilateral Distillation
* Few-Shot Geometry-Aware Keypoint Localization
* Few-Shot Learning with Visual Distribution Calibration and Cross-Modal Distribution Alignment
* Few-Shot Non-Line-of-Sight Imaging with Signal-Surface Collaborative Regularization
* Few-Shot Referring Relationships in Videos
* Few-shot Semantic Image Synthesis with Class Affinity Transfer
* FFCV: Accelerating Training by Removing Data Bottlenecks
* FFF: Fragment-Guided Flexible Fitting for Building Complete Protein Structures
* FFHQ-UV: Normalized Facial UV-Texture Dataset for 3D Face Reconstruction
* FIANCEE: Faster Inference of Adversarial Networks via Conditional Early Exits
* Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training
* Finding Geometric Models by Clustering in the Consensus Space
* Fine-grained Audible Video Description
* Fine-Grained Classification with Noisy Labels
* Fine-Grained Face Swapping Via Regional GAN Inversion
* Fine-grained Image-Text Matching by Cross-modal Hard Aligning Network
* Fine-tuned CLIP Models are Efficient Video Learners
* Finetune like you pretrain: Improved finetuning of zero-shot vision models
* FitMe: Deep Photorealistic 3D Morphable Model Avatars
* Fix the Noise: Disentangling Source Feature for Controllable Domain Translation
* FJMP: Factorized Joint Multi-Agent Motion Prediction over Learned Directed Acyclic Interaction Graphs
* FLAG3D: A 3D Fitness Activity Dataset with Language Instruction
* FlatFormer: Flattened Window Attention for Efficient Point Cloud Transformer
* FLEX: Full-Body Grasping Without Full-Body Grasps
* Flexible-Cm GAN: Towards Precise 3D Dose Prediction in Radiotherapy
* FlexiViT: One Model for All Patch Sizes
* FlexNeRF: Photorealistic Free-viewpoint Rendering of Moving Humans from Sparse Views
* Flow Supervision for Deformable NeRF
* FlowFormer++: Masked Cost Volume Autoencoding for Pretraining Optical Flow Estimation
* FlowGrad: Controlling the Output of Generative ODEs with Gradients
* Focus On Details: Online Multi-Object Tracking with Diverse Fine-Grained Representation
* Focused and Collaborative Feedback Integration for Interactive Image Segmentation
* Foundation Model Drives Weakly Incremental Learning for Semantic Segmentation
* Four-view Geometry with Unknown Radial Distortion
* Frame Flexible Network
* Frame Interpolation Transformer and Uncertainty Guidance
* Frame-Event Alignment and Fusion Network for High Frame Rate Tracking
* FREDOM: Fairness Domain Adaptation Approach to Semantic Scene Understanding
* FreeNeRF: Improving Few-Shot Neural Rendering with Free Frequency Regularization
* FreeSeg: Unified, Universal and Open-Vocabulary Image Segmentation
* Freestyle Layout-to-Image Synthesis
* Frequency-Modulated Point Cloud Rendering with Easy Editing
* Fresnel Microfacet BRDF: Unification of Polari-Radiometric Surface-Body Reflection
* From Images to Textual Prompts: Zero-shot Visual Question Answering with Frozen Large Language Models
* From Node Interaction to Hop Interaction: New Effective and Scalable Graph Learning Paradigm
* Frustratingly Easy Regularization on Representation Can Boost Deep Reinforcement Learning
* FrustumFormer: Adaptive Instance-aware Resampling for Multi-view 3D Detection
* Full or Weak Annotations? An Adaptive Strategy for Budget-Constrained Annotation Campaigns
* Fully Self-Supervised Depth Estimation from Defocus Clue
* Fusing Pre-Trained Language Models with Multimodal Prompts through Reinforcement Learning
* Fuzzy Positive Learning for Semi-Supervised Semantic Segmentation
* G-MSM: Unsupervised Multi-Shape Matching with Graph-Based Affinity Priors
* GaitGCI: Generative Counterfactual Intervention for Gait Recognition
* Galactic: Scaling End-to-End Reinforcement Learning for Rearrangement at 100k Steps-Per-Second
* GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis
* GamutMLP: A Lightweight MLP for Color Loss Recovery
* GANHead: Towards Generative Animatable Neural Head Avatars
* GANmouflage: 3D Object Nondetection with Texture Fields
* GAPartNet: Cross-Category Domain-Generalizable Object Perception and Manipulation via Generalizable and Actionable Parts
* GarmentTracking: Category-Level Garment Pose Tracking
* Gated Multi-Resolution Transfer Network for Burst Restoration and Enhancement
* Gated Stereo: Joint Depth Estimation from Gated and Wide-Baseline Active Stereo Cues
* Gaussian Label Distribution Learning for Spherical Image Object Detection
* Gazeformer: Scalable, Effective and Fast Prediction of Goal-Directed Human Attention
* GazeNeRF: 3D-Aware Gaze Redirection with Neural Radiance Fields
* GCFAgg: Global and Cross-View Feature Aggregation for Multi-View Clustering
* GD-MAE: Generative Decoder for MAE Pre-Training on LiDAR Point Clouds
* GEN: Pushing the Limits of Softmax-Based Out-of-Distribution Detection
* GeneCIS: A Benchmark for General Conditional Image Similarity
* General Regret Bound of Preconditioned Gradient Method for DNN Training, A
* Generalist: Decoupling Natural and Robust Generalization
* Generalizable Implicit Neural Representations via Instance Pattern Composers
* Generalizable Local Feature Pre-training for Deformable Shape Analysis
* Generalization Matters: Loss Minima Flattening via Parameter Hybridization for Efficient Online Knowledge Distillation
* Generalized Decoding for Pixel, Image, and Language
* Generalized Deep 3D Shape Prior via Part-Discretized Diffusion Process
* Generalized Framework for Video Instance Segmentation, A
* Generalized Relation Modeling for Transformer Tracking
* Generalized UAV Object Detection via Frequency Domain Disentanglement
* Generalizing Dataset Distillation via Deep Generative Prior
* Generating Aligned Pseudo-Supervision from Non-Aligned Data for Image Restoration in Under-Display Camera
* Generating Anomalies for Video Anomaly Detection with Prompt-based Feature Mapping
* Generating Features with Increased Crop-Related Diversity for Few-Shot Object Detection
* Generating Holistic 3D Human Motion from Speech
* Generating Human Motion from Textual Descriptions with Discrete Representations
* Generating Part-Aware Editable 3D Shapes without 3D Supervision
* Generative Bias for Robust Visual Question Answering
* Generative Diffusion Prior for Unified Image Restoration and Enhancement
* Generative Semantic Segmentation
* Generic-to-Specific Distillation of Masked Autoencoders
* Genie: Show Me the Data for Quantization
* GeoLayoutLM: Geometric Pre-training for Visual Information Extraction
* GeoMAE: Masked Geometric Target Prediction for Self-supervised Point Cloud Pre-Training
* Geometric Visual Similarity Learning in 3D Medical Image Self-Supervised Pre-training
* Geometry and Uncertainty-Aware 3D Point Cloud Class-Incremental Semantic Segmentation
* GeoMVSNet: Learning Multi-View Stereo with Geometry Perception
* GeoNet: Benchmarking Unsupervised Adaptation across Geographies
* GeoVLN: Learning Geometry-Enhanced Visual Representation with Slot Attention for Vision-and-Language Navigation
* GFIE: A Dataset and Baseline for Gaze-Following from 2D to 3D in Indoor Environments
* GFPose: Learning 3D Human Pose Prior with Gradient Fields
* GINA-3D: Learning to Generate Implicit Neural Assets in the Wild
* GIVL: Improving Geographical Inclusivity of Vision-Language Models with Pre-Training Methods
* GKEAL: Gaussian Kernel Embedded Analytic Learning for Few-Shot Class Incremental Task
* GlassesGAN: Eyewear Personalization Using Synthetic Appearance Discovery and Targeted Subspace Modeling
* GLeaD: Improving GANs with A Generator-Leading Task
* GLIGEN: Open-Set Grounded Text-to-Image Generation
* Global and Local Mixture Consistency Cumulative Learning for Long-tailed Visual Recognitions
* Global Vision Transformer Pruning with Hessian-Aware Saliency
* Global-to-Local Modeling for Video-Based 3D Human Pose and Shape Estimation
* Glocal Energy-based Learning for Few-Shot Open-Set Recognition
* Gloss Attention for Gloss-free Sign Language Translation
* GM-NeRF: Learning Generalizable Model-Based Neural Radiance Fields from Multi-View Images
* Good is Bad: Causality Inspired Cloth-debiasing for Cloth-changing Person Re-identification
* GP-VTON: Towards General Purpose Virtual Try-On via Collaborative Local-Flow Global-Parsing Learning
* Grad-PU: Arbitrary-Scale Point Cloud Upsampling via Gradient Descent with Learned Distance Functions
* GradICON: Approximate Diffeomorphisms via Gradient Inverse Consistency
* Gradient Norm Aware Minimization Seeks First-Order Flatness and Improves Generalization
* Gradient-based Uncertainty Attribution for Explainable Bayesian Deep Learning
* GradMA: A Gradient-Memory-based Accelerated Federated Learning with Alleviated Catastrophic Forgetting
* Graph Representation for Order-aware Visual Transformation
* Graph Transformer GANs for Graph-Constrained House Generation
* Graphics Capsule: Learning Hierarchical 3D Face Representations from 2D Images
* GraVoS: Voxel Selection for 3D Point-Cloud Detection
* GRES: Generalized Referring Expression Segmentation
* Grid-guided Neural Radiance Fields for Large Urban Scenes
* Ground-Truth Free Meta-Learning for Deep Compressive Sampling
* Grounding Counterfactual Explanation of Image Classifiers to Textual Concept Space
* GrowSP: Unsupervised Semantic Segmentation of 3D Point Clouds
* gSDF: Geometry-Driven Signed Distance Functions for 3D Hand-Object Reconstruction
* Guided Depth Super-Resolution by Deep Anisotropic Diffusion
* Guided Recommendation for Model Fine-Tuning
* Guiding Pseudo-labels with Uncertainty Estimation for Source-free Unsupervised Domain Adaptation
* H2ONet: Hand-Occlusion-and-Orientation-Aware Network for Real-Time 3D Hand Mesh Reconstruction
* HAAV: Hierarchical Aggregation of Augmented Views for Image Captioning
* Habitat-Matterport 3D Semantics Dataset
* HairStep: Transfer Synthetic to Real Using Strand and Depth Maps for Single-View 3D Hair Modeling
* HaLP: Hallucinating Latent Positives for Skeleton-based Self-Supervised Learning of Actions
* Ham2Pose: Animating Sign Language Notation into Pose Sequences
* Hand Avatar: Free-Pose Hand Animation and Rendering from Monocular Video
* HandNeRF: Neural Radiance Fields for Animatable Interacting Hands
* HandsOff: Labeled Dataset Generation With No Additional Human Annotations
* Handwritten Text Generation from Visual Archetypes
* Handy: Towards a High Fidelity 3D Hand Shape and Appearance Model
* Hard Patches Mining for Masked Image Modeling
* Hard Sample Matters a Lot in Zero-Shot Quantization
* Harmonious Feature Learning for Interactive Hand-Object Pose Estimation
* Harmonious Teacher for Cross-Domain Object Detection
* HARP: Personalized Hand Reconstruction from a Monocular RGB Video
* HDR Imaging with Spatially Varying Signal-to-Noise Ratios
* Heat Diffusion Based Multi-Scale and Geometric Structure-Aware Transformer for Mesh Segmentation
* HelixSurf: A Robust and Efficient Neural Implicit Surface Learning of Indoor Scenes with Iterative Intertwined Regularization
* Heterogeneous Continual Learning
* HexPlane: A Fast Representation for Dynamic Scenes
* HGFormer: Hierarchical Grouping Transformer for Domain Generalized Semantic Segmentation
* HGNet: Learning Hierarchical Geometry from Points, Edges, and Surfaces
* Hi-LASSIE: High-Fidelity Articulated Shape and Skeleton Discovery from Sparse Image Ensemble
* Hi4D: 4D Instance Segmentation of Close Human Interaction
* Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision
* HIER: Metric Learning Beyond Class Labels via Hierarchical Regularization
* Hierarchical B-Frame Video Coding Using Two-Layer CANF Without Motion Coding
* Hierarchical Dense Correlation Distillation for Few-Shot Segmentation
* Hierarchical Discriminative Learning Improves Visual Representations of Biomedical Microscopy
* Hierarchical Fine-Grained Image Forgery Detection and Localization
* Hierarchical Neural Memory Network for Low Latency Event Processing
* Hierarchical Prompt Learning for Multi-Task Learning
* Hierarchical Representation Network for Accurate and Detailed Face Reconstruction from In-The-Wild Images, A
* Hierarchical Semantic Contrast for Scene-aware Video Anomaly Detection
* Hierarchical Semantic Correspondence Networks for Video Paragraph Grounding
* Hierarchical Supervision and Shuffle Data Augmentation for 3D Semi-Supervised Object Detection
* Hierarchical Temporal Transformer for 3D Hand Pose Estimation and Action Recognition from Egocentric RGB Videos
* Hierarchical Video-Moment Retrieval and Step-Captioning
* HierVL: Learning Hierarchical Video-Language Embeddings
* High Fidelity 3D Hand Shape Reconstruction via Scalable Graph Frequency Decomposition
* High-fidelity 3D Face Generation from Natural Language Descriptions
* High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization
* High-fidelity 3D Human Digitization from Single 2K Resolution Images
* High-Fidelity and Freely Controllable Talking Head Video Generation
* High-Fidelity Clothed Avatar Reconstruction from a Single Image
* High-fidelity Event-Radiance Recovery via Transient Event Frequency
* High-fidelity Facial Avatar Reconstruction from Monocular Video with Generative Priors
* High-Fidelity Generalized Emotional Talking Face Generation with Multi-Modal Emotion Space Learning
* High-Fidelity Guided Image Synthesis with Latent Diffusion Models
* High-Frequency Stereo Matching Network
* High-Res Facial Appearance Capture from Polarized Smartphone Images
* High-resolution image reconstruction with latent diffusion models from human brain activity
* Highly Confident Local Structure Based Consensus Graph Learning for Incomplete Multi-view Clustering
* Hint-Aug: Drawing Hints from Foundation Vision Transformers towards Boosted Few-shot Parameter-Efficient Tuning
* Histopathology Whole Slide Image Analysis with Heterogeneous Graph Representation Learning
* HNeRV: A Hybrid Neural Representation for Videos
* HOICLIP: Efficient Knowledge Transfer for HOI Detection with Vision-Language Models
* HOLODIFFUSION: Training a 3D Diffusion Model Using 2D Images
* HOOD: Hierarchical Graphs for Generalized Modelling of Clothing Dynamics
* HOTNAS: Hierarchical Optimal Transport for Neural Architecture Search
* HouseDiffusion: Vector Floorplan Generation via a Diffusion Model with Discrete and Continuous Denoising
* How can objects help action recognition?
* How to Backdoor Diffusion Models?
* How to Prevent the Continuous Damage of Noises to Model Training?
* How to Prevent the Poor Performance Clients for Personalized Federated Learning?
* How You Feelin'? Learning Emotions and Mental States in Movie Scenes
* HRDFuse: Monocular 360° Depth Estimation by Collaboratively Learning Holistic-with-Regional Depth Distributions
* HS-Pose: Hybrid Scope Feature Extraction for Category-level Object Pose Estimation
* Hubs and Hyperspheres: Reducing Hubness and Improving Transductive Few-Shot Learning with Hyperspherical Embeddings
* Human Body Shape Completion with Implicit Shape and Flow Learning
* Human Guided Ground-Truth Generation for Realistic Image Super-Resolution
* Human Pose as Compositional Tokens
* Human Pose Estimation in Extremely Low-Light Conditions
* Human-Art: A Versatile Human-Centric Dataset Bridging Natural and Artificial Scenes
* HumanBench: Towards General Human-Centric Perception with Projector Assisted Pretraining
* HumanGen: Generating Human Radiance Fields with Explicit Priors
* HuManiFlow: Ancestor-Conditioned Normalising Flows on SO(3) Manifolds for Human Pose and Shape Distribution Estimation
* Humans as Light Bulbs: 3D Human Reconstruction from Thermal Reflection
* Hunting Sparsity: Density-Guided Contrastive Learning for Semi-Supervised Semantic Segmentation
* Hybrid Active Learning via Deep Clustering for Video Action Detection
* Hybrid Neural Rendering for Large-Scale Scenes with Motion Blur
* Hyperbolic Contrastive Learning for Visual Representations beyond Objects
* HyperCUT: Video Sequence from a Single Blurry Image using Unsupervised Ordering
* HyperMatch: Noise-Tolerant Semi-Supervised Learning via Relaxed Contrastive Constraint
* HyperReel: High-Fidelity 6-DoF Video with Ray-Conditioned Sampling
* Hyperspherical Embedding for Point Cloud Completion
* HypLiLoc: Towards Effective LiDAR Pose Regression with Hyperbolic Fusion
* I2-SDF: Intrinsic Indoor Scene Reconstruction and Editing via Raytracing in Neural SDFs
* I2MVFormer: Large Language Model Generated Multi-View Document Supervision for Zero-Shot Image Classification
* iCLIP: Bridging Image Classification and Contrastive Language-Image Pre-training for Visual Recognition
* Identity-Preserving Talking Face Generation with Landmark and Appearance Priors
* IDGI: A Framework to Eliminate Explanation Noise from Integrated Gradients
* iDisc: Internal Discretization for Monocular Depth Estimation
* IFSeg: Image-free Semantic Segmentation via Vision-Language Model
* Im2Hands: Learning Attentive Implicit Representation of Interacting Two-Hand Shapes
* Image as a Foreign Language: BEIT Pretraining for Vision and Vision-Language Tasks
* Image Cropping with Spatial-aware Feature and Rank Consistency
* Image Quality Assessment Dataset for Portraits, An
* Image Quality-aware Diagnosis via Meta-knowledge Co-embedding
* Image Super-Resolution Using T-Tetromino Pixels
* ImageBind One Embedding Space to Bind Them All
* Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting
* ImageNet-E: Benchmarking Neural Network Robustness via Attribute Editing
* Images Speak in Images: A Generalist Painter for In-Context Visual Learning
* Imagic: Text-Based Real Image Editing with Diffusion Models
* Imitation Learning as State Matching via Differentiable Physics
* IMP: Iterative Matching and Pose Estimation with Adaptive Pooling
* Implicit 3D Human Mesh Recovery using Consistency with Pose and Shape from Unseen-view
* Implicit Diffusion Models for Continuous Super-Resolution
* Implicit Identity Driven Deepfake Face Swapping Detection
* Implicit Identity Leakage: The Stumbling Block to Improving Deepfake Detection Generalization
* Implicit Neural Head Synthesis via Controllable Local Deformation Fields
* Implicit Occupancy Flow Fields for Perception and Prediction in Self-Driving
* Implicit Surface Contrastive Clustering for LiDAR Point Clouds
* Implicit View-Time Interpolation of Stereo Videos Using Multi-Plane Disparities and Non-Uniform Coordinates
* Improved Distribution Matching for Dataset Condensation
* Improved Test-Time Adaptation for Domain Generalization
* Improving Commonsense in Vision-Language Models via Knowledge Graph Riddles
* Improving Cross-Modal Retrieval with Set of Diverse Embeddings
* Improving Fairness in Facial Albedo Estimation via Visual-Textual Cues
* Improving Generalization of Meta-Learning with Inverted Regularization at Inner-Level
* Improving Generalization with Domain Convex Game
* Improving Graph Representation for Point Cloud Segmentation via Attentive Filtering
* Improving Image Recognition by Retrieving from Web-Scale Image-Text Data
* Improving Robust Generalization by Direct PAC-Bayesian Bound Minimization
* Improving Robustness of Semantic Segmentation to Motion-Blur Using Class-Centric Augmentation
* Improving Robustness of Vision Transformers by Reducing Sensitivity to Patch Corruptions
* Improving Selective Visual Question Answering by Learning from Your Peers
* Improving Table Structure Recognition with Visual-Alignment Sequential Coordinate Modeling
* Improving the Transferability of Adversarial Samples by Path-Augmented Method
* Improving Vision-and-Language Navigation by Generating Future-View Image Semantics
* Improving Visual Grounding by Encouraging Consistent Gradient-Based Explanations
* Improving Visual Representation Learning Through Perceptual Understanding
* Improving Weakly Supervised Temporal Action Localization by Bridging Train-Test Gap in Pseudo Labels
* Improving Zero-shot Generalization and Robustness of Multi-Modal Models
* In-Depth Exploration of Person Re-Identification and Gait Recognition in Cloth-Changing Conditions, An
* In-Hand 3D Object Scanning from an RGB Sequence
* Incremental 3D Semantic Scene Graph Prediction from RGB Sequences
* Incrementer: Transformer for Class-Incremental Semantic Segmentation with Knowledge Distillation Focusing on Old Class
* Independent Component Alignment for Multi-Task Learning
* Indescribable Multi-Modal Spatial Evaluator
* Indiscernible Object Counting in Underwater Scenes
* Inferring and Leveraging Parts from Object Shape for Improving Semantic Image Synthesis
* Infinite Photorealistic Worlds Using Procedural Generation
* Ingredient-oriented Multi-Degradation Learning for Image Restoration
* Initialization Noise in Image Gradients and Saliency Maps
* Instance Relation Graph Guided Source-Free Domain Adaptive Object Detection
* Instance-Aware Domain Generalization for Face Anti-Spoofing
* Instance-Specific and Model-Adaptive Supervision for Semi-Supervised Semantic Segmentation
* Instant Domain Augmentation for LiDAR Semantic Segmentation
* Instant Multi-View Head Capture through Learnable Registration
* Instant Volumetric Head Avatars
* Instant-NVR: Instant Neural Volumetric Rendering for Human-object Interactions from Monocular RGBD Stream
* InstantAvatar: Learning Avatars from Monocular Video in 60 Seconds
* InstMove: Instance Motion for Object-centric Video Segmentation
* InstructPix2Pix: Learning to Follow Image Editing Instructions
* Integral Neural Networks
* Integrally Pre-Trained Transformer Pyramid Networks
* Interactive and Explainable Region-guided Radiology Report Generation
* Interactive Cartoonization with Controllable Perceptual Factors
* Interactive Segmentation as Gaussian Process Classification
* Interactive Segmentation of Radiance Fields
* InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
* Interventional Bag Multi-Instance Learning On Whole-Slide Pathological Images
* Intrinsic Physical Concepts Discovery with Object-Centric Predictive Models
* Introducing Competition to Boost the Transferability of Targeted Adversarial Examples Through Clean Feature Mixup
* Inverse Rendering of Translucent Objects using Physical and Neural Renderers
* Inversion-based Style Transfer with Diffusion Models
* Invertible Neural Skinning
* Inverting the Imaging Process by Learning an Implicit Camera Model
* IPCC-TP: Utilizing Incremental Pearson Correlation Coefficient for Joint Multi-Agent Trajectory Prediction
* iQuery: Instruments as Queries for Audio-Visual Sound Separation
* Is BERT Blind? Exploring the Effect of Vision-and-Language Pretraining on Visual Language Understanding
* IS-GGT: Iterative Scene Graph Generation with Generative Transformers
* ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution
* Iterative Geometry Encoding Volume for Stereo Matching
* Iterative Next Boundary Detection for Instance Segmentation of Tree Rings in Microscopy Images of Shrub Cross Sections
* Iterative Proposal Refinement for Weakly-Supervised Video Grounding
* Iterative Vision-and-Language Navigation
* IterativePFN: True Iterative Point Cloud Filtering
* itKD: Interchange Transfer-based Knowledge Distillation for 3D Object Detection
* JacobiNeRF: NeRF Shaping with Mutual Information Gradients
* JAWS: Just A Wild Shot for Cinematic Transfer in Neural Radiance Fields
* Jedi: Entropy-Based Localization and Removal of Adversarial Patches
* Joint Appearance and Motion Learning for Efficient Rolling Shutter Correction
* Joint HDR Denoising and Fusion: A Real-World Mobile HDR Image Dataset
* Joint Token Pruning and Squeezing Towards More Aggressive Compression of Vision Transformers
* Joint Video Multi-Frame Interpolation and Deblurring under Unknown Exposure Time
* Joint Visual Grounding and Tracking with Natural Language Specification
* JRDB-Pose: A Large-Scale Dataset for Multi-Person Pose Estimation and Tracking
* K-Planes: Explicit Radiance Fields in Space, Time, and Appearance
* K3DN: Disparity-Aware Kernel Estimation for Dual-Pixel Defocus Deblurring
* KD-DLGAN: Data Limited Image Generation via Knowledge Distillation
* KERM: Knowledge Enhanced Reasoning for Vision-and-Language Navigation
* Kernel Aware Resampler
* KiUT: Knowledge-injected U-Transformer for Radiology Report Generation
* Knowledge Combination to Learn Rotated Detection without Rotated Annotation
* Knowledge Distillation for 6D Pose Estimation by Aligning Distributions of Local Predictions
* L-CoIns: Language-based Colorization With Instance Awareness
* Label Information Bottleneck for Label Enhancement
* Label-Free Liver Tumor Segmentation
* LANA: A Language-Capable Navigator for Instruction Following and Generation
* Language Adaptive Weight Generation for Multi-Task Visual Grounding
* Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification
* Language-Guided Audio-Visual Source Separation via Trimodal Consistency
* Language-Guided Music Recommendation for Video via Prompt Analogies
* LANIT: Language-Driven Image-to-Image Translation for Unlabeled Data
* Large-Capacity and Flexible Video Steganography via Invertible Neural Network
* Large-Scale Homography Benchmark, A
* Large-Scale Robustness Analysis of Video Action Recognition Models, A
* Large-scale Training Data Search for Object Re-identification
* LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs
* LaserMix for Semi-Supervised LiDAR Semantic Segmentation
* LASP: Text-to-Text Optimization for Language-Aware Soft Prompting of Vision and Language Models
* Latency Matters: Real-Time Action Forecasting Transformer
* Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures
* LAVENDER: Unifying Video-Language Understanding as Masked Language Modeling
* Layout-based Causal Inference for Object Navigation
* LayoutDiffusion: Controllable Diffusion Model for Layout-to-Image Generation
* LayoutDM: Discrete Diffusion Model for Controllable Layout Generation
* LayoutDM: Transformer-based Diffusion Model for Layout Generation
* LayoutFormer++: Conditional Graphic Layout Generation via Constraint Serialization and Decoding Space Restriction
* Leapfrog Diffusion Model for Stochastic Trajectory Prediction
* Learnable Skeleton-Aware 3D Point Cloud Sampling
* Learned Image Compression with Mixed Transformer-CNN Architectures
* Learned Two-Plane Perspective Prior based Image Resampling for Efficient Object Detection
* Learning 3D Representations from 2D Pre-Trained Models via Image-to-Point Masked Autoencoders
* Learning 3D Scene Priors with 2D Supervision
* Learning 3D-Aware Image Synthesis with Unknown Pose Distribution
* Learning a 3D Morphable Face Reflectance Model from Low-Cost Data
* Learning a Deep Color Difference Metric for Photographic Images
* Learning a Depth Covariance Function
* Learning a Practical SDR-to-HDRTV Up-conversion using New Dataset and Degradation Models
* Learning a Simple Low-Light Image Enhancer from Paired Low-Light Instances
* Learning A Sparse Transformer Network for Effective Image Deraining
* Learning Accurate 3D Shape Based on Stereo Polarimetric Imaging
* Learning Action Changes by Measuring Verb-Adverb Textual Relationships
* Learning Adaptive Dense Event Stereo from the Image Domain
* Learning Analytical Posterior Probability for Human Mesh Recovery
* Learning Anchor Transformations for 3D Garment Animation
* Learning and Aggregating Lane Graphs for Urban Automated Driving
* Learning Articulated Shape with Keypoint Pseudo-Labels from Web Images
* Learning Attention as Disentangler for Compositional Zero-Shot Learning
* Learning Attribute and Class-Specific Representation Duet for Fine-Grained Fashion Analysis
* Learning Audio-Visual Source Localization via False Negative Aware Contrastive Learning
* Learning Bottleneck Concepts in Image Classification
* Learning Common Rationale to Improve Self-Supervised Representation for Fine-Grained Visual Recognition Problems
* Learning Compact Representations for LiDAR Completion and Generation
* Learning Conditional Attributes for Compositional Zero-Shot Learning
* Learning Correspondence Uncertainty via Differentiable Nonlinear Least Squares
* Learning Customized Visual Models with Retrieval-Augmented Knowledge
* Learning Debiased Representations via Conditional Attribute Interpolation
* Learning Decorrelated Representations Efficiently Using Fast Fourier Transform
* Learning Detailed Radiance Manifolds for High-Fidelity and 3D-Consistent Portrait Synthesis from Monocular Image
* Learning Discriminative Representations for Skeleton Based Action Recognition
* Learning Distortion Invariant Representation for Image Restoration from a Causality Perspective
* Learning Dynamic Style Kernels for Artistic Style Transfer
* Learning Emotion Representations from Verbal and Nonverbal Communication
* Learning Event Guided High Dynamic Range Video Reconstruction
* Learning Expressive Prompting With Residuals for Vision Transformers
* Learning Federated Visual Prompt in Null Space for MRI Reconstruction
* Learning from Noisy Labels with Decoupled Meta Label Purifier
* Learning from Unique Perspectives: User-aware Saliency Modeling
* Learning Generative Structure Prior for Blind Text Image Super-resolution
* Learning Geometric-Aware Properties in 2D Representation Using Lightweight CAD Models, or Zero Real 3D Pairs
* Learning Geometry-aware Representations by Sketching
* Learning Human Mesh Recovery in 3D Scenes
* Learning Human-to-Robot Handovers from Point Clouds
* Learning Imbalanced Data with Vision Transformers
* Learning Instance-Level Representation for Large-Scale Multi-Modal Pretraining in E-Commerce
* Learning Joint Latent Space EBM Prior Model for Multi-layer Generator
* Learning Locally Editable Virtual Humans
* Learning Multi-Modal Class-Specific Tokens for Weakly Supervised Dense Object Localization
* Learning Neural Duplex Radiance Fields for Real-Time View Synthesis
* Learning Neural Parametric Head Models
* Learning Neural Proto-Face Field for Disentangled 3D Face Modeling in the Wild
* Learning Neural Volumetric Representations of Dynamic Humans in Minutes
* Learning on Gradients: Generalized Artifacts Representation for GAN-Generated Images Detection
* Learning Open-Vocabulary Semantic Segmentation Models From Natural Language Supervision
* Learning Optical Expansion from Scale Matching
* Learning Orthogonal Prototypes for Generalized Few-Shot Semantic Segmentation
* Learning Partial Correlation based Deep Visual Representation for Image Classification
* Learning Personalized High Quality Volumetric Head Avatars from Monocular RGB Videos
* Learning Procedure-aware Video Representation from Instructional Videos and Their Narrations
* Learning Rotation-Equivariant Features for Visual Correspondence
* Learning Sample Relationship for Exposure Correction
* Learning Semantic Relationship among Instances for Image-Text Matching
* Learning Semantic-Aware Disentangled Representation for Flexible 3D Human Body Editing
* Learning Semantic-Aware Knowledge Guidance for Low-Light Image Enhancement
* Learning Situation Hyper-Graphs for Video Question Answering
* Learning Spatial-Temporal Implicit Neural Representations for Event-Guided Video Super-Resolution
* Learning Steerable Function for Efficient Image Resampling
* Learning the Distribution of Errors in Stereo Matching for Joint Disparity and Uncertainty Estimation
* Learning to Detect and Segment for Open Vocabulary Object Detection
* Learning to Detect Mirrors from Videos via Dual Correspondences
* Learning to Dub Movies via Hierarchical Prosody Models
* Learning to Exploit Temporal Structure for Biomedical Vision-Language Processing
* Learning to Exploit the Sequence-Specific Prior Knowledge for Image Processing Pipelines Optimization
* Learning to Fuse Monocular and Multi-view Cues for Multi-frame Depth Estimation in Dynamic Scenes
* Learning to Generate Image Embeddings with User-Level Differential Privacy
* Learning to Generate Language-Supervised and Open-Vocabulary Scene Graph Using Pre-Trained Visual-Semantic Space
* Learning to Generate Text-Grounded Mask for Open-World Semantic Segmentation from Only Image-Text Pairs
* Learning to Measure the Point Cloud Reconstruction Loss in a Representation Space
* Learning to Name Classes for Vision and Language Models
* Learning to Predict Scene-Level Implicit 3D from Posed RGBD Data
* Learning to Render Novel Views from Wide-Baseline Stereo Pairs
* Learning to Retain while Acquiring: Combating Distribution-Shift in Adversarial Data-Free Knowledge Distillation
* Learning to Segment Every Referring Object Point by Point
* Learning to Zoom and Unzoom
* Learning Transferable Spatiotemporal Representations from Natural Script Knowledge
* Learning Transformation-Predictive Representations for Detection and Description of Local Features
* Learning Transformations to Reduce the Geometric Shift in Object Detection
* Learning Video Representations from Large Language Models
* Learning Visibility Field for Detailed 3D Human Reconstruction and Relighting
* Learning Visual Representations via Language-Guided Sampling
* Learning Weather-General and Weather-Specific Features for Image Restoration Under Multiple Adverse Weather Conditions
* Learning with Fantasy: Semantic-Aware Virtual Contrastive Constraint for Few-Shot Class-Incremental Learning
* Learning with Noisy labels via Self-supervised Adversarial Noisy Masking
* LEGO-Net: Learning Regular Rearrangements of Objects in Rooms
* LEMaRT: Label-Efficient Masked Region Transform for Image Harmonization
* Less is More: Reducing Task and Model Complexity for 3D Point Cloud Semantic Segmentation
* Level-S2fM: Structure from Motion on Neural Level Set of Implicit Surfaces
* Leverage Interactive Affinity for Affordance Learning
* Leveraging Hidden Positives for Unsupervised Semantic Segmentation
* Leveraging Inter-Rater Agreement for Classification in the Presence of Noisy Labels
* Leveraging per Image-Token Consistency for Vision-Language Pre-Training
* Leveraging Temporal Context in Low Representational Power Regimes
* LG-BPN: Local and Global Blind-Patch Network for Self-Supervised Real-World Denoising
* LiDAR-in-the-Loop Hyperparameter Optimization
* LiDAR2Map: In Defense of LiDAR-Based Semantic Map Construction Using Online Camera Distillation
* LidarGait: Benchmarking 3D Gait Recognition with Point Clouds
* Lift3D: Synthesize 3D Training Data by Lifting 2D GAN to 3D Generative Radiance Field
* Light Source Separation and Intrinsic Image Decomposition under AC Illumination
* Light Touch Approach to Teaching Transformers Multi-view Geometry, A
* Light Weight Model for Active Speaker Detection, A
* LightedDepth: Video Depth Estimation in Light of Limited Inference View Angles
* LightPainter: Interactive Portrait Relighting with Freehand Scribble
* LINe: Out-of-Distribution Detection by Leveraging Important Neurons
* LinK: Linear Kernel for LiDAR-based 3D Perception
* Linking Garment with Person via Semantically Associated Landmarks for Virtual Try-On
* LipFormer: High-fidelity and Generalizable Talking Face Generation with A Pre-learned Facial Codebook
* Listening Human Behavior: 3D Human Pose Estimation with Acoustic Signals
* Lite DETR: An Interleaved Multi-Scale Encoder for Efficient DETR
* Lite-Mono: A Lightweight CNN and Transformer Architecture for Self-Supervised Monocular Depth Estimation
* Local 3D Editing via 3D Distillation of CLIP Knowledge
* Local Connectivity-Based Density Estimation for Face Clustering
* Local Implicit Normalizing Flow for Arbitrary-Scale Image Super-Resolution
* Local Implicit Ray Function for Generalizable Radiance Field Representation
* Local-Guided Global: Paired Similarity Representation for Visual Reinforcement Learning
* Local-to-Global Registration for Bundle-Adjusting Neural Radiance Fields
* Localized Semantic Feature Mixers for Efficient Pedestrian Detection in Autonomous Driving
* LOCATE: Localize and Transfer Object Parts for Weakly Supervised Affordance Grounding
* Logical Consistency and Greater Descriptive Power for Facial Hair Attribute Learning
* Logical Implications for Visual Question Answering Consistency
* LOGO: A Long-Form Video Dataset for Group Action Quality Assessment
* LoGoNet: Towards Accurate 3D Object Detection with Local-to-Global Cross-Modal Fusion
* Long Range Pooling for 3D Large-Scale Scene Understanding
* Long-Tailed Visual Recognition via Self-Heterogeneous Integration with Knowledge Excavation
* Long-Term Visual Localization with Mobile Sensors
* Look Around for Anomalies: Weakly-Supervised Anomaly Detection via Context-Motion Relational Learning
* Look Before You Match: Instance Understanding Matters in Video Object Segmentation
* Look, Radiate, and Learn: Self-Supervised Localisation via Radio-Visual Correspondence
* Lookahead Diffusion Probabilistic Models for Refining Mean Estimation
* Looking Through the Glass: Neural Surface Reconstruction Against High Specular Reflections
* Loopback Network for Explainable Microvascular Invasion Classification, A
* Low-Light Image Enhancement via Structure Modeling and Guidance
* LP-DIF: Learning Local Pattern-Specific Deep Implicit Function for 3D Objects and Scenes
* LSTFE-Net: Long Short-Term Feature Enhancement Network for Video Small Object Detection
* LVQAC: Lattice Vector Quantization Coupled with Spatially Adaptive Companding for Efficient Learned Image Compression
* M6Doc: A Large-Scale Multi-Format, Multi-Type, Multi-Layout, Multi-Language, Multi-Annotation Category Dataset for Modern Document Layout Analysis
* MACARONS: Mapping and Coverage Anticipation with RGB Online Self-Supervision
* MAESTER: Masked Autoencoder Guided Segmentation at Pixel Resolution for Accurate, Self-Supervised Subcellular Structure Recognition
* MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis
* Magic3D: High-Resolution Text-to-3D Content Creation
* MagicNet: Semi-Supervised Multi-Organ Segmentation via Magic-Cube Partition and Recovery
* MagicPony: Learning Articulated 3D Animals in the Wild
* MAGVIT: Masked Generative Video Transformer
* MAGVLT: Masked Generative Vision-and-Language Transformer
* MAIR: Multi-View Attention Inverse Rendering with 3D Spatially-Varying Lighting Estimation
* Make Landscape Flatter in Differentially Private Federated Learning
* Make-A-Story: Visual Memory Conditioned Consistent Story Generation
* Making Vision Transformers Efficient from A Token Sparsification View
* MaLP: Manipulation Localization Using a Proactive Scheme
* MammalNet: A Large-Scale Video Benchmark for Mammal Recognition and Behavior Understanding
* Manipulating Transfer Learning for Property Inference
* MAP: Multimodal Uncertainty-Aware Vision-Language Pre-training Model
* MaPLe: Multi-modal Prompt Learning
* Mapping Degeneration Meets Label Evolution: Learning Infrared Small Target Detection with Single Point Supervision
* Marching-Primitives: Shape Abstraction from Signed Distance Function
* MarginMatch: Improving Semi-Supervised Learning with Pseudo-Margins
* Markerless Camera-to-Robot Pose Estimation via Self-Supervised Sim-to-Real Transfer
* MARLIN: Masked Autoencoder for facial video Representation LearnINg
* MarS3D: A Plug-and-Play Motion-Aware Model for Semantic Segmentation on Multi-Scan 3D Point Clouds
* Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation
* Mask-Free OVIS: Open-Vocabulary Instance Segmentation without Manual Mask Annotations
* Mask-Free Video Instance Segmentation
* Mask-Guided Matting in the Wild
* Mask3D: Pretraining 2D Vision Transformers by Learning Masked 3D Priors
* MaskCLIP: Masked Self-Distillation Advances Contrastive Language-Image Pretraining
* MaskCon: Masked Contrastive Learning for Coarse-Labelled Dataset
* Masked and Adaptive Transformer for Exemplar Based Image Translation
* Masked Auto-Encoders Meet Generative Adversarial Networks and Beyond
* Masked Autoencoders Enable Efficient Knowledge Distillers
* Masked Autoencoding Does Not Help Natural Language Supervision at Scale
* Masked Image Modeling with Local Multi-Scale Reconstruction
* Masked Image Training for Generalizable Deep Image Denoising
* Masked Images Are Counterfactual Samples for Robust Fine-Tuning
* Masked Jigsaw Puzzle: A Versatile Position Embedding for Vision Transformers
* Masked Motion Encoding for Self-Supervised Video Representation Learning
* Masked Representation Learning for Domain Generalized Stereo Matching
* Masked Scene Contrast: A Scalable Framework for Unsupervised 3D Representation Learning
* Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation Learning
* Masked Wavelet Representation for Compact Neural Radiance Fields
* MaskSketch: Unpaired Structure-guided Masked Image Generation
* Master: Meta Style Transformer for Controllable Zero-Shot and Few-Shot Artistic Style Transfer
* Matching Is Not Enough: A Two-Stage Framework for Category-Agnostic Pose Estimation
* MCF: Mutual Correction Framework for Semi-Supervised Medical Image Segmentation
* MD-VQA: Multi-Dimensional Quality Assessment for UGC Live Videos
* MDL-NAS: A Joint Multi-domain Learning Framework for Vision Transformer
* MDQE: Mining Discriminative Query Embeddings to Segment Occluded Instances on Challenging Videos
* MED-VT: Multiscale Encoder-Decoder Video Transformer with Application to Object Segmentation
* MEDIC: Remove Model Backdoors via Importance Driven Cloning
* Megahertz Light Steering Without Moving Parts
* MEGANE: Morphable Eyeglass and Avatar Network
* MELTR: Meta Loss Transformer for Learning to Fine-tune Video Foundation Models
* MeMaHand: Exploiting Mesh-Mano Interaction for Single Image Two-Hand Reconstruction
* Memory-Friendly Scalable Super-Resolution via Rewinding Lottery Ticket Hypothesis
* Meta Architecture for Point Cloud Analysis
* Meta Compositional Referring Expression Segmentation
* Meta Omnium: A Benchmark for General-Purpose Learning-to-Learn
* Meta-Causal Learning for Single Domain Generalization
* Meta-Explore: Exploratory Hierarchical Vision-and-Language Navigation Using Scene Object Spectrum Grounding
* Meta-Learning Approach to Predicting Performance and Data Requirements, A
* Meta-Learning with a Geometry-Adaptive Preconditioner
* Meta-Personalizing Vision-Language Models to Find Named Instances in Video
* Meta-Tuning Loss Functions and Data Augmentation for Few-Shot Object Detection
* MetaCLUE: Towards Comprehensive Visual Metaphors Research
* Metadata-Based RAW Reconstruction via Implicit Neural Functions
* MetaFusion: Infrared and Visible Image Fusion via Meta-Feature Embedding from Object Detection
* MetaMix: Towards Corruption-Robust Continual Learning with Temporally Self-Adaptive Data Transformation
* MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation
* MetaViewer: Towards A Unified Multi-View Representation
* MethaneMapper: Spectral Absorption Aware Hyperspectral Transformer for Methane Detection
* METransformer: Radiology Report Generation by Transformer with Multiple Learnable Expert Tokens
* MHPL: Minimum Happy Points Learning for Active Source Free Domain Adaptation
* MIANet: Aggregating Unbiased Instance and General Information for Few-Shot Semantic Segmentation
* MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation
* Micron-BERT: BERT-Based Facial Micro-Expression Recognition
* MIME: Human-Aware 3D Scene Generation
* Mind the Label Shift of Augmentation-based Graph OOD Generalization
* Minimizing Maximum Model Discrepancy for Transferable Black-box Targeted Attacks
* Minimizing the Accumulated Trajectory Error to Improve Dataset Distillation
* MISC210K: A Large-Scale Dataset for Multi-Instance Semantic Correspondence
* MIST: Multi-modal Iterative Spatial-Temporal Transformer for Long-form Video Question Answering
* Mitigating Task Interference in Multi-Task Learning via Explicit Task Routing with Non-Learnable Primitives
* Mixed Autoencoder for Self-Supervised Visual Representation Learning
* MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers
* MixNeRF: Modeling a Ray with Mixture Density for Novel View Synthesis from Sparse Inputs
* MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource Visual Question Answering
* MIXSIM: A Hierarchical Framework for Mixed Reality Traffic Simulation
* MixTeacher: Mining Promising Labels with Mixed Scale Teacher for Semi-Supervised Object Detection
* ML)2P-Encoder: On Exploration of Channel-Class Correlation for Multi-Label Zero-Shot Learning
* MM-3DScene: 3D Scene Understanding by Customizing Masked Modeling with Informative-Preserved Reconstruction and Self-Distilled Consistency
* MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
* MMANet: Margin-Aware Distillation and Modality-Aware Regularization for Incomplete Multimodal Learning
* MMG-Ego4D: Multi-Modal Generalization in Egocentric Action Recognition
* MMVC: Learned Multi-Mode Video Compression with Block-based Prediction Mode Selection and Density-Adaptive Entropy Coding
* Mobile User Interface Element Detection Via Adaptively Prompt Tuning
* MobileBrick: Building LEGO for 3D Reconstruction on Mobile Devices
* MobileNeRF: Exploiting the Polygon Rasterization Pipeline for Efficient Neural Field Rendering on Mobile Architectures
* MobileOne: An Improved One millisecond Mobile Backbone
* MobileVOS: Real-Time Video Object Segmentation Contrastive Learning meets Knowledge Distillation
* Mod-Squad: Designing Mixtures of Experts As Modular Multi-Task Learners
* Modality-Agnostic Debiasing for Single Domain Generalization
* Modality-invariant Visual Odometry for Embodied Vision
* MoDAR: Using Motion Forecasting for 3D Object Detection in Point Cloud Sequences
* Model Barrier: A Compact Un-Transferable Isolation Domain for Model Intellectual Property Protection
* Model-Agnostic Gender Debiased Image Captioning
* Modeling Entities as Semantic Points for Visual Information Extraction in the Wild
* Modeling Inter-Class and Intra-Class Constraints in Novel Class Discovery
* Modeling the Distributional Uncertainty for Salient Object Detection Models
* Modeling Video as Stochastic Processes for Fine-Grained Video Representation Learning
* Modernizing Old Photos Using Multiple References via Photorealistic Style Transfer
* MoDi: Unconditional Motion Synthesis from Diverse Data
* Modular Memorability: Tiered Representations for Video Memorability Prediction
* MoFusion: A Framework for Denoising-Diffusion-Based Motion Synthesis
* MoLo: Motion-Augmented Long-Short Contrastive Learning for Few-Shot Action Recognition
* MonoATT: Online Monocular 3D Object Detection with Adaptive Token Transformer
* MonoHuman: Animatable Human Neural Field from Monocular Video
* MOSO: Decomposing MOtion, Scene and Object for Video Prediction
* MoStGAN-V: Video Generation with Temporal Motion Styles
* MOT: Masked Optimal Transport for Partial Domain Adaptation
* Motion Information Propagation for Neural Video Compression
* MotionDiffuser: Controllable Multi-Agent Motion Prediction Using Diffusion
* MotionTrack: Learning Robust Short-Term and Long-Term Motions for Multi-Object Tracking
* MOTRv2: Bootstrapping End-to-End Multi-Object Tracking by Pretrained Object Detectors
* MOVES: Manipulated Objects in Video Enable Segmentation
* Movies2Scenes: Using Movie Metadata to Learn Scene Representation
* MP-Former: Mask-Piloted Transformer for Image Segmentation
* MSeg3D: Multi-Modal 3D Semantic Segmentation for Autonomous Driving
* MSF: Motion-guided Sequential Fusion for Efficient 3D Object Detection from Point Cloud Sequences
* MSINet: Twins Contrastive Search of Multi-Scale Interaction for Object ReID
* MSMDFusion: Fusing LiDAR and Camera at Multiple Scales with Multi-Depth Seeds for 3D Object Detection
* Multi Domain Learning for Motion Magnification
* Multi-Agent Automated Machine Learning
* Multi-Centroid Task Descriptor for Dynamic Class Incremental Inference
* Multi-Concept Customization of Text-to-Image Diffusion
* Multi-Granularity Archaeological Dating of Chinese Bronze Dings Based on a Knowledge-Guided Relation Graph
* Multi-Label Compound Expression Recognition: C-EXPR Database and Network
* Multi-Level Logit Distillation
* Multi-modal Gait Recognition via Effective Spatial-Temporal Feature Fusion
* Multi-Modal Learning with Missing Modality via Shared-Specific Feature Modelling
* Multi-Modal Representation Learning with Text-Driven Soft Masks
* Multi-Mode Online Knowledge Distillation for Self-Supervised Visual Representation Learning
* Multi-Object Manipulation via Object-Centric Neural Scattering Functions
* Multi-Realism Image Compression with a Conditional Generator
* Multi-Sensor Large-Scale Dataset for Multi-View 3D Reconstruction
* Multi-Space Neural Radiance Fields
* Multi-view Adversarial Discriminator: Mine the Non-causal Factors for Object Detection in Unseen Domains
* Multi-View Azimuth Stereo via Tangent Space Consistency
* Multi-view Inverse Rendering for Large-scale Real-world Indoor Scenes
* Multi-View Reconstruction Using Signed Ray Distance Functions (SRDF)
* Multi-View Stereo Representation Revist: Region-Aware MVSNet
* Multiclass Confidence and Localization Calibration for Object Detection
* Multilateral Semantic Relations Modeling for Image Text Retrieval
* Multimodal Industrial Anomaly Detection via Hybrid Fusion
* Multimodal Prompting with Missing Modalities for Visual Recognition
* Multiple Instance Learning via Iterative Self-Paced Supervised Contrastive Learning
* Multiplicative Fourier Level of Detail
* Multiscale Tensor Decomposition and Rendering Equation Encoding for View Synthesis
* Multispectral Video Semantic Segmentation: A Benchmark Dataset and Baseline
* Multivariate, Multi-Frequency and Multimodal: Rethinking Graph Neural Networks for Emotion Recognition in Conversation
* Multiview Compressive Coding for 3D Reconstruction
* Music-Driven Group Choreography
* Mutual Information-Based Temporal Difference Learning for Human Pose Estimation in Video
* MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training
* MVImgNet: A Large-scale Dataset of Multi-view Images
* N-Gram in Swin Transformers for Efficient Lightweight Image Super-Resolution
* NaQ: Leveraging Narrations as Queries to Supervise Episodic Memory
* NAR-Former: Neural Architecture Representation Learning Towards Holistic Attributes Prediction
* Natural Language-Assisted Sign Language Recognition
* NeAT: Learning Neural Implicit Surfaces with Arbitrary Topologies from Multi-View Images
* NEF: Neural Edge Fields for 3D Parametric Curve Reconstruction from Multi-View Images
* NeFII: Inverse Rendering for Reflectance Decomposition with Near-Field Indirect Illumination
* Neighborhood Attention Transformer
* NeMo: 3D Neural Motion Fields from Multiple Video Instances of the Same Action
* NeRDi: Single-View NeRF Synthesis with Language-Guided Diffusion as General Image Priors
* NeRF in the Palm of Your Hand: Corrective Augmentation for Robotics via Novel-View Synthesis
* NeRF-DS: Neural Radiance Fields for Dynamic Specular Objects
* NeRF-RPN: A general framework for object detection in NeRFs
* NeRF-Supervised Deep Stereo
* NeRFInvertor: High Fidelity NeRF-GAN Inversion for Single-Shot Real Image Animation
* Nerflets: Local Radiance Fields for Efficient Structure-Aware 3D Scene Representation from 2D Supervision
* NeRFLight: Fast and Light Neural Radiance Fields using a Shared Feature Grid
* NeRFLiX: High-Quality Neural View Synthesis by Learning a Degradation-Driven Inter-viewpoint MiXer
* NeRFVS: Neural Radiance Fields for Free View Synthesis via Geometry Scaffolds
* NerVE: Neural Volumetric Edges for Parametric Curve Extraction from Point Cloud
* Network Expansion For Practical Training Acceleration
* Network-Free, Unsupervised Semantic Segmentation with Synthetic Images
* NeuDA: Neural Deformable Anchor for High-Fidelity Implicit Surface Reconstruction
* NeUDF: Leaning Neural Unsigned Distance Fields with Volume Rendering
* NeuFace: Realistic 3D Neural Face Rendering from Multi-View Images
* Neumann Network with Recursive Kernels for Single Image Defocus Deblurring
* NeuMap: Neural Coordinate Mapping by Auto-Transdecoder for Camera Localization
* Neural Congealing: Aligning Images to a Joint Semantic Atlas
* Neural Dependencies Emerging from Learning Massive Categories
* Neural Fields Meet Explicit Geometric Representations for Inverse Rendering of Urban Scenes
* Neural Fourier Filter Bank
* Neural Intrinsic Embedding for Non-Rigid Point Cloud Matching
* Neural Kaleidoscopic Space Sculpting
* Neural Kernel Surface Reconstruction
* Neural Koopman Pooling: Control-Inspired Temporal Dynamics Encoding for Skeleton-Based Action Recognition
* Neural Lens Modeling
* Neural Map Prior for Autonomous Driving
* Neural Part Priors: Learning to Optimize Part-Based Object Completion in RGB-D Scans
* Neural Pixel Composition for 3D-4D View Synthesis from Multi-Views
* Neural Preset for Color Style Transfer
* Neural Rate Estimator and Unsupervised Learning for Efficient Distributed Image Analytics in Split-DNN models
* Neural Residual Radiance Fields for Streamably Free-Viewpoint Videos
* Neural Scene Chronology
* Neural Texture Synthesis with Guided Correspondence
* Neural Transformation Fields for Arbitrary-Styled Font Generation
* Neural Vector Fields: Implicit Representation by Explicit Learning
* Neural Video Compression with Diverse Contexts
* Neural Volumetric Memory for Visual Locomotion Control
* Neural Voting Field for Camera-Space 3D Hand Pose Estimation
* Neuralangelo: High-Fidelity Neural Surface Reconstruction
* NeuralDome: A Neural Modeling Pipeline on Multi-View Human-Object Interactions
* NeuralEditor: Editing Neural Radiance Fields via Manipulating Point Clouds
* NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models
* Neuralizer: General Neuroimage Analysis without Re-Training
* NeuralLift-360: Lifting an in-the-Wild 2D Photo to A 3D Object with 360° Views
* NeuralPCI: Spatio-Temporal Neural Field for 3D Point Cloud Multi-Frame Non-Linear Interpolation
* NeuralUDF: Learning Unsigned Distance Fields for Multi-View Reconstruction of Surfaces with Arbitrary Topologies
* Neuro-Modulated Hebbian Learning for Fully Test-Time Adaptation
* NeurOCS: Neural NOCS Supervision for Monocular 3D Object Localization
* Neuron Structure Modeling for Generalizable Remote Physiological Measurement
* NeuWigs: A Neural Dynamic Model for Volumetric Hair Capture and Animation
* New Benchmark: On the Utility of Synthetic Data with Blender for Bare Supervised Learning and Downstream Domain Adaptation, A
* New Comprehensive Benchmark for Semi-supervised Video Anomaly Detection and Anticipation, A
* New Dataset Based on Images Taken by Blind People for Testing the Robustness of Image Classification Models Trained for ImageNet Categories, A
* New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning, A
* NewsNet: A Novel Dataset for Hierarchical Temporal Segmentation
* Next3D: Generative Neural Texture Rasterization for 3D-Aware Head Avatars
* NICO++: Towards Better Benchmarking for Domain Generalization
* NIFF: Alleviating Forgetting in Generalized Few-Shot Object Detection via Neural Instance Feature Forging
* Nighttime Smartphone Reflective Flare Removal Using Optical Center Symmetry Prior
* NIKI: Neural Inverse Kinematics with Invertible Neural Networks for 3D Human Pose and Shape Estimation
* NIPQ: Noise proxy-based Integrated Pseudo-Quantization
* NIRVANA: Neural Implicit Representations of Videos with Adaptive Networks and Autoregressive Patch-Wise Modeling
* NLOST: Non-Line-of-Sight Imaging with Transformer
* No One Left Behind: Improving the Worst Categories in Long-Tailed Learning
* Noisy Correspondence Learning with Meta Similarity Correction
* NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers
* NoisyTwins: Class-Consistent and Diverse Image Generation Through StyleGANs
* Non-Contrastive Learning Meets Language-Image Pre-Training
* Non-Contrastive Unsupervised Learning of Physiological Signals from Video
* Non-Line-of-Sight Imaging with Signal Superresolution Network
* NoPe-NeRF: Optimising Neural Radiance Field with No Pose Prior
* Normal-guided Garment UV Prediction for Human Re-texturing
* Normalizing Flow based Feature Synthesis for Outlier-Aware Object Detection
* Not All Image Regions Matter: Masked Vector Quantization for Autoregressive Image Generation
* Novel Class Discovery for 3D Point Cloud Semantic Segmentation
* Novel-View Acoustic Synthesis
* NS3D: Neuro-Symbolic Grounding of 3D Objects and Relations
* Null-text Inversion for Editing Real Images using Guided Diffusion Models
* NVTC: Nonlinear Vector Transform Coding
* NÜWA-LIP: Language-guided Image Inpainting with Defect-free VQGAN
* Objaverse: A Universe of Annotated 3D Objects
* Object Detection with Self-Supervised Scene Adaptation
* Object Discovery from Motion-Guided Tokens
* Object Folder Benchmark: Multisensory Learning with Neural and Real Objects, The
* Object pop-up: Can we infer 3D objects and their poses from human interactions alone?
* Object Pose Estimation with Statistical Guarantees: Conformal Keypoint Detection and Geometric Uncertainty Propagation
* Object-Aware Distillation Pyramid for Open-Vocabulary Object Detection
* Object-Goal Visual Navigation via Effective Exploration of Relations Among Historical Navigation States
* ObjectMatch: Robust Registration using Canonical Object Correspondences
* ObjectStitch: Object Compositing with Diffusion Model
* Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking
* Occlusion-Free Scene Recovery via Neural Radiance Fields
* OCELOT: Overlapped Cell on Tissue Dataset for Histopathology
* OCTET: Object-aware Counterfactual Explanations
* OcTr: Octree-Based Transformer for 3D Object Detection
* Octree Guided Unoriented Surface Reconstruction
* Omni Aggregation Networks for Lightweight Image Super-Resolution
* Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild
* OmniAL: A Unified CNN Framework for Unsupervised Anomaly Localization
* OmniAvatar: Geometry-Guided Controllable 3D Head Synthesis
* OmniCity: Omnipotent City Understanding with Multi-Level and Multi-View Images
* OmniMAE: Single Model Masked Pretraining on Images and Videos
* Omnimatte3D: Associating Objects and Their Effects in Unconstrained Monocular Video
* OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation
* OmniVidar: Omnidirectional Depth Estimation from Multi-Fisheye Images
* On Calibrating Semantic Segmentation Models: Analyses and An Algorithm
* On Data Scaling in Masked Image Modeling
* On Distillation of Guided Diffusion Models
* On the Benefits of 3D Pose and Tracking for Human Action Recognition
* On the Convergence of IRLS and Its Variants in Outlier-Robust Estimation
* On the Difficulty of Unpaired Infrared-to-Visible Video Translation: Fine-Grained Content-Rich Patches Transfer
* On the Effectiveness of Partial Variance Reduction in Federated Learning with Heterogeneous Data
* On the Effects of Self-supervision and Contrastive Alignment in Deep Multi-view Clustering
* On the Importance of Accurate Geometry Data for Dense 3D Vision Tasks
* On the Pitfall of Mixup for Uncertainty Calibration
* On the Stability-Plasticity Dilemma of Class-Incremental Learning
* On-the-Fly Category Discovery
* One-Shot High-Fidelity Talking-Head Synthesis with Deformable Neural Radiance Field
* One-Shot Model for Mixed-Precision Quantization
* One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer
* One-to-Few Label Assignment for End-to-End Dense Detection
* OneFormer: One Transformer to Rule Universal Image Segmentation
* OPE-SR: Orthogonal Position Encoding for Designing a Parameter-free Upsampling Module in Arbitrary-scale Image Super-Resolution
* Open Set Action Recognition via Multi-Label Evidential Learning
* Open Vocabulary Semantic Segmentation with Patch Aligned Contrastive Learning
* Open-Category Human-Object Interaction Pre-training via Language Modeling Framework
* Open-Set Fine-Grained Retrieval via Prompting Vision-Language Evaluator
* Open-Set Likelihood Maximization for Few-Shot Learning
* Open-Set Representation Learning through Combinatorial Embedding
* Open-set Semantic Segmentation for Point Clouds via Adversarial Prototype Framework
* Open-vocabulary Attribute Detection
* Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models
* Open-Vocabulary Point-Cloud Object Detection without 3D Annotation
* Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP
* Open-World Multi-Task Control Through Goal-Aware Representation Learning and Adaptive Horizon Prediction
* OpenGait: Revisiting Gait Recognition Toward Better Practicality
* OpenMix: Exploring Outlier Samples for Misclassification Detection
* OpenScene: 3D Scene Understanding with Open Vocabularies
* Optimal Proposal Learning for Deployable End-to-End Pedestrian Detection
* Optimal Transport Minimization: Crowd Localization on Density Maps for Semi-Supervised Counting
* Optimization-Inspired Cross-Attention Transformer for Compressive Sensing
* ORCa: Glossy Objects as Radiance-Field Cameras
* OReX: Object Reconstruction from Planar Cross-sections Using Neural Fields
* OrienterNet: Visual Localization in 2D Public Maps with Neural Matching
* Orthogonal Annotation Benefits Barely-supervised Medical Image Segmentation
* OSAN: A One-Stage Alignment Network to Unify Multimodal Alignment and Unsupervised Domain Adaptation
* OSRT: Omnidirectional Image Super-Resolution with Distortion-aware Transformer
* OT-Filter: An Optimal Transport Filter for Learning with Noisy Labels
* OTAvatar: One-Shot Talking Face Avatar with Controllable Tri-Plane Rendering
* Out-of-Candidate Rectification for Weakly Supervised Semantic Segmentation
* Out-of-Distributed Semantic Pruning for Robust Semi-Supervised Learning
* OvarNet: Towards Open-Vocabulary Object Attribute Recognition
* Overcoming the TradeOff between Accuracy and Plausibility in 3D Hand Shape Reconstruction
* Overlooked Factors in Concept-Based Explanations: Dataset Choice, Concept Learnability, and Human Capability
* OVTrack: Open-Vocabulary Multiple Object Tracking
* PA&DA: Jointly Sampling PAth and DAta for Consistent NAS
* PaCa-ViT: Learning Patch-to-Cluster Attention in Vision Transformers
* PACO: Parts and Attributes of Common Objects
* Paint by Example: Exemplar-based Image Editing with Diffusion Models
* Painting 3D Nature in 2D: View Synthesis of Natural Scenes from a Single Semantic Mask
* Paired-Point Lifting for Enhanced Privacy-Preserving Visual Localization
* PaletteNeRF: Palette-based Appearance Editing of Neural Radiance Fields
* PanelNet: Understanding 360 Indoor Environment via Panel Representation
* PAniC-3D: Stylized Single-view 3D Reconstruction from Portraits of Anime Characters
* PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360°
* Panoptic Compositional Feature Field for Editable Scene Rendering with Network-Inferred Labels via Metric Learning
* Panoptic Lifting for 3D Scene Understanding with Neural Fields
* Panoptic Video Scene Graph Generation
* PanoSwin: a Pano-style Swin Transformer for Panorama Understanding
* Parallel Diffusion Models of Operator and Image for Blind Inverse Problems
* Parameter Efficient Local Implicit Image Function Network for Face Segmentation
* Parametric Implicit Face Representation for Audio-Driven Facial Reenactment
* PartDistillation: Learning Parts from Instance Segmentation
* Partial Network Cloning
* PartManip: Learning Cross-Category Generalizable Part Manipulation Policy from Point Cloud Observations
* PartMix: Regularization Strategy to Learn Part Discovery for Visible-Infrared Person Re-Identification
* Parts2Words: Learning Joint Embedding of Point Clouds and Texts by Bidirectional Matching Between Parts and Words
* PartSLIP: Low-Shot Part Segmentation for 3D Point Clouds via Pretrained Image-Language Models
* Passive Micron-Scale Time-of-Flight with Sunlight Interferometry
* Patch-Based 3D Natural Scene Generation from a Single Example
* Patch-Mix Transformer for Unsupervised Domain Adaptation: A Game Perspective
* PatchCraft Self-Supervised Training for Correlated Image Denoising
* PATS: Patch Area Transportation with Subdivision for Local Feature Matching
* PC2: Projection-Conditioned Point Cloud Diffusion for Single-Image 3D Reconstruction
* pCON: Polarimetric Coordinate Networks for Neural Scene Representations
* PCR: Proxy-Based Contrastive Replay for Online Class-Incremental Continual Learning
* PCT-Net: Full Resolution Image Harmonization Using Pixel-Wise Color Transformations
* PD-Quant: Post-Training Quantization Based on Prediction Difference Metric
* PDPP: Projected Diffusion for Procedure Planning in Instructional Videos
* PeakConv: Learning Peak Receptive Field for Radar Semantic Segmentation
* PEAL: Prior-embedded Explicit Attention Learning for Low-overlap Point Cloud Registration
* PEFAT: Boosting Semi-Supervised Medical Image Classification via Pseudo-Loss Estimation and Feature Adversarial Training
* Perception and Semantic Aware Regularization for Sequential Confidence Calibration
* Perception-Oriented Single Image Super-Resolution using Optimal Objective Estimation
* PermutoSDF: Fast Multi-View Reconstruction with Implicit Surfaces Using Permutohedral Lattices
* Persistent Nature: A Generative Model of Unbounded 3D Worlds
* Person Image Synthesis via Denoising Diffusion Model
* PersonNeRF: Personalized Reconstruction from Photo Collections
* Perspective Fields for Single Image Camera Calibration
* PET-NeuS: Positional Encoding Tri-Planes for Neural Surfaces
* PHA: Patch-Wise High-Frequency Augmentation for Transformer-Based Person Re-Identification
* Phase-Shifting Coder: Predicting Accurate Orientation in Oriented Object Detection
* Phone2Proc: Bringing Robust Robots into Our Chaotic World
* Photo Pre-Training, But for Sketch
* Physical-World Optical Adversarial Attacks on 3D Face Recognition
* Physically Adversarial Infrared Patches with Learnable Shapes and Locations
* Physically Realizable Natural-Looking Clothing Textures Evade Person Detectors via 3D Modeling
* Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos
* Physics-Guided ISO-Dependent Sensor Noise Modeling for Extreme Low-Light Photography
* Pic2Word: Mapping Pictures to Words for Zero-shot Composed Image Retrieval
* Picture that Sketch: Photorealistic Image Generation from Abstract Sketches
* PIDNet: A Real-time Semantic Segmentation Network Inspired by PID Controllers
* PillarNeXt: Rethinking Network Designs for 3D Object Detection in LiDAR Point Clouds
* PiMAE: Point Cloud and Image Interactive Masked Autoencoders for 3D Object Detection
* PIP-Net: Patch-Based Intuitive Prototypes for Interpretable Image Classification
* PIRLNav: Pretraining with Imitation and RL Finetuning for OBJECTNAV
* PIVOT: Prompting for Video Continual Learning
* PivoTAL: Prior-Driven Supervision for Weakly-Supervised Temporal Action Localization
* Pix2Map: Cross-Modal Retrieval for Inferring Street Maps from Images
* Pixels, Regions, and Objects: Multiple Enhancement for Salient Object Detection
* PixHt-Lab: Pixel Height Based Light Effect Generation for Image Compositing
* PLA: Language-Driven Open-Vocabulary 3D Scene Understanding
* PlaneDepth: Self-Supervised Depth Estimation via Orthogonal Planes
* Planning-oriented Autonomous Driving
* Plateau-Reduced Differentiable Path Tracing
* Plen-VDB: Memory Efficient VDB-Based Radiance Fields for Fast Training and Rendering
* PLIKS: A Pseudo-Linear Inverse Kinematic Solver for 3D Human Body Estimation
* Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation
* PMatch: Paired Masked Image Modeling for Dense Geometric Matching
* PMR: Prototypical Modal Rebalance for Multimodal Learning
* POEM: Reconstructing Hand in a Point Embedded Multi-view Stereo
* Point Cloud Forecasting as a Proxy for 4D Occupancy Forecasting
* Point2Pix: Photo-Realistic Point Cloud Rendering via Neural Radiance Fields
* PointAvatar: Deformable Point-Based Head Avatars from Videos
* PointCert: Point Cloud Classification with Deterministic Certified Robustness Guarantees
* PointClustering: Unsupervised Point Cloud Pre-training using Transformation Invariance in Clustering
* PointCMP: Contrastive Mask Prediction for Self-supervised Learning on Point Cloud Videos
* PointConvFormer: Revenge of the Point-based Convolution
* PointDistiller: Structured Knowledge Distillation Towards Efficient and Compact 3D Detection
* Pointersect: Neural Rendering with Cloud-Ray Intersection
* PointListNet: Deep Learning on 3D Point Lists
* PointVector: A Vector Representation In Point Cloud Analysis
* Polarimetric iToF: Measuring High-Fidelity Depth Through Scattering Media
* Polarized Color Image Denoising
* Policy Adaptation from Foundation Model Feedback
* Poly-PC: A Polyhedral Network for Multiple Point Cloud Tasks at Once
* PolyFormer: Referring Image Segmentation as Sequential Polygon Generation
* Polynomial Implicit Neural Representations for Large Diverse Datasets
* Pose Synchronization under Multiple Pair-wise Relative Poses
* Pose-disentangled Contrastive Learning for Self-supervised Facial Representation
* PoseExaminer: Automated Testing of Out-of-Distribution Robustness in Human Pose and Shape Estimation
* PoseFormerV2: Exploring Frequency Domain for Efficient and Robust 3D Human Pose Estimation
* Position-Guided Text Prompt for Vision-Language Pre-Training
* Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation
* Post-Processing Temporal Action Detection
* Post-Training Quantization on Diffusion Models
* PosterLayout: A New Benchmark and Approach for Content-Aware Visual-Textual Presentation Layout
* POTTER: Pooling Attention Transformer for Efficient Human Mesh Recovery
* Power Bundle Adjustment for Large-Scale 3D Reconstruction
* Practical Network Acceleration with Tiny Sets
* Practical Stereo Depth System for Smart Glasses, A
* Practical Upper Bound for the Worst-Case Attribution Deviations, A
* Prefix Conditioning Unifies Language and Label Supervision
* PREIM3D: 3D Consistent Precise Image Attribute Editing from a Single Image
* Preserving Linear Separability in Continual Learning by Backward Feature Projection
* Primitive Generation and Semantic-Related Alignment for Universal Zero-Shot Segmentation
* Principles of Forgetting in Domain-Incremental Semantic Segmentation in Adverse Weather Conditions
* PRISE: Demystifying Deep Lucas-Kanade with Strongly Star-Convex Constraints for Multimodel Image Alignment
* Privacy-preserving Adversarial Facial Features
* Privacy-Preserving Representations are not Enough: Recovering Scene Content from Camera Poses
* Private Image Generation with Dual-Purpose Auxiliary Classifier
* PROB: Probabilistic Objectness for Open World Object Detection
* Probabilistic Attention Model with Occlusion-aware Texture Regression for 3D Hand Reconstruction from a Single RGB Image, A
* Probabilistic Debiasing of Scene Graphs
* Probabilistic Framework for Lifelong Test-Time Adaptation, A
* Probabilistic Knowledge Distillation of Face Ensembles
* Probabilistic Prompt Learning for Dense Prediction
* Probability-based Global Cross-modal Upsampling for Pansharpening
* Probing Neural Representations of Scene Perception in a Hippocampally Dependent Task Using Artificial Neural Networks
* Probing Sentiment-Oriented PreTraining Inspired by Human Sentiment Perception Mechanism
* Procedure-Aware Pretraining for Instructional Video Understanding
* ProD: Prompting-to-disentangle Domain Knowledge for Cross-domain Few-shot Image Classification
* Progressive Backdoor Erasing via connecting Backdoor and Adversarial Attacks
* Progressive Disentangled Representation Learning for Fine-Grained Controllable Talking Head Synthesis
* Progressive Neighbor Consistency Mining for Correspondence Pruning
* Progressive Open Space Expansion for Open-Set Model Attribution
* Progressive Random Convolutions for Single Domain Generalization
* Progressive Semantic-Visual Mutual Adaption for Generalized Zero-Shot Learning
* Progressive Spatio-temporal Alignment for Efficient Event-based Motion Estimation
* Progressive Transformation Learning for Leveraging Virtual Images in Training
* Progressively Optimized Local Radiance Fields for Robust View Synthesis
* Promoting Semantic Connectivity: Dual Nearest Neighbors Contrastive Learning for Unsupervised Domain Generalization
* Prompt, Generate, Then Cache: Cascade of Foundation Models Makes Strong Few-Shot Learners
* Prompt-Guided Zero-Shot Anomaly Action Recognition using Pretrained Deep Skeleton Features
* PromptCAL: Contrastive Affinity Learning via Auxiliary Prompts for Generalized Novel Category Discovery
* Prompting Large Language Models with Answer Heuristics for Knowledge-Based Visual Question Answering
* Propagate and Calibrate: Real-Time Passive Non-Line-of-Sight Tracking
* ProphNet: Efficient Agent-Centric Motion Forceasting with Anchor-Informed Proposals
* Proposal-Based Multiple Instance Learning for Weakly-Supervised Temporal Action Localization
* Protocon: Pseudo-Label Refinement via Online Clustering and Prototypical Consistency for Efficient Semi-Supervised Learning
* Prototype-Based Embedding Network for Scene Graph Generation
* Prototypical Residual Networks for Anomaly Detection and Localization
* ProTéGé: Untrimmed Pretraining for Video Temporal Grounding by Video Temporal Grounding
* Proximal Splitting Adversarial Attack for Semantic Segmentation
* ProxyFormer: Proxy Alignment Assisted Point Cloud Completion with Missing Part Sensitive Transformer
* Pruning Parameterization with Bi-level Optimization for Efficient Semantic Segmentation on the Edge
* Pseudo-Label Guided Contrastive Learning for Semi-Supervised Medical Image Segmentation
* PSVT: End-to-End Multi-Person 3D Pose and Shape Estimation with Progressive Video Transformers
* Putting People in Their Place: Affordance-Aware Human Insertion into Scenes
* PVO: Panoptic Visual Odometry
* PVT-SSD: Single-Stage 3D Object Detector with Point-Voxel Transformer
* PyPose: A Library for Robot Learning with Physics-based Optimization
* PyramidFlow: High-Resolution Defect Contrastive Localization Using Pyramid Normalizing Flow
* Q-DETR: An Efficient Low-Bit Quantized Detection Transformer
* Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images!
* QPGesture: Quantization-Based and Phase-Guided Motion Matching for Natural Speech-Driven Gesture Generation
* Quality-aware Pretrained Models for Blind Image Quality Assessment
* QuantArt: Quantizing Image Style Transfer Towards High Visual Fidelity
* Quantitative Manipulation of Custom Attributes on 3D-Aware Image Synthesis
* Quantum Multi-Model Fitting
* Quantum-Inspired Spectral-Spatial Pyramid Network for Hyperspectral Image Classification
* Query-Centric Trajectory Prediction
* Query: Dependent Video Representation for Moment Retrieval and Highlight Detection
* RA-CLIP: Retrieval Augmented Contrastive Language-Image Pre-Training
* RaBit: Parametric Modeling of 3D Biped Cartoon Characters with a Topological-Consistent Dataset
* Randomized Adversarial Training via Taylor Expansion
* Range-nullspace Video Frame Interpolation with Focalized Motion Estimation
* RangeViT: Towards Vision Transformers for 3D Semantic Segmentation in Autonomous Driving
* Ranking Regularization for Critical Rare Classes: Minimizing False Positives at a High True Positive Rate
* RankMix: Data Augmentation for Weakly Supervised Learning of Classifying Whole Slide Images with Diverse Sizes and Imbalanced Categories
* Rate Gradient Approximation Attack Threats Deep Spiking Neural Networks
* Raw Image Reconstruction with Learned Compact Metadata
* Rawgment: Noise-Accounted RAW Augmentation Enables Recognition in a Wide Variety of Environments
* Re-basin via implicit Sinkhorn differentiation
* Re-GAN: Data-Efficient GANs Training via Architectural Reconfiguration
* Re-IQA: Unsupervised Learning for Image Quality Assessment in the Wild
* Re-Thinking Federated Active Learning Based on Inter-Class Diversity
* Re-Thinking Model Inversion Attacks Against Deep Neural Networks
* Re2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization
* Real-time 6K Image Rescaling with Rate-distortion Optimization
* Real-Time Controllable Denoising for Image and Video
* Real-Time Evaluation in Online Continual Learning: A New Hope
* Real-time Multi-person Eyeblink Detection in the Wild for Untrimmed Video
* Real-Time Neural Light Field on Mobile Devices
* RealFusion 360° Reconstruction of Any Object from a Single Image
* REALIMPACT: A Dataset of Impact Sound Fields for Real Objects
* Realistic Saliency Guided Image Enhancement
* ReasonNet: End-to-End Driving with Temporal and Global Reasoning
* Rebalancing Batch Normalization for Exemplar-Based Class-Incremental Learning
* REC-MV: REconstructing 3D Dynamic Cloth from Monocular Videos
* ReCo: Region-Controlled Text-to-Image Generation
* Recognizability Embedding Enhancement for Very Low-Resolution Face Recognition and Quality Estimation
* Recognizing Rigid Patterns of Unlabeled Point Clouds by Complete and Continuous Isometry Invariants with no False Negatives and no False Positives
* Reconstructing Animatable Categories from Videos
* Reconstructing Signing Avatars from Video Using Linguistic Priors
* Recovering 3D Hand Mesh Sequence from a Single Blurry Image: A New Dataset and Temporal Unfolding
* Recurrence without Recurrence: Stable Video Landmark Detection with Deep Equilibrium Models
* Recurrent Homography Estimation Using Homography-Guided Image Warping and Focus Transformer
* Recurrent Vision Transformers for Object Detection with Event Cameras
* ReDirTrans: Latent-to-Latent Translation for Gaze and Head Redirection
* Reducing the Label Bias for Timestamp Supervised Temporal Action Segmentation
* Ref-NPR: Reference-Based Non-Photorealistic Radiance Fields for Controllable Scene Stylization
* RefCLIP: A Universal Teacher for Weakly Supervised Referring Expression Comprehension
* Referring Image Matting
* Referring Multi-Object Tracking
* RefSR-NeRF: Towards High Fidelity and Super Resolution View Synthesis
* RefTeacher: A Strong Baseline for Semi-Supervised Referring Expression Comprehension
* Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers
* Regularization of polynomial networks for image recognition
* Regularize implicit neural representation by itself
* Regularized Vector Quantization for Tokenized Image Synthesis
* Regularizing Second-Order Influences for Continual Learning
* Reinforcement Learning-Based Black-Box Model Inversion Attacks
* Relational Context Learning for Human-Object Interaction Detection
* Relational Space-Time Query in Long-Form Videos
* Reliability in Semantic Segmentation: Are we on the Right Track?
* Reliable and Interpretable Personalized Federated Learning
* ReLight My NeRF: A Dataset for Novel View Synthesis and Relighting of Real World Objects
* Relightable Neural Human Assets from Multi-view Gradient Illuminations
* RelightableHands: Efficient Neural Relighting of Articulated Hand Models
* Removing Objects From Neural Radiance Fields
* Renderable Neural Radiance Map for Visual Navigation
* RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and Generation
* RepMode: Learning to Re-Parameterize Diverse Experts for Subcellular Structure Prediction
* Representation Learning for Visual Object Tracking by Masked Appearance Transfer
* Representing Volumetric Videos as Dynamic MLP Maps
* Reproducible Scaling Laws for Contrastive Language-Image Learning
* ResFormer: Scaling ViTs with Multi-Resolution Training
* Residual Degradation Learning Unfolding Framework with Mixing Priors Across Spectral and Spatial for Compressive Spectral Imaging
* Resource Problem of Using Linear Layer Leakage Attack in Federated Learning, The
* Resource-Efficient RGBD Aerial Tracking
* Restoration of Hand-Drawn Architectural Drawings using Latent Space Mapping with Degradation Generator
* Rethinking Domain Generalization for Face Anti-spoofing: Separability and Alignment
* Rethinking Feature-based Knowledge Distillation for Face Recognition
* Rethinking Federated Learning with Domain Shift: A Prototype View
* Rethinking Few-Shot Medical Segmentation: A Vector Quantization View
* Rethinking Gradient Projection Continual Learning: Stability/Plasticity Feature Space Decoupling
* Rethinking Image Super Resolution from Long-Tailed Distribution Learning Perspective
* Rethinking Optical Flow from Geometric Matching Consistent Perspective
* Rethinking Out-of-distribution (OOD) Detection: Masked Image Modeling is All You Need
* Rethinking the Approximation Error in 3D Surface Fitting for Point Cloud Normal Estimation
* Rethinking the Correlation in Few-Shot Segmentation: A Buoys View
* Rethinking the Learning Paradigm for Dynamic Facial Expression Recognition
* Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning
* Reveal: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory
* Revealing the Dark Secrets of Masked Image Modeling
* ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Regeneration
* Revisiting Multimodal Representation in Contrastive Learning: From Patch and Token Embeddings to Finite Discrete Tokens
* Revisiting Prototypical Network for Cross Domain Few-Shot Learning
* Revisiting Residual Networks for Adversarial Robustness
* Revisiting Reverse Distillation for Anomaly Detection
* Revisiting Rolling Shutter Bundle Adjustment: Toward Accurate and Fast Solution
* Revisiting Rotation Averaging: Uncertainties and Robust Losses
* Revisiting Self-Similarity: Structural Embedding for Image Retrieval
* Revisiting Temporal Modeling for CLIP-Based Image-to-Video Knowledge Transferring
* Revisiting the P3P Problem
* Revisiting the Stack-Based Inverse Tone Mapping
* Revisiting Weak-to-Strong Consistency in Semi-Supervised Semantic Segmentation
* RGB No More: Minimally-Decoded JPEG Vision Transformers
* RGBD2: Generative Scene Synthesis via Incremental View Inpainting Using RGBD Diffusion Models
* RIATIG: Reliable and Imperceptible Adversarial Text-to-Image Generation with Natural Prompts
* RIAV-MVS: Recurrent-Indexing an Asymmetric Volume for Multi-View Stereo
* RIDCP: Revitalizing Real Image Dehazing via High-Quality Codebook Priors
* RIFormer: Keep Your Vision Backbone Effective But Removing Token Mixer
* Rigidity-Aware Detection for 6D Object Pose Estimation
* RILS: Masked Visual Reconstruction in Language Semantic Space
* RMLVQA: A Margin Loss Approach For Visual Question Answering with Language Biases
* Robot Structure Prior Guided Temporal Attention for Camera-to-Robot Pose Estimation from Image Sequence
* Robust 3D Shape Classification via Non-local Graph Attention Network
* Robust and Scalable Gaussian Process Regression and Its Applications
* Robust Dynamic Radiance Fields
* Robust Generalization Against Photon-Limited Corruptions via Worst-Case Sharpness Minimization
* Robust Mean Teacher for Continual and Gradual Test-Time Adaptation
* Robust Model-based Face Reconstruction through Weakly-Supervised Outlier Segmentation
* Robust Multiview Point Cloud Registration with Reliable Pose Graph Initialization and History Reweighting
* Robust Outlier Rejection for 3D Registration with Variational Bayes
* Robust Single Image Reflection Removal Against Adversarial Attacks
* Robust Test-Time Adaptation in Dynamic Scenarios
* Robust Unsupervised StyleGAN Image Restoration
* RobustNeRF: Ignoring Distractors with Robust Losses
* RODIN: A Generative Model for Sculpting 3D Digital Avatars Using Diffusion
* Role of Transients in Two-Bounce Non-Line-of-Sight Imaging
* RONO: Robust Discriminative Learning with Noisy Labels for 2D-3D Cross-Modal Retrieval
* Rotation-Invariant Transformer for Point Cloud Matching
* Rotation-Translation-Decoupled Solution for Robust and Efficient Visual-Inertial Initialization, A
* Run, Don't Walk: Chasing Higher FLOPS for Faster Neural Networks
* RUST: Latent Neural Scene Representations from Unposed Imagery
* RWSC-Fusion: Region-Wise Style-Controlled Fusion Network for the Prohibited X-ray Security Image Synthesis
* S3C: Semi-Supervised VQA Natural Language Explanation via Self-Critical Learning
* SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
* Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models
* Sample-level Multi-view Graph Clustering
* Samples with Low Loss Curvature Improve Data Efficiency
* Sampling is Matter: Point-Guided 3D Human Mesh Reconstruction
* SAP-DETR: Bridging the Gap Between Salient Points and Queries-Based Transformer Detector for Fast Model Convergency
* SCADE: NeRFs from Space Carving with Ambiguity-Aware Depth Estimates
* Scalable, Detailed and Mask-Free Universal Photometric Stereo
* ScaleDet: A Scalable Multi-Dataset Object Detector
* ScaleFL: Resource-Adaptive Federated Learning with Heterogeneous Clients
* ScaleKD: Distilling Scale-Aware Knowledge in Small Object Detector
* Scaling Language-Image Pre-Training via Masking
* Scaling up GANs for Text-to-Image Synthesis
* ScanDMM: A Deep Markov Model of Scanpath Prediction for 360° Images
* ScarceNet: Animal Pose Estimation with Scarce Annotations
* SCConv: Spatial and Channel Reconstruction Convolution for Feature Redundancy
* Scene-Aware Egocentric 3D Human Pose Estimation
* SceneComposer: Any-Level Semantic Image Synthesis
* SceneTrilogy: On Human Scene-Sketch and its Complementarity with Photo and Text
* SCoDA: Domain Adaptive Shape Completion for Real Scans
* SCOOP: Self-Supervised Correspondence and Optimization-Based Scene Flow
* Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation
* SCOTCH and SODA: A Transformer Video Shadow Detection Framework
* SCPNet: Semantic Scene Completion on Point Cloud
* SDC-UDA: Volumetric Unsupervised Domain Adaptation Framework for Slice-Direction Continuous Cross-Modality Medical Image Segmentation
* SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation
* SE-ORNet: Self-Ensembling Orientation-Aware Network for Unsupervised Point Cloud Shape Correspondence
* Search-Map-Search: A Frame Selection Paradigm for Action Recognition
* Seasoning Model Soups for Robustness to Adversarial and Natural Distribution Shifts
* SeaThru-NeRF: Neural Radiance Fields in Scattering Media
* SECAD-Net: Self-Supervised CAD Reconstruction by Learning Sketch-Extrude Operations
* Seeing a Rose in Five Thousand Ways
* Seeing Beyond the Brain: Conditional Diffusion Model with Sparse Masked Modeling for Vision Decoding
* Seeing Electric Network Frequency from Events
* Seeing Through the Glass: Neural 3D Reconstruction of Object Inside a Transparent Container
* Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning
* Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert
* Seeing With Sound: Long-Range Acoustic Beamforming for Multimodal Scene Understanding
* SegLoc: Learning Segmentation-Based Representations for Privacy-Preserving Visual Localization
* Selective Structured State-Spaces for Long-Form Video Understanding
* Self-Correctable and Adaptable Inference for Generalizable Human Pose Estimation
* Self-Guided Diffusion Models
* Self-Positioning Point-Based Transformer for Point Cloud Understanding
* Self-Supervised 3D Scene Flow Estimation Guided by Superpoints
* Self-supervised AutoFlow
* Self-Supervised Blind Motion Deblurring with Deep Expectation Maximization
* Self-Supervised Geometry-Aware Encoder for Style-Based 3D GAN Inversion
* Self-Supervised Image-to-Point Distillation via Semantically Tolerant Contrastive Loss
* Self-Supervised Implicit Glyph Attention for Text Recognition
* Self-Supervised Learning for Multimodal Non-Rigid 3D Shape Matching
* Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture
* Self-supervised Non-uniform Kernel Estimation with Flow-based Motion Prior for Blind Image Deblurring
* Self-Supervised Pre-Training with Masked Shape Prediction for 3D Scene Understanding
* Self-Supervised Representation Learning for CAD
* Self-Supervised Super-Plane for Neural 3D Reconstruction
* Self-Supervised Video Forensics by Audio-Visual Anomaly Detection
* SelfME: Self-Supervised Motion Learning for Micro-Expression Recognition
* Semantic Human Parsing via Scalable Semantic Transfer Over Multiple Label Domains
* Semantic Prompt for Few-Shot Image Recognition
* Semantic Ray: Learning a Generalizable Semantic Field with Cross-Reprojection Attention
* Semantic Scene Completion with Cleaner Self
* Semantic-Conditional Diffusion Networks for Image Captioning*
* Semantic-Promoted Debiasing and Background Disambiguation for Zero-Shot Instance Segmentation
* Semi-DETR: Semi-Supervised Object Detection with Detection Transformers
* Semi-Supervised 2D Human Pose Estimation Driven by Position Inconsistency Pseudo Label Correction Module
* Semi-Supervised Domain Adaptation with Source Label Adaptation
* Semi-Supervised Hand Appearance Recovery via Structure Disentanglement and Dual Adversarial Discrimination
* Semi-supervised learning made simple with self-supervised clustering
* Semi-Supervised Parametric Real-World Image Harmonization
* Semi-Supervised Stereo-Based 3D Object Detection via Cross-View Consensus
* Semi-Supervised Video Inpainting with Cycle Consistency Constraints
* Semi-Weakly Supervised Object Kinematic Motion Prediction
* SemiCVT: Semi-Supervised Convolutional Vision Transformer for Semantic Segmentation
* Semidefinite Relaxations for Robust Multiview Triangulation
* SeqTrack: Sequence to Sequence Learning for Visual Object Tracking
* Sequential Training of GANs Against GAN-Classifiers Reveals Correlated Knowledge Gaps Present Among Independently Trained GAN Instances
* SeSDF: Self-Evolved Signed Distance Field for Implicit 3D Clothed Human Reconstruction
* SFD2: Semantic-Guided Feature Detection and Description
* SfM-TTR: Using Structure from Motion for Test-Time Refinement of Single-View Depth Networks
* SGLoc: Scene Geometry Encoding for Outdoor LiDAR Localization
* ShadowDiffusion: When Degradation Prior Meets Diffusion Model for Shadow Removal
* ShadowNeuS: Neural SDF Reconstruction by Shadow Ray Supervision
* Shakes on a Plane: Unsupervised Depth Estimation from Unstabilized Photography
* Shape, Pose, and Appearance from a Single Image via Bootstrapped Radiance Field Inversion
* Shape-Aware Text-Driven Layered Video Editing
* Shape-Constraint Recurrent Flow for 6D Object Pose Estimation
* Shape-Erased Feature Learning for Visible-Infrared Person Re-Identification
* ShapeClipper: Scalable 3D Shape Learning from Single-View Images via Geometric and CLIP-Based Consistency
* ShapeTalk: A Language Dataset and Framework for 3D Shape Edits and Deformations
* Sharpness-Aware Gradient Matching for Domain Generalization
* Shepherding Slots to Objects: Towards Stable and Robust Object-Centric Learning
* Shifted Diffusion for Text-to-image Generation
* Shortcomings of Top-Down Randomization-Based Sanity Checks for Evaluations of Deep Neural Network Explanations
* SHS-Net: Learning Signed Hyper Surfaces for Oriented Normal Estimation of Point Clouds
* Siamese DETR
* Siamese Image Modeling for Self-Supervised Vision Representation Learning
* Sibling-Attack: Rethinking Transferable Adversarial Attacks against Face Recognition
* Side Adapter Network for Open-Vocabulary Semantic Segmentation
* SIEDOB: Semantic Image Editing by Disentangling Object and Background
* SIM: Semantic-aware Instance Mask Generation for Box-Supervised Instance Segmentation
* Similarity Maps for Self-Training Weakly-Supervised Phrase Grounding
* Similarity Metric Learning For RGB-Infrared Group Re-Identification
* Simple Baseline for Video Restoration with Grouped Spatial-Temporal Shift, A
* Simple Cues Lead to a Strong Multi-Object Tracker
* Simple Framework for Text-Supervised Semantic Segmentation, A
* SimpleNet: A Simple Network for Image Anomaly Detection and Localization
* SimpSON: Simplifying Photo Cleanup with Single-Click Distracting Object Segmentation Network
* Simulated Annealing in Early Layers Leads to Better Generalization
* Simultaneously Short- and Long-Term Temporal Modeling for Semi-Supervised Video Semantic Segmentation
* SINE: Semantic-driven Image-based NeRF Editing with Prior-guided Editing Field
* SINE: SINgle Image Editing with Text-to-Image Diffusion Models
* Single Domain Generalization for LiDAR Semantic Segmentation
* Single Image Backdoor Inversion via Robust Smoothed Classifiers
* Single Image Depth Prediction Made Better: A Multivariate Gaussian Take
* Single View Scene Scale Estimation using Scale Field
* SinGRAF: Learning a 3D Generative Radiance Field for a Single Scene
* Sketch2Saliency: Learning to Detect Salient Objects from Human Drawings
* SketchXAI: A First Look at Explainability for Human Sketches
* Skinned Motion Retargeting with Residual Perception of Motion Semantics and Geometry
* SkyEye: Self-Supervised Bird's-Eye-View Semantic Mapping Using Monocular Frontal View Images
* SLACK: Stable Learning of Augmentations with Cold-Start and KL Regularization
* Sliced Optimal Partial Transport
* SliceMatch: Geometry-Guided Aggregation for Cross-View Pose Estimation
* Slide-Transformer: Hierarchical Vision Transformer with Local Self-Attention
* Slimmable Dataset Condensation
* SLOPER4D: A Scene-Aware Dataset for Global 4D Human Pose Estimation in Urban Environments
* SlowLiDAR: Increasing the Latency of LiDAR-Based Detection Using Adversarial Examples
* SMAE: Few-shot Learning for HDR Deghosting with Saturation-Aware Masked Autoencoders
* Smallcap: Lightweight Image Captioning Prompted with Retrieval Augmentation
* SmartAssign:Learning A Smart Knowledge Assignment Strategy for Deraining and Desnowing
* SmartBrush: Text and Shape Guided Object Inpainting with Diffusion Model
* SMOC-Net: Leveraging Camera Pose for Self-Supervised Monocular Object Pose Estimation
* SMPConv: Self-Moving Point Representations for Continuous Convolution
* Soft Augmentation for Image Classification
* Soft-Landing Strategy for Alleviating the Task Discrepancy Problem in Temporal Action Localization Tasks
* Solving 3D Inverse Problems Using Pre-Trained 2D Diffusion Models
* Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective
* Solving Relaxations of MAP-MRF Problems: Combinatorial in-Face Frank-Wolfe Directions
* Soma Segmentation Benchmark in Full Adult Fly Brain, A
* SOOD: Towards Semi-Supervised Oriented Object Detection
* Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment
* Source-Free Adaptive Gaze Estimation by Uncertainty Reduction
* Source-Free Video Domain Adaptation with Spatial-Temporal-Historical Consistency Learning
* SPARF: Neural Radiance Fields from Sparse and Noisy Poses
* Sparse Multi-Modal Graph Transformer with Shared-Context Processing for Representation Learning of Giga-pixel Images
* SparseFusion: Distilling View-Conditioned Diffusion for 3D Reconstruction
* Sparsely Annotated Semantic Segmentation with Adaptive Gaussian Mixtures
* SparsePose: Sparse-View Camera Pose Regression and Refinement
* SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer
* Sparsifiner: Learning Sparse Instance-Dependent Attention for Efficient Vision Transformers
* SpaText: Spatio-Textual Representation for Controllable Image Generation
* Spatial-Frequency Mutual Learning for Face Super-Resolution
* Spatial-temporal Concept based Explanation of 3D ConvNets
* Spatial-then-Temporal Self-Supervised Learning for Video Correspondence
* Spatially Adaptive Self-Supervised Learning for Real-World Image Denoising
* Spatio-Focal Bidirectional Disparity Estimation from a Dual-Pixel Image
* Spatio-Temporal Pixel-Level Contrastive Learning-based Source-Free Domain Adaptation for Video Semantic Segmentation
* Spatiotemporal Self-Supervised Learning for Point Clouds in the Wild
* Specialist Diffusion: Plug-and-Play Sample-Efficient Fine-Tuning of Text-to-Image Diffusion Models to Learn Any Unseen Style
* Spectral Bayesian Uncertainty for Image Super-Resolution
* Spectral Enhanced Rectangle Transformer for Hyperspectral Image Denoising
* Sphere-Guided Training of Neural Implicit Surfaces
* Spherical Transformer for LiDAR-Based 3D Recognition
* Spider GAN: Leveraging Friendly Neighbors to Accelerate GAN Training
* SPIn-NeRF: Multiview Segmentation and Perceptual Inpainting with Neural Radiance Fields
* SplineCam: Exact Visualization and Characterization of Deep Network Geometry and Decision Boundaries
* Spring: A High-Resolution High-Detail Dataset and Benchmark for Scene Flow, Optical Flow and Stereo
* SQUID: Deep Feature In-Painting for Unsupervised Anomaly Detection
* sRGB Real Noise Synthesizing with Neighboring Correlation-Aware Noise Model
* Standing Between Past and Future: Spatio-Temporal Modeling for Multi-Camera 3D Multi-Object Tracking
* STAR Loss: Reducing Semantic Ambiguity in Facial Landmark Detection
* StarCraftImage: A Dataset For Prototyping Spatial Reasoning Methods For Multi-Agent Environments
* Stare at What You See: Masked Image Modeling without Reconstruction
* Starting from Non-Parametric Networks for 3D Point Cloud Analysis
* STDLens: Model Hijacking-Resilient Federated Learning for Object Detection
* SteerNeRF: Accelerating NeRF Rendering via Smooth Viewpoint Trajectory
* StepFormer: Self-Supervised Step Discovery and Localization in Instructional Videos
* Stimulus Verification is a Universal and Effective Sampler in Multi-modal Human Trajectory Prediction
* Stitchable Neural Networks
* STMixer: A One-Stage Sparse Action Detector
* STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition
* Streaming Video Model
* Strong Baseline for Generalized Few-Shot Semantic Segmentation, A
* Structural Multiplane Image: Bridging Neural View Synthesis and 3D Reconstruction
* Structure Aggregation for Cross-Spectral Stereo Image Guided Denoising
* Structured 3D Features for Reconstructing Controllable Avatars
* Structured Kernel Estimation for Photon-Limited Deconvolution
* Structured Sparsity Learning for Efficient Video Super-Resolution
* StructVPR: Distill Structural Knowledge with Weighting Samples for Visual Place Recognition
* Style Projected Clustering for Domain Generalized Semantic Segmentation
* StyleAdv: Meta Style Adversarial Training for Cross-Domain Few-Shot Learning
* StyleGAN Salon: Multi-View Latent Optimization for Pose-Invariant Hairstyle Transfer
* StyleGene: Crossover and Mutation of Region-level Facial Genes for Kinship Face Synthesis
* StyleIPSB: Identity-Preserving Semantic Basis of StyleGAN for High Fidelity Face Swapping
* StyleRes: Transforming the Residuals for Real Image Editing with StyleGAN
* StyleRF: Zero-Shot 3D Style Transfer of Neural Radiance Fields
* StyLess: Boosting the Transferability of Adversarial Examples
* StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-Based Generator
* SUDS: Scalable Urban Dynamic Scenes
* SunStage: Portrait Reconstruction and Relighting Using the Sun as a Light Stage
* Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning
* Super-Resolution Neural Operator
* Superclass Learning with Representation Enhancement
* SuperDisco: Super-Class Discovery Improves Visual Recognition for the Long-Tail
* Supervised Masked Knowledge Distillation for Few-Shot Transformers
* SurfelNeRF: Neural Surfel Radiance Fields for Online Photorealistic Reconstruction of Indoor Scenes
* SVFormer: Semi-supervised Video Transformer for Action Recognition
* SVGformer: Representation Learning for Continuous Vector Graphics using Transformers
* SViTT: Temporal Learning of Sparse Video-Text Transformers
* Swept-Angle Synthetic Wavelength Interferometry
* Switchable Representation Learning Framework with Self-Compatibility
* Symmetric Shape-Preserving Autoencoder for Unsupervised Real Scene Point Cloud Completion
* Synthesizing Photorealistic Virtual Humans Through Cross-Modal Disentanglement
* SynthVSR: Scaling Up Visual Speech Recognition With Synthetic Supervision
* System-Status-Aware Adaptive Network for Online Streaming Video Understanding
* T-SEA: Transfer-Based Self-Ensemble Attack on Object Detection
* Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation
* Tangentially Elongated Gaussian Belief Propagation for Event-Based Incremental Optical Flow Estimation
* TAPS3D: Text-Guided 3D Textured Shape Generation from Pseudo Supervision
* Target-referenced Reactive Grasping for Dynamic Objects
* TarViS: A Unified Approach for Target-Based Video Segmentation
* Task Difficulty Aware Parameter Allocation and Regularization for Lifelong Learning
* Task Residual for Tuning Vision-Language Models
* Task-Specific Fine-Tuning via Variational Information Bottleneck for Weakly-Supervised Pathology Whole Slide Image Classification
* TBP-Former: Learning Temporal Bird's-Eye-View Pyramid for Joint Perception and Prediction in Vision-Centric Autonomous Driving
* Teacher-generated spatial-attention labels boost robustness and accuracy of contrastive models
* Teaching Matters: Investigating the Role of Supervision in Vision Transformers
* Teaching Structured Vision and Language Concepts to Vision and Language Models
* Teleidoscopic Imaging System for Microscale 3D Shape Reconstruction
* Tell Me What Happened: Unifying Text-guided Video Completion via Multimodal Masked Video Generation
* Temporal Attention Unit: Towards Efficient Spatiotemporal Predictive Learning
* Temporal Consistent 3D LiDAR Representation Learning for Semantic Perception in Autonomous Driving
* Temporal Interpolation is all You Need for Dynamic Neural Radiance Fields
* Temporally Consistent Online Depth Estimation Using Point-Based Fusion
* TempSAL - Uncovering Temporal Information for Deep Saliency Prediction
* TensoIR: Tensorial Inverse Rendering
* Tensor4D: Efficient Neural 4D Decomposition for High-Fidelity Dynamic Reconstruction and Rendering
* TeSLA: Test-Time Self-Learning With Automatic Adversarial Augmentation
* Test of Time: Instilling Video-Language Models with a Sense of Time
* Test Time Adaptation with Regularized Loss for Weakly Supervised Salient Object Detection
* TexPose: Neural Texture Learning for Self-Supervised 6D Object Pose Estimation
* Text with Knowledge Graph Augmented Transformer for Video Captioning
* Text-Guided Unsupervised Latent Transformation for Multi-Attribute Image Manipulation
* Text-Visual Prompting for Efficient 2D Temporal Video Grounding
* Text2Scene: Text-driven Indoor Scene Stylization with Part-Aware Details
* Texts as Images in Prompt Tuning for Multi-Label Image Recognition
* Texture-Guided Saliency Distilling for Unsupervised Salient Object Detection
* Therbligs in Action: Video Understanding through Motion Primitives
* Thermal Spread Functions (TSF): Physics-Guided Material Classification
* Think Twice before Driving: Towards Scalable Decoders for End-to-End Autonomous Driving
* Three Guidelines You Should Know for Universally Slimmable Self-Supervised Learning
* TimeBalance: Temporally-Invariant and Temporally-Distinctive Video Representations for Semi-Supervised Action Recognition
* TINC: Tree-Structured Implicit Neural Compression
* TinyMIM: An Empirical Study of Distilling MIM Pre-trained Models
* TIPI: Test Time Adaptation with Transformation Invariance
* TMO: Textured Mesh Acquisition of Objects with a Mobile Device by using Differentiable Rendering
* Token Boosting for Robust Self-Supervised Visual Transformer Pre-training
* Token Contrast for Weakly-Supervised Semantic Segmentation
* Token Turing Machines
* TokenHPE: Learning Orientation Tokens for Efficient Head Pose Estimation via Transformers
* Top-Down Visual Attention from Analysis by Synthesis
* TopDiG: Class-Agnostic Topological Directional Graph Extraction from Remote Sensing Images
* TOPLight: Lightweight Neural Networks with Task-Oriented Pretraining for Visible-Infrared Recognition
* TopNet: Transformer-Based Object Placement Network for Image Compositing
* Topology-Guided Multi-Class Cell Context Generation for Digital Pathology
* ToThePoint: Efficient Contrastive Learning of 3D Point Clouds via Recycling
* Toward Accurate Post-Training Quantization for Image Super Resolution
* Toward RAW Object Detection: A New Benchmark and A New Model
* Toward Stable, Interpretable, and Lightweight Hyperspectral Super-Resolution
* Toward Verifiable and Reproducible Human Evaluation for Text-to-Image Generation
* Towards a Smaller Student: Capacity Dynamic Distillation for Efficient Image Retrieval
* Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization
* Towards All-in-One Pre-Training via Maximizing Multi-Modal Mutual Information
* Towards Artistic Image Aesthetics Assessment: A Large-scale Dataset and a New Method
* Towards Benchmarking and Assessing Visual Naturalness of Physical World Adversarial Attacks
* Towards Better Decision Forests: Forest Alternating Optimization
* Towards Better Gradient Consistency for Neural Signed Distance Functions via Level Set Alignment
* Towards Better Stability and Adaptability: Improve Online Self-Training for Model Adaptation in Semantic Segmentation
* Towards Bridging the Performance Gaps of Joint Energy-Based Models
* Towards Building Self-Aware Object Detectors via Reliable Uncertainty Quantification and Calibration
* Towards Compositional Adversarial Robustness: Generalizing Adversarial Training to Composite Semantic Perturbations
* Towards Domain Generalization for Multi-view 3D Object Detection in Bird-Eye-View
* Towards Effective Adversarial Textured 3D Meshes on Physical Face Recognition
* Towards Effective Visual Representations for Partial-Label Learning
* Towards Efficient Use of Multi-Scale Features in Transformer-Based Object Detectors
* Towards End-to-End Generative Modeling of Long Videos with Memory-Efficient Bidirectional Transformers
* Towards Fast Adaptation of Pretrained Contrastive Models for Multi-channel Video-Language Retrieval
* Towards Flexible Multi-modal Document Models
* Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection to Image-Text Pre-Training
* Towards High-Quality and Efficient Video Super-Resolution via Spatial-Temporal Data Overfitting
* Towards Modality-Agnostic Person Re-identification with Descriptive Query
* Towards Open-World Segmentation of Parts
* Towards Practical Plug-and-Play Diffusion Models
* Towards Professional Level Crowd Annotation of Expert Domain Data
* Towards Realistic Long-Tailed Semi-Supervised Learning: Consistency is All You Need
* Towards Robust Tampered Text Detection in Document Image: New Dataset and New Solution
* Towards Scalable Neural Representation for Diverse Videos
* Towards Stable Human Pose Estimation via Cross-View Fusion and Foot Stabilization
* Towards Transferable Targeted Adversarial Examples
* Towards Trustable Skin Cancer Diagnosis via Rewriting Model's Decision
* Towards Unbiased Volume Rendering of Neural Implicit Surfaces with Geometry Priors
* Towards Unified Scene Text Spotting Based on Sequence Generation
* Towards Universal Fake Image Detectors that Generalize Across Generative Models
* Towards Unsupervised Object Detection from LiDAR Point Clouds
* Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion
* TRACE: 5D Temporal Regression of Avatars with Dynamic Cameras in 3D Environments
* Tracking Multiple Deformable Objects in Egocentric Videos
* Tracking Through Containers and Occluders in the Wild
* Trade-off between Robustness and Accuracy of Vision Transformers
* Train-Once-for-All Personalization
* Train/Test-Time Adaptation with Retrieval
* Trainable Projected Gradient Method for Robust Fine-Tuning
* Training Debiased Subnetworks with Contrastive Weight Pruning
* Trajectory-Aware Body Interaction Transformer for Multi-Person Pose Forecasting
* Transductive Few-Shot Learning with Prototype-Based Label Propagation by Iterative Graph Refinement
* Transfer Knowledge from Head to Tail: Uncertainty Calibration under Long-tailed Distribution
* Transfer4D: A Framework for Frugal Motion Capture and Deformation Transfer
* Transferable Adversarial Attacks on Vision Transformers with Token Gradient Regularization
* TransFlow: Transformer as Flow Learner
* Transformer Scale Gate for Semantic Segmentation
* Transformer-Based Learned Optimization
* Transformer-based Unified Recognition of Two Hands Manipulating Objects
* Transforming Radiance Field with Lipschitz Network for Photorealistic 3D Scene Stylization
* TranSG: Transformer-Based Skeleton Graph Prototype Contrastive Learning with Structure-Trajectory Prompted Reconstruction for Person Re-Identification
* Trap Attention: Monocular Depth Estimation with Manual Traps
* Treasure Beneath Multiple Annotations: An Uncertainty-Aware Edge Detector, The
* Tree Instance Segmentation with Temporal Contour Graph
* Tri-Perspective View for Vision-Based 3D Semantic Occupancy Prediction
* TriDet: Temporal Action Detection with Relative Boundary Modeling
* TriVol: Point Cloud Rendering via Triple Volumes
* TrojDiff: Trojan Attacks on Diffusion Models with Diverse Targets
* TrojViT: Trojan Insertion in Vision Transformers
* TruFor: Leveraging All-Round Clues for Trustworthy Image Forgery Detection and Localization
* TryOnDiffusion: A Tale of Two UNets
* TTA-COPE: Test-Time Adaptation for Category-Level Object Pose Estimation
* Tunable Convolutions with Parametric Multi-Loss Optimization
* Turning a CLIP Model into a Scene Text Detector
* Turning Strengths into Weaknesses: A Certified Robustness Inspired Attack Framework against Graph Neural Networks
* Twin Contrastive Learning with Noisy Labels
* TWINS: A Fine-Tuning Framework for Improved Transferability of Adversarial Robustness and Generalization
* Two-shot Video Object Segmentation
* Two-stage Co-segmentation Network Based on Discriminative Representation for Recovering Human Mesh from Videos
* Two-Stream Networks for Weakly-Supervised Temporal Action Localization with Semantic-Aware Mechanisms
* Two-View Geometry Scoring Without Correspondences
* Two-Way Multi-Label Loss
* UDE: A Unified Driving Engine for Human Motion Generation
* ULIP: Learning a Unified Representation of Language, Images, and Point Clouds for 3D Understanding
* Ultra-High Resolution Segmentation with Ultra-Rich Context: A Novel Benchmark
* Ultrahigh Resolution Image/Video Matting with Spatio-Temporal Sparsity
* UMat: Uncertainty-Aware Single Image High Resolution Material Capture
* Unbalanced Optimal Transport: A Unified Framework for Object Detection
* Unbiased Multiple Instance Learning for Weakly Supervised Video Anomaly Detection
* Unbiased Scene Graph Generation in Videos
* Uncertainty-Aware Optimal Transport for Semantically Coherent Out-of-Distribution Detection
* Uncertainty-Aware Unsupervised Image Deblurring with Deep Residual Prior
* Uncertainty-Aware Vision-Based Metric Cross-View Geolocalization
* Uncovering the Disentanglement Capability in Text-to-Image Diffusion Models
* Uncovering the Missing Pattern: Unified Framework Towards Trajectory Imputation and Prediction
* Uncurated Image-Text Datasets: Shedding Light on Demographic Bias
* Understanding and Constructing Latent Modality Structures in Multi-Modal Representation Learning
* Understanding and Improving Features Learned in Deep Functional Maps
* Understanding and Improving Visual Prompting: A Label-Mapping Perspective
* Understanding Deep Generative Models with Generalized Empirical Likelihoods
* Understanding Imbalanced Semantic Segmentation Through Neural Collapse
* Understanding Masked Autoencoders via Hierarchical Latent Variable Models
* Understanding Masked Image Modeling via Learning Occlusion Invariant Feature
* Understanding the Robustness of 3D Object Detection with Bird'View Representations in Autonomous Driving
* Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks
* Uni3D: A Unified Baseline for Multi-Dataset 3D Object Detection
* Unicode Analogies: An Anti-Objectivist Visual Reasoning Challenge
* UniDAformer: Unified Domain Adaptive Panoptic Segmentation Transformer via Hierarchical Mask Calibration
* UniDexGrasp: Universal Robotic Dexterous Grasping via Learning Diverse Proposal Generation and Goal-Conditioned Policy
* UniDistill: A Universal Cross-Modality Knowledge Distillation Framework for 3D Object Detection in Bird's-Eye View
* Unified HDR Imaging Method with Pixel and Patch Level, A
* Unified Keypoint-Based Action Recognition Framework via Structured Keypoint Pooling
* Unified Knowledge Distillation Framework for Deep Directed Graphical Models, A
* Unified Mask Embedding and Correspondence Learning for Self-Supervised Video Segmentation
* Unified Pose Sequence Modeling
* Unified Pyramid Recurrent Network for Video Frame Interpolation, A
* Unified Spatial-Angular Structured Light for Single-View Acquisition of Shape and Reflectance, A
* Unifying Layout Generation with a Decoupled Diffusion Model
* Unifying Short and Long-Term Tracking with Graph Hierarchies
* Unifying Vision, Text, and Layout for Universal Document Processing
* UniHCP: A Unified Model for Human-Centric Perceptions
* UniSim: A Neural Closed-Loop Sensor Simulator
* Unite and Conquer: Plug and Play Multi-Modal Synthesis Using Diffusion Models
* Universal Instance Perception as Object Discovery and Retrieval
* Unknown Sniffer for Object Detection: Don't Turn a Blind Eye to Unknown Objects
* Unlearnable Clusters: Towards Label-Agnostic Unlearnable Examples
* Unpaired Image-to-Image Translation with Shortest Path Regularization
* Unsupervised 3D Point Cloud Representation Learning by Triangle Constrained Contrast for Autonomous Driving
* Unsupervised 3D Shape Reconstruction by Part Retrieval and Assembly
* Unsupervised Continual Semantic Adaptation Through Neural Rendering
* Unsupervised Contour Tracking of Live Cells by Mechanical and Cycle Consistency Losses
* Unsupervised Cumulative Domain Adaptation for Foggy Scene Optical Flow
* Unsupervised Deep Asymmetric Stereo Matching with Spatially-Adaptive Self-Similarity
* Unsupervised Deep Probabilistic Approach for Partial Point Cloud Registration
* Unsupervised Domain Adaption with Pixel-Level Discriminator for Image-Aware Layout Generation
* Unsupervised Inference of Signed Distance Functions from Single Sparse Point Clouds without Learning Priors
* Unsupervised Intrinsic Image Decomposition with LiDAR Intensity
* Unsupervised Object Localization: Observing the Background to Discover Objects
* Unsupervised Sampling Promoting for Stochastic Human Trajectory Prediction
* Unsupervised Space-Time Network for Temporally-Consistent Segmentation of Multiple Motions
* Unsupervised Visible-Infrared Person Re-Identification via Progressive Graph Matching and Alternate Learning
* Unsupervised Volumetric Animation
* Upcycling Models Under Domain and Category Shift
* Use Your Head: Improving Long-Tail Video Recognition
* UTM: A Unified Multiple Object Tracking Model with Identity-Aware Feature Enhancement
* UV Volumes for Real-time Rendering of Editable Free-view Human Performance
* V2V4Real: A Real-World Large-Scale Dataset for Vehicle-to-Vehicle Cooperative Perception
* V2X-Seq: A Large-Scale Sequential Dataset for Vehicle-Infrastructure Cooperative Perception and Forecasting
* Variational Distribution Learning for Unsupervised Text-to-Image Generation
* VDN-NeRF: Resolving Shape-Radiance Ambiguity via View-Dependence Normalization
* VecFontSDF: Learning to Reconstruct and Synthesize High-Quality Vector Fonts via Signed Distance Functions
* Vector Quantization with Self-Attention for Quality-Independent Representation Learning
* VectorFloorSeg: Two-Stream Graph Attention Network for Vectorized Roughcast Floorplan Segmentation
* VectorFusion: Text-to-SVG by Abstracting Pixel-Based Diffusion Models
* VGFlow: Visibility guided Flow Network for Human Reposing
* Vid2Avatar: 3D Avatar Reconstruction from Videos in the Wild via Self-supervised Scene Decomposition
* Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning
* Video Compression with Entropy-Constrained Neural Representations
* Video Dehazing via a Multi-Range Temporal Alignment Network with Physical Prior
* Video Event Restoration Based on Keyframes for Video Anomaly Detection
* Video Probabilistic Diffusion Models in Projected Latent Space
* Video Test-Time Adaptation for Action Recognition
* Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning
* VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation
* VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
* VideoTrack: Learning to Track Objects via Video Transformer
* ViewNet: A Novel Projection-Based Backbone with View Pooling for Few-shot Point Cloud Classification
* Viewpoint Equivariance for Multi-View 3D Object Detection
* VILA: Learning Image Aesthetics from User Comments with Vision-Language Pretraining
* ViLEM: Visual-Language Error Modeling for Image-Text Retrieval
* VindLU: A Recipe for Effective Video-and-Language Pretraining
* ViP3D: End-to-End Visual Trajectory Prediction via 3D Agent Queries
* ViPLO: Vision Transformer Based Pose-Conditioned Self-Loop Graph for Human-Object Interaction Detection
* Virtual Occlusions Through Implicit Depth
* Virtual Sparse Convolution for Multimodal 3D Object Detection
* VisFusion: Visibility-Aware Online 3D Scene Reconstruction from Videos
* Visibility Aware Human-Object Interaction Tracking from Single RGB Camera
* Visibility Constrained Wide-Band Illumination Spectrum Design for Seeing-in-the-Dark
* Vision Transformers are Good Mask Auto-Labelers
* Vision Transformers are Parameter-Efficient Audio-Visual Learners
* Visual Atoms: Pre-Training Vision Transformers with Sinusoidal Waves
* Visual Dependency Transformers: Dependency Tree Emerges from Reversed Attention
* Visual DNA: Representing and Comparing Images Using Distributions of Neuron Activations
* Visual Exemplar Driven Task-Prompting for Unified Perception in Autonomous Driving
* Visual Language Pretrained Multiple Instance Zero-Shot Transfer for Histopathology Images
* Visual Localization using Imperfect 3D Models from the Internet
* Visual Programming: Compositional visual reasoning without training
* Visual Prompt Multi-Modal Tracking
* Visual Prompt Tuning for Generative Transfer Learning
* Visual Query Tuning: Towards Effective Usage of Intermediate Representations for Parameter and Memory Efficient Transfer Learning
* Visual Recognition by Request
* Visual Recognition-Driven Image Restoration for Multiple Degradation with Intrinsic Semantics Recovery
* Visual-Language Prompt Tuning with Knowledge-Guided Context Optimization
* Visual-Tactile Sensing for In-Hand Object Reconstruction
* Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting
* ViTs for SITS: Vision Transformers for Satellite Image Time Series
* VIVE3D: Viewpoint-Independent Video Editing using 3D-Aware GANs
* VL-SAT: Visual-Linguistic Semantics Assisted Training for 3D Semantic Scene Graph Prediction in Point Cloud
* VLPD: Context-Aware Pedestrian Detection via Vision-Language Semantic Self-Supervision
* vMAP: Vectorised Object Mapping for Neural Field SLAM
* VNE: An Effective Method for Improving Deep Representation by Manipulating Eigenvalue Distribution
* VolRecon: Volume Rendering of Signed Ray Distance Functions for Generalizable Multi-View Reconstruction
* VoP: Text-Video Co-Operative Prompt Tuning for Cross-Modal Retrieval
* VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking
* VoxFormer: Sparse Voxel Transformer for Camera-Based 3D Semantic Scene Completion
* VQACL: A Novel Visual Question Answering Continual Learning Setting
* Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring
* Wavelet Diffusion Models are fast and scalable Image Generators
* Weak-shot Object Detection through Mutual Knowledge Transfer
* Weakly Supervised Class-agnostic Motion Prediction for Autonomous Driving
* Weakly Supervised Monocular 3D Object Detection Using Multi-View Projection and Direction Consistency
* Weakly Supervised Posture Mining for Fine-Grained Classification
* Weakly Supervised Segmentation with Point Annotations for Histopathology Images via Contrast-Based Variational Model
* Weakly Supervised Semantic Segmentation via Adversarial Learning of Classifier and Reconstructor
* Weakly Supervised Temporal Sentence Grounding with Uncertainty-Guided Self-training
* Weakly Supervised Video Emotion Detection and Prediction via Cross-Modal Temporal Erasing Network
* Weakly Supervised Video Representation Learning with Unaligned Text for Sequential Videos
* Weakly-Supervised Domain Adaptive Semantic Segmentation with Prototypical Contrastive Learning
* Weakly-supervised Single-view Image Relighting
* WeatherStream: Light Transport Automation of Single Image Deweathering
* Whac-A-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies Others, A
* What Can Human Sketches Do for Object Detection?
* What Happened 3 Seconds Ago? Inferring the Past with Thermal Imaging
* What You Can Reconstruct from a Shadow
* Where is My Spot? Few-shot Image Generation via Latent Subspace Optimization
* Where is my Wallet? Modeling Object Proposal Sets for Egocentric Visual Query Localization
* Where We Are and What We're Looking At: Query Based Worldwide Image Geo-localization Using Hierarchies and Scenes
* Why is the Winner the Best?
* Wide-Angle Rectification via Content-Aware Conformal Mapping
* WildLight: In-the-wild Inverse Rendering with a Flashlight
* WinCLIP: Zero-/Few-Shot Anomaly Classification and Segmentation
* WINNER: Weakly-supervised hIerarchical decompositioN and aligNment for spatio-tEmporal video gRounding
* WIRE: Wavelet Implicit Neural Representations
* Wisdom of Crowds: Temporal Progressive Attention for Early Action Prediction, The
* X-Avatar: Expressive Human Avatars
* X-Pruner: eXplainable Pruning for Vision Transformers
* X3KD: Knowledge Distillation Across Modalities, Tasks and Stages for Multi-Camera 3D Object Detection
* YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors
* You Are Catching My Attention: Are Vision Transformers Bad Learners under Backdoor Attacks?
* You Can Ground Earlier than See: An Effective and Efficient Pipeline for Temporal Sentence Grounding in Compressed Videos
* You Do Not Need Additional Priors or Regularizers in Retinex-Based Low-Light Image Enhancement
* You Need Multiple Exiting: Dynamic Early Exiting for Accelerating Unified Vision Language Model
* You Only Segment Once: Towards Real-Time Panoptic Segmentation
* ZBS: Zero-Shot Background Subtraction via Instance-Level Background Modeling and Foreground Selection
* ZegCLIP: Towards Adapting CLIP for Zero-shot Semantic Segmentation
* Zero-Shot Dual-Lens Super-Resolution
* Zero-Shot Everything Sketch-Based Image Retrieval, and in Explainable Style
* Zero-Shot Generative Model Adaptation via Image-Specific Prompt Learning
* Zero-Shot Model Diagnosis
* Zero-Shot Noise2Noise: Efficient Image Denoising without any Data
* Zero-Shot Object Counting
* Zero-shot Pose Transfer for Unrigged Stylized 3D Characters
* Zero-shot Referring Image Segmentation with Global-Local Context Features
* Zero-Shot Text-to-Parameter Translation for Game Character Auto-Creation
* Ą-la-carte Prompt Tuning (APT): Combining Distinct Data Via Composable Prompting
2356 for CVPR23

CVPR4HB08 * *CVPR for Human Communicative Behavior Analysis
* American Sign Language Lexicon Video Dataset, The
* Associating audio-visual activity cues in a dominance estimation framework
* Automatic facial expression recognition for intelligent tutoring systems
* B-spline polynomial descriptors for human activity recognition
* Distributed segmentation and classification of human actions using a wearable motion sensor network
* HO2: A new feature for multi-agent event detection and recognition
* Multimodal real-time focus of attention estimation in SmartRooms
* Principal appearance and motion from boosted spatiotemporal descriptors
* Remote and head-motion-free gaze tracking for real environments with automated head-eye model calibrations
* Speaker detection using the timing structure of lip motion and sound
* Technique for automatic emotion recognition by body gesture analysis
* Towards fast, view-invariant human action recognition
13 for CVPR4HB08

CVPR4HB09 * *CVPR for Human Communicative Behavior Analysis
* Action recognition via local descriptors and holistic features
* alignment based similarity measure for hand detection in cluttered sign language video, An
* Audiovisual event detection towards scene understanding
* Auditory dialog analysis and understanding by generative modelling of interactional dynamics
* Automatic recognition of fingerspelled words in British Sign Language
* Automatically detecting action units from faces of pain: Comparing shape and appearance features
* Dominance detection in face-to-face conversations
* framework for automated measurement of the intensity of non-posed Facial Action Units, A
* Fusion by optimal dynamic mixtures of proposal distributions
* Head pose estimation using Spectral Regression Discriminant Analysis
* implicit spatiotemporal shape model for human activity localization and recognition, An
* Modeling and exploiting the spatio-temporal facial action dependencies for robust spontaneous facial expression recognition
* Multi-modal laughter recognition in video conversations
* Physiological modelling for improved reliability in silhouette-driven gradient-based hand tracking
* Robust facial action recognition from real-time 3D streams
* Social Signal Processing: Understanding social interactions through nonverbal behavior analysis
* Use of Active Appearance Models for analysis and synthesis of naturally occurring behavior
18 for CVPR4HB09

CVPR4HB10 * Action recognition based on a bag of 3D points
* Annotation and taxonomy of gestures in lecture videos
* Attention estimation by simultaneous observation of viewer and view
* Automatic segmentation of video to aid the study of faucet usability for older adults
* Capturing appearance variation in active appearance models
* Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression, The
* Facial action unit detection: 3D versus 2D modality
* Facial Expression Invariant Head Pose Normalization Using Gaussian Process Regression
* Facial expression recognition using Gabor motion energy filters
* Facial expressions as feedback cue in human-robot interaction: A comparison between human and automatic recognition performances
* Hierarchical preference learning for light control from user feedback
* Human activity recognition in video using a hierarchical probabilistic latent model
* Learning spatial weighting via quadratic programming for facial expression analysis
* novel approach to American Sign Language (ASL) phrase verification using reversed signing, A
14 for CVPR4HB10

CVPR4HB11 * *CVPR for Human Communicative Behavior Analysis
* Automatic visual mimicry expression analysis in interpersonal interaction
* common framework for real-time emotion recognition and facial action unit detection, A
* Facial behaviometrics: The case of facial deformation in spontaneous smile/laughter
* Learning human behaviour patterns in work environments
* Localizing actions through sequential 2D video projections
* Recognizing expressions from face and body gesture by temporal normalized motion and appearance features
* Sparse representations of image gradient orientations for visual recognition and tracking
* Towards an Optimal Affect-Sensitive Instructional System of cognitive skills
9 for CVPR4HB11

CVPR83 * *CVPR
* Absorption Edge Detector, An
* Automatic 3D Reconstruction from 2D Geometric Part Descriptions
* Automatic 3D Reconstruction from 2D geometric Part Descriptions
* Classification of Boundaries on the Plane Using Stochastic Models
* Corner Detection Using the Facet Model
* Determination of Rotational and Translational Components of a Flow Field using a Content Addressable Parallel Processor
* Determining 3-D Motion and Structure of a Rigid Body Over 3 Frames Using Straight Line Correspondence
* Distortion Invariant Recognition Using a Moment Feature Space
* Edge Detection by Estimation of Multiple-Order Derivatives
* Effects of Distortions on the Recognition Rate of a Structural OCR System
* Estimating 3-D Motion Parameters and Object Surface Structures from the Image Motion of Curved Edges
* Extraction Elliptical Figures for an Edge Vector Field
* Fractal-Based Description of Natural Scenes
* Gaussian Blur And The Heat Equation: Forward And Inverse Solutions
* Handprinted Kanji Recogniton by Feature Matching Methods and Its Application to Personal OCR
* Image Motion Analysis Using the Concept of Weak Solutions to Distributed Parameter Systems
* Image Restoration by Transformation of Signal-Dependent Noise to Signal-Independent Noise
* Image Segmentation and Nucleus Classification for Automated Tissue Section Analysis
* Inferring Motion of Cylindrical Object from Shading
* Intensity Guided Range Sensing Recognition of Three-Dimensional Objects
* Linear Delineation
* Matching Regions in Aerial Images
* Measures of Homogeneity in Texture
* Measuring the Shape of 3-D Objects
* Method to Restore Multichannel Imagery, A
* Minimal Error Region Merging Technique for Segmentation, A
* Model Based Vision System to Identify and Locate Partially Visible Industrial Parts, A
* Moment Based Two-Dimensional Edge Operator, A
* Moment Techniques in Picture Analysis
* Multi-Resolution Flow - Through Motion Analysis
* Multi-Resolution Representation for Shape, A
* New Product Graph Based Algorithm for Subgraph Isomorphism, A
* Nonparametric Edge Detection With an Assumption on Minimum Edge Height
* Normalized Quadtree Representation, A
* Obtaining 3-D from Shadows in Aerial Images
* Octree Representations of Moving Objects
* Optimal Color Quantization for Color Display
* Optimal Perceptual Inference
* Perceptual Organization and Curve Partitioning
* Perceptual Segmentation of Nonhomogeneous Dot Patterns
* Quaternions in Computer Vision and Robotics
* Recovering Motion Parameters in Scenes Containing Multiple Moving Objects
* Rectangular Coding for Binary Images
* Relationship between Image Irradiance and Surface Orientation, The
* Representation and Segmentation of Document Images
* Representation of Images Using Voronoi Tessellation
* Representation of Regions of Map Data for Efficient Comparison and Retrieval
* Rotation Invariant Texture Classification Using Circular Random Field Models
* Scene Matching by Hierarchical Correlation
* Segmentation of Images Using a Relaxation Technique
* Segmentation of Range Data into Planar and Quadratic Patches
* Semi-Consistency: A Solution to the No-Label Problem
* Shape Analysis of Three Dimensional Objects Using the Method of Moments
* Solving Three-Dimensional Small-Rotation Motion Equations
* Step-Wise Optimization for Hierarchical Picture Segmentation
* Three-Dimensional Feature Extraction
* Using Quadtrees to Represent Polygonal Maps
* Using Symbolic Differences to Organize Relational Models
* Virtual Quadtrees
* Visual Inspection Using Linear Features
61 for CVPR83

CVPR85 * *CVPR
* 2-D Object Recognition Using Hierarchical Boundary Segments
* 3-D Model Building using CAGD Techniques
* 3-D Surface Discrimination from Local Curvature Measures
* 3D-Profile Method for Object Recognition, The
* Acquisition of Randomly Oriented Workpieces Through Structure Mating
* Alignment and Connection of Fragmented Linear Features in Aerial Imagery
* Analysis and Performance of Two Middle-Level Vision Tasks on a Fine-Grained SIMD Tree Machine, The
* Analyzing Orthographic Projection of Multiple 3-D Velocity Vector Fields in Optical Flow
* Approach to Smart Document Reader System
* Architecture for Automatic Lipreading to Enhance Speech Recognition, An
* Automatic Comparison of 2-D Electrophoretic Gels
* Boundary Detection by Minimizing Functionals
* Capturing the Local Structure of Image Discontinuities in Two Dimensions
* Centroid Tracking Scheme in a Weighted Coordinate System, A
* Comparison of Several Estimators of Edge Point in Noisy Digital Data Across a Step Edge, A
* Computational Approach to Remote Sensing, A
* Computational Theory of Stereo Vision, A
* Computer Vision System That Analyses CT-Scans of Sawlogs, A
* Computing Geometrical Features of Digital Objects in General Purpose Image Processing Pipeline Architectures
* Construction of Surface Representation from 3-D Volumetric Scene Description
* CONTAM: An Edge-Based Approach to Segmenting Images with Irregular Objects
* Cross-Angle Transform for Viewer-Independent Recognition of 3-D Objects
* CSG-EESI Scheme for Representing Solids and a Conversion Expert System, The
* Depth from Stereo
* Design of an Expert System for Object Classification Through an Application to the Classification of Galaxies
* Displacement Vector Field Computation by Temporal Covariance Model
* Edge Detection from Local Negative Maximum of Second Derivative
* Edge Detection with Subpixel Precision
* Empirical Results with a Model of Color Vision
* Enhanced Mesh Connected VLSI Architecture for Parallel Image Processing, An
* Equivalent Descriptions of Generalized Cylinders
* Estimating and Recognizing Parameterized 3-D Objects Using a Moving Camera
* Estimation of Object Motion Parameters from Noisy Images
* Estimation of Position and Displacement in Space from Two Images
* Experiments in Estimation of 3-D Motion Parameters from a Sequence of Image Frames
* Extraction of the Fair Document from Mixed Mode Manuscript
* Fast Stereo Matching of Edges Segments Using Prediction and Verification of Hypotheses
* Filtering Closed Curves
* Finding Range From Stereo Images
* Gaussian Blur And The Heat Equation: Forward And Inverse Solutions
* Hey Robot... Looking for Cones?
* Histogram Analysis Using A Scale Space Approach
* Iconic and Symbolic Processing Using a Content Addressable Array Parallel Processor
* Image Flow: Fundamentals and Future Research
* Improved Fast Parallel Thinning Algorithm for Digital Patterns, An
* Incrementally Constructing a Spatial Representation Using a Moving Camera
* Inherent Ambiguities in Recovering 3-D Motion and Structure from a Noisy Flow Field
* Intrinsic and Extrinsic Surface Characteristics
* Investigation of Adaptable Vision System for Factory Automation, An
* Learning Structural Descriptions of Shape
* Machine Perception of Partially Obscured Planar Shapes
* Machine Vision Techniques for Finding Sugarcane Seedeyes
* Matching Three-Dimensional Symbolic Descriptions Obtained from Multiple Views of a Scene
* Method for Calculating Velocity in Time Dependent Images Based on the Continuity Equation, A
* Model for Describing Surface Shape, A
* Model-based Shape from Contour and Point Patterns
* Modified Hough Transform for Lines, A
* Motion Problem: A Decomposition-Based Solution, The
* Multi-channel Filtering Methods for Segmentation of Color Images
* New Approach to the Continuous Labeling Problem, A
* New Precise Recognition Method for Handprinted Kanji, A
* New Vision System and the Fourier Descriptor Method by Group Representation Theory, A
* Octree Representation of Objects in Arbitrary Motion: Representation and Efficiency
* On the Use of Computer Vision Techniques for Automatic Speech Recognition
* Parallel and Hierarchical Algorithm for Region Growing, A
* Parallel Image Thinning and Vectorization on PASM
* Parallel Polyhedral Shape Recognition
* Parallel Processing of Iconic to Symbolic Transformation of Images
* Piecing Together the 3D Shape of Moving Objects: An Overview
* Pipelined Processor for Low-Level Vision, A
* Practical Basis Set for Chinese Character Recognition, A
* Range Image Understanding
* Recognition and Knowledge Synthesis of 3-D Object Image Based on Attributed Hypergraph
* Recognition of Conics in Contours Using Their Geometrical Properties
* Relaxation for Point-Pattern Matching: What it really Computes
* Robust Algorithms for Motion Estimation Based on Two Sequential Stereo Image Pairs
* Second Directional Derivative Zero Crossing Detector Using the Cubic Facet Model
* Segment Quadtree: A Linear Quadtree-Based Representation for Linear Features, The
* Segmentation of Images Based on Intensity Gradient Information
* Sequential Edge Detection in Correlated Random Fields
* Shape Analysis of Three Dimensional Objects Using Range Information
* Shape From Rotation
* Solving Random-Dot Stereograms Using the Heat Equation
* Stereo Reconstruction of Scene Depth
* Structure from Motion: An Alternative Approach
* Surface Reconstruction from Planar Cross Sections
* Systolic Convolver for Parallel, Multiresolution Edge Detection, A
* Two View Motion Analysis
* Two-Dimensional Object Recognition Using Multiresolution Models
* Use of Compound Predicates in Split-and-Merge Segmentation
* Use of the Delaunay Triangulation for the Identification and the Localization of Objects
92 for CVPR85

CVPR86 * *CVPR
* 3-D Model Construction from Multiple Views Using Range and Intensity Data
* 3-D Objects Recognition by State Space Search: Optimal Geometric Matching
* 3D Model Building using a Fast Range Finder
* Algorithm for the Segmentation of Bilevel Images, An
* Automatic Detection of Address Blocks on Irregular Mail Pieces
* Automatic Road Network Detection on Aerial Photographs
* Calculating Object Size from Stereoscopic Motion
* Calibration Problem for Stereo, The
* Camera Rotation Invariance of Image Characteristics
* Characterization of Some Drilled Pit Images from Contour and Ridge Points Detection
* Classification of Partial 2-D Shapes Using Fourier Descriptors
* Coarse-to-Fine Control Strategy for Stereo and Motion on a Mesh-connected Computer, A
* Computation of Volume/Surface Octrees from Contours and Silhouettes of Multiple Views
* Computational Approach to Visual Word Recognition: Hypothesis Generation and Testing, A
* Depth from Three Camera Stereo
* Description of Surfaces from Range Data using Curvature Properties
* Detecting Half-Edges and Vertices in Images
* Edge Detection Using the Directional Derivatives of a Space Varying Correlated Random Field Model
* Efficient and Accurate Camera Calibration Technique for 3-D Machine Vision, An
* Efficient Octree Generation from Silhouettes
* Egomotion Using Active Vision
* Fast Polygonal Approximation Method for Real-Time Shape Recognition, A
* Feature Identification for Hybrid Structural/Statistical Pattern Classification
* Generalized Cone Descriptions from Sparse 3-D Data
* Geometric Grouping Applied to Straight Lines
* Grayscale Morphology
* Implementation of Conditional Processing and Pyramids with a General Purpose Pipelined Pixel Processor
* Improved Approach to Connected Component Labeling of Images, An
* Information Fusion Problem and Rule Based Hypothesis Applied to Complex Aggregations of Image Events, The
* Invariant Surface Reconstruction Using Weak Continuity Constraints
* Kinematics of a Rigid Object from a Sequence of Noisy Images: A Batch Approach
* Labeling Connected Components on a Massively Parallel Tree Machine
* Local Rotational Symmetries
* Machine Printed Chinese and Japanese Character Recognition Method and Experiments for Reading Japanese Pocket Books
* Method of Image Guided Vehicle Using White Line Recognition, A
* Midline Model Based Segmentation
* Model Based Analysis of Industrial Scenes
* Motion Estimation from 3-D Point Sets with and without Correspondences
* Motion Estimation: The Proper Formulation for when 3 or 4 Frames Are Available
* Motion Stereo Using Ego-Motion Complex Logarithmic Mapping
* Multiple-Scale Segmentation and Representation of Solid Plane Shapes
* Note on Two-Dimensional Landmark-Based Object Recognition, A
* Object Recognition Using the Connection Machine's Router
* On Optimum Edge Recognition Using Matched Filters
* Optimal Cell Size for Efficient Retrieval of Sparse Data by Approximate 2D Position Using a Coarse Spatial Array
* Optimal Likelihood Generators for Edge Detection under Gaussian Additive Noise
* Optimal Linear Operator for Edge Detection, An
* Parallel Algorithms for Low level Vision on the Homogeneous Multiprocessor
* Perception of Structure from Motion: I: Optic Flow vs. Discrete Displacements, II: Lower Bound Results
* Perspective Angle Transform and Its Application to 3-D Configuration Recovery
* Pyramid Machine Simulator for the Symbolics 3600, A
* Range and Shape Measurement Using Three-View Stereo Analysis
* Recognizing Shapes via Random Chord Samplings
* Representations Based on Zero-Crossings in Scale-Space
* Road Following Using Vanishing Points
* Rule Based System for Pattern Recognition that Exploits Topological Constraints, A
* Segmentation and Classification of Range Images
* Segmentation of Range Images into Planar Surfaces by Split and Merge
* Segmentation Through Symbolic Surface Descriptions
* Selection and Use of Image Features for Segmentation of Boundary Images
* Sensing 3-D Surface patches Using a Projected Grid
* Shape and 3-D Motion from Contour without Point to Point Correspondences: General Principles
* Shape from Light Stripe Texture
* Smoothing the Optic Flow Field Under Perspective Projection
* Spherical Dual Images: A 3D Representation Method for Solid Objects that Combines Dual Space and Gaussian Spheres
* Split-Merge-merge: An Enhanced Segmentation Capability
* Structure from Motion, Acceleration and Taylor Series
* Syntactic Omni-Font Character Recognition System, A
* Three-Dimensional Vision System for Bin-Picking, A
* Tracing Neurons Through Serial Sections: Using Knowledge of Shape to Improve Performance
* Very Fast Convolution with Laplacian-of-Gaussian Masks
* Visual Surface Reconstruction Using Sparse Depth Data
73 for CVPR86

CVPR88 * *CVPR
* 3-D Motion Estimation Using a Sequence of Noisy Stereo Images: Models, Estimation, and Uniqueness Results
* Acceleration-Based Structure-from-Motion
* Accessibility: A New Approach to Path Planning Among Moving Obstacles
* Acquiring Simple Patterns for Surface Inspection
* Algorithms for Shape from Shading and Occluding Boundaries
* Analysis of Feature Detectability from Curvature Estimation, An
* Analysis of Two New Stereo Algorithms
* Automated Fast Recognition and Location of Arbitrarily Shaped Objects by Image Morphology
* Automatic Generation of Simple Morphological Algorithms
* binary consistency checking scheme and its applications to seismic Horizon Detection, The
* Binary Morphology: Working in the Sampled Domain
* Bit Level Concurrency in Real-Time Geometric Feature Extractions
* Building an Accurate Range Finder with off the Shelf Components
* CAD Based Planning and Execution of Inspection
* Calculating Geometric Properties of Objects Represented by Fourier Coefficients
* Calculation of Surface Position and Orientation Using the Photometric Stereo Method
* Calibrating a Cartesian Robot with Eye-Hand Configuration Independent of Eye-to-Hand Relationship
* Calibration of a Stereo System with Small Relative Angles
* Character Recognition Using Attributed Grammar
* Closed-Form Representation of Convolution, Dilation, and Erosion in the Context of Image Algebra
* Closed-Form Solution + Maximum Likelihood: A Robust Approach to Motion and Structure Estimation
* Collision-Free Navigation Scheme in the Presence of Moving Obstacles, A
* Color Constancy Computation in Near-Mondrian Scenes Using a Finite Dimensional Linear Model
* Color Metric for Computer Vision, A
* Comparative Study of Two Useful Discrete-Valued Random Fields for the Statistical Modeling of Images, A
* Complex Shadow-Boundary Segmentation Using the Entry-Exit Method
* Computationally Efficient Algorithm for Shape Decomposition, A
* Computing Quadtree Medial Axis Transform by a Multi-Layered Pyramid of Lisp-Processor Arrays
* Computing the Aspect Graph for Line Drawings of Polyhedral Objects
* Connected Component Labeling on Polymorphic Torus Architecture
* Context Dependent Edge Detection
* Convected Activation Profiles and the Measurement of Visual Motion
* Convexity Algorithms for Parallel Machines
* Cooperative Methods for Road Tracking in Aerial Imagery
* Curvature-Based Approach to Terrain Recognition, A
* Decision Theoretic Approach for 3-D Vision, A
* Depth Recovery from Blurred Edges
* Deriving Coarse 3-D Models of Objects
* Descriptive Pattern Recognition System Applied to Pictorial Patterns Where the Discriminating Information Is Carried in the Object Shape, A
* Design of Fast Connected Components Hardware
* Detecting Motion in Out-of-Register Pictures
* Determination of Camera Location from 2-D to 3-D Line and Point Correspondences
* Dynamic Scene Understanding for Autonomous Mobile Robotics
* Edge Detection Through Residual Analysis
* Effective Method for Determining the Robot Position, An
* Estimating Motion/Structure from Line Correspondences: A Robust Linear Algorithm and Uniqueness Theorems
* Evaluation of Quantization Error in Computer Vision
* Experiments and Evaluations of Rule Based Methods in Image Analysis
* Expert Vision System for Autonomous Land Vehicle Road Following, An
* Finding Objects in Aerial Photographs: A Rule-Based Low Level System
* Finding Point Correspondences and Determining Motion of a Rigid Object from Two Weak Perspective Views
* From Depth and Optical Flow to Rigid Body Motion
* Generalizing Epipolar-Plane Image Analysis on the Spatiotemporal Surface
* Generation of Volume/Surface Octree from Range Data
* Graph-Based Vectorization Method for Line Patterns
* Gray-Levels Can Improve the Performance of Binary Image Digitizers
* How To Tell Right From Left
* Image Computations on Reconfigurable VLSI Arrays
* Image Interpretation by Distributed Cooperative Processes
* Image Representation Using Voronoi Tessellation: Adaptive and Secure
* Image Sequence Enhancement Using Sub-Pixel Displacements
* Improving Visible-Surface Reconstruction
* Incremental Estimation of Dense Depth Maps from Image Sequences
* Integrated Approach to Feature Based Dynamic Vision, An
* Integrating Moving Edge Information along a 2-D Trajectory in Densely Sampled Imagery
* Integrating Region Growing and Edge Detection
* Integration of Information for Stereo and Multiple Shape-from-Texture Cues, The
* Interactive Complexity Control and High-Speed Stereo Matching
* Invariants of Three-Dimensional Contours
* Inverse Perspective Problem from a Single View for Polyhedra Location, The
* IU Parallel Processing Benchmark
* Karhunen-Loeve Analysis of Dynamic Sequences of Thermographic Images for Early Breast Cancer Detection
* Knowledge-Based System for Recognizing Man-Made Objects in Aerial Images, A
* Large Hierarchical Object Recognition Using Libraries of Parameterized Model Sub-Parts
* Learning Rules for 3-D Object Recognition
* Line-Drawing Interpretation: A Mathematical Framework
* Local Constraint Integration in a Connectionist Model of Stereo Vision
* Localization of Objects from Range Data
* Moment Images, Polynomial Fit Filters, and the Problem of Surface Interpolation
* Morphology-Based Symbolic Image Modeling, Multi-Scale Nonlinear Smoothing, and Pattern Spectrum
* Motion Analysis of Nonridgid Surfaces
* Motion-vision architectures
* Multi-Resolution Feature Reduction Technique for Image Segmentation with Multiple Components, A
* Multi-Scale Description of Space Curves and Three-Dimensional Objects
* New Algorithms for Reconstruction of a 3-D Depth Map from One or More Images
* new approach to machine-based perception of monocular images, A
* New Approach to Robot Orientation by Orthogonal Lines, A
* new representation for a line, A
* New Strategy for Octree Representation of Three-Dimensional Objects, A
* Novel Approach to Real-Time Motion Detection, A
* Object Recognition by Affine Invariant Matching
* Object Recognition Using the Connection Machine
* On Image Analysis by the Methods of Moments
* On Improving the Accuracy of The Hough Transform: Theory, Simulations, and Experiments
* On the Number of Digital Straight Lines on an NxN Grid
* On use of predictive probabilistic estimates for selecting best decision rules in the course of a search
* Operational Perception System for Cross-Country Navigation, An
* Optimal Algorithm for the Derivation of Shape from Shadows, An
* Optimal Geometric Algorithms on Fixed-Size Linear Arrays and Scan Line Arrays
* Parallel bit-level pipelined VLSI processing unit for the histogramming operation
* Pattern Classification Using Relative Constraints
* Pattern Recognition Algorithm Based on the Rapid Transform, A
* Point Symmetry of Convex Digital Regions
* Polynomial Shift-Invariant Operators for Texture Segmentation
* Projective Invariants of Shape
* Pyramid Based Depth from Focus
* Qualitative Depth from Vertical and Horizontal Binocular Disparities, in Agreement with Psychophysical Evidence
* Quantization Error in Stereo Imaging
* Range from translational motion blurring
* Rapid Search for Spherical Objects in Aerial Photographs
* Recognition of Handwritten Words: First and Second Order Hidden Markov Model Based Approach
* Recursive Clustering Technique for Color Picture Segmentation, A
* Refining Edges Detected by a LoG Operator
* Region Grouping from a Range Image
* Relaxational Extracting Method for Character Recognition in Scene Images, A
* Renormalized Curvature Scale Space and the Evolution Properties of Planar Curves, The
* Road Finding for Road-Network Extraction
* Robust Image Processing Language in the Context of Image Algebra, A
* Robust Parallel Implementation of 2D Model-Based Recognition, A
* Robust Recovery of Motion: Effects of Surface Orientation and Field of View
* Rule-Based Inspection of Leadframes
* Scale-Independent Dominant Point Detection Algorithm, A
* Searching parameter spaces with noisy linear constraints
* Shape from Texture Using the Wigner Distribution
* Simulating Essential Pyramids
* Smooth Interpolation of Rotational Motions
* Soft-Linked Quadtree: A Cascaded Ring Structure Using Flexible Linkage Concept
* Solving the Depth Interpolation Problem on a Parallel Architecture with a Multigrid Approach
* Spline-Based Surface Fitting on Range Images for CAD Applications
* Straight Homogeneous Generalized Cylinders: Differential Geometry and Uniqueness Results
* Straight Line Fitting in a Noisy Image
* Structural Pyramids for Representing and Locating Moving Obstacles in Visual Guidance of Navigation
* Surface Classification: Hypothesis Testing and Parameter Estimation
* Surface Orientation from Projective Foreshortening of Isotropic Texture Autocorrelation
* Texture Segmentation Using Voronoi Polygons
* Tree Search Algorithm for Target Detection in Image Sequences, A
* Useful Geometric Properties of the Generalized Cone
* Using Perceptual Grouping to Recognize and Locate Partially Occluded Objects
* Vertex Space Analysis and Its Application to Model Based Object Recognition
* Weak Structural Texture Analysis Technique to Wave Heights for Ocean Waves Image, A
* Wide Base-Line Dynamic Stereo: Approximation and Refinement
142 for CVPR88

CVPR89 * *CVPR
* 1-Subcycle Parallel Thinning Algorithm for Producing Perfect 8-Curves and Obtaining Isotropic Skeleton of an L-Shape Pattern, A
* 3D Edge Detection Using Recursive Filtering: Application to Scanner Images
* 3D Object Recognition Via Simulated Particles Diffusion
* Accurate Measurement of Orientation from Stereo Using Line Correspondence
* Adaptive Smoothing: A General Tool for Early Vision
* Adding Scale to the Primal Sketch
* Analytic Solution for the Perspective 4-Point Problem, An
* Application of robust sequential edge detection and linking to boundaries of low contrast lesions in medical images
* Applying Uncertainty Reasoning to Model Based Object Recognition
* Automated Inspection of Solder Bumps Using Visual Signatures of Specular Image-Highlights
* Automatic Thresholding Algorithm Based on an Illumination-Independent Contrast Measure, An
* Camera Calibration Using Geometric Constraints
* Color Image Segmentation Using Markov Random Fields
* Computation of Normal Velocity from Local Phase Information
* Computational Model of Texture Segmentation, A
* Computing Oriented Texture Fields
* Contour-based Recovery of Image Flow: Iterative Method, A
* Cost Minimization Approach to Edge Detection Using Simulated Annealing, A
* Data Set for Quantitative Motion Analysis, A
* Depth from Dynamic Stereo Images
* Depth Recovery Algorithm Using Defocus Information, A
* Determining Linear Shape Change: Towards Automatic Generation of Object Recognition Programs
* Development of hand-eye system with 3-D vision and microgripper and its application to assembling flexible wires
* Discontinuity Preserving Surface Reconstruction
* Dual Representation of Gray-Scale Morphological Filters, The
* Edge Detection, Classification, and Measurement
* Edge Recognition in Dynamic Vision
* Edge Reinforcement Using Parameterized Relaxation Labeling
* Estimation of Motion Parameters for a Deformable Object from Range Data
* Experiments in Fitting Discrete Markov Random Fields to Textures
* Fast Computation of Disparity from Phase Differences, The
* Fast Surface Interpolation Using Hierarchical Basis Functions
* Feature Extraction from Faces Using Deformable Templates
* Feature selection with stochastic complexity
* Fine Grain Image Computations on Electro-Optical Arrays
* Fingerprint Theorems for Curvature and Torsion Zero-Crossings
* From 3D Line Segments to Objects and Spaces
* Generalized Neighborhoods: A New Approach to Complex Parameter Feature Extraction
* Generalized smoothing networks in early vision
* Hierarchical Approach to Line Extraction, A
* Hierarchical clustering on SIMD machines with alignment network
* Hierarchical Region Based Stereo Matching
* Image Coding via Morphological Transformations: A General Theory
* Integrated Modelling of Thermal and Visual Image Generation
* Inverse Perspective Transform from Zero-Curvature Curve Points Application to the Localization of Some Generalized Cylinders
* Knowledge-Based Approach to the Detection, Tracking and Classification of Target Formations in Infrared Image Sequences, A
* Knowledge-guided left ventricular boundary detection
* Likely local shape
* Locating Human Faces in Newspaper Photographs
* Marked Grid Labeling
* Markov Random Field Model-Based Approach to Image Interpretation, A
* Measurement in Three Dimensions by Motion Stereo and Spherical Mapping
* Method for Shape-from-Shading Using Multiple Images Acquired under Different Viewing and Lighting Conditions, A
* Methodology for Experimental Computer Vision
* Model Learning and Recognition of Nonrigid Objects
* Monocular Vision Using Inverse Perspective Projection Geometry: Analytic Relations
* Multiresolution Shape from Shading
* Mutual Illumination
* Need for Accuracy Verification of Machine Vision Algorithms and Systems, The
* New Method for Computing Intrinsic Surface Properties, A
* Non-Linear Optimization Algorithm for the Estimation of Structure and Motion Parameters, A
* Object Recognition Using a Neural Network and Invariant Zernike Features
* Object Wings: 2-1/2-D Primitives for 3-D Recognition
* On Computing a Fuzzy Focus of Expansion for Autonomous Navigation
* On Reliable Curvature Estimation
* Optimal Implementation of Morphological Operations on Neighborhood Connected Array Processors, The
* Optimal Motion and Structure Estimation
* Outdoor Vehicle Navigation Using Passive 3D Vision
* Parallel Memory Systems for Image Processing
* Parametrically Deformable Contour Models
* Path Planning Using a Potential Field Representation
* Polarization/Radiometric Based Material Classification
* Processing of Line Drawings in a Hierarchical Environment
* Quantization Error in Spatial Sampling: Comparison between Square and Hexagonal Pixels
* Real-Time Vergence Control for Binocular Robots
* Recognition of hand-lettered characters in the GTX 5000 drawing processor
* Recognition of the coronary blood vessels on angiograms using hierarchical model-based iconic search
* Region-Based Optical Flow Estimation
* Registration of Multiple Overlapping Range Images: Scenes without Distinctive Features
* Representation and Recognition of 3-D Curves
* Representation and Segmentation of a Cluttered Scene Using Fused Edge and Surface Data
* Representation and transformation of uncertainty in an evidence theory framework
* Representation Space: An Approach to the Integration of Visual Information
* Representing and Comparing Shapes Using Shape Polynomials
* Robust Edge Detection
* Scanning Electron Microscope-Based Stereo Analysis
* Segmentation and Description Based on Perceptual Organization
* Segmentation of Textured Images
* Shape Understanding from Lambertian Photometric Flow Fields
* Simple, Real-Time Range Camera, A
* Structured Edge Map of Curved Objects in a Range Image
* Techniques for Real-Time Generation of Range Images
* Theory of Photometric Stereo for a General Class of Reflectance Maps, A
* Trajectory Primal Sketch: A Multi-Scale Scheme for Representing Motion Characteristics, The
* Using Polarization to Separate Reflection Components
* Visual Recognition Using Concurrent and Layered Parameter Networks
* Visual Terrain Matching for a Mars Rover
98 for CVPR89

CVPR91 * *CVPR
* 3D from an Image Sequence: Occlusions and Perspective
* Active Camera Controlling for Manipulation
* Adaptive Estimation of Hysteresis Thresholds
* Algorithmic Characterization of Vehicle Trajectories from Image Sequences by Motion Verbs
* Analysis and Solutions of the Three Point Perspective Pose Estimation Problem
* Analysis of the Probability of Disparity Changes in Stereo Matching and a New Algorithm Based on the Analysis, An
* Application of a Hybrid Tracking Algorithm to Motion Analysis, The
* Articulated Object Recognition, or: How to Generalize the Generalized Hough Transform
* Boundary element methods for solving Poisson equations in computer vision problems
* Camera Models Determination Using Multiple Frames
* Camera Stability Problem and Dynamic Stereo Vision, The
* Classification of Facial Features for Recognition
* Closed-Form Solutions for Physically Based Shape Modeling and Recognition
* Closed-Loop Adaptive Image Segmentation
* Computational Approach to Boundary Detection, A
* computational framework and SIMD algorithms for low-level support of intermediate level vision processing, A
* Computer vision hardware using the Radon transform
* Computing a Stable, Connected Skeleton from Discrete Data
* Computing Viewpoints that Satisfy Optical Constraints
* consensus structure inference algorithm, A
* Constraining Deformable Superquadrics and Nonrigid Motion Tracking
* Curve-Based Stereo: Figural Continuity and Curvature
* Deformable Kernels for Early Vision
* Deformable Models: Canonical Parameters for Surface Representation and Multiple View Integration
* Determining 3-D Object Pose Using the Complex Extended Gaussian Image
* Determining a Maximum Value Yield of a Log Using an Optical Log Scanner
* Direct Computation of Height from Shading, The
* Discontinuity Detection and Thresholding: A Stocastic Approach
* discontinuity detector based on the pervasive noise in surface property data, A
* Dual networks and their pattern classification properties
* Dynamic Stereo in Visual Navigation
* Dynamic System for Object Description and Correspondence, A
* Early Jump-Out Corner Detectors
* Edge Detection Using Refined Regularization
* Efficiently Using Invariant Theory for Model-Based Matching
* Establishing Motion Correspondence
* Estimation of Discontinuous Displacement Vector Fields with the Minimum Description Length Criterion
* Estimation of Illuminant Direction, Albedo, and Shape from Shading
* Estimation of Motion and Structure of Planar Surfaces from a Sequence of Monocular Images
* Exact Euclidean Distance Function by Chain Propagation
* Extracting Surfaces of Revolution by Perceptual Grouping of Ellipses
* Face Recognition Using Eigenfaces
* Fast Affine Point Matching: An Output-Sensitive Method
* Fast Segmentation of Range Images into Planar Regions
* Feature Matching in 360^o Waveforms for Robot Navigation
* Finding Convex Edge Groupings in an Image
* Finding Junctions Using the Image Gradient
* Finding the aspect-ratio of an imaging system
* Fractal probability functions-an application to image analysis
* From Voxel to Curvature
* Fuzzy Algorithms to Find Linear and Planar Clusters and Their Application
* Generic Recognition Through Qualitative Reasoning about 3-D Shape and Object Function
* Gripping Information for a Robot from Silhouettes
* How Accurately Can Direct Motion Vision Determine Depth
* Human motion analysis based on a robot arm model
* Identification and 3D Description of Shallow Environmental Structure in a Sequence of Images
* Identification of Interreflection in Color Images Using a Physics-Based Reflection Model
* image pyramid with morphological operators, An
* Integration and Interpretation of Incomplete Stereo Scene Data
* Integration of Vision Modules: A Game-Theoretic Framework
* Introducing New Deformable Surfaces to Segment 3D Images
* Linear Algorithm for Computing the Phase Portraits of Oriented Textures, A
* linear generalized Hough transform and its parallel implementation, A
* Long-Range Spatiotemporal Motion Understanding Using Spatiotemporal Flow Curves
* MAP Model Matching
* MAP Representations and Coding-Based Priors for Segmentation
* Markov/Gibbs Texture Modeling: Aura Matrices and Temperature Effects
* MARVEL: A System That Recognizes World Locations with Stereo Vision
* Matrix Based Method for Determining Depth from Focus, A
* Measurement of Non-Rigid Motion Using Contour Shape Descriptors
* Model Based Recognition Using Pruned Correspondence Search
* Model-Group Indexing for Recognition
* Modeling Generic Polyhedral Objects with Constraints
* Modelling Solids of Revolution by Monocular Vision
* Motion Analysis and Epicardial Deformation Estimation from Angiography Data
* Multi-Channel Filtering Approach to Texture Segmentation, A
* Multi-Dimensional Robust Edge Detection
* Multidimensional Indexing for Recognizing Visual Shapes
* Multiframe-Based Identification of Mobile Components of a Scene with a Moving Camera
* Multiple-Baseline Stereo, A
* Multiscale Approach for Recognizing Complex Annotations in Engineering Documents, A
* Neural-Network Approach to CSG-Based 3-D Object Recognition, A
* New Shape Segmentation Approach for Active Vision Systems, A
* Object Detection Using Contrast Based Scale-Space
* Offline Tracing and Representation of Signatures
* On an Analysis of Static Occlusion in Stereo Vision
* On contour texture
* On Corner and Vertex Detection
* On the Error Analysis of Geometric Hashing
* On the representation of occluded shapes
* Optimal Contour Approximation by Deformable Piecewise Cubic Splines
* Optimal Matching of Planar Models in 3D Scenes
* P-Field: A Computational Model for Binocular Motion Processing, The
* Parallel algorithms and architectures for discrete relaxation technique
* Partial Implementation of the Fixation Method on Real Images: Direct Recovery of Motion and Shape in the General Case
* Pattern recognition with new class discovery
* Physically-based edge labeling
* Planar Shape Classification Using Hidden Markov Model
* Pose Clustering on Constraints for Object Recognition
* Pose Estimation of Jointed Structures
* Positional Estimation of a Mobile Robot Using Edge Visibility Regions
* Probability Distributions of Optical Flow
* Qualitative Detection of Motion by a Moving Observer
* Qualitative Motion Analysis Using a Spatio-Temporal Approach
* Quantitative Approach to Camera Fixation, A
* Rapid Euclidean Distance Transform Using Grayscale Morphology Decomposition
* Real-Time Generation of Environmental Map and Obstacle Avoidance Using Omnidirectional Image Sensor with Conic Mirror
* Recognition and Semi-Differential Invariants
* Recovering Shape from Contour for Constant Cross Section Generalized Cylinders
* Recovery of Nonridid Motion and Structure
* Region Based Stereo Matching Oriented Image Processing
* Relative Positioning from Geometric Invariants
* Remote-sensing issues for intelligent underwater systems
* Robust Dynamic Motion Estimation Over Time
* Robust vectorization using graph-based thinning and reliability-based line approximation
* Sampling and Reconstruction with Adaptive Meshes
* Screw Motion Approach to Uniqueness Analysis of Head-Eye Geometry, A
* Segmentation and Grouping of Object Boundaries Using Energy Minimization
* Segmentation by Nonlinear Diffusion
* Sequences, Structure, and Active Vision
* Shape Adaptation for Modelling of 3D Objects in Natural Scenes
* Shape From Rotation
* Shape from Shading as a Partially Well-Constructed Problem
* Shape Representation and Image Segmentation Using Deformable Surfaces
* Shape Representation and Recognition from Curvature
* Shared Memory Multiprocessor Implementation and Evaluation of Hough Transform Algorithm
* Small autonomous mobile robots: sensing and action
* Sources from Shading
* Stereopsis and Image Registration from Extended Edge Features in the Absence of Camera Pose Information
* stereoscopic camera employing a single main lens, A
* Structural Hashing: Efficient Three Dimensional Object Recognition
* Surface and Motion Estimation from Sparse Range Data
* Surface Approximation Using Weighted Splines
* SYMAN: A SYMmetry ANalyzer
* Syntactic pattern classification by branch and bound search
* Teleological computer graphics modeling
* Temporal Slice Analysis of Image Sequences
* Temporal Surface Reconstruction
* Topological Segmentation of Discrete Surfaces
* Trajectories and Events
* Two Plane Camera Calibration: A Unified Model
* Uncertainty Update and Dynamic Search Window for Model-Based Object Recognition
* Unified Computational Theory for Motion Transparency and Motion Boundaries Based on Eigenenergy Analysis, A
* Use of Monocular Groupings and Occlusion Analysis in a Hierarchical Stereo System
* Using Collinear Points to Compute Egomotion and Detect Nonrigidity
* Using Stereomotion to Track Binocular Targets
147 for CVPR91

CVPR92 * *CVPR
* 3-D Landmark Recognition from Range Images
* 3-D Recognition and Shape Estimation from Image Contours Using Invariant 3-D Object Curve Models
* 3D Model Acquisition from Monocular Image Sequences
* 3D Shape and Light Source Location from Depth and Reflectance
* Absolute Orientation from Uncertain Point Data: A Unified Approach
* Accuracy assessment on camera calibration method not considering lens distortion
* Accurate calibration of CCD-cameras
* Active Object Recognition
* Active Photometric Stereo
* Adaptive Meshes and Shells: Irregular Triangulation, Discontinuities, and Hierarchical Subdivisions
* Adaptive-Size Physically-Based Models for Nonrigid Motion Analysis
* Affine Trackability Aids Obstacle Detection
* Alignment of Objects with Smooth Surfaces: Error Analysis of the Curvature Method, The
* Analysis of the Least median of Squares Estimator for Computer Vision Applications
* Anatomy of a Color Histogram
* Autonomous Fixation
* Bayesian Treatment of the Stereo Correspondence Problem Using Half-Occluded Regions, A
* CCD camera calibration and noise estimation
* Classification trees with neural network feature extraction
* Comparing Images Using the Hausdorff Distance under Translation
* Computational ground and airborne localization over rough terrain
* Computing Curvilinear Structure by Token-Based Grouping
* Computing Occlusion-Free Viewpoints
* Computing stereo correspondences in the presence of narrow occluding objects
* Computing the View Orientations of Random Projections of Asymmetric Objects
* Constructing perceptual categories
* Contour Matching Using Local Affine Transformations
* Controlling Illumination color to Enhance Object Discriminability
* Correcting Chromatic Aberrations Using Image Warping
* Curved Contours and Surface Reconstruction
* Deformable Models for 3D Medical Images Using Finite Elements and Balloons
* Deformable Region Model Using Stochastic Processes Applied to Echocardiographic Images, A
* Depth from Defocus and Rapid Autofocusing: A Practical Approach
* Detecting parameterized curve segments using MDL and the Hough transform
* Determination of the Apparent Boundary of an Object
* Diffuse Reflection
* Direct Method for Reconstructing Shape from Shading
* Direct Motion Stereo for Passive Navigation
* Edge detection in range images through morphological residue analysis
* Efficient Model Library Access by Projectively Invariant Indexing Functions
* Exploratory Active Vision: Theory
* Extracting the Shape and Roughness of Specular Lobe Objects Using Four Light Photometric Stereo
* Face Recognition Based on Depth and Curvature Features
* Fast Linear Shape from Shading, A
* Fast Recognition Using Adaptive Subdivisions of Transformation Space
* Feature Based Approach to Face Recognition, A
* From Accurate Range Imaging Sensor Calibration to Accurate Model-Based 3-D Object Localization
* From Partial Derivatives of 3D Density Images to Ridge Lines
* Generating Connected Skeletons for Exact and Approximate Reconstruction
* geometric approach to machine-printed character recognition, A
* Geometric image primitives by complex moments in Gabor space and the application to texture segmentation
* Geometric Primitive Extraction Using A Genetic Algorithm
* geometry of visual interception, The
* Handprinted Digit Recognition Using Spatiotemporal Connectionist Models
* Handwritten Numeral Recognition Based on Hierarchically Self-Organizing Learning Networks with Spatio-Temporal Pattern Recognition
* heterogeneous M-SIMD architecture for Kalman filter controlled processing of image sequences, An
* Hierarchical Decomposition and Axial Representation of Shape
* Hierarchical Waveform Matching: A New Feature-Based Stereo Technique
* Hybrid Weak-Perspective and Full-Perspective Matching
* Image segmentation via edge contour finding: a graph theoretic approach
* Image Sequence Enhancement Using Multiple Motions Analysis
* Image Understanding Environments Program, The
* Indexing Function-Based Categories for Generic Recognition
* information theoretic robust sequential procedure for surface model order selection in noisy range data, An
* Iterative TIN Generation from Digital Elevation Models
* Kinematic calibration of an active camera system
* Labeling of Curvilinear Structure Across Scales by Token Grouping
* Local Reproducible Smoothing without Shrinkage
* Local Shape Approximation from Shading
* Low Resolution Cues for Guiding Saccadic Eye Movements
* Matching Complex Images to Multiple 3D Objects Using View Description Networks
* measure of symmetry based on shape similarity, A
* Model Based Region Segmentation Using Cooccurrence Matrices
* Model Indexing: The Graph-Hashing Approach
* Morphological Decomposition of Restricted Domains: A Vector Space Solution
* Morphological Grayscale Reconstruction: Definition, Efficient Algorithm and Applications in Image Analysis
* Morphological structuring function decomposition
* Motion Trajectories
* MRF Approach to Optical Flow Estimation, A
* Multi-Primitive Hierarchical (MPH) Stereo System
* Multi-Resolution Surface Modeling from Multiple Range Views
* Multifractals, texture, and image analysis
* Multiple Motions from Instantaneous Frequency
* Multiresolution Shape Description by Corners
* Neural network models for illusory contour perception
* New 3-D Surface Measurement System Using a Structured Light, A
* Noise Resistant Projective and Affine Invariants
* Non-Rigid Heart Wall Motion Using MR Tagging
* Nonlinear multiscale filtering using mathematical morphology
* Object segmentation and binding within a biologically-based neural network model of depth-from-occlusion
* Object-Oriented Approach to Template Guided Inspection, An
* Off-line handwritten word recognition (HWR) using a single contextual hidden Markov model
* On Finding the Ends of SHGCs in an Edge Image
* On Poisson Solvers and Semidirect Methods for Computing Area Based Optical-Flow
* On texture in document images
* On the Derivation of Geometric Constraints in Stereo
* Optimal nonlinear pattern restoration from noisy binary figures
* Parameter estimation in MRF line process models
* Parameterizing and Fitting Bounded Algebraic Curves and Surfaces
* Perceptual Organization Using Bayesian Networks
* Performance of Optical Flow Techniques
* Point Correspondence Recovery in Non-Rigid Motion
* Predicting expected gray level statistics of opened signals
* Properties of Energy Edge Detectors
* Qualitative shape from active shading
* Random perturbation models and performance characterization in computer vision
* Range image segmentation and fitting by residual consensus
* Real-Time Smooth Pursuit Tracking for a Moving Binocular Robot
* Recognition of Motion from Temporal Texture
* Recognizing 3D Objects from 2D Images: An Error Analysis
* Recognizing Assembly Tasks using Face-Contact Relations
* Recognizing human action in time-sequential images using hidden Markov model
* Recovering LSHGCs and SHGCs from Stereo
* Recovering Shape by Purposive Viewpoint Adjustment
* Recovering the Scaling Function of a SHGC from a Single Perspective View
* Recovery of 3-D Objects with Multiple Curved Surfaces from 2-D Contours
* Recovery of Hierarchical Part Structure of 3D Shape from Range Image
* Recovery of Temporal Information from Static Images of Handwriting
* Recursive opening transform
* Refinement of Disparity Estimates Through the Fusion of Monocular Image Segmentations
* Refinement of Noisy Correspondence Using Feedback from 3-D Motion
* Right Straight Homogeneous Generalized Cylinders with Symmetric Cross-Sections: Recovery of Pose and Shape from Image Contours
* Robust Consensus Based Edge-Detection
* Robust Focus Ranging
* Robust object recognition based on implicit algebraic curves and surfaces
* Robust statistics in shape fitting
* Saliencies and Symmetries: Toward 3D Object Recognition from Large Model Databases
* Scale Space Aspect Graph, The
* Segmentation by Nonlinear Diffusion, II
* Sequential Detection Framework for Feature Tracking within Computational Constraints, A
* Shadow Identification
* Shape from Focus System
* Shape from Periodic Texture Using the Spectrogram
* Shape from Texture Using Markov Random Field Models and Stereo-Windows
* Shape Reconstruction from Photometric Stereo
* Shape-from-Texture by Wavelet-Based Measurement of Local Spectral Moments
* Simple Algorithm for Shape for Shading, A
* Simple Direct Computation of the FOE with Confidence Measures
* Single Plane Model Extensions Using Projective Transformations
* Smoothed Local Generalized Cones: An Axial Representation of 3D Shapes
* Some Invariant Linear Methods in Photogrammetry and Model-Matching
* Space Efficient 3D Model Indexing
* Spatial reasoning based on multivariate belief functions
* Spatial-Quefrency Approach to Optical Echo Analysis
* Stereo from Uncalibrated Cameras
* Surface Reconstruction Using Neural Networks
* Surface segmentation from stereo
* Task-Specific Utility in a General Bayes net Vision System
* Toward Object-Based Heuristics
* Toward Stochastic Modeling of Obstacle Detectability in Passive Stereo Range Imagery
* Towards a General Framework for Feature Extraction
* Uncertain Views
* Vector Field Analysis For Oriented Patterns
* Verifying and Combining Different Visual Cues into a 3D Model
* Vision-Based Range Estimation Using Helicopter Flight Data
* Visual Inspection of Machined Parts
* Visual Motion Analysis under Interceptive Behavior
* Voronoi Skeletons: Theory and Applications
* Weak Lambertian Assumption for Determining Cylindrical Shape and Pose from Shading and Contour
160 for CVPR92

CVPR93 * *CVPR
* 2-D Images of 3-D Oriented Points
* 3-D Model-Data Correspondence and Nonrigid Deformation
* 3-D Motion Estimation and Object Tracking Using B-Spline Curve Modeling
* 3D Parts Decomposition from Sparse Range Data Information Criterion
* Accurate Stereo Correspondence Method for Textured Scenes Using Improved Power Cepstrum Techniques, An
* Active Binocular Stereo
* Active Calibration: Alternative Strategy and Analysis
* active-vision system for recognition of pre-marked objects in robotic assembly workcells, An
* Active/Dynamic Stereo: A General Framework
* Adaptive Active Contour Algorithms for Extracting and Mapping Thick Curves
* Adaptive Mesh Generation for Surface Reconstruction: Parallel Hierarchical Triangulation without Discontinuities
* Adaptive segmentation of images of objects with smooth surfaces
* Agglomerative Clustering on Range Data with a Unified Probabilistic Merging Function and Termination Criterion
* Automatic Finding of Main Roads in Aerial Images by Using Geometric-Stochastic Models and Estimation
* Automatic Tube Inspection System That Finds Cylinders in Range Data, An
* Axiomatization of shape analysis and application to texture hyperdiscrimination
* Bayesian Region Merging Probability for Parametric Image Models
* Bayesian View Class Determination
* Binocular Motion Stereo Using MAP Estimation
* Centering Behavior Using Peripheral Vision
* clustering filter for scale-space filtering and image restoration, A
* Color Recovery from Biased Illumination: Color Constancy
* Comparative Study of Stereo, Vergence, and Focus as Depth Cues for Active Vision, A
* Comparison between Asymptotic Bayesian Approach and Kalman Filter-Based Technique for 3D Reconstruction Using an Image Sequence
* Comparison of Weighted LS Methods with LS Methods in 3-D Motion Estimation from Stereo Image Sequences, A
* Completion of Occluded Shapes Using Symmetry
* Computing Correspondence Based on Regions and Invariants without Feature Extraction and Segmentation
* Computing Matched-Epipolar Projections
* Conformal Transplantation of Lightness to Varying Resolution Sensors
* Constrained contouring in polar coordinates
* Coupling of Rotation and Translation in Motion Estimation of Planar Surfaces, The
* Depth from Defocus by Changing Camera Aperture: A Spatial Domain Approach
* Depth from Focusing and Defocusing
* Detecting Activities
* Determining the Fundamental Matrix with Planes: Instability and New Algorithms
* Determining the Shape of Multi-Colored Dichromatic Surfaces Using Color Photometric Stereo
* Differential Method for Computing Local Shape-from-Texture for Planar and Curved Surfaces, A
* Diffuse Reflectance from Rough Surfaces
* Direct Representation and Detection of Multi-Scale, Multi-Orientation Fields Using Local Differentiation Filters
* Direction of Heading from Image Deformations
* Distance Metric between 3D Models and 2D Images for Recognition and Classification
* Distributed Bayesian Object Recognition
* Divergent Stereo for Robot Navigation: Learning from Bees
* Dynamic Camera Self-Calibration from Controlled Motion Sequences
* Dynamic Retina: Contrast and Motion Detection for Active Vision, The
* Dynamical encoding of cursive handwriting
* Early Vision Processing Using a Multi-Stage Diffusion Process
* Edge detection and feature extraction by non-orthogonal image expansion for optimal discriminative SNR
* Efficient and Robust Methods of Accurate Camera Calibration
* Efficient Edge Detection Using Two Scales
* Efficient Recognition of Rotationally Symmetric Surface and Straight Homogeneous Generalized Cylinders
* Efficient Serial Associative Memory
* Estimation of curvature from sampled noisy data
* Estimation of the Position and Orientation of a Planar Surface Using Multiple Beams
* Exploiting the Temporal Coherence of Motion for Linking Partial Spatiotemporal Trajectories
* Extracting Affine Deformations from Image Patches - I: Finding Scale and Rotation
* Fast Alignment Using Probabilistic Indexing
* Feature-Based Monocular Motion Analysis System Guided by Feedback Information, A
* Finding Waldo, or Focus of Attention Using Local Color Information
* FLASH: a fast look-up algorithm for string homology
* Focal length and compression of space
* Fractal surface reconstruction for modeling natural terrain
* From Global to Local, a Continuum of Shape Models with Fractal Priors
* Gaussian Error Models for Object Recognition
* Generalized Tube Model: Recognizing 3D Elongated Objects from 2D Intensity Images
* Global 3-D Motion Estimation
* Ground state texture patterns for the second-order Ising model
* Higher-Order Statistics in Object Recognition
* Human face detection in a scene
* Image enhancement using non-linear diffusion
* Incremental Recognition of Pedestrians from Image Sequences
* Inferring Global Perceptual Contours from Local Features
* Inferring the Shape of the Real Object from the Object Reconstructed by Volume Intersection
* Information Theoretic Clustering of Large Structural Modelbases
* Integration of Shape from X Modules: Combining Stereo and Shading
* Interpreting the Views in Engineering Drawings
* Iterative Pose Estimation Using Coplanar Points
* Knowledge-Based Image Understanding Using Incomplete and Generic Models
* Labeling of Human Face Components from Range Data
* Layered Representation for Motion Analysis
* Local, Global, and Multilevel Stereo Matching
* Maintaining Stereo Calibration by Tracking Image Points
* Mapping a lifelike 2.5 D human face via an automatic approach
* Matching Elastic Contours
* Matching Perspective Views of Coplanar Structures Using Projective Unwarping and Similarity Matching
* Mixture Models for Optical Flow Computation
* Model Based Corner Detection
* Model-Based Invariants for 3-D Vision
* Modeling and Identifying 3-D Color Textures
* Modeling Surfaces of Arbitrary Topology with Dynamic Particles
* Monocular Pose Determination from Lines: Critical Sets and Maximum Number of Solutions
* Motion of a Stereo Rig: Strong, Weak, and Self-Calibration
* Multi-Layer Surface Segmentation Using Energy Minimization
* Multi-Resolution Technique for Comparing Images Using the Hausdorff Distance
* Multi-Scale Structure from Multi-Views by d{2}G Filtered 3D Voting
* Multi-Scale, Torsion-Based Shape Representations for Space Curves
* Multiscale relaxation labeling of fractal images
* New Developments on Geometric Hashing for Curve Matching
* New Method for Acquiring Time-Sequential Range Images by Integrating Stereo Pairs of Thermal and Intensity Images, A
* Nonlinear Diffusion Model for Discontinuous Disparity and Half-Occlusions in Stereo, A
* Nonlinear Phase Portrait Models for Oriented Textures
* Nonparametric Algorithm for Edge Localization, A
* Normalized and Differential Convolution
* Nulling Filters and the Separation of Transparent Motions
* Object Contour Extraction Using Color and Motion
* Obtaining Optical Flow with Multi-Orientation Filters
* On Hierarchical Color Segmentation and Applications
* On solving exact Euclidean distance transformation with invariance to object size
* On the recognition of occluded shapes and generic faces using multiple-template expansion matching
* On the use of snakes for 3-D robotic visual tracking
* On Using Geometric Distance Fits to Estimate 3D Object Shape, Pose, and Deformation from Range, CT, and Video Images
* Optimal Partition of Moving Edge Segments, The
* Optimisation Approach to Improving the Accuracy of the Hough Transform: Plane Orientations from Skew Symmetry, An
* Orientation Normalized Vector Quantizer for Flow-Like Image Coding, An
* Parallel dense depth-from-motion on the image understanding architecture
* Parallel Implementation of a Multisensor Feature-Based Range-Estimation Method, A
* Parallel Line Grouping in Irregular Curve Pyramids
* Partitioning Range Images Using Curvature and Scale
* Parts of Visual Form: Computational Aspects
* Perception Framework for Inspection and Reverse Engineering, A
* Performing Segmentation of Ultrasound Images Using Temporal Information
* Pinta: A System for Visualizing the Anatomical Structures of the Brain from MR Imaging
* Pose Determination and Recognition of 3D Polyhedral Objects from a Single Perspective View
* Practical Stereo Vision System, A
* Quadratic Filter and Feature Detection
* Qualitative Approach to Quantitative Recovery of SHGC's Shape and Pose from Shading and Contour, A
* Qualitative visual navigation using weighted correlation
* Quasi-Invariant Properties and 3-D Shape Recovery of Non-Straight, Non-Constant Generalized Cylinders
* Rapid recognition of freeform objects in noisy range images using tripod operators
* real-time edge linker, A
* Real-Time Motion Stereo
* Real-Time Recognition with the Entire Brodatz Texture Database
* Recognition by Prototypes
* Reconstruction of HOT Curves from Image Sequences
* Recovering 3D Shape and Motion from Image Streams Using Non-Linear Least Squares
* Recovering and Characterizing Image Features Using an Efficient Model Based Approach
* Recovering and Tracking Pose of Curved 3D Objects from 2D Images
* Recovering Planar Surfaces by Stereovision Based on Projective Geometry
* Recovery of Non-Rigid Motion from Stereo Images, The
* Recursive Estimation of Structure and Motion Using Relative Orientation Constraints
* Recursive Motion and Structure Estimation with Complete Error Characterization
* Relative 3D Reconstruction Using Multiple Uncalibrated Images
* Removal of Specularities Using Color and Polarization
* Resolving Edge-Line Ambiguities Using Probabilistic Relaxation
* Robust Affine Invariant Matching with Application to Line Features
* Robust and efficient detection of convex groups
* Robust Shape Recovery from Occluding Contours Using a Linear Smoother
* Robustness of Model-Based Recognition in Cluttered Images
* Role of Vision in Two-Arms Manipulation, The
* Roughness and Shape of Specular Lobe Surfaces Using Photometric Sampling Method
* Scalable geometric hashing on MasPar machines
* Scaling Images and Image Features via the Renormalization Group
* Scene Reconstruction and Description: Geometric Primitive Extraction from Multiple View Scattered Data
* Self-Calibration of the Intrinsic Parameters of Cameras for Active Vision Systems
* Semi-Local Invariants
* Shape from Perspective Trihedral Angle Constraint
* Shape from Photomotion
* Shape-Based Tracking of Naturally Occurring Annuli in Image Sequences
* short note on local region growing by pseudophysical simulation, A
* Simulated Tearing: An Algorithm for Discontinuity-Preserving Visual Surface Reconstruction
* Space-Time Gestures
* Spatiotemporal Representation of Dynamic Objects
* Stereopsis for Verging Systems
* Stochastic Performance Modeling and Evaluation of Obstacle Detectability with Imaging Range Sensors
* Support Function, Curvature Functions and 3-D Attitude Determination, The
* Surface Matching Algorithm for Two Perspective Views, A
* System for Real-Time Fire Detection, A
* Systematic Design of Indexing Strategies for Object Recognition
* Systolic Architecture for Labeling the Connected Components in Multi-Valued Images in Real Time, A
* Temporal-Color Space Analysis of Reflection
* Texture Classification Using Noncausal Hidden Markov-Models
* Texture Discrimination Using Wavelets
* Three-Dimensional Shape Reconstruction by Active Rangefinder
* Threshold decomposition of soft morphological filters
* Toward Global Surface Reconstruction by Purposive Viewpoint Adjustment
* Toward Template-Based Tolerancing from a Bayesian Viewpoint
* Transform for Detection of Multiscale Image Structure, A
* Unsupervised Segmentation of Textured Color Images Using Markov Random Field Models
* Using Differential Geometry in R4 to Extract Typical Features in 3D Images
* Using Isolated Landmarks and Trajectories in Robot Navigation
* Using Stability of Interpretation as Verification for Low Level Processing: An Example from Egomotion and Optic Flow
* Using Topological Information of Images to Improve Stereo Matching
* Variable Duration Hidden Markov Model and Morphological Segmentation for Handwritten Word Recognition
* Visual keyword recognition using hidden Markov models
* Wavelet Transform Embeddings in Mesh Architectures
* What is the Center of the Image?
187 for CVPR93

CVPR94 * *CVPR
* 2D Matching of 3D Moving Objects in Color Outdoor Scenes
* 3D Geometry from Planar Parallax
* Accurate Structure and Motion Computation in the Presence of Range Image Distortions Due to Sequential Acquisition
* Accurate Vergence Control in Complex Scenes
* Active Part-Decomposition, Shape and Motion Estimation of Articulated Objects: A Physics-based Approach
* Active stereo vision and cyclotorsion
* Adaptive Polynomial Modelling of the Reflectance Map for Shape Estimation from Stereo and Shading
* Adaptive-complexity registration of images
* Affine Invariant Detection of Periodic Motion
* Affine-Invariant B-Spline Weighted Moments for Object Recognition from Image Curves
* Age Classification from Facial Images
* Analysis of Shape from Shading Techniques
* Analytical Studies of Low-Level Motion Estimators in Space-Time Images Using a Unified Filter Concept
* Analyzing and Recognizing Walking Figures in XYT
* Automated Design of Bayesian Perceptual Inference Networks
* Automated Discovery of Detectors and Interation -- Performing Calculations to Recognize Patterns in Protein Sequences Using Genetic Programming
* Automatic Registration Method for Frameless Stereotaxy, Image Guided Surgery and Enhanced Reality Visualization, An
* Automatic Selection of Tuning Parameters for Feature Extraction Sequences
* Automating the Hunt for Volcanoes on Venus
* Autonomous Exploration: Driven by Uncertainty
* Benchmarking Page Segmentation Algorithms
* Blended Deformable Models
* Capacity of Color Histogram Indexing, The
* Closed-Form Attitude Determination under Spectrally Varying Illumination
* Complementary Data Fusion for Limited-Angle Tomography
* Computing Differential Properties of 3D Shapes from Stereoscopic Images without 3D Models
* Computing Spatio-Temporal Representations of Human Faces
* Constraint-Fusion for Localization and Interpretation of Constrained Objects
* Controlled Active Exploration of Uncalibrated Environments
* Data and Model-Driven Selection Using Closely-Spaced Parallel-Line Groups
* Deformable Boundary Finding Influenced by Region Homogeneity
* Deformable Contours: Modeling and Extraction
* Deformable Models With Parameter Functions: Application to Heart-Wall Modeling
* Dense, Time-varying Range Data Acquisition from Stereo Pairs of Thermal and Intensity Images
* Depth from Focus with One Image
* Detecting Multiple Image Motions by Exploiting Temporal Coherence of Apparent Motion over a Long Image Sequence
* Detection of Buildings Using Perceptual Groupings and Shadows
* Determination of Motion Parameters and Estimation of Point Correspondences in Small Nonrigid Deformations
* Development of the Image Understanding Environment, The
* Document Image Analysis: Geometric and Logical Layout
* Efficient Indexing Techniques for Model Based Sensing
* Efficient Parallel Multigrid Relaxation Algorithms for Markov Random Field-Based Low-Level Vision Applications
* Emerging hypothesis verification using function-based geometric models and active vision strategies
* Error Propagation in Full 3-D and 2-D Object Recognition
* Euclidean Reconstruction from Uncalibrated Views
* Executing Reactive Behavior for Autonomous Navigation
* Extraction of 3D Shape from Optic Flow: A Geometric Approach
* Extraction of the Zero-Crossings of the Curvature Derivative in Volumetric 3D Medical Images: A Multi-Scale Approach
* Extremal Points: Definition and Application to 3D Image Registration
* Extruded Generalized Cylinder: A Deformable Model for Object Recovery, The
* Face Recognition Under Varying Pose
* Fast Algorithm for MDL-Based Multi-Band Image Segmentation, A
* Feature Matching for Building Extraction from Multiple Views
* Focused Image Recovery from Two Defocused Images Recorded with Different Camera Settings
* Framework for Recovering Affine Transforms Using Points, Lines or Image Brightnesses, A
* Framework for Segmentation Using Physical Models of Image Formation, A
* Gaussian Normalization of Morphological Size Distributions for Increasing Sensitivity to Texture Variations and its Application to Pavement Distress Classification
* Geometric Heat-Equation and Nonlinear Diffusion of Shapes and Images
* Global Surface Reconstruction by Purposive Control of Observer Motion
* Good Features to Track
* Height Recovery From Intensity Gradient
* Hexagonal Wavelet Representations for Recognizing Complex Annotations
* Hierarchical Gabor Filters for Object Detection in Infrared Images
* Hierarchical Spline-Based Image Registration
* Hierarchical Statistical Framework for the Segmentation of Deformable Objects in Image Sequences, A
* HOT Curves for Modeling and Recognition of Smooth Curved 3D Objects
* Illumination Planning for Object Recognition in Structured Environments
* Independent Motion Segmentation and Collision Prediction for Road Vehicles
* Information Extraction from Telephone Company Drawings
* Initializing Snakes
* Integration of Bottom-Up and Top-Down Cues for Visual Attention Using Non-Linear Relaxation
* Integration of Transitory Image Sequences
* Invited Talk: Pattern Recognition: Present and Future
* Learning Indexing Functions for 3-D Model-Based Object Recognition
* Localized Radon Transform-Based Detection of Linear Features in Noisy Images
* low-dimensional representation of human faces for arbitrary lighting conditions, A
* Markov Random Field Model for Object Matching under Relational Constraints, A
* Maximum Likelihood N-Camera Stereo Algorithm, A
* MDL-Based Spatiotemporal Segmentation from Motion in a Long Image Sequence
* Medium Level Scene Representation Using VLSI Smart Hexagonal Sensor with multi-resolution Edge Extraction Capability and Scale Space Integration Co-Processor
* Merging Range Images of Arbitrarily Shaped Objects
* Model-based Next View Planning By Using Rules: Automatic Feature Prediction and Detection
* Modelled Object Pose Estimation and Tracking by a Multi-Camera System
* Motion and Structure from One Dimensional Optical Flow
* Motion Estimation and Vector Splines
* New Closed-Form Solution for Absolute Orientation, A
* New Method to Calculate Looming for Autonomous Obstacle Avoidance, A
* New Robust Operator for Computer Vision: Application to Range Data, A
* New Robust Operator for Computer Vision: Theoretical Analysis, A
* Nonlinear Diffusion of Scalar Images using Well-Posed Differential Operators with Applications in Medical Imaging
* O(N) Iterative Solution to the Poisson Equation in Low-Level Vision Problems, An
* Object Recognition Using Multilayer Hopfield Neural-Network
* Object Representation for Object Recognition
* Obstacle Detection Analysis
* Occluding Contour Detection Using Affine Invariants and Purposive Viewpoint Control
* On Integration of Vision Modules
* On the Relative Brightness of Specular and Diffuse Reflection
* On-Line Cursive Word Recognition System, An
* Optimal Estimation of 3D Structures Using Visual Servoing
* Orientation-Based Representations of 3-D Shape
* Outlier Process: Unifying Line Processes and Robust Statistics, The
* Overcomplete Steerable Pyramid Filters and Rotation Invariance
* Panel on Computer Vision: Past, Present and Future
* Perceptual Completion of Occluded Surfaces
* Pictures and Trails: a New Framework for the Computation of Shape and Motion from Perspective Image Sequences
* Practical Edge Finding with a Robust Estimator
* Practical Pattern Recognition System for Translation, Scale and Rotation Invariance, A
* Principal Component Analysis with Missing Data and Its Application to Object Modeling
* Probe Based Recognition of Targets in Infrared Images
* Projective and Object Space Geometry for Monocular Building Extraction
* Projective Reconstruction from Line Correspondences
* Qualitative Image Analysis of Group Behaviour
* Qualitative Obstacle Detection
* Qualitative Tracking of 3-D Objects using Active Contour Networks
* Quantitative Performance Evaluation of Thinning Algorithms Under Noisy Conditions
* Reactive View Planning for Quantification of Local Geometry
* Real-Time Feature Tracking and Projective Invariance as a Basis for Hand-Eye Coordination
* Realistic Range Rendering
* Recognition by Functional Parts
* Recognition by Using an Active/Space-Variant Sensor
* Recognizing Object Function Through Reasoning About 3-D Shape and Dynamic Physical Properties
* Recognizing Off-Line Cursive Handwriting
* Reconstruction of High Resolution 3D Visual Information Using Sub-pixel Camera Displacements
* Recovering Parametric Geons from Multiview Range Data
* Recovering the Shape of Polyhedra Using Line-Drawing Analysis and Complex Reflectance Models
* Recovery of Ego-Motion Using Image Stabilization
* Registration of Multiple Range Views for Automatic 3-D Model Building
* Registration without Correspondences
* Relative Affine Structure: Theory and Application to 3D Reconstruction from Perspective Views
* Representation and Computation of the Spatial Environment for Indoor Navigation
* Resolvability Ellipsoid for Visual Servoing, The
* Rigid, Affine And Locally Affine Registration of Free-Form Surfaces
* Robot Pose Estimation in Unknown Environments by Matching 2D Range Scans
* Robust Feature Selection for Object Recognition using Uncertain 2D Image Data
* Robust Motion Analysis
* Salient Structure Analysis for Fluid Flow
* Scene Understanding from Propagation and Consistency of Polarization-Based Constraints
* Segmentation of Surface Curvature Using a Photometric Invariant
* Sekeleton-Space: a Multiscale Shape Description Combining Region and Boundary Information
* Shape Analysis of Brain Structures Using Physical and Experimental Modes
* Shape-from-Texture Algorithm Based on the Human Visual Psychophysics, A
* Simplex Meshes: a General Representation for 3D Shape Reconstruction
* Simultaneous Segmentation and Approximation of Complex Patterns
* Singularity Analysis and Derivative Scale-Space
* Site Model Supported Monitoring of Aerial Images
* Structure from Image Sequences Captured Through a Monocular Extra-Wide Angle Lens
* Study Relating Image Sampling Rate and Image Pattern Recognition, A
* Subpixel Contour Matching Using Continuous Dynamic Programming
* Surface Curvature from Integrability
* Surface Description of Complex Images from Multiple Range Images
* Survey of Motion Analysis from Moving Light Displays, A
* Three-Dimensional Image Registration for Spiral CT Angiography
* Time and Space Efficient Pose Clustering
* Time-to-X: Analysis of Motion through Temporal Parameters
* Tracking of Tubular Objects for Scientific Applications
* Using Global Consistency to Recognise Euclidean Objects with an Uncalibrated Camera
* Using Illumination Invariant Color Histogram Descriptors for Recognition
* Variable Window Gabor Filters and Their Use in Focus and Correspondence
* View-Based and Modular Eigenspaces for Face Recognition
* Viewpoint Selection for Visual Search Tasks
* Vision System for Observing and Extracting Facial Action Parameters, A
* VISIPLAN: A Hierarchical Planning Framework for Composing Biomedical Image Analysis Processes
* Vista: A Software Environment for Computer Vision Research
* Visual-Motion Fixation Invariant, A
* Visually-Guided Navigation by Comparing Two-Dimensional Edge Images
* Voronoi Diagrams of Polygons: A Framework for Shape Representation
* X-Y separable pyramid steerable scalable kernels
168 for CVPR94

CVPR96 * *CVPR
* 3-D Object Pose Estimation by Shading and Edge Data Fusion: Stimulating Virtual Manipulation on Mental Images
* 3-D Scene Data Recovery Using Omnidirectional Multibaseline Stereo
* 3D Model-Based Tracking of Humans in Action: A Multi-View Approach
* Active Face Tracking and Pose Estimation in an Interactive Room
* Active Intrinsic Calibration Using Vanishing Points
* Affine Invariants Detection: Edges, Active Contours, and Segments
* Affine Structure and Photometry
* Attention Control for Robot Vision
* Autonomous Recognition: Driven by Ambiguity
* Bayesian Face Recognition Using Deformable Intensity Surfaces
* Bayesian Image Restoration and Segmentation by Constrained Optimization
* Blurring Strategies for Image Segmentation Using a Multiscale Linking Model
* Calibration of a Foveated Wide Angle Lens on an Active Vision Head
* Canonical Decomposition of Steerable Functions
* Close-Loop Object Recognition Using Reinforcement Learning
* Closest Point Search in High Dimensions
* Combination of Multiple Classifiers Using Local Accuracy Estimates
* Combining Grey Value Invariants with Local Constraints for Object Recognition
* Common Framework for Curve Evolution, Segmentation and Anisotropic Diffusion, A
* Comparison of Approaches to Egomotion Computation
* Comparison of Edge Detectors: A Methodology and Initial Study
* Competitive Mixture of Deformable Models for Pattern Classification
* Complexity analysis of RBF networks for Pattern Recognition
* Connectionist Networks for Feature Indexing and Object Recognition
* Constrained Phase Congruency: Simultaneous Detection of Interest Points and of their Scales
* Content-Based Retrieval: Research Issues and Directions
* Controlled Camera Motions for Scene Reconstruction and Exploration
* Convexity Analysis of Active Contour Problems
* Convolutional Neural Networks for Face Recognition
* Coregistration of Range and Optical Images Using Coplanarity and Orientation Constraints
* Dealing with Occlusions in the Eigenspace Approach
* Dense Non-Rigid Motion Tracking from a Sequence of Velocity Fields
* Determining Correspondences and Rigid Motion of 3D Point Sets with Missing Data
* Edge Detection with Automatic Scale Selection
* Education for Computer Vision: Panel
* Efficient Image Gradient-Based Object Localization and Recognition
* Eigenfeatures for Planar Pose Measurement of Partially Occluded Objects
* Epipolar Geometry and Linear Subspace Methods: A New Approach to Weak Calibration
* Extracting Salient Curves from Images: An Analysis of the Saliency Network
* Extraction of Maximal Inscribed Disks from Discrete Euclidean Distance Maps
* Factorization Method for Shape and Motion from Line Correspondences, A
* Factorization Methods for Projective Structure and Motion
* Fast and Flexible Statistical Method for Text Extraction in Document Pages, A
* Feature Correspondence by Interleaving Shape and Texture Computations
* Feature-Based Face Recognition Using Mixture Distance
* Finding Correspondence Points Based on Bayesian Triangulation
* FRAME: Filters, Random fields and Maximum Entropy: Towards the Unified Theory for Texture Modeling
* Framework for Recognizing a Facial Image from a Police Sketch, A
* From Projective to Euclidean Reconstruction
* Further Constraints on Visual Articulated Motions
* Gauging Relational Consistency and Correcting Structural Errors
* Geometric and Photometric Constraints for Surface Recovery
* Gesture Recognition Using Perseus Architecture
* Global Minimum for Active Contour Models: A Minimum Path Approach
* Global Models with Parametric Offsets as Applied to Cardiac Motion Recovery
* Graph Matching by Graduated Assignment
* Hand Segmentation Using Learning-Based Prediction and Verification for Hand-Sign Recognition
* Hierarchical Approach to High Resolution Edge Contour Reconstruction, A
* High Resolution PET and High Field Magnetic Resonance in Study of Human Physiology and Disease
* Human Assisted Computer Vision and Artificial Intelligence -- Why Not?
* Illumination and Geometry Invariant Recognition of Texture in Color Images
* Incremental Focus of Attention for Robust Visual Tracking
* Indexing to 3D Model Aspects using 2D Contour Features
* Industry Needs for Computer Vision and Pattern Recognition: Panel
* Inference of Segmented, Volumetric Shape from Intensity Images
* Integration of Optical Flow and Deformable Models: Applications to Human Face Shape and Motion Estimation, The
* Interactive Learning with a Society of Models
* Interpreting and Representing Tabular Documents
* Invariant Histograms and Deformable Template Matching for SAR Target Recognition
* Isotropic Corner and Edge Detection
* Isotropic Processing for Gradient Estimation
* Lambda-Tau-Space Representation of Images and Generalized Edge Detector
* Lie Group Theory, Space Variant Fourier Analysis and the Exponential Chirp Transform
* Local Parallel Computation of Stochastic Completion Fields
* MIMD Computing Platform for a Hierarchical Foveal Machine Vision System, An
* Minimal Operator Set for Passive Depth from Defocus
* Mirror and Point Symmetry under Perspective Skewing
* Model-Based Estimation of 3D Human Motion with Occlusion Based on Active Multi-Viewpoint Selection
* Modeling Clutter and Context for Target Detection in Infrared Images
* Motion from Fixation
* Multi-Stage Target Recognition Using Modular Vector Quantizers and Multi-Layer Perceptions
* Multilinear Forms in the Velocity Case
* MUSE: Robust Surface Fitting Using Unbiased Scale Estimates
* Neural Network-Based Face Detection
* New, Faster, More Controlled Fitting of Implicit Polynomial 2D Curves and 3D Surfaces to Data
* Non-Rigid Matching Using Demons
* Nonlinear Shape Restoration for Document Images
* Novel Active Vision-Based Visual Threat Cue for Autonomous Navigation Tasks
* Occlusion Detectable Stereo -- Occlusion Patterns in Camera Matrix
* On 3D Shape Similarity
* On a Spectral Attentional Mechanism
* Optimal Servoing for Active Foveated Vision
* Ordinal Measures for Visual Correspondence
* Panoramic Image Acquisition
* Parametric Feature Detection
* Pattern Rejection
* Physics-Based Segmentation: Moving beyond Color
* Polarization Phase-Based Method for Material Classification and Object Recognition in Computer Vision
* Progress in Machine Vision for Motion Control
* Quantitative Measures of Change Based on Feature Organization: Eigenvalues and Eigenvectors
* Randomness and Geometric Figures in Computer Vision
* Real Time Tracking of Image Regions with Changes in Geometry and Illumination
* Realtime Extraction of Connected Component in 3D Sonar Range Images
* Recognition of Handwritten Phrases as Applied to Street Name Images
* Recognition of Planar Object Classes
* Recognition via Consensus of Local Moments of Brightness and Orientation
* Recognizing 3D Objects by Generating Random Actions
* Recognizing Three Dimensional Objects by Comparing Two-Dimensional Images
* Recovering Affine Motion and Defocus Blur Simultaneously
* Recovering the Viewing Parameter of Random, Translated and Noisy Projections of Asymmetric Objects
* Recovery of Global Nonrigid Motion: A Model-Based Approach without Point Correspondences
* Reducing Structure from Motion: A General Framework for Dynamic Vision with Experimental Evaluation
* Robust Clustering Algorithm Based on Competitive Agglomeration and Soft Rejection of Outliers, A
* Robust Recovery of Camera Rotation from Three Frames
* Runway Obstacle Detection by Controlled Spatiomatic Image Flow Disparity
* Scale Space Localization, Blur, and Contour-Based Image Coding
* Shadows and Shading Flow Fields
* Shock Grammar for Recognition, A
* Similarity Queries in Image Databases
* SiteCity: A Semi-Automated Site Modeling System
* Skin and Bones: Multi-layer, Locally Affine, Optical Flow and Regularization with Transparency
* Space-Sweep Approach to True Multi-Image Matching, A
* Sparse Representations for Image Decomposition with Occlusions
* Stereo Machine for Video-Rate Dense Depth Mapping and Its New Applications, A
* Stereo Matching With Nonlinear Diffusion
* Stereo Vision for View Synthesis
* Structure and Motion of Curved 3D Objects from Monocular Silhouettes
* Structure from Linear or Planar Motions
* Structure from Multiple 2D Affine Correspondences without Camera Calibration
* Subpixel Image Registration by Estimating the Polyphase Decomposition of the Cross Power Spectrum
* Target Detection in Foveal ATR Systems
* Texture Features and Learning Similarity
* Towards Accurate Recovery of Shape from Shading under Diffuse Light
* Unified Mixture Framework for Motion Segmentation: Incorporating Spatial Coherence and Estimating the Number of Models, A
* Use of Hybrid Models to Recover Cardiac Wall Motion in Tagged MR Images, The
* Using a Spectral Reflectance Model for the Illumination-Invariant Recognition of Local Image Structure
* Using Physics-based Invariant Representations for the Recognition of Regions in Multispectral Satellite Images
* Vector-Valued Active Contours
* Video Browsing Using Edges and Motion
* View Point Variation in the Noise Sensitivity of Pose Estimation
* Visual Organization for Figure/Ground Separation
* What is the Set of Images of an Object Under All Possible Lighting Conditions
* Word Spotting: A New Approach to Indexing Handwriting
144 for CVPR96

CVPR97 * *CVPR
* 3D Reconstruction of the Human Jaw from a Sequence of Images
* 3D to 2D Recognition with Regions
* Adaptive B-Splines and Boundary Estimation
* Analysis of Gesture and Action in Technical Talks
* Analyzing Articulated Motion Using Expectation-Maximization
* Appearance Matching of Occluded Objects Using Coarse-to-Fine Adaptive Masks
* Are Textureless Scenes Recoverable?
* Area and Length Minimizing Flows for Shape Segmentation
* Assessment of Information Criteria for Motion Model Selection, An
* Autocalibration and the Absolute Quadric
* Automatic Line Matching Across Views
* Automatic Model Acquisition from Range Images with View Planning
* Bas-Relief Ambiguity, The
* Bayesian Segmentation Framework for Textured Visual Images, A
* Body Plans
* Bootstrapping Algorithm for Learning Linear Models of Object Classes, A
* Building Reconstruction from Optical and Range Images
* Calibration of a Structured Light System: A Projective Approach
* Catadioptric Omnidirectional Cameras
* Character Extraction of License Plates from Video
* Characterization of Errors in Compositing Panoramic Images
* Class of Probabilistic Shape Models, A
* Color Based Tracking of Heads and Other Mobile Objects at Video Frame Rates
* Combining Region Splitting and Edge Detection Through Guided Delaunay Image Subdivision
* Completion Energies and Scale
* Computational Approach to Steerable Functions, A
* Configuration based scene classification and image indexing
* Confounding of Translation and Rotation in Reconstruction from Multiple Views, The
* Content-Based Trademark Retrieval System Using Visually Salient Features
* Controlling View-Based Algorithms Using Approximate World Models and Action Information
* Corner Detection with Covariance Propagation
* Coupled Hidden Markov Models for Complex Action Recognition
* Creating Image Based VR Using a Self-Calibrating Fisheye Lens
* Critical Motion Sequences for Monocular Self-Calibration and Uncalibrated Euclidean Reconstruction
* Curvature Based Descriptor Invariant to Pose and Albedo Derived from Photometric Data, A
* Cylindrical Rectification to Minimize Epipolar Distortion
* Deformable Multi Template Matching With Application To Portal Images
* Depth from Scattering
* Detection and Description of Buildings from Multiple Aerial Images
* Deterioration Detection for Digital Film Restoration
* Determining a Polyhedral Shape Using Interreflections
* Direct Method for Stereo Correspondence Based on Singular Value Decomposition, A
* Direct Methods for Estimation of Structure and Motion from Three Views
* Disparity Component Matching for Visual Correspondence
* Divide and Conquer Strategy in Shape from Shading Problems, A
* Dynamic Appearance-Based Recognition
* Edge Flow: A Framework of Boundary Detection and Image Segmentation
* Edge Localization in Surface Reconstruction Using Optimal Estimation Theory
* Efficient Approximation of Range Images Through Data-Dependent Adaptive Triangulations
* Efficient Guaranteed Search for Gray-Level Patterns
* Efficient Regularity-Based Grouping
* Efficient Stereo with Multiple Windowing
* Ego Motion Estimation Using Optical Flow Fields Observed from Multiple Cameras
* Empirical Bayesian EM Based Motion Segmentation
* Error Analysis of a Real Time Stereo System
* Euclidean Reconstruction from Image Sequences with Varying and Unknown Focal Length and Principal Point
* Experimental Performance Evaluation of Feature Grouping Modules
* Extracting Surface Textures and Microstructures from Multiple Aerial Images
* Extracting Symmetry Features from Color Images
* Face Detection with Information Based Maximum Discrimination
* Facial Expression Recognition and its Degree Estimation
* Fast 3D Stabilization and Mosaic Construction
* Fast Binary Image Processing Using Binary Decision Diagrams
* Feature Tracking from an Image Sequence Using Geometric Invariants
* FERET Evaluation Methodology for Face-Recognition Algorithms, The
* FOCUS: Searching for Multi-Colored Objects in a Diverse Image Database
* Four-Step Camera Calibration Procedure with Implicit Image Correction, A
* Fully 3D Active Surface Models with Self-Inflation and Self-Deflation Forces
* General Filter for Measurements with Any Probability Distribution, A
* Global Training of Document Processing Systems Using Graph Transformer Networks
* Gradient Vector Flow: A New External Force for Snakes
* Hierarchical Recognition of Articulated Objects from Single Perspective Views
* Hybrid Framework for Surface Registration and Deformable Models, An
* Hyperpatches for 3D Model Acquisition and Tracking
* Identification of Salient Contours in Cluttered Images
* Image Based View Synthesis of Articulated Agents
* Image Based Visual Motion Cue for Autonomous Navigation, An
* Image Indexing Using Color Correlograms
* Image Velocity Estimation from Trajectory Surface in Spatiotemporal Space
* Images as Embedding Maps and Minimal Surfaces: Movies, Color, and Volumetric Medical Images
* Independent 3D Motion Detection Based on Depth Elimination in Normal Flow Fields
* Interaction with On-Screen Objects Using Visual Gesture Recognition
* LAFTER: Lips and Face Real Time Tracker
* Learning and Recognizing Human Dynamics in Video Sequences
* Learning Bilinear Models for Two Factor Problems in Vision
* Learning Generic Prior Models for Visual Computation
* Learning Parameterized Models of Image Motion
* Lens Distortion Calibration Using Point Correspondences
* Linear Fitting with Missing Data: Applications to Structure from Motion and to Characterizing Intensity Images
* Lip Reading from Scale-Space Measurements
* Local Blur Estimation and Super Resolution
* Matching 3-D Arcs
* MDL Estimation for Small Sample Sizes and Its Application to Segmenting Binary Strings
* Minimum-Variance Adaptive Surface Mesh, A
* Model-Based Approach to Accurate and Consistent 3-D Modeling of Drainage and Surrounding Terrain
* Model-Based Multi Objective Analysis of Ultrasound Image Sequences in Prenatal Diagnosis
* Modelling of Single Mode Distributions of Colour Data Using Directional Statistics
* Motion Estimation Using Ordinal Measures
* Motion of Disturbances: Detection and Tracking of Multi-Body Non-Rigid Motion
* Multi Image Focus of Attention for Rapid Site Model Construction
* Multi-Modal Tracking of Faces for Video Communications
* Multi-Scale Classification of 3D Objects
* Name-It: Association Of Face And Name In Video
* Non-Parametric Similarity Measures for Unsupervised Texture Segmentation and Image Retrieval
* Nonlinear Operators in Image Restoration
* Normalized Cuts and Image Segmentation
* Novel View Synthesis in Tensor Space
* Object Based Video Indexing for Virtual Studio Productions
* Object Detection Using Hierarchical MRF and MAP Estimation
* Object Detection with Vector Quantized Binary Features
* Object Recognition Using Appearance Based Parts and Relations
* Object Recognition Using Invariant Profiles
* Object Tracking Using Affine Structure for Point Correspondences
* On Occluding Contour Artifacts in Stereo Vision
* On Perpendicular Texture or: Why Do We See More Flowers in the Distance?
* Optic Flow Calculation Using Robust Statistics
* Optimal Selection of Camera Parameters for Recovery of Depth from Defocused Images
* Orientation Diffusions
* Panoramic Mosaics By Manifold Projection
* Parameterized Structure from Motion for 3D Adaptive Feedback Tracking of Faces
* Pedestrian Detection Using Wavelet Templates
* Photometric Computation of the Sign of Gaussian Curvature using a Curve-Orientation Invariant
* Photorealistic Scene Reconstruction by Voxel Coloring
* Physically Based Fluid Flow Recovery from Image Sequences
* Pictorial Recognition Using Affine-Invariant Spectral Signatures
* Prediction Intervals for Surface Growing Range Segmentation
* Projective Registration with Difference Decomposition
* Rational Discrete Generalized Cylinders And Their Application To Shape Recovery In Medical Images
* Real Time Closed World Tracking
* Real Time Computer Vision System for Measuring Traffic Parameters, A
* Real-Time Estimation of Human Body Posture from Monocular Thermal Images
* Recognizing Objects by Matching Oriented Points
* Reconstruction of 3-D Curves from 2-D Images Using Affine Shape Methods for Curves
* Recursive Structure and Motion from Image Sequences Using Shape and Depth Spaces
* Reflectance and Texture of Real-World Surfaces
* Region-Level Graph Labeling Approach to Motion-Based Segmentation, A
* Registering Multiple Cartographic Models with the Hierarchical Mixture of Expert Algorithms
* Removing the Bias from Line Detection
* Representation and Recognition of Action Using Temporal Templates, The
* Representation of Objects in a Volumetric Frequency Domain with Application to Face Recognition
* Resolving Occlusion in Augmented Reality: A Contour Based Approach Without 3D Reconstruction
* Ridge's Corner Detection and Correspondence
* Robust Analysis of Feature Spaces: Color Image Segmentation
* Robust and Convergent Iterative Approach for Determining the Dominant Plane from Two Views Without Correspondence and Calibration, A
* Robust Occluding Contour Detection Using the Hausdorff Distance
* Scale-Space Vector Fields for Feature Analysis
* Self-Maintaining Camera Calibration over Time
* Shape and Albedo from Multiple Images Using Integrability
* Shape from the Light Field Boundary
* Shape Indexing Using Approximate Nearest-Neighbour Search in High-Dimensional Spaces
* Shocks from Images: Propagation of Orientation Elements
* Smoothness in Layers: Motion Segmentation Using Nonparametric Mixture Estimation
* Stereo Coupled Active Contours
* Stereo-Motion That Complements Stereo and Motion Analyses
* Stratified Approach To Metric Self-Calibration, A
* Surface Shape from Warping
* Temporal Classification of Natural Gesture with Application to Video Coding
* Temporal Multi-Scale Models for Flow and Acceleration
* Tracking Non-Rigid Moving Objects Based on Color Cluster Flow
* Training Support Vector Machines: An Application to Face Detection
* True Multi-Image Alignment and Its Application to Mosaicing and Lens Distortion Correction
* Uncalibrated 1D Camera and 3D Affine Reconstruction of Lines
* Using 3D Features to Improve Terrain Classification
* Using Chromaticity Distributions and Eigenspace Analysis for Pose-, Illumination- and Specularity-Invariant Recognition of 3D Objects
* Using Differential Constraints to Reconstruct Complex Surfaces from Stereo
* Using Geometric Corners to Build a 2D Mosaic from a Set of Images
* Using Local 3D Structure for Segmentation of Bone from Computer Tomography Images
* Velocity And Disparity Cues For Robust Real-Time Binocular Tracking
* Verifying Model-Based Alignments in the Presence of Uncertainty
* Video Skimming and Characterization through the Combination of Image and Language Understanding Techniques
* Vision for a Smart Kiosk
* What Is A Light Source?
173 for CVPR97

CVPR98 * *CVPR
* 2D Fluid Motion Analysis from a Single Image
* 3D Object Depth Recovery from Highlights Using Active Sensor and Illumination Control
* Acquisition and Use of Interaction Behaviour Models, The
* Acquisition of a Large Pose-Mosaic Dataset
* Action Recognition Using Probabilistic Parsing
* Analyzing the Bidirectional Texture Function
* Anomaly Detection through Registration
* Appearance Based Behavior Recognition by Event Driven Selective Attention
* Automated Mosaicing with Super-resolution Zoom
* Automatic Hierarchical Classification of Silhouettes of 3D Objects
* Background Modeling for Segmentation of Video-rate Stereo Sequences
* Bagging in Computer Vision
* Bayesian Framework for Semantic Content Characterization, A
* Boundary Finding with Correspondence Using Statistical Shape Models
* Camera Calibration and 3D Euclidean Reconstruction from Known Observer Translations
* Classification Based Euclidean Similarity Metric for 3D Image Retrieval, A
* Clustering Appearances of 3D Objects
* Comparing Images Under Variable Illumination
* Computer Modeling, Analysis, and Synthesis of Dressed Humans
* Computing the Camera Heading from Multiple Frames
* Connected Vibrations: A Modal Analysis Approach to Non-Rigid Motion Tracking
* Content-based Search Engine for VRML Databases, A
* Correspondence Between two Different Views of X-Ray Mammograms Using Simulation of Breast Deformation
* Coupled PDE Model of Nonlinear Diffusion for Image Smoothing and Segmentation, A
* Creaseness Measures for CT and MR Image Registration
* Curves Matching Using Geodesic Paths
* Dense Shape and Motion from Region Correspondences by Factorization
* Depth Measurement by the Multi-focus Camera
* Direct Estimation of Motion and Extended Scene Structure from a Moving Stereo Rig
* Direct Shape from Texture Using a Parametric Surface Model and an Adaptive Filtering Technique
* Edge/region-based Segmentation and Reconstruction of Underwater Acoustic Images By Markov Random Fields
* Efficient Multiple Model Recognition in Cluttered 3-d Scenes
* Efficient Optimization of a Deformable Template Using Dynamic Programming
* Efficient Query Refinement for Image Retrieval
* Elliptical Head Tracking Using Intensity Gradients and Color Histograms
* Empirical Performance Analysis of Linear Discriminant Classifiers
* Exhaustive Detection of Manufacturing Flaws as Abnormalities
* Exploitation of Natural Image Statistics by Biological Vision Systems: 1/f2 Power Spectra and Self-Similar Bandpass Decompositions
* Extraction and Classification of Visual Motion Patterns for Hand Gesture Recognition
* Extraction and Tracking of the Tongue Surface from Ultrasound Image Sequences
* Face Recognition Based on Nearest Linear Combinations
* Fingerprint Preselection using Eigenfeatures
* Fitting Undeformed Superquadrics to Range Data: Improving Model Recovery and Classification
* From Parametric Warping to the Cooperation of Local Features and Global Models
* Fuzzy Convergence
* Fuzzy Relational Distance for Large-scale Object Recognition
* Generation of the Euclidean Skeleton from the Vector Distance Map by a Bisector Decision Rule
* Ground from Figure Discrimination
* Head Tracking via Robust Registration in Texture Map Images
* Hierarchical Morphable Models
* Hierarchical Organization of Appearance Based Parts and Relations for Object Recognition
* Histogram Model for 3D Textures
* Human Action Detection Using PNF Propagation of Temporal Constraints
* Illumination Cones for Recognition under Variable Lighting: Faces
* Image Alignment for Precise Camera Fixation and Aim
* Image Editing in the Contour Domain
* Image Indexing and Retrieval Based on Human Perceptual Color Clustering
* Image Segmentation Based on the Integration of Pixel Affinity and Deformable Models
* Image Segmentation for Human Tracking using Sequential-Image-Based Hierarchical Adaptation
* Image Segmentation Using Local Variation
* Image Understanding from Thermal Emission Polarization
* Incorporating Illumination Constraints in Deformable Models
* Incremental Tracking of Human Actions from Multiple Views
* Inferring Segmented Surface Description from Stereo Data
* Integrated Person Tracking Using Stereo, Color, and Pattern Detection
* Interactive Construction of 3D Models from Panoramic Mosaics
* Interactive Sensor Planning
* Joint Probabilistic Techniques for Tracking Multi-part Objects
* Layered Approach to Stereo Reconstruction, A
* Learning to Form Large Groups of Salient Image Features
* Magic Morphin' Mirror: Person Detection and Tracking
* Making Good Features to Track Better
* Markov Random Fields with Efficient Approximations
* Methodology for Deriving Probabilistic Correctness Measures from Recognizers, A
* Metric Rectification for Perspective Images of Planes
* Minimax Entropy and Learning by Diffusion
* Mixtures of Local Linear Subspaces for Face Recognition
* Model-based Multiscale Detection of 3D Vessels
* Model-Based Target Recognition in Pulsed Ladar Imagery
* Moment Invariants and Quantization Effects
* Mosaics of Scenes with Moving Objects
* Motion Feature Detection Using Steerable Flow Fields
* Multiple Feature Integration for Robust Object Localization
* New Complex Basis for Implicit Polynomial Curves and its Simple Exploitation for Pose Estimation and Invariant Recognition, A
* New Linear Method for Euclidean Motion/structure from Three Calibrated Affine Views, A
* Nonlinear PHMMs for the Interpretation of Parameterized Gesture
* Nonrigid Motion Analysis Based on Dynamic Refinement of Finite Element Models
* Object Localization by Dynamic Template Warping
* Objective Comparison Methodology of Edge Detection Algorithms Using a Structure from Motion Task, An
* On 3-D Surface Reconstruction Using Shape from Shadows
* Optimal Structure from Motion: Local Ambiguities and Global Estimates
* Optimized Interaction Strategy for Bayesian Reference Feedback, An
* Orientation Space Filtering for Multiple Orientation Line Segmentation
* Performance Characterization and Comparison of Video Indexing Algorithms
* Probabilistic Affine Invariants for Recognition
* Probabilistic Formulation for Hausdorff Matching, A
* Probabilistic Formulation for Object Recognition
* Probabilistic Reasoning Models for Face Recognition
* Projective Translations and Affine Stereo Calibration
* Qualitative and Quantitative Car Tracking from a Range Image Sequence
* Radial Cumulative Similarity Transform for Robust Image Correspondence, A
* Real Time Tracking for Enhanced Tennis Broadcasts
* Real-Time 2-D Feature Detection on a Reconfigurable Computer
* Recognizing Abruptly Changing Facial Expressions from Time-Sequential Face Images
* Reliable Tracking of Human Arm Dynamics by Multiple Cue Integration and Constraint Fusion
* Remote Reality Demonstration
* Retrieval of Commercials by Video Semantics
* Robust Defect Segmentation in Woven Fabrics
* Robust Recognition of Scaled Eigenimages Through a Hierarchical Approach
* Rotated Wedge Averaging Method for Junction Classification
* Rotation Invariant Neural Network-based Face Detection
* Rotation Invariant Neural Network-based Face Detection
* Salient and Multiple Illusory Surfaces
* Sample Tree: A Sequential Hypothesis Testing Approach to 3D Object Recognition, The
* Scene Segmentation from 3D Motion
* Segmentation by Grouping Junctions
* Segmenting 3-d Surfaces using Multicolored Illumination
* Shape from Equal Thickness Contours
* Singularity Analysis for Articulated Object Tracking
* Soccer Image Sequence Computed by a Virtual Camera
* Spatiotemporal Motion Model for Video Summarization, A
* Specialized Multibaseline Stereo Technique for Obstacle Detection, A
* Statistical Framework for Long-Range Feature Matching in Uncalibrated Image Mosaicing, A
* Stereo and Color Analysis for Dynamic Obstacle Avoidance
* Stochastic Computation of Medial Axis in Markov Random Fields
* Subtly Different Facial Expression Recognition and Expression Intensity Estimation
* Texture Recognition Using a Non-parametric Multi-scale Statistical Model
* Theory of Selection for Gamut Mapping Color Constancy, A
* Towards the Computational Perception of Action
* Tracking People with Twists and Exponential Maps
* Using Adaptive Tracking to Classify and Monitor Activities in a Site
* Using Hierarchical Shape Models to Spot Keywords in Cursive Handwriting Data
* VADIS: A Video Analysis, Display and Indexing System
* Variable-Scale Smoothing and Edge Detection Guided by Stereoscopy
* Variants of Dynamic Link Architecture Based on Mathematical Morphology for Frontal Face Authentication
* Varying Focal Length Self-calibration and Pose Estimation
* Verification Protocol and Statistical Performance Analysis for Face Recognition Algorithms, A
* Video Scene Segmentation Via Continuous Video Coherence
* Volumetric Layer Segmentation Using Coupled Surfaces Propagation
* W4: A Real Time System for Detecting and Tracking People
* Weak Orthogonalization of Face and Perturbation for Recognition
* Weighted Combination of Classifiers Employing Shared and Distinct Representations, A
* Well-behaved, Tunable 3D Affine Invariants
* What is the Spectral Dimensionality of Illumination Functions in Outdoor Scenes?
145 for CVPR98

CVPR99 * *CVPR
* 3D Deformable Image Matching Using Multiscale Minimization of Global Energy Functions
* 3D Geometric Invariant Alignment of Surfaces with Application in Brain Mapping
* 3D Trajectory Recovery for Tracking Multiple Objects and Trajectory-Guided Recognition of Actions
* Adaptive Background Mixture Models for Real-time Tracking
* Adaptive Balloon Models
* Advances in Daylight Statistical Colour Modelling
* Algebraic Curves that Work Better
* Applying Perceptual Grouping to Content-based Image Retrieval: Building Images
* Area-Based Computation of Stereo Disparity with Model-Based Window Size Selection
* Audio-Visual Person Verification
* Automatic Aircraft Recognition: Toward Using Human Similarity Measure in a Recognition System
* Automatic Differentiation Facilitates OF-Integration into Steering-Angle-Based Road Vehicle Tracking
* Automatic Hierarchical Classification Using Time-based Co-occurrences
* Automatic Reconstruction of Piecewise Planar Models from Multiple Views
* Background Estimation and Removal Based on Range and Color
* Bayesian Approach to Spread Spectrum Watermark Detection and Secure Copyright Protection for Digital Image Libraries, A
* Bayesian Multi-camera Surveillance
* Bias Field Estimation and Adaptive Segmentation of MRI Data Using a Modified Fuzzy C-Means Algorithm
* Biprism-Stereo Camera System, A
* Calibration of an Outdoor Active Camera System
* Calibration of Image Sequences for Model Visualisation
* Color Edge Detection with the Compass Operator
* Color Image Segmentation
* Combining Central and Peripheral Vision for Reactive Robot Navigation
* Combining Information using Hard Constraints
* Comparison of Edge Detectors Using an Object Recognition Task
* Computational Approach to Semantic Event Detection, A
* Computing Rectifying Homographies for Stereo Vision
* Constrained Self-Calibration
* Convolver-Based Real-Time Stereo Machine (SAZAN), A
* Critical Motions in Euclidean Structure from Motion
* Customized-Queries Approach to CBIR Using EM, The
* Data-driven Shape-from-Shading using Curvature Consistency
* Deformable Shape Detection and Description via Model-Based Region Grouping
* Deformable Template and Distribution Mixture-based Data Modeling for the Endocardial Contour Tracking in an Echographic Sequence
* Dependency-based Framework of Combining Multiple Experts for the Recognition of Unconstrained Handwritten Numerals, A
* Detecting and Tracking Moving Objects for Video Surveillance
* Detection and Characterization of Multiple Motion Points
* Detection and Removal of Line Scratches in Motion Picture Films
* Detection of gradual transitions through temporal slice analysis
* Discriminatory Power of Ordinal Measures: Towards a New Coefficient, The
* Dynamic Occluding Contours: A New External-energy Term for Snakes
* Edge Detector Evaluation Using Empirical ROC Curves
* Edge Preserving Orientation Adaptive Filtering
* Efficient Bundle Adjustment with Virtual Key Frames: A Hierarchical Approach to Multi-frame Structure from Motion
* Efficient Iterative Solutions to M-View Projective Reconstruction Problem
* Efficient Recursive Factorization Method for Determining Structure from Motion, An
* Efficient Techniques for Wide-Angle Stereo Vision Using Surface Projection Models
* Eigen-Texture Method: Appearance Compression based on 3D Model
* Eigenshapes for 3D Object Recognition in Range Data
* Elastic Matching of Diffusion Tensor MRIs
* Elastic Registration of Medical Images Using Radial Basis Functions with Compact Support
* Estimating Mixture Models of Images and Inferring Spatial Transformations Using the EM Algorithm
* Estimating Model Parameters and Boundaries By Minimizing a Joint, Robust Objective Function
* Estimation of Epipolar Geometry from Apparent Contours: Affine and Circular Motion Cases
* Evaluation of Texture Segmentation Algorithms
* Explaining Optical Flow Events with Parameterized Spatio-temporal Models
* Explanation-based Facial Motion Tracking Using a Piecewise Bezier Volume Deformation Model
* Exploiting the Dependencies in Information Fusion
* Extracting Nonrigid Motion and 3D Structure of Hurricanes from Satellite Image Sequences without Correspondences
* Extracting Textured Vertical Facades from Controlled Close-Range Imagery
* Face Recognition Using Shape and Texture
* Factorization as a Rank 1 Problem
* Fast Scheme for Altering Resolution in the Compressed Domain, A
* Fast, Reliable Head Tracking under Varying Illumination
* Fast, Robust, and Consistent Camera Motion Estimation
* Finger Code: A Filterbank for Fingerprint Representation and Matching
* Folded Catadioptric Cameras
* Framework for Learning Query Concepts in Image Classification, A
* Fundamental Bounds on Edge Detection: An Information Theoretic Evaluation of Different Edge Cues
* Fuzzy K-NN Algorithm using Weights from the Variance of Membership Values, A
* Generic Object Detection using Model Based Segmentation
* Geodesic Active Contours for Supervised Texture Segmentation
* Gesture Localization and Recognition Using Probabilistic Visual Learning
* Global Measures of Coherence for Edge Detector Evaluation
* Graph-Theoretic Clustering for Image Grouping and Retrieval
* Harmonic Maps and their Applications in Surface Matching
* High-Level and Generic Models for Visual Search: When Does High Level Knowledge Help?
* Histogram Clustering for Unsupervised Image Segmentation
* Illumination Distribution from Shadows
* Image Interpolation by Joint View Triangulation
* Implicit Representation and Scene Reconstruction from Probability Density Functions
* Improving Identification Performance by Integrating Evidence from Sequences
* Independent Motion: The Importance of History
* Indexing Using a Spectral Encoding of Topological Structure
* Integral Formulation for Differential Photometric Stereo, An
* Integrating Shape from Shading and Range Data Using Neural Networks
* Interpolating View and Scene Motion by Dynamic View Morphing
* Invariant Recognition in Hyperspectral Images
* Joint Detection for Potsherds of Broken Earthenware
* Learning 2D Shape Models
* Ligature Instabilities in the Perceptual Organization of Shape
* Linear Self-Calibration of a Rotating and Zooming Camera
* Locating Indexing Structures in Engineering Drawing Databases Using Location Hashing
* Low-cost Interactive Active Monocular Range Finder
* Material Classification for 3D Objects in Aerial Hyperspectral Images
* Measurement of Surface Orientations of Transparent Objects Using Polarization in Highlight
* Minimal Projective Reconstruction with Missing Data
* Minimum-entropy models of scene activity
* Model-Based Segmentation of Nuclei
* Model-guided Segmentation of Corpus Callosum in MR Images
* Motion Segmentation: A Synergistic Approach
* Multi-Frame Alignment of Planes
* Multi-scale Feature Selection in Stereo
* Multi-View Approach to Motion and Stereo, A
* Multiple Hypothesis Approach to Figure Tracking, A
* Multiple-Hand-Gesture Tracking Using Multiple Cameras
* Multiscale Image Registration Using Scale Trace Correlation
* New Bayesian Framework for Object Recognition, A
* New Multi-Level Framework for Deformable Contour Optimization, A
* New Structure-from-Motion Ambiguity, A
* New Visualization Paradigm for Multispectral Imagery and Data Fusion, A
* Nonmetric Calibration of Wide-Angle Lenses and Polycameras
* Norm^2 Based Face Recognition
* Novel Bayesian Method for Fitting Parametric and Non-Parametric Models to Noisy Data, A
* Object Recognition with Color Co-occurrence Histograms
* On Plane-Based Camera Calibration: A General Algorithm, Singularities, Applications
* On the Intrinsic Reconstruction of Shape from its Symmetries
* Optimal Eigenfeature Selection by Optimal Image Registration
* Optimal Multi-Scale Matching
* Optimal Rigid Motion Estimation and Performance Evaluation with Bootstrap
* Panoramic EPI Generation and Analysis of Video from a Moving Platform with Vibration
* Parameterized Image Varieties and Estimation with Bilinear Constraints
* Perceptual Organization of Occluding Contours Generated by Opaque Surfaces
* Perceptual Organization via the Symmetry Map and Symmetry Transforms
* Performance Prediction and Validation for Object Recognition
* Planar Catadioptric Stereo: Geometry and Calibration
* Pose Clustering with Density Estimation and Structural Constraints
* Probabilistic Framework for Embedded Face and Facial Expression Recognition, A
* Probabilistic Recognition of Activity Using Local Appearance
* Progressive Probabilistic Hough Transform for Line Detection
* Projective Framework for Scene Segmentation in the Presence of Moving Objects, A
* Projective Rectification without Epipolar Geometry
* Projective Rotations Applied to a Pan-Tilt Stereo Head
* Q-Warping: Direct Computation of Quadratic Reference Surfaces
* Quotient Image: Class Based Recognition and Synthesis Under Varying Illumination Conditions, The
* Radiometric Self Calibration
* Real-time Gabor-type Filtering Using Analog Focal Plane Image Processors
* Real-Time Periodic Motion Detection, Analysis, and Applications
* Recognition of Strings Using Non-stationary Markovian Models: An Application to ZIP Code Recognition
* Recognizing Color Patterns Irrespective of Viewpoint and Illumination
* Recognizing Hand Gestures Using Motion Trajectories
* Reconstruction of Linearly Parameterized Models from Single Images with a Camera of Unknown Focal Length
* Representing and recognizing complete set of geons using extended superquadrics
* Robot Localization Using Uncalibrated Camera Invariants
* Robust and Efficient Image Alignment with Spatially-Varying Illumination Models
* Robust Hierarchical Algorithm for Constructing a Mosaic from Images of the Curved Human Retina
* Robust Visual Servoing Based on Relative Orientation
* Sampling, Resampling and Colour Constancy
* Sensor Planning for a Trinocular Active Vision System
* Separating Reflections and Lighting Using Independent Components Analysis
* Shadow Gestures: 3D Hand Pose Estimation Using a Single Camera
* Shape from Recognition and Learning: Recovery of 3D Face Shapes
* Shape from Video
* Shape Reconstruction in Projective Voxel Grid Space from Large Number of Images
* Simple Technique for Self-Calibration, A
* Simultaneous Depth Recovery and Image Restoration from Defocused Images
* Simultaneous Extraction of Functional Face Sub-spaces
* Simultaneous Image Classification and Restoration Using a Variational Approach
* Space-Variant Dynamic Neural Fields for Visual Attention
* Spatial Filter Selection for Illumination-Invariant Color Texture Discrimination
* Statistical Biases in Optic Flow
* Statistical Color Models with Application to Skin Detection
* Statistics of Natural Images and Models
* Stereo Correspondence from Motion Correspondence
* Stereo Panorama with a Single Camera
* Stochastic Image Segmentation by Typical Cuts
* Structured Probabilistic Model for Recognition, A
* Subset Selection for Active Object Recognition
* Surface Reconstruction from Multiple Aerial Images in Dense Urban Areas
* Tensors of Three Affine Views, The
* Time-Series Classification Using Mixed-State Dynamic Bayesian Networks
* Toboggan-Based Intelligent Scissors with a Four Parameter Edge Model
* Torque-based Recursive Filtering Approach to the Recovery of the 3-D Articulated Motion from Image Sequences
* Toward a Scale-Space Aspect Graph: Solids of Revolution
* Toward Learning Visual Discrimination Strategies
* Toward Recovering Shape and Motion of 3D Curves from Multi-View Image Sequences
* Tracking from Multiple View Points: Self-calibration of Space and Time
* Trajectory Triangulation of Lines: Reconstruction of a 3D Point Moving Along a Line from a Monocular Image Sequence
* Unifying Boundary and Region-based Information for Geodesic Active Tracking
* User Assisted Modeling of Buildings from Aerial Images
* Using a Linear Subspace Approach for Invariant Subpixel Material Identification in Airborne Hyperspectral Imagery
* Using Planar Parallax to Estimate the Time-to-Contact
* Using the Condensation Algorithm for Robust, Vision-based Mobile Robot Localization
* Virtual Snakes for Occlusion Analysis
* Vision-Based Speaker Detection Using Bayesian Networks
* Visual Recognition of Multi-agent Action Using Binary Temporal Relations
* Visual Signature Verification Using Affine Arc-length
* Visual Tracking and Control using Lie Algebras
* Volumetric Stereo Matching Method: Application to Image-Based Modeling, A
* What Do Planar Shadows Tell Us About Scene Geometry?
* Yet Another Method for Pose Estimation: A Probabilistic Approach using Points, Lines, and Cylinders
193 for CVPR99

CVPRWS09 * *CVPR Workshops
* Robust feature matching in 2.3µs

CVPRWS11 * *CVPR Workshops

Index for "c"


Last update:16-Mar-24 21:12:13
Use price@usc.edu for comments.