admin管理员组文章数量:1530842
ACM MM 会议是多媒体领域的top1顶会 人人心向往之的会议
我的有位老师说他的学生读了三年博士,投了好几次MM都没被录,主动要求延毕,说三年我追个姑娘也追到手了,竟然投会议就是投不中。。。
今年ACM MM 会议将在10月22-26日在韩国首尔举行,会议相关议程在官网
http://www.acmmm/2018/
已经发布,已经接收的papers列表:
Accepted Papers
Understanding Humans in Crowded Scenes: Deep Nested Adversarial Learning and A New Benchmark for Multi-Human Parsing
Incremental Deep Hidden Attribute Learning
Step-by-step Erasion, One-by-one Collection: A Weakly Supervised Temporal Action Detector
Visual Domain Adaptation with Manifold Embedded Distribution Alignment
Object-Difference Attention: A simple relational attention for Visual Question Answering
Robust Billboard-based, Free-viewpoint Video Synthesis Algorithm to Overcome Occlusions under Challenging Outdoor Sport Scenes
Multi-Human Parsing Machines
Deep Priority Hashing
CropNet: Real-Time Thumbnailing
Learning to Transfer: Generalizable Attribute Learning with Multitask Neural Model Search
Supervised Online Hashing via Hadamard Codebook Learning
Shared Linear Encoder-based Gaussian Process Latent Variable Model for Visual Classification
Learning Semantic Structure-preserved Embeddings for Cross-modal Retrieval
Fine-grained Grocery Product Recognition by One-shot Learning
Fine-grained Representation Learning and Recognition by Exploiting Hierarchical Semantic Embedding
Style Separation and Synthesis via Generative Adversarial Networks
Attention-based Pyramid Aggregation Network for Visual Place Recognition
Dance with Melody : An LSTM-autoencoder Approach on Music-oriented Dance Synthesis
Fast Parameter Adaptation for Few-shot Image Captioning and Visual Question Answering
Semi-supervised Deep Generative Modelling of Incomplete Multi-Modality Emotional Data
Post Tuned Hashing: A New Approach to Indexing High-dimensional Data
Joint Sign Language Recognition and Education System with ST-Net
Aesthetic-Driven Image Enhancement by Adversarial Learning
Cascaded Feature Augmentation with Diffusion for Image Retrieval
Twitter Sentiment Analysis via Bi-sense Emoji Embedding and Attention-based LSTM
Temporal Sequence Distillation: Towards Few Frame Action Recognition
Joint Global and Co-Attentive Representation Learning for Image-Sentence Retrieval
Multi-View Image Generation from a Single-View
Slackliner — An Interactive Slackline Training Assistant
Hierarchical Memory Modelling for Video Captioning
Group Re-Identification: Leveraging and Integrating Multi-Grain Information
Collaborative Annotation of Semantic Objects in Images with Multi-granularity Supervisions
Multi-modal Preference Modeling for Product Search
GraphNet: Learning Image Pseudo Annotations for Weakly-Supervised Semantic Segmentation
Deep Triplet Quantization
Previewer for Multiple-Scale Object Detector
QARC: Video Quality Aware Rate Control for Real-Time Video Streaming based on Deep Reinforcement Learning
What dress fits me best? Fashion Recommendation on the Clothing Style for Personal Body Shape
SCRATCH: A Scalable Discrete Matrix Factorization Hashing for Cross-Modal Retrieval
OSMO: Online Specific Models for Occlusion in Multiple Object Tracking under Surveillance Scene
Cross-modal Moment Localization in Videos
Attribute-Aware Attention Model for Fine-grained Representation Learning
Video Forecasting with Forward-Backward-Net: Delving Deeper into Spatiotemporal Consistency
Learning Discriminative Features with Multiple Granularities for Person Re-Identification
StripNet: Towards Topology Consistent Strip Structure Segmentation
Attention-based Multi-Patch Aggregation for Image Aesthetic Assessment
An End-to-End Quadrilateral Regression Network for Comic Panel Extraction
CLS: A Cross-user Learning based System for Improving QoE in 360-degree Video Adaptive Streaming
Only Learn One Sample: Fine-Grained Visual Categorization with One Sample Training
Life-long Cross-media Correlation Learning
Text-to-image Synthesis via Symmetrical Distillation Networks
Multi-Scale Correlation for Sequential Cross-modal Hashing Learning
Jaguar: Low Latency Mobile Augmented Reality with Flexible Tracking
Feature Constrained by Pixel: Hierarchical Adversarial Deep Domain Adaptation
Explore Multi-Step Reasoning in Video Question Answering
Monocular Camera Based Real-Time Dense Mapping Using Generative Adversarial Network
Learning Collaborative Generation Correction Modules for Blind Image Deblurring and Beyond
Watch, Think and Attend: End-to-End Video Classification via Dynamic Knowledge Evolution Modeling
Multi-Label Image Classification via Knowledge Distillation from Weakly-Supervised Detection
Fast and Light Manifold CNN based 3D Facial Expression Recognition across Pose Variations
Unregularized Auto-Encoder with Generative Adversarial Networks for Image Generation
Real-time 3D Face-Eye Performance Capture of a Person Wearing VR Headset
Attention and Language Ensemble for Scene Text Recognition with Convolutional Sequence Modeling
Participation-Contributed Temporal Dynamic Model for Group Activity Recognition
A Unified Generative Adversarial Framework for Image Generation and Person Re-identification
Facial Expression Recognition in the Wild: A Cycle-Consistent Adversarial Attention Transfer Approach
Inferring User Emotive State Changes in Realistic Human-Computer Conversational Dialogs
Mining Semantics-Preserving Attention for Group Activity Recognition
Causally Regularized Learning on Data with Agnostic Bias
I read, I saw, I tell: Texts Assisted Fine-Grained Visual Classification
Context-Aware Unsupervised Text Stylization
Bridge The Gap Between VQA and Human Behavior on Omnidirectional Video: A Large-Scale Database and A Deep Learning Model
When to Learn What: Deep Cognitive Subspace Clustering
Look Deeper See Richer: Depth-aware Image Paragraph Captioning
Depth Structure Preserving Scene Image Generation
CA3Net: Contextual-Attentional Attribute-Appearance Network for Person Re-Identification
Learning Multimodal Taxonomy via Variational Deep Graph Embedding and Clustering
Beyond Narrative Description: Generating Poetry from Images by Multi-Adversarial Training
GNAS: A Greedy Neural Architecture Search Method for Multi-Attribute Learning
A Distributed Approach for Bitrate Selection in HTTP Adaptive Streaming
Generative Adversarial Product Quantisation
EmotionGAN: Unsupervised Domain Adaptation for Learning Discrete Probability Distributions of Image Emotions
Few-Shot Adaptation for Video Semantic Indexing
Historical Context-based Style Classification of Painting Images via Label Distribution Learning
Sparsely Grouped Multi-task Generative Adversarial Networks for Facial Attribute Manipulation
High-Quality Exposure Correction of Underexposed Photos
Fashion Sensitive Clothing Recommendation using Hierarchical Collocation Model
A Margin-based MLE for Crowdsourced Partial Ranking
Personalized Serious Games for Cognitive Intervention with Lifelog Visual Analytics
PHD-GIFs: Personalized Highlight Detection for Automatic GIF Creation
iHuman3D: Intelligent Human Body 3D Reconstruction using a Single Flying Camera
Face-Voice Matching using Cross-modal Embeddings
Multi-Scale Context Attention Network for Image Retrieval
When Deep Fool Meets Deep Prior: Adversarial Attack on Image Super-Resolution
Musicality-Novelty Generative Adversarial Nets for Algorithmic Composition
Knowledge-aware Multimodal Dialogue Systems
Cross-Domain Adversarial Feature Learning for Sketch Re-identification
Comprehensive Distance-Preserving Autoencoders for Cross-Modal Retrieval
Facial Expression Recognition Enhanced by Thermal Images through Adversarial Learning
CSAN: Contextual Self-Attention Network for User Sequential Recommendation
Semantic Human Matting
Visual Spatial Attention Network for Relationship Detection
Geometry Guided Adversarial Facial Expression Synthesis
Personalized multiple facial action unit recognition through generative adversarial recognition network
Learning Joint Multimodal Representation with Adversarial Attention Networks
Detecting Abnormality without Knowing Normality: A Two-stage Approach for Unsupervised Video Abnormal Event Detection
WildFish: A Large Benchmark for Fish Recognition in the Wild
Temporal Hierarchical Attention at Category- and Item-Level for Micro-Video Click-Through Prediction
BeautyGAN: Instance-level Facial Makeup Transfer with Deep Generative Adversarial Network
Songle Sync: A Large-Scale Web-based Platform for Controlling Various Devices in Synchronization with Music
CloudVR: Cloud Accelerated Interactive Mobile Virtual Reality
RGCNN: Regularized Graph CNN for Point Cloud Segmentation
Video-based Person Re-identification via Self-Paced Learning and Deep Reinforcement Learning Framework
Photo Squarization by Deep Multi-Operator Retargeting
Predicting Visual Context for Unsupervised Event Segmentation in Continuous Photo-streams
Semantic Image Inpainting with Progressive Generative Networks
Attentive Interactive Convolutional Matching for Community Question Answering in Social Multimedia
Deep Understanding of Cooking Procedure for Cross-modal Recipe Retrieval
LA-Net: Layout-Aware Dense Network for Monocular Depth Estimation
Direction-aware Neural Style Transfer
Reconfigurable Inverted Index
Learning and Fusing Multimodal Deep Features for Acoustic Scene Categorization
Context-Aware Visual Policy Network for Sequence-Level Image Captioning
A Unified Framework for Multimodal Domain Adaptation
Trusted Guidance Pyramid Network for Human Parsing
USAR: an interactive user-specific aesthetic ranking framework for images
Non-locally Enhanced Encoder-Decoder Network for Single Image De-raining
Structure Guided Photorealistic Style Transfer
Tracking-assisted Weakly Supervised Online Visual Object Segmentation in Unconstrained Videos
An ADMM-Based Universal Framework for Adversarial Attacks on Deep Neural Networks
Decoupled Novel Object Captioner
ThoughtViz: Visualizing Human Thoughts Using Generative Adversarial Network
Optimizing Personalized Interaction Experience in Crowd-Interactive Livecast: A Cloud-Edge Approach
End-to-End Blind Quality Assessment of Compressed Video Using Deep Neural Networks
Dynamic Sound Field Synthesis for Speech and Music Optimization
Local Convolutional Neural Networks for Person Re-Identification
Interpretable Multimodal Retrieval for Fashion Products
Conditional Expression Synthesis with Face Parsing Transformation
A Feature-Adaptive Semi-Supervised Framework for Co-Saliency Detection
Attentive Recurrent Neural Network for Weak-supervised Multi-label Image Classification
iSPA-Net: Iterative Semantic Pose Alignment Network
Extractive Video Summarizer with Memory Augmented Neural Networks
ModaNet: A Large-Scale Street Fashion Dataset with Polygon Annotations
Fully Point-wise Convolutional Neural Network for Modeling Statistical Regularities in Natural Images
From data to knowledge: deep learning model compression, transmission and communication
ChipGAN: A Generative Adversarial Network for Chinese Ink Wash Painting Style Transfer
Dest-ResNet: a Deep Spatiotemporal Residual Network for Hotspot Traffic Speed Prediction
Boosting Scene Parsing Performance via Reliable Scale Prediction
Deep Cross modal learning for Caricature Verification and Identification (CaVINet)
Online Action Tube Detection via Resolving the Spatio-temporal Context Pattern
Adaptive Temporal Encoding Network for Video Instance-level Human Parsing
User-Guided Deep Anime Line Art Colorization with Conditional Adversarial Networks
Enhancing Visual Question Answering Using Dropout
Online Inter-Camera Trajectory Association Exploiting Person Re-Identification and Camera Topology
Improving QoE of ABR Streaming Sessions through QUIC Retransmissions
Temporal Cross-Media SubSpaces Learning with Soft-Constraints
Learning Local Descriptors with Adversarial Enhancer from Volumetric Geometry Patches
SibNet: Sibling Convolutional Encoder for Video Captioning
Context-Dependent Diffusion Network for Visual Relationship Detection
Your Attention is Unique: Detecting 360-Degree Video Saliency in Head-Mounted Display for Head Movement Prediction
Generating Defensive Plays in Basketball Games
Connectionist temporal fusion for Sign Language Translation
JPEG Decompression in the Homomorphic Encryption Domain
BitStream: Efficient Computing Architecture for Real-Time Low-Power Inference of Binary Neural Networks on CPUs
Support Neighbor Loss for Person Re-Identification
A Large Scale RGB-D Database for Arbitrary-view Human Action Recognition
FlexStream: Towards Flexible Adaptive Video Streaming on End Devices using Extreme SDN
Spotting and Aggregating Salient Regions for Video Captioning
Structural inpainting
Partial Multi-View Subspace Clustering
FoV-Aware Edge Caching for Adaptive 360° Video Streaming
Attentive LSTM Crowd Flow Machines
Perceptual Temporal Incoherence Aware Stereo Video Retargeting
Fast Discrete Cross-modal Hashing With Regressing From Semantic Labels
Dense Auto-Encoder Hashing for Robust Cross-Modality Retrieval
Investigation of Small Group Social Interactions using Deep Visual Activity-Based Nonverbal Features
Dissimilarity Representation Learning for Generalized Zero-Shot Recognition
Examine before You Answer: Multi-task Learning with Adaptive-attentions for Multiple-choice VQA
Cumulative Nets for Edge Detection
Beyond the Product: Discovering Image Posts for Brands in Social Media
Robustness and Discrimination Oriented Hashing Combining Texture and Invariant Vector Distance
SLIONS: A Karaoke Application to Enhance Foreign Language Learning
Drawing in a Virtual 3D Space – Introducing VR Drawing in Elementary School Art Education
Semi-Supervised DFF: Decoupling Detection and Feature Flow for Video Object Detectors
Residual-Guide Feature Fusion Network for Single Image Deraining
Paragraph generation network with visual relationship detection
Hybrid Point Cloud Attribute Compression Using Slice-based Layered Structure and Block-based Intra Prediction
CIRCE: Real-Time Caching for Instance Recognition on Cloud Environments and Multi-Core Architectures
From Volcano to Toyshop: Adaptive Discriminative Region Discovery for Scene Recognition
Unsupervised Learning of 3D Model Reconstruction from Hand-Drawn Sketches
Learning to Synthesize 3D Indoor Scenes from Monocular Images
DASH for 3D Networked Virtual Environment
PVNet: A Joint Convolutional Network of Point Cloud and Multi-View for 3D Shape Recognition
The Effect of Foveation on High Dynamic Range Video Perception
GestureGAN for Hand Gesture-to-Gesture Translation in the Wild
MiniView Layout for Bandwidth-Efficient 360-Degree Video
An Efficient Deep Quantized Compressed Sensing Coding Framework of Natural Images
Deep Multimodal Image-Repurposing Detection
Video-to-Video Translation with Global Temporal Consistency
Robust Correlation Filter Tracking with Shepherded Instance-Aware Proposals
Cross-Species Learning: A Low-Cost Approach to Learning Human Fight from Animal Fight
PoB: Toward Reasoning Patterns of Beauty in Image Data
Webly Supervised Joint Embedding for Cross-Modal Image-Text Retrieval
Deep Adaptive Temporal Pooling for Activity Recognition
Human Conversation Analysis Using Attentive Multimodal Networks with Hierarchical Encoder-Decoder
Pseudo Transfer with Marginalized Corrupted Attribute for Zero-shot Learning
Crossing-Domain Generative Adversarial Networks for Unsupervised Multi-Domain Image-to-Image Translation
Person Re-identification with Hierarchical Deep Learning Feature and efficient XQDA Metric
EmoCeleb: Emotion recognition in speech using Cross-Modal Transfer in the wild
随便搜一个‘generative’关键词就有14篇文章,可见gan和vae等仍然是一个大方向。
版权声明:本文标题:2018年ACM MM会议论文 arXiv链接 内容由热心网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:https://m.elefans.com/dongtai/1725458796a1024530.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论