Marcus Rohrbach

61 publications

11 venues

H Index 29

Affiliation

TU Darmstadt, Germany

Links

Name	Venue	Year	citations
Predicting Implicit Arguments in Procedural Video Instructions.	ACL	2025	0
DEFAME: Dynamic Evidence-based FAct-checking with Multimodal Experts.	ICML	2025	0
V^2Dial: Unification of Video and Visual Dialog via Multimodal Experts.	CVPR	2025	0
Efficient Pre-training for Localized Instruction Generation of Procedural Videos.	ECCV	2024	1
Improving Selective Visual Question Answering by Learning from Your Peers.	CVPR	2023	28
Learn2Augment: Learning to Composite Videos for Data Augmentation in Action Recognition.	ECCV	2022	48
Reliable Visual Question Answering: Abstain Rather Than Answer Incorrectly.	ECCV	2022	81
Learning To Recognize Procedural Activities with Distant Supervision.	CVPR	2022	100
CLASTER: Clustering with Reinforcement Learning for Zero-Shot Action Recognition.	ECCV	2022	0
FLAVA: A Foundational Language And Vision Alignment Model.	CVPR	2022	0
SMART Frame Selection for Action Recognition.	AAAI	2021	0
KRISP: Integrating Implicit and Symbolic Knowledge for Open-Domain Knowledge-Based VQA.	CVPR	2021	0
Remembering for the Right Reasons: Explanations Reduce Catastrophic Forgetting.	ICLR	2021	0
In Defense of Grid Features for Visual Question Answering.	CVPR	2020	360
TextCaps: A Dataset for Image Captioning with Reading Comprehension.	ECCV	2020	530
Adversarial Continual Learning.	ECCV	2020	227
12-in-1: Multi-Task Vision and Language Representation Learning.	CVPR	2020	35
Learning to Generate Grounded Visual Captions Without Localization Supervision.	ECCV	2020	0
Iterative Answer Prediction With Pointer-Augmented Multimodal Transformers for TextVQA.	CVPR	2020	0
Decoupling Representation and Classifier for Long-Tailed Recognition.	ICLR	2020	0
Uncertainty-guided Continual Learning with Bayesian Neural Networks.	ICLR	2020	0
Towards VQA Models That Can Read.	CVPR	2019	1856
DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video Action Recognition.	CVPR	2019	133
Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks With Octave Convolution.	ICCV	2019	650
Cycle-Consistency for Robust Visual Question Answering.	CVPR	2019	206
Large-Scale Visual Relationship Understanding.	AAAI	2019	0
Probabilistic Neural Symbolic Models for Interpretable Visual Question Answering.	ICML	2019	0
CoDraw: Collaborative Drawing as a Testbed for Grounded Goal-driven Communication.	ACL	2019	0
Grounded Video Description.	CVPR	2019	0
Graph-Based Global Reasoning Networks.	CVPR	2019	0
Adversarial Inference for Multi-Sentence Video Description.	CVPR	2019	0
A Dataset for Telling the Stories of Social Media Videos.	EMNLP	2018	63
Multimodal Explanations: Justifying Decisions and Pointing to the Evidence.	CVPR	2018	461
Exploring the Challenges Towards Lifelong Fact Learning.	ACCV	2018	12
Memory Aware Synapses: Learning What (not) to Forget.	ECCV	2018	11
Visual Coreference Resolution in Visual Dialog Using Neural Module Networks.	ECCV	2018	172
Learning to Reason: End-to-End Module Networks for Visual Question Answering.	ICCV	2017	599
Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training.	ICCV	2017	15
Long-Term Recurrent Convolutional Networks for Visual Recognition and Description.	TPAMI	2017	312
Generating Descriptions with Grounded and Co-referenced People.	CVPR	2017	72
Modeling Relationships in Referential Expressions with Compositional Modular Networks.	CVPR	2017	0
Captioning Images with Diverse Objects.	CVPR	2017	0
Generating Visual Explanations.	ECCV	2016	649
Deep Compositional Captioning: Describing Novel Object Categories without Paired Training Data.	CVPR	2016	30
Segmentation from Natural Language Expressions.	ECCV	2016	520
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding.	EMNLP	2016	1556
Commonsense in Parts: Mining Part-Whole Relations from the Web and Image Tags.	AAAI	2016	29
Grounding of Textual Phrases in Images by Reconstruction.	ECCV	2016	0
Neural Module Networks.	CVPR	2016	0
Natural Language Object Retrieval.	CVPR	2016	0
Ask Your Neurons: A Neural-Based Approach to Answering Questions about Images.	ICCV	2015	634
Spatial Semantic Regularisation for Large Scale Object Detection.	ICCV	2015	25
A dataset for Movie Description.	CVPR	2015	547
Long-term recurrent convolutional networks for visual recognition and description.	CVPR	2015	0
Sequence to Sequence - Video to Text.	ICCV	2015	0
Transfer Learning in a Transductive Setting.	NIPS/NeurIPS	2013	253
Translating Video Content to Natural Language Descriptions.	ICCV	2013	377
A database for fine grained activity detection of cooking activities.	CVPR	2012	626
Script Data for Attribute-Based Recognition of Composite Activities.	ECCV	2012	200
Evaluating knowledge transfer and zero-shot learning in a large-scale setting.	CVPR	2011	374
What helps where - and why? Semantic relatedness for knowledge transfer.	CVPR	2010	0