VideoGEM: Training-free Action Grounding in Videos.
|
CVPR |
2025 |
0 |
Unbiasing through Textual Descriptions: Mitigating Representation Bias in Video Benchmarks.
|
CVPR |
2025 |
1 |
Canonical Rank Adaptation: An Efficient Fine-Tuning Strategy for Vision Transformers.
|
ICML |
2025 |
2 |
CAV-MAE Sync: Improving Contrastive Audio-Visual Mask Autoencoders via Fine-Grained Alignment.
|
CVPR |
2025 |
3 |
Fréchet Wavelet Distance: A Domain-Agnostic Metric for Image Generation.
|
ICLR |
2025 |
0 |
Convolutional Differentiable Logic Gate Networks.
|
NIPS/NeurIPS |
2024 |
35 |
Uncertainty Quantification via Stable Distribution Propagation.
|
ICLR |
2024 |
10 |
Meta-prompting for Automating Zero-Shot Visual Recognition with LLMs.
|
ECCV |
2024 |
33 |
ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs.
|
NIPS/NeurIPS |
2024 |
19 |
Newton Losses: Using Curvature Information for Learning with Differentiable Algorithms.
|
NIPS/NeurIPS |
2024 |
2 |
HowToCaption: Prompting LLMs to Transform Video Annotations at Scale.
|
ECCV |
2024 |
0 |
What, When, and Where? Self-Supervised Spatio- Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions.
|
CVPR |
2024 |
0 |
Grounding Everything: Emerging Localization Properties in Vision-Language Transformers.
|
CVPR |
2024 |
0 |
ISAAC Newton: Input-based Approximate Curvature for Newton's Method.
|
ICLR |
2023 |
5 |
Learning by Sorting: Self-supervised Learning with Group Ordering Constraints.
|
ICCV |
2023 |
15 |
Temperature Schedules for self-supervised contrastive methods on long-tail data.
|
ICLR |
2023 |
63 |
Learning Human Action Recognition Representations Without Real Humans.
|
NIPS/NeurIPS |
2023 |
8 |
MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge.
|
ICCV |
2023 |
49 |
In-Style: Bridging Text and Uncurated Videos with Style Transfer for Text-Video Retrieval.
|
ICCV |
2023 |
6 |
Preserving Modality Structure Improves Multi-Modal Learning.
|
ICCV |
2023 |
13 |
Learning Situation Hyper-Graphs for Video Question Answering.
|
CVPR |
2023 |
23 |
Video Test-Time Adaptation for Action Recognition.
|
CVPR |
2023 |
0 |
Contrastive Audio-Visual Masked Autoencoder.
|
ICLR |
2023 |
0 |
How Transferable are Video Representations Based on Synthetic Data?
|
NIPS/NeurIPS |
2022 |
42 |
Weakly Supervised Grounding for VQA in Vision-Language Transformers.
|
ECCV |
2022 |
5 |
Deep Differentiable Logic Gate Networks.
|
NIPS/NeurIPS |
2022 |
70 |
Differentiable Top-k Classification Learning.
|
ICML |
2022 |
43 |
CycDA: Unsupervised Cycle Domain Adaptation to Learn from Image to Video.
|
ECCV |
2022 |
9 |
Monotonic Differentiable Sorting Networks.
|
ICLR |
2022 |
29 |
Unsupervised Domain Generalization by Learning a Bridge Across Domains.
|
CVPR |
2022 |
0 |
Everything at Once - Multi-modal Fusion Transformer for Video Retrieval.
|
CVPR |
2022 |
0 |
Detector-Free Weakly Supervised Grounding by Separation.
|
ICCV |
2021 |
31 |
Generalized and Incremental Few-Shot Learning by Explicit Learning and Calibration without Forgetting.
|
ICCV |
2021 |
70 |
Differentiable Sorting Networks for Scalable Sorting and Ranking Supervision.
|
ICML |
2021 |
1 |
Found a Reason for me? Weakly-supervised Grounded Visual Question Answering using Capsules.
|
CVPR |
2021 |
41 |
Learning with Algorithmic Supervision via Continuous Relaxations.
|
NIPS/NeurIPS |
2021 |
33 |
Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos.
|
ICCV |
2021 |
97 |
A Hybrid RNN-HMM Approach for Weakly Supervised Temporal Action Segmentation.
|
TPAMI |
2020 |
0 |
Unsupervised Learning of Action Classes With Continuous Temporal Embedding.
|
CVPR |
2019 |
127 |
More Is Less: Learning Efficient Video Representations by Big-Little Network and Depthwise Temporal Aggregation.
|
NIPS/NeurIPS |
2019 |
133 |
NeuralNetwork-Viterbi: A Framework for Weakly Supervised Video Learning.
|
CVPR |
2018 |
151 |
Action Sets: Weakly Supervised Action Segmentation Without Ordering Constraints.
|
CVPR |
2018 |
0 |
Weakly Supervised Action Learning with RNN Based Fine-to-Coarse Modeling.
|
CVPR |
2017 |
213 |
The Language of Actions: Recovering the Syntax and Semantics of Goal-Directed Human Activities.
|
CVPR |
2014 |
671 |
HMDB: A large video database for human motion recognition.
|
ICCV |
2011 |
4199 |
Combined intention, activity, and motion recognition for a humanoid household robot.
|
IROS |
2011 |
39 |
HMM-based human motion recognition with optical flow data.
|
Humanoids |
2009 |
38 |