HowToCaption: Prompting LLMs to Transform Video Annotations at Scale.
|
ECCV |
2024 |
0 |
Meta-prompting for Automating Zero-Shot Visual Recognition with LLMs.
|
ECCV |
2024 |
0 |
What, When, and Where? Self-Supervised Spatio- Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions.
|
CVPR |
2024 |
0 |
Grounding Everything: Emerging Localization Properties in Vision-Language Transformers.
|
CVPR |
2024 |
0 |
Uncertainty Quantification via Stable Distribution Propagation.
|
ICLR |
2024 |
0 |
Video Test-Time Adaptation for Action Recognition.
|
CVPR |
2023 |
0 |
Learning Situation Hyper-Graphs for Video Question Answering.
|
CVPR |
2023 |
0 |
Learning Human Action Recognition Representations Without Real Humans.
|
NIPS/NeurIPS |
2023 |
0 |
ISAAC Newton: Input-based Approximate Curvature for Newton's Method.
|
ICLR |
2023 |
0 |
Contrastive Audio-Visual Masked Autoencoder.
|
ICLR |
2023 |
0 |
Temperature Schedules for self-supervised contrastive methods on long-tail data.
|
ICLR |
2023 |
0 |
Learning by Sorting: Self-supervised Learning with Group Ordering Constraints.
|
ICCV |
2023 |
0 |
Preserving Modality Structure Improves Multi-Modal Learning.
|
ICCV |
2023 |
0 |
MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge.
|
ICCV |
2023 |
0 |
In-Style: Bridging Text and Uncurated Videos with Style Transfer for Text-Video Retrieval.
|
ICCV |
2023 |
0 |
Differentiable Top-k Classification Learning.
|
ICML |
2022 |
4 |
How Transferable are Video Representations Based on Synthetic Data?
|
NIPS/NeurIPS |
2022 |
4 |
Weakly Supervised Grounding for VQA in Vision-Language Transformers.
|
ECCV |
2022 |
0 |
Monotonic Differentiable Sorting Networks.
|
ICLR |
2022 |
5 |
CycDA: Unsupervised Cycle Domain Adaptation to Learn from Image to Video.
|
ECCV |
2022 |
0 |
Unsupervised Domain Generalization by Learning a Bridge Across Domains.
|
CVPR |
2022 |
0 |
Everything at Once - Multi-modal Fusion Transformer for Video Retrieval.
|
CVPR |
2022 |
0 |
Deep Differentiable Logic Gate Networks.
|
NIPS/NeurIPS |
2022 |
0 |
Found a Reason for me? Weakly-supervised Grounded Visual Question Answering using Capsules.
|
CVPR |
2021 |
13 |
Generalized and Incremental Few-Shot Learning by Explicit Learning and Calibration without Forgetting.
|
ICCV |
2021 |
11 |
Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos.
|
ICCV |
2021 |
34 |
Learning with Algorithmic Supervision via Continuous Relaxations.
|
NIPS/NeurIPS |
2021 |
8 |
Detector-Free Weakly Supervised Grounding by Separation.
|
ICCV |
2021 |
6 |
Differentiable Sorting Networks for Scalable Sorting and Ranking Supervision.
|
ICML |
2021 |
8 |
A Hybrid RNN-HMM Approach for Weakly Supervised Temporal Action Segmentation.
|
TPAMI |
2020 |
0 |
More Is Less: Learning Efficient Video Representations by Big-Little Network and Depthwise Temporal Aggregation.
|
NIPS/NeurIPS |
2019 |
93 |
Unsupervised Learning of Action Classes With Continuous Temporal Embedding.
|
CVPR |
2019 |
59 |
NeuralNetwork-Viterbi: A Framework for Weakly Supervised Video Learning.
|
CVPR |
2018 |
100 |
Action Sets: Weakly Supervised Action Segmentation Without Ordering Constraints.
|
CVPR |
2018 |
0 |
Weakly Supervised Action Learning with RNN Based Fine-to-Coarse Modeling.
|
CVPR |
2017 |
160 |
The Language of Actions: Recovering the Syntax and Semantics of Goal-Directed Human Activities.
|
CVPR |
2014 |
373 |
HMDB: A large video database for human motion recognition.
|
ICCV |
2011 |
2909 |
Combined intention, activity, and motion recognition for a humanoid household robot.
|
IROS |
2011 |
38 |
HMM-based human motion recognition with optical flow data.
|
Humanoids |
2009 |
38 |