Barbara Plank

71 publications

10 venues

H Index 28

Affiliation

LMU Munich, Germany
IT University of Copenhagen, Denmark

Links

Name	Venue	Year	citations
The Validation Gap: A Mechanistic Analysis of How Language Models Compute Arithmetic but Fail to Validate It.	EMNLP	2025	7
Reason to Rote: Rethinking Memorization in Reasoning.	EMNLP	2025	3
What's the Difference? Supporting Users in Identifying the Effects of Prompt and Model Changes Through Token Patterns.	ACL	2025	4
Pragmatics in the Era of Large Language Models: A Survey on Datasets, Evaluation, Opportunities and Challenges.	ACL	2025	20
Disentangling Subjectivity and Uncertainty for Hate Speech Annotation and Modeling using Gaze.	EMNLP	2025	1
Cross-Dialect Information Retrieval: Information Access in Low-Resource and High-Variance Languages.	COLING	2025	0
LiTEx: A Linguistic Taxonomy of Explanations for Understanding Within-Label Variation in Natural Language Inference.	EMNLP	2025	2
Threading the Needle: Reweaving Chain-of-Thought Reasoning to Explain Human Label Variation.	EMNLP	2025	3
M-ABSA: A Multilingual Dataset for Aspect-Based Sentiment Analysis.	EMNLP	2025	7
Probing LLMs for Multilingual Discourse Generalization Through a Unified Label Set.	ACL	2025	6
Mind the Uncertainty in Human Disagreement: Evaluating Discrepancies Between Model Predictions and Human Responses in VQA.	AAAI	2025	0
Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models.	ACL	2025	0
LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks.	ACL	2025	0
Algorithmic Fidelity of Large Language Models in Generating Synthetic German Public Opinions: A Case Study.	ACL	2025	0
Surgical, Cheap, and Flexible: Mitigating False Refusal in Language Models via Single Vector Ablation.	ICLR	2025	0
Evaluating Pixel Language Models on Non-Standardized Languages.	COLING	2025	0
RAcQUEt: Unveiling the Dangers of Overlooked Referential Ambiguity in Visual LLMs.	EMNLP	2025	0
Through the Lens of Split Vote: Exploring Disagreement, Difficulty and Calibration in Legal Case Outcome Classification.	ACL	2024	7
Liar, Liar, Logical Mire: A Benchmark for Suppositional Reasoning in Large Language Models.	EMNLP	2024	11
VariErr NLI: Separating Annotation Error from Human Label Variation.	ACL	2024	43
Position: Insights from Survey Methodology can Improve Training Data.	ICML	2024	11
Interpreting Predictive Probabilities: Model Confidence or Human Label Variation?	EACL	2024	17
NNOSE: Nearest Neighbor Occupational Skill Extraction.	EACL	2024	11
Comparing Inferential Strategies of Humans and Large Language Models in Deductive Reasoning.	ACL	2024	16
Exploring the Robustness of Task-oriented Dialogue Systems for Colloquial German Varieties.	EACL	2024	6
ACTOR: Active Learning with Annotator-specific Classification Heads to Embrace Human Label Variation.	EMNLP	2023	15
How to Distill your BERT: An Empirical Study on the Impact of Weight Initialisation and Distillation Objectives.	ACL	2023	11
What Comes Next? Evaluating Uncertainty in Neural Text Generators Against Human Production Variability.	EMNLP	2023	46
From Dissonance to Insights: Dissecting Disagreements in Rationale Construction for Case Outcome Classification.	EMNLP	2023	15
ESCOXLM-R: Multilingual Taxonomy-driven Pre-training for the Job Market Domain.	ACL	2023	28
Establishing Trustworthiness: Rethinking Tasks and Model Evaluation.	EMNLP	2023	3
On Language Spaces, Scales and Cross-Lingual Transfer of UD Parsers.	CoNLL	2022	5
Probing for Labeled Dependency Trees.	ACL	2022	10
Spectral Probing.	EMNLP	2022	3
Neural Natural Language Generation: A Survey on Multilinguality, Multimodality, Controllability and Learning.	JAIR	2022	61
Stop Measuring Calibration When Humans Disagree.	EMNLP	2022	67
Evidence \textgreater Intuition: Transferability Estimation for Encoder Selection.	EMNLP	2022	1
The "Problem" of Human Label Variation: On Ground Truth in Data, Modeling and Evaluation.	EMNLP	2022	0
Genre as Weak Supervision for Cross-lingual Dependency Parsing.	EMNLP	2021	20
Learning from Disagreement: A Survey.	JAIR	2021	254
Biomedical Event Extraction as Sequence Labeling.	EMNLP	2020	74
DaN+: Danish Nested Named Entities and Lexical Normalization.	COLING	2020	40
Neural Unsupervised Domain Adaptation in NLP - A Survey.	COLING	2020	0
Psycholinguistics Meets Continual Learning: Measuring Catastrophic Forgetting in Visual Question Answering.	ACL	2019	55
Bleaching Text: Abstract Features for Cross-lingual Gender Prediction.	ACL	2018	63
Distant Supervision from Disparate Sources for Low-Resource Part-of-Speech Tagging.	EMNLP	2018	53
Strong Baselines for Neural Semi-Supervised Learning under Domain Shift.	ACL	2018	177
Parsing Universal Dependencies without training.	EACL	2017	19
Learning to select data for transfer learning with Bayesian Optimization.	EMNLP	2017	201
Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures (Extended Abstract).	IJCAI	2017	25
Cross-lingual tagger evaluation without test data.	EACL	2017	6
When is multitask learning effective? Semantic sequence prediction under varying data conditions.	EACL	2017	0
Semantic Tagging with Deep Residual Networks.	COLING	2016	80
Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss.	ACL	2016	419
Multi-view and multi-task training of RST discourse parsers.	COLING	2016	56
Keystroke dynamics as signal for shallow syntactic parsing.	COLING	2016	44
Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures.	JAIR	2016	381
Do dependency parsing metrics correlate with human judgments?	CoNLL	2015	18
Inverted indexing for cross-lingual NLP.	ACL	2015	92
Semantic Representations for Domain Adaptation: A Case Study on the Tree Kernel-based Method for Relation Extraction.	ACL	2015	42
Using Frame Semantics for Knowledge Extraction from Twitter.	AAAI	2015	34
Linguistically debatable or just plain wrong?	ACL	2014	133
Adapting taggers to Twitter with not-so-distant supervision.	COLING	2014	39
Learning part-of-speech taggers with inter-annotator agreement loss.	EACL	2014	126
Importance weighting and unsupervised domain adaptation of POS taggers: a negative result.	EMNLP	2014	22
Experiments with crowdsourced re-annotation of a POS tagging data set.	ACL	2014	48
Opinion Mining on YouTube.	ACL	2014	48
What's in a p-value in NLP?	CoNLL	2014	0
Embedding Semantic Similarity in Tree Kernels for Domain Adaptation of Relation Extraction.	ACL	2013	126
Reversible Stochastic Attribute-Value Grammars.	ACL	2011	51
Effective Measures of Domain Similarity for Parsing.	ACL	2011	109