orgEntity

arXiv

From the wire

Original publisher articles citing arXiv. Broadside has not yet written editorial coverage for these.

PASC: Pipeline-Aware Conformal Prediction with Joint Coverage Guarantees for Multi-Stage NLP and LLM Pipelines
via arxiv.org4 days ago
Beyond One-Size-Fits-All: Adaptive Subgraph Denoising for Zero-Shot Graph Learning with Large Language Models
via arxiv.orgabout 14 hours ago
Benchmarking and Improving Monitors for Out-Of-Distribution Alignment Failure in LLMs
via arxiv.orgabout 14 hours ago
The Scaling Laws of Skills in LLM Agent Systems
via arxiv.org5 days ago
From Flat Language Labels to Typological Priors: Structured Language Conditioning for Multilingual Speech-to-Speech Translation
via arxiv.org6 days ago
DeepSlide: From Artifacts to Presentation Delivery
via arxiv.org6 days ago
Generalization or Memorization? Brittleness Testing for Chess-Trained Language Models
via arxiv.org5 days ago
LISTEN to Your Preferences: An LLM Framework for Multi-Objective Selection
via arxiv.org5 days ago
Learning Bilevel Policies over Symbolic World Models for Long-Horizon Planning
via arxiv.org6 days ago
Vector Policy Optimization: Training for Diversity Improves Test-Time Search
via arxiv.orgabout 14 hours ago
DrugSAGE:Self-evolving Agent Experience for Efficient State-of-the-Art Drug Discovery
via arxiv.org6 days ago
SWE-Mutation: Can LLMs Generate Reliable Test Suites in Software Engineering?
via arxiv.orgabout 14 hours ago
Teaching Language Models to Forecast Research Success Through Comparative Idea Evaluation
via arxiv.orgabout 14 hours ago
SpecBench: Measuring Reward Hacking in Long-Horizon Coding Agents
via arxiv.org3 days ago
The Annotation Scarcity Paradox in Low-Resource NLP Evaluation: A Decade of Acceleration and Emerging Constraints
via arxiv.org4 days ago
Polar probe linearly decodes semantic structures from LLMs
via arxiv.org5 days ago
Responsible Federated LLMs via Safety Filtering and Constitutional AI
via arxiv.org5 days ago
Readers make targeted regressions to plausible errors in reanalysis of "noisy-channel garden-path" sentences
via arxiv.org5 days ago
MeMo: Memory as a Model
via arxiv.org8 days ago
Herculean: An Agentic Benchmark for Financial Intelligence
via arxiv.org8 days ago
← Back to home