DUCTILE: Agentic LLM Orchestration of Engineering Analysis in Product Development Practice

This paper introduces DUCTILE, an agentic LLM orchestration framework that separates adaptive decision-making from deterministic tool execution to automate engineering analysis in product development, successfully handling input deviations in an aerospace case study while highlighting the emerging tension between task automation and the creation of exhausting supervisory roles.

Alejandro Pradas-Gomez, Arindam Brahma, Ola IsakssonThu, 12 Ma🤖 cs.AI

Building Privacy-and-Security-Focused Federated Learning Infrastructure for Global Multi-Centre Healthcare Research

This paper introduces FLA³, a governance-aware federated learning platform that integrates authentication, authorization, and accounting mechanisms to enable secure, privacy-preserving, and regulatory-compliant multi-center healthcare research, demonstrating its operational feasibility and clinical utility across international institutions while achieving predictive performance comparable to centralized training.

Fan Zhang, Daniel Kreuter, Javier Fernandez-Marques, BloodCounts Consortium, Gregory Verghese, Bernard Butler, Nicholas Lane, Suthesh Sivapalaratnam, Joseph Taylor, Norbert C. J. de Wit, Nicholas S. Gleadall, Carola-Bibiane Schönlieb, Michael RobertsThu, 12 Ma💻 cs

SBOMs into Agentic AIBOMs: Schema Extensions, Agentic Orchestration, and Reproducibility Evaluation

This paper introduces Agentic AIBOMs, a multi-agent framework that extends static Software Bills of Materials (SBOMs) with autonomous, policy-constrained reasoning to dynamically capture runtime behavior and environmental drift, thereby enhancing supply-chain security through reproducible, context-aware vulnerability assessment and minimal schema extensions to existing standards.

Petar Radanliev, Carsten Maple, Omar Santos, Kayvan AtefiThu, 12 Ma🤖 cs.AI

Toward Epistemic Stability: Engineering Consistent Procedures for Industrial LLM Hallucination Reduction

This paper presents and evaluates five prompt engineering strategies for reducing LLM hallucinations in industrial settings without modifying model weights, finding that an Enhanced Data Registry (M4) achieved perfect consistency in initial trials while a revised Decomposed Model-Agnostic Prompting (M2) showed the most significant improvement in subsequent verification.

Brian Freeman, Adam Kicklighter, Matt Erdman, Zach GordonThu, 12 Ma🤖 cs.AI

One Model, Many Skills: Parameter-Efficient Fine-Tuning for Multitask Code Analysis

This paper presents the first comprehensive evaluation of parameter-efficient fine-tuning (PEFT) for multitask code analysis, demonstrating that a single shared PEFT module can match or surpass full fine-tuning performance while significantly reducing computational and storage costs, provided that tasks are strategically grouped based on factors like complementarity and stability.

Amal Akli, Maxime Cordy, Mike Papadakis, Yves Le TraonThu, 12 Ma💻 cs

UniCoR: Modality Collaboration for Robust Cross-Language Hybrid Code Retrieval

UniCoR is a novel self-supervised framework that addresses the challenges of insufficient semantic understanding, inefficient modality fusion, and weak cross-language generalization in hybrid code retrieval by employing multi-perspective supervised contrastive learning and representation distribution consistency, thereby achieving state-of-the-art performance on both empirical and large-scale benchmarks.

Yang Yang, Li Kuang, Jiakun Liu, Zhongxin Liu, Yingjie Xia, David LoMon, 09 Ma💻 cs

ROS-related Robotic Systems Development with V-model-based Application of MeROS Metamodel

This paper proposes a structured methodology that integrates the Robot Operating System (ROS) with Model-Based Systems Engineering (MBSE) through a specialized SysML metamodel called MeROS and an adapted V-model, aiming to enhance the semantic coherence, structural traceability, and reliable coordination of complex heterogeneous robotic systems.

Tomasz Winiarski, Jan Kaniuka, Daniel Giełdowski, Jakub Ostrysz, Krystian Radlak, Dmytro KushnirMon, 09 Ma💻 cs

Story Point Estimation Using Large Language Models

This study demonstrates that large language models can effectively predict software story points without training data or with only a few examples, outperforming traditional supervised deep learning models, while also finding that comparative judgments, though not inherently easier to predict, can serve as effective few-shot examples to further enhance estimation accuracy.

Pranam Prakash Shetty, Adarsh Balakrishnan, Mengqiao Xu, Xiaoyin Xi, Zhe YuMon, 09 Ma💻 cs