Fish Audio S2 Technical Report

Fish Audio S2 是一款开源的多说话人、多轮次且支持自然语言指令控制的高级文本转语音系统,其通过多阶段训练与数据流水线实现了生产级流式推理(RTF 0.195,首字延迟<100ms),并公开了模型权重、微调代码及基于 SGLang 的推理引擎。

Shijia Liao, Yuxuan Wang, Songting Liu, Yifan Cheng, Ruoyi Zhang, Tianyu Li, Shidong Li, Yisheng Zheng, Xingwei Liu, Qingzheng Wang, Zhizhuo Zhou, Jiahua Liu, Xin Chen, Dawei HanWed, 11 Ma🤖 cs.AI

From Word2Vec to Transformers: Text-Derived Composition Embeddings for Filtering Combinatorial Electrocatalysts

该研究提出了一种无需电化学标签的文本驱动筛选策略,通过比较基于 Word2Vec 和 Transformer 的组分嵌入方法,成功在 15 种材料库中利用“导电性”和“介电性”概念方向有效过滤了复杂的组合电催化剂候选物,其中轻量级的 Word2Vec 基线模型在减少候选数量的同时保持了优异的筛选性能。

Lei Zhang, Markus StrickerWed, 11 Ma🔬 cond-mat.mtrl-sci

ConFu: Contemplate the Future for Better Speculative Sampling

本文提出了名为 ConFu 的新型推测采样框架,通过引入“思考未来”的机制(如思考令牌、软提示及动态混合专家模型),使草稿模型能够利用来自目标模型的未来导向信号,从而在几乎不增加成本的情况下显著提升了 Llama-3 模型的令牌接受率和生成速度。

Zongyue Qin, Raghavv Goel, Mukul Gagrani, Risheek Garrepalli, Mingu Lee, Yizhou SunWed, 11 Ma💬 cs.CL

SciTaRC: Benchmarking QA on Scientific Tabular Data that Requires Language Reasoning and Complex Computation

本文介绍了 SciTaRC 基准,该基准通过专家编写的科学论文表格数据问答任务,揭示了当前最先进的 AI 模型(包括 Llama-3.3-70B)因存在普遍的“执行瓶颈”而在深度语言推理和复杂计算方面表现不佳,导致在至少 23% 的任务上失败。

Hexuan Wang, Yaxuan Ren, Srikar Bommireddypalli, Shuxian Chen, Adarsh Prabhudesai, Rongkun Zhou, Elina Baral, Philipp KoehnWed, 11 Ma💬 cs.CL

PathoScribe: Transforming Pathology Data into a Living Library with a Unified LLM-Driven Framework for Semantic Retrieval and Clinical Integration

本文提出了 PathoScribe 框架,通过统一的检索增强大语言模型技术,将静态的病理报告档案转化为支持自然语言检索、自动队列构建及临床推理的“活体图书馆”,显著提升了病理数据的检索效率与临床决策价值。

Abdul Rehman Akbar, Samuel Wales-McGrath, Alejadro Levya, Lina Gokhale, Rajendra Singh, Wei Chen, Anil Parwani, Muhammad Khalid Khan NiaziWed, 11 Ma🤖 cs.AI

Automated Thematic Analysis for Clinical Qualitative Data: Iterative Codebook Refinement with Full Provenance

该论文提出了一种结合迭代代码本优化与完整溯源追踪的自动化主题分析框架,旨在解决临床定性数据处理中的可扩展性与可重复性挑战,并在多个数据集上验证了其相较于基线方法在代码复用性、分布一致性及专家主题对齐方面的显著优势。

Seungjun Yi, Joakim Nguyen, Huimin Xu, Terence Lim, Joseph Skrovan, Mehak Beri, Hitakshi Modi, Andrew Well, Carlos M. Mery, Yan Zhang, Mia K. Markey, Ying DingWed, 11 Ma💬 cs.CL

Learning When to Sample: Confidence-Aware Self-Consistency for Efficient LLM Chain-of-Thought Reasoning

该论文提出了一种基于置信度的自适应采样框架,通过分析单条推理轨迹中的中间状态特征来动态选择推理路径,在保持与多路径方法相当准确率的同时,显著降低了大语言模型链式推理的计算成本。

Juming Xiong, Kevin Guo, Congning Ni, Chao Yan, Katherine Brown, Avinash Baidya, Xiang Gao, Bradley Marlin, Zhijun YinWed, 11 Ma💬 cs.CL

From Days to Minutes: An Autonomous AI Agent Achieves Reliable Clinical Triage in Remote Patient Monitoring

该研究介绍了一种名为 Sentinel 的自主 AI 代理,它利用模型上下文协议(MCP)对远程患者监测数据进行多步推理和情境化分诊,在紧急敏感性等关键指标上超越了人类临床医生,同时以极低的成本实现了可扩展的自动化监测,从而解决了以往远程患者监测试验因数据过载而失败的核心难题。

Seunghwan Kim (AnsibleHealth Inc., San Francisco, USA), Tiffany H. Kung (AnsibleHealth Inc., San Francisco, USA, Stanford School of Medicine, Stanford, USA), Heena Verma (AnsibleHealth Inc., San Francisco, USA), Dilan Edirisinghe (AnsibleHealth Inc., San Francisco, USA), Kaveh Sedehi (AnsibleHealth Inc., San Francisco, USA), Johanna Alvarez (AnsibleHealth Inc., San Francisco, USA), Diane Shilling (AnsibleHealth Inc., San Francisco, USA), Audra Lisa Doyle (AnsibleHealth Inc., San Francisco, USA), Ajit Chary (AnsibleHealth Inc., San Francisco, USA), William Borden (AnsibleHealth Inc., San Francisco, USA, George Washington University, Washington, D.C., USA), Ming Jack Po (AnsibleHealth Inc., San Francisco, USA)Wed, 11 Ma🤖 cs.AI