Dial: A Knowledge-Grounded Dialect-Specific NL2SQL System

This paper introduces Dial, a knowledge-grounded framework that addresses the challenges of generating executable SQL across heterogeneous database systems by employing dialect-aware logical planning, a hierarchical intent-aware knowledge base, and an execution-driven debugging loop, achieving significant improvements in translation accuracy and dialect feature coverage on the newly constructed DS-NL2SQL benchmark.

Xiang Zhang, Hongming Xu, Le Zhou, Wei Zhou, Xuanhe Zhou, Guoliang Li, Yuyu Luo, Changdong Liu, Guorun Chen, Jiang Liao, Fan WuTue, 10 Ma🤖 cs.LG

Skip to the Good Part: Representation Structure & Inference-Time Layer Skipping in Diffusion vs. Autoregressive LLMs

This paper reveals that diffusion language models develop distinct, hierarchical internal representations with early-layer redundancy compared to autoregressive models, enabling a novel, training-free layer-skipping inference method that significantly reduces computational costs while maintaining high performance.

Raghavv Goel, Risheek Garrepalli, Sudhanshu Agrawal, Chris Lott, Mingu Lee, Fatih PorikliTue, 10 Ma💬 cs.CL

Bolbosh: Script-Aware Flow Matching for Kashmiri Text-to-Speech

This paper introduces Bolbosh, the first open-source neural Text-to-Speech system for Kashmiri, which utilizes a script-aware, supervised cross-lingual adaptation strategy based on Optimal Transport Conditional Flow Matching and a three-stage acoustic enhancement pipeline to overcome the limitations of zero-shot multilingual baselines and achieve significantly higher speech quality and intelligibility.

Tajamul Ashraf, Burhaan Rasheed Zargar, Saeed Abdul Muizz, Ifrah Mushtaq, Nazima Mehdi, Iqra Altaf Gillani, Aadil Amin Kak, Janibul BashirTue, 10 Ma💬 cs.CL

TableMind++: An Uncertainty-Aware Programmatic Agent for Tool-Augmented Table Reasoning

TableMind++ enhances the existing TableMind framework for tool-augmented table reasoning by introducing an uncertainty-aware inference framework that mitigates hallucinations through memory-guided plan pruning, confidence-based action refinement, and dual-weighted trajectory aggregation, thereby achieving superior performance on diverse benchmarks.

Mingyue Cheng, Shuo Yu, Chuang Jiang, Xiaoyu Tao, Qingyang Mao, Jie Ouyang, Qi Liu, Enhong ChenTue, 10 Ma💬 cs.CL

MAWARITH: A Dataset and Benchmark for Legal Inheritance Reasoning with LLMs

The paper introduces MAWARITH, a large-scale Arabic dataset and the MIR-E evaluation metric designed to benchmark and improve large language models' ability to perform complex, multi-step reasoning for Islamic inheritance law, revealing that while advanced models like Gemini-2.5-flash achieve high performance, many others struggle with critical legal rules and error propagation.

Abdessalam Bouchekif, Shahd Gaben, Samer Rashwani, Somaya Eltanbouly, Mutaz Al-Khatib, Heba Sbahi, Mohammed Ghaly, Emad MohamedTue, 10 Ma💬 cs.CL

Nw\=ach\=a Mun\=a: A Devanagari Speech Corpus and Proximal Transfer Benchmark for Nepal Bhasha ASR

This paper introduces Nw\=ach\=a Mun\=a, the first manually transcribed Devanagari speech corpus for the endangered Nepal Bhasha, and demonstrates that proximal cross-lingual transfer from Nepali achieves competitive automatic speech recognition performance comparable to large multilingual models while being significantly more computationally efficient.

Rishikesh Kumar Sharma, Safal Narshing Shrestha, Jenny Poudel, Rupak Tiwari, Arju Shrestha, Rupak Raj Ghimire, Bal Krishna BalTue, 10 Ma💬 cs.CL

KCoEvo: A Knowledge Graph Augmented Framework for Evolutionary Code Generation

KCoEvo is a knowledge graph-augmented framework that addresses the challenges of API-driven code evolution by decomposing migration into path retrieval and informed generation stages, significantly improving accuracy and execution success over standard LLM baselines through structured reasoning and synthetic supervision.

Jiazhen Kang, Yuchen Lu, Chen Jiang, Jinrui Liu, Tianhao Zhang, Bo Jiang, Ningyuan Sun, Tongtong Wu, Guilin QiTue, 10 Ma💬 cs.CL

StyleBench: Evaluating Speech Language Models on Conversational Speaking Style Control

This paper introduces StyleBench, a multi-turn dialogue benchmark designed to systematically evaluate and quantify the ability of speech language models to control conversational speaking styles across emotion, speed, volume, and pitch dimensions, revealing performance gaps between current models and highlighting directions for future improvement.

Haishu Zhao, Aokai Hao, Yuan Ge, Zhenqiang Hong, Tong Xiao, Jingbo ZhuTue, 10 Ma💬 cs.CL

KohakuRAG: A simple RAG framework with hierarchical document indexing

KohakuRAG is an open-source, hierarchical RAG framework that achieves state-of-the-art performance on the WattBot 2025 Challenge by preserving document structure through a four-level tree representation, enhancing retrieval via LLM-powered query planning, and stabilizing outputs with ensemble voting, thereby outperforming existing methods in precision and citation accuracy.

Shih-Ying Yeh, Yueh-Feng Ku, Ko-Wei Huang, Buu-Khang TuTue, 10 Ma💬 cs.CL

Scalable Training of Mixture-of-Experts Models with Megatron Core

This paper presents Megatron Core, a scalable and production-ready open-source framework that addresses the coupled memory, communication, and computation challenges of Mixture-of-Experts (MoE) training through integrated system-level optimizations, enabling high-performance training of models ranging from billions to trillions of parameters on large-scale GPU clusters.

Zijie Yan (NVIDIA), Hongxiao Bai (NVIDIA), Xin Yao (NVIDIA), Dennis Liu (NVIDIA), Tong Liu (NVIDIA), Hongbin Liu (NVIDIA), Pingtian Li (NVIDIA), Evan Wu (NVIDIA), Shiqing Fan (NVIDIA), Li Tao (NVIDIA), Robin Zhang (NVIDIA), Yuzhong Wang (NVIDIA), Shifang Xu (NVIDIA), Jack Chang (NVIDIA), Xuwen Chen (NVIDIA), Kunlun Li (NVIDIA), Yan Bai (NVIDIA), Gao Deng (NVIDIA), Nan Zheng (NVIDIA), Vijay Anand Korthikanti (NVIDIA), Abhinav Khattar (NVIDIA), Ethan He (NVIDIA), Soham Govande (NVIDIA), Sangkug Lym (NVIDIA), Zhongbo Zhu (NVIDIA), Qi Zhang (NVIDIA), Haochen Yuan (NVIDIA), Xiaowei Ren (NVIDIA), Deyu Fu (NVIDIA), Tailai Ma (NVIDIA), Shunkang Zhang (NVIDIA), Jiang Shao (NVIDIA), Ray Wang (NVIDIA), Santosh Bhavani (NVIDIA), Xipeng Li (NVIDIA), Chandler Zhou (NVIDIA), David Wu (NVIDIA), Yingcan Wei (NVIDIA), Ashwath Aithal (NVIDIA), Michael Andersch (NVIDIA), Mohammad Shoeybi (NVIDIA), Jiajie Yao (NVIDIA), June Yang (NVIDIA)Tue, 10 Ma🤖 cs.LG

Large Language Model for Discrete Optimization Problems: Evaluation and Step-by-step Reasoning

This paper evaluates the capabilities of various large language models, including Llama-3 and ChatGPT, in solving diverse discrete optimization problems using natural language datasets, revealing that while stronger models generally perform better, Chain-of-Thought reasoning is not universally effective and data augmentation can improve performance on simpler tasks despite introducing instability.

Tianhao Qian, Guilin Qi, Z. Y. Wu, Ran Gu, Xuanyi Liu, Canchen LyuTue, 10 Ma💬 cs.CL

3ViewSense: Spatial and Mental Perspective Reasoning from Orthographic Views in Vision-Language Models

To address the "spatial intelligence gap" where Vision-Language Models struggle with elementary 3D tasks despite strong logical reasoning, the paper introduces 3ViewSense, a framework that leverages an engineering-inspired "Simulate-and-Reason" mechanism to ground spatial understanding in orthographic views, significantly improving performance on occlusion-heavy counting and view-consistent reasoning benchmarks.

Shaoxiong Zhan, Yanlin Lai, Zheng Liu, Hai Lin, Shen Li, Xiaodong Cai, Zijian Lin, Wen Huang, Hai-Tao ZhengTue, 10 Ma💬 cs.CL

Whitening Reveals Cluster Commitment as the Geometric Separator of Hallucination Types

This paper demonstrates that applying PCA-whitening to GPT-2-small embeddings reveals cluster commitment as the geometric separator distinguishing hallucination types, specifically resolving the previously indistinguishable "wrong-well convergence" and "coverage gap" failures while identifying the inability to separate "center-drift" from "wrong-well convergence" as a model capacity limitation rather than a measurement artifact.

Matic KorunTue, 10 Ma💬 cs.CL

QuadAI at SemEval-2026 Task 3: Ensemble Learning of Hybrid RoBERTa and LLMs for Dimensional Aspect-Based Sentiment Analysis

The QuadAI system for SemEval-2026 Task 3 achieves superior performance in dimensional aspect-based sentiment regression by employing an ensemble learning framework that combines a hybrid RoBERTa encoder with large language models, leveraging the complementary strengths of both architectures to significantly reduce RMSE and improve correlation scores.

A. J. W. de Vink, Filippos Karolos Ventirozos, Natalia Amat-Lefort, Lifeng HanTue, 10 Ma💬 cs.CL