MHCXGraph: A Graph-Based approach to detecting T cell receptor cross-reactivity

This paper introduces MHCXGraph, a scalable and interpretable graph-based computational tool that integrates structural information to identify conserved regions in peptide-MHC complexes, thereby overcoming the limitations of sequence-based methods for detecting T cell receptor cross-reactivity in therapeutic and vaccine development.

Simoes, C. D. M. S., Maidana, R. L. B. R., De Assis, S. C. + 2 more2026-04-10💻 bioinformatics

Synolog: A Scalable Synteny-Based Framework for Genome Architecture Characterization

The paper introduces Synolog, a scalable bioinformatic toolkit that leverages synteny-based analysis to identify orthologs, synteny clusters, retrogenes, and segmental duplications across diverse genomes, demonstrating its utility in characterizing genome architecture, detecting local gene expansions, and reconstructing chromosome-level assemblies while highlighting the advantages of synteny over pure sequence similarity.

Madrigal, G., Catchen, J. M.2026-04-10💻 bioinformatics

PERREO: An integrated pipeline for repetitive elements analysis enables the repeatome expression profiling in cancer

The paper introduces PERREO, a comprehensive and user-friendly pipeline that overcomes the limitations of standard RNA-seq tools by enabling sensitive, repeat-aware analysis of repetitive element expression across short- and long-read data to facilitate the study of the repeatome's role in cancer and other diseases.

Rodriguez-Martin, F., Masero-Leon, M., Gomez-Cabello, D.2026-04-10💻 bioinformatics

Benchmarking ambient RNA removal across droplet and well-plate platforms reveals artificial count generation as a critical failure mode of scAR and CellClear

This study systematically benchmarks six ambient RNA removal tools across diverse single-cell platforms, revealing that while CellBender and SoupX offer reliable denoising, tools like scAR and CellClear critically fail by generating artificial counts and spurious cell types, thereby establishing count matrix integrity as a paramount criterion for tool selection.

Schroeder, L., Gerber, S., Ruffini, N.2026-04-10💻 bioinformatics

Structure-aware geometric graph learning for modeling protease-substrate specificity at scale

The paper introduces OmniCleave, a scalable, structure-aware geometric graph learning framework that outperforms existing methods in modeling protease-substrate specificity by integrating multi-scale structural graphs and higher-order relational topology, thereby enabling the discovery of novel substrates and cleavage sites across diverse protease families.

Guo, X., Bi, Y., Ran, Z. + 9 more2026-04-10💻 bioinformatics

A computational model for quantifying instability of tandem repeats across the genome

This paper introduces a general-purpose computational model that leverages long-read sequencing data to accurately quantify genome-wide tandem repeat instability by characterizing read-to-consensus deviations, revealing that instability is primarily driven by repeat composition rather than length and enabling the detection of significant mosaicism in pathogenic expansions.

Dolzhenko, E., English, A., Mokveld, T. + 14 more2026-04-10💻 bioinformatics

Statistical Principles Define an Open-Source Differential Analysis Workflow for Mass Spectrometry Imaging Experiments with Complex Designs

This paper presents an open-source, statistically rigorous workflow for analyzing complex mass spectrometry imaging experiments, demonstrating through case studies and simulations how critical decisions regarding signal processing, region selection, and statistical modeling impact the detection of differentially abundant analytes.

Rogers, E. B. T., Lakkimsetty, S. S., Bemis, K. A. + 4 more2026-04-10💻 bioinformatics

Divergent landscapes of positive and negative selection signatures across residue-resolved human-virus protein-protein interaction interfaces

By integrating human-virus protein-protein interaction maps with residue-resolved contact data, this study reveals that positive and negative selection signatures exhibit distinct spatial patterns across virus-targeted host proteins, with positively selected residues clustering more prominently on interfaces shared between viral and endogenous partners, thereby highlighting these "mimic-targeted" sites as focal points of adaptive evolution.

Su, W.-C., Xia, Y.2026-04-10💻 bioinformatics

CoPhaser: generic modeling of biological cycles in scRNA-seq with context-dependent periodic manifolds

CoPhaser is a versatile, biologically informed variational autoencoder that disentangles context-dependent periodic trajectories from other sources of cellular variability in single-cell RNA sequencing data, enabling the accurate reconstruction and analysis of diverse biological cycles such as the cell cycle, circadian rhythms, and developmental clocks across various tissues and disease states.

Paychere, Y., Salati, A., Gobet, C. + 1 more2026-04-09💻 bioinformatics

Quaternion Spectral Fingerprinting of DNA: GPU-Accelerated Multi-Channel Fourier Analysis for Alignment-Free Genomics

This paper introduces a GPU-accelerated quaternion Fourier transform framework that encodes DNA as a quaternion-valued signal to enable alignment-free genomic analysis, revealing universal structural periodicities like the helical repeat and species-specific features such as nucleosome positioning through multi-channel spectral fingerprints while achieving whole-genome processing speeds of under one second on commodity hardware.

Bergach, M. A.2026-04-09💻 bioinformatics

End-to-end evaluation of pipelines for metagenome-assembled genomes reveals hidden performance gaps

This paper introduces MAG-E, a simulation-based framework for end-to-end evaluation of metagenome-assembled genome (MAG) pipelines, which reveals that while metaSPAdes and COMEBin generally outperform alternatives in the human gut microbiome, current tools struggle with prophages and shared contigs, and quality control metrics like CheckM2 often misestimate genome quality.

Coleman, I., Ma, J., Qian, G. + 3 more2026-04-09💻 bioinformatics

Germline VCF Annotator: a lightweight pipeline for processing germline VCFs with robust variant extraction and read evidence quality control

This study introduces the Germline VCF Annotator, a lightweight pipeline that converts raw VCF files into reproducible, human-readable tables with standardized annotations and read-evidence quality control, enabling robust analysis of germline variant burdens in normal colon crypts where no age-related trends in DNA damage response loci were observed.

Manojlovic, Z.2026-04-09💻 bioinformatics

IEKB: a comprehensive knowledge base for inner ear genetics integrating curated associations, cochlear interactions, Bayesian candidate prioritisation, explainable dark-gene support relations, and a scientific entity network

The paper introduces the Inner Ear Knowledge Base (IEKB), an open-access resource that unifies curated gene-phenotype-disease associations, cochlear interactions, Bayesian candidate prioritization, explainable support relations, and a multi-entity scientific network to advance inner-ear genetics research through automated curation and interactive exploration tools.

Wang, H., Chen, W., Ning, H. + 6 more2026-04-09💻 bioinformatics