Short talks and Posters of RECOMB-Seq • RECOMB-seq Conference

Short Talks Selected from Abstracts

2025

Metrics Matter: why we need to stop using silhouette in single-cell benchmarking. Pia Rautenstrauch and Uwe Ohler .
Full length isoform reconstruction in single cell data. Marie Van Hecke, Koen Deserranno, Elise Callens, Filip Van Nieuwerburgh and Kathleen Marchal .
Masked superstrings as a compact, indexable, and dynamic representation of unconstrained k-mer sets. Ondřej Sladký, Pavel Veselý and Karel Brinda .
Automated annotation of satellite DNA. Alexander Sweeten, Adam Phillippy and Michael Schatz .
b-move: faster lossless approximate pattern matching in a run-length compressed index. Lore Depuydt, Luca Renders, Simon Van de Vyver, Lennart Veys, Travis Gagie and Jan Fostier .
strangepg: toward pangenome scale graph visualization. Konstantinn Bonnet and Tobias Marschall .
Improved variant calling via latent breakpoint graphs. Megan Le, Lillian Zhang, Can Koçkan, Barış Ekim, Houlin Yu, Brian Haas, Aziz Al’Khafaji, Bonnie Berger and Victoria Popic .
A novel k-mer masking approach for improving specificity in metagenomic pathogen detection. Junqiong Qiu, Seungmo Lee, Vivek Agarwal and William O’Brien .
Pre-training dataset deduplication improves genomic LLMs. Mahler Revsine, Daniel Khashabi and Michael Schatz .
Edgecopy: accurate CNV calling in duplicated genes using whole-exome sequencing. Sang Yoon Byun and Vikas Bansal .
Vizitig: context-rich exploration of sequencing datasets. Bastien Degardins, Charles Paperman and Camille Marchet .
Inverted colored de Bruin graph for practical kmer set storage. Timothé Rouzé, Rayan Chikhi and Antoine Limasset .
Reindeer2: practical abundance index at scale. Yohan Hernandez Courbevoie, Mikaël Salson, Chloé Bessière, Haoliang Xue, Daniel Gautheret, Camille Marchet and Antoine Limasset .
Multi-sample, multi-platform isoform quantification using empirical Bayes. Arghamitra Talukder, Shree Thavarekere, Madison Mehlferber, Gloria M Sheynkman and David A. Knowles .
De Bruijn graphs for pangenomics: in-depth performance benchmarking of de Bruijn graph-based tools for read mapping. Zülal Bingöl, Berkan Şahin, Konstantina Koliogeorgi, Ricardo Roman-Brenes, Klea Zambaku, Can Firtina, Onur Mutlu and Can Alkan .
Efficient algorithm for resolving scenarios of complex chromosomal rearrangements. Barbara Poszewiecka, Krzysztof Gogolewski and Anna Gambin .

2024

A whole-genome probe design for massively parallel variant validation using selective circularization. Daniel Newburger, Georges Natsoulis, Hua Xu, Sue Grimes, John Bell and Hanlee Ji.
Accurate estimation of gene expression levels from dge sequencing data. Marius Nicolae and Ion Mandoiu.
Constrained traversal of repeats with paired sequences. Sébastien Boisvert, Élénie Godzaridis, François Laviolette and Jacques Corbeil.
Contig graph mining for duplication breakpoints. Jurgen F. Nijkamp, Jean-Marc Daran, Marcel J.T. Reinders and Dick De Ridder.
Counting k-mers with a Bloom Filter. Pall Melsted and Jonathan Pritchard.
Finding deletions with exact break points from noisy low coverage paired-end short sequence reads. Jin Zhang and Yufeng Wu.
Improved variant discovery and allele frequency estimation from pooled dna resequencing with bayesian latent class analysis and compositional bias models. Shom Paul and Aaron Mackey.
Modeling and automation of sequencing-based determination of RNA structure. Sharon Aviran, Cole Trapnell, Julius Lucks, Stefanie Mortimer, Shujun Luo, Gary Schroth, Jennifer Doudna, Adam Arkin and Lior Pachter.
mTiM: margin-based transcript mapping from RNA-seq. Georg Zeller, Nico Goernitz, Gunnar Raetsch, Jonas Behr, Andre Kahles, Soeren Sonnenburg and Pramod Mudrakarta.
Separating metagenomic data into genomes via clustering. Olga Tanaseichuk and Tao Jiang.
TavernaPBS: custom next-generation sequence analysis workflows using high-performance computing resources with Taverna and PBS. Mark Lawson, Paul Shuber and Aaron Mackey.

2023

TRIBAL: Tree inference of B cell clonal lineages. Leah Weber, Derek Reiman, Mohammed El-Kebir and Aly Khan.
A comprehensive analysis of the reusability of public omics data across 3.8 million research publications. Serghei Mangul, Mohammad Vahed, Nicholas Darci-Maher, Kerui Peng, Jaqueline Brito, JungHyun Jung, Anushka Rajesh, Andrew Smith, Reid F. Thompson, Casey Greene, Jonathan Jacobs, Dat Duong and Eleazar Eskin.
ClairS: accurate haplotype-aware long-read somatic variant calling using deep learning-based synthetic data learning. Zhenxian Zheng, Junhao Su, Tak-Wah Lam and Ruibang Luo.
A probabilistic framework for parametrizing RNA velocity fields with manifold-consistent cell cycle dynamics. Alex Lederer, Lorenzo Talamanca, Colas Droin, Maxine Leonardi, Irina Khven, Hugo Carvalho, Felix Naef and Gioele La Manno.
Pairwise sequence alignment with block and character edit operations. Ahmet Cemal Alıcıoğlu, Mahmud Sami Aydın and Can Alkan.
Genome misassembly detection using Stash: A data structure based on stochastic tile hashing. Armaghan Sarvar, Lauren Coombe, René Warren and Inanc Birol.
Sigmoni: efficient pangenome multi-classification of nanopore signal. Vikram Shivakumar, Omar Ahmed, Sam Kovaka, Mohsen Zakeri and Ben Langmead.
Panagram: alignment-free and interactive pan-genome visualization. Katharine Jenike, Sam Kovaka, Matthias Benoit, Srividya Ramakrishnan, Shujun Ou, James Saterlee, Stephan Hwang, Iacopo Gentile, Anat Hendelman, Michael Passalacqua, Xingang Wang, Michael Alonge, Hamsini Suresh, Ryan Santos, Blaine Fitzgerald, Gina Robitaille, Edeline Gagnon, Melissa Kramer, Sara Goodwin, W. Richard McCombie, Jaime Prohens, Tiina E. Särkinen, Amy Frary, Jesse Gillis, Joyce Van Eck, Ben Langmead, Zachary B. Lippman and Michael C. Schatz.
Integrating Hi-C sequencing data in verkko for gapless haplotype-resolved assembly. Dmitry Antipov, Shilpa Garg, Adam Phillippy and Sergey Koren.

2022

A trans-ancestry genomics-based approach to study the interplay between the the immune system, pathogen virulence, and HLA type, ancestry, and sepsis outcome. Serghei Mangul.
Genotyping short tandem repeats using long reads. Helyaneh Ziaei Jam and Melissa Gymrek.
Metabuli: a metagenomic classifier that combines protein- and DNA-level classification to achieve both high sensitivity and specificity. Jaebeom Kim and Martin Steinegger.
Rigorous benchmarking of T cell receptor repertoire profiling methods for cancer RNA sequencing. Kerui Peng and Serghei Mangul.
Detection of somatic mosaicism at short tandem repeats from NGS data. Aarushi Sehgal and Melissa Gymrek.

2020

Visualisation of multiple sequence alignment structures. Paulina Knut, Paulina Dziadkiewicz and Norbert Dojer.
Comparison of kNN and k-means optimization methods of reference set selection for improved CNV callers performance. Wiktor Kuśmirek.
Enough to speak true: benchmarking long-read genome sequence alignment tools for human genomics applications. Jonathan Lotempio, Emmanuele Delot and Eric Vilain.
Association of alternative splicing in PICALM with Alzheimer’s disease progression. Juhyun Park, Senggyun Han, Kwangsik Nho and Younghee Lee.
Weighted minimizer sampling improves long read mapping. Chirag Jain, Arang Rhie, Haowen Zhang, Claudia Chu, Sergey Koren and Adam Phillippy.
EpiScanpy: a single cell epigenomics analysis pipeline. Anna Danese, Maria Richter, Kridsadakorn Chaichoompu, David Fischer, Fabian Theis and Maria Colomé-Tatché.

2018

Solving scaffolding problem with repeats. Igor Mandric and Alex Zelikovsky.
Signal enrichment of metagenome sequencing reads using topological data analysis. “Aldo Guzman-Saenz, Niina Haiminen, Saugata Basu and Laxmi.
Parida”. .
CliqueSNV: scalable reconstruction of intra-host viral populations from NGS reads. Sergey Knyazev, Viachaslau Tsyvina, Andrii Melnyk, Alexander Artyomenko, Tatiana Malygina, Yuri Porozov, Ellsworth Campbell, William Switzer, Pavel Skums and Alex Zelikovsky.
Interactive single cell RNA-Seq analysis with the Single Cell Toolkit (SCTK). David Jenkins, Tyler Faits, Emma Briars, Sebasitan Carrasco Pro, Steve Cunningham, Masanao Yajima and W. Evan Johnson.
GRASS-C - graph-based RNA-Seq analysis in single cell level subgraph clustering. Harry Taegyun Yang.
Tigmint: correct assembly errors using linked eeads from large molecules. Shaun D Jackman, Lauren Coombe, Justin Chu, Rene Warren,Ben Vandervalk, Sarah Yeo, Hamid Mohamadi, Joerg Bohlmann, Steven Jones and Inanc Birol.
Fast expectation maximization source tracking. Liat Shenhav, Mike Thompson, Tyler Joseph, Ori Furman, David Bogumil, Itzik Mizrahi and Eran Halperin.
Fast and accurate bisulfite alignment and methylation calling for mammalian genomes. Jonas Fischer and Marcel Schulz.
Identification of transcriptional signatures for cell types from single-cell RNA-Seq. Lynn Yi, Vasilis Ntranos, Pall Melsted and Lior Pachter.

2017

Genomic reads forests for compressed representation of high throughput sequence data. Tony Ginart, Kaiyuan Zhu, Joseph Hui, Ibrahim Numanagic, David Tse, Thomas Courtade and Cenk Sahinalp.
Faster omnitig listing for safe and complete contig assembly. Massimo Cairo, Paul Medvedev, Nidia Obscura Acosta, Romeo Rizzi and Alexandru I. Tomescu.
Probabilistic estimation of overlap graphs for large sequence datasets. Rahul Nihalani, Sriram P. Chockalingam, Shaowei Zhu, Vijay Vazirani and Srinivas Aluru.
TransPac: transposon detection and characterization from long-reads. Xintong Chen, Oscar Rodriguez, Matthew Pendleton, Bojan Losic and Ali Bashir.
Variant tolerant read mapping using min-hashing. Jens Quedenfeld and Sven Rahmann.
MetaCherchant – an algorithm for analyzing genomic environment of antibiotic resistance gene in gut microbiota. Evgenii I. Olekhnovich, Artem T. Vasilyev, Vladimir I. Ulyantsev and Alexander V. Tyakht.
ARCS: assembly roundup by chromium scaffolding. Sarah Yeo, Lauren Coombe, Justin Chu, Rene Warren and Inanc Birol.
Kohdista: a succinct solution to Rmap alignment. Martin Muggli, Simon Puglisi and Christina Boucher.
A tensor factorization framework for haplotype assembly of diploids and polyploids. Abolfazl Hashemi, Banghua Zhu and Haris Vikalo.
LEAP: a generalization of the landau-vishkin algorithm with custom gap penalties. Hongyi Xin, Jeremie Kim, Sunny Nahar, Can Alkan and Onur Mutlu.
Theory and algorithm for the minimum path flow decomposition problem. Mingfu Shao and Carl Kingsford.

2016

Recycler: an algorithm for detecting plasmids from de novo assembly graphs. Roye Rozov, Aya Brown Kav, David Bogumil, Itzhak Mizrahi, Eran Halperin and Ron Shamir.
plasmidSPAdes: assembling plasmids from whole genome sequencing data. Dmitry Antipov, Nolan Hartwick, Max Shen, Michael Rayko, Alla Lapidus and Pavel Pevzner.
Fast, lightweight clustering of de novo transcriptomes using fragment equivalence classes. Avi Srivastava, Hirak Sarkar, Laraib Malik and Robert Patro.
Single-molecule protein identification by sub-nanopore sensors. Mikhail Kolmogorov, Eamonn Kennedy, Zhuxin Dong, Gregory Timp and Pavel Pezvner.
gReC 2.0: new algorithmic challenges of adaptive immune repertoire construction. Alexander Shlemov, Sergey Bankevich, Andrey Bzikadze and Yana Safonova.
Efficient index maintenance under dynamic genome modification. Nitish Gupta, Komal Sanjeev, Tim Wall, Carl Kingsford and Rob Patro.
NASCUP: nucleic acid sequence classification by universal probability. Sunyoung Kwon, Gyuwan Kim, Byunghan Lee, Sungroh Yoon and Young-Han Kim.

2015

Chromatin segmentation with a joint model for reads explains a larger portion of the epigenome. Alessandro Mammana and Ho-Ryun Chung.
Using csaw to detect differentially bound regions in ChIP-seq data. Aaron Lun and Gordon Smyth.
Computational detection of DNA double-stranded breaks and inferring mechanisms of their formation. Norbert Dojer, Abhishek Mitra, Yea-Lih Lin, Anna Kubicka, Magdalena Skrzypczak, Krzysztof Ginalski, Philippe Pasero and Maga Rowicka.
A framework for inferring fitness landscapes of patient-derived viruses using quasispecies theory. David Seifert, Francesca Di Giallonardo, Karin J. Metzner, Huldrych F. Günthard and Niko Beerenwinkel.
Genome-wide mapping and computational analysis of non-B DNA structures in vivo. Damian Wójtowicz, Fedor Kouzine, Arito Yamane, Craig J. Benham, Rafael C. Casellas, David Levens and Teresa M. Przytycka.

2011

A whole-genome probe design for massively parallel variant validation using selective circularization. Daniel Newburger, Georges Natsoulis, Hua Xu, Sue Grimes, John Bell and Hanlee Ji.
Accurate estimation of gene expression levels from DGE sequencing data. Marius Nicolae and Ion Mandoiu.
Constrained traversal of repeats with paired sequences. Sébastien Boisvert, Élénie Godzaridis, François Laviolette and Jacques Corbeil.
Contig graph mining for duplication breakpoints. Jurgen F. Nijkamp, Jean-Marc Daran, Marcel J.T. Reinders and Dick De Ridder.
Counting k-mers with a Bloom Filter. Pall Melsted and Jonathan Pritchard.
Finding deletions with exact break points from noisy low coverage paired-end short sequence reads. Jin Zhang and Yufeng Wu.
Improved variant discovery and allele frequency estimation from pooled dna resequencing with Bayesian latent class analysis and compositional bias models. Shom Paul and Aaron Mackey.
Modeling and automation of sequencing-based determination of RNA structure. Sharon Aviran, Cole Trapnell, Julius Lucks, Stefanie Mortimer, Shujun Luo, Gary Schroth, Jennifer Doudna, Adam Arkin and Lior Pachter.
mTiM: margin-based transcript mapping from RNA-seq. Georg Zeller, Nico Goernitz, Gunnar Raetsch, Jonas Behr, Andre Kahles, Soeren Sonnenburg and Pramod Mudrakarta.
Separating metagenomic data into genomes via clustering. Olga Tanaseichuk and Tao Jiang.
TavernaPBS: custom next-generation sequence analysis workflows using high-performance computing resources with Taverna and PBS. Mark Lawson, Paul Shuber and Aaron Mackey.

Posters

2025

Accelerating gkm-SVM training through GPU implementation. Dongwon Lee.
Identifying potential therapeutic targets for heart failure through systematic transcriptome analysis. Min-Ju Kim and Haeseung Lee.
Population-specific and universal molecular features of skeletal muscle aging: Comparative transcriptome analysis of Korean and GTEx datasets. Byeong-Don Min and Sang-Min Park.
Unlocking hidden protein functions with a biochemically informed annotation strategy. Olga Botvinnik.
Unraveling miRNA-seq data: a statistical framework to account for competition for expression towards accurate differential expression analysis. Seong-Hwan Jun.
Systematic evaluation of dimensionality reduction methods for capturing transcriptomic signatures responding to drug treatments. Yuseong Kwon, Sojeong Park, Soyoung Park and Haeseung Lee.
GreedyMini: generating low-density DNA minimizers. Shay Golan, Ido Tziony, Matan Kraus, Yaron Orenstein and Arseny Shur.
GPU-accelerated homology search with MMseqs2. Felix Kallenborn, Alejandro Chacon, Christian Hundt, Hassan Sirelkhatim, Kieran Didi, Sooyoung Cha, Christian Dallago, Milot Mirdita, Bertil Schmidt and Martin Steinegger .
strangepg: toward pangenome scale graph visualization. Konstantinn Bonne and Tobias Marschall.
Vizitig: context-rich exploration of sequencing datasets. Bastien Degardins, Charles Paperman and Camille Marchet.
Identifying gene-environment interactions for cancer incidence using epigenomic profiles. Younghoon Kim.
stDyer enables spatial domain clustering with dynamic graph embedding. Ke Xu, Yu Xu, Zirui Wang, Xin Zhou and Lu Zhang.
Splicing junction classifier for detecting abnormal KEAP1-NRF2 system activation. Raul Mateos, Wira Winardi, Kenichi Chiba, Ai Okada, Ayako Suzuki, Yoichiro Mitsuishi and Yuichi Shiraishi.
Adapting broad protein language models to viruses. Spyros Lytras, Adam Strange, Jumpei Ito and Kei Sato.
b-move: faster lossless approximate pattern matching in a run-length compressed index. Lore Depuydt, Luca Renders, Simon Van de Vyver, Lennart Veys, Travis Gagie and Jan Fostier.
Full length isoform reconstruction in single cell data. Marie Van Hecke, Koen Deserranno, Elise Callens, Filip Van Nieuwerburgh and Kathleen Marchal.
A novel computational pipeline for the functional characterization and deorphanization of G-protein coupled receptors. Catherine Zhou.
Parallel and space efficient exact local alignment. Evelin Aasna.
Pre-training dataset deduplication improves genomic LLMs. Mahler Revsine, Daniel Khashabi and Michael Schatz.

2024

Compressed Indexing for Pangenome Substring Queries . Stephen Hwang, Nathaniel K. Brown, Omar Y. Ahmed, Katharine Jenike, Sam Kovaka, Michael C. Schatz and Ben Langmead.
Pan-genome de Bruijn Graph using the Bidirectional FM-index. Lore Depuydt, Luca Renders, Thomas Abeel and Jan Fostier.
Mumemto: efficient maximal matching across multiple genomes. Vikram Shivakumar and Ben Langmead.
A*PA & A*PA2: Up to 20 times faster exact global pairwise alignment. Ragnar Groot Koerkamp and Pesho Ivanov.
Accelerating whole-genome alignment using parallel chaining algorithm. Ghanshyam Chandra and Chirag Jain.
Full resolution HLA and KIR genes annotation for human genome assemblies. Ying Zhou, Li Song and Heng Li.
Combining DNA and protein alignments to improve genome annotation with LiftOn. Kuan-Hao Chao, Jakob M Heinz, Celine Hoh, Alan Mao, Alaina Shumate, Mihaela Pertea and Steven L Salzberg.
DupCaller enables robust detection of somatic mutations from Error-Corrected Sequencing. Yuhe Cheng and Ludmil B Alexandrov.
Comprehensive Tissue-Specific Somatic Mutation Profiling via RNA-seq in Diverse Mice. Alexis Garretson and Beth L Dumont.
VISTA: An integrated framework for structural variant discovery. Varuni Sarwal, Seungmo Lee, Jianzhi Yang, Sriram Sankararaman, Mark Chaisson, Eleazar Eskin and Serghei Mangul.
An efficient and accurate germline SNP caller for long-read RNA sequencing data. Neng Huang and Heng Li.
Analyzing the relatedness of genomic variation in malaria parasites using a reference-free approach. Cecile P G Meier-Scherling, Tavor Baharav, Karamoko Niaré, Julia Salzman, Lorin Crawford and Jeffrey A Bailey.
Identification of B cell subsets based on antigen receptor sequences using deep learning. Hyunho Lee, Kyoungseob Shin, Yongju Lee, Soobin Lee, Seungyoun Lee, Eunjae Lee, Seung Woo Kim, Ha Young Shin, Jong Hoon Kim, Junho Chung and Sunghoon Kwon.
Optimizing Design of Genomics Studies for Clonal Evolution Analysis. Arjun Srivatsa and Russell Schwartz.
Comprehensive benchmarking of methods to infer from TCR-Seq data. Mohammad Vahed, Yu Ning Huang, Jiaqi Fu, Kerui Peng and Serghei Mangul.
PA-Bench: A framework for benchmarking pairwise aligners. Ragnar Groot Koerkamp and Daniel Liu.
Block Aligner: an adaptive SIMD-accelerated aligner for sequences and position-specific scoring matrices. Daniel Liu and Martin Steinegger.
Gut derived Extracellular Vesicles Reaching Kupffer Cells: An alternative route for lipid transport out of the gut. Estefania Torrejón, Akiko Teshima, Inês Ferreira, Ana Sofia Carvalho, Hans Christian Beck, Rune Matthiesen, Fabrizia Carli, Amalia Gastaldelli, Maria Paula Macedo and Rita Machado de Oliveira.
A Unitig-Centered Pan-Genome Approach for Predicting Antibiotic Resistance and Discovering Novel Resistance Genes in Bacterial Strains. Thi Duyen Do, Ming-Ren Yang and Yu-Wei Wu.
Assessing Microbial Genome Representation Across Various Reference Databases: A Comprehensive Evaluation. Grigore Boldirev, Nitesh Sharma, Alex Zelikovsky and Serghei Mangul.
Validating a liquid soil model to explore soil microbial community dynamics. Siqin Li, Nicole Genesis Nicole Carpio Paucar and Natalie Farny.
Genetic Architecture of the Germline Mutation Rate and Reproductive Success in the Collaborative Cross. Alexis Garretson and Beth L Dumont.
WGTDA: A Topological Framework for Biomarker Discovery in Gene Expression Data. Ndivhuwo Nyase, Lebohang Mashatola, Stephanie Muller, Aviwe Kohlakala and Kahn Rhrissorrakrai.
QDSWorkflow: An Elastic Net-Based Tool for Modeling Cellular Dormancy. Michelle Wei and Guang Yao.
Comprehensive characterization of pseudogenes across 26 human tissues. Yunzhe Jiang, Beatrice Borsari and Mark Gerstein.
Partial gene predictions on unassembled reads: evaluating the Good, the Bad and the slightly ORF. Amanda Clare, Wayne Aubrey, Mike Surette and Nicholas Dimonaco.
A unified hypothesis-free feature extraction framework for diverse epigenomic data. Maria Chikina and Tugrul Balci.
Benchmarking of machine learning algorithms to predict mortality in sepsis from transcriptomic data. Karishma Chhugani, Serghei Mangul, Oleg Arnaut, Nitesh Sharma and Belin Korukoğlu.
Comparative Evaluation of T-Cell Receptor Repertoire Sequencing Methods. Dhrithi Deshpande and Serghei Mangul.
Statistical assessment of gene functional annotation clustering in graph models of chromosome conformation capture data. Dallas Nygard, Julie St-Pierre and Mathieu Lavallée-Adam.
Community: A Novel R-Tool for Enhanced Differential Communication Analysis in scRNAseq Data. Muhammet Celik, Felix Roman Salcher, Frank Ziemann, Maria Solovey and Maria Colome-Tatche.
Recovering approximate single cell distribution from aggregate measurements. Pratik Worah.
scFedVI: A Privacy-Preserving Approach to Mitigating Batch Effects in Single-Cell RNA-Sequencing Data. Parishad Mokhber, Alireza Gargoorimotlagh and Babak Khalaj.
sceptre: statistically rigorous, computationally efficient, and user-friendly single-cell CRISPR screen data analysis. Timothy Barry, Joseph Deutch, Xihong Lin and Eugene Katsevich.

2023

SeGraM: a universal hardware accelerator for genomic sequence-to-graph and sequence-to-sequence mapping. Damla Senol Cali, Konstantinos Kanellopoulos, Joel Lindegger, Zülal Bingöl, Gurpreet Singh Kalsi, Ziyi Zuo, Can Firtina, Meryem Banu Cavlak, Jeremie S. Kim, Nika Mansouri Ghiasi, Gagandeep Singh, Juan Gómez Luna, Nour Almadhoun Alserr, Mohammed Alser, Sreenivas Subramoney, Can Alkan, Saugata Ghose and Onur Mutlu.
Detection of large tandem duplications in HMPV isolates. Thomas Krannich, Stephan Fuchs and Sophie Köndgen.
Predicting the origin of soil samples - performance evaluation of a new targeted high-throughput sequencing metagenomic tool.. Kamila Marszałek, Michał B. Kowalski, Andrzej Ossowski, Rafał Płoski, Renata Zbieć Piekarska, Paweł P. Łabaj and Wojciech Branicki.
A rigorous benchmarking of methods for SARS-CoV-2 lineage abundance estimation in wastewater. Viorel Munteanu, Khooshbu Kantibhai Patel, Nitesh Kumar Sharma, Sergey Knyazev and Serghei Mangul.
Rigorous benchmarking of HLA callers for RNA-seq data. Ram Ayyala, Dottie Yu, Sergey Knyazev and Serghei Mangul.
RNA-Seq-based methods are able to effectively capture the clonotypes and estimate the diversity of TCR repertoires in T cell rich tissues and certain repertoires. Serghei Mangul.
Machine learning enabled pattern discovery in large-scale spatial gene expression datasets. Reza Abbasi-Asl.
Assessing the completeness of immunogenetics databases across diverse populations. Yu-Ning Huang, Yiting Meng, Naresh Amrat Patel, Jay Himanshu Mehta, Brittney Hua, Marina Fayzullina, Houda Alachkar and Serghei Mangul.
The systematic assessment of completeness of public metadata accompanying omics studies. Yu-Ning Huang, Anushka Rajesh, Ram Ayyala, Aditya Sarkar, Ruiwei Guo, Irina Nakashidze, Shirley Monge, Dottie Yu, Qiushi Peng, Grace Scheg, Khooshbu Kantibhai Patel, Tejasvene Ramesh, Anushka Yadav, Fangyun Liu, Jay Himanshu Mehta and Serghei Mangul.
SurfR: Surfing the cells’ surfaceome. Aurora Maurizio, Anna Sofia Tascini and Marco Jacopo Morelli.
GoPeaks: histone modification peak calling for CUT&Tag. William Yashar, Garth Kong, Jake Vancampen, Brittany Curtiss, Daniel Coleman, Lucia Carbone, Galip Yardimci, Julia Maxson and Theodore Braun.
ClusterV: accurate detection of HIV quasispecies and drug resistance mutations using ONT sequencing data. Junhao Su, Tak-Wah Lam and Ruibang Luo.
Unikseq: unique region identification in genome sequences using a k-mer approach, to empower environmental DNA assay designs and comparative genomics studies. Rene Warren, Michael J Allison, M. Louie Lopez, Neha Acharya-Patel, Lauren Coombe, Cecilia L. Yang, Caren C Helbing and Inanc Birol.
ntHits: streaming through raw sequencing data to profile and filter k-mers with selected multiplicities. Parham Kazemi, Hamid Mohamadi, Justin Chu, Lauren Coombe, Rene L Warren and Inanc Birol.
Evaluating the Robustness and Reproducibility of RNA-Seq Quantification Tools. Fangyun Liu, Brian Nadel, Pelin Icer Baykal and Serghei Mangul.
Copy number estimation using Counting Bloom Filters in de novo assembled genomes. Klea Zambaku, Ricardo Roman-Brenes, Ömer Yavuz Öztürk, Can Alkan and Inanç Birol.
Improving functional annotation of bacterial genomes with COGtools. Karel Sedlar, Petra Polakovicova and Ralf Zimmer.
Orthanq: orthogonal evidence based haplotype quantification. Hamdiye Uzuner and Johannes Köster.
Minichain: a new method for pangenome graph construction. Ghanshyam Chandra and Chirag Jain.
ALIBI2: improved linearization of pangenome graphs. Anna Lisiecka and Norbert Dojer.
Characterization of alignment and search algorithms for short read, long read, and graph mappers. Ecem İlgün, Ömer Yavuz Öztürk, Klea Zambaku, Juan Gómez Luna, Mohammed Alser, Ricardo Roman-Brenes, Can Alkan and The Biopim Project.
Nanopore signal alignment, analysis, and visualization with Uncalled4. Sam Kovaka, Paul W. Hook, Vikram Shivakumar, Katharine M. Jenike, Luke Morina, Roham Razaghi, Winston Timp and Michael C. Schatz.
Mod.Plot: a rapid and interactive visualization of tandem repeats. Alexander Sweeten, Adam Phillippy and Michael Schatz.
Using minimizer interarrival distances for read-until human read detection from blood samples sequenced by Oxford Nanopore. Sina Barazandeh, Mahmud Sami Aydin, Berke Ucar, Can Alkan and Inanc Birol.

2022

Finding Significant Genes and Pathways Across Viruses in CRISPR-Cas9 Screen Database. Elianna Kondylis, Jacklyn Luu, Kyle Awayan, Andreas Puschnik, Angela Pisco.
Metabuli: a metagenomic classifier that combines protein- and DNA-level classification to achieve both high sensitivity and specificity. Jaebeom Kim and Martin Steinegger.
Sketching and sampling approaches for fast and accurate long read classification. Arun Das and Michael Schatz.
Theory of local k-mer selection with applications to long-read alignment. Jim Shaw and Yun William Yu.
Metagenome assembly of high-fidelity long reads with hifiasm-meta. Xiaowen Feng, Haoyu Cheng, Daniel Portik and Heng Li.
Automated telomere-to-telomere genome assembly with PacBio HiFi and ultra-long ONT data. Mikko Rautiainen, Sergey Nurk, Brian Walenz, Adam M Phillippy, Sergey Koren.
Unlocking the microblogging potential for science and medicine. Karishma Chhugani and Serghei Mangul.

2018

mirLibSpark: a scalable NGS microRNA prediction pipeline with data aggregation. Chao-Jung Wu, Mohamed Amine Remita and Abdoulaye Baniré Diallo.
K-merator, an efficient design of highly specific k-mers for quantification of transcriptional signatures in large scale RNAseq cohorts.. Sébastien Riquier, Anne-Laure Bougé, Benoit Guibert, Jérôme Audoux, Daniel Gautheret, Thérèse Commes and Anthony Boureux.
Kevlar: Mapping-free approach for accurate discovery of de novo variants. Daniel Standage, C. Titus Brown and Fereydoun Hormozdiari.
Promoter and enhancer chromatin dynamics during pancreatic differentiation. Henriette Miko, Scott A. Lacadie and Uwe Ohler.
Ultrafast space-efficient k-mer indexing. Sven Rahmann.
ARKS: chromosome-scale human genome scaffolding with linked read kmers. Rene Warren, Lauren Coombe, Jessica Zhang, Ben Vandervalk, Justin Chu, Shaun Jackman and Inanc Birol.
Tigmint: correct assembly errors using linked reads from large molecules. Shaun Jackman, Lauren Coombe, Justin Chu, Rene Warren, Ben Vandervalk, Sarah Yeo, Zhuyi Xue, Hamid Mohamadi, Joerg Bohlmann, Steven Jones and Inanc Birol.
Multi-Index Bloom Filters: A probabilistic data structure for sensitive multi-reference sequence classification with multiple spaced seeds. Justin Chu, Emre Erhan, Hamid Mohamadi, Ben Vandervalk, Jeffrey Tse, Sarah Yeo, Shaun Jackman, Ka Ming Nip, Rene Warren and Inanc Birol.
De novo clustering of gene expressed variants in transcriptomic long reads data sets. Camille Marchet, Lolita Lecompte, Jean-Marc Aury, Corinne Da Silva, Corinne Cruaud, Jacques Nicolas and Pierre Peterlongo.
ONTig: contiguating genome assembly using oxford nanopore long reads. Hamid Mohamadi, Ben Vandervalk, Shaun Jackman, Lauren Coombe, Justin Chu, Rene Warren and Inanc Birol.
Rapid and precise analysis of human gut metagenomes using Oxford Nanopore sequencing technology. Hugo Roume, Mathieu Almeida, Florian Plaza Oñate and S. Dusko Ehrlich.
S3A: a scalable and accurate annotated assembly tool for targeted gene assembly. Laurent David, Hugues Richard, Riccardo Vicedomini and Alessandra Carbone.
Pan-genome structural analysis and visualisation. Paulina Dziadkiewicz, Jakub Tyrek and Norbert Dojer.
Reference-guided genome assembly in metagenomic samples. Cervin Guyomar, Wesley Delage, Fabrice Legeai, Christophe Mougel, Jean-Christophe Simon and Claire Lemaitre.
Nonparametric identification of epigenomic networks from large-scale ChIP-seq experiments. Gabriele Schweikert and Sanguinetti Guido.
Accelerating approximate pattern matching with processing-in-memory (PIM) and single-instruction multiple-data (SIMD) programming. Damla Senol Cali, Zulal Bingol, Jeremie Kim, Rachata Ausavarungnirun, Saugata Ghose, Can Alkan and Onur Mutlu.
Isoform assembly with quasi-lossless compression of quality scores in RNA-seq data. Ana Hernandez.
Map2Peak: from unmapped reads to CHIP-seq peaks in half the time. Krishna Reddy Gujjula and Kiavash Kianfar.

2015

Deep sequencing characterization of Sus scrofa piRNA fraction shared between female and male gonads. Aleksandra Swiercz, Dorota Kowalczywiewicz, Luiza Handschuh, Katarzyna Lesniak, Marek Figlerowicz and Jan Wrzesinski.
GPU-accelerated whole genome assembly. Michał Kierzynka, Wojciech Frohmberg, Jacek Błażewicz, Piotr Żurkowski, Marta Kasprzak and Paweł Wojciechowski.
Scaling ABySS to longer reads using spaced k-mers and Bloom filters. Shaun Jackman, Karthika Raghavan, Benjamin Vandervalk, Daniel Paulino, Justin Chu, Hamid Mohamadi, Anthony Raymond, Rene Warren and Inanc Birol.
Conditional entropy in variation-adjusted windows detects positive selection signatures relevant to next Generation sequencing. Samuel K. Handelman, Michal Seweryn, Ryan M. Smith, Katherine Hartmann, Danxin Wang, Maciej Pietrzak, Andrew D. Johnson, Andrzej Kloczkowski, Wolfgang Sadee.

2012

Detection of chromosomal inversions with paired-end sequencing. José Ignacio Lucas Lledó and Mario Cáceres.
A context-based approach to identify the most likely mapping for RNA-seq experiments. Thomas Bonfert, Gergely Csaba, Ralf Zimmer and Caroline C. Friedel.
Understanding the nucleation of the microRNA-mRNA pairing by using CLIP-Seq and RNA folding data. Ray Marin and Jiri Vanicek.
Calling inversions from next-generation sequencing paired-end mapping data with GRIAL. Sònia Casillas, Can Alkan, Evan E Eichler and Mario Cáceres.
Empirical evaluation of different modern reference panels for imputation and their implication for Genome Wide Association Studies.. Sílvia Bonàs, Josep M. Mercader and David Torrents.
Filtering duplicate reads from 454 pyrosequencing data. Susanne Balzer, Ketil Malde, Inge Jonassen and Markus A. Grohme.
Improving loss of heterozygosity identification by tumor purity estimation. Eva König, Lars Feuerbach, Barbara Hutter, Matthias Schlesner, Qi Wang, Benedikt Brors and Thomas Lengauer.
A probabilistic method for structural variant prediction from strobe sequencing data. Anna Ritz, Suzanne Sindi, Ali Bashir and Benjamin Raphael.
Integrative de novo transcriptome assembly in fruit fly. Nathan Boley.
The devil is in the detail: mining and annotating genomic variants in the Tasmanian Devil facial tumour genome. Ole Schulz-Trieglaff, Elizabeth Murchison, Zemin Ning and Anthony Cox.
Automated workflow for RNA-Seq analysis: application and testing with various types of RNA-Seq protocols. Irina Khrebtukova, Ryan Kelley, Shujun Luo, Tim Hill, Patrick Lau, Jennifer Chiniquy, Kathryn Stephens, Semyon Kruglyak and Gary P Schroth.
The de novo Genome Assembly Assessment Server. André Corvelo and Tyler Alioto.
Improving RNA sequencing interpretation: a case study on breast cancer cell lines. Kirstine Belling, David Flores, Daniel Elias, Jan Stenvang, Jun Wang, Nils Brünner, Henrik Ditzel and Ramneek Gupta.
Coalescing discordant read mapping signatures for structural variant breakpoint detection. Ryan M. Layer, Aaron R. Quinlan and Ira M. Hall.
Analysis of pandemic (H1N1) 2009 Influenza A virus circulating in Mexico during the 2011-2012 season by ultra-deep sequencing. Joanna Ortiz Alcantara, Elizabeth González Durán, Araceli Rodriguez Castillo, Fabiola Garcés Ayala, José Miguel Segura Candelas, Claudia Wong Arámbula, Patricia Alcántara Pérez, Abril Rodríguez, Brisia Rodríguez, Juan Carlos Del Mazo, Susana Serrano, Gisela Barrera Badillo, Irma López Martínez, Lucía Hernández Rivas, Hugo López-Gatell, Celia Alpuche Aranda and José Ernesto Ramírez González.
YAHA: fast and flexible long-read alignment with optimal breakpoint detection. Gregory Faust and Ira Hall.
Efficient and error-tolerant sequencing read mapping. Norbert Dojer and Piotr Jaroszyński.
Identifying genomic copy number alteration and loss of heterozygosity in next-generation sequence data. John R. McPherson, Yingting Wu, Patrick Tan and Steve Rozen.
Algorithms to find mutated pathways in cancer. Fabio Vandin, Hsin-Ta Wu, Eli Upfal and Ben Raphael.
Torrent Variant Caller: it’s all about speed, accuracy, and long indels. Dumitru Brinza, Zheng Zhang, Eric Tsung, Charles Scafe, Onur Sakarya, Alexander Joyner, Sowmi Utiramerur, Guy Del Mistro, Fiona Hyland and Ellen Beasley.
The GEM toolkit: world-class short read mapping, and more. Santiago Marco Sola and Paolo Ribeca.
Strategies for sequencing and analysis of low-diversity samples. Maga Rowicka and Abhishek Mitra.
Instant-Seq:- an integrated tool with web interface for fast analysis of ChIP-Seq data. Abhishek Mitra and Maga Rowicka.
Quasispecies spectrum reconstruction using multi-commodity flows. Nicholas Mancuso, Bassam Tork, Pavel Skums, Ion Mandoiu and Alex Zelikovsky.
On the comparison of sets of alternative transcripts. Aida Ouangraoua, Krister Swenson and Anne Bergeron.
CNVeM: Copy number variation detection using uncertainty of read mapping. Zhanyong Wang, Farhad Hormozdiari, Wen-Yun Yang, Eran Halperin and Eleazar Eskin.
GRAPE RNAseq analysis pipeline environment. David Gonzalez-Knowles, Maik Roder, Angelika Merkel and Roderic Guigo.

2011

Epigenetics of atherosclerosis. Lauren Mills, Brian Wamhoff, Brett Blackman, Aaron Mackey and Jessica Connelly.
Genome-scale analysis of promoter melting in eukaryotic gene transcription. Fedor Kouzine, Damian Wójtowicz, Arito Yamane, Wolfgang Resch, Teresa M. Przytycka, David Levens and Rafael Casellas.
Haplotype discovery based on unassembled sequences estimation. Serghei Mangul and Alex Zelikovsky.
RGASP evaluation of RNA-seq read alignment algorithms. Andre Kahles, Regina Bohnert, Paolo Ribeca, Jonas Behr and Gunnar Raetsch.
Screening for transposable element-induced adaptations in Drosophila melanogaster using next-gen sequencing data. Anna-Sophie Fiston-Lavier, Dmitri Petrov and Josefa Gonzalez.
SeqMDD: symbolic data structures for accurate mapping. Marco Beccuti, Francesca Cordero, Susanna Donatelli and Raffaele Calogero.
SlideSort: A fast and exact tool for finding all similar pairs from next-generation sequencing data. Kana Shimizu and Koji Tsuda.
Smarti - A fast short read alignment algorithm. Florian Schatz, Sascha Möller and Manfred Schimmler.
Toward assessing the quality of de novo assembly. Rasiah Loganantharaj.