Publications

RNAcare: integrating clinical data with transcriptomic evidence using rheumatoid arthritis as a case study

Authors: Mingcan Tang, William Haese-Hill, Fraser Morton, Carl Goodyear, Duncan Porter, Stefan Siebert, Thomas D Otto

BMC Medical Genomics •

Gene expression analysis is a crucial tool for uncovering the biological mechanisms that underlie differences between patient subgroups, offering insights that can inform clinical decisions. However, despite its potential, gene expression analysis remains challenging for clinicians due to the specialised skills required to access, integrate, and analyse large datasets. Existing tools primarily focus on RNA-Seq data analysis, providing user-friendly interfaces but often falling short in several critical areas: they typically do not integrate clinical data, lack support for patient-specific analyses, and offer limited flexibility in exploring relationships between gene expression and clinical outcomes in disease cohorts. Users, including clinicians with a general knowledge of transcriptomics, however, who may have limited programming experience, are increasingly seeking tools that go beyond traditional analysis. To overcome these issues, computational tools must incorporate advanced techniques, such as machine learning, to better understand how gene expression correlates with patient symptoms of interest.

paraCell: a novel software tool for the interactive analysis and visualization of standard and dual host–parasite single-cell RNA-seq data

Authors: Edward Agboraw, William Haese-Hill, Franziska Hentzschel, Emma Briggs, Dana Aghabi, Anna Heawood, Clare R Harding, Brian Shiels, Kathryn Crouch, Domenico Somma, Thomas D Otto

Nucleic Acids Research •

Advances in sequencing technology have led to a dramatic increase in the number of single-cell transcriptomic datasets. In the field of parasitology, these datasets typically describe the gene expression patterns of a given parasite species at the single-cell level under experimental conditions, in specific hosts or tissues, or at different life cycle stages. However, while this wealth of available data represents a significant resource, analysing these datasets often requires expert computational skills, preventing a considerable proportion of the parasitology community from meaningfully integrating existing single-cell data into their work. Here, we present paraCell, a novel software tool that allows the user to visualize and analyse pre-loaded single-cell data without requiring any programming ability. The source code is free to allow remote installation. On our web server, we demonstrated how to visualize and re-analyse published Plasmodium and Trypanosoma datasets. We have also generated Toxoplasma–mouse and Theileria–cow scRNA-seq datasets to highlight the functionality of paraCell for pathogen–host interaction. The analysis of the data highlights the impact of the host interferon-γ response and gene expression profiles associated with disease susceptibility by these intracellular parasites, respectively.

Annotation and visualization of parasite, fungi and arthropod genomes with Companion

Authors: William Haese-Hill, Kathryn Crouch, Thomas D Otto

Nucleic Acids Research •

As sequencing genomes has become increasingly popular, the need for annotation of the resulting assemblies is growing. Structural and functional annotation is still challenging as it includes finding the correct gene sequences, annotating other elements such as RNA and being able to submit those data to databases to share it with the community. Compared to de novo assembly where contiguous chromosomes are a sign of high quality, it is difficult to visualize and assess the quality of annotation. We developed the Companion web server to allow non-experts to annotate their genome using a reference-based method, enabling them to assess the output before submitting to public databases. In this update paper, we describe how we have included novel methods for gene finding and made the Companion server more efficient for annotation of genomes of up to 1 Gb in size. The reference set was increased to include genomes of interest for human and animal health from the fungi and arthropod kingdoms. We show that Companion outperforms existing comparable tools where closely related references are available.

peaks2utr: a robust Python tool for the annotation of 3′ UTRs

Authors: William Haese-Hill, Kathryn Crouch, Thomas D Otto

Bioinformatics •

Annotation of nonmodel organisms is an open problem, especially the detection of untranslated regions (UTRs). Correct annotation of UTRs is crucial in transcriptomic analysis to accurately capture the expression of each gene yet is mostly overlooked in annotation pipelines. Here we present peaks2utr, an easy-to-use Python command line tool that uses the UTR enrichment of single-cell technologies, such as 10× Chromium, to accurately annotate 3′ UTRs for a given canonical annotation.