In an interdisciplinary team of three we replicated the single-cell RNA-sequencing (scRNA-Seq) data processing of a popular 2020 paper published in Science Immunology using Nextflow, a bioinformatics tool. We successfully reproduced the Cell Ranger feature-barcode matrices for two large (200GB+) original FASTQ RNA-seq data sets published alongside the paper. These matrices are the main input for the downstream data analysis in R.
This project was done during the project phase of the BlueDot Biosecurity Fundamentals course which I completed in Winter 2023/24. With this project we set out to learn about Nextflow and bioinformatics workflows in general. As part of the project we completed the Nextflow Fundamentals program.
In our team I was responsible for the setup of all server and software infrastructure, I acquired the raw data, supervised all pipeline runs and wrote large parts of our project report. I created and held our presentation at the course closing event. We placed first in our project category and BlueDot Impact awarded us the first price of 500£.