Reporting of bulk-RNA-Seq data analysis

Otoniel Maya

Warning!

This tutorial is mainly focus on three topics:

Accordind to Conesa 2019, et al. a generic (Bulk) RNASeq analysis includes three main steps:

Preprocessing: includes experimental design, sequencing design and quality control steps.
Core analysis: includes transcriptome profiling, differential geen expression, and functional profiling.
Advance analysis: includes visualization, other RNASeq technologies and data integration. ̶

And you should keep register of all of them, but…

FAIR stands for findability, accessibility, interoperability, and reusability of data.

The data refers to:

Raw sequencing files
Metadata in a plain text file
Expresion profile files: quantification, annotations, differential expresion files in binary or text format.
Analysis source code files
Plots

Assign a globally unique and persistent identifier to your data (such as a DOI or UUID)
Describe your data with rich metadata that includes essential information.
Explicitly link metadata to the data they describe.
Ensure that your data are registered or indexed in a searchable resource.

Represent your data using a formal, shared, and broadly language, format, and structure.

You are familiar with:

Have you tried to reproduce a bioinformatic Methods section?

Select the right SHELL: Bash
Use environments: renv, python environments (conda, pyenv, pipenv, poetry, pipx, etc.)
Select a good text editor/IDE
Use version control
Use notebooks: Rmd, Quarto, Jupyter, Pluto
Add a README file, and LICENSE (optional)

In RStudio , go to “File > New Project”
Click on “Version Control: Checkout a project from a version control repository”
Click on “Git: Clone a project from a repository”
Fill in the info: URL: https://github.com/ATGenomics/rnaseq_report
Browse to where you would like to create this folder: ~/workshop