Data QC and storage
Data is initially stored on our local storage cluster where quality control and primary data analysis is performed by our HPC Cluster. Insight in the quality of your data is provided with the widely used FastQC and MultiQC tools. These tools generate html reports which are easy to access and contain intuitive graphs to illustrate the run performance and data quality. Data is demultiplexed and split out into a fastq file for each sample, subject or patient. The raw data is stored for 3 months, the data after QC (i.e. fastq files, bam files, QC’d chip array data) are stored for a year. These storage services are included in the sequencing costs when sequencing is done by CFG. Services to store the data for prolonged periods of time can be ordered separately. Contact us for more information.
Data analysis
We also provide a, limited, number of pipelines developed and maintained for use in diagnostics. For whole exome sequencing and whole genome sequencing our variant calling pipeline is based on GATK. For deep and shallow whole genome sequencing, but also for targeted sequencing our copy number pipeline is based onQDNAseq, Exomedepth or XHMM. A Limma/EdgeR based pipeline is available for RNA-seq and cellranger pipeline is available for single cell RNA-seq data analysis. These pipelines are available for use for research projects as well. This way we provide our research community with validated, up to date, analysis tools for their research projects.
High performance computation services
In collaboration with the HPC Center of Expertise, we also offer access to an HPC Compute cluster for advanced users. The resources on this cluster are managed by SLURM and compute nodes are connected to a high performance and low latency BeeGFS storage cluster. A large collection of frequently used tools is readily available on the cluster and extendable upon request. Contact Daoud Sie for more information on this cluster: d.sie@amsterdamumc.nl.