Datasets are typically ChIP-seq samples, but can also come from e.g. DIP-seq, RIP-seq, RNA-seq, ATAC-seq, or CAGE-seq.

Next generation sequencing results in a high number of individual sequences that are typically 30-100 bp long, and it is common to call these sequences for reads.

Datasets needs to be mapped (a.k.a. aligned) to a reference genome before they can be imported into EaSeq, and EaSeq cannot use unmapped reads without coordinates. We have a guide on how to do this here.

During import, the only information used by EaSeq is the genomic coordinate, and other information such as sequence or quality metrics is discarded.

EaSeq allows import of two distinct type of Datasets, Read-based Datasets (e.g. bed-files, bam files, and aln-files) and Coverage-based Datasets (wig-files and bedgraph/bg-files).

