Configuration¶
Learn how to setup a configuration file required bycelseq2
as input. The
configuration file is global and can be reusable.
Why configuration¶
Global configuration file has 3 purposes:
- Tell
celseq2
how CEL-Seq2 is performed. For example, what are the sequences of the cell barcodes in use? - Specify bioinformatics details of
celseq2
. For example, what is the absolute path to the genome annotation on your computer? - Configuration file is reusable as long as both experimental and bioinformatics protocol are applied on same type of species.
Create a configuration file from template¶
celseq2
provides a bash command new-configuration-file
to initiate
a new configuration template so that user can fill in details.
new-configuration-file -o /path/to/my_wonderful_config.yaml
How to specify configuration¶
Here is a real example as global configuration.
########################### ## CEL-seq2 Tech Setting ########################### BC_INDEX_FPATH: 'yanailab/refs/barcodes/barcodes_cel-seq_umis96.tab' BC_IDs_DEFAULT: '1-96' UMI_LENGTH: 6 BC_LENGTH: 6 ########################### ## Bowties Index ########################### BOWTIE2_INDEX_PREFIX: 'yanailab/refs/danio_rerio/danRer10_87/genome/Danio_rerio.GRCz10.dna.toplevel' BOWTIE2: '/local/apps/bowtie2/2.3.1/bowtie2' ########################### ## Annotations ########################### GFF: 'yanailab/refs/danio_rerio/danRer10_87/gtf/Danio_rerio.GRCz10.87.gtf.gz' ########################### ## Demultiplexing ########################### FASTQ_QUAL_MIN_OF_BC: 10 CUT_LENGTH: 35 ########################### ## Alignment ########################### ALIGNER: 'bowtie2' ########################### ## UMI Count ########################### ALN_QUAL_MIN: 0
Explanations of key parameters¶
BC_INDEX_FPATH
¶
Absolute file path to a space/tab separated file which saves all the sequences for cell barcodes.
Here are first 11 lines of the content of
BC_INDEX_FPATH
#barcode_id sequence 1 AGACTC 2 AGCTAG 3 AGCTCA 4 AGCTTC 5 CATGAG 6 CATGCA 7 CATGTC 8 CACTAG 9 CAGATC 10 TCACAG
UMI_LENGTH
, BC_LENGTH
, CUT_LENGTH
¶
CEL-Seq2 sequences in a pair-end manner. Read-1 records the sequences of UMIs and
cell barcodes, while read-2 records the sequences of RNA transcripts. celseq2
will cut a subsequence with length of CUT_LENGTH
since the left-most end of
read-2, which will be ready for alignment.