判断二代测序数据产自哪种illumina测序平台

最新分类情况,请在上述链接查找。 |首字符|测序平台| |----|----| |HWI-M [0-9] {4} $|MiSeq| | HWUSI|Genome Analyzer IIx| | “ M [0-9] {5} $| MiSeq| | “ HWI-C [0-9] {5} $| HiSeq 1500| | “ C [0-9] {5} $| HiSeq 1500| | “ HWI-D [0-9] {5} $| HiSeq 2500| | “ D [0-9] {5} $| HiSeq 2500| | “ J [0-9] {5} $|HiSeq 3000| | “ K [0-9] {5} $| HiSeq 3000(目前基本不用),HiSeq 4000| |“ E [0-9] {5} $| HiSeq X| |NB [0-9] {6} $| NextSeq| | NS [0-9] {6} $| NextSeq| | MN [0-9] {5} $|MiniSeq|

测序通道的分类
         "C[A-Z,0-9]{4}ANXX$" : (["HiSeq 1500", "HiSeq 2000", "HiSeq 2500"], "High Output (8-lane) v4 flow cell"),
         "C[A-Z,0-9]{4}ACXX$" : (["HiSeq 1000", "HiSeq 1500", "HiSeq 2000", "HiSeq 2500"], "High Output (8-lane) v3 flow cell"),
         "H[A-Z,0-9]{4}ADXX$" : (["HiSeq 1500", "HiSeq 2500"], "Rapid Run (2-lane) v1 flow cell"),
         "H[A-Z,0-9]{4}BCXX$" : (["HiSeq 1500", "HiSeq 2500"], "Rapid Run (2-lane) v2 flow cell"),
         "H[A-Z,0-9]{4}BCXY$" : (["HiSeq 1500", "HiSeq 2500"], "Rapid Run (2-lane) v2 flow cell"),
         "H[A-Z,0-9]{4}BBXX$" : (["HiSeq 4000"], "(8-lane) v1 flow cell"),
         "H[A-Z,0-9]{4}BBXY$" : (["HiSeq 4000"], "(8-lane) v1 flow cell"),
         "H[A-Z,0-9]{4}CCXX$" : (["HiSeq X"], "(8-lane) flow cell"),
         "H[A-Z,0-9]{4}CCXY$" : (["HiSeq X"], "(8-lane) flow cell"),
         "H[A-Z,0-9]{4}ALXX$" : (["HiSeq X"], "(8-lane) flow cell"),
         "H[A-Z,0-9]{4}BGXX$" : (["NextSeq"], "High output flow cell"),
         "H[A-Z,0-9]{4}BGXY$" : (["NextSeq"], "High output flow cell"),
         "H[A-Z,0-9]{4}BGX2$" : (["NextSeq"], "High output flow cell"),
         "H[A-Z,0-9]{4}AFXX$" : (["NextSeq"], "Mid output flow cell"),
         "A[A-Z,0-9]{4}$" : (["MiSeq"], "MiSeq flow cell"),
         "B[A-Z,0-9]{4}$" : (["MiSeq"], "MiSeq flow cell"),
         "D[A-Z,0-9]{4}$" : (["MiSeq"], "MiSeq nano flow cell"),
         "G[A-Z,0-9]{4}$" : (["MiSeq"], "MiSeq micro flow cell"),
         "H[A-Z,0-9]{4}DMXX$" : (["NovaSeq"], "S2 flow cell")}

使用zless查看测序原始文件。 zless sample.fastq.gz|head -5

@E00552:40:H23NGCCXY:5:1101:1154:1520 1:N:0:NCAGTG
NTTTGCTAAACGGAAGGACTAAAGTAGGAACTGATTGGCTTTAGTCTCTAGTCTCTCACATGGGTGCTAAAAGGGACTAGAGGGTAACATTTACTCCAATTGCCTTTGCCTAGAGTTGGAATATAATATAAGTGAATTGTCCACCTTCTT
+
#AAFAFJAJJ-FFFJJJ7JJJFJJJJJFJJJJ<FFFAJJJJFJJJJJJJJJJFAJ<AJJFJJJJ-FF7FJJJJJJJJF<FJJJJAFAJFFFJJJJJJJFJ-FJJJJFJ<J-FJFF-7AF7FJF7FJJ7FAFJ-<<7<-AAJJJ<JA-F<-
@E00552:40:H23NGCCXY:5:1101:2777:1520 1:N:0:NCAGTG

显然可以看出,是E开头,即HiSeq X (8-lane) flow cell

例2:zless sample2.fastq.gz|head -5

@A00262:358:HTG2NDSXX:2:1101:1127:1031 1:N:0:GTTATA+GTTATAC
GNCTACATTTACCTAGCATTTTTCTTCTATCTTACATAGTTTTTGGGTAAACATACTATCCTTATGAGCATTGGGTGTAATGTTTGTTGTTTTATGTTGATTGCTTATTTGGGTAGAAATGACTAACCTATGCTTCATTCCTGCGGATGG
+
F#FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:,FFFF,FFFF,F:FFF:FFF,FFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00262:358:HTG2NDSXX:2:1101:1181:1031 1:N:0:GTTATA+GTTATAC
回到页面顶部