Steps for Clustering of an organisms Sequences
- Selection of organisms sequences (ESTs and cDNAs) from EMBL
- Annotation of vector sequences, repeats, low quality
- Sequence clipping
- Clustering of mRNAs and related ESTs
- Search all-against-all similarities between remaining ESTs using QUASAR
- Clustering of remaining ESTs
- Assembly of all clusters, generation of Staden projects
- Extraction of consensus sequences for each cluster (contig)
- Selection of representative clone/cluster
- Picking of the non-redundant clone-set
- Annotation of the resulting clusters (contigs)
- Generation of Presentation for the Web
Consensus of human sequences based on Unigene
- Obtain Clustering Information from Unigene
Unigene -
Hs.seq.all.Z
- Proceed with steps 2, 3, 7, 8, 9, 10, 11, 12
Criteria for selection of 'optimal' clones
- Clone availability at RZPD
- Quality of Library
- 5' and 3' read within the same cluster
- Correct orientation of clone
- Distance between 5' and 3' read
- Position in consensus sequence