Steps for Clustering of an organisms Sequences

Selection of organisms sequences (ESTs and cDNAs) from EMBL
Annotation of vector sequences, repeats, low quality
Sequence clipping
Clustering of mRNAs and related ESTs
Search all-against-all similarities between remaining ESTs using QUASAR
Clustering of remaining ESTs
Assembly of all clusters, generation of Staden projects
Extraction of consensus sequences for each cluster (contig)
Selection of representative clone/cluster
Picking of the non-redundant clone-set
Annotation of the resulting clusters (contigs)
Generation of Presentation for the Web

Consensus of human sequences based on Unigene

Obtain Clustering Information from Unigene
Unigene - Hs.seq.all.Z
Proceed with steps 2, 3, 7, 8, 9, 10, 11, 12

Criteria for selection of 'optimal' clones

Clone availability at RZPD
Quality of Library
5' and 3' read within the same cluster
Correct orientation of clone
Distance between 5' and 3' read
Position in consensus sequence