SHREC 2007-Protein Challenge
Start | Data | Evaluation | Register | Misc | Results |
Participants
In this track we had two groups participating:The group from ITI participated with two different methods:
Each group submitted a ranked list of the unknown 30 protein structures and Table together with the distance of each query to each protein from the 633 training set computed by their method. The SCOP classifiaction [1] was considered as the ground truth. Only the ATOM section of the PDB [2] files was provided.
We also compared the results to the classification achieved by our method (LMB, Germany)[3]. Since we organized the track, our results are out of competition.
Methods
Li et al. focus on the topology of each protein: they use STRIDE [4] to detect the secondary structure, including the hydrogen bond. Then, they compute the beta sheets (beta strands connected with hydrogen bond) and the order. For main class a, b, c, d, g, and folds of a and g, they used the length and percentage of alpha helix and beta strand to classify. For each fold in each class b, c, d, they used the orders to classify.P. Daras and V. Tsatsaias submitted two ranked lists computed with two different methods. The first method (Trace) is described in the paper [5]. The second method (Graph) is called '3D Protein Classification Using Toplogical and Geometrical Information'. The 3D objects are firstly segmented to their molecular structure. Then, descriptors are extracted for each segment using spherical harmonics algorithms, and graphs are constructed for the molecules. Next, a sub-graph matching procedure is utilized in order to provide final similarity distances between the graphs.
Evaluation
The ranked lists were evaluated by the following simple method: The next neighbor in the ranked list, meaning the protein domain with the least distance to the query protein is considered and the query protein is assigned to its class. One point is scored for the correct SCOP class only, two points for the correct SCOP fold and zero points if neither of them is correct. The maximal amount of points is 60, when the fold for each query protein is correctly classified.From the three submitted methods , the team from Purdue performed (total score 45) best even though using simple features. The two methods submitted by team ITI misclassified half of the query proteins and their best method Graph scored 29 points. However, even better classification could be achieved by the LMB team, total score 52.
The query set was chosen randomly from the 27 scop folds. Some proteins consisted of only one domain, others (e.g. Protein2, Protein8, Protein23) of several domains which were however all belonging to the same fold. Also, the size of the protein domains ranged from 31 amino acids (Protein11) to 364 amino acids (Protein16).
References
[1] A.G. Murzin., S.E Brenner, T. Hubbard and C. Chothia, SCOP: a structural classification of proteins database for the investigation of sequences and structures, J. Mol. Biol. 247, pp. 536-540, 1995.[2] H.M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T.N. Bhata, H. Weissig, I.N. Shindyalov and P.E. Bourne, The Protein Data Bank, Nucleic Acids Research, Vol. 28, pp. 235-242, 2000.
[3] M. Temerinac, M. Reisert and H. Burkhardt, Invariant Features for Searching in Protein Fold Databases, International Journal on Computer Mathematics , 'Special Issue on Bioinformatics', to appear 2007.
[4] D. Frishman, P. Argos, Knowledge-Based Protein Secondary Structure Assignment, Proteins: Structure, Function, and Genetics 23:566-579, 1995 [5] P. Daras, D. Zarpalas, A. Axenopoulos, D. Tzovaras and M.G. Strintzis, Three-Dimensional Shape-Structure Comparison Method for Protein Classification, IEEE/ACM transactions on Computational Biology and Bioinformatics, Vol. 3, No. 3, pp. 193-207, July 2006.
Group | #(Wrong classification)=0 | #(Correct SCOP Class only)=1 | #(Correct SCOP Fold)=2 | Total score |
Purdue | 5 | 5 | 20 | 5*1+20*2=45 |
ITI(Trace) | 15 | 8 | 7 | 8*1+7*2=22 |
ITI(Graph) | 14 | 3 | 13 | 3*1+13*2=29 |
LMB | 2 | 4 | 24 | 4*1+24*2=52 |
In the table below, for each group the predicted class/score for each query protein is presented according to the nearest neighbor.
Query Protein | ProteinID | SCOP ID | Purdue | ITI(Trace) | ITI(Graph) | LMB |
Protein0 | 1agt | g.3 | b.40 / 0 | g.3 / 2 | g.3 / 2 | g.3 / 2 |
Protein1 | 1b0b | a.1 | a.1 / 2 | c.2 / 0 | a.1 / 2 | a.1 / 2 |
Protein2 | 1c6vA | c.55 | d.58 / 0 | c.94 / 1 | b.40 / 0 | c.55 / 2 |
Protein3 | 1cch | a.3 | a.3 / 2 | d.169 / 0 | a.3 / 2 | a.3 / 2 |
Protein4 | 1cor | a.3 | a.3 / 2 | a.3 / 2 | a.3 / 2 | a.3 / 2 |
Protein5 | 1dp4 | c.93 | c.69 / 1 | c.37 / 1 | b.40 / 0 | c.69 / 1 |
Protein6 | 1dyzA | b.6 | b.6 / 2 | a.1 / 0 | b.6 / 2 | b.6 / 2 |
Protein7 | 1e9m | d.15 | b.6 / 0 | a.1 / 0 | c.3 / 0 | d.15 / 2 |
Protein8 | 1eq2B | c.2 | c.2 / 2 | c.93 / 1 | a.3 / 0 | c.2 / 2 |
Protein9 | 1eylA | b.42 | b.42 / 2 | b.60 / 1 | c.2 / 0 | b.42 / 2 |
Protein10 | 1fe0A | d.58 | a.4 / 0 | a.39 / 0 | d.58 / 2 | d.58 / 2 |
Protein11 | 1g26 | g.3 | g.3 / 2 | a.26 / 0 | b.34 / 0 | g.3 / 2 |
Protein12 | 1gcpA | b.34 | b.40 / 1 | b.1 / 1 | b.34 / 2 | b.34 / 2 |
Protein13 | 1gglA | b.60 | b.60 / 2 | a.3 / 0 | b.60 / 2 | b.60 / 2 |
Protein14 | 1gqzA | d.19 | d.58 / 1 | c.1 / 0 | c.2 / 0 | c.2 / 0 |
Protein15 | 1gyvA | b.1 | b.47 / 1 | b.1 / 2 | c.93 / 0 | b.7 / 1 |
Protein16 | 1icp | c.1 | c.1 / 2 | a.24 / 0 | a.4 / 0 | c.1 / 2 |
Protein17 | 1ihmA | b.121 | b.121 / 2 | c.3 / 0 | d.58 / 0 | b.121 / 2 |
Protein18 | 1il6 | a.26 | a.26 / 2 | b.6 / 0 | c.69 / 0 | a.26 / 2 |
Protein19 | 1jjf | c.69 | c.69 / 2 | c.23 / 1 | c.37 / 1 | c.37 / 1 |
Protein20 | 1jr6 | c.37 | d.15 / 0 | b.6 / 0 | c.2 / 1 | c.23 / 1 |
Protein21 | 1jzmA | a.1 | a.1 / 2 | a.1 / 2 | a.1 / 2 | a.1 / 2 |
Protein22 | 1kt7 | b.60 | b.60 / 2 | b.40 / 1 | b.60 / 2 | b.60 / 2 |
Protein23 | 1mi3 | a.24 | c.1 / 2 | c.47 / 1 | c.1 / 2 | c.1 / 2 |
Protein24 | 1mi3A | c.1 | c.1 / 2 | c.1 / 2 | c.23 / 1 | c.1 / 2 |
Protein25 | 1pruA | a.4 | a.4 / 2 | g.3 / 0 | b.1 / 0 | a.4 / 2 |
Protein26 | 1rfjA | a.39 | a.39 / 2 | a.39 / 2 | a.39 / 2 | a.39 / 2 |
Protein27 | 1vavA | b.29 | b.47 / 1 | c.69 / 0 | d.15 / 0 | b.29 / 2 |
Protein28 | 1wat | a.24 | a.24 / 2 | c.37 / 0 | c.37 / 0 | d.58 / 0 |
Protein29 | 1xnc | b.29 | b.29 / 2 | b.29 / 2 | b.29 / 2 | b.29 / 2 |
Total: | 45/ 60 | 22/ 60 | 29/ 60 | 52/ 60 |
Click on the PDB id of the Query protein and you will get a ranked list computed with the method specified in the brackets. The nearest neighbour to the query protein for each method is shown in green.
Query Protein | ProteinID | SCOP ID | Results(Perdue) | Results(ITI(Trace)) | Results(ITI (Graph)) | Results(LMB) |
Protein0 | 1agt | g.3 | 1agt 1mjc | 1agt 1tsk | 1agt 1ktx | 1agt 2crd |
Protein1 | 1b0b | a.1 | 1b0b 1flp | 1b0b 1bdma1 | 1b0b 1eca | 1b0b 1flp |
Protein2 | 1c6vA | c.55 | 1c6vA 1npk | 1c6vA 1omp | 1c6vA 1igp | 1c6vA 1itg |
Protein3 | 1cch | a.3 | 1cch 1dvh | 1cch 1prtc2 | 1cch 351c | 1cch 1cor |
Protein4 | 1cor | a.3 | 1cor 1cyi | 1cor 2pac | 1cor 351c | 1cor 2pac |
Protein5 | 1dp4 | c.93 | 1dp4 1crl | 1dp4 1mmd_2 | 1dp4 1kab | 1dp4 1pea |
Protein6 | 1dyzA | b.6 | 1dyzA 1aiza | 1dyzA 1gdlo1 | 1dyzA 1aiza | 1dyzA 2aza |
Protein7 | 1e9m | d.15 | 1e9m 1jer | 1e9m 1aofa1 | 1e9m 1coy_1 | 1e9m 1put |
Protein8 | 1eq2B | c.2 | 1eq2B 1hrda1 | 1eq2B 2dri | 1eq2B 351c | 1eq2B 1xel |
Protein9 | 1eylA | b.42 | 1eylA 1wba | 1eylA 1mup | 1eylA 1cyda | 1eylA 1tie |
Protein10 | 1fe0A | d.19 | 1fe0A 1cgpa1 | 1fe0A 4icb | 1fe0A 1afi | 1fe0A 1fwp |
Protein11 | 1g26 | g.3 | 1g26 1chl | 1g26 1rfba | 1g26 1mmd_1 | 1g26 1gur |
Protein12 | 1gcpA | b.34 | 1gcpA 1rip | 1gcpA 1yaia | 1gcpA 1shfa | 1gcpA 1shfa |
Protein13 | 1gglA | b.60 | 1gglA 1hms | 1gglA 1ccr | 1gglA 1hms | 1gglA 1hms |
Protein14 | 1gqzA | d.19 | 1gqzA 1vaoa1 | 1gqzA 1edt | 1gqzA 1cyda | 1gqzA 1scu |
Protein15 | 1gyvA | b.1 | 1gyvA 1sgc | 1gyvA 1tnn | 1gyvA 2lbp | 1gyvA 1rsy |
Protein16 | 1icp | c.1 | 1icp 1jdc_2 | 1icp 1was | 1icp 1aplc | 1icp 1oyb |
Protein17 | 1ihmA | b.121 | 1ihmA 2bpa1 | 1ihmA 1gnd_1 | 1ihmA 1pil | 1ihmA 2tbv |
Protein18 | 1il6 | a.26 | 1il6 1huw | 1il6 1aiza | 1il6 1wht.1 | 1il6 1ifa |
Protein19 | 1jjf | c.69 | 1jjf 3tgl | 1jjf 1scua2 | 1jjf 1deka | 1jjf 1dar_2 |
Protein20 | 1jr6 | c.37 | 1jr6 1se4_2 | 1jr6 1jer | 1jr6 2cmd_1 | 1jr6 1ntr |
Protein21 | 1jzmA | a.1 | 1jzmA 3sdha | 1jzmA 3sdha | 1jzmA 3sdha | 1jzmA 3sdha |
Protein22 | 1kt7 | b.60 | 1kt7 1hbq | 1kt7 2prd | 1kt7 1hbq | 1kt7 1hbp |
Protein23 | 1mi3 | c.3 | 1mi3 5ruba1 | 1mi3 2trcp | 1mi3 1nal1 | 1mi3 1ads |
Protein24 | 1mi3A | c.3 | 1mi3A 5ruba1 | 1mi3A 3rubl1 | 1mi3A 5nul | 1mi3A 1ads |
Protein25 | 1pruA | a.4 | 1pruA 1yrna | 1pruA 4cpai | 1pruA 2hft_2 | 1pruA 1oct |
Protein26 | 1rfjA | a.39 | 1rfjA 1osa | 1rfjA 1osa | 1rfjA 1ctaa | 1rfjA 1osa |
Protein27 | 1vavA | b.29 | 1vavA 1agja | 1vavA 1tib | 1vavA 1pga | 1vavA 1kit_1 |
Protein28 | 1wat | a.24 | 1wat 1was | 1wat 1deka | 1wat 1dar_2 | 1wat 1ab8 |
Protein29 | 1xnc | b.29 | 1xnc 1xnb | 1xnc 1xnb | 1xnc 1xnb | 1xnc 1xnb |
This Homepage was created by Maja Temerinac for the SHREC 2007 Protein Challenge.
For more information, please contact:
Maja Temerinac temerina(at)informatik.uni-freiburg.de or
Marco Reisert reisert(at)informatik.uni-freiburg.de