SUPPLEMENTS - DISTRIBUTION ACROSS FUNCTIONAL CATEGORIES AND CLASSES =================================================================== 1105 sfs with functional annotation 41 small functional categories Duplets: ---------- All dup 9398 supra dup 2368 2064 complex 304 repeat over-represented supra dup 1203 total over-represented supra-domains out of 1471 A 307 significant supra-domains in kingdom B 621 significant supra-domains in kingdom E 1017 significant supra-domains in kingdom Triplets: ----------- All trip 4323 entries read supra trip 935 entries read 424 complex 511 repeat over-represented supra trip 166 total over-represented supra-domains out of 675 B 45 significant supra-domains in kingdom E 137 significant supra-domains in kingdom Other: ---------- Single domains: 1216 entries read Top 200 most duplicated dup: 200 top entries 140 complex 60 repeat Top 200 most versatile dup: 200 top entries 123 complex 77 repeat Top 200 most duplicated trip: 200 top entries 140 complex 60 repeat Top 200 most versatile trip: 200 top entries 123 complex 77 repeat Notes: --------- - Note the difference between the 41 functional categories (small cats) and the 7 broader functional classes (large cats). - For combinations we counted the functional annotation for each single domain (hence there are twice as many counts for domain combinations than combinations in one set. FUNCTION ============================================================= Whole SCOP with 1437 superfamilies Energy 63 General 47 Information 166 Metabolism 318 Processes 106 Regulation 206 Unknown 199 Small cats >= 5 C Energy production and conversion 30 CA Electron transfer/transport 23 CB Enzymes of photosynthesis 10 D Cell devision and chromsome partitioning, cell 13 E Amino acid transport and metabolism 19 F Nucleotide transport and metabolism 28 G Carbohydrate transport and metabolism 26 GA Polysaccharide metabolism 17 H Coenzyme metabolism 50 HA Small molecule binding 7 I Lipid metabolism 9 IA Phospholipid metabolism 6 J Translation, ribosomes, ribosome biogenesis; tRNA metabolism 79 K Transcription 24 L DNA replication, recombination, repair 52 LA DNA-binding (transcription factors) 57 LB RNA-binding, RNA processing, dynamics 11 MA Cell adhesion domains 20 N Cell motility, cytoskeleton 11 O Posttranslational modification, protein turnover, chaperones 25 OA Proteases, peptidases and their inhibitors 49 OB Kinases and phosphatases and inhibitors 12 P Inorganic ion transport and metabolism 7 Q Secondary metabolites biosynthesis, transport and catabolism 12 R General Function prediction 28 RA Oxidation/Reduction 51 RB Transferases 24 RC Other enzymes 81 RD Protein-protein interaction (dimerization domains) 12 RE Immune response 14 RF Transport 22 S Function unknown 160 SA Viral proteins 39 SB Toxins and defense enzymes 28 T Signal transduction 43 ------------------------------------------------------------------------------------ 9398 total two-domain combinations Large cats same: 0.3087 0.462 1.4965 Small cats same: 0.1333 0.3399 2.5498 Superfamilies same: 0.0509 0.2199 4.3202 Energy 955 0.101617365396893 General 2418 0.25728878484784 Information 1564 0.166418386890828 Metabolism 5730 0.609704192381358 Processes 1236 0.131517344115769 Regulation 4961 0.52787827197276 Unknown 997 0.106086401361992 Small cats >= 5 C Energy production and conversion 441 0.0469248776335391 CA Electron transfer/transport 425 0.0452223877420728 CB Enzymes of photosynthesis 89 0.00947010002128112 D Cell devision and chromsome partitioning, cell 157 0.0167056820600128 E Amino acid transport and metabolism 213 0.0226643966801447 EA Nitrogen metabolism 61 0.00649074271121515 F Nucleotide transport and metabolism 547 0.0582038731645031 G Carbohydrate transport and metabolism 555 0.0590551181102362 GA Polysaccharide metabolism 342 0.0363907214300915 H Coenzyme metabolism 649 0.0690572462226006 HA Small molecule binding 1090 0.11598212385614 I Lipid metabolism 75 0.00798042136624814 IA Phospholipid metabolism 69 0.00734198765694829 J Translation, ribosomes, ribosome biogenesis; tRNA metabolism 460 0.0489465843796552 K Transcription 238 0.0253245371355608 L DNA replication, recombination, repair 676 0.0719301979144499 LA DNA-binding (transcription factors) 1691 0.179931900404341 LB RNA-binding, RNA processing, dynamics 190 0.020217067461162 M Cell envelope biogenesis, outer membrane 47 0.00500106405618217 MA Cell adhesion domains 709 0.0754415833155991 N Cell motility, cytoskeleton 92 0.00978931687593105 O Posttranslational modification, protein turnover, chaperones 372 0.0395828899765908 OA Proteases, peptidases and their inhibitors 884 0.0940625665035114 OB Kinases and phosphatases and inhibitors 308 0.0327729304107257 P Inorganic ion transport and metabolism 259 0.0275590551181102 Q Secondary metabolites biosynthesis, transport and catabolism 271 0.0288359225367099 R General Function prediction 869 0.0924664822302618 RA Oxidation/Reduction 881 0.0937433496488615 RB Transferases 753 0.0801234305171313 RC Other enzymes 1383 0.147158969993616 RD Protein-protein interaction (dimerization domains) 459 0.0488401787614386 RE Immune response 130 0.0138327303681634 RF Transport 268 0.02851670568206 RG Blood clotting 51 0.00542668652904873 S Function unknown 816 0.0868269844647797 SA Viral proteins 181 0.0192594168972122 SB Toxins and defense enzymes 163 0.0173441157693126 T Signal transduction 997 0.106086401361992 ------------------------------------------------------------------------------------ 2368 supra two-domain combinations Large cats same: 0.423 0.4942 1.1683 Small cats same: 0.2683 0.4432 1.6518 Superfamilies same: 0.1325 0.3391 2.5592 Energy 177 0.0747466216216216 General 805 0.339949324324324 Information 321 0.135557432432432 Metabolism 1208 0.510135135135135 Processes 287 0.121199324324324 Regulation 1455 0.614442567567568 Unknown 244 0.103040540540541 Small cats >= 5 C Energy production and conversion 77 0.0325168918918919 CA Electron transfer/transport 88 0.0371621621621622 CB Enzymes of photosynthesis 12 0.00506756756756757 D Cell devision and chromsome partitioning, cell 31 0.0130912162162162 E Amino acid transport and metabolism 41 0.0173141891891892 EA Nitrogen metabolism 12 0.00506756756756757 F Nucleotide transport and metabolism 116 0.0489864864864865 G Carbohydrate transport and metabolism 129 0.0544763513513514 GA Polysaccharide metabolism 88 0.0371621621621622 H Coenzyme metabolism 140 0.0591216216216216 HA Small molecule binding 409 0.172719594594595 I Lipid metabolism 14 0.00591216216216216 IA Phospholipid metabolism 16 0.00675675675675676 J Translation, ribosomes, ribosome biogenesis; tRNA metabolism 96 0.0405405405405405 K Transcription 67 0.0282939189189189 L DNA replication, recombination, repair 118 0.0498310810810811 LA DNA-binding (transcription factors) 456 0.192567567567568 LB RNA-binding, RNA processing, dynamics 40 0.0168918918918919 M Cell envelope biogenesis, outer membrane 15 0.00633445945945946 MA Cell adhesion domains 251 0.105996621621622 N Cell motility, cytoskeleton 27 0.011402027027027 O Posttranslational modification, protein turnover, chaperones 91 0.0384290540540541 OA Proteases, peptidases and their inhibitors 208 0.0878378378378378 OB Kinases and phosphatases and inhibitors 107 0.0451858108108108 P Inorganic ion transport and metabolism 53 0.0223817567567568 Q Secondary metabolites biosynthesis, transport and catabolism 58 0.0244932432432432 R General Function prediction 252 0.106418918918919 RA Oxidation/Reduction 176 0.0743243243243243 RB Transferases 177 0.0747466216216216 RC Other enzymes 257 0.108530405405405 RD Protein-protein interaction (dimerization domains) 144 0.0608108108108108 RE Immune response 33 0.0139358108108108 RF Transport 65 0.0274493243243243 RG Blood clotting 14 0.00591216216216216 S Function unknown 220 0.0929054054054054 SA Viral proteins 24 0.0101351351351351 SB Toxins and defense enzymes 33 0.0139358108108108 T Signal transduction 342 0.144425675675676 ------------------------------------------------------------------------------------ 1203 principal two-domain combinations Large cats same: 0.5302 0.4993 0.9417 Small cats same: 0.4065 0.4914 1.2088 Superfamilies same: 0.2204 0.4147 1.8815 Energy 79 0.0656691604322527 General 346 0.28761429758936 Information 205 0.170407315045719 Metabolism 589 0.489609310058188 Processes 134 0.111388196176226 Regulation 733 0.609310058187864 Unknown 160 0.133000831255195 Small cats >= 5 C Energy production and conversion 41 0.0340814630091438 CA Electron transfer/transport 35 0.029093931837074 D Cell devision and chromsome partitioning, cell 14 0.0116375727348296 E Amino acid transport and metabolism 24 0.0199501246882793 F Nucleotide transport and metabolism 57 0.0473815461346633 G Carbohydrate transport and metabolism 63 0.0523690773067332 GA Polysaccharide metabolism 40 0.0332502078137988 H Coenzyme metabolism 76 0.0631753948462178 HA Small molecule binding 156 0.129675810473815 I Lipid metabolism 10 0.00831255195344971 IA Phospholipid metabolism 8 0.00665004156275977 J Translation, ribosomes, ribosome biogenesis; tRNA metabolism 75 0.0623441396508728 K Transcription 37 0.0307564422277639 L DNA replication, recombination, repair 69 0.057356608478803 LA DNA-binding (transcription factors) 204 0.169576059850374 LB RNA-binding, RNA processing, dynamics 24 0.0199501246882793 M Cell envelope biogenesis, outer membrane 5 0.00415627597672485 MA Cell adhesion domains 117 0.0972568578553616 N Cell motility, cytoskeleton 18 0.0149625935162095 O Posttranslational modification, protein turnover, chaperones 46 0.0382377389858687 OA Proteases, peptidases and their inhibitors 115 0.0955943474646717 OB Kinases and phosphatases and inhibitors 43 0.0357439733998337 P Inorganic ion transport and metabolism 18 0.0149625935162095 Q Secondary metabolites biosynthesis, transport and catabolism 28 0.0232751454696592 R General Function prediction 125 0.103906899418121 RA Oxidation/Reduction 104 0.086450540315877 RB Transferases 68 0.056525353283458 RC Other enzymes 115 0.0955943474646717 RD Protein-protein interaction (dimerization domains) 65 0.0540315876974231 RE Immune response 20 0.0166251039068994 RF Transport 32 0.0266001662510391 RG Blood clotting 11 0.00914380714879468 S Function unknown 144 0.119700748129676 SA Viral proteins 16 0.0133000831255195 SB Toxins and defense enzymes 8 0.00665004156275977 T Signal transduction 208 0.172901080631754 ------------------------------------------------------------------------------------ 4323 total three-domain combinations Large cats same: 0.4266 0.4946 1.1593 Small cats same: 0.265 0.4414 1.6656 Superfamilies same: 0.1417 0.3488 2.4615 Energy 475 0.109877399953736 General 2067 0.478140180430257 Information 913 0.211195928753181 Metabolism 3034 0.701827434651862 Processes 861 0.199167244968772 Regulation 4212 0.974323386537127 Unknown 785 0.181586860976174 Small cats >= 5 C Energy production and conversion 220 0.0508905852417303 CA Electron transfer/transport 217 0.0501966227157067 CB Enzymes of photosynthesis 38 0.00879019199629887 D Cell devision and chromsome partitioning, cell 73 0.0168864214665741 E Amino acid transport and metabolism 123 0.0284524635669674 EA Nitrogen metabolism 22 0.00508905852417303 F Nucleotide transport and metabolism 254 0.0587554938699977 G Carbohydrate transport and metabolism 352 0.0814249363867685 GA Polysaccharide metabolism 200 0.046264168401573 H Coenzyme metabolism 393 0.0909090909090909 HA Small molecule binding 1173 0.271339347675226 I Lipid metabolism 38 0.00879019199629887 IA Phospholipid metabolism 39 0.00902151283830673 J Translation, ribosomes, ribosome biogenesis; tRNA metabolism 258 0.0596807772380291 K Transcription 146 0.0337728429331483 L DNA replication, recombination, repair 390 0.0902151283830673 LA DNA-binding (transcription factors) 1133 0.262086513994911 LB RNA-binding, RNA processing, dynamics 119 0.0275271801989359 M Cell envelope biogenesis, outer membrane 55 0.0127226463104326 MA Cell adhesion domains 1012 0.234096692111959 N Cell motility, cytoskeleton 80 0.0185056673606292 O Posttranslational modification, protein turnover, chaperones 260 0.0601434189220449 OA Proteases, peptidases and their inhibitors 586 0.135554013416609 OB Kinases and phosphatases and inhibitors 173 0.0400185056673606 P Inorganic ion transport and metabolism 124 0.0286837844089752 Q Secondary metabolites biosynthesis, transport and catabolism 161 0.0372426555632663 R General Function prediction 571 0.132084200786491 RA Oxidation/Reduction 457 0.105713624797594 RB Transferases 366 0.0846634281748786 RC Other enzymes 668 0.154522322461254 RD Protein-protein interaction (dimerization domains) 323 0.0747166319685404 RE Immune response 126 0.029146426092991 RF Transport 230 0.0532037936618089 RG Blood clotting 60 0.0138792505204719 S Function unknown 709 0.164006476983576 SA Viral proteins 76 0.0175803839925977 SB Toxins and defense enzymes 74 0.017117742308582 T Signal transduction 1048 0.242424242424242 ----------------------------------------------------------------------------------------------------- 935 supra three-domain combinations Large cats same: 0.5444 0.4983 0.9153 Small cats same: 0.4044 0.491 1.2141 Superfamilies same: 0.2333 0.4231 1.8135 Energy 80 0.0855614973262032 General 475 0.508021390374332 Information 177 0.189304812834225 Metabolism 544 0.581818181818182 Processes 165 0.176470588235294 Regulation 1044 1.11657754010695 Unknown 193 0.206417112299465 Small cats >= 5 C Energy production and conversion 44 0.0470588235294118 CA Electron transfer/transport 34 0.0363636363636364 D Cell devision and chromsome partitioning, cell 10 0.0106951871657754 E Amino acid transport and metabolism 14 0.0149732620320856 F Nucleotide transport and metabolism 48 0.0513368983957219 G Carbohydrate transport and metabolism 69 0.0737967914438503 GA Polysaccharide metabolism 35 0.0374331550802139 H Coenzyme metabolism 75 0.0802139037433155 HA Small molecule binding 274 0.293048128342246 I Lipid metabolism 7 0.00748663101604278 IA Phospholipid metabolism 7 0.00748663101604278 J Translation, ribosomes, ribosome biogenesis; tRNA metabolism 54 0.0577540106951872 K Transcription 23 0.0245989304812834 L DNA replication, recombination, repair 65 0.0695187165775401 LA DNA-binding (transcription factors) 232 0.248128342245989 LB RNA-binding, RNA processing, dynamics 35 0.0374331550802139 M Cell envelope biogenesis, outer membrane 17 0.0181818181818182 MA Cell adhesion domains 307 0.328342245989305 N Cell motility, cytoskeleton 11 0.0117647058823529 O Posttranslational modification, protein turnover, chaperones 61 0.06524064171123 OA Proteases, peptidases and their inhibitors 98 0.104812834224599 OB Kinases and phosphatases and inhibitors 29 0.0310160427807487 P Inorganic ion transport and metabolism 14 0.0149732620320856 Q Secondary metabolites biosynthesis, transport and catabolism 36 0.0385026737967914 R General Function prediction 127 0.135828877005348 RA Oxidation/Reduction 98 0.104812834224599 RB Transferases 58 0.0620320855614973 RC Other enzymes 102 0.109090909090909 RD Protein-protein interaction (dimerization domains) 74 0.079144385026738 RE Immune response 34 0.0363636363636364 RF Transport 46 0.0491978609625668 RG Blood clotting 20 0.0213903743315508 S Function unknown 179 0.19144385026738 SA Viral proteins 14 0.0149732620320856 SB Toxins and defense enzymes 6 0.00641711229946524 T Signal transduction 317 0.33903743315508 ---------------------------------------------------------------------------------------------------------------- 166 principal three-domain combinations Large cats same: 0.61 0.4893 0.8021 Small cats same: 0.4842 0.5013 1.0353 Superfamilies same: 0.3773 0.4862 1.2886 Energy 3 0.0180722891566265 General 86 0.518072289156627 Information 31 0.186746987951807 Metabolism 61 0.367469879518072 Processes 32 0.192771084337349 Regulation 225 1.35542168674699 Unknown 41 0.246987951807229 Small cats >= 5 F Nucleotide transport and metabolism 6 0.036144578313253 G Carbohydrate transport and metabolism 18 0.108433734939759 HA Small molecule binding 35 0.210843373493976 J Translation, ribosomes, ribosome biogenesis; tRNA metabolism 8 0.0481927710843374 K Transcription 7 0.0421686746987952 L DNA replication, recombination, repair 15 0.0903614457831325 LA DNA-binding (transcription factors) 40 0.240963855421687 M Cell envelope biogenesis, outer membrane 8 0.0481927710843374 MA Cell adhesion domains 87 0.524096385542169 O Posttranslational modification, protein turnover, chaperones 12 0.072289156626506 OA Proteases, peptidases and their inhibitors 25 0.150602409638554 Q Secondary metabolites biosynthesis, transport and catabolism 13 0.0783132530120482 R General Function prediction 42 0.253012048192771 RB Transferases 7 0.0421686746987952 RC Other enzymes 9 0.0542168674698795 RD Protein-protein interaction (dimerization domains) 9 0.0542168674698795 RE Immune response 10 0.0602409638554217 RF Transport 10 0.0602409638554217 S Function unknown 40 0.240963855421687 T Signal transduction 59 0.355421686746988