By Ms Lim Yun Ping, Bioinformatics Institute, A*STAR Singapore
1.
Given an unknown sequence from molecule X, find out its potential function and pick out several sequences similar to it.
>SeqX
ATGGGCCACAGCCACAGCACCGGCAAGGAGATCAACGACAACGAGCTGTTCACCTGCGAGGACCCCGTGTTCGACCAGCCCGTGGCCAGCCCCAAGAGCGAGATCAGCAGCAAGCTGGCC
GAGGAGATCGAGAGAAGCAAGAGCCCCCTGATCCTGGAGGTGAGCCCCAGAACCCCCGACAGCGTGCAGATGTTCAGACCCACCTTCGACACCTTCAGACCCCCCAACAGCGACAGCAGC
ACCTTCAGAGGCAGCCAGAGCAGAGAGGACCTGGTGGCCTGCAGCAGCATGAACAGCGTGAACAACGTGCACGACATGAACACCGTGAGCAGCAGCAGCAGCAGCAGCGCCCCCCTGTTC
GTGGCCCTGTACGACTTCCACGGCGTGGGCGAGGAGCAGCTGAGCCTGAGAAAGGGCGACCAGGTGAGAATCCTGGGCTACAACAAGAACAACGAGTGGTGCGAGGCCAGACTGTACAGC
ACCAGAAAGAACGACGCCAGCAACCAGAGAAGACTGGGCGAGATCGGCTGGGTGCCCAGCAACTTCATCGCCCCCTACAACAGCCTGGACAAGTACACCTGGTACCACGGCAAGATCAGC
AGAAGCGACAGCGAGGCCATCCTGGGCAGCGGCATCACCGGCAGCTTCCTGGTGAGAGAGAGCGAGACCAGCATCGGCCAGTACACCATCAGCGTGAGACACGACGGCAGAGTGTTCCAC
TACAGAATCAACGTGGACAACACCGAGAAGATGTTCATCACCCAGGAGGTGAAGTTCAGAACCCTGGGCGAGCTGGTGCACCACCACAGCGTGCACGCCGACGGCCTGATCTGCCTGCTG
ATGTACCCCGCCAGCAAGAAGGACAAGGGCAGAGGCCTGTTCAGCCTGAGCCCCAACGCCCCCGACGAGTGGGAGCTGGACAGAAGCGAGATCATCATGCACAACAAGCTGGGCGGCGGC
CAGTACGGCGACGTGTACGAGGGCTACTGGAAGAGACACGACTGCACCATCGCCGTGAAGGCCCTGAAGGAGGACGCCATGCCCCTGCACGAGTTCCTGGCCGAGGCCGCCATCATGAAG
GACCTGCACCACAAGAACCTGGTGAGACTGCTGGGCGTGTGCACCCACGAGGCCCCCTTCTACATCATCACCGAGTTCATGTGCAACGGCAACCTGCTGGAGTACCTGAGAAGAACCGAC
AAGAGCCTGCTGCCCCCCATCATCCTGGTGCAGATGGCCAGCCAGATCGCCAGCGGCATGAGCTACCTGGAGGCCAGACACTTCATCCACAGAGACCTGGCCGCCAGAAACTGCCTGGTG
AGCGAGCACAACATCGTGAAGATCGCCGACTTCGGCCTGGCCAGATTCATGAAGGAGGACACCTACACCGCCCACGCCGGCGCCAAGTTCCCCATCAAGTGGACCGCCCCCGAGGGCCTG
GCCTTCAACACCTTCAGCAGCAAGAGCGACGTGTGGGCCTTCGGCGTGCTGCTGTGGGAGATCGCCACCTACGGCATGGCCCCCTACCCCGGCGTGGAGCTGAGCAACGTGTACGGCCTG
CTGGAGAACGGCTTCAGAATGGACGGCCCCCAGGGCTGCCCCCCCAGCGTGTACAGACTGATGCTGCAGTGCTGGAACTGGAGCCCCAGCGACAGACCCAGATTCAGAGACATCCACTTC
AACCTGGAGAACCTGATCAGCAGCAACAGCCTGAACGACGAGGTGCAGAAGCAGCTGAAGAAGAACAACGACAAGAAGCTGGAGAGCGACAAGAGAAGAAGCAACGTGAGAGAGAGAAGC
GACAGCAAGAGCAGACACAGCAGCCACCACGACAGAGACAGAGACAGAGAGAGCCTGCACAGCAGAAACAGCAACCCCGAGATCCCCAACAGAAGCTTCATCAGAACCGACGACAGCGTG
AGCTTCTTCAACCCCAGCACCACCAGCAAGGTGACCAGCTTCAGAGCCCAGGGCCCCCCCTTCCCCCCCCCCCCCCAGCAGAACACCAAGCCCAAGCTGCTGAAGAGCGTGCTGAACAGC
AACGCCAGACACGCCAGCGAGGAGTTCGAGAGAAACGAGCAGGACGACGTGGTGCCCCTGGCCGAGAAGAACGTGAGAAAGGCCGTGACCAGACTGGGCGGCACCATGCCCAAGGGCCAG
AGAATCGACGCCTACCTGGACAGCATGAGAAGAGTGGACAGCTGGAAGGAGAGCACCGACGCCGACAACGAGGGCGCCGGCAGCAGCAGCCTGAGCAGAACCGTGAGCAACGACAGCCTG
GACACCCTGCCCCTGCCCGACAGCATGAACAGCAGCACCTACGTGAAGATGCACCCCGCCAGCGGCGAGAACGTGTTCCTGAGACAGATCAGAAGCAAGCTGAAGAAGAGAAGCGAGACC
CCCGAGCTGGACCACATCGACAGCGACACCGCCGACGAGACCACCAAGAGCGAGAAGAGCCCCTTCGGCAGCCTGAACAAGAGCAGCATCAAGTACCCCATCAAGAACGCCCCCGAGTTC
AGCGAGAACCACAGCAGAGTGAGCCCCGTGCCCGTGCCCCCCAGCAGAAACGCCAGCGTGAGCGTGAGACCCGACAGCAAGGCCGAGGACAGCAGCGACGAGACCACCAAGGACGTGGGC
ATGTGGGGCCCCAAGCACGCCGTGACCAGAAAGATCGAGATCGTGAAGAACGACAGCTACCCCAACGTGGAGGGCGAGCTGAAGGCCAAGATCAGAAACCTGAGACACGTGCCCAAGGAG
GAGAGCAACACCAGCAGCCAGGAGGACCTGCCCCTGGACGCCACCGACAACACCAACGACAGCATCATCGTGATCCCCAGAGACGAGAAGGCCAAGGTGAGACAGCTGGTGACCCAGAAG
GTGAGCCCCCTGCAGCACCACAGACCCTTCAGCCTGCAGTGCCCCAACAACAGCACCAGCAGCGCCATCAGCCACAGCGAGCACGCCGACAGCAGCGAGACCAGCAGCCTGAGCGGCGTG
TACGAGGAGAGAATGAAGCCCGAGCTGCCCAGAAAGAGAAGCAACGGCGACACCAAGGTGGTGCCCGTGACCTGGATCATCAACGGCGAGAAGGAGCCCAACGGCATGGCCAGAACCAAG
AGCCTGAGAGACATCACCAGCAAGTTCGAGCAGCTGGGCACCGCCAGCACCATCGAGAGCAAGATCGAGGAGGCCGTGCCCTACAGAGAGCACGCCCTGGAGAAGAAGGGCACCAGCAAG
AGATTCAGCATGCTGGAGGGCAGCAACGAGCTGAAGCACGTGGTGCCCCCCAGAAAGAACAGAAACCAGGACGAGAGCGGCAGCATCGACGAGGAGCCCGTGAGCAAGGACATGATCGTG
AGCCTGCTGAAGGTGATCCAGAAGGAGTTCGTGAACCTGTTCAACCTGGCCAGCAGCGAGATCACCGACGAGAAGCTGCAGCAGTTCGTGATCATGGCCGACAACGTGCAGAAGCTGCAC
AGCACCTGCAGCGTGTACGCCGAGCAGATCAGCCCCCACAGCAAGTTCAGATTCAAGGAGCTGCTGAGCCAGCTGGAGATCTACAACAGACAGATCAAGTTCAGCCACAACCCCAGAGCC
AAGCCCGTGGACGACAAGCTGAAGATGGCCTTCCAGGACTGCTTCGACCAGATCATGAGACTGGTGGACAGA
2.
Find the ORFs for this sequence and translate it into a protein sequence.
3.
What is the PI and molecular weight for this protein sequence?
4.
Using pairwise local alignment, try to find out the percentage identity between the 2 sequences.
What is the percentage identity of the pairwise global alignment for the 2 sequences?
Are there any differences?
>Mouse|P00520
CKSKKGLSSSSSCYLEEALQRPVASDFEPQGLSEAARWNSKENLLAGPSENDPNLFVALY
DFVASGDNTLSITKGEKLRVLGYNHNGEWCEAQTKNGQGWVPSNYITPVNSLEKHSWYHGPVSRNAAEYL
LSSGINGSFLVRESESSPGQRSISLRYEGRVYHYRINTASDGKLYVSSESRFNTLAELVHHHSTVADGLI
TTLHYPAPKRNKPTIYGVSPNYDKWEMERTDITMKHKLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEV
EEFLKEAAVMKEIKHPNLVQLLGVCTREPPFYIITEFMTYGNLLDYLRECNRQEVSAVVLLYMATQISSA
MEYLEKKNFIHRDLAARNCLVGENHLVKVADFGLSRLMTGDTYTAHAGAKFPIKWTAPESLAYNKFSIKS
DVWAFGVLLWEIATYGMSPYPGIDLSQVYELLEKDYRMERPEGCPEKVYELMRACWQWNPSDRPSFAEIH
QAFETMFQESSISDEVEKELGKRGTRGGAGSMLQAPELPTKTRTCRRAAEQKDAPDTPELLHTKGLGESD
ALDSEPAVSPLLPRKERGPPDGSLNEDERLLPRDRKTNLFSALIKKKKKMAPTPPKRSSSFREMDGQPDR
RGASEDDSRELCNGPPALTSDAAEPTKSPKASNGAGVPNGAFREPGNSGFRSPHMWKKSSTLTGSRLAAA
EEESGMSSSKRFLRSCSASCMPHGARDTEWRSVTLPRDLPSAGKQFDSSTFGGHKSEKPALPRKRTSESR
SEQVAKSTAMPLPGWLKKNEEAAEEGFKDTESSPGSSPPSLTPKLLRRQVTASPSSGLSHKEEATKGSAS
GMGTPATAEPAPPSNKVGLSKASSEEMRVRRHKHSSESPGRDKGRLAKLKPAPPPPPACTGKAGKPAQSP
SQEAGEAGGPTKTKCTSLAMDAVNTDPTKAGPPGEGLRKPVPPSVPKPQSTAKPPGTPTSPVSTPSTAPA
PSPLAGDQQPSSAAFIPLISTRVSLRKTRQPPERIASGTITKGVVLDSTEALCLAISRNSEQMASHSAVL
EAGKNLYTFCVSYVDSIQQMRNKFAFREAINKLESNLRELQICPATASSGPAATQDFSKLLSSVKEISDI
VRR
>Human|P00519
MLEICLKLVGCKSKKGLSSSSSCYLEEALQRPVASDFEPQGLSEAARWNSKENLLAGPSENDPNLFVALY
DFVASGDNTLSITKGEKLRVLGYNHNGEWCEAQTKNGQGWVPSNYITPVNSLEKHSWYHGPVSRNAAEYL
LSSGINGSFLVRESESSPGQRSISLRYEGRVYHYRINTASDGKLYVSSESRFNTLAELVHHHSTVADGLI
TTLHYPAPKRNKPTVYGVSPNYDKWEMERTDITMKHKLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEV
EEFLKEAAVMKEIKHPNLVQLLGVCTREPPFYIITEFMTYGNLLDYLRECNRQEVNAVVLLYMATQISSA
MEYLEKKNFIHRDLAARNCLVGENHLVKVADFGLSRLMTGDTYTAHAGAKFPIKWTAPESLAYNKFSIKS
DVWAFGVLLWEIATYGMSPYPGIDLSQVYELLEKDYRMERPEGCPEKVYELMRACWQWNPSDRPSFAEIH
QAFETMFQESSISDEVEKELGKQGVRGAVSTLLQAPELPTKTRTSRRAAEHRDTTDVPEMPHSKGQGESD
PLDHEPAVSPLLPRKERGPPEGGLNEDERLLPKDKKTNLFSALIKKKKKTAPTPPKRSSSFREMDGQPER
RGAGEEEGRDISNGALAFTPLDTADPAKSPKPSNGAGVPNGALRESGGSGFRSPHLWKKSSTLTSSRLAT
GEEEGGGSSSKRFLRSCSASCVPHGAKDTEWRSVTLPRDLQSTGRQFDSSTFGGHKSEKPALPRKRAGEN
RSDQVTRGTVTPPPRLVKKNEEAADEVFKDIMESSPGSSPPNLTPKPLRRQVTVAPASGLPHKEEAEKGS
ALGTPAAAEPVTPTSKAGSGAPGGTSKGPAEESRVRRHKHSSESPGRDKGKLSRLKPAPPPPPAASAGKA
GGKPSQSPSQEAAGEAVLGAKTKATSLVDAVNSDAAKPSQPGEGLKKPVLPATPKPQSAKPSGTPISPAP
VPSTLPSASSALAGDQPSSTAFIPLISTRVSLRKTRQPPERIASGAITKGVVLDSTEALCLAISRNSEQM
ASHSAVLEAGKNLYTFCVSYVDSIQQMRNKFAFREAINKLENNLRELQICPATAGSGPAATQDFSKLLSS
VKEISDIVQR
5.
Align the following sequences to find out the the best conserved regions and determine how similar they are to each other. Which two sequences are closer to each other and which one is the furthest apart? Draw the phylogram showing the similarity for these sequences.
>SeqX
MGHSHSTGKEINDNELFTCEDPVFDQPVASPKSEISSKLAEEIERSKSPLILEVSPRTPDSVQMFRPTFD
TFRPPNSDSSTFRGSQSREDLVACSSMNSVNNVHDMNTVSSSSSSSAPLFVALYDFHGVGEEQLSLRKGD
QVRILGYNKNNEWCEARLYSTRKNDASNQRRLGEIGWVPSNFIAPYNSLDKYTWYHGKISRSDSEAILGS
GITGSFLVRESETSIGQYTISVRHDGRVFHYRINVDNTEKMFITQEVKFRTLGELVHHHSVHADGLICLL
MYPASKKDKGRGLFSLSPNAPDEWELDRSEIIMHNKLGGGQYGDVYEGYWKRHDCTIAVKALKEDAMPLH
EFLAEAAIMKDLHHKNLVRLLGVCTHEAPFYIITEFMCNGNLLEYLRRTDKSLLPPIILVQMASQIASGM
SYLEARHFIHRDLAARNCLVSEHNIVKIADFGLARFMKEDTYTAHAGAKFPIKWTAPEGLAFNTFSSKSD
VWAFGVLLWEIATYGMAPYPGVELSNVYGLLENGFRMDGPQGCPPSVYRLMLQCWNWSPSDRPRFRDIHF
NLENLISSNSLNDEVQKQLKKNNDKKLESDKRRSNVRERSDSKSRHSSHHDRDRDRESLHSRNSNPEIPN
RSFIRTDDSVSFFNPSTTSKVTSFRAQGPPFPPPPQQNTKPKLLKSVLNSNARHASEEFERNEQDDVVPL
AEKNVRKAVTRLGGTMPKGQRIDAYLDSMRRVDSWKESTDADNEGAGSSSLSRTVSNDSLDTLPLPDSMN
SSTYVKMHPASGENVFLRQIRSKLKKRSETPELDHIDSDTADETTKSEKSPFGSLNKSSIKYPIKNAPEF
SENHSRVSPVPVPPSRNASVSVRPDSKAEDSSDETTKDVGMWGPKHAVTRKIEIVKNDSYPNVEGELKAK
IRNLRHVPKEESNTSSQEDLPLDATDNTNDSIIVIPRDEKAKVRQLVTQKVSPLQHHRPFSLQCPNNSTS
SAISHSEHADSSETSSLSGVYEERMKPELPRKRSNGDTKVVPVTWIINGEKEPNGMARTKSLRDITSKFE
QLGTASTIESKIEEAVPYREHALEKKGTSKRFSMLEGSNELKHVVPPRKNRNQDESGSIDEEPVSKDMIV
SLLKVIQKEFVNLFNLASSEITDEKLQQFVIMADNVQKLHSTCSVYAEQISPHSKFRFKELLSQLEIYNR
QIKFSHNPRAKPVDDKLKMAFQDCFDQIMRLVDR
>Fly|P00522
MGAQQGKDRGAHSGGGGSGAPVSCIGLSSSPVASVSPHCISSSSGVSSAPLGGGSTLRGSRIKSSSSGVA
SGSGSGGGGGGSGSGLSQRSGGHKDARCNPTVGLNIFTEHNGTKHSSFRGHPGKYHMNLEALLQSRPLPH
IPAGSTRPLFWRIAELQQHQQDSGGLGLQGSSLGGGHSSTTSVFESAHRWTSKENLLAPGPEEDDPQLFV
ALYDFQAGGENQLSLKKGEQVRILSYNKSGEWCEAHSDSGNVGWVPSNYVTPLNSLEKHSWYHGPISRNA
AEYLLSSGINGSFLVRESESSPGQRSISLRYEGRVYHYRISEDPDGKVFVTQEAKFNTLAELVHHHSVPH
EGHGLITPLLYPAPKQNKPTVFPLSPEPDEWEICRTDIMMKHKLGGGQYGEVYEAVWKRYGNTVAVKTLK
EDTMALKDFLEEAAIMKEMKHPNLVQLIGVCTREPPFYIITEFMSHGNLLDFLRSAGRETLDAVALLYMA
TQIASGMSYLESRNYIHRDLAARNCLVGDNKLVKVADFGLARLMRDDTYTAHAGAKFPIKWTAPEGLAYN
KFSTKSDVWAFGVLLWEIATYGMSPYPAIDLTDVYHKLDKGYRMERPPGCPPEVYDLMRQCWQWDATDRP
TFKSIHHALEHMFQESSITEAVEKQLNANATSASSSAPSTSGVATGGGATTTTAASGCASSSSATASLSL
TPQMVKKGLPGGQALTPNAHHNDPHQQQASTPMSETGSTSTKLSTFSSQGKGNVQMRRTTNKQGKQAPAP
PKRTSLLSSSRDSTYREEDPANARCNFIDDLSTNGLARDINSLTQRYDSETDPAADPDTDATGDSLEQSL
SQVIAAPVTNKMQHSLHSGGGGGGIGPRSSQQHSSFKRPTGTPVMGNRGLETRQSKRSQLHSQAPGPGPP
STQPHHGNNGVVTSAHPITVGALDVMNVKQVVNRYGTLPKGARIGAYLDSLEDSSEAAPALPATAPSLPP
ANGHATPPAARLNPKASPIPPQQMIRSNSSGGVTMQNNAAASLNKLQRHRTTTEGTMMTFSSFRAGGSSS
SPKRSASGVASGVQPALANLEFPPPPLDLPPPPEEFEGGPPPPPPAPESAVQAIQQHLHAQLPNNGNISN
GNGTNNNDSSHNDVSNIAPSVEEASSRFGVSLRKREPSTDSCSSLGSPPEDLKEKLITEIKAAGKDTAPA
SHLANGSGIAVVDPVSLLVTELAESMNLPKPPPQQQQKLTNGNSTGSGFKAQLKKVEPKKMSAPMPKRTA
NTIIDFKAHLRRVDKEKEPATPAPAPATVAVANNANCNTTGTLNRKEDGSKKFSQAMQKTEIKIDVTNSN
VEADAGAAGEGDLGKRRSTDDEEQSHTEGLGSGGQGSADMTQSLYEQKPQIQQKPAVPHKPTKLTIYATP
IAKLTEPASSASSTQISRESILELVGLLEGSLKHPVNAIAGSQWLQLSDKLNILHNSCVIFAENGAMPPH
SKFQFRELVTRVEAQSQHLRSAGSKNVQDNERLVAEVGQSLRQISNALNR
>Mouse|P00520
CKSKKGLSSSSSCYLEEALQRPVASDFEPQGLSEAARWNSKENLLAGPSENDPNLFVALY
DFVASGDNTLSITKGEKLRVLGYNHNGEWCEAQTKNGQGWVPSNYITPVNSLEKHSWYHGPVSRNAAEYL
LSSGINGSFLVRESESSPGQRSISLRYEGRVYHYRINTASDGKLYVSSESRFNTLAELVHHHSTVADGLI
TTLHYPAPKRNKPTIYGVSPNYDKWEMERTDITMKHKLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEV
EEFLKEAAVMKEIKHPNLVQLLGVCTREPPFYIITEFMTYGNLLDYLRECNRQEVSAVVLLYMATQISSA
MEYLEKKNFIHRDLAARNCLVGENHLVKVADFGLSRLMTGDTYTAHAGAKFPIKWTAPESLAYNKFSIKS
DVWAFGVLLWEIATYGMSPYPGIDLSQVYELLEKDYRMERPEGCPEKVYELMRACWQWNPSDRPSFAEIH
QAFETMFQESSISDEVEKELGKRGTRGGAGSMLQAPELPTKTRTCRRAAEQKDAPDTPELLHTKGLGESD
ALDSEPAVSPLLPRKERGPPDGSLNEDERLLPRDRKTNLFSALIKKKKKMAPTPPKRSSSFREMDGQPDR
RGASEDDSRELCNGPPALTSDAAEPTKSPKASNGAGVPNGAFREPGNSGFRSPHMWKKSSTLTGSRLAAA
EEESGMSSSKRFLRSCSASCMPHGARDTEWRSVTLPRDLPSAGKQFDSSTFGGHKSEKPALPRKRTSESR
SEQVAKSTAMPLPGWLKKNEEAAEEGFKDTESSPGSSPPSLTPKLLRRQVTASPSSGLSHKEEATKGSAS
GMGTPATAEPAPPSNKVGLSKASSEEMRVRRHKHSSESPGRDKGRLAKLKPAPPPPPACTGKAGKPAQSP
SQEAGEAGGPTKTKCTSLAMDAVNTDPTKAGPPGEGLRKPVPPSVPKPQSTAKPPGTPTSPVSTPSTAPA
PSPLAGDQQPSSAAFIPLISTRVSLRKTRQPPERIASGTITKGVVLDSTEALCLAISRNSEQMASHSAVL
EAGKNLYTFCVSYVDSIQQMRNKFAFREAINKLESNLRELQICPATASSGPAATQDFSKLLSSVKEISDI
VRR
>Human|P00519
MLEICLKLVGCKSKKGLSSSSSCYLEEALQRPVASDFEPQGLSEAARWNSKENLLAGPSENDPNLFVALY
DFVASGDNTLSITKGEKLRVLGYNHNGEWCEAQTKNGQGWVPSNYITPVNSLEKHSWYHGPVSRNAAEYL
LSSGINGSFLVRESESSPGQRSISLRYEGRVYHYRINTASDGKLYVSSESRFNTLAELVHHHSTVADGLI
TTLHYPAPKRNKPTVYGVSPNYDKWEMERTDITMKHKLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEV
EEFLKEAAVMKEIKHPNLVQLLGVCTREPPFYIITEFMTYGNLLDYLRECNRQEVNAVVLLYMATQISSA
MEYLEKKNFIHRDLAARNCLVGENHLVKVADFGLSRLMTGDTYTAHAGAKFPIKWTAPESLAYNKFSIKS
DVWAFGVLLWEIATYGMSPYPGIDLSQVYELLEKDYRMERPEGCPEKVYELMRACWQWNPSDRPSFAEIH
QAFETMFQESSISDEVEKELGKQGVRGAVSTLLQAPELPTKTRTSRRAAEHRDTTDVPEMPHSKGQGESD
PLDHEPAVSPLLPRKERGPPEGGLNEDERLLPKDKKTNLFSALIKKKKKTAPTPPKRSSSFREMDGQPER
RGAGEEEGRDISNGALAFTPLDTADPAKSPKPSNGAGVPNGALRESGGSGFRSPHLWKKSSTLTSSRLAT
GEEEGGGSSSKRFLRSCSASCVPHGAKDTEWRSVTLPRDLQSTGRQFDSSTFGGHKSEKPALPRKRAGEN
RSDQVTRGTVTPPPRLVKKNEEAADEVFKDIMESSPGSSPPNLTPKPLRRQVTVAPASGLPHKEEAEKGS
ALGTPAAAEPVTPTSKAGSGAPGGTSKGPAEESRVRRHKHSSESPGRDKGKLSRLKPAPPPPPAASAGKA
GGKPSQSPSQEAAGEAVLGAKTKATSLVDAVNSDAAKPSQPGEGLKKPVLPATPKPQSAKPSGTPISPAP
VPSTLPSASSALAGDQPSSTAFIPLISTRVSLRKTRQPPERIASGAITKGVVLDSTEALCLAISRNSEQM
ASHSAVLEAGKNLYTFCVSYVDSIQQMRNKFAFREAINKLENNLRELQICPATAGSGPAATQDFSKLLSS
VKEISDIVQR
6.
Find out if the molecule X has any similar structures from PDB and download the most similar molecule from MMDB for viewing using Cn3D.
What is the PDB accession number for the molecule most similar to our molecule X?
What is the percentage identity of the most similar protein structure to molecule X?
When was it deposited?
What experimental method was used to obtain this structure?
What is the resolution of the molecule?
How many polymer chains are there in the molecule?
How many chemical components are there?
|