




版權說明:本文檔由用戶提供并上傳,收益歸屬內容提供方,若內容存在侵權,請進行舉報或認領
文檔簡介
1、Gene Functional Annotation Tools and DatabaseszhangminProviding advanced genomic solutions! OutlineWhat is functional annotation?Popular tools - BLAST and HMMERNucleotide and protein databasesGene functional annotation and classificationInterPro and InterProScanPracticeOutlineWhat is functional anno
2、tation?Popular tools - BLAST and HMMERNucleotide and protein databasesGene functional annotation and classificationInterPro and InterProScanA simple exampleGenome AssemblyAssemble the Pieces RightGene PredictionWhen on board HMS Beagle, as naturalist, I was much struck with certain facts in the dist
3、ribution of the inhabitants of South America, and in the geological relations of the present to the past inhabitants of that continent. These facts seemed to me to throw some light on the origin of species - that mystery of mysteries, as it has been called by one of our greatest philosophers .Identi
4、fy the wordsWhen on board HMS Beagle, as naturalist, I was much struck with certain facts in the distribution of the inhabitants of South America, and in the geological relations of the present to the past inhabitants of that continent. These facts seemed to me to throw some light on the origin of s
5、pecies - that mystery of mysteries, as it has been called by one of our greatest philosophers .Functional AnnotationWhen on board HMS Beagle, as naturalist, I was much struck with certain facts in the distribution of the inhabitants of South America, and in the geological relations of the present to
6、 the past inhabitants of that continent. These facts seemed to me to throw some light on the origin of species - that mystery of mysteries, as it has been called by one of our greatest philosophers .naturalist nach-er-uh-list, nach-ruh-noun1. a person who studies or is an expert in natural history,
7、especially a zoologist or botanist.2. an adherent of naturalism in literature or art.Origin: 158090; natural + -istOrigin of Species, Thenoun( On the Origin of Species by Means of Natural Selection, or the Preservation of Favoured Races in the Struggle for Life ) a treatise (1859) by Charles Darwin
8、setting forth his theory of evolution. Identify the function (i.e., meaning) of each wordDATABASESPROFILESWhat information can be used for functional annotation?Sequence based approachesProtein A has function X, and protein B is a homolog (ortholog) of protein A; Hence B has function XStructure base
9、d approachesProtein A has structure X, and X has so-so structural features; Hence As function sites areMotif based approaches (sequence motifs, 3D motifs)A group of genes have function X and they all have motif Y; protein A has motif Y; Hence protein As function might be related to X“Guilt-by-associ
10、ation”Gene A has function X and gene B is often “associated” with gene A, B might have function related to XDomain fusion, phylogenetic profiling, PPI, etcOutlineWhat is functional annotation?Popular tools - BLAST and HMMERNucleotide and protein databasesGene functional annotation and classification
11、InterPro and InterProScanA simple exampleBiological SequencesSequence similarity is a powerful tool for discovering biological function. Just as the ancient Greeks used comparative anatomy to understand the human body and linguists used the Rosetta stone to decipher Egyptian hieroglyphs, today we ca
12、n use comparative sequence analysis to understand genomes, RNAs, and proteins. But why are biological sequences similar to one another in the first place? The answer to this question isnt simple and requires an understanding of molecular and evolutionary biology. Biological sequences like proteins m
13、ay have important functions necessary for the survival of an organism. But DNA sequence can mutate randomly, and this may change how a sequence functions. Over time, both functional constraints and random processes impact the course of sequence evolution. The degree to which a sequence follows a fun
14、ctional or random path depends on natural selection and neutral evolution. So the reason why sequences are similar to one another is because they start out similar to one another and follow different paths. Basic Local Alignment Search ToolDivide a query sequence into short chunks called words,Look
15、for exact matchesin case of hit try extending the alignmentStatistical assessmentDifferent flavors!BLASTNQueries nucleotide vs. nucleotide sequencesBLASTPQueries protein vs. protein sequencesBLASTXQueries 6 possible frames of nucleotide sequences vs. protein sequencesTBLASTNReciprocal of BLASTX(庫和核算
16、序列都翻譯成6框)TBLASTXQueries 6 possible frames of nucleotide sequences vs. 6 possible frames of nucleotide sequences inside the databaseHMMER HMMER is used for searching sequence databases for homologs of protein sequences, and for making protein sequence alignments. It implements methods using probabili
17、stic models called profile hidden Markov models (profile HMMs).Representation of a Hidden Markov model based on a multiple sequence alignment.HMMER algorithmshmmscan - search protein sequences against collections of profiles, e.g. Pfam. In HMMER2 this was called hmmpfam.hmmsearch - used to search on
18、e or more profiles against a protein sequence database. jackhmmer - iteratively search a query protein sequence, multiple sequence alignment or profile HMM against the target protein sequence database.phmmer - used to search one or more query protein sequences against a protein sequence database./se
19、arch/hmmscanOutlineWhat is functional annotation?Popular tools - BLAST and HMMERNucleotide and protein databasesGene functional annotation and classificationInterPro and InterProScanA simple exampleNucleotide and protein databasesNCBI (USA), EMBL (Europe), DDBJ (Japan)EST, STS, GSS, Genomes, RefSeq,
20、 HTG, etc. International Nucleotide Sequence Database CollaborationGenbankCoreNucleotide - Nt/NrdbESTdbGSSNCBI Nt/NrNt - Nucleotide collection The nucleotide collection consists of GenBank+ EMBL+ DDBJ+ PDB+RefSeq sequences, but excludes EST, STS, GSS, WGS, TSA, patent sequences as well as phase 0, 1
21、, and 2 HTGS sequences. The database is partially non-redundant.Nr - Non-redundant protein sequences All non-redundant GenBank CDS translations+PDB +SwissProt + PIR+PRF excluding environmental samples from WGS projects.UniProtKB/Swiss-ProtUniProtKB - Protein knowledgebase, consists of two sections:S
22、wissProt: manually annotated and reviewed.TrEMBL: automatically annotated and is notreviewed.Model Organism GenomesUseful ToolsKey word searchBLAST, BLATGenome browseBiomartOther functional resourseOutlineWhat is functional annotation?Popular tools - BLAST and HMMERNucleotide and protein databasesGe
23、ne functional annotation and classificationInterPro and InterProScanA simple exampleGene functional annotation and classificationTo interpret a protein in the context of biological functionProtein domains, families, functional sites, pathways or other biological meaningful aspectsProtein domain fami
24、ly, PFAMGene ontologyKEGG pathwayKOG/COG PFAM14831 families, high quality Pfam-A, low quality Pfam-B.Annotation tools: hmmscan (HMMER 3.0)The Pfam database is a large collection of protein families, each represented bymultiple sequence alignmentsandhidden Markov models (HMMs).PFAM featuresGene Ontol
25、ogyAim to standardizing the representation of gene and gene product attributes across species and databases.GO covers three domains: biological process, cellular component and molecular function.For example, Cytochrome P450 11B1, mitochondrialGO cellular component term:GO:0005743Where is it?Mitochon
26、drial p450mitochondrial inner membraneGO molecular function term:GO:0004497What does it do?substrate + O2 = CO2 +H2O + productmonooxygenase activityGO biological process term: GO:0006118Which process is this?electron transportDAGpart_ofis_aGO AnnotationMappings to GOEC2GO, Pfam2GO, COG2GOAnnotation
27、toolsBlast2goGoannaGotchaCOG/KOGClusters of Orthologous Groups of proteinseuKaryotic Ortholog GroupsHow to define ortholog?Bet - best hitEach COG included proteins from at least three sufficiently distant species?COG/KOGKyoto Encyclopedia of Genes and Genomes KEGG (Kyoto Encyclopedia of Genes and Ge
28、nomes) is a collection of online databases dealing with genomes, enzymatic pathways, and biological chemicals. The PATHWAY database records networks of molecular interactions in the cells, and variants of them specific to particular organisms. Kanehisa LaboratoriesKEGG orthologyKAAS, for ortholog as
29、signment and pathway mappingA set of represent genomes, bi-directional best hitKEGG pathwayhsa00010ko00010map00010Glycolysis / GluconeogenesisKEGG APIhttp:/rest.kegg.jp/ = info | list | find | get | conv | link = | : path for kegg pathway, ko for kegg orthology : + + TASKGet a kegg pathway map.Get gene list that involve that pathway.TASK 1http:/rest.kegg.jp/info/pathwayhttp:/rest.kegg.jp/list/pathway http:/rest.kegg.jp/get/map00010/image Glycolysis / GluconeogenesisGet a kegg pathway map.TASK 2Get gene list that involve that p
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯系上傳者。文件的所有權益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網頁內容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
- 4. 未經權益所有人同意不得將文件中的內容挪作商業或盈利用途。
- 5. 人人文庫網僅提供信息存儲空間,僅對用戶上傳內容的表現方式做保護處理,對用戶上傳分享的文檔內容本身不做任何修改或編輯,并不能對任何下載內容負責。
- 6. 下載文件中如有侵權或不適當內容,請與我們聯系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 2025年植物生長調節劑合作協議書
- 2025版權轉讓協議合同
- 2025年個人借款合同英文版
- 2025標準裝修合同模板
- 2025房屋租賃合同范文匯編
- 2025年ZRO2陶瓷磨介合作協議書
- 2025年特種氯乙烯共聚物項目建議書
- 2025年板臥式電除塵器項目建議書
- 2025年植物促生菌劑合作協議書
- 2025年單晶生產爐合作協議書
- 義務兵家庭優待金審核登記表
- GA 255-2022警服長袖制式襯衣
- GB/T 5202-2008輻射防護儀器α、β和α/β(β能量大于60keV)污染測量儀與監測儀
- GB/T 39560.4-2021電子電氣產品中某些物質的測定第4部分:CV-AAS、CV-AFS、ICP-OES和ICP-MS測定聚合物、金屬和電子件中的汞
- GB/T 3452.4-2020液壓氣動用O形橡膠密封圈第4部分:抗擠壓環(擋環)
- 計劃生育協會基礎知識課件
- 【教材解讀】語篇研讀-Sailing the oceans
- 抗腫瘤藥物過敏反應和過敏性休克
- 排水管道非開挖預防性修復可行性研究報告
- 交通工程基礎習習題及參考答案
- 線路送出工程質量創優項目策劃書
評論
0/150
提交評論