各位好:
能否利用R软件来搭建一个GO注释的流程?主要想实现以下功能:
一、输入:
1. 支持常见数据库编号输入,如蛋白质GI号,蛋白质名称,uniprot ID号,基因GI号,基因名,基因号等;
2. 类似于blast2go支持基因或者蛋白质序列输入;
二、输出:
1. 基因产物的GO对应关系;
2. GO的分类饼图等,类似于wego那样的分类图;
3. GO的富集分析。
请各位大侠支招哈。
谢谢~
各位好:
能否利用R软件来搭建一个GO注释的流程?主要想实现以下功能:
一、输入:
1. 支持常见数据库编号输入,如蛋白质GI号,蛋白质名称,uniprot ID号,基因GI号,基因名,基因号等;
2. 类似于blast2go支持基因或者蛋白质序列输入;
二、输出:
1. 基因产物的GO对应关系;
2. GO的分类饼图等,类似于wego那样的分类图;
3. GO的富集分析。
请各位大侠支招哈。
谢谢~
"topGO" package in Bioconductor is suitable for these jobs, especially the GO analysis and category-plotting ("Output" things in your query). For the "Input" format, gene names conversion could be easily done by some web-bases tools like "gene converter" http://idconverter.bioinfo.cnio.es/. R packages such as "biomaRt" is also in the recommended list if you are familiar with R language. For the gene/protein sequence query, I have not found a R package with this kind of function. My suggestion is converting gene sequence to its name at first, and then using the tools mentioned above.
Since GO analysis is quite a hot topic, if someone has any better ideas, please share with me. Thanks.[s:13]