各位好:

能否利用R软件来搭建一个GO注释的流程?主要想实现以下功能:

一、输入:

1. 支持常见数据库编号输入,如蛋白质GI号,蛋白质名称,uniprot ID号,基因GI号,基因名,基因号等;

2. 类似于blast2go支持基因或者蛋白质序列输入;

二、输出:

1. 基因产物的GO对应关系;

2. GO的分类饼图等,类似于wego那样的分类图;

3. GO的富集分析。

请各位大侠支招哈。

谢谢~

"topGO" package in Bioconductor is suitable for these jobs, especially the GO analysis and category-plotting ("Output" things in your query). For the "Input" format, gene names conversion could be easily done by some web-bases tools like "gene converter" http://idconverter.bioinfo.cnio.es/. R packages such as "biomaRt" is also in the recommended list if you are familiar with R language. For the gene/protein sequence query, I have not found a R package with this kind of function. My suggestion is converting gene sequence to its name at first, and then using the tools mentioned above.

Since GO analysis is quite a hot topic, if someone has any better ideas, please share with me. Thanks.[s:13]