Galaxy | Tool Preview

Cluster KEGG (version 1.0.0)

What it does

The program builds a network of gene categories connected by shared genes. The edges of this network are weighted based on the number of genes that each node shares. The clustering coefficient, cu, is then calculated for each node using the formula:

/repository/static/images/4ab3a886a95d362e/cluster_kegg_formula.png

where deg(u) is the degree of u and edge weights, wuv, are normalized by the maximum weight in the network. The cluster coefficients are then filtered by our program based on threshold (that could be a percentile or a value choose by the user) and all the nodes with a cluster coefficient lower than this threshold are deleted from the network. Finally, the program reports each connected component as a cluster of gene classifications. With our program a lower number of gene categories is obtained, but the results are easier to interpret as they exclude genes present in many gene groups.