decorative decorative
 

ChIPCodis

Mining of Regulatory Transcription Factor data in Yeast
Citation: Abascal F, Carmona-Saez P, Carazo JM, Pascual-Montano A. (2008)
ChIPCodis: Mining complex regulatory systems in yeast by concurrent enrichment analysis of Chip-on-chip data.
Bioinformatics, 24: 1208-1209. [Supplementary material]

 
 


[ CNB ]



 

decorative

Reference Dataset   [?]
The Harbison et al dataset contains chip-on-chip data obtained under different environmental conditions (more details).
Harbison et al. (2004) Nature 431, 99-104
The MacIsaac et al dataset is a subset of the Harbison et al dataset, and contains chip-on-chip data for which a TF binding site has been succesfully identified in the yeast genome. Hence, it is a more trusted, although reduced dataset. As implemented in ChIPCodis, the MacIsaac et al dataset represents a static view of the binding of TFs, not considering the condition dependence or dynamic nature of TF regulation.
MacIsaac et al. (2006) BMC Bioinformatics 7, 113

Harbison et al (chip-on-chip data, 203 TFs)
MacIsaac et al (chip-on-chip data with binding sites identified, 121 TFs)


Harbison options   [?]
There are three datasets related to the chip-on-chip data of Harbison et al.
  • All available conditions (14): It contains chip-on-chip data for yeast TFs under a series of conditions. Hence, it embodies a dynamic perspective of the regulation of transcription in yeast. It is important to keep in mind that not all the TFs were tested under the 14 conditions, and that those 14 conditions are just a partial view of the environmental conditions faced up by yeast.
  • Growth in rich medium: It is a subset of the previous dataset, restricted to the condition of growth in rich medium (the only one for which the complete repertoire of yeast TFs was tested).
  • Compiled data: This dataset represents a static view (condition independent) of the data of Harbison et al. In the Compiled dataset a TF is said to be bound to a gene if it was bound in some of the 14 conditions.
Chip-on-chip conditions     

MacIsaac options   [?]
Evolutionary information, in the form of conservation of potential binding sites across closely related species, can also be considered for assessing the reliability of chip-on-chip data.
According to this criterion, the dataset can be restricted to binding sites which are strongly conserved or at least weakly conserved. No evolutionary restriction is applied if "all predicted sites" option is selected.
Acoording to conservation include     

P-value threshold   [?]
This p-value provides an estimation of the statistical relevance of the chip-on-chip hybridization signal.
See: Lee et al. (2002) Science 298, 799-804.
Statistical relevance of the chip-on-chip hybridization signal

Paste list of genes   [+] allowed IDs
Type Example
Orf name YCL061C
Systematic name S000000566
Gene name mrc1
Uniprot accession P25588
RefSeq NP_009871
Example (TCA cycle genes)


Paste list of reference genes (optional)

Other options   [+]
Minimum number of genes   [?]
TFs or combinations of TFs that do not appear in at least the selected number of genes will not be shown.
Statistical test   [?]
The combined annotations identified with the Association Rules Discovery technology can be statistically assesed with two alternative tests: the hypergeometric distribution and the Chi-square test.
More details on these methods can be found here.
P-value correction   [?]
Select either the FDR or the Permutations method if you want to correct p-values for multiple hypothesis testing.
  • The FDR correction implements the False Discovery Rate method of Benjamini and Hochberg (J R Stat Soc 1995, 57:289-300).
  • The simulation correction method (permutation) implements the approach described in Boyle et al. (Bioinformatics 2004, 20:3710-3715). Briefly, a gene list of the same size of the input list is generated by randomly selecting genes from the set of genes defined as the reference distribution. The process of extracting frequent sets of annotations is repeated and p-values for the annotations and combinations of annotations generated from this random list are calculated using the same statistical test. This process is repeated 1000 times and the corrected p-values for each set of k-annotations are calculated as the fraction of permutations having any annotation of the same value of k with a p-value as good or better than the observed p-value.

E-mail (optional)   

    
 
Contact:

 

decorative   decorative