TRANSFAC

TRANSFAC® release 2018.3 is out now!

The gold standard in the area of transcriptional regulation.

TRANSFAC® is the database of eukaryotic transcription factors, their genomic binding sites and DNA-binding profiles. Dating back to a very early compilation, it has been carefully maintained and curated since then and became the gold standard in the field, which can be made use of when applying the geneXplain platform.

In particular its library of positional weight matrices is a unique collection of DNA-binding models, suitable for a comprehensive analysis of genomic sequences for potential transcription factor binding sites (TFBSs).

You can use TRANSFAC® as encyclopedia of transcriptional regulation, or as a tool to identify potential TFBSs. The latter can be done with the proven tool MatchTM, or with any of the respective modules in the geneXplain platform.

Structure

The core of TRANSFAC® comprises contents of two domains: One documents TF binding sites, usually in promoters or enhancers. The other describes the binding proteins (TFs).

Transfac conceptOn top of each of these two domains, an abstract view on its contents is provided:

Binding sites referring to the same TF are merged into a positional weight matrix. Such a matrix reflects the frequency with which each nucleotide is found in each position of this TF’s binding sites and, thus, the base preference in each position.

Transcription factors are subsumed to classes, based on the general properties of their DNA-binding domains. This early attempt has been expanded to a comprehensive TF classification, the latest version of which can be found here.

Key features

Interlinked reports connecting transcription factors, their experimentally-characterized binding sites and regulated genes, as well as promoter reports with mapped annotated TF-binding sites and high-throughput data (ChIP-seq etc.)

More than 70,000  site reports containing details from the primary literature for more than 300 species, with a focus on human, mouse, rat, yeast, and plants

More than 23,000 transcription factor (and 1,200 miRNA) reports, a subset of which provide GO functional assignments, disease associations and expression pattern assignments

More than 67,000 manually annotated transcription factor–site interactions; plus more than 57,000 miRNA-target site interactions

More than 2,000 high-throughput TFBS ChIP experiment reports  comprising 27M transcription factor bound fragments/intervals, many of which have been annotated with the best-scoring binding site and neighboring genes, as well as 161 DNase hypersensitivity ChIP-Seq experiments comprising 15M fragments and 1M histone modification fragments

More than 7,000 positional weight matrices, to be used by MatchTM, FMatch, CMsearch and a number of geneXplain bricks to predict TF binding sites.

More than 360,000 promoter reports for human and nine other organisms, including transcription start sites, CpG islands, single nucleotide polymorphisms (SNPs) and various other annotations

Including tools for TF-binding site prediction, de novo motif identification, matrix comparison and miRNA regulator identification.

Further, including a pathway visualization tool for building custom regulatory networks out of experimentally demonstrated factor-DNA and factor-factor interactions, as well as a functional analysis tool for identification of shared attributes in an analyzed gene set.

More detailed statistics can be obtained here.

Benefits

Quickly access detailed reports for transcription factors, their experimentally-characterized binding sites and regulated genes, and ChIP experiments without tedious and time consuming literature searches.

Predict transcription factor binding sites within a DNA sequence using your own or TRANSFAC’s positional weight matrices.

Build custom transcription regulatory networks.

Use TRANSFAC®‘s positional weight matrices as an integral part of the geneXplain platform.

New release

TRANSFAC® release 2018.2

The TRANSFAC® database on transcription factors, their genomic binding sites and DNA-binding motifs (PWMs), contains these new data features:

  • Performance assessment of TRANSFAC® PWMs and derived matrix recommendations

Out of the huge collection of PWMs in the TRANSFAC database, a non-redundant library was compiled comprising the best-performing DNA-binding motifs of altogether 2799 transcription factors.

The user can now choose among four new PWM profiles consisting of recommended matrices for vertebrate, plant, fungal, and insect factors to be used with MATCH (to predict transcription factor binding sites, TFBSs, in DNA sequences) or FMATCH (to identify enriched TFBSs in a set of DNA sequences).

  • Integration of new human ChIP-Seq experiments from ENCODE

164 new human transcription factor binding site ChIP-Seq experiments released by the ENCODE phase 3 project between October 2017 and January 2018 have been integrated. The data sets comprise 2,570,897 fragments bound by 122 distinct transcription factors, of which 68 factors were not yet covered by ChIP-Seq data.

For 76 of the sets, an existing positional weight matrix for the respective transcription factor was used together with the MATCH tool to predict altogether 1,497,691 best binding sites inside the fragments.

Predicted best binding sites as well as complete fragments are available in FASTA and BED format via the ChIP Experiment Reports, as are lists of genes in a distance range to the fragments as specified by the user.

  • Addition of public human ChIP-Seq experiments from other sources

1,757 human ChIP-Seq data sets published in GEO and ArrayExpress and re-analyzed by the ReMap 2018 project have been incorporated. The experiments involve 48,509,720 fragments bound by 342 distinct transcription factors, including 190 without previous ChIP-Seq data set in the database. The peaks were taken from the “all peaks” catalog, allowing to preserve the cell specificity of the original experiments.

  • Ensembl version update

Genomic information for genes, promoters, and ChIP fragments for the species human, mouse, rat, macaque, and Arabidopsis is now based on Ensembl release 91.

Free trial

Thank you very much for your interest in our programs!

Please contact us and you will be provided with your free trial version.

Price request





Choose product(s) you are interested in

geneXplain platformTRANSFACTRANSPATH
HumanPSDBRENDA
PASS & PharmaExpertGUSAR
Bioinformatic/ System Biology/ Pharmacogenomic services







AcademicCommercial

Promoter analysis

Learn more about promoter analysis with TRANSFAC® in the geneXplain platform.

Videos

Transcription Factor Classification

Most transcription factors (TFs) possess a DNA-binding domain (DBD), which mediates the recognition of specific, short DNA sequence elements in promoter, enhancer, etc. In order to approach the problem of deciphering the underlying DNA-protein recognition code, we have completely revised an earlier TF classification scheme (1,2) by adapting it to the wealth of data that were reported during the last ten years (TFClass; 3-5). TFClass has been implemented at the Dept. of Bioinformatics at the University Medical Center Göttingen (3,6).
Part of this work was done in the context of the Syscol project, where our partner at the Karolinska institute (Prof. J. Taipale and his team) have characterized the DNA-binding profiles of more than 400 mammalian TFs (7). It will be tempting to compare the similarities of their matrices with the DBD classification reported here, and with our own approaches to classify DNA-binding profiles (8).

References

  1. Wingender, E., Schoeps, T., Haubrock, M., Krull, M. and Dönitz, J. (2018) TFClass: expanding the classification of human transcription factors to their mammalian orthologs. Nucleic Acids Res. 46, D343-D347. Link
  2. Wingender, E., Schoeps, T., Haubrock, M., Dönitz, J. (2015) TFClass: a classification of human transcription factors and their rodent orthologs. Nucleic Acids Res. 43, D97-D102. Link
  3. Stegmaier, P., Kel, A., Wingender, E., Borlak, J. (2013) A discriminative approach for unsupervised clustering of DNA sequence motifs. PLoS Comput. Biol. 9, e1002958.
  4. Jolma, A., et al. (2013) DNA-Binding Specificities of Human Transcription Factors. Cell, 152, 327–339. Link
  5. http://tfclass.bioinf.med.uni-goettingen.de/tfclass
  6. http://www.edgar-wingender.de/huTF_classification.html
  7. Wingender, E. (2013) Criteria for an updated classification of human transcription factor DNA-binding domains. J. Bioinform. Comput. Biol. 11, in press. Link
  8. Wingender, E., Schoeps, T., Dönitz, J. (2013) TFClass: An expandable hierarchical classification of human transcription factors. Nucleic Acids Res. 41, D165-D170. Link
  9. Heinemeyer, T., Chen, X., Karas, H., Kel, A.E., Kel, O.V., Liebich, I., Meinhardt, T., Reuter, I., Schacherer, F., Wingender,E. (1999) Expanding the TRANSFAC database towards an expert system of regulatory molecular mechanisms. Nucleic Acids Res., 27, 318–322. Link
  10. Wingender, E. (1997) Classification scheme of eukaryotic transcription factors. Mol. Biol. Engl. Tr. 31, 498-512. Link

Information downloads

TRANSFAC® Statistics 2018.3 (download)
TRANSFAC® Release 2018.3 (download)
TRANSFAC® Statistics 2018.2 (download)
TRANSFAC® Release 2018.2 (download)
TRANSFAC® Release and Statistics 2018.1 (download)
TRANSFAC® Flyer (download)
TRANSFAC® Documentation (download)
TRANSFAC® Video (at YouTube)
See also the TRANSFAC® entry at Wikipedia.
More about TRANSFAC as a scientific project and its history on the pages of Edgar Wingender.
TRANSFAC® is a registered trademark of QIAGEN.

Recent publications

Wingender, E., Schoeps, T., Haubrock, M., Krull, M. and Dönitz, J. (2018) TFClass: expanding the classification of human transcription factors to their mammalian orthologs. Nucleic Acids Res. 46, D343-D347. PubMed.

Kaplun, A., Krull, M., Lakshman, K., Matys, V., Lewicki, B., Hogan, J.D. (2016) Establishing and validating regulatory regions for variant annotation and expression analysis. BMC Genomics 17 (Suppl. 2):393. PubMed.

Wingender, E. (2008) The TRANSFAC project as an example of framework technology that supports the analysis of genomic regulation. Brief. Bioinform. 9:326-332. PubMed.

Matys, V., Kel-Margoulis, O.V., Fricke, E., Liebich, I., Land, S., Barre-Dirrie, A., Reuter, I., Chekmenev, D., Krull, M., Hornischer, K., Voss, N., Stegmaier, P., Lewicki-Potapov, B., Saxel, H., Kel, A.E., Wingender, E. (2006) TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucleic Acids Res. 34:D108-D110. PubMed.

Kel, A.E., Gössling, E., Reuter, I., Cheremushkin, E., Kel-Margoulis, O.V., Wingender, E. (2003) MATCH: A tool for searching transcription factor binding sites in DNA sequences. Nucleic Acids Res. 31:3576-3579. PubMed

Wingender, E., Dietze, P., Karas, H., Knüppel, R. (1996) TRANSFAC: a database on transcription factors and their DNA binding sites. Nucleic Acids Res. 24:238-241. PubMed

Knüppel, R., Dietze, P., Lehnberg, W., Frech, K., Wingender, E. (1994) TRANSFAC retrieval program: a network model database of eukaryotic transcription regulating sequences and proteins. J. Comput. Biol. 1:191-198. PubMed

Wingender, E. (1988) Compilation of transcription regulating proteins. Nucleic Acids Res. 16:1879-1902. PubMed

×