TRANSFAC download

The TRANSFAC® flat file download (including the databases TRANSCompel® and TANSProTM) contains eukaryotic transcription factors (and miRNAs), their experimentally determined genomic binding sites and consensus DNA-binding motifs (PWMs), as well as data on combinatorial gene regulation and factor-factor interaction. Promoters, enhancers and silencers annotated with transcription  factor  ChIP-Seq,  DNase hyper-

sensitivity and histone methylated intervals from the ENCODE project and from other sources complement the manually curated binding site data. Based on the positional weight matrices (PWMs) transcription factor binding sites can be predicted in regulatory regions. In the TRANSFAC® flat file download, the tools of the MatchTM Library can be used on command line or the PWMs can be used with tools of the user. 

 

Here is the screenshot of the TRANSFAC download page showing all the archives included in the download package and their sizes:

TRANSFAC download archives

Key features

  • Intended for Bioinformaticians
  • No installation is needed – just download and unzip archives
  • Data files are provided in DAT and JSON formats
  • Promoters are provided in the DAT and GTF formats
  • Direct data access without user interface: data extraction is possible via Perl scripts or other programs written by the user
  • Java-based tools for TFBS search (Match Library) are accessible via a command line
  • For use with customer tools and incorporation into user-specific pipelines

Structure

Structure of the TRANSFAC® flat file download release is as follows:

TRANSFAC download structure

At the center of TRANSFAC® are transcription factors and their DNA binding sites based on experimental evidence extracted from scientific publications. The interlinked factor and site entries are captured in the respective flat files. Interactions between two or more factors are also included. TF-binding sites can be with or without a connection to a gene. Genomic sites are mapped to promoter sequences in the TRANSProTM module of TRANSFAC®. Besides linkage of factors via binding sites to regulated genes, monomeric factors are also linked to their encoding genes. In some cases this may also lead to autoregulatory loops, where a factor regulates the expression of its own gene. Based on gene-factor and (indirect) factor-gene links, gene-regulatory networks can be extracted.

The transcription factor classification from TFClass has been integrated. Based on the collection of DNA-binding sites for a transcription factor, consensus binding motifs in form of positional weight matrices (PWMs) are derived, which can be used by the command line tools in the Match Library for transcription factor binding site prediction in regulatory DNA sequences. Synergistic and antagonistic interactions between transcription factors binding to closely situated sites, so called composite regulatory elements, are included in the COMPEL flat file of the TRANSCompel® module of TRANSFAC®. ChIP-seq and other high-throughput data are also included and are mapped to promoter and enhancer regions.

TRANSFAC products


Interface

Data access / download

Data core content from TRANSFAC® (TFBSs, PWMs, etc.) and accompanying databases

PWMs

Tools / Functionality

Input data type

Analysis size




TRANSFAC® online

Web interface (GUI)

Export of subsets of certain data types and analysis results

Data core content. Plus, for TFs, additional data from HumanPSD (GO, tissue expression)

No download for PWMs. PWMs can only be used with the included tools

Web-based tools for TFBS prediction in single promoters or in gene/sequence sets (plus additional tools

DNA sequences, gene lists, genomic intervals

Limited to pre-processed data sets (of few hundred genes / seqs / intervals)

econom

TRANSFAC® online +

geneXplain®

platform

GUI and API

Export of subsets of certain data types and analysis results

Data core content. Plus, for TFs, additional data from HumanPSD (GO, tissue expression)

No download for PWMs. PWMs can only be used with the included tools

Tools and customizable workflows for omics data analysis (optionally extendable to pathway and upstream analysis)

Processed and raw RNAseq, SNPs, omics data, etc.

Complete analysis of omics data or of whole genomes


TRANSFAC® flat file download

Command line

Download of flat files with all data (DAT/JSON) for direct access

Data core content

Download of PWMs in matrix.dat for use with internal or external tools

Command line tools for TFBS prediction (Match Library) or use of PWMs with (compatible) tools of the user

DNA sequences

Whole genome analysis

 

You can view TRANSFAC online features and packages detailed info here >>>

Explore TRANSFAC online features and packages info

Get TRANSFAC download

Price request TRANSFAC download

Information downloads

TRANSFAC® statistics (view)
TRANSFAC® download video (at YouTube)
See also the TRANSFAC® entry at Wikipedia.
More about TRANSFAC as a scientific project and its history on the pages of Edgar Wingender.
TRANSFAC® is a registered trademark of QIAGEN.

Videos

Transcription Factor Classification

Most transcription factors (TFs) possess a DNA-binding domain (DBD), which mediates the recognition of specific, short DNA sequence elements in promoter, enhancer, etc. In order to approach the problem of deciphering the underlying DNA-protein recognition code, we have completely revised an earlier TF classification scheme (1,2) by adapting it to the wealth of data that were reported during the last ten years (TFClass; 3-5). TFClass has been implemented at the Dept. of Bioinformatics at the University Medical Center Göttingen (3,6).
Part of this work was done in the context of the Syscol project, where our partner at the Karolinska institute (Prof. J. Taipale and his team) have characterized the DNA-binding profiles of more than 400 mammalian TFs (7). It will be tempting to compare the similarities of their matrices with the DBD classification reported here, and with our own approaches to classify DNA-binding profiles (8).

References

  1. Wingender, E., Schoeps, T., Haubrock, M., Krull, M. and Dönitz, J. (2018) TFClass: expanding the classification of human transcription factors to their mammalian orthologs. Nucleic Acids Res. 46, D343-D347. Link
  2. Wingender, E., Schoeps, T., Haubrock, M., Dönitz, J. (2015) TFClass: a classification of human transcription factors and their rodent orthologs. Nucleic Acids Res. 43, D97-D102. Link
  3. Stegmaier, P., Kel, A., Wingender, E., Borlak, J. (2013) A discriminative approach for unsupervised clustering of DNA sequence motifs. PLoS Comput. Biol. 9, e1002958.
  4. Jolma, A., et al. (2013) DNA-Binding Specificities of Human Transcription Factors. Cell 152, 327–339. Link
  5. http://tfclass.bioinf.med.uni-goettingen.de
  6. http://www.edgar-wingender.de/huTF_classification.html
  7. Wingender, E. (2013) Criteria for an updated classification of human transcription factor DNA-binding domains. J. Bioinform. Comput. Biol. 11, 1340007. Link
  8. Wingender, E., Schoeps, T., Dönitz, J. (2013) TFClass: An expandable hierarchical classification of human transcription factors. Nucleic Acids Res. 41, D165-D170. Link
  9. Heinemeyer, T., Chen, X., Karas, H., Kel, A.E., Kel, O.V., Liebich, I., Meinhardt, T., Reuter, I., Schacherer, F., Wingender, E. (1999) Expanding the TRANSFAC database towards an expert system of regulatory molecular mechanisms. Nucleic Acids Res. 27, 318–322. Link
  10. Wingender, E. (1997) Classification scheme of eukaryotic transcription factors. Mol. Biol. Engl. Tr. 31, 498-512. Link

Publications

Wingender, E., Schoeps, T., Haubrock, M., Krull, M. and Dönitz, J. (2018) TFClass: expanding the classification of human transcription factors to their mammalian orthologs. Nucleic Acids Res. 46, D343-D347. PubMed

Kaplun, A., Krull, M., Lakshman, K., Matys, V., Lewicki, B., Hogan, J.D. (2016) Establishing and validating regulatory regions for variant annotation and expression analysis. BMC Genomics 17 (Suppl. 2), 393. PubMed

Wingender, E. (2008) The TRANSFAC project as an example of framework technology that supports the analysis of genomic regulation. Brief. Bioinform. 9, 326-332. PubMed

Matys, V., Kel-Margoulis, O.V., Fricke, E., Liebich, I., Land, S., Barre-Dirrie, A., Reuter, I., Chekmenev, D., Krull, M., Hornischer, K., Voss, N., Stegmaier, P., Lewicki-Potapov, B., Saxel, H., Kel, A.E., Wingender, E. (2006) TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucleic Acids Res. 34, D108-D110. PubMed

Kel, A.E., Gössling, E., Reuter, I., Cheremushkin, E., Kel-Margoulis, O.V., Wingender, E. (2003) MATCH: A tool for searching transcription factor binding sites in DNA sequences. Nucleic Acids Res. 31, 3576-3579. PubMed

Wingender, E., Dietze, P., Karas, H., Knüppel, R. (1996) TRANSFAC: a database on transcription factors and their DNA binding sites. Nucleic Acids Res. 24, 238-241. PubMed

Knüppel, R., Dietze, P., Lehnberg, W., Frech, K., Wingender, E. (1994) TRANSFAC retrieval program: a network model database of eukaryotic transcription regulating sequences and proteins. J. Comput. Biol. 1, 191-198. PubMed

Wingender, E. (1988) Compilation of transcription regulating proteins. Nucleic Acids Res. 16, 1879-1902. PubMed

×