TRANSFAC download

The TRANSFAC® flat file download (including the databases TRANSCompel® and TRANSProTM) contains eukaryotic transcription factors (and miRNAs), their experimentally determined genomic binding sites and consensus DNA-binding motifs (PWMs), as well as data on combinatorial gene regulation and factor-factor interaction. Promoters, enhancers and silencers annotated with transcription  factor  ChIP-Seq,

DNase hyper-sensitivity and histone methylated intervals from the ENCODE project and from other sources complement the manually curated binding site data. Based on the positional weight matrices (PWMs) transcription factor binding sites can be predicted in regulatory regions. In the TRANSFAC® flat file download, the tools of the MatchTM Library can be used on command line or the PWMs can be used with tools of the user. 

 

Here is the screenshot of the TRANSFAC download page showing all the archives included in the download package and their sizes:

TRANSFAC download 2024.1

Key features

  • Intended for Bioinformaticians
  • No installation is needed – just download and unzip archives
  • Data files are provided in DAT and JSON formats
  • Promoters are provided in the DAT and GTF formats
  • Direct data access without user interface: data extraction is possible via Perl scripts or other programs written by the user
  • Java-based tools for TFBS search (Match Library) are accessible via a command line
  • For use with customer tools and incorporation into user-specific pipelines

Structure

Structure of the TRANSFAC® flat file download release is as follows:

TRANSFAC download structure

At the center of TRANSFAC® are transcription factors and their DNA binding sites based on experimental evidence extracted from scientific publications. The interlinked factor and site entries are captured in the respective flat files. Interactions between two or more factors are also included. TF-binding sites can be with or without a connection to a gene. Genomic sites are mapped to promoter sequences in the TRANSProTM module of TRANSFAC®. Besides linkage of factors via binding sites to regulated genes, monomeric factors are also linked to their encoding genes. In some cases this may also lead to autoregulatory loops, where a factor regulates the expression of its own gene. Based on gene-factor and (indirect) factor-gene links, gene-regulatory networks can be extracted.

The transcription factor classification from TFClass has been integrated. Based on the collection of DNA-binding sites for a transcription factor, consensus binding motifs in form of positional weight matrices (PWMs) are derived, which can be used by the command line tools in the Match Library for transcription factor binding site prediction in regulatory DNA sequences. Synergistic and antagonistic interactions between transcription factors binding to closely situated sites, so called composite regulatory elements, are included in the COMPEL flat file of the TRANSCompel® module of TRANSFAC®. ChIP-seq and other high-throughput data are also included and are mapped to promoter and enhancer regions.

TRANSFAC products


Accounts

Accounts

Accounts

Accounts

Accounts

Accounts

Accounts

Accounts

Accounts

Accounts

Accounts

Accounts

Accounts

Accounts

Accounts

Accounts

Accounts

Accounts

Download of TRANSFAC data in flat files (.dat) without user interface (factor, matrix, site, gene, enhancer, cell, class, classification, reference, fragment)

Download of TRANSFAC data in JSON without user interface (factor, matrix, site, gene) e. g. for import into a PostgreSQL database

 

Download of TRANSCompel data in flat files (.dat) without user interface (compel, evidence)  

Download of TRANSPro data in flat files (.dat, .gtf) without user interface (promoters) 

Command line tools (Java) for TF-binding site prediction (Match Library) in DNA sequences (FASTA)

   
Fully in-house integration into your analysis pipeline

Use of PWMs with compatible third-party tools or with your own tools


User-friendly interface (GUI)


API (Java, R, Jupyter notebook)

Partial data export via user interface or via API (excluding PWMs)

Prediction of enriched TFBSs in gene/promoter sets, in DNA sequence sets or in sets of genomic intervals

Analysis of human genes for tissue-specific (Human Protein Atlas) and GO-specific transcription factors

Identification of regulatory DNA motifs affected by genomic variants 

Automatic promoter extraction for TFBS prediction

Whole genomes included at the backend allowing upload of genomic intervals

(with option of upload of further genomes)

Visualization of predicted TF-binding sites together with other features in the included genome browser

Tools and customizable workflows for omics data analysis

Inclusion of certain data from HumanPSD™ (Gene Ontology, tissue expression)


TRANSFAC®

download  

 


TRANSFAC®

omics

TRANSFAC® omics +  

TRANSFAC® download  

 

You can view TRANSFAC online features and packages detailed info here >>>

Explore TRANSFAC online features and packages info

Get TRANSFAC download

Price request TRANSFAC download

Information downloads

TRANSFAC® statistics (view)
TRANSFAC® download video (at YouTube)
TRANSFAC® Documentation (download)
See also the TRANSFAC® entry at Wikipedia.
More about TRANSFAC as a scientific project and its history on the pages of Edgar Wingender.
TRANSFAC® is a registered trademark of geneXplain GmbH.

Videos

Transcription Factor Classification

Most transcription factors (TFs) possess a DNA-binding domain (DBD), which mediates the recognition of specific, short DNA sequence elements in promoter, enhancer, etc. In order to approach the problem of deciphering the underlying DNA-protein recognition code, we have completely revised an earlier TF classification scheme (1,2) by adapting it to the wealth of data that were reported during the last ten years (TFClass; 3-5). TFClass has been implemented at the Dept. of Bioinformatics at the University Medical Center Göttingen (3,6).
Part of this work was done in the context of the Syscol project, where our partner at the Karolinska institute (Prof. J. Taipale and his team) have characterized the DNA-binding profiles of more than 400 mammalian TFs (7). It will be tempting to compare the similarities of their matrices with the DBD classification reported here, and with our own approaches to classify DNA-binding profiles (8).

References

  1. Wingender, E., Schoeps, T., Haubrock, M., Krull, M. and Dönitz, J. (2018) TFClass: expanding the classification of human transcription factors to their mammalian orthologs. Nucleic Acids Res. 46, D343-D347. Link
  2. Wingender, E., Schoeps, T., Haubrock, M., Dönitz, J. (2015) TFClass: a classification of human transcription factors and their rodent orthologs. Nucleic Acids Res. 43, D97-D102. Link
  3. Stegmaier, P., Kel, A., Wingender, E., Borlak, J. (2013) A discriminative approach for unsupervised clustering of DNA sequence motifs. PLoS Comput. Biol. 9, e1002958.
  4. Jolma, A., et al. (2013) DNA-Binding Specificities of Human Transcription Factors. Cell 152, 327–339. Link
  5. http://tfclass.bioinf.med.uni-goettingen.de
  6. http://www.edgar-wingender.de/huTF_classification.html
  7. Wingender, E. (2013) Criteria for an updated classification of human transcription factor DNA-binding domains. J. Bioinform. Comput. Biol. 11, 1340007. Link
  8. Wingender, E., Schoeps, T., Dönitz, J. (2013) TFClass: An expandable hierarchical classification of human transcription factors. Nucleic Acids Res. 41, D165-D170. Link
  9. Heinemeyer, T., Chen, X., Karas, H., Kel, A.E., Kel, O.V., Liebich, I., Meinhardt, T., Reuter, I., Schacherer, F., Wingender, E. (1999) Expanding the TRANSFAC database towards an expert system of regulatory molecular mechanisms. Nucleic Acids Res. 27, 318–322. Link
  10. Wingender, E. (1997) Classification scheme of eukaryotic transcription factors. Mol. Biol. Engl. Tr. 31, 498-512. Link

Publications

Wingender, E., Schoeps, T., Haubrock, M., Krull, M. and Dönitz, J. (2018) TFClass: expanding the classification of human transcription factors to their mammalian orthologs. Nucleic Acids Res. 46, D343-D347. PubMed

Kaplun, A., Krull, M., Lakshman, K., Matys, V., Lewicki, B., Hogan, J.D. (2016) Establishing and validating regulatory regions for variant annotation and expression analysis. BMC Genomics 17 (Suppl. 2), 393. PubMed

Wingender, E. (2008) The TRANSFAC project as an example of framework technology that supports the analysis of genomic regulation. Brief. Bioinform. 9, 326-332. PubMed

Matys, V., Kel-Margoulis, O.V., Fricke, E., Liebich, I., Land, S., Barre-Dirrie, A., Reuter, I., Chekmenev, D., Krull, M., Hornischer, K., Voss, N., Stegmaier, P., Lewicki-Potapov, B., Saxel, H., Kel, A.E., Wingender, E. (2006) TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucleic Acids Res. 34, D108-D110. PubMed

Kel, A.E., Gössling, E., Reuter, I., Cheremushkin, E., Kel-Margoulis, O.V., Wingender, E. (2003) MATCH: A tool for searching transcription factor binding sites in DNA sequences. Nucleic Acids Res. 31, 3576-3579. PubMed

Wingender, E., Dietze, P., Karas, H., Knüppel, R. (1996) TRANSFAC: a database on transcription factors and their DNA binding sites. Nucleic Acids Res. 24, 238-241. PubMed

Knüppel, R., Dietze, P., Lehnberg, W., Frech, K., Wingender, E. (1994) TRANSFAC retrieval program: a network model database of eukaryotic transcription regulating sequences and proteins. J. Comput. Biol. 1, 191-198. PubMed

Wingender, E. (1988) Compilation of transcription regulating proteins. Nucleic Acids Res. 16, 1879-1902. PubMed

×