Structure of the TRANSFAC® flat file download release is as follows:

TRANSFAC download structure

At the center of TRANSFAC® are transcription factors and their DNA binding sites based on experimental evidence extracted from scientific publications. The interlinked factor and site entries are captured in the respective flat files. Interactions between two or more factors are also included. TF-binding sites can be with or without a connection to a gene. Genomic sites are mapped to promoter sequences in the TRANSProTM module of TRANSFAC®. Besides linkage of factors via binding sites to regulated genes, monomeric factors are also linked to their encoding genes. In some cases this may also lead to autoregulatory loops, where a factor regulates the expression of its own gene. Based on gene-factor and (indirect) factor-gene links, gene-regulatory networks can be extracted.

The transcription factor classification from TFClass has been integrated. Based on the collection of DNA-binding sites for a transcription factor, consensus binding motifs in form of positional weight matrices (PWMs) are derived, which can be used by the command line tools in the Match Library for transcription factor binding site prediction in regulatory DNA sequences. Synergistic and antagonistic interactions between transcription factors binding to closely situated sites, so called composite regulatory elements, are included in the COMPEL flat file of the TRANSCompel® module of TRANSFAC®. ChIP-seq and other high-throughput data are also included and are mapped to promoter and enhancer regions.