Archive - geneXplain

TRANSFAC download

The TRANSFAC® flat file download (including the databases TRANSCompel® and TRANSProTM) contains eukaryotic transcription factors (and miRNAs), their experimentally determined genomic binding sites and consensus DNA-binding motifs (PWMs), as well as data on combinatorial gene regulation and factor-factor interaction. Promoters, enhancers and silencers annotated with transcription  factor  ChIP-Seq,

DNase hyper-sensitivity and histone methylated intervals from the ENCODE project and from other sources complement the manually curated binding site data. Based on the positional weight matrices (PWMs) transcription factor binding sites can be predicted in regulatory regions. In the TRANSFAC® flat file download, the tools of the MatchTM Library can be used on command line or the PWMs can be used with tools of the user. 


Here is the screenshot of the TRANSFAC download page showing all the archives included in the download package and their sizes:

TRANSFAC download 2024.1

Key features

  • Intended for Bioinformaticians
  • No installation is needed – just download and unzip archives
  • Data files are provided in DAT and JSON formats
  • Promoters are provided in the DAT and GTF formats
  • Direct data access without user interface: data extraction is possible via Perl scripts or other programs written by the user
  • Java-based tools for TFBS search (Match Library) are accessible via a command line
  • For use with customer tools and incorporation into user-specific pipelines


Structure of the TRANSFAC® flat file download release is as follows:

TRANSFAC download structure

At the center of TRANSFAC® are transcription factors and their DNA binding sites based on experimental evidence extracted from scientific publications. The interlinked factor and site entries are captured in the respective flat files. Interactions between two or more factors are also included. TF-binding sites can be with or without a connection to a gene. Genomic sites are mapped to promoter sequences in the TRANSProTM module of TRANSFAC®. Besides linkage of factors via binding sites to regulated genes, monomeric factors are also linked to their encoding genes. In some cases this may also lead to autoregulatory loops, where a factor regulates the expression of its own gene. Based on gene-factor and (indirect) factor-gene links, gene-regulatory networks can be extracted.

The transcription factor classification from TFClass has been integrated. Based on the collection of DNA-binding sites for a transcription factor, consensus binding motifs in form of positional weight matrices (PWMs) are derived, which can be used by the command line tools in the Match Library for transcription factor binding site prediction in regulatory DNA sequences. Synergistic and antagonistic interactions between transcription factors binding to closely situated sites, so called composite regulatory elements, are included in the COMPEL flat file of the TRANSCompel® module of TRANSFAC®. ChIP-seq and other high-throughput data are also included and are mapped to promoter and enhancer regions.

TRANSFAC products



















Download of TRANSFAC data in flat files (.dat) without user interface (factor, matrix, site, gene, enhancer, cell, class, classification, reference, fragment)

Download of TRANSFAC data in JSON without user interface (factor, matrix, site, gene) e. g. for import into a PostgreSQL database


Download of TRANSCompel data in flat files (.dat) without user interface (compel, evidence)  

Download of TRANSPro data in flat files (.dat, .gtf) without user interface (promoters) 

Command line tools (Java) for TF-binding site prediction (Match Library) in DNA sequences (FASTA)

Fully in-house integration into your analysis pipeline

Use of PWMs with compatible third-party tools or with your own tools

User-friendly interface (GUI)

API (Java, R, Jupyter notebook)

Partial data export via user interface or via API (excluding PWMs)

Prediction of enriched TFBSs in gene/promoter sets, in DNA sequence sets or in sets of genomic intervals

Analysis of human genes for tissue-specific (Human Protein Atlas) and GO-specific transcription factors

Identification of regulatory DNA motifs affected by genomic variants 

Automatic promoter extraction for TFBS prediction

Whole genomes included at the backend allowing upload of genomic intervals

(with option of upload of further genomes)

Visualization of predicted TF-binding sites together with other features in the included genome browser

Tools and customizable workflows for omics data analysis

Inclusion of certain data from HumanPSD™ (Gene Ontology, tissue expression)






TRANSFAC® omics +  

TRANSFAC® download  


You can view TRANSFAC online features and packages detailed info here >>>

Explore TRANSFAC online features and packages info

Get TRANSFAC download

Genome Enhancer

Release 3.4 is now out!


Welcome to the new era of Precision Medicine

Meet Genome Enhancer – a fully automated pipeline for patient omics data analysis, which identifies prospective drug targets and corresponding treatments by reconstructing the molecular mechanism of the studied pathology. Proven applications of Genome Enhancer include cancer, neurodegenerative diseases, infectious diseases, diabetes and metabolic diseases, hypertension.

Genome Enhancer is also available as a part of TRANSFAC full package – a powerful synergism between the automatic pipeline for multi-omics data processing of Genome Enhancer and the comprehensive bioinformatics toolbox of the geneXplain® platform integrated with TRANSFAC, TRANSPATH, and HumanPSD databases. Find details on the TRANSFAC full package here.


Request Genome Enhancer price info



Genome Enhancer uses Upstream Analysis, an integrated promoter and pathway analysis, to identify potential drug targets of the studied pathology.

In the first step of this analysis the transcription factors that regulate differentially expressed or mutated genes are identified with the use of the TRANSFAC® database of transcription factors binding sites.

The second step searches for common master-regulators of the identified transcription factors by building a personalized signal transduction network of the studied pathology using the TRANSPATH® database of mammalian signal transduction and metabolic pathways. The identified

master regulators are prospective drug target candidates. They are used for further selection of chemical compounds that can bring therapeutic benefit for the studied clinical case. In this step the HumanPSD database is employed to identify drugs that have been tested in clinical trials. The cheminformatic tool PASS predicts small molecules that can affect the identified targets.

Finally, Genome Enhancer generates a comprehensive analysis report about the personalized drug targets identified for a certain patient, or a group of patients, and the drugs that may be effective in this case. You can view a number of Genome Enhancer demo reports at the corresponding section of this page.


A detailed description of Genome Enhancer analysis schema can be found on this page.

Check out the Genome Enhancer video channel ran by the CSO of geneXplain GmbH Dr. Alexander Kel. See how various omics data can be analyzed in Genome Enhancer or even send your own data to Dr. Kel and he will show its analysis in one of the next videos!

Key features

  • Identifies activated targets in the examined patient data and suggests known and re-purposing drugs
  • Based on well accepted databases TRANSFAC®, TRANSPATH®, HumanPSD™
  • Applies unique in-house developed algorithms
  • Suites for use by researches and by medical doctors
  • Does not require special skills to operate
  • Analyses all types of omics data starting either with raw or pre-processed data
  • Generates a comprehensive report on the identified drug targets and prospective therapies
  • Generates MTB (Molecular Tumor Board) report on patient’s genomics data for a number of pathologies (if not more than 2 conditions were selected during the analysis launch)

In addition to that, all our customers enjoy the following advantages or our products:

–       Technical and scientific support or your research: rapid answers to your questions from our professional support team will be always provided

–       Secure cloud space connected with online licenses that can be accessed from any location

–       Extensive manuals, documentation, examples, and tutorials available for all our products

–       Frequent releases and updates of our databases contents and software functionality

–       Ability to request personal training sessions

–       All our servers are running on CO2-neutral water or wind power


Available solutions

Genome Enhancer can be licensed independently or as an integrative part of the TRANSFAC full package providing a powerful synergism between the automatic pipeline for multi-omics data processing of Genome Enhancer and the comprehensive bioinformatics toolbox of the geneXplain® platform integrated with TRANSFAC, TRANSPATH, and HumanPSD databases. With TRANSFAC full package you will be able to perform further processing of your analysis results received from Genome Enhancer, create, modify, and use already pre-defined workflows for your multi-omics data analysis, and work with data coming from other model organisms. For more info on TRANSFAC full package please refer to this table.

Only three steps to launch the analysis


Running Genome Enhancer takes only three steps:

1. Upload your data to the server and specify the import options (data type)

2. Split your data by the conditions you want to compare

3. Launch the analysis by specifying the conditions to be compared and the disease and tissue types (optional)

The analysis report will be ready shortly. Depending on your input data, it will include lists of differentially expressed or mutated genes; transcription factors, regulating those genes; reconstructed signaling network of the studied pathological process; potential drug targets and corresponding known drugs and re-purposing drugs, which may be effective in the studied case, as well as further cheminformatically predicted drug-like compounds. The report also contains description of analysis methods used and the references.

Acceptable input data formats

Genome Enhancer works with genomics, transcriptomics, epigenomics, proteomics and metabolomics input data types of the following formats:

Transcriptomics (RNA-seq, microarrays)

*.txt, *.csv, *.xls (table with gene identifiers)

*.CEL (affymetrix)

*.txt (special agilent format)

*.txt (special illumina format)


Epigenomics (ChIP-seq)


*.bam (hg38 only)

*.bed (hg38 only)

*.txt (table with illumina methylation probe ids, cg*)



*.txt, *.csv, *.xls (table data with SNP identifiers, rs*), *.tsv



*.txt, *.csv, *.xls (table with protein identifiers)


*.txt, *.csv, *.xls (table with the list of metabolites from chebi database, e.g. CHEBI:57316)

Files of one data format can be uploaded in a .zip archive

What we offer

Multi-omics analysis

Use genomics, transcriptomics, metabolomics, proteomics, and epigenomics data in one analysis run and receive an integrated report


Easy interface

Due to complete automation of the analysis, the system can be used by medical doctors and biologists without any bioinformatics skills


Personalized medicine

Running the analysis on omics data of a certain patient, you will identify personalized prospective drug targets and corresponding treatments


Scientific base

Integration of promoter and enhancer analysis with pathway reconstruction gives unrivaled disease molecular mechanism modeling accuracy

Drug target identification

Genome Enhancer reconstructs a complex network of signal transduction pathways that are activated in the pathology and identifies their key regulators



Flexible pricing

Select the license which fits you best – exclusive access to Genome Enhancer or TRANSFAC full package including a wide range of tools and functionalities.

Report examples

You can view various analysis report examples generated by Genome Enhancer on the basis of different omics input data types and various origins of the studied pathologies:

You are also very welcome to view or download the Genome Enhancer flyer.



Below you will find a compilation of our videos referring to different aspects of Genome Enhancer.
In English Language
This video introduces you to the fully automated pipeline for the easy bioinformatics analysis of multi-omics data. (2:02 min)
This video shows how Genome Enhancer results can be interpreted for further use in clinic on the example of sensitivity prediction towards VEGFA-targeted therapy for three colorectal cancer patients. (5:51 min)
In Chinese Language
The video provides you with a preview to the user interface of Genome Enhancer, and shows how to perform an analysis using Genome Enhancer. (4:42 min)
This video uses transcriptomics data as an example to highlight the detailed content of the data analysis report obtained as result of an analysis by Genome Enhancer. (3:09 min)
Here, Genome Enhancer is applied to analyze multi-omics data of colorectal cancer cell lines to find master regulators and to predict drug compounds affecting them. Thereafter, the effect of the predicted drug compounds on tumors is verified in an animal experiment.


  1. Lloyd, Katie, et al. “Using systems medicine to identify a therapeutic agent with potential for repurposing in Inflammatory Bowel Disease.” Disease models & mechanisms (2020).
  2. Kel, Alexander, et al. “Walking pathways with positive feedback loops reveal DNA methylation biomarkers of colorectal cancer.” BMC bioinformatics 20.4 (2019): 119.
  3. Kolpakov, Fedor, et al. “BioUML: an integrated environment for systems biology and collaborative analysis of biomedical data.” Nucleic acids research 47.W1 (2019): W225-W233.
  4. Boyarskikh, Ulyana, et al. “Computational master-regulator search reveals mTOR and PI3K pathways responsible for low sensitivity of NCI-H292 and A427 lung cancer cell lines to cytotoxic action of p53 activator Nutlin-3.” BMC medical genomics 11.1 (2018): 12.
  5. Boyarskikh, U. A., et al. “Master-regulators driving resistance of non-small cell lung cancer cells to p53 reactivator Nutlin-3.” Virtual Biology 4 (2017): 1-31.



The results of Genome Enhancer analysis, contained in any of the reports produced by this pipeline, are intended for research use only and should not be used for medical or professional advice. GeneXplain GmbH makes no guarantee of the comprehensiveness, reliability or accuracy of the information contained in the reports generated by Genome Enhancer.

Decisions regarding care and treatment of patients should be fully made by attending doctors. The predicted chemical compounds listed in the reports are given only for doctor’s consideration and they cannot be treated as prescribed medication. It is the physician’s responsibility to independently decide whether any, none or all of the predicted compounds can be used solely or in combination for patient treatment purposes, taking into account all applicable information regarding FDA prescribing recommendations for any therapeutic and the patient’s condition, including, but not limited to, the patient’s and family’s medical history, physical examinations, information from various diagnostic tests, and patient preferences in accordance with the current standard of care. Whether or not a particular patient will benefit from a selected therapy is based on many factors and can vary significantly.

The compounds predicted to be active against the identified drug targets in the reports are not guaranteed to be active against any particular patient’s condition. GeneXplain GmbH does not give any assurances or guarantees regarding the treatment information and conclusions given in the reports. There is no guarantee that any third party will provide a refund for any of the treatment decisions made based on these results. None of the listed compounds was checked by Genome Enhancer for adverse side-effects or even toxic effects.

The analysis reports contain information about chemical drug compounds, clinical trials and disease biomarkers retrieved from the HumanPSD™ database of gene-disease assignments maintained and exclusively distributed worldwide by geneXplain GmbH. The information contained in this database is collected from scientific literature and public clinical trials resources. It is updated to the best of geneXplain’s knowledge however we do not guarantee completeness and reliability of this information leaving the final checkup and consideration of the predicted therapies to the medical doctor. In all cases, the end user (including researchers and medical doctors) accepts full responsibility for all risks associated with using of information, contained in the reports generated by Genome Enhancer.

The scientific analysis underlying the Genome Enhancer reports employs a complex analysis pipeline which uses geneXplain’s proprietary Upstream Analysis approach, integrated with TRANSFAC® and TRANSPATH® databases maintained and exclusively distributed worldwide by geneXplain GmbH. The pipeline and the databases are updated to the best of geneXplain’s knowledge and belief, however, geneXplain GmbH shall not give a warranty as to the characteristics or to the content and any of the results produced by Genome Enhancer. Moreover, any warranty concerning the completeness, up-to-dateness, correctness and usability of Genome Enhancer information and results produced by it, shall be excluded.

The results produced by Genome Enhancer, including the analysis reports, severely depend on the quality of input data used for the analysis. It is the responsibility of Genome Enhancer users to check the input data quality and parameters used for running the Genome Enhancer pipeline.

Note that the text given in the reports is not unique and can be fully or partially repeated in other Genome Enhancer analysis reports, including reports of other users. This should be considered when publishing any results or excerpts from the reports. This restriction refers only to the general description of analysis methods used for generating the reports. All data and graphics referring to the concrete set of input data, including lists of mutated genes, differentially expressed genes/proteins/metabolites, functional classifications, identified transcription factors and master regulators, constructed molecular networks, lists of chemical compounds and reconstructed model of molecular mechanisms of the studied pathology are unique in respect to the used input data set and Genome Enhancer pipeline parameters used for the current run.

Upstream Analysis

genexplain platform logo

What is Upstream Analysis?

GeneXplain’s proprietary approach to analyze gene expression data is called Upstream Analysis. The term indicates that it is a causal analysis, providing a clue about the reason why a certain set of genes has been up- (or down-) regulated in the system under study. In contrast, conventional analyses usually reveal the effects of the differentially expressed genes, e.g. by mapping them onto ontological categories.


How does it work?

GeneXplain’s Upstream Analysis is an integrated promoter – pathway analysis. It starts from any list of differentially expressed genes (DEGs), which you may have extracted from your raw data with the aid of the geneXplain platform, and comprises two main steps:
  • At first, the promoters of the differentially regulated genes are retrieved and analyzed for potential transcription factor (TF) binding sites and their combinations. From that, a set of TFs is identified that potentially have regulated the found DEGs.
  • In a second step, the pathways are reconstructed that are known to activate the previously hypothesized TFs. Molecules where these pathways converge are considered as potential master regulators of the process under study

Step 1: Promoter analysis

First, potential transcription factor binding sites (TFBSs) are identified in all promoters of the DEGs of your experiment (Yes set) as well as in a negative control set (No set). This is the usually done with a library of position-specific scoring or positional weight matrices (PSSMs or PWMs).
We recommend to apply the most comprehensive matrix library available, the TRANSFAC® database, and using the MATCHTM algorithm for the sequence analysis.
Next, out of all these potential transcription factor binding sites (TFBSs), those that are characteristic for the DEG set under study are identified. This is done by rigorously determining their enrichment in the Yes- compared to the No set.
Learn more about promoter analysis with TRANSFAC® in the geneXplain platform.

Step 2: Pathway analysis

Step 1 resulted in a set of transcription factors (TFs), that are likely repsonsible for the differential regulation of the observed set of DEGs. From available pathway data, we have extracted information about all relevant signaling cascades that regulate the activity of TFs; optimally, the TRANSPATH® database is used for this and the further analysis.
As has been proven in a large number of use cases, these pathways usually converge in a couple of key nodes, which qualify as candidate master regulators of the process under study.

Activities of transcription factors (TFs, blue circles) are regulated by upstream signaling cascades (components shown as green circles). These converge in certain nodes, representing molecules that are potential master regulators of the process under study.


PharmaExpert Logo
PharmaExpert analyzes the relationships between biological activities, drug-drug interactions and multiple targeting of chemical compounds and selects compounds that have a pre-defined biological activity. It helps answer a question like “How to select the most promising compounds among those known to interact with the selected protein?”

Click image to enlarge the picture.

PharmaExpert interface


PharmaExpert interface. PharmaExpert analyzes the relationships between biological activities, drug-drug interactions and multiple targeting of chemical compounds and selects compounds that have a pre-defined biological activity.

PharmaExpert interface. (Click image for an enlarged view.)

Key features of PharmaExpert

PharmaExpert is an expert system taking into account the known relationships between pharmacotherapeutic effects and mechanisms of action of biologically active substances.

Fields of application: Analysis of the cause-effect relationships between the biological activities, estimation of possible positive and negative pharmacokinetic and pharmacodynamic drug-drug interactions, selection of compounds with the needed activity spectra predicted by PASS, identification of compounds with multiple mechanisms of particular pharmacological action.

PharmaExpert is designed to visualize and to analyze the prediction results of PASS and GUSAR software. It provides the following functions:

– reading the SD files containing information about the structures of organic compounds and prediction results of their spectra of biological activity provided by PASS, as well as the prediction results of (Q)SAR models of GUSAR;

– visualization of relationships between the predicted biological activities based on the known data about the causal relationships between them, and “target-pathway-effect” relationships;

– selecting compounds with desired biological activities in one or more SD files;

– analysis of possible positive and negative drug-drug interactions for individual pairs of compounds or for all compounds contained in the SD file;

– saving identified relationships between the activities and the results of the selection of compounds with desired biological activities in SD or TXT file.


PharmaExpert 2022 contains a knowledgebase with over 15 thousand of known interactions between biological activities, as well as the relationships between proteins, signalling/regulatory pathways (KEGG or Reactome), Gene Ontology biological processes and therapeutic and adverse effects:

PharmaExpert knowledgebase

All biological activities are divided onto seven types: (1) mechanisms of action; (2) pharmacological effects; (3) toxic and side effects; (4) interaction with antitargets; (5) interaction with drug metabolizing enzymes (inhibition, induction, interaction as a substrate); (6) changing gene expression of individual genes (increase, decrease); (7) interaction with transporter protein (inhibition, stimulation, interaction as a substrate).

Automatic search is provided for compounds acting on any of the mechanisms of action (or simultaneously on several mechanisms of action, up to ten) associated with the therapeutic effect or signalling/regulatory pathway (KEGG or Reactome) and biological process of Gene Ontology.

Analysis of possible drug-drug interactions is performed simultaneously for all seven types of biological activity.


Find history in our archive.
Upcoming events
August 20, 2024

ATAC-seq and enhancers free webinar

The next Coffee break with TRANSFAC will be held on August 20th at 10 AM CEST. Leave your question for the upcoming event or receive event joining link to your email address. See you soon!

July 9, 2024

Release 2024.1

The next Coffee break with TRANSFAC will be held on July 9th at 10 AM CEST. Leave your question for the upcoming event or receive event joining link to your email address. See you soon!

July 4, 2024

geneXplain release 2024.1

New release of the geneXplain products – check all new features here!

June 18-21, 2024

GlioResolve consortium meeting and training is coming up. GeneXplain will give two lectures: (1) ML and AI in cancer research and (2) From DNA motifs to networks and drug targets. Check out how this project is developing the new TME-targeting precision medicine platform aiming to build the basis for fighting such complicated pathology as glioblastoma.