Archive - geneXplain

TRANSFAC download

The TRANSFAC® flat file download (including the databases TRANSCompel® and TANSProTM) contains eukaryotic transcription factors (and miRNAs), their experimentally determined genomic binding sites and consensus DNA-binding motifs (PWMs), as well as data on combinatorial gene regulation and factor-factor interaction. Promoters, enhancers and silencers annotated with transcription  factor  ChIP-Seq,  DNase hyper-

sensitivity and histone methylated intervals from the ENCODE project and from other sources complement the manually curated binding site data. Based on the positional weight matrices (PWMs) transcription factor binding sites can be predicted in regulatory regions. In the TRANSFAC® flat file download, the tools of the MatchTM Library can be used on command line or the PWMs can be used with tools of the user. 


Here is the screenshot of the TRANSFAC download page showing all the archives included in the download package and their sizes:

TRANSFAC download archives

Key features

  • Intended for Bioinformaticians
  • No installation is needed – just download and unzip archives
  • Data files are provided in DAT and JSON formats
  • Promoters are provided in the DAT and GTF formats
  • Direct data access without user interface: data extraction is possible via Perl scripts or other programs written by the user
  • Java-based tools for TFBS search (Match Library) are accessible via a command line
  • For use with customer tools and incorporation into user-specific pipelines


Structure of the TRANSFAC® flat file download release is as follows:

TRANSFAC download structure

At the center of TRANSFAC® are transcription factors and their DNA binding sites based on experimental evidence extracted from scientific publications. The interlinked factor and site entries are captured in the respective flat files. Interactions between two or more factors are also included. TF-binding sites can be with or without a connection to a gene. Genomic sites are mapped to promoter sequences in the TRANSProTM module of TRANSFAC®. Besides linkage of factors via binding sites to regulated genes, monomeric factors are also linked to their encoding genes. In some cases this may also lead to autoregulatory loops, where a factor regulates the expression of its own gene. Based on gene-factor and (indirect) factor-gene links, gene-regulatory networks can be extracted.

The transcription factor classification from TFClass has been integrated. Based on the collection of DNA-binding sites for a transcription factor, consensus binding motifs in form of positional weight matrices (PWMs) are derived, which can be used by the command line tools in the Match Library for transcription factor binding site prediction in regulatory DNA sequences. Synergistic and antagonistic interactions between transcription factors binding to closely situated sites, so called composite regulatory elements, are included in the COMPEL flat file of the TRANSCompel® module of TRANSFAC®. ChIP-seq and other high-throughput data are also included and are mapped to promoter and enhancer regions.

TRANSFAC products


Data access / download

Data core content from TRANSFAC® (TFBSs, PWMs, etc.) and accompanying databases


Tools / Functionality

Input data type

Analysis size

TRANSFAC® online

Web interface (GUI)

Export of subsets of certain data types and analysis results

Data core content. Plus, for TFs, additional data from HumanPSD (GO, tissue expression)

No download for PWMs. PWMs can only be used with the included tools

Web-based tools for TFBS prediction in single promoters or in gene/sequence sets (plus additional tools

DNA sequences, gene lists, genomic intervals

Limited to pre-processed data sets (of few hundred genes / seqs / intervals)


TRANSFAC® online +




Export of subsets of certain data types and analysis results

Data core content. Plus, for TFs, additional data from HumanPSD (GO, tissue expression)

No download for PWMs. PWMs can only be used with the included tools

Tools and customizable workflows for omics data analysis (optionally extendable to pathway and upstream analysis)

Processed and raw RNAseq, SNPs, omics data, etc.

Complete analysis of omics data or of whole genomes

TRANSFAC® flat file download

Command line

Download of flat files with all data (DAT/JSON) for direct access

Data core content

Download of PWMs in matrix.dat for use with internal or external tools

Command line tools for TFBS prediction (Match Library) or use of PWMs with (compatible) tools of the user

DNA sequences

Whole genome analysis


You can view TRANSFAC online features and packages detailed info here >>>

Explore TRANSFAC online features and packages info

Get TRANSFAC download

Genome Enhancer

Release 3.1 is now out!


Welcome to the new era of Precision Medicine

Meet Genome Enhancer – a fully automated pipeline for patient omics data analysis, which identifies prospective drug targets and corresponding treatments by reconstructing the molecular mechanism of the studied pathology. Proven applications of Genome Enhancer include cancer, neurodegenerative diseases, infectious diseases, diabetes and metabolic diseases, hypertension.

Starting from release 2.0 Genome Enhancer is also available as Genome Enhancer Expert solution – a powerful synergism between the automatic pipeline for multi-omics data processing of Genome Enhancer and the comprehensive bioinformatics toolbox of the geneXplain® platform. More details about Genome Enhancer Expert solution are available here.

You can login to Genome Enhancer directly with your geneXplain® platform (get one by registering here).


Genome Enhancer uses Upstream Analysis, an integrated promoter and pathway analysis, to identify potential drug targets of the studied pathology.

In the first step of this analysis the transcription factors that regulate differentially expressed or mutated genes are identified with the use of the TRANSFAC® database of transcription factors binding sites.

The second step searches for common master-regulators of the identified transcription factors by building a personalized signal transduction network of the studied pathology using the TRANSPATH® database of mammalian signal transduction and metabolic pathways. The identified

master regulators are prospective drug target candidates. They are used for further selection of chemical compounds that can bring therapeutic benefit for the studied clinical case. In this step the HumanPSD database is employed to identify drugs that have been tested in clinical trials. The cheminformatic tool PASS predicts small molecules that can affect the identified targets.

Finally, Genome Enhancer generates a comprehensive analysis report about the personalized drug targets identified for a certain patient, or a group of patients, and the drugs that may be effective in this case. You can view a number of Genome Enhancer demo reports at the corresponding section of this page.


A detailed description of Genome Enhancer analysis schema can be found on this page.

Check out the Genome Enhancer video channel ran by the CSO of geneXplain GmbH Dr. Alexander Kel. See how various omics data can be analyzed in Genome Enhancer or even send your own data to Dr. Kel and he will show its analysis in one of the next videos!

Key features

  • Identifies activated targets in the examined patient data and suggests known and re-purposing drugs
  • Based on well accepted databases TRANSFAC®, TRANSPATH®, HumanPSD™
  • Applies unique in-house developed algorithms
  • Suites for use by researches and by medical doctors
  • Does not require special skills to operate
  • Analyses all types of omics data starting either with raw or pre-processed data
  • Generates a comprehensive report on the identified drug targets and prospective therapies
  • Generates MTB (Molecular Tumor Board) report on patient’s genomics data for a number of pathologies (if not more than 2 conditions were selected during the analysis launch)


Available solutions


Starting from release 2.0 Genome Enhancer is also available as Genome Enhancer Expert solution – a powerful synergism between the automatic pipeline for multi-omics data processing of Genome Enhancer and the comprehensive bioinformatics toolbox of the geneXplain® platform.

Genome Enhancer Expert will open you the full functionality of the geneXplain® platform with TRANSFAC®, TRANSPATH® and HumanPSD™ databases connected. In the platform view you will be able to perform further processing of your analysis results, received from Genome Enhancer, create, modify and use already pre-defined workflows for your multi-omics data analysis and work with data coming from other model organisms. For more info on geneXplain® platform functionality please refer to the platform product page.

Only three steps to launch the analysis


Running Genome Enhancer takes only three steps:

1. Upload your data to the server and specify the import options (data type)

2. Split your data by the conditions you want to compare

3. Launch the analysis by specifying the conditions to be compared and the disease and tissue types (optional)

The analysis report will be ready shortly. Depending on your input data, it will include lists of differentially expressed or mutated genes; transcription factors, regulating those genes; reconstructed signaling network of the studied pathological process; potential drug targets and corresponding known drugs and re-purposing drugs, which may be effective in the studied case, as well as further cheminformatically predicted drug-like compounds. The report also contains description of analysis methods used and the references.

Acceptable input data formats

Genome Enhancer works with genomics, transcriptomics, epigenomics, proteomics and metabolomics input data types of the following formats:

Transcriptomics (RNA-seq, microarrays)

*.txt, *.csv, *.xls (table with gene identifiers)

*.CEL (affymetrix)

*.txt (special agilent format)

*.txt (special illumina format)


Epigenomics (ChIP-seq)


*.bam (hg38 only)

*.bed (hg38 only)

*.txt (table with illumina methylation probe ids, cg*)



*.txt, *.csv, *.xls (table data with SNP identifiers, rs*), *.tsv



*.txt, *.csv, *.xls (table with protein identifiers)


*.txt, *.csv, *.xls (table with the list of metabolites from chebi database, e.g. CHEBI:57316)

Files of one data format can be uploaded in a .zip archive

What we offer

Multi-omics analysis

Use genomics, transcriptomics, metabolomics, proteomics, and epigenomics data in one analysis run and receive an integrated report


Easy interface

Due to complete automation of the analysis, the system can be used by medical doctors and biologists without any bioinformatics skills


Personalized medicine

Running the analysis on omics data of a certain patient, you will identify personalized prospective drug targets and corresponding treatments


Scientific base

Integration of promoter and enhancer analysis with pathway reconstruction gives unrivaled disease molecular mechanism modeling accuracy

Drug target identification

Genome Enhancer reconstructs a complex network of signal transduction pathways that are activated in the pathology and identifies their key regulators



Flexible pricing

Select the license which fits you best – both short-term and long-term licenses are supported by our flexible tariffs. For more details visit the geneXplain store

Report examples

You can view various analysis report examples generated by Genome Enhancer on the basis of different omics input data types and various origins of the studied pathologies:

You are also very welcome to view or download the Genome Enhancer flyer.



Below you will find a compilation of our videos referring to different aspects of Genome Enhancer.
In English Language
This video introduces you to the fully automated pipeline for the easy bioinformatics analysis of multi-omics data. (2:02 min)
This video shows how Genome Enhancer results can be interpreted for further use in clinic on the example of sensitivity prediction towards VEGFA-targeted therapy for three colorectal cancer patients. (5:51 min)
In Chinese Language
The video provides you with a preview to the user interface of Genome Enhancer, and shows how to perform an analysis using Genome Enhancer. (4:42 min)
This video uses transcriptomics data as an example to highlight the detailed content of the data analysis report obtained as result of an analysis by Genome Enhancer. (3:09 min)
Here, Genome Enhancer is applied to analyze multi-omics data of colorectal cancer cell lines to find master regulators and to predict drug compounds affecting them. Thereafter, the effect of the predicted drug compounds on tumors is verified in an animal experiment.


  1. Lloyd, Katie, et al. “Using systems medicine to identify a therapeutic agent with potential for repurposing in Inflammatory Bowel Disease.” Disease models & mechanisms (2020).
  2. Kel, Alexander, et al. “Walking pathways with positive feedback loops reveal DNA methylation biomarkers of colorectal cancer.” BMC bioinformatics 20.4 (2019): 119.
  3. Kolpakov, Fedor, et al. “BioUML: an integrated environment for systems biology and collaborative analysis of biomedical data.” Nucleic acids research 47.W1 (2019): W225-W233.
  4. Boyarskikh, Ulyana, et al. “Computational master-regulator search reveals mTOR and PI3K pathways responsible for low sensitivity of NCI-H292 and A427 lung cancer cell lines to cytotoxic action of p53 activator Nutlin-3.” BMC medical genomics 11.1 (2018): 12.
  5. Boyarskikh, U. A., et al. “Master-regulators driving resistance of non-small cell lung cancer cells to p53 reactivator Nutlin-3.” Virtual Biology 4 (2017): 1-31.



The results of Genome Enhancer analysis, contained in any of the reports produced by this pipeline, are intended for research use only and should not be used for medical or professional advice. GeneXplain GmbH makes no guarantee of the comprehensiveness, reliability or accuracy of the information contained in the reports generated by Genome Enhancer.

Decisions regarding care and treatment of patients should be fully made by attending doctors. The predicted chemical compounds listed in the reports are given only for doctor’s consideration and they cannot be treated as prescribed medication. It is the physician’s responsibility to independently decide whether any, none or all of the predicted compounds can be used solely or in combination for patient treatment purposes, taking into account all applicable information regarding FDA prescribing recommendations for any therapeutic and the patient’s condition, including, but not limited to, the patient’s and family’s medical history, physical examinations, information from various diagnostic tests, and patient preferences in accordance with the current standard of care. Whether or not a particular patient will benefit from a selected therapy is based on many factors and can vary significantly.

The compounds predicted to be active against the identified drug targets in the reports are not guaranteed to be active against any particular patient’s condition. GeneXplain GmbH does not give any assurances or guarantees regarding the treatment information and conclusions given in the reports. There is no guarantee that any third party will provide a refund for any of the treatment decisions made based on these results. None of the listed compounds was checked by Genome Enhancer for adverse side-effects or even toxic effects.

The analysis reports contain information about chemical drug compounds, clinical trials and disease biomarkers retrieved from the HumanPSD™ database of gene-disease assignments maintained and exclusively distributed worldwide by geneXplain GmbH. The information contained in this database is collected from scientific literature and public clinical trials resources. It is updated to the best of geneXplain’s knowledge however we do not guarantee completeness and reliability of this information leaving the final checkup and consideration of the predicted therapies to the medical doctor. In all cases, the end user (including researchers and medical doctors) accepts full responsibility for all risks associated with using of information, contained in the reports generated by Genome Enhancer.

The scientific analysis underlying the Genome Enhancer reports employs a complex analysis pipeline which uses geneXplain’s proprietary Upstream Analysis approach, integrated with TRANSFAC® and TRANSPATH® databases maintained and exclusively distributed worldwide by geneXplain GmbH. The pipeline and the databases are updated to the best of geneXplain’s knowledge and belief, however, geneXplain GmbH shall not give a warranty as to the characteristics or to the content and any of the results produced by Genome Enhancer. Moreover, any warranty concerning the completeness, up-to-dateness, correctness and usability of Genome Enhancer information and results produced by it, shall be excluded.

The results produced by Genome Enhancer, including the analysis reports, severely depend on the quality of input data used for the analysis. It is the responsibility of Genome Enhancer users to check the input data quality and parameters used for running the Genome Enhancer pipeline.

Note that the text given in the reports is not unique and can be fully or partially repeated in other Genome Enhancer analysis reports, including reports of other users. This should be considered when publishing any results or excerpts from the reports. This restriction refers only to the general description of analysis methods used for generating the reports. All data and graphics referring to the concrete set of input data, including lists of mutated genes, differentially expressed genes/proteins/metabolites, functional classifications, identified transcription factors and master regulators, constructed molecular networks, lists of chemical compounds and reconstructed model of molecular mechanisms of the studied pathology are unique in respect to the used input data set and Genome Enhancer pipeline parameters used for the current run.

Upstream Analysis

genexplain platform logo

What is Upstream Analysis?

GeneXplain’s proprietary approach to analyze gene expression data is called Upstream Analysis. The term indicates that it is a causal analysis, providing a clue about the reason why a certain set of genes has been up- (or down-) regulated in the system under study. In contrast, conventional analyses usually reveal the effects of the differentially expressed genes, e.g. by mapping them onto ontological categories.


How does it work?

GeneXplain’s Upstream Analysis is an integrated promoter – pathway analysis. It starts from any list of differentially expressed genes (DEGs), which you may have extracted from your raw data with the aid of the geneXplain platform, and comprises two main steps:
  • At first, the promoters of the differentially regulated genes are retrieved and analyzed for potential transcription factor (TF) binding sites and their combinations. From that, a set of TFs is identified that potentially have regulated the found DEGs.
  • In a second step, the pathways are reconstructed that are known to activate the previously hypothesized TFs. Molecules where these pathways converge are considered as potential master regulators of the process under study

Step 1: Promoter analysis

First, potential transcription factor binding sites (TFBSs) are identified in all promoters of the DEGs of your experiment (Yes set) as well as in a negative control set (No set). This is the usually done with a library of position-specific scoring or positional weight matrices (PSSMs or PWMs).
We recommend to apply the most comprehensive matrix library available, the TRANSFAC® database, and using the MATCHTM algorithm for the sequence analysis.
Next, out of all these potential transcription factor binding sites (TFBSs), those that are characteristic for the DEG set under study are identified. This is done by rigorously determining their enrichment in the Yes- compared to the No set.
Learn more about promoter analysis with TRANSFAC® in the geneXplain platform.

Step 2: Pathway analysis

Step 1 resulted in a set of transcription factors (TFs), that are likely repsonsible for the differential regulation of the observed set of DEGs. From available pathway data, we have extracted information about all relevant signaling cascades that regulate the activity of TFs; optimally, the TRANSPATH® database is used for this and the further analysis.
As has been proven in a large number of use cases, these pathways usually converge in a couple of key nodes, which qualify as candidate master regulators of the process under study.

Activities of transcription factors (TFs, blue circles) are regulated by upstream signaling cascades (components shown as green circles). These converge in certain nodes, representing molecules that are potential master regulators of the process under study.


PharmaExpert Logo
PharmaExpert analyzes the relationships between biological activities, drug-drug interactions and multiple targeting of chemical compounds and selects compounds that have a pre-defined biological activity. It helps answer a question like “How to select the most promising compounds among those known to interact with the selected protein?”

Click image to enlarge the picture.

PharmaExpert interface


PharmaExpert interface. PharmaExpert analyzes the relationships between biological activities, drug-drug interactions and multiple targeting of chemical compounds and selects compounds that have a pre-defined biological activity.

PharmaExpert interface. (Click image for an enlarged view.)

Key features of PharmaExpert

PharmaExpert is an expert system taking into account the known relationships between pharmacotherapeutic effects and mechanisms of action of biologically active substances.

Fields of application: Analysis of the cause-effect relationships between the biological activities, estimation of possible positive and negative pharmacokinetic and pharmacodynamic drug-drug interactions, selection of compounds with the needed activity spectra predicted by PASS, identification of compounds with multiple mechanisms of particular pharmacological action.

PharmaExpert is designed to visualize and to analyze the prediction results of PASS and GUSAR software. It provides the following functions:

– reading the SD files containing information about the structures of organic compounds and prediction results of their spectra of biological activity provided by PASS, as well as the prediction results of (Q)SAR models of GUSAR;

– visualization of relationships between the predicted biological activities based on the known data about the causal relationships between them, and “target-pathway-effect” relationships;

– selecting compounds with desired biological activities in one or more SD files;

– analysis of possible positive and negative drug-drug interactions for individual pairs of compounds or for all compounds contained in the SD file;

– saving identified relationships between the activities and the results of the selection of compounds with desired biological activities in SD or TXT file.


PharmaExpert 2022 contains a knowledgebase with over 15 thousand of known interactions between biological activities, as well as the relationships between proteins, signalling/regulatory pathways (KEGG or Reactome), Gene Ontology biological processes and therapeutic and adverse effects:

PharmaExpert knowledgebase

All biological activities are divided onto seven types: (1) mechanisms of action; (2) pharmacological effects; (3) toxic and side effects; (4) interaction with antitargets; (5) interaction with drug metabolizing enzymes (inhibition, induction, interaction as a substrate); (6) changing gene expression of individual genes (increase, decrease); (7) interaction with transporter protein (inhibition, stimulation, interaction as a substrate).

Automatic search is provided for compounds acting on any of the mechanisms of action (or simultaneously on several mechanisms of action, up to ten) associated with the therapeutic effect or signalling/regulatory pathway (KEGG or Reactome) and biological process of Gene Ontology.

Analysis of possible drug-drug interactions is performed simultaneously for all seven types of biological activity.


Find history in our archive.
May 25, 2023

Coffee break with TRANSFAC May 25th 5 PM CEST

The next Coffee break with TRANSFAC will be held on May 25th at 5 PM CEST. Leave your question for the upcoming event or receive event joining link to your email address. See you soon!

May 11, 2023

OxidoResist project Banner

The OxidoResist project kick-off meeting is taking place today and tomorrow at Karolinska Institute in Stockholm. Check out the project page for the details on the aims, goals, and partners of OxidoResist initiative.

April 25, 2023

Coffee break with TRANSFAC April 25th 11 AM CEST

The next Coffee break with TRANSFAC will be held on April 25th at 11 AM CEST. Leave your question for the upcoming event or receive event joining link to your email address. See you soon!

April 4, 2023

Coffee Break with TRANSFAC on April 4th at 11 AM CEST

The next Coffee break with TRANSFAC will be held on April 4th at 11 AM CEST. Leave your question for the upcoming event or receive event joining link to your email address. See you soon!

March 27, 2023

Check out our new publication “Epigenome-Wide Changes in the Cell Layers of the Vein Wall When Exposing the Venous Endothelium to Oscillatory Shear Stress” that made extensive use of Genome Enhancer and our other tools and databases >>>  Full text link