Coffee break with TRANSFAC

Welcome to the “Coffee break with TRANSFAC”

A series of online sessions hosted by Dr. Alexander Kel, CEO geneXplain GmbH


The next Coffee break with TRANSFAC will be announced soon
Stay tuned to our news for the latest updates.


This initiative of Q&A sessions with a leading bioinformatics expert Dr. Alexander Kel is intended to support all researchers out there that are interested in the area of applied bioinformatics.

Come to our sessions as a simple listener, or become an active participant of this recurrent event and ask Dr. Kel your own questions in order to emphasise the direction of the live discussion.

You can send your questions via the form below or by email: with a subject “Question to Dr. Kel”.

Questions can also be asked live during the online event.



What to ask about?

Any question that you need assistance with while performing your bioinformatics analysis, e.g.

  • Promoter analysis? Pathway analysis?
  • Can AI analyze NGS data?
  • How to combine DNA methylation and metabolome data?
  • What to start from and how to interpret the obtained results?

and much more….


Check out the video records of the previous “Coffee break with TRANSFAC” sessions in order to find answers to the questions that have already been addressed.


19 September 2023, the ninth “Coffee break with TRANSFAC” session:



Your questions addressed within this session:

00:20 Gene regulatory networks – what are they and how to construct them

21:12 MATCH Suite and gene regulatory networks based on predictions of transcription factor binding sites

27:35 MATCH Suite results visualization: constructing gene regulatory networks using Python* in Jupiter notebook (geneXplain platform API)

*The Python code shown in the online demo is available here

40:42 From which online software I can get the sequences of transcription factors?

50:30 Can TRANSFAC be applied for other species, e.g. drosophila or plants? What are the limitations?


8 June 2023, the eighth “Coffee break with TRANSFAC” session:



Your questions addressed within this session:

00:02 Introduction: TRANSFAC in Cancer research

00:56 World Brain Tumor day

01:45 Transcription factors: how do they regulate their target genes

02:21 Positional weight matrices – TFBS model

03:07 Searching for TFBS enrichment

03:48 Composite complexes of transcription factors

04:47 Search for Master Regulators

06:26 Intracellular signal transduction: complex cascades consisting of signaling reactions

08:18 Construction of potential network of possible reactions within the cell

09:01 Modeling of regulation of a particular set of genes

11:59 Algorithm of Master Regulators searc

13:42 Integration of context protein expression information in the search for Master Regulators

15:23 Verification of Master Regulators quality

18:08 Walking pathways concept

20:35 How to select the “true” Master Regulator

22:08 Therapeutic targets and biomarkers of the studied processes

22:59 Methylation marks located in regulatory regions of Master Regulator genes are good diagnostic or prognostic biomarkers

23:18 The Upstream Analysis concept (integrated promoter and pathway analysis)

23:58 Genome Enhancer – the fully automated pipeline for prospective drug target identification

26:43 Analysis of Glioblastoma short-term survival patients vs. long-term survival patients using Genome Enhancer (this analysis resulted in the following publication: IGFBP2 Is a Potential Master Regulator Driving the Dysregulated Gene Network Responsible for Short Survival in Glioblastoma Multiforme)

41:47 Switch from Genome Enhancer to the geneXplain platform interface with extended functionality. Mapping of prospective Master Regulators to diseases in which they are known biomarkers.

47:38 Immunotherapy sensitivity prediction for Glioblastoma patients

51:26 Are there transcription factor and enhancer databases for nonhuman primates?

56:30 Is there a classical publication involving geneXplain related to the topic?

Yes, please check those on Glioblastoma: (1) and (2), this one on the Walking Pathways concept, and other publications that can be found here.

58:57 Can we check these results with the data on mutations in samples? Won’t this help find up- or down-regulated genes with respect to their mutation profile?

01:04:53 If I have RNA-seq data for my patient, which additional omics data could improve the analysis with your tool effectively? E.g. variants, protein expression, methylation?


25 May 2023, the seventh “Coffee break with TRANSFAC” session:



Your questions addressed within this session:

02:47 TRANSFAC brief overview

05:08 Site enrichment analysis

07:59 Site combinations: complexes of transcription factor binding sites; composite modules

14:27 TRANSFAC application for different model organisms

21:27 Example on drosophila genes analysis from GSE149116 – combination of RNA-seq and ChIP-seq data

22:31 What are the target genes of my transcription factor? Is such question correct?

23:16 Human TF was put as a construct and expressed in Drosophila –> Human TF started regulating the drosophila genes! Wow!

25:29 Different experiment types being united in one analysis: intersection of RNA-seq data with ChIP-seq data

28:25 Drosophila data analysis in the geneXplain platform (on the example of GSE149116 dataset)

41:02 How site search summary looks like in the geneXplain platform

42:21 Identification of “master” transcription factors working together in combinations; CMA (Composite Module Analyst) analysis in the geneXplain platform

46:44 Can we use other genomes (other than human) for TFBS analysis with TRANSFAC in the geneXplain platform

47:15 How to upload the palm genome to the geneXplain platform

50:33 Extraction of promoters of genes responsible for palm resistance to different bacteria or infections. “Master” transcription factors regulating the resistant genes in the palm.

54:35 What are the “standard” genomes in the geneXplain platform

56:01 How can I find master transcription factors that regulate specific pathways in Arabidopsis

58:17 Which profile (collection of PWMs – positional weight matrices) to use for the analysis?


25 April 2023, the sixth “Coffee break with TRANSFAC” session:



Your questions addressed within this session:

    • Mutations in cancer: how to analyze them using TRANSFAC?
    • Introduction to Genome Enhancer – an automatized pipeline for multi-omics data analysis. Analysis of cancer mutations using Genome Enhancer.
    • Lung cancer cell lines analysis using Genome Enhancer. A more detailed video devoted to this example can be found here.
    • By which authority is Genome Enhancer certified for application to patient data in hospital?
    • Mutations located in non-coding regions of genes: can they destroy or create TFBS? 
    • Can we predict the effect of these gene mutations on drug binding to targets? 
    • Analysis of binding sites compositions: search for co-factors – transcription factor binding sites (TFBS) working together. Identification of TFBS compositions around clusters of mutations when compared to regions without mutations. 
    • Switch to the geneXplain platform perspective: a more detailed view on results produced by Genome Enhancer pipeline in regards to the TFBS analysis.
    • Brief overview of the next sections of Genome Enhancer report: pathway analysis, identification of master-regulators in networks and selection of prospective drug targets and associated treatments.


    04 April 2023, the fifth “Coffee break with TRANSFAC” session:



    Your questions addressed within this session:

    • SNP data analysis: motifs destroyed and created by SNPs – how to find them and what is their effect?
    • Overview of the geneXplain platform interface
    • Analyzing SNP data with the geneXplain platform (Variant Analysis)
    • Is it possible to use TRANSFAC to calculate p-values for binding with a specific allele at the SNP and to rank predicted TFs based on their binding score?


    23 March 2023, the fourth “Coffee break with TRANSFAC” session:



    Your questions addressed within this session:

    • De novo motifs. What are they? How to find and use them? (short intro)
    • Why are there so many motifs (PWMs – positional weight matrices) collected for 1 transcription factor? All of them are different. Which one is “true”? (spoiler: they ALL are!)
    • Searching for de novo motifs: when and why do we need new motifs? How are they discovered?
    • ChIPMunk motif discovery tool – what is it?
    • De novo motif search using the geneXplain platform
    • Can I perform TFBS search with TRANSFAC alone?
    • Can geneXplain platform find mutation in the intron (when compared with reference sequence, e.g. hg38)?


    2 March 2023, the third “Coffee break with TRANSFAC” session:



    Your questions addressed within this session:

    • How to compute transcription factor binding affinity?
    • What is the difference between the 5’ prime and the 3’ prime promoters?
    • I have two genes with overlapping promoters, but different TFBS, can you comment on this?
    • If I design a minimal promoter, what would be the basis for screening and selecting the best TFBS to include in my minimal promoter if I use the MATCH Suite tool? In brief, how to prioritize these TFs to select the most relevant of them?
    • Could you explain what is the difference between p-values in the MATCH Suite analysis report?
    • For one TF there can be numerous PWMs defined by a number in the name such as V$AP1_02. Does this number correspond to a new version of the PWM based on the alignment of known TFBS? Do you advise to use the last PWM built on the alignment of known binding sites as the most updated one or do we need to use all PWMs for a TF to perform the prediction for a particular TF?
    • Alexander, as I understand, the affinity score(s) is a floating number. How does the program define the cut-off (maybe I missed that) – every time depending on the experiment, or it is set up on an average basis?
    • Is it important to have cell specific analysis and not only tissue or organ specific one?


    7 February 2023, the second “Coffee break with TRANSFAC” session:



    Your questions addressed within this session:

    • TFBS prediction in DNA based on positional weight matrices
    • How to define the cutoffs for binding sites scores
    • In calculating FP and FN you need to know the true binding sites. How are they determined?
    • I have some questions about my data derived from the TRANSFAC 2012 professional matrices. 1) Some of the genes of interest have binding sites for IRF (without further specification of the subtype), STAT (again, without further specification of the subtype). Can you explain what this means? 2) Some other genes have binding sites for STAT1STAT1 – does this mean STAT1 dimer, also known as the GAS element?
    • I am interested to find out what TFs, promoter & enhancer regions, and epigenetic signatures are relevant for the regulation of specific genes in rat hippocampal pyramidal and inhibitory neurons. I know this is not an easy task, and needs not only a bioinformatic but also an experimental approach. I am not aware how transcriptome (RNAseq) and epigenetic (ATAC-seq) data are integrated in the TRANSFAC platform and whether TRANSFAC could help in refining this.


    17 January 2023, the first “Coffee break with TRANSFAC” session:



    Your questions addressed within this session:

    • How can I start a TRANSFAC bioinformatics analysis? What types of data do I need?
    • What is the significance of a transcription factor binding site on plus or minus strand of the gene?
    • New transcription factors – do you survey those all the time? How many targets are necessary for inclusion of one (creation of a matrix) in the program?
    • What advantages does TRANSFAC have over the HOCOMOCO and JASPAR databases?
    • Regarding TF binding analysis: even after we get a putative motif for TFs, how many TF could possibly bind at a single site on the genome, depending on a significance cut-off, I find TF binding sites from 2-3 to almost 40-50 at the same location. How do we filter the noise or get accurate results?
    • Is it possible to use directly gene sequences to search TFBS? For example sequences from Ensembl database? FASTA file?
    • Either on + strand, or on – strand, one should read the binding site from 5′ end to 3′ end, is it correct?
    • Can a gene have two active promoters and corresponding TSS in the same cell?
    • In order to investigate how much a genetic mutation could affect a TF binding site, is it possible that such a mutation could affect multiple TFs binding at that region? Would it be advisable to focus only on TFs based on their binding score?
    • The co-occurrence of TFs proposed by TRANSFAC is based on the knowledge coming from different experiments, so what are the chances of co-occurrence actually?



    Stay tuned to our news not to miss the announcement of the next “Coffee break with TRANSFAC” session.

    Follow our social media channels for the updates: