Tech

Research reveals the “grammar” behind human gene regulation

a , MPRA (STARR-seq) library schematic. The enhancer activity assay clones a DNA library containing a synthetic TF motif (i), a human genome fragment (ii), or a completely random synthetic DNA oligonucleotide (iii) into the 3’UTR of the reporter gene (open reading). Frame (ORF)). It is driven by the smallest δ1-crystallin gene (Sasaki) or the EF1α promoter. In the binary promoter-enhancer (iv) activity assay, a random synthetic DNA sequence is used instead of the smallest promoter in the 3’UTR (3’UTR).Method,Supplementary memoAnd supplementary table 3 and Four ). b b , MPRA (STARR-seq) reporter constructs and their variations, and experimental workflows for measuring promoter or enhancer activity. The MPRA library is transfected into human cells, RNA is isolated 24 hours later, and reporter-specific RNA enrichment, library preparation, sequencing, and data analysis are performed. Recover the active promoter by mapping the transcribed enhancer to the input DNA and identifying the corresponding promoter. c Enhancer activity of the HT-SELEX motif as measured from the synthetic TF motif library of GP5d cells. A multiple variation of the median sequence pattern containing a single instance of motif consensus or its inverse complement to an input library is shown. The red line shows 1% of activity associated with the strongest motif. The dimer motif is a core consensus sequence (GGAA for ETS, ACAA for SOX, AACCGG for GRHL, GAAA, HH for IRF, head to head, HT, head to tail, TT, tail to tail. It is shown in the direction with respect to the tail). Gap length between core sequences. The asterisk indicates an A-rich sequence on the 5’side of the IRFHT2 dimer.Supplementary table<データトラック="クリック" data-track-label ="リンク" data-track-action ="補足資料アンカー" href ="/ articles / s41588-021-01009-4#MOESM4"> Five Explains the naming of the motifs in each figure. d The effect of mismatch on the enhancer activity of the p53 family (p63) motif when the consensus base is replaced by another base at one position at a time. Log compared to input 2 Magnification changes are plotted against the same motif pattern in two different sequence contexts. PWM with HT-SELEX and STARR-seq motifs is shown. Note that mutating G to another base (H) at position 5 (H05) almost completely loses activity. Credit: DOI: 10.1038 / s41588-021-01009-4 “width =” 800 “height =” 530 “/>

Figure 1: Few TFs show strong transcriptional activity in cells. a, MPRA (STARR-seq) library schematic. The enhancer activity assay clones a DNA library containing a synthetic TF motif (i), a human genome fragment (ii), or a completely random synthetic DNA oligonucleotide (iii) into the 3’UTR of the reporter gene (open reading). Frame (ORF)). It is driven by the smallest δ1-crystallin gene (Sasaki) or the EF1α promoter. In the binary promoter-enhancer (iv) activity assay, a random synthetic DNA sequence is used instead of the smallest promoter in the 3’UTR (3’UTR).Method, Supplementary information And supplementary table 3 When Four). b b, MPRA (STARR-seq) reporter constructs and their variations, and experimental workflows for measuring promoter or enhancer activity. The MPRA library is transfected into human cells, RNA is isolated 24 hours later, and reporter-specific RNA enrichment, library preparation, sequencing, and data analysis are performed. Recover the active promoter by mapping the transcribed enhancer to the input DNA and identifying the corresponding promoter. cEnhancer activity of the HT-SELEX motif as measured from the synthetic TF motif library of GP5d cells. A multiple variation of the median sequence pattern containing a single instance of motif consensus or its inverse complement to an input library is shown. The red line shows 1% of activity associated with the strongest motif. The dimer motif is a core consensus sequence (GGAA for ETS, ACAA for SOX, AACCGG for GRHL, GAAA, HH for IRF, head to head, HT, head to tail, TT, tail to tail. It is shown in the direction with respect to the tail). Gap length between core sequences. The asterisk indicates an A-rich sequence on the 5’side of the IRFHT2 dimer.Supplementary table Five I will explain the naming of the motifs in each figure. d, The effect of mismatch on the enhancer activity of the p53 family (p63) motif when a consensus base is replaced with another base at one position at a time.log2 Magnification changes compared to the input are plotted against the same motif pattern in two different sequence contexts. PWM with HT-SELEX and STARR-seq motifs is shown. Note that mutating G to another base (H) at position 5 (H05) almost completely loses activity. Credit: DOI: 10.1038 / s41588-021-01009-4

A research group at the University of Helsinki has discovered the logic that regulates gene regulation in human cells. In the future, this new knowledge could be used to investigate cancer and other genetic disorders.


Gene regulation is an important process that controls the activity of gene In the cell.Incorrect Gene regulation It can contribute to the development of many illnesses, including cancer.

The DNA of the human genome contains genes that encode proteins. Muscle cells Their strength and brain cells are the ability to process information. DNA also contains genes Regulatory factors It determines when and where the gene is expressed and ensures that the muscle gene is expressed in the muscle and the brain gene is expressed in the brain.

However, the regulatory code that determines gene activity remains poorly understood. The human genome is composed of about 3 billion base pairs, but the genome sequence alone is too short to learn the gene regulation code. The problem is similar to the problem faced by linguists trying to understand a forgotten language based on a few short texts.

Professor Yusshitai Pare’s research group at the Center for Tumor Genetics Research at the Academy of Finland has found a way around this problem to resolve the regulatory code.

New research recently Nature Genetics journal.

“We measured gene regulatory activity from a collection of DNA sequences 100 times larger than the whole. Human genome“. Biswajyoti Sahu, a researcher at the Academy of Finland, who is the lead author of this study, said.

“Instead of using nature Genome sequence, Introduced a random synthetic DNA sequence into human cells. Next, the cells themselves were able to read the new DNA and highlight the sequences that act as active regulators, “Sahu adds, explaining the innovative approach.

Researchers identify key atomic units of gene expression

Researchers have created extensive data sets using a technique known as the massively parallel reporter assay. With this technique, the regulatory activity of millions of DNA sequences can be studied simultaneously in one large assay. The data was analyzed using artificial intelligence tools.

Gene expression is regulated by proteins known as transcription factors that bind to DNA. Researchers have found that the very short DNA sequences to which these factors bind constitute important atomic units of gene expression. Individual transcription factors contribute additively to gene regulation. In other words, each factor independently increases regulatory activity without specific interaction with other factors. In addition, transcription factors may have several parallel functions in the gene regulatory process, such as improving gene expression rates and defining the location of the genome where transcription begins.

“Transcription factor binding motifs can be thought of as words that together define a cell’s genes. Regulation code“Professor Jussi Taipale explains.

Researchers have found that the grammar of the code is relatively weak and that most words can be placed in almost any order without changing their meaning.

“But sometimes it resembles a compound word, but the grammar is powerful and certain combinations of factors need to bind in a particular order to activate gene expression,” says Taipale. increase.

Only a handful of highly active transcription factors in the cell

The researchers compared three different human cell types: colon cancer cells and liver cancer cells, and normal cells from the retina. They found that only a handful of transcription factors were very active in cells.In addition, most Transcription Factor activity is similar regardless of cell type.

Result is, Human cells It can be categorized into different types based on the chromatin context, which is either in a closed chromatin region where the DNA is tightly packed, or in a more open chromatin environment where the DNA is not tightly packed around the histone protein.

Traditionally, the active regulatory element has been thought to be within the open chromatin region, where DNA is easily accessible. Transcription factor.. Therefore, the discovery of active regulatory elements that function within the closed chromatin region is one of the central new observations of the study. In addition, researchers have identified chromatin-dependent regulatory factors. These elements are active at normal sites in the genome, but when they are removed from their original location and moved closer to another gene, their activity is significantly reduced.


How Hydra Animals Regenerate Their Heads


For more information:
Biswajyoti Sahu et al, Sequencing Factor for Human Gene Regulatory Elements, Nature Genetics (2022). DOI: 10.1038 / s41588-021-01009-4

Quote: Human gene regulation obtained on February 21, 2022 from https: //phys.org/news/2022-02-uncovers-grammar-human-gene.html by a study (February 21, 2022) The “grammar” behind is revealed.

This document is subject to copyright. No part may be reproduced without written permission, except for fair transactions for personal investigation or research purposes. Content is provided for informational purposes only.



Research reveals the “grammar” behind human gene regulation

Source link Research reveals the “grammar” behind human gene regulation

Show More

Related Articles

Back to top button