목요일, 2월 6, 2025
HomeMedical NewsNew AI mannequin predicts gene expression throughout human cell varieties

New AI mannequin predicts gene expression throughout human cell varieties



Abstract: A crew of investigators from Dana-Farber Most cancers Institute, The Broad Institute of MIT and Harvard, Google, and Columbia College have created a man-made intelligence mannequin that may predict which genes are expressed in any sort of human cell. The mannequin, referred to as EpiBERT, was impressed by BERT, a deep studying mannequin designed to know and generate human-like language.

EpiBERT was skilled on information from a whole bunch of human cell varieties in a number of phases. It was fed the genomic sequence, which is 3 billion base pairs lengthy, together with maps of chromatin accessibility that inform which of those sequences are unwound from the chromosome and skim by the cell. The mannequin was first skilled to be taught the connection between DNA sequence and chromatin accessibility throughout massive chunks of the genome in a particular cell sort. It then makes use of these discovered relationships to foretell which genes had been energetic within the corresponding cell sort. It precisely recognized regulatory components – components of the genome acknowledged by transcription components – and their affect on gene expression throughout many cell varieties, constructing a “grammar” that’s generalizable and predictable. This grammar-building course of will be likened to the way in which a big language mannequin, resembling ChatGPT, learns to construct significant sentences and paragraphs from many examples of textual content. The EpiBERT mannequin can course of accessibility and predict useful bases in addition to RNA expression for a never-before-seen cell sort. 

Significance: Each cell within the physique has the identical genome sequence, so the distinction between two varieties of cells shouldn’t be the genes within the genome, however which genes are turned on, when, and the way a lot. Roughly 20% of the genome codes for regulatory components decide which genes are turned on, however little or no is understood about the place these codes are within the genome, what their directions appear like, or how mutations have an effect on operate in a cell. EpiBERT will make clear how genes are regulated in cells and, doubtlessly, how that cell’s regulatory system will be mutated in ways in which result in ailments resembling most cancers.

Funding: The Broad Institute, the Novo Nordisk Basis, the Nationwide Genome Analysis Institute, the Sharf Inexperienced Most cancers Analysis Fund, the Richard and Nancy Lubin Household, and the American Most cancers Society. Tensor Processing Unit (TPU) entry and help offered by Google.

Supply:

Journal reference:

Javed, N., et al. (2025). A multi-modal transformer for cell type-agnostic regulatory predictions. Cell Genomics. doi.org/10.1016/j.xgen.2025.100762.

RELATED ARTICLES
RELATED ARTICLES

Most Popular