9.7 C
London
HomeTRENDSGoogle Research Presents a Novel AI Method for Genetic Discovery that can...

Google Research Presents a Novel AI Method for Genetic Discovery that can Harness Hidden Information in High-Dimensional Clinical Data

Related stories

Farewell: Fintech Nexus is shutting down

When we started Fintech Nexus in 2013 (known as...

Goldman Sachs loses profit after hits from GreenSky, real estate

Second-quarter profit fell 58% to $1.22 billion, or $3.08...

What Are the Benefits of Using Turnitin AI Detection?

In today’s digital age, academic integrity faces new challenges...

This Man Can Make Anyone Go Viral

Logan Forsyth helps his clients get hundreds of millions of...

High-dimensional clinical data (HDCD) refers to datasets in healthcare where the number of variables (or features) is significantly larger than the number of patients (or observations). As the number of variables increases, the data space grows exponentially, requiring substantial computational resources that make it difficult to process and analyze. Additionally, models built on high-dimensional data can be difficult to interpret, hindering clinical decision-making. The difficulty in obtaining large datasets with comprehensive disease labels and the limitations of standard disease labels in reflecting complex biological traits restrict the effective use of HDCD in genomic studies.

GoogleAI researchers address the challenge of harnessing high-dimensional clinical data (HDCD), such as spirograms, photoplethysmograms (PPGs), and imaging data, for genetic discovery and disease prediction. Current methods in genomic studies often involve genome-wide association studies (GWAS) on expert-defined features extracted from HDCD or directly on high-dimensional data coordinates. However, these approaches face challenges such as computational expense, high multiple-testing burdens, and limited ability to uncover complex genetic associations. 

Google’s novel approach called REpresentation Learning for Genetic discovery on Low-dimensional Embeddings (REGLE), is designed to address these limitations. REGLE utilizes unsupervised representation learning to transform HDCD into lower-dimensional embeddings without the need for disease labels. This method integrates expert-defined features (EDFs) where available and enables more efficient and comprehensive genetic analysis.

REGLE employs a variational autoencoder (VAE) to learn non-linear, low-dimensional, disentangled representations of HDCD. The process involves three main steps: learning embeddings of HDCD via VAE, performing GWAS on these embeddings to identify genetic associations, and creating polygenic risk scores (PRSs) from the embeddings to predict specific diseases or traits, potentially using a few disease labels. The method was validated on two types of HDCD—spirograms and PPGs—and demonstrated significant improvements. REGLE detected novel genetic loci associated with lung and cardiovascular functions that were not identified through traditional methods. For instance, REGLE found 45% more significant loci for PPG data and improved risk prediction for diseases like COPD and asthma compared to methods based on EDFs or principal component analysis (PCA). The embeddings also provided interpretable results, highlighting features like airway obstruction not well-represented by standard EDFs.

In conclusion, the REGLE method provides a robust solution for genetic analysis using high-dimensional clinical data by leveraging unsupervised learning to uncover hidden genetic signals and improve disease prediction.  By eliminating the need for extensive disease labels and incorporating expert features, REGLE effectively addresses traditional methods’ limitations. Researchers demonstrated that improvements in novel loci discovery and risk prediction underscore REGLE’s potential to advance genomic research and enhance personalized medicine through a more comprehensive analysis of HDCD.

Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. 

Join our Telegram Channel and LinkedIn Group.

If you like our work, you will love our newsletter..

Don’t Forget to join our 46k+ ML SubReddit

Pragati Jhunjhunwala is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Kharagpur. She is a tech enthusiast and has a keen interest in the scope of software and data science applications. She is always reading about the developments in different field of AI and ML.

🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others…

Latest stories