background

Professor, Computer Science

Xin Gao

xin-gao
Email

xin.gao@kaust.edu.sa

Computer, Electrical and Mathematical Sciences and Engineering (CEMSE)

https://cemse.kaust.edu.sa/

Principal Investigator, Structural and Functional Bioinformatics Group

https://cemse.kaust.edu.sa/sfb

Acting Associate Director, Computational Bioscience Research Center

https://cemse.kaust.edu.sa/cbrc/people/person/xin-gao

Orcid

https://orcid.org/0000-0002-7108-3574

 
“I view bioengineering as the science and art of understanding, optimizing, and even creating life.”

Professor Gao earned a bachelor’s degree in computer science in 2004 from Tsinghua University and his Ph.D. in computer science from the University of Waterloo in 2009. Prior to joining KAUST, he was a Lane Fellow at the Lane Center for Computational Biology in the School of Computer Science at Carnegie Mellon University. 

At KAUST, Professor Gao wears many hats. He is currently a professor of computer science in the Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) division, acting associate director of the Computational Bioscience Research Center (CBRC), deputy director of the Smart Health Initiative (SHI), and also leads the Structural and Functional Bioinformatics Group.

 

FocusAndTechnologyAreas

Professor Gao’s research is at the intersection of computer science and biology. In the field of computer science, his interest is to develop machine learning theories and methodologies related to deep learning, probabilistic graphical models, kernel methods, and matrix factorization in computer science. 

In the field of bioinformatics, his group works on building computational models, developing machine learning techniques, and designing efficient and effective algorithms to tackle critical open problems related to bioinformatics. Professor Gao focuses on a variety of high-impact application areas, including biological sequence analysis, 3D structure determination, function annotation, and—more recently—biomedicine and healthcare.

 

Professor Gao developed an end-to-end pipeline for nanopore sequencing data analysis and applications consisting of three main components of primary research: 

  • a novel, bi-directional, WaveNet-based base-calling method that decodes the raw electrical current signals from nanopore sequencers to long DNA reads
  • the world’s first signal-level simulator for nanopore sequencing to alleviate the data scarcity issue for deep learning training
  • an ultra-fast signal-sequence alignment algorithm, which is not only as accurate as the optimal alignment algorithm, but also 3,000+ times faster

He has applied this innovation pipeline to the genomic diagnosis of Saudi genetic diseases in collaboration with the King Faisal Specialist Hospital and Research Center, to antibiotic resistance gene detection in the environment, and to structure variation detection after CRISPR/CAS9 genome editing. As a result of his research, he diagnosed a previously unsolved Saudi genetic disorder of microcephaly, which is an abnormality of the cerebral white matter and intellectual disability caused by a partial exonic deletion of 38bps of KCTD3.  He also diagnosed four consanguineous families with severe global developmental delay, microcephaly, facial dysmorphism, and variable congenital heart and eye malformations as caused by recessive deleterious variants in SMG8.

He co-founded a startup, Peregrine Genomics, to further pursue technology transfer and economic development of these techniques. Peregrine Genomics is a winner of the 2019 TAQADAM Startup Accelerator Program, a finalist in the GITEX Supernova Challenge, ranked number five among 15,000 startups in the 2020 Entrepreneurship World Cup (EWC) Saudi Arabia, and was included in the top 25 among 175,000 startups from 200 countries in the 2020 EWC global final.

As a core member of the R3T team, Professor Gao made rapid response and scientific contributions to combat COVID-19. This included:

  • Developing an AI-based CT diagnosis system, drug repositioning, sentiment analysis, rehabilitation, and sequela prediction
  • Building a fully automatic AI-based system for COVID-19 CT-scan diagnosis, segmentation, and quantification—which inspired KAUST’s first publication on COVID-19 and was featured in one of 12 papers in the world’s first special issue on imaging-based diagnosis of COVID-19.
  • Deploying a pipeline to the King Faisal Specialist Hospital and helping front-line radiologists. The response to this work was highly promising, with one radiologist advocating that “the model was fast to use and each case took approximately less than a minute to be processed. It is expected that such a model will make an important contribution to chest imaging, especially with the current pandemic.” 
 

In addition to the research mentioned above, Professor Gao is a well-known, world-leading expert on protein bioinformatics. Since joining KAUST, he has been working on computational methodology development for analyzing and understanding protein sequences, 3D structures, functions, and their behaviors in complex biological networks. Two of his most represented works include:

  • The development of DEEPre, the first deep-learning-based enzyme function predictor, which accurately predicts the detailed functions of a very important family of proteins and enzymes. It quickly gained momentum among researchers in the bioinformatics and protein science fields, becoming a highly popular tool. Since its publication in 2018, the paper has been cited more than 100 times and the web server has processed more than 200,000 queries for the community. Researchers from top institutes such as Harvard Medical School, UC Berkeley, the University of Michigan, EBI, MPI, BGI, KAIST, and the University of Tokyo have used, cited, and followed this work. 
  • He recently developed NucleicNet, which is the first computational method able to predict the high-resolution interactions between RNA constituents (ribose, phosphate, and four bases) and any location of a given protein structure surface. NucleicNet combines the strength of physicochemical characteristics and deep learning to not only predict the most likely RNA for RNA-binding proteins, but also identify new RBPs completely. NucleicNet has been highlighted by various media outlets including F1000Prime, PHYS.ORG, News Medical, and KAUST Discovery. 

Why did you choose this research topic?

Although I’m a purely trained computer scientist, I’m completely fascinated by how life works—and by the possibility of tackling life science problems with computational methods. I’m particularly interested in developing new principled computational methods to solve biological and biomedical problems with high significance and impact.

 
Why KAUST?

When I first visited the KAUST campus, I was deeply impressed by the infrastructure and the inspiring vision of King Abdullah. I knew that by joining this brand new university, I could be a part of history. 

 
What are your future plans?

I plan to become the leading expert in developing novel computational methods to solve open problems in biology, biomedicine, and healthcare, with a game-changing impact on improving human health. I expect my group to make breakthroughs and solve the most critical problems to benefit human health.


What does bioengineering mean to you?

I view bioengineering as the science and art of understanding, optimizing, and even creating life—which requires the convergence of many people from multidisciplinary backgrounds of study. 

 

×

Ready to join with us?

Follow Us