In a new study — the first of its kind to functionally link DNA mutations to autism — researchers used artificial intelligence to demonstrate how mutations in “junk” DNA can cause autism.
Their findings are published in Nature Genetics.
Via machine learning, the research team analyzed complete genomes of 1,790 individuals with autism and their unaffected parents and siblings. Those participating in the study did not have a family history of autism, which meant that their autism was caused by a spontaneous mutation.
The machine learning system predicted that mutations that cause autism are located in parts of the genome that do not encode proteins — regions often called “junk” DNA. The number of autism cases linked to the noncoding mutations was similar to that of cases linked to protein-coding mutations that disable gene function, meaning this discovery goes beyond autism.
“This is the first clear demonstration of non-inherited, noncoding mutations causing any complex human disease or disorder,” said study leader Olga Troyanskaya, the deputy director for genomics at the Flatiron Institute’s Center for Computational Biology (CCB) and a professor of computer science at Princeton University.
“This [discovery] enables a new perspective on the cause of not just autism, but many human diseases,” added co-author Jian Zhou of CCB and Princeton.
Mutations in protein-coding regions of the genome (which only make up about 1-2% of the entire genome) account for at most 30% of autism cases that aren’t linked to a family history of autism. The research team also found evidence to suggest that autism-causing mutations must happen elsewhere in the genome as well.
To figure out which noncoding mutations may cause autism, the team used machine learning to predict how any given sequence would affect gene expression. They applied the machine learning technique to a collection of genetic data called the Simons Simplex Collection, which contains the whole genomes of about 2,000 “quartets,” meaning a child with autism, an unaffected sibling, and two unaffected parents.
“This is a shift in thinking about genetic studies that we’re introducing with this analysis,” said Chandra Theesfeld, a research scientist in Troyanskaya’s lab at Princeton. “In addition to scientists studying shared genetic mutations across large groups of individuals, here we’re applying a set of smart, sophisticated tools that tell us what any specific mutation is going to do, even those that are rare or never observed before.”
“The design of the Simons Simplex Collection is what allowed us to do this study,” said Zhou. “The unaffected siblings are a built-in control.”
The analysis showed that noncoding mutations in many of the children with autism altered their gene regulation, and these mutations also affected gene expression in the brain and genes linked to autism.
“This is consistent with how autism most likely manifests in the brain,” said study co-author Christopher Park, a research scientist at CCB. “It’s not just the number of mutations occurring, but what kind of mutations are occurring.”
Troyanskaya hopes to expand their methods and use them to improve how genetic data is used for diagnosing and treating diseases and disorders outside of autism.
“Right now, 98 percent of the genome is usually being thrown away,” Troyanskaya said. “Our work allows you to think about what we can do with the 98 percent.”
Image Credit: Shutterstock