Genetic “Mesearch”

Personal genomics

Personal, consumer genomics has exploded in the past decade, and shows no signs of slowing. Yet there is an inherent tension in attempts to translate the findings of genetic research, which is done in large groups of people, into insights for individuals. I.e., the research question, “Does this genetic variant, or set of variants, occur more often in people with heart disease than those without?” does not map comfortably onto the related question, “If I have this variant(s), will I get heart disease?” Direct-to-consumer (DTC) testing companies rely on this type of mapping between research findings and customer reports. (Granted, companies clarify that reports are not diagnostic and simply convey a likely increase or decrease in risk for a given disease or condition.)

Translating genomics…into what?

A buzzword in research, “translational genomics” typically refers to the application of basic research findings into clinical practice. However, inherent in that pathway is translation from studies of populations to predictions or insights for individuals. In consumer genomics, those individuals often aren’t currently sick but rather coming from a place of curiosity rather than medical necessity or urgency. These circumstances increase the potential awkwardness of translation, since the end goal isn’t even to affect medical care. Furthermore, information about disease risk is hard to process when our experience as individuals is a binary “yes/no” — do I have or will I get this. This disconnect is especially salient in DTC reports on physical traits: people balk when 23andMe gets their eye or hair color prediction wrong. It’s so obvious what our individual truth is on these easily observable facts we can forget the body of genetic research on eye color is complex and evolving.

Genetic research

So where do those genetic research findings come from? Many come from genome-wide association studies (GWAS), a popular study design enabled by the advent of microarray-based genotyping in the mid 2000’s. The idea is to measure a million or so common variants across the genome and compare their frequency in a group of unrelated people with the disease (cases) versus those without (controls), aka “case-control association testing.” More recently, falling costs of whole-genome sequencing make it possible to try a similar tactic with rare genetic variants rather than common. Either way, you’re looking for parts of the genome that occur at statistically different frequencies between cases and controls. Because most common diseases are caused by a combination of genetic and non-genetic factors, any single association found is unlikely to explain all that much.

One slightly controversial way that researchers are trying to get around this is by adding up all the small effects of many variants to create an overall score, or “polygenic risk score” (PRS), an approach also used in some DTC reports. But opponents argue a PRS that sums up hundreds or even thousands of small effects is just a lazy way to say “we don’t really understand how the genome is working here, so let’s just throw some math at it.” One PRS paper made a splash earlier this month by showing how well risk scores predicted five common diseases, including heart attack and type 2 diabetes. The researchers plan to create a website where DTC customers can upload their raw genetic data (more on that below) and calculate their own scores — the ultimate goal in translating population-level research into insights for individuals.

I should also note that most of this drives clinical geneticists bonkers, as blindly mapping research results onto individuals ignores key contextual factors such as family and clinical history. I heard from one clinical geneticist that you need board-certified geneticists to interpret the clinical impact of any variant(s) for a given patient, rather than just “throwing it into a machine” (i.e. automated algorithm, such as PRS calculation).

Genetic mesearch

cartoon stick figure holding double helix

For the past four years, I’ve been studying people who take this quest to translate research findings into individual meaning to the next level…let’s call them, the “genetic mesearchers.” These are folks who go beyond the DTC companies’ processed reports by downloading their raw genetic data file and plugging it into a suite of additional, third-party interpretation tools. Some of these tools work like the DTC companies, providing curated reports that attempt to synthesize research across several genetic variants at once. But others take users right to the raw, unprocessed fruits of research by linking them to research papers on single variants — i.e. a “bridge to the literature.” Now mesearchers have to do even more work to process what those research findings may or may not mean for them as individuals.

Mesearchers know a lot for “non-experts”

In talking to developers of some of these third-party tools, they see directly bridging to the literature as a more transparent way to relay the inherently uncertain and iterative nature of genetic research. And indeed in talking to some of these tool users, I saw remarkably sophisticated ways of processing this complex and probabilistic information. For example:

“I think some people don’t understand the statistics involved in some of these [third-party tool outputs]. This isn’t a test score, this is where you are on the curve and it doesn’t mean…[it’s] not a diagnosis. Don’t use it like one.” (participant 798)

Now, at the same time, mesearchers are often jointly leveraging their raw data for both health information and genealogy research. In identifying close family relatives, “DNA doesn’t lie,” as one of my interviewees told me. And that’s largely true — it’s rather easy to determine if someone is your parent or your sibling by examining DNA. But this certainty breaks down quickly when you’re using that same raw DNA file to try and understand health risks. There, DNA doesn’t lie outright, but it’s definitely not forthcoming and may ultimately be misleading.


Interview quotes I have presented here come from my unpublished dissertation research. I am grateful to these participants for their sharing their time and experiences with me.

Leave a Reply

Your email address will not be published. Required fields are marked *