“If I build it, I will come”

For my dissertation, I have been interviewing developers of third-party interpretation tools for consumer genomic data. These are tools such as Promethease, openSNP, and DNA.land, among many others, where people who have their genetic data file from consumer testing can seek further analysis and/or contribute their genome to research. Even though I’m only a few interviews in, I’m already seeing some interesting themes. This isn’t a formal analysis here, just some of my initial impressions — “field notes,” you might call it.

A door sign saying "interview in progress"
Image credit: laura pasquini, https://www.flickr.com/photos/souvenirsofcanada/15518999970.
  • Theme 1. Tools are heterogeneous in terms of their creators/developers, functions, purposes, and processes.

Just like with most studies, I have inclusion criteria to identify eligible tools. (Also called “scoping,” this is an essential process for any successful…and completable…graduate student project! We just can’t get enough of the scoping…). My three criteria for tool eligibility are:

  • Must allow users to upload (or analyze locally) their genetic data file
  • Must return some type of information or interpretation to the user
  • Must be active at the time of my study (and there were some tools that failed on this – i.e., they are now defunct)

This is a very user or customer-facing set of inclusion criteria, which I did on purpose. That is, I thought of a user sitting at their computer, having just freshly downloaded their genetic data from 23andMe or AncestryDNA. What tools might they search for or stumble upon as they pursued further self-directed analysis of their genetics? Those were the tools I wanted to study.

At the outset, I was aware I might be grouping together a lot of very different tools, and indeed after months of closer study this is definitely the case. Some tools might not even want to be called “interpretation tools,” as they are more focused on users contributing data to research and just happen to give some tidbits of info in return. I also wasn’t expecting to include some DTC companies, but turns out there are some that also offer analysis of existing genetic data, i.e. from another testing company.

So I have a sort of motley crew of a dataset, but I’m optimistic it will make for an interesting and fruitful analytic substrate.

  • Theme 2. Several developers built the tool they wanted for themselves.

This isn’t across the board, but I was struck at how many of the tools were born out of the developers’ own needs and desires. They had their own genetic data in hand and wanted to do something with it that existing tools couldn’t do, or couldn’t do in exactly the way they wanted. So with sufficient programming and bioinformatic skills, they build the tool they want and then expand it out for broader use.  It reminds me of the phrase “If you build it, they will come.” Except here we have the developer saying something like: “If I build it, I will come…and then I’ll let everyone else who’s interested come, too.”

  • Theme 3. The process of getting information is a source of information in and of itself.

I’ve been learning about tools by (1) studying the website and any associated papers or media coverage and (2) interviewing tool developers. The difficulty or ease of accessing these routes of information is sometimes telling about the tool itself. Some tools are run by companies where it’s very difficult to tease out any proprietary details of how they work (analytical tools and resources used, e.g.). Other tools are built on an idea of openness and are accordingly incredibly transparent about how they work, how many people are using it, etc. Reaching out to tool developers for interviews is similarly illuminating. But not always in the way I expected. Not surprisingly, several people are either ignoring or declining my inquiries, and it’s fair to say more companies fall into that category than not. But some companies are talking to me. And on the flip side, some academics are not. My goal is to get interviews with at least half of the tools I’m studying, which would be ~50% “response rate.” And the goal with qualitative interviews such as these is not to get to a statistically significant number, per se, but rather to gain some depth and texture to my understanding of who made these tools and why. Stay tuned for more!

Leave a Reply

Your email address will not be published. Required fields are marked *