Today DNA sequencing is dominated by research applications. This is likely to continue in the short term, but in the medium and long term the future of DNA sequencing is likely to look very different.
Fundamentally I see sequencing as a new class of sensor. People talk generally discuss sequencing in the context of human health. But as sequencing decreases in cost and becomes easier to use it becomes more like a general purpose sensor. Like CMOS imaging chips for example, it will have research,clinical, and consumer applications.
The Medium Term
In the medium term it’s clear that DNA sequencing will be making its way into the clinic. Companies like 23andme which screen for inherited genetic traits are one obvious application. The genetic screening of every child at birth is somewhat attractive, and could provide everyone with a dataset they could draw upon throughout their lifetime. With 4 million births a year in the US, this is a pretty big market, but it’s perhaps not the biggest.
There are 1.7 million new cancers reported each year in the US, cancer is an inherently genetic disease, and each of theses cases would ideally be sequenced in full to better understand its genetic cause.
But better than understanding a cancer once you know it exists is to detect it early so you can do something about it. Companies like Illumina’s newly founded GRAIL seek to use DNA sequencing to regularly screen for cancer. For various reasons both complete, and fragmented cancer cells end up in a patents blood. By taking a simple blood sample you should therefore be able to screen for cancer using DNA sequencing. Screening every US adult every 5 years gives you a market for 50 million test a year.
The screening market is much bigger than this. There are more than a million sepsis infections a year in the US screening for early detection of sepsis and other infectious diseases would be of huge value. In fact is seem obvious that when bulk sequencing for <$100 becomes generally available, blood serum would be sequenced as a matter of course. That single test could detect cancers, infectious diseases, and provide a genetic profile of any unborn children as well as the patent themselves.
If we sequenced every patient admired to hospital, that would result in ~30 million genetic tests in the US alone.
The Long Term
In the longer term it’s clear that sequencing costs will drop even further. My guess is that the basic sensing technology will drop as low as cheap CMOS imaging sensors are today. These cost almost nothing, on the order of a dollar. Unlike CMOS sensors though I expect DNA sequencer to always be single (or few) sample use. But overall it seems reasonable to expect that the cost of sequencing will drop to the $10 mark.
Combined with on chip extraction techniques this could create a simple consumer grade platform. But what exactly would a consumer do with cheap sequencing.
There are a few ideas I’ve heard thrown around. One is routine sequencing at borders. This would allow quick detection and containment of viral outbreaks. This might be feasible at the $10 mark, though it may have significant legal and moral implications.
I think it’s more likely that at that level people would be routinely sequencing themselves anyway. Every time you have a cold, or feel a bit under the weather you’d sequence some samples and find out exactly what was wrong with you. It’s possible that this could lead to targeted medication but reassurance that “there’s nothing serious wrong” is probably worth $10 to most people, and is a lot more comforting than a doctor saying “it’s probably nothing come back if you still feel bad in a week”.
You’d likely be using DNA sequencing as a QC in agriculture and food processing too. To track contamination, source of infection, or monitor supply chains. It’s so cheap that it makes sense to do this on a per-batch basis.
And in agriculture, it will be used to monitor the health of livestock, in much the same way as it would be used to monitor human health.
Some have even suggested that DNA sequencing will be integrated directly into toilets, which will in no doubt cheerily intone your wellbeing, or setup a doctors appointment for further tests.
DNA sensors in public spaces might continuously monitor for airborn viruses.
Ultimately I think everyone will have a DNA sequencer in their home, either continuously monitoring the occupants health, or used as required. Perhaps enabling to answer questions like “is this really not horse meat?”
If we can beat a $10 sequencing run and head toward $1 even more applications open up. “What’s this plant?” You could try and look it up, but why not just sequence it, and as a bonus get a complete genotype. How clean are my work surfaces? Sequence the bacterial population?
The eventual global market for sequencing is likely to be in the 10s of billions of tests per year.
Disclosure: I always try to be as unbiased as I can. However, I have worked for DNA sequencing companies. I own stock in DNA sequencing companies. And I continue to work in the industry.
There’s been much talk of the $1000 genome. But it’s clear that the price will continue to drop even further. Ultra-cheap sequencing (and sample prep) would open up entirely new applications. The route to ultra-cheap sequencing may hold some surprises and it’s interesting to run the numbers. Let’s begin imaging our ideal sequencing platform.
DNA sequencers are often characterised in terms of throughput. That is, how much DNA they can sequence per unit of time. The human genome is about 3 billion basepairs. You’d be forgiven for thinking that you’d only ever want a sensor that delivers that much sequence. Turns out there are lots of applications where lots more sequencing would be useful. Often you’re sequencing larger populations of organisms. Or you’re looking for something that doesn’t happen very often (low abundance fragments of cancer in blood plasma).
As a convenient guesstimate I’d say we want to be able to sequence 1000x the size of a human genome. Oh and I’d like to be able to run this in 5 mins. For fear of being accused of overkill I’ll leave the specs at that (this much sequencing costs 1000s of dollars and would take days currently).
How many sensing elements and what kind of throughput would be needed? Sensing DNA at more than 100 bases per second (bps) is likely to be tough. Nanopore approaches generally generate signals in the picoamp range. Amplifying these signals at anything more than a few 10s of Kilohertz gets hard. SBS approaches will certainly also have issues running at speeds faster than this.
How many sensing elements do we need? 3^10^9/60/5 is 10 million. At a 100bps, that gives us 100,000 sensing elements.
An iPhone 6s camera can reliably sustain these kinds of data rates. 120fps at 1080p (~2megapixels). That camera module regularly goes for about $20 on eBay, similar modules are likely available OEM at much lower prices. Which is to say, we can build CMOS chips at volume, which produce data rates in the right ballpark.
While $20 is cheap, it’s still not cheap enough to be a throw away component in a $1 sequencing system. And while semiconductors are crazy cheap (you certainly can buy cheap CMOS sensors for around $1), it’s hard for me to imagine shipping a complete, semiconductor derived, consumable for this price.
So what does the technology behind a $1 run sequencer look like?
The great hope for the future of sequencing has always been nanopores. In this technique the DNA passes through a small aperture. As it moves through the hole the DNA sequence is read off using one technique or another.
Nanopore arrays are based around semiconductor fabrication, and the cost profile is likely to be similar. It’s hard for me to see how a nanopore sensor could be produced for less than $1, and delivered to a user for less than $10. It’s also hard for me to imagine that the array is reusable. Any system where the DNA is in contact with the sensor will result in the contamination of that sensor, and its rapid degeneration.
So what might a $1 sequencer look like? Somewhat controversially, I think it might decouple the sensing technology from the substrate in much the same way current massively parallel sequencing platforms do today. An optical system, with cheap reagents, and a cheaply manufactured substrate (which costs <$1) would seem logical (though a reusable FET sensor might work too). The instrument itself might use more expensive CMOS cameras and cost a few hundred dollars, but because the sensing and sample are decoupled they could be used repeatedly.
Disclosure: While I always try and remain unbiased. I own stock in sequencing companies. I’ve worked for DNA sequencing companies. And I continue to work in the industry. Exercise your own judgement.
AGBT has come and gone, and with it further details of the Illumina Firefly platform have appeared. The speculation I previously posted appears to be largely correct. Genomeweb has an excellent report on what was presented by Jay Flatley. From the Genomeweb article we note “The Firefly device is essentially a CMOS sensor with nanowells. The nanowells are fabricated over photodiodes to enable DNA deposition to be aligned one-to-one with each photodiode. Clustering and sequencing then occurs directly on the CMOS chip.” and that Jay said it’s “inherently a one-channel device”. This means that they are using a single photodiode per well, unlike CrackerBio who have suggested that a triple-junction photodiode could give a decently specific signal dependent on differing wavelength.
A new chemistry has been developed for this one-channel system. He appears to have suggested that a dye needs to be cleaved and another added during imaging. That makes sense of course, but does add complexity to the chemistry. It likely means at least a slight decrease in read length/increase in error rate (all other things being equal, which of course they are not).
Using a single photodiode is the conservative move at this point, and makes sense to me when introducing an entirely new detection mechanism. Jay also said that they’ve demonstrated 150bp reads and error rates similar to the HiSeq X (to which I would say as always “show me the data”).
Announcements like this are interesting. Jay said it was because it would be difficult to keep it secret as they will be ramping up their supply chain. I’m not sure I buy that, they’re a public company and there’s always a pressure the break into new markets, release something new, and show share holders that things will get better and better.
Disclosure: While I generally try and remain unbiased. I have worked for multiple DNA sequencing companies, I continue to work in sequencing, I have interests in sequencing companies. As always, you should form your own opinions based on the facts available.