Nanopore DNA Sequencing Research Groups (August 2018)

This is a draft list of research groups engaged in research on the use of nanopores for DNA sequencing. It’s based on this list. I went through this and searched for groups that had published research on DNA sensing using nanopores since 2015. I also added a couple of groups. The list is likely incomplete, and I plan on tidying it up over the next couple of days.

Lab name Website Recent Publication
Nanopore Group – UC Santa Cruz Group Page Mapping DNA methylation with high-throughput nanopore sequencing
The Aksimentiev Group Group Page DNA sequence-dependent ionic currents in ultra-small solid-state nanopores
The Albrecht Group Group Page Low Noise Nanopore Platforms Optimised for the Synchronised Optical and Electrical Detection of Biomolecules
Anselmetti lab Group Page Controlled translocation of DNA through nanopores in carbon nano-, silicon-nitride- and lipid-coated membranes
Bayley Group Group Page Nucleobase Recognition by Truncated α-Hemolysin Pores
Behrends Laboratory Group Page Length- and Species-Selective Detection of Short Oligonucleotides using a Microelectrode Cavity Array of Biological Nanopores
Cees Dekker Lab Group Page
Chen Research Group Group Page
DRNDIĆ Lab Group Page Monolayer WS2 Nanopores for DNA Translocation with Light-Adjustable Sizes
Edel Group Group Page Double Barrel Nanopores as a New Tool for Controlling Single-Molecule Transport
The Nanopore Group at Harvard Group Page
UW Nanopore Biophysics Group Page
The Hall Lab Group Page
Keyser Lab Group Page
Jiali Li Lab Group Page A tip-attached tuning fork sensor for the control of DNA translocation through a nanopore
Xinsheng Ling Group Page Rapid fabrication of solid-state nanopores with high reproducibility over a large area using a helium ion microscope
Maglia Lab Group Page Alpha‐Helical Fragaceatoxin C Nanopore Engineered for Double‐Stranded and Single‐Stranded Nucleic Acid Analysis
Meller Group Group Page
McGrath Lab Group Page DNA Translocations through Nanopores under Nanoscale Preconfinement
Muthukumar Lab Group Page Temperature effect on ionic current and ssDNA transport through nanopores
Laboratory of Nanoscale Biology (Aleksandra Radenovic) Group Page Identification of single nucleotides in MoS2 nanopores
Rosenstein Laboratory Group Page In Situ Nanopore Fabrication and Single-Molecule Sensing with Microscale Liquid Contacts
Stein Lab Group Page Nanopore Sequencing: Forcing Improved Resolution
Stuart Lindsay Group Page Universal Readers Based on Hydrogen Bonding or π–π Stacking for Identification of DNA Nucleotides in Electron Tunnel Junctions
Taniguchi Lab Group Page Quantitative analysis of DNA with single-molecule sequencing
The T.-Cossa LAB Group Page DNA Translocations through Nanopores under Nanoscale Preconfinement
Wanunu Lab Group Page Length-independent DNA packing into nanopore zero-mode waveguides for low-input DNA sequencing
Yitao Long Group Page Construction of an aerolysin nanopore in a lipid bilayer for single-oligonucleotide analysis

Need to add:
Zero-mode waveguide detection of DNA translocation through FIB-organised arrays of engineered nanopores

Virtual Nanopores and DNA Synthesis

The Genapsys concept of virtual wells, defined by fields and use of nanopores by Iridia to selectively expose a template under synthesis to a polymerase had me pondering over the concept of a virtual nanopore. That is to say, a nanopore that is defined by fields rather than physically.

My initial thoughts were that a larger pore (>1nm, perhaps as large as 10 to 100s) could have embedded negatively charged electrodes. The field generated by these electrodes might be used to further restrict the area though which a strand can translocate the pore.

It seems unlikely (though not impossible?) that this would produce a pore that could be used for sequencing (the height of the pore, and construction point being difficult to control precisely).

However, in the Iridia concept the construction point doesn’t need to be thin, the diameter of the pore just needs to be small enough that the polymerase (also negatively charged?) can not translocate through the pore. So such a pore could be valuable for synthesis if it could be made to work.

In addition to this, because the size of the nanopore is adjustable, it might be possible to completely close the pore. This could be valuable in some designs.

Googling around for virtual nanopores, I came across a paper describing a related concept [1].

The paper “Tunable Aqueous Virtual Micropore” describes the application of a quadrupole trap to the translocation of biomolecules. The concept is demonstrated using a planar micron scale system using beads.

The quadrupole approach is used in mass spectrometers to direct ionised particles. In this system they’re used to direct the motion of a particle based on its mass/charge ratio.

To me this system seems more complex than required to confine the strand, but there’s quite likely something I’m missing. In a nanopore system, I assume electrodes would be embedded in the side of the pore. This would allow the strand to be confined as it translocates. Unfortunately the paper was published in 2012 and there doesn’t appear to have been a follow up.

I’m curious to see if some kind of field based confinement ever gets applied to synthesis systems. It also seems possible that the virtual well concept could be of value here.

[1] https://onlinelibrary.wiley.com/doi/abs/10.1002/smll.201101739

Genapsys

Sequencing system image from patent.

Building on my list of sequencing companies. I’ve put together a few brief notes on Genapsys.

Business

Genapsys was founded in 2010. They have raised in excess of 84M USD in total (110M USD according to their website, 84M accounted for on crunchbase). Investors appear to include Decheng Capital, IPV Capital, Plug and Play Ventures and possibly Ampersand Capital Partners. Yuri Borisovich Milner (DST Global) is also said to be an investor. Their last round was on January 2018, where they raised 32.5M USD, Series C. In addition to venture funding they have received some grant funding (~4M USD) [1].

The company was founded by Dr. Hesaam Esfandyarpour and incubated at the Stanford Genome Technology Center.

Technology

The patents I’ve reviewed describe two key components to the Genapsys approach. The first is a method for confining DNA and reagents using Virtual Wells [3]. The second is a method for detecting base incorporation (to build a single channel sequencing-by-synthesis platform).

Virtual Wells

One of the key elements of the 454 and Ion torrent platforms was their use of wells to confine beads on an array (and in Ion torrents case over a sensor). Both these platforms used an off chip process to amplify DNA on the bead (emulsion PCR). This added an extra and somewhat awkward step to the sequencing process.

The Genapsys approach suggests doing away with wells completely. Instead magnetic and electric fields would be used to confine beads and reagents. Magnetic beads would be used which would be attracted to magnets on the chip, localising the bead over sensors.

They then suggest using electric fields to confine nucleotides, strands, enzymes and potentially other reagents. Using the virtual well technology the amplification process could take place on chip. I assume this would be the same process as used for emulsion PCR. However, rather than the reaction vessel being a tiny water bubble, the charged reagents are confined by the electric field.

This seems like a rather neat idea, but I’m curious to know how well it works in practice. I would guess confinement is not perfect.

Detection

Initial reports all suggested that incorporation would be detected either through changes in pH (ISFET) or heat [2]. While recent patents still mention pH, heat, and charge based detection, they also discuss detection of incorporation through conductivity and impedance changes in the Debye layer of the bead [5].

The Debye layer as far as I can tell just means the double layer. My understanding is that in this scenario the bead will be negatively charged. There will therefore be a double layer formed around the bead:

By placing electrodes on either side of the bead you can measure conductivity and impedance/capacitance. I would imagine that most current will come from the bulk of the solution, however the patents suggest that contribution from the double layer can be measured.

Exactly how additional nucleotides effect the conductivity/impedance of the double layer is less clear to me. As they will increase the negative charge in the vicinity of the bead it seems logical that they would however.

If it really is just the additional charge they add, the system feels similar to other methods of detecting the beads overall charge (like the caerus approach). However perhaps by looking at the change in double layer current background contributions are reduced, or the effect of the charge difference is amplified.

One patent does show “sequencing data”, as far as I can tell this is more likely to have come from simulation than a real experiment, I’ve highlighted a large deflection caused by multiple incorporation in the figure below:

Another factor that leads me to believe they may be pursuing a bead based, Debye layer based approach (rather than using ISFETs) is that pairs of electrodes are shown along side beads repeatedly.

There are a few images of seemly real chip systems, but I didn’t see any SEM images. One example is this figure showing bead occupancy (which looks pretty good!):

Overall, while Genapsys haven’t released a whole lot of public information, the patents seem to give a reasonable indication of what they’re working on. Charge based approaches seem attractive. One advantage they have over other chip based approaches is that you don’t need to monitor the incorporation in real time. Bases can be incorporated, and the charge difference measured at low speed, and potentially under different reagent conditions.

However, while patents suggest chips maybe reusable, you only get a single read from each “virtual well”. As with Ion torrent and 454, this could ultimately limit throughput.

Overall, some interesting ideas. And I look forward to seeing how things pan out.

Notes

[1] http://grantome.com/grant/NIH/R01-HG006889-03

[2] Genomeweb article: https://www.genomeweb.com/sequencing/genapsys-develop-microelectronic-sequencer

“DNA sequencing method that is based on direct heat or pH measurement”

US Patent No. 7,932,034, “Heat and pH measurement for sequencing of DNA.”

“In 2010, the firm won a $250,000 grant through the Qualifying Therapeutic Discovery Project Program for a project entitled “Development of an inexpensive, ultra-high throughput micro-electronic medical sequencer””

[3]

http://www.freepatentsonline.com/9399217.html
“As used herein, “virtual wells” refers to local electric field or local magnetic field confinement zones where the species or set of species of interest, typically DNA or beads, generally does not migrate into neighboring “virtual wells” during a period of time necessary for a desired reaction or interaction.”

Virtual Well

The a virtual well or “chamber-free array”, may detect or manipulate particles (e.g., beads, cells, DNA, RNA, proteins, ligands, biomolecules, other particulate moieties, or combinations thereof) in an array wherein said array captures, holds, confines, isolates or moves the particles through an electrical, magnetic or electromagnetic force and may be used for a reaction and or detection of the particles and or a reaction involving said particles. Said “virtual well” may provide a powerful tool for capturing/holding/manipulating of beads, cells, other biomolecules, or their carriers and may subsequently concentrate, confine, or isolate moieties in different pixels or regions of the array from other pixels or regions in said array utilizing electrical, magnetic, or electromagnetic force(s). In one embodiment the array is in a fluidic environment. Sensing may be done by measurement of charge, pH, current, voltage, heat, optical or other methods.

[4] 2013 patent: http://www.freepatentsonline.com/20130096013.pdf
ISFET, chemFET or nanobridge.

[5] Recent patent: http://www.freepatentsonline.com/y2018/0119215.html
“detect a change in conductivity within a Debye layer” ” can detect nucleotide incorporation events by measuring local impedance changes of the magnetic beads 220 and/or the amplified DNA (or other nucleic acid) 255 associated with the magnetic beads 220. Such measurement can be made, for example, by directly measuring local impedance change or measuring a signal that is indicative of local impedance change. In some cases, detection of impedance occurs within the Debye length (e.g., Debye layer) of the magnetic beads 220 and/or the amplified DNA 245 associated with the magnetic beads 220. Nucleotide incorporation events may also be measured by directly measuring a local charge change or local conductivity change or a signal that is indicative of one or more of these as described elsewhere herein. Detection of charge change or conductivity change can occur within the Debye length (e.g., Debye layer) of the magnetic beads 220 and/or amplified DNA 245 associated with the magnetic beads 220.”

“using the sensor to detect a change in conductivity within a Debye layer of the bead upon incorporation of at least one nucleotide of the population of nucleotides into a growing nucleic acid strand, which growing nucleic acid strand is derived from the primer and is complementary to the nucleic acid template”

Sequencing with Mixtures of Three Bases

A previous post discussed Cygnus’ approach to sequencing, using mixtures of bases and multiple reads of the same template. Centrillion also have a patent that appears to cover a related approach.

The Cygnus approach, as described in their paper uses mixtures of 2 bases. I thought it might be interesting to work through corrections using mixtures of 3 bases. It’s possible this is covered somewhere in their supplementary info, or huge 200+ page patent. I’ve not checked and this is just for fun.

There are 4 possible sets of 3 different base types: ATG, ATC, TGC and AGC. The difference between each of these sets is clearly a single base (3 bases out of ATGC in the set, and 1 left out).

To recap on the previous post, a template is exposed to alternating sets (mixtures) of bases, and we measure incorporation intensity and learn how many bases incorporate (as in the same for a normal single channel unterminated sequencing chemistry). In order to process the entire strand the sets we alternate between must contain all base types. For the sets of 3 base types this is no problem, any pair of sets will contain all four base types and differ by only a single base type.

There are 6 possible pairings:

a ATG,ATC

b ATG,TGC

c ATG,AGC

d ATC,TGC

e ATC,AGC

f TGC,AGC

We could vary the order of the pairs. But we don’t really need to. Working through all possible 2bp repeats [1] it’s clear that we can accurate resolve all sequences using 3 out of the 6 alternating pairs.

In all cases, one pairing supplies the base transition information. For example for the repeat ATATAT this is group f above. This is the only pairing that blocks incorporation between A and T transitions. Each pairing blocks on transitions between one of the six possible transition types (G<->C A<->T A<->G A<->C T<->G T<->C). To accurately resolve all sequences, all pairings are therefore required. In the example 2bp repeats, one pairing provides the “transition” information and 2 other pairings are required to resolve the sequence to one of the four bases.

You therefore need to sequence each template six times. However, at any given base information from only 3 of the “mixture sequences” is required to resolve the strand. The other 3 sequences provide redundant information for error correction. This information could be used in a number of ways (either masking likely errored bases, taking a majority vote, or using this information in a more complex error correction model).

How much sequencing does this require as compared to standard single base sequencing?

Well, there will always be degenerate sequences, both in this scheme and the Cygnus approach. These sequences will require very slightly more sequencing than using a normal single base incorporation system.

However we can simulate the number of cycles required (a cycle being the incorporation of a single base type, or a single mixture type). I quickly threw some code together to do this [2]. Assuming this hastily thrown together code is correct the single base incorporation scheme requires 1.481 cycles per base (or ~2.7 bases incorporated per set of 4 bases). The mix of 3 scheme described above requires 1.4905 cycles per base.

So, if you just go by this, there’s very little overhead.

One downside of the base mixture incorporations is that the sequencing system has to cope with longer homopolymers (or rather runs of 1 of 3 different base types). Again this is true of the approach described here, and the Cygnus system. What issues this causes, will depend on the error profile of the underlying technology.

While I’ve discussed mixtures of 3 bases here, it might also be interesting to look at combinations of mixtures of 2 and 3 bases. For example you might have set pairs of ATG, and ATC. Then a set of CA and GT to resolve the ambiguity (this could be extended to create a complete sequencing system).

Maybe that’s another fun project for another time.

Notes

[1]

[2]

#include <iostream>
#include <vector>
#include <math.h>
#include <stdlib.h>

using namespace std;

// Multiple base incorporations
string s1 = "ATG";
string s2 = "ATC";
string s3 = "TGC";
string s4 = "AGC";

int mix_incorp(string temp,vector<string> pair) {

  int p=0;
  int cycles=0;
  for(int n=0;n<temp.size();) {
  
    for(;;) {
      bool ad=false;
      if(temp[n] == pair[p][0]) {n++; ad=true;}
      if(temp[n] == pair[p][1]) {n++; ad=true;}
      if(temp[n] == pair[p][2]) {n++; ad=true;}
      if(ad==false) break;
    }

    cycles++; 
    if(p==0) p=1; else p=0;
  }

  return cycles;

}

int main() {

  string temp;

  // generate random sequence
  for(int n=0;n<10000;n++) {
    int r = rand()%4;
    if(r == 0) temp += "A";
    if(r == 1) temp += "T";
    if(r == 2) temp += "G";
    if(r == 3) temp += "C";
  }

  cout << "Sequence: " << temp << endl;

  // Single base incorps
  string order="ATGC";
  int pos=0;
  int cycle_count=0;
  for(int n=0;n<temp.size();) {

    for(;temp[n] == order[pos];) n++;
   
    pos++;
    cycle_count++;
    if(pos == order.size()) pos=0;
  }
  cout << "Average cycles per base, single base incorps: " << ((float)cycle_count)/((float)temp.size()) << endl;

 
  // Super ugly code, but functional...
  vector<vector<string> > pairs(6);
  pairs[0].push_back(s1); 
  pairs[0].push_back(s2); 
  pairs[1].push_back(s1); 
  pairs[1].push_back(s3); 
  pairs[2].push_back(s1); 
  pairs[2].push_back(s4); 
  pairs[3].push_back(s2); 
  pairs[3].push_back(s3); 
  pairs[4].push_back(s2); 
  pairs[4].push_back(s4); 
  pairs[5].push_back(s3); 
  pairs[5].push_back(s4); 
  
  int total=0;
  for(int n=0;n<6;n++) {

    int count = mix_incorp(temp,pairs[n]);
    total+=count;
  }
  cout << "Average cycles per base, mixture incorps: " << ((float)total)/((float)temp.size()) << endl;
}