Evonetix and other thoughts

This post isn’t really about the Evonetix approach to DNA synthesis, but rather about related ideas. To set the context, I review an approach described in one of their older patents (their current approach pretty well on their website).

Evonetix was incorporated in February 2015 (Cambridge, UK). They have raised somewhere of the order of ~15MUSD to date. Investors include Cambridge Consultants, Hermann Hauser, DCVC, Draper Esprit, Morningside Group, Rising Tide, and Civilization Ventures.

Their patents [1] describe the basic approach, their patents seem quite readable and I recommend taking a look.

Essentially, they describe using a substrate on which sites are coated with a waxy layer (it seems n-Alkane is preferred). They then selectively melt the wax in order to expose the substrate to reagents. They don’t mentioned enzymatic DNA synthesis methods in the patents I’ve read at all. It looks like phosphoramidite synthesis is the focus.

They discuss two approaches to applying heat, either using a laser or via on chip heating elements. Given their recent deal with LioniX [4] it seems likely that they will continue down the silicon based route. As usual a number of configurations are discussed in the patent, but in particular a 0.5 micron spacing is mentioned. In the patent and elsewhere a billion wells seems to be the target. This would result in something like a ~16mm^2 chip. Big enough that I’d hope it would be reusable, but not massive.

I would guess one of the issues would be insuring that the thermal changes are sufficiently localized. The patent suggests a couple of methods [2] [3] to help with this, but I imagine it will be a significant challenge. In fact, you also want to uniformly cool the chip after each cycle as well… so all round accurate thermal control will be the important.

Thoughts

Essentially the Evonetix play is around designing a system to selectively expose a “virtual well” to reagents. That’s quite an interesting idea, but it made me wonder about related approaches.

Activating Enzymes

The first thought that occurs to me is that how might a similar system be used in an enzymatic DNA synthesis platform. Rather than using heating elements to expose a strand under extension to reagents could you use heat to either activate or deactivate enzymes.

For example lets say you cool the whole chip down, to a point where the template independent polymerase (TdT) is largely inactive. You then locally heat the chip, only where you want incorporation to take place (which would occur cyclicly).

Confining the heat in the presence of reagents might be more of an issue here.

Using an electric field

Rather than deactivating the polymerase thermally, could you alter the local conditions around a strand under synthesis using an electric field. Initially I was wondering if a negative field would, kind of push away nucleosides. However, it seems [5] that a field can also be used to locally change the pH. Could you therefore use local changes in pH to activate/deactivate an enzyme?

Non-virtual wells

The Evonetix play is around the use of a “virtual well”. But how might you go about creating real wells with caps which could easily be opened and closed to selectively expose the contents of the well to reagents?

One approach might be to use a larger (maybe 500nm?) bead as a cap. The bead could be magnetic, or charged. In those cases, you could use a magnetic or electric field to selectively push the bead away from the well just enough to allow reagents to enter the well. You might want to coat the well with something to help create a seal perhaps.

Local heating could also be used to push the bead off the well perhaps, or other methods like optical tweezers… these seem less attractive however.

Those were my initial thoughts anyway… Evonetix seem like an interesting company, and I look forward to seeing how things develop.

Notes

[1] https://patents.google.com/patent/US20160184788A1/en US20160184788A1

[2] “To achieve a great melting and coalescence of the masking material within a limited area, a series of discrete heating pulses may be used, each pulse being separated by period of cooling. For example, a series of 1,000-20,000…heating pulses… about 1 ns… with about 1 microsecond of cooling between each pulse may be suitable.”

[3] “The thermal energy may be applied to the selected ones of the sites simultaneously, but alternatively it may be desired to stagger the application of thermal energy to the select sites in order to allow for more efficient diffusion of thermal energy away from a selected site following the application of thermal energy. Suitably, the application of thermal energy to the selected site is carried out so that the adjacent sites are not heated simultaneously and optionally not immediately one after another.”

[4] https://www.evonetix.com/evonetix-lionix/

[5] https://patents.google.com/patent/US20130344539

An Encoding And Correction Approach for DNA Data Storage

I’ve previously noted that there’s significant interest in using DNA as a data storage medium. In my previous post, I discussed a correction/selective amplification approach which might help remove errors in errored synthesis platforms.

In this post I consider an encoding and selective amplification approach that might work particularly well for DNA data storage.

In this approach only a subset of bases are used to encode information. Other bases are used to provide synchronisation points.

For example we might use the bases A,T, and C to encode information. G would be used as a synchronisation base. We might for example, have 9 bases of information followed by a synchronisation “G” [1].

We can see how this could work by way of the following example:

True sequence:
0123456789012345678901234567890123456789
TACTACTATCGTCATCATCTGCTAATCATTGACTTTACTA

Our synchronisation “G”s will allow us to selectively amplify those synthesized strands matching the “true” (desired) sequence which do not contain insertion errors.

For example, the following strand contains an error at position 7. We would use the technique previously described, that is we would use a normal polymerase and perform stepwise incorporation by flowing in bases in the “true”/desired order.

Error at position 7:
01234567890123456789012345678901234567890
TACTACTCATCGTCATCATCTGCTAATCATTGACTTTACTA
ATGATGAGTAGCAGTAGTA

The presence of regular synchronisation “G”s makes it harder for an errored strand to advance when undergoing stepwise synthesis, as the strand needs to wait for up to 9 bases to flow through the system until it can start to advance when out of sync.

As previously noted, this scheme can be used to selectively amplify strands without insertion errors (between rounds of melting). The amplification scheme could be applied at regular intervals to remove error’d strands from the system.

This amplification scheme does not help with deletion errors, these as possibly less critical here as they appear as a length error (which may be illuminated though size selection). The most critical errors maybe a combination of insertions and deletions which result in strands of the same length as our desired strand. This scheme could help remove these.

Notes

[1] Naturally, different bases, and different spacing could be used. Potentially you might want to switch between using different sets of bases to encode information, and for synchronisation throughout a strand.

The encoding used, could be one of a number of schemes. Of particular interest might be an encoding that minimises the impact of deletion errors with respect to the desired sequence (for example, uses longer homopolymers to encode data).

Using an SBS-like approach to selectively amplify

Today I was pondering that fact that there are DNA synthesis approaches that may result in high error rates.

One significant class of errors is insertions. In particular, homopolymer errors. One of the issues with enzymatic DNA synthesis is that so far, it’s been difficult to incorporate bases with reversible terminators.

One approach could be to limit the number of bases incorporated purely through the concentration present. This is likely to result in a highly errored product however. Even if your error rate is 5%, after incorporating 100 bases, less than 5% of your product will be fully correct.

If we could selectively amplify only the correct strands, this might give us more utility out of an inefficient/errored synthesis platform.

Let’s say we get some reasonable fraction of fully correct strands at 20 bases [1]. Size selection might be problematic [2] as many errors will be either the same length, or nearly the same length. We assume that insertion errors dominate, and it’s these errors that we’re mostly interested in removing.

One approach might be to selectively completely amplify only those strands which don’t contain insertions. You can do this, by step-wise synthesis of a complementary strand. By exposing the strand to reversibly terminated [3] bases in the correct order only. The scheme is somewhat similar to sequencing-by-synthesis, but here is used for selective amplification.

To take an example, say we have attempted to synthesize the sequence CGTCCCTAGTCGACTGACGT. We would expose the synthesized strands to complementary bases in the correct order [4] during stepwise synthesis. This stepwise process would be, similar to sequencing-by-synthesis (incorporate, wash/remove, cleave terminators etc.).

A fully correct strand, or one containing deletions only will incorporate a base at every position. A complete complementary strand will therefore be created.

A strand with an insertion however will become out of sync with the correct/desired bases. It will therefore no incorporate a base at every position.

In the example below, we can see how a single insertion error, will result in a strand half the size of the original. Insertion errors are therefore converted to larger fragment size errors (and produce significantly smaller fragments in many cases).

In this example bases are flowed into the pool in the order G,C,A,G etc. and incorporated from the 3′ end of the template.

In the errored strand, bases incorporate correctly until the 6th position. At this point, the synthesis process gets out of sync. An A,T,C, and A fail to incorporate, before another G is encountered. The final synthesised strand is ~50% smaller than the fully correct template.

True sequence
   01234567890123456789
3' CGTCCCTAGTCGACTGACGT 5'
5' GCAGGGATCAGCTGACTGCA 3'

Insertion
   012345678901234567890
3' CGTCCCCTAGTCGACTGACGT 5'
5' GCAGGGGATCA           3'

The process described would most likely need to be performed cyclicly (between rounds of melting), to amplify the pool of strands sufficiently. After this selective amplification process, size selection [6] could take place to select the correct (or a “more correct” subset). This subset might be used for downstream applications, or as a substrate for further synthesis [5].

This amplification process might remove the most problematic errored strands from the synthesis process [6] as well as potentially allowing us to gain more utility for an errored synthesis process.

Notes

[1] I’m selecting 20 bases to keep the examples simple.

[2] Again size selection of short fragments is problematic anyway, but this is just an example.

[3] Or maybe, without terminators if you don’t care so much about homopolymer errors and only interested in removing other insertions.

[4] Appropriated primed+a normal polymerase, suitable for incorporating the base we are using.

[5] Effectively you might try and “reset” the synthesis process periodically, by removing errors from the pool.

[6] The most problematic errored strands might be those that are the same size as the fully correct template. These strands would need to be the result of at least an insertion and a deletion. The above scheme will not completely amplify these strands, and could therefore help mitigate against this issue.

Eve Biomedical

Today I’m going to briefly review a slightly less well known DNA sequencing company, Eve Biomedical. Checkout the complete list of sequencing companies, for links to all other posts.

Business

Eve Biomedical is a Californian company (C2934369) founded on 11/13/2006. They received ~1MUSD of SBIR grants, starting in 2013. Crunchbase lists them as having raised 7.7MUSD (from DFJ) most recently in December of 2012. I could find only 3 current employees on LinkedIn, and around 10 previous employees.

Technology

I could find 2 patents assigned to Eve Biomedical [1]. I’m going to focus on the earliest, because it’s the most fascinating to me [1a]. The patent describes a process they call “Rotation-dependent transcriptional sequencing”.

The sequencing approach relies of the fact that as an RNA polymerase incorporates a base, it rotates the template DNA strand. The approach is therefore sequencing-by-synthesis. However rather than synthesising a complementry strand of DNA (as in Illumina and other approaches) DNA is being transcribed to RNA and we are detecting the incorporation process as it takes place.

How to we detect the rotation of single strands of DNA? Stick a bead on it… An asymmetric magnetic bead is used. This allows us to put it under slight tension using a magnetic field (so it doesn’t move around under brownian motion). The patent shows 2.7 micron beads, it’s slightly surprisingly to me that a single polymerase/strand is able to move what I assume is a comparatively huge bead (if anyone has dimensions  for polymerases I’d be very interested!).

A schematic of the system (from the patent) which I’ve annotated is shown below:

With this basic system in place, there are a couple of questions remaining. The first is, what exactly do these tags look like. The patent describes tags that look like a bigger bead with a smaller one stuck to it, I didn’t look into the construction details, but I’d imagine these can easily be fabricated:

The second question, is how to we use these parts to build a DNA sequencing platform.

If we are able to detect incorporation events, it’s hopefully clear that we can use this system to sequence DNA. One option is to simply flow bases in one at a time, and detect when bases are incorporated. This approach is similar to other single-channel, unterminated sequencing-by-synthesis approaches (Direct Genomics comes to mind, Ion torrent would be a related bulk approach).

The patent mentions this of course, but focuses on a different approach. Here they supply all but one base in equal quantities. A 4th base is supplied in limit quantity. This means that every time the polymerase needs to incorporate the “limit quantity” base, it pauses and hangs around waiting for one to come along.

Assuming that the polymerase otherwise incorporates at a constant rate, you can use these pauses to detect where that base occurs on a strand.

You then need to perform the sequencing experiment 4 times [6], limiting one base each time. The plot below, shows an example trace showing where slower incorporation indicates the incorporation of a “G”:

 

The patent also shows what appears to be a complete worked through real dataset (the text implies it’s real). In this example, 4 experiments are performed each reaction containing a limited quantity of one base. The rotation traces are captured and then combined to determine the template sequence:

The sequence contains a couple of homopolymers, I guess the incorporation process slows down for twice as long at these points. Overall, it’s difficult to tell exactly how well the process is working from the trace above (or get any idea of what the error rate might be). But if this is proof of concept real data, it looks pretty reasonable!

The approach seems attractive because it vastly reduces the number of cycles/washes that would need to be performed (just 4 versus 1 for every base position sequenced).

The Eve Biomedical approach has other advantages too, because the beads are big, the optical system should be relatively simple and cheap (commodity mobile phone CMOS sensors?). But there are disadvantages too. Because the  sequencing is occurring in real-time sequencing is limited to a single field of view. This might make it harder to scale the platform.

Eve Biomedical have a second patent [1]. On my brief reading, this refers to using the same RNA polymerase, limited quality of one base system. However in this patent they suggest using it with a nanopore/nanostructure platform. The patent doesn’t seem to include real data/rig images.

The Eve Biomedical approach is really unique, and I’m not seen anyone else suggest using rotation to detect incorporation. While a very different approach, it reminds somewhat of the scheme Depixus have presented.

I’d love to see it played out a little more, if only because it’s so different.

Notes

[1] https://patents.google.com/patent/US20180010181A1

[1a] https://patents.google.com/patent/US20120214171A1
“As a consequence of transcription, the RNA polymerase exerts torque on the nucleic acid, which, in turn, manifests itself as rotation of a tag attached to the nucleic acid.”

“Such a method generaly includes contacting an RNA polymerase with a target nucleic acid molecule under sequencing conditions, detecting the rotational patern of the rotation tag, and repeating the contacting and detecting steps a plurality of times.”

“acid molecule comprises a rotation tag. The sequence of the target nucleic acid molecules is based, sequentialy, on the presence or absence of a change in the rotational patern in the presence of the at least one nucleoside triphosphate”

“FIG.3 are graphs showing two modes of nucleic acid sequencing described herein: Panel A shows an asynchronous, real-time “nucleotide patern’ sequencing strategy, where a limited concentration of a single nucleoside triphosphate (guanine(G) in this Panel) causes the polymerase to pause when incorporating G nucleotides in to the nascent Strand. Panel B shows asynchronous sequencing strategy, where a “base-by-base’ introduction of nucleoside triphosphates results in a continuous decoding of the nucleotide Sequence.”

“Rotation-dependent transcriptional sequencing relies upon transcription of target nucleic acid molecules by RNA polymerase. The RNA polymerase is immobilized on a solid surface, and a rotation tag is bound to the target nucleic acid molecules. During transcription, RNA polymerase establishes a transcription bubble in the template nucleic acid that contains within it an RNA:DNA hybrid of approximately 8 bases. As the RNA polymerase advances along the double-stranded nucleic acid template, it must unwind the helix at the leading edge of the bubble and reanneal the strands at the trailing edge. The torque produced as a result of the unwinding of the double-stranded helix results in rotation of the template nucleic acid relative to the RNA polymerase of about 36° per nucleotide incorporated. Therefore, when the RNA polymerase is immobilized on a solid surface and a rotation tag is attached to the template nucleic acid, the rotation of the template nucleic acid can be observed and is indicative of transcriptional activity (i.e., incorporation of a nucleoside triphosphate) by the enzyme.”

[2] https://www.genome.gov/27554929/

[3] ~1MUSD in SBIR grants: https://www.sbir.gov/sbirsearch/detail/671328

[4] Crunchbase lists a total of 7.7MUSD raised, from one disclosed investor DFJ. https://www.crunchbase.com/organization/eve-biomedical

[5] Here a figure from the patent which as I understand it shows magnetic beads going in and out of focus and the field strength is varied. This process isn’t used in sequencing, but shows how the beads/strands can be put under slight tension during the sequencing process.

[6] Strictly speaking 3 times, as you could obviously infer one of the bases as “not being any of the others”. You could also probably using the Cygnus type “mixtures of bases” error correction schemes with this approach.