Archive for the ‘Uncategorized’ Category.

Interfacing Hitachi HD44780 LCDs using I2C on an Arduino

IMG_0078Akiba was showing me these modules he picked up in Shenzhen which can be soldered directly to the back of the HD44780 alphanumeric displays. One of the issues with these displays is that they are driven by a parallel data stream. While this makes them very easy to code for it uses up a lot of pins on your microcontroller.

This is where the HD44780 to I2C module comes in handy. It uses the I2C serial bus on (for example) the Arduino. This means only only two you go from 7 to 2 pins, you can also put multiple modules on the same bus with minimal overhead.

The module is based about the pcf8574 which is a general purpose I2C I/O expander. It’s going to be a part I keep in mind of other projects, as it looks really useful for any situation where you’re running out of pins.

The NewLiquidCrystalDisplay library supports the adapter module on the Arduino. There are a couple of gotcha, but the code below includes the library and is setup to use the default address (which is configured by 3 unplaced resistors on the back of the module). Aside from the code, the wiring process is really easy, you just need to connect the module to +5V, ground, and SCL/SDA (either via dedicated pins of A4 and A5 on the Arduino Uno). Here’s my horribly messy wiring (I’m afraid I didn’t have any SIL connectors to hand):


There’s one slight gotcha. If you find that the display is illuminated, but blank. Make sure the display intensity is set correctly, just use a screw driver to turn the pot on the back of the module until you can see the text well.

You should be able to buy the module on the freaklabs store soon.

Complete example codebase, including library: i2cLCD.

Notes on Genia’s new paper – nanopore SBS

geniaporeGenia have released a new paper showing recent data from their “nanopore SBS” platform.

Summary: The best data in this paper is a 20bp read on a synthetic template with no homopolymers. This has long dwells (multiple seconds) and levels looks clearly differentiated. The second has short dwells (100ms?) under different experimental conditions they say gives better resolution on homopolymers.

The first dataset looks like reasonable progress, the second I’m not sure I buy, and is very low complexity in any case (just 3 homopolymers runs).

Overall this is an R&D level system. It’s interesting progress, but not useful for any application at present.

Genia’s nanopore SBS technology is shown in the figure above. To a computational scientist like myself it seems like an interesting system. Genia have had modified nucleotides created such that each nucleotide has an oligo hanging off it. That seems pretty amazing, but it appears that the nucleotides are incorporated by the polymerase. During the nucleotide incorporation process, the tag breaks off and passes through the pore. The cartoon below shows the basic idea:


The diagram above shows each nucleotide tagged with a longer oligo which goes down the pore. When the tags sit in the pore they block the flow of ionic current through the pore. While in the diagram above I show polyN tags, Genia have selected tags to give a good spread of current blockages (and have included modified bases in the tags). Uses oligo tags has two benefits over competing systems. Firstly, each tag is providing a signal from a single base. In some competing nanopore systems multiple template bases are in the pore at the same time. This means that more than one base effects the readout. This results in a convoluted signal from which it can be difficult to extract the original template sequence [1]. The second advantage is that you can optimise the spread of the tags so that each tag and be easily differentiated.


There seems to be one other trick in the system described with this sentence “The applied voltage is adjusted to ensure that, in a majority of cases, one and only one pore is inserted into the membranes of each well. “. My understanding was that the number of pores in each membrane is poisson limited. But if they’re able to control the pore insertion with an applied voltage that’s pretty neat (perhaps someone who understands this can comment). The paper discusses a 264 pore chip, which stuck me as odd as I believe they’re talked about chips with many more pores.



The first dataset is shown to the right. This is the dataset that contains no homopolymers runs. To my mind it’s the most convincing dataset in the paper. Raw data for this plot isn’t available (why is that still ok in 2016?). So we’re forced to draw our conclusions from eyeballing the data, and their analysis.

The data however looks quite clean, I’d assume this is the best data they’ve seen on the chip, and it’s a shame their aren’t more examples of this read. The base dwells seem to be all over the place, and I’d assume, much like other nanopore systems, they are exponentially distributed.

The second dataset describes their experiments optimising the system for homopolymer detection. I find this less convincing. It’s a short run, and it’s hard to tell how much longer the ‘T’ calls are than the noise spikes that appear to be at almost the same level. The following statement also gives me some concern:

“Base calling was carried out by manual inspection of the current level of each deflection, ignoring ones with dwell times less than 10 ms.”

I guess this is effectively thresholding the data, but in that case why not say that? Regardless the fact that an automated base caller wasn’t used most likely means that datasets are very small at the moment.

Overall this is interesting progress and represents a solid milestone in their development. It’s not clear that this actually represents the state of the art Genia system. It may be that this is an older platform and doesn’t reflect the current system, as the low pore count might indicate. However, it’s common for every vendor to say this when a new paper is released, and it’s difficult to discern the truth without further disclosures.


[1] In the pore used here this is particular important. You have about 15 picoamps between the maximal and minimal blockage. This isn’t a huge amount of signal. Before even considering thermal noise, if we were to sample at 1MHz 1 picoamp would be 8 electrons per timestep. As a colleague used to say… so few electrons that you could name them.

Disclosure/Disclaimer: I have worked and continue to work in the DNA sequencing industry. I own stock in DNA sequencing companies. While I have tried to be unbiased this represents my opinion and speculation only. I recommend reading the publicly available paper for yourself.

PunkSeq10 Schematics and Gerbers


This post contains the current gerbers, schematic and layout files for the PunkSeq10. The currently shipping version is r2. You can buy this from my shop.

Schematic as pdf: punkseq10

Kicad files (including gerbers in BabySeq.tar

Are you sure this isn’t horse? – DNA Sequencing is Universal Sensing

 Today DNA sequencing is dominated by research applications. This is likely to continue in the short term, but in the medium and long term the future of DNA sequencing is likely to look very different.

Fundamentally I see sequencing as a new class of sensor. People talk generally discuss sequencing in the context of human health. But as sequencing decreases in cost and becomes easier to use it becomes more like a general purpose sensor. Like CMOS imaging chips for example, it will have research,clinical, and consumer applications.

The Medium Term

In the medium term it’s clear that DNA sequencing will be making its way into the clinic. Companies like 23andme which screen for inherited genetic traits are one obvious application. The genetic screening of every child at birth is somewhat attractive, and could provide everyone with a dataset they could draw upon throughout their lifetime. With 4 million births a year in the US, this is a pretty big market, but it’s perhaps not the biggest.

There are 1.7 million new cancers reported each year in the US, cancer is an inherently genetic disease, and each of theses cases would ideally be sequenced in full to better understand its genetic cause.
But better than understanding a cancer once you know it exists is to detect it early so you can do something about it. Companies like Illumina’s newly founded GRAIL seek to use DNA sequencing to regularly screen for cancer. For various reasons both complete, and fragmented cancer cells end up in a patents blood. By taking a simple blood sample you should therefore be able to screen for cancer using DNA sequencing. Screening every US adult every 5 years gives you a market for 50 million test a year.

The screening market is much bigger than this. There are more than a million sepsis infections a year in the US screening for early detection of sepsis and other infectious diseases would be of huge value. In fact is seem obvious that when bulk sequencing for <$100 becomes generally available, blood serum would be sequenced as a matter of course. That single test could detect cancers, infectious diseases, and provide a genetic profile of any unborn children as well as the patent themselves.
If we sequenced every patient admired to hospital, that would result in ~30 million genetic tests in the US alone.

The Long Term

In the longer term it’s clear that sequencing costs will drop even further. My guess is that the basic sensing technology will drop as low as cheap CMOS imaging sensors are today. These cost almost nothing, on the order of a dollar. Unlike CMOS sensors though I expect DNA sequencer to always be single (or few) sample use. But overall it seems reasonable to expect that the cost of sequencing will drop to the $10 mark.
Combined with on chip extraction techniques this could create a simple consumer grade platform. But what exactly would a consumer do with cheap sequencing.

There are a few ideas I’ve heard thrown around. One is routine sequencing at borders. This would allow quick detection and containment of viral outbreaks. This might be feasible at the $10 mark, though it may have significant legal and moral implications.

I think it’s more likely that at that level people would be routinely sequencing themselves anyway. Every time you have a cold, or feel a bit under the weather you’d sequence some samples and find out exactly what was wrong with you. It’s possible that this could lead to targeted medication but reassurance that “there’s nothing serious wrong” is probably worth $10 to most people, and is a lot more comforting than a doctor saying “it’s probably nothing come back if you still feel bad in a week”.

You’d likely be using DNA sequencing as a QC in agriculture and food processing too. To track contamination, source of infection, or monitor supply chains. It’s so cheap that it makes sense to do this on a per-batch basis.

And in agriculture, it will be used to monitor the health of livestock, in much the same way as it would be used to monitor human health.
Some have even suggested that DNA sequencing will be integrated directly into toilets, which will in no doubt cheerily intone your wellbeing, or setup a doctors appointment for further tests.

DNA sensors in public spaces might continuously monitor for airborn viruses.

Ultimately I think everyone will have a DNA sequencer in their home, either continuously monitoring the occupants health, or used as required. Perhaps enabling to answer questions like “is this really not horse meat?”
If we can beat a $10 sequencing run and head toward $1 even more applications open up. “What’s this plant?” You could try and look it up, but why not just sequence it, and as a bonus get a complete genotype. How clean are my work surfaces? Sequence the bacterial population?
The eventual global market for sequencing is likely to be in the 10s of billions of tests per year.

Disclosure: I always try to be as unbiased as I can. However, I have worked for DNA sequencing companies. I own stock in DNA sequencing companies. And I continue to work in the industry.