I found a MFC-5895CW in Ryman’s today, discounted to 95 pounds. The MFC-5895CW is a multifunction printer/scanner/fax machine. What attracted me though is that it has a document feeder for scanning. I’ve been looking for a way to scan in my books (even if the process is destructive). As an aside, I’m told that book scanning services are common (and cheap) in Japan, it’s a shame it’s such a pain to get them there…
For my first effort I decided to use a book I was planning to throw out anyway “Digital System Design with VHDL” by Mark Zwolinski (sorry Mark, you were a great lecturer but I just don’t find myself doing much VHDL these days…).
To start with, I removed the front and back cover.
I then cut along the spine using a Stanley Knife:
Unfortunately, as I cut into the book I seem to have moved the knife nearer to the spine (something to avoid next time).
After slicing the whole thing up, it’s ready to scan!
The scanner can cope with about 60pages at a time. However it’s not a duplexing scanner. So once you’ve scanned one side you need to reinsert the stack of pages to scan the other side.
The MFC-5895CV scans directly to pdf. It creates files with a basename followed by a two digit number (01,02,03 etc.). It will also scan directly to a USB stick, which is rather neat.
So, after scanning you’re felt with a series of pdf files on a USB stick. Odd and Even numbered files form a pair of front and back sides of pages. You now need to join all these together.
I used pdftk on Linux to do this. Here’s my bash script (you’ll probably need to change the basename if you use it). It assumes it’s being run in the same directory as the input files.
basename="010111" for ((i=1; i<=99; i++)) do mkdir join cd join file1=../$basename`printf "%02d" $i`.PDF fileout=../$basename`printf "%02d" $i`join.pdf i=$((i+1)) file2=../$basename`printf "%02d" $i`.PDF echo "file1: " $file1 echo "file2: " $file2 cp $file1 ./first.pdf cp $file2 ./second.pdf pdftk ./second.pdf cat end-1 output second1.pdf rm second.pdf mv second1.pdf second.pdf pdftk first.pdf burst output %04d_A.pdf pdftk second.pdf burst output %04d_B.pdf rm first.pdf rm second.pdf pdftk *.pdf cat output out.pdf cp out.pdf $fileout cd .. rm -rf join done pdftk *join.pdf cat output complete.pdf
It all works pretty well for the most part. Some of the pages came out a little askew:
This maybe due to my poor cutting, not having set the feeder correctly, or the generally dog-eared nature of the book.
Diagrams came out pretty well:
Though, you can see some compression artifacts. It could also do with some post processing to increase the contrast perhaps. I might try the next book at 300dpi (I should also probably uses the black and white scanning mode).