{"id":762,"date":"2012-02-19T13:08:44","date_gmt":"2012-02-19T13:08:44","guid":{"rendered":"http:\/\/41j.com\/blog\/?p=762"},"modified":"2012-02-19T13:08:44","modified_gmt":"2012-02-19T13:08:44","slug":"scanning-books-with-a-mfc-5895cw","status":"publish","type":"post","link":"https:\/\/41j.com\/blog\/2012\/02\/scanning-books-with-a-mfc-5895cw\/","title":{"rendered":"Scanning books with a MFC-5895CW"},"content":{"rendered":"<p>I found a MFC-5895CW in Ryman&#8217;s today, discounted to 95 pounds. The MFC-5895CW is a multifunction printer\/scanner\/fax machine. What attracted me though is that it has a document feeder for scanning. I&#8217;ve been looking for a way to scan in my books (even if the process is destructive). As an aside, I&#8217;m told that book scanning services are common (and cheap) in Japan, it&#8217;s a shame it&#8217;s such a pain to get them there&#8230;<\/p>\n<p>For my first effort I decided to use a book I was planning to throw out anyway &#8220;Digital System Design with VHDL&#8221; by Mark Zwolinski (sorry Mark, you were a great lecturer but I just don&#8217;t find myself doing much VHDL these days&#8230;).<\/p>\n<p><a href=\"http:\/\/41j.com\/blog\/wp-content\/uploads\/2012\/02\/photo-7.jpg\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/41j.com\/blog\/wp-content\/uploads\/2012\/02\/photo-7-288x300.jpg\" alt=\"\" title=\"photo-7\" width=\"288\" height=\"300\" class=\"aligncenter size-medium wp-image-763\" srcset=\"https:\/\/41j.com\/blog\/wp-content\/uploads\/2012\/02\/photo-7-288x300.jpg 288w, https:\/\/41j.com\/blog\/wp-content\/uploads\/2012\/02\/photo-7.jpg 838w\" sizes=\"auto, (max-width: 288px) 100vw, 288px\" \/><\/a><\/p>\n<p>To start with, I removed the front and back cover.<\/p>\n<p><a href=\"http:\/\/41j.com\/blog\/wp-content\/uploads\/2012\/02\/photo-8.jpg\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/41j.com\/blog\/wp-content\/uploads\/2012\/02\/photo-8-285x300.jpg\" alt=\"\" title=\"photo-8\" width=\"285\" height=\"300\" class=\"aligncenter size-medium wp-image-764\" srcset=\"https:\/\/41j.com\/blog\/wp-content\/uploads\/2012\/02\/photo-8-285x300.jpg 285w, https:\/\/41j.com\/blog\/wp-content\/uploads\/2012\/02\/photo-8.jpg 820w\" sizes=\"auto, (max-width: 285px) 100vw, 285px\" \/><\/a><\/p>\n<p>I then cut along the spine using a Stanley Knife:<\/p>\n<p><a href=\"http:\/\/41j.com\/blog\/wp-content\/uploads\/2012\/02\/photo-9.jpg\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/41j.com\/blog\/wp-content\/uploads\/2012\/02\/photo-9-223x300.jpg\" alt=\"\" title=\"photo-9\" width=\"223\" height=\"300\" class=\"aligncenter size-medium wp-image-765\" srcset=\"https:\/\/41j.com\/blog\/wp-content\/uploads\/2012\/02\/photo-9-223x300.jpg 223w, https:\/\/41j.com\/blog\/wp-content\/uploads\/2012\/02\/photo-9-762x1024.jpg 762w, https:\/\/41j.com\/blog\/wp-content\/uploads\/2012\/02\/photo-9.jpg 864w\" sizes=\"auto, (max-width: 223px) 100vw, 223px\" \/><\/a><\/p>\n<p>Unfortunately, as I cut into the book I seem to have moved the knife nearer to the spine (something to avoid next time).<\/p>\n<p><a href=\"http:\/\/41j.com\/blog\/wp-content\/uploads\/2012\/02\/photo-10.jpg\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/41j.com\/blog\/wp-content\/uploads\/2012\/02\/photo-10-247x300.jpg\" alt=\"\" title=\"photo-10\" width=\"247\" height=\"300\" class=\"aligncenter size-medium wp-image-766\" srcset=\"https:\/\/41j.com\/blog\/wp-content\/uploads\/2012\/02\/photo-10-247x300.jpg 247w, https:\/\/41j.com\/blog\/wp-content\/uploads\/2012\/02\/photo-10.jpg 603w\" sizes=\"auto, (max-width: 247px) 100vw, 247px\" \/><\/a><\/p>\n<p>After slicing the whole thing up, it&#8217;s ready to scan!<\/p>\n<p><iframe loading=\"lazy\" width=\"560\" height=\"315\" src=\"http:\/\/www.youtube.com\/embed\/uebMz1PUxL8\" frameborder=\"0\" allowfullscreen><\/iframe><\/p>\n<p>The scanner can cope with about 60pages at a time. However it&#8217;s not a duplexing scanner. So once you&#8217;ve scanned one side you need to reinsert the stack of pages to scan the other side.<\/p>\n<p>The MFC-5895CV scans directly to pdf. It creates files with a basename followed by a two digit number (01,02,03 etc.). It will also scan directly to a USB stick, which is rather neat.<\/p>\n<p>So, after scanning you&#8217;re felt with a series of pdf files on a USB stick. Odd and Even numbered files form a pair of front and back sides of pages. You now need to join all these together.<\/p>\n<p>I used pdftk on Linux to do this. Here&#8217;s my bash script (you&#8217;ll probably need to change the basename if you use it). It assumes it&#8217;s being run in the same directory as the input files.<\/p>\n<pre class=\"brush: bash; title: ; notranslate\" title=\"\">\r\nbasename=&quot;010111&quot;\r\n\r\nfor ((i=1; i&lt;=99; i++))\r\ndo\r\n  mkdir join\r\n  cd join\r\n  file1=..\/$basename`printf &quot;%02d&quot; $i`.PDF\r\n  fileout=..\/$basename`printf &quot;%02d&quot; $i`join.pdf\r\n  i=$((i+1))\r\n  file2=..\/$basename`printf &quot;%02d&quot; $i`.PDF\r\n  echo &quot;file1: &quot; $file1\r\n  echo &quot;file2: &quot; $file2\r\n  cp $file1 .\/first.pdf\r\n  cp $file2 .\/second.pdf\r\n  pdftk .\/second.pdf cat end-1 output second1.pdf\r\n  rm second.pdf\r\n  mv second1.pdf second.pdf\r\n  pdftk first.pdf burst output %04d_A.pdf\r\n  pdftk second.pdf burst output %04d_B.pdf\r\n  rm first.pdf\r\n  rm second.pdf\r\n  pdftk *.pdf cat output out.pdf\r\n  cp out.pdf $fileout\r\n  cd ..\r\n  rm -rf join\r\ndone\r\n\r\npdftk *join.pdf cat output complete.pdf\r\n<\/pre>\n<p>It all works pretty well for the most part. Some of the pages came out a little askew:<\/p>\n<p><a href=\"http:\/\/41j.com\/blog\/wp-content\/uploads\/2012\/02\/skew.png\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/41j.com\/blog\/wp-content\/uploads\/2012\/02\/skew-267x300.png\" alt=\"\" title=\"skew\" width=\"267\" height=\"300\" class=\"aligncenter size-medium wp-image-767\" srcset=\"https:\/\/41j.com\/blog\/wp-content\/uploads\/2012\/02\/skew-267x300.png 267w, https:\/\/41j.com\/blog\/wp-content\/uploads\/2012\/02\/skew-912x1024.png 912w, https:\/\/41j.com\/blog\/wp-content\/uploads\/2012\/02\/skew.png 1189w\" sizes=\"auto, (max-width: 267px) 100vw, 267px\" \/><\/a><\/p>\n<p>This maybe due to my poor cutting, not having set the feeder correctly, or the generally dog-eared nature of the book.<\/p>\n<p>Diagrams came out pretty well:<\/p>\n<p><a href=\"http:\/\/41j.com\/blog\/wp-content\/uploads\/2012\/02\/diapaged.png\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/41j.com\/blog\/wp-content\/uploads\/2012\/02\/diapaged-262x300.png\" alt=\"\" title=\"diapaged\" width=\"262\" height=\"300\" class=\"aligncenter size-medium wp-image-768\" srcset=\"https:\/\/41j.com\/blog\/wp-content\/uploads\/2012\/02\/diapaged-262x300.png 262w, https:\/\/41j.com\/blog\/wp-content\/uploads\/2012\/02\/diapaged-897x1024.png 897w, https:\/\/41j.com\/blog\/wp-content\/uploads\/2012\/02\/diapaged.png 1160w\" sizes=\"auto, (max-width: 262px) 100vw, 262px\" \/><\/a><\/p>\n<p>Though, you can see some compression artifacts. It could also do with some post processing to increase the contrast perhaps. I might try the next book at 300dpi (I should also probably uses the black and white scanning mode).<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I found a MFC-5895CW in Ryman&#8217;s today, discounted to 95 pounds. The MFC-5895CW is a multifunction printer\/scanner\/fax machine. What attracted me though is that it has a document feeder for scanning. I&#8217;ve been looking for a way to scan in my books (even if the process is destructive). As an aside, I&#8217;m told that book [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[1],"tags":[],"class_list":["post-762","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p1RRoU-ci","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/41j.com\/blog\/wp-json\/wp\/v2\/posts\/762","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/41j.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/41j.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/41j.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/41j.com\/blog\/wp-json\/wp\/v2\/comments?post=762"}],"version-history":[{"count":2,"href":"https:\/\/41j.com\/blog\/wp-json\/wp\/v2\/posts\/762\/revisions"}],"predecessor-version":[{"id":771,"href":"https:\/\/41j.com\/blog\/wp-json\/wp\/v2\/posts\/762\/revisions\/771"}],"wp:attachment":[{"href":"https:\/\/41j.com\/blog\/wp-json\/wp\/v2\/media?parent=762"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/41j.com\/blog\/wp-json\/wp\/v2\/categories?post=762"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/41j.com\/blog\/wp-json\/wp\/v2\/tags?post=762"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}