Print-to-Digital Book Conversion Walkthrough

Scan a book

After my recent review of the new ScanSnap SV600, I got some emails from some readers wanting more information relating to converting a print workbook to a digital version. With my previous review of the ScanSnap iX500, I had briefly mentioned how I ripped apart a book of mazes and scanned in the pages so my oldest son could solve the mazes on the iPad over and over again. Another related question was how well the SV600 compared to the iX500 for performing this task as well as the final results. And yet another question asked if I would create a video showing the entire process.

Rather than shoot a video, I believe I can answer many of the questions with some photos and simple instructions on the process that I’ve now got in place for doing this. I’m going to go through every step I take to convert an actual book to a PDF file and show you what it looks like on the iPad when all is said and done.

I’m going to start with a book titled Mummy Mazes by Don-Oliver Matthies. It’s a 40-page book with 28 pages devoted to mazes, and the remaining pages dedicated to the solutions (which I won’t be scanning) and a few opening pages including one where the child writes his/her name in the book and draws or tapes a picture of himself/herself.

Before I begin the actual scanning, one item to note is that two or three of the mazes are two-page spreads. The SV600 software will allow me to scan a book two pages at a time and later “break” the two page image into two separate pages. I won’t be doing this for the two-page mazes, so you’ll get to see what a two-page scan looks like without my having ripped the book apart and using the special stitching folder that comes with the iX500 to stitch the two scans together into one original maze. More on this later.

So, to get started, I’m going to make some configuration settings first. I right click on the ScanSnap icon in the lower right corner of my screen (on a Windows 7 PC) and choose the “Scan Button Settings” potion that controls what happens when I press the Scan button on the front of the SV600’s base. I’m going to select the PDF option for the File Format as you can see below. This is done on the File Option tab.

SV600 Settings

I’ll next switch over to the Scanning tab and make certain the image quality and color mode are set to automatic and I’ll be doing single-sided scans, of course. (The reason for this drop-down menu is that if I’m using the iX500 I can select duplex-scan for scanning both sides of a page at once — the only issue here is it would require me to rip out the pages and I’m trying to avoid that now.)

Settings cont.

I’m not going to be doing OCR (Optical Character Recognition) here, and I’m not concerned about compression, but these are options you can certainly tweak if you want them.

Now it’s time to start scanning. What I love about the SV600 software is that it can detect between a single page and two-page scan (or at least it always gets it right in my experience). So I’ll start with the cover. I press the Scan button on the base of the SV600. It doesn’t have to be squared up perfectly — the software will correct an amazing amount of skew. Still, I’ll be trying my best to keep the book’s top edge as parallel to the front base as possible.

Cover scan

Now for the tricky part.  A book like this tends to want to close. The software does a great job of removing your fingertips from the margins if you choose to hold down the pages. For this kind of book, I’m just using my fist to flatten the seam as much as possible. I don’t care if the book gets a little bent out of shape from this process. If you’re scanning a thick book, however, you’re going to have to use your fingers to reduce the page curl as much as possible. Here’s a photo of the book open to the first set of mazes and ready to scan. Notice that the left side of the book has a noticeable curl but it’s not huge… I’m NOT going to be using my fingers, so watch what happens after I do the scan.

Two page scan

As you can see in the next image, the curl has been flattened with almost zero defects in the final image and the software is now waiting on me to turn the page and scan the next set of mazes.

Flattened pages

Before I press the scan button for each set of mazes, I let the pages settle first. After I press them now and try to flatten the seam, there’s always a small amount of movement. After the pages have settled, I keep going. The next photo shows you one of those mazes that takes up both left and right pages. With the iX500 I would have scanned each of these pages separately and then stitched them together into one image using the special stitching folder that comes with the iX500. With the SV600, I don’t have to do that anymore. I simply flatten the pages as best I can and then scan the entire thing as one large maze.

Two page maz

You may be wondering what’s happening in the scan when the pages underneath the current page being scanned start showing. This always happens because of the curl, and my scans do indeed have little slivers on the left side for each of the previous pages. The good news is I can fix that shortly during the cleanup/edit phase and it doesn’t take long at all. I’ll come back to that in a bit.

After I finish all the scanning, I’ve got 15 pages as you can see in the image below. But keep in mind that all but two of those consist of one image with two mazes in each. Those will need to be broken up. This means Cover + (12 scans x 2 mazes) + (2 scans x 1 maze) for a total of 27 pages for the final PDF file. I’ll click the Finish Scanning button to enter the editing/cleanup stage.

Scanning done

In the Cleanup stage, I will get a chance to look at each scan to make certain the scan was good. The first scan of the cover looks great (image skipped in the interest of brevity), so it’s time to move to the next two-maze image.

As you can see in the image below, the curve of the left page was picked up by the scanner. But notice on the right side of the screen that the scan has already been flattened. It looks good, but I want to split this page into two individual pages so I can see the details a bit better.

Start cleanup

To do that, I make sure I’ve got the image thumbnail selected in the right scrolling window and then click on the 1|2 button near the top followed by the Apply button. As you can see below, the image was successfully split into two and I can view the pages a bit larger to make sure the scan was good. (If the scan were bad, I can scan the page in question again and then use the included Acrobat X software to later insert the new scan into the book PDF in its proper spot.)

Separated scan

Now let me show you one of those mazes that takes up two full pages as well as the slivers of pages underneath showing up in the scan. Take a look at the next screenshot.


Look down the left side of the scan and you’ll see a mix of colors that are the slivers of pages underneath the current two-page spread. Luckily the software allows me to redraw the selection window using the red dotted outline you see here. I can drag it so that it completely ignores the slivers on the left. I won’t be splitting this page, so I simply select the 1 button and click Apply to have the image cropped so it remains a single, two-page maze but without the slivers showing through!

Slivers removed

After the Cleanup process is done, I have 27 pages ready to be saved. I click on the Save & Exit button shown below.

Cleanup finished

I’m next asked where to save the PDF. I typically save to my desktop or Evernote or to Dropbox, but there are a dozen options here.

Where to save

I collect these kinds of scans in Dropbox so I can use an app on my iPad to allow my sons to do the various worksheets and books I scan. Any app that can open a PDF file (or you can scan each page as an individual JPEG) and let you scribble over the original will work — PDFPen, Doodle Buddy, and even GoodReader.

Keep in mind, my main reason for doing these kinds of things is due to the fact that I have two sons, ages 6 and 3. I’d prefer to not buy a book twice — I was burned many times with my first son when I purchased workbooks and he’d scribble a single line with a crayon over a page or a couple dozen. Knowing my youngest would one day enjoy books like these maze books, I have found that scanning in these types of books will let my boys do the activities over and over and over again…

Below you can see one of the mazes opened up in PDFPen. It’s a great app that lets you markup PDFs. Even better, it lets me download a copy of the mazes from Dropbox so the originals are left intact. And even better… PDFPen will let my son undo a markup; in essence, erasing a worksheet or maze for another try.

PDFPen Scan

And here’s another page from the book opened in GoodReader and solved by my son. (I’m trying to get him to use a stylus, but he insists on using his fingers.)

GoodReader Scan

With both apps, you have to turn the iPad to Landscape mode for the large mazes that cross two pages, but it still works great and requires no dragging around of the screen — the entire thing is displayed on the large screen.

Now here’s the deal — everything you see above can be done by most any of the ScanSnap family of scanners. With the iX500 and other feeder-type scanners, you’re going to have to pull apart the workbooks so you can scan the pages, but you can save time by using the duplex-mode scanning feature to do two scans at once. (Be sure to trim the edges off the pages where the glue in the binding remains… this can gum up a scanner and cause a paper to misfeed. I used scissors with my early maze book scans to trim the edges so the pages fed in nice and neat.)

Now, to answer the comparison question — iX500 versus SV600. For single page or a stack of 10-20 separated sheets, the iX500 is much faster since it can grab one page at a time from the hopper. I use the SV600 for individual scans (single sheet such as a legal document or a brochure) because it’s just easy to sit it on the mat and press the button. For stacks of scans or a mix of unusual sizes of documents, I go with the SV600. For standard stacks (such as 8.5 x 11 sheets or three pages ripped from a magazine) I go with the iX500 for sheer speed. For books (mine or kids books) I go with the SV600 now that I see how well it does on scanning two pages at once and flattening them. Both scanners can deliver the same resolution options and save to PDF or JPG, so it’s really an issue of my time and which scanner is best suited for the type of scanning I will be doing. Again, the quality of the scans from both scanners is identical as far as I can tell.

Over the last week, I’ve now managed to scan in six additional maze books (added to the original Pirate Mazes book I scanned with the iX500) in the Maze Craze series. My son loves this series because there’s a story that ties all the mazes together… and in a year or two, my youngest will probably be ready to enjoy solving them as well.

Mummy Mazes
Castle Mazes
Explorer Mazes
Magician’s Castle
Detective Mazes
Spooky Mazes
Pirate Mazes

I should add that I also scan about 20-30% of the worksheets, scribbles, coloring sheets, and other stuff that comes home from school. I don’t know about your school(s), but my youngest brings home 2-3 sheets per day and my oldest brings home a pile of worksheets every Thursday. In a month, this pile can easily hit 2-3″ in height. I’ve gotten in a habit of picking out some of the best stuff and scanning it right into a single PDF. I have folders for both boys labeled by age or school year and I’m hoping there’ll come a day in the distant future when a daughter-in-law or grandchild might be interested in seeing not only photos from their Dad’s childhood but also a collection of artwork, spelling tests, report cards, and other stuff. I read somewhere that studies have shown a photo of keepsakes and special artwork trigger something in the brain that’s identical to holding the object in your hands… I can’t keep all the stuff my boys bring home, so scanning gives me the next best thing. Just a thought.

Finally, I’d be remiss if I didn’t close this review by stating clearly that I do not violate copyright laws and I do not redistribute content. I was asked if I could send a copy of the Pirate Maze PDF “to look over” but I politely declined. Everything I scan for use by my children is something I purchased — as a full-time writer, I do not wish to steal from my fellow writers/artists and I would humbly ask that you adopt the same position should you choose to scan any books in your library.

James Floyd Kelly is a full-time writer. His latest three books are Digital Engineering with Minecraft, Tinkercad for Beginners and The Ultimate iPad. Learn more by visiting his website