I have been such a lazy user of scanners. I really hate pulling up a scanner everytime to scan small documents, then to scan, copy and email, such a tedious work for me. These days there are cameras everywhere in mobile phones, computers and laptops. Even there are apps to use video camera as scanner. I decided it to try it for myself and found out that with few lines of code using opencv and python, you can capture document scans from video camera.

I first converted the video data to grayscale then to black and white image (for clarity) and then capture a frame. I applied adaptive threashold mean algorithm which works very well and noise free.

Lets see the code:

import numpy as np
import cv2
import pylab as pl

cap = cv2.VideoCapture(0)

# take first frame of the video
ret,frame = cap.read()

while(True):
    # Capture frame-by-frame
    ret, frame = cap.read()

    # Our operations on the frame come here
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    #Display the resulting frame
    thresh = cv2.adaptiveThreshold(gray,255,cv2.ADAPTIVE_THRESH_MEAN_C,cv2.THRESH_BINARY,25,9)
    cv2.imshow('frame',thresh)
    
    if cv2.waitKey(1) & 0xFF == ord('q'):
        cv2.imwrite("snap.jpg", thresh)
        break

# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()

This code will show a live video, grab a document, show it to the camera, press q key to capture and quit and this will save a image. Then cut the image to whatever you want and convert to pdf. You can see a sample output below (deliberately made to look unclear). This is just a frame capture using a video camera.

I noticed that shakes are a major problem. I tried applying canny edge detection later on this image, but simple thresholding done great  job!

snap