I have been such a lazy user of scanners. I really hate pulling up a scanner everytime to scan small documents, then to scan, copy and email, such a tedious work for me. These days there are cameras everywhere in mobile phones, computers and laptops. Even there are apps to use video camera as scanner. I decided it to try it for myself and found out that with few lines of code using opencv and python, you can capture document scans from video camera.
I first converted the video data to grayscale then to black and white image (for clarity) and then capture a frame. I applied adaptive threashold mean algorithm which works very well and noise free.
Lets see the code:
import numpy as np import cv2 import pylab as pl cap = cv2.VideoCapture(0) # take first frame of the video ret,frame = cap.read() while(True): # Capture frame-by-frame ret, frame = cap.read() # Our operations on the frame come here gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) #Display the resulting frame thresh = cv2.adaptiveThreshold(gray,255,cv2.ADAPTIVE_THRESH_MEAN_C,cv2.THRESH_BINARY,25,9) cv2.imshow('frame',thresh) if cv2.waitKey(1) & 0xFF == ord('q'): cv2.imwrite("snap.jpg", thresh) break # When everything done, release the capture cap.release() cv2.destroyAllWindows()
This code will show a live video, grab a document, show it to the camera, press q key to capture and quit and this will save a image. Then cut the image to whatever you want and convert to pdf. You can see a sample output below (deliberately made to look unclear). This is just a frame capture using a video camera.
I noticed that shakes are a major problem. I tried applying canny edge detection later on this image, but simple thresholding done great job!