K-means is a clustering algorithm which can be applied on various machine learning problems. Lets take a look on how we apply kmeans on grayscale faces to detect coordinates of left and right eye centers. We first slice the eyes from the face reducing the dimensionality, then we apply gabor filter to have a maximum response for eye centers. Finally we apply thresholding or pick all the (x,y) ordinates of dark pixels and apply kmeans to find the cluster center (2 cluster centroids for left and right eye)
Remember kmeans cannot be directly applied on the 96×96 pixel images. If you apply directly on this huge dimensionality the number of centroids will be the same dimensionality ie 96*96 dimenstions. So you will need the 2D data to successfully apply k-means center points.
Lets see the input grayscale facial images. Our goal is to run a unsupervised approach to detect the left and right eye centers which are insensitive to orientations.
We apply gabor filter with freq 1.5 and pi/2 theta, ksize=11 for the maximum response to the images. You can tweak the sigma from 1.0 to see the various filter responses.
def build_filters(): filters = [] ksize = 11 freq = np.array([1.5]) for theta in np.arange(np.pi/2,np.pi, np.pi / 2): for lamda in freq: #np.arange(np.pi/2, np.pi, np.pi/2): kern = cv2.getGaborKernel((ksize, ksize), 1.0, theta, lamda, 0.5, 0, ktype=cv2.CV_32F) kern /= 1.5*kern.sum() filters.append(kern) return filters def process(img, filters): accum = np.zeros_like(img) for kern in filters: fimg = cv2.filter2D(img, cv2.CV_8UC3, kern) np.maximum(accum, fimg, accum) return accum filters = [] res = [] label = [] for k in xrange(len(X)): img = X[k] X[k, :, :] = image_histogram_equalization(X[k, :,:])[0] filters = build_filters() filters = np.asarray(filters) for i in xrange(len(filters)): res1 = process(img, filters[i]) res.append(np.asarray(res1)) f = np.asarray(filters) print 'Gabor Filters', f.shape output = np.asarray(res)
The output looks like
Here we identify clearly the dark pixels on eye centers, eye brows, mouth corners. Not bad!!
We take the gabor filter output of the face, slice the eye part to 70×30 pixels, then search for all the coordinates of darkest pixels in the area, then apply kmeans (2 clusters for 2 eyes). We print the centroids.
# slice eyes and threshhold pl.figure() eyes = [] for k in range(5): eyes = output[k,20:50,10:80] #eyes = np.reshape(30,70) print 'Eyes', eyes.shape #thresh = filter.threshold_otsu(eyes) #eyes = eyes > thresh peaks = np.array(np.where(eyes > 120)) peaks = np.rollaxis(peaks,-1) print 'Peaks', peaks.shape estimator = KMeans(n_clusters=2) estimator.fit(peaks) centroids = np.asarray(estimator.cluster_centers_) print 'Cluster centers', centroids.shape, centroids pl.imshow(eyes.reshape(30,70), cmap='gray' ) #pl.scatter(peaks[1], peaks[0], marker='o') pl.scatter(centroids[1], centroids[0], marker='x', s=100) #pl.scatter(centroids[3], centroids[2], marker='o', s=100) pl.show()
We will see how the output looks like….. Woww looks nice.
The problem with kmeans is there are more errors as well as the algorithm needs fine tuning. The approach is a very simple approach, and the same can be applied for nose, mouth and lips.