SIFT
1) 개요
- SIFT extracts features that are invariant to scale, rotation and intensity.
- SIFT features can be matched reliably across 3D viewpoint and noise.
- SIFT includes both an interest point detector and a descriptor.
Interest point detector
- It is an algorithm that chooses points from an image based on some criterion.
- Harris, Min Eigen, and FAST are interest point dector.
Descriptor
- It is a vector of values, which describes the image patch around an interest point. (raw pixel values, histogram of gradient, etc)
(ref : https://dsp.stackexchange.com/questions/24346/what-is-the-difference-between-feature-detectors-and-feature-descriptors)
STEPS
1) DoG
- SIFT interest point locations are found using difference-of-Gaussian functions (DoG).
$$D(X, \sigma) = [G_{k\sigma}(X)-G_{\sigma}(X)]*I(X)=[G_{k\sigma}-G_{\sigma}]*I=I_{k\sigma}-I_{\sigma}$$
- Before detecting interest points, blur the image due to noise.
$$I_{\sigma} = G_{\sigma}\:blurred\:grayscale\:image$$
$$k=determining\:the\:separation\:in\:scale$$
- Interest points are the maxima and minima of D(x, sigma) across both image location and scale.
DoG
- Compute the value by subtracting 2 different std values of gaussian kernel.
- This value is similar to LoG value.
- LoG has to do 2 times of derivatives. So, LoG needs more computing power.
- Furthermore, for scale invariance, LoG is needed to be normalized to std square.
2) Find interest points (maxima and minima of D(x, sigma); DoG)
- DoG images are searched for local extrema or minima over scale and space.
- Above picture is the example. With same octave and different scale images, one pixel in an image is compared with its neighbors as well as 9 pixels in different scale images. If it is a extrema or minima, it is a potential keypoint.
- If we make n DoGs, each octave have n + 3 DoG images. Scale difference will be 2^(1/n)
Octave : Group of same size images
3) Interest point localization
- Refine to get more accurate results from potential keypoints (edges, Low contrast keypoints).
(1) Elimination of edge response
- DoG has higher response for edges.
- Use Harris corner detector to remove edges.
(2) Elimination of low contrast keypoints
- Use taylor series expansion of scale space to get more accurate location of extrema.
- If a potential keypoint's contrast is less than a threshold value, reject it.
4) Orientation assignment
- To achieve invariance to rotation, blur the around of keypoint and refine the gradient direction and magnitude.
- Use gaussian weighted circular window 1.5 timed std.
- An orientation histogram with 36 bins covering 360 degrees is created.
- The peak of histogram will be a keypoint's orientation. If there is another peak which is higher than 80% of top peak, it recognizes both.
5) Keypoint Descriptor
- It is represented as a vector to form interest point descriptor.
- To achieve invariance to rotation, a reference direction is chosen based on the direction and magnitude of the image gradient around each point.
- Compute a desrcriptor based on the position, scale and rotation.
- The descriptor takes a grid of subregions around the point and for each subregion computes an image gradient orientation histogram. (Standard setting : 4 x 4 subregions with 8 bin orientation histograms)
6) Keypoint matching
- For matching a feature in one image to a feature in another image is to use the ratio of the distance to the two closest matching features.
- Get euclidean distance between each keypoint in two images.
- By using ratio of the distance (1st nearest vs. 2nd nearest), check if the matching is corrected. (threshold : 0.8)
Code
1) Keypoints
img = cv2.imread('building.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
sift = cv2.SIFT_create()
kp = sift.detect(gray, None)
img = cv2.drawKeypoints(gray, kp, img)
plt.figure(figsize= (10, 7))
plt.imshow(img)
2) Feature matching
img1 = cv2.imread('/users/sejongpyo/downloads/631.jpg')
gray1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY)
img2 = cv2.imread('/users/sejongpyo/downloads/632.png')
gray2 = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)
#sift
sift = cv2.xfeatures2d.SIFT_create()
keypoints_1, descriptors_1 = sift.detectAndCompute(img1, None)
keypoints_2, descriptors_2 = sift.detectAndCompute(img2, None)
bf = cv2.BFMatcher(cv2.NORM_L2, crossCheck = True)
matches = bf.match(descriptors_1, descriptors_2)
matches = sorted(matches, key = lambda x:x.distance)
img3 = cv2.drawMatches(img1, keypoints_1, img2, keypoints_2, matches[:30], img2, flags = 2)
plt.imshow(img3)
plt.axis('off')
plt.show()
참고
salkuma.files.wordpress.com/2014/04/sifteca095eba6ac.pdf
'AI > Computer Vision' 카테고리의 다른 글
Hough Transform (0) | 2021.04.01 |
---|---|
Image Resampling (Image pyramids) (0) | 2021.03.30 |
Harris Corner Detector (0) | 2021.03.24 |
Morphology (0) | 2021.03.24 |
Image filtering (Detecting edges) (0) | 2021.03.24 |