SIFT
1) 개요
- SIFT extracts features that are invariant to scale, rotation and intensity.
- SIFT features can be matched reliably across 3D viewpoint and noise.
- SIFT includes both an interest point detector and a descriptor.
Interest point detector
- It is an algorithm that chooses points from an image based on some criterion.
- Harris, Min Eigen, and FAST are interest point dector.
Descriptor
- It is a vector of values, which describes the image patch around an interest point. (raw pixel values, histogram of gradient, etc)
(ref : https://dsp.stackexchange.com/questions/24346/what-is-the-difference-between-feature-detectors-and-feature-descriptors)
STEPS
1) DoG

- SIFT interest point locations are found using difference-of-Gaussian functions (DoG).
D(X,σ)=[Gkσ(X)−Gσ(X)]∗I(X)=[Gkσ−Gσ]∗I=Ikσ−Iσ
- Before detecting interest points, blur the image due to noise.
Iσ=Gσblurredgrayscaleimage
k=determiningtheseparationinscale
- Interest points are the maxima and minima of D(x, sigma) across both image location and scale.
DoG
- Compute the value by subtracting 2 different std values of gaussian kernel.
- This value is similar to LoG value.
- LoG has to do 2 times of derivatives. So, LoG needs more computing power.
- Furthermore, for scale invariance, LoG is needed to be normalized to std square.
2) Find interest points (maxima and minima of D(x, sigma); DoG)

- DoG images are searched for local extrema or minima over scale and space.
- Above picture is the example. With same octave and different scale images, one pixel in an image is compared with its neighbors as well as 9 pixels in different scale images. If it is a extrema or minima, it is a potential keypoint.
- If we make n DoGs, each octave have n + 3 DoG images. Scale difference will be 2^(1/n)
Octave : Group of same size images
3) Interest point localization
- Refine to get more accurate results from potential keypoints (edges, Low contrast keypoints).
(1) Elimination of edge response
- DoG has higher response for edges.
- Use Harris corner detector to remove edges.
(2) Elimination of low contrast keypoints
- Use taylor series expansion of scale space to get more accurate location of extrema.
- If a potential keypoint's contrast is less than a threshold value, reject it.
4) Orientation assignment

- To achieve invariance to rotation, blur the around of keypoint and refine the gradient direction and magnitude.
- Use gaussian weighted circular window 1.5 timed std.
- An orientation histogram with 36 bins covering 360 degrees is created.

- The peak of histogram will be a keypoint's orientation. If there is another peak which is higher than 80% of top peak, it recognizes both.
5) Keypoint Descriptor
- It is represented as a vector to form interest point descriptor.
- To achieve invariance to rotation, a reference direction is chosen based on the direction and magnitude of the image gradient around each point.
- Compute a desrcriptor based on the position, scale and rotation.

- The descriptor takes a grid of subregions around the point and for each subregion computes an image gradient orientation histogram. (Standard setting : 4 x 4 subregions with 8 bin orientation histograms)
6) Keypoint matching
- For matching a feature in one image to a feature in another image is to use the ratio of the distance to the two closest matching features.
- Get euclidean distance between each keypoint in two images.
- By using ratio of the distance (1st nearest vs. 2nd nearest), check if the matching is corrected. (threshold : 0.8)
Code
1) Keypoints

img = cv2.imread('building.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
sift = cv2.SIFT_create()
kp = sift.detect(gray, None)
img = cv2.drawKeypoints(gray, kp, img)
plt.figure(figsize= (10, 7))
plt.imshow(img)
2) Feature matching

img1 = cv2.imread('/users/sejongpyo/downloads/631.jpg')
gray1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY)
img2 = cv2.imread('/users/sejongpyo/downloads/632.png')
gray2 = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)
#sift
sift = cv2.xfeatures2d.SIFT_create()
keypoints_1, descriptors_1 = sift.detectAndCompute(img1, None)
keypoints_2, descriptors_2 = sift.detectAndCompute(img2, None)
bf = cv2.BFMatcher(cv2.NORM_L2, crossCheck = True)
matches = bf.match(descriptors_1, descriptors_2)
matches = sorted(matches, key = lambda x:x.distance)
img3 = cv2.drawMatches(img1, keypoints_1, img2, keypoints_2, matches[:30], img2, flags = 2)
plt.imshow(img3)
plt.axis('off')
plt.show()
참고
salkuma.files.wordpress.com/2014/04/sifteca095eba6ac.pdf