SIFT (Scale invariant feature transform)

SIFT

1) 개요

- SIFT extracts features that are invariant to scale, rotation and intensity.

- SIFT features can be matched reliably across 3D viewpoint and noise.

- SIFT includes both an interest point detector and a descriptor.

Interest point detector
- It is an algorithm that chooses points from an image based on some criterion.
- Harris, Min Eigen, and FAST are interest point dector.

Descriptor
- It is a vector of values, which describes the image patch around an interest point. (raw pixel values, histogram of gradient, etc)

(ref : https://dsp.stackexchange.com/questions/24346/what-is-the-difference-between-feature-detectors-and-feature-descriptors)

STEPS

1) DoG

출처 : https://docs.opencv.org/master/da/df5/tutorial_py_sift_intro.html

- SIFT interest point locations are found using difference-of-Gaussian functions (DoG).

$$D(X, \sigma) = [G_{k\sigma}(X)-G_{\sigma}(X)]*I(X)=[G_{k\sigma}-G_{\sigma}]*I=I_{k\sigma}-I_{\sigma}$$

- Before detecting interest points, blur the image due to noise.

$$I_{\sigma} = G_{\sigma}\:blurred\:grayscale\:image$$

$$k=determining\:the\:separation\:in\:scale$$

- Interest points are the maxima and minima of D(x, sigma) across both image location and scale.

DoG
- Compute the value by subtracting 2 different std values of gaussian kernel.
- This value is similar to LoG value.
- LoG has to do 2 times of derivatives. So, LoG needs more computing power.
- Furthermore, for scale invariance, LoG is needed to be normalized to std square.

2) Find interest points (maxima and minima of D(x, sigma); DoG)

- DoG images are searched for local extrema or minima over scale and space.

- Above picture is the example. With same octave and different scale images, one pixel in an image is compared with its neighbors as well as 9 pixels in different scale images. If it is a extrema or minima, it is a potential keypoint.

- If we make n DoGs, each octave have n + 3 DoG images. Scale difference will be 2^(1/n)

Octave : Group of same size images

3) Interest point localization

- Refine to get more accurate results from potential keypoints (edges, Low contrast keypoints).

(1) Elimination of edge response

- DoG has higher response for edges.

- Use Harris corner detector to remove edges.

(2) Elimination of low contrast keypoints

- Use taylor series expansion of scale space to get more accurate location of extrema.

- If a potential keypoint's contrast is less than a threshold value, reject it.

4) Orientation assignment

- To achieve invariance to rotation, blur the around of keypoint and refine the gradient direction and magnitude.

- Use gaussian weighted circular window 1.5 timed std.

- An orientation histogram with 36 bins covering 360 degrees is created.

출처 : https://salkuma.files.wordpress.com/2014/04/sifteca095eba6ac.pdf

- The peak of histogram will be a keypoint's orientation. If there is another peak which is higher than 80% of top peak, it recognizes both.

5) Keypoint Descriptor

- It is represented as a vector to form interest point descriptor.

- To achieve invariance to rotation, a reference direction is chosen based on the direction and magnitude of the image gradient around each point.

- Compute a desrcriptor based on the position, scale and rotation.

- The descriptor takes a grid of subregions around the point and for each subregion computes an image gradient orientation histogram. (Standard setting : 4 x 4 subregions with 8 bin orientation histograms)

6) Keypoint matching

- For matching a feature in one image to a feature in another image is to use the ratio of the distance to the two closest matching features.

- Get euclidean distance between each keypoint in two images.

- By using ratio of the distance (1st nearest vs. 2nd nearest), check if the matching is corrected. (threshold : 0.8)

Code

1) Keypoints

img = cv2.imread('building.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

sift = cv2.SIFT_create()
kp = sift.detect(gray, None)

img = cv2.drawKeypoints(gray, kp, img)

plt.figure(figsize= (10, 7))
plt.imshow(img)

2) Feature matching

img1 = cv2.imread('/users/sejongpyo/downloads/631.jpg')
gray1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY)

img2 = cv2.imread('/users/sejongpyo/downloads/632.png')
gray2 = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)

#sift
sift = cv2.xfeatures2d.SIFT_create()

keypoints_1, descriptors_1 = sift.detectAndCompute(img1, None)
keypoints_2, descriptors_2 = sift.detectAndCompute(img2, None)

bf = cv2.BFMatcher(cv2.NORM_L2, crossCheck = True)

matches = bf.match(descriptors_1, descriptors_2)
matches = sorted(matches, key = lambda x:x.distance)

img3 = cv2.drawMatches(img1, keypoints_1, img2, keypoints_2, matches[:30], img2, flags = 2)
plt.imshow(img3)
plt.axis('off')
plt.show()

참고

salkuma.files.wordpress.com/2014/04/sifteca095eba6ac.pdf

http://programmingcomputervision.com/

blueskyvision.tistory.com/21

docs.opencv.org/master/da/df5/tutorial_py_sift_intro.html

'AI > Computer Vision' 카테고리의 다른 글

Hough Transform (0)	2021.04.01
Image Resampling (Image pyramids) (0)	2021.03.30
Harris Corner Detector (0)	2021.03.24
Morphology (0)	2021.03.24
Image filtering (Detecting edges) (0)	2021.03.24

일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

날아가는 개발자