본문으로 바로가기

Geometric Transformation

category AI/Computer Vision 2021. 4. 4. 15:50

2D Transformation

ref : SzeliskiBookDraft_20200920.pdf
ref : SzeliskiBookDraft_20200920.pdf

2D linear transformations

1) 개요

    - change domain of image

ref : http://www.cs.cornell.edu/courses/cs5670/2019sp/lectures/lec07_transformations.pdf

 

2) Properties of linear transformations

    - Origin maps to origin

    - Lines map to lines

    - Parallel lines remain parallel

    - Ratios are preserved

    - Closed under composition

 

3) parametric transformation

$$\begin{bmatrix} x' \\ y' \end{bmatrix} = T \begin{bmatrix} x \\ y \end{bmatrix}$$

    - Transformation T is a coordinate-changing machine.

    - Each component multiplied by a scalar.

ref : http://www.cs.cmu.edu/~16385/lectures/lecture7.pdf

 

Homogeneous coordinates

1) 개요

$$x' = x + t_{x}, \;\; y' = y + t_{y}$$

    - Translation isn't a linear operation on 2D coordinates.

    - By using homogeneous coordinates, can express affine or projective transformation by single matrix.

    - Image taken by camera is projected 3D to 2D. Every points in the space is projected to homogeneous coordinates.

    - 2D transformations in heterogeneous coordinates.

ref : http://www.cs.cornell.edu/courses/cs5670/2019sp/lectures/lec07_transformations.pdf

 

2) 수식적 의미

ref : http://www.cs.cornell.edu/courses/cs5670/2019sp/lectures/lec07_transformations.pdf

$$\begin{bmatrix} x \\ y \end{bmatrix} \Rightarrow \begin{bmatrix} x \\ y \\ 1 \end{bmatrix}\stackrel{\text{def}}{=}\begin{bmatrix} wx \\ wy \\ w \end{bmatrix}$$

    - represent 2D point with a 3D vector.

    - 3D vectors are only defined up to scale -> Scale is ignored.

$$\begin{bmatrix} x \\ y \\ w \end{bmatrix} \Rightarrow (x/w, y/w)$$

    - Converting from homogeneous coordinatesto heterogeneous.


Affine transformations

1) 개요

    - Combinations of linear transformations and translations.

    - Intuitively, mapping a triangle to an affine transformed triangle.

    - Reflection is possible.

$$\begin{bmatrix} x' \\ y' \\ w' \end{bmatrix} = \begin{bmatrix} a & b & c \\ d & e & f \\ 0 & 0 & 1 \end{bmatrix}\begin{bmatrix} x \\ y \\ 1 \end{bmatrix}$$

 

2) Properties of affine transformations

    - Origin doesn't necessarily map to origin

    - Lines map to lines

    - Parallel lines remain parallel

    - Ratios are preserved

    - Closed under composition

 

Projective transformations aka Homographies

1) 개요

    - Combinations of affine transformations and projective warps

    - maps points in one plane to another.

    - use for registering images, rectifying images, texture warping, creating panoramas.

    - Reflection is possible.

    - [[a, b], [d, e]] : rotation, scale, shearing, reflection

    - [c, f] : translation

    - [g, h] : perspective (원근)

$$\begin{bmatrix} x' \\ y' \\ w' \end{bmatrix} = \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & 1 \end{bmatrix}\begin{bmatrix} x \\ y \\ 1 \end{bmatrix}$$

 

2) Properties of projective transformations

    - Origin doesn't necessarily map to origin

    - Lines map to lines

    - Parallel lines don't necessarily map to parallel lines

    - Ratios aren't necessarily preserved

    - Closed under composition

 

Abnormal transformation of projective, affine (twist, concave due to reflection)
- Check if a transformation has work normally or showed twist or concave result)
- if one of these are satisfied, it is a abnormal. -> Not must satisfy.
$$D \leq 0,\; sx<0.1 \; sx>4 \; sy<0.1 \; sy>4 \; P>0.002$$
$$D=ae - bd$$
$$sx = \sqrt{a^{2} + d^{2}}$$
$$sy = \sqrt{b^{2} + e^{2}}$$
$$P = \sqrt{g^{2} + h^{2}}$$
- D : determinant of [[a, b], [d, e]] -> reflection, twist
- sx : x-axis scale factor -> scale depend on programmer
- sy : y-axis scale factor -> scale depend on programmer
- P : degree of perspective

 

The direct linear transform (DLT) -> determining the homography matrix

1) 개요

    - Homographies can be computed directly from corresponding points in two images.

    - Each point coresondence gives 2 equations, one each for the x and y coordinates.

    - DLT is computing H given 4 or more correspondences.

 

2) 수식적 이해

 

    - By stacking all corresponding points a least squares solution for H can be found using singular value decomposition. (SVD)

 

Determining unknown image warps

    - if pixel (x', y') lands between two pixels, add contribution to several pixels, normalize later. (splatting)

    - if still result in holes, resample color value from interpolated source image.

 

Code

1) Scaling

img = cv2.imread('/Users/sejongpyo/downloads/yubi.jpg')

# shrink
res = cv2.resize(img, None, fx = 0.5,fy = 0.5, interpolation = cv2.INTER_CUBIC)
# (img, dsize(manual size), fx, fy, interpolation)
# zoom
height, width = img.shape[:2]
res = cv2.resize(img, (2*width, 2*height), interpolation = cv2.INTER_CUBIC)

plt.imshow(res)
plt.show

 

2) Translation

img = cv2.imread('/Users/sejongpyo/downloads/yubi.jpg', 0)

rows, cols = img.shape[:2]
# M = [[1, 0, x], [0, 1, y]] x축, y축 이동값 적용
M = np.float32([[1, 0, 100], [0, 1, 50]])
dst = cv2.warpAffine(img, M, (cols, rows)) # (width, height)

plt.imshow(dst)
plt.show

 

3) Rotatoin

img = cv2.imread('/Users/sejongpyo/downloads/yubi.jpg', 0)

rows, cols = img.shape[:2]

# 90 degree rotation and 0.5 scale
M = cv2.getRotationMatrix2D((cols/2, rows/2), 90, 0.5)

dst = cv2.warpAffine(img, M, (cols, rows))

plt.imshow(dst)
plt.show

 

4) Affine Transformation

img = cv2.imread('/Users/sejongpyo/downloads/yubi.jpg')
rows, cols, ch = img.shape

pts1 = np.float32([[200, 100], [400, 100], [200, 200]])
pts2 = np.float32([[200, 300], [400, 200], [200, 400]])

# check point
cv2.circle(img, (200, 100), 10, (255, 0, 0), -1)
cv2.circle(img, (400, 100), 10, (0, 255, 0), -1)
cv2.circle(img, (200, 200), 10, (0, 0, 255), -1)

M = cv2.getAffineTransform(pts1, pts2)

dst = cv2.warpAffine(img, M, (cols, rows))

plt.subplot(121), plt.imshow(img)
plt.subplot(122), plt.imshow(dst)

 

5) perspective transformation

import numpy as np
import cv2

coordi = []

def get_coordinates(event, x, y, flags, params):
    '''
    get coordinates of an image by clicking points
    '''
    global coordi
    
    # left top -> right top -> left bottom -> right bottom
    if event == cv2.EVENT_LBUTTONDOWN:
        coordi.append([x, y])
        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)
        cv2.imshow('image', img)


img= cv2.imread('museum.jpg')
cv2.imshow('image', img)
cv2.setMouseCallback('image', get_coordinates)
cv2.waitKey(0)
coor = np.float32(coordi)

width = np.linalg.norm(coor[1] - coor[0])
height = np.linalg.norm(coor[2] - coor[0])

move = np.float32([[0, 0], [width, 0], [0, height], [width, height]])

M = cv2.getPerspectiveTransform(coor, move)
dst = cv2.warpPerspective(img, M, (width, height))
cv2.imshow('dst', dst)

cv2.waitKey(0)
cv2.destroyAllWindows()

ref.

darkpgmr.tistory.com/79?category=460965

www.cs.cornell.edu/courses/cs5670/2019sp/lectures/lec07_transformations.pdf

www.cs.cmu.edu/~16385/lectures/lecture7.pdf

'AI > Computer Vision' 카테고리의 다른 글

Camera  (0) 2021.04.08
RANSAC  (0) 2021.04.06
Hough Transform  (0) 2021.04.01
Image Resampling (Image pyramids)  (0) 2021.03.30
SIFT (Scale invariant feature transform)  (0) 2021.03.26