A A
[Data Mining] Introduction to Numpy part.2

Broadcasting

Numpy์˜ Broadcasting์€ ์„œ๋กœ ๋‹ค๋ฅธ ํฌ๊ธฐ์˜ ๋ฐฐ์—ด ๊ฐ„์˜ ์—ฐ์‚ฐ์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•˜๋Š” ๊ฐ•๋ ฅํ•œ ๊ธฐ๋Šฅ์ž…๋‹ˆ๋‹ค.

 

  • Broadcasting์„ ํ†ตํ•ด Numpy๋Š” ๋” ์ž‘์€ ๋ฐฐ์—ด์„ ๋” ํฐ ๋ฐฐ์—ด๊ณผ ๋™์ผํ•œ ๋ชจ์–‘์œผ๋กœ ํ™•์žฅํ•˜์—ฌ ์š”์†Œ๋ณ„(element-wise) ์—ฐ์‚ฐ์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” ๋ฐ˜๋ณต๋ฌธ์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š๊ณ ๋„ ํšจ์œจ์ ์ธ ๋ฒกํ„ฐํ™” ์—ฐ์‚ฐ์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค.
  • ๋ธŒ๋กœ๋“œ์บ์ŠคํŠธ๋Š” ์‚ฐ์ˆ  ์—ฐ์‚ฐ ์ค‘์— numpy๊ฐ€ ๋‹ค์–‘ํ•œ ๋ชจ์–‘์„ ๊ฐ€์ง„ ๋ฐฐ์—ด์„ ์–ด๋–ป๊ฒŒ ์ฒ˜๋ฆฌํ•˜๋Š”์ง€ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.
  • ํŠน์ • ์ œ์•ฝ ์กฐ๊ฑด์— ๋”ฐ๋ผ ๋” ์ž‘์€ ๋ฐฐ์—ด์€ ๋” ํฐ ๋ฐฐ์—ด์— ๊ฑธ์ณ "๋ธŒ๋กœ๋“œ์บ์ŠคํŠธ"๋˜์–ด ํ˜ธํ™˜ ๊ฐ€๋Šฅํ•œ ๋ชจ์–‘์„ ๊ฐ–์Šต๋‹ˆ๋‹ค.

Examples

A      (2d array):  5 x 4
B      (1d array):      1
Result (2d array):  5 x 4

A      (2d array):  5 x 4
B      (1d array):      4
Result (2d array):  5 x 4

A      (3d array):  15 x 3 x 5
B      (3d array):  15 x 1 x 5
Result (3d array):  15 x 3 x 5

A      (3d array):  15 x 3 x 5
B      (2d array):       3 x 5
Result (3d array):  15 x 3 x 5

A      (3d array):  15 x 3 x 5
B      (2d array):       3 x 1
Result (3d array):  15 x 3 x 5

 

np.array([[1,2],[3,4]]) + np.array([[10]])
array([[11, 12],
       [13, 14]])

 

np.array([[1,2],[3,4]]) + np.array([[10,100]])
array([[ 11, 102],
       [ 13, 104]])

 

A = np.array([[1,2]])
B = np.array([[10],[100]])
print(A.shape, B.shape)
C = A + B
C

 

(1, 2) (2, 1)
array([[ 11,  12],
       [101, 102]])

 

X = np.array([[1]]*3) + np.array([[0]*10]) # 3 * 1, 1 * 10
X
X = np.array([[1]]*3) + np.array([[0]*10])
X
array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])

 

  • np.array([[1]]*3)๋Š” [[1]] ๋ฐฐ์—ด์„ 3๋ฒˆ ๋ฐ˜๋ณตํ•˜์—ฌ 2์ฐจ์› ๋ฐฐ์—ด๋กœ ๋งŒ๋“œ๋Š” ์—ฐ์‚ฐ์ž…๋‹ˆ๋‹ค.
  • ์ด ๊ฒฝ์šฐ, ํ˜•์ƒ์€ (3, 1)์ด ๋˜๋ฉฐ ๊ฒฐ๊ณผ๋Š” [[1], [1], [1]]์ž…๋‹ˆ๋‹ค.
  • np.array([[0]*10])๋Š” [0]์„ 10๋ฒˆ ๋ฐ˜๋ณตํ•˜์—ฌ ๊ธธ์ด๊ฐ€ 10์ธ 2์ฐจ์› ๋ฐฐ์—ด์„ ๋งŒ๋“ญ๋‹ˆ๋‹ค.
  • ์ด ๊ฒฝ์šฐ, ํ˜•์ƒ์€ (1, 10)์ด ๋˜๋ฉฐ ๊ฒฐ๊ณผ๋Š” [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]์ž…๋‹ˆ๋‹ค.
  • [[1, 1, 1, 1, 1, 1, 1, 1, 1, 1]]๊ฐ€ 3๋ฒˆ ๋ฐ˜๋ณต๋œ 3 x 10 ๋ฐฐ์—ด์ž…๋‹ˆ๋‹ค. 

 

# ๋ฐฐ์—ด a ์ƒ์„ฑ (3x1 ํฌ๊ธฐ)
a = np.array([[1], [2], [3]])

# ๋ฐฐ์—ด a์˜ ์ „์น˜(ํ–‰๋ ฌ์˜ ์ „์น˜)๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ ๋ฐฐ์—ด b์— ์ €์žฅ (1x3 ํฌ๊ธฐ)
b = a.T

# a์™€ b๋ฅผ ๋”ํ•œ ๊ฒฐ๊ณผ๋ฅผ ๊ณ„์‚ฐ (broadcasting ๊ธฐ๋Šฅ ์‚ฌ์šฉ)
result = a + b

# ๊ฒฐ๊ณผ ๋ฐฐ์—ด ์ถœ๋ ฅ
result
array([[2, 3, 4],
       [3, 4, 5],
       [4, 5, 6]])

Meshgrid

numpy.meshgrid๋Š” ๋‹ค์ฐจ์› ๊ฒฉ์ž ์ขŒํ‘œ๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋˜๋Š” ํ•จ์ˆ˜์ž…๋‹ˆ๋‹ค.
  • ๋ณดํ†ต 2์ฐจ์› ํ‰๋ฉด์ด๋‚˜ 3์ฐจ์› ๊ณต๊ฐ„์—์„œ์˜ ์ขŒํ‘œ๊ณ„๋ฅผ ๋งŒ๋“ค ๋•Œ ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค.
  • ์ด ํ•จ์ˆ˜๋Š” ์ฃผ๋กœ ํ•จ์ˆ˜์˜ ๊ทธ๋ž˜ํ”„๋ฅผ ๊ทธ๋ฆฌ๊ฑฐ๋‚˜ ๋‹ค์ฐจ์› ๋ฐ์ดํ„ฐ์˜ ์‹œ๊ฐํ™”๋ฅผ ์œ„ํ•ด ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.

meshgrid์˜ ์‚ฌ์šฉ๋ฒ•

  • numpy.meshgrid๋Š” 1์ฐจ์› ์ขŒํ‘œ ๋ฐฐ์—ด ๋‘ ๊ฐœ๋ฅผ ๋ฐ›์•„์„œ ๋‘ ๊ฐœ์˜ 2์ฐจ์› ๋ฐฐ์—ด์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
  • ๊ฐ ๋ฐ˜ํ™˜๋œ ๋ฐฐ์—ด์€ ์ขŒํ‘œ ๊ทธ๋ฆฌ๋“œ๋ฅผ ๊ตฌ์„ฑํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.
  • ๋ฒกํ„ฐํ™”๋œ ํ‰๊ฐ€๋ฅผ ์œ„ํ•ด D x N ๋ฉ”์‰ฌ ๊ทธ๋ฆฌ๋“œ๋ฅผ ๋งŒ๋“ค์–ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
v = np.array([10,20,30])   # N
w = np.array([5,6])        # D
X, Y = np.meshgrid(v, w)
X + Y
array([[15, 25, 35],
       [16, 26, 36]])

Axis ordering

์ •์˜์ƒ ์ฐจ์›์˜ ์ถ• ๋ฒˆํ˜ธ๋Š” ๋ฐฐ์—ด์˜ ๋ชจ์–‘ ์•ˆ์—์„œ ํ•ด๋‹น ์ฐจ์›์˜ ์ธ๋ฑ์Šค์ž…๋‹ˆ๋‹ค.
  • ์ธ๋ฑ์‹ฑํ•˜๋Š” ๋™์•ˆ ํ•ด๋‹น ์ฐจ์›์— ์•ก์„ธ์Šคํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋˜๋Š” ์œ„์น˜์ด๊ธฐ๋„ ํ•ฉ๋‹ˆ๋‹ค.
  • ์˜ˆ๋ฅผ ๋“ค์–ด, 2D ๋ฐฐ์—ด a์˜ ๋ชจ์–‘์ด (5,6)์ด๋ฉด a[4,5]๊นŒ์ง€ a[0,0]์— ์ ‘๊ทผํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
    • ๋”ฐ๋ผ์„œ ์ถ• 0์€ ์ฒซ ๋ฒˆ์งธ ์ฐจ์›("ํ–‰")์ด๊ณ , ์ถ• 1์€ ๋‘ ๋ฒˆ์งธ ์ฐจ์›("์—ด")์ž…๋‹ˆ๋‹ค.
    • "ํ–‰"๊ณผ "์—ด"์ด ์˜๋ฏธ๊ฐ€ ์—†๋Š” ๊ณ ์ฐจ์›์—์„œ๋Š” ์ถ•์„ ๊ด€๋ จ๋œ ๋ชจ์–‘๊ณผ ์ง€์ˆ˜๋กœ ์ƒ๊ฐํ•ด ๋ณด์‹ญ์‹œ์˜ค.
  • ์˜ˆ๋ฅผ ๋“ค์–ด np.sum(axis=n)์„ ํ•˜๋ฉด ์ฐจ์› n์ด ์ถ•์†Œ๋˜๊ณ  ์‚ญ์ œ๋˜๋ฉฐ ์ƒˆ ํ–‰๋ ฌ์˜ ๊ฐ ๊ฐ’์€ ์ถ•์†Œ๋œ ๊ฐ’์˜ ํ•ฉ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
    • ์˜ˆ๋ฅผ ๋“ค์–ด b์˜ ๋ชจ์–‘์ด (5,6,7,8)์ด๊ณ  c = b.sum(axis=2)์ด๋ฉด ์ถ• 2(ํฌ๊ธฐ 7์˜ dimension)๊ฐ€ ์ถ•์†Œ๋˜๊ณ  ๊ฒฐ๊ณผ๋Š” ๋ชจ์–‘์ด (5,6,8)๋ฉ๋‹ˆ๋‹ค.
    • ๋˜ํ•œ c[x,y,z]๋Š” ๋ชจ๋“  ์›์†Œ b[x,y,:,z]์˜ ํ•ฉ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

 

X = np.array([[0,0,0], [1,1,1]])
X.shape

# axis 0 is row; axis 1 is column

# Result: (2, 3)
X.sum(axis=0) # ์ฐจ์› 0์ด ์ถ•์†Œ ๋ฐ ์‚ญ์ œ๋˜๊ฑฐ๋‚˜ ์ฐจ์› 0์— ๋Œ€ํ•ด ์ง‘๊ณ„๋ฉ๋‹ˆ๋‹ค

# Result: array([1, 1, 1])
X.sum(axis=1) # ์ฐจ์› 1์ด ์ถ•์†Œ ๋ฐ ์‚ญ์ œ๋˜๊ฑฐ๋‚˜ ์ฐจ์› 0์— ๋Œ€ํ•ด ์ง‘๊ณ„๋ฉ๋‹ˆ๋‹ค

# Result: array([0, 3])
# 1๋ถ€ํ„ฐ 24๊นŒ์ง€์˜ ์ •์ˆ˜๋กœ ๊ตฌ์„ฑ๋œ 1์ฐจ์› ๋ฐฐ์—ด์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
X = np.array(range(1, 24 + 1))

# ๋ฐฐ์—ด X๋ฅผ (2, 3, 4) ํ˜•์ƒ์œผ๋กœ ์žฌ๊ตฌ์กฐํ™”ํ•˜์—ฌ 3์ฐจ์› ๋ฐฐ์—ด๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
# ์ด๋•Œ, ๋ฐฐ์—ด์€ 2๊ฐœ์˜ 3x4 ํ–‰๋ ฌ๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค.
X = X.reshape(2, 3, 4)

# ์žฌ๊ตฌ์กฐํ™”๋œ 3์ฐจ์› ๋ฐฐ์—ด X๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
X
X.shape

# (2, 3, 4)

 

X.sum(axis=0)
array([[14, 16, 18, 20],
       [22, 24, 26, 28],
       [30, 32, 34, 36]])

 

X๋Š” np.arange(24).reshape(2, 3, 4)์„ ํ†ตํ•ด ๋งŒ๋“ค์–ด์ง„ 3์ฐจ์› ๋ฐฐ์—ด ์ž…๋‹ˆ๋‹ค.

 

  • X[0]: ์ฒซ ๋ฒˆ์งธ 3x4 ํ–‰๋ ฌ
[[[ 1,  2,  3,  4],
  [ 5,  6,  7,  8],
  [ 9, 10, 11, 12]],
  • X[1]: ๋‘ ๋ฒˆ์งธ 3x4 ํ–‰๋ ฌ
[[13, 14, 15, 16],
  [17, 18, 19, 20],
  [21, 22, 23, 24]]]

 

  • axis=0์„ ๋”ฐ๋ผ ํ•ฉ๊ณ„๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
  • ์ฒซ ๋ฒˆ์งธ ์—ด: [1+13, 5+17, 9+21] = [14, 22, 30]
  • ๋‘ ๋ฒˆ์งธ ์—ด: [2+14, 6+18, 10+22] = [16, 24, 32]
  • ์„ธ ๋ฒˆ์งธ ์—ด: [3+15, 7+19, 11+23] = [18, 26, 34]
  • ๋„ค ๋ฒˆ์งธ ์—ด: [4+16, 8+20, 12+24] = [20, 28, 36]

 

axis=0๋Š” ๋ฐฐ์—ด์—์„œ ์ฒซ ๋ฒˆ์งธ ์ถ•์„ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค. ๋ฐฐ์—ด์˜ ์ถ•(axis)์€ ๊ฐ ๋ฐฐ์—ด์˜ ์ฐจ์›์„ ๋‚˜ํƒ€๋‚ด๋ฉฐ, axis๋Š” ์ถ•์˜ ์ธ๋ฑ์Šค๋ฅผ ๊ฐ€๋ฆฌํ‚ต๋‹ˆ๋‹ค.

 

  • 2์ฐจ์› ๋ฐฐ์—ด์ธ ๊ฒฝ์šฐ:
  • axis=0๋Š” ํ–‰์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. axis=0์„ ๋”ฐ๋ผ ํ•ฉ์‚ฐํ•œ๋‹ค๋Š” ๊ฒƒ์€ ๊ฐ ์—ด์„ ๋”ฐ๋ผ ๊ฐ’๋“ค์„ ํ•ฉ์‚ฐํ•˜๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.
  • ๋”ฐ๋ผ์„œ, axis=0์œผ๋กœ ํ•ฉ์‚ฐํ•˜๋ฉด ๊ฒฐ๊ณผ๋กœ ๊ฐ ์—ด์˜ ๊ฐ’๋“ค์„ ํ•ฉ์‚ฐํ•œ ๊ฐ’๋“ค์ด ๋ฐ˜ํ™˜๋ฉ๋‹ˆ๋‹ค
  • axis=0์„ ๋”ฐ๋ผ ํ•ฉ์‚ฐํ•  ๋•Œ๋Š” ๊ฐ ์—ด์˜ ๊ฐ’๋“ค์„ ํ•ฉ์‚ฐํ•˜์—ฌ ์—ด๋ณ„๋กœ ๊ฒฐ๊ณผ๋ฅผ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

 

X.sum(axis=1)
array([[15, 18, 21, 24],
       [51, 54, 57, 60]])
  • X.sum(axis=1)์„ ์‹คํ–‰ํ•˜๋ฉด ๊ฐ '์ธต'์—์„œ ๋™์ผํ•œ ์—ด์— ์œ„์น˜ํ•œ ์š”์†Œ๋“ค์˜ ํ•ฉ์„ ๊ตฌํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.
  • ๋”ฐ๋ผ์„œ ๊ฐ '์ธต'์˜ ์—ด๋ณ„ ํ•ฉ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:
    • ์ฒซ ๋ฒˆ์งธ '์ธต':
      • ์ฒซ ๋ฒˆ์งธ ์—ด์˜ ํ•ฉ: 1 + 5 + 9 = 15
      • ๋‘ ๋ฒˆ์งธ ์—ด์˜ ํ•ฉ: 2 + 6 + 10 = 18
      • ์„ธ ๋ฒˆ์งธ ์—ด์˜ ํ•ฉ: 3 + 7 + 11 = 21
      • ๋„ค ๋ฒˆ์งธ ์—ด์˜ ํ•ฉ: 4 + 8 + 12 = 24
    • ๋‘ ๋ฒˆ์งธ '์ธต':
      • ์ฒซ ๋ฒˆ์งธ ์—ด์˜ ํ•ฉ: 13 + 17 + 21 = 51
      • ๋‘ ๋ฒˆ์งธ ์—ด์˜ ํ•ฉ: 14 + 18 + 22 = 54
      • ์„ธ ๋ฒˆ์งธ ์—ด์˜ ํ•ฉ: 15 + 19 + 23 = 57
      • ๋„ค ๋ฒˆ์งธ ์—ด์˜ ํ•ฉ: 16 + 20 + 24 = 60

 

X.sum(axis=2)
array([[10, 26, 42],
       [58, 74, 90]])
  • X.sum(axis=2)์„ ์‹คํ–‰ํ•˜๋ฉด, ๊ฐ '์ธต'์˜ ๊ฐ ํ–‰์— ์žˆ๋Š” ์š”์†Œ๋“ค์˜ ํ•ฉ์„ ๊ตฌํ•ฉ๋‹ˆ๋‹ค:
  • ์ฒซ ๋ฒˆ์งธ '์ธต':
    • ์ฒซ ๋ฒˆ์งธ ํ–‰์˜ ํ•ฉ: 1 + 2 + 3 + 4 = 10
    • ๋‘ ๋ฒˆ์งธ ํ–‰์˜ ํ•ฉ: 5 + 6 + 7 + 8 = 26
    • ์„ธ ๋ฒˆ์งธ ํ–‰์˜ ํ•ฉ: 9 + 10 + 11 + 12 = 42
  • ๋‘ ๋ฒˆ์งธ '์ธต':
    • ์ฒซ ๋ฒˆ์งธ ํ–‰์˜ ํ•ฉ: 13 + 14 + 15 + 16 = 58
    • ๋‘ ๋ฒˆ์งธ ํ–‰์˜ ํ•ฉ: 17 + 18 + 19 + 20 = 74
    • ์„ธ ๋ฒˆ์งธ ํ–‰์˜ ํ•ฉ: 21 + 22 + 23 + 24 = 90

 

X.sum(axis=(1,2))
array([ 78, 222])
  • X.sum(axis=(1,2))์„ ์‹คํ–‰ํ•˜๋ฉด, ๊ฐ '์ธต'์—์„œ ๋ชจ๋“  ํ–‰๊ณผ ์—ด์— ์žˆ๋Š” ์š”์†Œ๋“ค์˜ ์ดํ•ฉ์„ ๊ตฌํ•ฉ๋‹ˆ๋‹ค:
    • ์ฒซ ๋ฒˆ์งธ '์ธต'์˜ ํ•ฉ:
    • (1 + 2 + 3 + 4) + (5 + 6 + 7 + 8) + (9 + 10 + 11 + 12) = 10 + 26 + 42 = 78
    • ๋‘ ๋ฒˆ์งธ '์ธต'์˜ ํ•ฉ:
    • (13 + 14 + 15 + 16) + (17 + 18 + 19 + 20) + (21 + 22 + 23 + 24) = 58 + 74 + 90 = 222
    • ๊ฒฐ๊ณผ์ ์œผ๋กœ, X.sum(axis=(1,2))์˜ ๊ฒฐ๊ณผ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๋ฐฐ์—ด์ด ๋ฉ๋‹ˆ๋‹ค:

 

X.sum(axis=(0,1,2))

# Result: 300
  • X.sum(axis=(0,1,2))์„ ์‹คํ–‰ํ•˜๋ฉด, ๋ฐฐ์—ด์˜ ๋ชจ๋“  ์š”์†Œ๋“ค์˜ ์ดํ•ฉ์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค:
    • ์ฒซ ๋ฒˆ์งธ '์ธต': (1 + 2 + 3 + 4) + (5 + 6 + 7 + 8) + (9 + 10 + 11 + 12) = 78
    • ๋‘ ๋ฒˆ์งธ '์ธต': (13 + 14 + 15 + 16) + (17 + 18 + 19 + 20) + (21 + 22 + 23 + 24) = 222
    • ๋ชจ๋“  '์ธต'์˜ ํ•ฉ๊ณ„: 78 + 222 = 300
    • ๋”ฐ๋ผ์„œ, X.sum(axis=(0,1,2))์˜ ๊ฒฐ๊ณผ๋Š” 300์ด ๋ฉ๋‹ˆ๋‹ค. ์ด๋Š” ๋ฐฐ์—ด ๋‚ด ๋ชจ๋“  ์š”์†Œ์˜ ์ดํ•ฉ์„ ๋‚˜ํƒ€๋‚ด๋Š” ์Šค์นผ๋ผ ๊ฐ’์ž…๋‹ˆ๋‹ค.
# 2๊ฐœ์˜ 3์ฐจ์› ์  X์™€ Y๋ฅผ ์„ ์–ธ
X = np.array([0, 0, 0])  # ์ฒซ ๋ฒˆ์งธ 3์ฐจ์› ์  X
Y = np.array([1, 1, 1])  # ๋‘ ๋ฒˆ์งธ 3์ฐจ์› ์  Y

# ๋‘ ์  X์™€ Y ์‚ฌ์ด์˜ ์œ ํด๋ฆฌ๋“œ ๊ฑฐ๋ฆฌ ๊ณ„์‚ฐ
distance = np.sqrt(np.sum((X - Y)**2))  # ์ฐจ์ด ๋ฒกํ„ฐ(X - Y)์˜ ์ œ๊ณฑ์„ ๊ตฌํ•˜๊ณ , ํ•ฉ์„ ๊ณ„์‚ฐํ•œ ํ›„ ์ œ๊ณฑ๊ทผ์„ ๊ตฌํ•จ

print(distance)  # ๊ณ„์‚ฐ๋œ ๊ฑฐ๋ฆฌ๋ฅผ ์ถœ๋ ฅ

# Result: 1.7320508075688772

 

import numpy as np

# 2x3 ๋ฐฐ์—ด์„ ์—ญ์ˆœ์œผ๋กœ ์ƒ์„ฑ
X = np.array(np.arange(2 * 3, 0, -1).reshape(2, 3))
print(X)  # ์ƒ์„ฑ๋œ ๋ฐฐ์—ด ์ถœ๋ ฅ

print()  # ์ค„ ๋ฐ”๊ฟˆ

# axis=0์— ๋Œ€ํ•ด ๋ฐฐ์—ด์„ ์ •๋ ฌ
print("axis=0\\n", np.sort(X, axis=0))

print()  # ์ค„ ๋ฐ”๊ฟˆ

# axis=-1(๋˜๋Š” axis=1)์— ๋Œ€ํ•ด ๋ฐฐ์—ด์„ ์ •๋ ฌ
print("axis=-1\\n", np.sort(X, axis=-1))

print()  # ์ค„ ๋ฐ”๊ฟˆ

# ๊ธฐ๋ณธ ์ถ•์€ axis=-1(๋˜๋Š” axis=1)์ด๋ฏ€๋กœ ๋™์ผํ•œ ๊ฒฐ๊ณผ๋ฅผ ์ถœ๋ ฅ
print("default is -1\\n", np.sort(X))

print()  # ์ค„ ๋ฐ”๊ฟˆ

# axis=None์„ ์‚ฌ์šฉํ•˜์—ฌ ์ „์ฒด ๋ฐฐ์—ด์„ 1์ฐจ์›์œผ๋กœ ์ •๋ ฌ
print("axis=None\\n", np.sort(X, axis=None))

[[6 5 4]
 [3 2 1]]

axis=0
 [[3 2 1]
 [6 5 4]]

axis=-1
 [[4 5 6]
 [1 2 3]]

default is -1
 [[4 5 6]
 [1 2 3]]

axis=None
 [1 2 3 4 5 6]
  • X: 2x3 ํ˜•ํƒœ์˜ ๋ฐฐ์—ด๋กœ, 2๊ฐœ์˜ ํ–‰๊ณผ 3๊ฐœ์˜ ์—ด์„ ๊ฐ€์ง„ ๋ฐฐ์—ด์ž…๋‹ˆ๋‹ค.
    • ์ด ๋ฐฐ์—ด์€*np.arange(2*3, 0, -1).reshape(2,3)๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ 6๋ถ€ํ„ฐ 1๊นŒ์ง€์˜ ์ˆซ์ž๋ฅผ 2x3 ๋ฐฐ์—ด ํ˜•ํƒœ๋กœ ์ •๋ ฌํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค.
  • ์ •๋ ฌ ๊ฒฐ๊ณผ:
    • axis=0: ์ด ์˜ต์…˜์„ ์‚ฌ์šฉํ•˜๋ฉด ๊ฐ ์—ด์„ ๋”ฐ๋ผ ๋ฐฐ์—ด์ด ์ •๋ ฌ๋ฉ๋‹ˆ๋‹ค. X ๋ฐฐ์—ด์˜ ๊ฐ ์—ด์„ ์ •๋ ฌํ•˜๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:
      • ์ฒซ ๋ฒˆ์งธ ์—ด: [3, 6]
      • ๋‘ ๋ฒˆ์งธ ์—ด: [2, 5]
      • ์„ธ ๋ฒˆ์งธ ์—ด: [1, 4]
    • axis=-1 ๋˜๋Š” axis=1: ์ด ์˜ต์…˜์„ ์‚ฌ์šฉํ•˜๋ฉด ๊ฐ ํ–‰์„ ๋”ฐ๋ผ ๋ฐฐ์—ด์ด ์ •๋ ฌ๋ฉ๋‹ˆ๋‹ค. X ๋ฐฐ์—ด์˜ ๊ฐ ํ–‰์„ ์ •๋ ฌํ•˜๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:
      • ์ฒซ ๋ฒˆ์งธ ํ–‰: [1, 2, 3]
      • ๋‘ ๋ฒˆ์งธ ํ–‰: [4, 5, 6]
    • axis=None: ์ด ์˜ต์…˜์„ ์‚ฌ์šฉํ•˜๋ฉด ๋ฐฐ์—ด์ด 1์ฐจ์›์œผ๋กœ ํŽผ์ณ์ง„ ๋‹ค์Œ ์ •๋ ฌ๋ฉ๋‹ˆ๋‹ค. X ๋ฐฐ์—ด์„ 1์ฐจ์›์œผ๋กœ ํŽผ์นœ ํ›„ ์ •๋ ฌํ•˜๋ฉด [1, 2, 3, 4, 5, 6]๊ฐ€ ๋ฉ๋‹ˆ๋‹ค.
  • ์š”์•ฝ
    • axis=0: ๊ฐ ์—ด์„ ๋”ฐ๋ผ ์ •๋ ฌํ•ฉ๋‹ˆ๋‹ค.
    • axis=1 ๋˜๋Š” axis=-1: ๊ฐ ํ–‰์„ ๋”ฐ๋ผ ์ •๋ ฌํ•ฉ๋‹ˆ๋‹ค.
    • axis=None: ๋ฐฐ์—ด์„ 1์ฐจ์›์œผ๋กœ ํŽผ์นœ ํ›„ ์ „์ฒด๋ฅผ ์ •๋ ฌํ•ฉ๋‹ˆ๋‹ค.

sort vs argsort vs partition vs argpartition

  • argmin, argmax, …
# ๋ฐฐ์—ด `X` ์ดˆ๊ธฐํ™”
X = np.array([4,10,1,20,45,100,2,1])
print('X =\\n', X)

# ๋ฐฐ์—ด `X`์˜ ์š”์†Œ๋ฅผ ์˜ค๋ฆ„์ฐจ์ˆœ์œผ๋กœ ์ •๋ ฌ
print('sorted =\\n', np.sort(X))

# ๋ฐฐ์—ด `X`๋ฅผ ์ •๋ ฌํ–ˆ์„ ๋•Œ ์š”์†Œ์˜ ์›๋ž˜ ์ธ๋ฑ์Šค๋ฅผ ๋ฐ˜ํ™˜
print('argsorted =\\n', np.argsort(X))

# ๋ฐฐ์—ด `X`์—์„œ `3`๋ฒˆ์งธ ์ž‘์€ ๊ฐ’์ด ์œ„์น˜ํ•ด์•ผ ํ•  ์ž๋ฆฌ๋ฅผ ๊ธฐ์ค€์œผ๋กœ ๋ถ€๋ถ„์ ์œผ๋กœ ์ •๋ ฌ
# ์ฒซ `3`๊ฐœ์˜ ์š”์†Œ๋Š” ์ž‘์€ ๊ฐ’๋“ค๋กœ ๊ตฌ์„ฑ๋˜๊ณ , ๋‚˜๋จธ์ง€ ์š”์†Œ๋“ค์€ ์•„์ง ์ •๋ ฌ๋˜์ง€ ์•Š์€ ์ƒํƒœ๋กœ ๋ฐฐ์—ด๋จ
print('partitioned first 3 =\\n', np.partition(X, 3))

# ๋ฐฐ์—ด `X`์—์„œ `3`๋ฒˆ์งธ ์ž‘์€ ๊ฐ’์ด ์œ„์น˜ํ•ด์•ผ ํ•  ์ž๋ฆฌ๋ฅผ ๊ธฐ์ค€์œผ๋กœ ๋ถ€๋ถ„ ์ •๋ ฌํ–ˆ์„ ๋•Œ์˜ ์š”์†Œ์˜ ์ธ๋ฑ์Šค๋ฅผ ๋ฐ˜ํ™˜
print('argpartitioned first 3 =\\n', np.argpartition(X, 3))

# ๋ฐฐ์—ด `X`์—์„œ `-3`๋ฒˆ์งธ (๋งˆ์ง€๋ง‰ ์„ธ ๋ฒˆ์งธ) ํฐ ๊ฐ’์ด ์œ„์น˜ํ•ด์•ผ ํ•  ์ž๋ฆฌ๋ฅผ ๊ธฐ์ค€์œผ๋กœ ๋ฐฐ์—ด์„ ๋ถ€๋ถ„์ ์œผ๋กœ ์ •๋ ฌ
# ๋งˆ์ง€๋ง‰ `3`๊ฐœ์˜ ์š”์†Œ๋Š” ํฐ ๊ฐ’๋“ค๋กœ ๊ตฌ์„ฑ๋˜๊ณ , ๋‚˜๋จธ์ง€ ์š”์†Œ๋“ค์€ ์•„์ง ์ •๋ ฌ๋˜์ง€ ์•Š์€ ์ƒํƒœ๋กœ ๋ฐฐ์—ด๋จ
print('partitioned last 3=\\n', np.partition(X, -3))

# ๋ฐฐ์—ด `X`์—์„œ `-3`๋ฒˆ์งธ (๋งˆ์ง€๋ง‰ ์„ธ ๋ฒˆ์งธ) ํฐ ๊ฐ’์ด ์œ„์น˜ํ•ด์•ผ ํ•  ์ž๋ฆฌ๋ฅผ ๊ธฐ์ค€์œผ๋กœ ๋ถ€๋ถ„ ์ •๋ ฌํ–ˆ์„ ๋•Œ์˜ ์š”์†Œ์˜ ์ธ๋ฑ์Šค๋ฅผ ๋ฐ˜ํ™˜
print('argpartitioned last 3=\\n', np.argpartition(X, -3))
X =
 [  4  10   1  20  45 100   2   1]
sorted =
 [  1   1   2   4  10  20  45 100]
argsorted =
 [2 7 6 0 1 3 4 5]
partitioned first 3 =
 [  2   1   1   4  45 100  10  20]
argpartitioned first 3 =
 [6 7 2 0 4 5 1 3]
partitioned last 3=
 [  2   1   1   4  10  20  45 100]
argpartitioned last 3=
 [6 7 2 0 1 3 4 5]

 

Lab.

์ถ• 0์„ ๋”ฐ๋ผ 2-d ๋ฐฐ์—ด T๋ฅผ ์ •๋ ฌํ•˜๊ณ , ์ •๋ ฌ ํ‚ค๋Š” ์ถ• 1์„ ๋”ฐ๋ผ ์›์†Œ์˜ ํ•ฉ์ž…๋‹ˆ๋‹ค.
T = np.array([[2,2],[-1,10],[0,1]])  # 2์ฐจ์› ๋ฐฐ์—ด T๋ฅผ ์ดˆ๊ธฐํ™”
I = np.argsort(np.sum(T, axis=1))    # ์ถ• 1์„ ๋”ฐ๋ผ ๊ฐ ํ–‰์˜ ํ•ฉ๊ณ„๋ฅผ ๊ตฌํ•œ ํ›„, ๊ทธ ํ•ฉ๊ณ„์˜ ์ธ๋ฑ์Šค๋ฅผ ์˜ค๋ฆ„์ฐจ์ˆœ์œผ๋กœ ์ •๋ ฌ
T[I, :]                             # ์ •๋ ฌ๋œ ์ธ๋ฑ์Šค๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ฐฐ์—ด T์˜ ํ–‰์„ ์žฌ์ •๋ ฌ

 

  • T = np.array([[2,2],[-1,10],[0,1]])์€ 2์ฐจ์› ๋ฐฐ์—ด T๋ฅผ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค. ์ด ๋ฐฐ์—ด์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
[[ 2,  2],
 [-1, 10],
 [ 0,  1]]
  • I = np.argsort(np.sum(T, axis=1)) ๋ช…๋ น์–ด๋Š” ๋‘ ๋ถ€๋ถ„์œผ๋กœ ๋‚˜๋‰ฉ๋‹ˆ๋‹ค:
    • np.sum(T, axis=1)๋Š” ๋ฐฐ์—ด T์˜ ๊ฐ ํ–‰์— ๋Œ€ํ•œ ํ•ฉ๊ณ„๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
    • ๋”ฐ๋ผ์„œ, ๊ฐ ํ–‰์˜ ํ•ฉ๊ณ„๋Š” [4, 9, 1]์ด ๋ฉ๋‹ˆ๋‹ค.
  • np.argsort(...)๋Š” ์ฃผ์–ด์ง„ ๋ฐฐ์—ด์˜ ์š”์†Œ๋ฅผ ์˜ค๋ฆ„์ฐจ์ˆœ์œผ๋กœ ์ •๋ ฌํ–ˆ์„ ๋•Œ์˜ ์ธ๋ฑ์Šค๋ฅผ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
    • [4, 9, 1]์˜ ์š”์†Œ๋ฅผ ์˜ค๋ฆ„์ฐจ์ˆœ์œผ๋กœ ์ •๋ ฌํ•˜๋ฉด [1, 4, 9]๊ฐ€ ๋˜๊ณ , ์ด์— ํ•ด๋‹นํ•˜๋Š” ์›๋ž˜ ๋ฐฐ์—ด์˜ ์ธ๋ฑ์Šค๋Š” [2, 0, 1]์ž…๋‹ˆ๋‹ค.T[I, :]๋Š” ๋ฐฐ์—ด T์˜ ํ–‰์„ I ๋ฐฐ์—ด์— ๋”ฐ๋ผ ์žฌ์ •๋ ฌํ•ฉ๋‹ˆ๋‹ค.
    • I๋Š” [2, 0, 1]์ด๋ฏ€๋กœ, T์˜ ํ–‰๋„ ์ด ์ธ๋ฑ์Šค ์ˆœ์„œ๋Œ€๋กœ ์žฌ๋ฐฐ์น˜๋ฉ๋‹ˆ๋‹ค.
    • ์ด๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์ˆœ์„œ๋ฅผ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค:
  • T์˜ 2๋ฒˆ์งธ ํ–‰์ด ์ฒซ ๋ฒˆ์งธ ์œ„์น˜๋กœ ์ด๋™ํ•ฉ๋‹ˆ๋‹ค.
  • ([0, 1])T์˜ 0๋ฒˆ์งธ ํ–‰์ด ๋‘ ๋ฒˆ์งธ ์œ„์น˜๋กœ ์ด๋™ํ•ฉ๋‹ˆ๋‹ค.
  • ([2, 2])T์˜ 1๋ฒˆ์งธ ํ–‰์ด ์„ธ ๋ฒˆ์งธ ์œ„์น˜๋กœ ์ด๋™ํ•ฉ๋‹ˆ๋‹ค. ([-1, 10])
  • ๊ฒฐ๊ณผ์ ์œผ๋กœ, T[I, :]๋ฅผ ์‹คํ–‰ํ•˜๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๋ฐฐ์—ด์ด ์ƒ์„ฑ๋ฉ๋‹ˆ๋‹ค:
[[ 0,  1],
 [ 2,  2],
 [-1, 10]]

  • ์ด ๋ฐฐ์—ด์€ ์›๋ž˜ ๋ฐฐ์—ด T์˜ ํ–‰์„ ๊ฐ ํ–‰์˜ ํ•ฉ๊ณ„๊ฐ€ ์ž‘์€ ์ˆœ์„œ๋Œ€๋กœ ์žฌ์ •๋ ฌํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค.
  • ์ฒซ ๋ฒˆ์งธ ํ–‰์˜ ํ•ฉ๊ณ„๊ฐ€ ๊ฐ€์žฅ ์ž‘๊ณ ([0, 1]์˜ ํ•ฉ๊ณ„๋Š” 1), ๋‹ค์Œ์œผ๋กœ [2, 2]์˜ ํ•ฉ๊ณ„๋Š” 4,
  • ๋งˆ์ง€๋ง‰์œผ๋กœ [-1, 10]์˜ ํ•ฉ๊ณ„๋Š” 9๋กœ, ์˜ค๋ฆ„์ฐจ์ˆœ์œผ๋กœ ์ •๋ ฌ๋œ ์ˆœ์„œ๋ฅผ ๋ฐ˜์˜ํ•ฉ๋‹ˆ๋‹ค.
array([[ 0,  1],
       [ 2,  2],
       [-1, 10]])

Vectorized Function

๋ฒกํ„ฐํ™”๋œ ํ•จ์ˆ˜๋Š” ๋ฐฐ์—ด์„ ์š”์†Œ๋ณ„๋กœ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๋„๋ก ๋ฒกํ„ฐํ™”๋œ ์—ฐ์‚ฐ์„ ์ˆ˜ํ–‰ํ•˜๋Š” ํ•จ์ˆ˜์ž…๋‹ˆ๋‹ค.
  • Numpy์—์„œ ๋ฒกํ„ฐํ™”๋œ ํ•จ์ˆ˜๋ฅผ ์ž‘์„ฑํ•˜๋Š” ๊ฒƒ์€ ์„ฑ๋Šฅ ํ–ฅ์ƒ๊ณผ ์ฝ”๋“œ ๊ฐ„๊ฒฐ์„ฑ ์ธก๋ฉด์—์„œ ๋งค์šฐ ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค.
  • Numpy์˜ ๋ฒกํ„ฐํ™” ๊ธฐ๋Šฅ์„ ํ†ตํ•ด ๋ฐ˜๋ณต๋ฌธ์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š๊ณ ๋„ ๋ฐฐ์—ด ์ „์ฒด์— ๋Œ€ํ•ด ์—ฐ์‚ฐ์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • map ํ•จ์ˆ˜์™€ ์œ ์‚ฌํ•ฉ๋‹ˆ๋‹ค.
import math

# ์ฃผ์–ด์ง„ ๊ฐ’ ์ค‘ ์ ˆ๋Œ€๊ฐ’์„ ๊ธฐ์ค€์œผ๋กœ ๊ฐ€์žฅ ํฐ ๊ฐ’์„ ์ฐพ์Šต๋‹ˆ๋‹ค.
# ์ฃผ์–ด์ง„ ๊ฐ’: 1, 2, 3, 4, 5, -100
# key ๋งค๊ฐœ๋ณ€์ˆ˜๋กœ lambda ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ ๊ฐ’์˜ ์ ˆ๋Œ€๊ฐ’์„ ๋น„๊ตํ•ฉ๋‹ˆ๋‹ค.
max_value = max(1, 2, 3, 4, 5, -100, key=lambda x: math.fabs(x))

# ๊ฐ€์žฅ ํฐ ์ ˆ๋Œ€๊ฐ’์„ ๊ฐ€์ง„ ๊ฐ’(์ด ๊ฒฝ์šฐ -100)์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
print(max_value)

# Result: -100
from functools import partial
import math

# functools ๋ชจ๋“ˆ์—์„œ partial ํ•จ์ˆ˜๋ฅผ ๋ถˆ๋Ÿฌ์˜ต๋‹ˆ๋‹ค.

# max ํ•จ์ˆ˜์˜ key ๋งค๊ฐœ๋ณ€์ˆ˜์— lambda ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ ์ž…๋ ฅ๊ฐ’์˜ ์ ˆ๋Œ€๊ฐ’์„ ๊ธฐ์ค€์œผ๋กœ ์ตœ๋Œ€ ๊ฐ’์„ ์ฐพ๋„๋ก ๋ถ€๋ถ„ ์ ์šฉํ•ฉ๋‹ˆ๋‹ค.
mymax = partial(max, key=lambda x: math.fabs(x))

# ์ด์ œ mymax ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•  ๋•Œ๋งˆ๋‹ค ์ž๋™์œผ๋กœ key ๋งค๊ฐœ๋ณ€์ˆ˜์— lambda ํ•จ์ˆ˜๊ฐ€ ์ ์šฉ๋ฉ๋‹ˆ๋‹ค.
# ๋‘ ๋ฆฌ์ŠคํŠธ [10, 2, 3]์™€ [4, 5, 6]์„ ๋น„๊ตํ•˜์—ฌ ๊ฐ ์š”์†Œ์—์„œ ๋” ํฐ ๊ฐ’์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
result = list(map(max, [10,2,3], [4,5,6]))

# ๊ฒฐ๊ณผ๋Š” [10, 5, 6]์ž…๋‹ˆ๋‹ค. ์ด๋Š” ๊ฐ ์ธ๋ฑ์Šค์—์„œ ๋” ํฐ ๊ฐ’์„ ๋ฐ˜ํ™˜ํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค.
print(result)  # [10, 5, 6]
  • map ํ•จ์ˆ˜๋Š” ์ฃผ์–ด์ง„ ๋‘ ๋ฆฌ์ŠคํŠธ๋ฅผ max ํ•จ์ˆ˜์— ๊ฐ๊ฐ์˜ ์š”์†Œ๊ฐ€ ๋Œ€์‘ํ•˜๋„๋ก ๋งคํ•‘ํ•ฉ๋‹ˆ๋‹ค.
  • ์ฆ‰, max ํ•จ์ˆ˜๋Š” ๊ฐ ์ธ๋ฑ์Šค์— ํ•ด๋‹นํ•˜๋Š” ์š”์†Œ๋ฅผ ๋น„๊ตํ•˜์—ฌ ๋” ํฐ ๊ฐ’์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
    • ์ฒซ ๋ฒˆ์งธ ์š”์†Œ ๋น„๊ต: max(10, 4)๋Š” 10๊ณผ 4๋ฅผ ๋น„๊ตํ•˜์—ฌ ๋” ํฐ ๊ฐ’์ธ 10์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
    • ๋‘ ๋ฒˆ์งธ ์š”์†Œ ๋น„๊ต: max(2, 5)๋Š” 2์™€ 5๋ฅผ ๋น„๊ตํ•˜์—ฌ ๋” ํฐ ๊ฐ’์ธ 5๋ฅผ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
    • ์„ธ ๋ฒˆ์งธ ์š”์†Œ ๋น„๊ต: max(3, 6)๋Š” 3๊ณผ 6์„ ๋น„๊ตํ•˜์—ฌ ๋” ํฐ ๊ฐ’์ธ 6์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
list(map(mymax, [-10,2,3], [4,5,-6]))

# Result: [-10, 5, -6]
u = np.array([100,2,3,4])
v = np.array([1,2,3,4])
w = np.array([4,3,2,1])
np.vectorize(max)(u, v, w)

# array([100,   3,   3,   4])
dist = np.vectorize(lambda x, y: np.sqrt(x**2 + y**2))
dist(v, w)

# array([4.12310563, 3.60555128, 3.60555128, 4.12310563])
  • ์œ„์™€ ๊ฐ™์ด np.vectorize๋Š” ๋žŒ๋‹ค ํ•จ์ˆ˜๋ฅผ ๋ฒกํ„ฐํ™”ํ•˜์—ฌ ๊ฐ ์š”์†Œ์— ์ ์šฉํ•˜๊ณ , ๊ฒฐ๊ณผ๋ฅผ ๋ฐฐ์—ด๋กœ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค. ๊ณ„์‚ฐ๋œ ๊ฐ’์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:
    • ์ฒซ ๋ฒˆ์งธ ์š”์†Œ ์Œ์— ๋Œ€ํ•œ ๊ฑฐ๋ฆฌ: sqrt(v[0]**2 + w[0]**2)
    • ๋‘ ๋ฒˆ์งธ ์š”์†Œ ์Œ์— ๋Œ€ํ•œ ๊ฑฐ๋ฆฌ: sqrt(v[1]**2 + w[1]**2)
    • ์„ธ ๋ฒˆ์งธ ์š”์†Œ ์Œ์— ๋Œ€ํ•œ ๊ฑฐ๋ฆฌ: sqrt(v[2]**2 + w[2]**2)
    • ๋„ค ๋ฒˆ์งธ ์š”์†Œ ์Œ์— ๋Œ€ํ•œ ๊ฑฐ๋ฆฌ: sqrt(v[3]**2 + w[3]**2)
๊ฒฐ๊ณผ์ ์œผ๋กœ [4.12310563, 3.60555128, 3.60555128, 4.12310563]๊ฐ€ ๋ฐ˜ํ™˜๋ฉ๋‹ˆ๋‹ค.

 

# 3D ํฌ์ธํŠธ 0์—์„œ ๊ณ„์‚ฐ๋œ ๋ฒกํ„ฐํ™”๋œ ๊ฑฐ๋ฆฌ๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
import numpy as np

def calculate_euclidean_distances(points):
    # ์ด ํ•จ์ˆ˜๋Š” ์ž…๋ ฅ๋œ 3D ์ ๋“ค์˜ ๋ฐฐ์—ด์—์„œ ์›์ ์œผ๋กœ๋ถ€ํ„ฐ์˜ ์œ ํด๋ฆฌ๋“œ ๊ฑฐ๋ฆฌ๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
    # points: ๊ฐ ํ–‰์ด [x, y, z]๋กœ ํ‘œํ˜„๋œ 3D ์ ์„ ๋‚˜ํƒ€๋‚ด๋Š” 2์ฐจ์› ๋ฐฐ์—ด.
    
    # points ๋ฐฐ์—ด์˜ ๊ฐ ์š”์†Œ์˜ ์ œ๊ณฑ์„ ๊ณ„์‚ฐํ•˜๊ณ , ๊ฐ ์ ์˜ x^2 + y^2 + z^2๋ฅผ ๊ณ„์‚ฐํ•˜๊ธฐ ์œ„ํ•ด ์ถ• 1์„ ๋”ฐ๋ผ ํ•ฉ์„ ๊ตฌํ•ฉ๋‹ˆ๋‹ค.
    squared_sum = np.sum(np.square(points), axis=1)
    
    # ์ œ๊ณฑ์˜ ํ•ฉ์˜ ์ œ๊ณฑ๊ทผ์„ ๊ณ„์‚ฐํ•˜์—ฌ ๊ฐ ์ ์— ๋Œ€ํ•œ ์œ ํด๋ฆฌ๋“œ ๊ฑฐ๋ฆฌ๋ฅผ ๊ตฌํ•ฉ๋‹ˆ๋‹ค.
    distances = np.sqrt(squared_sum)
    
    return distances

# ์˜ˆ์‹œ ์‚ฌ์šฉ๋ฒ•:
# 3D ์ ์˜ ๋ฐฐ์—ด์„ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค.
points = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9],
    [-1, -2, -3]
])

# ์›์ (0, 0, 0)์—์„œ ์œ ํด๋ฆฌ๋“œ ๊ฑฐ๋ฆฌ๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
distances = calculate_euclidean_distances(points)

# ๊ฑฐ๋ฆฌ๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
print(distances)
  • calculate_euclidean_distances ํ•จ์ˆ˜๋Š” ๊ฐ ํ–‰์ด 3D ์  [x, y, z]๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” 2์ฐจ์› ๋ฐฐ์—ด points๋ฅผ ๋ฐ›์Šต๋‹ˆ๋‹ค.
    • ํ•จ์ˆ˜ ๋‚ด๋ถ€์—์„œ points ๋ฐฐ์—ด์˜ ๊ฐ ์š”์†Œ์˜ ์ œ๊ณฑ์„ ๊ณ„์‚ฐํ•œ ํ›„, ์ถ• 1์„ ๋”ฐ๋ผ ํ•ฉ์„ ๊ตฌํ•ฉ๋‹ˆ๋‹ค(๊ฐ ์ ์˜ x^2+ y^2 + z^2๋ฅผ ํ•ฉ์‚ฐ).
    • ์ œ๊ณฑ์˜ ํ•ฉ์˜ ์ œ๊ณฑ๊ทผ์„ ๊ณ„์‚ฐํ•˜์—ฌ ๊ฐ ์ ์— ๋Œ€ํ•œ ์œ ํด๋ฆฌ๋“œ ๊ฑฐ๋ฆฌ๋ฅผ ๊ตฌํ•ฉ๋‹ˆ๋‹ค.
    • ํ•จ์ˆ˜๋Š” ์›์ ์—์„œ ๊ฐ ์ ๊นŒ์ง€์˜ ๊ฑฐ๋ฆฌ๋ฅผ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

Numpy linear algebra (Numpy ์„ ํ˜• ๋Œ€์ˆ˜)

Numpy๋Š” ์„ ํ˜• ๋Œ€์ˆ˜ ์—ฐ์‚ฐ์„ ์ง€์›ํ•˜๊ธฐ ์œ„ํ•ด ๋งŽ์€ ํ•จ์ˆ˜๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
import numpy as np

# 2 x 3 ๋žœ๋ค ํ–‰๋ ฌ ์ƒ์„ฑ
X = np.random.randn(2, 3)
print(X)  # ํ–‰๋ ฌ X ์ถœ๋ ฅ

# ํ–‰๋ ฌ X์˜ ์ „์น˜(transpose) ๊ณ„์‚ฐ
print(X.T)  # ํ–‰๋ ฌ X์˜ ์ „์น˜ ์ถœ๋ ฅ

# ํฌ๊ธฐ๊ฐ€ 3์ธ ๋žœ๋ค ๋ฒกํ„ฐ ์ƒ์„ฑ
y = np.random.randn(3)
print(y)  # ๋ฒกํ„ฐ y ์ถœ๋ ฅ

# ํ–‰๋ ฌ X์™€ ๋ฒกํ„ฐ y์˜ ํ–‰๋ ฌ-๋ฒกํ„ฐ ๊ณฑ์…ˆ
print(X.dot(y))  # ํ–‰๋ ฌ-๋ฒกํ„ฐ ๊ณฑ์…ˆ ๊ฒฐ๊ณผ ์ถœ๋ ฅ

# np.dot() ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ–‰๋ ฌ X์™€ ๋ฒกํ„ฐ y์˜ ๊ณฑ์…ˆ
print(np.dot(X, y))  # ํ–‰๋ ฌ-๋ฒกํ„ฐ ๊ณฑ์…ˆ ๊ฒฐ๊ณผ ์ถœ๋ ฅ (X.dot(y)์™€ ๋™์ผ)

# ํ–‰๋ ฌ X์™€ X์˜ ์ „์น˜(X.T)์˜ ํ–‰๋ ฌ-ํ–‰๋ ฌ ๊ณฑ์…ˆ
print(X.dot(X.T))  # ํ–‰๋ ฌ-ํ–‰๋ ฌ ๊ณฑ์…ˆ ๊ฒฐ๊ณผ ์ถœ๋ ฅ

# ํ–‰๋ ฌ X์˜ ์ „์น˜(X.T)์™€ ํ–‰๋ ฌ X์˜ ํ–‰๋ ฌ-ํ–‰๋ ฌ ๊ณฑ์…ˆ
print(X.T.dot(X))  # ํ–‰๋ ฌ-ํ–‰๋ ฌ ๊ณฑ์…ˆ ๊ฒฐ๊ณผ ์ถœ๋ ฅ
[[-0.67521745 -0.25112232 -0.53902013]
 [-0.31444559  0.26792464 -0.91960302]]
[[-0.67521745 -0.31444559]
 [-0.25112232  0.26792464]
 [-0.53902013 -0.91960302]]
[-0.27857094  0.11301957 -0.0460988 ]
[0.1845624  0.16026873]
[0.1845624  0.16026873]
[[0.80952372 0.64072183]
 [0.64072183 1.01632936]]
[[ 0.55479463  0.08531445  0.65312091]
 [ 0.08531445  0.13484603 -0.11102433]
 [ 0.65312091 -0.11102433  1.13621241]]
y.dot(y) # ๋ฒกํ„ฐ y์™€ ์ž์‹  ์‚ฌ์ด์˜ ๋‚ด์ (dot product)์„ ๊ณ„์‚ฐ
# ๋ฒกํ„ฐ y์™€ ์ž์‹  ์‚ฌ์ด์˜ ๋‚ด์ (dot product)์„ ๊ณ„์‚ฐ

0.09250029216474008
import numpy as np

# 5x3 ์ฐจ์›์˜ ๋žœ๋ค ํ–‰๋ ฌ X ์ƒ์„ฑ
X = np.random.randn(5, 3)
print(X)

# X^T * X๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ C์— ํ• ๋‹นํ•ฉ๋‹ˆ๋‹ค. ์ด ๊ฒฐ๊ณผ๋Š” 3x3 ํฌ๊ธฐ์˜ ์ •์‚ฌ๊ฐํ–‰๋ ฌ์ž…๋‹ˆ๋‹ค.
C = X.T.dot(X)               
print("C = X^T * X:\\n", C)

# C์˜ ์—ญํ–‰๋ ฌ์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค. -> X * X**-1 = I
invC = np.linalg.inv(C)      
print("C์˜ ์—ญํ–‰๋ ฌ:\\n", invC)

# C์˜ ํ–‰๋ ฌ์‹(determinant)์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
detC = np.linalg.det(C)      
print("C์˜ ํ–‰๋ ฌ์‹:", detC)

# C์˜ ๊ณ ์œ ๊ฐ’(eigenvalue) S์™€ ๊ณ ์œ ๋ฒกํ„ฐ(eigenvector) U๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
S, U = np.linalg.eig(C)      
print("C์˜ ๊ณ ์œ ๊ฐ’ S:\\n", S)
print("C์˜ ๊ณ ์œ ๋ฒกํ„ฐ U:\\n", U)

 

  • X: 5x3 ํฌ๊ธฐ์˜ ๋žœ๋ค ํ–‰๋ ฌ์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
  • C: X^T * X ์—ฐ์‚ฐ์„ ์ˆ˜ํ–‰ํ•˜์—ฌ ์ •์‚ฌ๊ฐ ํ–‰๋ ฌ C๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
  • invC: np.linalg.inv(C)๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ C์˜ ์—ญํ–‰๋ ฌ์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
  • detC: np.linalg.det(C)๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ C์˜ ํ–‰๋ ฌ์‹์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
  • S, U: np.linalg.eig(C)๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ C์˜ ๊ณ ์œ ๊ฐ’ S์™€ ๊ณ ์œ ๋ฒกํ„ฐ U๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
[[ 1.00517715 -0.29554381 -1.29674166]
 [ 1.28813155  0.05589876 -0.22072513]
 [ 0.46327488  0.5101119  -1.30901555]
 [-0.68836097  0.50845609 -0.06891248]
 [-0.79042016  1.3979979  -1.01016794]]
C = X^T * X:
 [[ 3.98289246 -1.44375394 -1.34831834]
 [-1.44375394  2.56361068 -1.74409033]
 [-1.34831834 -1.74409033  4.46896842]]
C์˜ ์—ญํ–‰๋ ฌ:
 [[0.6600184  0.69051673 0.46861787]
 [0.69051673 1.25350588 0.69753544]
 [0.46861787 0.69753544 0.63737548]]
C์˜ ํ–‰๋ ฌ์‹: 12.749408257332224
C์˜ ๊ณ ์œ ๊ฐ’ S:
 [0.46098261 4.83874418 5.71574477]
C์˜ ๊ณ ์œ ๋ฒกํ„ฐ U:
 [[ 0.48384751  0.75707829 -0.43900347]
 [ 0.73115789 -0.6253672  -0.27262429]
 [ 0.4809363   0.18907227  0.85612613]]

 

import numpy as np

# 2x2 ํฌ๊ธฐ์˜ ํ–‰๋ ฌ L์„ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.
L = np.array([[2, 0], [0, 1]])

# ํ–‰๋ ฌ L์˜ ๊ณ ์œ ๊ฐ’(eigenvalue) S์™€ ๊ณ ์œ ๋ฒกํ„ฐ(eigenvector) U๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
S, U = np.linalg.eig(L)

# ๊ณ„์‚ฐ๋œ ๊ณ ์œ ๊ฐ’ S์™€ ๊ณ ์œ ๋ฒกํ„ฐ U๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
print("๊ณ ์œ ๊ฐ’ S:", S)
print("๊ณ ์œ ๋ฒกํ„ฐ U:\\n", U)
๊ณ ์œ ๊ฐ’ S: [2. 1.]
๊ณ ์œ ๋ฒกํ„ฐ U:
 [[1. 0.]
 [0. 1.]]
v = np.array([1,1])
v = L.dot(v)
v

# array([2,    1])
  • L.dot(v)์˜ ๊ฒฐ๊ณผ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:
  • L.dot(v) = np.array([21 + 01, 01 + 11]), ์ฆ‰ np.array([2,1])์ž…๋‹ˆ๋‹ค.

The Frobenius norm (ํ”„๋กœ๋ฒ ๋‹ˆ์šฐ์Šค ๊ทœ๋ฒ”)

ํ”„๋กœ๋ฒ ๋‹ˆ์šฐ์Šค ๊ทœ๋ฒ”(Frobenius norm)์€ ํ–‰๋ ฌ์˜ ์š”์†Œ๋“ค์— ๋Œ€ํ•œ ์œ ํด๋ฆฌ๋“œ ๊ฑฐ๋ฆฌ์™€ ์œ ์‚ฌํ•œ ๊ฐœ๋…์œผ๋กœ, ํ–‰๋ ฌ์˜ ํฌ๊ธฐ๋‚˜ ๊ธธ์ด๋ฅผ ์ธก์ •ํ•˜๋Š” ํ•˜๋‚˜์˜ ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค.
  • ์ด๋Š” ํ–‰๋ ฌ์˜ ๊ฐ ์›์†Œ์˜ ์ œ๊ณฑ์„ ํ•ฉํ•œ ๋’ค, ๊ทธ ํ•ฉ์˜ ์ œ๊ณฑ๊ทผ์„ ์ทจํ•œ ๊ฐ’์œผ๋กœ ์ •์˜๋ฉ๋‹ˆ๋‹ค.
# 1์ฐจ์› ๋ฐฐ์—ด์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
X = np.array([1, 2])

# 2-๋…ธ๋ฆ„(Euclidean norm)์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
# ์œ ํด๋ฆฌ๋“œ ๊ฑฐ๋ฆฌ๋กœ, ๋ฐฐ์—ด์˜ ๋ชจ๋“  ์š”์†Œ์˜ ์ œ๊ณฑ์„ ๋”ํ•˜๊ณ  ๊ทธ ์ œ๊ณฑ๊ทผ์„ ์ทจํ•ฉ๋‹ˆ๋‹ค.
print(np.linalg.norm(X))  # ๊ฒฐ๊ณผ: sqrt(1^2 + 2^2) = sqrt(5)

# 1-๋…ธ๋ฆ„(Manhattan norm)์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
# ๋ฐฐ์—ด์˜ ๋ชจ๋“  ์š”์†Œ์˜ ์ ˆ๋Œ“๊ฐ’์„ ํ•ฉ์‚ฐํ•ฉ๋‹ˆ๋‹ค.
print(np.linalg.norm(X, ord=1))  # ๊ฒฐ๊ณผ: abs(1) + abs(2) = 3

# ๋ฌดํ•œ๋Œ€ ๋…ธ๋ฆ„(Infinity norm)์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
# ๋ฐฐ์—ด์˜ ๋ชจ๋“  ์š”์†Œ ์ค‘ ๊ฐ€์žฅ ํฐ ์ ˆ๋Œ“๊ฐ’์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค. -> ์ตœ๋Œ€๊ฐ’
print(np.linalg.norm(X, ord=np.inf))  # ๊ฒฐ๊ณผ: max(abs(1), abs(2)) = 2

# -๋ฌดํ•œ๋Œ€ ๋…ธ๋ฆ„์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
# ๋ฐฐ์—ด์˜ ๋ชจ๋“  ์š”์†Œ ์ค‘ ๊ฐ€์žฅ ์ž‘์€ ์ ˆ๋Œ“๊ฐ’์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค. -> ์ตœ์†Œ๊ฐ’
print(np.linalg.norm(X, ord=-np.inf))  # ๊ฒฐ๊ณผ: min(abs(1), abs(2)) = 1
2.23606797749979
3.0
2.0
1.0

 

import math
import numpy as np

# ๋‘ ๊ฐœ์˜ ๋ฒกํ„ฐ x์™€ y๋ฅผ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค.
x = np.array([1, 0])  # ๋ฒกํ„ฐ x๋Š” [1, 0]์ž…๋‹ˆ๋‹ค.
y = np.array([0, 1])  # ๋ฒกํ„ฐ y๋Š” [0, 1]์ž…๋‹ˆ๋‹ค.

# ์ฒซ ๋ฒˆ์งธ ์ฝ”์‚ฌ์ธ ์œ ์‚ฌ๋„ ๊ณ„์‚ฐ:
# ๋ฒกํ„ฐ x์™€ y์˜ ๋‚ด์ ์„ ๊ณ„์‚ฐํ•˜๊ณ , ๊ฐ ๋ฒกํ„ฐ์˜ ํฌ๊ธฐ๋ฅผ ๊ณ„์‚ฐํ•œ ๋‹ค์Œ, ๋‘˜์„ ๋‚˜๋ˆ•๋‹ˆ๋‹ค.
print("cosine =", x.dot(y) / (math.sqrt(x.dot(x)) * math.sqrt(y.dot(y))))

# ๋‘ ๋ฒˆ์งธ ์ฝ”์‚ฌ์ธ ์œ ์‚ฌ๋„ ๊ณ„์‚ฐ:
# ๋ฒกํ„ฐ x์™€ y์˜ ๋‚ด์ ์„ ๊ณ„์‚ฐํ•˜๊ณ , numpy์˜ np.linalg.norm ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ฒกํ„ฐ ํฌ๊ธฐ๋ฅผ ๊ณ„์‚ฐํ•œ ๋‹ค์Œ, ๋‘˜์„ ๋‚˜๋ˆ•๋‹ˆ๋‹ค.
print("cosine =", x.dot(y) / (np.linalg.norm(x) * np.linalg.norm(y)))
cosine = 0.0
cosine = 0.0

LAB: distance matrix

  • In case of 1-d points
import numpy as np

pts = np.array([1., 2, 3, 4, 5])  # 1์ฐจ์› ๋ฐฐ์—ด์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.

# np.newaxis: ์ถ• ํ•˜๋‚˜ ๋” ๋งŒ๋“ฌ
u = pts[:, np.newaxis]  # ์—ด ๋ฐฉํ–ฅ์œผ๋กœ 1์ฐจ์› ๋ฐฐ์—ด์„ ํ™•์žฅํ•ฉ๋‹ˆ๋‹ค. ๊ฒฐ๊ณผ๋Š” (5, 1) ํฌ๊ธฐ์˜ 2์ฐจ์› ๋ฐฐ์—ด์ž…๋‹ˆ๋‹ค.
v = pts.T[np.newaxis, :]  # ํ–‰ ๋ฐฉํ–ฅ์œผ๋กœ 1์ฐจ์› ๋ฐฐ์—ด์„ ํ™•์žฅํ•ฉ๋‹ˆ๋‹ค. ๊ฒฐ๊ณผ๋Š” (1, 5) ํฌ๊ธฐ์˜ 2์ฐจ์› ๋ฐฐ์—ด์ž…๋‹ˆ๋‹ค.

# `u`์™€ `v`์˜ ์ฐจ์ด์˜ ์ ˆ๋Œ€๊ฐ’์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
result = np.abs(u - v)

print(result)  # `u`์™€ `v`์˜ ์ฐจ์ด์˜ ์ ˆ๋Œ€๊ฐ’์œผ๋กœ ์ด๋ฃจ์–ด์ง„ 5x5 ํ–‰๋ ฌ์ด ์ถœ๋ ฅ๋ฉ๋‹ˆ๋‹ค.
# ๋Œ€์นญ ํ–‰๋ ฌ
array([[0., 1., 2., 3., 4.],
       [1., 0., 1., 2., 3.],
       [2., 1., 0., 1., 2.],
       [3., 2., 1., 0., 1.],
       [4., 3., 2., 1., 0.]])
  • pts.T๋Š” pts ๋ฐฐ์—ด์˜ ์ „์น˜(transpose)๋ฅผ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.
  • ์ „์น˜๋Š” ๋ฐฐ์—ด์˜ ์ถ•(axis)์„ ๋ฐ”๊พธ๋Š” ์—ฐ์‚ฐ์œผ๋กœ, ๋ฐฐ์—ด์˜ ํ–‰๊ณผ ์—ด์„ ๋ฐ”๊พธ๋Š” ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค.
# pts = np.array([1., 2, 3, 4, 5])
pts.T

# array([1., 2., 3., 4., 5.])

 

  • n-d ํฌ์ธํŠธ์ธ ๊ฒฝ์šฐ
import numpy as np

# 2์ฐจ์› ๋ฐฐ์—ด `pts`๋ฅผ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค. (3, 2)์˜ ๋ชจ์–‘์œผ๋กœ 2์ฐจ์› ์ ๋“ค์„ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค.
pts = np.array([[1, 0], [1, 1], [0, 1]])

# `pts`์˜ ๋ชจ์–‘์„ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
print(pts.shape)

# ์ƒˆ๋กœ์šด ์ฐจ์›์„ ์ถ”๊ฐ€ํ•˜์—ฌ 3์ฐจ์› ๋ฐฐ์—ด `u`๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
# `u`์˜ ๋ชจ์–‘์€ (3, 2, 1)์ด๋ฉฐ, `pts`์˜ ๊ฐ ํ–‰์„ ์ƒˆ๋กœ์šด ์„ธ ๋ฒˆ์งธ ์ฐจ์›์— ๋”ฐ๋ผ 1์ฐจ์› ๋ฐฐ์—ด๋กœ ํ™•์žฅํ•ฉ๋‹ˆ๋‹ค.
u = pts[:, :, np.newaxis]

# `pts`๋ฅผ ์ „์น˜ํ•˜๊ณ  ์ƒˆ๋กœ์šด ์ฐจ์›์„ ์ถ”๊ฐ€ํ•˜์—ฌ 3์ฐจ์› ๋ฐฐ์—ด `v`๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
# `v`์˜ ๋ชจ์–‘์€ (1, 2, 3)์ด๋ฉฐ, `pts`์˜ ๊ฐ ์—ด์„ ์ƒˆ๋กœ์šด ์ฒซ ๋ฒˆ์งธ ์ฐจ์›์— ๋”ฐ๋ผ 1์ฐจ์› ๋ฐฐ์—ด๋กœ ํ™•์žฅํ•ฉ๋‹ˆ๋‹ค.
v = pts.T[np.newaxis, :, :]

# Result: (3,2)
import numpy as np

# 2์ฐจ์› ์ ๋“ค์˜ ๋ฐฐ์—ด `pts`๋ฅผ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค. `pts`๋Š” (3, 2) ๋ชจ์–‘์˜ ๋ฐฐ์—ด์ž…๋‹ˆ๋‹ค.
pts = np.array([[0, 0], [1, 1], [0, 0]])

# `pts`์˜ ๋ชจ์–‘์„ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
print(pts.shape)

# `pts`์˜ ๊ฐ ํ–‰์„ ์ƒˆ๋กœ์šด ์ฐจ์›์— ๋”ฐ๋ผ ํ™•์žฅํ•˜์—ฌ 3์ฐจ์› ๋ฐฐ์—ด `u`๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
# `u`์˜ ๋ชจ์–‘์€ (3, 2, 1)์ž…๋‹ˆ๋‹ค.
u = pts[:, :, np.newaxis]

# `pts`๋ฅผ ์ „์น˜ํ•œ ํ›„, ์ƒˆ๋กœ์šด ์ฐจ์›์„ ์ถ”๊ฐ€ํ•˜์—ฌ 3์ฐจ์› ๋ฐฐ์—ด `v`๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
# `v`์˜ ๋ชจ์–‘์€ (1, 2, 3)์ด๋ฉฐ, `pts`์˜ ๊ฐ ์—ด์„ ์ƒˆ๋กœ์šด ์ฐจ์›์— ๋”ฐ๋ผ ํ™•์žฅํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค.
v = pts.T[np.newaxis, :, :]

# Result: (3,2)
# np.linalg.norm(pts)๋Š” ์ฃผ์–ด์ง„ ๋ฐฐ์—ด pts์˜ ๋…ธ๋ฆ„(norm)์„ ๊ณ„์‚ฐํ•˜๋Š” ํ•จ์ˆ˜.
# ๋…ธ๋ฆ„์€ ์ฃผ์–ด์ง„ ๋ฒกํ„ฐ์˜ ํฌ๊ธฐ๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ์ฒ™๋„

np.linalg.norm(pts)

# 1.4142135623730951

 

  • u = pts[:, :, np.newaxis]์€ ์ž…๋ ฅ ๋ฐฐ์—ด pts๋ฅผ ์ƒˆ๋กœ์šด ์ฐจ์›์œผ๋กœ ํ™•์žฅํ•˜์—ฌ (3, 2, 1) ํ˜•์ƒ์˜ ๋ฐฐ์—ด๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
  • v = pts.T[np.newaxis, :, :]์€ pts์˜ ์ „์น˜ ๋ฐฐ์—ด์„ ์ƒˆ๋กœ์šด ์ฐจ์›์œผ๋กœ ํ™•์žฅํ•˜์—ฌ (1, 2, 3) ํ˜•์ƒ์˜ ๋ฐฐ์—ด๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
print(v.shape)
print(u.shape)

(1, 2, 3)
(3, 2, 1)
np.sqrt(np.sum((u - v)**2, axis=1))

 

 

  1. ๋ฐฐ์—ด u์™€ v:
    • u์™€ v๋Š” ๊ฐ๊ฐ (3, 2, 1) ๋ฐ (1, 2, 3) ํ˜•์ƒ์˜ 3์ฐจ์› ๋ฐฐ์—ด์ž…๋‹ˆ๋‹ค.
    • u๋Š” pts ๋ฐฐ์—ด์˜ ๊ฐ ์ ์„ z ์ถ•(๋งˆ์ง€๋ง‰ ์ฐจ์›)์— ๋ฐฐ์—ดํ•˜์—ฌ 3์ฐจ์› ๋ฐฐ์—ด๋กœ ๋งŒ๋“  ๊ฒƒ์ž…๋‹ˆ๋‹ค.
    • v๋Š” pts ๋ฐฐ์—ด์˜ ์ „์น˜(transpose) ํ›„ ์ƒˆ๋กœ์šด ์ฐจ์›์„ ์ถ”๊ฐ€ํ•˜์—ฌ x ์ถ•(์ฒซ ๋ฒˆ์งธ ์ฐจ์›)์— ๋ฐฐ์—ดํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค.
  2. u - v:
    • u์™€ v์˜ ํ˜•์ƒ์ด ๊ฐ๊ฐ (3, 2, 1) ๋ฐ (1, 2, 3)์ด๋ฏ€๋กœ, ๋‘ ๋ฐฐ์—ด์€ ๋ธŒ๋กœ๋“œ์บ์ŠคํŒ…์„ ํ†ตํ•ด ํ˜•์ƒ์ด ๋งž์ถฐ์ง‘๋‹ˆ๋‹ค.
    → (3 ,2, 1) (1, 2, 3) ⇒ (3, 2, 3)
    • u - v๋Š” ๊ฐ๊ฐ์˜ u ์š”์†Œ์™€ v ์š”์†Œ ๊ฐ„์˜ ์ฐจ์ด๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค. ์ด ์—ฐ์‚ฐ์€ ๊ฐ ์ ์˜ ์ขŒํ‘œ ์ฐจ์ด๋ฅผ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค.
    • ๊ฒฐ๊ณผ๋Š” (3, 2, 3) ํ˜•์ƒ์˜ ๋ฐฐ์—ด์ด ๋ฉ๋‹ˆ๋‹ค. ์ด๋Š” 3๊ฐœ์˜ ์  (u)๊ณผ 3๊ฐœ์˜ ์  (v) ์‚ฌ์ด์˜ ์ฐจ์ด ๊ฐ’์„ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค.
  3. np.sum((u - v)**2, axis=1):
    • axis=1์„ ์ง€์ •ํ•˜๋ฉด u์™€ v ๊ฐ„์˜ ์ฐจ์ด ์ œ๊ณฑ์„ y ์ถ•(๋‘ ๋ฒˆ์งธ ์ฐจ์›)์—์„œ ํ•ฉ์‚ฐํ•ฉ๋‹ˆ๋‹ค.
    • axis=1์˜ ํ•ฉ์‚ฐ ๊ฒฐ๊ณผ๋Š” (3, 3) ํ˜•์ƒ์˜ ๋ฐฐ์—ด๋กœ, ๊ฐ ์  ๊ฐ„์˜ ๊ฑฐ๋ฆฌ๋ฅผ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค.
  4. np.sqrt(np.sum((u - v)**2, axis=1)):
    • u์™€ v์˜ ๊ฐ ์  ๊ฐ„ ์ฐจ์ด์˜ ์ œ๊ณฑ์„ ํ•ฉ์‚ฐํ•œ ๊ฐ’์„ sqrt๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ œ๊ณฑ๊ทผ์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
    • ์ด ์—ฐ์‚ฐ์€ 3๊ฐœ์˜ ์  ๊ฐ„์˜ ์œ ํด๋ฆฌ๋“œ ๊ฑฐ๋ฆฌ๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ (3, 3) ํ˜•์ƒ์˜ ๋ฐฐ์—ด์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
array([[0.        , 1.        , 1.41421356],
       [1.        , 0.        , 1.        ],
       [1.41421356, 1.        , 0.        ]])

 

 

  • np.linalg.norm(u - v, axis=1)๋Š” u์™€ v ์‚ฌ์ด์˜ ์ฐจ์ด ๋ฒกํ„ฐ์˜ ์œ ํด๋ฆฌ๋“œ ๊ฑฐ๋ฆฌ(Euclidean distance)๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ์ฝ”๋“œ์ž…๋‹ˆ๋‹ค.
    • u์™€ v๋Š” ๊ฐ๊ฐ (3, 2, 1)๊ณผ (1, 2, 3)์˜ ํ˜•์ƒ์„ ๊ฐ€์ง„ NumPy ๋ฐฐ์—ด๋กœ, ๊ฐ๊ฐ 2์ฐจ์› ํฌ์ธํŠธ๋ฅผ ํ‘œํ˜„ํ•ฉ๋‹ˆ๋‹ค. u์™€ v๋Š” ์„œ๋กœ ๋‹ค๋ฅธ ํ˜•์ƒ์ด์ง€๋งŒ, ๋ธŒ๋กœ๋“œ์บ์ŠคํŒ…์„ ํ†ตํ•ด ๋‘ ๋ฐฐ์—ด์„ ์—ฐ์‚ฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
    • u - v๋Š” u ๋ฐฐ์—ด๊ณผ v ๋ฐฐ์—ด์˜ ์ฐจ์ด ๋ฒกํ„ฐ๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค. ๋‘ ๋ฐฐ์—ด์€ ๋ธŒ๋กœ๋“œ์บ์ŠคํŒ…์„ ํ†ตํ•ด (3, 2, 3) ํ˜•์ƒ์œผ๋กœ ํ™•์žฅ๋ฉ๋‹ˆ๋‹ค. ์ด๋Š” u์™€ v ์‚ฌ์ด์˜ ์ฐจ์ด๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ๋ฐฐ์—ด์ž…๋‹ˆ๋‹ค.
    • np.linalg.norm(u - v, axis=1)์€ ์ฐจ์ด ๋ฒกํ„ฐ์˜ ์œ ํด๋ฆฌ๋“œ ๊ฑฐ๋ฆฌ๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค. axis=1์€ ํ–‰ ๋‹จ์œ„๋กœ ์—ฐ์‚ฐ์„ ์ˆ˜ํ–‰ํ•˜๋„๋ก ์ง€์‹œํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰, ๊ฐ ํฌ์ธํŠธ u์™€ v ์‚ฌ์ด์˜ ์ฐจ์ด ๋ฒกํ„ฐ์˜ ์œ ํด๋ฆฌ๋“œ ๊ฑฐ๋ฆฌ๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
np.linalg.norm(u - v, axis=1)
np.linalg.norm(u - v, axis=1)
array([[0.        , 1.        , 1.41421356],
       [1.        , 0.        , 1.        ],
       [1.41421356, 1.        , 0.        ]])

 

์‹ค์ œ ์‘์šฉ ํ”„๋กœ๊ทธ๋žจ์—์„œ sklearn.metrics.pairwise๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์Œ๋ณ„ ๊ฑฐ๋ฆฌ๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค
  • euclidean_distances(pts): pts ๋ฐฐ์—ด์˜ ๊ฐ ์š”์†Œ ๊ฐ„์˜ ์œ ํด๋ฆฌ๋“œ ๊ฑฐ๋ฆฌ๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
    • ์œ ํด๋ฆฌ๋“œ ๊ฑฐ๋ฆฌ๋Š” ๋‘ ์  ์‚ฌ์ด์˜ ์ง์„  ๊ฑฐ๋ฆฌ๋ฅผ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. ๋ฐ˜ํ™˜๋˜๋Š” ๊ฐ’์€ pts ๋ฐฐ์—ด์˜ ๊ฐ ์Œ์— ๋Œ€ํ•œ ๊ฑฐ๋ฆฌ ๊ฐ’์œผ๋กœ ๊ตฌ์„ฑ๋œ ํ–‰๋ ฌ์ž…๋‹ˆ๋‹ค.
  • manhattan_distances(pts): pts ๋ฐฐ์—ด์˜ ๊ฐ ์š”์†Œ ๊ฐ„์˜ ๋งจํ•ดํŠผ ๊ฑฐ๋ฆฌ๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
    • ๋งจํ•ดํŠผ ๊ฑฐ๋ฆฌ๋Š” ๋‘ ์  ์‚ฌ์ด์˜ ์ขŒํ‘œ ์ฐจ์ด์˜ ์ ˆ๋Œ€๊ฐ’์„ ํ•ฉํ•œ ๊ฐ’์œผ๋กœ, ํƒ์‹œ ๊ฑฐ๋ฆฌ๋ผ๊ณ ๋„ ๋ถˆ๋ฆฝ๋‹ˆ๋‹ค. ๋ฐ˜ํ™˜๋˜๋Š” ๊ฐ’์€ pts ๋ฐฐ์—ด์˜ ๊ฐ ์Œ์— ๋Œ€ํ•œ ๊ฑฐ๋ฆฌ ๊ฐ’์œผ๋กœ ๊ตฌ์„ฑ๋œ ํ–‰๋ ฌ์ž…๋‹ˆ๋‹ค.
  • cosine_similarity(pts): pts ๋ฐฐ์—ด์˜ ๊ฐ ์š”์†Œ ๊ฐ„์˜ ์ฝ”์‚ฌ์ธ ์œ ์‚ฌ๋„๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
    • ์ฝ”์‚ฌ์ธ ์œ ์‚ฌ๋„๋Š” ๋‘ ๋ฒกํ„ฐ ์‚ฌ์ด์˜ ๊ฐ๋„๋ฅผ ๋‚˜ํƒ€๋‚ด๋ฉฐ, 1์— ๊ฐ€๊นŒ์šธ์ˆ˜๋ก ์œ ์‚ฌํ•œ ๋ฐฉํ–ฅ์„ ๊ฐ€์ง€๊ณ  ์žˆ์Œ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. ๋ฐ˜ํ™˜๋˜๋Š” ๊ฐ’์€ pts ๋ฐฐ์—ด์˜ ๊ฐ ์Œ์— ๋Œ€ํ•œ ์œ ์‚ฌ๋„ ๊ฐ’์œผ๋กœ ๊ตฌ์„ฑ๋œ ํ–‰๋ ฌ์ž…๋‹ˆ๋‹ค.
from sklearn.metrics.pairwise import euclidean_distances, manhattan_distances, cosine_similarity

# 2์ฐจ์› ํฌ์ธํŠธ ๋ฐฐ์—ด `pts`์˜ ์š”์†Œ๋“ค ๊ฐ„์˜ ์œ ํด๋ฆฌ๋“œ ๊ฑฐ๋ฆฌ๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
print(euclidean_distances(pts))

# 2์ฐจ์› ํฌ์ธํŠธ ๋ฐฐ์—ด `pts`์˜ ์š”์†Œ๋“ค ๊ฐ„์˜ ๋งจํ•ดํŠผ ๊ฑฐ๋ฆฌ๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
print(manhattan_distances(pts))

# 2์ฐจ์› ํฌ์ธํŠธ ๋ฐฐ์—ด `pts`์˜ ์š”์†Œ๋“ค ๊ฐ„์˜ ์ฝ”์‚ฌ์ธ ์œ ์‚ฌ๋„๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
print(cosine_similarity(pts))
[[0.         1.         1.41421356]
 [1.         0.         1.        ]
 [1.41421356 1.         0.        ]]
[[0. 1. 2.]
 [1. 0. 1.]
 [2. 1. 0.]]
[[1.         0.70710678 0.        ]
 [0.70710678 1.         0.70710678]
 [0.         0.70710678 1.        ]]

 

์ด์ œ ์—ฌ๊ธฐ์„œ ์ž์‹ ์˜ ๊ฑฐ๋ฆฌ๋ฅผ ์ •์˜ํ•˜๋ ค๋ฉด?
  • ์ด ์ฝ”๋“œ๋Š” pairwise_distances ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ pts ๋ฐฐ์—ด์˜ ๊ฐ ์š”์†Œ ๊ฐ„์˜ ๋ฌดํ•œ๋Œ€ ๊ฑฐ๋ฆฌ๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
  • inf_dist ํ•จ์ˆ˜๋Š” ๋‘ ๋ฒกํ„ฐ x์™€ y ์‚ฌ์ด์˜ ๊ฐ ์š”์†Œ ๊ฐ„ ์ ˆ๋Œ€ ์ฐจ์ด์˜ ์ตœ๋Œ€๊ฐ’์„ ๋ฐ˜ํ™˜ํ•˜๋Š” ๋žŒ๋‹ค ํ•จ์ˆ˜์ž…๋‹ˆ๋‹ค.
  • pairwise_distances(pts, metric=inf_dist)๋Š” ์ฃผ์–ด์ง„ pts ๋ฐฐ์—ด์˜ ๊ฐ ์Œ์— ๋Œ€ํ•œ ๋ฌดํ•œ๋Œ€ ๊ฑฐ๋ฆฌ๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ ํ–‰๋ ฌ๋กœ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
from sklearn.metrics.pairwise import pairwise_distances

# ์‚ฌ์šฉ์ž ์ •์˜ ๊ฑฐ๋ฆฌ ํ•จ์ˆ˜ inf_dist๋Š” ๋‘ ์  x์™€ y ์‚ฌ์ด์˜ ๋ฌดํ•œ๋Œ€ ๊ฑฐ๋ฆฌ๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
# ๋ฌดํ•œ๋Œ€ ๊ฑฐ๋ฆฌ(inf_dist)๋Š” ๋‘ ๋ฒกํ„ฐ x์™€ y์˜ ๊ฐ ์š”์†Œ ๊ฐ„ ์ ˆ๋Œ€ ์ฐจ์ด์˜ ์ตœ๋Œ€๊ฐ’์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.
inf_dist = lambda x, y : np.max(np.abs(x - y))

# pairwise_distances ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ฃผ์–ด์ง„ 2์ฐจ์› ํฌ์ธํŠธ ๋ฐฐ์—ด pts์˜ ๊ฐ ์Œ์— ๋Œ€ํ•œ ๋ฌดํ•œ๋Œ€ ๊ฑฐ๋ฆฌ๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
# metric ๋งค๊ฐœ๋ณ€์ˆ˜๋กœ ์‚ฌ์šฉ์ž ์ •์˜ ๊ฑฐ๋ฆฌ ํ•จ์ˆ˜ inf_dist๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ฌดํ•œ๋Œ€ ๊ฑฐ๋ฆฌ๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
print(pairwise_distances(pts, metric=inf_dist))
[[0. 1. 1.]
 [1. 0. 1.]
 [1. 1. 0.]]