A A
[CV] OpenCV๋กœ Object Detection ๊ตฌํ˜„ํ•˜๊ธฐ (Part.1)

OpenCV DNN ์žฅ๋‹จ์ 

OpenCV Deep Neural Network์˜ ์žฅ๋‹จ์ ์— ๋ฐํ•˜์—ฌ ์•Œ์•„๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

  • OpenCV ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋Š” Intel์— ์˜ํ•˜์—ฌ ์ตœ์ดˆ ๊ฐœ๋ฐœ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์žฅ๋‹จ์ ์„ ์„ค๋ช…ํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

OpenCV DNN ์žฅ์ 

  • ๋”ฅ๋Ÿฌ๋‹ ๊ฐœ๋ฐœ ํ”„๋ ˆ์ž„ ์›Œํฌ ์—†์ด ์‰ฝ๊ฒŒ Inference๋ฅผ ๊ตฌํ˜„ ๊ฐ€๋Šฅ ํ•ฉ๋‹ˆ๋‹ค.
  • OpenCV์—์„œ ์ง€์›ํ•˜๋Š” ๋‹ค์–‘ํ•œ Computer Vision ์ฒ˜๋ฆฌ ๋ฐ API์™€ Deep learning์„ ์‰ฝ๊ฒŒ ๊ฒฐํ•ฉํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ํŠน์ง•์ด ์žˆ์Šต๋‹ˆ๋‹ค.

OpenCV DNN ๋‹จ์ 

  • GPU ์ง€์› ๊ธฐ๋Šฅ์ด ์•ฝํ•ฉ๋‹ˆ๋‹ค.
  • DNN ๋ชจ๋“ˆ์€ ๊ณผ๊ฑฐ์— NVIDIA GPU ์ง€์›์ด ๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. 2019๋…„ 10์›”์— Google์—์„œ NVIDIA GPU ์ง€์› ๋ฐœํ‘œํ–ˆ์ง€๋งŒ. ์•„์ง ํ™˜๊ฒฝ ๊ตฌ์„ฑ/์„ค์น˜๊ฐ€ ์–ด๋ ต์Šต๋‹ˆ๋‹ค. ์ ์ฐจ ๊ฐœ์„  ์ž‘์—…์ด ์ง„ํ–‰์ค‘์ž…๋‹ˆ๋‹ค.
  • OpenCV๋Š” ๋ชจ๋ธ์„ ํ•™์Šตํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์„ ์ œ๊ณตํ•˜์ง€ ์•Š์œผ๋ฉฐ ์˜ค์ง Inference๋งŒ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
  • CPU ๊ธฐ๋ฐ˜์—์„œ Inference ์†๋„๊ฐ€ ๊ฐœ์„ ๋˜์—ˆ์œผ๋‚˜, GPU(NVIDIA)๊ฐ€ ์ง€์›๋˜์ง€ ์•Š์•„ ํƒ€ Deep learning framework ๋Œ€๋น„ Inference ์†๋„๊ฐ€ ํฌ๊ฒŒ ์ €ํ•˜๋œ๋‹ค๋Š” ํŠน์ง•์ด ์žˆ์Šต๋‹ˆ๋‹ค.

๊ธฐ์กด Deep Learning Frame๊ณผ์˜ ์—ฐ๋™

  • OpenCV๋Š” ์ž์ฒด์ ์œผ๋กœ ๋”ฅ๋Ÿฌ๋‹ ๊ฐ€์ค‘์น˜ ๋ชจ๋ธ์„ ์ƒ์„ฑํ•˜์ง€ ์•Š๊ณ  ํƒ€ Framework์—์„œ ์ƒ์„ฑ๋œ ๋ชจ๋ธ์„ ๋ณ€ํ™˜ํ•˜์—ฌ ๋กœ๋”ฉํ•ฉ๋‹ˆ๋‹ค.
  • DNN ํŒจํ‚ค์ง€๋Š” ํŒŒ์ผ๋กœ ์ƒ์„ฑ๋œ ํƒ€ ํ”„๋ ˆ์ž„์›Œํฌ ๋ชจ๋ธ์„ ๋กœ๋”ฉํ•  ์ˆ˜ ์žˆ๋„๋ก readNetFromXXX(๊ฐ€์ค‘์น˜ ๋ชจ๋ธํŒŒ์ผ, ํ™˜๊ฒฝ ํŒŒ์ผ) API๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
  • Weight(๊ฐ€์ค‘์น˜) ๋ชจ๋ธ ํŒŒ์ผ์€ ํƒ€ ํ”„๋ ˆ์ž„์›Œํฌ ๋ชจ๋ธ ํŒŒ์ผ, ํ™˜๊ฒฝ ํŒŒ์ผ์€ ํƒ€ ํ”„๋ ˆ์ž„์›Œํฌ ๋ชจ๋ธ ํŒŒ์ผ์˜ ํ™˜๊ฒฝ(Config) ํŒŒ์ผ์„ DNN ํŒจํ‚ค์ง€์—์„œ ๋‹ค์‹œ ๋ณ€ํ™˜ํ•œ ํ™˜๊ฒฝ ํŒŒ์ผ ์ž…๋‹ˆ๋‹ค.


OpenCV ์ง€์› Tensorflow ๋ชจ๋ธ

 

TensorFlow Object Detection API

Open Source Computer Vision Library. Contribute to opencv/opencv development by creating an account on GitHub.

github.com


OpenCV DNN์„ ์ด์šฉํ•œ Inference ์ˆ˜ํ–‰ ์ ˆ์ฐจ

OpenCV DNN์„ ์ด์šฉํ•œ Inference ์ˆ˜ํ–‰ ์ ˆ์ฐจ์— ๋ฐํ•˜์—ฌ ํ•œ๋ฒˆ ์•Œ์•„๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

 

1. Weight(๊ฐ€์ค‘์น˜)๋ชจ๋ธ ํŒŒ์ผ๊ณผ ํ™˜๊ฒฝ ์„ค์ • ํŒŒ์ผ์„ ๋กœ๋“œํ•˜์—ฌ Inference Network ๋ชจ๋ธ์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.

cvNet = cv2.dnn.readNetFromTensorflow('frozen_inference_graph.pb', 'graph.pbtxt')

img = cv2.imread('img.jpg')
rows, cols, channels = img.shape

 

 

2. ์ž…๋ ฅ ์ด๋ฏธ์ง€๋ฅผ Preprocessingํ•˜์—ฌ Network์— ์ž…๋ ฅํ•ฉ๋‹ˆ๋‹ค.

cvNet.setInput(cv2.dnn.blobFromImage(img, size=(300, 300), swapRB=True, crop=False))

 

3. inference Network์—์„œ Output์„ ์ถ”์ถœํ•ฉ๋‹ˆ๋‹ค.

networkOutput = cvNet.forward()

 

4. ์ถ”์ถœ๋œ Output์—์„œ Detect ์ •๋ณด๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœํ•œ ์›๋ณธ Image ์œ„์— Object Detection์„ ์‹œ๊ฐํ™” ํ•ฉ๋‹ˆ๋‹ค.

for detection in networkOutput[0,0]:
๊ทธ๋ฆฌ๊ณ  Object Detected๋œ ๊ฒฐ๊ณผ, Bounding Box ์ขŒํ‘œ, ์˜ˆ์ธก ๋ ˆ์ด๋ธ”๋“ค์„ ์›๋ณธ image ์œ„์— ์‹œ๊ฐํ™” ๋กœ์ง์„ ๊ตฌํ˜„ํ•ฉ๋‹ˆ๋‹ค.

OpenCV blobFromImage()

OpenCV blobFromImage()๋Š” Image๋ฅผ Preprocessing ์ˆ˜ํ–‰ํ•˜์—ฌ Network์— ์ž…๋ ฅ ํ• ์ˆ˜ ์žˆ๊ฒŒ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

 

  • ์ฃผ์š”ํ•œ ํŠน์ง•์œผ๋กœ๋Š” ์ด๋ฏธ์ง€ ์‚ฌ์ด์ฆˆ๋ฅผ ๊ณ ์ •ํ•ฉ๋‹ˆ๋‹ค.
  • ์ด๋ฏธ์ง€ ๊ฐ’์„ ์Šค์ผ€์ผ๋ง ํ•ฉ๋‹ˆ๋‹ค.
  • BGR์„ RGB๋กœ ๋ณ€๊ฒฝํ•˜๋ฉฐ, ์ด๋ฏธ์ง€๋ฅผ Cropt ํ• ์ˆ˜ ์žˆ๋Š” ์˜ต์…˜์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

  • ๋˜ํ•œ OpenCV๋Š” RGB์ด๋ฏธ์ง€๋ฅผ BGR ํ˜•ํƒœ๋กœ ์ €์žฅํ•˜๋ฏ€๋กœ ์›๋ณธ Image ๊ทธ๋Œ€๋กœ ์ €์žฅํ•˜๋ ค๋ฉด cvtColor()๋ฅผ ์ด์šฉํ•˜์—ฌ ๋‹ค์‹œ RGB๋กœ ๋ณ€ํ™˜ ํ•˜๋Š” ๊ณผ์ •์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
img_bgr = cv2.imread(‘์›๋ณธimage’)
img_rgb = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2RGB)
  • blobFromImage( )์˜ swapRB=True๋Š” cv2.imread()๋กœ ๋กœ๋”ฉ๋œ BGRํ˜•ํƒœ์˜ ๋ฐ์ดํ„ฐ๋ฅผ RGB๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ Network์œผ๋กœ ์ž…๋ ฅํ•ฉ๋‹ˆ๋‹ค.
img_bgr = cv2.imread(‘์›๋ณธimage’)
cvNet.setInput(cv2.dnn.blobFromImage(
img_bgr, size=(300, 300), swapRB=True, crop=False))

OpenCV Video Stream Capture๋ฅผ ์ด์šฉํ•œ Video Object Detection

OpenCV์˜ VideoCapture( ) API๋ฅผ ์ด์šฉํ•˜์—ฌ Video Stream์„ Frame by Frame๋ณ„๋กœ Captureํ•œ Image์— Object Detection์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ฐฉ์‹์— ๋ฐํ•˜์—ฌ ์•Œ์•„๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

input_video = cv2.VideoCapture(input_file_path) # Frame by Frame์œผ๋กœ Iteration์„ ์ง„ํ–‰ ํ•ฉ๋‹ˆ๋‹ค.
while(True):
	frame ๋ณ„๋กœ Object Detection ์ˆ˜ํ–‰
  • ์ด ์ž‘์—… ํ๋ฆ„์„ ๊ฐ„๋‹จํžˆ ์„ค๋ช…ํ•˜๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:
  • VideoCapture ์ดˆ๊ธฐํ™”: cv2.VideoCapture(input_file_path) ํ•จ์ˆ˜๋Š” ์ง€์ •๋œ ๊ฒฝ๋กœ(input_file_path)์˜ ๋น„๋””์˜ค ํŒŒ์ผ์„ ๋กœ๋”ฉํ•˜์—ฌ ๋น„๋””์˜ค ์ŠคํŠธ๋ฆผ์„ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.
    • ์ด ํ•จ์ˆ˜๋Š” OpenCV ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์˜ ์ผ๋ถ€๋กœ์„œ, ๋น„๋””์˜ค ํŒŒ์ผ์ด๋‚˜ ์—ฐ๊ฒฐ๋œ ์นด๋ฉ”๋ผ์—์„œ ๋น„๋””์˜ค๋ฅผ ์บก์ฒ˜ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.
  • ํ”„๋ ˆ์ž„๋ณ„ ์ฒ˜๋ฆฌ: while(True) ๋ฃจํ”„๋Š” capture.read()์™€ ๊ฐ™์€ ๋ฉ”์†Œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋น„๋””์˜ค ์ŠคํŠธ๋ฆผ์—์„œ ๊ณ„์†ํ•ด์„œ ๊ฐ ํ”„๋ ˆ์ž„์„ ์ฝ์Šต๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ capture์€ VideoCapture()์— ์˜ํ•ด ์ƒ์„ฑ๋œ ์ธ์Šคํ„ด์Šค์ž…๋‹ˆ๋‹ค.
    • ์ด ๋ฃจํ”„๋Š” ๋น„๋””์˜ค๊ฐ€ ๋๋‚˜๊ฑฐ๋‚˜ ์ค‘๋‹จ ์กฐ๊ฑด(๋„ํ‘œ์—๋Š” ํ‘œ์‹œ๋˜์ง€ ์•Š์Œ)์ด ์ถฉ์กฑ๋  ๋•Œ๊นŒ์ง€ ๊ณ„์† ์‹คํ–‰๋ฉ๋‹ˆ๋‹ค.
  • ๊ฐ ํ”„๋ ˆ์ž„์— ๋Œ€ํ•œ ๊ฐ์ฒด ํƒ์ง€: ์ถ”์ถœ๋œ ๊ฐ ํ”„๋ ˆ์ž„์— ๋Œ€ํ•˜์—ฌ ๊ฐ์ฒด ํƒ์ง€๊ฐ€ ์ˆ˜ํ–‰๋ฉ๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ๋ชจ๋ธ(์˜ˆ: YOLO, SSD, ๋˜๋Š” Faster R-CNN๊ณผ ๊ฐ™์€)์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ ํ”„๋ ˆ์ž„ ๋‚ด์˜ ๊ฐ์ฒด๋ฅผ ํƒ์ง€ํ•˜๊ณ  ์‹๋ณ„ํ•˜๋Š” ๋ฐฉ์‹์ด ํฌํ•จ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์ถœ๋ ฅ ์‹œ๊ฐํ™”: ์ฒ˜๋ฆฌ ํ›„์—๋Š” ๊ฒฐ๊ณผ(์˜ˆ: ๊ฐ์ง€๋œ ๊ฐ์ฒด ์ฃผ๋ณ€์˜ ๊ฒฝ๊ณ„ ์ƒ์ž, ํด๋ž˜์Šค ๋ ˆ์ด๋ธ”)๋ฅผ ํ”„๋ ˆ์ž„์— ๊ทธ๋ฆฌ๊ณ  ์ด ํ”„๋ ˆ์ž„๋“ค์„ ๋””์Šคํ”Œ๋ ˆ์ดํ•˜๊ฑฐ๋‚˜ ์ถœ๋ ฅ์œผ๋กœ ์ €์žฅํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์—ฐ์†์„ฑ: ๋น„๋””์˜ค ์ŠคํŠธ๋ฆผ์˜ ๋๊นŒ์ง€, ๋˜๋Š” ํŠน์ • ์ค‘๋‹จ ์กฐ๊ฑด์ด ๋งŒ์กฑ๋  ๋•Œ๊นŒ์ง€ ์ด ํ”„๋กœ์„ธ์Šค๊ฐ€ ๋ชจ๋“  ์ƒˆ ํ”„๋ ˆ์ž„์— ๋Œ€ํ•ด ๋ฐ˜๋ณต๋ฉ๋‹ˆ๋‹ค.
    • ์ด๋ฅผ ํ†ตํ•ด ๋น„๋””์˜ค ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ์‹ค์‹œ๊ฐ„ ๋˜๋Š” ์ผ๊ด„ ์ฒ˜๋ฆฌ๊ฐ€ ๊ฐ€๋Šฅํ•ด ๊ฐ์ฒด ํƒ์ง€ ๋ชฉ์ ์œผ๋กœ ํ™œ์šฉ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

MS-COCO Dataset ์˜ค๋ธŒ์ ํŠธ ์นดํ…Œ๊ณ ๋ฆฌ

์ด๋ ‡๊ฒŒ ๋ณด์‹œ๋Š”๊ฒƒ ์ฒ˜๋Ÿผ ๋Œ€๋žต 80๊ฐœ ์ •๋„์˜ Object Category๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.