My Dev & Engineering Repository

Tensorflow에서 Pretrained 된 모델 파일을 OpenCV에서 로드하여 이미지와 영상에 대한 Object Detection을 수행해 보겠습니다.

입력 이미지로 사용될 이미지 보기

import cv2
import matplotlib.pyplot as plt
%matplotlib inline

img = cv2.imread('../../data/image/beatles01.jpg')
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

print('image shape:', img.shape)
plt.figure(figsize=(12, 12))
plt.imshow(img_rgb)

image shape: (633, 806, 3)
<matplotlib.image.AxesImage at 0x7fda701ddc88>

Inference 모델 생성

Tensorflow에서 Pretrained 된 Inference모델(Frozen graph)와 환경파일을 다운로드 받은 후 이를 이용해 OpenCV에서 Inference 모델을 생성해 보겠습니다.

다운로드 URL은 아래 링크에 있습니다.

Home

Open Source Computer Vision Library. Contribute to opencv/opencv development by creating an account on GitHub.

github.com

Pretrained 모델은 아래 링크에서 다운로드 후 압축 해제 해야합니다.
http://download.tensorflow.org/models/object_detection/ssd_inception_v2_coco_2017_11_17.tar.gz

Pretrained 모델을 위한 환경 파일은 아래 링크에서 다운로드 해야합니다.
https://github.com/opencv/opencv_extra/blob/master/testdata/dnn/ssd_inception_v2_coco_2017_11_17.pbtxt

opencv_extra/testdata/dnn/ssd_inception_v2_coco_2017_11_17.pbtxt at master · opencv/opencv_extra

OpenCV extra data. Contribute to opencv/opencv_extra development by creating an account on GitHub.

github.com

download된 모델 파일과 config 파일을 인자로 하여 inference 모델을 DNN에서 로딩합니다.

이 Model 리스트를 보면, MobileNet-SSD v2, Inception-SSD v2 모델은 저사양 디바이스에서 Object Detection이 가능합니다. 한번 수행 성능을 비교해 봐야 합니다.

# mkdir pretrained; cd pretrained
# wget http://download.tensorflow.org/models/object_detection/ssd_inception_v2_coco_2017_11_17.tar.gz
# wget https://raw.githubusercontent.com/opencv/opencv_extra/master/testdata/dnn/ssd_inception_v2_coco_2017_11_17.pbtxt
# cd ssd_inception_v2_coco_2017_11_17; mv ssd_inception_v2_coco_2017_11_17.pbtxt graph.pbtxt
# https://github.com/opencv/opencv_extra/blob/master/testdata/dnn/ssd_inception_v2_coco_2017_11_17.pbtxt?raw=true -O ./graph.pbtxt

!pwd
!ls pretrained/ssd_inception_v2_coco_2017_11_17

/home/younggi.kim999/DLCV/Detection/ssd
checkpoint		   model.ckpt.data-00000-of-00001  saved_model
frozen_inference_graph.pb  model.ckpt.index
graph.pbtxt		   model.ckpt.meta

여기 있는 Directory들은 Pre-Trained된 Mdoel들이 있는 Directory입니다.
- frozen_inference_graph.pb, graph.pbtxt 가 해당됩니다.

cv_net = cv2.dnn.readNetFromTensorflow('./pretrained/ssd_inception_v2_coco_2017_11_17/frozen_inference_graph.pb', 
                                     './pretrained/ssd_inception_v2_coco_2017_11_17/graph.pbtxt')

첫번째 'frozen_inference_graph.pb'는 wait file
두번째 'graph.pbtxt'는 config file 입니다.

CoCo 데이터셋의 클래스 ID별 클래스명 지정

labels_to_names_seq= {0:'person',1:'bicycle',2:'car',3:'motorcycle',4:'airplane',5:'bus',6:'train',7:'truck',8:'boat',9:'traffic light',
                    10:'fire hydrant',11:'street sign',12:'stop sign',13:'parking meter',14:'bench',15:'bird',16:'cat',17:'dog',18:'horse',19:'sheep',
                    20:'cow',21:'elephant',22:'bear',23:'zebra',24:'giraffe',25:'hat',26:'backpack',27:'umbrella',28:'shoe',29:'eye glasses',
                    30:'handbag',31:'tie',32:'suitcase',33:'frisbee',34:'skis',35:'snowboard',36:'sports ball',37:'kite',38:'baseball bat',39:'baseball glove',
                    40:'skateboard',41:'surfboard',42:'tennis racket',43:'bottle',44:'plate',45:'wine glass',46:'cup',47:'fork',48:'knife',49:'spoon',
                    50:'bowl',51:'banana',52:'apple',53:'sandwich',54:'orange',55:'broccoli',56:'carrot',57:'hot dog',58:'pizza',59:'donut',
                    60:'cake',61:'chair',62:'couch',63:'potted plant',64:'bed',65:'mirror',66:'dining table',67:'window',68:'desk',69:'toilet',
                    70:'door',71:'tv',72:'laptop',73:'mouse',74:'remote',75:'keyboard',76:'cell phone',77:'microwave',78:'oven',79:'toaster',
                    80:'sink',81:'refrigerator',82:'blender',83:'book',84:'clock',85:'vase',86:'scissors',87:'teddy bear',88:'hair drier',89:'toothbrush',
                    90:'hair brush'}

OpenCV와 Tensorflow의 CoCo 클래스 ID와 Name Mapping

이미지를 Preprocessing을 수행하여 Network에 입사하고 Object Detection 수행 후 결과를 이미지를 시각화

# 원본 이미지 (633, 806)를 네트웍에 입력시에는 (300, 300)로 resize 함. 
# 이후 결과가 출력되면 resize된 이미지 기반으로 bounding box 위치가 예측 되므로 이를 다시 원복하기 위해 원본 이미지 shape정보 필요
rows = img.shape[0]
cols = img.shape[1]
# cv2의 rectangle()은 인자로 들어온 이미지 배열에 직접 사각형을 업데이트 하므로 그림 표현을 위한 별도의 이미지 배열 생성. 
draw_img = img.copy()

# 원본 이미지 배열을 사이즈 (300, 300)으로, BGR을 RGB로 변환하여 배열 입력
cv_net.setInput(cv2.dnn.blobFromImage(img,  size=(300, 300), swapRB=True, crop=False))
# Object Detection 수행하여 결과를 cv_out으로 반환 
cv_out = cv_net.forward()
print(cv_out.shape)

# bounding box의 테두리와 caption 글자색 지정
green_color=(0, 255, 0)
red_color=(0, 0, 255)

# detected 된 object들을 iteration 하면서 정보 추출
for detection in cv_out[0,0,:,:]:
    score = float(detection[2])
    class_id = int(detection[1])
    # detected된 object들의 score가 0.4 이상만 추출
    if score > 0.4:
        # detected된 object들은 image 크기가 (300, 300)으로 scale된 기준으로 예측되었으므로 다시 원본 이미지 비율로 계산
        left = detection[3] * cols
        top = detection[4] * rows
        right = detection[5] * cols
        bottom = detection[6] * rows
        # labels_to_names 딕셔너리로 class_id값을 클래스명으로 변경. opencv에서는 class_id + 1로 매핑해야함.
        caption = "{}: {:.4f}".format(labels_to_names[class_id], score)
        
        #cv2.rectangle()은 인자로 들어온 draw_img에 사각형을 그림. 위치 인자는 반드시 정수형.
        cv2.rectangle(draw_img, (int(left), int(top)), (int(right), int(bottom)), color=green_color, thickness=2)
        cv2.putText(draw_img, caption, (int(left), int(top - 5)), cv2.FONT_HERSHEY_SIMPLEX, 0.7, red_color, 2)
        print(caption, class_id)

img_rgb = cv2.cvtColor(draw_img, cv2.COLOR_BGR2RGB)

plt.figure(figsize=(12, 12))
plt.imshow(img_rgb)

(1, 1, 100, 7)
person: 0.9696 1
person: 0.9660 1
person: 0.8916 1
person: 0.6298 1
car: 0.8609 3
car: 0.7223 3
car: 0.7184 3
car: 0.7095 3
car: 0.5949 3
car: 0.5511 3
<matplotlib.image.AxesImage at 0x7fda6c0c92b0>

결과 배열의 형태 (1, 1, 100, 7)
- 객체 검출 네트워크의 출력 형태를 나타냅니다.
- (1, 1, 100, 7)는 배치 크기 1, 채널 1, 100개의 검출된 객체, 각 객체당 7개의 정보(클래스 ID, 신뢰도 점수, bounding box 좌표)를 의미합니다.

객체 검출 결과
- 객체 검출 결과를 순서대로 나열한 것입니다.
- person: 0.9696 1: 검출된 객체가 'person'이고, 신뢰도 점수는 0.9696이며, 클래스 ID는 1입니다.
- 동일한 형식으로 다른 객체들도 나열됩니다.
- 예를 들어, car: 0.8609 3은 검출된 객체가 'car'이고, 신뢰도 점수는 0.8609이며, 클래스 ID는 3입니다.

이 코드는 이미지 객체 검출(Object Detection)을 수행하여 원본 이미지에 검출된 객체의 경계 상자(bounding box)와 클래스명을 표시하는 과정을 포함합니다.
먼저 원본 이미지의 크기와 복사본을 생성하고, 이미지를 (300, 300) 크기로 리사이즈한 후 RGB로 변환하여 네트워크에 입력합니다.
객체 검출을 수행하여 결과를 추출한 뒤, 검출된 각 객체에 대해 검출 점수가 0.4 이상인 경우 bounding box 좌표를 원본 이미지 크기로 변환하여 복사본 이미지에 경계 상자와 클래스명을 표시합니다.
마지막으로 이미지를 RGB 형식으로 변환하여 출력합니다.

단일 이미지의 Object Detection은 함수로 생성

import time

def get_detected_img(cv_net, img_array, score_threshold, use_copied_array=True, is_print=True):
    
    rows = img_array.shape[0]
    cols = img_array.shape[1]
    
    draw_img = None
    if use_copied_array:
        draw_img = img_array.copy()
        #draw_img = cv2.cvtColor(draw_img, cv2.COLOR_BGR2RGB)
    else:
        draw_img = img_array
    
    cv_net.setInput(cv2.dnn.blobFromImage(img_array, size=(300, 300), swapRB=True, crop=False))
    
    start = time.time()
    cv_out = cv_net.forward()
    
    green_color=(0, 255, 0)
    red_color=(0, 0, 255)

    # detected 된 object들을 iteration 하면서 정보 추출
    for detection in cv_out[0,0,:,:]:
        score = float(detection[2])
        class_id = int(detection[1])
        # detected된 object들의 score가 0.4 이상만 추출
        if score > score_threshold:
            # detected된 object들은 image 크기가 (300, 300)으로 scale된 기준으로 예측되었으므로 다시 원본 이미지 비율로 계산
            left = detection[3] * cols
            top = detection[4] * rows
            right = detection[5] * cols
            bottom = detection[6] * rows
            # labels_to_names 딕셔너리로 class_id값을 클래스명으로 변경. opencv에서는 class_id + 1로 매핑해야함.
            caption = "{}: {:.4f}".format(labels_to_names[class_id], score)

            #cv2.rectangle()은 인자로 들어온 draw_img에 사각형을 그림. 위치 인자는 반드시 정수형.
            cv2.rectangle(draw_img, (int(left), int(top)), (int(right), int(bottom)), color=green_color, thickness=2)
            cv2.putText(draw_img, caption, (int(left), int(top - 5)), cv2.FONT_HERSHEY_SIMPLEX, 0.7, red_color, 2)
    if is_print:
        print('Detection 수행시간:',round(time.time() - start, 2),"초")

    return draw_img

이 코드는 입력된 이미지를 대상으로 객체 검출(Object Detection)을 수행하고, 검출된 객체 주위에 경계 상자(bounding box)와 클래스명을 표시한 이미지를 반환하는 함수 get_detected_img를 정의합니다.
함수는 OpenCV 딥러닝 네트워크를 사용하여 이미지를 (300, 300) 크기로 리사이즈한 후 네트워크에 입력합니다.
객체 검출 결과에서 검출 점수가 주어진 임계값보다 높은 객체들만 선택하여, 원본 이미지 크기로 변환된 bounding box를 그립니다.
이때, 경계 상자와 클래스명을 이미지에 추가합니다. 함수는 수행 시간을 측정하여 필요 시 출력하며, 결과 이미지를 반환합니다.

Example

# image 로드 
img = cv2.imread('../../data/image/john_wick01.jpg')

#coco dataset 클래스명 매핑

# tensorflow inference 모델 로딩
cv_net = cv2.dnn.readNetFromTensorflow('./pretrained/ssd_inception_v2_coco_2017_11_17/frozen_inference_graph.pb', 
                                     './pretrained/ssd_inception_v2_coco_2017_11_17/graph.pbtxt')
# Object Detetion 수행 후 시각화 
draw_img = get_detected_img(cv_net, img, score_threshold=0.4, use_copied_array=True, is_print=True)

img_rgb = cv2.cvtColor(draw_img, cv2.COLOR_BGR2RGB)

plt.figure(figsize=(12, 12))
plt.imshow(img_rgb)

Video Object Detection 수행

원본 영상 보기

from IPython.display import clear_output, Image, display, Video, HTML
Video('../../data/video/John_Wick_small.mp4') # 이부분은 디렉토리에 맞게 수정해야 합니다.

VideoCapture와 VideoWriter 설정하기

VideoCapture를 이용하여 Video를 frame별로 capture 할 수 있도록 설정합니다.
VideoCapture의 속성을 이용하여 Video Frame의 크기 및 FPS를 설정합니다.
VideoWriter를 위한 인코딩 코덱 설정 및 영상 write를 위한 설정을 해야합니다.

총 Frame 별로 iteration 하면서 Object Detection 을 수행합니다.
개별 frame별로 하면? 단일 이미지 Object Detection과 유사합니다.

Video Detection 전용 함수 생성

비디오 파일을 입력받아 각 프레임에 대해 객체 검출(Object Detection)을 수행하고,
검출 결과를 반영한 비디오를 출력하는 함수 do_detected_video를 정의해 보겠습니다.

def do_detected_video(cv_net, input_path, output_path, score_threshold, is_print):
    
    cap = cv2.VideoCapture(input_path)

    codec = cv2.VideoWriter_fourcc(*'XVID')

    vid_size = (round(cap.get(cv2.CAP_PROP_FRAME_WIDTH)),round(cap.get(cv2.CAP_PROP_FRAME_HEIGHT)))
    vid_fps = cap.get(cv2.CAP_PROP_FPS)

    vid_writer = cv2.VideoWriter(output_path, codec, vid_fps, vid_size) 

    frame_cnt = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
    print('총 Frame 갯수:', frame_cnt, )

    green_color=(0, 255, 0)
    red_color=(0, 0, 255)
    while True:
        hasFrame, img_frame = cap.read()
        if not hasFrame:
            print('더 이상 처리할 frame이 없습니다.')
            break
        
        returned_frame = get_detected_img(cv_net, img_frame, score_threshold=score_threshold, use_copied_array=True, is_print=True)
        vid_writer.write(returned_frame)
    # end of while loop

    vid_writer.release()
    cap.release()

do_detected_video(cv_net, '../../data/video/John_Wick_small.mp4', '../../data/output/John_Wick_small_ssd01.avi', 0.4, True)

!gsutil cp ../../data/output/John_Wick_small_ssd01.avi gs://my_bucket_dlcv/data/output/John_Wick_small_ssd01.avi

결과를 보면 Detecion이 잘되지만, 작은 Object는 Detection이 안되는 경우가 존재합니다.

코드 설명

비디오 파일 열기 및 설정
- input_path로 지정된 비디오 파일을 열고, 출력 비디오 파일을 output_path로 지정된 경로에 저장할 준비를 합니다.
- 비디오 코덱을 설정하고, 입력 비디오의 프레임 크기와 프레임 속도(FPS)를 가져와서 출력 비디오 설정에 사용합니다.
- 입력 비디오의 총 프레임 수를 출력합니다.
프레임 처리 루프
- 비디오의 각 프레임을 반복적으로 읽어 들입니다.
- 프레임이 더 이상 없으면 루프를 종료합니다.
- 각 프레임에 대해 get_detected_img 함수를 호출하여 객체 검출을 수행하고, 검출된 결과를 프레임에 반영합니다.
- 검출 결과가 반영된 프레임을 출력 비디오 파일에 작성합니다.
비디오 파일 릴리스
- 모든 프레임 처리가 완료되면, 출력 비디오 파일과 입력 비디오 파일을 닫고 리소스를 해제합니다.

SSD + MobileNet으로 Object Detection 수행

아래 링크가 Download URL 입니다.

Home

Open Source Computer Vision Library. Contribute to opencv/opencv development by creating an account on GitHub.

github.com

Dataset 다운로드 링크

http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v2_coco_2018_03_29.tar.gz

# !mkdir pretrained; cd pretrained
# !wget  http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v2_coco_2018_03_29.tar.gz
# !wget https://raw.githubusercontent.com/opencv/opencv_extra/master/testdata/dnn/ssd_mobilenet_v2_coco_2018_03_29.pbtxt
# cd ssd_mobilenet_v2_coco_2018_03_29; mv ssd_mobilenet_v2_coco_2018_03_29.pbtxt graph.pbtxt

cv_net_mobile = cv2.dnn.readNetFromTensorflow('./pretrained/ssd_mobilenet_v2_coco_2018_03_29/frozen_inference_graph.pb', 
                                     './pretrained/ssd_mobilenet_v2_coco_2018_03_29/graph.pbtxt')

영상 Detection

do_detected_video(cv_net_mobile, '../../data/video/John_Wick_small.mp4', '../../data/output/John_Wick_small_ssd_mobile01.avi', 0.2, True)

# image 로드 
img = cv2.imread('../../data/image/beatles01.jpg')

#coco dataset 클래스명 매핑

cv_net_mobile = cv2.dnn.readNetFromTensorflow('./pretrained/ssd_mobilenet_v2_coco_2018_03_29/frozen_inference_graph.pb', 
                                     './pretrained/ssd_mobilenet_v2_coco_2018_03_29/graph.pbtxt')
# Object Detetion 수행 후 시각화 
draw_img = get_detected_img(cv_net_mobile, img, score_threshold=0.4, use_copied_array=True, is_print=True)

img_rgb = cv2.cvtColor(draw_img, cv2.COLOR_BGR2RGB)

plt.figure(figsize=(12, 12))
plt.imshow(img_rgb)

캡쳐 사진을 보면, 수행성능이 Inception 보다 좋다는 특징이 있습니다.

저작자표시 비영리 동일조건 (새창열림)

'👀 Computer Vision' 카테고리의 다른 글

[CV] OpenCV에서 YOLO를 이용한 Object Detection Part.1 (0)	2024.07.15
[CV] YOLO (You Only Look Once) (0)	2024.07.14
[CV] SSD - Single Shot (Multibox) Detector (0)	2024.07.07
[CV] OpenCV로 Object Detection 구현하기 (Part.2) (0)	2024.06.04
[CV] OpenCV로 Object Detection 구현하기 (Part.1) (0)	2024.06.02

Notice

입력 이미지로 사용될 이미지 보기

Inference 모델 생성

CoCo 데이터셋의 클래스 ID별 클래스명 지정

OpenCV와 Tensorflow의 CoCo 클래스 ID와 Name Mapping

이미지를 Preprocessing을 수행하여 Network에 입사하고 Object Detection 수행 후 결과를 이미지를 시각화

단일 이미지의 Object Detection은 함수로 생성

Example

Video Object Detection 수행

원본 영상 보기

VideoCapture와 VideoWriter 설정하기

Video Detection 전용 함수 생성

코드 설명

SSD + MobileNet으로 Object Detection 수행

영상 Detection

'👀 Computer Vision' 카테고리의 다른 글

티스토리툴바

SUBSCRIBE

Notice

입력 이미지로 사용될 이미지 보기

Inference 모델 생성

CoCo 데이터셋의 클래스 ID별 클래스명 지정

OpenCV와 Tensorflow의 CoCo 클래스 ID와 Name Mapping

이미지를 Preprocessing을 수행하여 Network에 입사하고 Object Detection 수행 후 결과를 이미지를 시각화

단일 이미지의 Object Detection은 함수로 생성

Example

Video Object Detection 수행

원본 영상 보기

VideoCapture와 VideoWriter 설정하기

Video Detection 전용 함수 생성

코드 설명

SSD + MobileNet으로 Object Detection 수행

영상 Detection

'👀 Computer Vision' 카테고리의 다른 글

티스토리툴바