KR102776445B1

KR102776445B1 - Device and method for obtaining depth of space using camera

Info

Publication number: KR102776445B1
Application number: KR1020200051815A
Authority: KR
Inventors: 유병욱; 이건일; 이원우; 정지원
Original assignee: 삼성전자주식회사
Priority date: 2020-04-28
Filing date: 2020-04-28
Publication date: 2025-03-07
Anticipated expiration: 2040-04-28
Also published as: WO2021221436A1; KR20210133083A; US20230037866A1; US12294687B2

Abstract

카메라를 이용하여 공간의 깊이를 획득하는 디바이스 및 방법이 제공된다. 디바이스가 카메라 모듈을 이용하여 특징 영역의 깊이를 획득하는 방법은, 카메라 모듈을 기설정된 각도로 순차적으로 회전시키면서 상기 카메라의 주변을 복수회 촬영함으로써 복수의 이미지들을 획득하는 단계; 상기 복수의 이미지들 중 제1 이미지 및 제n 이미지 사이의 서로 인접한 이미지들을 비교함으로써, 상기 제1 이미지 내의 제1 특징 영역 및 상기 제n 이미지 내의 상기 제1 특징 영역과 동일한 제n 특징 영역을 식별하는 단계; 상기 제1 이미지 및 상기 제n 이미지에 대한 베이스 라인(base line) 값을 획득하는 단계; 상기 제1 특징 영역 및 상기 제n 특징 영역 간의 디스패리티(disparity) 값을 획득하는 단계; 및 상기 베이스 라인 값 및 상기 디스패리티 값에 기초하여, 상기 제1 특징 영역 또는 상기 제n 특징 영역의 깊이를 산출하는 단계;를 포함한다.A device and method for acquiring depth of a space using a camera are provided. The method for acquiring the depth of a feature region by a device using a camera module includes: acquiring a plurality of images by sequentially rotating the camera module at a preset angle and photographing the periphery of the camera multiple times; identifying a first feature region in the first image and an n-th feature region identical to the first feature region in the n-th image by comparing adjacent images between a first image and an n-th image among the plurality of images; acquiring a base line value for the first image and the n-th image; acquiring a disparity value between the first feature region and the n-th feature region; and calculating the depth of the first feature region or the n-th feature region based on the base line value and the disparity value.

Description

{DEVICE AND METHOD FOR OBTAINING DEPTH OF SPACE USING CAMERA}

본 개시는 카메라를 이용하여 공간의 깊이를 획득하는 디바이스 및 방법에 관한 것으로서, 회전하는 카메라를 이용하여 주변 공간의 깊이 값을 산출하는 디바이스 및 방법에 관한 것이다.The present disclosure relates to a device and method for obtaining depth of space using a camera, and more particularly, to a device and method for calculating depth values of a surrounding space using a rotating camera.

증강 현실(Augmented Reality)은 현실 세계의 물리적 환경 공간이나 현실 객체(real world object) 상에 가상 이미지를 투영시켜 하나의 이미지로 보여주는 기술이다. Augmented Reality is a technology that projects virtual images onto the physical environment or real world objects of the real world and displays them as a single image.

증강 현실 장치는 사용자의 안면부나 두부에 착용된 상태에서 사용자의 눈앞에 배치되는, 시스루(see-through) 형태의 디스플레이 모듈을 통해 현실 장면과 가상 이미지를 함께 볼 수 있게 한다. 이러한 증강 현실 장치에 대한 연구가 활발히 진행됨에 따라 다양한 형태의 착용형 장치들이 출시되거나 출시가 예고 되고 있다. 예를 들어, 안경형 디스플레이 장치(wearable glasses)나 헤드마운트 디스플레이 장치(Head Mounted Display)가 현재 출시되거나 출시가 예고되고 있는 착용형 디스플레이 장치이다.Augmented reality devices allow users to view real scenes and virtual images together through a see-through display module that is placed in front of the user's eyes while worn on the user's face or head. As research on such augmented reality devices is actively progressing, various types of wearable devices are being released or are scheduled to be released. For example, wearable glasses and head-mounted displays are wearable display devices that are currently being released or are scheduled to be released.

이러한 증강 현실 장치는 사용자의 제스처를 인식하거나 가상 객체를 실제 객체와 자연스럽게 연동하여 표시하기 위해 실제 객체의 깊이를 측정할 필요가 있다. 그런데, 기존의 증강 현실 장치는 주변 공간 내의 객체들을 인식하기 위하여 복수의 카메라 또는 TOF(Time of Flight) 카메라를 이용하여 주변 공간의 깊이를 측정하였으며, 이로 인하여 카메라의 무게 또는 개수가 증가하는 문제가 있었다. 또한, 이로 인하여 증강 현실 장치의 부피, 무게 및 배터리 소모량이 증가하는 등과 같은 문제가 발생하였다.These augmented reality devices need to measure the depth of real objects in order to recognize user gestures or display virtual objects in natural linkage with real objects. However, existing augmented reality devices measure the depth of the surrounding space using multiple cameras or TOF (Time of Flight) cameras in order to recognize objects in the surrounding space, which increases the weight or number of cameras. In addition, this causes problems such as increased volume, weight, and battery consumption of the augmented reality device.

이에 따라, 증강 현실 장치가 사용자에 의해 장시간 착용되거나 자주 착용되는 경우에도, 사용자가 불편함을 느끼지 않도록, 증강 현실 장치를 경량화하고 소형화하는 것이 중요하다.Accordingly, it is important to make the augmented reality device lightweight and miniaturized so that the user does not feel uncomfortable even when the augmented reality device is worn for a long time or frequently.

본 개시의 일 실시예는, 하나의 카메라 모듈을 기설정된 각도로 순차적으로 회전시키면서 상기 카메라의 주변을 복수회 촬영함으로써 획득되는 이미지들을 비교함으로써, 디바이스를 경량화하면서도 특정 영역의 깊이를 효과적으로 산출할 수 있는 디바이스 및 방법을 제공할 수 있다.One embodiment of the present disclosure can provide a device and method capable of effectively calculating the depth of a specific area while reducing the weight of the device by comparing images obtained by sequentially rotating a camera module at a preset angle and taking multiple shots of the surroundings of the camera.

또한, 본 개시의 일 실시예는, 순차적으로 촬영된 복수의 이미지들 중 서로 인접한 이미지들을 비교하여 이미지들 내의 동일한 특정 영역을 효율적으로 식별하고, 식별된 특정 영역들의 위치에 기초하여 특정 영역들의 깊이 값을 산출할 수 있는 디바이스 및 방법을 제공할 수 있다.In addition, one embodiment of the present disclosure can provide a device and method capable of efficiently identifying the same specific region within a plurality of sequentially captured images by comparing adjacent images, and calculating depth values of the specific regions based on the locations of the identified specific regions.

또한, 본 개시의 일 실시예는, 촬영 모드에 따라 카메라 모듈을 통한 촬영 각도 범위를 설정할 수 있는 디바이스 및 방법을 제공할 수 있다.In addition, one embodiment of the present disclosure can provide a device and method capable of setting a shooting angle range through a camera module according to a shooting mode.

상술한 기술적 과제를 달성하기 위한 기술적 수단으로서, 본 개시의 제1 측면은, 카메라 모듈을 기설정된 각도로 순차적으로 회전시키면서 상기 카메라의 주변을 복수회 촬영함으로써 복수의 이미지들을 획득하는 단계; 상기 복수의 이미지들 중 제1 이미지 및 제n 이미지 사이의 서로 인접한 이미지들을 비교함으로써, 상기 제1 이미지 내의 제1 특징 영역 및 상기 제n 이미지 내의 상기 제1 특징 영역과 동일한 제n 특징 영역을 식별하는 단계; 상기 제1 이미지를 촬영할 때의 상기 카메라 모듈의 배치 및 상기 n번째 이미지를 촬영할 때의 상기 카메라 모듈의 배치에 기초하여, 상기 제1 이미지 및 상기 제n 이미지에 대한 베이스 라인(base line) 값을 획득하는 단계; 상기 제1 이미지 내의 상기 제1 특징 영역의 위치 및 상기 제n 이미지 내의 상기 제n 특징 영역의 위치에 기초하여, 상기 제1 특징 영역 및 상기 제n 특징 영역 간의 디스패리티(disparity) 값을 획득하는 단계; 및 상기 베이스 라인 값 및 상기 디스패리티 값에 기초하여, 상기 제1 특징 영역 또는 상기 제n 특징 영역의 깊이를 산출하는 단계;를 포함하는, 디바이스가 카메라 모듈을 이용하여 특징 영역의 깊이를 획득하는 방법을 제공할 수 있다.As a technical means for achieving the above-described technical task, a first aspect of the present disclosure comprises the steps of: acquiring a plurality of images by sequentially rotating a camera module at a preset angle and shooting a periphery of the camera multiple times; identifying a first feature region in the first image and an n-th feature region identical to the first feature region in the n-th image by comparing adjacent images between a first image and an n-th image among the plurality of images; acquiring a base line value for the first image and the n-th image based on the arrangement of the camera module when shooting the first image and the arrangement of the camera module when shooting the n-th image; acquiring a disparity value between the first feature region and the n-th feature region based on the position of the first feature region in the first image and the position of the n-th feature region in the n-th image; And a step of calculating the depth of the first feature region or the nth feature region based on the baseline value and the disparity value; The device can provide a method for obtaining the depth of a feature region using a camera module.

또한, 본 개시의 제2 측면은, 카메라 모듈; 상기 카메라 모듈을 소정 각도로 회전시키는 촬영 방향 제어 유닛; 디스플레이; 하나 이상의 인스트럭션을 저장하는 메모리; 및 상기 하나 이상의 인스터럭션을 실행하는 프로세서;를 포함하며, 상기 프로세서는, 상기 촬영 방향 제어 유닛 및 상기 카메라 모듈을 제어함으로써, 상기 카메라 모듈을 기설정된 각도로 순차적으로 회전시키면서 상기 카메라의 주변을 복수회 촬영하여 복수의 이미지들을 획득하고, 상기 복수의 이미지들 중 제1 이미지 및 제n 이미지 사이의 서로 인접한 이미지들을 비교함으로써, 상기 제1 이미지 내의 제1 특징 영역 및 상기 제n 이미지 내의 상기 제1 특징 영역과 동일한 제n 특징 영역을 식별하고, 상기 제1 이미지를 촬영할 때의 상기 카메라 모듈의 배치 및 상기 n번째 이미지를 촬영할 때의 상기 카메라 모듈의 배치에 기초하여, 상기 제1 이미지 및 상기 제n 이미지에 대한 베이스 라인(base line) 값을 획득하고, 상기 제1 이미지 내의 상기 제1 특징 영역의 위치 및 상기 제n 이미지 내의 상기 제n 특징 영역의 위치에 기초하여, 상기 제1 특징 영역 및 상기 제n 특징 영역 간의 디스패리티(disparity) 값을 획득하고, 상기 베이스 라인 값 및 상기 디스패리티 값에 기초하여, 상기 제1 특징 영역 또는 상기 제n 특징 영역의 깊이를 산출하는, 카메라 모듈을 이용하여 특징 영역의 깊이를 획득하는 디바이스를 제공할 수 있다.In addition, a second aspect of the present disclosure comprises: a camera module; a shooting direction control unit for rotating the camera module at a predetermined angle; a display; a memory for storing one or more instructions; And a processor for executing the one or more instructions; wherein the processor controls the shooting direction control unit and the camera module to sequentially rotate the camera module at a preset angle and capture a plurality of images by capturing a plurality of images around the camera, and compares adjacent images between a first image and an n-th image among the plurality of images to identify a first feature region in the first image and an n-th feature region that is identical to the first feature region in the n-th image, and obtains a base line value for the first image and the n-th image based on the arrangement of the camera module when capturing the first image and the arrangement of the camera module when capturing the n-th image, and obtains a disparity value between the first feature region and the n-th feature region based on the position of the first feature region in the first image and the position of the n-th feature region in the n-th image, and calculates a depth of the first feature region or the n-th feature region based on the base line value and the disparity value. A device for acquiring depth of a feature region using a camera module can be provided.

또한, 본 개시의 제3 측면은, 제1 측면의 방법을 컴퓨터에서 실행시키기 위한 프로그램을 기록한 컴퓨터로 읽을 수 있는 기록매체를 제공할 수 있다.In addition, a third aspect of the present disclosure can provide a computer-readable recording medium having recorded thereon a program for executing the method of the first aspect on a computer.

도 1은 본 개시의 일 실시예에 따른 디바이스가 주변 영역의 깊이를 획득하는 방법의 개요도이다.
도 2는 본 개시의 일 실시예에 따른 디바이스의 블록도이다.
도 3은 본 개시의 일 실시예에 따른 디바이스 내의 카메라 모듈이 회전하는 예시를 나타내는 도면이다.
도 4a는 본 개시의 일 실시예에 따른 디바이스 내의 카메라 모듈이 회전하면서 복수의 이미지를 획득하는 예시를 나타내는 도면이다.
도 4b는 본 개시의 일 실시예에 따른 인접한 이미지들 내의 동일한 특징 영역이 순차적으로 식별되는 예시를 나타내는 도면이다.
도 4c는 본 개시의 일 실시예에 따른 디바이스가 가장 큰 각도 차이가 나는 이미지들 내의 동일한 식별 영역들을 식별하는 예시를 나타내는 도면이다.
도 5a는 본 개시의 일 실시예에 따른 디바이스가 디바이스의 주변을 촬영하는 예시를 도시한 도면이다.
도 5b는 본 개시의 일 실시예에 따른 디바이스가 카메라 모듈을 가상으로 배치하는 예시를 나타내는 도면이다.
도 5c는 본 개시의 일 실시예에 따른 디바이스가 디스패리티 값을 산출하는 예시를 나타내는 도면이다.
도 6은 본 개시의 일 실시예에 따른 디바이스가 특징 영역의 깊이를 산출하는 예시를 나타내는 도면이다.
도 7은 본 개시의 일 실시예에 따른 디바이스가 이미지 내의 특징 영역의 깊이를 산출하는 방법의 흐름도이다.
도 8은 본 개시의 일 실시예에 따른 디바이스가 디스패리티 값을 산출하는 예시를 나타내는 도면이다.
도 9는 본 개시의 일 실시예에 따른 디바이스가 촬영 모드에 따라 디바이스의 주변을 인식하는 방법의 흐름도이다.
도 10a은 본 개시의 일 실시예에 따른 디바이스가 제스처 모드에서 사용자의 손 동작을 촬영하는 예시를 나타내는 도면이다.
도 10b는 본 개시의 일 실시예에 따른 디바이스(1000)가 이미지 세트들로부터 제스처를 인식하는 예시를 나타내는 도면이다.
도 11a는 본 개시의 일 실시예에 따른 디바이스가 디바이스 주변을 촬영하여 획득된 복수의 이미지들을 복수의 이미지 세트로 그룹핑하는 예시를 나타내는 도면이다.
도 11b는 본 개시의 일 실시예에 따른 디바이스에 의해 그룹핑된 복수의 이미지 세트들의 예시를 나타내는 도면이다.
도 11c는 본 개시의 일 실시예에 따른 디바이스가 복수의 이미지 세트로부터 생성된 깊이 맵으로부터 디바이스 주변 공간을 나타내는 깊이 맵을 생성하고 분석하는 예시를 나타내는 도면이다.FIG. 1 is a schematic diagram of a method for a device to obtain depth of a surrounding area according to one embodiment of the present disclosure.
FIG. 2 is a block diagram of a device according to one embodiment of the present disclosure.
FIG. 3 is a drawing showing an example of a camera module rotating within a device according to one embodiment of the present disclosure.
FIG. 4A is a diagram illustrating an example of obtaining multiple images while a camera module within a device rotates according to one embodiment of the present disclosure.
FIG. 4b is a diagram illustrating an example in which identical feature regions in adjacent images are sequentially identified according to one embodiment of the present disclosure.
FIG. 4c is a diagram illustrating an example of a device according to one embodiment of the present disclosure identifying identical identification regions within images with the greatest angular difference.
FIG. 5A is a diagram illustrating an example of a device photographing its surroundings according to one embodiment of the present disclosure.
FIG. 5b is a diagram illustrating an example of a device virtually arranging a camera module according to one embodiment of the present disclosure.
FIG. 5c is a diagram illustrating an example of a device calculating a disparity value according to one embodiment of the present disclosure.
FIG. 6 is a diagram showing an example of a device calculating the depth of a feature region according to one embodiment of the present disclosure.
FIG. 7 is a flowchart of a method for calculating the depth of a feature region in an image by a device according to one embodiment of the present disclosure.
FIG. 8 is a diagram illustrating an example of a device calculating a disparity value according to one embodiment of the present disclosure.
FIG. 9 is a flowchart of a method for a device to recognize its surroundings according to a shooting mode according to one embodiment of the present disclosure.
FIG. 10A is a diagram illustrating an example of a device capturing a user's hand movements in gesture mode according to one embodiment of the present disclosure.
FIG. 10b is a diagram illustrating an example of a device (1000) recognizing a gesture from image sets according to one embodiment of the present disclosure.
FIG. 11A is a diagram illustrating an example of a device grouping multiple images acquired by photographing the surroundings of the device into multiple image sets according to one embodiment of the present disclosure.
FIG. 11b is a diagram illustrating an example of multiple image sets grouped by a device according to one embodiment of the present disclosure.
FIG. 11C is a diagram illustrating an example of a device generating and analyzing a depth map representing space around the device from depth maps generated from multiple sets of images according to one embodiment of the present disclosure.

아래에서는 첨부한 도면을 참조하여 본 개시가 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 본 개시의 실시예를 상세히 설명한다. 그러나 본 개시는 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 개시를 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다. Hereinafter, embodiments of the present disclosure will be described in detail with reference to the attached drawings so that those skilled in the art can easily practice the present disclosure. However, the present disclosure may be implemented in various different forms and is not limited to the embodiments described herein. In addition, in order to clearly describe the present disclosure in the drawings, parts that are not related to the description are omitted, and similar parts are assigned similar drawing reference numerals throughout the specification.

명세서 전체에서, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라, 그 중간에 다른 소자를 사이에 두고 "전기적으로 연결"되어 있는 경우도 포함한다. 또한 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다.Throughout the specification, when a part is said to be "connected" to another part, this includes not only the case where it is "directly connected" but also the case where it is "electrically connected" with another element in between. Also, when a part is said to "include" a component, this does not mean that it excludes other components, but rather that it may include other components, unless otherwise specifically stated.

본 개시에서, '증강 현실(AR: Augmented Reality)'은 현실 세계의 물리적 환경 공간 내에 가상 이미지를 함께 보여주거나 현실 객체와 가상 이미지를 함께 보여주는 것을 의미한다.In this disclosure, 'Augmented Reality (AR)' means displaying a virtual image together within a physical environment space of the real world or displaying a real object and a virtual image together.

아울러, '증강 현실 장치(Augmented Reality Device)'라 함은 '증강 현실(Augmented Reality)'을 표현할 수 있는 장치로서, 일반적으로 사용자가 안면부(顔面部)에 착용하는 안경 형상의 증강 현실 안경 장치(Augmented Reality Glasses) 뿐만 아니라, 두부(頭部)에 착용하는 헤드 마운트 디스플레이 장치 (HMD : Head Mounted Display Apparatus)나, 증강 현실 헬멧(Augmented Reality Helmet) 등을 포괄한다.In addition, the term 'Augmented Reality Device' refers to a device capable of expressing 'Augmented Reality', and generally includes not only the Augmented Reality Glasses, which are glasses-shaped devices worn by the user on the face, but also a Head Mounted Display Apparatus (HMD) worn on the head, or an Augmented Reality Helmet.

한편, '현실 장면(real scene)'이란 사용자가 증강 현실 장치를 통해서 보는 현실 세계의 장면으로서, 현실 객체(real world object)를 포함할 수 있다. 또한, '가상 이미지(virtual image)'는 광학 엔진을 통해 생성되는 이미지로 정적 이미지와 동적 이미지를 모두 포함할 수 있다. 이러한 가상 이미지는 현실 장면과 함께 관측되며, 현실 장면 속의 현실 객체에 대한 정보 또는 증강 현실 장치의 동작에 대한 정보나 제어 메뉴 등을 나타내는 이미지일 수 있다.Meanwhile, a 'real scene' is a scene of the real world that a user sees through an augmented reality device, and may include a real world object. In addition, a 'virtual image' is an image generated through an optical engine, and may include both static and dynamic images. This virtual image is observed together with the real scene, and may be an image that shows information about a real object in the real scene, information about the operation of the augmented reality device, a control menu, etc.

따라서, 일반적인 증강 현실 장치는 광원에서 생성된 광으로 구성되는 가상 이미지를 생성하기 위한 광학 엔진과 광학 엔진에서 생성된 가상 이미지를 사용자의 눈까지 안내하고 현실 세계의 장면도 함께 볼 수 있도록 투명한 재질로 형성된 도광판(Wave guide)을 구비한다. 전술한 바와 같이, 증강 현실 장치는 현실 세계의 장면도 함께 관측할 수 있어야 하므로 광학 엔진에서 생성된 광을 도광판을 통해 사용자의 눈까지 안내하기 위해서는 기본적으로 직진성을 가지는 광의 경로를 변경하기 위한 광학 소자(Optical element)가 필요하다. 이 때, 미러 등에 의한 반사를 이용하여 광 경로를 변경할 수도 있고, DOE(Diffractive optical element), HOE(Holographic optical element) 등과 같은 회절 소자에 의한 회절을 통해 광 경로를 변경할 수도 있으나 이에 한정되는 것은 아니다.Accordingly, a typical augmented reality device comprises an optical engine for generating a virtual image composed of light generated from a light source, and a wave guide formed of a transparent material to guide the virtual image generated from the optical engine to the user's eyes and also to allow the user to view scenes of the real world. As described above, the augmented reality device must be able to observe scenes of the real world as well, and therefore, in order to guide the light generated from the optical engine to the user's eyes through the wave guide, an optical element for basically changing the path of the light, which has a straight path, is required. At this time, the light path may be changed by using reflection by a mirror, etc., or the light path may be changed by diffraction by a diffractive optical element such as a DOE (Diffractive optical element) or a HOE (Holographic optical element), but is not limited thereto.

또한, 본 개시에서, 제스처 인식 모드는 소정 임계치보다 작은 촬영 각도 범위에서 근거리에 위치한 사용자의 신체에 의한 제스처를 촬영하기 위한 촬영 모드이며, 공간 인식 모드는 소정 임계치보다 큰 촬영 각도 범위에서 디바이스(1000) 주변의 공간을 촬영하기 위한 촬영 모드일 수 있다.Additionally, in the present disclosure, the gesture recognition mode may be a shooting mode for shooting a gesture made by a user's body located at a close range in a shooting angle range smaller than a predetermined threshold, and the space recognition mode may be a shooting mode for shooting a space around the device (1000) in a shooting angle range larger than a predetermined threshold.

또한, 본 개시에서, 두 이미지에 대한 디스패리티 값은, 촬영된 두 이미지 중에서 한 이미지 내의 어떤 특징 영역이 다른 이미지에서 얼마나 쉬프트되었는 지를 나타내는 값일 수 있다.Additionally, in the present disclosure, the disparity value for two images may be a value indicating how much a feature region in one of the two captured images is shifted in the other image.

또한, 본 개시에서, 베이스 라인 값은 한 이미지를 촬영할 때의 카메라 모듈의 촬영 중심 및 다른 이미지를 촬영할 때의 카메라 모듈의 촬영 중심 간의 거리를 나타내는 값일 수 있다. 카메라 모듈의 촬영 중심은, 예를 들어, 카메라 모듈의 렌즈의 중심점일 수 있다.Additionally, in the present disclosure, the baseline value may be a value representing the distance between the shooting center of the camera module when shooting one image and the shooting center of the camera module when shooting another image. The shooting center of the camera module may be, for example, the center point of the lens of the camera module.

이하 첨부된 도면을 참고하여 본 개시를 상세히 설명하기로 한다.The present disclosure will be described in detail with reference to the attached drawings below.

도 1은 본 개시의 일 실시예에 따른 디바이스가 주변 영역의 깊이를 획득하는 방법의 개요도이다.FIG. 1 is a schematic diagram of a method for a device to obtain depth of a surrounding area according to one embodiment of the present disclosure.

도 1을 참조하면, 디바이스(1000)는 카메라 모듈(1400)을 기설정된 각도로 회전시키면서, 회전되는 카메라 모듈(1400)을 통하여 디바이스(1000)의 주변을 순차적으로 촬영하고, 디바이스(1000)의 주변이 촬영된 복수의 이미지들을 획득할 수 있다. 디바이스(1000)는 제1 이미지 내지 제n 이미지를 포함하는 복수의 이미지들을 획득하고, 복수의 이미지들 중 인접한 이미지들을 비교함으로써, 복수의 이미지들 내의 동일한 특징 영역들을 식별할 수 있다. 또한, 이를 통하여, 디바이스(1000)는 제1 이미지 및 제n 이미지 내의 동일한 특징 영역들 간의 디스패리티 값을 획득하고, 특징 영역에 대한 깊이 값을 산출할 수 있다.Referring to FIG. 1, the device (1000) sequentially captures the surroundings of the device (1000) by rotating the camera module (1400) at a preset angle through the rotating camera module (1400), and obtains a plurality of images in which the surroundings of the device (1000) are captured. The device (1000) obtains a plurality of images including a first image to an n-th image, and identifies identical feature regions within the plurality of images by comparing adjacent images among the plurality of images. In addition, through this, the device (1000) can obtain a disparity value between identical feature regions within the first image and the n-th image, and calculate a depth value for the feature region.

디바이스(1000)는 증강 현실(Augmented Reality)을 표현할 수 있는 증강 현실 장치일 수 있다. 디바이스(1000)는, 예를 들어, 사용자가 안면부(顔面部)에 착용하는 안경 형상의 증강 현실 안경 장치(Augmented Reality Glasses), 및 두부(頭部)에 착용하는 헤드 마운트 디스플레이 장치 (HMD : Head Mounted Display Apparatus)나, 증강 현실 헬멧(Augmented Reality Helmet) 등을 포함할 수 있다.The device (1000) may be an augmented reality device capable of expressing augmented reality. The device (1000) may include, for example, augmented reality glasses in the shape of glasses worn by a user on the face, a head mounted display apparatus (HMD: Head Mounted Display Apparatus) worn on the head, an augmented reality helmet, etc.

또한, 디바이스(1000)는 스마트폰, 태블릿 PC, PC, 스마트 TV, 휴대폰, PDA(personal digital assistant), 랩톱, 미디어 플레이어, 마이크로 서버, GPS(global positioning system) 장치, 전자책 단말기, 디지털방송용 단말기, 네비게이션, 키오스크, MP3 플레이어, 디지털 카메라, 가전기기 및 기타 모바일 또는 비모바일 컴퓨팅 장치일 수도 있다. 그러나, 이에 제한되지 않으며, 디바이스(1000)는 카메라 모듈을 제어하여 획득되는 이미지를 처리할 수 있는 모든 종류의 기기를 포함할 수 있다.In addition, the device (1000) may be a smart phone, a tablet PC, a PC, a smart TV, a mobile phone, a PDA (personal digital assistant), a laptop, a media player, a micro server, a GPS (global positioning system) device, an e-book reader, a digital broadcasting terminal, a navigation device, a kiosk, an MP3 player, a digital camera, a home appliance, and other mobile or non-mobile computing devices. However, the present invention is not limited thereto, and the device (1000) may include any type of device capable of processing an image acquired by controlling a camera module.

도 2는 본 개시의 일 실시예에 따른 디바이스의 블록도이다.FIG. 2 is a block diagram of a device according to one embodiment of the present disclosure.

도 2를 참조하면, 본 개시의 일 실시예에 따른 디바이스(1000)는 사용자 입력부(1100), 마이크(1200), 디스플레이부(1300), 카메라 모듈(1400), 촬영 방향 제어 유닛(1500), 통신 인터페이스(1600), 저장부(1700) 및 프로세서(1800)를 포함할 수 있다. Referring to FIG. 2, a device (1000) according to one embodiment of the present disclosure may include a user input unit (1100), a microphone (1200), a display unit (1300), a camera module (1400), a shooting direction control unit (1500), a communication interface (1600), a storage unit (1700), and a processor (1800).

사용자 입력부(1100)는, 사용자가 디바이스(1000)를 제어하기 위한 데이터를 입력하는 수단을 의미한다. 예를 들어, 사용자 입력부(1100)는 키 패드(key pad), 돔 스위치 (dome switch), 터치 패드(접촉식 정전 용량 방식, 압력식 저항막 방식, 적외선 감지 방식, 표면 초음파 전도 방식, 적분식 장력 측정 방식, 피에조 효과 방식 등), 조그 휠 또는 조그 스위치 중 적어도 하나를 포함할 수 있으나 이에 한정되는 것은 아니다.The user input unit (1100) refers to a means for a user to input data for controlling the device (1000). For example, the user input unit (1100) may include, but is not limited to, at least one of a key pad, a dome switch, a touch pad (contact electrostatic capacitance type, pressure resistive film type, infrared detection type, surface ultrasonic conduction type, integral tension measurement type, piezo effect type, etc.), a jog wheel, or a jog switch.

사용자 입력부(1100)는 후술할 카메라 모듈(1400)을 이용하여 디바이스(1000)의 주변을 촬영하고 촬영된 이미지를 기반으로 디바이스(1000) 또는 서버(미도시)로부터의 서비스를 제공받기 위한 사용자 입력을 수신할 수 있다.The user input unit (1100) can capture images of the surroundings of the device (1000) using a camera module (1400) to be described later, and receive user input for providing services from the device (1000) or a server (not shown) based on the captured images.

마이크(1200)는, 외부의 음향 신호를 입력 받아 전기적인 음성 데이터로 처리한다. 예를 들어, 마이크(1200)은 외부 디바이스 또는 화자로부터의 음향 신호를 수신할 수 있다. 마이크(1200)는 외부의 음향 신호를 입력 받는 과정에서 발생 되는 잡음(noise)를 제거하기 위한 다양한 잡음 제거 알고리즘을 이용할 수 있다. 마이크(1200)는 디바이스(1000)를 제어하기 위한 사용자의 음성 입력을 수신할 수 있다.The microphone (1200) receives an external sound signal and processes it into electrical voice data. For example, the microphone (1200) can receive an sound signal from an external device or a speaker. The microphone (1200) can use various noise removal algorithms to remove noise generated in the process of receiving an external sound signal. The microphone (1200) can receive a user's voice input for controlling the device (1000).

디스플레이부(1300)는 디바이스(1000)에서 처리되는 정보를 출력한다. 예를 들어, 디스플레이부(1300)는, 디바이스(1000)의 주변을 촬영하기 위한 사용자 인터페이스, 디바이스(1000) 주변의 촬영된 이미지를 기반으로 제공되는 서비스에 관련된 정보를 디스플레이할 수 있다. 디바이스(1000)는, 일 실시 예에 의하면, 디스플레이(1300)는 AR(Augmented Reality) 영상을 제공할 수 있다. 일 실시 예에 따른 디스플레이(1300)는 웨이브 가이드(미도시)와 디스플레이 모듈(미도시)을 포함할 수 있다. 웨이브 가이드(미도시)는 사용자가 디바이스(1000)를 착용할 때, 배면의 일부 영역이 보이는 투명한 소재로 구성될 수 있다. 웨이브 가이드(미도시)는 광이 내부에서 반사되면서 전파될 수 있는 투명 재질의 단층 혹은 다층 구조의 평판으로 구성될 수 있다. 웨이브 가이드(미도시)는 디스플레이 모듈의 출사면에 마주하여 투사된 가상 이미지의 광을 입력 받을 수 있다. 여기서, 투명 재질이라 함은, 광이 통과될 수 있는 재질이라는 의미이며, 투명도가 100%가 아닐 수 있으며, 소정의 색상을 지닐 수도 있다. 일 실시 예에서, 웨이브 가이드(미도시)는 투명 재질로 형성됨에 따라, 사용자는 디스플레이(1300)를 통해 가상 이미지의 가상 객체를 볼 수 있을 뿐만 아니라, 외부 실제 장면(scene)을 볼 수도 있으므로, 웨이브 가이드(미도시)는 시스루 디스플레이(see through display)로 지칭될 수 있다. 디스플레이(1300)는 웨이브 가이드를 통해 가상 이미지의 가상 객체를 출력함으로써, 증강 현실(argumented reality) 영상을 제공할 수 있다.The display unit (1300) outputs information processed in the device (1000). For example, the display unit (1300) may display a user interface for capturing the surroundings of the device (1000) and information related to a service provided based on captured images of the surroundings of the device (1000). According to one embodiment, the device (1000) may provide an AR (Augmented Reality) image. The display (1300) according to one embodiment may include a wave guide (not shown) and a display module (not shown). The wave guide (not shown) may be made of a transparent material through which a portion of the back surface is visible when the user wears the device (1000). The wave guide (not shown) may be made of a single-layer or multi-layer flat plate made of a transparent material through which light may be reflected and transmitted. The wave guide (not shown) may receive light of a virtual image projected facing the emission surface of the display module. Here, the transparent material means a material through which light can pass, and the transparency may not be 100% and may have a predetermined color. In one embodiment, since the wave guide (not shown) is formed of a transparent material, the user can view not only a virtual object of a virtual image through the display (1300), but also an external actual scene, and therefore, the wave guide (not shown) may be referred to as a see through display. The display (1300) can provide an augmented reality image by outputting a virtual object of a virtual image through the wave guide.

카메라 모듈(1400)은 디바이스(1000)의 주변을 촬영할 수 있다. 카메라 모듈은 촬영 기능을 요구하는 애플리케이션이 실행되는 경우에 이미지 센서를 통해 정지 영상 또는 동영상 등의 화상 프레임을 얻을 수 있다. 이미지 센서를 통해 캡쳐된 이미지는 후술할 프로세서(1800) 또는 별도의 이미지 처리부(미도시)를 통해 처리될 수 있다.The camera module (1400) can capture images of the surroundings of the device (1000). When an application requiring a shooting function is executed, the camera module can obtain image frames, such as still images or moving images, through an image sensor. Images captured through the image sensor can be processed through a processor (1800) to be described later or a separate image processing unit (not shown).

촬영 방향 제어 유닛(1500)은 카메라 모듈(1400)의 촬영 방향을 변경할 수 있다. 촬영 방향 제어 유닛(1500)은 카메라 모듈(1400)을 패닝시킴으로써 카메라 모듈(1400)의 촬영 방향을 변경할 수 있는 하드웨어 구조를 포함할 수 있다. 촬영 방향 제어 유닛(1500)을 통해 카메라 모듈(1400)은 소정의 축을 기준으로 시계 또는 반시계 방향으로 회전될 수 있으며, 카메라 모듈(1400)은 소정 각도로 회전하면서 디바이스(1000) 주변을 순차적으로 촬영할 수 있다. 촬영 방향 제어 유닛(1500)은, 예를 들어, 카메라 모듈(1400)의 근처에 위치한 전자석을 포함하며, 전자석에 전기를 인가함으로써 발생되는 자력을 이용하여 카메라 모듈(1400)의 촬영 방향을 제어할 수 있다. 또는, 촬영 방향 제어 유닛(1500)은, 예를 들어, 카메라 모듈(1400)에 물리적으로 연결되는 모터를 포함하며, 모터를 이용하여 카메라 모듈(1400)의 촬영 방향을 제어할 수 있다. 하지만, 촬영 방향 제어 유닛(1500)이 카메라 모듈(1400)의 촬영 방향을 제어하는 방법은 이에 제한되지 않으며, 다양한 방법을 통해 카메라 모듈(1400)을 회전시킴으로써 카메라 모듈(1400)의 촬영 방향을 제어할 수 있다.The shooting direction control unit (1500) can change the shooting direction of the camera module (1400). The shooting direction control unit (1500) can include a hardware structure that can change the shooting direction of the camera module (1400) by panning the camera module (1400). Through the shooting direction control unit (1500), the camera module (1400) can be rotated clockwise or counterclockwise with respect to a predetermined axis, and the camera module (1400) can sequentially shoot around the device (1000) while rotating at a predetermined angle. The shooting direction control unit (1500) can control the shooting direction of the camera module (1400) by using a magnetic force generated by, for example, applying electricity to the electromagnet, including an electromagnet located near the camera module (1400). Alternatively, the shooting direction control unit (1500) may include, for example, a motor physically connected to the camera module (1400), and may control the shooting direction of the camera module (1400) using the motor. However, the method by which the shooting direction control unit (1500) controls the shooting direction of the camera module (1400) is not limited thereto, and the shooting direction of the camera module (1400) may be controlled by rotating the camera module (1400) through various methods.

통신 인터페이스(1600)는 디바이스(1000) 주변을 촬영하여 획득되는 이미지를 기반으로 서비스를 받기 위한 데이터를 외부 디바이스(미도시) 및 서버(미도시)와 송수신할 수 있다.The communication interface (1600) can transmit and receive data for receiving a service based on an image acquired by photographing the surroundings of the device (1000) with an external device (not shown) and a server (not shown).

저장부(1700)는 후술할 프로세서(1800)에 의해 실행될 프로그램을 저장할 수 있고, 디바이스(1000)로 입력되거나 디바이스(1000)로부터 출력되는 데이터를 저장할 수 있다. The storage unit (1700) can store a program to be executed by the processor (1800) to be described later, and can store data input to or output from the device (1000).

저장부(1700)는 플래시 메모리 타입(flash memory type), 하드디스크 타입(hard disk type), 멀티미디어 카드 마이크로 타입(multimedia card micro type), 카드 타입의 메모리(예를 들어 SD 또는 XD 메모리 등), 램(RAM, Random Access Memory) SRAM(Static Random Access Memory), 롬(ROM, Read-Only Memory), EEPROM(Electrically Erasable Programmable Read-Only Memory), PROM(Programmable Read-Only Memory), 자기 메모리, 자기 디스크, 광디스크 중 적어도 하나의 타입의 저장매체를 포함할 수 있다. The storage unit (1700) may include at least one type of storage medium among a flash memory type, a hard disk type, a multimedia card micro type, a card type memory (for example, an SD or XD memory, etc.), a RAM (Random Access Memory), a SRAM (Static Random Access Memory), a ROM (Read-Only Memory), an EEPROM (Electrically Erasable Programmable Read-Only Memory), a PROM (Programmable Read-Only Memory), a magnetic memory, a magnetic disk, and an optical disk.

저장부(1700)에 저장된 프로그램들은 그 기능에 따라 복수 개의 모듈들로 분류할 수 있는데, 예를 들어, 촬영 모드 식별 모듈(1710), 촬영 각도 결정 모듈(1720), 촬영 모듈(1730), 이미지 비교 모듈(1740), 이미지 배열 모듈(1750), 깊이 산출 모듈(1760), 제스처 인식 모듈(1770) 및 공간 인식 모듈(1780)을 포함할 수 있다. The programs stored in the storage unit (1700) can be classified into a plurality of modules according to their functions, and may include, for example, a shooting mode identification module (1710), a shooting angle determination module (1720), a shooting module (1730), an image comparison module (1740), an image arrangement module (1750), a depth calculation module (1760), a gesture recognition module (1770), and a space recognition module (1780).

프로세서(1800)는 디바이스(1000)의 전반적인 동작을 제어한다. 예를 들어, 프로세서(1800)는, 저장부(1700)에 저장된 프로그램들을 실행함으로써, 사용자 입력부(1100), 마이크(1200), 디스플레이부(1300), 카메라 모듈(1400), 촬영 방향 제어 유닛(1500), 통신 인터페이스(1600) 및 저장부(1700) 등을 전반적으로 제어할 수 있다. The processor (1800) controls the overall operation of the device (1000). For example, the processor (1800) can control the user input unit (1100), the microphone (1200), the display unit (1300), the camera module (1400), the shooting direction control unit (1500), the communication interface (1600), and the storage unit (1700) by executing programs stored in the storage unit (1700).

프로세서(1800)는 저장부(1700)에 저장된 촬영 모드 식별 모듈(1710)을 실행함으로써, 카메라 모듈(1400)의 촬영 모드를 식별할 수 있다. 카메라 모듈(1400)의 촬영 모드는, 예를 들어, 사용자의 제스처를 인식하기 위한 제스처 인식 모드 및 디바이스(1000) 주변의 공간을 인식하기 위한 공간 인식 모드를 포함할 수 있다. 제스처 인식 모드는 소정 임계치보다 작은 촬영 각도 범위에서 근거리에 위치한 사용자의 신체에 의한 제스처를 촬영하기 위한 촬영 모드이며, 공간 인식 모드는 소정 임계치보다 큰 촬영 각도 범위에서 디바이스(1000) 주변의 공간을 촬영하기 위한 촬영 모드일 수 있다.The processor (1800) can identify the shooting mode of the camera module (1400) by executing the shooting mode identification module (1710) stored in the storage (1700). The shooting mode of the camera module (1400) can include, for example, a gesture recognition mode for recognizing a user's gesture and a space recognition mode for recognizing a space around the device (1000). The gesture recognition mode is a shooting mode for shooting a gesture made by a user's body located at a close range in a shooting angle range smaller than a predetermined threshold, and the space recognition mode can be a shooting mode for shooting a space around the device (1000) in a shooting angle range larger than a predetermined threshold.

프로세서(1800)는 기설정된 기준에 따라 카메라 모듈(1400)의 촬영 모드를 식별할 수 있다. 예를 들어, 디바이스(1000)의 전원이 켜지거나 디바이스(1000)가 비활성화 상태에서 활성화되면, 프로세서(1800)는 사용자의 제스처 입력을 수신하기 위하여 카메라 모듈(1400)의 촬영 모드를 제스처 인식 모드로 식별할 수 있다. 또한, 프로세서(1800)는 제스처 인식 모드에서 사용자의 제스처를 인식하고 인식된 제스처에 따라 카메라 모듈(1400)의 촬영 모드를 변경할 수 있다. 프로세서(1800)는 제스처에 대응되는 애플리케이션을 실행하고, 실행된 애플리케이션이 요구하는 촬영 모드에 따라 카메라 모듈(1400)의 촬영 모드를 변경할 수 있다. 또한, 예를 들어, 프로세서(1800)는 카메라 모듈(1400)을 통한 촬영을 필요로 하지 않는 애플리케이션이 실행되면, 카메라 모듈(1400)을 비활성화할 수 있다.The processor (1800) can identify the shooting mode of the camera module (1400) according to preset criteria. For example, when the device (1000) is powered on or the device (1000) is activated from an inactive state, the processor (1800) can identify the shooting mode of the camera module (1400) as a gesture recognition mode in order to receive a user's gesture input. In addition, the processor (1800) can recognize the user's gesture in the gesture recognition mode and change the shooting mode of the camera module (1400) according to the recognized gesture. The processor (1800) can execute an application corresponding to the gesture and change the shooting mode of the camera module (1400) according to a shooting mode required by the executed application. In addition, for example, the processor (1800) can deactivate the camera module (1400) when an application that does not require shooting through the camera module (1400) is executed.

또는, 예를 들어, 디바이스(1000)의 전원이 켜지거나 디바이스(1000)가 비활성화 상태에서 활성화되면 디바이스(1000)는 카메라 모듈(1400)의 촬영 모드를 공간 인식 모드로 식별할 수 있다. 또한, 프로세서(1800)는 공간 인식 모드에서 디바이스(1000)의 주변을 촬영함으로써 디바이스(1000)의 주변 공간을 인식할 수 있다.Alternatively, for example, when the device (1000) is powered on or the device (1000) is activated from an inactive state, the device (1000) may identify the shooting mode of the camera module (1400) as the space recognition mode. In addition, the processor (1800) may recognize the surrounding space of the device (1000) by shooting the surroundings of the device (1000) in the space recognition mode.

또한, 예를 들어, 프로세서(1800)는 디바이스(1000)에 대한 기설정된 사용자 입력이 수신되면, 카메라 모듈(1400)의 촬영 모드를 제스처 모드로 또는 공간 인식 모드로 식별할 수 있다.Additionally, for example, when a preset user input for the device (1000) is received, the processor (1800) may identify the shooting mode of the camera module (1400) as a gesture mode or a spatial recognition mode.

프로세서(1800)는 저장부(1700)에 저장된 촬영 각도 결정 모듈(1720)을 실행함으로써, 카메라 모듈(1400)의 촬영 각도의 범위를 결정할 수 있다. 프로세서(1800)는 식별된 촬영 모드에 따라 촬영 각도의 범위를 결정할 수 있다. 이 경우, 촬영 모드에 따른 촬영 각도의 범위가 미리 설정될 수 있다.The processor (1800) can determine the range of the shooting angle of the camera module (1400) by executing the shooting angle determination module (1720) stored in the storage unit (1700). The processor (1800) can determine the range of the shooting angle according to the identified shooting mode. In this case, the range of the shooting angle according to the shooting mode can be set in advance.

예를 들어, 카메라 모듈(1400)의 촬영 모드가 제스처 모드인 경우에 -15도 ~ 15도의 촬영 각도 범위에서 카메라 모듈(1400)이 디바이스(1000)의 주변을 촬영하도록 설정될 수 있다. 예를 들어, 카메라 모듈(1400)이 정면을 향하고 있을 때의 각도를 0도라고 하면, 카메라 모듈(1400)의 촬영 각도 범위가 -15도 ~ 15도인 경우에, 카메라 모듈(1400)은 정면을 기준으로 하여 좌측으로 -15도까지 회전하고 우측으로 15도까지 회전할 수 있다.For example, when the shooting mode of the camera module (1400) is gesture mode, the camera module (1400) may be set to shoot the surroundings of the device (1000) within a shooting angle range of -15 degrees to 15 degrees. For example, when the angle when the camera module (1400) is facing the front is assumed to be 0 degrees, when the shooting angle range of the camera module (1400) is -15 degrees to 15 degrees, the camera module (1400) may rotate to the left by -15 degrees and to the right by 15 degrees based on the front.

예를 들어, 카메라 모듈(1400)의 촬영 모드가 공간 인식 모드인 경우에 -60도 ~ 60도의 촬영 각도 범위에서 카메라 모듈(1400)이 디바이스(1000)의 주변을 촬영하도록 설정될 수 있다. 예를 들어, 카메라 모듈(1400)이 정면을 향하고 있을 때의 각도를 0도라고 하면, 카메라 모듈(1400)의 촬영 각도 범위가 -60도 ~ 60도인 경우에, 카메라 모듈(1400)은 정면을 기준으로 하여 좌측으로 -60도까지 회전하고 우측으로 60도까지 회전할 수 있다.For example, when the shooting mode of the camera module (1400) is the space recognition mode, the camera module (1400) may be set to shoot the surroundings of the device (1000) within a shooting angle range of -60 degrees to 60 degrees. For example, when the angle when the camera module (1400) is facing the front is assumed to be 0 degrees, when the shooting angle range of the camera module (1400) is -60 degrees to 60 degrees, the camera module (1400) may rotate to the left by -60 degrees and to the right by 60 degrees based on the front.

프로세서(1800)는 저장부(1700)에 저장된 촬영 각도 결정 모듈(1720)을 실행함으로써, 카메라 모듈(1400)의 촬영 간격을 결정할 수 있다. 프로세서(1800)는 촬영 각도 범위 내에서 얼마만큼의 각도 간격으로 디바이스(1000)의 주변을 촬영할 지를 결정할 수 있다. 예를 들어, 프로세서(1800)는 촬영 각도 범위 내에서 5도 간격으로 디바이스(1000)의 주변을 촬영하도록 촬영 간격을 결정할 수 있다. 또한, 카메라 모듈(1400)의 촬영 간격은, 예를 들어, 촬영 모드 및 촬영 환경에 따라 상이하게 설정될 수 있다. 촬영 환경은, 디바이스(1000) 주변의 밝기, 피사체의 개수 및 피사체의 움직임 등을 포함할 수 있으나, 이에 제한되지 않는다.The processor (1800) can determine the shooting interval of the camera module (1400) by executing the shooting angle determination module (1720) stored in the storage (1700). The processor (1800) can determine the angle interval at which to shoot the surroundings of the device (1000) within the shooting angle range. For example, the processor (1800) can determine the shooting interval to shoot the surroundings of the device (1000) at 5-degree intervals within the shooting angle range. In addition, the shooting interval of the camera module (1400) can be set differently according to, for example, the shooting mode and the shooting environment. The shooting environment can include, but is not limited to, brightness around the device (1000), the number of subjects, and movement of subjects.

프로세서(1800)는 저장부(1700)에 저장된 촬영 모듈(1730)을 실행함으로써, 카메라 모듈(1400)을 통해 디바이스(1000)의 주변을 촬영할 수 있다. 프로세서(1800)는 카메라 모듈(1400) 및 촬영 방향 제어 유닛(1500)을 제어함으로써, 식별된 촬영 각도의 범위 내에서 결정된 촬영 간격에 따라 디바이스(1000)의 주변을 순차적으로 촬영함으로써 복수의 이미지를 획득할 수 있다. 카메라 모듈(1400)은 촬영 각도 범위 내에서 회전하면서 촬영 간격에 따라 디바이스(1000) 주변을 복수회 촬영할 수 있다. 카메라 모듈(1400)에 의해 촬영된 이미지들 중 인접한 이미지들의 일부는 서로 중첩될 수 있다.The processor (1800) can capture images of the surroundings of the device (1000) through the camera module (1400) by executing the capture module (1730) stored in the storage (1700). The processor (1800) can sequentially capture images of the surroundings of the device (1000) at a determined capture interval within a range of identified capture angles by controlling the camera module (1400) and the capture direction control unit (1500), thereby obtaining a plurality of images. The camera module (1400) can capture images of the surroundings of the device (1000) multiple times at a capture interval while rotating within the capture angle range. Some of the adjacent images among the images captured by the camera module (1400) may overlap each other.

또는, 프로세서(1800)는 카메라 모듈(1400) 및 촬영 방향 제어 유닛(1500)을 제어함으로써, 촬영 각도 범위 내에서 카메라 모듈(1400)을 회전시키면서 동영상을 촬영할 수 있다. 이 경우, 프로세서(1800)는 촬영된 동영상 내의 프레임들 중에서 복수의 프레임을 촬영 간격에 따라 추출할 수 있다. 추출된 프레임들의 이미지들 중 인접한 이미지들의 일부는 서로 중첩될 수 있다.Alternatively, the processor (1800) may control the camera module (1400) and the shooting direction control unit (1500) to capture a video while rotating the camera module (1400) within a shooting angle range. In this case, the processor (1800) may extract a plurality of frames from among the frames in the captured video according to the shooting interval. Some of the adjacent images from among the images of the extracted frames may overlap each other.

프로세서(1800)는 저장부(1700)에 저장된 촬영 모듈(1730)을 실행함으로써, 카메라 모듈(1400)의 초점 거리를 변경할 수 있다. 카메라 모듈(1400)이 소정 임계치보다 가까운 거리에 위치한 피사체들을 촬영하는 경우에, 프로세서(1800)는 카메라 모듈(1400)의 초점 거리를 변경하지 않을 수 있다. 카메라 모듈(1400)이 활성화되는 경우에 카메라 모듈(1400)의 초점 거리는 가까운 거리에 위치한 피사체를 측정할 수 있도록 설정되어 있을 수 있으며, 이로 인하여, 가까운 거리의 피사체를 촬영하기 위하여 프로세서(1800)는 카메라 모듈(1400)의 초점 거리를 변경하지 않을 수 있다. 또한, 소정 거리 값 부근의 가까운 거리의 피사체들을 촬영하기 위한 초점 거리는 실질적으로 거의 동일하기 때문에, 프로세서(1800)는 카메라 모듈(1400)의 초점 거리를 변경하지 않을 수 있다. 또한, 카메라 모듈(1400)이 소정 임계치보다 먼 거리에 위치한 피사체들을 촬영하는 경우에, 프로세서(1800)는 초점 거리를 피사체에 따라 변경하면서 카메라 모듈(1400)이 변경된 초점 거리에 따라 디바이스(1000)의 주변을 촬영을 하도록 카메라 모듈(1400)을 제어할 수 있다.The processor (1800) can change the focal length of the camera module (1400) by executing the shooting module (1730) stored in the storage (1700). When the camera module (1400) photographs subjects located at a distance closer than a predetermined threshold, the processor (1800) may not change the focal length of the camera module (1400). When the camera module (1400) is activated, the focal length of the camera module (1400) may be set to measure subjects located at a close distance, and thus, the processor (1800) may not change the focal length of the camera module (1400) to photograph subjects at a close distance. In addition, since the focal lengths for photographing subjects at a close distance near a predetermined distance value are substantially the same, the processor (1800) may not change the focal length of the camera module (1400). In addition, when the camera module (1400) photographs subjects located at a distance greater than a predetermined threshold, the processor (1800) can control the camera module (1400) to change the focal length depending on the subject and photograph the surroundings of the device (1000) according to the changed focal length.

프로세서(1800)는 저장부(1700)에 저장된 이미지 비교 모듈(1740)을 실행함으로써, 카메라 모듈(1400)을 통해 획득된 복수의 이미지들 중 인접한 이미지들을 비교할 수 있다. 프로세서(1800)는 인접한 이미지들을 서로 비교함으로써, 인접한 이미지들 내의 동일한 특징 영역들을 식별할 수 있다.The processor (1800) can compare adjacent images among a plurality of images acquired through the camera module (1400) by executing the image comparison module (1740) stored in the storage unit (1700). The processor (1800) can identify identical feature regions within the adjacent images by comparing the adjacent images with each other.

예를 들어, 카메라 모듈(1400)로부터 5개의 이미지가 획득된 경우에 프로세서(1800)는 제1 이미지와 제2 이미지를 비교하고, 제2 이미지와 제3 이미지를 비교하고, 제3 이미지와 제4 이미지를 비교하고, 제4 이미지와 제5 이미지를 비교할 수 있다. 또한, 프로세서(1800)는 제1 이미지 내의 제1 특징 영역이 제2 이미지 내의 제2 특징 영역과 동일함을 식별하고, 제2 이미지 내의 제2 특징 영역이 제3 이미지 내의 제3 특징 영역과 동일함을 식별하고, 제3 이미지 내의 제3 특징 영역이 제4 이미지 내의 제4 특징 영역과 동일함을 식별하고, 제4 이미지 내의 제4 특징 영역이 제5 이미지 내의 제5 특징 영역과 동일함을 식별할 수 있다. 이에 따라, 프로세서(1800)는 제1 이미지 내의 제1 특징 영역과 제5 이미지 내의 제5 특징 영역이 동일함을 보다 효과적으로 식별할 수 있다. 또한, 예를 들어, 이미지 내의 특징 영역은 소정의 특징점(feature point)일 수 있다.For example, when five images are acquired from the camera module (1400), the processor (1800) can compare the first image with the second image, the second image with the third image, the third image with the fourth image, and the fourth image with the fifth image. In addition, the processor (1800) can identify that the first feature region in the first image is identical to the second feature region in the second image, identify that the second feature region in the second image is identical to the third feature region in the third image, identify that the third feature region in the third image is identical to the fourth feature region in the fourth image, and identify that the fourth feature region in the fourth image is identical to the fifth feature region in the fifth image. Accordingly, the processor (1800) can more effectively identify that the first feature region in the first image and the fifth feature region in the fifth image are identical. In addition, for example, the feature region in the image may be a predetermined feature point.

프로세서(1800)는 저장부(1700)에 저장된 이미지 배열 모듈(1750)을 실행함으로써, 복수의 이미지들 중 선택된 두 이미지를 배열할 수 있다. 프로세서(1800)는, 특징 영역의 깊이를 산출하기 위하여, 복수의 이미지들 중에서 소정 임계치 이상으로 촬영 각도가 상이한 두 이미지를 선택할 수 있다. 예를 들어, 프로세서(1800)는 복수의 이미지들 중에서 촬영 각도의 차이가 가장 크지만 동일한 특징 영역을 포함하고 있는 두 이미지를 선택함으로써, 특징 영역의 깊이가 보다 정확하게 산출되도록 할 수 있다. 예를 들어, 카메라 모듈(1400)로부터 5개의 이미지가 획득된 경우에 프로세서(1800)는 획득된 5개의 이미지들 중에서, 소정 수치 이상으로 상이한 제1 이미지 및 제5 이미지를 선택할 수 있다.The processor (1800) can arrange two images selected from among a plurality of images by executing the image arrangement module (1750) stored in the storage unit (1700). The processor (1800) can select two images whose shooting angles are different by a predetermined threshold or more from among the plurality of images in order to calculate the depth of the feature region. For example, the processor (1800) can select two images whose shooting angles are the greatest difference from among the plurality of images but include the same feature region, thereby allowing the depth of the feature region to be calculated more accurately. For example, when five images are acquired from the camera module (1400), the processor (1800) can select the first image and the fifth image, which are different by a predetermined threshold or more, from among the five acquired images.

또한, 프로세서(1800)는 선택된 두 이미지에 대한 디스패리티 값을 획득하기 위하여 두 이미지를 가상으로 배열할 수 있다. 두 이미지에 대한 디스패리티 값은, 촬영된 두 이미지 중에서 한 이미지 내의 어떤 특징 영역이 다른 이미지에서 얼마나 쉬프트되었는 지를 나타내는 값일 수 있다.Additionally, the processor (1800) may virtually arrange the two images to obtain disparity values for the two selected images. The disparity values for the two images may be values indicating how much a feature region in one of the two captured images is shifted in the other image.

프로세서(1800)는 카메라 모듈(1400)의 촬영 방향이 서로 평행해지도록 카메라 모듈(1400)을 가상으로 배치할 수 있다. 예를 들어, 제1 이미지를 촬영할 때의 카메라 모듈(1400)과 제5 이미지를 촬영할 때의 카메라 모듈(1400) 간의 베이스 라인 값을 유지하면서, 제1 이미지를 촬영할 때의 카메라 모듈(1400)의 촬영 방향 및 제5 이미지를 촬영할 때의 카메라 모듈(1400)의 촬영 방향이 평행해지도록, 제1 이미지를 촬영할 때의 카메라 모듈(1400)과 제5 이미지를 촬영할 때의 카메라 모듈(1400)을 가상으로 배치할 수 있다. 베이스 라인 값은 한 이미지를 촬영할 때의 카메라 모듈(1400)의 촬영 중심 및 다른 이미지를 촬영할 때의 카메라 모듈(1400)의 촬영 중심 간의 거리를 나타내는 값일 수 있다. 카메라 모듈(1400)의 촬영 중심은, 예를 들어, 카메라 모듈(1400)의 렌즈의 중심점일 수 있다. 카메라 모듈(1400)을 가상으로 배열하는 방법에 대하여는 후술하기로 한다.The processor (1800) can virtually arrange the camera modules (1400) so that the shooting directions of the camera modules (1400) become parallel to each other. For example, while maintaining the baseline value between the camera module (1400) when shooting the first image and the camera module (1400) when shooting the fifth image, the camera module (1400) when shooting the first image and the camera module (1400) when shooting the fifth image can be virtually arranged so that the shooting direction of the camera module (1400) when shooting the first image and the shooting direction of the camera module (1400) when shooting the fifth image become parallel to each other. The baseline value can be a value representing the distance between the shooting center of the camera module (1400) when shooting one image and the shooting center of the camera module (1400) when shooting another image. The shooting center of the camera module (1400) can be, for example, the center point of the lens of the camera module (1400). A method for virtually arranging camera modules (1400) will be described later.

프로세서(1800)는 가상으로 배치된 카메라 모듈(1400)의 위치를 기준으로 하여 이미지들을 배열할 수 있다. 예를 들어, 제1 이미지를 촬영할 때의 카메라 모듈(1400)의 배치된 위치 및 제5 이미지를 촬영할 때의 카메라 모듈의 배치된 위치를 기준으로, 제1 이미지 및 제5 이미지가 가상으로 배열될 수 있다. 이 경우, 프로세서(1800)는 가상으로 배열된 제1 이미지 내의 제1 특징 영역의 위치 및 가상으로 배열된 제5 이미지 내의 제5 특징 영역의 위치에 기초하여, 제1 이미지 내의 제1 특징 영역 및 제5 이미지 내의 제5 특징 영역 간의 디스패리티 값을 획득할 수 있다. 예를 들어, 디스패리티 값은 가상으로 배열된 이미지들 내의 동일한 특징 영역들 간의 거리에 기초하여 산출될 수 있다.The processor (1800) can arrange the images based on the positions of the virtually arranged camera modules (1400). For example, the first image and the fifth image can be virtually arranged based on the arranged positions of the camera modules (1400) when capturing the first image and the arranged positions of the camera modules when capturing the fifth image. In this case, the processor (1800) can obtain a disparity value between the first feature region in the first image and the fifth feature region in the fifth image based on the positions of the first feature region in the virtually arranged first image and the fifth feature region in the virtually arranged fifth image. For example, the disparity value can be calculated based on the distance between the same feature regions in the virtually arranged images.

프로세서(1800)는 저장부(1700)에 저장된 깊이 산출 모듈(1760)을 실행함으로써, 특징 영역의 깊이를 산출할 수 있다. 프로세서(1800)는 두 이미지 내의 동일한 특징 영역들에 대한 디스패리티 값, 카메라 모듈(1400)의 베이스 라인 값 및 카메라 모듈(1400)의 초점 거리를 이용하여, 두 이미지 내의 동일한 특징 영역에 대한 깊이를 산출할 수 있다. 프로세서(1800)는 베이스 라인 값과 초점 거리 간의 비율이 디스패리티 값과 깊이 간의 비율과 동일함을 이용하여, 특징 영역의 깊이를 산출할 수 있다.The processor (1800) can calculate the depth of a feature region by executing a depth calculation module (1760) stored in the storage unit (1700). The processor (1800) can calculate the depth of the feature region by using a disparity value for the same feature regions in the two images, a baseline value of the camera module (1400), and a focal length of the camera module (1400). The processor (1800) can calculate the depth of the feature region by using the fact that a ratio between the baseline value and the focal length is the same as a ratio between the disparity value and the depth.

프로세서(1800)는 이미지 내의 특징 영역들의 깊이를 산출함으로써, 디바이스(1000) 주변 공간에 대한 깊이 맵(depth map)을 생성할 수 있다. 깊이 맵은 이미지 내에 존재하는 객체들 또는 공간들의 3차원 거리 정보를 나타내는 이미지로서, 이미지 내의 각 화소값은 해당 화소의 깊이 정보를 나타낼 수 있다. 깊이 정보는, 시점(view point)로부터 특정 화소에 대응되는 공간까지의 거리를 나타내는 값일 수 있다. 깊이 맵은 시점으로부터 먼 부분과 가까운 부분을 구별되게 나타낼 수 있다. 예를 들어, 깊이 맵에서, 사용자의 시점에서 먼 부분으로부터 가까운 부분까지가 흰색에서 검정색으로 점점 어두워지도록 그라데이션 방식으로 표시될 수 있다. 이에 따라, 디바이스(1000)의 주변 공간 내의 객체들의 형상 및 깊이가 깊이 맵에서 구별되게 표현될 수 있다.The processor (1800) can generate a depth map for the surrounding space of the device (1000) by calculating the depth of feature areas within the image. The depth map is an image representing three-dimensional distance information of objects or spaces existing within the image, and each pixel value within the image can represent depth information of the corresponding pixel. The depth information can be a value representing a distance from a view point to a space corresponding to a specific pixel. The depth map can distinguishably represent a part far from the view point and a part near the view point. For example, in the depth map, a part far from the user's view point to a part near the user's view point can be displayed in a gradient manner so that it gradually darkens from white to black. Accordingly, the shapes and depths of objects within the surrounding space of the device (1000) can be distinctly expressed in the depth map.

프로세서(1800)는 소정 단위의 이미지 세트들로부터 디바이스(1000) 주변 공간의 일부들에 대한 복수의 깊이 맵을 생성하고, 생성된 복수의 깊이 맵을 연결함으로써 디바이스(1000) 주변 공간에 대한 깊이 맵을 완성할 수 있다. 예를 들어, 카메라 모듈(1400)이 회전하면서 디바이스(1000)의 주변을 촬영하여 20개의 이미지들을 획득한 경우에, 프로세서(1800)는 첫번째 내지 10번째 이미지를 포함하는 제1 이미지 세트로부터 제1 부분 깊이 맵을 생성하고, 6번째 내지 15번째 이미지를 포함하는 제2 이미지 세트로부터 제2 부분 깊이 맵을 생성하고, 11번째 내지 20번째 이미지를 포함하는 제3 이미지 세트로부터 제3 부분 깊이 맵을 생성할 수 있다. 또한, 프로세서(1800)는 제1 부분 깊이 맵, 제2 부분 깊이 맵 및 제3 부분 깊이 맵을 이용하여, 전체 깊이 맵을 생성할 수 있다.The processor (1800) can generate a plurality of depth maps for parts of the surrounding space of the device (1000) from a predetermined unit of image sets, and complete the depth map for the surrounding space of the device (1000) by connecting the generated plurality of depth maps. For example, when the camera module (1400) acquires 20 images by photographing the surroundings of the device (1000) while rotating, the processor (1800) can generate a first partial depth map from a first image set including the first to tenth images, generate a second partial depth map from a second image set including the sixth to fifteenth images, and generate a third partial depth map from a third image set including the eleventh to twentieth images. In addition, the processor (1800) can generate an entire depth map by using the first partial depth map, the second partial depth map, and the third partial depth map.

한편, 프로세서(1800)는 인공지능 모델을 이용하여 특징 영역의 깊이를 식별할 수 있다. 이 경우, 프로세서(1800)는 이미지들로부터 특징 영역의 깊이를 산출하도록 훈련된 인공지능 모델에 카메라 모듈(1400)로부터 촬영된 이미지들을 입력함으로써 특징 영역의 깊이에 관한 정보를 획득할 수 있다. 인공지능 모델로부터 획득되는 특징 영역의 깊이 정보는 깊이 산출 모듈(1760)에 의해 산출된 깊이 정보를 검증하는데 이용될 수 있으나, 이에 제한되지 않는다. 또한, 카메라 모듈(1400)을 통해 촬영된 이미지들 및 깊이 산출 모듈(1760)에 의해 산출된 깊이 정보는, 깊이 산출을 위한 인공지능 모델을 업데이트하는데 이용될 수 있다.Meanwhile, the processor (1800) can identify the depth of the feature region using an artificial intelligence model. In this case, the processor (1800) can obtain information about the depth of the feature region by inputting images captured from the camera module (1400) into an artificial intelligence model trained to calculate the depth of the feature region from images. The depth information of the feature region obtained from the artificial intelligence model can be used to verify the depth information calculated by the depth calculation module (1760), but is not limited thereto. In addition, the images captured through the camera module (1400) and the depth information calculated by the depth calculation module (1760) can be used to update the artificial intelligence model for depth calculation.

또한, 프로세서(1800)는 인공지능 모델을 이용하여 깊이 맵을 획득할 수 있다. 이 경우, 프로세서(1800)는 이미지들로부터 깊이 맵을 생성하도록 훈련된 인공지능 모델에 카메라 모듈(1400)로부터 촬영된 이미지들을 입력함으로써, 디바이스(1000) 주변 공간에 대한 깊이 맵을 획득할 수 있다. 인공지능 모델로부터 획득되는 깊이 맵은 깊이 산출 모듈(1760)에 의해 생성된 깊이 맵을 검증하는데 이용될 수 있으나, 이에 제한되지 않는다. 또한, 카메라 모듈(1400)을 통해 촬영된 이미지들 및 깊이 산출 모듈(1760)에 의해 생성된 깊이 맵은, 깊이 맵 생성을 위한 인공지능 모델을 업데이트하는데 이용될 수 있다.In addition, the processor (1800) can obtain a depth map using an artificial intelligence model. In this case, the processor (1800) can obtain a depth map for the space around the device (1000) by inputting images captured from the camera module (1400) into an artificial intelligence model trained to generate a depth map from images. The depth map obtained from the artificial intelligence model can be used to verify the depth map generated by the depth calculation module (1760), but is not limited thereto. In addition, the images captured through the camera module (1400) and the depth map generated by the depth calculation module (1760) can be used to update an artificial intelligence model for generating a depth map.

프로세서(1800)는 저장부(1700)에 저장된 제스처 인식 모듈(1770)을 실행함으로써, 사용자의 제스처를 인식할 수 있다. 프로세서(1800)는 촬영 각도 범위 내에서 복수회 반복하여 회전하면서 이미지들을 획득할 수 있다. 예를 들어, 프로세서(1800)는 촬영 각도 범위 내에서 첫번째로 회전하면서 n 개의 제1 제스처 이미지 세트를 획득하고, 촬영 각도 범위 내에서 두번째로 회전하면서 n 개의 제2 제스처 이미지 세트를 획득하고, 촬영 각도 범위 내에서 세번째로 회전하면서 n 개의 제3 제스처 이미지 세트를 획득할 수 있다. 또한, 예를 들어, 프로세서(1800)는 제1 제스처 이미지 세트 내의 이미지들로부터 사용자의 손의 깊이를 산출하여 사용자의 손의 제1 모양 및 제1 위치를 식별할 수 있다. 예를 들어, 프로세서(1800)는 제2 제스처 이미지 세트 내의 이미지들로부터 사용자의 손의 깊이를 산출하여 사용자의 손의 제2 모양 및 제2 위치를 식별할 수 있다. 예를 들어, 프로세서(1800)는 제3 제스처 이미지 세트 내의 이미지들로부터 사용자의 손의 깊이를 산출하여 사용자의 손의 제3 모양 및 제3 위치를 식별할 수 있다. 또한, 프로세서(1800)는 사용자의 손의 제1 모양, 제2 모양 및 제3 모양에 기초하여 사용자의 손의 모양 변화를 식별하고, 식별된 손의 모양 변화에 따른 제스처를 인식할 수 있다. 또한, 프로세서(1800)는 사용자의 손의 제1 위치, 제2 위치 및 제3 위치에 기초하여 손의 위치 변화를 식별하고, 식별된 손의 위치 변화에 따른 제스처를 인식할 수 있다. 프로세서(1800)는 사용자의 손의 모양 변화 및 위치 변화를 함께 고려하여 제스처를 인식할 수도 있다. 상기에서는, 사용자의 손에 따른 제스처에 대하여 설명되었지만, 제스처를 인식하기 위하여 사용자의 손이 아닌 다른 신체 부위가 인식될 수도 있다.The processor (1800) can recognize a user's gesture by executing a gesture recognition module (1770) stored in the storage unit (1700). The processor (1800) can acquire images by repeatedly rotating multiple times within a shooting angle range. For example, the processor (1800) can acquire a first set of n gesture images by rotating for the first time within the shooting angle range, a second set of n gesture images by rotating for the second time within the shooting angle range, and a third set of n gesture images by rotating for the third time within the shooting angle range. In addition, for example, the processor (1800) can identify a first shape and a first position of the user's hand by calculating a depth of the user's hand from images within the first gesture image set. For example, the processor (1800) can identify a second shape and a second position of the user's hand by calculating a depth of the user's hand from images within the second gesture image set. For example, the processor (1800) may calculate the depth of the user's hand from images in the third gesture image set to identify the third shape and the third position of the user's hand. In addition, the processor (1800) may identify a change in the shape of the user's hand based on the first shape, the second shape, and the third shape of the user's hand, and recognize a gesture according to the change in the shape of the identified hand. In addition, the processor (1800) may identify a change in the position of the hand based on the first position, the second position, and the third position of the user's hand, and recognize a gesture according to the change in the position of the identified hand. The processor (1800) may recognize a gesture by considering both the shape change and the position change of the user's hand. In the above, gestures according to the user's hand have been described, but other body parts other than the user's hand may be recognized in order to recognize gestures.

프로세서(1800)는 인공지능 모델을 이용하여 제스처를 식별할 수 있다. 이 경우, 프로세서(1800)는 이미지들로부터 사용자의 제스처를 식별하도록 훈련된 인공지능 모델에 카메라 모듈(1400)로부터 촬영된 이미지들을 입력함으로써, 사용자의 제스처를 식별할 수 있다. 인공지능 모델로부터 식별되는 제스처는 제스처 인식 모델에 의해 인식된 제스처를 검증하는데 이용될 수 있으나, 이에 제한되지 않는다. 또한, 카메라 모듈(1400)을 통해 촬영된 이미지들 및 제스처 인식 모듈(1770)에 의해 인식된 제스처는, 제스처 인식을 위한 인공지능 모델을 업데이트하는데 이용될 수 있다.The processor (1800) can identify a gesture using an artificial intelligence model. In this case, the processor (1800) can identify a user's gesture by inputting images captured from the camera module (1400) into an artificial intelligence model trained to identify a user's gesture from images. The gesture identified from the artificial intelligence model can be used to verify a gesture recognized by the gesture recognition model, but is not limited thereto. In addition, the images captured through the camera module (1400) and the gesture recognized by the gesture recognition module (1770) can be used to update an artificial intelligence model for gesture recognition.

프로세서(1800)는 저장부(1700)에 저장된 공간 인식 모듈(1780)을 실행함으로써, 디바이스(1000) 주변의 공간을 인식할 수 있다. 프로세서(1800)는 디바이스(1000)의 주변에 대하여 생성된 깊이 맵으로부터 디바이스(1000) 주변의 특징 영역들이 무엇인지를 식별할 수 있다. 예를 들어, 프로세서(1800)는 디바이스(1000) 주변의 특징 영역에 대응되는 객체가 무엇인지에 대한 식별 정보, 디바이스(1000)에 대한 특징 영역의 상대적인 위치를 나타내는 좌표 정보, 디바이스(1000)와 특징 영역 간의 거리를 나타내는 깊이 정보 등을 생성할 수 있다.The processor (1800) can recognize the space around the device (1000) by executing the space recognition module (1780) stored in the storage (1700). The processor (1800) can identify what feature areas are around the device (1000) from a depth map generated for the periphery of the device (1000). For example, the processor (1800) can generate identification information about what object corresponds to the feature area around the device (1000), coordinate information indicating the relative position of the feature area with respect to the device (1000), depth information indicating the distance between the device (1000) and the feature area, etc.

또한, 프로세서(1800)는 디바이스(1000) 주변 공간 중에서 인식되지 않은 공간을 식별하고, 인식되지 않은 공간이 촬영되는 경우에 인식되지 않은 공간에 대한 깊이 맵을 생성하여 전체 깊이 맵에 추가할 수 있다. 이 경우, 프로세서(1800)는 카메라 모듈(1400)이 디바이스(1000) 주변의 어떤 영역을 촬영하였는 지에 관한 이력을 저장할 수 있으며, 저장된 이력에 기초하여 카메라 모듈(1400)이 촬영되지 않은 공간을 향하는 경우에 공간 인식 모드를 활성화할 수 있다.In addition, the processor (1800) can identify an unrecognized space among the spaces surrounding the device (1000), and when the unrecognized space is captured, generate a depth map for the unrecognized space and add it to the overall depth map. In this case, the processor (1800) can store a history of which areas around the device (1000) the camera module (1400) captured, and activate a space recognition mode when the camera module (1400) faces the space that has not been captured based on the stored history.

프로세서(1800)는 인공지능 모델을 이용하여 공간을 인식할 수 있다. 이 경우, 프로세서(1800)는 이미지들로부터 공간 내의 특징 영역을 식별하도록 훈련된 인공지능 모델에 카메라 모듈(1400)로부터 촬영된 이미지들 또는 깊이 맵을 입력함으로써, 디바이스(1000)의 주변 공간을 인식할 수 있다. 인공지능 모델로부터 인식되는 공간 정보는 공간 인식 모델에 의해 인식된 공간을 검증하는데 이용될 수 있으나, 이에 제한되지 않는다. 또한, 카메라 모듈(1400)을 통해 촬영된 이미지들 또는 깊이 맵, 및 공간 인식 모듈(1780)에 의해 인식된 공간 정보는, 공간 인식을 위한 인공지능 모델을 업데이트하는데 이용될 수 있다.The processor (1800) can recognize a space using an artificial intelligence model. In this case, the processor (1800) can recognize a surrounding space of the device (1000) by inputting images or a depth map captured from the camera module (1400) into an artificial intelligence model trained to identify a feature region in a space from images. The spatial information recognized from the artificial intelligence model can be used to verify the space recognized by the spatial recognition model, but is not limited thereto. In addition, the images or depth map captured through the camera module (1400) and the spatial information recognized by the spatial recognition module (1780) can be used to update an artificial intelligence model for spatial recognition.

도 3은 본 개시의 일 실시예에 따른 디바이스 내의 카메라 모듈이 회전하는 예시를 나타내는 도면이다.FIG. 3 is a drawing showing an example of a camera module rotating within a device according to one embodiment of the present disclosure.

도 3을 참조하면, 촬영 방향 제어 유닛(1500)은 회전 중심(30)을 기준으로 카메라 모듈(1400) 시계 또는 반시계 방향으로 회전(32)할 수 있다. 이에 따라, 카메라 모듈(1400)은 소정 각도로 회전하면서 디바이스(1000)의 주변을 순차적으로 촬영할 수 있게 된다. 또한, 카메라 모듈(1400)은 회전 중심(30)을 중심으로 소정 각도로 패닝(panning) 또는 틸팅(tilting)되면서 디바이스(1000)의 주변을 촬영할 수 있다.Referring to FIG. 3, the shooting direction control unit (1500) can rotate the camera module (1400) clockwise or counterclockwise (32) based on the rotation center (30). Accordingly, the camera module (1400) can sequentially capture images of the periphery of the device (1000) while rotating at a predetermined angle. In addition, the camera module (1400) can capture images of the periphery of the device (1000) while panning or tilting at a predetermined angle around the rotation center (30).

도 4a는 본 개시의 일 실시예에 따른 디바이스 내의 카메라 모듈이 회전하면서 복수의 이미지를 획득하는 예시를 나타내는 도면이다.FIG. 4A is a diagram illustrating an example of obtaining multiple images while a camera module within a device rotates according to one embodiment of the present disclosure.

도 4a를 참조하면, 디바이스(1000) 내의 카메라 모듈(1400)은 소정의 촬영 각도 범위(40) 내에서 시계 또는 반시계 방향으로 소정 각도씩 회전하면서 디바이스(1000)의 주변을 순차적으로 촬영할 수 있다. 예를 들어, 디바이스(1000)는 카메라 모듈(1400)을 시계 방향으로 5도씩 회전시키면서 디바이스(1000)의 주변을 순차적으로 촬영할 수 있다. 카메라 모듈(1400)의 촬영 각도 범위(40)는 디바이스(1000)의 촬영 모드에 따라 상이하게 설정될 수 있다. 예를 들어, 디바이스(1000)의 촬영 모드가 제스처 인식 모드인 경우에 촬영 각도 범위(40)는 기준축(42)을 기준으로 -15도 ~ +15도의 범위로 설정될 수 있다. 또한, 예를 들어, 디바이스(1000)의 촬영 모드가 공간 인식 모드인 경우에 촬영 각도 범위(40)는 기준축(42)을 기준으로 -30도 ~ +30도의 범위일 수 있다. 기준축(42)은 카메라 모듈(1400)이 디바이스(1000)의 정면을 향하고 있는 경우의 카메라 모듈(1400)의 촬영 방향에 의해 결정될 수 있다.Referring to FIG. 4A, the camera module (1400) in the device (1000) can sequentially capture images of the periphery of the device (1000) while rotating clockwise or counterclockwise at predetermined angles within a predetermined capturing angle range (40). For example, the device (1000) can sequentially capture images of the periphery of the device (1000) while rotating the camera module (1400) clockwise at 5 degrees each. The capturing angle range (40) of the camera module (1400) can be set differently depending on the capturing mode of the device (1000). For example, when the capturing mode of the device (1000) is gesture recognition mode, the capturing angle range (40) can be set to a range of -15 degrees to +15 degrees with respect to the reference axis (42). In addition, for example, when the shooting mode of the device (1000) is a space recognition mode, the shooting angle range (40) may be a range of -30 degrees to +30 degrees with respect to the reference axis (42). The reference axis (42) may be determined by the shooting direction of the camera module (1400) when the camera module (1400) is facing the front of the device (1000).

디바이스(1000)는 카메라 모듈(1400)을 통하여 디바이스(1000)의 주변을 촬영함으로써, 제1 내지 제n 이미지를 획득할 수 있다. 이 경우, 제1 내지 제n 이미지들 중 인접한 이미지들은 일정 부분이 서로 중첩될 수 있다.The device (1000) can obtain first to nth images by photographing the surroundings of the device (1000) through the camera module (1400). In this case, adjacent images among the first to nth images may overlap each other to a certain extent.

도 4b는 본 개시의 일 실시예에 따른 인접한 이미지들 내의 동일한 특징 영역이 순차적으로 식별되는 예시를 나타내는 도면이다.FIG. 4b is a diagram illustrating an example in which identical feature regions in adjacent images are sequentially identified according to one embodiment of the present disclosure.

도 4b를 참조하면, 디바이스(1000)는 스테레오 매칭(stereo matching) 기법을 통하여 이미지들 내의 동일한 특징 영역들을 식별할 수 있다. 디바이스(1000)는 제1 이미지 내의 제1 특징 영역이 제2 이미지 내의 제2 특징 영역과 동일함을 식별할 수 있다. 또한, 디바이스(1000)는 제2 이미지 내의 제2 특징 영역이 제3 이미지 내의 제3 특징 영역과 동일함을 식별할 수 있다. 이와 같이, 디바이스(1000)는 인접한 2개의 이미지들을 순차적으로 비교함으로써, 제n-1 이미지 내의 제 n-1 특징 영역이 제n 이미지 내의 제n 특징 영역과 동일함을 식별할 수 있다.Referring to FIG. 4b, the device (1000) can identify identical feature regions in the images through a stereo matching technique. The device (1000) can identify that a first feature region in a first image is identical to a second feature region in a second image. Additionally, the device (1000) can identify that a second feature region in a second image is identical to a third feature region in a third image. In this way, the device (1000) can identify that an n-1th feature region in an n-1th image is identical to an nth feature region in an nth image by sequentially comparing two adjacent images.

도 4c는 본 개시의 일 실시예에 따른 디바이스가 가장 큰 각도 차이가 나는 이미지들 내의 동일한 식별 영역들을 식별하는 예시를 나타내는 도면이다.FIG. 4c is a diagram illustrating an example of a device according to one embodiment of the present disclosure identifying identical identification regions within images with the greatest angular difference.

도 4c를 참조하면, 디바이스(1000)는 도 4b에서의 식별 결과에 기초하여, 제1 이미지 내의 제1 특징 영역이 제n 이미지 내의 제n 특징 영역과 동일함을 식별할 수 있다. 이에 따라, 디바이스(1000)는 제1 내지 제n 이미지들 중에서 촬영 각도 차이가 가장 큰 두 개의 이미지에서 동일한 특징 영역을 식별할 수 있으며, 디바이스(1000)는 후술할 특징 영역의 깊이를 보다 정확하게 산출할 수 있게 된다.Referring to FIG. 4c, the device (1000) can identify that the first feature region in the first image is the same as the n-th feature region in the n-th image based on the identification result in FIG. 4b. Accordingly, the device (1000) can identify the same feature region in two images having the largest difference in shooting angles among the first to n-th images, and the device (1000) can more accurately calculate the depth of the feature region, which will be described later.

인접한 이미지들은 많은 부분이 서로 중첩되며, 인접한 이미지들 내의 동일한 특정 영역들은 인접한 이미지들 내에서 서로 비슷한 위치에 표시되므로, 디바이스(1000)는 인접한 이미지들을 비교함으로써 보다 효과적으로 동일한 특정 영역들을 식별할 수 있게 된다. 예를 들어, 제1 이미지는 다른 이미지들보다 제2 이미지와 더 많은 부분이 중첩되며, 제1 이미지 내에서의 제1 특징 영역의 위치는 제2 이미지 내의 제2 특징 영역의 위치와 차이가 적기 때문에, 디바이스(1000)는 제1 이미지 및 제2 이미지의 보다 적은 부분을 서로 비교하더라도, 디바이스(1000)는 제1 특징 영역 및 제2 특징 영역이 동일함을 효과적으로 정확하게 식별할 수 있게 된다. 또한, 디바이스(1000)는 인접한 이미지들을 순차적으로 비교하고, 순차적으로 비교된 결과들을 이용함으로써 제1 이미지의 제1 특징 영역 및 제n 이미지의 제n 특징 영역이 동일함을 보다 정확하게 식별할 수 있게 된다.Since adjacent images overlap each other to a large extent and the same specific regions in the adjacent images are displayed in similar positions in the adjacent images, the device (1000) can identify the same specific regions more effectively by comparing the adjacent images. For example, since the first image overlaps the second image to a greater extent than the other images and the position of the first feature region in the first image has a small difference from the position of the second feature region in the second image, even if the device (1000) compares a smaller portion of the first image and the second image, the device (1000) can effectively and accurately identify that the first feature region and the second feature region are the same. In addition, the device (1000) can more accurately identify that the first feature region of the first image and the n-th feature region of the n-th image are the same by sequentially comparing the adjacent images and using the results of the sequential comparison.

도 4a 내지 도 4c에서는 디바이스(1000)가 촬영된 이미지 모두를 이용하여 이미지들 내의 동일한 특징 영역들을 식별하는 것으로 설명하였지만, 이에 제한되지 않는다. 디바이스(1000)는 촬영된 이미지들 중 일부인 복수의 이미지들을 선택하고, 선택된 복수의 이미지들을 비교함으로써 선택된 복수의 이미지들 내의 동일한 특징 영역들을 식별할 수도 있다. 예를 들어, 제1 내지 제n 이미지들 중에서 제3 내지 제n-3 이미지를 선택하고, 제3 내지 제n-3 이미지들 중 인접한 이미지들을 비교함으로써, 제3 이미지 내의 제3 특징 영역 및 제n-3 이미지 내의 제n-3 특징 영역이 동일함을 식별할 수도 있다.Although FIGS. 4A to 4C describe that the device (1000) identifies identical feature regions within the images using all of the captured images, the present invention is not limited thereto. The device (1000) may select a plurality of images, which are part of the captured images, and identify identical feature regions within the selected plurality of images by comparing the selected plurality of images. For example, the device may select third to n-3th images among the first to n-th images, and identify that the third feature region within the third image and the n-3th feature region within the n-3th image are identical by comparing adjacent images among the third to n-3th images.

도 4a 내지 도 4c에서는, 디바이스(1000)가 촬영 각도 범위 내에서 가장 작은 촬영 각도에서 촬영된 제1 이미지 및 가장 큰 촬영 각도에서 촬영된 제n 이미지로부터 동일한 특징 영역들을 식별하는 것으로 설명되었지만, 이에 제한되지 않는다. 디바이스(1000)는 소정 임계치 이상의 촬영 각도 차이를 가지는 두 이미지들로부터 동일한 특징 영역들을 식별하고, 식별된 특징 영역들을 이용하여 특징 영역들의 깊이를 산출할 수도 있다. 예를 들어, 디바이스(1000)는 제1 내지 제n 이미지들 중에서 촬영 각도의 차이가 소정 임계치 이상인 제1 이미지 및 제n-3 이미지를 선택하고, 선택된 제1 이미지 및 제n-3 이미지로부터 동일한 특징 영역들을 식별할 수 있다.In FIGS. 4A to 4C, it has been described that the device (1000) identifies the same feature regions from the first image captured at the smallest shooting angle and the n-th image captured at the largest shooting angle within the shooting angle range, but is not limited thereto. The device (1000) may identify the same feature regions from two images having a shooting angle difference greater than or equal to a predetermined threshold, and may also calculate the depth of the feature regions using the identified feature regions. For example, the device (1000) may select the first image and the n-3th image among the first to n-th images, wherein the difference in shooting angles is greater than or equal to a predetermined threshold, and identify the same feature regions from the selected first image and the n-3th image.

도 5a 내지 도 5c는 본 개시의 일 실시예에 따른 디바이스가 촬영된 이미지를 배열하고 특징 영역의 디스패리티 값을 획득하는 예시를 설명하기 위한 도면이다.FIGS. 5A to 5C are diagrams illustrating examples of a device arranging captured images and obtaining disparity values of feature regions according to one embodiment of the present disclosure.

도 5a는 본 개시의 일 실시예에 따른 디바이스가 디바이스의 주변을 촬영하는 예시를 도시한 도면이다.FIG. 5A is a diagram illustrating an example of a device photographing its surroundings according to one embodiment of the present disclosure.

도 5a를 참조하면, 디바이스(1000)는 촬영 각도 범위 내에서 소정 촬영 각도 간격으로 디바이스(1000)의 주변을 순차적으로 촬영할 수 있다. 또한, 촬영 각도 범위 내에서 차이가 가장 큰 두 이미지인 제1 이미지를 촬영할 때의 카메라 모듈(1400-1)의 촬영 중심 및 제n 이미지를 촬영할 때의 카메라 모듈(1400-n)의 촬영 중심에 의해, 제1 이미지 및 제n 이미지에 대한 베이스라인 값(b)이 결정될 수 있다. 예를 들어, 제1 이미지를 촬영할 때의 카메라 모듈(1400-1)의 렌즈 중심 및 제n 이미지를 촬영할 때의 카메라 모듈(1400-n)의 렌즈 중심 간의 거리가 베이스 라인 값(b)일 수 있다.Referring to FIG. 5A, the device (1000) can sequentially capture images of the periphery of the device (1000) at predetermined capture angle intervals within a capture angle range. In addition, a baseline value (b) for the first image and the n-th image can be determined by the capture center of the camera module (1400-1) when capturing the first image, which are the two images with the greatest difference within the capture angle range, and the capture center of the camera module (1400-n) when capturing the n-th image. For example, the distance between the lens center of the camera module (1400-1) when capturing the first image and the lens center of the camera module (1400-n) when capturing the n-th image can be the baseline value (b).

도 5b는 본 개시의 일 실시예에 따른 디바이스가 카메라 모듈을 가상으로 배치하는 예시를 나타내는 도면이다.FIG. 5b is a diagram illustrating an example of a device virtually arranging a camera module according to one embodiment of the present disclosure.

도 5b를 참조하면, 디바이스(1000)는 제1 이미지를 촬영할 때의 카메라 모듈(1400-1)과 제n 이미지를 촬영할 때의 카메라 모듈(1400-n) 간의 베이스 라인 값(b)을 유지하면서, 제1 이미지를 촬영할 때의 카메라 모듈(1400-1)의 촬영 방향(53) 및 제n 이미지를 촬영할 때의 카메라 모듈(1400-n)의 촬영 방향(54)이 서로 평행해지도록, 제1 이미지를 촬영할 때의 카메라 모듈(1400-1)과 제n 이미지를 촬영할 때의 카메라 모듈(1400-n)을 가상으로 배치할 수 있다.Referring to FIG. 5b, the device (1000) can virtually arrange the camera module (1400-1) when shooting the first image and the camera module (1400-n) when shooting the nth image so that the shooting direction (53) of the camera module (1400-1) when shooting the first image and the shooting direction (54) of the camera module (1400-n) when shooting the nth image become parallel to each other while maintaining the baseline value (b) between the camera module (1400-1) when shooting the first image and the camera module (1400-n) when shooting the nth image.

도 5c는 본 개시의 일 실시예에 따른 디바이스가 디스패리티 값을 산출하는 예시를 나타내는 도면이다.FIG. 5c is a diagram illustrating an example of a device calculating a disparity value according to one embodiment of the present disclosure.

도 5c를 참조하면, 디바이스(1000)는 제1 이미지 및 제n 이미지에서 동일한 특징 영역들의 디스패리티 값(d)을 산출하기 위하여, 제1 이미지 및 제n 이미지를 배열할 수 있다. 예를 들어, 디바이스(1000)는 제1 이미지 및 제n 이미지를 세로축(55) 및 세로축(56)을 따라 배열할 수 있다. 또한, 예를 들어, 제1 이미지 및 제n 이미지의 세로변들이 세로축(55) 및 세로축(56)에 일치하지 않을 수 있다. 이 경우 디바이스(1000)는 제1 이미지 및 제n 이미지의 세로변들이 세로축(55) 및 세로축(56)에 일치하도록 제1 이미지 및 제n 이미지를 소정 알고리즘에 따라 수정(rectification)할 수 있다.Referring to FIG. 5c, the device (1000) may arrange the first image and the n-th image to calculate disparity values (d) of the same feature regions in the first image and the n-th image. For example, the device (1000) may arrange the first image and the n-th image along the vertical axis (55) and the vertical axis (56). In addition, for example, the vertical sides of the first image and the n-th image may not coincide with the vertical axis (55) and the vertical axis (56). In this case, the device (1000) may rectify the first image and the n-th image according to a predetermined algorithm so that the vertical sides of the first image and the n-th image coincide with the vertical axis (55) and the vertical axis (56).

또한, 디바이스(1000)는 배열된 제1 이미지 내의 제1 특징 영역의 위치 및 배열된 제 n 이미지 내의 제n 특징 영역의 위치에 기초하여, 제1 특징 영역 및 제n 특징 영역 간의 디스패리티 값(d)를 산출할 수 있다. 예를 들어, 제1 특징 영역 및 제n 특징 영역 간의 디스패리티 값(d)은 제1 특징 영역을 가로지르는 세로선(57) 및 제n 특징 영역을 가로지르는 세로선(58) 간의 거리에 의해 결정될 수 있다.Additionally, the device (1000) can calculate a disparity value (d) between the first feature region and the nth feature region based on the position of the first feature region in the arranged first image and the position of the nth feature region in the arranged nth image. For example, the disparity value (d) between the first feature region and the nth feature region can be determined by the distance between a vertical line (57) crossing the first feature region and a vertical line (58) crossing the nth feature region.

도 6은 본 개시의 일 실시예에 따른 디바이스가 특징 영역의 깊이를 산출하는 예시를 나타내는 도면이다.FIG. 6 is a diagram showing an example of a device calculating the depth of a feature region according to one embodiment of the present disclosure.

도 6을 참조하면, 두 이미지 내의 동일한 특징 영역들에 대한 디스패리티 값, 카메라 모듈(1400)의 베이스 라인 값 및 카메라 모듈(1400)의 초점 거리를 이용하여, 두 이미지 내의 동일한 특징 영역에 대한 깊이가 산출될 수 있다. 예를 들어, 특징 영역의 깊이(z) 및 카메라 모듈(1400)의 초점 거리(f)간의 비율이 디스패리티 값(d) 및 베이스라인 값(b)의 비율과 동일함에 기초하여, 아래의 수학식 1과 같이, 특징 영역의 깊이(z)가 산출될 수 있다.Referring to FIG. 6, the depth for the same feature region in the two images can be calculated using the disparity values for the same feature regions in the two images, the baseline value of the camera module (1400), and the focal length of the camera module (1400). For example, based on the fact that the ratio between the depth (z) of the feature region and the focal length (f) of the camera module (1400) is equal to the ratio of the disparity value (d) and the baseline value (b), the depth (z) of the feature region can be calculated as in Mathematical Expression 1 below.

<수학식 1><Mathematical formula 1>

특징 영역의 깊이(z) = {디스패리티 값(d) x 초점거리(f)}/베이스라인값(b)Depth of feature region (z) = {disparity value (d) x focal length (f)}/baseline value (b)

예를 들어, 수학식 1에서, 특징 영역의 깊이(z)는 이미지 내의 동일한 특징 영역들인 제1 특징 영역 및 제n 특징 영역의 깊이이며, 디스패리티 값(d)은 이미지 내의 제1 특징 영역 및 제n 특징 영역에 대한 디스패리티 값이며, 베이스 라인 값(b)은 제1 특징 영역을 촬영할 때의 카메라 모듈(1400)의 렌즈 중심과 제n 특징 영역을 촬영할 때의 카메라 모듈(1400)의 렌즈 중심 간의 거리하며, 초점 거리(f)는 제1 특징 영역 및 제n 특징 영역을 촬영할 때의 카메라 렌즈(1400)의 초점 거리일 수 있다.For example, in mathematical expression 1, the depth (z) of the feature region is the depth of the first feature region and the nth feature region, which are the same feature regions within the image, the disparity value (d) is a disparity value for the first feature region and the nth feature region within the image, the baseline value (b) is a distance between the center of the lens of the camera module (1400) when photographing the first feature region and the center of the lens of the camera module (1400) when photographing the nth feature region, and the focal length (f) may be a focal length of the camera lens (1400) when photographing the first feature region and the nth feature region.

도 7은 본 개시의 일 실시예에 따른 디바이스가 이미지 내의 특징 영역의 깊이를 산출하는 방법의 흐름도이다.FIG. 7 is a flowchart of a method for calculating the depth of a feature region in an image by a device according to one embodiment of the present disclosure.

동작 S700에서 디바이스(1000)는 카메라 모듈(1400)을 회전시키면서 디바이스(1000)의 주변을 연속하여 촬영할 수 있다. 디바이스(1000)는 카메라 모듈(1400) 및 촬영 방향 제어 유닛(1500)을 제어함으로써, 식별된 촬영 각도의 범위 내에서 결정된 촬영 간격에 따라 디바이스(1000)의 주변을 순차적으로 촬영함으로써 복수의 이미지를 획득할 수 있다. 카메라 모듈(1400)은 촬영 각도 범위 내에서 회전하면서 촬영 간격에 따라 디바이스(1000) 주변을 복수회 촬영할 수 있다. 카메라 모듈(1400)에 의해 촬영된 이미지들 중 인접한 이미지들의 일부는 서로 중첩될 수 있다.In operation S700, the device (1000) can sequentially capture images of the surroundings of the device (1000) while rotating the camera module (1400). The device (1000) can sequentially capture images of the surroundings of the device (1000) at a determined capturing interval within a range of identified capturing angles by controlling the camera module (1400) and the capturing direction control unit (1500), thereby obtaining a plurality of images. The camera module (1400) can capture images of the surroundings of the device (1000) multiple times at a capturing interval while rotating within a capturing angle range. Some of the adjacent images among the images captured by the camera module (1400) may overlap each other.

S710에서 디바이스(1000)는 촬영된 복수의 이미지들 중에서 서로 인접한 이미지들을 비교함으로써, 이미지들 내의 동일한 특징 영역들을 식별할 수 있다. 디바이스(1000)는 제1 이미지 내의 제1 특징 영역이 제2 이미지 내의 제2 특징 영역과 동일함을 식별할 수 있다. 또한, 디바이스(1000)는 제2 이미지 내의 제2 특징 영역이 제3 이미지 내의 제3 특징 영역과 동일함을 식별할 수 있다. 이와 같이, 디바이스(1000)는 인접한 2개의 이미지들을 순차적으로 비교함으로써, 제n-1 이미지 내의 제 n-1 특징 영역이 제n 이미지 내의 제n 특징 영역과 동일함을 식별할 수 있다.In S710, the device (1000) can identify identical feature regions in the images by comparing adjacent images among the plurality of captured images. The device (1000) can identify that a first feature region in a first image is identical to a second feature region in a second image. Additionally, the device (1000) can identify that a second feature region in a second image is identical to a third feature region in a third image. In this way, the device (1000) can identify that an n-1th feature region in an n-1th image is identical to an nth feature region in an nth image by sequentially comparing two adjacent images.

동작 S720에서 디바이스(1000)는 복수의 이미지들 중에서 촬영 각도가 소정 값 이상이 제1 이미지 및 제n 이미지를 선택할 수 있다. 예를 들어, n 개의 이미지가 촬영된 경우에, 촬영 각도 범위 내에서 가장 작은 촬영 각도에서 촬영된 제1 이미지 및 가장 큰 촬영 각도에서 촬영된 제n 이미지를 선택할 수 있으나, 이에 제한되지 않는다. 디바이스(1000)는 소정 임계치 이상의 촬영 각도 차이를 가지는 두 이미지들로부터 동일한 특징 영역들을 식별하고, 식별된 특징 영역들을 이용하여 특징 영역들의 깊이를 산출할 수도 있다. 예를 들어, 디바이스(1000)는 제1 내지 제n 이미지들 중에서 촬영 각도의 차이가 소정 임계치 이상인 제1 이미지 및 제n-3 이미지를 선택하고, 선택된 제1 이미지 및 제n-3 이미지로부터 동일한 특징 영역들을 식별할 수도 있다.In operation S720, the device (1000) may select a first image and an n-th image from among a plurality of images, wherein the shooting angles are greater than or equal to a predetermined value. For example, when n images are captured, the first image captured at the smallest shooting angle and the n-th image captured at the largest shooting angle within the shooting angle range may be selected, but is not limited thereto. The device (1000) may identify identical feature regions from two images having a shooting angle difference greater than or equal to a predetermined threshold, and may calculate the depth of the feature regions using the identified feature regions. For example, the device (1000) may select a first image and an n-3th image from among the first to n-th images, wherein the shooting angle difference is greater than or equal to a predetermined threshold, and may identify identical feature regions from the selected first image and the n-3th image.

동작 S730에서 디바이스(1000)는 제1 이미지를 촬영할 때의 카메라 모듈(1400)의 촬영 중심 및 제2 이미지를 촬영할 때의 카메라 모듈(1400)의 촬영 중심에 기초하여 베이스라인 값을 획득할 수 있다. 베이스 라인 값은 한 이미지를 촬영할 때의 카메라 모듈(1400)의 촬영 중심 및 다른 이미지를 촬영할 때의 카메라 모듈(1400)의 촬영 중심 간의 거리를 나타내는 값일 수 있다. 카메라 모듈(1400)의 촬영 중심은, 예를 들어, 카메라 모듈(1400)의 렌즈의 중심점일 수 있다.In operation S730, the device (1000) can obtain a baseline value based on the shooting center of the camera module (1400) when shooting a first image and the shooting center of the camera module (1400) when shooting a second image. The baseline value may be a value representing a distance between the shooting center of the camera module (1400) when shooting one image and the shooting center of the camera module (1400) when shooting another image. The shooting center of the camera module (1400) may be, for example, the center point of the lens of the camera module (1400).

동작 S740에서 디바이스(1000)는 제1 이미지 내의 제1 특징 영역 및 제n 이미지 내의 제n 특징 영역 간의 디스패리티 값을 산출할 수 있다. 디바이스(1000)는 베이스라인 값을 유지하면서 제1 이미지를 촬영할 때의 카메라 모듈과 제n 이미지를 촬영할 때의 카메라 모듈을 가상으로 배치하고, 제1 이미지 및 제2 이미지를 소정 기준에 따라 배열할 수 있다. 또한, 디바이스(1000)는 배열된 제1 이미지 내의 제1 특징 영역의 위치 및 배열된 제n 이미지 내의 제n 특징 영역의 위치에 기초하여, 제1 특징 영역 및 제2 특징 영역 간의 디스패리티 값을 산출할 수 있다.In operation S740, the device (1000) can calculate a disparity value between a first feature region in a first image and an n-th feature region in an n-th image. The device (1000) can virtually arrange a camera module when capturing a first image and a camera module when capturing an n-th image while maintaining a baseline value, and arrange the first image and the second image according to a predetermined criterion. In addition, the device (1000) can calculate a disparity value between the first feature region and the second feature region based on a position of the first feature region in the arranged first image and a position of the n-th feature region in the arranged n-th image.

동작 S750에서 디바이스(1000)는 디스패리티 값, 베이스라인 값 및 카메라 모듈의 초점 거리에 기초하여, 특징 영역의 깊이를 산출할 수 있다. 디바이스(1000)는 두 이미지 내의 동일한 특징 영역들에 대한 디스패리티 값, 카메라 모듈(1400)의 베이스 라인 값 및 카메라 모듈(1400)의 초점 거리를 이용하여, 두 이미지 내의 동일한 특징 영역에 대한 깊이를 산출할 수 있다. 디바이스(1000)는 베이스 라인 값과 초점 거리 간의 비율이 디스패리티 값과 깊이 간의 비율과 동일함을 이용하여, 특징 영역의 깊이를 산출할 수 있다.In operation S750, the device (1000) can calculate the depth of a feature region based on a disparity value, a baseline value, and a focal length of a camera module. The device (1000) can calculate the depth for the same feature region in the two images by using the disparity values for the same feature regions in the two images, the baseline value of the camera module (1400), and the focal length of the camera module (1400). The device (1000) can calculate the depth of the feature region by using the fact that a ratio between the baseline value and the focal length is the same as a ratio between the disparity value and the depth.

도 8은 본 개시의 일 실시예에 따른 디바이스가 디스패리티 값을 산출하는 예시를 나타내는 도면이다.FIG. 8 is a diagram illustrating an example of a device calculating a disparity value according to one embodiment of the present disclosure.

S800에서 디바이스(1000)는 베이스라인 값을 유지하면서, 제1 이미지를 촬영할 때의 카메라 모듈(1400-1)과 제n 이미지를 촬영할 때의 카메라 모듈(1400-n)을 가상으로 배치할 수 있다. 디바이스(1000)는 제1 이미지를 촬영할 때의 카메라 모듈(1400-1)의 촬영 방향 및 제n 이미지를 촬영할 때의 카메라 모듈(1400-n)의 촬영 방향이 서로 평행해지도록 제1 이미지를 촬영할 때의 카메라 모듈(1400-1)과 제n 이미지를 촬영할 때의 카메라 모듈(1400-n)을 가상으로 배치할 수 있다. In S800, the device (1000) can virtually arrange the camera module (1400-1) when shooting the first image and the camera module (1400-n) when shooting the n-th image while maintaining the baseline value. The device (1000) can virtually arrange the camera module (1400-1) when shooting the first image and the camera module (1400-n) when shooting the n-th image so that the shooting direction of the camera module (1400-1) when shooting the first image and the shooting direction of the camera module (1400-n) when shooting the n-th image become parallel to each other.

S810에서 디바이스(1000)는 제1 이미지 및 제n 이미지를 배열할 수 있다. 디바이스(1000)는 제1 이미지 및 제n 이미지를 세로축(55) 및 세로축(56)을 따라 배열할 수 있다. 제1 이미지 및 제n 이미지의 세로변들이 세로축(55) 및 세로축(56)에 일치하지 않는 경우에, 디바이스(1000)는 제1 이미지 및 제n 이미지의 세로변들이 세로축(55) 및 세로축(56)에 일치하도록 제1 이미지 및 제n 이미지를 소정 알고리즘에 따라 수정(rectification)할 수 있다.In S810, the device (1000) can arrange the first image and the n-th image. The device (1000) can arrange the first image and the n-th image along the vertical axis (55) and the vertical axis (56). When the vertical sides of the first image and the n-th image do not coincide with the vertical axis (55) and the vertical axis (56), the device (1000) can rectify the first image and the n-th image according to a predetermined algorithm so that the vertical sides of the first image and the n-th image coincide with the vertical axis (55) and the vertical axis (56).

동작 S820에서 디바이스(1000)는 제1 이미지 내의 제1 특징 영역 및 제n 이미지 내의 제n 특징 영역 간의 거리를 산출할 수 있다. 디바이스(1000)는 동작 S820에서 배열된 제1 이미지 내의 제1 특징 영역의 위치 및 동작 S820에서 배열된 제2 이미지 내의 제2 특징 영역의 위치에 기초하여, 제1 특징 영역 및 제n 특징 영역 간의 거리를 산출할 수 있다. In operation S820, the device (1000) can calculate a distance between a first feature region in the first image and an n-th feature region in the n-th image. The device (1000) can calculate a distance between the first feature region and the n-th feature region based on a position of the first feature region in the first image arranged in operation S820 and a position of the second feature region in the second image arranged in operation S820.

도 9는 본 개시의 일 실시예에 따른 디바이스가 촬영 모드에 따라 디바이스의 주변을 인식하는 방법의 흐름도이다.FIG. 9 is a flowchart of a method for a device to recognize its surroundings according to a shooting mode according to one embodiment of the present disclosure.

동작 S900에서 디바이스(1000)는 촬영 모드를 식별할 수 있다. 예를 들어, 디바이스(1000)의 전원이 켜지거나 디바이스(1000)가 비활성화 상태에서 활성화되면, 디바이스(1000)는 사용자의 제스처 입력을 수신하기 위하여 카메라 모듈(1400)의 촬영 모드를 제스처 인식 모드로 식별할 수 있다. 또한, 디바이스(1000)는 제스처 인식 모드에서 사용자의 제스처를 인식하고 인식된 제스처에 따라 카메라 모듈(1400)의 촬영 모드를 변경할 수 있다. 디바이스(1000)는 제스처에 대응되는 애플리케이션을 실행하고, 실행된 애플리케이션이 요구하는 촬영 모드에 따라 카메라 모듈(1400)의 촬영 모드를 변경할 수 있다. 또한, 예를 들어, 디바이스(1000)는 카메라 모듈(1400)을 통한 촬영을 필요로 하지 않는 애플리케이션이 실행되면, 카메라 모듈(1400)을 비활성화할 수 있다.In operation S900, the device (1000) can identify the shooting mode. For example, when the device (1000) is powered on or the device (1000) is activated from an inactive state, the device (1000) can identify the shooting mode of the camera module (1400) as the gesture recognition mode in order to receive a user's gesture input. In addition, the device (1000) can recognize the user's gesture in the gesture recognition mode and change the shooting mode of the camera module (1400) according to the recognized gesture. The device (1000) can execute an application corresponding to the gesture and change the shooting mode of the camera module (1400) according to the shooting mode required by the executed application. In addition, for example, when an application that does not require shooting through the camera module (1400) is executed, the device (1000) can deactivate the camera module (1400).

또는, 예를 들어, 디바이스(1000)의 전원이 켜지거나 디바이스(1000)가 비활성화 상태에서 활성화되면 디바이스(1000)는 카메라 모듈(1400)의 촬영 모드를 공간 인식 모드로 식별할 수 있다. Alternatively, for example, when the device (1000) is powered on or the device (1000) is activated from an inactive state, the device (1000) may identify the shooting mode of the camera module (1400) as the space recognition mode.

동작 S900에서의 식별 결과, 촬영 모드가 제스처 모드라고 식별되면, 동작 S905에서 디바이스(1000)는 제스처 모드에 대응되는 제1 촬영 각도 범위를 식별할 수 있다. 예를 들어, 카메라 모듈(1400)의 촬영 모드가 제스처 모드인 경우에 -15도 ~ 15도의 촬영 각도 범위에서 카메라 모듈(1400)이 디바이스(1000)의 주변을 촬영하도록 설정될 수 있다. 예를 들어, 카메라 모듈(1400)이 정면을 향하고 있을 때의 각도를 0도라고 하면, 카메라 모듈(1400)의 촬영 각도 범위가 -15도 ~ 15도인 경우에, 카메라 모듈(1400)은 정면을 기준으로 하여 좌측으로 -15도까지 회전하고 우측으로 15도까지 회전할 수 있다.As a result of the identification in operation S900, if the shooting mode is identified as the gesture mode, the device (1000) can identify a first shooting angle range corresponding to the gesture mode in operation S905. For example, if the shooting mode of the camera module (1400) is the gesture mode, the camera module (1400) can be set to shoot the surroundings of the device (1000) in a shooting angle range of -15 degrees to 15 degrees. For example, if the angle when the camera module (1400) is facing the front is 0 degrees, when the shooting angle range of the camera module (1400) is -15 degrees to 15 degrees, the camera module (1400) can rotate to the left by -15 degrees and to the right by 15 degrees based on the front.

동작 S910에서 디바이스(1000)는 제1 촬영 각도 범위 내에서 디바이스(1000)의 주변을 촬영할 수 있다. 디바이스(1000)는 카메라 모듈(1400) 및 촬영 방향 제어 유닛(1500)을 제어함으로써, 제1 촬영 각도의 범위 내에서 소정의 촬영 간격에 따라 디바이스(1000)의 주변을 순차적으로 촬영함으로써 복수의 이미지를 획득할 수 있다. 이 경우, 디바이스(1000)는 카메라 모듈(1400)의 촬영 간격을 결정할 수 있다. In operation S910, the device (1000) can capture the surroundings of the device (1000) within a first shooting angle range. The device (1000) can sequentially capture the surroundings of the device (1000) at a predetermined shooting interval within the first shooting angle range by controlling the camera module (1400) and the shooting direction control unit (1500), thereby obtaining a plurality of images. In this case, the device (1000) can determine the shooting interval of the camera module (1400).

동작 S915에서 디바이스(1000)는 촬영된 이미지들로부터 제스처에 관련된 객체를 식별할 수 있다. 디바이스(1000)는 촬영된 이미지들 내의 여러 객체들에 관련된 특징 영역들의 깊이 값들을 산출하고 산출된 깊이 값들을 비교할 수 있다. 디바이스(1000)는 산출된 깊이 값들을 비교함으로써, 사용자의 제스처에 관련된 객체를 식별할 수 있다. 또한, 디바이스(1000)는 식별된 객체의 모양 및 위치를 식별할 수 있다.In operation S915, the device (1000) can identify an object related to a gesture from the captured images. The device (1000) can calculate depth values of feature areas related to multiple objects in the captured images and compare the calculated depth values. The device (1000) can identify an object related to the user's gesture by comparing the calculated depth values. In addition, the device (1000) can identify a shape and a location of the identified object.

동작 S920에서 디바이스(1000)는 객체의 모양 및 위치에 기초하여 제스처를 식별할 수 있다. 디바이스(1000)는 객체의 모양 변화 및 객체의 위치 변화에 기초하여, 사용자의 제스처를 식별할 수 있다.In operation S920, the device (1000) can identify a gesture based on the shape and position of the object. The device (1000) can identify a user's gesture based on changes in the shape of the object and changes in the position of the object.

동작 S900에서의 식별 결과, 촬영 모드가 공간 인식 모드라고 식별되면, 동작 S950에서 디바이스(1000)는 공간 인식 모드에 대응되는 제2 촬영 각도 범위를 식별할 수 있다. 예를 들어, 카메라 모듈(1400)의 촬영 모드가 공간 인식 모드인 경우에 -60도 ~ 60도의 촬영 각도 범위에서 카메라 모듈(1400)이 디바이스(1000)의 주변을 촬영하도록 설정될 수 있다. 예를 들어, 카메라 모듈(1400)이 정면을 향하고 있을 때의 각도를 0도라고 하면, 카메라 모듈(1400)의 촬영 각도 범위가 -60도 ~ 60도인 경우에, 카메라 모듈(1400)은 정면을 기준으로 하여 좌측으로 -60도까지 회전하고 우측으로 60도까지 회전할 수 있다.As a result of the identification in operation S900, if the shooting mode is identified as the space recognition mode, the device (1000) can identify a second shooting angle range corresponding to the space recognition mode in operation S950. For example, if the shooting mode of the camera module (1400) is the space recognition mode, the camera module (1400) can be set to shoot the surroundings of the device (1000) in a shooting angle range of -60 degrees to 60 degrees. For example, if the angle when the camera module (1400) is facing the front is 0 degrees, when the shooting angle range of the camera module (1400) is -60 degrees to 60 degrees, the camera module (1400) can rotate to the left by -60 degrees and to the right by 60 degrees based on the front.

동작 S955에서 디바이스(1000)는 제2 촬영 각도 범위 내에서 디바이스(1000)의 주변을 촬영할 수 있다. 디바이스(1000)는 카메라 모듈(1400) 및 촬영 방향 제어 유닛(1500)을 제어함으로써, 제2 촬영 각도의 범위 내에서 소정의 촬영 간격에 따라 디바이스(1000)의 주변을 순차적으로 촬영함으로써 복수의 이미지를 획득할 수 있다. 이 경우, 디바이스(1000)는 카메라 모듈(1400)의 촬영 간격을 결정할 수 있다.In operation S955, the device (1000) can capture the surroundings of the device (1000) within the second shooting angle range. The device (1000) can sequentially capture the surroundings of the device (1000) at a predetermined shooting interval within the second shooting angle range by controlling the camera module (1400) and the shooting direction control unit (1500), thereby obtaining a plurality of images. In this case, the device (1000) can determine the shooting interval of the camera module (1400).

동작 S960에서 디바이스(1000)는 촬영된 이미지들로부터 주변 객체들을 식별할 수 있다. 디바이스(1000)는 디바이스(1000)는 촬영된 이미지들 내의 여러 객체들에 관련된 특징 영역들의 깊이 값들을 산출하고 산출된 깊이 값들을 비교할 수 있다. 디바이스(1000)는 산출된 깊이 값들을 비교함으로써, 디바이스(1000) 주변의 객체들을 식별함으로써 디바이스(1000)의 주변 공간을 인식할 수 있다.In operation S960, the device (1000) can identify surrounding objects from the captured images. The device (1000) can calculate depth values of feature areas related to multiple objects in the captured images and compare the calculated depth values. By comparing the calculated depth values, the device (1000) can identify objects around the device (1000), thereby recognizing the surrounding space of the device (1000).

일 실시예에 따르면, 디바이스(1000)는 사용자가 응시하는 물체를 식별할 수 있다. 사용자의 눈동자가 소정 시간이상 고정되면, 디바이스(1000)는 카메라 모듈(1400)을 소정 촬영 각도 범위 내에서 회전시키면서 사용자가 바라보는 방향 주변을 순차적으로 촬영할 수 있다. 또한, 디바이스(1000)는 사용자가 바라보는 방향 주변의 깊이 맵을 생성하고, 사용자가 응시한 물체를 식별할 수 있다. 또한, 디바이스(1000)는 식별된 물체에 관한 부가 정보를 디스플레이할 수 있다. 부가 정보는, 예를 들어, 물체의 식별 값, 물체의 깊이 값을 포함할 수 있으나, 이에 제한되지 않는다. 디바이스(1000)는 식별된 물체에 관한 부가 정보를 사용자에게 제공하기 위한 GUI를 디스플레이할 수도 있다.According to one embodiment, the device (1000) can identify an object that a user is gazing at. When the user's eyes are fixed for a predetermined period of time, the device (1000) can sequentially capture images of the surroundings in the direction in which the user is looking while rotating the camera module (1400) within a predetermined shooting angle range. In addition, the device (1000) can generate a depth map of the surroundings in the direction in which the user is looking, and identify the object that the user is gazing at. In addition, the device (1000) can display additional information about the identified object. The additional information can include, for example, an identification value of the object and a depth value of the object, but is not limited thereto. The device (1000) can also display a GUI for providing the user with additional information about the identified object.

도 10a은 본 개시의 일 실시예에 따른 디바이스가 제스처 모드에서 사용자의 손 동작을 촬영하는 예시를 나타내는 도면이다.FIG. 10A is a diagram illustrating an example of a device capturing a user's hand movements in gesture mode according to one embodiment of the present disclosure.

도 10a을 참조하면, 사용자는 손가락(102)을 구부리는 제스처를 행할 수 있으며, 제스처 모드에서 디바이스(1000)는 구부려지는 손가락(102)을 복수회 순차적으로 촬영할 수 있다. 예를 들어, 디바이스(1000)는 제스처 모드에 대응되는 제1 촬영 각도 범위 내에서 카메라 모듈(1400)을 소정 촬영 간격으로 복수회 회전시키면서 손가락(102)을 순차적으로 촬영할 수 있다. 이에 따라, 디바이스(1000)는 n개의 이미지들을 포함하는 복수의 이미지 세트들을 획득할 수 있다. 예를 들어, 디바이스(1000)는 손가락(102)이 펴진 상태에서 촬영된 제1 이미지 세트(104), 및 손가락(102)이 구부려진 상태에서 촬영된 제2 이미지 세트(105)를 획득할 수 있다.Referring to FIG. 10A, a user may perform a gesture of bending a finger (102), and the device (1000) may sequentially capture multiple images of the bent finger (102) in the gesture mode. For example, the device (1000) may sequentially capture images of the finger (102) while rotating the camera module (1400) multiple times at a predetermined shooting interval within a first shooting angle range corresponding to the gesture mode. Accordingly, the device (1000) may obtain multiple image sets including n images. For example, the device (1000) may obtain a first image set (104) captured in a state where the finger (102) is extended, and a second image set (105) captured in a state where the finger (102) is bent.

도 10b는 본 개시의 일 실시예에 따른 디바이스(1000)가 이미지 세트들로부터 제스처를 인식하는 예시를 나타내는 도면이다.FIG. 10b is a diagram illustrating an example of a device (1000) recognizing a gesture from image sets according to one embodiment of the present disclosure.

도 10b를 참조하면, 디바이스(1000)는 손가락(102)이 펴진 상태에서 촬영된 제1 이미지 세트(104)로부터 제1 깊이 맵(112)을 생성하고, 손가락(102)이 구부려진 상태에서 촬영된 제2 이미지 세트(105)로부터 제2 깊이 맵(113)를 생성할 수 있다.Referring to FIG. 10b, the device (1000) can generate a first depth map (112) from a first image set (104) captured in a state where the finger (102) is extended, and can generate a second depth map (113) from a second image set (105) captured in a state where the finger (102) is bent.

또한, 디바이스(1000)는 제1 깊이 맵(112)으로부터 펴진 손가락(115)을 식별하고, 제2 깊이 맵(113)으로부터 구부려진 손가락(116)을 식별할 수 있다. 이후, 디바이스(1000)는 펴진 손가락(115)의 모양 및 구부려진 손가락(116)의 모양을 분석하여, 사용자의 제스처가 ‘클릭’ 동작임을 식별할 수 있다.Additionally, the device (1000) can identify a spread finger (115) from the first depth map (112) and a bent finger (116) from the second depth map (113). Thereafter, the device (1000) can identify that the user's gesture is a 'click' motion by analyzing the shape of the spread finger (115) and the shape of the bent finger (116).

도 11a 내지 도 11c는 본 개시의 일 실시예에 따른 디바이스가 디바이스 주변의 공간을 분석하는 예시를 나타내는 도면이다.FIGS. 11A to 11C are diagrams showing examples of a device analyzing a space around a device according to one embodiment of the present disclosure.

도 11a는 본 개시의 일 실시예에 따른 디바이스가 디바이스 주변을 촬영하여 획득된 복수의 이미지들을 복수의 이미지 세트로 그룹핑하는 예시를 나타내는 도면이다.FIG. 11A is a diagram illustrating an example of a device grouping multiple images acquired by photographing the surroundings of the device into multiple image sets according to one embodiment of the present disclosure.

도 11a를 참조하면, 디바이스(1000)는 공간 인식 모드에 대응되는 제2 촬영 각도 범위 내에서 카메라 모듈(1400)을 소정 촬영 간격으로 복수회 회전시키면서 디바이스(1000)의 주변을 순차적으로 촬영할 수 있다. 또한, 디바이스(1000)는 복수의 이미지들 중 인접한 이미지들을 그룹핑함으로써 복수의 이미지 세트를 획득할 수 있다. 복수의 이미지 세트들은 연속되는 이미지들로 구성될 수 있다. 이미지 세트 내의 연속되는 이미지들 중 일부는, 인접한 이미지 세트 내의 연속되는 이미지들 중 일부와 중복될 수 있다.Referring to FIG. 11A, the device (1000) may sequentially capture images of the surroundings of the device (1000) while rotating the camera module (1400) multiple times at predetermined shooting intervals within a second shooting angle range corresponding to the space recognition mode. In addition, the device (1000) may obtain multiple image sets by grouping adjacent images among the multiple images. The multiple image sets may be composed of consecutive images. Some of the consecutive images in the image set may overlap some of the consecutive images in the adjacent image set.

도 11b는 본 개시의 일 실시예에 따른 디바이스에 의해 그룹핑된 복수의 이미지 세트들의 예시를 나타내는 도면이다.FIG. 11b is a diagram illustrating an example of multiple image sets grouped by a device according to one embodiment of the present disclosure.

도 11b를 참조하면, 디바이스(1000)는 디바이스(1000) 주변을 촬영된 복수의 이미지들로부터 제1 이미지 세트(114), 제2 이미지 세트(115) 및 제3 이미지 세트(116)를 획득할 수 있다. 제1 이미지 세트(114)는 이미지(1), 이미지 (2), 이미지 (3) 및 이미지 (4)를 포함할 수 있으며, 제2 이미지 세트(115)는 이미지(3), 이미지 (4), 이미지 (5) 및 이미지 (6)를 포함할 수 있으며, 제3 이미지 세트(116)는 이미지(5), 이미지 (6), 이미지 (7) 및 이미지 (8)를 포함할 수 있다.Referring to FIG. 11b, the device (1000) can obtain a first image set (114), a second image set (115), and a third image set (116) from a plurality of images captured around the device (1000). The first image set (114) can include image (1), image (2), image (3), and image (4), the second image set (115) can include image (3), image (4), image (5), and image (6), and the third image set (116) can include image (5), image (6), image (7), and image (8).

또한, 특징 영역(9)은 이미지(5)에서는 사라지지만 제1 이미지 세트(114) 내의 이미지들 내에는 존재하므로, 디바이스(1000)는 이미지 세트(114)로부터 특징 영역(9)의 깊이 값을 산출할 수 있다.Additionally, since the feature region (9) disappears from the image (5) but exists in the images in the first image set (114), the device (1000) can calculate the depth value of the feature region (9) from the image set (114).

도 11c는 본 개시의 일 실시예에 따른 디바이스가 복수의 이미지 세트로부터 생성된 깊이 맵으로부터 디바이스 주변 공간을 나타내는 깊이 맵을 생성하고 분석하는 예시를 나타내는 도면이다.FIG. 11C is a diagram illustrating an example of a device generating and analyzing a depth map representing space around the device from depth maps generated from multiple sets of images according to one embodiment of the present disclosure.

또한, 디바이스(1000)는 제1 이미지 세트(114)로부터 제1 부분 깊이 맵(117)을 생성하고, 제2 이미지 세트(115)로부터 제2 부분 깊이 맵(118)을 생성하고, 제3 이미지 세트(116)로부터 제3 부분 깊이 맵(119)을 생성할 수 있다. 또한, 디바이스(1000)는 제1 부분 깊이 맵(117), 제2 부분 깊이 맵(118) 및 제3 부분 깊이 맵(119)을 이용하여, 전체 깊이 맵(120)을 생성하고 분석할 수 있다.Additionally, the device (1000) can generate a first partial depth map (117) from the first image set (114), a second partial depth map (118) from the second image set (115), and a third partial depth map (119) from the third image set (116). Additionally, the device (1000) can generate and analyze a full depth map (120) using the first partial depth map (117), the second partial depth map (118), and the third partial depth map (119).

본 개시의 일 실시예는 컴퓨터에 의해 실행되는 프로그램 모듈과 같은 컴퓨터에 의해 실행가능한 명령어를 포함하는 기록 매체의 형태로도 구현될 수 있다. 컴퓨터 판독 가능 매체는 컴퓨터에 의해 액세스될 수 있는 임의의 가용 매체일 수 있고, 휘발성 및 비휘발성 매체, 분리형 및 비분리형 매체를 모두 포함한다. 또한, 컴퓨터 판독가능 매체는 컴퓨터 저장 매체 및 통신 매체를 포함할 수 있다. 컴퓨터 저장 매체는 컴퓨터 판독가능 명령어, 데이터 구조, 프로그램 모듈 또는 기타 데이터와 같은 정보의 저장을 위한 임의의 방법 또는 기술로 구현된 휘발성 및 비휘발성, 분리형 및 비분리형 매체를 모두 포함한다. 통신 매체는 전형적으로 컴퓨터 판독가능 명령어, 데이터 구조, 또는 프로그램 모듈과 같은 변조된 데이터 신호의 기타 데이터를 포함할 수 있다.An embodiment of the present disclosure may also be implemented in the form of a recording medium containing computer-executable instructions, such as program modules, that are executed by a computer. Computer-readable media can be any available media that can be accessed by a computer and includes both volatile and nonvolatile media, removable and non-removable media. Additionally, computer-readable media can include computer storage media and communication media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer-readable instructions, data structures, program modules, or other data. Communication media typically includes computer-readable instructions, data structures, or other data in a modulated data signal, such as program modules.

또한, 컴퓨터에 의해 읽을 수 있는 저장매체는, 비일시적(non-transitory) 저장매체의 형태로 제공될 수 있다. 여기서, ‘비일시적 저장매체'는 실재(tangible)하는 장치이고, 신호(signal)(예: 전자기파)를 포함하지 않는다는 것을 의미할 뿐이며, 이 용어는 데이터가 저장매체에 반영구적으로 저장되는 경우와 임시적으로 저장되는 경우를 구분하지 않는다. 예로, '비일시적 저장매체'는 데이터가 임시적으로 저장되는 버퍼를 포함할 수 있다.Additionally, the computer-readable storage medium may be provided in the form of a non-transitory storage medium. Here, the term 'non-transitory storage medium' means a tangible device and does not contain signals (e.g., electromagnetic waves), and this term does not distinguish between cases where data is stored semi-permanently in the storage medium and cases where data is stored temporarily. For example, the 'non-transitory storage medium' may include a buffer in which data is temporarily stored.

일 실시예에 따르면, 본 문서에 개시된 다양한 실시예들에 따른 방법은 컴퓨터 프로그램 제품(computer program product)에 포함되어 제공될 수 있다. 컴퓨터 프로그램 제품은 상품으로서 판매자 및 구매자 간에 거래될 수 있다. 컴퓨터 프로그램 제품은 기기로 읽을 수 있는 저장 매체(예: compact disc read only memory (CD-ROM))의 형태로 배포되거나, 또는 어플리케이션 스토어(예: 플레이 스토어TM)를 통해 또는 두개의 사용자 장치들(예: 스마트폰들) 간에 직접, 온라인으로 배포(예: 다운로드 또는 업로드)될 수 있다. 온라인 배포의 경우에, 컴퓨터 프로그램 제품(예: 다운로더블 앱(downloadable app))의 적어도 일부는 제조사의 서버, 어플리케이션 스토어의 서버, 또는 중계 서버의 메모리와 같은 기기로 읽을 수 있는 저장 매체에 적어도 일시 저장되거나, 임시적으로 생성될 수 있다.According to one embodiment, the method according to various embodiments disclosed in the present document may be provided as included in a computer program product. The computer program product may be traded between a seller and a buyer as a commodity. The computer program product may be distributed in the form of a machine-readable storage medium (e.g., a compact disc read only memory (CD-ROM)), or may be distributed online (e.g., downloaded or uploaded) via an application store (e.g., Play StoreTM) or directly between two user devices (e.g., smartphones). In the case of online distribution, at least a part of the computer program product (e.g., a downloadable app) may be at least temporarily stored or temporarily generated in a machine-readable storage medium, such as a memory of a manufacturer's server, a server of an application store, or an intermediary server.

또한, 본 명세서에서, “부”는 프로세서 또는 회로와 같은 하드웨어 구성(hardware component), 및/또는 프로세서와 같은 하드웨어 구성에 의해 실행되는 소프트웨어 구성(software component)일 수 있다.Additionally, in this specification, a “part” may be a hardware component such as a processor or a circuit, and/or a software component executed by a hardware component such as a processor.

또한, 본 명세서에서, “a, b 또는 c 중 적어도 하나를 포함한다”는 “a만 포함하거나, b만 포함하거나, c만 포함하거나, a 및 b를 포함하거나, b 및 c를 포함하거나, a 및 c를 포함하거나, a, b 및 c를 모두 포함하는 것을 의미할 수 있다.Additionally, in this specification, “comprising at least one of a, b, or c” can mean “comprising only a, including only b, including only c, including both a and b, including both b and c, including both a and b, and including all of a, b, and c.”

본 개시에 따른 인공지능과 관련된 기능은 프로세서와 메모리를 통해 동작된다. 프로세서는 하나 또는 복수의 프로세서로 구성될 수 있다. 이때, 하나 또는 복수의 프로세서는 CPU, AP, DSP(Digital Signal Processor) 등과 같은 범용 프로세서, GPU, VPU(Vision Processing Unit)와 같은 그래픽 전용 프로세서 또는 NPU와 같은 인공지능 전용 프로세서일 수 있다. 하나 또는 복수의 프로세서는, 메모리에 저장된 기 정의된 동작 규칙 또는 인공지능 모델에 따라, 입력 데이터를 처리하도록 제어한다. 또는, 하나 또는 복수의 프로세서가 인공지능 전용 프로세서인 경우, 인공지능 전용 프로세서는, 특정 인공지능 모델의 처리에 특화된 하드웨어 구조로 설계될 수 있다. The function related to artificial intelligence according to the present disclosure is operated through a processor and a memory. The processor may be composed of one or more processors. At this time, one or more processors may be a general-purpose processor such as a CPU, an AP, a DSP (Digital Signal Processor), a graphics-only processor such as a GPU, a VPU (Vision Processing Unit), or an artificial intelligence-only processor such as an NPU. One or more processors control to process input data according to a predefined operation rule or artificial intelligence model stored in a memory. Alternatively, when one or more processors are artificial intelligence-only processors, the artificial intelligence-only processor may be designed with a hardware structure specialized for processing a specific artificial intelligence model.

기 정의된 동작 규칙 또는 인공지능 모델은 학습을 통해 만들어진 것을 특징으로 한다. 여기서, 학습을 통해 만들어진다는 것은, 기본 인공지능 모델이 학습 알고리즘에 의하여 다수의 학습 데이터들을 이용하여 학습됨으로써, 원하는 특성(또는, 목적)을 수행하도록 설정된 기 정의된 동작 규칙 또는 인공지능 모델이 만들어짐을 의미한다. 이러한 학습은 본 개시에 따른 인공지능이 수행되는 기기 자체에서 이루어질 수도 있고, 별도의 서버 및/또는 시스템을 통해 이루어 질 수도 있다. 학습 알고리즘의 예로는, 지도형 학습(supervised learning), 비지도형 학습(unsupervised learning), 준지도형 학습(semi-supervised learning) 또는 강화 학습(reinforcement learning)이 있으나, 전술한 예에 한정되지 않는다.The predefined operation rules or artificial intelligence models are characterized by being created through learning. Here, being created through learning means that the basic artificial intelligence model is learned by using a plurality of learning data by a learning algorithm, thereby creating a predefined operation rules or artificial intelligence model set to perform a desired characteristic (or purpose). Such learning may be performed in the device itself on which the artificial intelligence according to the present disclosure is performed, or may be performed through a separate server and/or system. Examples of the learning algorithm include supervised learning, unsupervised learning, semi-supervised learning, or reinforcement learning, but are not limited to the examples described above.

인공지능 모델은, 복수의 신경망 레이어들로 구성될 수 있다. 복수의 신경망 레이어들 각각은 복수의 가중치들(weight values)을 갖고 있으며, 이전(previous) 레이어의 연산 결과와 복수의 가중치들 간의 연산을 통해 신경망 연산을 수행한다. 복수의 신경망 레이어들이 갖고 있는 복수의 가중치들은 인공지능 모델의 학습 결과에 의해 최적화될 수 있다. 예를 들어, 학습 과정 동안 인공지능 모델에서 획득한 로스(loss) 값 또는 코스트(cost) 값이 감소 또는 최소화되도록 복수의 가중치들이 갱신될 수 있다. 인공 신경망은 심층 신경망(DNN:Deep Neural Network)를 포함할 수 있으며, 예를 들어, CNN (Convolutional Neural Network), DNN (Deep Neural Network), RNN (Recurrent Neural Network), RBM (Restricted Boltzmann Machine), DBN (Deep Belief Network), BRDNN(Bidirectional Recurrent Deep Neural Network) 또는 심층 Q-네트워크 (Deep Q-Networks) 등이 있으나, 전술한 예에 한정되지 않는다.The artificial intelligence model may be composed of a plurality of neural network layers. Each of the plurality of neural network layers has a plurality of weight values, and performs neural network operations through operations between the operation results of the previous layer and the plurality of weights. The plurality of weights of the plurality of neural network layers may be optimized by the learning results of the artificial intelligence model. For example, the plurality of weights may be updated so that the loss value or cost value obtained from the artificial intelligence model is reduced or minimized during the learning process. The artificial neural network may include a deep neural network (DNN), and examples thereof include, but are not limited to, a convolutional neural network (CNN), a deep neural network (DNN), a recurrent neural network (RNN), a restricted boltzmann machine (RBM), a deep belief network (DBN), a bidirectional recurrent deep neural network (BRDNN), or deep Q-networks.

본 개시에 따른 인공지능 모델은 이미지 데이터를 인공지능 모델의 입력 데이터로 이용하여 이미지 또는 이미지 내 특징 영역을 인식한 출력 데이터를 출력할 수 있다. 인공지능 모델은 학습을 통해 만들어 질 수 있다. 여기서, 학습을 통해 만들어진다는 것은, 기본 인공지능 모델이 학습 알고리즘에 의하여 다수의 학습 데이터들을 이용하여 학습됨으로써, 원하는 특성(또는, 목적)을 수행하도록 설정된 기 정의된 동작 규칙 또는 인공지능 모델이 만들어짐을 의미한다. 인공지능 모델은, 복수의 신경망 레이어들로 구성될 수 있다. 복수의 신경망 레이어들 각각은 복수의 가중치들(weight values)을 갖고 있으며, 이전(previous) 레이어의 연산 결과와 복수의 가중치들 간의 연산을 통해 신경망 연산을 수행한다.An artificial intelligence model according to the present disclosure can use image data as input data of an artificial intelligence model to output output data that recognizes an image or a feature region within an image. The artificial intelligence model can be created through learning. Here, being created through learning means that a basic artificial intelligence model is learned by using a plurality of learning data by a learning algorithm, thereby creating a predefined operation rule or artificial intelligence model set to perform a desired characteristic (or purpose). The artificial intelligence model can be composed of a plurality of neural network layers. Each of the plurality of neural network layers has a plurality of weight values, and performs a neural network operation through an operation between the operation result of the previous layer and the plurality of weights.

시각적 이해는 사물을 인간의 시각처럼 인식하여 처리하는 기술로서, 특징 영역 인식(Object Recognition), 특징 영역 추적(Object Tracking), 영상 검색(Image Retrieval), 사람 인식(Human Reconnition), 장면 이해(Scene Recognition), 공간 이해(3D Reconstruction/Localization), 영상 개선(Image Enhancement) 등을 포함한다.Visual understanding is a technology that recognizes and processes objects like human vision, and includes object recognition, object tracking, image retrieval, human recognition, scene recognition, spatial understanding (3D reconstruction/localization), and image enhancement.

전술한 본 개시의 설명은 예시를 위한 것이며, 본 개시가 속하는 기술분야의 통상의 지식을 가진 자는 본 개시의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다.The above description of the present disclosure is for illustrative purposes, and those skilled in the art will understand that the present disclosure can be easily modified into other specific forms without changing the technical idea or essential features of the present disclosure. Therefore, it should be understood that the embodiments described above are exemplary in all respects and not restrictive. For example, each component described as a single component may be implemented in a distributed manner, and likewise, components described as distributed may be implemented in a combined manner.

본 개시의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 개시의 범위에 포함되는 것으로 해석되어야 한다.The scope of the present disclosure is indicated by the claims described below rather than the detailed description above, and all changes or modifications derived from the meaning and scope of the claims and their equivalent concepts should be interpreted as being included in the scope of the present disclosure.

Claims

In a method for a device to obtain depth of a feature region using one camera module,
A step of acquiring multiple images by sequentially rotating one camera module at a preset angle and taking multiple shots of the surroundings of said one camera module;
A step of identifying a first feature region in the first image and an n-th feature region that is identical to the first feature region in the n-th image by comparing adjacent images between a first image and an n-th image among the plurality of images;
A step of obtaining a base line value for the first image and the nth image based on the arrangement of the one camera module when capturing the first image and the arrangement of the one camera module when capturing the nth image;
A step of obtaining a disparity value between the first feature region and the nth feature region based on the location of the first feature region in the first image and the location of the nth feature region in the nth image; and
A step of calculating the depth of the first feature region or the nth feature region based on the baseline value and the disparity value;
A method comprising:

In the first paragraph,
The step of obtaining the above multiple images is:
A method for sequentially photographing the surroundings of one camera module while panning the one camera module at a predetermined angular interval within a predetermined shooting angle range.

In the first paragraph,
The step of identifying the first feature region and the nth feature region is:
A step of identifying feature regions that are identical to the first feature region among feature regions in the first image to the nth image by sequentially comparing adjacent images among the first image to the nth image in the plurality of images;
A method comprising:

In the first paragraph,
The above one camera module is sequentially rotated at a preset angle, thereby changing the shooting angle of the above one camera module.
A method according to claim 1, wherein the baseline values for the first image and the nth image are determined based on a distance value between a position of a camera lens of the one camera module when acquiring the first image and a position of a camera lens of the one camera module when acquiring the first image.

In the first paragraph,
The step of obtaining the above disparity value is:
A step of virtually arranging the one camera module when shooting the first image and the one camera module when shooting the n-th image so that the shooting direction of the one camera module when shooting the first image and the shooting direction of the one camera module when shooting the n-th image become parallel to each other;
A step of arranging the first image and the nth image based on the shooting direction of the above-deployed one camera module; and
A step of obtaining a distance value between the first feature region in the arranged first image and the nth feature region in the arranged nth image;
A method comprising:

In clause 5,
A method according to claim 1, wherein the step of placing virtually places one camera module when shooting the first image and one camera module when shooting the n-th image while maintaining the baseline values for the first image and the n-th image.

In the first paragraph,
A method according to claim 1, wherein the step of calculating the depth calculates the depth based on the baseline value, the disparity value, and the focal length of the one camera module when capturing the first image and the nth image.

In the first paragraph,
A step of determining a shooting mode of the above one camera module; and
A step of identifying a preset shooting angle range according to the above shooting mode;
Including more,
A method wherein the step of acquiring the plurality of images comprises acquiring the plurality of images while sequentially rotating the single camera module at the preset angle within the identified shooting angle range.

In Article 8,
A method according to claim 1, wherein the shooting mode includes a gesture recognition mode for identifying a user's gesture around the one camera module and a space recognition mode for recognizing a space around the one camera module.

In the first paragraph,
The step of obtaining the above multiple images is:
A step of determining a focal length of the above one camera module;
Including more,
A method wherein the surroundings of one camera module are photographed according to the determined focal length.

In a device that acquires depth of a feature area using a single camera module,
One camera module;
A shooting direction control unit that rotates the above one camera module at a predetermined angle;
display;
Memory that stores one or more instructions; and
A processor executing one or more of the above instructions;
Including,
The above processor,
By controlling the above shooting direction control unit and the one camera module, the one camera module is sequentially rotated at a preset angle and the surroundings of the one camera module are photographed multiple times to obtain multiple images.
By comparing adjacent images between a first image and an n-th image among the plurality of images, a first feature region in the first image and an n-th feature region identical to the first feature region in the n-th image are identified,
Based on the arrangement of the one camera module when shooting the first image and the arrangement of the one camera module when shooting the nth image, a base line value for the first image and the nth image is obtained,
Based on the location of the first feature region in the first image and the location of the nth feature region in the nth image, a disparity value between the first feature region and the nth feature region is obtained,
A device that calculates the depth of the first feature region or the nth feature region based on the baseline value and the disparity value.

In Article 11,
A device wherein the processor sequentially captures images of the surroundings of the one camera module by executing the one or more instructions, thereby panning the one camera module at a predetermined angular interval within a predetermined shooting angle range.

In Article 11,
The above processor executes one or more of the above instructions,
A device for identifying feature regions that are identical to the first feature region among feature regions in the first image to the nth image by sequentially comparing adjacent images among the first image to the nth image in the plurality of images.

In Article 11,
The above one camera module is sequentially rotated at a preset angle, thereby changing the shooting angle of the above one camera module.
A device, wherein the baseline values for the first image and the nth image are determined based on a distance value between a position of a camera lens of the one camera module when acquiring the first image and a position of a camera lens of the one camera module when acquiring the first image.

In Article 11,
The above processor executes one or more of the above instructions,
When shooting the first image, the one camera module and the one camera module when shooting the nth image are virtually arranged so that the shooting direction of the one camera module when shooting the first image and the shooting direction of the one camera module when shooting the nth image are parallel to each other,
Arrange the first image and the nth image based on the shooting direction of the above-deployed one camera module,
A device for obtaining a distance value between the first feature region in the arranged first image and the nth feature region in the arranged nth image.

In Article 15,
The above processor executes one or more of the above instructions,
A device for virtually positioning one camera module when capturing the first image and one camera module when capturing the n-th image while maintaining the baseline values for the first image and the n-th image.

In Article 11,
The above processor executes one or more of the above instructions,
A device that calculates the depth based on the baseline value, the disparity value, and the focal length of the one camera module when capturing the first image and the nth image.

In Article 11,
The above processor executes one or more of the above instructions,
Determine the shooting mode of the above one camera module,
Identify the preset shooting angle range according to the above shooting mode,
A device that acquires the plurality of images by sequentially rotating the one camera module at the preset angle within the identified shooting angle range.

In Article 18,
A device wherein the above shooting mode includes a gesture recognition mode for identifying a user's gesture around the one camera module and a space recognition mode for recognizing a space around the one camera module.

A computer-readable recording medium having recorded thereon a program for executing the method of claim 1 on a computer.