KR20200044175A

KR20200044175A - Electronic apparatus and assistant service providing method thereof

Info

Publication number: KR20200044175A
Application number: KR1020180119100A
Authority: KR
Inventors: 장원남; 김수연; 조성래
Original assignee: 삼성전자주식회사
Priority date: 2018-10-05
Filing date: 2018-10-05
Publication date: 2020-04-29
Also published as: US20200111490A1; US20220270605A1; WO2020071858A1; US11817097B2; US11302319B2

Abstract

An electronic device is disclosed. According to the present invention, the electronic device comprises a communication unit, a memory, and a processor connected to the communication unit and the memory and controlling the electronic device. By executing at least one instruction stored in the memory, the processor transmits information about a user voice acquired by the electronic device to a plurality of servers providing different secretary services through the communication unit when a user input for running a secretary service is received, and provides, when a plurality of response information is received from the plurality of servers, a response for the user voice based on at least one piece of the plurality of received response information. The plurality of servers provide secretary services using an artificial intelligence agent.

Description

Electronic device and its secretary service provision method {ELECTRONIC APPARATUS AND ASSISTANT SERVICE PROVIDING METHOD THEREOF}

본 개시는 전자 장치 및 그의 비서 서비스 제공 방법에 대한 것으로, 보다 구체적으로는, 비서 서비스를 제공하는 전자 장치 및 그의 비서 서비스 제공 방법에 대한 것이다.The present disclosure relates to an electronic device and a method of providing a secretarial service thereof, and more specifically, to an electronic device providing a secretarial service and a method of providing a secretarial service thereof.

근래에는 인간 수준의 지능을 구현하는 인공 지능 시스템이 다양한 분야에서 이용되고 있다. 인공 지능 시스템은 기존의 룰(rule) 기반 스마트 시스템과 달리 기계가 스스로 학습하고 판단하며 똑똑해지는 시스템이다. 인공 지능 시스템은 사용할수록 인식률이 향상되고 사용자 취향을 보다 정확하게 이해할 수 있게 되어, 기존 룰 기반 스마트 시스템은 점차 딥러닝 기반 인공 지능 시스템으로 대체되고 있다.In recent years, artificial intelligence systems that embody human-level intelligence have been used in various fields. The artificial intelligence system is a system in which a machine learns, judges, and becomes smart unlike a rule-based smart system. As the artificial intelligence system is used, the recognition rate is improved and the user's taste can be more accurately understood, and the existing rule-based smart system is gradually being replaced by a deep learning-based artificial intelligence system.

인공 지능 기술은 기계학습(예로, 딥러닝) 및 기계학습을 활용한 요소 기술들로 구성된다.Artificial intelligence technology is composed of machine learning (eg, deep learning) and elemental technologies utilizing machine learning.

기계학습은 입력 데이터들의 특징을 스스로 분류/학습하는 알고리즘 기술이며, 요소기술은 딥러닝 등의 기계학습 알고리즘을 활용하여 인간 두뇌의 인지, 판단 등의 기능을 모사하는 기술로서, 언어적 이해, 시각적 이해, 추론/예측, 지식 표현, 동작 제어 등의 기술 분야로 구성된다.Machine learning is an algorithm technology that classifies / learns the characteristics of input data by itself, and element technology is a technology that simulates functions such as cognition and judgment of the human brain by using machine learning algorithms such as deep learning. It consists of technical fields such as understanding, reasoning / prediction, knowledge expression, and motion control.

현재 인공지능 기술은 다양한 분야에서 이용되고 있으며, 특히, 근래에는 인공지능 모델을 이용하여 사용자 입력(특히, 사용자 음성)에 대한 응답을 제공하는 비서 서비스(assistant service)(가령, 음성 비서(voice assistant) 또는 음성 비서 서비스(voice assistant service))가 제공되고 있다.Currently, artificial intelligence technology is used in various fields, and in particular, an assistant service (e.g., voice assistant) that provides a response to user input (especially user voice) using an artificial intelligence model in recent years. ) Or a voice assistant service is provided.

이러한 비서 서비스들은 각자 뛰어난 성능을 보이는 도메인이 서로 다름에도 불구하고, 현재 출시되고 있는 전자 장치는 하나의 비서 서비스만을 제공하고 있다는 점에서, 사용자에게 보다 유용하게 비서 서비스를 제공하기 위한 방안의 모색이 요청된다.Although these secretary services have different domains each having excellent performance, the current electronic device provides only one secretary service, so the search for a method for providing the secretary service to users more usefully Is requested.

본 발명은 상술한 필요성에 따른 것으로, 본 발명의 목적은 다양한 비서 서비스를 제공할 수 있는 전자 장치 및 그의 비서 서비스 제공 방법을 제공함에 있다.The present invention is in accordance with the above-mentioned needs, and an object of the present invention is to provide an electronic device capable of providing various secretary services and a method for providing secretary services thereof.

본 개시의 일 실시예에 따른 전자 장치는 통신부, 메모리 및 상기 통신부 및 상기 메모리와 연결되어, 상기 전자 장치를 제어하는 프로세서를 포함하며, 상기 프로세서는, 상기 메모리에 저장된 적어도 하나의 명령을 실행함으로써, 비서 서비스를 실행하기 위한 사용자 입력이 수신되면, 상기 전자 장치가 획득한 사용자 음성에 대한 정보를 상기 통신부를 통해 서로 다른 비서 서비스를 제공하는 복수의 서버로 전송하고, 상기 복수의 서버로부터 복수의 응답 정보가 수신되면, 상기 수신된 복수의 응답 정보 중 적어도 하나에 기초하여 상기 사용자 음성에 대한 응답을 제공하며, 상기 복수의 서버는, 인공지능 에이전트(Artificial intelligence agent)를 이용하여 비서 서비스를 제공한다.An electronic device according to an embodiment of the present disclosure includes a communication unit, a memory, and a processor connected to the communication unit and the memory to control the electronic device, wherein the processor executes at least one instruction stored in the memory. When a user input for executing a secretary service is received, information on the user voice acquired by the electronic device is transmitted to a plurality of servers providing different secretary services through the communication unit, and a plurality of When response information is received, a response to the user voice is provided based on at least one of the received response information, and the plurality of servers provide a secretary service using an artificial intelligence agent do.

여기에서, 상기 프로세서는 제1 비서 서비스를 실행하기 위한 사용자 입력이 수신되면, 상기 사용자 음성에 대한 정보를 제1 비서 서비스를 제공하는 제1 서버 및 제2 비서 서비스를 제공하는 제2 서버로 전송하고, 상기 제1 및 제2 서버로부터 수신된 복수의 응답 정보 중 상기 제1 서버로부터 수신된 응답 정보에 기초하여 상기 사용자 음성에 대한 응답을 제공할 수 있다.Here, when the user input for executing the first secretary service is received, the processor transmits information on the user voice to the first server providing the first secretary service and the second server providing the second secretary service. And, a response to the user voice may be provided based on the response information received from the first server among the plurality of response information received from the first and second servers.

또한, 상기 프로세서는 상기 전자 장치가 획득한 제1 사용자 음성에서 제1 비서 서비스를 실행하기 위한 제1 웨이크업(wakeup) 단어가 확인되는 경우, 상기 제1 사용자 음성에 대한 정보를 상기 제1 비서 서비스를 제공하는 상기 제1 서버 및 상기 제2 비서 서비스를 제공하는 상기 제2 서버로 전송하고, 상기 제1 사용자 음성에 대한 정보의 전송에 응답하여 상기 제1 서버 및 상기 제2 서버로부터 수신된 복수의 응답 정보 중 상기 제1 서버로부터 수신된 응답 정보에 기초하여 상기 제1 사용자 음성에 대한 응답을 제공할 수 있다.In addition, when the first wakeup word for executing the first secretary service is confirmed from the first user voice acquired by the electronic device, the processor may transmit information about the first user voice to the first secretary. Transmitted to the first server providing the service and the second server providing the second secretary service, and received from the first server and the second server in response to transmission of information about the first user voice A response to the first user voice may be provided based on the response information received from the first server among the plurality of response information.

여기에서, 상기 프로세서는 상기 제1 사용자 음성 이후에 획득된 제2 사용자 음성이 상기 제1 사용자 음성에 대한 다른 비서 서비스의 응답 결과를 요청하는 음성에 해당하는지 여부를 확인할 수 있다.Here, the processor may check whether the second user voice obtained after the first user voice corresponds to a voice requesting a response result of another secretary service to the first user voice.

또한, 상기 프로세서는 상기 제2 사용자 음성이 상기 제1 사용자 음성에 대한 상기 제2 비서 서비스의 응답 결과를 요청하는 음성에 해당하는 경우, 상기 제1 사용자 음성에 대한 정보의 전송에 응답하여 수신된 복수의 응답 정보 중 상기 제2 서버로부터 수신된 응답 정보에 기초하여 상기 제2 사용자 음성에 대한 응답을 제공할 수 있다.In addition, when the second user voice corresponds to a voice requesting a response result of the second secretary service to the first user voice, the processor is received in response to transmission of information about the first user voice. A response to the second user voice may be provided based on the response information received from the second server among the plurality of response information.

또한, 상기 프로세서는 상기 제2 사용자 음성이 상기 다른 비서 서비스의 응답 결과를 요청하는 음성에 해당하지 않는 경우, 상기 제2 사용자 음성에 대한 정보를 상기 제1 서버 및 제2 서버로 전송하고, 상기 제2 사용자 음성에 대한 정보의 전송에 응답하여 상기 제1 서버 및 상기 제2 서버로부터 수신된 복수의 응답 정보 중 상기 제1 서버로부터 수신된 응답 정보에 기초하여 상기 제2 사용자 음성에 대한 응답을 제공할 수 있다.In addition, when the second user voice does not correspond to the voice requesting the response result of the other secretary service, the processor transmits information on the second user voice to the first server and the second server, and In response to the transmission of information on the second user voice, the response to the second user voice is based on the response information received from the first server among the plurality of response information received from the first server and the second server. Can provide.

한편, 상기 프로세서는 기설정된 버튼의 선택에 의한 상기 사용자 입력이 수신된 후 상기 사용자 음성이 획득되면, 상기 사용자 음성에 대한 정보를 상기 복수의 서버로 전송하고, 상기 복수의 서버로부터 복수의 응답 정보가 수신되면, 상기 복수의 응답 정보에 기초하여 상기 사용자 음성에 대한 복수의 응답을 제공할 수 있다.Meanwhile, when the user voice is obtained after the user input is received by selecting a preset button, the processor transmits information on the user voice to the plurality of servers and a plurality of response information from the plurality of servers. When is received, a plurality of responses to the user voice may be provided based on the plurality of response information.

여기에서, 상기 프로세서는 상기 복수의 서버가 제공하는 복수의 비서 서비스에 대한 사용자의 선호도 및 상기 복수의 응답 정보의 정확도 중 적어도 하나에 기초하여 상기 복수의 응답을 상이한 형태로 출력할 수 있다.Here, the processor may output the plurality of responses in different forms based on at least one of a user's preference for a plurality of secretary services provided by the plurality of servers and the accuracy of the plurality of response information.

한편, 본 개시의 일 실시 예에 따른 전자 장치의 비서 서비스 제공 방법은 비서 서비스를 실행하기 위한 사용자 입력이 수신되면, 상기 전자 장치가 획득한 사용자 음성에 대한 정보를 서로 다른 비서 서비스를 제공하는 복수의 서버로 전송하는 단계 및 상기 복수의 서버로부터 복수의 응답 정보가 수신되면, 상기 수신된 복수의 응답 정보 중 적어도 하나에 기초하여 상기 사용자 음성에 대한 응답을 제공하는 단계를 포함하며, 상기 복수의 서버는, 인공지능 에이전트(Artificial intelligence agent)를 이용하여 비서 서비스를 제공한다.On the other hand, a method for providing a secretarial service of an electronic device according to an embodiment of the present disclosure, when a user input for executing a secretary service is received, a plurality of different secretary service providing information about the user voice acquired by the electronic device And providing a response to the user's voice based on at least one of the received plurality of response information, when the plurality of response information is received from the plurality of servers. The server provides a secretary service using an artificial intelligence agent.

여기에서, 상기 전송하는 단계는 제1 비서 서비스를 실행하기 위한 사용자 입력이 수신되면, 상기 사용자 음성에 대한 정보를 제1 비서 서비스를 제공하는 제1 서버 및 제2 비서 서비스를 제공하는 제2 서버로 전송하고, 상기 제공하는 단계는 상기 제1 및 제2 서버로부터 수신된 복수의 응답 정보 중 상기 제1 서버로부터 수신된 응답 정보에 기초하여 상기 사용자 음성에 대한 응답을 제공할 수 있다.Here, in the transmitting step, when a user input for executing the first secretary service is received, the first server providing the first secretarial service and the second server providing the second secretarial service for information on the user's voice And providing the response, the response to the user voice may be provided based on the response information received from the first server among the plurality of response information received from the first and second servers.

또한, 상기 전송하는 단계는 상기 전자 장치가 획득한 제1 사용자 음성에서 제1 비서 서비스를 실행하기 위한 제1 웨이크업(wakeup) 단어가 확인되는 경우, 상기 제1 사용자 음성에 대한 정보를 상기 제1 비서 서비스를 제공하는 상기 제1 서버 및 상기 제2 비서 서비스를 제공하는 상기 제2 서버로 전송하고, 상기 제공하는 단계는 상기 제1 사용자 음성에 대한 정보의 전송에 응답하여 상기 제1 서버 및 상기 제2 서버로부터 수신된 복수의 응답 정보 중 상기 제1 서버로부터 수신된 응답 정보에 기초하여 상기 제1 사용자 음성에 대한 응답을 제공할 수 있다.In addition, when the first wakeup word for executing the first secretary service is confirmed from the first user voice acquired by the electronic device, the transmitting step may include information on the first user voice. 1 transmitting to the first server providing the secretarial service and the second server providing the second secretarial service, and the providing step comprises the first server and the first server in response to the transmission of information on the first user voice A response to the first user voice may be provided based on the response information received from the first server among the plurality of response information received from the second server.

여기에서, 본 개시에 따른 비서 서비스 제공 방법은 상기 제1 사용자 음성 이후에 획득된 제2 사용자 음성이 상기 제1 사용자 음성에 대한 다른 비서 서비스의 응답 결과를 요청하는 음성에 해당하는지 여부를 확인하는 단계를 더 포함할 수 있다.Here, the method for providing a secretary service according to the present disclosure determines whether a second user voice obtained after the first user voice corresponds to a voice requesting a response result of another secretary service to the first user voice It may further include a step.

또한, 본 개시에 따른 비서 서비스 제공 방법은 상기 제2 사용자 음성이 상기 제1 사용자 음성에 대한 상기 제2 비서 서비스의 응답 결과를 요청하는 음성에 해당하는 경우, 상기 제1 사용자 음성에 대한 정보의 전송에 응답하여 수신된 복수의 응답 정보 중 상기 제2 서버로부터 수신된 응답 정보에 기초하여 상기 제2 사용자 음성에 대한 응답을 제공하는 단계를 더 포함할 수 있다.In addition, in the method for providing a secretarial service according to the present disclosure, when the second user voice corresponds to a voice requesting a result of the response of the second secretary service to the first user voice, information regarding the first user voice The method may further include providing a response to the second user voice based on the response information received from the second server among the plurality of response information received in response to the transmission.

또한, 본 개시에 따른 비서 서비스 제공 방법은 상기 제2 사용자 음성이 상기 다른 비서 서비스의 응답 결과를 요청하는 음성에 해당하지 않는 경우, 상기 제2 사용자 음성에 대한 정보를 상기 제1 서버 및 제2 서버로 전송하고, 상기 제2 사용자 음성에 대한 정보의 전송에 응답하여 상기 제1 서버 및 상기 제2 서버로부터 수신된 복수의 응답 정보 중 상기 제1 서버로부터 수신된 응답 정보에 기초하여 상기 제2 사용자 음성에 대한 응답을 제공하는 단계를 더 포함할 수 있다.In addition, in the method for providing a secretarial service according to the present disclosure, when the second user voice does not correspond to a voice requesting a response result of the other secretary service, information on the second user voice is provided to the first server and the second A second response based on the response information received from the first server among a plurality of response information received from the first server and the second server in response to transmission of information about the second user voice. And providing a response to the user's voice.

한편, 상기 전송하는 단계는 기설정된 버튼의 선택에 의한 상기 사용자 입력이 수신된 후 상기 사용자 음성이 획득되면, 상기 사용자 음성에 대한 정보를 상기 복수의 서버로 전송하고, 상기 제공하는 단계는 상기 복수의 서버로부터 복수의 응답 정보가 수신되면, 상기 복수의 응답 정보에 기초하여 상기 사용자 음성에 대한 복수의 응답을 제공할 수 있다.On the other hand, when the user voice is obtained after the user input is received by selecting a preset button, the transmitting step transmits information on the user voice to the plurality of servers, and the providing step comprises the plurality of steps. When a plurality of response information is received from the server of, it is possible to provide a plurality of responses to the user voice based on the plurality of response information.

여기에서, 상기 제공하는 단계는 상기 복수의 서버가 제공하는 복수의 비서 서비스에 대한 사용자의 선호도 및 상기 복수의 응답 정보의 정확도 중 적어도 하나에 기초하여 상기 복수의 응답을 상이한 형태로 출력할 수 있다.Here, the providing step may output the plurality of responses in different forms based on at least one of a user's preference for a plurality of secretary services provided by the plurality of servers and accuracy of the plurality of response information. .

본 개시의 다양한 실시 예에 따르면, 다양한 비서 서비스의 답변을 제공받을 수 있다는 점에서, 비서 서비스를 사용하는 사용자의 만족도 및 편의성이 향상될 수 있다. According to various embodiments of the present disclosure, satisfaction and convenience of a user who uses a secretary service may be improved in that answers to various secretary services can be provided.

도 1은 본 개시의 일 실시 예에 따른 비서 서비스를 제공하는 시스템을 설명하기 위한 도면,
도 2a 및 도 2b는 본 개시의 일 실시 예에 따른 전자 장치의 구성을 설명하기 위한 블록도,
도 3 내지 도 18은 본 개시의 다양한 실시 예에 따라 비서 서비스를 제공하는 방법을 설명하기 위한 도면,
도 19는 본 개시의 일 실시 예에 따른 전자 장치의 구성을 설명하기 위한 블록도,
도 20 및 도 21은 본 개시의 다양한 실시 예에 따라 비서 서비스와 관련된 가이드를 제공하는 방법을 설명하기 위한 도면, 그리고
도 22는 본 개시의 일 실시 예에 따른 비서 서비스 제공 방법을 설명하기 위한 흐름도이다.1 is a view for explaining a system for providing a secretary service according to an embodiment of the present disclosure;
2A and 2B are block diagrams illustrating a configuration of an electronic device according to an embodiment of the present disclosure;
3 to 18 are diagrams for explaining a method of providing a secretary service according to various embodiments of the present disclosure;
19 is a block diagram illustrating the configuration of an electronic device according to an embodiment of the present disclosure;
20 and 21 are diagrams for explaining a method of providing a guide related to a secretary service according to various embodiments of the present disclosure, and
22 is a flowchart illustrating a secretary service providing method according to an embodiment of the present disclosure.

이하, 본 문서의 다양한 실시 예가 첨부된 도면을 참조하여 기재된다. 그러나, 이는 본 문서에 기재된 기술을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 문서의 실시 예의 다양한 변경(modifications), 균등물(equivalents), 및/또는 대체물(alternatives)을 포함하는 것으로 이해되어야 한다. 도면의 설명과 관련하여, 유사한 구성요소에 대해서는 유사한 참조 부호가 사용될 수 있다.Hereinafter, various embodiments of the present document will be described with reference to the accompanying drawings. However, this is not intended to limit the techniques described in this document to specific embodiments, and should be understood to include various modifications, equivalents, and / or alternatives of embodiments of the document. . In connection with the description of the drawings, similar reference numerals may be used for similar components.

본 문서에서, "가진다," "가질 수 있다," "포함한다," 또는 "포함할 수 있다" 등의 표현은 해당 특징(예: 수치, 기능, 동작, 또는 부품 등의 구성요소)의 존재를 가리키며, 추가적인 특징의 존재를 배제하지 않는다.In this document, expressions such as "have," "can have," "includes," or "can contain," the existence of a corresponding feature (eg, a component such as a numerical value, function, operation, or part) And does not exclude the presence of additional features.

본 문서에서, "A 또는 B," "A 또는/및 B 중 적어도 하나," 또는 "A 또는/및 B 중 하나 또는 그 이상"등의 표현은 함께 나열된 항목들의 모든 가능한 조합을 포함할 수 있다. 예를 들면, "A 또는 B," "A 및 B 중 적어도 하나," 또는 "A 또는 B 중 적어도 하나"는, (1) 적어도 하나의 A를 포함, (2) 적어도 하나의 B를 포함, 또는 (3) 적어도 하나의 A 및 적어도 하나의 B 모두를 포함하는 경우를 모두 지칭할 수 있다.In this document, expressions such as “A or B,” “at least one of A or / and B,” or “one or more of A or / and B”, etc. may include all possible combinations of the items listed together. . For example, “A or B,” “at least one of A and B,” or “at least one of A or B,” (1) includes at least one A, (2) includes at least one B, Or (3) all cases including both at least one A and at least one B.

본 문서에서 사용된 "제1," "제2," "첫째," 또는 "둘째,"등의 표현들은 다양한 구성요소들을, 순서 및/또는 중요도에 상관없이 수식할 수 있고, 한 구성요소를 다른 구성요소와 구분하기 위해 사용될 뿐 해당 구성요소들을 한정하지 않는다. Expressions such as "first," "second," "first," or "second," as used herein may modify various components, regardless of order and / or importance, and denote one component. It is used to distinguish from other components, but does not limit the components.

어떤 구성요소(예: 제1 구성요소)가 다른 구성요소(예: 제2 구성요소)에 "(기능적으로 또는 통신적으로) 연결되어((operatively or communicatively) coupled with/to)" 있다거나 "접속되어(connected to)" 있다고 언급된 때에는, 상기 어떤 구성요소가 상기 다른 구성요소에 직접적으로 연결되거나, 다른 구성요소(예: 제3 구성요소)를 통하여 연결될 수 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소(예: 제1 구성요소)가 다른 구성요소(예: 제2 구성요소)에 "직접 연결되어" 있다거나 "직접 접속되어" 있다고 언급된 때에는, 상기 어떤 구성요소와 상기 다른 구성요소 사이에 다른 구성요소(예: 제 3 구성요소)가 존재하지 않는 것으로 이해될 수 있다.Some component (eg, first component) is "(functionally or communicatively) coupled with / to" another component (eg, second component), or " When referred to as "connected to", it should be understood that any of the above components may be directly connected to the other component or may be connected through another component (eg, a third component). On the other hand, when it is mentioned that a component (eg, a first component) is “directly connected” or “directly connected” to another component (eg, a second component), the component and the component It can be understood that there are no other components (eg, the third component) between the other components.

본 문서에서 사용된 표현 "~하도록 구성된(또는 설정된)(configured to)"은 상황에 따라, 예를 들면, "~에 적합한(suitable for)," "~하는 능력을 가지는(having the capacity to)," "~하도록 설계된(designed to)," "~하도록 변경된(adapted to)," "~하도록 만들어진(made to)," 또는 "~를 할 수 있는(capable of)"과 바꾸어 사용될 수 있다. 용어 "~하도록 구성된(또는 설정된)"은 하드웨어적으로 "특별히 설계된(specifically designed to)" 것만을 반드시 의미하지 않을 수 있다. 대신, 어떤 상황에서는, "~하도록 구성된 장치"라는 표현은, 그 장치가 다른 장치 또는 부품들과 함께 "~할 수 있는" 것을 의미할 수 있다. 예를 들면, 문구 "A, B, 및 C를 수행하도록 구성된(또는 설정된) 부프로세서"는 해당 동작을 수행하기 위한 전용 프로세서(예: 임베디드 프로세서), 또는 메모리 장치에 저장된 하나 이상의 소프트웨어 프로그램들을 실행함으로써, 해당 동작들을 수행할 수 있는 범용 프로세서(generic-purpose processor)(예: CPU 또는 application processor)를 의미할 수 있다. As used herein, the expression "configured to" is, depending on the context, for example, "having the capacity to" , "" Designed to, "" adapted to, "" made to, "or" capable of "can be used interchangeably. The term "configured (or set) to" may not necessarily mean only "specifically designed to" in hardware. Instead, in some situations, the expression "a device configured to" may mean that the device "can" with other devices or parts. For example, the phrase “subprocessor configured (or set) to perform A, B, and C” executes a dedicated processor (eg, an embedded processor) to perform the operation, or one or more software programs stored in the memory device. By doing so, it may mean a general-purpose processor (eg, a CPU or application processor) capable of performing the corresponding operations.

본 문서의 다양한 실시예들에 따른 전자 장치는, 예를 들면, 스마트폰, 태블릿 PC, 이동 전화기, 영상 전화기, 전자책 리더기, 데스크탑 PC, 랩탑 PC, 넷북 컴퓨터, 워크스테이션, 서버, PDA, PMP(portable multimediaplayer), MP3 플레이어, 의료기기, 카메라, 또는 웨어러블 장치 중 적어도 하나를 포함할 수 있다. 웨어러블 장치는 액세서리형(예: 시계, 반지, 팔찌, 발찌, 목걸이, 안경, 콘택트 렌즈, 또는 머리 착용형 장치(head-mounted-device(HMD)), 직물 또는 의류 일체형(예: 전자 의복), 신체 부착형(예: 스킨 패드 또는 문신), 또는 생체 이식형 회로 중 적어도 하나를 포함할 수 있다. 어떤 실시예들에서, 전자 장치는, 예를 들면, 텔레비전, DVD(digital video disk) 플레이어, 스피커, 오디오, 냉장고, 에어컨, 청소기, 오븐, 전자레인지, 세탁기, 공기 청정기, 셋톱 박스, 홈 오토매이션 컨트롤 패널, 보안 컨트롤 패널, 미디어 박스(예: 삼성 HomeSync^TM, 애플 TV^TM, 또는 구글 TV^TM), 게임 콘솔(예: Xbox^TM, PlayStation^TM), 전자 사전, 전자 키, 캠코더, 또는 전자 액자 중 적어도 하나를 포함할 수 있다.An electronic device according to various embodiments of the present disclosure includes, for example, a smart phone, a tablet PC, a mobile phone, a video phone, an e-book reader, a desktop PC, a laptop PC, a netbook computer, a workstation, a server, a PDA, and a PMP. (portable multimediaplayer), an MP3 player, a medical device, a camera, or a wearable device. Wearable devices are accessories (e.g. watches, rings, bracelets, anklets, necklaces, glasses, contact lenses, or head-mounted-device (HMD)), fabrics or garments (e.g. electronic clothing), It may include at least one of a body-attached type (eg, a skin pad or a tattoo), or a bio-implantable circuit In some embodiments, the electronic device may be, for example, a television, a digital video disk (DVD) player, Speaker, audio, refrigerator, air conditioner, cleaner, oven, microwave, washing machine, air purifier, set-top box, home automation control panel, security control panel, media box (e.g. Samsung HomeSync ^TM , Apple TV ^TM , or Google TV) ^TM ), a game console (eg, Xbox ^TM , PlayStation ^TM ), an electronic dictionary, an electronic key, a camcorder, or at least one of an electronic picture frame.

다른 실시예에서, 전자 장치는, 각종 의료기기(예: 각종 휴대용 의료측정기기(혈당 측정기, 심박 측정기, 혈압 측정기, 또는 체온 측정기 등), MRA(magnetic resonance angiography), MRI(magnetic resonance imaging), CT(computed tomography), 촬영기, 또는 초음파기 등), 네비게이션 장치, 위성 항법 시스템(GNSS(global navigation satellite system)), EDR(event data recorder), FDR(flight data recorder), 자동차 인포테인먼트 장치, 선박용 전자 장비(예: 선박용 항법 장치, 자이로 콤파스 등), 항공 전자기기(avionics), 보안 기기, 차량용 헤드 유닛(head unit), 산업용 또는 가정용 로봇, 드론(drone), 금융 기관의 ATM, 상점의 POS(point of sales), 또는 사물 인터넷 장치 (예: 전구, 각종 센서, 스프링클러 장치, 화재 경보기, 온도조절기, 가로등, 토스터, 운동기구, 온수탱크, 히터, 보일러 등) 중 적어도 하나를 포함할 수 있다. In another embodiment, the electronic device includes various medical devices (eg, various portable medical measurement devices (such as a blood glucose meter, heart rate monitor, blood pressure meter, or body temperature meter), magnetic resonance angiography (MRA), magnetic resonance imaging (MRI), CT (computed tomography), camera, or ultrasound, etc., navigation device, global navigation satellite system (GNSS), event data recorder (EDR), flight data recorder (FDR), automotive infotainment device, marine electronic equipment (E.g. navigational devices for ships, gyro compasses, etc.), avionics, security devices, head units for vehicles, robots for industrial or domestic use, drones, ATMs in financial institutions, point-of-sale at stores of sales), or at least one of Internet of Things devices (eg light bulbs, various sensors, sprinkler devices, fire alarms, thermostats, street lights, toasters, exercise equipment, hot water tanks, heaters, boilers, etc.). .

본 문서에서, 사용자라는 용어는 전자 장치를 사용하는 사람 또는 전자 장치를 사용하는 장치(예: 인공지능 전자 장치)를 지칭할 수 있다.In this document, the term user may refer to a person using an electronic device or a device using an electronic device (eg, an artificial intelligence electronic device).

도 1은 본 개시의 일 실시 예에 따른 비서 서비스를 제공하는 시스템을 설명하기 위한 도면이다.1 is a view for explaining a system for providing a secretary service according to an embodiment of the present disclosure.

도 1a 내지 도 1d에서는 전자 장치(100)를 텔레비전으로 도시하였으나, 이는 일 예일 뿐이고, 전자 장치(100)는 전술한 바와 같은 다양한 타입의 디바이스로 구현될 수 있다.1A to 1D, the electronic device 100 is illustrated as a television, but this is only an example, and the electronic device 100 may be implemented as various types of devices as described above.

먼저, 도 1a를 참조하면, 전자 장치(100)는 비서 서비스(또는, 음성 비서 서비스)를 제공할 수 있다.First, referring to FIG. 1A, the electronic device 100 may provide a secretary service (or a voice secretary service).

이를 위해, 전자 장치(100)는 사용자 음성을 획득하고, 획득된 사용자 음성에 대한 정보를 복수의 서버(200-1, 200-2,..., 200-n)로 전송할 수 있다.To this end, the electronic device 100 may acquire a user voice and transmit information about the obtained user voice to a plurality of servers 200-1, 200-2, ..., 200-n.

일 예로, 사용자 음성에 대한 정보는 음성 파일일 수 있다. 즉, 전자 장치(100)는 획득된 사용자 음성을 포함하는 음성 파일 가령, 음성 웨이브 파일을 생성하고, 이를 복수의 서버(200-1, 200-2,..., 200-n)로 전송할 수 있다.For example, information about the user's voice may be a voice file. That is, the electronic device 100 may generate a voice file including the acquired user voice, such as a voice wave file, and transmit it to a plurality of servers 200-1, 200-2, ..., 200-n. have.

다른 예로, 사용자 음성에 대한 정보는 압축된 음성 파일일 수 있다. 즉, 전자 장치(100)는 획득된 사용자 음성을 코덱 등으로 압축하여 압축된 음성 파일을 생성하고, 이를 복수의 서버(200-1, 200-2,..., 200-n)로 전송할 수 있다.As another example, the information on the user's voice may be a compressed voice file. That is, the electronic device 100 compresses the acquired user voice with a codec or the like to generate a compressed voice file and transmits the compressed voice file to a plurality of servers 200-1, 200-2, ..., 200-n. have.

다른 예로, 전자 장치(100)는 획득된 사용자 음성에 대해 가령, 노이즈 제거하는 동작, 특정 화자의 음성을 제외한 다른 화자의 음성을 제거하는 동작, 실제 음성이 시작되는 시점 이전의 오디오 신호를 제거하는 동작 등 중 적어도 하나의 전처리를 수행하고, 전처리가 수행된 사용자 음성을 복수의 서버(200-1, 200-2,..., 200-n)로 전송할 수 있다. 이 경우, 전처리가 수행된 사용자 음성은 음성 파일 또는 압축된 음성 파일 형태로 복수의 서버(200-1, 200-2,..., 200-n)로 전송될 수 있다.As another example, the electronic device 100 may remove, for example, a noise removing operation, an operation of removing a voice of a speaker other than the voice of a specific speaker, and an audio signal prior to the time when the actual voice starts. At least one pre-processing operation may be performed, and the pre-processed user voice may be transmitted to a plurality of servers 200-1, 200-2, ..., 200-n. In this case, the pre-processed user voice may be transmitted to a plurality of servers 200-1, 200-2, ..., 200-n in the form of a voice file or a compressed voice file.

다른 예로, 전자 장치(100)는 획득된 사용자 음성에서 특징점을 추출하고, 추출된 특징점에 대한 정보를 복수의 서버(200-1, 200-2,..., 200-n)로 전송할 수 있다.As another example, the electronic device 100 may extract a feature point from the acquired user voice, and transmit information about the extracted feature point to a plurality of servers 200-1, 200-2, ..., 200-n. .

한편, 복수의 서버(200-1, 200-2,..., 200-n)는 인공지능 에이전트Artificial intelligence agent)를 이용하여 비서 서비스를 제공하는 서버일 수 있다.On the other hand, a plurality of servers (200-1, 200-2, ..., 200-n) may be a server that provides a secretary service using an artificial intelligence agent (Artificial intelligence agent).

이를 위해, 복수의 서버(200-1, 200-2,..., 200-n)는 인공지능 모델을 이용하여 사용자 음성에 대한 응답을 제공할 수 있는 대화 시스템(Dialogue System)을 포함할 수 있다. To this end, a plurality of servers 200-1, 200-2, ..., 200-n may include a dialogue system that can provide a response to a user's voice using an artificial intelligence model. have.

구체적으로, 복수의 서버(200-1, 200-2,..., 200-n)는 사용자 음성에 대한 전처리를 수행하고, 사용자 음성에 대한 음성 인식을 수행하여 사용자 음성을 텍스트로 변환하고, 음성 인식 결과에 기초하여 사용자 음성의 의도(intent) 및 엔티티(entity)를 파악할 수 있다. 그리고, 복수의 서버(200-1, 200-2,..., 200-n)는 자연어 이해 결과에 기초하여 사용자 음성에 대한 응답에 대한 정보를 획득하고, 획득된 정보에 기초하여 사용자 음성에 대한 응답 정보으로서, 자연어를 획득할 수 있다. 그리고, 복수의 서버(200-1, 200-2,..., 200-n)는 응답 정보를 전자 장치(100)로 전송할 수 있다. Specifically, the plurality of servers 200-1, 200-2, ..., 200-n perform pre-processing for the user's voice, and perform voice recognition for the user's voice to convert the user's voice into text, Based on the result of the speech recognition, the intent and entity of the user's voice may be identified. Then, a plurality of servers (200-1, 200-2, ..., 200-n) acquires information on the response to the user's voice based on the natural language understanding result, and based on the obtained information to the user's voice As response information, a natural language can be obtained. Then, the plurality of servers 200-1, 200-2, ..., 200-n may transmit response information to the electronic device 100.

이를 위해, 복수의 서버(200-1, 200-2,..., 200-n)는 자동 음성 인식(Automatic Speech Recognition, ASR) 모듈, 자연어 이해(Natural Language Understanding, NLU) 모듈, 대화 관리(Dialogue Management, DM) 모듈, 자연어 생성(Natural Language Generation, NLG) 모듈 등을 포함할 수 있다.To this end, a plurality of servers (200-1, 200-2, ..., 200-n) includes an Automatic Speech Recognition (ASR) module, a Natural Language Understanding (NLU) module, and conversation management ( Dialogue Management (DM) module, Natural Language Generation (NLG) module, and the like.

한편, 전자 장치(100)는 수신된 응답 정보에 기초하여 사용자 음성에 대한 응답을 제공할 수 있다.Meanwhile, the electronic device 100 may provide a response to the user's voice based on the received response information.

예를 들어, 전자 장치(100)는 TTS(Text to Speech)를 통해, 수신된 응답 정보에 포함된 텍스트를 음성으로 변환하여 전자 장치(100)의 스피커 또는 전자 장치(100)와 연결된 스피커를 통해 출력하거나, 또는 해당 텍스트를 포함하는 유저 인터페이스(User interface)를 전자 장치(100)의 디스플레이를 통해 표시할 수 있다.For example, the electronic device 100 converts text included in the received response information into voice through text to speech (TTS), and through a speaker of the electronic device 100 or a speaker connected to the electronic device 100 It is possible to output or display a user interface including the corresponding text through the display of the electronic device 100.

이에 의해, 대화 시스템은 사용자 음성에 대한 응답을 제공할 수 있게 되어, 사용자는 전자 장치(100)와 대화를 수행할 수 있게 된다.Thereby, the conversation system can provide a response to the user's voice, so that the user can perform a conversation with the electronic device 100.

한편, 복수의 서버(200-1, 200-2,..., 200-n)는 사용자 음성이 전자 장치(100) 또는 외부 전자 장치를 제어하기 위한 의도를 포함하는 경우, 제어 대상이 되는 전자 장치를 제어하기 위한 명령을 전자 장치(100) 또는 제어 대상이 되는 전자 장치를 제어할 수 있는 전자 장치(미도시)로 전송할 수 있다. On the other hand, a plurality of servers (200-1, 200-2, ..., 200-n) is the user's voice includes the intention to control the electronic device 100 or an external electronic device, the electronic target of control A command for controlling the device may be transmitted to the electronic device 100 or an electronic device (not shown) capable of controlling the electronic device to be controlled.

예를 들어, 사용자 음성이 전자 장치(100)와 함께 사물 인터넷(Internet of Things, IoT) 환경을 구성하는 전등(미도시)을 켜기 위한 의도를 포함하는 경우를 가정한다. 이 경우, 복수의 서버(200-1, 200-2,..., 200-n)는 전등(미도시)을 턴온시키기 위한 명령을 전자 장치(100)로 전송할 수 있다. 이 경우, 전자 장치(100)는 해당 명령을 전등(미도시)로 전송하고, 전등(미도시)은 전자 장치(100)로부터 수신된 명령에 의해 턴온될 수 있다. For example, assume that the user voice includes an intention to turn on a light (not shown) constituting the Internet of Things (IoT) environment together with the electronic device 100. In this case, the plurality of servers 200-1, 200-2, ..., 200-n may transmit a command to turn on a light (not shown) to the electronic device 100. In this case, the electronic device 100 transmits a corresponding command to a light (not shown), and the light (not shown) may be turned on by a command received from the electronic device 100.

다만, 이는 일 예일 뿐이고, 복수의 서버(200-1, 200-2,..., 200-n)는 다양한 방식으로 제어 명령을 전송할 수 있다. However, this is only an example, and the plurality of servers 200-1, 200-2, ..., 200-n may transmit control commands in various ways.

구체적으로, 복수의 서버(200-1, 200-2,..., 200-n)는 제어 대상이 되는 전자 장치(미도시)로 제어 명령을 전송할 수 있다. 전술한 예에서, 복수의 서버(200-1, 200-2,..., 200-n)는 전등(미도시)을 턴온시키기 위한 명령을 전등(미도시)로 전송하고, 전등(미도시)은 복수의 서버(200-1, 200-2,..., 200-n)로부터 수신된 명령에 의해 턴온될 수 있다.Specifically, the plurality of servers 200-1, 200-2, ..., 200-n may transmit a control command to an electronic device (not shown) to be controlled. In the above example, the plurality of servers 200-1, 200-2, ..., 200-n transmit a command to turn on a light (not shown) to a light (not shown), and a light (not shown) ) May be turned on by commands received from a plurality of servers 200-1, 200-2, ..., 200-n.

또한, 복수의 서버(200-1, 200-2,..., 200-n)는 다른 전자 장치(미도시) 로 제어 명령을 전송하고, 다른 전자 장치(미도시)는 해당 명령을 제어 대상이 되는 전자 자치(미도시)로 전송할 수 있다. 전술한 예에서, 사물 인터넷 환경을 구성하는 복수의 전자 장치 중 냉장고(미도시)가 사물 인터넷 환경을 구성하는 전자 장치를 제어하는 기능을 수행하는 경우를 가정한다. 이 경우, 복수의 서버(200-1, 200-2,..., 200-n)는 전등(미도시)을 턴온시키기 위한 명령을 냉장고(미도시)로 전송할 수 있다. 이 경우, 냉장고(미도시)는 해당 명령을 전등(미도시)로 전송하고, 전등(미도시)은 냉장고(미도시)로부터 수신된 명령에 의해 턴온될 수 있다.Also, a plurality of servers 200-1, 200-2, ..., 200-n transmits a control command to another electronic device (not shown), and another electronic device (not shown) controls the corresponding command. It can be transmitted to the electronic self-governing (not shown). In the above example, it is assumed that a refrigerator (not shown) among a plurality of electronic devices constituting the Internet of Things environment performs a function of controlling an electronic device constituting the Internet of Things environment. In this case, the plurality of servers 200-1, 200-2, ..., 200-n may transmit a command to turn on a light (not shown) to a refrigerator (not shown). In this case, the refrigerator (not shown) transmits the corresponding command to a light (not shown), and the light (not shown) can be turned on by a command received from the refrigerator (not shown).

한편, 복수의 서버(200-1, 200-2,..., 200-n)는 대화 시스템을 동작하기 위한 인공지능 에이전트를 저장할 수 있다. 구체적으로, 복수의 서버(200-1, 200-2,..., 200-n)는 사용자 음성에 대한 응답으로 자연어를 생성하기 위하여 인공지능 에이전트를 이용할 수 있다. 인공지능 에이전트는 AI(Artificial Intelligence) 기반의 서비스(예를 들어, 음성 인식 서비스, 비서 서비스, 번역 서비스, 검색 서비스 등)를 제공하기 위한 전용 프로그램으로서, 기존의 범용 프로세서(예를 들어, CPU) 또는 별도의 AI 전용 프로세서(예를 들어, GPU 등)에 의해 실행될 수 있다. 특히, 인공지능 에이전트는 다양한 모듈을 제어할 수 있다.On the other hand, a plurality of servers (200-1, 200-2, ..., 200-n) may store an AI agent for operating the conversation system. Specifically, a plurality of servers 200-1, 200-2, ..., 200-n may use an artificial intelligence agent to generate natural language in response to a user's voice. The AI agent is a dedicated program for providing AI (Artificial Intelligence) based services (eg, voice recognition service, secretary service, translation service, search service, etc.), and is a conventional general-purpose processor (for example, CPU). Alternatively, it may be executed by a separate AI-only processor (eg, GPU, etc.). In particular, the AI agent can control various modules.

또한, 본 개시의 일 실시 예에 따른, 복수의 서버(200-1, 200-2,..., 200-n)는 자연어를 생성(또는 획득)하도록 학습된 인공지능 모델을 저장할 수 있다. 본 개시에서 학습된 인공지능 모델은 인식 모델의 적용 분야 또는 장치의 컴퓨터 성능 등을 고려하여 구축될 수 있다. 자연스로운 자연어를 생성하기 위해, 학습된 인공지능 모델은, 예로, 신경망(Neural Network)을 기반으로 하는 모델일 수 있다. 인공지능 모델은 인간의 뇌 구조를 컴퓨터 상에서 모의하도록 설계될 수 있으며 인간의 신경망의 뉴런(neuron)을 모의하는, 가중치를 가지는 복수의 네트워크 노드들을 포함할 수 있다. 복수의 네트워크 노드들은 뉴런이 시냅스(synapse)를 통하여 신호를 주고 받는 뉴런의 시냅틱(synaptic) 활동을 모의하도록 각각 연결 관계를 형성할 수 있다. 또한, 학습된 인공지능 모델은, 일 예로, 신경망 모델, 또는 신경망 모델에서 발전한 딥 러닝 모델을 포함할 수 있다. 딥 러닝 모델에서 복수의 네트워크 노드들은 서로 다른 깊이(또는, 레이어)에 위치하면서 컨볼루션(convolution) 연결 관계에 따라 데이터를 주고 받을 수 있다. 학습된 인공지능 모델의 예에는 DNN(Deep Neural Network), RNN(Recurrent Neural Network), BRDNN(Bidirectional Recurrent Deep Neural Network) 등이 있을 수 있으나 이에 한정되지 않는다.Further, according to an embodiment of the present disclosure, a plurality of servers 200-1, 200-2, ..., 200-n may store an artificial intelligence model trained to generate (or acquire) natural language. The artificial intelligence model learned in the present disclosure may be constructed in consideration of application fields of a recognition model or computer performance of a device. In order to generate natural natural language, the trained artificial intelligence model may be, for example, a model based on a neural network. The artificial intelligence model can be designed to simulate a human brain structure on a computer and can include a plurality of network nodes with weights, simulating neurons of a human neural network. A plurality of network nodes may each form a connection relationship so that neurons simulate synaptic activity of neurons that send and receive signals through synapses. In addition, the trained artificial intelligence model may include, for example, a neural network model or a deep learning model developed from a neural network model. In the deep learning model, a plurality of network nodes may be located at different depths (or layers) and exchange data according to a convolutional connection relationship. Examples of the learned artificial intelligence model may include, but are not limited to, Deep Neural Network (DNN), Recurrent Neural Network (RNN), and Bidirectional Recurrent Deep Neural Network (BRDNN).

한편, 전술한 바와 같이, 음성 인식을 위해서는, 사용자 음성에 대한 전처리, 사용자 음성을 텍스트로 변환, 텍스트에서 의미 인식(즉, 의도 및 엔티티 파악 등) 등의 동작이 수행된다. 전술한 예에서는 이들 동작이 복수의 서버(200-1, 200-2,..., 200-n)에서 수행되는 것으로 설명하였으나, 이는 일 예에 불과하며, 이들 동작 중 적어도 일부는 다른 전자 장치(미도시)에서 수행될 수 있다.Meanwhile, as described above, for voice recognition, operations such as pre-processing the user voice, converting the user voice to text, and semantic recognition (ie, intention and entity identification) in the text are performed. In the above example, these operations are described as being performed by a plurality of servers 200-1, 200-2, ..., 200-n, but this is only an example, and at least some of these operations are different electronic devices. (Not shown).

구체적으로, 음성 인식을 위한 동작들 중 적어도 일부는 전자 장치(100)에서 수행될 수 있다. 이 경우, 해당 동작을 수행하기 위한 모듈이 전자 장치(미도시)에 구비되어 있을 수 있다.Specifically, at least some of the operations for speech recognition may be performed in the electronic device 100. In this case, a module for performing the corresponding operation may be provided in an electronic device (not shown).

예를 들어, 전자 장치(100)는 사용자 음성에 대한 전처리를 수행하고, 사용자 음성을 텍스트로 변환할 수 있다. 그리고, 전자 장치(100)는 사용자 음성이 변환된 텍스트에 대한 정보를 복수의 서버(200-1, 200-2,..., 200-n)로 전송할 수 있다. 이 경우, 복수의 서버(200-1, 200-2,..., 200-n)는 텍스트로부터 사용자 음성의 의도 및 엔티티 등을 파악하고, 자연어 이해 결과에 기초하여 사용자 음성에 대한 응답 정보를 전자 장치(100)로 전송할 수 있다. For example, the electronic device 100 may perform pre-processing of the user's voice and convert the user's voice into text. In addition, the electronic device 100 may transmit information on the text in which the user's voice is converted to a plurality of servers 200-1, 200-2, ..., 200-n. In this case, the plurality of servers 200-1, 200-2, ..., 200-n grasp the intention and entities of the user's voice from the text, and respond to the user's voice based on the natural language understanding result. It can be transmitted to the electronic device 100.

다른 예로, 전자 장치(100)는 사용자 음성에 대한 전처리를 수행하고, 사용자 음성을 텍스트로 변환하고, 텍스트에서 사용자 음성에 대한 의도 및 엔티티 등을 파악하고, 자연어 이해 결과에 대한 정보를 복수의 서버(200-1, 200-2,..., 200-n)로 전송할 수 있다. 이 경우, 복수의 서버(200-1, 200-2,..., 200-n)는 전자 장치(100)로부터 수신된 정보에 기초하여 사용자 음성에 대한 응답에 대한 정보를 획득하고, 획득된 정보에 기초하여 사용자 음성에 대한 응답 정보으로서, 자연어를 획득할 수 있다. 그리고, 복수의 서버(200-1, 200-2,..., 200-n)는 응답 정보를 전자 장치(100)로 전송할 수 있다. As another example, the electronic device 100 performs pre-processing of the user's voice, converts the user's voice into text, grasps the intent and entities for the user's voice in the text, and obtains information about the natural language understanding result from a plurality of servers (200-1, 200-2, ..., 200-n). In this case, the plurality of servers 200-1, 200-2, ..., 200-n acquire information on the response to the user's voice based on the information received from the electronic device 100, and the obtained As a response information to the user's voice based on the information, a natural language can be obtained. Then, the plurality of servers 200-1, 200-2, ..., 200-n may transmit response information to the electronic device 100.

한편, 음성 인식을 위한 동작들 중 적어도 일부는 다른 전자 장치(미도시)에서 수행될 수 있다. 이 경우, 해당 동작을 수행하기 위한 모듈이 다른 전자 장치(미도시)에 구비되어 있을 수 있다.Meanwhile, at least some of the operations for speech recognition may be performed in another electronic device (not shown). In this case, a module for performing the corresponding operation may be provided in another electronic device (not shown).

예를 들어, 도 1b와 같이, 전자 장치(100)가 냉장고(10), 세탁기(20), 전등(30) 및 에어컨(40)과 함께, 댁 내의 사물 인터넷 환경을 구성하는 경우를 가정한다.For example, as illustrated in FIG. 1B, it is assumed that the electronic device 100 configures the Internet of Things environment in the home together with the refrigerator 10, the washing machine 20, the light fixture 30, and the air conditioner 40.

이 경우, 전자 장치(100)는 획득된 사용자 음성을 냉장고(10)로 전송할 수 있다. 이때, 냉장고(10)는 전자 장치(100)로부터 수신된 사용자 음성에 대한 전처리를 수행하고, 사용자 음성을 텍스트로 변환하고, 사용자 음성이 변환된 텍스트에 대한 정보를 전자 장치(100)로 전송할 수 있다. 이 경우, 전자 장치(100)는 냉장고(10)로부터 수신된 텍스트 정보를 복수의 서버(200-1, 200-2,..., 200-n)로 전송할 수 있다. 복수의 서버(200-1, 200-2,..., 200-n)는 텍스트로부터 사용자 음성의 의도 및 엔티티 등을 파악하고, 자연어 이해 결과에 기초하여 사용자 음성에 대한 응답 정보를 전자 장치(100)로 전송할 수 있다. In this case, the electronic device 100 may transmit the acquired user voice to the refrigerator 10. At this time, the refrigerator 10 may perform pre-processing on the user voice received from the electronic device 100, convert the user voice into text, and transmit information on the text in which the user voice is converted to the electronic device 100. have. In this case, the electronic device 100 may transmit text information received from the refrigerator 10 to a plurality of servers 200-1, 200-2, ..., 200-n. The plurality of servers 200-1, 200-2, ..., 200-n identify the intention and entities of the user's voice from the text, and provide response information for the user's voice based on the natural language understanding result ( 100).

뿐만 아니라, 음성 인식을 위한 동작들 중 적어도 일부는 복수의 전자 장치(미도시)에서 수행될 수도 있다. In addition, at least some of the operations for speech recognition may be performed in a plurality of electronic devices (not shown).

예를 들어, 도 1b에서, 전자 장치(100)는 획득된 사용자 음성을 냉장고(10)로 전송할 수 있다. 이때, 냉장고(10)는 전자 장치(100)로부터 수신된 사용자 음성에 대한 전처리를 수행하고, 전처리가 수행된 사용자 음성을 전자 장치(100)로 전송할 수 있다. 이 경우, 전자 장치(100)는 냉장고(10)로부터 수신된 사용자 음성을 에어컨(40)으로 전송할 수 있다. 이때, 에어컨(40)은 전자 장치(100)로부터 수신된 사용자 음성을 텍스트로 변환하고, 사용자 음성이 변환된 텍스트에 대한 정보를 전자 장치(100)로 전송할 수 있다. 이 경우, 전자 장치(100)는 에어컨(40)로부터 수신된 텍스트 정보를 복수의 서버(200-1, 200-2,..., 200-n)로 전송할 수 있다. 복수의 서버(200-1, 200-2,..., 200-n)는 텍스트로부터 사용자 음성의 의도 및 엔티티 등을 파악하고, 자연어 이해 결과에 기초하여 사용자 음성에 대한 응답 정보를 전자 장치(100)로 전송할 수 있다. For example, in FIG. 1B, the electronic device 100 may transmit the acquired user voice to the refrigerator 10. At this time, the refrigerator 10 may perform pre-processing on the user voice received from the electronic device 100 and transmit the pre-processed user voice to the electronic device 100. In this case, the electronic device 100 may transmit the user's voice received from the refrigerator 10 to the air conditioner 40. At this time, the air conditioner 40 may convert the user voice received from the electronic device 100 into text, and transmit information on the text converted from the user voice to the electronic device 100. In this case, the electronic device 100 may transmit text information received from the air conditioner 40 to a plurality of servers 200-1, 200-2, ..., 200-n. The plurality of servers 200-1, 200-2, ..., 200-n identify the intention and entities of the user's voice from the text, and provide response information for the user's voice based on the natural language understanding result ( 100).

한편, 전술한 예에서는 전자 장치(100)가 텔레비전으로 구현되는 것으로 설명하였으나, 전자 장치(100)는 다양한 유형의 기기로 구현될 수 있다.Meanwhile, in the above-described example, although the electronic device 100 is described as being implemented as a television, the electronic device 100 may be implemented as various types of devices.

일 예로, 도 1c와 같이, 전자 장치는 마이크를 구비한 스피커(100)로 구현될 수도 있다. 이 경우, 스피커(100)는 마이크를 통해 사용자 음성을 획득하고, 획득된 사용자 음성을 복수의 서버(200-1, 200-2,..., 200-n)로 전송할 수 있다. 그리고, 스피커(100)는 복수의 서버(200-1, 200-2,..., 200-n)로부터 수신된 응답 정보를 음성으로 변환하여 출력할 수 있다. For example, as illustrated in FIG. 1C, the electronic device may be implemented as a speaker 100 having a microphone. In this case, the speaker 100 may acquire a user voice through a microphone and transmit the obtained user voice to a plurality of servers 200-1, 200-2, ..., 200-n. In addition, the speaker 100 may convert and output response information received from a plurality of servers 200-1, 200-2, ..., 200-n into voice.

한편, 응답 정보는 전자 장치(100)가 아닌 다른 전자 장치(미도시)에서 제공될 수도 있다.Meanwhile, the response information may be provided by an electronic device (not shown) other than the electronic device 100.

예를 들어, 도 1d와 같이, 전자 장치가 마이크를 구비한 스피커(100)로 구현된 경우를 가정한다.For example, assume that the electronic device is implemented as a speaker 100 equipped with a microphone, as shown in FIG. 1D.

이 경우, 스피커(100)는 마이크를 통해 사용자 음성을 수신하고, 사용자 음성에 대한 정보를 복수의 서버(200-1, 200-2,..., 200-n)로 전송할 수 있다. 이에 따라, 복수의 서버(200-1, 200-2,..., 200-n)는 사용자 음성에 대한 응답 정보를 생성하여, 스피커(100)로 전송할 수 있다. 이 경우, 스피커(100)는 응답 정보를 디스플레이(미도시)가 구비된 텔레비전(50)으로 전송하고, 텔레비전(50)은 스피커(100)로부터 수신된 응답 정보에 포함된 텍스트를 포함하는 유저 인터페이스를 텔레비전(50)의 디스플레이를 통해 표시할 수 있다.In this case, the speaker 100 may receive a user voice through a microphone and transmit information on the user voice to a plurality of servers 200-1, 200-2, ..., 200-n. Accordingly, the plurality of servers 200-1, 200-2, ..., 200-n may generate response information for the user's voice and transmit it to the speaker 100. In this case, the speaker 100 transmits response information to the television 50 equipped with a display (not shown), and the television 50 includes a user interface including text included in the response information received from the speaker 100 Can be displayed on the display of the television 50.

한편, 복수의 서버(200-1, 200-2,..., 200-n)는 서로 다른 비서 서비스를 제공하는 서버일 수 있다.Meanwhile, the plurality of servers 200-1, 200-2, ..., 200-n may be servers that provide different secretary services.

예를 들어, 근래에는 삼성 빅스비(Bixby) 등과 같이, 다양한 기업들이 비서 서비스를 제공하고 있다. 이와 같이, 다양한 비서 서비스가 존재하는데, 본 개시의 일 실시 예에 따르면, 서버는 각 기업에서 비서 서비스를 제공하기 위해 운용하는 서버일 수 있다. For example, in recent years, various companies, such as Samsung Bixby, provide secretarial services. As such, various secretary services exist, and according to an embodiment of the present disclosure, the server may be a server operated to provide secretary services in each enterprise.

한편, 현재 제공되고 있는 비서 서비스들은 각자 뛰어난 성능을 보이는 도메인(예를 들어, 질문/답변, 기기 제어 등)이 서로 다를 수 있다. 따라서, 사용자가 보다 정확한 응답을 제공할 수 있는 비서 서비스를 이용할 수 있게 된다면, 비서 서비스를 사용하는 사용자의 만족도 및 편의성이 향상될 수 있다. Meanwhile, currently provided secretarial services may have different domains (e.g., questions / answers, device control, etc.) each exhibiting excellent performance. Accordingly, if a user can use a secretary service capable of providing a more accurate response, satisfaction and convenience of a user using the secretary service can be improved.

이에 따라, 본 개시의 일 실시 예에 따른 전자 장치(100)는 복수의 서버(200-1, 200-2,..., 200-n)로부터 응답 정보를 획득하여 사용자에게 제공할 수 있는데, 이하에서 보다 구체적으로 설명하도록 한다.Accordingly, the electronic device 100 according to an embodiment of the present disclosure may obtain response information from a plurality of servers 200-1, 200-2, ..., 200-n and provide it to a user. It will be described in more detail below.

도 2는 본 개시의 일 실시 예에 따른 전자 장치의 구성을 설명하기 위한 도면이다. 2 is a diagram illustrating a configuration of an electronic device according to an embodiment of the present disclosure.

도 2a를 참조하면, 전자 장치(100)는 통신부(110), 메모리(120) 및 프로세서(130)를 포함할 수 있다.Referring to FIG. 2A, the electronic device 100 may include a communication unit 110, a memory 120, and a processor 130.

통신부(110)는 다양한 유형의 외부 기기와 통신을 수행할 수 있다. 예를 들어, 통신부(110)는 외부의 전자장치 또는 서버와 통신을 수행할 수 있다. 이 경우, 통신부(110)는 와이파이, 블루투스 및 NFC 등과 같은 다양한 방식의 근거리 통신을 이용하거나, 무선 이동 통신 및 이더넷 등과 같은 다양한 방식의 원거리 통신을 이용할 수 있다. 이를 위해, 통신부(110)는 와이파이 칩, 이더넷 칩, 블루투스 칩, 무선 이동 통신 칩, NFC 칩 중 적어도 하나를 포함할 수 있다.The communication unit 110 may communicate with various types of external devices. For example, the communication unit 110 may communicate with an external electronic device or server. In this case, the communication unit 110 may use various types of short-range communication such as Wi-Fi, Bluetooth and NFC, or use various types of remote communication such as wireless mobile communication and Ethernet. To this end, the communication unit 110 may include at least one of a Wi-Fi chip, an Ethernet chip, a Bluetooth chip, a wireless mobile communication chip, and an NFC chip.

여기에서, 서버는 학습된 인공지능 모델을 이용하여 비서 서비스를 제공할 수 있는 서버를 포함할 수 있다.Here, the server may include a server capable of providing a secretary service using the learned artificial intelligence model.

이 경우, 통신부(110)는 전자 장치(100)가 획득한 사용자 음성에 대한 정보를 서버로 전송할 수 있으며, 서버로부터 인공지능 모델을 통해 획득된 사용자 음성에 대응되는 응답 정보를 수신할 수 있다. 여기에서, 사용자 음성에 대한 정보는 음성 파일, 압축된 음성 파일, 전처리가 수행된 사용자 음성 및 사용자 음성으로부터 추출된 특징점 정보 중 적어도 하나일 수 있다.In this case, the communication unit 110 may transmit information about the user voice acquired by the electronic device 100 to the server, and receive response information corresponding to the user voice obtained through the artificial intelligence model from the server. Here, the user voice information may be at least one of a voice file, a compressed voice file, pre-processed user voice, and feature point information extracted from the user voice.

메모리(120)는 전자 장치(100)의 적어도 하나의 다른 구성요소에 관계된 명령 또는 데이터를 저장할 수 있다. 특히, 메모리(120)는 비서 서비스 실행과 관련된 적어도 하나의 명령을 저장하고 있을 수 있다.The memory 120 may store instructions or data related to at least one other component of the electronic device 100. In particular, the memory 120 may store at least one instruction related to execution of a secretary service.

프로세서(130)는 통신부(110) 및 메모리(120)와 연결되어, 전자 장치(100)를 제어할 수 있다. 구체적으로, 프로세서(130)는 통신부(110) 및 메모리(120)와 전기적으로 연결되어, 전자 장치(100)의 전반적인 동작 및 기능을 제어할 수 있다. The processor 130 may be connected to the communication unit 110 and the memory 120 to control the electronic device 100. Specifically, the processor 130 may be electrically connected to the communication unit 110 and the memory 120 to control overall operations and functions of the electronic device 100.

특히, 프로세서(130)는 메모리(120)에 저장된 적어도 하나의 명령을 실행함으로써, 비서 서비스를 실행하기 위한 사용자 입력이 수신되면, 전자 장치(100)가 획득한 사용자 음성에 대한 정보를 통신부(110)를 통해 서로 다른 비서 서비스를 제공하는 복수의 서버(200-1, 200-2,..., 200-n)로 전송하고, 복수의 서버(200-1, 200-2,..., 200-n)로부터 복수의 응답 정보가 수신되면, 수신된 복수의 응답 정보 중 적어도 하나에 기초하여 사용자 음성에 대한 응답을 제공할 수 있다.In particular, the processor 130 executes at least one command stored in the memory 120, and when a user input for executing a secretary service is received, the electronic device 100 receives information about the user's voice acquired by the communication unit 110 ) To multiple servers (200-1, 200-2, ..., 200-n) providing different secretary services, and multiple servers (200-1, 200-2, ..., When a plurality of response information is received from 200-n), a response to a user voice may be provided based on at least one of the received plurality of response information.

여기에서, 비서 서비스를 실행하기 위한 사용자 입력이 수신되는 경우는 웨이크업 단어(wakeup word)(또는, 웨이크업 커맨드(wakeup command))를 포함하는 사용자 음성이 입력되거나, 기설정된 버튼의 선택에 의한 사용자 입력이 수신되는 경우를 포함할 수 있다.Here, when a user input for executing a secretary service is received, a user voice including a wakeup word (or a wakeup command) is input, or by selecting a preset button. It may include a case where a user input is received.

이 경우, 웨이크업 단어는 비서 서비스를 실행하거나 비서 서비스의 응답을 요청하기 위한 트리거 단어를 의미하며, 사용자는 전자 장치(100)에서 비서 서비스를 이용하기 위해서는 음성 호출 명령어에 해당하는 웨이크업 단어를 발화하여야 한다. 이러한 웨이크업 단어는 비서 서비스를 바탕으로 결정될 수 있다. 예를 들어, 웨이크업 단어는 각 기업에서 비서 서비스를 위해 사용하는 인공지능의 이름 등을 포함할 수 있다. 따라서, 웨이크업 단어는 비서 서비스의 종류에 따라 서로 다를 수 있다. In this case, the wake-up word means a trigger word for executing a secretary service or requesting a response from the secretary service, and the user uses the wake-up word corresponding to a voice call command in order to use the secretary service in the electronic device 100. Must ignite. These wake-up words may be determined based on the secretarial service. For example, the wake-up word may include the name of the artificial intelligence that each company uses for secretarial services. Therefore, the wake-up word may be different depending on the type of secretary service.

또한, 웨이크업 단어는 사용자 입력에 따라 변경될 수 있다. 즉, 사용자는 전자 장치(100) 또는 다른 전자 장치(미도시)를 통해 서버에 액세스하여, 웨이크업 단어를 변경할 수 있다. 예를 들어, 사용자는 웨이크업 단어인 “하이(hi) 빅스비”를 “헬로우(hello) 빅스비”로 변경을 요청할 수 있고, 서버는 사용자 입력에 따라 웨이크업 단어를 변경하여 등록할 수 있다. 다만, 서버는 사용자가 변경을 위해 입력한 새로운 웨이크업 단어의 전체 또는 일부가 다른 서버에서 제공하는 비서 서비스의 웨이크업 단어와 전체 또는 일부가 동일한 경우에는, 사용자의 변경 요청을 거절하고 웨이크업 단어를 변경하지 않을 수 있다. Also, the wakeup word may be changed according to user input. That is, the user may change the wake-up word by accessing the server through the electronic device 100 or another electronic device (not shown). For example, the user may request to change the wake-up word “hi” Bixby ”to“ hello ”Bixby, and the server may register by changing the wake-up word according to user input. . However, if all or part of the new wake-up word entered by the user for change is the same as all or part of the wake-up word of the secretary service provided by another server, the server rejects the user's change request and the wake-up word You can not change it.

한편, 사용자 음성은 전자 장치(100)에 마련된 마이크(미도시)를 통해 입력되거나, 또는, 전자 장치(100)에 연결된 외부 전자 장치(미도시)가 사용자 음성을 입력받고, 이를 전자 장치(100)로 전송할 수 있다. Meanwhile, the user's voice is input through a microphone (not shown) provided in the electronic device 100, or an external electronic device (not shown) connected to the electronic device 100 receives the user's voice, and the electronic device 100 ).

한편, 기설정된 버튼은 전자 장치(100)에 마련되거나, 전자 장치(100)에 연결된 외부 전자 장치(미도시)에 마련될 수 있다. 이 경우, 외부 전자 장치(미도시)는 기설정된 버튼이 선택되면, 기설정된 버튼이 선택되었음을 나타내는 신호를 전자 장치(100)로 전송할 수 있다. Meanwhile, the preset button may be provided on the electronic device 100 or may be provided on an external electronic device (not shown) connected to the electronic device 100. In this case, when the preset button is selected, the external electronic device (not shown) may transmit a signal indicating that the preset button is selected to the electronic device 100.

이하에서는, 도 2b를 참조하여, 프로세서(130)가 복수의 비서 서비스를 통해, 사용자 음성에 대한 응답을 제공하는 방법을 보다 구체적으로 설명하도록 한다.Hereinafter, a method in which the processor 130 provides a response to a user voice through a plurality of secretary services will be described in more detail with reference to FIG. 2B.

먼저, 본 개시의 일 실시 예에 따라 특정한 웨이크업 단어를 포함하는 사용자 음성이 비서 서비스를 실행하기 위한 사용자 입력으로 입력된 경우를 설명하도록 한다.First, a case in which a user voice including a specific wakeup word is input as a user input for executing a secretary service according to an embodiment of the present disclosure will be described.

프로세서(130)는 제1 비서 서비스를 실행하기 위한 사용자 입력이 수신되면, 사용자 음성에 대한 정보를 제1 비서 서비스를 제공하는 제1 서버 및 제2 비서 서비스를 제공하는 제2 서버로 전송하고, 제1 및 제2 서버로부터 수신된 복수의 응답 정보 중 제1 서버로부터 수신된 응답 정보에 기초하여 사용자 음성에 대한 응답을 제공할 수 있다.When the user input for executing the first secretary service is received, the processor 130 transmits information on the user's voice to the first server providing the first secretary service and the second server providing the second secretary service, The response to the user voice may be provided based on the response information received from the first server among the plurality of response information received from the first and second servers.

즉, 프로세서(130)는 전자 장치(100)가 획득한 제1 사용자 음성에서 제1 비서 서비스를 실행하기 위한 제1 웨이크업 단어가 확인되는 경우, 제1 사용자 음성에 대한 정보를 제1 비서 서비스를 제공하는 제1 서버 및 제2 비서 서비스를 제공하는 제2 서버로 전송하고, 제1 사용자 음성에 대한 정보의 전송에 응답하여 제1 서버 및 제2 서버로부터 수신된 복수의 응답 정보 중 제1 서버로부터 수신된 응답 정보에 기초하여 제1 사용자 음성에 대한 응답을 제공할 수 있다.That is, when the first wake-up word for executing the first secretary service is identified in the first user voice acquired by the electronic device 100, the processor 130 provides the information about the first user voice to the first secretary service A first server providing a first server and a second server providing a second secretary service, the first of the plurality of response information received from the first server and the second server in response to the transmission of information about the first user voice A response to the first user voice may be provided based on the response information received from the server.

이 경우, 프로세서(130)는 웨이크업 엔진(131)을 이용하여 사용자 음성에서 웨이크업 단어가 존재하는지 여부를 확인할 수 있다. 여기에서, 웨이크업 엔진(131)은 다양한 웨이크업 단어를 인식하도록 학습된 인공지능 모델을 포함할 수 있으며, 이때, 웨이크업 단어는 추가로 지정되거나 변경될 수도 있다. 한편, 도 2b에서는 웨이크업 엔진(131) 및 음성 인식 모듈(132)이 하드웨어로 구현되는 것으로 설명하였으나, 이들은 소프트웨어로 구현될 수 있고, 이 경우, 메모리(120)에 저장될 수 있다.In this case, the processor 130 may use the wakeup engine 131 to check whether a wakeup word exists in the user's voice. Here, the wake-up engine 131 may include an artificial intelligence model trained to recognize various wake-up words, wherein the wake-up word may be additionally designated or changed. Meanwhile, in FIG. 2B, the wake-up engine 131 and the voice recognition module 132 are described as being implemented in hardware, but they may be implemented in software, and in this case, may be stored in the memory 120.

구체적으로, 프로세서(130)는 입력되는 사용자 음성을 메모리(120)에 저장하고, 웨이크업 엔진(131)을 통해 일정 구간의 사용자 음성에 대해 음성 인식을 수행하여, 사용자 음성에 웨이크업 단어가 존재하는지 여부를 확인할 수 있다.Specifically, the processor 130 stores the input user voice in the memory 120 and performs a voice recognition on the user voice in a certain section through the wake-up engine 131, so that a wake-up word exists in the user voice You can check whether or not.

여기에서, 웨이크업 단어는 복수의 서버(200-1, 200-2,..., 200-n)에서 제공하는 복수의 비서 서비스들 중 어느 하나의 비서 서비스를 실행하기 위한 웨이크업 단어가 될 수 있다.Here, the wake-up word may be a wake-up word for executing a secretary service of any one of a plurality of secretary services provided by a plurality of servers 200-1, 200-2, ..., 200-n. You can.

이에 따라, 프로세서(130)는 사용자 음성에 웨이크업 단어가 존재하는 경우, 사용자 음성에 대한 정보를 통신부(110)를 통해 복수의 서버(200-1, 200-2,..., 200-n)로 전송할 수 있다. Accordingly, when the wake-up word exists in the user's voice, the processor 130 transmits information about the user's voice through the communication unit 110 to a plurality of servers 200-1, 200-2, ..., 200-n ).

이 경우, 프로세서(130)는 전자 장치(100)가 획득한 사용자 음성으로부터 사용자 음성에 대한 정보를 획득할 수 있다. In this case, the processor 130 may obtain information about the user voice from the user voice acquired by the electronic device 100.

구체적으로, 프로세서(130)는 사용자 음성에 대한 음성 파일 또는 압축된 음성 파일을 생성하거나, 사용자 음성에 대한 전처리를 수행하고, 전처리가 수행된 사용자 음성에 대한 대한 음성 파일 또는 압축된 음성 파일을 생성할 수 있다. 또한, 프로세서(130)는 사용자 음성으로부터 특징점을 추출하여, 사용자 음성에 대한 정보를 획득할 수 있다. Specifically, the processor 130 generates a voice file for the user voice or a compressed voice file, or performs pre-processing for the user voice, and generates a voice file or compressed voice file for the pre-processed user voice. can do. In addition, the processor 130 may extract feature points from the user's voice to obtain information about the user's voice.

한편, 프로세서(130)는 사용자 음성에서 웨이크업 단어를 제외한 나머지 사용자 음성에 대한 정보를 복수의 서버(200-1, 200-2,..., 200-n)로 전송할 수 있다. 예를 들어, 프로세서(130)는 웨이크업 단어 이후에 존재하는 사용자 음성에 대한 정보를 복수의 서버(200-1, 200-2,..., 200-n)로 전송할 수 있다.On the other hand, the processor 130 may transmit information on the remaining user voices except for the wake-up word from the user voices to a plurality of servers 200-1, 200-2, ..., 200-n. For example, the processor 130 may transmit information about the user's voice that exists after the wake-up word to a plurality of servers 200-1, 200-2, ..., 200-n.

이를 위해, 프로세서(130)는 메모리(120)에 저장된 어플리케이션(121-1, 121-2,..., 121-n)을 실행(또는, 런치(launch), 활성화)할 수 있다.To this end, the processor 130 may execute (or launch, activate) the applications 121-1, 121-2, ..., 121-n stored in the memory 120.

여기에서, 어플리케이션(121-1, 121-2,..., 121-n)은 복수의 서버(200-1, 200-2,..., 200-n) 각각과 연동하여 비서 서비스 기능을 제공하기 위한 프로그램으로, 복수의 서버(200-1, 200-2,..., 200-n)와 비서 서비스 제공을 위한 각종 정보를 송수신하고, 사용자 음성에 대한 응답을 제공할 수 있다.Here, the application (121-1, 121-2, ..., 121-n) is a plurality of servers (200-1, 200-2, ..., 200-n) in conjunction with each of the secretary service function As a program for providing, a plurality of servers 200-1, 200-2, ..., 200-n, and various information for providing secretary services can be transmitted and received, and responses to user voices can be provided.

이러한 어플리케이션(121-1, 121-2,..., 121-n)은 복수의 서버(200-1, 200-2,..., 200-n) 또는 별도의 어플리케이션 스토어를 통해 다운로드되어 전자 장치(100)에 설치되거나, 전자 장치(100)가 복수의 서버(200-1, 200-2,..., 200-n)에 접속하여 해당 기능을 클라우드 서비스 방식으로 이용할 수도 있다.These applications (121-1, 121-2, ..., 121-n) are downloaded through a plurality of servers (200-1, 200-2, ..., 200-n) or separate application stores. It may be installed on the device 100 or the electronic device 100 may access a plurality of servers 200-1, 200-2, ..., 200-n to use the corresponding function as a cloud service method.

이 경우, 프로세서(130)는 웨이크업 단어에 대응되는 비서 서비스를 제공하기 위한 어플리케이션(121-1) 뿐만 아니라, 다른 어플리케이션(121-2,..., 121-n)을 모두 실행하고, 어플리케이션(121-1, 121-2,..., 121-n)을 통해 사용자 음성에 대한 정보를 복수의 서버(200-1, 200-2,..., 200-n)로 전송할 수 있다.In this case, the processor 130 executes all other applications 121-2, ..., 121-n, as well as an application 121-1 for providing a secretary service corresponding to the wake-up word, and the application Information about the user voice may be transmitted to a plurality of servers 200-1, 200-2, ..., 200-n through (121-1, 121-2, ..., 121-n).

즉, 특정한 비서 서비스를 실행하기 위한 웨이크업 단어가 입력된 경우라도, 프로세서(130)는 특정한 비서 서비스를 제공하는 서버(200-1)뿐만 아니라, 해당 서버와는 다른 서버들(200-2,..., 200-n)로도 사용자 음성에 대한 정보를 전송할 수 있다.That is, even when a wake-up word for executing a specific secretary service is input, the processor 130 not only the server 200-1 providing a specific secretary service, but also other servers 200-2 different from the corresponding server (200-2, ..., 200-n) can also transmit information about the user's voice.

이 경우, 복수의 서버(200-1, 200-2,..., 200-n)는 전자 장치(100)로부터 수신된 사용자 음성에 대한 정보에 기초하여, 사용자 음성에 대한 응답 정보를 생성하고, 응답 정보를 전자 장치(100)로 전송할 수 있다. In this case, the plurality of servers 200-1, 200-2, ..., 200-n generate response information for the user voice based on the information about the user voice received from the electronic device 100 and , Response information may be transmitted to the electronic device 100.

한편, 프로세서(130)는 복수의 서버(200-1, 200-2,..., 200-n)로부터 사용자 음성에 대한 응답 정보가 수신되면, 복수의 응답 정보 중 웨이크업 단어에 대응되는 비서 서비스에 의해 생성된 응답 정보에 기초하여 사용자 음성에 대한 응답을 제공할 수 있다. Meanwhile, when the response information for the user's voice is received from the plurality of servers 200-1, 200-2, ..., 200-n, the processor 130 is a secretary corresponding to the wake-up word among the plurality of response information A response to the user's voice may be provided based on the response information generated by the service.

즉, 사용자가 특정한 비서 서비스를 나타내는 웨이크업 단어를 발화하였다는 것은 해당 비서 서비스의 응답을 요청한 것이라는 점에서, 프로세서(130)는 수신된 복수의 응답 정보 중 웨이크업 단어와 매칭되는 비서 서비스를 제공하는 서버로부터 수신된 응답 정보에 기초하여, 사용자 음성에 대한 응답을 제공할 수 있다. 이 경우, 프로세서(130)는 다른 응답 정보를 메모리(120)에 저장할 수 있다.That is, the fact that the user uttered a wake-up word indicating a specific secretary service is a request for a response from the corresponding secretary service, and the processor 130 provides a secretary service matching the wake-up word among the plurality of response information received. Based on the response information received from the server, a response to the user's voice can be provided. In this case, the processor 130 may store other response information in the memory 120.

한편, 프로세서(130)는 제1 사용자 음성 이후에 획득된 제2 사용자 음성이 제1 사용자 음성에 대한 다른 비서 서비스의 응답 결과를 요청하는 음성에 해당하는지 여부를 확인할 수 있다.Meanwhile, the processor 130 may check whether the second user voice obtained after the first user voice corresponds to a voice requesting a response result of another secretary service to the first user voice.

이 경우, 프로세서(130)는 음성 인식 모듈(132)을 이용하여 사용자 음성이 다른 비서 서비스의 응답 결과를 요청하는 음성에 해당하는지 여부를 확인할 수 있다.In this case, the processor 130 may use the voice recognition module 132 to check whether the user voice corresponds to a voice requesting a response result of another secretary service.

구체적으로, 음성 인식 모듈(132)은 사용자 음성에 대한 음성 인식을 수행하여 사용자 음성을 텍스트로 변환하고, 음성 인식 결과에 기초하여 사용자 음성의 의도가 다른 비서 서비스의 응답 결과를 요청하는 것인지 여부를 파악할 수 있다. 이 경우, 음성 인식 모듈(132)은 자연어 이해를 통해 사용자 음성이 다른 비서 서비스의 응답 결과를 요청하는 것인지를 확인하도록 학습된 인공지능 모델을 포함할 수 있다.Specifically, the voice recognition module 132 performs voice recognition on the user's voice to convert the user's voice into text, and whether the user's voice intention requests a response result of another secretary service based on the voice recognition result. Can grasp. In this case, the speech recognition module 132 may include an artificial intelligence model trained to confirm whether the user's voice requests a response result of another secretary service through natural language understanding.

한편, 프로세서(130)는 어플리케이션(121-1, 121-2,..., 121-n)이 실행된 상태에서만, 사용자 음성에 대한 인식을 수행하여, 해당 사용자 음성이 다른 비서 서비스의 응답 결과를 요청하는 의도를 갖는지 여부를 확인할 수도 있다.Meanwhile, the processor 130 performs recognition of the user's voice only when the applications 121-1, 121-2, ..., 121-n are executed, and the response result of the secretary service whose user voice is different You can also check whether you have the intention to request.

이에 따라, 프로세서(130)는 제2 사용자 음성이 제1 사용자 음성에 대한 제2 비서 서비스의 응답 결과를 요청하는 음성에 해당하지 않는 경우, 제2 사용자 음성에 대한 정보를 제1 서버 및 제2 서버로 전송하고, 제2 사용자 음성의 전송에 응답하여 제1 서버 및 제2 서버로부터 수신된 복수의 응답 정보 중 제1 서버로부터 수신된 응답 정보에 기초하여 제2 사용자 음성에 대한 응답을 제공할 수 있다.Accordingly, when the second user voice does not correspond to the voice requesting the response result of the second secretary service to the first user voice, the processor 130 transmits information about the second user voice to the first server and the second To the server and provide a response to the second user voice based on the response information received from the first server among the plurality of response information received from the first server and the second server in response to the transmission of the second user voice. You can.

즉, 사용자가 다른 비서 서비스에서 제공한 정보를 요청하지 않고, 다른 의도를 갖는 음성을 발화한 경우, 프로세서(130)는 이전에 응답을 제공했던 비서 서비스를 계속적으로 이용하여, 이후의 사용자 음성에 대한 응답을 제공하게 된다. 다만, 이 경우에도, 사용자가 이후에 발화한 사용자 음성에 대해 다른 비서 서비스가 제공하는 답변을 요청할 수도 있다는 점에서, 프로세서(130)는 이후에 획득되는 사용자 음성에 대한 정보도 복수의 서버로 전송하게 된다.That is, when the user does not request information provided by another secretary service, and utters a voice with a different intention, the processor 130 continues to use the secretary service that previously provided a response, to the subsequent user voice. Response. However, even in this case, since the user may request an answer provided by another secretary service for the user's voice uttered later, the processor 130 also transmits information about the user's voice acquired later to a plurality of servers. Is done.

한편, 프로세서(130)는 제2 사용자 음성이 제1 사용자 음성에 대한 제2 비서 서비스의 응답 결과를 요청하는 음성에 해당하는 경우, 제1 사용자 음성에 대한 정보의 전송에 응답하여 수신된 복수의 응답 정보 중 제2 비서 서비스를 제공하는 제2 서버로부터 수신된 응답 정보에 기초하여 제2 사용자 음성에 대한 응답을 제공할 수 있다.Meanwhile, when the second user voice corresponds to a voice requesting a response result of the second secretary service to the first user voice, the processor 130 may receive a plurality of received responses in response to transmission of information about the first user voice. The response to the second user voice may be provided based on the response information received from the second server providing the second secretary service among the response information.

즉, 사용자가 이전 사용자 음성에 대해 다른 비서 서비스에서 제공한 정보를 요청하는 음성을 발화한 경우, 프로세서(130)는 이전 사용자 음성에 대해 수신된 복수의 응답 정보 중에서, 사용자가 요청한 비서 서비스를 제공하는 서버로부터 수신된 응답 정보에 기초하여 사용자 음성에 대한 응답을 제공할 수 있다. That is, when a user utters a voice requesting information provided by another secretary service for the previous user voice, the processor 130 provides a secretary service requested by the user among a plurality of response information received for the previous user voice The response to the user voice may be provided based on the response information received from the server.

한편, 이와 같이, 사용자의 요청에 따라 비서 서비스가 변경된 경우, 프로세서(130)는 변경된 비서 서비스를 이용하여 그 이후의 사용자 음성에 대한 응답을 제공할 수 있다. 즉, 프로세서(130)는 다시 다른 비서 서비스로의 변경을 요청하는 사용자 음성이 입력되기 전까지는 변경된 비서 서비스를 이용하여 사용자 음성에 대한 응답을 제공하고, 또 다른 비서 서비스로의 변경을 요청하는 사용자 음성이 입력되면, 사용자가 요청한 비서 서비스를 이용하여 사용자 음성에 대한 응답을 제공할 수 있다.On the other hand, in this way, when the secretary service is changed according to the user's request, the processor 130 may provide a response to the user voice thereafter using the changed secretary service. That is, the processor 130 provides a response to the user's voice using the changed secretary service until the user's voice requesting the change to another secretary service is input again, and the user requesting the change to another secretary service When a voice is input, a response to the user voice may be provided using a secretary service requested by the user.

이와 같이, 본 개시의 일 실시 예에 따르면, 사용자 음성에 대한 정보를 복수의 서버로 전송하고, 복수의 서버로부터 수신된 복수의 응답 정보를 이용하여 사용자 음성에 대한 응답을 제공한다는 점에서, 사용자에게 심리스(semaless)하게 복수의 비서 서비스를 제공하여 줄 수 있고, 사용자는 다양한 비서 서비스의 답변을 제공받을 수 있다는 점에서, 비서 서비스를 사용하는 사용자의 만족도 및 편의성이 향상될 수 있다. As described above, according to an embodiment of the present disclosure, the user transmits information about the user's voice to a plurality of servers, and provides a response to the user's voice by using the plurality of response information received from the plurality of servers. It can be provided seamlessly (semaless) a plurality of secretary services, in that the user can be provided with the answers of various secretary services, the satisfaction and convenience of the user using the secretary service can be improved.

도 3 내지 도 10은 본 개시의 일 실시 예에 따른 사용자 음성에 대한 응답을 제공하는 방법을 설명하기 위한 도면들이다.3 to 10 are diagrams for explaining a method for providing a response to a user voice according to an embodiment of the present disclosure.

먼저, 도 3과 같이, 사용자가 "XXX, 지금 몇시야?"와 같이 발화한 경우를 가정한다. 여기에서, XXX는 제1 서버(200-1)에서 제공하는 비서 서비스에 대한 웨이크업 단어일 수 있다.First, as shown in FIG. 3, it is assumed that the user has spoken as “XXX, what time is it now?”. Here, XXX may be a wake-up word for the secretary service provided by the first server 200-1.

이 경우, 도 3의 ①과 같이, 프로세서(130)는 어플리케이션(121-1, 121-2,..., 121-n)을 실행하고, 어플리케이션(121-1, 121-2,..., 121-n)을 통해 웨이크업 단어를 제외한 사용자 음성인 "지금 몇시야"에 대한 정보를 복수의 서버(200-1, 200-2,..., 200-n)로 전송할 수 있다.In this case, as shown in ① in FIG. 3, the processor 130 executes the applications 121-1, 121-2, ..., 121-n, and the applications 121-1, 121-2, ... , 121-n), information about the user's voice "what time is now" except the wake-up word can be transmitted to a plurality of servers 200-1, 200-2, ..., 200-n.

이에 따라, 도 3의 ②와 같이, 복수의 서버(200-1, 200-2,..., 200-n)는 전자 장치(100)로부터 수신된 "지금 몇시야"에 대한 응답 정보를 생성하고, 생성된 응답 정보를 전자 장치(100)로 전송할 수 있다.Accordingly, as shown in ② of FIG. 3, a plurality of servers 200-1, 200-2, ..., 200-n generate response information for "what time now" received from the electronic device 100. Then, the generated response information may be transmitted to the electronic device 100.

예를 들어, 제1 서버(200-1)는 "아침 9시 입니다"를 응답 정보로서 전자 장치(100)로 전송하고, 제2 서버(200-2)는 "한국 시간으로 알려드릴까요?"를 응답 정보로서 전자 장치(100)로 전송하고, 제n 서버(200-n)은 "9시 입니다"를 응답 정보로서 전자 장치(100)로 전송할 수 있다.For example, the first server 200-1 transmits "It is 9:00 in the morning" as the response information to the electronic device 100, and the second server 200-2 asks "Do you want to be informed in Korean time?" The response information may be transmitted to the electronic device 100, and the nth server 200-n may transmit “nine o'clock” to the electronic device 100 as response information.

이 경우, 프로세서(130)는 사용자가 호출한 비서 서비스(또는, 음성 비서)는 XXX라는 점에서, 도 4와 같이, 제1 서버(200-1)로부터 수신된 응답 정보에 기초하여 "아침 9시 입니다"(410)를 포함하는 화면을 전자 장치(100)의 디스플레이에 표시할 수 있다.In this case, the processor 130 is based on the response information received from the first server 200-1, as shown in FIG. 4, since the secretary service (or voice secretary) called by the user is XXX, “Morning 9 It is possible to display a screen including the poetry "410 on the display of the electronic device 100.

이후, 도 5와 같이, 사용자가 "오늘 날씨는 어때?"와 같이 발화한 경우를 가정한다. Thereafter, as shown in FIG. 5, it is assumed that the user has spoken as “How is the weather today?”.

이 경우, 프로세서(130)는 어플리케이션(121-1, 121-2,..., 121-n)이 실행된 상태라는 점에서, "오늘 날씨는 어때"가 다른 비서 서비스의 응답 결과를 요청하는 음성에 해당하는지를 확인할 수 있다. 이때, "오늘 날씨는 어때"는 다른 비서 서비스의 응답 결과를 요청하는 음성에 해당하지 않는다는 점에서, 도 5의 ①과 같이, 사용자 음성인 "오늘 날씨 어때"에 대한 정보를 복수의 서버(200-1, 200-2,..., 200-n)으로 전송할 수 있다.In this case, the processor 130 requests the response result of another secretary service with "what is the weather today" in that the applications 121-1, 121-2, ..., 121-n are executed. You can check whether it corresponds to the voice. At this time, "how is the weather today" does not correspond to the voice requesting the response result of the other secretary service, as shown in ① in Fig. 5, the user voice "how about the weather today" a plurality of servers (200 -1, 200-2, ..., 200-n).

이에 따라, 도 5의 ②와 같이, 복수의 서버(200-1, 200-2,..., 200-n)는 전자 장치(100)로부터 수신된 "오늘 날씨는 어때"에 대한 응답 정보를 생성하고, 생성된 응답 정보를 전자 장치(100)로 전송할 수 있다.Accordingly, as shown in ② of FIG. 5, a plurality of servers 200-1, 200-2, ..., 200-n receive response information for "What is the weather today" received from the electronic device 100? It can generate and transmit the generated response information to the electronic device 100.

예를 들어, 제1 서버(200-1)는 "오늘은 한 차례 비가 오겠습니다"를 응답 정보로서 전자 장치(100)로 전송하고, 제2 서버(200-2)는 "오늘 비올 확률은 60% 입니다"를 응답 정보로서 전자 장치(100)로 전송하고, 제n 서버(200-n)은 "오늘은 맑다가, 저녁 6시부터 7시 사이에 비가 올 것으로 예상됩니다"를 응답 정보로서 전자 장치(100)로 전송할 수 있다.For example, the first server 200-1 transmits "I will rain once today" as the response information to the electronic device 100, and the second server 200-2 says, "The probability of raining today is 60 % Is sent to the electronic device 100 as response information, and the nth server 200-n transmits "today is clear, and it is expected to rain between 6 and 7 pm" as response information. It can be transmitted to the device 100.

이 경우, 프로세서(130)는 이전 사용자 음성에 대해 XXX 비서 서비스를 통해 응답을 제공하였다는 점에서, 도 6과 같이, 제1 서버(200-1)로부터 수신된 응답 정보에 기초하여 "오늘은 한 차례 비가 오겠습니다"(610)를 포함하는 화면을 전자 장치(100)의 디스플레이에 표시할 수 있다.In this case, the processor 130, based on the response information received from the first server 200-1, as shown in FIG. 6 in that the response to the previous user voice through the XXX secretary service, "Today is It will rain once. ”A screen including 610 may be displayed on the display of the electronic device 100.

이후, 도 7A와 같이, 사용자가 "YYY 결과도 알려줘"를 발화한 경우, 프로세서(130)는 어플리케이션(121-1, 121-2,..., 121-n)이 실행된 상태라는 점에서, "YYY 결과도 알려줘"가 다른 비서 서비스의 응답 결과를 요청하는 음성에 해당하는지를 확인할 수 있다. 여기에서, YYY는 제2 서버(200-2)에서 제공하는 비서 서비스에 대한 웨이크업 단어일 수 있다. Thereafter, as shown in FIG. 7A, when the user utters “tell me the YYY result”, the processor 130 is in the state that the applications 121-1, 121-2, ..., 121-n are executed. , It can be confirmed whether "tell me the YYY result" corresponds to the voice requesting the response result of another secretary service. Here, YYY may be a wake-up word for the secretary service provided by the second server 200-2.

이때, "YYY 결과도 알려줘"는 비서 서비스인 YYY의 응답 결과를 요청하는 음성에 해당한다는 점에서, 프로세서(130)는 도 7B와 같이, 이전에 제2 서버(200-2)로부터 수신된 "오늘 날씨 어때"에 대한 응답 정보에 기초하여, "오늘은 비올 확률은 60% 입니다"(710)를 포함하는 화면을 전자 장치(100)의 디스플레이에 표시할 수 있다.At this time, "tell me the YYY result" in that it corresponds to the voice requesting the response result of the secretary service YYY, the processor 130 is previously received from the second server 200-2, as shown in Figure 7B Based on the response information on "How is the weather today," a screen including "710% probability of rain today" is displayed on the display of the electronic device 100.

이후, 도 8과 같이, 사용자가 "이번주 주말에도 비가와?"와 같이 발화한 경우를 가정한다. Thereafter, as shown in FIG. 8, it is assumed that the user utters as “Is it raining on the weekend this week?”.

이 경우, 프로세서(130)는 어플리케이션(111-1, 111-2,..., 111-n)이 실행된 상태라는 점에서, "이번주 주말에도 비가와"가 다른 비서 서비스의 응답 결과를 요청하는 음성에 해당하는지를 확인할 수 있다. 이때, "이번주 주말에도 비가와”는 다른 비서 서비스의 응답 결과를 요청하는 음성에 해당하지 않는다는 점에서, 도 8의 ①과 같이, 사용자 음성인 "이번주 주말에도 비가와"에 대한 정보를 복수의 서버(200-1, 200-2,..., 200-n)으로 전송할 수 있다.In this case, the processor 130 requests the response result of the secretarial service different from "it rains even on the weekend this week" in that the applications 111-1, 111-2, ..., 111-n are executed. You can check whether it corresponds to the voice. At this time, in the point that "it rains even on this weekend" does not correspond to the voice requesting the response result of the other secretary service, as shown in ① in Fig. 8, a plurality of information about the user voice "it rains also on this weekend" Servers 200-1, 200-2, ..., 200-n.

이에 따라, 도 8의 ②와 같이, 복수의 서버(200-1, 200-2,..., 200-n)는 전자 장치(100)로부터 수신된 "이번주 주말에도 비가와"에 대한 응답 정보를 생성하고, 생성된 응답 정보를 전자 장치(100)로 전송할 수 있다.Accordingly, as shown in ② of FIG. 8, the plurality of servers 200-1, 200-2, ..., 200-n respond to "Raining rain on the weekend this week" received from the electronic device 100. To generate and transmit the generated response information to the electronic device 100.

이 경우, 프로세서(130)는 이전에 사용자가 호출한 비서 서비스는 YYY라는 점에서, 도 9와 같이, 제2 서버(200-2)로부터 수신된 응답 정보에 기초하여 "주말에는 비가 오지 않을 것으로 보입니다"(910)를 포함하는 화면을 전자 장치(100)의 디스플레이에 표시할 수 있다.In this case, the processor 130 is based on the response information received from the second server 200-2, as shown in FIG. 9, since the secretary service previously called by the user is YYY, "the rain will not rain on the weekend." It is possible to display a screen including the "see" (910) on the display of the electronic device (100).

한편, 전술한 예에, 프로세서(130)는 다른 비서 서비스의 응답을 제공할 수 있음을 나타내는 유저 인터페이스를 제공할 수도 있다.Meanwhile, in the above-described example, the processor 130 may provide a user interface indicating that a response from another secretary service can be provided.

구체적으로, 프로세서(130)는 사용자가 호출한 비서 서비스의 응답을 제공할 때, 사용자 호출한 비서 서비스 이외의 다른 비서 서비스의 명칭 등을 사용자가 호출한 비서 서비스의 응답과 함께 표시하여, 다른 비서 서비스의 응답을 제공할 수 있음을 사용자에게 가이드할 수 있다.Specifically, when providing the response of the secretary service called by the user, the processor 130 displays the name of the secretary service other than the secretary service called by the user along with the response of the secretary service called by the user, and other secretary It is possible to guide the user that the response of the service can be provided.

예를 들어, 도 10a와 같이, 사용자가 "XXX, 오늘 날씨는 어때?"와 같이 발화한 경우, 프로세서(130)는 비서 서비스 XXX가 제공한 응답 정보에 기초하여 “오늘은 한 차례 비가 오겠습니다”(1010)를 디스플레이에 표시할 수 있다. 이 경우, 프로세서(130)는 다른 비서 서비스의 이름인 YYY(1020), ZZZ(1030)을 디스플레이에 표시할 수 있다. For example, as shown in FIG. 10A, when the user utters "XXX, how is the weather today?", The processor 130 may use the response information provided by the secretary service XXX, “Today it will rain once. 1010 can be displayed on the display. In this case, the processor 130 may display YYY 1020 and ZZZ 1030, which are names of other secretary services, on the display.

이후, 도 10b와 같이, 사용자가 “YYY 결과도 알려줘"를 발화한 경우, 프로세서(130)는 비서 서비스 YYY를 제공하는 서버로부터 수신된 응답 정보에 기초하여, 도 10c와 같이, "오늘은 비올 확률은 60% 입니다"(1040)를 디스플레이에 표시할 수 있다. 이 경우, 프로세서(130)는 다른 비서 서비스의 이름인 XXX(1050), ZZZ(1030)을 디스플레이에 표시할 수 있다.Thereafter, as shown in FIG. 10B, when the user utters “tell me the YYY result”, the processor 130 is based on the response information received from the server providing the secretary service YYY, as shown in FIG. 10C, “Today is empty Probability is 60% "(1040) on the display. In this case, the processor 130 may display the names of other secretary services XXX (1050) and ZZZ (1030) on the display.

한편, 본 개시의 일 실시 예에 따라 기설정된 버튼을 선택하는 사용자 입력이 비서 서비스를 실행하기 위한 사용자 입력으로 입력된 경우를 설명하도록 한다.Meanwhile, a case where a user input for selecting a preset button is input as a user input for executing a secretary service according to an embodiment of the present disclosure.

구체적으로, 프로세서(130)는 기설정된 버튼의 선택에 의한 사용자 입력이 수신된 후 사용자 음성이 획득되면, 사용자 음성에 대한 정보를 복수의 서버(200-1, 200-2,..., 200-n)로 전송할 수 있다.Specifically, when a user voice is obtained after a user input is received by selecting a preset button, the processor 130 receives information about the user voice from a plurality of servers 200-1, 200-2, ..., 200 -n).

여기에서, 기설정된 버튼은 비서 서비스를 트리거하기 위한 버튼이라는 점에서, 프로세서(130)는 기설정된 버튼이 선택되면, 메모리(120)에 저장된 어플리케이션(121-1, 121-2,..., 121-n)을 실행하고, 이후 입력된 사용자 음성에 대한 정보를 어플리케이션(121-1, 121-2,..., 121-n)을 통해 복수의 서버(200-1, 200-2,..., 200-n)로 전송할 수 있다. Here, since the preset button is a button for triggering a secretary service, the processor 130, when the preset button is selected, the application (121-1, 121-2, ...) stored in the memory 120 121-n), and then input the information about the user voice through the application (121-1, 121-2, ..., 121-n) a plurality of servers (200-1, 200-2 ,. .., 200-n).

한편, 기설정된 버튼은 전자 장치(100)에 마련되거나, 전자 장치(100)에 연결된 외부 전자 장치(미도시)에 마련될 수 있다. 이 경우, 외부 전자 장치(미도시)는 기설정된 버튼이 눌려지면, 기설정된 버튼이 눌려졌었음을 나타내는 신호를 전자 장치(100)로 전송할 수 있다. Meanwhile, the preset button may be provided on the electronic device 100 or may be provided on an external electronic device (not shown) connected to the electronic device 100. In this case, when the preset button is pressed, the external electronic device (not shown) may transmit a signal indicating that the preset button was pressed to the electronic device 100.

한편, 복수의 서버(200-1, 200-2,..., 200-n)는 전자 장치(100)로부터 수신된 사용자 음성에 대해 응답 정보를 생성하고, 응답 정보를 전자 장치(100)로 전송할 수 있다. On the other hand, a plurality of servers (200-1, 200-2, ..., 200-n) generates response information for the user voice received from the electronic device 100, the response information to the electronic device 100 Can transmit.

한편, 프로세서(130)는 복수의 서버(200-1, 200-2,..., 200-n)로부터 사용자 음성에 대한 응답 정보가 수신되면, 복수의 응답 정보에 기초하여 사용자 음성에 대한 복수의 응답을 제공할 수 있다.Meanwhile, when the response information for the user voice is received from the plurality of servers 200-1, 200-2, ..., 200-n, the processor 130 may perform a plurality of user voices based on the plurality of response information. Can provide a response.

이 경우, 사용자가 특정한 웨이크업 단어를 발화한 것과 같이 비서 서비스를 특정하여 응답을 요청한 것이 아니라는 점에서, 프로세서(130)는 복수의 비서 서비스로부터 제공된 복수의 응답 정보에 기초하여 사용자 음성에 대한 응답을 제공할 수 있다.In this case, the processor 130 responds to the user's voice based on a plurality of response information provided from a plurality of secretary services, in that the user does not request a response by specifying a secretary service such as a specific wake-up word is spoken. Can provide

이때, 프로세서(130)는 복수의 서버가 제공하는 복수의 비서 서비스에 대한 사용자의 선호도 및 복수의 응답 정보의 정확도 중 적어도 하나에 기초하여 복수의 응답을 상이한 형태로 제공할 수 있다.In this case, the processor 130 may provide a plurality of responses in different forms based on at least one of a user's preference for a plurality of secretary services provided by a plurality of servers and accuracy of the plurality of response information.

구체적으로, 프로세서(130)는 복수의 서버가 제공하는 복수의 비서 서비스에 대한 사용자의 선호도에 기초하여, 사용자의 선호도가 가장 높은 비서 서비스의 응답을 가장 크게 표시하거나, 다른 비서 서비스와는 다른 컬러로 표시하거나 하이라이트시켜 표시할 수 있다. 또는, 프로세서(130)는 사용자의 선호도가 가장 높은 비서 서비스의 응답을 표시하고, 다른 비서 서비스에 대해서는 비서 서비스의 이름만을 표시할 수 있다. 또는, 프로세서(130)는 사용자의 선호도가 높은 순으로 비서 서비스의 응답을 크게 표시할 수 있다. 또는, 프로세서(130)는 복수의 비서 서비스의 응답을 표시할 때, 이들을 사용자의 선호도 순으로 표시할 수도 있다.Specifically, the processor 130 displays the response of the secretary service with the highest user preference to the largest, or a color different from other secretary services, based on the user's preference for the plurality of secretary services provided by the plurality of servers. It can be marked with or by highlighting. Alternatively, the processor 130 may display the response of the secretary service with the highest user preference, and only the name of the secretary service for other secretary services. Alternatively, the processor 130 may largely display the response of the secretary service in the order of high user preference. Alternatively, when displaying responses of a plurality of secretary services, the processor 130 may display them in order of user preference.

이를 위해, 프로세서(130)는 사용자가 이용하는 비서 서비스에 대한 정보를 메모리(120)에 저장할 수 있으며, 사용자가 비서 서비스를 이용할 때마다 해당 정보를 업데이트할 수 있다.To this end, the processor 130 may store information about the secretary service used by the user in the memory 120 and update the information whenever the user uses the secretary service.

이에 따라, 프로세서(130)는 사용자의 이용 빈도가 높은 비서 서비스 순으로, 사용자가 높은 선호도를 갖는 것으로 판단할 수 있다.Accordingly, the processor 130 may determine that the user has a high preference in the order of secretary services having a high frequency of use.

한편, 프로세서(130)는 복수의 서버가 제공하는 복수의 비서 서비스의 정확도에 기초하여, 정확한 응답을 제공하지 못하는 비서 서비스의 응답은 다른 비서 서비스의 응답보다 작게 표시하거나, 또는 비서 서비스의 이름만 표시하거나 또는 비서 서비스의 이름과 응답의 일부 내용만을 표시할 수 있다.On the other hand, the processor 130, based on the accuracy of a plurality of secretary services provided by a plurality of servers, the response of the secretary service that does not provide an accurate response is displayed smaller than the responses of other secretary services, or only the name of the secretary service It may be displayed, or only the name of the secretary service and a portion of the response.

예를 들어, 복수의 서버(200-1, 200-2,..., 200-n)는 가령, 사용자 음성에 대한 의미를 분석할 수 없는 등의 경우, 응답 정보를 구성하지 못하게 된다. 이 경우, 복수의 서버(200-1, 200-2,..., 200-n)는 "지원되지 않은 서비스 입니다"와 같은 응답 정보를 전자 장치(100)로 전송할 수 있다.For example, the plurality of servers 200-1, 200-2, ..., 200-n, for example, cannot analyze the meaning of the user's voice, and thus cannot configure the response information. In this case, the plurality of servers 200-1, 200-2, ..., 200-n may transmit response information such as "This service is not supported" to the electronic device 100.

이 경우, 프로세서(130)는 해당 서버에서 제공하는 비서 서비스가 사용자 음성에 대한 정확한 응답을 제공하지 못하는 것으로 판단하고, 비서 서비스의 응답을 다른 비서 서비스의 응답과 차별화하여 표시할 수 있다.In this case, the processor 130 may determine that the secretary service provided by the corresponding server does not provide an accurate response to the user's voice, and may display the response of the secretary service differently from the response of the other secretary service.

한편, 복수의 응답을 제공한 이후, 프로세서(130)는 사용자 음성이 입력되면, 웨이크업 엔진(131)을 이용하여 사용자 음성에서 웨이크업 단어가 존재하는지 여부를 확인할 수 있다.Meanwhile, after providing a plurality of responses, the processor 130 may check whether a wakeup word exists in the user voice using the wakeup engine 131 when the user voice is input.

이에 따라, 웨이크업 단어가 존재하지 않는 경우, 프로세서(130)는 사용자 음성에 대한 정보를 복수의 서버(200-1, 200-2,..., 200-n)로 전송하고, 복수의 서버(200-1, 200-2,..., 200-n)로부터 수신된 복수의 응답 정보에 기초하여 복수의 응답을 제공할 수 있다. 이 경우, 프로세서(130)는 전술한 바와 같이, 복수의 응답을 차별적으로 표시할 수 있다.Accordingly, when the wakeup word does not exist, the processor 130 transmits information about the user's voice to a plurality of servers 200-1, 200-2, ..., 200-n, and a plurality of servers A plurality of responses may be provided based on a plurality of response information received from (200-1, 200-2, ..., 200-n). In this case, the processor 130 may differentially display a plurality of responses, as described above.

다만, 웨이크업 단어가 존재하는 경우에는 프로세서(130)는 웨이크업 단어를 제외한 나머지 사용자 음성에 대한 정보를 복수의 서버(200-1, 200-2,..., 200-n)로 전송할 수 있다. However, when the wake-up word exists, the processor 130 may transmit information on the user's voice excluding the wake-up word to a plurality of servers 200-1, 200-2, ..., 200-n. have.

그리고, 프로세서(130)는 복수의 서버(200-1, 200-2,..., 200-n)로부터 사용자 음성에 대한 응답 정보가 수신되면, 복수의 응답 정보 중 웨이크업 단어에 대응되는 비서 서비스에 의해 생성된 응답 정보에 기초하여 사용자 음성에 대한 응답을 제공할 수 있다. Then, when the processor 130 receives the response information for the user's voice from the plurality of servers 200-1, 200-2, ..., 200-n, the processor 130 corresponds to a wake-up word among the plurality of response information A response to the user's voice may be provided based on the response information generated by the service.

즉, 사용자가 특정한 비서 서비스를 나타내는 웨이크업 단어를 발화한 이후에는 복수의 비서 서비스의 응답을 제공하는 것이 아니라, 사용자가 응답을 요청한 비서 서비스 즉, 웨이크업 단어에 매칭되는 비서 서비스를 제공하는 서버로부터 수신된 응답 정보에 기초하여, 사용자 음성에 대한 응답을 제공할 수 있다. 이 경우, 프로세서(130)는 다른 응답 정보를 메모리(120)에 저장하고, 이후, 다른 비서 서비스의 응답 결과를 요청하는 경우, 메모리(120)에 저장된 응답 정보를 이용하여, 응답을 제공할 수 있다.That is, after the user utters a wake-up word indicating a specific secretary service, rather than providing responses from a plurality of secretary services, the user provides a secretary service that requests a response, that is, a server providing a secretary service matching the wake-up word Based on the response information received from, a response to the user voice may be provided. In this case, the processor 130 may store other response information in the memory 120 and then, when requesting a response result of another secretary service, use the response information stored in the memory 120 to provide a response. have.

한편, 사용자가 발화한 음성에서 웨이크업 단어가 존재하는 경우의 프로세서(130)의 구체적인 동작은 전술한 바와 동일하다는 점에서, 구체적인 중복 설명은 생략하도록 한다.On the other hand, the detailed operation of the processor 130 when the wake-up word exists in the voice spoken by the user is the same as described above, and thus, a detailed overlapping description will be omitted.

이와 같이, 본 개시의 일 실시 예에 따르면, 사용자가 특정한 비서 서비스를 특정함이 없이 비서 서비스의 실행을 요청한 경우에는, 사용자 음성에 대한 복수의 비서 서비스의 응답을 제공하고, 이후 사용자에 의해 선택된 비서 서비스의 응답을 제공할 수 있다는 점에서, 비서 서비스를 사용하는 사용자의 만족도 및 편의성이 향상될 수 있다. As described above, according to an embodiment of the present disclosure, when a user requests execution of a secretary service without specifying a specific secretary service, responses of a plurality of secretary services to a user voice are provided, and then selected by the user In that the response of the secretary service can be provided, satisfaction and convenience of a user using the secretary service can be improved.

도 11 내지 도 17은 본 개시의 일 실시 예에 따른 사용자 음성에 대한 응답을 제공하는 방법을 설명하기 위한 도면들이다.11 to 17 are diagrams for describing a method of providing a response to a user voice according to an embodiment of the present disclosure.

먼저, 도 11과 같이, 사용자가 리모컨(1000)에 마련된 버튼을 누른 후, "지금 몇시야"와 같이 발화한 경우를 가정한다. 이 경우, 리모컨(1000)은 버튼이 눌려졌음을 나타내는 신호를 전자 장치(100)로 전송할 수 있고, 프로세서(130)는 수신된 신호에 기초하여 비서 서비스를 실행하기 위한 사용자 입력이 입력된 것으로 판단할 수 있다.First, as shown in FIG. 11, it is assumed that a user presses a button provided on the remote controller 1000 and then utters a message as “What time is it now?” In this case, the remote controller 1000 may transmit a signal indicating that the button is pressed to the electronic device 100, and the processor 130 determines that a user input for executing a secretary service is input based on the received signal. can do.

이에 따라, 도 11의 ①과 같이, 프로세서(130)는 어플리케이션(121-1, 121-2,..., 121-n)을 실행하고, 획득된 사용자 음성인 "지금 몇시야"에 대한 정보를 어플리케이션(121-1, 121-2,..., 121-n)을 통해 복수의 서버(200-1, 200-2,..., 200-n)로 전송할 수 있다.Accordingly, as shown in ① of FIG. 11, the processor 130 executes the applications 121-1, 121-2, ..., 121-n, and information about the acquired user voice "what time is now" It can be transmitted to a plurality of servers (200-1, 200-2, ..., 200-n) through the application (121-1, 121-2, ..., 121-n).

이에 따라, 도 11의 ②와 같이, 복수의 서버(200-1, 200-2,..., 200-n)는 전자 장치(100)로부터 수신된 "지금 몇시야"에 대한 응답 정보를 생성하고, 생성된 응답 정보를 전자 장치(100)로 전송할 수 있다.Accordingly, as shown in ② of FIG. 11, a plurality of servers 200-1, 200-2, ..., 200-n generate response information for "what time now" received from the electronic device 100. Then, the generated response information may be transmitted to the electronic device 100.

이 경우, 프로세서(130)는 복수의 서버(200-1, 200-2,..., 200-n)로부터 수신된 복수의 응답 정보에 기초하여 "지금 몇시야"에 대한 복수의 응답을 전자 장치(100)의 디스플레이에 표시할 수 있다.In this case, the processor 130 electronically transmits a plurality of responses to " what time now " based on a plurality of response information received from a plurality of servers 200-1, 200-2, ..., 200-n. It can be displayed on the display of the device 100.

예를 들어, 도 12a와 같이, 프로세서(130)는 사용자 선호도가 높은 XXX의 응답 정보에 기초한 "아침 9시 입니다"(1210)를 YYY 및 ZZZ의 응답 정보에 기초한 "한국 시간으로 알려드릴까요?"(1220) 및 "9시 입니다"(1230)보다 큰 사이즈로 표시할 수 있다. 또한, 프로세서(130)는 응답 정보를 구성하지 못한 aaa 및 bbb의 응답으로 "No service"(1240, 1250)으로 제공하되, 다른 비서 서비스의 응답보다 작게 표시할 수 있다.For example, as illustrated in FIG. 12A, the processor 130 may inform the user of “Korean morning time” based on the response information of YYY and ZZZ based on the response information of XXX, which has high user preference, based on the response information of YYY and ZZZ. It can be displayed in a size larger than (1220) and "It is 9 o'clock" (1230). In addition, the processor 130 is provided as "No service" (1240, 1250) in response to aaa and bbb that do not configure the response information, but may be displayed smaller than the responses of other secretary services.

이후, 도 13과 같이, 사용자가 "오늘 날씨는 어때?"와 같이 발화한 경우를 가정한다. Thereafter, as shown in FIG. 13, it is assumed that the user has spoken as “How is the weather today?”.

이 경우, 프로세서(130)는 웨이크업 모듈(131)을 이용하여 "오늘 날씨는 어때?"에 웨이크업 단어가 존재하는지를 확인할 수 있다.In this case, the processor 130 may use the wake-up module 131 to check whether a wake-up word exists in “How is the weather today?”.

이때, "오늘 날씨는 어때?"에는 웨이크업 단어가 존재하지 않는다는 점에서, 프로세서(130)는 도 13의 ①과 같이, 사용자 음성인 "오늘 날씨는 어때?"에 대한 정보를 복수의 서버(200-1, 200-2,..., 200-n)으로 전송할 수 있고, 복수의 서버(200-1, 200-2,..., 200-n)는 도 13의 ②와 같이, 전자 장치(100)로부터 수신된 "오늘 날씨는 어때"에 대한 응답 정보를 생성하고, 생성된 응답 정보를 전자 장치(100)로 전송할 수 있다.At this time, in the absence of a wake-up word in "How is today's weather?", The processor 130, as shown in ① in Fig. 13, the user voice, "What is the weather today?" 200-1, 200-2, ..., 200-n), and a plurality of servers 200-1, 200-2, ..., 200-n) are electronic as shown in ② in FIG. The response information for “what is the weather today” received from the device 100 may be generated, and the generated response information may be transmitted to the electronic device 100.

이 경우, 프로세서(130)는 복수의 서버(200-1, 200-2,..., 200-n)로부터 수신된 복수의 응답 정보에 기초하여 "오늘 날씨는 어때?"에 대한 응답을 전자 장치(100)의 디스플레이에 표시할 수 있다.In this case, the processor 130 electronically responds to "How is the weather today?" Based on the plurality of response information received from the plurality of servers 200-1, 200-2, ..., 200-n. It can be displayed on the display of the device 100.

이때, 프로세서(130)는 "오늘 날씨는 어때?"에는 웨이크업 단어가 포함되어 있지 않다는 점에서, 복수의 서버(200-1, 200-2,..., 200-n)로부터 수신된 복수의 응답 정보에 "오늘 날씨는 어때?"에 대한 복수의 응답을 표시할 수 있다.At this time, the processor 130 is a plurality of received from a plurality of servers (200-1, 200-2, ..., 200-n) in that the "how is the weather today?" Multiple responses to "How is the weather today?" May be displayed in the response information of.

예를 들어, 도 14와 같이, 프로세서(130)는 사용자 선호도가 높은 XXX의 응답 정보에 기초한 "오늘은 한 차례 비가 오겠습니다"(1410)을 YYY, ZZZ 및 bbb의 응답 정보에 기초한 "오늘 비올 확률은 60%입니다"(1420), "오늘은 맑다가, 저녁 6시부터 7시 사이에 비가 올 것으로 예상됩니다"(1430), "오늘 비가 올 것으로 예상됩니다"(1440)보다 큰 사이즈로 표시할 수 있다. 또한, 프로세서(130)는 응답 정보를 구성하지 못한 aaa의 응답으로 "No service"(1450)으로 제공하되, 다른 비서 서비스의 응답보다 작게 표시할 수 있다.For example, as shown in FIG. 14, the processor 130 sets the “today will rain once” 1410 based on the response information of XXX having a high user preference, and the “today today” based on the response information of YYY, ZZZ, and bbb. Probability is 60% "(1420)," Today is clear, but it is expected to rain between 6 and 7 pm "(1430) and" Today is expected to rain "(1440) can do. In addition, the processor 130 provides a "No service" 1450 in response to aaa, which has failed to configure response information, but may be displayed smaller than the response of other secretary services.

이후, 도 15와 같이, 사용자가 "ZZZ, 주말에도 비가와?"를 발화한 경우를 가정한다. 여기에서, ZZZ는 특정한 비서 서비스에 대한 웨이크업 단어일 수 있다.Then, as shown in FIG. 15, it is assumed that the user utters "ZZZ, rain on weekends?". Here, ZZZ may be a wake-up word for a specific secretary service.

이 경우, 프로세서(130)는 웨이크업 모듈(131)을 이용하여 "ZZZ, 주말에도 비가와?"에 웨이크업 단어가 존재하는지를 확인할 수 있다.In this case, the processor 130 may use the wake-up module 131 to check whether a wake-up word exists in "ZZZ, rain on weekends?".

이때, "ZZZ, 주말에도 비가와?"에는 웨이크업 단어인 ZZZ가 존재한다는 점에서, 프로세서(130)는 도 15의 ①과 같이, "ZZZ"를 제외한 사용자 음성인 "주말에도 비가와"에 대한 정보를 복수의 서버(200-1, 200-2,..., 200-n)로 전송할 수 있고, 복수의 서버(200-1, 200-2,..., 200-n)는 도 15의 ②와 같이, 전자 장치(100)로부터 수신된 "주말에도 비가와"에 대한 응답 정보를 생성하고, 생성된 응답 정보를 전자 장치(100)로 전송할 수 있다.At this time, in the "ZZZ, does it rain on the weekend?", The wake-up word ZZZ exists, so that the processor 130, as shown in ① in Fig. 15, the user voice, except for "ZZZ," Information can be transmitted to a plurality of servers (200-1, 200-2, ..., 200-n), and a plurality of servers (200-1, 200-2, ..., 200-n) As in (2) of 15, response information for “it rains even on weekends” received from the electronic device 100 may be generated, and the generated response information may be transmitted to the electronic device 100.

이 경우, 도 16과 같이, 프로세서(130)는 복수의 서버(200-1, 200-2,..., 200-n) 중 사용자가 발화한 비서 서비스 ZZZ를 제공하는 서버로부터 수신된 응답 정보에 기초하여, "주말은 화창할 것으로 보입니다"(1610)를 전자 장치(100)의 디스플레이에 표시할 수 있다.In this case, as shown in FIG. 16, the processor 130 receives response information received from a server providing a secretary service ZZZ uttered by a user among a plurality of servers 200-1, 200-2, ..., 200-n. Based on the, “weekends appear to be sunny” 1610 may be displayed on the display of the electronic device 100.

이와 같이, 프로세서(130)는 사용자가 기설정된 버튼을 선택하여 비서 서비스를 실행한 경우라도, 사용자가 특정한 비서 서비스의 응답을 요청한 경우에는, 사용자가 요청한 비서 서비스의 응답만을 표시할 수 있다. 그리고, 프로세서(130)는 이후 사용자 음성에 웨이크업 단어가 포함되었는지에 따라 해당 비서 서비스와 계속적인 대화를 제공하거나, 다른 비서 서비스의 응답을 제공할 수 있다. In this way, the processor 130 may display only the response of the secretary service requested by the user when the user requests a response of a specific secretary service even when the user selects a preset button to execute the secretary service. In addition, the processor 130 may provide a continuous conversation with the corresponding secretary service or provide a response from another secretary service according to whether a wakeup word is included in the user's voice.

한편, 전술한 예에서는 사용자의 선호도에 따라 사용자의 선호도가 높은 비서 서비스의 응답을 다른 비서 서비스의 응답보다 크게 표시하는 것으로 설명하였으나, 이는 일 예에 불과하다. 즉, 프로세서(130)는 사용자의 선호도가 가장 높은 비서 서비스의 응답을 표시하고, 나머지 다른 비서 서비스에 대해서는 비서 서비스의 이름만을 표시할 수도 있다.Meanwhile, in the above-described example, the response of the secretary service having a high user preference is displayed as larger than the response of the other secretary service according to the user preference, but this is only an example. That is, the processor 130 may display the response of the secretary service having the highest user preference, and may display only the name of the secretary service for the other secretary services.

예를 들어, 사용자가 "지금 몇시야?”와 같이 발화한 경우, 프로세서(130)는 도 17a와 같이, 사용자 선호도가 높은 비서 서비스 XXX의 응답 정보에 기초하여 “아침 9시 입니다”(1710)를 디스플레이에 표시할 수 있다. 이 경우, 프로세서(130)는 다른 비서 서비스의 이름인 YYY(1720), ZZZ(1730), aaa(1740), bbb(1750)을 디스플레이에 표시할 수 있다. For example, when the user speaks, such as "What time is it now?", The processor 130, based on the response information of the secretarial service XXX with high user preference, as shown in FIG. 17A, is "9:00 am" (1710). In this case, the processor 130 may display the names of other secretary services, YYY (1720), ZZZ (1730), aaa (1740), and bbb (1750).

이후, 도 17b와 같이, 사용자가 “ZZZ 결과도 알려줘"를 발화한 경우, 프로세서(130)는 비서 서비스 ZZZ를 제공하는 서버로부터 수신된 응답 정보에 기초하여, 도 17c와 같이, "9시 입니다 "(1760)를 디스플레이에 표시할 수 있다. 이 경우, 프로세서(130)는 다른 비서 서비스의 이름인 XXX(1770), YYY(1720), aaa(1740), bbb(1750)을 디스플레이에 표시할 수 있다. Thereafter, as shown in FIG. 17B, when the user utters “tell me the ZZZ result”, the processor 130 is based on the response information received from the server providing the secretary service ZZZ, as in FIG. "(1760) may be displayed on the display. In this case, the processor 130 may display the names of other secretary services, XXX (1770), YYY (1720), aaa (1740), and bbb (1750). You can.

한편, 전술한 예에서는 사용자 음성에 대한 응답이 전자 장치(100)의 디스플레이를 통해 제공되는 것으로 설명하였으나 이는 일 예에 불과하다. 즉, 전자 장치(100)는 전자 장치(100)에 마련된 스피커를 통해 사용자 음성에 대한 응답을 음성으로 출력할 수 있다. 이 경우, 전자 장치(100)는 화면 또는 음성 중 적어도 하나의 형태로 응답을 제공할 수 있다.Meanwhile, in the above-described example, it has been described that the response to the user's voice is provided through the display of the electronic device 100, but this is only an example. That is, the electronic device 100 may output a response to the user's voice as a voice through a speaker provided in the electronic device 100. In this case, the electronic device 100 may provide a response in at least one form of a screen or voice.

뿐만 아니라, 사용자 음성에 대한 응답은 전자 장치(100)가 아닌 다른 전자 장치(미도시)에서 제공될 수도 있다. 즉, 사용자 음성은 전자 장치(100)가 획득하되, 사용자 음성에 대한 응답은 전자 장치(100)가 아닌 전자 장치(100)와 연결된 다른 전자 장치(미도시)에서 제공될 수도 있다.In addition, the response to the user's voice may be provided by an electronic device (not shown) other than the electronic device 100. That is, the user voice is acquired by the electronic device 100, but a response to the user voice may be provided by another electronic device (not shown) connected to the electronic device 100 rather than the electronic device 100.

예를 들어, 도 18a 내지 도 18d와 같이, 전자 장치(100)가 마이크를 구비한 스피커로 구현된 경우를 가정한다.For example, assume that the electronic device 100 is implemented as a speaker with a microphone, as shown in FIGS. 18A to 18D.

먼저, 도 18a와 같이, 사용자가 "XXX, 지금 몇시야?"와 같이 발화한 경우를 가정한다.First, as shown in FIG. 18A, it is assumed that the user has spoken as “XXX, what time is it now?”.

이 경우, 프로세서(130)는 마이크를 통해 사용자 음성이 수신되면, 사용자 음성에 대한 정보를 복수의 서버(200-1, 200-2,..., 200-n)로 전송할 수 있다. 이에 따라, 복수의 서버(200-1, 200-2,..., 200-n)는 사용자 음성에 대한 응답 정보를 생성하여, 전자 장치(100)로 전송할 수 있다. In this case, when the user voice is received through the microphone, the processor 130 may transmit information about the user voice to a plurality of servers 200-1, 200-2, ..., 200-n. Accordingly, the plurality of servers 200-1, 200-2, ..., 200-n may generate response information for the user's voice and transmit it to the electronic device 100.

이 경우, 프로세서(130)는 복수의 서버(200-1, 200-2,..., 200-n)로부터 수신된 응답 정보 중 사용자가 호출한 비서 서비스 XXX를 제공하는 서버로부터 수신된 응답 정보를 텔레비전(50)으로 전송할 수 있다.In this case, the processor 130 among the response information received from a plurality of servers (200-1, 200-2, ..., 200-n), response information received from a server providing a secretary service XXX called by a user Can be transmitted to the television 50.

이에 따라, 도 18b와 같이, 텔레비전(50)은 전자 장치(100)로부터 수신된 응답 정보에 기초하여, “아침 9시 입니다”(1810)을 디스플레이에 표시할 수 있다.Accordingly, as shown in FIG. 18B, the television 50 may display “10 am in the morning” 1810 on the display based on the response information received from the electronic device 100.

이후, 도 18c와 같이, 사용자가 “YYY 결과도 알려줘”와 같이 발화한 경우를 가정한다.Thereafter, as shown in FIG. 18C, it is assumed that the user utters as “tell me the YYY result”.

이 경우, 프로세서(130)는 마이크를 통해 사용자 음성이 수신되면, 사용자 음성이 다른 비서 서비스의 응답 결과를 요청하는 음성에 해당하는지를 판단할 수 있다. 이때, "YYY 결과도 알려줘"는 비서 서비스인 YYY의 응답 결과를 요청하는 음성에 해당한다는 점에서, 프로세서(130)는 이전에 수신된 응답 정보 중 비서 서비스 YYY를 제공하는 서버로부터 수신된 응답 정보를 텔레비전(50)으로 전송할 수 있다.In this case, when the user voice is received through the microphone, the processor 130 may determine whether the user voice corresponds to a voice requesting a response result of another secretary service. At this time, "tell me also the YYY results" in that it corresponds to the voice requesting the response result of the secretary service YYY, the processor 130 among the previously received response information is the response information received from the server providing the secretary service YYY Can be transmitted to the television 50.

이에 따라, 도 18d와 같이, 텔레비전(50)은 전자 장치(100)로부터 수신된 응답 정보에 기초하여, “한국 시간으로 알려드릴까요?”(1820)을 디스플레이에 표시할 수 있다.Accordingly, as illustrated in FIG. 18D, the television 50 may display “Would you like to be informed in Korean time?” 1820 on the display based on the response information received from the electronic device 100.

한편, 전술한 예에서는 음성 인식을 위한 동작들이 복수의 서버(200-1, 200-2,...,200-n)에서 수행되는 것으로 설명하였다.Meanwhile, in the above-described example, it has been described that the operations for speech recognition are performed on a plurality of servers 200-1, 200-2, ..., 200-n.

다만, 이는 일 예일 뿐이고, 음성 인식을 위한 동작들 중 적어도 일부는 다른 전자 장치(미도시)에서 수행될 수 있다. However, this is only an example, and at least some of the operations for speech recognition may be performed in another electronic device (not shown).

구체적으로, 프로세서(130)는 사용자 음성에 대한 전처리를 수행하고, 사용자 음성을 텍스트로 변환할 수 있다. 그리고, 프로세서(130)는 사용자 음성이 변환된 텍스트에 대한 정보를 복수의 서버(200-1, 200-2,..., 200-n)로 전송할 수 있다. 이 경우, 복수의 서버(200-1, 200-2,..., 200-n)는 텍스트로부터 사용자 음성의 의도 및 엔티티 등을 파악하고, 자연어 이해 결과에 기초하여 사용자 음성에 대한 응답 정보를 전자 장치(100)로 전송할 수 있다. Specifically, the processor 130 may preprocess the user's voice and convert the user's voice into text. Then, the processor 130 may transmit information on the text in which the user's voice is converted to a plurality of servers 200-1, 200-2, ..., 200-n. In this case, the plurality of servers 200-1, 200-2, ..., 200-n grasp the intention and entities of the user's voice from the text, and respond to the user's voice based on the natural language understanding result. It can be transmitted to the electronic device 100.

다른 예로, 프로세서(130)는 사용자 음성에 대한 전처리를 수행하고, 사용자 음성을 텍스트로 변환하고, 텍스트에서 사용자 음성에 대한 의도 및 엔티티 등을 파악하고, 자연어 이해 결과에 대한 정보를 복수의 서버(200-1, 200-2,..., 200-n)로 전송할 수 있다. 이 경우, 복수의 서버(200-1, 200-2,..., 200-n)는 전자 장치(100)로부터 수신된 정보에 기초하여 사용자 음성에 대한 응답에 대한 정보를 획득하고, 획득된 정보에 기초하여 사용자 음성에 대한 응답 정보으로서, 자연어를 획득할 수 있다. 그리고, 복수의 서버(200-1, 200-2,..., 200-n)는 응답 정보를 전자 장치(100)로 전송할 수 있다. As another example, the processor 130 performs pre-processing of the user's voice, converts the user's voice into text, understands the intent and entities for the user's voice in the text, and provides information about the natural language understanding results from a plurality of servers ( 200-1, 200-2, ..., 200-n). In this case, the plurality of servers 200-1, 200-2, ..., 200-n acquire information on the response to the user's voice based on the information received from the electronic device 100, and the obtained As a response information to the user's voice based on the information, a natural language can be obtained. Then, the plurality of servers 200-1, 200-2, ..., 200-n may transmit response information to the electronic device 100.

도 19는 본 개시의 다양한 실시 예에 따른 전자 장치의 구성을 나타내는 블록도이다.19 is a block diagram illustrating a configuration of an electronic device according to various embodiments of the present disclosure.

도 19를 참조하면, 전자 장치(100)는 통신부(110), 메모리(120), 프로세서(130), 디스플레이(140), 마이크(150), 오디오 출력부(160) 및 사용자 입력부(170)를 포함할 수 있다. 이들 구성요소는 프로세서(130)에 의해 제어될 수 있다.Referring to FIG. 19, the electronic device 100 includes a communication unit 110, a memory 120, a processor 130, a display 140, a microphone 150, an audio output unit 160, and a user input unit 170. It can contain. These components can be controlled by processor 130.

한편, 통신부(110), 메모리(120) 및 프로세서(130)는 도 2에 도시된 통신부(110), 메모리(120) 및 프로세서(130)와 동일한 기능을 수행하므로, 이들 구성요소에 대한 중복된 설명은 생략하도록 한다.On the other hand, since the communication unit 110, the memory 120, and the processor 130 perform the same functions as the communication unit 110, the memory 120, and the processor 130 shown in FIG. 2, overlapping of these components The description is omitted.

통신부(110)는 다양한 유형의 통신방식에 따라 다양한 유형의 외부 기기와 통신을 수행할 수 있다. 통신부(110)는 와이파이칩(111), 블루투스 칩(112), 무선 통신 칩(113), NFC 칩(114) 및 이더넷 칩(115) 중 적어도 하나를 포함할 수 있다. The communication unit 110 may communicate with various types of external devices according to various types of communication methods. The communication unit 110 may include at least one of a Wi-Fi chip 111, a Bluetooth chip 112, a wireless communication chip 113, an NFC chip 114, and an Ethernet chip 115.

이 경우, 프로세서(130)는 통신부(110)를 이용하여 외부 서버 또는 각종 외부 기기와 통신을 수행할 수 있다. In this case, the processor 130 may communicate with an external server or various external devices using the communication unit 110.

예를 들어, 프로세서(130)는 통신부(110)를 통해 전자 장치(100)가 획득한 사용자 음성을 서버로 전송할 수 있으며, 서버로부터 인공지능 모델을 통해 획득된 사용자 음성에 대응되는 응답 정보를 수신할 수 있다. 또한, 프로세서(130)는 통신부(110)를 통해 외부 전자 장치(미도시)가 획득한 사용자 음성을 수신할 수 있다. 또한, 프로세서(130)는 외부 전자 장치(미도시)에서 기설정된 버튼이 눌려졌음을 나타내는 신호를 통신부(110)를 통해 수신할 수 있다. For example, the processor 130 may transmit the user voice acquired by the electronic device 100 to the server through the communication unit 110, and receive response information corresponding to the user voice obtained through the artificial intelligence model from the server can do. Also, the processor 130 may receive a user voice acquired by an external electronic device (not shown) through the communication unit 110. Further, the processor 130 may receive a signal indicating that a preset button is pressed in an external electronic device (not shown) through the communication unit 110.

메모리(120)는 전자 장치(100)의 적어도 하나의 다른 구성요소에 관계된 명령 또는 데이터를 저장할 수 있다. 메모리(120)는 비휘발성 메모리, 휘발성 메모리, 플래시메모리(flash-memory), 하드디스크 드라이브(HDD) 또는 솔리드 스테이트 드라이브(SSD) 등으로 구현될 수 있다. 메모리(120)는 프로세서(130)에 의해 액세스되며, 프로세서(130)에 의한 데이터의 독취/기록/수정/삭제/갱신 등이 수행될 수 있다. 본 문서에서 메모리라는 용어는 메모리(120), 프로세서(130) 내 롬(미도시), 램(미도시) 또는 전자 장치(100)에 장착되는 메모리 카드(미도시)(예를 들어, micro SD 카드, 메모리 스틱)를 포함할 수 있다. The memory 120 may store instructions or data related to at least one other component of the electronic device 100. The memory 120 may be implemented as a non-volatile memory, volatile memory, flash-memory, hard disk drive (HDD), or solid state drive (SSD). The memory 120 is accessed by the processor 130, and data read / write / modify / delete / update may be performed by the processor 130. In this document, the term memory is a memory card (not shown) mounted on a memory 120, a ROM (not shown), a RAM (not shown), or an electronic device 100 in the processor 130 (eg, micro SD) Card, memory stick).

또한, 메모리(120)에는 디스플레이(140)의 디스플레이 영역에 표시될 각종 화면을 구성하기 위한 프로그램 및 데이터 등이 저장될 수 있다. 또한, 메모리(120)는 본 문서의 다양한 인공지능 모델 가령, 사용자 음성이 다른 비서 서비스의 응답 결과를 요청하는 음성에 해당하는지 여부를 확인하기 위한 모델을 저장할 수 있다. 다만, 이는 일 예일 뿐이고, 해당 모델은 외부 전자 장치(미도시)에 저장될 수도 있다. 이 경우, 프로세서(130)는 통신부(110)를 통해 사용자 음성을 외부 전자 장치(미도시)로 전송하고, 외부 전자 장치(미도시)에서 인식된 결과를 외부 전자 장치(미도시)로부터 수신할 수도 있다. Further, programs and data for configuring various screens to be displayed on the display area of the display 140 may be stored in the memory 120. In addition, the memory 120 may store various AI models of this document, for example, a model for confirming whether a user voice corresponds to a voice requesting a response result of another secretary service. However, this is only an example, and the corresponding model may be stored in an external electronic device (not shown). In this case, the processor 130 transmits the user's voice to the external electronic device (not shown) through the communication unit 110 and receives the result recognized by the external electronic device (not shown) from the external electronic device (not shown). It might be.

디스플레이(140)는 다양한 화면을 표시할 수 있다. 구체적으로, 프로세서(130)는 복수의 서버(200-1, 200-2,..., 200-n)으로부터 수신된 응답 정보에 기초하여, 사용자 음성에 대한 응답을 디스플레이(140)에 표시할 수 있다. 이 경우, 디스플레이(140)에 표시된 화면은 사용자 음성에 대한 답변 등을 나타내는 텍스트를 포함할 수 있다.The display 140 may display various screens. Specifically, the processor 130 may display a response to the user voice on the display 140 based on the response information received from the plurality of servers 200-1, 200-2, ..., 200-n. You can. In this case, the screen displayed on the display 140 may include text indicating an answer to the user's voice.

한편, 디스플레이(140)의 적어도 일부는 플렉서블 디스플레이(flexible display)의 형태로 전자 장치(100)의 전면 영역 및, 측면 영역 및 후면 영역 중 적어도 하나에 결합될 수도 있다. 플렉서블 디스플레이는 종이처럼 얇고 유연한 기판을 통해 손상 없이 휘거나 구부리거나 말 수 있는 것을 특징으로 할 수 있다. Meanwhile, at least a portion of the display 140 may be coupled to at least one of a front region, a side region, and a rear region of the electronic device 100 in the form of a flexible display. The flexible display may be characterized by being able to bend, bend, or roll without damage through a thin and flexible substrate like paper.

또한, 디스플레이(140)는 터치 패널(171)과 결합하여 레이어 구조의 터치 스크린으로 구현될 수 있다. 터치 스크린은 디스플레이 기능뿐만 아니라 터치 입력 위치, 터치된 면적뿐만 아니라 터치 입력 압력까지도 검출하는 기능을 가질 수 있고, 또한 실질적인 터치(real-touch)뿐만 아니라 근접 터치(proximity touch)도 검출하는 기능을 가질 수 있다.In addition, the display 140 may be implemented as a touch screen having a layer structure in combination with the touch panel 171. The touch screen may have a function of detecting not only a display function but also a touch input position, a touched area as well as a touch input pressure, and also a function of detecting a proximity touch as well as a real-touch. You can.

마이크(150)는 사용자 음성을 획득할 수 있다. 마이크(150)는 사용자 음성을 입력받고, 이를 음성 신호로 생성할 수 있다. 이 경우, 프로세서(130)는 음성 신호를 이용하여, 사용자 음성에 대한 응답을 제공할 수 있다. The microphone 150 may acquire a user voice. The microphone 150 may receive a user voice and generate it as a voice signal. In this case, the processor 130 may provide a response to the user's voice using the voice signal.

오디오 출력부(160)는 스피커로 구현되어, 각종 알림 음이나 음성 메시지를 출력할 수 있다. 이 경우, 프로세서(130)는 사용자 음성에 대한 응답을 음성으로 변환하고, 오디오 출력부(160)를 통해 출력할 수 있다.The audio output unit 160 is implemented as a speaker and can output various notification sounds or voice messages. In this case, the processor 130 may convert a response to the user's voice into voice, and output it through the audio output unit 160.

사용자 입력부(170)는 다양한 사용자 입력을 수신하여 프로세서(130)로 전달할 수 있다. 사용자 입력부(170)는, 예를 들면, 터치 패널(171), 또는 키(172)를 포함할 수 있다. 터치 패널(171)은, 예를 들면, 정전식, 감압식, 적외선 방식, 또는 초음파 방식 중 적어도 하나의 방식을 사용할 수 있다. 또한, 터치 패널(171)은 제어 회로를 더 포함할 수도 있다. 터치 패널(171)은 택타일 레이어(tactile layer)를 더 포함하여, 사용자에게 촉각 반응을 제공할 수 있다. 키(172)는 예를 들면, 물리적인 버튼, 광학식 키, 또는 키패드를 포함할 수 있다. The user input unit 170 may receive various user inputs and transmit them to the processor 130. The user input unit 170 may include, for example, a touch panel 171 or a key 172. The touch panel 171 may use, for example, at least one of capacitive, pressure-sensitive, infrared, and ultrasonic methods. Also, the touch panel 171 may further include a control circuit. The touch panel 171 may further include a tactile layer to provide a tactile reaction to the user. The key 172 may include, for example, a physical button, an optical key, or a keypad.

특히, 사용자 입력부(170)는 사용자 입력부(170)에 마련된 특정 버튼을 선택하는 사용자 입력을 프로세서(130)로 전달할 수 있고, 프로세서(130)는 사용자 입력이 수신되면, 이후 획득된 사용자 음성을 이용하여 비서 서비스를 제공할 수 있다. In particular, the user input unit 170 may transmit a user input to select a specific button provided in the user input unit 170 to the processor 130, and when the user input is received, the processor 130 uses the acquired user voice To provide secretarial services.

프로세서(130)(또는, 제어부)는 메모리(120)에 저장된 각종 프로그램을 이용하여 전자 장치(100)의 전반적인 동작을 제어할 수 있다.The processor 130 (or the control unit) may control the overall operation of the electronic device 100 using various programs stored in the memory 120.

프로세서(130)는 RAM(133), ROM(134), 그래픽 처리부(136), 메인 CPU(135), 제1 내지 n 인터페이스(137-1~137-n), 버스(138)로 구성될 수 있다. 이때, RAM(133), ROM(134), 그래픽 처리부(136), 메인 CPU(135), 제1 내지 n 인터페이스(137-1~137-n) 등은 버스(138)를 통해 서로 연결될 수 있다.The processor 130 may be composed of a RAM 133, a ROM 134, a graphics processor 136, a main CPU 135, first to n interfaces 137-1 to 137-n, and a bus 138. have. At this time, the RAM 133, the ROM 134, the graphic processing unit 136, the main CPU 135, and the first to n interfaces 137-1 to 137-n may be connected to each other through the bus 138. .

한편, 본 개시의 일 실시 예에 따른 전자 장치(100)는 비서 서비스와 관련된 다양한 가이드를 제공할 수 있는데, 도 20 및 도 21을 참조하여 보다 구체적으로 설명하도록 한다.Meanwhile, the electronic device 100 according to an embodiment of the present disclosure may provide various guides related to a secretary service, and will be described in more detail with reference to FIGS. 20 and 21.

본 개시의 일 실시 예에 따르면, 전자 장치(100)는 사용자가 특정한 웨이크업 단어를 발화한 경우라도, 웨이크업 단어에 대응되는 비서 서비스를 제공하는 서버 외에도 다른 서버로 전송할 수 있다.According to an embodiment of the present disclosure, even when a user utters a specific wake-up word, the electronic device 100 may transmit to another server in addition to a server providing a secretary service corresponding to the wake-up word.

이에 따라, 복수의 서버(200-1, 200-2,..., 200-n)를 비서 서비스를 제공하는 서비스 제공자들은 복수의 서버(200-1, 200-2,..., 200-n)로 수신된 사용자 음성을 이용하여, 비서 서비스 이용 시 사용자가 어떠한 음성을 많이 발화하였는지를 확인할 수 있게 된다.Accordingly, service providers providing secretarial services to a plurality of servers 200-1, 200-2, ..., 200-n may include a plurality of servers 200-1, 200-2, ..., 200- By using the user's voice received by n), it is possible to check which voice the user spoke a lot when using the secretary service.

이에 따라, 서비스 제공자들은 사용자가 요청한 기능 또는 답변을 자신의 비서 서비스가 제공하지 못하고 있는 경우, 해당 기능 또는 답변을 제공할 수 있도록 비서 서비스를 학습시킬 수 있다. 그리고, 서비스 제공자들은 학습이 완료되면, 서버를 통해 해당 기능 또는 답변을 제공할 수 있다는 가이드를 전자 장치(100)로 전송할 수 있다. Accordingly, service providers can learn the secretarial service to provide the function or answer when their secretarial service does not provide the function or answer requested by the user. In addition, when learning is completed, the service providers may transmit a guide to the electronic device 100 that a corresponding function or answer may be provided through the server.

이 경우, 전자 장치(100)는 서버로부터 수신된 정보를 표시할 수 있다. 예를 들어, 도 20과 같이 "XXX가 조명 제어 기능을 학습하였습니다. 앞으로, "XXX 조명 꺼줘"로 조명을 제어할 수 있습니다"(2010)와 같은 가이드를 표시할 수 있다.In this case, the electronic device 100 may display information received from the server. For example, as shown in FIG. 20, a guide such as "XXX has learned the lighting control function. In the future," You can control the lighting with "Turn off XXX lighting" "(2010) can be displayed.

또한, 서비스 제공자들은 사용자가 요청한 기능 또는 답변을 자신의 비서 서비스에서도 제공 가능하다는 가이드를 전자 장치(100)로 전송할 수 있다. In addition, service providers may transmit a guide to the electronic device 100 that a function or answer requested by the user can be provided in their secretary service.

이 경우, 전자 장치(100)는 서버로부터 수신된 정보를 표시할 수 있다. 예를 들어, 도 21과 같이 "XXX도 날씨를 알려드릴 수 있습니다. 앞으로, "XXX 내일 날씨 알려줘"로 날씨를 알 수 있습니다"(2110)와 같은 가이드를 표시할 수 있다.In this case, the electronic device 100 may display information received from the server. For example, as shown in FIG. 21, a guide such as “XXX can tell the weather. In the future,“ You can know the weather with “tell me about the weather tomorrow” ”(2110) can be displayed.

도 22는 본 개시의 일 실시 예에 따른 전자 장치의 비서 서비스 제공 방법을 설명하기 위한 흐름도이다.22 is a flowchart illustrating a method of providing a secretarial service of an electronic device according to an embodiment of the present disclosure.

먼저, 비서 서비스를 실행하기 위한 사용자 입력이 수신되면, 전자 장치가 획득한 사용자 음성에 대한 정보를 서로 다른 비서 서비스를 제공하는 복수의 서버로 전송한다(S2210).First, when a user input for executing a secretary service is received, information on the user voice acquired by the electronic device is transmitted to a plurality of servers providing different secretary services (S2210).

그리고,복수의 서버로부터 복수의 응답 정보가 수신되면, 수신된 복수의 응답 정보 중 적어도 하나에 기초하여 사용자 음성에 대한 응답을 제공한다(S2220).Then, when a plurality of response information is received from a plurality of servers, a response to a user voice is provided based on at least one of the received response information (S2220).

여기에서, 복수의 서버는, 인공지능 에이전트(Artificial intelligence agent)를 이용하여 비서 서비스를 제공할 수 있다.Here, the plurality of servers may provide a secretary service using an artificial intelligence agent.

한편, S2210 단계는 제1 비서 서비스를 실행하기 위한 사용자 입력이 수신되면, 사용자 음성에 대한 정보를 제1 비서 서비스를 제공하는 제1 서버 및 제2 비서 서비스를 제공하는 제2 서버로 전송하고, S2220 단계는 제1 및 제2 서버로부터 수신된 복수의 응답 정보 중 제1 서버로부터 수신된 응답 정보에 기초하여 사용자 음성에 대한 응답을 제공할 수 있다.Meanwhile, in step S2210, when a user input for executing the first secretary service is received, information about the user's voice is transmitted to the first server providing the first secretary service and the second server providing the second secretary service, In step S2220, a response to a user voice may be provided based on the response information received from the first server among the plurality of response information received from the first and second servers.

또한, S2210 단계는 전자 장치가 획득한 제1 사용자 음성에서 제1 비서 서비스를 실행하기 위한 제1 웨이크업(wakeup) 단어가 확인되는 경우, 제1 사용자 음성에 대한 정보를 제1 비서 서비스를 제공하는 제1 서버 및 제2 비서 서비스를 제공하는 제2 서버로 전송하고, S2220 단계는 제1 사용자 음성에 대한 정보의 전송에 응답하여 제1 서버 및 제2 서버로부터 수신된 복수의 응답 정보 중 제1 서버로부터 수신된 응답 정보에 기초하여 제1 사용자 음성에 대한 응답을 제공할 수 있다.In addition, in step S2210, when the first wakeup word for executing the first secretary service is confirmed in the first user voice acquired by the electronic device, information on the first user voice is provided to the first secretary service. The first server and the second server providing the second secretary service, and step S2220 is selected from among a plurality of response information received from the first server and the second server in response to the transmission of information about the first user voice. A response to the first user's voice may be provided based on the response information received from the 1 server.

여기에서, 제1 사용자 음성 이후에 획득된 제2 사용자 음성이 제1 사용자 음성에 대한 다른 비서 서비스의 응답 결과를 요청하는 음성에 해당하는지 여부를 확인할 수 있다.Here, it can be confirmed whether the second user voice obtained after the first user voice corresponds to a voice requesting a response result of another secretary service to the first user voice.

또한, 제2 사용자 음성이 제1 사용자 음성에 대한 제2 비서 서비스의 응답 결과를 요청하는 음성에 해당하는 경우, 제1 사용자 음성에 대한 정보의 전송에 응답하여 수신된 복수의 응답 정보 중 제2 서버로부터 수신된 응답 정보에 기초하여 제2 사용자 음성에 대한 응답을 제공할 수 있다.In addition, when the second user voice corresponds to a voice requesting a response result of the second secretary service to the first user voice, the second of the plurality of response information received in response to the transmission of the information about the first user voice A response to the second user voice may be provided based on the response information received from the server.

또한, 제2 사용자 음성이 다른 비서 서비스의 응답 결과를 요청하는 음성에 해당하지 않는 경우, 제2 사용자 음성에 대한 정보를 제1 서버 및 제2 서버로 전송하고, 제2 사용자 음성에 대한 정보의 전송에 응답하여 제1 서버 및 제2 서버로부터 수신된 복수의 응답 정보 중 제1 서버로부터 수신된 응답 정보에 기초하여 제2 사용자 음성에 대한 응답을 제공할 수 있다.In addition, when the second user voice does not correspond to the voice requesting the response result of the other secretary service, information on the second user voice is transmitted to the first server and the second server, and the information on the second user voice The response to the second user voice may be provided based on the response information received from the first server among the plurality of response information received from the first server and the second server in response to the transmission.

한편, S2210 단계는 기설정된 버튼의 선택에 의한 사용자 입력이 수신된 후 사용자 음성이 획득되면, 사용자 음성에 대한 정보를 복수의 서버로 전송하고, S1920 단계는 복수의 서버로부터 복수의 응답 정보가 수신되면, 복수의 응답 정보에 기초하여 사용자 음성에 대한 복수의 응답을 제공할 수 있다.On the other hand, in step S2210, when a user voice is acquired after a user input is received by selecting a preset button, information on the user voice is transmitted to a plurality of servers, and step S1920 receives a plurality of response information from a plurality of servers. If so, a plurality of responses to the user's voice may be provided based on the plurality of response information.

이 경우, S2220 단계는 복수의 서버가 제공하는 복수의 비서 서비스에 대한 사용자의 선호도 및 복수의 응답 정보의 정확도 중 적어도 하나에 기초하여 복수의 응답을 상이한 형태로 출력할 수 있다.In this case, step S2220 may output a plurality of responses in different forms based on at least one of a user's preference for a plurality of secretary services provided by a plurality of servers and accuracy of the plurality of response information.

본 문서에서 사용된 용어 "모듈"은 하드웨어, 소프트웨어 또는 펌웨어로 구성된 유닛을 포함하며, 예를 들면, 로직, 논리 블록, 부품, 또는 회로 등의 용어와 상호 호환적으로 사용될 수 있다. 모듈은, 일체로 구성된 부품 또는 하나 또는 그 이상의 기능을 수행하는 최소 단위 또는 그 일부가 될 수 있다. 예를 들면, 모듈은 ASIC(application-specific integrated circuit)으로 구성될 수 있다. As used herein, the term "module" includes units composed of hardware, software, or firmware, and may be used interchangeably with terms such as logic, logic blocks, components, or circuits. The module may be an integrally configured component or a minimum unit that performs one or more functions or a part thereof. For example, the module can be configured with an application-specific integrated circuit (ASIC).

본 문서의 다양한 실시예들은 기기(machine)(예: 컴퓨터)로 읽을 수 있는 저장 매체(machine-readable storage media에 저장된 명령어를 포함하는 소프트웨어로 구현될 수 있다. 기기는, 저장 매체로부터 저장된 명령어를 호출하고, 호출된 명령어에 따라 동작이 가능한 장치로서, 개시된 실시예들에 따른 전자 장치(예: 전자 장치(A))를 포함할 수 있다. 상기 명령이 프로세서에 의해 실행될 경우, 프로세서가 직접, 또는 상기 프로세서의 제어하에 다른 구성요소들을 이용하여 상기 명령에 해당하는 기능을 수행할 수 있다. 명령은 컴파일러 또는 인터프리터에 의해 생성 또는 실행되는 코드를 포함할 수 있다. 기기로 읽을 수 있는 저장매체는, 비일시적(non-transitory) 저장매체의 형태로 제공될 수 있다. 여기서, '비일시적'은 저장매체가 신호(signal)를 포함하지 않으며 실재(tangible)한다는 것을 의미할 뿐 데이터가 저장매체에 반영구적 또는 임시적으로 저장됨을 구분하지 않는다.Various embodiments of the present document may be implemented with software including instructions stored in a machine (eg, computer) readable storage media. A device that can be called and operated according to the called command, and may include an electronic device (eg, electronic device A) according to the disclosed embodiments. When the command is executed by a processor, the processor directly Or, under the control of the processor, other components may be used to perform the function corresponding to the instruction, which may include code generated or executed by a compiler or interpreter. , It may be provided in the form of a non-transitory storage medium, where 'non-transitory' means that the storage medium does not contain a signal. It means that it is tangible and does not distinguish between data being stored semi-permanently or temporarily on a storage medium.

일시예에 따르면, 본 문서에 개시된 다양한 실시예들에 따른 방법은 컴퓨터 프로그램 제품(computer program product)에 포함되어 제공될 수 있다. 컴퓨터 프로그램 제품은 상품으로서 판매자 및 구매자 간에 거래될 수 있다. 컴퓨터 프로그램 제품은 기기로 읽을 수 있는 저장 매체(예: compact disc read only memory (CD-ROM))의 형태로, 또는 어플리케이션 스토어(예: 플레이 스토어TM)를 통해 온라인으로 배포될 수 있다. 온라인 배포의 경우에, 컴퓨터 프로그램 제품의 적어도 일부는 제조사의 서버, 어플리케이션 스토어의 서버, 또는 중계 서버의 메모리와 같은 저장 매체에 적어도 일시 저장되거나, 임시적으로 생성될 수 있다.According to a date and time example, a method according to various embodiments disclosed in this document may be provided as being included in a computer program product. Computer program products are commodities that can be traded between sellers and buyers. The computer program product may be distributed online in the form of a storage medium readable by the device (eg compact disc read only memory (CD-ROM)) or through an application store (eg Play StoreTM). In the case of online distribution, at least a portion of the computer program product may be temporarily stored at least temporarily on a storage medium such as a memory of a manufacturer's server, an application store's server, or a relay server, or may be temporarily generated.

다양한 실시예들에 따른 구성 요소(예: 모듈 또는 프로그램) 각각은 단수 또는 복수의 개체로 구성될 수 있으며, 전술한 해당 서브 구성 요소들 중 일부 서브 구성 요소가 생략되거나, 또는 다른 서브 구성 요소가 다양한 실시예에 더 포함될 수 있다. 대체적으로 또는 추가적으로, 일부 구성 요소들(예: 모듈 또는 프로그램)은 하나의 개체로 통합되어, 통합되기 이전의 각각의 해당 구성 요소에 의해 수행되는 기능을 동일 또는 유사하게 수행할 수 있다. 다양한 실시예들에 따른, 모듈, 프로그램 또는 다른 구성 요소에 의해 수행되는 동작들은 순차적, 병렬적, 반복적 또는 휴리스틱하게 실행되거나, 적어도 일부 동작이 다른 순서로 실행되거나, 생략되거나, 또는 다른 동작이 추가될 수 있다.Each component (eg, module or program) according to various embodiments may be composed of a singular or a plurality of entities, and some of the aforementioned sub-components may be omitted, or other sub-components may be omitted. It may be further included in various embodiments. Alternatively or additionally, some components (eg, modules or programs) may be integrated into one entity, performing the same or similar functions performed by each corresponding component before being integrated. According to various embodiments, operations performed by a module, program, or other component may be sequentially, parallel, iteratively or heuristically executed, at least some operations may be executed in a different order, omitted, or other operations may be added. Can be.

100 : 전자 장치 110 : 통신부
120 : 메모리 130 : 프로세서100: electronic device 110: communication unit
120: memory 130: processor

Claims

In the electronic device,
Communication department;
Memory; And
And a processor connected to the communication unit and the memory to control the electronic device.
The processor, by executing at least one instruction stored in the memory,
When a user input for executing a secretary service is received, information on a user voice obtained by the electronic device is transmitted to a plurality of servers providing different secretary services through the communication unit,
When a plurality of response information is received from the plurality of servers, a response to the user voice is provided based on at least one of the received plurality of response information,
The plurality of servers, the electronic device to provide a secretary service using an artificial intelligence agent (Artificial intelligence agent).

According to claim 1,
The processor,
When a user input for executing a first secretary service is received, information on the user voice is transmitted to a first server providing a first secretary service and a second server providing a second secretary service, and the first and An electronic device that provides a response to the user voice based on response information received from the first server among a plurality of response information received from a second server.

According to claim 1,
The processor,
When the first wakeup word for executing the first secretary service is identified in the first user voice acquired by the electronic device, the information for the first user voice is provided to the first secretary service Among the plurality of response information transmitted from the first server and the second server in response to transmission of information on the first user voice, transmitted to the first server and the second server providing the second secretary service And providing a response to the first user voice based on the response information received from the first server.

According to claim 3,
The processor,
The electronic device confirms whether the second user voice obtained after the first user voice corresponds to a voice requesting a response result of another secretary service to the first user voice.

According to claim 4,
The processor,
If the second user voice corresponds to a voice requesting a response result of the second secretary service to the first user voice, among the plurality of response information received in response to the transmission of the information about the first user voice And provide a response to the second user voice based on the response information received from the second server.

According to claim 4,
The processor,
If the second user voice does not correspond to the voice requesting the response result of the other secretary service, information on the second user voice is transmitted to the first server and the second server, and the second user voice is transmitted to the second user voice. An electronic device that provides a response to the second user voice based on the response information received from the first server among the plurality of response information received from the first server and the second server in response to the transmission of the information about the .

According to claim 1,
The processor,
When the user voice is obtained after the user input is received by selecting a preset button, information about the user voice is transmitted to the plurality of servers, and when a plurality of response information is received from the plurality of servers, the An electronic device that provides a plurality of responses to the user voice based on a plurality of response information.

The method of claim 7,
The processor,
The electronic device outputs the plurality of responses in different forms based on at least one of a user's preference for a plurality of secretary services provided by the plurality of servers and the accuracy of the plurality of response information.

In the method for providing secretarial services of electronic devices,
When a user input for executing a secretary service is received, transmitting information on user voices obtained by the electronic device to a plurality of servers providing different secretary services; And
Includes; when receiving a plurality of response information from the plurality of servers, providing a response to the user voice based on at least one of the received plurality of response information; includes,
The plurality of servers, using an artificial intelligence agent (Artificial intelligence agent) to provide a secretary service, secretary service providing method.

The method of claim 9,
The transmitting step,
When a user input for executing a first secretary service is received, the information on the user voice is transmitted to a first server providing a first secretary service and a second server providing a second secretary service,
The providing step,
A method of providing a secretary service, wherein a response to the user's voice is provided based on response information received from the first server among a plurality of response information received from the first and second servers.

The method of claim 9,
The transmitting step,
When the first wakeup word for executing the first secretary service is identified in the first user voice acquired by the electronic device, the information for the first user voice is provided to the first secretary service Transmitting to the first server and the second server providing the second secretary service,
The providing step,
Response to the first user voice based on the response information received from the first server among a plurality of response information received from the first server and the second server in response to the transmission of the information about the first user voice Provided, secretary service provision method.

The method of claim 11,
And confirming whether the second user voice obtained after the first user voice corresponds to a voice requesting a response result of another secretary service to the first user voice.

The method of claim 12,
If the second user voice corresponds to a voice requesting a response result of the second secretary service to the first user voice, among the plurality of response information received in response to the transmission of the information about the first user voice And providing a response to the second user voice based on the response information received from the second server.

The method of claim 12,
If the second user voice does not correspond to the voice requesting the response result of the other secretary service, information on the second user voice is transmitted to the first server and the second server, and the second user voice is transmitted to the second user voice. Providing a response to the second user voice based on the response information received from the first server among a plurality of response information received from the first server and the second server in response to the transmission of the information on the; A method of providing a secretarial service, further comprising.

The method of claim 9,
The transmitting step,
When the user voice is obtained after the user input is received by selecting a preset button, information on the user voice is transmitted to the plurality of servers,
The providing step,
When a plurality of response information is received from the plurality of servers, a method for providing a secretary service is provided based on the plurality of response information.

The method of claim 15,
The providing step,
And outputting the plurality of responses in different forms based on at least one of a user's preference for a plurality of secretary services provided by the plurality of servers and the accuracy of the plurality of response information.