JPH1138995A

JPH1138995A - Speech recognition device and navigation system

Info

Publication number: JPH1138995A
Application number: JP9191186A
Authority: JP
Inventors: Hisashi Sugiura; 恒杉浦
Original assignee: Denso Corp
Current assignee: Denso Corp
Priority date: 1997-07-16
Filing date: 1997-07-16
Publication date: 1999-02-12

Abstract

PROBLEM TO BE SOLVED: To allow an unaccustomed user to easily use a speech recognition device by informing the user of contents of a predetermined guide when there is no speech input even if a predetermined time has passed from a point in time when a speech input period is specified to start. SOLUTION: Having informed of contents A of a guide, a control part 38 judges whether or not a push-to-talk(PTT) switch 36 is turned on, and following this, it judges whether it is a click operation. When it judges that it is not a click operation, it judges whether there is a speech input. When there is no speech input, the control part judges whether PTT switch 36 has been turned off, and if it remains being turned on, it is judged whether a predetermined time Ta has passed. And, if the predetermined time Ta has passed, the contents C of the guide are informed. Concrete contents of the contents C of this guide are 'Please, start with a prefecture name when you input your destination'. Thus, a user, who wants to set a destination, is informed of an input method.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、例えばナビゲーシ
ョンシステムにおける目的地の設定などを音声によって
入力できるようにする場合などに有効な音声認識装置及
びその音声認識装置を備えたナビゲーションシステムに
関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech recognition apparatus which is effective for, for example, enabling a destination setting or the like in a navigation system to be input by speech, and a navigation system having the speech recognition apparatus.

【０００２】[0002]

【従来の技術及び発明が解決しようとする課題】従来よ
り、入力された音声を予め記憶されている複数の比較対
象パターン候補と比較し、一致度合の高いものを認識結
果とする音声認識装置が既に実用化されており、例えば
ナビゲーションシステムにおいて設定すべき目的地を利
用者が地名を音声で入力するためなどに用いられてい
る。特に車載ナビゲーションシステムを運転手自身が利
用する場合、音声入力であればボタン操作や画面注視が
伴わないため、車両の走行中に行っても安全性が高いた
め有効である。2. Description of the Related Art Conventionally, there has been proposed a speech recognition apparatus which compares an inputted speech with a plurality of pattern candidates to be compared which are stored in advance and determines a speech having a high degree of coincidence as a recognition result. It has already been put into practical use, and is used, for example, for a user to input a destination name to be set in a navigation system by voice. In particular, when the driver uses the in-vehicle navigation system, voice input does not involve button operation or screen gaze, so that it is effective because the safety is high even when the vehicle is running.

【０００３】特に、音声認識装置側が常に音声入力に備
えるように準備しておくのは負担が大きいため、認識対
象となる音声を入力させる期間の開始及び終了タイミン
グについては利用者自身が指定するような構成が採用さ
れることが多い。この入力期間を指定するための手段と
しては、例えばＰＴＴ（Push-To-Talk）スイッチなどが
利用される。つまり、ＰＴＴスイッチが押されている場
合にだけ入力音声に対する認識処理を実行するようにし
て、実際上必要がない入力音声については処理をしない
ようにしている。In particular, since it is burdensome for the voice recognition apparatus to always prepare for voice input, the user himself / herself specifies the start and end timings of the period for inputting the voice to be recognized. Such a configuration is often employed. As a means for specifying the input period, for example, a PTT (Push-To-Talk) switch is used. That is, the recognition process for the input voice is executed only when the PTT switch is pressed, and the input voice that is not actually required is not processed.

【０００４】このように利用価値の高い音声入力の手法
であるが、利用者が音声認識のシステム自体に慣れてい
ないと、次のような問題点が生じる。例えば、利用方法
が判らないことに起因して、装置にとっては不適切な音
声を適当に入力してしまい実効性が上がらないという問
題や、あるいは、極端な場合には装置自体を使わないと
いう問題がある。また、利用方法自体をある程度は理解
しているが具体的な利用方法に慣れていないため、例え
ば入力すべき言葉がすぐに出てこなくて沈黙してしまう
という問題もある。[0004] Such a speech input method having a high value of use has the following problems if the user is not accustomed to the speech recognition system itself. For example, there is a problem that the sound is inappropriately input to the device due to the lack of understanding of the usage method, and the effectiveness is not improved, or a problem that the device itself is not used in extreme cases. There is. In addition, since the user understands the usage method to some extent but is not familiar with the specific usage method, for example, there is a problem that words to be input do not immediately appear and the user is silent.

【０００５】本発明は、このような問題を解決し、音声
認識装置の利用方法に慣れていない利用者であっても容
易に利用ができ、より使い勝手の良い音声認識装置やそ
の音声認識装置を備えたナビゲーションシステムを提供
することを目的とする。The present invention solves such a problem, and a user-friendly voice recognition device which can be easily used even by a user unfamiliar with the method of using the voice recognition device, and a voice recognition device which is more convenient. An object of the present invention is to provide a navigation system provided with the navigation system.

【０００６】[0006]

【課題を解決するための手段及び発明の効果】上記目的
を達成するためになされた請求項１に記載の音声認識装
置は、音声を入力するための音声入力手段と、該音声入
力手段を用いて認識対象となる音声を入力させる期間の
開始及び終了を利用者自身が指定するために設けられた
入力期間指定手段と、該入力期間指定手段によって指定
された入力期間内の前記音声入力手段を介して入力され
た音声を、予め辞書手段に記憶されている複数の比較対
象パターン候補と比較して一致度合の高いものを認識結
果とする認識手段と、該認識手段による認識結果を報知
する報知手段と、該報知手段によって認識結果が報知さ
れた後に所定の確定指示がなされた場合には、当該認識
結果を確定したものとして所定の確定後処理を実行する
確定後処理手段と、を備える音声認識装置であって、前
記入力期間指定手段によって音声入力期間の開始が指定
された時点から所定時間経過しても前記音声入力手段を
介した音声入力がない場合には、所定のガイド内容を報
知するガイド手段を備えることを特徴とする。According to a first aspect of the present invention, there is provided a voice recognition apparatus for inputting a voice, and the voice recognition apparatus uses the voice input means. Input period specifying means provided for the user himself to specify the start and end of the period for inputting the voice to be recognized, and the voice input means within the input period specified by the input period specifying means. A recognition unit that compares a voice input through a plurality of comparison target pattern candidates stored in advance in a dictionary unit and determines a recognition result having a high degree of coincidence, and a notification that notifies the recognition result by the recognition unit Means, and when a predetermined confirmation instruction is given after the recognition result is notified by the notifying means, a fixed post-processing means for executing a predetermined post-confirmation processing assuming that the recognition result is fixed. A voice recognition device comprising: a predetermined guide when there is no voice input via the voice input means even when a predetermined time has elapsed after the start of the voice input period is specified by the input time period specifying means. It is characterized by having guide means for informing the contents.

【０００７】請求項１に記載の音声認識装置によれば、
利用者は例えばマイクロフォンなどの音声入力手段を介
して音声を入力するのであるが、認識対象の音声を入力
させる期間の開始及び終了を利用者自身が指定するため
に入力期間指定手段が設けられており、この入力期間指
定手段によって指定された入力期間内に入力された音声
が認識対象となる。そして、認識手段が、その入力され
た音声を予め辞書手段に記憶されている複数の比較対象
パターン候補と比較して一致度合の高いものを認識結果
とし、報知手段によって認識結果を報知する。そして、
認識結果が報知された後に所定の確定指示がなされた場
合には、確定後処理手段が、その認識結果を確定したも
のとして所定の確定後処理を実行する。[0007] According to the voice recognition device of the first aspect,
The user inputs voice through voice input means such as a microphone, for example. However, an input period specifying means is provided for the user to specify the start and end of the period for inputting the voice to be recognized. The voice input during the input period specified by the input period specifying means is to be recognized. Then, the recognizing means compares the input voice with a plurality of comparison target pattern candidates stored in the dictionary means in advance, and determines a recognition result having a high degree of coincidence, and reports the recognition result by the notifying means. And
If a predetermined confirmation instruction is issued after the notification of the recognition result, the post-confirmation processing means executes the predetermined post-confirmation processing assuming that the recognition result has been confirmed.

【０００８】そして、このような基本的な音声認識に関
する処理を実行する際、入力期間指定手段によって音声
入力期間の開始が指定された時点から所定時間経過して
も音声入力手段を介した音声入力がない場合には、ガイ
ド手段が、所定のガイド内容を少なくとも音声にて報知
する。つまり、入力期間指定手段によって音声入力期間
の開始が指定されたということは、利用者が音声入力を
しようとする意志は推認される。したがって、そのよう
な状態となったにもかかわらず所定時間経過しても音声
入力がないことの原因の一つとして、利用者は音声入力
したいのであるがどのように入力すればよいかについて
の理解が十分でないことが考えられる。そのため、利用
に際して困っている利用者に所定のガイド内容を報知す
るのである。When such basic speech recognition processing is executed, the voice input via the voice input means continues even if a predetermined time has elapsed after the start of the voice input period is specified by the input period specifying means. When there is no such information, the guide means notifies at least the predetermined guide content by voice. In other words, the fact that the start of the voice input period is specified by the input period specifying means is inferred that the user intends to input voice. Therefore, as one of the reasons why there is no voice input even after the lapse of a predetermined time even in such a state, the user wants to perform voice input, but how to input is desired. The understanding may not be enough. For this reason, a predetermined guide content is notified to a user who is having difficulty in using the guide.

【０００９】この場合の所定のガイド内容としては種々
考えられるので、ここでは、ナビゲーションシステムに
おける目的地などを音声入力するために音声認識装置を
適用した場合を想定していくつか説明する。まず、ナビゲーションシステムに対して自分が音声で
何を入力することができるのか自体を知らない場合など
には、例えば「目的地の設定が行えます」というよう
に、音声入力によって指示が可能な設定項目自体を案内
するものが考えられる。この場合には、音声入力によっ
て指示する対象のシステム（この場合にはナビゲーショ
ン装置）における一連の処理のどの段階にあるかで内容
を変更してもよい。例えば、目的地までの経路設定を一
連の処理と考えると、上述した目的地の設定がされた
後、例えば経由地の指定はするのか、というような経路
を設定する上での条件をさらに指定する場合がある。し
たがって、目的地の設定が済んでいる状況で、上述した
音声入力期間の開始が指定された時点から所定時間経過
しても音声入力がない場合には、今度は「経由地の指定
が行えます」というようなガイド内容を報知すればよ
い。[0009] In this case, there are various types of predetermined guide contents. Therefore, here, some examples will be described on the assumption that a voice recognition device is applied for voice input of a destination or the like in a navigation system. First, if you do not know what you can input by voice to the navigation system, for example, a setting that can be instructed by voice input, such as "You can set the destination" One that guides the item itself is conceivable. In this case, the content may be changed depending on the stage of a series of processes in the system (in this case, the navigation device) to be instructed by voice input. For example, assuming that setting a route to a destination is a series of processing, after the above-described destination is set, for example, a condition for setting a route such as whether to specify a waypoint is further specified. May be. Therefore, if the destination has already been set and there is no voice input within the specified time after the start of the above-mentioned voice input period has been specified, the message "You can specify the waypoint What is necessary is just to inform the guide contents such as ".

【００１０】また、ナビゲーションシステムに対して
自分が音声で何を入力することができるのか自体は知っ
ているが、具体的な入力方法を熟知していない場合も想
定される。この場合には、「目的地を設定するときは都
道府県名から入力して下さい」というように、入力方法
の説明をガイド内容として報知することが考えられる。
また、これでも利用者がどのように入力するかを完全に
は理解できず、入力できずにいる場合には、「例えば愛
知県刈谷市昭和町と入力して下さい」というように、具
体的な入力例をガイド内容として報知することが考えら
れる。これによって利用者は具体的な入力方法が判り、
また、具体例まで報知してもらえばそれに倣って自分が
希望する目的地を容易に入力することができる。[0010] Further, it is assumed that the user knows what he / she can input to the navigation system by voice, but does not know the specific input method. In this case, it is conceivable that the explanation of the input method is reported as the guide content, such as "Please input from the prefecture name when setting the destination".
Also, even if the user still does not fully understand how to enter, and if it is not possible to enter it, please specify, for example, "Please enter Showa-cho, Kariya city, Aichi prefecture" It is conceivable to report a simple input example as guide content. This allows the user to know the specific input method,
If the user is notified of a specific example, he or she can easily input a desired destination.

【００１１】もちろん、上記及びで説明したガイド
内容は一例であり、適用するナビゲーションシステムの
処理に適合した適切なガイド内容を出せばよいし、また
ナビゲーションシステム以外に適用するのであれば、そ
のシステムの処理に適合するように工夫すればよい。Of course, the contents of the guide described above and above are merely examples, and it is sufficient to provide appropriate guide contents suitable for the processing of the navigation system to be applied. What is necessary is just to devise it so that it may adapt to processing.

【００１２】このように、本発明の音声認識装置によれ
ば、音声認識装置の利用方法に慣れていない利用者であ
っても容易に利用ができるようになり、より使い勝手の
良いユーザフレンドリ面で優れた装置を提供することが
できる。また、利用方法に慣れていない利用者に対する
ガイド機能を発揮する上では、上述したように、入力期
間指定手段によって音声入力期間の開始が指定された時
点から所定時間経過しても音声入力手段を介した音声入
力がないという状況が生じた場合にガイド内容を報知す
るのではなく、利用者が積極的にガイド内容を知りたい
という場合に対応できるようにしておくことも考えられ
る。As described above, according to the speech recognition apparatus of the present invention, even a user who is not used to using the speech recognition apparatus can easily use the speech recognition apparatus, and is more user-friendly. An excellent device can be provided. In order to provide a guide function for a user who is not accustomed to the usage method, as described above, the voice input unit is not used even if a predetermined time has elapsed after the start of the voice input period is specified by the input period specifying unit. Instead of notifying the guide contents when a situation where there is no voice input via the terminal occurs, it may be possible to cope with a case where the user actively wants to know the guide contents.

【００１３】この点を考慮した音声認識装置は、請求項
２に示すように、音声を入力するための音声入力手段
と、該音声入力手段を用いて認識対象となる音声を入力
させる期間の開始及び終了を利用者自身が指定するため
に設けられた入力期間指定手段と、該入力期間指定手段
によって指定された入力期間内の前記音声入力手段を介
して入力された音声を、予め辞書手段に記憶されている
複数の比較対象パターン候補と比較して一致度合の高い
ものを認識結果とする認識手段と、該認識手段による認
識結果を報知する報知手段と、該報知手段によって認識
結果が報知された後に所定の確定指示がなされた場合に
は、当該認識結果を確定したものとして所定の確定後処
理を実行する確定後処理手段と、を備える音声認識装置
であって、前記入力期間指定手段によって通常の音声入
力期間の開始及び終了の指定ではない所定のガイド要求
がされた場合には、そのガイド要求に応じた所定のガイ
ド内容を報知するガイド手段を備えることを特徴とす
る。According to a second aspect of the present invention, there is provided a voice recognition apparatus for inputting a voice and starting a period for inputting a voice to be recognized using the voice input means. And an input period specifying means provided for the user himself to specify the end, and a voice input through the voice input means within the input period specified by the input time specifying means, is previously stored in the dictionary means. Recognition means for recognizing a result having a high degree of coincidence with a plurality of stored comparison target pattern candidates as a recognition result, notification means for notifying the recognition result by the recognition means, and a recognition result notified by the notification means. And a post-confirmation processing means for executing predetermined post-confirmation processing assuming that the recognition result has been finalized when a predetermined finalization instruction is issued after the input. When a predetermined guide request other than the designation of the start and end of the normal voice input period is made by the interval designating unit, there is provided a guide unit for notifying the predetermined guide content according to the guide request. .

【００１４】この請求項２に記載の音声認識装置によれ
ば、利用者は例えばマイクロフォンなどの音声入力手段
を介して音声を入力するのであるが、認識対象の音声を
入力させる期間の開始及び終了を利用者自身が指定する
ために入力期間指定手段が設けられており、この入力期
間指定手段によって指定された入力期間内に入力された
音声が認識対象となる。そして、認識手段が、その入力
された音声を予め辞書手段に記憶されている複数の比較
対象パターン候補と比較して一致度合の高いものを認識
結果とし、報知手段によって認識結果を報知する。そし
て、認識結果が報知された後に所定の確定指示がなされ
た場合には、確定後処理手段が、その認識結果を確定し
たものとして所定の確定後処理を実行する。According to the second aspect of the present invention, the user inputs the voice through the voice input means such as a microphone. The start and end of the period for inputting the voice to be recognized. Is provided by the user himself / herself, and the voice input during the input period specified by the input period specifying means is to be recognized. Then, the recognizing means compares the input voice with a plurality of comparison target pattern candidates stored in the dictionary means in advance, and determines a recognition result having a high degree of coincidence, and reports the recognition result by the notifying means. Then, when a predetermined confirmation instruction is given after the recognition result is notified, the post-confirmation processing means executes the predetermined post-confirmation processing assuming that the recognition result has been confirmed.

【００１５】そして、このような基本的な音声認識に関
する処理を実行する際、入力期間指定手段によって通常
の音声入力期間の開始及び終了の指定ではない所定のガ
イド要求がされた場合には、ガイド手段が、そのガイド
要求に応じた所定のガイド内容を報知する。ここで、
「通常の音声入力期間の開始及び終了の指定ではない所
定のガイド要求がされた場合」とは、例えば次のような
場合を言う。通常の音声入力期間の開始及び終了の指定
は、開始指定の後に音声入力を行い、その後に終了指定
をすることになるので、開始指定から終了指定までの間
隔が例えば１秒以下となるようなことは想定しにくい。
したがって、例えば開始指定から終了指定までの間隔が
比較的短い時間（例えば０．５秒以下）となるいわゆる
「クリック動作」の場合には、それは音声入力期間の開
始及び終了の指定ではなく、ガイド要求のための操作で
あると判断するのである。もちろん、装置側において通
常の音声入力期間の開始及び終了の指定ではありえない
様な特別の操作であれば、これ以外の操作もガイド要求
のための操作であるとしてよい。例えばいわゆる「ダブ
ルクリック」なども適用できる。When performing such basic speech recognition-related processing, if a predetermined guide request other than the designation of the start and end of a normal speech input period is made by the input period designating means, a guide is issued. The means notifies predetermined guide contents according to the guide request. here,
"When a predetermined guide request other than the designation of the start and end of the normal voice input period is made" means, for example, the following case. To specify the start and end of the normal voice input period, voice input is performed after the start is specified, and then the end is specified. Therefore, the interval from the start to the end is set to, for example, 1 second or less. It is hard to imagine that.
Therefore, for example, in the case of a so-called “click operation” in which the interval from the start designation to the end designation is relatively short (for example, 0.5 seconds or less), it is not the designation of the start and end of the voice input period, but the guide It is determined that the operation is for the request. Of course, if the device side is a special operation that cannot be designated as the start and end of the normal voice input period, any other operation may be an operation for a guide request. For example, a so-called "double click" can be applied.

【００１６】本請求項２記載の音声認識装置によれば、
利用者が積極的にガイド内容を知りたいという場合に
は、上記入力期間指定手段によって所定のガイド要求を
すればよく、利用方法に慣れていない利用者に対するガ
イド機能を発揮することができる。その上、ガイド要求
を指定するためだけに特別な構成を必要とするのではな
く、音声入力のための基本的な処理に必要な入力期間指
定手段をガイド要求の指定にも利用できるため、構成の
簡略化にも寄与する。According to the speech recognition apparatus of the second aspect,
If the user wants to know the contents of the guide positively, a predetermined guide request may be made by the input period designating means, and a guide function for a user unfamiliar with the use method can be exhibited. In addition, a special configuration is not required only for specifying the guide request, and the input period specifying means required for basic processing for voice input can be used for specifying the guide request. Also contributes to simplification.

【００１７】なお、この場合の所定のガイド内容も種々
考えられるが、上述したナビゲーションシステムにおけ
る目的地などを音声入力するために音声認識装置を適用
した場合を想定して説明した上記及びに関して言え
ば、この場合には特にで示した内容が有効であると考
えられる。つまり、ナビゲーションシステムに対して自
分が音声で何を入力することができるのか自体を知らな
い場合などに「目的地の設定が行えます」というよう
に、音声入力によって指示が可能な設定項目自体を案内
するものである。この場合にも、当然ながら一連のナビ
ゲーション処理のどの段階にあるかで内容を変更しても
よい。また、）で示したガイド内容の例も、特に除外す
ることはなく、「目的地を設定するときは都道府県名か
ら入力して下さい」や「例えば愛知県刈谷市昭和町と入
力して下さい」というようなガイド内容を報知してもよ
い。There are various types of predetermined guide contents in this case. However, the above description and the description given on the assumption that the voice recognition device is applied to input a destination or the like in the navigation system by voice are described. In this case, it is considered that the contents indicated by are particularly effective. In other words, if you do not know what you can input by voice to the navigation system, the setting items that can be instructed by voice input, such as "You can set the destination" It is for guiding. Also in this case, the content may be changed depending on the stage of the series of navigation processing. In addition, examples of the guide contents shown in parentheses) are not particularly excluded. For example, when entering a destination, enter the prefecture name, or enter, for example, Showa-cho, Kariya city, Aichi prefecture. May be notified.

【００１８】上述した請求項１に示した音声認識装置に
おいては、利用者が音声入力をしようとする意志は推認
されるが実際には音声入力がなく、どのように入力すれ
ばよいかについての理解が十分でないことが考えられる
ため、利用に際して困っている利用者に対して、特に
「どのように入力すればよいのか」を教示するという観
点から所定のガイド内容が決まっていた。それに対し
て、請求項２に記載する音声認識装置においては、利用
者が音声入力をしようとしている状況ではなく、それ以
前の状況でガイド要求をするため、上述したように、
「目的地の設定が行えます」というような音声入力によ
って指示が可能な設定項目自体を案内するといったガイ
ド内容が適合する。但し、ガイド要求がされた状況を判
断し、その状況での適切なガイド内容を報知すればよい
ので、その内容自体を特に限定するものではない。In the above-described voice recognition device, the user's intention to input a voice is inferred, but there is actually no voice input, and how the input should be performed. Since it is considered that the user does not understand enough, predetermined guide contents have been determined from the viewpoint of instructing a user who is having trouble using the system, particularly, "how to input". On the other hand, in the voice recognition device according to the second aspect, since the user makes a guide request not in the situation where the user is trying to perform voice input but in the situation before that, as described above,
The guide contents such as guiding the setting items that can be instructed by voice input such as “can set the destination” are suitable. However, since it is only necessary to determine the situation where the guide request is made and to notify the appropriate guide content in that situation, the content itself is not particularly limited.

【００１９】また、上述した２つの機能を兼ね備える音
声認識装置としては、請求項３に示す構成が考えられ
る。すなわち、請求項１に記載の音声認識装置におい
て、ガイド手段は、入力期間指定手段によって通常の音
声入力期間の開始及び終了の指定ではない所定のガイド
要求がされた場合には、そのガイド要求に応じた所定の
ガイド内容を少なくとも音声にて報知するよう構成され
ていることを特徴とする。Further, as a voice recognition device having both of the above two functions, the configuration described in claim 3 can be considered. That is, in the voice recognition device according to claim 1, when the input period specifying unit issues a predetermined guide request other than the designation of the start and end of the normal voice input period, the guide unit responds to the guide request. It is characterized in that it is configured to notify at least a predetermined guide content corresponding thereto by voice.

【００２０】この場合の音声認識装置によれば、ガイド
手段が、次の２種類の状況を判断して対応する処理を実
行する。つまり、一つは、入力期間指定手段によって音
声入力期間の開始が指定された時点から所定時間経過し
ても音声入力手段を介した音声入力がない場合には、所
定のガイド内容を報知するものであり、もう一つは、入
力期間指定手段によって通常の音声入力期間の開始及び
終了の指定ではない所定のガイド要求がされた場合に
は、そのガイド要求に応じた所定のガイド内容を報知す
るというものである。According to the speech recognition apparatus in this case, the guide means determines the following two situations and executes the corresponding processing. In other words, one is to notify a predetermined guide content when there is no voice input via the voice input means even when a predetermined time has elapsed after the start of the voice input period is specified by the input period specifying means. The other is that when a predetermined guide request other than the specification of the start and end of the normal voice input period is made by the input period specifying means, a predetermined guide content corresponding to the guide request is notified. That is.

【００２１】したがって、上述した利用方法に慣れてい
ない利用者の２種類の状況にそれぞれ応じた適切なガイ
ド機能を発揮することができる。なお、報知手段による
認識結果の報知形態、あるいはガイド手段によるガイド
内容の報知形態については、利用者に対して報知が可能
であればどのような形態でもよいが、請求項４に示すよ
うに、少なくとも音声による報知を行なうことが好まし
いと考えられる。これは、例えばカーナビゲーションシ
ステムなどの車載機器用として用いる場合には、音声で
出力されればドライバーは視点を表示装置にずらしたり
する必要がないので、安全運転のより一層の確保の点で
は有利であることなどの理由からである。また、認識結
果の報知を音声で行ない、かつガイド内容の報知も音声
で行なえば、それらのためのハード構成を共通化するこ
とができる。但し、音声に加え例えば画像で報知するよ
うにしてもよい。車載機器として適用する場合に音声出
力が有利であることを述べたが、もちろん車両が走行中
でない状況もあるので、音声及び画像の両方で報知すれ
ば、ドライバーは表示による確認と音声による確認との
両方が可能となる。Therefore, it is possible to exhibit an appropriate guide function according to each of the two kinds of situations of the user who is not used to the above-mentioned use method. The notification form of the recognition result by the notification unit or the guide form of the guide content by the guide unit may be any form as long as the user can be notified. It is considered preferable to provide at least notification by voice. This is advantageous in terms of further ensuring safe driving when used for on-vehicle equipment such as a car navigation system, since the driver does not need to shift his or her viewpoint to the display device if it is output in voice. This is because of that. Further, if the notification of the recognition result is performed by voice and the notification of the guide content is also performed by voice, the hardware configuration for them can be shared. However, the notification may be made by, for example, an image in addition to the sound. It was mentioned that voice output is advantageous when applied as an in-vehicle device, but of course there are also situations where the vehicle is not running, so if both the voice and the image are used, the driver can confirm with the display and with the voice Both are possible.

【００２２】また、請求項１〜６のいずれかに記載の音
声認識装置をナビゲーションシステム用として用いる場
合には、請求項７に示すように構成することが考えられ
る。すなわち、請求項１〜６のいずれかに記載の音声認
識装置と、ナビゲーション装置とを備え、音声認識装置
の音声入力手段は、少なくともナビゲーション装置がナ
ビゲート処理をする上で指定される必要のある所定のナ
ビゲート処理関連データの指示を利用者が音声にて入力
するために用いられるものであり、確定後処理手段は、
認識手段による認識結果をナビゲーション装置に出力す
るよう構成されているのである。この場合の「所定のナ
ビゲート処理関連データ」としては、目的地が代表的な
ものとして挙げられるが、それ以外にもルート探索に関
する条件選択など、ナビゲート処理をする上で指定の必
要のある指示が含まれる。When the voice recognition device according to any one of claims 1 to 6 is used for a navigation system, it is conceivable that the voice recognition device is configured as shown in claim 7. That is, a voice recognition device according to any one of claims 1 to 6 and a navigation device are provided, and the voice input means of the voice recognition device needs to be specified at least when the navigation device performs a navigation process. The user is used to input an instruction of predetermined navigation-related data by voice, and the post-confirmation processing means includes:
It is configured to output the result of recognition by the recognition means to the navigation device. In this case, the “predetermined navigation processing-related data” is a typical destination, but other than that, it is necessary to specify the navigation processing, such as selecting conditions for a route search. Instructions are included.

【００２３】なお、音声認識装置の適用先としては、上
述したナビゲーションシステムには限定されない。例え
ば音声認識装置を空調システム用として用いる場合に
は、設定温度の調整、空調モード（冷房・暖房・ドラ
イ）の選択、あるいは風向モードの選択を音声入力によ
って行うようにすることが考えられる。そして、この場
合には、その設定項目（温度・空調モード・風向モード
など）自体をガイド内容として報知したり、あるいは、
「設定温度を２５度にする」と言えばよいのか「設定温
度を５度下げる」というように言えばよいのか、などを
さらにガイド内容として報知することが考えられる。空
調モードや風向モードなどについても同様である。The application destination of the speech recognition apparatus is not limited to the navigation system described above. For example, when the voice recognition device is used for an air conditioning system, it is conceivable to adjust the set temperature, select an air conditioning mode (cooling / heating / dry), or select a wind direction mode by voice input. In this case, the setting items (temperature, air-conditioning mode, wind direction mode, etc.) themselves are reported as guide contents, or
It is conceivable that the guide content may further inform the user of whether to say "set the set temperature to 25 degrees" or "set the set temperature to 5 degrees". The same applies to the air conditioning mode and the wind direction mode.

【００２４】なお、上述のナビゲーションシステム及び
空調システムは、車載機器として用いられる場合だけで
はなく、例えば携帯型ナビゲーション装置や屋内用空調
装置などでもよい。但し、これまで説明したように車載
機器用として用いる場合には利用者がドライバーである
ことが考えられ、その場合には運転自体が最重要であ
り、それ以外の車載機器については、なるべく運転に支
障がないことが好ましい。したがって、車載機器として
のナビゲーションシステムや空調システムを前提とした
音声認識装置の場合には、より一層の利点がある。もち
ろん、このような視点で考えるならば、ナビゲーション
システムや空調システム以外の車載機器に対しても同様
に利用することができる。例えば、カーオーディオ機器
などは有効である。また、いわゆるパワーウインドウの
開閉やミラー角度の調整などを音声によって指示するよ
うな構成を考えれば、そのような状況でも有効である。The above navigation system and air conditioning system are not limited to the case where they are used as on-vehicle equipment, but may be, for example, a portable navigation device or an indoor air conditioning device. However, as described above, when used for in-vehicle equipment, the user may be a driver. In that case, driving itself is the most important. For other in-vehicle equipment, drive as much as possible. Preferably, there is no hindrance. Therefore, in the case of a voice recognition device on the premise of a navigation system or an air conditioning system as an in-vehicle device, there is a further advantage. Of course, from such a viewpoint, the present invention can be similarly applied to in-vehicle devices other than the navigation system and the air conditioning system. For example, a car audio device is effective. Also, considering a configuration in which opening and closing of the power window and adjustment of the mirror angle are instructed by voice, it is effective even in such a situation.

【００２５】また、車載機器用とした場合にはそれ特有
の利点があることは述べたが、本発明の音声認識装置の
適用先としては、利用者による音声入力指示にしたがっ
て所定の処理を実行するものであれば同様に考えられ
る。例えば、携帯用の情報端末装置、あるいは街頭やパ
ーキングエリアなどに設定される情報端末装置などにも
同様に適用できる。Although it has been described that there is an advantage unique to the in-vehicle device, the voice recognition device of the present invention is applied to a device that executes a predetermined process according to a voice input instruction from a user. If it does, the same can be considered. For example, the present invention can be similarly applied to a portable information terminal device or an information terminal device set in a street or a parking area.

【００２６】[0026]

【発明の実施の形態】図１は本発明の実施形態の音声認
識装置３０を適用したカーナビゲーションシステム２の
概略構成を示すブロック図である。本カーナビゲーショ
ンシステム２は、位置検出器４、地図データ入力器６、
操作スイッチ群８、これらに接続された制御回路１０、
制御回路１０に接続された外部メモリ１２、表示装置１
４及びリモコンセンサ１５及び音声認識装置３０を備え
ている。なお制御回路１０は通常のコンピュータとして
構成されており、内部には、周知のＣＰＵ、ＲＯＭ、Ｒ
ＡＭ、Ｉ／Ｏ及びこれらの構成を接続するバスラインが
備えられている。FIG. 1 is a block diagram showing a schematic configuration of a car navigation system 2 to which a voice recognition device 30 according to an embodiment of the present invention is applied. The car navigation system 2 includes a position detector 4, a map data input device 6,
An operation switch group 8, a control circuit 10 connected thereto,
External memory 12 connected to control circuit 10, display device 1
4 and a remote control sensor 15 and a voice recognition device 30. The control circuit 10 is configured as a normal computer, and has a well-known CPU, ROM, R
A bus line for connecting AM, I / O, and these components is provided.

【００２７】位置検出器４は、周知の地磁気センサ１
６、ジャイロスコープ１８、距離センサ２０、及び衛星
からの電波に基づいて車両の位置を検出するＧＰＳ（Gl
obal Positioning System ）のためのＧＰＳ受信機２２
を有している。これらのセンサ等１６，１８，２０，２
２は各々が性質の異なる誤差を持っているため、複数の
センサにより、各々補間しながら使用するように構成さ
れている。なお、精度によっては上述した内の一部で構
成してもよく、更に、ステアリングの回転センサ、各転
動輪の車輪センサ等を用いてもよい。The position detector 4 is a well-known geomagnetic sensor 1.
6. GPS (Gl) for detecting the position of the vehicle based on radio waves from the gyroscope 18, the distance sensor 20, and satellites
GPS receiver 22 for obal Positioning System)
have. These sensors and the like 16, 18, 20, 2
2 is configured to be used while interpolating each with a plurality of sensors because each has an error having a different property. It should be noted that depending on the accuracy, a part of the above-described components may be used, and a rotation sensor for the steering wheel, a wheel sensor for each rolling wheel, or the like may be used.

【００２８】地図データ入力器６は、位置検出の精度向
上のためのいわゆるマップマッチング用データ、地図デ
ータ及び目印データを含む各種データを入力するための
装置である。媒体としては、そのデータ量からＣＤ−Ｒ
ＯＭを用いるのが一般的であるが、メモリカード等の他
の媒体を用いても良い。The map data input device 6 is a device for inputting various data including so-called map matching data, map data and landmark data for improving the accuracy of position detection. As a medium, CD-R
Although OM is generally used, another medium such as a memory card may be used.

【００２９】表示装置１４はカラー表示装置であり、表
示装置１４の画面には、位置検出器４から入力された車
両現在位置マークと、地図データ入力器６より入力され
た地図データと、更に地図上に表示する誘導経路や後述
する設定地点の目印等の付加データとを重ねて表示する
ことができる。The display device 14 is a color display device. On the screen of the display device 14, the current vehicle position mark input from the position detector 4, the map data input from the map data input device 6, and the map data It is possible to superimpose and display additional data such as a guidance route displayed above and a mark of a set point described later.

【００３０】また、本カーナビゲーションシステム２
は、リモートコントロール端末（以下、リモコンと称す
る。）１５ａを介してリモコンセンサ１５から、あるい
は操作スイッチ群８により目的地の位置を入力すると、
現在位置からその目的地までの最適な経路を自動的に選
択して誘導経路を形成し表示する、いわゆる経路案内機
能も備えている。このような自動的に最適な経路を設定
する手法は、ダイクストラ法等の手法が知られている。
操作スイッチ群８は、例えば、表示装置１４と一体にな
ったタッチスイッチもしくはメカニカルなスイッチ等が
用いられ、各種入力に使用される。The car navigation system 2
Is input from the remote control sensor 15 via a remote control terminal (hereinafter referred to as a remote control) 15a or the position of the destination by the operation switches 8;
It also has a so-called route guidance function that automatically selects an optimal route from the current position to the destination and forms and displays a guidance route. As a technique for automatically setting the optimum route, a technique such as the Dijkstra method is known.
The operation switch group 8 is, for example, a touch switch integrated with the display device 14 or a mechanical switch, and is used for various inputs.

【００３１】そして、音声認識装置３０は、上記操作ス
イッチ群８あるいはリモコン１５ａが手動操作により目
的地などを指示するために用いられるのに対して、利用
者が音声で入力することによっても同様に目的地などを
指示することができるようにするための装置である。The voice recognition device 30 is used when the operation switch group 8 or the remote controller 15a is manually operated to indicate a destination or the like, and similarly, when the user inputs voice. This is a device that allows the user to specify a destination or the like.

【００３２】この音声認識装置３０は、「認識手段」と
しての音声認識部３１と、「確定後処理手段」としての
対話制御部３２と、音声合成部３３と、音声抽出部３４
と、「音声入力手段」としてのマイク３５と、「入力期
間指定手段」としてのＰＴＴ（Push-To-Talk）スイッチ
３６と、スピーカ３７と、制御部３８と、表示制御部３
９と、表示部４０とを備えている。The speech recognition device 30 includes a speech recognition unit 31 as "recognition means", a dialog control unit 32 as "post-determination processing means", a speech synthesis unit 33, and a speech extraction unit 34.
A microphone 35 as an “audio input unit”, a PTT (Push-To-Talk) switch 36 as an “input period designating unit”, a speaker 37, a control unit 38, and a display control unit 3.
9 and a display unit 40.

【００３３】音声認識部３１は、音声抽出部３４から入
力された音声データを、対話制御部３２からの指示によ
り入力音声の認識処理を行い、その認識結果を対話制御
部３２に返す。すなわち、音声抽出部３４から取得した
音声データに対し、記憶している辞書データを用いて照
合を行ない、複数の比較対象パターン候補と比較して一
致度の高い上位比較対象パターンを対話制御部３２へ出
力する。入力音声中の単語系列の認識は、音声抽出部３
４から入力された音声データを順次音響分析して音響的
特徴量（例えばケプストラム）を抽出し、この音響分析
によって得られた音響的特徴量時系列データを得る。そ
して、周知のＤＰマッチング法、ＨＭＭ（隠れマルコフ
モデル）あるいはニューラルネットなどによって、この
時系列データをいくつかの区間に分け、各区間が辞書デ
ータとして格納されたどの単語に対応しているかを求め
る。The voice recognition unit 31 performs a voice recognition process on the voice data input from the voice extraction unit 34 according to an instruction from the dialog control unit 32, and returns the recognition result to the dialog control unit 32. That is, the voice data acquired from the voice extraction unit 34 is collated using the stored dictionary data, and a higher-level comparison target pattern having a higher degree of coincidence with a plurality of comparison target pattern candidates is compared with the dialogue control unit 32. Output to The recognition of the word sequence in the input speech is performed by the speech extraction unit 3
The sound data inputted from the step 4 are sequentially subjected to acoustic analysis to extract an acoustic feature (for example, cepstrum), and to obtain acoustic feature time-series data obtained by the acoustic analysis. The time-series data is divided into several sections by a well-known DP matching method, an HMM (Hidden Markov Model), a neural network, or the like, and a word corresponding to each section stored as dictionary data is obtained. .

【００３４】対話制御部３２は、音声認識部３１におけ
る認識結果や制御部３８からの指示に基づき、音声合成
部３３への応答音声の出力指示や表示制御部３９への応
答表示の出力指示、あるいは、システム自体の処理を実
行する制御回路１０に対して例えばナビゲート処理のた
めに必要な目的地を通知して設定処理を実行させるよう
指示する処理を実行する。このような処理が確定後処理
であり、結果として、この音声認識装置３０を利用すれ
ば、上記操作スイッチ群８あるいはリモコン１５ａを手
動しなくても、音声入力によりナビゲーションシステム
に対する目的地の指示などが可能となるのである。The dialogue control unit 32 outputs a response voice output instruction to the voice synthesis unit 33 and a response display output instruction to the display control unit 39 based on the recognition result of the voice recognition unit 31 and the instruction from the control unit 38. Alternatively, the control circuit 10 executes a process of instructing the control circuit 10 executing the process of the system itself to execute a setting process by notifying a destination required for the navigation process, for example. Such processing is post-confirmation processing, and as a result, if this voice recognition device 30 is used, the destination of the navigation system can be indicated by voice input without manually operating the operation switch group 8 or the remote controller 15a. It becomes possible.

【００３５】なお、音声合成部３３は、波形データベー
ス内に格納されている音声波形を用い、対話制御部３２
からの応答音声の出力指示に基づく音声を合成する。こ
の合成音声がスピーカ３７から出力されることとなる。
また、表示制御部３９が、対話制御部３２からの応答表
示の出力指示に基づく表示画像を生成する。この生成画
像が表示部４０に出力されることとなる。The speech synthesizing unit 33 uses the speech waveform stored in the waveform database, and
Synthesizes a voice based on a response voice output instruction from the user. This synthesized voice is output from the speaker 37.
Further, the display control unit 39 generates a display image based on a response display output instruction from the dialog control unit 32. This generated image is output to the display unit 40.

【００３６】音声抽出部３４は、マイク３５にて取り込
んだ周囲の音声をデジタルデータに変換して音声認識部
３１に出力するものである。詳しくは、入力した音声の
特徴量を分析するため、例えば数１０ｍｓ程度の区間の
フレーム信号を一定間隔で切り出し、その入力信号が、
音声の含まれている音声区間であるのか音声の含まれて
いない雑音区間であるのか判定する。マイク３５から入
力される信号は、認識対象の音声だけでなく雑音も混在
したものであるため、音声区間と雑音区間の判定を行な
う。この判定方法としては従来より多くの手法が提案さ
れており、例えば入力信号の短時間パワーを一定時間毎
に抽出していき、所定の閾値以上の短時間パワーが一定
以上継続したか否かによって音声区間であるか雑音区間
であるかを判定する手法がよく採用されている。そし
て、音声区間であると判定された場合には、その入力信
号が音声認識部３１に出力されることとなる。The voice extracting unit 34 converts the surrounding voice captured by the microphone 35 into digital data and outputs the digital data to the voice recognizing unit 31. Specifically, in order to analyze the feature amount of the input voice, a frame signal in a section of, for example, about several tens of ms is cut out at regular intervals, and the input signal is
It is determined whether the section is a speech section containing speech or a noise section containing no speech. Since the signal input from the microphone 35 contains not only the speech to be recognized but also noise, the speech section and the noise section are determined. As this determination method, many methods have been proposed as compared with the related art. A technique of determining whether a section is a speech section or a noise section is often adopted. When it is determined that the input signal is in the voice section, the input signal is output to the voice recognition unit 31.

【００３７】また、本実施形態においては、利用者がＰ
ＴＴスイッチ３６を押しながらマイク３５を介して音声
を入力するという使用方法である。具体的には、制御部
３８がＰＴＴスイッチ３６が押されたタイミングや戻さ
れたタイミング及び押された状態が継続した時間を監視
しており、ＰＴＴスイッチ３６が押された場合には音声
抽出部３４に対して処理の実行を指示する。一方、ＰＴ
Ｔスイッチ３６が押されていない場合にはその処理を実
行させないようにしている。したがって、ＰＴＴスイッ
チ３６が押されている間にマイク３５を介して入力され
た音声データが音声認識部３１へ出力されることとな
る。また、制御部３８には音声抽出部３４での音声区間
であるか雑音区間であるかの判定結果も入力するよう構
成されており、例えば、ＰＴＴスイッチ３６が押された
のに音声入力がなく、その状態が所定時間継続している
といったことも判断できるようにされている。Also, in the present embodiment, the user
This is a usage method in which a voice is input via the microphone 35 while pressing the TT switch 36. Specifically, the control unit 38 monitors the timing when the PTT switch 36 is pressed, the timing when the PTT switch 36 is returned, and the time when the pressed state continues, and when the PTT switch 36 is pressed, the voice extracting unit 34 is instructed to execute processing. On the other hand, PT
If the T switch 36 is not pressed, the processing is not executed. Therefore, the voice data input via the microphone 35 while the PTT switch 36 is pressed is output to the voice recognition unit 31. In addition, the control unit 38 is also configured to input a determination result of a voice section or a noise section in the voice extraction unit 34. For example, although the PTT switch 36 is pressed, there is no voice input. It can also be determined that the state has continued for a predetermined time.

【００３８】次に、本実施形態１のカーナビゲーション
システム２の動作について説明する。なお、音声認識装
置３０に関係する部分が特徴であるので、カーナビゲー
ションシステム２としての一般的な動作を簡単に説明し
た後、音声認識装置３０に関係する部分の動作について
詳しく説明することとする。Next, the operation of the car navigation system 2 according to the first embodiment will be described. In addition, since a portion related to the voice recognition device 30 is a feature, a general operation of the car navigation system 2 will be briefly described, and then, an operation of a portion related to the voice recognition device 30 will be described in detail. .

【００３９】カーナビゲーションシステム２の電源オン
後に、表示装置１４上に表示されるメニューから、ドラ
イバーがリモコン１５ａ（操作スイッチ群８でも同様に
操作できる。以後の説明においても同じ）により、案内
経路を表示装置１４に表示させるために経路情報表示処
理を選択した場合、あるいは、音声認識装置３０を介し
て希望するメニューをマイク３５を介して音声入力する
ことで、対話制御部３２から制御回路１０へ、リモコン
１５ａを介して選択されるのを同様の指示がなされた場
合、次のような処理を実施する。After the power of the car navigation system 2 is turned on, the driver can change the guide route from the menu displayed on the display device 14 by using the remote controller 15a (the same operation can be performed with the operation switches 8; the same applies to the following description). When the route information display process is selected to be displayed on the display device 14, or when a desired menu is voice-inputted via the microphone 35 via the voice recognition device 30, the dialogue control unit 32 sends a message to the control circuit 10. When the same instruction is given via the remote controller 15a, the following processing is performed.

【００４０】すなわち、ドライバーが表示装置１４上の
地図に基づいて、音声あるいはリモコンなどの操作によ
って目的地を入力すると、ＧＰＳ受信機２２から得られ
る衛星のデータに基づき車両の現在地が求められ、目的
地と現在地との間に、ダイクストラ法によりコスト計算
して、現在地から目的地までの最も短距離の経路を誘導
経路として求める処理が行われる。そして、表示装置１
４上の道路地図に重ねて誘導経路を表示して、ドライバ
ーに適切なルートを案内する。このような誘導経路を求
める計算処理や案内処理は一般的に良く知られた処理で
あるので説明は省略する。That is, when the driver inputs a destination by voice or an operation of a remote controller or the like based on the map on the display device 14, the current position of the vehicle is obtained based on the satellite data obtained from the GPS receiver 22. A process is performed between the ground and the current location to calculate the cost by the Dijkstra method and determine the shortest route from the current location to the destination as a guidance route. And the display device 1
4. A guidance route is displayed on the road map above to guide the driver to an appropriate route. The calculation process and the guidance process for obtaining such a guide route are generally well-known processes, and a description thereof will be omitted.

【００４１】次に、音声認識装置３０における動作につ
いて、上述の経路案内のための目的地を音声入力する場
合を例にとって説明する。イグニションスイッチ３が操
作されてエンジンＯＮとなると（Ｓ１００：ＹＥＳ）、
カーナビゲーションシステム２の電源オンとなり、まず
ガイド内容Ａの報知を行なう（Ｓ１１０）。ここでは、
対話制御部３２が音声合成部３３にガイド内容Ａを音声
で合成出力するよう指示すると共に、表示制御部３９に
ガイド内容Ａを画像にて表示出力するよう指示する。こ
の指示にしたがう合成音声がスピーカ３７から出力され
ると共に、指示にしたがう画像が表示部４０に出力され
る。以下、ガイド内容Ｂ以降についても報知形態は同じ
である。そして、このガイド内容Ａの具体的内容は「音
声入力が可能です。クリックで説明します。」というも
のである。つまり、音声での入力ができること自体を利
用者に報知し、そのこと自体を知らない場合に対応す
る。そして、さらに説明を求めたい利用者にはクリック
操作をするように促すこととなる。Next, the operation of the voice recognition device 30 will be described by taking as an example the case where the destination for the above-mentioned route guidance is input by voice. When the ignition switch 3 is operated and the engine is turned on (S100: YES),
When the power of the car navigation system 2 is turned on, the guide content A is first notified (S110). here,
The dialogue control unit 32 instructs the speech synthesis unit 33 to synthesize and output the guide content A by voice, and instructs the display control unit 39 to display and output the guide content A as an image. A synthesized voice according to the instruction is output from the speaker 37, and an image according to the instruction is output to the display unit 40. Hereinafter, the notification form is the same for the guide content B and thereafter. The specific content of this guide content A is "Voice input is possible. Click to explain." In other words, the user is notified that the input can be made by voice, and the user does not know the input itself. Then, the user who wants further explanation is urged to perform the click operation.

【００４２】Ｓ１１０にてガイド内容Ａを報知した後
は、Ｓ１２０にて、ＰＴＴスイッチ３６がＯＮされたか
（押下されたか）否かを判断する。この判断は制御部３
８で行われる。そして、ＯＮされた場合には（Ｓ１２
０：ＹＥＳ）、続くＳ１３０でクリック操作がどうかを
判断する。この判断も制御部３８にて行われるが、具体
的には、ＰＴＴスイッチ３６がＯＮされた後の比較的短
い時間（例えば０．５秒以内）にＯＦＦされた場合には
それをクリック操作と見なす。After notifying the guide content A in S110, it is determined in S120 whether the PTT switch 36 is turned on (pressed). This judgment is made by the control unit 3
8 is performed. If it is turned on (S12
0: YES), and in subsequent S130, it is determined whether or not a click operation has been performed. This determination is also performed by the control unit 38. Specifically, when the PTT switch 36 is turned off within a relatively short time (for example, within 0.5 seconds) after the PTT switch 36 is turned on, a click operation is performed. Regard it.

【００４３】もしも、クリック操作であると判断した場
合には（Ｓ１３０：ＹＥＳ）、Ｓ１４０へ移行してガイ
ド内容Ｂを報知する。このガイド内容Ｂの具体的内容は
「目的地設定、拡大、縮小などが行えます。」というも
のである。つまり、どのような内容の音声入力ができる
のかを利用者に報知するのである。このガイド内容Ｂを
報知した後はＳ１２０へ戻る。If it is determined that the operation is a click operation (S130: YES), the flow shifts to S140 to notify the guide content B. The specific content of the guide content B is that "destination setting, enlargement, reduction, etc. can be performed." That is, the user is notified of what kind of voice input is possible. After the notification of the guide content B, the process returns to S120.

【００４４】一方、クリック操作でないと判断した場合
には（Ｓ１３０：ＮＯ）、Ｓ１５０へ移行して音声入力
があるかどうかを判断する。上述したように音声抽出部
３４ではマイク３５を介して入力された音声データに基
づき音声区間であるか雑音区間であるかを判定してお
り、その判定結果は制御部３８へ入力するので、制御部
３８はこの判定結果を基に音声入力があるかどうかを判
断する。On the other hand, if it is determined that the operation is not a click operation (S130: NO), the flow shifts to S150 to determine whether there is a voice input. As described above, the voice extracting unit 34 determines whether the voice section is a voice section or a noise section based on the voice data input via the microphone 35, and the determination result is input to the control unit 38. The unit 38 determines whether there is a voice input based on the determination result.

【００４５】音声入力がある場合には（Ｓ１５０：ＹＥ
Ｓ）、図４のＳ２６０へ移行するが、このＳ２６０以降
の処理は後述する。一方、音声入力がない場合には（Ｓ
１５０：ＮＯ）、ＰＴＴスイッチ３６がＯＦＦされたか
どうかを判断し（Ｓ１６０）、ＰＴＴスイッチ３６がＯ
ＦＦされた場合には（Ｓ１６０：ＹＥＳ）、Ｓ１２０へ
戻る。一方、ＰＴＴスイッチ３６がＯＦＦされない、つ
まりＯＮされた状態のままであれば（Ｓ１６０：Ｎ
Ｏ）、所定時間Ｔａが経過したかどうかを判断する（Ｓ
１７０）。If there is a voice input (S150: YE
S), the process proceeds to S260 in FIG. 4, and the processes after S260 will be described later. On the other hand, if there is no voice input (S
150: NO), it is determined whether or not the PTT switch 36 is turned off (S160).
If the FF has been performed (S160: YES), the process returns to S120. On the other hand, if the PTT switch 36 is not turned off, that is, remains in the ON state (S160: N
O), it is determined whether a predetermined time Ta has elapsed (S)
170).

【００４６】所定時間Ｔａが経過していない場合には
（Ｓ１７０：ＮＯ）、Ｓ１５０へ戻って音声入力される
のを待つが、所定時間Ｔａが経過した場合には（Ｓ１７
０：ＹＥＳ）、Ｓ１８０へ移行してガイド内容Ｃを報知
する。このガイド内容Ｃの具体的内容は「目的地を設定
するときは都道府県名から入力して下さい。」というも
のである。つまり、Ｓ１７０にて肯定判断されたという
ことは、ＰＴＴスイッチ３６がＯＮされたにもかかわら
ず、所定時間Ｔａ経過するまで音声入力がなかったこと
となるため、目的地を設定したいと思っている利用者に
入力方法を報知するのである。If the predetermined time Ta has not elapsed (S170: NO), the process returns to S150 and waits for voice input, but if the predetermined time Ta has elapsed (S17).
0: YES), proceeds to S180 and notifies guide content C. The specific content of the guide content C is "When setting a destination, input from the prefecture name." That is, an affirmative determination in S170 means that there is no voice input until the predetermined time Ta has elapsed even though the PTT switch 36 has been turned on, and therefore, it is desired to set a destination. It informs the user of the input method.

【００４７】Ｓ１８０にてガイド内容Ｃを報知した後
は、Ｓ１９０へ移行して音声入力があるかどうかを判断
する。音声入力がある場合には（Ｓ１５０：ＹＥＳ）、
図４のＳ２６０へ移行する。このＳ２６０以降の処理は
後述する。一方、音声入力がない場合には（Ｓ１９０：
ＮＯ）、ＰＴＴスイッチ３６がＯＦＦされたかどうかを
判断し（Ｓ２００）、ＰＴＴスイッチ３６がＯＦＦされ
た場合には（Ｓ２００：ＹＥＳ）、図２のＳ１２０へ戻
る。一方、ＰＴＴスイッチ３６がＯＦＦされない、つま
りＯＮされた状態のままであれば（Ｓ２００：ＮＯ）、
所定時間Ｔａが経過したかどうかを判断する（Ｓ２１
０）。After informing the guide content C in S180, the process shifts to S190 to determine whether there is a voice input. If there is a voice input (S150: YES),
The process proceeds to S260 in FIG. The processing after S260 will be described later. On the other hand, when there is no voice input (S190:
NO), it is determined whether or not the PTT switch 36 has been turned off (S200). If the PTT switch 36 has been turned off (S200: YES), the process returns to S120 in FIG. On the other hand, if the PTT switch 36 is not turned off, that is, remains in the ON state (S200: NO),
It is determined whether the predetermined time Ta has elapsed (S21).
0).

【００４８】所定時間Ｔａが経過していない場合には
（Ｓ２１０：ＮＯ）、Ｓ１９０へ戻って音声入力される
のを待つが、所定時間Ｔａが経過した場合には（Ｓ２１
０：ＹＥＳ）、Ｓ２２０へ移行してガイド内容Ｄを報知
する。このガイド内容Ｄの具体的内容は「例えば愛知県
刈谷市昭和町と入力して下さい。」というものである。
つまり、Ｓ２１０にて肯定判断されたということは、Ｐ
ＴＴスイッチ３６がＯＮされたにもかかわらず、所定時
間Ｔａ経過するまで音声入力がなかったこととなる。そ
れ以前の経過も考え合わせると、ガイド内容Ｃとして
「目的地を設定するときは都道府県名から入力して下さ
い。」と報知したにもかかわらず、やはり音声入力がさ
れずに所定時間が経過してしまったこととなる。これ
は、入力方法の説明をガイド内容として報知したが、利
用者はまだどのように入力するかを完全には理解でき
ず、入力できないといると推定されるため、さらに具体
的な入力例であるガイド内容Ｄを報知することで、それ
に倣って希望する目的地を容易に入力できるようにする
ためである。If the predetermined time Ta has not elapsed (S210: NO), the flow returns to S190 and waits for voice input, but if the predetermined time Ta has elapsed (S21).
0: YES), and proceeds to S220 to notify the guide content D. The specific content of this guide content D is "Please input, for example, Showa-cho, Kariya-shi, Aichi."
That is, a positive determination in S210 means that P
Even if the TT switch 36 is turned on, there is no voice input until the predetermined time Ta has elapsed. Considering the progress before that, even though the guide content C was informed that "When setting a destination, please input from the name of the prefecture." It has been done. This is an explanation of the input method as a guide content, but it is presumed that the user still cannot fully understand how to input and cannot input, so a more specific input example By notifying the guide content D, it is possible to easily input a desired destination in accordance with the notification.

【００４９】Ｓ２２０にてガイド内容Ｄを報知した後
は、Ｓ２３０へ移行して音声入力があるかどうかを判断
する。音声入力がある場合には（Ｓ２３０：ＹＥＳ）、
図４のＳ２６０へ移行する。このＳ２６０以降の処理は
後述する。一方、音声入力がない場合には（Ｓ２３０：
ＮＯ）、ＰＴＴスイッチ３６がＯＦＦされたかどうかを
判断し（Ｓ２４０）、ＰＴＴスイッチ３６がＯＦＦされ
た場合には（Ｓ２４０：ＹＥＳ）、図２のＳ１２０へ戻
る。一方、ＰＴＴスイッチ３６がＯＦＦされない、つま
りＯＮされた状態のままであれば（Ｓ２４０：ＮＯ）、
所定時間Ｔａが経過したかどうかを判断する（Ｓ２５
０）。After informing the guide contents D in S220, the flow shifts to S230 to determine whether or not there is a voice input. If there is a voice input (S230: YES),
The process proceeds to S260 in FIG. The processing after S260 will be described later. On the other hand, when there is no voice input (S230:
NO), it is determined whether or not the PTT switch 36 has been turned off (S240). If the PTT switch 36 has been turned off (S240: YES), the process returns to S120 in FIG. On the other hand, if the PTT switch 36 is not turned off, that is, remains in the ON state (S240: NO),
It is determined whether the predetermined time Ta has elapsed (S25).
0).

【００５０】所定時間Ｔａが経過していない場合には
（Ｓ２５０：ＮＯ）、Ｓ２３０へ戻って音声入力される
のを待つが、所定時間Ｔａが経過した場合には（Ｓ２５
０：ＹＥＳ）、図２のＳ１２０へ戻る。これまでの処理
では、音声入力がないまま所定時間Ｔａが経過した場合
には所定のガイド内容を報知するようにしていたが、こ
こではガイド内容は報知せずＳ１２０へ戻るようにし
た。これは、入力の具体例まで挙げたガイド内容Ｄを報
知しても利用者が音声入力しないので、音声入力する意
志自体がないと推定し、この時点で一応ガイドするのを
中止するのである。If the predetermined time Ta has not elapsed (S250: NO), the flow returns to S230 to wait for voice input, but if the predetermined time Ta has elapsed (S25).
0: YES), and returns to S120 of FIG. In the processing so far, when the predetermined time Ta elapses without a voice input, the predetermined guide content is notified, but here, the guide content is not notified, and the process returns to S120. This is because the user does not make a voice input even if the guide contents D listed up to the specific example of the input are notified, so that it is estimated that there is no intention to make a voice input itself, and the guide is temporarily stopped at this point.

【００５１】続いて、上述したＳ１５０，Ｓ１９０，Ｓ
２３０にて肯定判断、すなわち音声入力があった場合に
移行するＳ２６０の処理以降について説明する。Ｓ２６
０では、入力された音声に対する認識処理を行なう。こ
の音声認識処理は音声認識部３１にて実行されるのであ
るが、具体的には、取得した音声データに対して、記憶
されている辞書データを用いて照合を行なう。そして、
その照合結果により定まった上位比較対象パターンを認
識結果として対話制御部３２に出力することとなる。Subsequently, S150, S190, S
A description will be given of the processing after step S260 in which a determination is made in the affirmative at 230, that is, when there is a voice input. S26
At 0, a recognition process is performed on the input voice. This voice recognition processing is executed by the voice recognition unit 31. Specifically, the obtained voice data is collated using stored dictionary data. And
The higher-order comparison target pattern determined by the collation result is output to the dialog control unit 32 as a recognition result.

【００５２】続くＳ２７０では、その認識結果をトーク
バック及び表示する。つまり、対話制御部３２が音声合
成部３３及び表示制御部３９を制御し、認識した結果を
音声によりスピーカ３７から出力させると共に、表示部
４０に認識結果を示す文章を表示させる。In subsequent S270, the recognition result is talked back and displayed. That is, the dialogue control unit 32 controls the speech synthesis unit 33 and the display control unit 39 to output the recognized result from the speaker 37 by voice, and causes the display unit 40 to display a sentence indicating the recognition result.

【００５３】その後、Ｓ２８０にて正しい認識であるか
どうかを判断する。これは、利用者からの指示に応じて
判断することとなるが、例えば正しい認識であればマイ
ク３５から「はい」を音声入力し、間違った認識であれ
ば「いいえ」を音声入力するようにしておくことが考え
られる。もちろん、操作スイッチ群８を介してこれらの
指示を入力するようにしてもよい。Thereafter, it is determined in S280 whether the recognition is correct. This is determined according to an instruction from the user. For example, if the recognition is correct, the voice input of "Yes" is input from the microphone 35, and if the recognition is incorrect, the voice input of "No" is input. It is conceivable to keep it. Of course, these instructions may be input via the operation switch group 8.

【００５４】そして、誤った認識であれば（Ｓ２８０：
ＮＯ）図２のＳ１２０へ戻るが、正しい認識である場合
には（Ｓ２８０：ＹＥＳ）、Ｓ２９０へ移行して認識結
果を確定する。そして続くＳ３００にて、所定の確定後
処理を実行する。この場合の確定後処理とは、認識結果
としての「経路案内のための目的地」に関するデータ
を、制御回路１０（図１参照）へ出力する処理などであ
る。If the recognition is incorrect (S280:
NO) The process returns to S120 in FIG. 2, but if the recognition is correct (S280: YES), the process proceeds to S290 to determine the recognition result. Then, in subsequent S300, predetermined post-determination processing is executed. The post-determination process in this case is a process of outputting data relating to the “destination for route guidance” as a recognition result to the control circuit 10 (see FIG. 1).

【００５５】このような確定後処理（Ｓ３００）が終了
した後は、Ｓ３１０へ移行して、ＰＴＴスイッチ３６が
ＯＮされたか（押下されたか）否かを判断する。そし
て、ＯＮされた場合には（Ｓ３１０：ＹＥＳ）、続くＳ
３２０でクリック操作がどうかを判断し、クリック操作
であると判断した場合には（Ｓ３２０：ＹＥＳ）、Ｓ３
３０へ移行してガイド内容Ｅを報知する。このガイド内
容Ｅの具体的内容は「ルート探索、拡大、縮小などが行
えます。」というものである。つまり、目的地の設定は
これまでの処理で終了しているので、全体的なナビゲー
ション処理における次の段階の処理であるルート探索が
できることを利用者に報知するのである。このガイド内
容Ｅを報知した後はＳ３１０へ戻る。一方、クリック操
作でないと判断した場合には（Ｓ３２０：ＮＯ）、図５
のＳ３４０へ移行して、音声入力があるかどうかを判断
する。音声入力がある場合には（Ｓ３４０：ＹＥＳ）、
図６のＳ４１０へ移行する。このＳ４１０以降の処理は
後述する。一方、音声入力がない場合には（Ｓ３４０：
ＮＯ）、ＰＴＴスイッチ３６がＯＦＦされたかどうかを
判断し（Ｓ３５０）、ＰＴＴスイッチ３６がＯＦＦされ
た場合には（Ｓ３５０：ＹＥＳ）、図３のＳ３１０へ戻
る。一方、ＰＴＴスイッチ３６がＯＦＦされない、つま
りＯＮされた状態のままであれば（Ｓ３５０：ＮＯ）、
所定時間Ｔａが経過したかどうかを判断する（Ｓ３６
０）。After the post-determination process (S300) is completed, the flow shifts to S310 to determine whether or not the PTT switch 36 has been turned on (pressed). If it is turned on (S310: YES), the subsequent S
At 320, it is determined whether there is a click operation. If it is determined that the click operation is performed (S320: YES), S3
Then, the process goes to 30 to notify the guide content E. The specific content of the guide content E is "Route search, enlargement, reduction, etc. can be performed." That is, since the setting of the destination has been completed in the processing so far, the user is informed that the route search, which is the next processing in the overall navigation processing, can be performed. After the guide content E is notified, the process returns to S310. On the other hand, if it is determined that the operation is not a click operation (S320: NO), FIG.
The process proceeds to S340 to determine whether there is a voice input. If there is a voice input (S340: YES),
The process proceeds to S410 in FIG. The processing after S410 will be described later. On the other hand, when there is no voice input (S340:
NO), it is determined whether or not the PTT switch 36 has been turned off (S350). If the PTT switch 36 has been turned off (S350: YES), the process returns to S310 in FIG. On the other hand, if the PTT switch 36 is not turned off, that is, remains in the ON state (S350: NO),
It is determined whether the predetermined time Ta has elapsed (S36).
0).

【００５６】所定時間Ｔａが経過していない場合には
（Ｓ３６０：ＮＯ）、Ｓ３４０へ戻って音声入力される
のを待つが、所定時間Ｔａが経過した場合には（Ｓ３５
０：ＹＥＳ）、Ｓ３６０へ移行してガイド内容Ｆを報知
する。このガイド内容Ｆの具体的内容は「ルートを探索
したいときはルート探索と入力して下さい。」というも
のである。これは、ルートを探索したいと利用者が思っ
ていてもどのように入力よいかを迷っている状況を想定
し、その場合には「ルート探索」と入力すればよいこと
を報知するのである。If the predetermined time Ta has not elapsed (S360: NO), the flow returns to S340 to wait for voice input, but if the predetermined time Ta has elapsed (S35).
0: YES), the process proceeds to S360, and the guide content F is notified. The specific content of the guide content F is "Please input route search when you want to search for a route." This assumes a situation in which the user is wondering how to input even if he / she wants to search for a route, and in that case, is notified that the user may input "route search".

【００５７】Ｓ３７０にてガイド内容Ｆを報知した後
は、Ｓ３８０へ移行して音声入力があるかどうかを判断
する。音声入力がある場合には（Ｓ３８０：ＹＥＳ）、
図６のＳ４１０へ移行する。このＳ４１０以降の処理は
後述する。一方、音声入力がない場合には（Ｓ３８０：
ＮＯ）、ＰＴＴスイッチ３６がＯＦＦされたかどうかを
判断し（Ｓ３９０）、ＰＴＴスイッチ３６がＯＦＦされ
た場合には（Ｓ３９０：ＹＥＳ）、図４のＳ３１０へ戻
る。一方、ＰＴＴスイッチ３６がＯＦＦされない、つま
りＯＮされた状態のままであれば（Ｓ３９０：ＮＯ）、
所定時間Ｔａが経過したかどうかを判断する（Ｓ４０
０）。After informing the guide contents F in S370, the flow shifts to S380 to determine whether or not there is a voice input. If there is a voice input (S380: YES),
The process proceeds to S410 in FIG. The processing after S410 will be described later. On the other hand, when there is no voice input (S380:
NO), it is determined whether the PTT switch 36 has been turned off (S390). If the PTT switch 36 has been turned off (S390: YES), the process returns to S310 in FIG. On the other hand, if the PTT switch 36 is not turned off, that is, remains in the ON state (S390: NO),
It is determined whether the predetermined time Ta has elapsed (S40).
0).

【００５８】所定時間Ｔａが経過していない場合には
（Ｓ４００：ＮＯ）、Ｓ３８０へ戻って音声入力される
のを待つが、所定時間Ｔａが経過した場合には（Ｓ４０
０：ＹＥＳ）、図４のＳ３１０へ戻る。この場合には、
「ルートを探索したいときはルート探索と入力して下さ
い。」というガイド内容Ｆの報知（Ｓ３７０）をしたに
もかかわらず利用者が音声入力しないので、この場合に
はそれ以上のガイドはしないようにしている。If the predetermined time Ta has not elapsed (S400: NO), the flow returns to S380 to wait for voice input, but if the predetermined time Ta has elapsed (S40).
0: YES), and returns to S310 of FIG. In this case,
The user does not input a voice even though the guide content F was informed (S370) that "Please enter route search when you want to search for a route." In this case, do not give any further guidance. I have to.

【００５９】続いて、上述したＳ３４０，Ｓ３８０にて
肯定判断、すなわち音声入力があった場合に移行するＳ
４１０以降の処理について説明する。Ｓ４１０での音声
認識処理及びＳ４２０での認識結果のトークバック・表
示については、上述したＳ２６０，Ｓ２７０の処理と同
様である。そして、Ｓ４３０では上述のＳ２８０と同様
に正しい認識であるかどうかを判断する。誤った認識で
あれば（Ｓ４３０：ＮＯ）図４のＳ３１０へ戻るが、正
しい認識である場合には（Ｓ４３０：ＹＥＳ）、Ｓ４４
０へ移行して認識結果を確定する。そして続くＳ４５０
にて、所定の確定後処理を実行する。この場合の確定後
処理とは、認識結果としての「ルート探索をする旨の指
示」に関するデータを、制御回路１０（図１参照）へ出
力する処理などである。Subsequently, affirmative determination is made in S340 and S380 described above, that is, the processing shifts to when there is a voice input.
The processing after 410 will be described. The speech recognition process in S410 and the talkback / display of the recognition result in S420 are the same as the processes in S260 and S270 described above. Then, in S430, it is determined whether the recognition is correct, as in S280 described above. If the recognition is incorrect (S430: NO), the process returns to S310 in FIG. 4. If the recognition is correct (S430: YES), S44.
The process proceeds to 0 to determine the recognition result. And the following S450
, A predetermined post-determination process is executed. The post-determination process in this case is a process of outputting data relating to the “instruction to search for a route” as a recognition result to the control circuit 10 (see FIG. 1).

【００６０】このような確定後処理（Ｓ４５０）が終了
した後は、Ｓ４６０へ移行して、ＰＴＴスイッチ３６が
ＯＮされたか（押下されたか）否かを判断する。そし
て、ＯＮされた場合には（Ｓ４６０：ＹＥＳ）、続くＳ
４７０でクリック操作がどうかを判断し、クリック操作
であると判断した場合には（Ｓ４７０：ＹＥＳ）、Ｓ４
８０へ移行してガイド内容Ｇを報知する。このガイド内
容Ｇの具体的内容は「ルート探索の条件の変更などが行
えます。」というものである。つまり、ルート探索をす
る旨の指示は音声入力されたので、もしも利用者が条件
の変更をしたいのであればそれもできる旨を報知するの
である。このガイド内容Ｇを報知した後はＳ４６０へ戻
る。After the completion of the post-determination process (S450), the flow shifts to S460 to determine whether or not the PTT switch 36 has been turned on (pressed). If it is turned on (S460: YES), the subsequent S
At 470, it is determined whether or not a click operation has been performed. If it is determined that the click operation has been performed (S470: YES), S4
The process proceeds to 80 to notify the guide content G. The specific content of the guide content G is "change of conditions for route search can be performed." That is, since the instruction to perform the route search is input by voice, if the user wants to change the condition, the user is notified that the condition can be changed. After informing the guide content G, the process returns to S460.

【００６１】一方、クリック操作でないと判断した場合
には（Ｓ４７０：ＮＯ）、Ｓ４９０へ移行して、音声入
力があるかどうかを判断する。音声入力がある場合には
（Ｓ４９０：ＹＥＳ）、Ｓ５２０へ移行して音声認識処
理を行なう。この音声認識処理は上述したＳ２６０やＳ
４１０と同様であるのでここでは繰り返さない。また、
それ以降の認識結果のトークバック等についても同様で
ある。On the other hand, if it is determined that the operation is not a click operation (S470: NO), the flow shifts to S490 to determine whether there is a voice input. If there is a voice input (S490: YES), the flow shifts to S520 to perform voice recognition processing. This voice recognition processing is performed in S260 or S
Since it is the same as 410, it will not be repeated here. Also,
The same applies to talkbacks and the like of recognition results thereafter.

【００６２】一方、音声入力がない場合には（Ｓ４９
０：ＮＯ）、ＰＴＴスイッチ３６がＯＦＦされたかどう
かを判断し（Ｓ５００）、ＰＴＴスイッチ３６がＯＦＦ
された場合には（Ｓ５００：ＹＥＳ）、Ｓ４６０へ戻
る。一方、ＰＴＴスイッチ３６がＯＦＦされない、つま
りＯＮされた状態のままであれば（Ｓ５００：ＮＯ）、
所定時間Ｔａが経過したかどうかを判断する（Ｓ５１
０）。On the other hand, when there is no voice input (S49)
0: NO), it is determined whether the PTT switch 36 is turned off (S500), and the PTT switch 36 is turned off.
If it has been performed (S500: YES), the process returns to S460. On the other hand, if the PTT switch 36 is not turned off, that is, remains in the ON state (S500: NO),
It is determined whether the predetermined time Ta has elapsed (S51).
0).

【００６３】所定時間Ｔａが経過していない場合には
（Ｓ５１０：ＮＯ）、Ｓ４９０へ戻って音声入力される
のを待つ。所定時間Ｔａが経過した場合には（Ｓ５１
０：ＹＥＳ）、ここには示していないが、例えばまた別
のガイド内容を報知したり、あるいは音声入力待ちの状
態となるなどの所定の処理を行なう。If the predetermined time Ta has not elapsed (S510: NO), the flow returns to S490 and waits for voice input. When the predetermined time Ta has elapsed (S51)
0: YES), which is not shown here, performs predetermined processing such as notifying another guide content or waiting for voice input.

【００６４】なお、本実施形態では、対話制御部３２及
び制御部３８が「ガイド手段」に相当し、図２〜図６で
示す処理中の音声認識に関係しない部分の処理がガイド
手段としての処理に相当する。このような処理を実行す
ることによって、本実施形態の音声認識装置３０は、次
のような効果を発揮する。In the present embodiment, the dialogue control unit 32 and the control unit 38 correspond to "guide means", and the processing of the parts not related to voice recognition during the processing shown in FIGS. It corresponds to processing. By executing such processing, the speech recognition device 30 of the present embodiment exhibits the following effects.

【００６５】（１）まず、エンジンがＯＮされると「音
声入力が可能です。クリックで説明します。」というガ
イド内容Ａを報知している。つまり、ナビゲーション装
置に対する指示を音声で入力できること自体を利用者に
報知しているため、そのこと自体を知らない場合に対応
できる。(1) First, when the engine is turned on, a guide content A of "voice input is possible. Click to explain." That is, since the user is notified that the instruction to the navigation device can be input by voice, it is possible to cope with the case where the user does not know the fact.

【００６６】（２）そして、積極的に説明を求めたい利
用者にはクリック操作をするように促す報知も行ってい
る。クリック操作をした場合には、例えば「目的地設
定、拡大、縮小などが行えます。」というガイド内容Ｂ
や「ルート探索、拡大、縮小などが行えます。」という
ガイド内容Ｅが報知される。これをで指示すべき内容を
利用者が把握できる。なお、このように積極的に説明を
求めたい利用者に対応するため、例えばクリック操作を
複数種類設定しておき、例えばダブルクリック操作をし
た場合には、さらに詳しい説明をするような構成にする
ことも考えられる。つまり、利用者の理解レベルに応じ
た対応ができる。(2) The user is also informed that the user who wants to ask for the explanation actively is required to click. When the click operation is performed, for example, a guide content B that "can set destination, enlarge, reduce, etc."
Or a guide content E indicating that "route search, enlargement, reduction, etc. can be performed." This allows the user to understand the contents to be instructed. In order to cope with a user who wants to actively ask for an explanation as described above, for example, a plurality of types of click operations are set, and, for example, when a double-click operation is performed, a more detailed explanation is provided. It is also possible. That is, it is possible to respond according to the user's understanding level.

【００６７】（３）また、実際に音声入力による指示を
行おうとする利用者であっても、いざ音声入力をしよう
とすると、どのように入力してよいのかが判らない場合
がある。そのような状況では、ＰＴＴスイッチ３６が押
下されて音声入力をしようとする利用者の意志は推認で
きるが、実際には何も音声入力されないことがある。し
たがって、そのような場合には、例えば「目的地を設定
するときは都道府県名から入力して下さい。」というガ
イド内容Ｃや「ルートを探索したいときはルート探索と
入力して下さい。」というガイド内容Ｆを報知する。こ
れにより、音声入力したいのであるがどのように入力す
ればよいかについての理解が十分でない利用者に対する
ガイドが実現できる。(3) Even if the user actually tries to give an instruction by voice input, he / she may not know how to input the voice when he / she tries to input voice. In such a situation, the user's intention to input a voice by pressing the PTT switch 36 can be inferred, but no voice may be actually input. Therefore, in such a case, for example, a guide content C such as "Please input from the name of prefecture when setting a destination" or "Please input route search when you want to search a route." The guide content F is notified. As a result, a guide can be realized for a user who wants to input voice but does not fully understand how to input.

【００６８】（４）さらに、上述の「目的地を設定する
ときは都道府県名から入力して下さい」というガイド内
容Ｃを報知しても、まだ音声入力ができない状況も考え
られる。したがって、その場合のフォローとして、本実
施形態では「例えば愛知県刈谷市昭和町と入力して下さ
い」というガイド内容Ｄを報知している。このように具
体的な入力例をガイド内容として報知することによっ
て、利用者は具体的な入力方法が判り、また、具体例ま
で報知してもらえばそれに倣って自分が希望する目的地
を容易に入力することができる。(4) Furthermore, even if the above-mentioned guide content C stating that "when setting a destination, start with the name of the prefecture" is notified, there may be a situation where voice input is still not possible. Therefore, as a follow-up in that case, in the present embodiment, the guide content D of "Please input, for example, Showa-cho, Kariya-shi, Aichi" is notified. By notifying a specific input example as a guide content in this way, the user can know the specific input method, and if the user is notified of the specific example, the user can easily find the desired destination according to the input method. Can be entered.

【００６９】このように、本実施形態の音声認識装置３
０を適用したカーナビゲーションシステム２によれば、
音声認識装置３０の利用方法に慣れていない利用者であ
っても容易に利用ができるようになり、より使い勝手の
良いユーザフレンドリ面で優れたものとすることができ
る。As described above, the voice recognition device 3 of the present embodiment
According to the car navigation system 2 to which 0 is applied,
Even a user who is not accustomed to using the voice recognition device 30 can easily use the voice recognition device 30, and can provide a more user-friendly user-friendly surface.

【００７０】以上、本発明はこのような実施例に何等限
定されるものではなく、本発明の主旨を逸脱しない範囲
において種々なる形態で実施し得る。例えば、上記実施
形態では、認識結果の報知やガイド内容の報知を音声と
画像の２種類で行なうようにしたが、利用者に対して報
知が可能であればどのような形態でもよく。かりに１つ
だけ採用するならば、音声による報知を行なうことが好
ましい。これは、カーナビゲーションシステム２という
車載機器として用いていることを考慮したものであり、
音声で出力されればドライバーは視点を表示装置にずら
したりする必要がなく、安全運転のより一層の確保の点
では有利であることなどの理由からである。もちろん上
記実施形態のように、音声及び画像の両方で報知すれ
ば、ドライバーは表示による確認と音声による確認との
両方が可能となる。As described above, the present invention is not limited to such an embodiment, and can be implemented in various forms without departing from the gist of the present invention. For example, in the above-described embodiment, the notification of the recognition result and the notification of the guide content are performed by two types of voice and image. However, any form may be used as long as the notification can be performed to the user. If only one is used, it is preferable to perform notification by voice. This is in consideration of being used as an in-vehicle device called a car navigation system 2,
This is because the driver does not need to shift his / her viewpoint to the display device if the voice is output, which is advantageous in terms of further ensuring safe driving. Of course, as in the above-described embodiment, if the notification is made by both the voice and the image, the driver can perform both the confirmation by the display and the confirmation by the voice.

【００７１】また、音声認識装置の適用先としては、上
述したナビゲーションシステムには限定されない。例え
ば音声認識装置を空調システム用として用いる場合に
は、設定温度の調整、空調モード（冷房・暖房・ドラ
イ）の選択、あるいは風向モードの選択を音声入力によ
って行うようにすることが考えられる。そして、この場
合には、その設定項目（温度・空調モード・風向モード
など）自体をガイド内容として報知したり、あるいは、
「設定温度を２５度にする」と言えばよいのか「設定温
度を５度下げる」というように言えばよいのか、などを
さらにガイド内容として報知することが考えられる。空
調モードや風向モードなどについても同様である。The application of the speech recognition apparatus is not limited to the above-mentioned navigation system. For example, when the voice recognition device is used for an air conditioning system, it is conceivable to adjust the set temperature, select an air conditioning mode (cooling / heating / dry), or select a wind direction mode by voice input. In this case, the setting items (temperature, air-conditioning mode, wind direction mode, etc.) themselves are reported as guide contents, or
It is conceivable that the guide content may further inform the user of whether to say "set the set temperature to 25 degrees" or "set the set temperature to 5 degrees". The same applies to the air conditioning mode and the wind direction mode.

【００７２】さらに、ナビゲーションシステムや空調シ
ステムを車載機器として用いる場合には限定されず、例
えば携帯型ナビゲーション装置や屋内用空調装置などの
ように車載機器以外に用いてもよい。但し、これまで説
明したように車載機器用として用いる場合には利用者が
ドライバーであることが考えられ、その場合には運転自
体が最重要であり、それ以外の車載機器に対する操作に
ついては、なるべく運転に支障がないことが好ましい。
したがって、車載機器としてのカーナビゲーションシス
テム２や空調システムを前提とした音声認識装置３０の
場合には、より一層の利点がある。Further, the use of the navigation system or the air conditioning system as a vehicle-mounted device is not limited. For example, the navigation system or the air-conditioning system may be used other than the vehicle-mounted device such as a portable navigation device or an indoor air conditioner. However, as described above, when used for in-vehicle equipment, the user is considered to be a driver. In that case, driving itself is the most important. It is preferable that there is no problem in driving.
Therefore, in the case of the voice recognition device 30 on the premise of the car navigation system 2 and the air conditioning system as on-vehicle devices, there is a further advantage.

【００７３】もちろん、このような視点で考えるなら
ば、ナビゲーションシステムや空調システム以外の車載
機器に対しても同様に利用することができ、例えばカー
オーディオ機器などは有効である。また、それ以外に
も、いわゆるパワーウインドウの開閉やミラー角度の調
整などを音声によって指示するような構成を考えれば、
そのような制御対象についても同様に適用でき、やはり
有効である。Of course, from such a point of view, the present invention can be similarly applied to in-vehicle devices other than the navigation system and the air conditioning system. For example, a car audio device is effective. In addition, in addition to the above, if a configuration in which opening and closing of the power window and adjustment of the mirror angle are instructed by voice is considered,
The same can be applied to such a controlled object, which is also effective.

【００７４】また、車載機器用とした場合にはそれ特有
の利点があることは述べたが、音声認識装置３０の適用
先としては、利用者による音声入力指示にしたがって所
定の処理を実行するものであれば同様に考えられる。例
えば、携帯用の情報端末装置、あるいは街頭やパーキン
グエリアなどに設定される情報端末装置などにも同様に
適用できる。Although it has been described that there is an advantage peculiar to the in-vehicle device, the speech recognition device 30 is applied to a device which executes a predetermined process in accordance with a voice input instruction from a user. If so, it can be considered similarly. For example, the present invention can be similarly applied to a portable information terminal device or an information terminal device set in a street or a parking area.

[Brief description of the drawings]

【図１】本発明の実施形態としてのカーナビゲーショ
ンシステムの概略構成を示すブロック図である。FIG. 1 is a block diagram showing a schematic configuration of a car navigation system as an embodiment of the present invention.

【図２】音声認識装置が実行する処理を示すフローチ
ャートの一部である。FIG. 2 is a part of a flowchart showing processing executed by the voice recognition device.

【図３】音声認識装置が実行する処理を示すフローチ
ャートの一部である。FIG. 3 is a part of a flowchart showing processing executed by the voice recognition device.

【図４】音声認識装置が実行する処理を示すフローチ
ャートの一部である。FIG. 4 is a part of a flowchart showing processing executed by the voice recognition device.

【図５】音声認識装置が実行する処理を示すフローチ
ャートの一部である。FIG. 5 is a part of a flowchart showing processing executed by the speech recognition device.

【図６】音声認識装置が実行する処理を示すフローチ
ャートの一部である。FIG. 6 is a part of a flowchart showing processing executed by the speech recognition device.

[Explanation of symbols]

２…カーナビゲーションシステム４…位置検出器６…地図データ入力器８…操作スイッチ
群１０…制御回路１２…外部メモリ１４…表示装置１５…リモコンセ
ンサ１５ａ…リモコン１６…地磁気セ
ンサ１８…ジャイロスコープ２０…距離センサ２２…ＧＰＳ受信機３０…音声認識装
置３１…音声認識部３２…対話制御部３３…音声合成部３４…音声入力部３５…マイク３６…ＰＴＴスイ
ッチ３７…スピーカ３８…制御部３９…表示制御部４０…表示部2 ... Car navigation system 4 ... Position detector 6 ... Map data input device 8 ... Operation switch group 10 ... Control circuit 12 ... External memory 14 ... Display device 15 ... Remote control sensor 15a ... Remote control 16 ... Geomagnetic sensor 18 ... Gyroscope 20 ... Distance sensor 22 GPS receiver 30 Voice recognition device 31 Voice recognition unit 32 Dialogue control unit 33 Voice synthesis unit 34 Voice input unit 35 Microphone 36 PTT switch 37 Speaker 38 Control unit 39 Display control Part 40 ... Display part

フロントページの続き (51)Int.Cl.⁶ 識別記号ＦＩＧ０８Ｇ 1/0969 Ｇ０８Ｇ 1/0969 Ｇ０９Ｂ 29/00 Ｇ０９Ｂ 29/00 Ｆ 29/10 29/10 Ａ Continued on the front page (51) Int.Cl. ⁶ Identification code FI G08G 1/0969 G08G 1/0969 G09B 29/00 G09B 29/00 F 29/10 29/10 A

Claims

[Claims]

1. A voice input unit for inputting voice, and an input period provided for a user himself to designate a start and an end of a period for inputting a voice to be recognized using the voice input unit. Specifying means for comparing a voice input through the voice input means within the input period specified by the input period specifying means with a plurality of comparison target pattern candidates previously stored in the dictionary means, Recognizing means for recognizing the recognition result by the recognizing means; and notifying means for notifying the recognition result by the recognizing means. When a predetermined confirmation instruction is given after the recognizing result is notified by the notifying means, the recognition result is obtained. And a post-confirmation processing means for executing a predetermined post-confirmation process as a pre-determined one, wherein a start of a voice input period is designated by the input period designation means. A voice recognition device, comprising: a guide unit for notifying predetermined guide content when there is no voice input via the voice input unit even after a predetermined time has elapsed from a point in time.

2. A voice input means for inputting voice, and an input period provided for the user himself to designate a start and an end of a period for inputting a voice to be recognized using the voice input means. Specifying means for comparing a voice input through the voice input means within the input period specified by the input period specifying means with a plurality of comparison target pattern candidates previously stored in the dictionary means, Recognizing means for recognizing the recognition result by the recognizing means; and notifying means for notifying the recognition result by the recognizing means. When a predetermined confirmation instruction is given after the recognizing result is notified by the notifying means, the recognition result is obtained. And a post-confirmation processing means for executing a predetermined post-confirmation process as a pre-determination of the speech recognition device, wherein the input period designation means starts and ends a normal speech input period. A voice recognition device comprising: a guide unit that, when a predetermined guide request other than the designation of the end of the guide is made, notifies a predetermined guide content corresponding to the guide request.

3. The voice recognition device according to claim 1, wherein the guide unit is configured to execute a predetermined guide request which is not a designation of a start and an end of a normal speech input period by the input period designation unit. A voice recognition device configured to notify predetermined guide content according to the guide request.

4. The voice recognition device according to claim 1, wherein said notifying unit is configured to notify the recognition result by at least voice, and said guide unit is configured to control said predetermined guide. A speech recognition apparatus characterized in that the content is notified at least by voice.

5. The voice recognition device according to claim 1, wherein the predetermined guide content notified by the guide means is processed in a system using a voice recognition result by the voice recognition device. A speech recognition device, characterized in that the content is related to

6. The voice recognition device according to claim 5, wherein the guide unit is configured to notify a guide content corresponding to a processing execution stage in the system at a time of notifying the predetermined guide content. A speech recognition device characterized in that:

7. The voice recognition device according to claim 1, further comprising: a navigation device, wherein the voice input unit of the voice recognition device performs at least navigation processing by the navigation device. The user uses a voice to input an instruction of predetermined navigation processing-related data that needs to be specified, and the post-confirmation processing means sends the recognition result by the recognition means to the navigation device. A navigation system configured to output.