US5761687A - Character-based correction arrangement with correction propagation - Google Patents
Character-based correction arrangement with correction propagation Download PDFInfo
- Publication number
- US5761687A US5761687A US08/539,342 US53934295A US5761687A US 5761687 A US5761687 A US 5761687A US 53934295 A US53934295 A US 53934295A US 5761687 A US5761687 A US 5761687A
- Authority
- US
- United States
- Prior art keywords
- word
- character
- text
- erroneous
- correct
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/232—Orthographic correction, e.g. spell checking or vowelisation
Definitions
- the present invention pertains to the field of user input processing and correction. More particularly, this invention relates to a character-based correction arrangement for a user input recognition system with correction propagation.
- Speech recognition systems inevitably make recognition errors.
- the computer system typically displays a list of alternative words in the vocabulary (e.g., an active lexicon) that best match what the user spoke.
- the correction schemes in prior art dictation systems (or programs) are typically tailored for word-based languages (i.e., English and French). While the words in Western languages are clearly defined by blank spaces between those words, the concept of a "word" for the Chinese language is ambiguous because there is no equivalent word separator in the Chinese language. This typically causes correction of Chinese text to be difficult when a prior art word-based language correction scheme is employed to correct the Chinese text.
- a method of correcting a text in a data processing system is described.
- One of the features of the method of the present invention is to allow correction of a document to be relatively fast and effective.
- Another feature of the method of the present invention is to provide a correction system that allows corrections in a document made by a user to be propagated forward in the document such that the document is corrected by the system in accordance with the portion of the document corrected by the user.
- the method in accordance with one embodiment of the invention includes the step of locating a first incorrect character in the text.
- a character list of alternative characters for the first incorrect character is then shown to the user from which the user chooses a character to replace the first incorrect character with a correct character from the character list.
- the user may input a correct character without using a list of characters, such as the character list.
- the change of the first incorrect character is then propagated through the remainder of the text in accordance with a matching score and a language probability score of the remainder of the text.
- FIG. 1 shows a computer system that includes a speech recognition system
- FIG. 2 is a block diagram of the speech recognition system of FIG. 1, wherein the speech recognition system includes a character-based correction system in accordance with one embodiment of the present invention
- FIG. 3 is a flow chart diagram of the process of the character-based correction system
- FIGS. 4A through 4D show different stages of the process of the character-based correction system of FIGS. 2-3 in correcting an English text
- FIGS. 5A through 5D show different stages of the process of the character-based correction system of FIGS. 2-3 in correcting a Chinese text.
- the character-based correction arrangement in accordance with one embodiment of the present invention may be employed in a speech recognition system.
- the correction system of the present invention is not limited to such a system and can apply equally to other pattern recognition systems, such as handwriting recognition systems, or to other user input word processing systems.
- FIG. 1 illustrates a computer system 100 that implements the speech recognition system on which the correction system of the present invention is implemented.
- FIG. 1 shows some of the basic components of computer system 100, it is neither meant to be limiting nor to exclude other components or combinations of components.
- computer system 100 includes a bus 101 for transferring data and other information.
- Computer system 100 also includes a processor 102 coupled to bus 101 for processing data and information.
- Computer system 100 also includes a memory 104 and a mass storage device 107 coupled to bus 101.
- Computer system 100 also includes an optional digital signal processor 108 which performs digital signal processing functions and offers additional processing bandwidth. Alternatively, computer system 100 may not have a digital signal processor. In that case, the digital signal processing functions are performed by processor 102.
- Computer system 100 may further include a display device 121 coupled to bus 101 for displaying information to a computer user. Keyboard input device 122 is also coupled to bus 101 for communicating information and command selections to processor 102.
- An additional user input device is a pointing device 123, such as a mouse, a trackball, a trackpad, or cursor direction keys, coupled to bus 101 for communicating direction information and command selections to processor 102, and for controlling cursor movement on display 121.
- the pointing device 123 typically includes a signal generation device (such as a button or buttons) which provides signals that indicate command selections to processor 102.
- a signal generation device such as a button or buttons
- Another device which may be coupled to bus 101 is a hard copy device 124 which may be used for printing instructions, data, or other information on a medium such as paper, film, or similar types of media.
- System 100 may further include a sound processing device 125 for digitizing sound signals and transmitting such digitized signals to processor 102 or digital signal processor 108 via bus 101. In this manner, sound may be digitized and then processed using processor 108 or 102.
- Sound processing device 125 is coupled to an audio transducer, such as microphone 126.
- Sound processing device 125 typically includes an analog-to-digital (A/D) converter and can be implemented by known sound processing circuits.
- microphone 126 can be implemented by any known microphone or sound receiver.
- system 100 is one of the Macintosh® brand family of personal computers available from Apple Computer, Inc. of Cupertino, Calif.
- Processor 102 is one of the Motorola 680x0 family of processor available from Motorola, Inc. of Schaumburg, Ill., such as the 68020, 68030, or 68040.
- processor 102 may be a PowerPC RISC processor also sold by Motorola Inc.
- Processor 108 in one embodiment, comprises one of the AT&T DSP 3210 series of digital signal processors available from American Telephone and Canal (AT&T) Microelectronics of Allentown, Pa.
- AT&T American Telephone and Canal
- Computer system 100 includes a speech recognition system 200 (shown in FIG. 2).
- speech recognition system 200 is implemented as a series of software routines that are run by processor 102, which interacts with data received from digital signal processor 108 via sound processing device 125. It will, however, be appreciated that speech recognition system 200 can also be implemented in discrete hardware or firmware, or in a combination of software and/or hardware.
- FIG. 2 shows speech recognition system 200 in functional block diagram form.
- the speech is first fed to a speech receiver 201 of speech recognition system 200.
- Receiver 201 captures the speech signal and converts the analog speech signal into digital form.
- Receiver 201 can be implemented by microphone 126 and sound processing device 125 of FIG. 1.
- the digitized speech signal is then applied to a speech recognition engine 202 for recognition and for supplying the spoken words to a display interface 205 for display.
- speech recognition engine 202 is implemented by a series of known speech recognition software routines that are run by processors 102 and 108 of FIG. 1.
- display interface 205 is a user interface system, implemented through software, of computer system 100 of FIG. 1. As can be seen from FIG. 2, display interface 205 interfaces with display 121 and cursor control device 123 of FIG. 1. The user of speech recognition system 200 interacts with recognition system 200 through display interface 205 by means of cursor control device 123 and display 121. This will be described in more detail below.
- Speech recognition engine 202 typically includes a speech feature extraction function (not shown).
- the speech feature extraction function is performed, in one embodiment, by digital signal processor 108 of FIG. 1.
- the speech feature extraction function of speech recognition engine 202 recognizes acoustic features of human speech, as distinguished from other sound signal information contained in the digitized sound signals from speech receiver 201.
- the acoustic features from the speech feature extraction function are input to a recognizer process (also not shown in FIG. 2) of speech recognition engine 202 which performs speech recognition using a language model to determine whether the extracted features represent expected words in a vocabulary database 203 recognizable by speech recognition system 200.
- the recognizer uses a recognition algorithm to compare a sequence of frames produced from a speech utterance with a sequence of nodes contained in the acoustic model with each word in vocabulary database 203 to determine if a match exists.
- the result of the recognition matching process is a text output.
- the speech recognition algorithm employed is the Hidden Markov Model (HMM) that is known to a person skilled in the art.
- HMM Hidden Markov Model
- the recognition algorithm of speech recognition engine 202 uses probabilistic matching and dynamic programming for the acoustic matching process.
- Probabilistic matching determines the likelihood that a given frame of an utterance corresponds to a given node in an acoustic model of a word. This likelihood is not only a function of how closely the amplitude of the individual frequency bands of a frame match the expected frequencies contained in the given node models, but also is a function of how the deviation between the actual and expected amplitudes in each such frequency band compares to the expected deviations for such values.
- Dynamic programming, or viterbi searching provides a method to find an optimal, or near optimal, match between the sequence of frames produced from the utterance and the sequence of nodes contained in the model of the word. This is accomplished by expanding and contracting the duration of each node in the acoustic model of a word to compensate for the natural variations in the duration of speech sounds which occur in different utterances of the same word. A score is computed for each time-aligned match, based on the sum of the dissimilarity between the acoustic information in each frame and the acoustic model of the node against which it is time-aligned. The words with the lowest sum of such distances are then selected as the best scoring words. The combined score of the probability matching and dynamic programming is referred to as acoustic matching score.
- speech recognition system 200 employs language model filtering (e.g., natural language processing).
- language model filtering e.g., natural language processing
- the total score of a word in a language context for recognition typically includes a language modeling score and the acoustic matching score of that word.
- the acoustic matching score of a word needs to be added to the language modeling score of that word before the best scoring word is selected for display so that words which are most probable in the current context can be selected.
- the language modeling score of a word is obtained through filtering the language model information for that word.
- the language modeling information includes the statistical language model in addition to other language models (e.g., lexicon language model, word pair grammar language model, or syntactic/semantic analysis language model)
- the language modeling score of a word might include a unigram probability score, and/or a bigram probability score, and/or a trigram probability score, and/or other long distance language probability scores (i.e., N-gram probability scores).
- the unigram probability score of a word determines the statistical probability of the word regardless of any prior word or words.
- the bigram probability score of a word determines the statistical probability of the word given a prior adjacent word.
- the trigram probability score of a word determines the statistical probability of the word given two prior adjacent words.
- the N-gram probability (long distance language probability score) of a word determines the statistical probability of the word given N prior adjacent words.
- the recognition algorithm uses the language model to determine the language modeling score of a speech utterance.
- the recognition algorithm combines the language modeling score of the speech utterance with the acoustic matching score of the utterance to provide a list (i.e., N-best list) of N possible words arranged in a top-score-down order.
- the top scored word is then output to display interface 205 for display and the remaining words are contained in the N-best list in the order of their scores.
- the N-best word list contains ten (10) possible words. In another embodiment, the N-best list may contain five (5) possible words. Alternatively, the N-best list can contain an arbitrary number of words. In addition, the length of the N-best list may vary from word to word.
- Speech recognition system 200 also includes a character-based correction system 204 that implements one embodiment of the present invention.
- Correction system 204 is used to correct the text generated by speech recognition engine 202. In other words, correction system 204 is activated to correct the text displayed on display 121. Correction system 204 corrects the text displayed on display 121 through display interface 205. The user of speech recognition system 200 activates correction system 204, in one embodiment via display interface 205 using cursor control device 123 and display 121. Correction system 204 will be described in more detail below, in conjunction with FIGS. 3 through 5D.
- speech recognition system 200 is a non-alphabetic language (e.g., Chinese, Japanese, Korean, etc.) speech recognition system.
- correction system 204 is also a non-alphabetic language correction system.
- speech recognition system 200 could be an alphabetic language (e.g., English, French, Spanish, etc.) recognition system.
- correction system 204 is also an alphabetic language correction system. Correction system 204 will be described in more detail below, in connection with an example of English language (i.e., an alphabetic language) (FIGS. 4A-4D) and an example of Chinese language (i.e., a non-alphabetic language) (FIGS. 5A-5D).
- correction system 204 is a character-based correction system. This means that correction system 204 enables or allows the user to correct incorrect words on a character-by-character basis.
- correction system 204 is an alphabetic language correction system, the characters refer to the letters (or alphabets) of the word.
- Correction system 204 can be implemented by a software module running on processor 102 of FIG. 1, by a hardware integrated circuit or by a combination of software and hardware. Correction system 204 allows for in-word propagation of the correction of a character throughout the word that contains the character. In addition, correction system 204 allows for between-word propagation of the correction of a word throughout a sentence or text. The in-word propagation and between-word propagation adjusts the N-best list for subsequent characters or words as a consequence of correction by the user, thus allowing automatic correction of many words based on only a single correction by the user.
- the N-best character list in character-based correction system 204 is shorter than the N-best word list in a word-based correction system because different words in the N-best word list might share the same character in the same position. Therefore, the character-based correction may make it easier for users to make choices.
- character-based selections are able to compose much more character combinations than word-based selections.
- Character-based error correction allows the user to make up a word not appearing in the original word list. This can occur if the user speaks a word that is not in the active vocabulary.
- correction system 204 Briefly, the operation of correction system 204 is now described as follows.
- the user first scans through the text (e.g. sequentially scanning through the text) and finds the first erroneous character (e.g. the left-most erroneous character on a given line of text)
- the user may view an alternative character list (e.g., N-best character list) for the erroneous character that is generated by purging the duplicate characters in the N-best word list for that word.
- the N-best word list is, however, usually never shown and viewed by the user.
- the order of such N-best character list is determined by the order of the word from which the character is extracted.
- the correct character appears on the N-best character list and the user selects it to replace the currently displayed character, then an in-word propagation of the correction for the word is performed. If the correct character is not on the list, then the user must correct it using some other input method such as typing or perhaps speech input.
- the in-word propagation is performed by rearranging the order of the N-best word list associated with the word in which the changed character appeared.
- the rearrangement is based on the sum of the acoustic matching score, the unigram probability score, and a language modeling score for each entry.
- the rearrangement is essentially the same as moving the entries containing the changed character in the corresponding position to the top while maintaining their relative order and maintaining the order of the remaining entries unchanged.
- the N-best word list for that word is not normally shown on the display. Only the top entry of the N-best word list is shown replacing the word displayed, if they are different.
- Between-word propagation is then performed after in-word propagation until the first period punctuation appears, or otherwise until the sentence ends.
- Between-word propagation is also performed by rearranging the N-best word list for each of the words subsequent to the changed word based on the acoustic matching score, the unigram probability score, and the long distance language modeling score, given the changed word.
- the long distance language model may include bigram and trigram, etc.
- the rearranged N-best word list for each of the subsequent words is not normally shown on the display.
- the change is propagated forward within the word and towards the end of the sentence or text.
- the user may change the same word by selecting, from the N-best character list generated for a subsequent character, the character which was previously changed within the word in order to change the same word. This then causes another in-word propagation and between-word propagation.
- System 204 starts at step 301.
- the text which has been recognized by the system 200 is displayed as displayed text on display device 121.
- the user decides if correction of the displayed text is needed.
- the user of speech recognition system 200 can activate correction system 204 via display interface 205 and cursor control device 123 during speech dictation.
- the user typically activates correction system 204 using cursor control device 123 to select and activate a character correction menu on display 121 via display interface 205. This causes display interface 205 to generate a signal to call for correction system 204.
- the first character of the first word in a sentence that needs to be corrected is located at step 303.
- the leftmost character of the leftmost word that needs to be corrected is located by the user. The user typically does this using control cursor device 123 to click on the area of display 121 on the character.
- the N-best character list of that character is obtained and shown.
- Correction system 204 obtains the N-best character list of that character from speech recognition engine 202.
- Correction system 204 provides the N-best character list for display on display 121 (FIG. 2) via display interface 205 (FIG. 2). This allows the user to view the N-best character list of that character using cursor control device 123 (FIG. 2).
- the N-best character list allows the user to select any one of the characters listed in the N-best character list to replace the incorrect character.
- step 305 the user determines if the correct character is in the list. If the correct character is not in the list, then step 310 is performed. At step 310, the user determines if the correct character needs to be input by typing or speaking. If so, step 311 is performed. If not, step 311 is bypassed, leading to step 312.
- step 306 is performed; in this step 306, the character is corrected with the correct character which is selected by the user from the N-best list. This is typically by the user pointing a cursor on the display 121 to the correct character which is also displayed on display 121. Then step 307 is performed to determine if the in-word propagation process of the invention is required. If so, step 308 is performed. If not, step 308 is bypassed.
- step 308 the N-best word list for the word having the corrected character is rearranged in light of the corrected character.
- the correction of the character is propagated throughout the word. This is also referred to as in-word propagation and has been described above. This in-word propagation is based on the score of each of the listed words given the corrected character.
- correction system 204 then moves to step 309 at which time a between-word propagation is performed to adjust the N-best word list for each of the subsequent words based on the changed word.
- the between-word propagation is done in accordance with the language modeling score and the acoustic matching score with respect to the corrected word.
- the N-best word list of a word can be automatically adjusted in view of a changed character of the word or in view of a proximately changed word. This causes the correction to be automatic which is more efficient and convenient.
- Correction system 204 then returns to step 302 at which the user determines if more correction is needed. If so, step 303 is repeated. If not, the process ends at step 313.
- FIGS. 4A through 4D the operation of correction system 204 of FIG. 2 is described in accordance with an English text.
- a phrase "VIDEO CONFERENCE” is displayed.
- FIG. 4A also shows the correct phrase "AUDIO COEFFICIENT” for the displayed phrase.
- FIG. 4A also shows the N-best word list for each of the words. As described above, the N-best word list for each word is not normally shown on the display. As can be seen from FIG. 4A, the correct word for each displayed word is not high on the N-best list. As described above, the correct word may sometimes not appear on the N-best list.
- FIG. 4B shows the N-best character list for the first letter of the word "VIDEO" to be corrected.
- the user then can select the correct letter “A” to replace the letter “V”, as can be seen in FIG. 4B.
- correction system 204 With the first letter of the word “VIDEO” corrected, correction system 204 then propagates the correction throughout the word. As described above, correction system 204 does this by promoting the word "AUDIO” to the top of the N-best word list (see FIG. 4C) and yet maintaining the relative order of the N-best word list by a combination of the acoustic matching score and language model score for each entry to rearrange the words listed in the N-best word list based on the corrected letter "A".
- correction system 204 does the between-word propagation that may cause the word "COEFFICIENT” to appear on top of the N-best list of the word "CONFERENCE”. As described above, the between-word propagation is performed in accordance with the bigram language modeling or other long distance language model.
- FIGS. 5A through 5D show the operation of correction system 204 of FIG. 2 for text in Chinese.
- the Chinese words 401 and 402 for "Apple Computer” are the correct words, as spoken, whereas the actual displayed words 403 and 404 are different.
- the Chinese word 401 for "Apple” includes two Chinese characters 401a and 401b and the Chinese word 402 for "Computer” includes two characters 402a and 402b.
- FIG. 5B shows an N-best character list 410 of the first character 403a of the first word 403 displayed. The first character 403a is then replaced by character 401a from the N-best character list 410, as can be seen from FIG. 5B.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
Description
Claims (27)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US08/539,342 US5761687A (en) | 1995-10-04 | 1995-10-04 | Character-based correction arrangement with correction propagation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US08/539,342 US5761687A (en) | 1995-10-04 | 1995-10-04 | Character-based correction arrangement with correction propagation |
Publications (1)
Publication Number | Publication Date |
---|---|
US5761687A true US5761687A (en) | 1998-06-02 |
Family
ID=24150811
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US08/539,342 Expired - Lifetime US5761687A (en) | 1995-10-04 | 1995-10-04 | Character-based correction arrangement with correction propagation |
Country Status (1)
Country | Link |
---|---|
US (1) | US5761687A (en) |
Cited By (69)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6157905A (en) * | 1997-12-11 | 2000-12-05 | Microsoft Corporation | Identifying language and character set of data representing text |
US6173252B1 (en) * | 1997-03-13 | 2001-01-09 | International Business Machines Corp. | Apparatus and methods for Chinese error check by means of dynamic programming and weighted classes |
US6256410B1 (en) | 1998-07-30 | 2001-07-03 | International Business Machines Corp. | Methods and apparatus for customizing handwriting models to individual writers |
US6314397B1 (en) * | 1999-04-13 | 2001-11-06 | International Business Machines Corp. | Method and apparatus for propagating corrections in speech recognition software |
US6320985B1 (en) | 1998-07-31 | 2001-11-20 | International Business Machines Corporation | Apparatus and method for augmenting data in handwriting recognition system |
US6385579B1 (en) * | 1999-04-29 | 2002-05-07 | International Business Machines Corporation | Methods and apparatus for forming compound words for use in a continuous speech recognition system |
US20020078244A1 (en) * | 2000-12-18 | 2002-06-20 | Howard John H. | Object-based storage device with improved reliability and fast crash recovery |
US20020129146A1 (en) * | 2001-02-06 | 2002-09-12 | Eyal Aronoff | Highly available database clusters that move client connections between hosts |
US20020143534A1 (en) * | 2001-03-29 | 2002-10-03 | Koninklijke Philips Electronics N.V. | Editing during synchronous playback |
US20040012642A1 (en) * | 2002-07-22 | 2004-01-22 | Samsung Electronics Co., Ltd. | Method of inputting characters on a wireless mobile terminal |
US20040122666A1 (en) * | 2002-12-18 | 2004-06-24 | Ahlenius Mark T. | Method and apparatus for displaying speech recognition results |
US20040167779A1 (en) * | 2000-03-14 | 2004-08-26 | Sony Corporation | Speech recognition apparatus, speech recognition method, and recording medium |
US7149970B1 (en) * | 2000-06-23 | 2006-12-12 | Microsoft Corporation | Method and system for filtering and selecting from a candidate list generated by a stochastic input method |
US20070038452A1 (en) * | 2005-08-12 | 2007-02-15 | Avaya Technology Corp. | Tonal correction of speech |
US20070050188A1 (en) * | 2005-08-26 | 2007-03-01 | Avaya Technology Corp. | Tone contour transformation of speech |
US7200555B1 (en) * | 2000-07-05 | 2007-04-03 | International Business Machines Corporation | Speech recognition correction for devices having limited or no display |
US7444284B1 (en) * | 2001-01-24 | 2008-10-28 | Bevocal, Inc. | System, method and computer program product for large-scale street name speech recognition |
US20100063798A1 (en) * | 2008-09-09 | 2010-03-11 | Tsun Ku | Error-detecting apparatus and methods for a chinese article |
US20120304057A1 (en) * | 2011-05-23 | 2012-11-29 | Nuance Communications, Inc. | Methods and apparatus for correcting recognition errors |
US20130301920A1 (en) * | 2012-05-14 | 2013-11-14 | Xerox Corporation | Method for processing optical character recognizer output |
US20150347383A1 (en) * | 2014-05-30 | 2015-12-03 | Apple Inc. | Text prediction using combined word n-gram and unigram language models |
US9633660B2 (en) | 2010-02-25 | 2017-04-25 | Apple Inc. | User profiling for voice input processing |
US9668024B2 (en) | 2014-06-30 | 2017-05-30 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US9865248B2 (en) | 2008-04-05 | 2018-01-09 | Apple Inc. | Intelligent text-to-speech conversion |
US20180060282A1 (en) * | 2016-08-31 | 2018-03-01 | Nuance Communications, Inc. | User interface for dictation application employing automatic speech recognition |
US9934775B2 (en) | 2016-05-26 | 2018-04-03 | Apple Inc. | Unit-selection text-to-speech synthesis based on predicted concatenation parameters |
US9953088B2 (en) | 2012-05-14 | 2018-04-24 | Apple Inc. | Crowd sourcing information to fulfill user requests |
US9966060B2 (en) | 2013-06-07 | 2018-05-08 | Apple Inc. | System and method for user-specified pronunciation of words for speech synthesis and recognition |
US9972304B2 (en) | 2016-06-03 | 2018-05-15 | Apple Inc. | Privacy preserving distributed evaluation framework for embedded personalized systems |
US9971774B2 (en) | 2012-09-19 | 2018-05-15 | Apple Inc. | Voice-based media searching |
US9986419B2 (en) | 2014-09-30 | 2018-05-29 | Apple Inc. | Social reminders |
US20180226071A1 (en) * | 2017-02-09 | 2018-08-09 | Verint Systems Ltd. | Classification of Transcripts by Sentiment |
US10049668B2 (en) | 2015-12-02 | 2018-08-14 | Apple Inc. | Applying neural network language models to weighted finite state transducers for automatic speech recognition |
US10049663B2 (en) | 2016-06-08 | 2018-08-14 | Apple, Inc. | Intelligent automated assistant for media exploration |
US10067938B2 (en) | 2016-06-10 | 2018-09-04 | Apple Inc. | Multilingual word prediction |
US10079014B2 (en) | 2012-06-08 | 2018-09-18 | Apple Inc. | Name recognition system |
US10089072B2 (en) | 2016-06-11 | 2018-10-02 | Apple Inc. | Intelligent device arbitration and control |
US10169329B2 (en) | 2014-05-30 | 2019-01-01 | Apple Inc. | Exemplar-based natural language processing |
US10192552B2 (en) | 2016-06-10 | 2019-01-29 | Apple Inc. | Digital assistant providing whispered speech |
US20190065484A1 (en) * | 2013-10-23 | 2019-02-28 | Sunflare Co., Ltd. | Translation support system |
US10223066B2 (en) | 2015-12-23 | 2019-03-05 | Apple Inc. | Proactive assistance based on dialog communication between devices |
US10249300B2 (en) | 2016-06-06 | 2019-04-02 | Apple Inc. | Intelligent list reading |
US10269345B2 (en) | 2016-06-11 | 2019-04-23 | Apple Inc. | Intelligent task discovery |
US10283110B2 (en) | 2009-07-02 | 2019-05-07 | Apple Inc. | Methods and apparatuses for automatic speech recognition |
US10297253B2 (en) | 2016-06-11 | 2019-05-21 | Apple Inc. | Application integration with a digital assistant |
US10318871B2 (en) | 2005-09-08 | 2019-06-11 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
US10354011B2 (en) | 2016-06-09 | 2019-07-16 | Apple Inc. | Intelligent automated assistant in a home environment |
US10366158B2 (en) | 2015-09-29 | 2019-07-30 | Apple Inc. | Efficient word encoding for recurrent neural network language models |
US10446143B2 (en) | 2016-03-14 | 2019-10-15 | Apple Inc. | Identification of voice inputs providing credentials |
US10490187B2 (en) | 2016-06-10 | 2019-11-26 | Apple Inc. | Digital assistant providing automated status report |
US10503480B2 (en) * | 2014-04-30 | 2019-12-10 | Ent. Services Development Corporation Lp | Correlation based instruments discovery |
US10509862B2 (en) | 2016-06-10 | 2019-12-17 | Apple Inc. | Dynamic phrase expansion of language input |
US10521466B2 (en) | 2016-06-11 | 2019-12-31 | Apple Inc. | Data driven natural language event detection and classification |
US10593346B2 (en) | 2016-12-22 | 2020-03-17 | Apple Inc. | Rank-reduced token representation for automatic speech recognition |
US10671428B2 (en) | 2015-09-08 | 2020-06-02 | Apple Inc. | Distributed personal assistant |
US10691473B2 (en) | 2015-11-06 | 2020-06-23 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US10706841B2 (en) | 2010-01-18 | 2020-07-07 | Apple Inc. | Task flow identification based on user intent |
US10733993B2 (en) | 2016-06-10 | 2020-08-04 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
US10747498B2 (en) | 2015-09-08 | 2020-08-18 | Apple Inc. | Zero latency digital assistant |
US10791176B2 (en) | 2017-05-12 | 2020-09-29 | Apple Inc. | Synchronization and task delegation of a digital assistant |
US10795541B2 (en) | 2009-06-05 | 2020-10-06 | Apple Inc. | Intelligent organization of tasks items |
US10810274B2 (en) | 2017-05-15 | 2020-10-20 | Apple Inc. | Optimizing dialogue policy decisions for digital assistants using implicit feedback |
US10878088B2 (en) * | 2015-08-18 | 2020-12-29 | Trend Micro Incorporated | Identifying randomly generated character strings |
US11010550B2 (en) | 2015-09-29 | 2021-05-18 | Apple Inc. | Unified language modeling framework for word prediction, auto-completion and auto-correction |
US11080012B2 (en) | 2009-06-05 | 2021-08-03 | Apple Inc. | Interface for a virtual digital assistant |
US20210350807A1 (en) * | 2019-05-09 | 2021-11-11 | Rovi Guides, Inc. | Word correction using automatic speech recognition (asr) incremental response |
US11455475B2 (en) | 2012-08-31 | 2022-09-27 | Verint Americas Inc. | Human-to-human conversation analysis |
US11587559B2 (en) | 2015-09-30 | 2023-02-21 | Apple Inc. | Intelligent device identification |
US11822888B2 (en) | 2018-10-05 | 2023-11-21 | Verint Americas Inc. | Identifying relational segments |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4718094A (en) * | 1984-11-19 | 1988-01-05 | International Business Machines Corp. | Speech recognition system |
US4827521A (en) * | 1986-03-27 | 1989-05-02 | International Business Machines Corporation | Training of markov models used in a speech recognition system |
US5027406A (en) * | 1988-12-06 | 1991-06-25 | Dragon Systems, Inc. | Method for interactive speech recognition and training |
US5384892A (en) * | 1992-12-31 | 1995-01-24 | Apple Computer, Inc. | Dynamic language model for speech recognition |
US5386494A (en) * | 1991-12-06 | 1995-01-31 | Apple Computer, Inc. | Method and apparatus for controlling a speech recognition function using a cursor control device |
-
1995
- 1995-10-04 US US08/539,342 patent/US5761687A/en not_active Expired - Lifetime
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4718094A (en) * | 1984-11-19 | 1988-01-05 | International Business Machines Corp. | Speech recognition system |
US4827521A (en) * | 1986-03-27 | 1989-05-02 | International Business Machines Corporation | Training of markov models used in a speech recognition system |
US5027406A (en) * | 1988-12-06 | 1991-06-25 | Dragon Systems, Inc. | Method for interactive speech recognition and training |
US5386494A (en) * | 1991-12-06 | 1995-01-31 | Apple Computer, Inc. | Method and apparatus for controlling a speech recognition function using a cursor control device |
US5384892A (en) * | 1992-12-31 | 1995-01-24 | Apple Computer, Inc. | Dynamic language model for speech recognition |
Non-Patent Citations (6)
Title |
---|
"Matrix Fast Search . . . " by Bahl et al., Publ. 1989 Acc. 02848866 file 8. |
"MicroSoft Word, Getting Started" © 1992 pp. 30-32. |
"Speaker--Independent Large Vocabulary Spoken Word . . . " by Sawai, Publ. yr. 1989 vol. 20 n 12 Acc. 02952513 file 8. |
Matrix Fast Search . . . by Bahl et al., Publ. 1989 Acc. 02848866 file 8. * |
MicroSoft Word, Getting Started 1992 pp. 30 32. * |
Speaker Independent Large Vocabulary Spoken Word . . . by Sawai, Publ. yr. 1989 vol. 20 n 12 Acc. 02952513 file 8. * |
Cited By (97)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6173252B1 (en) * | 1997-03-13 | 2001-01-09 | International Business Machines Corp. | Apparatus and methods for Chinese error check by means of dynamic programming and weighted classes |
US6157905A (en) * | 1997-12-11 | 2000-12-05 | Microsoft Corporation | Identifying language and character set of data representing text |
US6256410B1 (en) | 1998-07-30 | 2001-07-03 | International Business Machines Corp. | Methods and apparatus for customizing handwriting models to individual writers |
US6320985B1 (en) | 1998-07-31 | 2001-11-20 | International Business Machines Corporation | Apparatus and method for augmenting data in handwriting recognition system |
US6314397B1 (en) * | 1999-04-13 | 2001-11-06 | International Business Machines Corp. | Method and apparatus for propagating corrections in speech recognition software |
US6385579B1 (en) * | 1999-04-29 | 2002-05-07 | International Business Machines Corporation | Methods and apparatus for forming compound words for use in a continuous speech recognition system |
US20040167779A1 (en) * | 2000-03-14 | 2004-08-26 | Sony Corporation | Speech recognition apparatus, speech recognition method, and recording medium |
US7149970B1 (en) * | 2000-06-23 | 2006-12-12 | Microsoft Corporation | Method and system for filtering and selecting from a candidate list generated by a stochastic input method |
US7200555B1 (en) * | 2000-07-05 | 2007-04-03 | International Business Machines Corporation | Speech recognition correction for devices having limited or no display |
US7730213B2 (en) * | 2000-12-18 | 2010-06-01 | Oracle America, Inc. | Object-based storage device with improved reliability and fast crash recovery |
US20020078244A1 (en) * | 2000-12-18 | 2002-06-20 | Howard John H. | Object-based storage device with improved reliability and fast crash recovery |
US7444284B1 (en) * | 2001-01-24 | 2008-10-28 | Bevocal, Inc. | System, method and computer program product for large-scale street name speech recognition |
US20020129146A1 (en) * | 2001-02-06 | 2002-09-12 | Eyal Aronoff | Highly available database clusters that move client connections between hosts |
US6999933B2 (en) * | 2001-03-29 | 2006-02-14 | Koninklijke Philips Electronics, N.V | Editing during synchronous playback |
US20020143534A1 (en) * | 2001-03-29 | 2002-10-03 | Koninklijke Philips Electronics N.V. | Editing during synchronous playback |
US7958461B2 (en) * | 2002-07-22 | 2011-06-07 | Samsung Electronics Co., Ltd | Method of inputting characters on a wireless mobile terminal |
US20040012642A1 (en) * | 2002-07-22 | 2004-01-22 | Samsung Electronics Co., Ltd. | Method of inputting characters on a wireless mobile terminal |
US6993482B2 (en) * | 2002-12-18 | 2006-01-31 | Motorola, Inc. | Method and apparatus for displaying speech recognition results |
WO2004061750A3 (en) * | 2002-12-18 | 2004-12-29 | Motorola Inc | Method and apparatus for displaying speech recognition results |
US20040122666A1 (en) * | 2002-12-18 | 2004-06-24 | Ahlenius Mark T. | Method and apparatus for displaying speech recognition results |
CN100345185C (en) * | 2002-12-18 | 2007-10-24 | 摩托罗拉公司 | Method and apparatus for displaying speech recognition results |
US8249873B2 (en) * | 2005-08-12 | 2012-08-21 | Avaya Inc. | Tonal correction of speech |
US20070038452A1 (en) * | 2005-08-12 | 2007-02-15 | Avaya Technology Corp. | Tonal correction of speech |
US20070050188A1 (en) * | 2005-08-26 | 2007-03-01 | Avaya Technology Corp. | Tone contour transformation of speech |
US10318871B2 (en) | 2005-09-08 | 2019-06-11 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
US9865248B2 (en) | 2008-04-05 | 2018-01-09 | Apple Inc. | Intelligent text-to-speech conversion |
US8374847B2 (en) * | 2008-09-09 | 2013-02-12 | Institute For Information Industry | Error-detecting apparatus and methods for a Chinese article |
TWI391832B (en) * | 2008-09-09 | 2013-04-01 | Inst Information Industry | Error detection apparatus and methods for chinese articles, and storage media |
US20100063798A1 (en) * | 2008-09-09 | 2010-03-11 | Tsun Ku | Error-detecting apparatus and methods for a chinese article |
US11080012B2 (en) | 2009-06-05 | 2021-08-03 | Apple Inc. | Interface for a virtual digital assistant |
US10795541B2 (en) | 2009-06-05 | 2020-10-06 | Apple Inc. | Intelligent organization of tasks items |
US10283110B2 (en) | 2009-07-02 | 2019-05-07 | Apple Inc. | Methods and apparatuses for automatic speech recognition |
US11423886B2 (en) | 2010-01-18 | 2022-08-23 | Apple Inc. | Task flow identification based on user intent |
US10706841B2 (en) | 2010-01-18 | 2020-07-07 | Apple Inc. | Task flow identification based on user intent |
US10049675B2 (en) | 2010-02-25 | 2018-08-14 | Apple Inc. | User profiling for voice input processing |
US9633660B2 (en) | 2010-02-25 | 2017-04-25 | Apple Inc. | User profiling for voice input processing |
US10522133B2 (en) * | 2011-05-23 | 2019-12-31 | Nuance Communications, Inc. | Methods and apparatus for correcting recognition errors |
US20120304057A1 (en) * | 2011-05-23 | 2012-11-29 | Nuance Communications, Inc. | Methods and apparatus for correcting recognition errors |
US20130301920A1 (en) * | 2012-05-14 | 2013-11-14 | Xerox Corporation | Method for processing optical character recognizer output |
US9953088B2 (en) | 2012-05-14 | 2018-04-24 | Apple Inc. | Crowd sourcing information to fulfill user requests |
US8983211B2 (en) * | 2012-05-14 | 2015-03-17 | Xerox Corporation | Method for processing optical character recognizer output |
US10079014B2 (en) | 2012-06-08 | 2018-09-18 | Apple Inc. | Name recognition system |
US11455475B2 (en) | 2012-08-31 | 2022-09-27 | Verint Americas Inc. | Human-to-human conversation analysis |
US9971774B2 (en) | 2012-09-19 | 2018-05-15 | Apple Inc. | Voice-based media searching |
US9966060B2 (en) | 2013-06-07 | 2018-05-08 | Apple Inc. | System and method for user-specified pronunciation of words for speech synthesis and recognition |
US10474761B2 (en) * | 2013-10-23 | 2019-11-12 | Sunflare Co., Ltd. | Translation support system |
US20190065484A1 (en) * | 2013-10-23 | 2019-02-28 | Sunflare Co., Ltd. | Translation support system |
US10503480B2 (en) * | 2014-04-30 | 2019-12-10 | Ent. Services Development Corporation Lp | Correlation based instruments discovery |
US10169329B2 (en) | 2014-05-30 | 2019-01-01 | Apple Inc. | Exemplar-based natural language processing |
US9785630B2 (en) * | 2014-05-30 | 2017-10-10 | Apple Inc. | Text prediction using combined word N-gram and unigram language models |
US20150347383A1 (en) * | 2014-05-30 | 2015-12-03 | Apple Inc. | Text prediction using combined word n-gram and unigram language models |
US9668024B2 (en) | 2014-06-30 | 2017-05-30 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US10904611B2 (en) | 2014-06-30 | 2021-01-26 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US9986419B2 (en) | 2014-09-30 | 2018-05-29 | Apple Inc. | Social reminders |
US10878088B2 (en) * | 2015-08-18 | 2020-12-29 | Trend Micro Incorporated | Identifying randomly generated character strings |
US11500672B2 (en) | 2015-09-08 | 2022-11-15 | Apple Inc. | Distributed personal assistant |
US10671428B2 (en) | 2015-09-08 | 2020-06-02 | Apple Inc. | Distributed personal assistant |
US10747498B2 (en) | 2015-09-08 | 2020-08-18 | Apple Inc. | Zero latency digital assistant |
US11010550B2 (en) | 2015-09-29 | 2021-05-18 | Apple Inc. | Unified language modeling framework for word prediction, auto-completion and auto-correction |
US10366158B2 (en) | 2015-09-29 | 2019-07-30 | Apple Inc. | Efficient word encoding for recurrent neural network language models |
US11587559B2 (en) | 2015-09-30 | 2023-02-21 | Apple Inc. | Intelligent device identification |
US11526368B2 (en) | 2015-11-06 | 2022-12-13 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US10691473B2 (en) | 2015-11-06 | 2020-06-23 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US10049668B2 (en) | 2015-12-02 | 2018-08-14 | Apple Inc. | Applying neural network language models to weighted finite state transducers for automatic speech recognition |
US10223066B2 (en) | 2015-12-23 | 2019-03-05 | Apple Inc. | Proactive assistance based on dialog communication between devices |
US10446143B2 (en) | 2016-03-14 | 2019-10-15 | Apple Inc. | Identification of voice inputs providing credentials |
US9934775B2 (en) | 2016-05-26 | 2018-04-03 | Apple Inc. | Unit-selection text-to-speech synthesis based on predicted concatenation parameters |
US9972304B2 (en) | 2016-06-03 | 2018-05-15 | Apple Inc. | Privacy preserving distributed evaluation framework for embedded personalized systems |
US10249300B2 (en) | 2016-06-06 | 2019-04-02 | Apple Inc. | Intelligent list reading |
US10049663B2 (en) | 2016-06-08 | 2018-08-14 | Apple, Inc. | Intelligent automated assistant for media exploration |
US11069347B2 (en) | 2016-06-08 | 2021-07-20 | Apple Inc. | Intelligent automated assistant for media exploration |
US10354011B2 (en) | 2016-06-09 | 2019-07-16 | Apple Inc. | Intelligent automated assistant in a home environment |
US10733993B2 (en) | 2016-06-10 | 2020-08-04 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
US11037565B2 (en) | 2016-06-10 | 2021-06-15 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
US10509862B2 (en) | 2016-06-10 | 2019-12-17 | Apple Inc. | Dynamic phrase expansion of language input |
US10067938B2 (en) | 2016-06-10 | 2018-09-04 | Apple Inc. | Multilingual word prediction |
US10490187B2 (en) | 2016-06-10 | 2019-11-26 | Apple Inc. | Digital assistant providing automated status report |
US10192552B2 (en) | 2016-06-10 | 2019-01-29 | Apple Inc. | Digital assistant providing whispered speech |
US10297253B2 (en) | 2016-06-11 | 2019-05-21 | Apple Inc. | Application integration with a digital assistant |
US10269345B2 (en) | 2016-06-11 | 2019-04-23 | Apple Inc. | Intelligent task discovery |
US10089072B2 (en) | 2016-06-11 | 2018-10-02 | Apple Inc. | Intelligent device arbitration and control |
US10521466B2 (en) | 2016-06-11 | 2019-12-31 | Apple Inc. | Data driven natural language event detection and classification |
US11152002B2 (en) | 2016-06-11 | 2021-10-19 | Apple Inc. | Application integration with a digital assistant |
US10706210B2 (en) * | 2016-08-31 | 2020-07-07 | Nuance Communications, Inc. | User interface for dictation application employing automatic speech recognition |
US20180060282A1 (en) * | 2016-08-31 | 2018-03-01 | Nuance Communications, Inc. | User interface for dictation application employing automatic speech recognition |
US10593346B2 (en) | 2016-12-22 | 2020-03-17 | Apple Inc. | Rank-reduced token representation for automatic speech recognition |
US10432789B2 (en) * | 2017-02-09 | 2019-10-01 | Verint Systems Ltd. | Classification of transcripts by sentiment |
IL257440B (en) * | 2017-02-09 | 2021-01-31 | Verint Systems Ltd | Classification of transcripts by sentiment |
US20180226071A1 (en) * | 2017-02-09 | 2018-08-09 | Verint Systems Ltd. | Classification of Transcripts by Sentiment |
US10616414B2 (en) * | 2017-02-09 | 2020-04-07 | Verint Systems Ltd. | Classification of transcripts by sentiment |
US11405466B2 (en) | 2017-05-12 | 2022-08-02 | Apple Inc. | Synchronization and task delegation of a digital assistant |
US10791176B2 (en) | 2017-05-12 | 2020-09-29 | Apple Inc. | Synchronization and task delegation of a digital assistant |
US10810274B2 (en) | 2017-05-15 | 2020-10-20 | Apple Inc. | Optimizing dialogue policy decisions for digital assistants using implicit feedback |
US11822888B2 (en) | 2018-10-05 | 2023-11-21 | Verint Americas Inc. | Identifying relational segments |
US20210350807A1 (en) * | 2019-05-09 | 2021-11-11 | Rovi Guides, Inc. | Word correction using automatic speech recognition (asr) incremental response |
US11651775B2 (en) * | 2019-05-09 | 2023-05-16 | Rovi Guides, Inc. | Word correction using automatic speech recognition (ASR) incremental response |
US20230252997A1 (en) * | 2019-05-09 | 2023-08-10 | Rovi Guides, Inc. | Word correction using automatic speech recognition (asr) incremental response |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5761687A (en) | Character-based correction arrangement with correction propagation | |
US7315818B2 (en) | Error correction in speech recognition | |
US5832428A (en) | Search engine for phrase recognition based on prefix/body/suffix architecture | |
US8719021B2 (en) | Speech recognition dictionary compilation assisting system, speech recognition dictionary compilation assisting method and speech recognition dictionary compilation assisting program | |
US7085716B1 (en) | Speech recognition using word-in-phrase command | |
US6067514A (en) | Method for automatically punctuating a speech utterance in a continuous speech recognition system | |
EP0376501B1 (en) | Speech recognition system | |
US7848926B2 (en) | System, method, and program for correcting misrecognized spoken words by selecting appropriate correction word from one or more competitive words | |
EP0965979B1 (en) | Position manipulation in speech recognition | |
US5852801A (en) | Method and apparatus for automatically invoking a new word module for unrecognized user input | |
EP0840286B1 (en) | Method and system for displaying a variable number of alternative words during speech recognition | |
EP0840289B1 (en) | Method and system for selecting alternative words during speech recognition | |
US5712957A (en) | Locating and correcting erroneously recognized portions of utterances by rescoring based on two n-best lists | |
US7089188B2 (en) | Method to expand inputs for word or document searching | |
US7702512B2 (en) | Natural error handling in speech recognition | |
EP1429313B1 (en) | Language model for use in speech recognition | |
US7251600B2 (en) | Disambiguation language model | |
EP0840288A2 (en) | Method and system for editing phrases during continuous speech recognition | |
US5706397A (en) | Speech recognition system with multi-level pruning for acoustic matching | |
Mérialdo | Multilevel decoding for very-large-size-dictionary speech recognition | |
JP2000056795A (en) | Speech recognition device | |
EP1189203A2 (en) | Homophone selection in speech recognition | |
KR20040008546A (en) | revision method of continuation voice recognition system | |
JP2000010588A (en) | Voice recognition method and apparatus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: APPLE COMPUTER, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:HULTEEN, ERIC A.;REEL/FRAME:007782/0947 Effective date: 19960109 Owner name: APPLE COMPUTER, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:BEAUREGARD, GERALD T.;REEL/FRAME:007782/0944 Effective date: 19960118 Owner name: APPLE COMPUTER, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:HON, HSIAO-WUEN;REEL/FRAME:007785/0381 Effective date: 19960117 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
CC | Certificate of correction | ||
FPAY | Fee payment |
Year of fee payment: 4 |
|
FEPP | Fee payment procedure |
Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
FEPP | Fee payment procedure |
Free format text: PAYER NUMBER DE-ASSIGNED (ORIGINAL EVENT CODE: RMPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
FPAY | Fee payment |
Year of fee payment: 8 |
|
AS | Assignment |
Owner name: APPLE INC., CALIFORNIA Free format text: CHANGE OF NAME;ASSIGNOR:APPLE COMPUTER, INC.;REEL/FRAME:019382/0050 Effective date: 20070109 |
|
FPAY | Fee payment |
Year of fee payment: 12 |