US8214208B2 - Method and system for sharing portable voice profiles - Google Patents
Method and system for sharing portable voice profiles Download PDFInfo
- Publication number
- US8214208B2 US8214208B2 US11/536,117 US53611706A US8214208B2 US 8214208 B2 US8214208 B2 US 8214208B2 US 53611706 A US53611706 A US 53611706A US 8214208 B2 US8214208 B2 US 8214208B2
- Authority
- US
- United States
- Prior art keywords
- voice profile
- speaker
- voice
- speech recognition
- portable
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related, expires
Links
- 238000000034 method Methods 0.000 title claims description 13
- 230000001419 dependent effect Effects 0.000 claims abstract description 29
- 238000004891 communication Methods 0.000 claims description 15
- 230000008569 process Effects 0.000 claims description 9
- 238000012549 training Methods 0.000 claims description 8
- 230000005236 sound signal Effects 0.000 description 7
- 230000008901 benefit Effects 0.000 description 5
- 230000006855 networking Effects 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 230000008859 change Effects 0.000 description 4
- 235000008694 Humulus lupulus Nutrition 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000002269 spontaneous effect Effects 0.000 description 2
- 230000002146 bilateral effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000000135 prohibitive effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
Definitions
- the present invention relates generally to speaker-dependent speech recognition systems. More particularly, the invention relates to an architecture for sharing customized, portable voice profiles, for use with varied speech recognition engines, amongst different applications and users.
- Speech recognition systems which can generally be defined as a set of computer-implemented algorithms for converting a speech or voice signal into words, are used in a wide variety of applications and contexts. For instance, speech recognition technology is utilized in dictation applications for converting spoken words into structured, or unstructured, documents. Phones and phone systems, global positioning systems (GPS), and other special-purpose computers often utilize speech recognition technology as a means for inputting commands (e.g., command and control applications).
- commands e.g., command and control applications.
- Speech recognition systems can be characterized by their attributes (e.g., speaking mode, speaking style, enrollment or vocabulary), which are often determined by the particular context of the speech recognition application. For example, speech recognition systems utilized in dictation applications often require that the user of the application enroll, or provide a speech sample, to train the system. Such systems are generally referred to as speaker-dependent systems. Speaker-dependent speech recognition systems support large vocabularies, but only one user. On the other hand, speaker-independent speech recognition systems support many users without any enrollment, or training, but accurately recognize only a limited vocabulary (e.g., a small list of commands, or ten digits). Speech recognition systems utilized in dictation applications may be designed for spontaneous and continuous speech, whereas speech recognition systems utilized for recognizing voice commands (e.g., voice dialing on mobile phones) may be designed to recognize isolated-words.
- attributes e.g., speaking mode, speaking style, enrollment or vocabulary
- any user-specific, location-specific, or customized data e.g., speaker-dependent enrollment/training data, or user-specific settings
- speaker-dependent enrollment/training data e.g., speaker-dependent enrollment/training data, or user-specific settings
- many speech recognition systems in use with dictation applications can be improved over time with active or passive training.
- the speech recognition system can “learn” from the identified errors, and prevent such errors from reoccurring.
- Other speech recognition systems benefit from user specific settings, such as settings that indicate the gender of the speaker, nationality of the speaker, age of the speaker, etc.
- such data are not easily shared amongst speech recognition systems and/or applications.
- speaker-dependent speech recognition systems do not work well in multi-user or conversational settings, where more than one person may contribute to a speech recording or voice signal.
- a typical speaker-dependent speech recognition system is invoked with a particular voice or user profile. Accordingly, that particular user profile is used to analyze the entire voice recording. If a recording includes recorded speech from multiple users, a conventional speaker-dependent speech recognition system uses only one user profile to analyze the recorded speech of all of the speakers.
- a particular speech recognition system is configured to use multiple voice profiles, there does not exist a system or method for easily locating and retrieving, the necessary voice profiles. For example, if a particular audio recording includes speech from persons A, B, C, D, E and F (where person A is the user of the speech recognition system) then it is necessary for the speech recognition system to locate and retrieve the voice profiles from persons A, B, C, D, E and F. It may be impractical to expect person A, the user of the speech recognition system, to ask each of persons B, C, D, E and F to provide a voice profile, particularly if person A is not acquainted with one or more persons.
- An embodiment of the present invention provides a speech recognition engine that utilizes portable voice profiles for converting recorded speech to text.
- Each portable voice profile includes speaker-dependent data, and is configured to be accessible to a plurality of speech recognition engines through a common interface.
- a voice profile manager facilitates the exchange of portable voice profiles between members of groups who have agreed to share their voice profiles.
- the speech recognition engine includes speaker identification logic to dynamically select a particular portable voice profile, in real-time, from a group of portable voice profiles.
- the speaker-dependent data included with the portable voice profile enhances the accuracy with which speech recognition engines recognize spoken words in recorded speech from a speaker associated with a portable voice profile.
- each speech recognition device includes a voice profile manager that communicates with other devices to facilitate the exchange of voice profiles.
- FIG. 1 depicts a block diagram of a speech recognition system for use with multiple portable voice profiles, according to an embodiment of the invention
- FIG. 2 depicts several speech recognition systems for use with multiple, portable voice profiles, according to various embodiments of the invention
- FIGS. 3 and 4 illustrate examples of user-interfaces for managing portable voice profiles for use with one or more speech recognition systems, according to embodiments of the invention
- FIGS. 5 and 6 illustrate an example of how portable voice profiles may be shared amongst a hierarchically organized population, according to an embodiment of the invention
- FIG. 7 depicts an example of a single-device, multi-user environment in which an embodiment of the invention may be utilized
- FIG. 8 depicts an example of a multi-device, multi-user, peer-to-peer environment in which an embodiment of the invention may be utilized
- FIGS. 9 and 10 illustrate an example of how a speech recognition system, consistent with an embodiment of the invention, works with a particular application.
- FIG. 11 illustrates an example of a client-server based speech recognition system, consistent with an embodiment of the invention.
- the speech recognition system 10 includes a speech recognition engine 12 and speaker identification logic 14 .
- the speech recognition system 10 includes a group manager 16 and a local voice profile manager 18 .
- the speech recognition system 10 is configured to work with a variety of speech-enabled applications ( 20 - 1 through 20 -N), for example, such as a dictation application, a telecommunications application (e.g., voice over Internet Protocol, or VoIP application), or any of a variety of command and control applications (e.g., voice dialing or GPS control).
- the speech recognition system 10 is configured to work with multiple portable voice profiles ( 22 - 1 through 22 -N).
- one function of the group manager 16 is to facilitate defining or selecting groups and exchanging voice profiles with members within the groups.
- the speech recognition system 10 is invoked to work with an active application 20 - 1 .
- the active application 20 - 1 may be configured to automatically invoke the speech recognition system 10 , or alternatively, a user of the active application may manually select to invoke the speech recognition system 10 .
- the active application 20 - 1 may pass one or more parameters 24 to the speech recognition engine 12 , such that the speech recognition engine 12 can optimally self-configure itself to provide the highest level of accuracy in recognizing speech (e.g., spoken words) in a voice signal received from or recorded by the active application 20 - 1 .
- the active application parameters 24 may include, for example, values for such application settings as, speaking mode (e.g., isolated-word or continuous speech), or speaking style (e.g., read speech or spontaneous speech).
- the active application 20 - 1 may pass one or more output parameters 26 to the speech recognition engine 12 .
- the output parameters 26 may be user-definable, for example, through a user-interface (not shown) of the speech recognition system 10 .
- the output parameters 26 may determine the format of the output created by the speech recognition engine 12 .
- each portable voice profile is associated with a particular user or speaker, where the terms “user” and “speaker” are used synonymously herein.
- a portable voice profile includes speaker-dependent data, such as the acoustic characteristics of a speaker, to improve the accuracy with which a compatible speech recognition engine 12 recognizes speech recorded by a particular speaker that is associated with a particular voice profile.
- each portable voice profile may include enrollment/training data. For example, each time a voice profile is used with a particular speech recognition engine, the voice profile may be manually or automatically updated as errors are identified and corrected by a speaker.
- a voice profile consistent with the invention may include data derived from use with a variety of speech recognition engines and a variety of applications.
- a voice profile may include speaker-dependent parameters or settings that are used by various speech recognition systems to improve recognition accuracy.
- settings may include values to indicate the gender of the speaker, nationality of the speaker, age of the speaker, acoustic characteristics of the speaker, characteristics for different speaking environments, or any other speaker-dependent attributes utilized by speech recognition engines to improve speech recognition accuracy.
- the speaker-dependent parameters may include data used for recognizing text in noisy and/or noiseless environments.
- each voice profile has a common interface (not shown), such that the data included within each voice profile can be easily accessed by a wide variety of speech recognition engines.
- portable voice profiles and associated data may be accessible through common function calls.
- function calls may be part of an application programming interface (API). Accordingly, any speech recognition engine implementing the API will have access to the portable voice profiles.
- API application programming interface
- the speaker identification logic 14 samples the input voice signal and then analyzes the voice signal for the purpose of identifying the speaker. For example, the speaker identification logic 14 may compare portions of the voice signal to speaker-dependent data from each accessible voice profile to determine whether the analyzed voice has a corresponding portable voice profile accessible. In a particular voice captured in the voice signal is associated with a portable voice profile accessible to the speech recognition engine 12 , then the speech recognition engine 12 dynamically invokes the matching portable voice profile (e.g., the active voice profile 22 - 2 ), and utilizes data from the matching portable voice profile to enhance or improve the speech recognition process. In contrast to conventional speaker-dependent speech recognition engines, a speech recognition engine consistent with the invention can dynamically switch between portable voice profiles in real time during the speech recognition process.
- the speaker identification logic 14 can dynamically identify the new voice and match the new voice to a portable voice profile, if one exists and is accessible. Accordingly, different voice profiles are used for different speakers to improve the accuracy of the speech recognition process.
- the group manager 16 facilitates the establishment and management of groups.
- each user may have his or her own portable voice profile (e.g., local voice profile 22 - 1 ) for use with his or her own applications and speech recognition engine or engines.
- Each user by means of the group manager 16 , can establish or select one or more groups, and then add other users to the group. For instance, if a particular user frequently records conversations with a select group of people and the user would like to share his or her voice profile with those people, then the user can establish a group to include those people. Alternatively, the user may select to join the group if the group has already been established.
- group definitions may be defined by some higher level administrative application, such that each user selects from a list of groups that have been predefined. This facilitates common naming of groups, so that members can easily find the correct group without establishing multiple names for the same group.
- a group when a group is initially established, it may be designated as a secure group. Accordingly, only members who have been authorized to join a secure group may do so.
- a group administrator may use a higher level administrative application (e.g., executing on a secure server, or a specialized peer device) to establish a secure group, and to designate the members who are allowed to join the secure group.
- each member's group manager 16 may exchange information via secure cryptographic communication protocols with the higher level administrative application when establishing membership privileges within the group.
- the group manager 16 After joining a particular group, in one embodiment of the invention, the group manager 16 enables a user to establish connections or links with other users.
- a unilateral connection exists when one user has requested a connection with another user, and a mutual or bilateral connection exists when both users have requested, or in some way acknowledged a desire to be connected. Accordingly, access privileges to voice profiles may be dependent upon the type of connection—unilateral or bilateral—that has been established between two users.
- a mutual connection establishes the requisite sharing privileges, such that each user grants the other access to his or her voice profile by establishing a mutual connection.
- a unilateral connection may or may not establish the requisite sharing privileges.
- a user may have to configure access privileges for unilateral connections.
- the group manager 16 may enable a user to grant or deny permission to unilaterally connected users to use, or otherwise access, his or her local portable voice profile 22 - 1 .
- the group manager 16 may establish the depth to which a user is willing to share his or voice profile. For example, if user A is connected to user B, and user B is connected to user C, such that user A has only an indirect connection to user C through user B, user A may allow his or her voice profile to be shared with user C by setting access privileges to the proper depth. In this case, a depth level setting of two would allow user C to access user A'a voice profile.
- the local voice profile manager 18 includes the necessary networking, communication and routing protocols to discover members of a group. For example, after a user has selected a particular group, the speech recognition system 10 may use a discovery communication protocol to query other speech recognition systems that have been designated as being in that same group. In this manner, the speech recognition system 10 can determine the users that are members of a selected group. Accordingly, the system 10 may present or display (e.g., in a graphical user-interface) each member of a particular group for selection by the user. To avoid overburdening network resources, discovery protocols may only search to an appropriate depth or level.
- a “hop” represents a nodal connection between the user's device and another user's device. In this manner, a user may find all other members of a selected group that are within so many “hops” of the user.
- group membership and/or voice profile sharing privileges may be established based on simple set logic or Boolean logic. For example, a user's voice profile may automatically be shared with another user or entity if the user's existing group membership satisfies a Boolean-valued function. For instance, a user may automatically be designated as a member of a particular group, if the user is a member of one or more other groups. If user A is a member of Group “New York City” and group “over age 25” then user A may automatically be designated as a member of group “Hertz Car Rental.” In one embodiment, a user may be able to control the extent to which he or she is automatically added to groups through access privileges.
- the local voice profile manager 18 automatically arranges for the exchange of the portable voice profiles.
- the local voice profile manager 18 may be configured to communicate with remove voice profile managers for the exchange of voice profiles.
- “local” simply means that the voice profile manager is executing on the user's device, whereas “remote” means that the voice profile manager is executing on another user's device. Such communications may occur automatically, for example, over a network such as the Internet. Accordingly, the group manager 16 , in conjunction with the local voice profile manager 18 , facilitates the automatic sharing of portable voice profiles amongst and between users who have established connections via the group manager.
- user A's profile manager 18 will automatically determine the locations of the other user's (e.g., users B, C, D and E) voice profiles.
- the actual exchange of the voice profiles is facilitated by networking, communication and/or routing protocols, which may be part of the local voice profile manager 18 , or another subsystem (not shown).
- the local portable voice profile 22 - 1 is a portable voice profile that is associated with a user/owner of the speech recognition engine and applications, whereas a remote voice profile is a profile received from another user, who does not own and may or may not regularly use the particular speech recognition engine 12 . Accordingly, each remote voice profile is retrieved from a remote device or system used by a member of a group with whom the user has established a connection.
- group size can be calculated by the larger system (e.g., by taking into consideration engineering estimates and measurements of the load on the entire system) to reflect the availability of computation and network resources.
- the local voice profile manager 18 may monitor the age of each remote portable voice profile ( 22 - 2 , 22 - 3 and 22 -N) so as to ensure that each remote portable voice profile is kept recent and up-to-date. For example, the local voice profile manager 18 may periodically communicate requests to one or more remote voice profile managers, from which a remote voice profile was received, requesting the remote voice profile managers to update remote voice profiles if necessary. Accordingly, if an update is required, a remote voice profile manager may communicate an updated remote portable voice profile to the local voice profile manager 18 .
- voice profiles may be exchanged prior to being invoked by an application or speech recognition engine. Accordingly, when a user selects a particular group to be used in connection with recording a conversation through a particular application (e.g, a Voice over Internet Protocol (VoIP) phone), the voice profiles associated with the members of that group are ready to be used.
- voice profiles may be exchanged in real time, for example, when a conversation is initiated through a particular application, or when a recording device is configured and enabled to capture a conversation between two or more users.
- a group may be automatically established when users identify themselves to a call conferencing application, which uses a group manager 16 and voice profile manager 18 to retrieve the appropriate voice profiles in real-time.
- the group manager 16 and/or voice profile manager 18 execute as stand alone applications, thereby enabling the establishment of groups and the exchange of voice profiles independent of any particular user application.
- the group manager 16 and/or voice profile manager may be integrated with and execute as part of a local user's application.
- the group manager 16 and/or voice profile manager 18 may be implemented as, or integrated with, an instant messaging or social networking platform. Accordingly, an instant messaging or social networking platform may provide a foundation for the communication and messaging platform, such that the functions to facilitate exchanging portable voice profiles is an add-on service or component.
- the group manager 16 may be integrated in such a way that group definitions and member selections may be facilitated by means of the instant messaging or social networking platform.
- the speech recognition engine 12 receives an audio signal from an active application 20 - 1 .
- the audio signal may be processed in real-time or near real-time, for example, as the signal is being captured.
- the audio signal may be captured by the active application and then processed by the speech recognition system 10 at a later time.
- the speaker identification logic 14 analyzes the audio signal to identify a speaker. Based on the identification of the speaker, a voice profile is invoked for processing the audio signal.
- the speech recognition engine 12 utilizes active voice profile data 28 to enhance the accuracy of the speech recognition.
- the speaker identification logic 14 cannot identify the speaker associated with a particular voice, for example, because there is no portable voice profile accessible for the speaker, then default settings or a default voice profile may be used. If, during the recognition process, the speaker identification logic 14 identifies a voice change, then the speaker identification logic 14 attempts to identify the new voice, and invoke the associated portable voice profile. In this manner, the speech recognition engine 12 continues to process the audio signal, and generates an output containing the recognized text of the captured or recorded audio signal.
- each speech recognition system may be configured to use only one type of application with its own customized speech recognition engine configured to utilize portable voice profiles.
- the speech recognition systems shown in FIG. 2 include a personal computer 30 executing a VoIP phone application 32 , a personal computer 34 executing a dictation application 36 , a mobile phone 38 executing a voice dialing application 40 , a personal digital assistant or PDA 42 executing a voice recorder application 44 , and a voice recorder 46 with a voice recording application 48 .
- Each individual device may have its own customized speech recognition engine.
- each device ( 30 , 34 , 38 , 42 and 46 ), application ( 32 , 36 , 40 , 44 and 48 ) and respective speech recognition engine ( 50 , 52 , 54 , 56 , 58 ) may be compatible with portable voice profiles. Therefore, in contrast to conventional speech recognition systems, a user who has a well developed voice profile, which includes data from extensive training with a dictation application (e.g., application 36 ) can use that voice profile with a voice recorder application 44 on a PDA 42 , or in any other voice-enabled application that is configured to utilize portable voice profiles. Moreover, the voice profile can be shared with other users.
- each device may be configured to synchronize with other devices owned by the same user. For example, if all of the devices illustrated in FIG. 2 are owned by one user, any change made to one device may be automatically distributed to other devices. Accordingly, if the user has established a group and selected the group's members on one device (e.g., personal computer 30 ), and then later adds a member to the group, the change will be distributed to all other devices owned by the user.
- the local voice profile manager 60 of one device may communicate (e.g., upload) device specific settings and information, including group, member, and voice profile information, to another device, such as mobile phone 38 , or PDA 42 .
- the communication between devices may occur over a wired connection (e.g., Ethernet, universal serial bus (USB) or FireWire) or wireless connection (e.g., Bluetooth or Wi-Fi®).
- a wired connection e.g., Ethernet, universal serial bus (USB) or FireWire
- wireless connection e.g., Bluetooth or Wi
- FIGS. 3 and 4 illustrate examples of user-interfaces for managing groups and associated portable voice profiles for use with one or more speech recognition systems, according to embodiments of the invention.
- a group manager 16 may have a graphical user-interface for establishing and managing groups. Accordingly, the group manager user-interface may enable a user to define or select a group, such as those shown in FIG. 3 (e.g., FRIENDS & FAMILY, COWORKERS, and GOLF CLUB MEMBERS). Once a group is defined or selected, the user may select other users to be members of each particular group.
- a group manager 16 may have a graphical user-interface for establishing and managing groups. Accordingly, the group manager user-interface may enable a user to define or select a group, such as those shown in FIG. 3 (e.g., FRIENDS & FAMILY, COWORKERS, and GOLF CLUB MEMBERS). Once a group is defined or selected, the user may select other users to be members of each particular
- the user by selecting another user (e.g., BILL STANTON) to be a member of a group, the user is, by default, granting permission to the other user (e.g., BILL STANTON) to access his or her portable voice profile.
- the user-interface may implement an alternative mechanism for granting access privileges to voice profiles, such that a member of a group is not automatically allowed access to a user's voice profile.
- the system 10 automatically polls other user's devices to determine the members of a particular group, thereby enabling the user-interface to display members of a group for selection by the user.
- FIG. 4 illustrates an example of a user-interface, according to an embodiment of the invention, for use with a voice profile service provider.
- a user may select to upload his or her portable voice profile to a voice profile service provider. Once uploaded, the user can select to share his or her portable voice profile.
- the user may select from a list of business entities, those entities with which the user would like to share his or her portable voice profile. Accordingly, if the user is a frequent caller to a particular business entity (e.g., United Airlines), then the user may wish to share his or her voice profile with that particular business entity.
- the automated voice response system of the call center may then utilize the user's voice profile to improve speech recognition accuracy, thereby improving the overall user experience.
- FIG. 5 illustrates a population of people organized into a hierarchical structure.
- the hierarchical structure may represent the structure of a company or other business entity.
- person A is at the top of the hierarchical structure, while persons E through J and N through S are at the lowest level of the structure—three levels from person A.
- One of the problems that exists in any hierarchical structure involves the flow of information. For example, because person A is three levels away from the persons at the lowest level (e.g., persons E through J and N through S), information flowing from person A to person J must pass through two levels, including persons B and D. Information may be lost or changed en route from person A to J. As more and more layers are added to the structure, problems associated with the flow of information increase, as there is insufficient bandwidth available for all the information available at the bottom of the hierarchy to be transmitted to the individuals higher up.
- FIG. 6 illustrates how an embodiment of the present invention enables persons within a hierarchical structure to share portable voice profiles.
- a group consisting of persons I, D, B, A, K, M and Q has been defined.
- a voice profile list (e.g., voice profile list 62 ), listing the voice profiles residing on, or stored at, a speech recognition system 10 owned and operated by person A.
- Each voice profile list may be maintained and/or managed by the voice profile manager 18 of its respective owner's system 10 .
- person M is in the group, it is clear from the illustration of FIG. 6 that person A does not have person M's voice profile stored on person A's system, as person M is not listed in person A's voice profile list.
- the profile manager 18 of person A's system can retrieve the voice profile of person M by sending a request to person K, who has person M's voice profile stored locally, as indicated in person K's voice profile list.
- person K who has person M's voice profile stored locally, as indicated in person K's voice profile list.
- a group of distributed voice profiles can be shared, and routed to the proper system when needed, or when requested.
- each user e.g., users A, B and C
- each user generates and develops his or her own voice profile by utilizing the portable voice profile with one or more application and speech recognition engines.
- each user agrees, by means of his or her own local voice profile manager, to exchange portable voice profiles with the other users.
- user A may grant users B and C permission to access user A's portable voice profile
- user B grants users A and C permission, and so on.
- each user's voice profile manager automatically negotiates the exchange of voice profiles between and amongst the users. Consequently, user A may utilize his own portable voice profile, as well as user B's portable voice profile, and user C's portable voice profile on one device, such as PDA 80 , with a voice recording application.
- each user may utilize a communication device with recording and speech recognition capabilities.
- each user's device including user A's personal computer 82 , user B's wireless VoIP phone 84 , and user C's mobile VoIP phone 86 , has a locally stored copy of each user's portable voice profile. Therefore, as user B's voice signal is received at user A's personal computer, a speech recognition engine executing at user A's personal computer can process the voice signal using user B's portable voice profile.
- portable voice profiles are exchanged prior to the initiation of a communication session, as described above. Alternatively, voice profiles may be exchanged at the time the communication session is initiated.
- FIGS. 9 and 10 illustrate an example of how a speech recognition system, consistent with an embodiment of the invention, might work in a particular multi-user application.
- FIG. 9 an example conversation between users A, B and C is shown.
- the conversation is captured, by an application, for example, such as the voice recording application on the PDA 80 ( FIG. 7 ), or a VoIP application on the personal computer 82 ( FIG. 8 ).
- the voice recording application processes and passes the recorded voice signal to the speech recognition engine.
- the speech recognition engine can utilize the various portable voice profiles to recognize the words spoken by user A, user B and user C. For example, as illustrated in FIG.
- user A's speech is processed by the speech recognition engine with user A's portable voice profile
- user B's speech is processed with user B's portable voice profile, and so on.
- each device of the overall system need not be aware of the location of every user's profile, as each device can query other users' devices in the system to obtain the location, and thus the voice profile, of another user.
- the system generally has a peer-to-peer architecture.
- each voice profile manager 18 may communicate directly with another voice profile manager to exchange voice profiles.
- no centralized server is required.
- a central server or a special purpose peer device facilitates some special functions, such as establishing groups, enforcing secured group access, and monitoring the overall processing and communication load of the system.
- each user may subscribe to a service, and select users who have also subscribed to the service, to be members of particular groups. Consequently, the particular architecture of the system (e.g., peer-to-peer, or client server, or the careful integration of the two forms of network topology in a larger system) is a design choice, and depends on the particular features that are desired. Those skilled in the art will appreciate the advantages of each type of architecture.
- higher level groups may be established and managed by regional, centralized servers.
- one or more servers may manage high level groups, such as a group including all members residing in a particular city (e.g., New York City, Los Angeles, San Francisco), or all members with a particular voice accent, such as all members with a New York accent.
- High level groups based on more precise data may be managed by each user's group manager 16 on a peer-to-peer basis.
- FIG. 11 illustrates an example of a client-server based speech recognition system, consistent with an embodiment of the invention.
- a voice profile service provider 90 operates a server 92 coupled to a storage device 94 for storing portable voice profiles.
- the voice profile service provider's server 92 is connected to a public network 96 , such as the Internet.
- a user such as user D, may utilize a conventional web browser executing on a computer 98 to access a web server operated by the voice profile service provider 90 .
- user D may establish a new voice profile with the voice profile service provider 90 , by requesting a new voice profile, in which case, the voice profile services provider 90 may prompt user D to input information and/or provide a voice sample.
- user D may upload an existing portable voice profile that user D has developed with an existing speech recognition system.
- a subscriber may be a business entity that subscribes to a service offered by the voice profile service provider 90 .
- the business entity is granted access to users' voice profiles, which can be used within the business entity's call center to improve speech recognition accuracy when a user (e.g., user D) places a call to the subscriber's call center. As illustrated in FIG.
- a speech recognition engine 104 operated by subscriber 100 can process user D's voice signal using user D's portable voice profile, which was received by the subscriber 100 from the voice profile service provider 90 .
- a subscriber 100 retrieves portable voice profiles when a call is initiated. For example, the subscriber 100 may request the user to enter a telephone number, or some other information suitable for identification, upon which, the subscribe can issued a request for the voice profile associated with the identification information. Alternatively, the subscriber 100 may utilize caller identification information communicated to the call center as part of the communication session. In any case, the subscriber 100 may issue a request to the voice profile service provider 90 for the user's voice profile after the user has provided the identifying information.
- a subscriber 100 may store voice profiles locally.
- the subscriber 100 may implement a mechanism for automatically updating a user's portable voice profile as it changes over time. For example, the subscriber 100 may periodically poll the voice profile service provider 90 to determine whether a particular voice profile has been updated or otherwise changed. The voice profile service provider may automatically communicate updates to the subscriber 100 .
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Telephonic Communication Services (AREA)
Abstract
Description
Claims (24)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/536,117 US8214208B2 (en) | 2006-09-28 | 2006-09-28 | Method and system for sharing portable voice profiles |
US13/523,762 US8990077B2 (en) | 2006-09-28 | 2012-06-14 | Method and system for sharing portable voice profiles |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/536,117 US8214208B2 (en) | 2006-09-28 | 2006-09-28 | Method and system for sharing portable voice profiles |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/523,762 Continuation US8990077B2 (en) | 2006-09-28 | 2012-06-14 | Method and system for sharing portable voice profiles |
Publications (2)
Publication Number | Publication Date |
---|---|
US20080082332A1 US20080082332A1 (en) | 2008-04-03 |
US8214208B2 true US8214208B2 (en) | 2012-07-03 |
Family
ID=39262074
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/536,117 Expired - Fee Related US8214208B2 (en) | 2006-09-28 | 2006-09-28 | Method and system for sharing portable voice profiles |
US13/523,762 Expired - Fee Related US8990077B2 (en) | 2006-09-28 | 2012-06-14 | Method and system for sharing portable voice profiles |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/523,762 Expired - Fee Related US8990077B2 (en) | 2006-09-28 | 2012-06-14 | Method and system for sharing portable voice profiles |
Country Status (1)
Country | Link |
---|---|
US (2) | US8214208B2 (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8520810B1 (en) | 2000-11-30 | 2013-08-27 | Google Inc. | Performing speech recognition over a network and using speech recognition results |
US20140026075A1 (en) * | 2011-01-16 | 2014-01-23 | Google Inc. | Method and System for Sharing Speech Recognition Program Profiles for an Application |
US9117451B2 (en) | 2013-02-20 | 2015-08-25 | Google Inc. | Methods and systems for sharing of adapted voice profiles |
US9282096B2 (en) | 2013-08-31 | 2016-03-08 | Steven Goldstein | Methods and systems for voice authentication service leveraging networking |
US9430641B1 (en) * | 2011-11-03 | 2016-08-30 | Mobile Iron, Inc. | Adapting a mobile application to a partitioned environment |
US10405163B2 (en) | 2013-10-06 | 2019-09-03 | Staton Techiya, Llc | Methods and systems for establishing and maintaining presence information of neighboring bluetooth devices |
US11074365B2 (en) | 2015-07-22 | 2021-07-27 | Ginko LLC | Event-based directory and contact management |
US11163905B2 (en) * | 2015-07-22 | 2021-11-02 | Ginko LLC | Contact management |
US11552966B2 (en) | 2020-09-25 | 2023-01-10 | International Business Machines Corporation | Generating and mutually maturing a knowledge corpus |
US11597519B2 (en) | 2017-10-17 | 2023-03-07 | The Boeing Company | Artificially intelligent flight crew systems and methods |
US11875798B2 (en) | 2021-05-03 | 2024-01-16 | International Business Machines Corporation | Profiles for enhanced speech recognition training |
Families Citing this family (246)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8645137B2 (en) | 2000-03-16 | 2014-02-04 | Apple Inc. | Fast, language-independent method for user authentication by voice |
US8677377B2 (en) | 2005-09-08 | 2014-03-18 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
US8161289B2 (en) * | 2005-12-21 | 2012-04-17 | SanDisk Technologies, Inc. | Voice controlled portable memory storage device |
US20070143117A1 (en) * | 2005-12-21 | 2007-06-21 | Conley Kevin M | Voice controlled portable memory storage device |
US7917949B2 (en) * | 2005-12-21 | 2011-03-29 | Sandisk Corporation | Voice controlled portable memory storage device |
US20070143111A1 (en) * | 2005-12-21 | 2007-06-21 | Conley Kevin M | Voice controlled portable memory storage device |
US9318108B2 (en) | 2010-01-18 | 2016-04-19 | Apple Inc. | Intelligent automated assistant |
US20080086565A1 (en) * | 2006-10-10 | 2008-04-10 | International Business Machines Corporation | Voice messaging feature provided for immediate electronic communications |
US8977255B2 (en) | 2007-04-03 | 2015-03-10 | Apple Inc. | Method and system for operating a multi-function portable electronic device using voice-activation |
US9794348B2 (en) | 2007-06-04 | 2017-10-17 | Todd R. Smith | Using voice commands from a mobile device to remotely access and control a computer |
US20090125307A1 (en) * | 2007-11-09 | 2009-05-14 | Jui-Chang Wang | System and a method for providing each user at multiple devices with speaker-dependent speech recognition engines via networks |
US10002189B2 (en) | 2007-12-20 | 2018-06-19 | Apple Inc. | Method and apparatus for searching using an active ontology |
US9330720B2 (en) | 2008-01-03 | 2016-05-03 | Apple Inc. | Methods and apparatus for altering audio output signals |
US7974841B2 (en) * | 2008-02-27 | 2011-07-05 | Sony Ericsson Mobile Communications Ab | Electronic devices and methods that adapt filtering of a microphone signal responsive to recognition of a targeted speaker's voice |
US8996376B2 (en) | 2008-04-05 | 2015-03-31 | Apple Inc. | Intelligent text-to-speech conversion |
US10496753B2 (en) | 2010-01-18 | 2019-12-03 | Apple Inc. | Automatically adapting user interfaces for hands-free interaction |
US20100030549A1 (en) | 2008-07-31 | 2010-02-04 | Lee Michael M | Mobile device having human language translation capability with positional feedback |
US8676904B2 (en) * | 2008-10-02 | 2014-03-18 | Apple Inc. | Electronic devices with voice command and contextual data processing capabilities |
JP5042194B2 (en) * | 2008-10-27 | 2012-10-03 | インターナショナル・ビジネス・マシーンズ・コーポレーション | Apparatus and method for updating speaker template |
US9959870B2 (en) | 2008-12-11 | 2018-05-01 | Apple Inc. | Speech recognition involving a mobile device |
US8655660B2 (en) * | 2008-12-11 | 2014-02-18 | International Business Machines Corporation | Method for dynamic learning of individual voice patterns |
US20100153116A1 (en) * | 2008-12-12 | 2010-06-17 | Zsolt Szalai | Method for storing and retrieving voice fonts |
US9858925B2 (en) | 2009-06-05 | 2018-01-02 | Apple Inc. | Using context information to facilitate processing of commands in a virtual assistant |
US20120309363A1 (en) | 2011-06-03 | 2012-12-06 | Apple Inc. | Triggering notifications associated with tasks items that represent tasks to perform |
US10241752B2 (en) | 2011-09-30 | 2019-03-26 | Apple Inc. | Interface for a virtual digital assistant |
US10241644B2 (en) | 2011-06-03 | 2019-03-26 | Apple Inc. | Actionable reminder entries |
US9431006B2 (en) | 2009-07-02 | 2016-08-30 | Apple Inc. | Methods and apparatuses for automatic speech recognition |
US9058818B2 (en) | 2009-10-22 | 2015-06-16 | Broadcom Corporation | User attribute derivation and update for network/peer assisted speech coding |
US8121618B2 (en) * | 2009-10-28 | 2012-02-21 | Digimarc Corporation | Intuitive computing methods and systems |
US9865263B2 (en) * | 2009-12-01 | 2018-01-09 | Nuance Communications, Inc. | Real-time voice recognition on a handheld device |
EP2518723A4 (en) * | 2009-12-21 | 2012-11-28 | Fujitsu Ltd | LANGUAGE CONTROL AND LANGUAGE CONTROL METHOD |
US10705794B2 (en) | 2010-01-18 | 2020-07-07 | Apple Inc. | Automatically adapting user interfaces for hands-free interaction |
US10679605B2 (en) | 2010-01-18 | 2020-06-09 | Apple Inc. | Hands-free list-reading by intelligent automated assistant |
US10276170B2 (en) | 2010-01-18 | 2019-04-30 | Apple Inc. | Intelligent automated assistant |
US10553209B2 (en) | 2010-01-18 | 2020-02-04 | Apple Inc. | Systems and methods for hands-free notification summaries |
US8682667B2 (en) | 2010-02-25 | 2014-03-25 | Apple Inc. | User profiling for selecting user specific voice input processing information |
US8370157B2 (en) * | 2010-07-08 | 2013-02-05 | Honeywell International Inc. | Aircraft speech recognition and voice training data storage and retrieval methods and apparatus |
US9318114B2 (en) * | 2010-11-24 | 2016-04-19 | At&T Intellectual Property I, L.P. | System and method for generating challenge utterances for speaker verification |
US10762293B2 (en) | 2010-12-22 | 2020-09-01 | Apple Inc. | Using parts-of-speech tagging and named entity recognition for spelling correction |
US9262612B2 (en) | 2011-03-21 | 2016-02-16 | Apple Inc. | Device access using voice authentication |
US10672399B2 (en) | 2011-06-03 | 2020-06-02 | Apple Inc. | Switching between text data and audio data based on a mapping |
US10057736B2 (en) | 2011-06-03 | 2018-08-21 | Apple Inc. | Active transport based notifications |
US9159324B2 (en) | 2011-07-01 | 2015-10-13 | Qualcomm Incorporated | Identifying people that are proximate to a mobile device user via social graphs, speech models, and user context |
US8994660B2 (en) | 2011-08-29 | 2015-03-31 | Apple Inc. | Text correction processing |
US10134385B2 (en) | 2012-03-02 | 2018-11-20 | Apple Inc. | Systems and methods for name pronunciation |
US9483461B2 (en) | 2012-03-06 | 2016-11-01 | Apple Inc. | Handling speech synthesis of content for multiple languages |
US9280610B2 (en) | 2012-05-14 | 2016-03-08 | Apple Inc. | Crowd sourcing information to fulfill user requests |
US10417037B2 (en) | 2012-05-15 | 2019-09-17 | Apple Inc. | Systems and methods for integrating third party services with a digital assistant |
US9899040B2 (en) * | 2012-05-31 | 2018-02-20 | Elwha, Llc | Methods and systems for managing adaptation data |
US9495966B2 (en) * | 2012-05-31 | 2016-11-15 | Elwha Llc | Speech recognition adaptation systems based on adaptation data |
US10431235B2 (en) * | 2012-05-31 | 2019-10-01 | Elwha Llc | Methods and systems for speech adaptation data |
US20130325451A1 (en) * | 2012-05-31 | 2013-12-05 | Elwha LLC, a limited liability company of the State of Delaware | Methods and systems for speech adaptation data |
US20130325474A1 (en) * | 2012-05-31 | 2013-12-05 | Royce A. Levien | Speech recognition adaptation systems based on adaptation data |
US9620128B2 (en) * | 2012-05-31 | 2017-04-11 | Elwha Llc | Speech recognition adaptation systems based on adaptation data |
US9305565B2 (en) * | 2012-05-31 | 2016-04-05 | Elwha Llc | Methods and systems for speech adaptation data |
US20130325449A1 (en) * | 2012-05-31 | 2013-12-05 | Elwha Llc | Speech recognition adaptation systems based on adaptation data |
US20130325459A1 (en) * | 2012-05-31 | 2013-12-05 | Royce A. Levien | Speech recognition adaptation systems based on adaptation data |
US9721563B2 (en) | 2012-06-08 | 2017-08-01 | Apple Inc. | Name recognition system |
US9495129B2 (en) | 2012-06-29 | 2016-11-15 | Apple Inc. | Device, method, and user interface for voice-activated navigation and browsing of a document |
US8744995B1 (en) | 2012-07-30 | 2014-06-03 | Google Inc. | Alias disambiguation |
US9786281B1 (en) * | 2012-08-02 | 2017-10-10 | Amazon Technologies, Inc. | Household agent learning |
US8583750B1 (en) | 2012-08-10 | 2013-11-12 | Google Inc. | Inferring identity of intended communication recipient |
US8520807B1 (en) | 2012-08-10 | 2013-08-27 | Google Inc. | Phonetically unique communication identifiers |
US8571865B1 (en) * | 2012-08-10 | 2013-10-29 | Google Inc. | Inference-aided speaker recognition |
US9576574B2 (en) | 2012-09-10 | 2017-02-21 | Apple Inc. | Context-sensitive handling of interruptions by intelligent digital assistant |
US9460716B1 (en) * | 2012-09-11 | 2016-10-04 | Google Inc. | Using social networks to improve acoustic models |
US9547647B2 (en) | 2012-09-19 | 2017-01-17 | Apple Inc. | Voice-based media searching |
US8983836B2 (en) | 2012-09-26 | 2015-03-17 | International Business Machines Corporation | Captioning using socially derived acoustic profiles |
US9734828B2 (en) * | 2012-12-12 | 2017-08-15 | Nuance Communications, Inc. | Method and apparatus for detecting user ID changes |
AU2014214676A1 (en) | 2013-02-07 | 2015-08-27 | Apple Inc. | Voice trigger for a digital assistant |
US10652394B2 (en) | 2013-03-14 | 2020-05-12 | Apple Inc. | System and method for processing voicemail |
US9733821B2 (en) | 2013-03-14 | 2017-08-15 | Apple Inc. | Voice control to diagnose inadvertent activation of accessibility features |
US10642574B2 (en) | 2013-03-14 | 2020-05-05 | Apple Inc. | Device, method, and graphical user interface for outputting captions |
US9368114B2 (en) | 2013-03-14 | 2016-06-14 | Apple Inc. | Context-sensitive handling of interruptions |
US9977779B2 (en) | 2013-03-14 | 2018-05-22 | Apple Inc. | Automatic supplementation of word correction dictionaries |
US10572476B2 (en) | 2013-03-14 | 2020-02-25 | Apple Inc. | Refining a search based on schedule items |
KR102057795B1 (en) | 2013-03-15 | 2019-12-19 | 애플 인크. | Context-sensitive handling of interruptions |
KR101857648B1 (en) | 2013-03-15 | 2018-05-15 | 애플 인크. | User training by intelligent digital assistant |
US10748529B1 (en) | 2013-03-15 | 2020-08-18 | Apple Inc. | Voice activated device for use with a voice-based digital assistant |
US9922642B2 (en) | 2013-03-15 | 2018-03-20 | Apple Inc. | Training an at least partial voice command system |
WO2014144579A1 (en) * | 2013-03-15 | 2014-09-18 | Apple Inc. | System and method for updating an adaptive speech recognition model |
WO2014197334A2 (en) | 2013-06-07 | 2014-12-11 | Apple Inc. | System and method for user-specified pronunciation of words for speech synthesis and recognition |
WO2014197336A1 (en) | 2013-06-07 | 2014-12-11 | Apple Inc. | System and method for detecting errors in interactions with a voice-based digital assistant |
US9582608B2 (en) | 2013-06-07 | 2017-02-28 | Apple Inc. | Unified ranking with entropy-weighted information for phrase-based semantic auto-completion |
WO2014197335A1 (en) | 2013-06-08 | 2014-12-11 | Apple Inc. | Interpreting and acting upon commands that involve sharing information with remote devices |
US10176167B2 (en) | 2013-06-09 | 2019-01-08 | Apple Inc. | System and method for inferring user intent from speech inputs |
EP3937002A1 (en) | 2013-06-09 | 2022-01-12 | Apple Inc. | Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant |
KR101809808B1 (en) | 2013-06-13 | 2017-12-15 | 애플 인크. | System and method for emergency calls initiated by voice command |
US9122453B2 (en) * | 2013-07-16 | 2015-09-01 | Xerox Corporation | Methods and systems for processing crowdsourced tasks |
US10791216B2 (en) | 2013-08-06 | 2020-09-29 | Apple Inc. | Auto-activating smart responses based on activities from remote devices |
US10296160B2 (en) | 2013-12-06 | 2019-05-21 | Apple Inc. | Method for extracting salient dialog usage from live data |
US9666204B2 (en) * | 2014-04-30 | 2017-05-30 | Qualcomm Incorporated | Voice profile management and speech signal generation |
US9620105B2 (en) | 2014-05-15 | 2017-04-11 | Apple Inc. | Analyzing audio input for efficient speech and music recognition |
US10592095B2 (en) | 2014-05-23 | 2020-03-17 | Apple Inc. | Instantaneous speaking of content on touch devices |
US9502031B2 (en) | 2014-05-27 | 2016-11-22 | Apple Inc. | Method for supporting dynamic grammars in WFST-based ASR |
TWI566107B (en) | 2014-05-30 | 2017-01-11 | 蘋果公司 | Method for processing a multi-part voice command, non-transitory computer readable storage medium and electronic device |
US9734193B2 (en) | 2014-05-30 | 2017-08-15 | Apple Inc. | Determining domain salience ranking from ambiguous words in natural speech |
US9760559B2 (en) | 2014-05-30 | 2017-09-12 | Apple Inc. | Predictive text input |
US9715875B2 (en) | 2014-05-30 | 2017-07-25 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
US9785630B2 (en) | 2014-05-30 | 2017-10-10 | Apple Inc. | Text prediction using combined word N-gram and unigram language models |
US9430463B2 (en) | 2014-05-30 | 2016-08-30 | Apple Inc. | Exemplar-based natural language processing |
US10289433B2 (en) | 2014-05-30 | 2019-05-14 | Apple Inc. | Domain specific language for encoding assistant dialog |
US9633004B2 (en) | 2014-05-30 | 2017-04-25 | Apple Inc. | Better resolution when referencing to concepts |
US9842101B2 (en) | 2014-05-30 | 2017-12-12 | Apple Inc. | Predictive conversion of language input |
US10078631B2 (en) | 2014-05-30 | 2018-09-18 | Apple Inc. | Entropy-guided text prediction using combined word and character n-gram language models |
US10170123B2 (en) | 2014-05-30 | 2019-01-01 | Apple Inc. | Intelligent assistant for home automation |
US10418034B1 (en) | 2014-06-20 | 2019-09-17 | Nvoq Incorporated | Systems and methods for a wireless microphone to access remotely hosted applications |
US10659851B2 (en) | 2014-06-30 | 2020-05-19 | Apple Inc. | Real-time digital assistant knowledge updates |
US9338493B2 (en) | 2014-06-30 | 2016-05-10 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US10446141B2 (en) | 2014-08-28 | 2019-10-15 | Apple Inc. | Automatic speech recognition based on user feedback |
US9818400B2 (en) | 2014-09-11 | 2017-11-14 | Apple Inc. | Method and apparatus for discovering trending terms in speech requests |
US10789041B2 (en) | 2014-09-12 | 2020-09-29 | Apple Inc. | Dynamic thresholds for always listening speech trigger |
US9646609B2 (en) | 2014-09-30 | 2017-05-09 | Apple Inc. | Caching apparatus for serving phonetic pronunciations |
US10074360B2 (en) | 2014-09-30 | 2018-09-11 | Apple Inc. | Providing an indication of the suitability of speech recognition |
US10127911B2 (en) | 2014-09-30 | 2018-11-13 | Apple Inc. | Speaker identification and unsupervised speaker adaptation techniques |
US9668121B2 (en) | 2014-09-30 | 2017-05-30 | Apple Inc. | Social reminders |
US9886432B2 (en) | 2014-09-30 | 2018-02-06 | Apple Inc. | Parsimonious handling of word inflection via categorical stem + suffix N-gram language models |
US9444934B2 (en) * | 2014-10-02 | 2016-09-13 | Nedelco, Inc. | Speech to text training method and system |
US10552013B2 (en) | 2014-12-02 | 2020-02-04 | Apple Inc. | Data detection |
US9711141B2 (en) | 2014-12-09 | 2017-07-18 | Apple Inc. | Disambiguating heteronyms in speech synthesis |
US10152299B2 (en) | 2015-03-06 | 2018-12-11 | Apple Inc. | Reducing response latency of intelligent automated assistants |
US9865280B2 (en) | 2015-03-06 | 2018-01-09 | Apple Inc. | Structured dictation using intelligent automated assistants |
US9886953B2 (en) | 2015-03-08 | 2018-02-06 | Apple Inc. | Virtual assistant activation |
US9721566B2 (en) | 2015-03-08 | 2017-08-01 | Apple Inc. | Competing devices responding to voice triggers |
US10567477B2 (en) | 2015-03-08 | 2020-02-18 | Apple Inc. | Virtual assistant continuity |
US9899019B2 (en) | 2015-03-18 | 2018-02-20 | Apple Inc. | Systems and methods for structured stem and suffix language models |
US9704488B2 (en) | 2015-03-20 | 2017-07-11 | Microsoft Technology Licensing, Llc | Communicating metadata that identifies a current speaker |
US9842105B2 (en) | 2015-04-16 | 2017-12-12 | Apple Inc. | Parsimonious continuous-space phrase representations for natural language processing |
US10460227B2 (en) | 2015-05-15 | 2019-10-29 | Apple Inc. | Virtual assistant in a communication session |
US10083688B2 (en) | 2015-05-27 | 2018-09-25 | Apple Inc. | Device voice control for selecting a displayed affordance |
US10200824B2 (en) | 2015-05-27 | 2019-02-05 | Apple Inc. | Systems and methods for proactively identifying and surfacing relevant content on a touch-sensitive device |
US10127220B2 (en) | 2015-06-04 | 2018-11-13 | Apple Inc. | Language identification from short strings |
US9578173B2 (en) | 2015-06-05 | 2017-02-21 | Apple Inc. | Virtual assistant aided communication with 3rd party service in a communication session |
US10101822B2 (en) | 2015-06-05 | 2018-10-16 | Apple Inc. | Language input correction |
US10186254B2 (en) | 2015-06-07 | 2019-01-22 | Apple Inc. | Context-based endpoint detection |
US10255907B2 (en) | 2015-06-07 | 2019-04-09 | Apple Inc. | Automatic accent detection using acoustic models |
US11025565B2 (en) | 2015-06-07 | 2021-06-01 | Apple Inc. | Personalized prediction of responses for instant messaging |
EP3107089B1 (en) | 2015-06-18 | 2021-03-31 | Airbus Operations GmbH | Speech recognition on board of an aircraft |
US20160378747A1 (en) | 2015-06-29 | 2016-12-29 | Apple Inc. | Virtual assistant for media playback |
US11025573B1 (en) * | 2015-07-22 | 2021-06-01 | Ginko LLC | Method and apparatus for data sharing |
US10747498B2 (en) | 2015-09-08 | 2020-08-18 | Apple Inc. | Zero latency digital assistant |
US10740384B2 (en) | 2015-09-08 | 2020-08-11 | Apple Inc. | Intelligent automated assistant for media search and playback |
US10671428B2 (en) | 2015-09-08 | 2020-06-02 | Apple Inc. | Distributed personal assistant |
US10331312B2 (en) | 2015-09-08 | 2019-06-25 | Apple Inc. | Intelligent automated assistant in a media environment |
US9697820B2 (en) | 2015-09-24 | 2017-07-04 | Apple Inc. | Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks |
US10366158B2 (en) | 2015-09-29 | 2019-07-30 | Apple Inc. | Efficient word encoding for recurrent neural network language models |
US11010550B2 (en) | 2015-09-29 | 2021-05-18 | Apple Inc. | Unified language modeling framework for word prediction, auto-completion and auto-correction |
US11587559B2 (en) | 2015-09-30 | 2023-02-21 | Apple Inc. | Intelligent device identification |
US10691473B2 (en) | 2015-11-06 | 2020-06-23 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US10956666B2 (en) | 2015-11-09 | 2021-03-23 | Apple Inc. | Unconventional virtual assistant interactions |
US10049668B2 (en) | 2015-12-02 | 2018-08-14 | Apple Inc. | Applying neural network language models to weighted finite state transducers for automatic speech recognition |
US10223066B2 (en) | 2015-12-23 | 2019-03-05 | Apple Inc. | Proactive assistance based on dialog communication between devices |
US10446143B2 (en) | 2016-03-14 | 2019-10-15 | Apple Inc. | Identification of voice inputs providing credentials |
WO2017160170A1 (en) | 2016-03-15 | 2017-09-21 | Motorola Solutions, Inc. | Method and apparatus for camera activation |
US9934775B2 (en) | 2016-05-26 | 2018-04-03 | Apple Inc. | Unit-selection text-to-speech synthesis based on predicted concatenation parameters |
US9972304B2 (en) | 2016-06-03 | 2018-05-15 | Apple Inc. | Privacy preserving distributed evaluation framework for embedded personalized systems |
US10249300B2 (en) | 2016-06-06 | 2019-04-02 | Apple Inc. | Intelligent list reading |
US11227589B2 (en) | 2016-06-06 | 2022-01-18 | Apple Inc. | Intelligent list reading |
US10049663B2 (en) | 2016-06-08 | 2018-08-14 | Apple, Inc. | Intelligent automated assistant for media exploration |
DK179588B1 (en) | 2016-06-09 | 2019-02-22 | Apple Inc. | Intelligent automated assistant in a home environment |
US12223282B2 (en) | 2016-06-09 | 2025-02-11 | Apple Inc. | Intelligent automated assistant in a home environment |
US10192552B2 (en) | 2016-06-10 | 2019-01-29 | Apple Inc. | Digital assistant providing whispered speech |
US10490187B2 (en) | 2016-06-10 | 2019-11-26 | Apple Inc. | Digital assistant providing automated status report |
US10586535B2 (en) | 2016-06-10 | 2020-03-10 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
US10067938B2 (en) | 2016-06-10 | 2018-09-04 | Apple Inc. | Multilingual word prediction |
US10509862B2 (en) | 2016-06-10 | 2019-12-17 | Apple Inc. | Dynamic phrase expansion of language input |
DK201670540A1 (en) | 2016-06-11 | 2018-01-08 | Apple Inc | Application integration with a digital assistant |
DK179415B1 (en) | 2016-06-11 | 2018-06-14 | Apple Inc | Intelligent device arbitration and control |
DK179049B1 (en) | 2016-06-11 | 2017-09-18 | Apple Inc | Data driven natural language event detection and classification |
DK179343B1 (en) | 2016-06-11 | 2018-05-14 | Apple Inc | Intelligent task discovery |
US10474753B2 (en) | 2016-09-07 | 2019-11-12 | Apple Inc. | Language identification using recurrent neural networks |
US10043516B2 (en) | 2016-09-23 | 2018-08-07 | Apple Inc. | Intelligent automated assistant |
US11281993B2 (en) | 2016-12-05 | 2022-03-22 | Apple Inc. | Model and ensemble compression for metric learning |
US10593346B2 (en) | 2016-12-22 | 2020-03-17 | Apple Inc. | Rank-reduced token representation for automatic speech recognition |
US10580405B1 (en) * | 2016-12-27 | 2020-03-03 | Amazon Technologies, Inc. | Voice control of remote device |
US11204787B2 (en) | 2017-01-09 | 2021-12-21 | Apple Inc. | Application integration with a digital assistant |
US9973627B1 (en) | 2017-01-25 | 2018-05-15 | Sorenson Ip Holdings, Llc | Selecting audio profiles |
CN107122179A (en) | 2017-03-31 | 2017-09-01 | 阿里巴巴集团控股有限公司 | The function control method and device of voice |
DK201770383A1 (en) | 2017-05-09 | 2018-12-14 | Apple Inc. | User interface for correcting recognition errors |
US10417266B2 (en) | 2017-05-09 | 2019-09-17 | Apple Inc. | Context-aware ranking of intelligent response suggestions |
US10726832B2 (en) | 2017-05-11 | 2020-07-28 | Apple Inc. | Maintaining privacy of personal information |
DK201770439A1 (en) | 2017-05-11 | 2018-12-13 | Apple Inc. | Offline personal assistant |
US10395654B2 (en) | 2017-05-11 | 2019-08-27 | Apple Inc. | Text normalization based on a data-driven learning network |
DK180048B1 (en) | 2017-05-11 | 2020-02-04 | Apple Inc. | MAINTAINING THE DATA PROTECTION OF PERSONAL INFORMATION |
DK179496B1 (en) | 2017-05-12 | 2019-01-15 | Apple Inc. | USER-SPECIFIC Acoustic Models |
DK179745B1 (en) | 2017-05-12 | 2019-05-01 | Apple Inc. | SYNCHRONIZATION AND TASK DELEGATION OF A DIGITAL ASSISTANT |
DK201770427A1 (en) | 2017-05-12 | 2018-12-20 | Apple Inc. | Low-latency intelligent automated assistant |
US11301477B2 (en) | 2017-05-12 | 2022-04-12 | Apple Inc. | Feedback analysis of a digital assistant |
DK201770431A1 (en) | 2017-05-15 | 2018-12-20 | Apple Inc. | Optimizing dialogue policy decisions for digital assistants using implicit feedback |
DK201770411A1 (en) | 2017-05-15 | 2018-12-20 | Apple Inc. | MULTI-MODAL INTERFACES |
DK201770432A1 (en) | 2017-05-15 | 2018-12-21 | Apple Inc. | Hierarchical belief states for digital assistants |
DK179560B1 (en) | 2017-05-16 | 2019-02-18 | Apple Inc. | Far-field extension for digital assistant services |
US10403278B2 (en) | 2017-05-16 | 2019-09-03 | Apple Inc. | Methods and systems for phonetic matching in digital assistant services |
US20180336275A1 (en) | 2017-05-16 | 2018-11-22 | Apple Inc. | Intelligent automated assistant for media exploration |
US20180336892A1 (en) | 2017-05-16 | 2018-11-22 | Apple Inc. | Detecting a trigger of a digital assistant |
US10311144B2 (en) | 2017-05-16 | 2019-06-04 | Apple Inc. | Emoji word sense disambiguation |
US10657328B2 (en) | 2017-06-02 | 2020-05-19 | Apple Inc. | Multi-task recurrent neural network architecture for efficient morphology handling in neural language modeling |
US10438594B2 (en) * | 2017-09-08 | 2019-10-08 | Amazon Technologies, Inc. | Administration of privileges by speech for voice assistant system |
US10445429B2 (en) | 2017-09-21 | 2019-10-15 | Apple Inc. | Natural language understanding using vocabularies with compressed serialized tries |
US10755051B2 (en) | 2017-09-29 | 2020-08-25 | Apple Inc. | Rule-based natural language processing |
US10567515B1 (en) * | 2017-10-26 | 2020-02-18 | Amazon Technologies, Inc. | Speech processing performed with respect to first and second user profiles in a dialog session |
US10636424B2 (en) | 2017-11-30 | 2020-04-28 | Apple Inc. | Multi-turn canned dialog |
WO2019112624A1 (en) * | 2017-12-08 | 2019-06-13 | Google Llc | Distributed identification in networked system |
US11087766B2 (en) * | 2018-01-05 | 2021-08-10 | Uniphore Software Systems | System and method for dynamic speech recognition selection based on speech rate or business domain |
US10733982B2 (en) | 2018-01-08 | 2020-08-04 | Apple Inc. | Multi-directional dialog |
US10733375B2 (en) | 2018-01-31 | 2020-08-04 | Apple Inc. | Knowledge-based framework for improving natural language understanding |
US10789959B2 (en) | 2018-03-02 | 2020-09-29 | Apple Inc. | Training speaker recognition models for digital assistants |
US10592604B2 (en) | 2018-03-12 | 2020-03-17 | Apple Inc. | Inverse text normalization for automatic speech recognition |
US10818288B2 (en) | 2018-03-26 | 2020-10-27 | Apple Inc. | Natural assistant interaction |
US10909331B2 (en) | 2018-03-30 | 2021-02-02 | Apple Inc. | Implicit identification of translation payload with neural machine translation |
US10928918B2 (en) | 2018-05-07 | 2021-02-23 | Apple Inc. | Raise to speak |
US11145294B2 (en) | 2018-05-07 | 2021-10-12 | Apple Inc. | Intelligent automated assistant for delivering content from user experiences |
US10984780B2 (en) | 2018-05-21 | 2021-04-20 | Apple Inc. | Global semantic word embeddings using bi-directional recurrent neural networks |
DK180639B1 (en) | 2018-06-01 | 2021-11-04 | Apple Inc | DISABILITY OF ATTENTION-ATTENTIVE VIRTUAL ASSISTANT |
US11386266B2 (en) | 2018-06-01 | 2022-07-12 | Apple Inc. | Text correction |
DK179822B1 (en) | 2018-06-01 | 2019-07-12 | Apple Inc. | Voice interaction at a primary device to access call functionality of a companion device |
US10892996B2 (en) | 2018-06-01 | 2021-01-12 | Apple Inc. | Variable latency device coordination |
DK201870355A1 (en) | 2018-06-01 | 2019-12-16 | Apple Inc. | Virtual assistant operation in multi-device environments |
US11076039B2 (en) | 2018-06-03 | 2021-07-27 | Apple Inc. | Accelerated task performance |
US11010561B2 (en) | 2018-09-27 | 2021-05-18 | Apple Inc. | Sentiment prediction from textual data |
US11170166B2 (en) | 2018-09-28 | 2021-11-09 | Apple Inc. | Neural typographical error modeling via generative adversarial networks |
US11462215B2 (en) | 2018-09-28 | 2022-10-04 | Apple Inc. | Multi-modal inputs for voice commands |
US10839159B2 (en) | 2018-09-28 | 2020-11-17 | Apple Inc. | Named entity normalization in a spoken dialog system |
US11475898B2 (en) | 2018-10-26 | 2022-10-18 | Apple Inc. | Low-latency multi-speaker speech recognition |
US11393478B2 (en) * | 2018-12-12 | 2022-07-19 | Sonos, Inc. | User specific context switching |
US11638059B2 (en) | 2019-01-04 | 2023-04-25 | Apple Inc. | Content playback on multiple devices |
US11348573B2 (en) | 2019-03-18 | 2022-05-31 | Apple Inc. | Multimodality in digital assistant systems |
US11423908B2 (en) | 2019-05-06 | 2022-08-23 | Apple Inc. | Interpreting spoken requests |
US11307752B2 (en) | 2019-05-06 | 2022-04-19 | Apple Inc. | User configurable task triggers |
US11475884B2 (en) | 2019-05-06 | 2022-10-18 | Apple Inc. | Reducing digital assistant latency when a language is incorrectly determined |
DK201970509A1 (en) | 2019-05-06 | 2021-01-15 | Apple Inc | Spoken notifications |
US11140099B2 (en) | 2019-05-21 | 2021-10-05 | Apple Inc. | Providing message response suggestions |
US11496600B2 (en) | 2019-05-31 | 2022-11-08 | Apple Inc. | Remote execution of machine-learned models |
US11289073B2 (en) | 2019-05-31 | 2022-03-29 | Apple Inc. | Device text to speech |
DK180129B1 (en) | 2019-05-31 | 2020-06-02 | Apple Inc. | USER ACTIVITY SHORTCUT SUGGESTIONS |
DK201970510A1 (en) | 2019-05-31 | 2021-02-11 | Apple Inc | Voice identification in digital assistant systems |
US11468890B2 (en) | 2019-06-01 | 2022-10-11 | Apple Inc. | Methods and user interfaces for voice-based control of electronic devices |
US11360641B2 (en) | 2019-06-01 | 2022-06-14 | Apple Inc. | Increasing the relevance of new available information |
WO2021056255A1 (en) | 2019-09-25 | 2021-04-01 | Apple Inc. | Text detection using global geometry estimators |
KR102737990B1 (en) * | 2020-01-23 | 2024-12-05 | 삼성전자주식회사 | Electronic device and method for training an artificial intelligence model related to a chatbot using voice data |
US11061543B1 (en) | 2020-05-11 | 2021-07-13 | Apple Inc. | Providing relevant data items based on context |
US11183193B1 (en) | 2020-05-11 | 2021-11-23 | Apple Inc. | Digital assistant hardware abstraction |
US11755276B2 (en) | 2020-05-12 | 2023-09-12 | Apple Inc. | Reducing description length based on confidence |
US11490204B2 (en) | 2020-07-20 | 2022-11-01 | Apple Inc. | Multi-device audio adjustment coordination |
US11438683B2 (en) | 2020-07-21 | 2022-09-06 | Apple Inc. | User identification using headphones |
US11961524B2 (en) | 2021-05-27 | 2024-04-16 | Honeywell International Inc. | System and method for extracting and displaying speaker information in an ATC transcription |
Citations (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6195641B1 (en) * | 1998-03-27 | 2001-02-27 | International Business Machines Corp. | Network universal spoken language vocabulary |
US6263308B1 (en) * | 2000-03-20 | 2001-07-17 | Microsoft Corporation | Methods and apparatus for performing speech recognition using acoustic models which are improved through an interactive process |
US20020065652A1 (en) * | 2000-11-27 | 2002-05-30 | Akihiro Kushida | Speech recognition system, speech recognition server, speech recognition client, their control method, and computer readable memory |
US20020156626A1 (en) * | 2001-04-20 | 2002-10-24 | Hutchison William R. | Speech recognition system |
US20030023431A1 (en) * | 2001-07-26 | 2003-01-30 | Marc Neuberger | Method and system for augmenting grammars in distributed voice browsing |
US20030065721A1 (en) * | 2001-09-28 | 2003-04-03 | Roskind James A. | Passive personalization of buddy lists |
US20030120486A1 (en) * | 2001-12-20 | 2003-06-26 | Hewlett Packard Company | Speech recognition system and method |
US20030163308A1 (en) * | 2002-02-28 | 2003-08-28 | Fujitsu Limited | Speech recognition system and speech file recording system |
US20030182113A1 (en) * | 1999-11-22 | 2003-09-25 | Xuedong Huang | Distributed speech recognition for mobile communication devices |
US20030220975A1 (en) * | 2002-05-21 | 2003-11-27 | Malik Dale W. | Group access management system |
US20040186714A1 (en) * | 2003-03-18 | 2004-09-23 | Aurilab, Llc | Speech recognition improvement through post-processsing |
US20040260701A1 (en) * | 2003-05-27 | 2004-12-23 | Juha Lehikoinen | System and method for weblog and sharing in a peer-to-peer environment |
US20050097440A1 (en) * | 2003-11-04 | 2005-05-05 | Richard Lusk | Method and system for collaboration |
US20050119894A1 (en) * | 2003-10-20 | 2005-06-02 | Cutler Ann R. | System and process for feedback speech instruction |
US20050171926A1 (en) * | 2004-02-02 | 2005-08-04 | Thione Giovanni L. | Systems and methods for collaborative note-taking |
US20050249196A1 (en) * | 2004-05-05 | 2005-11-10 | Amir Ansari | Multimedia access device and system employing the same |
US20050286546A1 (en) * | 2004-06-21 | 2005-12-29 | Arianna Bassoli | Synchronized media streaming between distributed peers |
US20060053014A1 (en) * | 2002-11-21 | 2006-03-09 | Shinichi Yoshizawa | Standard model creating device and standard model creating method |
US7035788B1 (en) * | 2000-04-25 | 2006-04-25 | Microsoft Corporation | Language model sharing |
US7099825B1 (en) * | 2002-03-15 | 2006-08-29 | Sprint Communications Company L.P. | User mobility in a voice recognition environment |
US7174298B2 (en) * | 2002-06-24 | 2007-02-06 | Intel Corporation | Method and apparatus to improve accuracy of mobile speech-enabled services |
US20070038436A1 (en) * | 2005-08-10 | 2007-02-15 | Voicebox Technologies, Inc. | System and method of supporting adaptive misrecognition in conversational speech |
US7224981B2 (en) * | 2002-06-20 | 2007-05-29 | Intel Corporation | Speech recognition of mobile devices |
US20070174399A1 (en) * | 2006-01-26 | 2007-07-26 | Ogle David M | Offline IM chat to avoid server connections |
US20070266079A1 (en) * | 2006-04-10 | 2007-11-15 | Microsoft Corporation | Content Upload Safety Tool |
US7302391B2 (en) * | 2000-11-30 | 2007-11-27 | Telesector Resources Group, Inc. | Methods and apparatus for performing speech recognition over a network and using speech recognition results |
Family Cites Families (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5897616A (en) * | 1997-06-11 | 1999-04-27 | International Business Machines Corporation | Apparatus and methods for speaker verification/identification/classification employing non-acoustic and/or acoustic models and databases |
US7003463B1 (en) * | 1998-10-02 | 2006-02-21 | International Business Machines Corporation | System and method for providing network coordinated conversational services |
US6332122B1 (en) * | 1999-06-23 | 2001-12-18 | International Business Machines Corporation | Transcription system for multiple speakers, using and establishing identification |
US6587818B2 (en) * | 1999-10-28 | 2003-07-01 | International Business Machines Corporation | System and method for resolving decoding ambiguity via dialog |
JP3728177B2 (en) * | 2000-05-24 | 2005-12-21 | キヤノン株式会社 | Audio processing system, apparatus, method, and storage medium |
JP2003316387A (en) * | 2002-02-19 | 2003-11-07 | Ntt Docomo Inc | Learning device, mobile communication terminal, information recognition system, and learning method |
US7133828B2 (en) * | 2002-10-18 | 2006-11-07 | Ser Solutions, Inc. | Methods and apparatus for audio data analysis and data mining using speech recognition |
US7222072B2 (en) * | 2003-02-13 | 2007-05-22 | Sbc Properties, L.P. | Bio-phonetic multi-phrase speaker identity verification |
US7464031B2 (en) * | 2003-11-28 | 2008-12-09 | International Business Machines Corporation | Speech recognition utilizing multitude of speech features |
US20050239511A1 (en) * | 2004-04-22 | 2005-10-27 | Motorola, Inc. | Speaker identification using a mobile communications device |
US20050246762A1 (en) * | 2004-04-29 | 2005-11-03 | International Business Machines Corporation | Changing access permission based on usage of a computer resource |
US20060282265A1 (en) * | 2005-06-10 | 2006-12-14 | Steve Grobman | Methods and apparatus to perform enhanced speech to text processing |
US8683333B2 (en) * | 2005-12-08 | 2014-03-25 | International Business Machines Corporation | Brokering of personalized rulesets for use in digital media character replacement |
US20070220113A1 (en) * | 2006-03-15 | 2007-09-20 | Jerry Knight | Rich presence in a personal communications client for enterprise communications |
US7720681B2 (en) * | 2006-03-23 | 2010-05-18 | Microsoft Corporation | Digital voice profiles |
US8130679B2 (en) * | 2006-05-25 | 2012-03-06 | Microsoft Corporation | Individual processing of VoIP contextual information |
US7508310B1 (en) * | 2008-04-17 | 2009-03-24 | Robelight, Llc | System and method for secure networking in a virtual space |
-
2006
- 2006-09-28 US US11/536,117 patent/US8214208B2/en not_active Expired - Fee Related
-
2012
- 2012-06-14 US US13/523,762 patent/US8990077B2/en not_active Expired - Fee Related
Patent Citations (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6195641B1 (en) * | 1998-03-27 | 2001-02-27 | International Business Machines Corp. | Network universal spoken language vocabulary |
US20030182113A1 (en) * | 1999-11-22 | 2003-09-25 | Xuedong Huang | Distributed speech recognition for mobile communication devices |
US6263308B1 (en) * | 2000-03-20 | 2001-07-17 | Microsoft Corporation | Methods and apparatus for performing speech recognition using acoustic models which are improved through an interactive process |
US7035788B1 (en) * | 2000-04-25 | 2006-04-25 | Microsoft Corporation | Language model sharing |
US20020065652A1 (en) * | 2000-11-27 | 2002-05-30 | Akihiro Kushida | Speech recognition system, speech recognition server, speech recognition client, their control method, and computer readable memory |
US7099824B2 (en) * | 2000-11-27 | 2006-08-29 | Canon Kabushiki Kaisha | Speech recognition system, speech recognition server, speech recognition client, their control method, and computer readable memory |
US7302391B2 (en) * | 2000-11-30 | 2007-11-27 | Telesector Resources Group, Inc. | Methods and apparatus for performing speech recognition over a network and using speech recognition results |
US20020156626A1 (en) * | 2001-04-20 | 2002-10-24 | Hutchison William R. | Speech recognition system |
US6785647B2 (en) * | 2001-04-20 | 2004-08-31 | William R. Hutchison | Speech recognition system with network accessible speech processing resources |
US20030023431A1 (en) * | 2001-07-26 | 2003-01-30 | Marc Neuberger | Method and system for augmenting grammars in distributed voice browsing |
US20030065721A1 (en) * | 2001-09-28 | 2003-04-03 | Roskind James A. | Passive personalization of buddy lists |
US20030120486A1 (en) * | 2001-12-20 | 2003-06-26 | Hewlett Packard Company | Speech recognition system and method |
US20030163308A1 (en) * | 2002-02-28 | 2003-08-28 | Fujitsu Limited | Speech recognition system and speech file recording system |
US7099825B1 (en) * | 2002-03-15 | 2006-08-29 | Sprint Communications Company L.P. | User mobility in a voice recognition environment |
US20030220975A1 (en) * | 2002-05-21 | 2003-11-27 | Malik Dale W. | Group access management system |
US7224981B2 (en) * | 2002-06-20 | 2007-05-29 | Intel Corporation | Speech recognition of mobile devices |
US7174298B2 (en) * | 2002-06-24 | 2007-02-06 | Intel Corporation | Method and apparatus to improve accuracy of mobile speech-enabled services |
US20060053014A1 (en) * | 2002-11-21 | 2006-03-09 | Shinichi Yoshizawa | Standard model creating device and standard model creating method |
US20040186714A1 (en) * | 2003-03-18 | 2004-09-23 | Aurilab, Llc | Speech recognition improvement through post-processsing |
US20040260701A1 (en) * | 2003-05-27 | 2004-12-23 | Juha Lehikoinen | System and method for weblog and sharing in a peer-to-peer environment |
US20050119894A1 (en) * | 2003-10-20 | 2005-06-02 | Cutler Ann R. | System and process for feedback speech instruction |
US20050097440A1 (en) * | 2003-11-04 | 2005-05-05 | Richard Lusk | Method and system for collaboration |
US20050171926A1 (en) * | 2004-02-02 | 2005-08-04 | Thione Giovanni L. | Systems and methods for collaborative note-taking |
US20050249196A1 (en) * | 2004-05-05 | 2005-11-10 | Amir Ansari | Multimedia access device and system employing the same |
US20050286546A1 (en) * | 2004-06-21 | 2005-12-29 | Arianna Bassoli | Synchronized media streaming between distributed peers |
US20070038436A1 (en) * | 2005-08-10 | 2007-02-15 | Voicebox Technologies, Inc. | System and method of supporting adaptive misrecognition in conversational speech |
US20070174399A1 (en) * | 2006-01-26 | 2007-07-26 | Ogle David M | Offline IM chat to avoid server connections |
US20070266079A1 (en) * | 2006-04-10 | 2007-11-15 | Microsoft Corporation | Content Upload Safety Tool |
Cited By (27)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9787830B1 (en) | 2000-11-30 | 2017-10-10 | Google Inc. | Performing speech recognition over a network and using speech recognition results based on determining that a network connection exists |
US8682663B2 (en) | 2000-11-30 | 2014-03-25 | Google Inc. | Performing speech recognition over a network and using speech recognition results based on determining that a network connection exists |
US8731937B1 (en) * | 2000-11-30 | 2014-05-20 | Google Inc. | Updating speech recognition models for contacts |
US8520810B1 (en) | 2000-11-30 | 2013-08-27 | Google Inc. | Performing speech recognition over a network and using speech recognition results |
US9380155B1 (en) | 2000-11-30 | 2016-06-28 | Google Inc. | Forming speech recognition over a network and using speech recognition results based on determining that a network connection exists |
US9818399B1 (en) | 2000-11-30 | 2017-11-14 | Google Inc. | Performing speech recognition over a network and using speech recognition results based on determining that a network connection exists |
US20140026075A1 (en) * | 2011-01-16 | 2014-01-23 | Google Inc. | Method and System for Sharing Speech Recognition Program Profiles for an Application |
US10218770B2 (en) * | 2011-01-16 | 2019-02-26 | Google Llc | Method and system for sharing speech recognition program profiles for an application |
US10114932B2 (en) * | 2011-11-03 | 2018-10-30 | Mobile Iron, Inc. | Adapting a mobile application to a partitioned environment |
US9430641B1 (en) * | 2011-11-03 | 2016-08-30 | Mobile Iron, Inc. | Adapting a mobile application to a partitioned environment |
US20170011206A1 (en) * | 2011-11-03 | 2017-01-12 | Mobile Iron, Inc. | Adapting a mobile application to a partitioned environment |
US9318104B1 (en) | 2013-02-20 | 2016-04-19 | Google Inc. | Methods and systems for sharing of adapted voice profiles |
CN106847258B (en) * | 2013-02-20 | 2020-09-29 | 谷歌有限责任公司 | Method and apparatus for sharing an adapted voice profile |
US9117451B2 (en) | 2013-02-20 | 2015-08-25 | Google Inc. | Methods and systems for sharing of adapted voice profiles |
CN106847258A (en) * | 2013-02-20 | 2017-06-13 | 谷歌公司 | Method and apparatus for sharing adjustment speech profiles |
US9282096B2 (en) | 2013-08-31 | 2016-03-08 | Steven Goldstein | Methods and systems for voice authentication service leveraging networking |
US11570601B2 (en) | 2013-10-06 | 2023-01-31 | Staton Techiya, Llc | Methods and systems for establishing and maintaining presence information of neighboring bluetooth devices |
US10869177B2 (en) | 2013-10-06 | 2020-12-15 | Staton Techiya, Llc | Methods and systems for establishing and maintaining presence information of neighboring bluetooth devices |
US10405163B2 (en) | 2013-10-06 | 2019-09-03 | Staton Techiya, Llc | Methods and systems for establishing and maintaining presence information of neighboring bluetooth devices |
US11729596B2 (en) | 2013-10-06 | 2023-08-15 | Staton Techiya Llc | Methods and systems for establishing and maintaining presence information of neighboring Bluetooth devices |
US20230370827A1 (en) * | 2013-10-06 | 2023-11-16 | Staton Techiya Llc | Methods and systems for establishing and maintaining presence information of neighboring bluetooth devices |
US12170941B2 (en) * | 2013-10-06 | 2024-12-17 | The Diablo Canyon Collective Llc | Methods and systems for establishing and maintaining presence information of neighboring bluetooth devices |
US11074365B2 (en) | 2015-07-22 | 2021-07-27 | Ginko LLC | Event-based directory and contact management |
US11163905B2 (en) * | 2015-07-22 | 2021-11-02 | Ginko LLC | Contact management |
US11597519B2 (en) | 2017-10-17 | 2023-03-07 | The Boeing Company | Artificially intelligent flight crew systems and methods |
US11552966B2 (en) | 2020-09-25 | 2023-01-10 | International Business Machines Corporation | Generating and mutually maturing a knowledge corpus |
US11875798B2 (en) | 2021-05-03 | 2024-01-16 | International Business Machines Corporation | Profiles for enhanced speech recognition training |
Also Published As
Publication number | Publication date |
---|---|
US8990077B2 (en) | 2015-03-24 |
US20080082332A1 (en) | 2008-04-03 |
US20120284027A1 (en) | 2012-11-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8214208B2 (en) | Method and system for sharing portable voice profiles | |
US8050255B2 (en) | Routing a VoIP call with contextual information | |
US9703943B2 (en) | Pre-authenticated calling for voice applications | |
US7400575B2 (en) | Method, system and service for achieving synchronous communication responsive to dynamic status | |
US8270937B2 (en) | Low-threat response service for mobile device users | |
CA2555302C (en) | Method and system of providing personal and business information | |
US8483368B2 (en) | Providing contextual information with a voicemail message | |
CN101421728B (en) | Mining data for services | |
US20230275902A1 (en) | Distributed identification in networked system | |
US20070253407A1 (en) | Enhanced VoIP services | |
US8767718B2 (en) | Conversation data accuracy confirmation | |
US7747568B2 (en) | Integrated user interface | |
EP2367334B1 (en) | Service authorizer | |
AU2002347406A1 (en) | Multi-modal messaging and callback with service authorizer and virtual customer database | |
CN101422003B (en) | Voip client information | |
US20050071429A1 (en) | System and method for mapping identity context to device context | |
KR20090062436A (en) | Apparatus and method for practicing foreign languages using telephone | |
EP1708470A2 (en) | Multi-modal callback system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: QTECH, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:MALLET, JACQUELINE;VEMURI, SUNIL;MACHIRAJU, N. RAO;REEL/FRAME:018793/0019 Effective date: 20070104 |
|
AS | Assignment |
Owner name: REQALL, INC.,CALIFORNIA Free format text: CHANGE OF NAME;ASSIGNOR:QTECH, INC.;REEL/FRAME:024131/0624 Effective date: 20081017 Owner name: REQALL, INC., CALIFORNIA Free format text: CHANGE OF NAME;ASSIGNOR:QTECH, INC.;REEL/FRAME:024131/0624 Effective date: 20081017 |
|
REMI | Maintenance fee reminder mailed | ||
LAPS | Lapse for failure to pay maintenance fees | ||
STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20160703 |