CN107077609B - Non-parametric model for detecting spatially distinct temporal patterns - Google Patents
Non-parametric model for detecting spatially distinct temporal patterns Download PDFInfo
- Publication number
- CN107077609B CN107077609B CN201580060090.6A CN201580060090A CN107077609B CN 107077609 B CN107077609 B CN 107077609B CN 201580060090 A CN201580060090 A CN 201580060090A CN 107077609 B CN107077609 B CN 107077609B
- Authority
- CN
- China
- Prior art keywords
- spatio
- cluster
- temporal pattern
- clusters
- data points
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000002123 temporal effect Effects 0.000 title description 16
- 238000000034 method Methods 0.000 claims abstract description 126
- 238000012549 training Methods 0.000 claims abstract description 56
- 230000000295 complement effect Effects 0.000 claims abstract description 46
- 238000003909 pattern recognition Methods 0.000 claims abstract description 18
- 230000008569 process Effects 0.000 claims description 85
- 230000015654 memory Effects 0.000 claims description 35
- 238000005192 partition Methods 0.000 claims description 35
- 238000005309 stochastic process Methods 0.000 claims description 23
- 239000000203 mixture Substances 0.000 claims description 18
- 230000000153 supplemental effect Effects 0.000 claims 7
- 238000012545 processing Methods 0.000 description 32
- 230000006870 function Effects 0.000 description 23
- 238000010586 diagram Methods 0.000 description 15
- 230000033001 locomotion Effects 0.000 description 13
- 230000009471 action Effects 0.000 description 8
- 238000004891 communication Methods 0.000 description 7
- 230000008901 benefit Effects 0.000 description 6
- 238000004590 computer program Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 4
- 239000011159 matrix material Substances 0.000 description 4
- 238000000926 separation method Methods 0.000 description 4
- 238000003491 array Methods 0.000 description 3
- 230000001413 cellular effect Effects 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 3
- 210000002569 neuron Anatomy 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 230000001537 neural effect Effects 0.000 description 2
- 239000005022 packaging material Substances 0.000 description 2
- 238000000638 solvent extraction Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
- 210000000653 nervous system Anatomy 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000000946 synaptic effect Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/32—Digital ink
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/28—Determining representative reference patterns, e.g. by averaging or distorting; Generating dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/29—Graphical models, e.g. Bayesian networks
- G06F18/295—Markov models or related models, e.g. semi-Markov models; Markov random fields; Networks embedding Markov models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/762—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/84—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using probabilistic graphical models from image or video features, e.g. Markov models or Bayesian networks
- G06V10/85—Markov-related models; Markov random fields
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/19—Recognition using electronic means
- G06V30/191—Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06V30/1914—Determining representative reference patterns, e.g. averaging or distorting patterns; Generating dictionaries, e.g. user dictionaries
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Computing Systems (AREA)
- Medical Informatics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Physics (AREA)
- Operations Research (AREA)
- Algebra (AREA)
- Image Analysis (AREA)
Abstract
Description
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本申请要求于2014年11月6日提交且题为“NONPARAMETRIC MODEL FOR DETECTIONOF SPATIALLY DIVERSE TEMPORAL PATTERNS(用于检测在空间上不同的时间模式的非参数化模型)”的美国临时专利申请No.62/076,319的权益,其公开内容通过援引全部明确纳入于此。This application claims US Provisional Patent Application No. 62/, filed on November 6, 2014 and entitled "NONPARAMETRIC MODEL FOR DETECTIONOF SPATIALLY DIVERSE TEMPORAL PATTERNS" 076,319, the disclosures of which are expressly incorporated herein by reference in their entirety.
背景技术Background technique
技术领域technical field
本公开的某些方面一般涉及机器学习,且更具体地涉及改进检测在空间上不同的时间模式的系统和方法。Certain aspects of the present disclosure relate generally to machine learning, and more particularly to improved systems and methods for detecting spatially distinct temporal patterns.
背景background
诸如蜂窝电话或个人数字助理(PDA)等移动设备具有若干功能,这些功能中的每一者可通过用户选择唯一键序列或使用屏上菜单来激活。给定移动设备上能够提供的有限数目的控件,在移动设备提供增加的特征集时,访问所有特征可变得愈加复杂。Mobile devices, such as cellular telephones or personal digital assistants (PDAs), have several functions, each of which can be activated by a user selecting a unique key sequence or using an on-screen menu. Given the limited number of controls that can be provided on a mobile device, accessing all of the features can become increasingly complex as the mobile device provides an increased feature set.
最近,一些移动设备已被设计成包括通过识别用户控制的姿势来接收用户输入的能力。一些设备可通过触摸屏界面接收用户控制的姿势,而其他设备可被配置成通过获取图像并实现计算机视觉办法以跟踪用户输入来接收用户控制的姿势。姿势识别的一个重要方面是识别所得的轨迹数据中的已知模式的能力。然而,比划或打手势来作出输入姿势的方法或表现通常随用户而变,或甚至随同一用户的每一次比划而变。例如,轻微差异可存在于不同用户比划特定字符(例如,数字“2”)的方式中。由于这些差异,识别轨迹数据中的模式依然是重大挑战。More recently, some mobile devices have been designed to include the ability to receive user input by recognizing user-controlled gestures. Some devices may receive user-controlled gestures through a touch screen interface, while other devices may be configured to receive user-controlled gestures by acquiring images and implementing computer vision approaches to track user input. An important aspect of gesture recognition is the ability to recognize known patterns in the resulting trajectory data. However, the method or performance of gestures or gestures to make input gestures typically varies from user to user, or even for each gesture by the same user. For example, slight differences may exist in how different users gesture certain characters (eg, the number "2"). Because of these differences, identifying patterns in trajectory data remains a significant challenge.
概览Overview
在本公开的一方面,提出了一种生成用于时空模式识别的时空模式模型的计算机实现的方法。该方法包括接收多个训练轨迹。训练轨迹中的每一者包括表示时空模式的多个不同数据点。接收到的训练轨迹定义一区域。该方法还包括将该区域划分成多个观察到的集群和一非观察到的补充性集群。该方法进一步包括生成时空模式模型以包括各观察到的集群和该非观察到的补充性集群。In one aspect of the present disclosure, a computer-implemented method of generating a spatiotemporal pattern model for spatiotemporal pattern recognition is presented. The method includes receiving a plurality of training trajectories. Each of the training trajectories includes a number of distinct data points representing spatiotemporal patterns. The received training trajectory defines a region. The method also includes dividing the region into a plurality of observed clusters and a non-observed complementary cluster. The method further includes generating a spatiotemporal pattern model to include each observed cluster and the non-observed complementary cluster.
在本公开的另一方面,提出了一种生成用于时空模式识别的时空模式模型的装置。该装置包括存储器以及耦合至该存储器的至少一个处理器。(各)处理器被配置成接收训练轨迹。训练轨迹中的每一者包括表示时空模式的不同数据点。接收到的训练轨迹定义一区域。(各)处理器被进一步配置成将该区域划分成各观察到的集群和一非观察到的补充性集群。(各)处理器进一步包括生成时空模式模型以包括各观察到的集群和该非观察到的补充性集群。In another aspect of the present disclosure, an apparatus for generating a spatiotemporal pattern model for spatiotemporal pattern recognition is presented. The apparatus includes a memory and at least one processor coupled to the memory. The processor(s) are configured to receive training trajectories. Each of the training trajectories includes different data points representing spatiotemporal patterns. The received training trajectory defines a region. The processor(s) are further configured to divide the area into observed clusters and a non-observed complementary cluster. The processor(s) further includes generating a spatiotemporal pattern model to include each observed cluster and the non-observed complementary cluster.
在本公开的又一方面,提出了一种用于生成用于时空模式识别的时空模式模型的设备。该设备包括用于接收训练轨迹的装置。训练轨迹中的每一者包括表示时空模式的不同数据点。接收到的训练轨迹定义一区域。该设备还包括用于将该区域划分成各观察到的集群和一非观察到的补充性集群的装置。该设备进一步包括用于生成时空模式模型以包括各观察到的集群和该非观察到的补充性集群的装置。In yet another aspect of the present disclosure, an apparatus for generating a spatiotemporal pattern model for spatiotemporal pattern recognition is presented. The apparatus includes means for receiving a training trajectory. Each of the training trajectories includes different data points representing spatiotemporal patterns. The received training trajectory defines a region. The apparatus also includes means for dividing the area into observed clusters and a non-observed complementary cluster. The apparatus further includes means for generating a spatiotemporal pattern model to include each observed cluster and the non-observed complementary cluster.
根据本公开的又一方面,提出了一种非瞬态计算机可读介质。该非瞬态计算机可读介质上编码有用于生成用于时空模式识别的时空模式模型的程序代码。该程序代码由处理器执行且包括用于接收训练轨迹的程序代码。训练轨迹中的每一者包括表示时空模式的不同数据点。接收到的训练轨迹定义一区域。该程序代码还包括用于将该区域划分成各观察到的集群和一非观察到的补充性集群的程序代码。该程序代码进一步包括用于生成时空模式模型以包括各观察到的集群和该非观察到的补充性集群的程序代码。According to yet another aspect of the present disclosure, a non-transitory computer-readable medium is presented. The non-transitory computer readable medium has encoded thereon program code for generating a spatiotemporal pattern model for spatiotemporal pattern recognition. The program code is executed by the processor and includes program code for receiving training trajectories. Each of the training trajectories includes different data points representing spatiotemporal patterns. The received training trajectory defines a region. The program code also includes program code for dividing the region into observed clusters and a non-observed complementary cluster. The program code further includes program code for generating a spatiotemporal pattern model to include each observed cluster and the non-observed complementary cluster.
本公开的附加特征和优点将在下文描述。本领域技术人员应该领会,本公开可容易地被用作修改或设计用于实施与本公开相同的目的的其他结构的基础。本领域技术人员还应认识到,这样的等效构造并不脱离所附权利要求中所阐述的本公开的教导。被认为是本公开的特性的新颖特征在其组织和操作方法两方面连同进一步的目的和优点在结合附图来考虑以下描述时将被更好地理解。然而,要清楚理解的是,提供每一幅附图均仅用于解说和描述目的,且无意作为对本公开的限定的定义。Additional features and advantages of the present disclosure will be described below. It should be appreciated by those skilled in the art that this disclosure may be readily utilized as a basis for modifying or designing other structures for carrying out the same purposes of this disclosure. Those skilled in the art should also realize that such equivalent constructions do not depart from the teachings of the present disclosure as set forth in the appended claims. The novel features believed to be characteristic of the present disclosure, both in their organization and method of operation, together with further objects and advantages, will be better understood when the following description is considered in conjunction with the accompanying drawings. It is to be clearly understood, however, that each drawing is provided for illustrative and descriptive purposes only and is not intended as a definition of the limitations of the present disclosure.
附图简述Brief Description of Drawings
在结合附图理解下面阐述的详细描述时,本公开的特征、本质和优点将变得更加明显,在附图中,相同附图标记始终作相应标识。The features, nature and advantages of the present disclosure will become more apparent when the detailed description set forth below is read in conjunction with the accompanying drawings, in which like reference numerals have been designated accordingly.
图1A和1B分别解说了移动平台的正面和背面。1A and 1B illustrate the front and back of the mobile platform, respectively.
图2解说了移动平台接收字母数字用户输入。2 illustrates the mobile platform receiving alphanumeric user input.
图3解说了根据本公开的某些方面的使用片上系统(SOC)(包括通用处理器)来设计神经网络的示例实现。3 illustrates an example implementation of designing a neural network using a system on a chip (SOC), including a general purpose processor, in accordance with certain aspects of the present disclosure.
图4解说了根据本公开的各方面的系统的示例实现。4 illustrates an example implementation of a system in accordance with aspects of the present disclosure.
图5是解说根据本公开的各方面的根据Dirichlet过程的经划分空间区域的示图。5 is a diagram illustrating a divided spatial region according to a Dirichlet process in accordance with aspects of the present disclosure.
图6是解说根据本公开的各方面的根据Pitman-Yor过程的经划分空间区域的示图。6 is a diagram illustrating a divided spatial region according to a Pitman-Yor process in accordance with aspects of the present disclosure.
图7A是解说用于上下颠倒地呈现的字母数字字符“2”的训练轨迹集合的示图。7A is a diagram illustrating a set of training trajectories for the alphanumeric character "2" presented upside down.
图7B是图7A的训练轨迹的高斯过程协方差的示图。7B is a graph of the Gaussian process covariance of the training trajectory of FIG. 7A.
图7C是解说图7A的训练轨迹的另一高斯过程协方差的示图,其具有与用于图7B的长度-尺度相比增加的长度-尺度。7C is a graph illustrating another Gaussian process covariance for the training trajectory of FIG. 7A with an increased length-scale compared to the length-scale used for FIG. 7B.
图7D是图7C的高斯过程协方差的三维(3D)表示。Figure 7D is a three-dimensional (3D) representation of the Gaussian process covariance of Figure 7C.
图8A-C是解说根据本公开的各方面的根据应用于图7A的训练轨迹的Pitman-Yor过程的经划分空间区域的示图。8A-C are diagrams illustrating divided spatial regions according to a Pitman-Yor process applied to the training trajectory of FIG. 7A in accordance with aspects of the present disclosure.
图9是用于识别时空模式的方法的图形解说。9 is a graphical illustration of a method for identifying spatiotemporal patterns.
图10是解说能够经由前向相机接收用户输入的移动平台的功能框图。10 is a functional block diagram illustrating a mobile platform capable of receiving user input via a front-facing camera.
图11是解说根据本公开的各方面的用于生成用于时空模式识别的时空模式模型的方法的流程图。11 is a flowchart illustrating a method for generating a spatiotemporal pattern model for spatiotemporal pattern recognition in accordance with aspects of the present disclosure.
图12是解说根据本公开的各方面的用于生成时空模式模型的过程的流程图。12 is a flowchart illustrating a process for generating a spatiotemporal pattern model in accordance with aspects of the present disclosure.
图13是解说根据本公开的各方面的用于时空模式识别的方法的流程图。13 is a flowchart illustrating a method for spatiotemporal pattern recognition in accordance with aspects of the present disclosure.
详细描述Detailed Description
以下结合附图阐述的详细描述旨在作为各种配置的描述,而无意表示可实践本文中所描述的概念的仅有的配置。本详细描述包括具体细节以便提供对各种概念的透彻理解。然而,对于本领域技术人员将显而易见的是,没有这些具体细节也可实践这些概念。在一些实例中,以框图形式示出众所周知的结构和组件以避免湮没此类概念。The detailed description set forth below in connection with the appended drawings is intended as a description of various configurations, and is not intended to represent the only configurations in which the concepts described herein can be practiced. The detailed description includes specific details in order to provide a thorough understanding of various concepts. However, it will be apparent to those skilled in the art that these concepts may be practiced without these specific details. In some instances, well-known structures and components are shown in block diagram form in order to avoid obscuring such concepts.
基于本教导,本领域技术人员应领会,本公开的范围旨在覆盖本公开的任何方面,不论其是与本公开的任何其他方面相独立地还是组合地实现的。例如,可以使用所阐述的任何数目的方面来实现装置或实践方法。另外,本公开的范围旨在覆盖使用作为所阐述的本公开的各个方面的补充或者与之不同的其他结构、功能性、或者结构及功能性来实践的此类装置或方法。应当理解,所披露的本公开的任何方面可由权利要求的一个或多个元素来实施。Based on the present teachings, those skilled in the art should appreciate that the scope of the disclosure is intended to cover any aspect of the disclosure, whether implemented independently or in combination with any other aspect of the disclosure. For example, an apparatus may be implemented or a method of practice may be implemented using any number of the set forth aspects. In addition, the scope of the present disclosure is intended to cover such apparatus or methods that are practiced using other structure, functionality, or structure and functionality in addition to or other than the various aspects of the present disclosure set forth. It should be understood that any aspect of the disclosure disclosed may be embodied by one or more elements of a claim.
措辞“示例性”在本文中用于表示“用作示例、实例或解说”。本文中描述为“示例性”的任何方面不必被解释为优于或胜过其他方面。The word "exemplary" is used herein to mean "serving as an example, instance, or illustration." Any aspect described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other aspects.
尽管本文描述了特定方面,但这些方面的众多变体和置换落在本公开的范围之内。虽然提到了优选方面的一些益处和优点,但本公开的范围并非旨在被限定于特定益处、用途或目标。相反,本公开的各方面旨在能宽泛地应用于不同的技术、系统配置、网络和协议,其中一些作为示例在附图以及以下对优选方面的描述中解说。详细描述和附图仅仅解说本公开而非限定本公开,本公开的范围由所附权利要求及其等效技术方案来定义。Although certain aspects are described herein, numerous variations and permutations of these aspects fall within the scope of this disclosure. While some benefits and advantages of preferred aspects have been mentioned, the scope of the present disclosure is not intended to be limited to particular benefits, uses, or objectives. Rather, aspects of the present disclosure are intended to be broadly applicable to different technologies, system configurations, networks, and protocols, some of which are illustrated by way of example in the accompanying drawings and the following description of preferred aspects. The detailed description and drawings merely illustrate rather than limit the disclosure, the scope of which is defined by the appended claims and their equivalents.
本公开的各方面涉及生成用于时空模式识别的模型。即,在检测到持续观察的多维变量内的时间模式时,知晓该变量是否背离所评估的模式以及背离得多少是合乎需要的。时间模式可包括字母数字字符、物体、语音、姿势、股票市场活动、气象模式、以及其它时间模式或时空模式。在接受或拒绝时间模式之前可考虑差异量。换言之,时间模式中的空间差异可能足够显著以使得原本可接受模式被拒绝,或反之。相应地,本公开的各方面提供了通过建模过程对可接受差异量的控制。Aspects of the present disclosure relate to generating models for spatiotemporal pattern recognition. That is, when a temporal pattern within a continuously observed multidimensional variable is detected, it is desirable to know whether and by how much the variable deviates from the evaluated pattern. Temporal patterns may include alphanumeric characters, objects, speech, gestures, stock market activity, weather patterns, and other temporal or spatiotemporal patterns. The amount of variance can be considered before accepting or rejecting a temporal pattern. In other words, spatial differences in temporal patterns may be significant enough that otherwise acceptable patterns are rejected, or vice versa. Accordingly, aspects of the present disclosure provide control over the amount of acceptable variance through the modeling process.
根据本公开的各方面,非参数化模型可被用于检测具有空间多样性的习得时间流形(manifold)。在一些方面,该模型可以基于随机过程,诸如称为Pitman-Yor过程的两参数Dirichlet过程。例如,空间多样性可通过Pitman-Yor过程在第二参数上施加协方差回归过程(例如,高斯过程协方差回归)来建模。给定混合模型的各分量的序列,隐马尔科夫模型可被应用来对各流形的时间动态性进行建模。这允许评估给定流形并拒绝任意序列,这在其中要在连续观察序列内发现各模式(例如,连续手部移动内的时间模式)的应用中是非常重要的。According to aspects of the present disclosure, non-parametric models may be used to detect learned temporal manifolds with spatial diversity. In some aspects, the model may be based on a stochastic process, such as a two-parameter Dirichlet process known as a Pitman-Yor process. For example, spatial diversity can be modeled by a Pitman-Yor process applying a covariance regression process (eg, Gaussian process covariance regression) on the second parameter. Given the sequence of components of the mixture model, a Hidden Markov Model can be applied to model the temporal dynamics of each manifold. This allows evaluating a given manifold and rejecting arbitrary sequences, which is important in applications where patterns are to be found within a continuous sequence of observations (eg, temporal patterns within a continuous hand movement).
时间流形被用在许多应用中,包括手部跟踪、姿势识别以及人类动作识别。常规隐马尔科夫模型(HMM)已被用来检测许多应用(包括语音和姿势识别)中的时间事件。存在HMM的不同版本,包括在其中各观察以空间多样性发生的应用中使用的具有离散和连续密度发射的HMM。Temporal manifolds are used in many applications, including hand tracking, gesture recognition, and human action recognition. Conventional Hidden Markov Models (HMMs) have been used to detect temporal events in many applications including speech and gesture recognition. Different versions of HMMs exist, including HMMs with discrete and continuous density emissions used in applications where observations occur in spatial diversity.
空间数据对于许多应用而言起重要作用,诸如用于流检测和姿势识别的轨迹的识别。用于对空间数据进行建模的一些常规方法包括K均值和高斯混合模型(GMM),其中数据点被编组在一起以定义集群集合,每一集群表示字母表中的一个符号。在单词词汇表(例如,英语单词、口语单词、轨迹、或姿势)中,来自字母表的各个符号序列可以创建有意义的不同单词(流形、姿势,等等)。Spatial data plays an important role for many applications, such as the identification of trajectories for flow detection and gesture recognition. Some conventional methods for modeling spatial data include K-means and Gaussian Mixture Models (GMM), in which data points are grouped together to define a set of clusters, each cluster representing a symbol in the alphabet. In a vocabulary of words (eg, English words, spoken words, trajectories, or gestures), various sequences of symbols from the alphabet can create meaningfully distinct words (manifolds, gestures, etc.).
检测时空模式的一个问题涉及拒绝不与习得模式的词汇表的任何成员相类似的移动。这在语音和姿势识别中尤其重要。对于诸如姿势和动作识别等应用,姿势/动作被定义为多维空间内的移动,其中该移动的所有部分应当被完成才能被认为是词汇表中的习得模型之一。这意味着例如如果经训练模式或移动看起来是空间中的圆,则部分类似于圆的曲线(例如,圆的60%)不是可接受的且被拒绝。在检测连续移动内的模式时,这尤其重要,因为任意移动频繁发生且它们中的许多可能部分类似于经训练模式中的一些。One problem with detecting spatiotemporal patterns involves rejecting movements that do not resemble any member of the vocabulary of the learned pattern. This is especially important in speech and gesture recognition. For applications such as gesture and action recognition, a gesture/action is defined as a movement within a multidimensional space, where all parts of the movement should be completed to be considered one of the learned models in the vocabulary. This means for example that if the trained pattern or movement appears to be a circle in space, a partially circle-like curve (eg, 60% of a circle) is not acceptable and rejected. This is especially important when detecting patterns within continuous movements, since arbitrary movements occur frequently and many of them may be partially similar to some of the trained patterns.
相应地,本公开的各方面提供了用于将各模式局部化并拒绝不与词汇表中的任何模式相类似的轨迹的观察和移动的非参数化模型。各模式的时间动态性可以用隐马尔科夫模型来建模并且用带高斯发射的Dirichlet过程混合(DPM)模型对它们的空间变化进行建模。高斯的混合允许适于空间上不同的模式的观察的无限集合。DPM可被用于对各观察进行群集。该混合的分量标记可进而被用在HMM中以用于对每一模式的时间动态性进行建模。Accordingly, aspects of the present disclosure provide non-parametric models for localizing patterns and rejecting observations and movements of trajectories that do not resemble any patterns in the vocabulary. The temporal dynamics of the modes can be modeled with a Hidden Markov Model and their spatial variation with a Dirichlet Process Mixture (DPM) model with Gaussian emission. Mixtures of Gaussians allow an infinite set of observations suitable for spatially different modes. DPM can be used to cluster observations. This mixed component label can in turn be used in the HMM for modeling the temporal dynamics of each mode.
此外,本文的配置被用于检测或拒绝一序列。因此,接受和拒绝区之间的清晰且强烈的分隔间隙是合乎需要的。例如,来自可接受区的数据产生显著大于由来自拒绝区的数据所产生的似然性的大似然性是合乎需要的。例如,如果接受的似然性是-100及以上且拒绝的似然性是-300及以下,则存在足够间隔来避免混淆。然而,如果接受的似然性是-100及以上且拒绝的似然性是-115及以下,则间隙很小,以致一些可接受输入可造成似然性稍小于-115并且因此被拒绝。Furthermore, the configuration herein is used to detect or reject a sequence. Therefore, a clear and strong separation gap between acceptance and rejection zones is desirable. For example, it is desirable that data from the acceptable zone yield a large likelihood that is significantly greater than that produced by the data from the rejection zone. For example, if the likelihood of acceptance is -100 and above and the likelihood of rejection is -300 and below, there is enough separation to avoid confusion. However, if the likelihood of acceptance is -100 and above and the likelihood of rejection is -115 and below, the gap is so small that some acceptable inputs may cause a likelihood slightly less than -115 and thus be rejected.
因为清晰且强烈的分隔间隙是合乎需要的,所以该模型可在不使用连续密度发射HMM(CDHMM)的情况下被配置。在CDHMM中,发射的概率通过概率密度函数(诸如高斯)和混合系数向量的混合来呈现。因此,远离密度中心的观察将产生小似然性,这造成HMM似然性方面的损失。CDHMM作出从接受到拒绝区的平滑移动,使得接受区与拒绝区之间的分隔含糊且不清。相反,根据本公开的各方面,接受区与拒绝区之间的间隙可通过考虑来自补充性集群的使得HMM产生非常小的似然性的观察而被放大。不具有来自补充性集群的数据点的序列具有较大似然性。Since clear and strong separation gaps are desirable, this model can be configured without the use of continuous density emission HMMs (CDHMMs). In CDHMM, the probability of emission is represented by a mixture of a probability density function (such as a Gaussian) and a vector of mixing coefficients. Therefore, observations far from the density center will yield small likelihoods, which cause a loss in the likelihood of the HMM. The CDHMM makes a smooth movement from the acceptance to the rejection area, making the separation between the acceptance area and the rejection area ambiguous and unclear. Instead, according to aspects of the present disclosure, the gap between the acceptance and rejection regions can be magnified by considering observations from complementary clusters that make the HMM yield very small likelihoods. Sequences that do not have data points from complementary clusters have greater likelihood.
图1A和1B分别解说了被配置成经由前向相机110接收用户输入的移动平台100的正面和背面。移动平台100被解说为包括前向显示器102、扬声器104、以及话筒106。移动平台100进一步包括用于捕捉环境的图像的后向相机108和前向相机110。移动平台100可进一步包括包含诸如邻近度传感器、加速度计、陀螺仪、邻近度传感器、触摸传感器/触摸屏等传感器的传感器系统,这可被用来辅助确定移动平台100的位置和/或相对运动或触摸手指在屏幕上的位置。1A and 1B illustrate the front and back, respectively, of
如本文中所使用的,移动平台是指任何便携式电子设备,诸如蜂窝或其它无线通信设备、个人通信系统(PCS)设备、个人导航设备(PND)、个人信息管理器(PIM)、个人数字助理(PDA)、或者其它合适的移动设备。移动平台可被配置成接收无线通信和/或导航信号(诸如导航定位信号)。移动平台可包括诸如通过短程无线、红外、有线连接、或其他连接与个人导航设备(PND)通信的设备,不管卫星信号接收、辅助数据接收、和/或定位相关处理是发生在该设备上还是PND上。在一些方面,移动平台还可包括电子设备,包括无线通信设备、计算机、膝上型设备、平板计算机、头戴式设备、可穿戴计算机等等,它们能够经由前向相机或触摸传感器来光学地或通过触摸跟踪用户引导的对象以用于识别用户输入。As used herein, mobile platform refers to any portable electronic device, such as cellular or other wireless communication device, personal communication system (PCS) device, personal navigation device (PND), personal information manager (PIM), personal digital assistant (PDA), or other suitable mobile device. The mobile platform may be configured to receive wireless communications and/or navigation signals (such as navigation positioning signals). The mobile platform may include a device that communicates with a Personal Navigation Device (PND), such as via a short-range wireless, infrared, wired connection, or other connection, regardless of whether satellite signal reception, assistance data reception, and/or location-related processing occurs on the device or on PND. In some aspects, the mobile platform may also include electronic devices, including wireless communication devices, computers, laptops, tablets, head-mounted devices, wearable computers, and the like, that are capable of being optically captured via a front-facing camera or touch sensor. Or track user-guided objects by touch for recognizing user input.
图2解说了经由相机(例如,参见图1A的前向相机110)接收字母数字用户输入的示例性移动平台100的俯视图。移动平台100使用其相机捕捉用户引导的对象的图像序列。在这一配置中,用户引导的对象是用户202的指尖204。然而,在其他方面,用户引导的对象可包括书写工具,诸如用户的整个手指、指示笔、笔、铅笔、刷、或其它书写工具。2 illustrates a top view of an exemplary
移动平台100捕捉图像系列或序列并且响应于此,在用户202在表面200上四处移动指尖204时跟踪用户引导的对象(例如,指尖204)。在一种配置中,表面200是平坦表面且与移动平台100分开并在其外部。例如,表面200可以是桌面或台面。在另一配置中,用户202可以简单地在移动平台100的视野中移动指尖204而不接触表面(例如,开放空间)以由移动平台100跟踪。在这一配置中,输入序列可以例如跟踪用户指尖204关于显示器102的表面的移动。在又一配置中,表面200可以是触摸屏,诸如触敏显示器102,其中基于与显示器的表面的接触来指示输入。在这一配置中,输入序列可以例如跟踪用户指尖204沿着和/或与显示器102的表面的接触。
移动平台100对用户引导的对象的跟踪数据可由移动平台100分析以生成轨迹数据。在一个示例中,轨迹数据是在时间上排序且在空间上不同的数据点的集合。移动平台100可以分析所有或部分轨迹数据以识别各种类型的用户输入。例如,轨迹数据可以指示诸如字母数字字符(例如,字母、数字、以及符号)、姿势、和/或鼠标/触摸控制输入等用户输入。在图2的示例中,用户202被示为正通过跨表面200引导指尖204完成字母数字字符206(例如,数字“2”)的一个或多个笔画。通过在用户202画出虚拟数字“2”时捕捉图像系列或记录跨触敏显示器102的移动,移动平台100可以跟踪指尖204并随后分析轨迹数据以识别该字符输入。The tracking data of the object guided by the user by the
图3解说根据本公开的某些方面使用片上系统(SOC)300进行前述的生成用于时空模式识别的时空模式模型的示例实现,SOC 300可包括通用处理器(CPU)或多核通用处理器(CPU)302。变量(例如,神经信号和突触权重)、与计算设备(例如,带有权重的神经网络)相关联的系统参数、延迟、频率槽信息、以及任务信息可被存储在与神经处理单元(NPU)308相关联的存储器块中、与CPU 302相关联的存储器块中、与图形处理单元(GPU)104相关联的存储器块中、与数字信号处理器(DSP)306相关联的存储器块中、专用存储器块318中,或可跨多个块分布。在通用处理器302处执行的指令可从与CPU302相关联的程序存储器加载或可从专用存储器块318加载。3 illustrates an example implementation of the aforementioned generation of spatiotemporal pattern models for spatiotemporal pattern recognition using a system-on-chip (SOC) 300, which may include a general-purpose processor (CPU) or a multi-core general-purpose processor ( CPU) 302. Variables (eg, neural signals and synaptic weights), system parameters associated with a computing device (eg, a neural network with weights), delays, frequency bin information, and task information can be stored in conjunction with the neural processing unit (NPU). ) 308, in the memory block associated with the
SOC 300还可包括为具体功能定制的附加处理块(诸如GPU 304、DSP 306、连通性块310(其可包括第四代长期演进(4G LTE)连通性、无执照Wi-Fi连通性、USB连通性、蓝牙连通性等))以及例如可检测和识别姿势的多媒体处理器312。在一种实现中,NPU实现在CPU、DSP、和/或GPU中。SOC 300还可包括传感器处理器314、图像信号处理器(ISP)、和/或导航320(其可包括全球定位系统)。
SOC 300可基于ARM指令集。在本公开的一方面,加载到通用处理器302中的指令可包括用于接收训练轨迹的代码。训练轨迹中的每一者包括表示时空模式的不同数据点并且接收到的训练轨迹定义一区域。加载到通用处理器302的指令还可包括用于将该区域划分成各观察到的集群和一非观察到的补充性集群的代码。此外,加载到通用处理器302的指令可包括用于生成时空模式模型以包括各观察到的集群和该非观察到的补充性集群的代码。The
图4解说了根据本公开的某些方面的系统400的示例实现。如图4中所解说的,系统400可具有可执行本文所描述的方法的各种操作的多个局部处理单元402。每个局部处理单元402可包括局部状态存储器404和可存储机器学习模型的参数的局部参数存储器406。另外,局部处理单元402可具有用于存储局部模型程序的局部(例如,神经元)模型程序(LMP)存储器408、用于存储局部学习程序的局部学习程序(LLP)存储器410、以及局部连接存储器412。此外,如图4中所解说的,每个局部处理单元402可与用于为该局部处理单元的各局部存储器提供配置的配置处理器单元414对接,并且与提供各局部处理单元402之间的路由的路由连接处理单元416对接。4 illustrates an example implementation of a
图5是解说根据Dirichlet过程的经划分空间区域502的示图。如图5中所示,空间区域502已被划分成八个区A1-A8。FIG. 5 is a diagram illustrating a divided
根据本公开的各方面,不合需要的观察和序列可被拒绝。在一些方面,Dirichlet过程混合模型可被施加在观察的无限空间上以定义接受区和拒绝区。Dirichlet过程可被用来通过作为正实数的θ和α来定义观察的空间的各分区,使得对于θ上的任何有限可测量分区A1,A2,…,AK,A1∪A2∪…∪AK=θ,并且G是θ上的随机概率测量(G(A1),G(A2),…,G(AK))~Dirichlet(αH(A1),αH(A2),…,αH(AK))。根据这一定义,观察的空间(例如,空间区域)可被划分成数个区,例如如图5中所示。如果G是根据Dirichlet过程(DP)分布的(例如,G~DP(α,H),则来自G的比划是θi,其中θi|G~G,i=1,2,…,N,并且Dirichlet过程的后验如下给出:According to aspects of the present disclosure, undesirable observations and sequences may be rejected. In some aspects, a Dirichlet process mixture model can be applied over an infinite space of observations to define acceptance and rejection regions. The Dirichlet process can be used to define partitions of the observed space by θ and α as positive real numbers such that for any finitely measurable partition A 1 ,A 2 ,…,AK on θ, A 1 ∪A 2 ∪ ... ∪AK = θ, and G is a random probability measure on θ (G(A 1 ), G(A 2 ), ..., G(A K )) ~ Dirichlet(αH(A 1 ), αH(A 2 ),…,αH(A K )). According to this definition, the observed space (eg, a region of space) can be divided into regions, such as shown in FIG. 5 . If G is distributed according to a Dirichlet process (DP) (eg, G∼DP(α,H), then the stroke from G is θ i , where θ i |G∼G, i=1,2,...,N, And the posterior of the Dirichlet process is given as:
通过将G边缘化,预测分布由下式给出:By marginalizing G, the predicted distribution is given by:
其中|Θ|是当前分区数目,Nk是分区k处的观察的数目,N是观察的总数,δ(θ,θk)是增量函数(例如,Kronecker增量函数),且α是对称Dirichlet分布的参数。where |Θ| is the current number of partitions, Nk is the number of observations at partition k , N is the total number of observations, δ(θ, θk ) is the increment function (eg, the Kronecker increment function), and α is symmetric Parameters of the Dirichlet distribution.
根据式2,新观察将以概率被指派给任何当前填充(非空)分区或集群k。另选地,新观察能以概率被指派给新未填充(空)分区。因而,如果α与当前分区处的观察的数目相比很小,则更可能的是新观察被指派给当前分区之一(而非新分区)。According to
应用随机过程(例如,Pitman-Yor过程),预测概率分布可由下式给出:Applying a stochastic process (for example, a Pitman-Yor process), the predicted probability distribution can be given by:
其中d是用于控制各区或集群的面积的参数。where d is a parameter that controls the area of each zone or cluster.
在式3中,参数d可以例如被定义为使得0≤d<1且α>-d。在这一示例中,参数d可以控制具有一个或非常少观察(轨迹)的区或集群的数目。即,d的值越大,具有较少数目的观察(轨迹)的区或集群越多。另外,d越大,具有大量观察的区或集群的数目越少,并且每一区内的观察(轨迹)的数目越大。In Equation 3, the parameter d may, for example, be defined such that 0≦d<1 and α>−d. In this example, the parameter d may control the number of regions or clusters with one or very few observations (trajectories). That is, the larger the value of d, the more regions or clusters with a smaller number of observations (trajectories). In addition, the larger d, the smaller the number of regions or clusters with a large number of observations, and the larger the number of observations (trajectories) per region.
在一个示例性方面,在空间上分布的数据点可由高斯集群来建模。诸如Pitman-Yor过程等随机过程可被用来限制高斯集群的范围。用于词汇表中的一单词的训练数据点群可以与高斯混合模型群集在一起。Pitman-Yor过程(PYP)可被用来将空间群集成有限数目的集群或区。如此,观察的空间可被划分成有限数目的区或集群,其中某些集群具有指派给它的训练数据点,具有增长更多集群的潜力或能力。将不具有所指派的数据点的无限集群集合共同地认为是单个集群,Pitman Yor过程可被用来将空间群集成以下集合:In one exemplary aspect, spatially distributed data points may be modeled by Gaussian clusters. Stochastic processes such as the Pitman-Yor process can be used to limit the extent of Gaussian clusters. A cluster of training data points for a word in the vocabulary can be clustered with a Gaussian mixture model. The Pitman-Yor Process (PYP) can be used to group the space into a limited number of clusters or regions. In this way, the observed space can be divided into a limited number of regions or clusters, some of which have training data points assigned to it, with the potential or ability to grow more clusters. Considering an infinite set of clusters with no assigned data points collectively as a single cluster, the Pitman Yor process can be used to group the space into the following set:
其中是经训练高斯分区(例如,区或集群)。in are trained Gaussian partitions (eg, regions or clusters).
图6是解说根据诸如Pitman-Yor过程等随机过程的经划分空间区域602的示图。参考图6,空间区域602已被划分成四个观察到的区A1、A2、A3以及A4和一个非观察到的补充区Acomplement。虽然图6中示出了四个观察到的区,但这仅仅是为了易于解释并且本公开不限于此。如本文所示,任何数目的区可被用来划分观察空间。在一个配置中,补充区Acomplement笼统地表示可由式4的PYP发起的所有未观察到的分区。6 is a diagram illustrating a divided
在数据点i+1相当不可能由经训练分量之一生成时,式4的PYP可被用来限制每一高斯集群的范围并评估新集群的创建。由Pitman-Yor过程发起的新集群的位置因此可以是任何位置。在一些方面,式4的基础分布可以是高斯族,因为混合模型是高斯。When data point i+1 is quite unlikely to be generated by one of the trained components, the PYP of Equation 4 can be used to limit the extent of each Gaussian cluster and evaluate the creation of new clusters. The location of the new cluster initiated by the Pitman-Yor process can thus be any location. In some aspects, the underlying distribution of Equation 4 may be of the Gaussian family because the mixture model is Gaussian.
混合模型中高斯的均值和协方差两者可以是未知的且采样自共轭先验。因为协方差矩阵是正定的(对于每一非零列向量而言,转置是正的),对于均值固定的情形而言其共轭先验具有逆Wishart分布Λ~IW(v,Δ),这是单维高斯采样的逆伽马-正态共轭先验的多维类似。在一些方面,多维均值和协方差矩阵是不确定的。因此,这一情形的正确先验是正态逆Wishart分布,其密度表达为:Λ~W(v,Δ)和Both the mean and covariance of the Gaussian in the mixture model can be unknown and sampled from a conjugate prior. Since the covariance matrix is positive definite (the transpose is positive for each non-zero column vector), its conjugate prior has the inverse Wishart distribution Λ~IW(v,Δ) for the fixed mean case, which is the multi-dimensional analog of the inverse gamma-normal conjugate prior of single-dimensional Gaussian sampling. In some aspects, the multidimensional mean and covariance matrices are indeterminate. Therefore, the correct prior for this situation is the normal inverse Wishart distribution, whose density is expressed as: Λ ~ W(v, Δ) and
其中v表示自由度且一般被选择成大于数据的维数,Δ是其中v作为其数据集的大小的伪协方差矩阵,且k是具有期望均值v的先验的伪数据集的大小。观察x*的预测似然性可根据具有自由度的Student-t分布来分布。因而,预测似然性随后可由均值为且协方差为的正态分布来近似地给出。where v represents the degrees of freedom and is generally chosen to be larger than the dimensionality of the data, Δ is the pseudo-covariance matrix with v as the size of its dataset, and k is the size of the pseudo-data set with a prior of expected mean v. The predicted likelihood of observation x * can be calculated according to having Degree of freedom Student-t distribution to distribute. Thus, the predicted likelihood can then be calculated from the mean as and the covariance is to approximate the normal distribution of .
相应地,使用Pitman-Yor过程并且具有用于基础正态分布的共轭先验的所确定的正确分布,该分布可被采样以用于推断。在一个示例性方面,Gibbs采样器过程可被用于训练并用于从PYP推断。PYP似然性对隐马尔科夫模型的发射进行建模,其中每一集群的分区标记被认为是观察。不属于具有所指派的数据点的任一分区的观察数据点的PYP似然性允许将观察延伸到不具有任何观察的无限分区集合,这可被称为补充分区或区(例如,Acomplement)。因此,可以向不可能来自被占据分区的观察给出补充分区的标记。HMM因而被修改以容纳这些观察。因为在词汇表的群集过程中和HMM的训练中没有这样的观察的实例,所以补充分区(例如,Acomplement)以从其它观察共同减去的非常小的概率被添加到观察的每一HMM的表(这确保发射矩阵保持随机)。例如,如果Bw是单词w的HMM的发射矩阵,则来自补充分区的观察的概率随后由下式给出:Accordingly, using the Pitman-Yor process and having a determined correct distribution for the conjugate prior of the underlying normal distribution, the distribution can be sampled for inference. In one exemplary aspect, the Gibbs sampler process may be used for training and for inference from PYP. The PYP likelihood models the emission of a hidden Markov model, where the partition labels of each cluster are considered observations. The PYP likelihood of an observation data point that does not belong to any partition with an assigned data point allows the observation to be extended to an infinite set of partitions without any observations, which may be referred to as complementary partitions or regions (eg, A complement ) . Thus, observations that are unlikely to come from an occupied partition can be given the label of a supplementary partition. The HMM was thus modified to accommodate these observations. Since there are no instances of such observations during the clustering of the vocabulary and in the training of the HMM, complementary partitions (eg, Acomplement ) are added to the observations of each HMM's table (this ensures that the emission matrix remains random). For example, if B w is the emission matrix of the HMM for word w, then the probability of an observation from the supplementary partition is then given by:
其中是状态为s的观察Ok针对单词w的经调整发射概率。表示来自补充分区的所有观察,且|Kw|表示对于单词w而言混合模型中被占据分区的数目。虽然单词间分区交叠是可能的,但对于观察序列分开地推断每一单词的PYP,并且因此每一单词的PYP的分区是根据训练数据针对该单词的观察的最高值表示。因此,每一分区处的数据点的空间变化由该单词的相关联的训练数据来表示。in is the adjusted emission probability of the observation Ok with state s for word w. represents all observations from complementary partitions, and |Kw| represents the number of occupied partitions in the mixture model for word w . While inter-word partition overlap is possible, the PYP for each word is inferred separately for a sequence of observations, and thus the partition of the PYP for each word is the highest value representation of the observations for that word from the training data. Thus, the spatial variation of data points at each partition is represented by the associated training data for that word.
在一些方面,Dirichlet过程能以概率将新成员吸引到已占据或填充的集群或区。因此,在关于新观察进行推断时,新分区被发起并被这一观察占据的似然性可能非常低。换言之,Dirichlet过程往往产生许多大分区。然而,从已占据分区集合排除数据点是合乎需要的,如果它更可能来自补充分区的话。此外,因为数据可在各区域处不同,使用相同的限制因子来限制混合的所有分量可能是不合理的。因此,代替等同地限制各混合分量,在一些方面,Pitman-Yor过程的第二参数(例如,参数d)可根据数据来被设置并允许分量协方差控制每一分量的范围。In some aspects, the Dirichlet process can be probabilistic Attract new members to occupied or populated clusters or zones. Therefore, when inferring about a new observation, the likelihood that a new partition was initiated and occupied by this observation may be very low. In other words, the Dirichlet process tends to produce many large partitions. However, it is desirable to exclude a data point from the set of occupied partitions if it is more likely to come from a supplementary partition. Furthermore, since the data may differ at regions, it may not be reasonable to limit all components of the mix using the same limit factor. Thus, instead of constraining each mixed component equally, in some aspects, a second parameter of the Pitman-Yor process (eg, parameter d) may be set according to the data and allow the component covariances to control the extent of each component.
由于数据点的空间性质,空间模型可被使用来提供PYP的第二参数(例如,参数d)。为此,其中数据的空间变化由非参数化协方差回归来建模的模型可被采用。考虑多维高斯变量的条件分布,给定具有相同维度的高斯变量集合,如果x*是d维变量且X表示高斯变量集合,则条件分布p(x*|X)的均值和协方差由下式给出:Due to the spatial nature of the data points, a spatial model can be used to provide a second parameter of the PYP (eg, parameter d). To this end, a model in which the spatial variation of the data is modeled by nonparametric covariance regression can be employed. Consider the conditional distribution of multi-dimensional Gaussian variables, given a set of Gaussian variables with the same dimension, if x* is a d-dimensional variable and X represents a set of Gaussian variables, the mean and covariance of the conditional distribution p(x*|X) are given by gives:
为避免高维数据回归的计算负担,在一些方面,数据的均值和协方差可由从某些先验分布采样的函数来建模。因而,可在潜在无限维高斯空间为数据创建高斯模型(例如,μ(x1),…,μ(xn)~N((m(x1),…,m(xn),K(x1,…,xn))),这是高斯过程(GP)。认为数据是固定的是合理的,因为模式与观察的位置无关。因此,固定协方差函数(诸如平方指数)可被使用:To avoid the computational burden of regression on high-dimensional data, in some aspects, the mean and covariance of the data can be modeled by functions sampled from some prior distribution. Thus, a Gaussian model can be created for the data in a potentially infinite dimensional Gaussian space (eg, μ(x 1 ),...,μ(x n )~N((m(x 1 ),...,m(x n ),K( x 1 ,...,x n ))), which is a Gaussian Process (GP). It is reasonable to think of the data as stationary, since the pattern is independent of the location of the observations. Therefore, a fixed covariance function (such as a squared exponent) can be used :
其中τ是量值且l是协方差函数的平滑度。在一些方面,只有给定数据点的条件分布的协方差可被考虑。where τ is the magnitude and l is the smoothness of the covariance function. In some aspects, only the covariance of the conditional distribution for a given data point may be considered.
cov(x*|X)=K(x*,x*)-K(x*,X)T(K(X,X)+σ2I)-1K(x*,X) (10)cov(x * |X)=K(x * ,x * )-K(x * ,X)T(K(X,X)+σ 2 I)-1K(x * ,X) (10)
在一些配置中,可能没有与将数据认为是表示高斯过程的数据点相关联的任何输出。因此,在这种情形中,数据点x*的GP的期望没有意义。然而,协方差回归可由将项(K(X,X)+σ2I)倒转的期望来支配。但是,它可被合理快地计算,因为该项与观察x*无关且可被存储。如此,协方差回归的过程也可以是快速的。In some configurations, there may not be any output associated with treating the data as data points representing a Gaussian process. Therefore, in this case, the expectation of the GP of the data point x * is meaningless. However, covariance regression can be governed by the expectation of inverting the term (K(X,X)+σ 2 I). However, it can be computed reasonably fast because the term is independent of the observation x * and can be stored. As such, the process of covariance regression can also be fast.
在一些方面,式9的协方差函数的超参数(τ和l)可被设置,使得对于Pitman-Yor过程而言,该回归是有用的。因为没有针对数据点的输出,给定数据X和高斯过程的超参数,输出的边际似然性不能被最大化。因此,在一些方面,试探法也可被使用。In some aspects, the hyperparameters (τ and /) of the covariance function of Equation 9 can be set such that the regression is useful for a Pitman-Yor process. Because there is no output for the data point, given the data X and the hyperparameters of the Gaussian process, the marginal likelihood of the output cannot be maximized. Thus, in some aspects heuristics may also be used.
例如,为使式3中的参数d遵守约束0≤d<1,协方差函数的量值τ可被设置成小于1但接近1的值(例如,τ=0.99)。这实际上可能不足以确保回归的协方差在任何位置都小于1。另选地,在一些方面,量值τ可被设置成更小的值。回归的值也可针对所有数据点同等地缩小。For example, to make the parameter d in Equation 3 obey the
另一方面,平滑度参数l(本文中也称为长度尺度)可被设置成表示数据如何平滑地变化的适当值。On the other hand, the smoothness parameter 1 (also referred to herein as the length scale) may be set to an appropriate value representing how smoothly the data varies.
式10中的协方差回归函数可被解释为各基础函数的混合的条件似然性,每一基础函数以来自给定数据集X的数据点为中心。每一基础函数的方差随后由长度尺度参数l来控制。为适当地设置l以使得各基本函数的混合不会对该数据过度拟合或拟合不足,在一些方面,长度尺度参数l可被设置成等于各观察之间的平均最小距离乘以一系数,例如由下式给出:The covariance regression function in Equation 10 can be interpreted as the conditional likelihood of a mixture of basis functions, each centered on a data point from a given dataset X. The variance of each basis function is then controlled by the length scale parameter /. To properly set l so that the mixture of basis functions does not overfit or underfit the data, in some aspects the length scale parameter l can be set equal to the average minimum distance between observations multiplied by a coefficient , for example given by:
对于全部其中系数η可被用来调整(例如,扩张或缩减)集群的面积。 for all where the coefficient n can be used to adjust (eg, expand or shrink) the area of the cluster.
回归过程产生0和1之间的值(不包括1)。对于接近给定数据集X或在给定数据集X中的点,回归的协方差非常小,而对于远离该数据集中的项的点而言,它将很大。The regression process produces values between 0 and 1 (excluding 1). The covariance of the regression will be very small for points close to or within a given dataset X, and it will be large for points far from items in that dataset.
图7A-7D解说了具有不同η值的、轨迹的协方差回归的示例。图7A是解说用于上下颠倒地呈现的字母数字字符“2”的训练轨迹(例如,704A-C)集合702的示图。训练轨迹可被归一化以提供可被用于识别的训练模式。图7B是图7A的训练轨迹的高斯过程协方差的绘图。图7C是解说图7A的训练轨迹的另一高斯过程协方差的示图,其具有与用于图7B的长度-尺度(例如,l=0.0244)相比增加的长度-尺度(例如,l=0.0488)。图7D是图7C的高斯过程协方差的三维(3D)表示。7A-7D illustrate examples of covariance regression of trajectories with different values of n. 7A is a diagram illustrating a
从图7B和7C示出针对给定集合X的1-cov(x*|X)的值的结果,可以确定η系数越大,l越大(因为是正的)且集合X上的混合覆盖的越平滑。实际上,混合覆盖的边缘也由超参数l控制。参考图7B,在数字“2”的轨迹的开始处,该模型具有其间有间隙的两个脱节分支708和710。另外,在图7B,示出了分开的样本之间的空区域712。然而,具有高斯基础分布的Dirichlet过程可被用来预测并计入将来变化。在一些方面,第二参数(参数d)可被设置成样本的GP协方差乘以可调整系数。这可向PYP提供极好提示以包括并预见变化,同时每一分布的范围被控制。如上所示,式3中的小参数d可造成给定数据点被指派给已占据分量(例如,区或集群)的较高概率。相反地,较高d可使得该过程允许生成新分量,如果该数据点不可能足够由已占据分量之一生成的话。因此,在一些方面,给定数据点x*处的GP协方差可被用作Pitman-Yor过程的d。如此,如果x*足够远离已占据分量,则该过程可以指派从基础分布h(θ)采样的新分量。From Figures 7B and 7C showing the results for the value of 1-cov(x*|X) for a given set X, it can be determined that the larger the n coefficient, the larger the l (because it is positive) and the smoother the blend coverage on the set X. In fact, the edges of the blend coverage are also controlled by the hyperparameter l. Referring to Figure 7B, at the beginning of the trajectory for number "2", the model has two
图8A-8C示出基于图7A的轨迹应用于经训练模型的Pitman-Yor过程的示例。图8A是解说根据基于图7A的轨迹(其中参数d被设置成0)应用于经训练模型的Pitman-Yor过程的经划分空间区域的示图。参考图8B,基于图7A的训练轨迹集合应用Dirichlet过程以产生经群集的空间。集群A1-A7连同Acomplement一起示出。图8C是解说根据基于图7A的轨迹(其中参数d基于高斯过程协方差来被设置(例如,GP协方差具有长度-尺度l=0.0244))应用于经训练模型的Pitman-Yor过程的经划分空间区域的示图。如图8B所示,区A1-A7的区域是宽舒的,使得模式仍然是可检测的但可导致假肯定。图8C是解说根据基于图7A的训练轨迹(其中参数d乘以大于1的系数(d>1.0))应用于经训练模型的Pitman-Yor过程的经划分空间区域的示图。如图8A中所示,定义该模式的各集群的区域非常宽舒,使得数字2不再是可辨别的。8A-8C illustrate an example of a Pitman-Yor process applied to a trained model based on the trajectory of FIG. 7A. 8A is a diagram illustrating a divided spatial region according to a Pitman-Yor process applied to a trained model based on the trajectory of FIG. 7A (where parameter d is set to 0). Referring to Figure 8B, a Dirichlet process is applied based on the set of training trajectories of Figure 7A to generate a clustered space. Clusters A1-A7 are shown along with the A complement . Figure 8C is a diagram illustrating the partitioning according to the Pitman-Yor process applied to the trained model based on the trajectory of Figure 7A, where the parameter d is set based on the Gaussian process covariance (eg, the GP covariance has length-scale l=0.0244) Illustration of the space area. As shown in Figure 8B, the area of regions A1-A7 is broad, so that the pattern is still detectable but can lead to false positives. 8C is a diagram illustrating a divided spatial region according to a Pitman-Yor process applied to a trained model based on the training trajectory of FIG. 7A (where parameter d is multiplied by a coefficient greater than 1 (d>1.0)). As shown in Figure 8A, the regions defining the clusters of the pattern are so broad that the
图9是时空模式识别的模型的图形表示。在图9的模型中,Dirichlet过程混合模型(DPM)连同参数α、d以及基础分布H一起被示出。隐马尔科夫模型接收来自DPM的观察序列,这是由DPM提取的分量标记的序列。Figure 9 is a graphical representation of a model for spatiotemporal pattern recognition. In the model of Figure 9, a Dirichlet Process Mixture Model (DPM) is shown along with parameters a, d and the underlying distribution H. A Hidden Markov Model receives a sequence of observations from the DPM, which is a sequence of component labels extracted by the DPM.
在一些方面,所识别模式的正确度可被验证。验证给定数据点序列中的所识别模式的正确度包括验证该模式的所有区域被正确地满足。为了验证,经训练模型中的大多数分区贯穿给定轨迹来被满足。即,如果一模式具有沿由PYP指派给给定模式的集群的轨迹的数据点,则它被满足。在一些方面,HMM可进一步标识满足各分区的轨迹序列处于正确的次序。In some aspects, the correctness of the identified patterns can be verified. Verifying the correctness of an identified pattern in a given sequence of data points includes verifying that all regions of the pattern are correctly satisfied. For validation, most partitions in the trained model are satisfied throughout a given trajectory. That is, a pattern is satisfied if it has data points along trajectories assigned by PYP to clusters of a given pattern. In some aspects, the HMM may further identify that the sequence of trajectories satisfying each partition is in the correct order.
因为PYP将给定序列的数据点指派给各集群,所以指派给每一集群的数据点的数目可以是清楚的。此外,确切地知晓PYP向新集群给出多少数据点,从而意味着它们属于共同表示没有被经训练集群覆盖的区域的补充分区。使用计数方案来确保每一分区接收到某最小数目的所分配的数据点,并且补充分区的所分配的点低于可容许最小值,满足这些准则的分区的数目被认为是给定序列是否正确地满足经训练模型的所有区域的量度。Because the PYP assigns a given sequence of data points to clusters, the number of data points assigned to each cluster can be unambiguous. Furthermore, it is known exactly how many data points the PYP gives to the new clusters, meaning they belong to complementary partitions that collectively represent areas not covered by the trained clusters. A counting scheme is used to ensure that each partition receives some minimum number of allocated data points, and that the allocated points of complementary partitions are below the allowable minimum, the number of partitions satisfying these criteria is considered correct for a given sequence to satisfy the metric for all regions of the trained model.
图10是解说被配置用于识别时间模式的移动平台1000的功能框图。移动平台1000是图1A和1B的移动平台100的一个可能实现。移动平台1000包括相机1002以及用户接口1006。用户接口1006包括显示器1026,它可被配置用于显示由相机1002捕捉的预览图像以及字母数字字符,如上所述。用户接口1006也可包括按键板1028,用户可以通过它将信息输入到移动平台1000中。如果需要,可通过利用相机1002来消除按键板1028,如上所述。另外,为向用户提供多种方式来提供时空模式,例如在一些方面,移动平台1000可包括用于经由显示器1026接收触摸姿势输入的触摸传感器。用户接口1006还可包括话筒1030和扬声器1032(例如,如果移动平台是蜂窝电话)。10 is a functional block diagram illustrating a
移动平台1000包括被配置成执行对象引导的跟踪的跟踪单元1018。在一个示例中,跟踪单元1018被配置成跟踪对象(例如,指尖、指示笔、书写工具或其它对象)的移动,如以上讨论的,以生成轨迹数据。
移动平台1000还包括控制单元1004,该控制单元1004被连接至相机1002以及用户接口1006连同其他部件(诸如,跟踪单元1018和姿势识别单元1022)并与它们通信。姿势识别单元1022接受并处理从跟踪单元1018接收到的轨迹数据以将用户输入识别为符号和/或姿势。控制单元1004可由处理器1008和相关联的存储器1014、硬件1010、软件1016及固件1012提供。
如果需要,控制单元1004还可以包括用于将期望AR数据渲染在显示器1026中的图形引擎1024,其可以是例如游戏引擎。为清楚起见,跟踪组件1018和姿势识别组件1022被分开地解说并且与处理器1008分开,但是可以是单个单元和/或基于在处理器1008中运行的软件1016中的指令而实现在处理器1008中。处理器1008以及跟踪单元1018、姿势识别单元1022和图形引擎1024中的一者或多者可以但不一定包括一个或多个微处理器、嵌入式处理器、控制器、专用集成电路(ASIC)、高级数字信号处理(ADSP)等等。术语处理器描述由系统实现的功能而非具体硬件。此外,如本文所使用的,术语“存储器”是指任何类型的计算机存储介质,包括长期、短期、或与移动平台1000相关联的其他存储器,且并不限于任何特定类型的存储器或特定数目的存储器、或记忆存储在其上的介质的类型。If desired, the
在一种配置中,机器学习模型被配置成用于接收训练轨迹。该模型还被配置用于将该区域划分成各观察到的集群和一非观察到的补充性集群。该模型进一步被配置用于生成时空模式模型以包括各观察到的集群和该非观察到的补充性集群。该模型包括接收装置、划分装置、和/或生成装置。在一个方面,接收装置、划分装置、和/或生成装置可以是配置成执行所叙述功能的通用处理器302、与通用处理器302相关联的程序存储器、存储器块318、局部处理单元402、和/或路由连接处理单元316。在另一配置中,接收装置、划分装置和/或生成装置可经由处理器1008、硬件1010、固件1012和/或软件1016来实现。在另一配置中,前述装置可以是被配置成执行由前述装置所叙述的功能的任何模块或任何装置。In one configuration, the machine learning model is configured to receive training trajectories. The model is also configured to divide the region into observed clusters and a non-observed complementary cluster. The model is further configured to generate a spatiotemporal pattern model to include each observed cluster and the non-observed complementary cluster. The model includes receiving means, dividing means, and/or generating means. In one aspect, the receiving means, partitioning means, and/or generating means may be the general-
本文中所描述的过程取决于应用可藉由各种手段来实现。例如,这些过程可在硬件1010、固件1012、软件1016、或其任何组合中实现。对于硬件实现,这些处理单元可以在一个或多个专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理器件(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、处理器、控制器、微控制器、微处理器、电子器件、设计成执行本文中所描述功能的其他电子单元、或者其组合内实现。The processes described herein can be implemented by various means depending on the application. For example, these processes may be implemented in
根据本公开的某些方面,每个局部处理单元402可被配置成基于模型的一个或多个期望功能特征来确定模型的参数,以及随着所确定的参数被进一步适配、调谐和更新来使这一个或多个功能特征朝着期望的功能特征发展。According to certain aspects of the present disclosure, each
图11是解说根据本公开的各方面的用于生成用于时空模式识别的时空模式模型的方法1100。在框1102,该过程接收训练轨迹。训练轨迹中的每一者可包括表示时空模式的不同数据点。时空模式可以例如包括表示字母数字字符、符号、鼠标/触摸控制中的至少一者的相机输入或输入姿势。接收到的训练轨迹定义一区域。11 is a diagram illustrating a
在框1104,该过程将该区域划分成各观察到的集群和一非观察到的补充性集群。此外,在框1106,该过程生成时空模式模型以包括各观察到的集群和该非观察到的补充性集群。At
该区域可通过应用诸如例如两参数Pitman-Yor过程等随机过程来被划分。在一些方面,协方差回归(诸如高斯过程协方差回归)可针对训练轨迹中包括的数据点中的两者或更多者而被执行并且可进而被用于确定一个或多个随机过程参数。The region may be divided by applying a random process such as, for example, a two-parameter Pitman-Yor process. In some aspects, covariance regression, such as Gaussian process covariance regression, may be performed on two or more of the data points included in the training trajectory and may in turn be used to determine one or more stochastic process parameters.
随机过程可被用来确定哪一集群(包括非观察到的补充性集群)对应于给定轨迹的每一数据点。观察到的集群中的每一者的范围还可基于一个或多个随机过程参数来确定。另外,在一些方面,时空模式模型可通过基于随机过程创建隐马尔科夫模型(HMM)来生成。A stochastic process can be used to determine which cluster (including non-observed complementary clusters) corresponds to each data point of a given trajectory. The range of each of the observed clusters may also be determined based on one or more random process parameters. Additionally, in some aspects, the spatiotemporal pattern model may be generated by creating a Hidden Markov Model (HMM) based on a stochastic process.
此外,该过程可修改HMM的观察表以包括非观察到的补充性集群。在又一些方面,在隐马尔科夫模型的似然性高于预定阈值时,该过程可将接收到的轨迹识别为时空匹配。Additionally, the process can modify the HMM's observation table to include non-observed complementary clusters. In still other aspects, the process may identify the received trajectory as a spatiotemporal match when the likelihood of the hidden Markov model is above a predetermined threshold.
图12是解说根据本公开的各方面的用于生成时空模式识别的模型的示例性过程1200的流程图。在框1202,该过程接收训练轨迹(例如,图7A的训练轨迹704A-C)。训练轨迹中的每一者包括表示输入姿势的在空间上不同的数据点。在框1204,高斯过程(例如,图7C)随后被应用于训练轨迹并且在可任选过程框1206,高斯过程被用来确定随机过程参数(例如,Pitman-Yor过程的参数d)。如上所述,过程框1206是可任选的且在训练模型时可被省略。因而,在一个示例中,高斯过程协方差回归只被用于识别而不被用于训练。随机过程(例如,Pitman-Yor过程)随后在过程框1208中被应用以将轨迹的空间区域划分成各观察到的区和一非观察到的补充性区。各观察到的区中的每一者的范围(例如,大小)可以基于上述随机过程参数。在框1210,该过程生成用于时空模式识别的模型,它使用各观察到的区和该非观察到的补充性区来确定模式匹配。12 is a flowchart illustrating an
图13是解说时空模式识别的过程1300的流程图。在框1302,该过程接收轨迹。接收到的轨迹可包括数据点。轨迹的各数据点可以与输入姿势、股票市场相关数据、语音、天气数据、或其它时空数据相关。13 is a flow diagram illustrating a
在框1304,该过程评估接收到的轨迹以确定接收到的轨迹的各数据点落入经训练时空模式模型中的哪一集群(例如,观察到的集群和补充集群)。在数据点落入观察到的集群中时,该过程指派一标记。在数据点处于补充集群中时,该数据点可能不是可接受的。然而,根据本公开的各方面,一些差异可被容忍。At
在框1306,该过程找出每一集群和补充集群中的数据点。包括补充集群的每一集群的所指派的标记可被提供给对应的隐马尔科夫模型以确定似然性。通过考虑补充集群的数据点,HMM所产生的似然性可被降低。At
每一HMM可以输出可在框1308与阈值相比较的似然性值。如果输出高于阈值,则接收到的轨迹可在框1310被识别为时空匹配。否则,在框1312,接收到的轨迹不被认为是匹配。Each HMM may output a likelihood value that may be compared to a threshold at
在一些方面,观察的次序也可被评估。即,在一些方面,HMM可以用正确的观察次序来训练并可被用来评估时空模式。例如,如果HMM是使用用于画出数字2的正确次序来训练的,如果数字2以逆序画出,则该HMM所生成的似然性可非常小并且因而可指示该输入不是匹配。In some aspects, the order of observations may also be evaluated. That is, in some aspects, the HMM can be trained with the correct order of observations and can be used to evaluate spatiotemporal patterns. For example, if the HMM was trained using the correct order for drawing the
以上所描述的方法的各种操作可由能够执行相应功能的任何合适的装置来执行。这些装置可包括各种硬件和/或(诸)软件组件和/或(诸)模块,包括但不限于电路、专用集成电路(ASIC)、或处理器。一般而言,在附图中有解说的操作的场合,那些操作可具有带相似编号的相应配对装置加功能组件。The various operations of the methods described above may be performed by any suitable apparatus capable of performing the corresponding functions. These means may include various hardware and/or software component(s) and/or module(s) including, but not limited to, circuits, application specific integrated circuits (ASICs), or processors. In general, where there are operations illustrated in the figures, those operations may have corresponding counterpart means-plus-function components with similar numbering.
如本文所使用的,术语“确定”涵盖各种各样的动作。例如,“确定”可包括演算、计算、处理、推导、研究、查找(例如,在表、数据库或其他数据结构中查找)、探知及诸如此类。另外,“确定”可包括接收(例如接收信息)、访问(例如访问存储器中的数据)、及类似动作。而且,“确定”可包括解析、选择、选取、确立及类似动作。As used herein, the term "determining" encompasses a wide variety of actions. For example, "determining" may include calculating, calculating, processing, deriving, studying, looking up (eg, in a table, database, or other data structure), probing, and the like. Additionally, "determining" may include receiving (eg, receiving information), accessing (eg, accessing data in memory), and the like. Also, "determining" may include resolving, selecting, selecting, establishing, and the like.
如本文中所使用的,引述一列项目“中的至少一个”的短语是指这些项目的任何组合,包括单个成员。作为示例,“a、b或c中的至少一个”旨在涵盖:a、b、c、a-b、a-c、b-c、以及a-b-c。As used herein, a phrase referring to "at least one of" a list of items refers to any combination of those items, including individual members. As an example, "at least one of a, b, or c" is intended to encompass: a, b, c, a-b, a-c, b-c, and a-b-c.
结合本公开所描述的各种解说性逻辑框、模块、以及电路可用设计成执行本文所描述功能的通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列信号(FPGA)或其他可编程逻辑器件(PLD)、分立的门或晶体管逻辑、分立的硬件组件或其任何组合来实现或执行。通用处理器可以是微处理器,但在替换方案中,处理器可以是任何市售的处理器、控制器、微控制器、或状态机。处理器还可以被实现为计算设备的组合,例如DSP与微处理器的组合、多个微处理器、与DSP核心协同的一个或多个微处理器、或任何其它此类配置。The various illustrative logical blocks, modules, and circuits described in connection with this disclosure may be used with general purpose processors, digital signal processors (DSPs), application specific integrated circuits (ASICs), field programmable gate arrays designed to perform the functions described herein signal (FPGA) or other programmable logic device (PLD), discrete gate or transistor logic, discrete hardware components, or any combination thereof. A general-purpose processor may be a microprocessor, but in the alternative, the processor may be any commercially available processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, such as a combination of a DSP and a microprocessor, multiple microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
结合本公开描述的方法或算法的步骤可直接在硬件中、在由处理器执行的软件模块中、或在这两者的组合中实施。软件模块可驻留在本领域所知的任何形式的存储介质中。可使用的存储介质的一些示例包括随机存取存储器(RAM)、只读存储器(ROM)、闪存、可擦除可编程只读存储器(EPROM)、电可擦除可编程只读存储器(EEPROM)、寄存器、硬盘、可移动盘、CD-ROM,等等。软件模块可包括单条指令、或许多条指令,且可分布在若干不同的代码段上,分布在不同的程序间以及跨多个存储介质分布。存储介质可被耦合到处理器以使得该处理器能从/向该存储介质读写信息。在替换方案中,存储介质可以被整合到处理器。The steps of a method or algorithm described in connection with this disclosure may be implemented directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in any form of storage medium known in the art. Some examples of storage media that can be used include random access memory (RAM), read only memory (ROM), flash memory, erasable programmable read only memory (EPROM), electrically erasable programmable read only memory (EEPROM) , registers, hard disks, removable disks, CD-ROMs, etc. A software module may include a single instruction, or many instructions, and may be distributed over several different code segments, among different programs, and across multiple storage media. A storage medium may be coupled to the processor such that the processor can read and write information from/to the storage medium. In the alternative, the storage medium may be integrated into the processor.
本文所公开的方法包括用于达成所描述的方法的一个或多个步骤或动作。这些方法步骤和/或动作可以彼此互换而不会脱离权利要求的范围。换言之,除非指定了步骤或动作的特定次序,否则具体步骤和/或动作的次序和/或使用可以改动而不会脱离权利要求的范围。The methods disclosed herein include one or more steps or actions for achieving the described methods. The method steps and/or actions may be interchanged with each other without departing from the scope of the claims. In other words, unless a specific order of steps or actions is specified, the order and/or use of specific steps and/or actions may be modified without departing from the scope of the claims.
所描述的功能可在硬件、软件、固件或其任何组合中实现。如果以硬件实现,则示例硬件配置可包括设备中的处理系统。处理系统可以用总线架构来实现。取决于处理系统的具体应用和整体设计约束,总线可包括任何数目的互连总线和桥接器。总线可将包括处理器、机器可读介质、以及总线接口的各种电路链接在一起。总线接口可用于尤其将网络适配器等经由总线连接至处理系统。网络适配器可用于实现信号处理功能。对于某些方面,用户接口(例如,按键板、显示器、鼠标、操纵杆,等等)也可以被连接到总线。总线还可以链接各种其他电路,诸如定时源、外围设备、稳压器、功率管理电路以及类似电路,它们在本领域中是众所周知的,因此将不再进一步描述。The functions described can be implemented in hardware, software, firmware, or any combination thereof. If implemented in hardware, an example hardware configuration may include a processing system in the device. The processing system can be implemented with a bus architecture. Depending on the specific application and overall design constraints of the processing system, the bus may include any number of interconnecting buses and bridges. A bus may link together various circuits including a processor, a machine-readable medium, and a bus interface. A bus interface may be used to connect, among other things, a network adapter or the like to the processing system via the bus. Network adapters can be used to implement signal processing functions. For certain aspects, a user interface (eg, keypad, display, mouse, joystick, etc.) may also be connected to the bus. The bus may also link various other circuits, such as timing sources, peripherals, voltage regulators, power management circuits, and the like, which are well known in the art and will not be described further.
处理器可负责管理总线和一般处理,包括执行存储在机器可读介质上的软件。处理器可用一个或多个通用和/或专用处理器来实现。示例包括微处理器、微控制器、DSP处理器、以及其他能执行软件的电路系统。软件应当被宽泛地解释成意指指令、数据、或其任何组合,无论是被称作软件、固件、中间件、微代码、硬件描述语言、或其他。作为示例,机器可读介质可包括随机存取存储器(RAM)、闪存、只读存储器(ROM)、可编程只读存储器(PROM)、可擦式可编程只读存储器(EPROM)、电可擦式可编程只读存储器(EEPROM)、寄存器、磁盘、光盘、硬驱动器、或者任何其他合适的存储介质、或其任何组合。机器可读介质可被实施在计算机程序产品中。该计算机程序产品可以包括包装材料。The processor may be responsible for managing the bus and general processing, including executing software stored on a machine-readable medium. A processor may be implemented with one or more general and/or special purpose processors. Examples include microprocessors, microcontrollers, DSP processors, and other circuitry capable of executing software. Software should be construed broadly to mean instructions, data, or any combination thereof, whether referred to as software, firmware, middleware, microcode, hardware description language, or otherwise. As examples, machine-readable media may include random access memory (RAM), flash memory, read only memory (ROM), programmable read only memory (PROM), erasable programmable read only memory (EPROM), electrically erasable programmable read only memory (EEPROM), registers, magnetic disk, optical disk, hard drive, or any other suitable storage medium, or any combination thereof. The machine-readable medium may be embodied in a computer program product. The computer program product may include packaging material.
在硬件实现中,机器可读介质可以是处理系统中与处理器分开的一部分。然而,如本领域技术人员将容易领会的,机器可读介质或其任何部分可在处理系统外部。作为示例,机器可读介质可包括传输线、由数据调制的载波、和/或与设备分开的计算机产品,所有这些都可由处理器通过总线接口来访问。替换地或补充地,机器可读介质或其任何部分可被集成到处理器中,诸如高速缓存和/或通用寄存器文件可能就是这种情形。虽然所讨论的各种组件可被描述为具有特定位置,诸如局部组件,但它们也可按各种方式来配置,诸如某些组件被配置成分布式计算系统的一部分。In a hardware implementation, the machine-readable medium may be a separate part of the processing system from the processor. However, as those skilled in the art will readily appreciate, the machine-readable medium or any portion thereof may be external to the processing system. As examples, a machine-readable medium may include a transmission line, a carrier wave modulated by data, and/or a computer product separate from the device, all of which can be accessed by a processor through a bus interface. Alternatively or additionally, the machine-readable medium or any portion thereof may be integrated into the processor, such as may be the case with a cache and/or a general purpose register file. Although the various components discussed may be described as having particular locations, such as local components, they may also be configured in various ways, such as certain components being configured as part of a distributed computing system.
处理系统可以被配置为通用处理系统,该通用处理系统具有一个或多个提供处理器功能性的微处理器、以及提供机器可读介质中的至少一部分的外部存储器,它们都通过外部总线架构与其他支持电路系统链接在一起。替换地,该处理系统可以包括一个或多个神经元形态处理器以用于实现本文所述的神经元模型和神经系统模型。作为另一替换方案,处理系统可以用带有集成在单块芯片中的处理器、总线接口、用户接口、支持电路系统、和至少一部分机器可读介质的专用集成电路(ASIC)来实现,或者用一个或多个现场可编程门阵列(FPGA)、可编程逻辑器件(PLD)、控制器、状态机、门控逻辑、分立硬件组件、或者任何其他合适的电路系统、或者能执行本公开通篇所描述的各种功能性的电路的任何组合来实现。取决于具体应用和加诸于整体系统上的总设计约束,本领域技术人员将认识到如何最佳地实现关于处理系统所描述的功能性。The processing system may be configured as a general-purpose processing system having one or more microprocessors providing processor functionality, and external memory providing at least a portion of a machine-readable medium, all connected via an external bus architecture. Other support circuitry is linked together. Alternatively, the processing system may include one or more neuron morphological processors for implementing the neuron models and nervous system models described herein. As another alternative, the processing system may be implemented as an application specific integrated circuit (ASIC) with a processor, bus interface, user interface, support circuitry, and at least a portion of a machine-readable medium integrated in a single chip, or The present disclosure may be implemented using one or more field programmable gate arrays (FPGAs), programmable logic devices (PLDs), controllers, state machines, gated logic, discrete hardware components, or any other suitable circuitry, or capable of implementing the present disclosure. implemented in any combination of the various functional circuits described in this article. Depending on the specific application and the overall design constraints imposed on the overall system, those skilled in the art will recognize how best to implement the functionality described with respect to the processing system.
机器可读介质可包括数个软件模块。这些软件模块包括当由处理器执行时使处理系统执行各种功能的指令。这些软件模块可包括传送模块和接收模块。每个软件模块可以驻留在单个存储设备中或者跨多个存储设备分布。作为示例,当触发事件发生时,可以从硬驱动器中将软件模块加载到RAM中。在软件模块执行期间,处理器可以将一些指令加载到高速缓存中以提高访问速度。随后可将一个或多个高速缓存行加载到通用寄存器文件中以供处理器执行。在以下述及软件模块的功能性时,将理解此类功能性是在处理器执行来自该软件模块的指令时由该处理器来实现的。此外,应领会,本公开的各方面产生对处理器、计算机、机器或实现此类方面的其它系统的机能的改进。The machine-readable medium may include several software modules. These software modules include instructions that, when executed by a processor, cause the processing system to perform various functions. These software modules may include a transmitting module and a receiving module. Each software module may reside in a single storage device or be distributed across multiple storage devices. As an example, a software module may be loaded into RAM from a hard drive when a trigger event occurs. During the execution of a software module, the processor may load some instructions into the cache to increase access speed. One or more cache lines may then be loaded into the general register file for execution by the processor. When referring to the functionality of a software module as described below, it will be understood that such functionality is implemented by the processor when the processor executes instructions from the software module. Furthermore, it should be appreciated that aspects of the present disclosure result in improvements in the functioning of processors, computers, machines, or other systems that implement such aspects.
如果以软件实现,则各功能可作为一条或多条指令或代码存储在计算机可读介质上或藉其进行传送。计算机可读介质包括计算机存储介质和通信介质两者,这些介质包括促成计算机程序从一地向另一地转移的任何介质。存储介质可以是能被计算机访问的任何可用介质。作为示例而非限定,此类计算机可读介质可包括RAM、ROM、EEPROM、CD-ROM或其他光盘存储、磁盘存储或其他磁存储设备、或能用于携带或存储指令或数据结构形式的期望程序代码且能被计算机访问的任何其他介质。另外,任何连接也被正当地称为计算机可读介质。例如,如果软件是使用同轴电缆、光纤电缆、双绞线、数字订户线(DSL)、或无线技术(诸如红外(IR)、无线电、以及微波)从web网站、服务器、或其他远程源传送而来,则该同轴电缆、光纤电缆、双绞线、DSL或无线技术(诸如红外、无线电、以及微波)就被包括在介质的定义之中。如本文中所使用的盘(disk)和碟(disc)包括压缩碟(CD)、激光碟、光碟、数字多用碟(DVD)、软盘、和碟,其中盘(disk)常常磁性地再现数据,而碟(disc)用激光来光学地再现数据。因此,在一些方面,计算机可读介质可包括非瞬态计算机可读介质(例如,有形介质)。另外,对于其他方面,计算机可读介质可包括瞬态计算机可读介质(例如,信号)。上述的组合应当也被包括在计算机可读介质的范围内。If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage medium can be any available medium that can be accessed by a computer. By way of example and not limitation, such computer-readable media may include RAM, ROM, EEPROM, CD-ROM, or other optical disk storage, magnetic disk storage, or other magnetic storage devices, or as desired in the form of carrying or storing instructions or data structures. Program code and any other medium that can be accessed by a computer. Also, any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a web site, server, or other remote source using coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared (IR), radio, and microwave Then, the coaxial cable, fiber optic cable, twisted pair, DSL or wireless technology (such as infrared, radio, and microwave) is included in the definition of medium. Disk and disc as used herein includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk, and Disks, where disks often reproduce data magnetically, while discs reproduce data optically with lasers. Thus, in some aspects computer-readable media may include non-transitory computer-readable media (eg, tangible media). Also, for other aspects, computer-readable media may include transitory computer-readable media (eg, signals). Combinations of the above should also be included within the scope of computer-readable media.
因此,某些方面可包括用于执行本文中给出的操作的计算机程序产品。例如,此类计算机程序产品可包括其上存储(和/或编码)有指令的计算机可读介质,这些指令能由一个或多个处理器执行以执行本文中所描述的操作。对于某些方面,计算机程序产品可包括包装材料。Accordingly, certain aspects may include a computer program product for performing the operations presented herein. For example, such a computer program product may include a computer-readable medium having stored (and/or encoded) thereon instructions executable by one or more processors to perform the operations described herein. For certain aspects, the computer program product may include packaging materials.
此外,应当领会,用于执行本文中所描述的方法和技术的模块和/或其它恰适装置能由用户终端和/或基站在适用的场合下载和/或以其他方式获得。例如,此类设备能被耦合至服务器以促成用于执行本文中所描述的方法的装置的转移。替换地,本文所述的各种方法能经由存储装置(例如,RAM、ROM、诸如压缩碟(CD)或软盘等物理存储介质等)来提供,以使得一旦将该存储装置耦合至或提供给用户终端和/或基站,该设备就能获得各种方法。此外,可利用适于向设备提供本文所描述的方法和技术的任何其他合适的技术。Furthermore, it should be appreciated that modules and/or other suitable means for performing the methods and techniques described herein can be downloaded and/or otherwise obtained by user terminals and/or base stations, where applicable. For example, such a device can be coupled to a server to facilitate the transfer of means for performing the methods described herein. Alternatively, the various methods described herein can be provided via a storage device (eg, RAM, ROM, physical storage medium such as a compact disc (CD) or floppy disk, etc.) such that once the storage device is coupled to or provided to User terminal and/or base station, the device can obtain various methods. Furthermore, any other suitable technology suitable for providing the methods and techniques described herein to a device may be utilized.
将理解,权利要求并不被限定于以上所解说的精确配置和组件。可在以上所描述的方法和装置的布局、操作和细节上作出各种改动、更换和变形而不会脱离权利要求的范围。It is to be understood that the claims are not to be limited to the precise arrangements and components described above. Various changes, substitutions and alterations may be made in the arrangement, operation and details of the methods and apparatus described above without departing from the scope of the claims.
Claims (24)
Applications Claiming Priority (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201462076319P | 2014-11-06 | 2014-11-06 | |
US62/076,319 | 2014-11-06 | ||
US14/933,976 | 2015-11-05 | ||
US14/933,976 US9898689B2 (en) | 2014-11-06 | 2015-11-05 | Nonparametric model for detection of spatially diverse temporal patterns |
PCT/US2015/059475 WO2016073856A1 (en) | 2014-11-06 | 2015-11-06 | Nonparametric model for detection of spatially diverse temporal patterns |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107077609A CN107077609A (en) | 2017-08-18 |
CN107077609B true CN107077609B (en) | 2020-08-07 |
Family
ID=54548296
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201580060090.6A Active CN107077609B (en) | 2014-11-06 | 2015-11-06 | Non-parametric model for detecting spatially distinct temporal patterns |
Country Status (4)
Country | Link |
---|---|
US (1) | US9898689B2 (en) |
EP (1) | EP3215981B1 (en) |
CN (1) | CN107077609B (en) |
WO (1) | WO2016073856A1 (en) |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10599988B2 (en) | 2016-03-02 | 2020-03-24 | D-Wave Systems Inc. | Systems and methods for analog processing of problem graphs having arbitrary size and/or connectivity |
CN108334897B (en) * | 2018-01-22 | 2023-04-07 | 上海海事大学 | Offshore floater track prediction method based on self-adaptive Gaussian mixture model |
US11593174B2 (en) | 2018-10-16 | 2023-02-28 | D-Wave Systems Inc. | Systems and methods for scheduling programs for dedicated execution on a quantum processor |
CN113544711B (en) | 2019-01-17 | 2024-08-02 | D-波系统公司 | Hybrid algorithm system and method for using cluster shrinkage |
US11593695B2 (en) | 2019-03-26 | 2023-02-28 | D-Wave Systems Inc. | Systems and methods for hybrid analog and digital processing of a computational problem using mean fields |
US11416553B2 (en) * | 2019-03-28 | 2022-08-16 | Amazon Technologies, Inc. | Spatial indexing |
US11436217B2 (en) | 2019-03-28 | 2022-09-06 | Amazon Technologies, Inc. | Ordered append-only log based data storage |
JP2020190940A (en) * | 2019-05-22 | 2020-11-26 | レノボ・シンガポール・プライベート・リミテッド | Information processor, control method, and program |
US11714730B2 (en) | 2019-08-20 | 2023-08-01 | D-Wave Systems Inc. | Systems and methods for high availability, failover and load balancing of heterogeneous resources |
CN112988527A (en) * | 2019-12-13 | 2021-06-18 | 中国电信股份有限公司 | GPU management platform anomaly detection method and device and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101437124A (en) * | 2008-12-17 | 2009-05-20 | 三星电子(中国)研发中心 | Method for processing dynamic gesture identification signal facing (to)television set control |
CN101763515A (en) * | 2009-09-23 | 2010-06-30 | 中国科学院自动化研究所 | Real-time gesture interaction method based on computer vision |
CN102592112A (en) * | 2011-12-20 | 2012-07-18 | 四川长虹电器股份有限公司 | Method for determining gesture moving direction based on hidden Markov model |
CN102971701A (en) * | 2010-06-17 | 2013-03-13 | 高通股份有限公司 | Methods and apparatus for contactless gesture recognition and power reduction |
CN103902984A (en) * | 2014-04-15 | 2014-07-02 | 福州大学 | Improved HMM training algorithm for dynamic hand gesture recognition |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050285937A1 (en) | 2004-06-28 | 2005-12-29 | Porikli Fatih M | Unusual event detection in a video using object and frame features |
US7263472B2 (en) | 2004-06-28 | 2007-08-28 | Mitsubishi Electric Research Laboratories, Inc. | Hidden markov model based object tracking and similarity metrics |
US8904312B2 (en) | 2006-11-09 | 2014-12-02 | Navisense | Method and device for touchless signing and recognition |
EP2469496A1 (en) | 2010-12-23 | 2012-06-27 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Concept for encoding data defining coded positions representing a trajectory of an object |
US9122931B2 (en) | 2013-10-25 | 2015-09-01 | TCL Research America Inc. | Object identification system and method |
US9818203B2 (en) | 2014-04-08 | 2017-11-14 | Alcatel-Lucent Usa Inc. | Methods and apparatuses for monitoring objects of interest in area with activity maps |
-
2015
- 2015-11-05 US US14/933,976 patent/US9898689B2/en active Active
- 2015-11-06 EP EP15795312.6A patent/EP3215981B1/en active Active
- 2015-11-06 WO PCT/US2015/059475 patent/WO2016073856A1/en active Application Filing
- 2015-11-06 CN CN201580060090.6A patent/CN107077609B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101437124A (en) * | 2008-12-17 | 2009-05-20 | 三星电子(中国)研发中心 | Method for processing dynamic gesture identification signal facing (to)television set control |
CN101763515A (en) * | 2009-09-23 | 2010-06-30 | 中国科学院自动化研究所 | Real-time gesture interaction method based on computer vision |
CN102971701A (en) * | 2010-06-17 | 2013-03-13 | 高通股份有限公司 | Methods and apparatus for contactless gesture recognition and power reduction |
CN102592112A (en) * | 2011-12-20 | 2012-07-18 | 四川长虹电器股份有限公司 | Method for determining gesture moving direction based on hidden Markov model |
CN103902984A (en) * | 2014-04-15 | 2014-07-02 | 福州大学 | Improved HMM training algorithm for dynamic hand gesture recognition |
Non-Patent Citations (1)
Title |
---|
"Improving Hand Gesture Recognition Using 3D Combined Features";Mahmoud Elmezain等;《2009 Second Tnternational Conference on Machine Vision》;20091228;第128-132页 * |
Also Published As
Publication number | Publication date |
---|---|
EP3215981B1 (en) | 2024-03-13 |
US9898689B2 (en) | 2018-02-20 |
CN107077609A (en) | 2017-08-18 |
WO2016073856A1 (en) | 2016-05-12 |
EP3215981A1 (en) | 2017-09-13 |
US20160132753A1 (en) | 2016-05-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107077609B (en) | Non-parametric model for detecting spatially distinct temporal patterns | |
CN102985897B (en) | Efficient gesture processes | |
US9746929B2 (en) | Gesture recognition using gesture elements | |
CN107209861A (en) | Use the data-optimized multi-class multimedia data classification of negative | |
JP7290730B2 (en) | Sentence generation method and device, electronic device and program | |
CN106575150A (en) | Identifying gestures using motion data | |
Yin et al. | A high-performance training-free approach for hand gesture recognition with accelerometer | |
Huynh-The et al. | Hierarchical topic modeling with pose-transition feature for action recognition using 3D skeleton data | |
Vatavu | The impact of motion dimensionality and bit cardinality on the design of 3D gesture recognizers | |
CN105549885A (en) | Method and device for recognizing user emotion during screen sliding operation | |
Misra et al. | Development of a hierarchical dynamic keyboard character recognition system using trajectory features and scale-invariant holistic modeling of characters | |
Choudhury et al. | A CNN-LSTM based ensemble framework for in-air handwritten Assamese character recognition | |
Wu et al. | A visual attention-based method to address the midas touch problem existing in gesture-based interaction | |
Singh et al. | A Temporal Convolutional Network for modeling raw 3D sequences and air-writing recognition | |
Kratz et al. | Making gestural input from arm-worn inertial sensors more practical | |
Li et al. | Cross-people mobile-phone based airwriting character recognition | |
Bai et al. | Dynamic hand gesture recognition based on depth information | |
Jian et al. | RD-Hand: a real-time regression-based detector for dynamic hand gesture | |
Jian et al. | Mobile terminal trajectory recognition based on improved LSTM model | |
Krishnaveni et al. | Classifier fusion based on Bayes aggregation method for Indian sign language datasets | |
Al-Khamees et al. | An evolving fuzzy model to determine an optimal number of data stream clusters | |
Glodek et al. | Hidden Markov models with graph densities for action recognition | |
Kasaei | Interactive Open-Ended Learning for 3D Object Recognition | |
Lee et al. | A recurrent neural network with non-gesture rejection model for recognizing gestures with smartphone sensors | |
Ghodke et al. | Air Writing using Deep Learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |