US9275353B2 - Event-processing operators - Google Patents
Event-processing operators Download PDFInfo
- Publication number
- US9275353B2 US9275353B2 US11/938,013 US93801307A US9275353B2 US 9275353 B2 US9275353 B2 US 9275353B2 US 93801307 A US93801307 A US 93801307A US 9275353 B2 US9275353 B2 US 9275353B2
- Authority
- US
- United States
- Prior art keywords
- event
- stream
- events
- processor
- classifications
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
Definitions
- the present disclosure relates generally to event processing systems.
- embodiments of inventive matter disclosed herein provide operators useful in implementing event processors.
- an event processor collects streams of events, processes the events, and generates output related to the processing of the events.
- the generated output can include, for example, notifications to interested parties or a stream of events, for example.
- An event is typically defined as “a change in state.” For example, when a consumer purchases an automobile, the automobile's state changes from “for sale” to “sold.” A system architecture for an automobile dealership may treat this state change as an event to be detected, produced, published, and consumed by various applications within the architecture.
- Event processors are frequently implemented as a network of operators and are designed to operate in real time.
- Designers of event processors have designed and implemented event processors in a variety of different programming languages.
- event processors can be implemented as a network of CQL operators.
- CQL is a Continuous Query Language for registering continuous queries against streams and updateable relations.
- Embodiments disclosed herein provide for novel event processing operators that can be advantageously utilized in implementing event processors and event processing systems.
- Event processing operators disclosed herein may be implemented in software only, hardware only, or a combination of hardware and software.
- an operator for event filtering by clustering receives a stream of events. For each event in the received stream of events, the operator determines whether the event is associated with one of a plurality of event clusters. For each event determined to not be associated with any of the plurality of event clusters, the operator places an alert event into an output stream. The operator periodically redefines the plurality of event clusters based on a subset of the events in the received stream. In this manner, the operator can compute clusters from historical data (i.e., events previously received) and then filter out events that fall outside of all the computed clusters.
- an operator for partitioning events by classification classifies received events in accordance with a generalized linear model comprising a predetermined distribution function, a linear predictor, and a predetermined link function.
- the operator periodically receives classified events from a stream of model input and, based on these already classified events, estimates unknown parameters in the linear predictor.
- the operator determines, based on the generalized linear model, whether the event has a probability equal to or greater than a predetermined threshold of being associated with one of a plurality of classifications.
- the operator places the event in an output stream along with a mark indicating the classification to which the event is associated.
- an operator for event abstraction by hypothesis testing is provided. This operator periodically identifies a subset of events in a received stream of events. The operator determines whether the subset of events fails a hypothesis about a statistical distribution of events in the received stream. Whenever, the operator determines that the subset of events fails the hypothesis, the operator places an alert event into an output stream.
- an operator for event filtering by point estimation receives a stream of events. For each event in the received stream of events, the operator evaluates a predicate comprising predetermined parameters. For each event for which the predicate evaluates to true, the operator places an alert event into an output stream. Periodically, the operator estimates values for the predetermined parameters based on a subset of the events in the received stream.
- an implementer of an event processing system can include, in the event processing system, event processors that include one or more of the operators described herein.
- systems described herein can execute event processors defined as a network of operators including one or more of the operators described herein.
- inventive matter disclosed herein may be embodied strictly as a software program, as software and hardware, or as hardware alone.
- the features disclosed herein may be employed in workstations and other computerized devices and software systems for such devices such as those manufactured by SUN Microsystems, Inc., of Santa Clara, Calif.
- inventive matter disclosed herein can be advantageously utilized in developing JBI components such as an Intelligent Event Processor (“IEP”).
- IEP Intelligent Event Processor
- FIG. 1 is a block diagram of a computer environment illustrating an example architecture of a respective computer system useful for implementing an event processor according to embodiments disclosed herein.
- FIG. 2 illustrates procedures performable by an event processor to implement an operator for event filtering by clustering in accordance with embodiments disclosed herein.
- FIG. 3 illustrates a flowchart of a particular embodiment of the operator for event filtering by clustering implemented by the procedures of FIG. 2 .
- FIG. 4 illustrates procedures performable by an event processor to implement an operator for partitioning events by classification in accordance with embodiments disclosed herein.
- FIG. 5 illustrates a flowchart of a particular embodiment of the operator for partitioning events by classification implemented by the procedures of FIG. 4 .
- FIG. 6 illustrates procedures performable by an event processor to implement an operator for event abstraction by hypothesis testing in accordance with embodiments disclosed herein.
- FIG. 7 illustrates a flowchart of a particular embodiment of the operator for event abstraction by hypothesis testing implemented by the procedures of FIG. 6 .
- FIG. 8 illustrates procedures performable by an event processor to implement an operator for event filtering by point estimation in accordance with embodiments disclosed herein.
- FIG. 9 illustrates a flowchart of a particular embodiment of the operator for event filtering by point estimation implemented by the procedures of FIG. 8 .
- FIG. 1 is a block diagram of a computing environment 100 illustrating an example architecture of a respective computer system 110 useful for implementing an Event-Processor application 120 according to embodiments disclosed herein.
- Computer system 110 can be a computerized device such as a personal computer, workstation, portable computing device, console, network terminal, processing device, etc.
- computer system 110 of the present example includes an interconnect 111 , such as a data bus or other circuitry, that couples a memory system 112 , a processor 113 , I/O interface 114 , and a communications interface 115 .
- An input device 130 (e.g., one or more user/developer-controlled devices such as a keyboard, mouse, touchpad, trackball, etc.) couples to the processor 113 through the I/O interface 114 and enables a user 140 , such as a developer of an event processor or an administrator of an event processor, to provide input commands and generally interact with a graphical user interface that the Event-Processor application 120 and the Event-Processor process 122 provide on a display 150 .
- I/O interface 114 potentially provides connectivity to peripheral devices such as the input device 130 , display screen 150 , storage device 160 , etc.
- the computer environment 100 includes a storage device 160 that can be used for storing one or more files 162 .
- the files 162 may contain, for example, programming constructs such as those constructs embodying previously designed operators or previously designed event processors. Additionally, the files 162 may contain, for example, data such as event streams comprising input to or output from executing event processors.
- Communications interface 115 enables computer system 110 to communicate with network 170 over the communication link 180 to retrieve and transmit information from remotely located sources if necessary.
- the computer system 110 may be communicatively connected via the communication link 180 to other computer systems on the network 170 that can also execute Event-Processor applications.
- Event-Processors performing operators, such as those described herein, can communicate with each other.
- the Event-Processor process 122 may produce an output stream that is sent via the communication link 180 and becomes an input stream to a second Event-Processor process executing on the network 170 .
- an output stream of an Event-Processor process executing on the network may be sent via the communications link 180 to the Event-Processor process 122 and become an input stream to the Event-Processor process 122 .
- memory system 112 can be any type of computer-readable medium and in this example is encoded with Event-Processor application 120 that supports functionality as herein described, such as operators described herein.
- Event-Processor application 120 can be embodied as computer software code such as data and/or logic instructions (e.g., code stored in the memory or on another computer-readable medium such as a disk) that supports processing functionality according to different embodiments described herein.
- processor 113 accesses the memory system 112 via the interconnect 111 in order to launch, run, execute, interpret, or otherwise perform the logic instructions of the Event-Processor application 120 . Execution of the Event-Processor application 120 produces processing functionality in an Event-Processor process 122 .
- Event-Processor process 122 represents one or more portions of the Event-Processor application 120 performing within or upon the processor 113 in the computer system 110 .
- the computer system 110 can include other processes and/or software and hardware components, such as an operating system that controls allocation and use of hardware resources.
- Event-Processor application 120 itself (i.e., the un-executed or non-performing logic instructions and/or data).
- the Event-Processor application 120 may be stored on a computer-readable medium such as a floppy disk, hard disk, or in an optical medium.
- the Event-Processor application 120 can also be stored in a memory type system such as in firmware, read-only memory (ROM), or, as in this example, as executable code within the memory system 112 (e.g., within Random Access Memory or RAM).
- embodiments disclosed herein include logic encoded in one or more tangible media for execution and, when executed, operable to perform methods and processes disclosed herein.
- Such logic may be embodied strictly as computer software (e.g., a set of computer programming instructions), as computer software and hardware, or as hardware alone.
- Computer system 110 e.g., Event-Processor application 120 and/or Event-Processor process 122
- computer system 110 generally performs methods and procedures shown in FIGS. 2-9 .
- other systems can be configured to provide similar functionality.
- FIG. 2 illustrates procedures 200 performable by an event processor, such as implemented by the Event-Processor application 120 , in accordance with embodiments disclosed herein.
- the procedures 200 implement an operator for event filtering by clustering.
- the procedures 200 comprise Step 210 , Step 220 , Step 230 , and Step 240 , which the event processor is not required to execute sequentially.
- the event processor may execute Step 210 continuously throughout the execution of the procedures 200 and execute Step 220 each time an event is received in the stream of events.
- the event processor receives a stream of events.
- the stream of events is an input stream to the operator for event filtering by clustering.
- the event processor determines whether an event is associated with one of a plurality of event clusters.
- the event processor performs this determination for each event in the received stream of events.
- the event processor can employ techniques known in the art of clustering for determining whether an event is associated with one of a plurality of clusters.
- the event processor determines whether an event is within the radius of one of the clusters according to a predetermined distance function for determining the distance between an event and the centroid of a cluster. That is, if the distance between the event and the centroid of a cluster, as determined by the predetermined distance function, is less than or equal to the radius of the cluster, then the event is determine to be associated with that cluster.
- a user may choose the distance function to be employed by this operator for event filtering by clustering.
- a user may create an event processor that implements both a first operator for event filtering by clustering that employs a first distance function and a second operator for event filtering by clustering that employs a second distance function.
- Step 230 the event processor places an alert event into an output stream for each event determined to not be associated with any of the plurality of clusters.
- the event processor compares an event with each cluster to determine whether the event is associated with any cluster. If the event processor determines that the event is not associated with any cluster, then the event processor places an alert event in an output stream.
- An alert event as used in embodiments described herein may be, for example, an actual event that caused an event processor to put the alert event in an output stream.
- the event processor may place in an output stream an event that the event processor has determined is not associated with any cluster.
- the event processor may place an alert event in an output that causes a notification or alert to be send to an administrator of the event processor or a user of the event processor.
- a user of an event processor may be a person or the user may be another event processor. That is, the output stream of one event processor containing alert events may be an input stream to a second event processor that processes alert events.
- Step 240 the event processor redefines the plurality of clusters based on a subset of the events in the received stream of events.
- the event processor performs Step 240 periodically. By periodically it is meant herein that the event processor will process a specific number of events from the input stream before performing Step 240 . Once Step 240 is performed, the event processor will not perform Step 240 until another specific number of events has been processed. Since the event processor performs Step 220 for each event in the received stream of events, the event processor performs Step 220 the same specific number of times (i.e., once for each event received) before redefining the plurality of clusters in Step 240 . This specific number is referred to herein as a sample interval. In particular embodiments, an implementer of an operator for event filtering by clustering may specify the sample interval.
- the event processor may employ clustering algorithms known in the art to perform the redefining of clusters in Step 240 .
- the event processor may employ a K-means algorithm to redefine the plurality of clusters.
- the K-means algorithm is one of the simplest unsupervised learning algorithms that solve the clustering problem.
- the K-means algorithm comprises the following steps:
- the cluster redefining is based on a subset of the events in the received stream.
- the number of events in the subset is referred to herein as the sample size. Similar to the sample interval, in particular embodiments, an implementer of an operator for event filtering by clustering may specify the sample size.
- the subset of events comprises the most recently received events. For example, if a user has specified a sample interval of 500 and a sample size of 250, then each time 500 events have been received the event processor can redefine the clusters using the 250 most recently received events.
- a user has implemented an operator for event filtering by clustering to have an input stream of events and an output stream of events, wherein each event in the input stream has two attributes X and Y.
- the user has chosen a sample interval of 10 and a sample size of 4.
- the user has chosen a distance function d that is the square root of ((d 1 ⁇ d 1 )+(d 2 ⁇ d 2 )), wherein d 1 is a function for determining the distance between the X attribute of a first event or centroid and the X attribute of a second event or centroid.
- d 1 (A, B)
- d 2 (A, B)
- the distance between A and B is the square root of ((d 1 (A, B) ⁇ d 1 (A, B))+(d 2 (A, B) ⁇ d 2 (A, B))).
- the event processor redefines the plurality of event clusters based on a subset of the events in the received stream of events.
- the event processor will also redefine the plurality of event clusters after processing the 20 th event, the 30 th event, the 40 th event, etc.
- Table 1 shows the subset of the events in the input stream to be used in redefining the plurality of event clusters after the event processor processes the 10 th event.
- the 7 th event is designated as A, the 8 th event as B, the 9 th event as C, and the 10 th event as D.
- Event Attributes A (5, 3) B ( ⁇ 1, 1) C (1, ⁇ 2) D ( ⁇ 3, ⁇ 2)
- the event processor initializes the clusters.
- the event processor may initialize the clusters using techniques known in the art.
- the event processor creates two initial clusters: (AB) and (CD). That is, A and B are both assigned to the cluster (AB) and both C and D are assigned to the cluster (CD).
- the centroid of (AB) has coordinates (2, 2).
- the centroid of (CD) has coordinates ( ⁇ 1, ⁇ 2).
- the centroid of the new cluster (A) is (5, 3) and the centroid of the new cluster (BCD) is ( ⁇ 1, ⁇ 1). Therefore, the new distance evaluations are as follows:
- the new cluster (A) is centered at (5, 3) with radius mean of 0 and radius standard deviation of 0.
- the new cluster (BCD) is centered at ( ⁇ 1, 1) with radius mean (“r-mean”) of 2.2 and radius standard deviation (“r-std”) of 0.17. It should be noted here that this example has been simplified in that each event in the subset is assigned to a cluster after the redefining of the clusters is performed. However, other algorithms may allow for one or more events in the subset of events to be an outlier and not be assigned to any particular cluster.
- the event processor receives the 11 th event and it has coordinates of (10, 6).
- the event processor compares this event with each cluster to determine whether the cluster is associated with any of the clusters.
- the method of determining whether an event is associated with a cluster is generally predetermined. For example, a user, when creating the operator, can specific a function to apply to determine whether an event is associated with a cluster. In this example, an event is associated with a cluster if the distance from the event to the centroid of the cluster is within ((3 ⁇ r-std)+r-mean). In this example, the 11 th event is too distant from each of the two clusters to be associated with either cluster so the event processor places an alert event indicating this result into an output stream.
- FIG. 3 illustrates a flowchart 300 of a particular embodiment of the operator implemented by the procedures 200 of FIG. 2 .
- an event processor embodying the operator receives an event.
- the event processor increments an event counter that keeps track of how many events have been received since the last time the event counter was reset.
- the event processor determines whether the received event is associated with any one of a plurality of clusters. If the received event is not associated with any cluster, then the event processor proceeds to Step 308 .
- the event processor places the received event (or other alert event) into an output stream.
- the event processor proceeds to Step 310 . If the event processor determines, in Step 306 , that the received event is associated with a cluster, then the event processor proceeds directly to Step 310 .
- Step 310 the event processor determines whether the event counter is equal to the sample interval. If the event counter is not equal to the sample interval, the event processor proceeds to Step 302 and receives another event. If the event counter is equal to the sample interval, the event processor proceeds to Step 312 .
- Step 312 the event processor identifies a subset of received events. In particular embodiments, the event processor will identify a sample-size number of the most recently received events as the subset. In Step 314 , the event processor redefines the event clusters based on the subset of events identified in Step 312 . In Step 316 , the event processor resets the event counter. After completing Step 316 , the event processor returns to Step 302 .
- FIG. 4 illustrates procedures 400 performable by an event processor, such as implemented by the Event-Processor application 120 , in accordance with embodiments disclosed herein.
- the procedures 400 implement an operator for event partitioning by classification.
- the procedures 400 comprise Step 410 , Step 420 , Step 430 , Step 440 , Step 450 , and Step 460 , which the event processor is not required to execute sequentially.
- the event processor may execute Step 410 continuously throughout the execution of the procedures 400 and execute Step 420 each time an event is received in the stream of events.
- the event processor receives a first stream of events.
- this first stream of events is an input stream to the operator for event partitioning by classification.
- the event processor determines whether an event has a probability equal to or greater than a predetermined threshold of being associated with one of a plurality of classifications in accordance with a generalized linear model comprising a predetermined distribution function, a linear predictor, and a predetermined link function. The event processor performs this determination for each received event in the first stream of events.
- the event processor For each event determined, in Step 420 , to have a probability equal to or greater than the predetermined threshold of being associated with one of the plurality of classifications, the event processor, in Step 430 , places the event in an output stream along with a mark indicating the classification to which the event is associated. For each event determined, in Step 420 , to not have a probability equal to or greater than the predetermined threshold of being associated with one of the plurality of classifications, the event processor, in Step 440 , places the event into an output stream along with a mark indicating that the event is not associated with one of the plurality of classifications.
- the term mark as used herein should be broadly interpreted to include any kind of indication of which classification the event is associated with or whether the event could not be classified. For example, if the event could not be classified, the event processor may mark the event as “unassociated” or “unassigned.”
- places the event into an output stream it is not herein meant that the event processor must necessarily place the exact event or even an identical copy of the event into the output stream. It is sufficient that the event processor include information in the placed event sufficient to identify the processed event that caused the placed event to be placed in the output stream. For example, the event processor may “place the event into an output stream” by placing an alert event into the output stream.
- Step 450 the event processor receives a second stream of events. Each event in this second stream of events is marked as being associated with one of the plurality of classifications.
- Step 460 the event processor estimates unknown parameters in the linear predictor.
- the event processor performs this estimating based on a sample-size number of events in the second stream of events.
- the event processor performs Step 460 periodically based on a sample interval.
- the sample interval can be chosen by a user at the time the event processor is set up to implement an operator for event partitioning by classification.
- a generalized linear model (“GLM”) is a known generalization of ordinary least squares regression.
- a GLM relates the random distribution of a measured variable of the experiment (i.e., the distribution function) to the systematic (i.e., non-random) portion of the experiment (i.e., the linear predictor) through a function called the link function.
- each outcome of the dependent variable Y is assumed to be generated from a particular distribution function in the exponential family (i.e., a large range of probability distributions).
- the unknown parameters ⁇ are typically estimated with maximum likelihood, quasi-maximum likelihood, or Bayesian techniques.
- a GLM consists of three elements:
- the linear predictor is the quantity which incorporates the information about the independent variables into the model.
- the linear predictor is related to the expected value of the data through the link function.
- ⁇ is expressed as a linear combinations of unknown parameters ⁇ .
- the coefficients of the linear combination are represented as the matrix of independent variables X.
- ⁇ can be express as X ⁇ .
- the elements of X are either measured or stipulated by in the modeling design process.
- the link function provides the relationship between the linear predictor and the mean of the distribution function.
- the designer may choose a list of classifications or categories, a distribution function, a link function, a sample interval, a sample size, and a threshold.
- Example distribution functions include binary, binomial, multinomial, and Poisson.
- the designer of the event processor may choose a custom designed distribution function.
- Example link functions include log, logit, probit, comploglog, and logloglink.
- the designer of the event processor may choose a custom designed link function.
- the operator implemented by the event processor has two input streams: a data input stream and a model input stream.
- a sample-size number of events from the model input stream is used to estimate unknown parameters in the linear predictor. This estimation can be performed using maximum likelihood or other estimation techniques, such as quasi-maximum likelihood, or Bayesian techniques. These estimates are then used in classifying events from the data input stream that can be classified with a probability greater than the predetermined threshold. After classifying a sample-interval number of events from the data input, the operator once again uses the next sample-size number of events from the model input to re-estimate the unknown parameters in the linear predictor.
- S 1 be a stream of vehicle events.
- An event e in S 1 has the form (Weight, TS) where TS is the timestamp when the event is collected.
- S 3 be a stream of previously processed (not necessarily by this event processor) vehicle events.
- An event e in S 3 has the form (Gas-performance, Weight, TS) where gas-performance can have a value of either “good” or “poor.”
- F event partitioning operator
- S 2 consists of those events from S 1 that are marked either as “good,” “poor,” or “unassigned.”
- (c 1 , c 2 ) (“good”, “poor”).
- An example configuration of F is as follows:
- the operator F computes a regression model from this set of data. For this example, we will use Maximum Likelihood Estimation (MLE) to estimate the unknown parameters in the model.
- MLE Maximum Likelihood Estimation
- FIG. 5 illustrates a flowchart 500 of a particular embodiment of the operator implemented by the procedures 400 of FIG. 4 .
- an event processor embodying the operator receives an event from a first stream (i.e., the data input stream).
- the event processor classifies the event according to a GLM, marks the event accordingly, and places the marked event (e.g., an alert event) into the output stream.
- the event processor increments an event counter that keeps track of how many events have been received since the last time the event counter was reset.
- Step 508 the event processor determines whether the event counter is equal to the sample interval. If the event counter is not equal to the sample interval, the event processor proceeds to Step 502 and receives another event. If the event counter is equal to the sample interval, the event processor proceeds to Step 510 .
- Step 510 the event processor receives a sample-size number of events from a second steam (i.e., the model input stream). In Step 512 , the event processor uses these events from the second stream to estimate unknown parameters in the GLM. In Step 514 , the event processor resets the event counter. After completing Step 514 , the event processor returns to Step 502 .
- FIG. 6 illustrates procedures 600 performable by an event processor, such as implemented by the Event-Processor application 120 , in accordance with embodiments disclosed herein.
- the procedures 600 implement an operator for event abstraction by hypothesis testing. Generally, this operator takes a sample of events from an input stream, tests the sample of events against a statistical hypothesis, and generates a higher-level event if the sample fails the hypothesis.
- the designer may choose an input stream, an output stream, a statistical distribution D of events in the input stream, a hypothesis about the parameters in D, a sample interval, and a sample size. Once this operator is deployed, the operator takes samples of events according to the sample interval and sample size and tests the hypothesis about the statistical distribution D with each sample. The operator places a new event into the output stream if a sample of events fails (rejects) the hypothesis D.
- the procedures 600 comprise Step 610 , Step 620 , Step 630 , and Step 640 , which the event processor is not required to execute sequentially.
- the event processor may execute Step 610 continuously throughout the execution of the procedures 600 .
- the event processor receives a stream of events.
- the stream of events is an input stream to the operator for event abstraction by hypothesis testing.
- the event processor identifies a subset of events (i.e., the sample) in the received stream.
- the number of identified events in the subset of events is equal to the sample size.
- the sample will consist of the most recently received events.
- the operator does not require the sample to consist of only the most recently received events.
- the event processor will typically identify the 500 most recently received events as the sample.
- the sample size is not required to be smaller than the sample interval. For example, if the sample interval is 200 and the sample size is 500, the event processor may test the most recent 500 events after every group of 200 events is received.
- the event processor will test a sample consisting of the 301 st event through the 800 th event after the 800 th event is received.
- the event processor will perform the next test after the 1000 th event is received and the events in the sample will be the 501 st event through the 1000 th event.
- Step 630 the event processor determines whether the subset of events fails a hypothesis about a statistical distribution of events in the received stream.
- Step 640 the event processor places an alert event into an output stream when the subset of events fails the hypothesis.
- S 2 be a stream of alerts.
- the configuration of this example operator F is as following:
- FIG. 7 illustrates a flowchart 700 of a particular embodiment of the operator implemented by the procedures 600 of FIG. 6 .
- an event processor embodying the operator receives an event from a first stream (i.e., the data input stream).
- the event processor increments an event counter.
- the event processor determines whether the event counter is equal to the sample interval. If the event counter is not equal to the sample interval, the event processor proceeds to Step 702 and receives another event. If the event counter is equal to the sample interval, the event processor proceeds to Step 708 .
- Step 708 the event processor identifies a subset of received events to be used as the sample. Upon completion of Step 708 , the event processor proceeds to Step 710 .
- Step 710 the event processor tests the identified sample against the hypothesis. If the sample does not fail the hypothesis, the event processor proceeds to Step 714 . If the sample does fail the hypothesis, the event processor proceeds to Step 712 .
- Step 712 the event processor places an alert event into an output stream.
- the event processor may include in the alert event information indicating how or why the sample failed the hypothesis. The alert event may also provide for notifying an administrator of the event processor that the current sample failed the hypothesis.
- the event processor proceeds to Step 714 .
- Step 714 the event processor resets the event counter. Upon completion of Step 714 , the event processor proceeds to Step 702 .
- FIG. 8 illustrates procedures 800 performable by an event processor, such as implemented by the Event-Processor application 120 , in accordance with embodiments disclosed herein.
- the procedures 800 implement an operator for event filtering by point estimation.
- a conventional stream filter is an operator that receives a stream of events, finds those events considered as exceptional by evaluating some predicate, and puts those events into another stream of events if the predicate evaluates to true.
- the operator for event filtering by point estimation as disclosed herein differs from conventional technology. Generally, the operator as disclosed herein allows a user to specify a predicate using information that is unavailable when the filter is defined. This information must be estimated at runtime using statistical methods. In particular embodiments, a user may specify the following:
- the operator takes samples of events according to the sample interval and the sample size specified to compute estimates of the unknown parameters in the statistical model.
- the operator takes each event in its input stream, evaluates the predicate with estimated parameter values, and places an alert event into the output stream if the predicate evaluates to true.
- the procedures 800 comprise Step 810 , Step 820 , Step 830 , and Step 840 , which the event processor is not required to execute sequentially.
- the event processor may execute Step 810 continuously throughout the execution of the procedures 800 and execute Step 820 each time an event is received in the stream of events.
- the event processor receives a stream of events.
- the stream of events is an input stream to the operator for event filtering by point estimation.
- Step 820 the event processor evaluates a predicate comprising predetermined parameters.
- the event processor performs this evaluation for each event in the received stream of events.
- Step 830 the event processor places an alert event into an output stream for each event for which the predicate evaluates to true.
- Step 840 the event processor estimates values for the predetermined parameters based on a subset of the events in the received stream.
- the event processor performs Step 840 periodically based on a sample interval.
- the sample interval can be chosen by a user at the time the event processor is set up to implement an operator for event filtering by point estimation.
- S 1 be a stream of human body measurements.
- An event e in S 1 has the form (weight, height, ts), where ts is a timestamp for when the measurements for weight and height were taken.
- an event filter F that computes a new stream S 2 , where S 2 consists of those measurements e from S 1 where weight is considered too large relative to the height.
- weight ⁇ ( ⁇ + ⁇ height) is Normally distributed with mean 0, and standard deviation ⁇ where the parameters ⁇ , ⁇ , and ⁇ have unknown values. That is, weight ⁇ ( ⁇ + ⁇ height) ⁇ N(0, ⁇ ).
- weight is considered too large relative to height if weight ⁇ ( ⁇ + ⁇ height)>3 ⁇ .
- the event processor may use methods similar to those described above in relation to operators for event partitioning by classification to estimate the values of ⁇ , ⁇ , and ⁇ .
- ⁇ , ⁇ , and ⁇ can be calculated by the following formula (i.e., Maximum Likelihood Estimation or MLE).
- FIG. 9 illustrates a flowchart 900 of a particular embodiment of the operator implemented by the procedures 800 of FIG. 8 .
- an event processor embodying the operator receives an event from a stream of events.
- the event processor increments an event counter that keeps track of how many events have been received since the last time the event count was reset.
- the event processor evaluates a predicate comprising predetermined parameters. If the event processor determines that the predicate does not evaluate to true, then the event processor proceeds to Step 910 . If the event processor determines the predicate evaluates to true, then the event processor proceeds to Step 908 .
- the event processor places an alert event into an output stream. Upon completion of Step 908 , the event processor proceeds to Step 910 .
- Step 910 the event processor determines whether the event counter is equal to the sample interval. In the event counter is not equal to the sample interval, then the event processor returns to Step 902 . If the event counter is equal to the sample interval, then the event processor proceeds to Step 912 .
- the event processor identifies a subset of the received events.
- the identified subset will consist of a number of the most recently received events equal to a predetermined sample size.
- Step 914 the event processor estimates parameter values for the predicate.
- the event processor performs these estimates based on the identified subset of the received events.
- the event processor will evaluate the predicate using these estimates for parameter values the next time the event processor performs Step 906 .
- the event processor will continue to these estimates for parameter values when evaluating the predicate until the next time Step 914 is performed and new estimates are computed.
- Step 914 the event processor proceeds to Step 916 .
- Step 916 the event processor resets the event counter.
- the event processor returns to Step 902 .
- novel event-processing operators are provided. These novel operators can be advantageously utilized in implementing event processors and event-processing systems. While inventive matter has been shown and described herein with reference to specific embodiments thereof, it should be understood by those skilled in the art that variations, alterations, changes in form and detail, and equivalents may be made or conceived of without departing from the spirit and scope of the inventive matter. The foregoing description of the inventive matter is not intended to limit the present invention. Rather, the scope of the present invention should be assessed as that of the appended claims and by equivalents thereof.
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Economics (AREA)
- Strategic Management (AREA)
- Marketing (AREA)
- Game Theory and Decision Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Development Economics (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Debugging And Monitoring (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
-
- 1. Place K points into the space represented by the objects (e.g., events) that are being clustered. These points represent initial cluster centroids.
- 2. Assign each object to the cluster that has the closest centroid.
- 3. When all objects have been assigned, recalculate the positions of the K centroids.
- 4. Repeat Steps 2 and 3 until the centroids no longer move. This produces a separation of the objects into clusters.
TABLE 1 | |||
Event ID | Event Attributes | ||
A | (5, 3) | ||
B | (−1, 1) | ||
C | (1, −2) | ||
D | (−3, −2) | ||
E(Y)=μ=g −1(Xβ)
where Xβ is the linear predictor, a linear combination of unknown parameters β, and g is called the link function. In this framework, the variance is typically a function V of the mean:
Var(Y)=V(μ)=V(g −1(Xβ)).
The unknown parameters β are typically estimated with maximum likelihood, quasi-maximum likelihood, or Bayesian techniques.
Event id | Gas-performance | Weight (pound) |
1 | poor | 4300 |
2 | good | 2600 |
. . . | . . . | . . . |
500 | poor | 3600 |
Since the link function is logit, the regression model is:
where xi is the actual value of i-th event's Weight attribute, and 1≦i≦500. Hence
Plugging equation (2) into equation (1), gives:
Note that xi and yi (1≦i≦500) are known values in the model input stream S3, and α and β are unknown values.
where p is the probability of a vehicle being evaluated as a poor gas-performer.
That is, the chance that this vehicle is a poor gas-performer is 0.9781. Since 0.9781>0.7 (i.e., the threshold), this vehicle is marked as “poor,” and an event (4100, “poor”) is generated and put into the output stream S2.
That is, the chance that this vehicle is a poor gas-performer is 0.1151. Hence, the vehicle's chance of being a good gas-performer is 1−0.1151=0.8849. Since 0.8849>0.7, this vehicle is marked as “good,” and an event (2700, “good”) is generated and put into the output stream S2.
That is, the chance that this vehicle is a poor gas-performer is 0.515. Hence, the vehicle's chance of being a good gas-performer is 1−0.515=0.485. Since both chances are less than 0.7, this vehicle is marked as “unassigned,” and an event (3200, “unassigned”) is generated and put into the output stream S2.
TABLE 2 | ||
Event ID | Weight (kg) | Height (cm) |
1 | 38.74 | 156 |
2 | 35.98 | 151 |
3 | 79.46 | 179 |
4 | 29.43 | 136 |
5 | 58.11 | 174 |
6 | 76.88 | 184 |
7 | 74.69 | 185 |
8 | 74.60 | 175 |
9 | 51.72 | 157 |
10 | 57.16 | 160 |
11 | 44.38 | 146 |
12 | 56.14 | 168 |
13 | 47.66 | 152 |
14 | 78.22 | 184 |
15 | 71.60 | 172 |
16 | 63.56 | 156 |
17 | 70.17 | 173 |
18 | 43.36 | 149 |
19 | 52.21 | 157 |
20 | 72.90 | 182 |
Then α, β, and σ can be calculated by the following formula (i.e., Maximum Likelihood Estimation or MLE).
It should be noted that
-
- 1) 20 is the sample size, and 2 is the number of unknown coefficients in the model: weight−(α+β×height) (i.e., α and β).
- 2) For model: weight−(α+β×height)˜N(0, σ), there is a closed form formula for MLE. However, when weight−(α+β*height)˜F where F is some distribution other than Normal distribution, there may not be any closed form formula, and the event processor may have to compute the unknown parameters in the model using algorithms known in nonlinear optimization and numerical analysis.
Weight−(−103.06+0.98*Height)˜N(0,6.24).
Assuming that the 21st event is (Weight, Height)=(87, 172), then 87−(−103.06+0.98×172)=21.5>3×6.24. Hence the 21st event is considered an exceptional event and is placed into the output stream S2.
Claims (12)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/938,013 US9275353B2 (en) | 2007-11-09 | 2007-11-09 | Event-processing operators |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/938,013 US9275353B2 (en) | 2007-11-09 | 2007-11-09 | Event-processing operators |
Publications (2)
Publication Number | Publication Date |
---|---|
US20090125916A1 US20090125916A1 (en) | 2009-05-14 |
US9275353B2 true US9275353B2 (en) | 2016-03-01 |
Family
ID=40624973
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/938,013 Active 2033-11-08 US9275353B2 (en) | 2007-11-09 | 2007-11-09 | Event-processing operators |
Country Status (1)
Country | Link |
---|---|
US (1) | US9275353B2 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150120758A1 (en) * | 2013-10-29 | 2015-04-30 | International Business Machines Corporation | Distributed processing of data records |
CN109643307A (en) * | 2017-05-24 | 2019-04-16 | 华为技术有限公司 | Stream processing system and method |
US11216742B2 (en) | 2019-03-04 | 2022-01-04 | Iocurrents, Inc. | Data compression and communication using machine learning |
US20230055677A1 (en) * | 2021-08-17 | 2023-02-23 | Citrix Systems, Inc. | Systems and methods for data linkage and entity resolution of continuous and un-synchronized data streams |
Families Citing this family (50)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9305057B2 (en) | 2009-12-28 | 2016-04-05 | Oracle International Corporation | Extensible indexing framework using data cartridges |
US9430494B2 (en) | 2009-12-28 | 2016-08-30 | Oracle International Corporation | Spatial data cartridge for event processing systems |
US9189280B2 (en) | 2010-11-18 | 2015-11-17 | Oracle International Corporation | Tracking large numbers of moving objects in an event processing system |
US8990416B2 (en) | 2011-05-06 | 2015-03-24 | Oracle International Corporation | Support for a new insert stream (ISTREAM) operation in complex event processing (CEP) |
US9329975B2 (en) | 2011-07-07 | 2016-05-03 | Oracle International Corporation | Continuous query language (CQL) debugger in complex event processing (CEP) |
US9336302B1 (en) | 2012-07-20 | 2016-05-10 | Zuci Realty Llc | Insight and algorithmic clustering for automated synthesis |
US20140208217A1 (en) | 2013-01-22 | 2014-07-24 | Splunk Inc. | Interface for managing splittable timestamps across event records |
US9753909B2 (en) | 2012-09-07 | 2017-09-05 | Splunk, Inc. | Advanced field extractor with multiple positive examples |
US9594814B2 (en) | 2012-09-07 | 2017-03-14 | Splunk Inc. | Advanced field extractor with modification of an extracted field |
US8751499B1 (en) * | 2013-01-22 | 2014-06-10 | Splunk Inc. | Variable representative sampling under resource constraints |
US8682906B1 (en) | 2013-01-23 | 2014-03-25 | Splunk Inc. | Real time display of data field values based on manual editing of regular expressions |
US8751963B1 (en) | 2013-01-23 | 2014-06-10 | Splunk Inc. | Real time indication of previously extracted data fields for regular expressions |
US10394946B2 (en) | 2012-09-07 | 2019-08-27 | Splunk Inc. | Refining extraction rules based on selected text within events |
US9563663B2 (en) | 2012-09-28 | 2017-02-07 | Oracle International Corporation | Fast path evaluation of Boolean predicates |
US9262479B2 (en) | 2012-09-28 | 2016-02-16 | Oracle International Corporation | Join operations for continuous queries over archived views |
US10956422B2 (en) | 2012-12-05 | 2021-03-23 | Oracle International Corporation | Integrating event processing with map-reduce |
US10298444B2 (en) | 2013-01-15 | 2019-05-21 | Oracle International Corporation | Variable duration windows on continuous data streams |
US9152929B2 (en) | 2013-01-23 | 2015-10-06 | Splunk Inc. | Real time display of statistics and values for selected regular expressions |
US9390135B2 (en) | 2013-02-19 | 2016-07-12 | Oracle International Corporation | Executing continuous event processing (CEP) queries in parallel |
US9047249B2 (en) | 2013-02-19 | 2015-06-02 | Oracle International Corporation | Handling faults in a continuous event processing (CEP) system |
US10169122B2 (en) | 2013-04-29 | 2019-01-01 | Moogsoft, Inc. | Methods for decomposing events from managed infrastructures |
US9607074B2 (en) | 2013-04-29 | 2017-03-28 | Moogsoft, Inc. | Alert dashboard system and method from event clustering |
US11010220B2 (en) | 2013-04-29 | 2021-05-18 | Moogsoft, Inc. | System and methods for decomposing events from managed infrastructures that includes a feedback signalizer functor |
US10554479B2 (en) | 2013-04-29 | 2020-02-04 | Moogsoft Inc. | Alert dashboard system with situation room |
US11888709B2 (en) | 2013-04-29 | 2024-01-30 | Dell Products L.P. | Alert dashboard system with situation room |
US10803133B2 (en) | 2013-04-29 | 2020-10-13 | Moogsoft Inc. | System for decomposing events from managed infrastructures that includes a reference tool signalizer |
US10013476B2 (en) | 2014-04-28 | 2018-07-03 | Moogsoft, Inc. | System for decomposing clustering events from managed infrastructures |
US10572277B2 (en) | 2013-04-29 | 2020-02-25 | Moogsoft, Inc. | Alert dashboard system with situation room |
US10700920B2 (en) | 2013-04-29 | 2020-06-30 | Moogsoft, Inc. | System and methods for decomposing events from managed infrastructures that includes a floating point unit |
US10007716B2 (en) | 2014-04-28 | 2018-06-26 | Moogsoft, Inc. | System for decomposing clustering events from managed infrastructures coupled to a data extraction device |
US12047340B2 (en) | 2013-04-29 | 2024-07-23 | Dell Products L.P. | System for managing an instructure with security |
US9418113B2 (en) | 2013-05-30 | 2016-08-16 | Oracle International Corporation | Value based windows on relations in continuous data streams |
US9934279B2 (en) | 2013-12-05 | 2018-04-03 | Oracle International Corporation | Pattern matching across multiple input data streams |
US9244978B2 (en) * | 2014-06-11 | 2016-01-26 | Oracle International Corporation | Custom partitioning of a data stream |
US9712645B2 (en) | 2014-06-26 | 2017-07-18 | Oracle International Corporation | Embedded event processing |
US9886486B2 (en) | 2014-09-24 | 2018-02-06 | Oracle International Corporation | Enriching events with dynamically typed big data for event processing |
US10120907B2 (en) | 2014-09-24 | 2018-11-06 | Oracle International Corporation | Scaling event processing using distributed flows and map-reduce operations |
US10425291B2 (en) | 2015-01-27 | 2019-09-24 | Moogsoft Inc. | System for decomposing events from managed infrastructures with prediction of a networks topology |
US11924018B2 (en) | 2015-01-27 | 2024-03-05 | Dell Products L.P. | System for decomposing events and unstructured data |
US10979304B2 (en) | 2015-01-27 | 2021-04-13 | Moogsoft Inc. | Agent technology system with monitoring policy |
US10873508B2 (en) | 2015-01-27 | 2020-12-22 | Moogsoft Inc. | Modularity and similarity graphics system with monitoring policy |
US11817993B2 (en) | 2015-01-27 | 2023-11-14 | Dell Products L.P. | System for decomposing events and unstructured data |
JP2016138816A (en) * | 2015-01-28 | 2016-08-04 | アルパイン株式会社 | Navigation device and computer program |
WO2017018901A1 (en) | 2015-07-24 | 2017-02-02 | Oracle International Corporation | Visually exploring and analyzing event streams |
US11615088B2 (en) | 2016-09-15 | 2023-03-28 | Oracle International Corporation | Complex event processing for micro-batch streaming |
US11977549B2 (en) | 2016-09-15 | 2024-05-07 | Oracle International Corporation | Clustering event processing engines |
US11205103B2 (en) | 2016-12-09 | 2021-12-21 | The Research Foundation for the State University | Semisupervised autoencoder for sentiment analysis |
WO2018169430A1 (en) | 2017-03-17 | 2018-09-20 | Oracle International Corporation | Integrating logic in micro batch based event processing systems |
WO2018169429A1 (en) | 2017-03-17 | 2018-09-20 | Oracle International Corporation | Framework for the deployment of event-based applications |
GB202002192D0 (en) * | 2020-02-18 | 2020-04-01 | Echobox Ltd | Topic clustering and Event Detection |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030065409A1 (en) * | 2001-09-28 | 2003-04-03 | Raeth Peter G. | Adaptively detecting an event of interest |
US20050171923A1 (en) * | 2001-10-17 | 2005-08-04 | Harri Kiiveri | Method and apparatus for identifying diagnostic components of a system |
US20060282301A1 (en) | 2005-06-13 | 2006-12-14 | Avaya Technology Corp. | Real time estimation of rolling averages of cumulative data |
US20070150585A1 (en) | 2005-12-28 | 2007-06-28 | Microsoft Corporation | Multi-dimensional aggregation on event streams |
US7257611B1 (en) | 2000-04-12 | 2007-08-14 | Oracle International Corporation | Distributed nonstop architecture for an event processing system |
-
2007
- 2007-11-09 US US11/938,013 patent/US9275353B2/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7257611B1 (en) | 2000-04-12 | 2007-08-14 | Oracle International Corporation | Distributed nonstop architecture for an event processing system |
US20030065409A1 (en) * | 2001-09-28 | 2003-04-03 | Raeth Peter G. | Adaptively detecting an event of interest |
US20050171923A1 (en) * | 2001-10-17 | 2005-08-04 | Harri Kiiveri | Method and apparatus for identifying diagnostic components of a system |
US20060282301A1 (en) | 2005-06-13 | 2006-12-14 | Avaya Technology Corp. | Real time estimation of rolling averages of cumulative data |
US20070150585A1 (en) | 2005-12-28 | 2007-06-28 | Microsoft Corporation | Multi-dimensional aggregation on event streams |
Non-Patent Citations (1)
Title |
---|
Arvind Arasu and Shivanth Babu and Jennifer Widom, Stanford University, "The CQL Continuous Query Language: Semantic Foundations and Query Execution", pp. 1-32, Published 2003. |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150120758A1 (en) * | 2013-10-29 | 2015-04-30 | International Business Machines Corporation | Distributed processing of data records |
US10482154B2 (en) * | 2013-10-29 | 2019-11-19 | International Business Machines Corporation | Distributed processing of data records |
CN109643307A (en) * | 2017-05-24 | 2019-04-16 | 华为技术有限公司 | Stream processing system and method |
CN109643307B (en) * | 2017-05-24 | 2021-08-20 | 华为技术有限公司 | Stream processing system and method |
US11216742B2 (en) | 2019-03-04 | 2022-01-04 | Iocurrents, Inc. | Data compression and communication using machine learning |
US11468355B2 (en) | 2019-03-04 | 2022-10-11 | Iocurrents, Inc. | Data compression and communication using machine learning |
US20230055677A1 (en) * | 2021-08-17 | 2023-02-23 | Citrix Systems, Inc. | Systems and methods for data linkage and entity resolution of continuous and un-synchronized data streams |
US11711255B2 (en) * | 2021-08-17 | 2023-07-25 | Citrix Systems, Inc. | Systems and methods for data linkage and entity resolution of continuous and un-synchronized data streams |
Also Published As
Publication number | Publication date |
---|---|
US20090125916A1 (en) | 2009-05-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9275353B2 (en) | Event-processing operators | |
US20220147405A1 (en) | Automatically scalable system for serverless hyperparameter tuning | |
EP3591586A1 (en) | Data model generation using generative adversarial networks and fully automated machine learning system which generates and optimizes solutions given a dataset and a desired outcome | |
CN110474808B (en) | Flow prediction method and device | |
CN105677759A (en) | Alarm correlation analysis method in communication network | |
CN103513983A (en) | Method and system for predictive alert threshold determination tool | |
CN108650684A (en) | A kind of correlation rule determines method and device | |
Lim et al. | Identifying recurrent and unknown performance issues | |
US10303705B2 (en) | Organization categorization system and method | |
Jiang et al. | Autoregressive networks | |
EP2963552B1 (en) | System analysis device and system analysis method | |
Farcomeni et al. | Robust estimation for the Cox regression model based on trimming | |
CN111612038A (en) | Abnormal user detection method and device, storage medium and electronic equipment | |
CN108242266A (en) | Auxiliary diagnostic equipment and method | |
CN110895495A (en) | Human error analysis method, system, computer device and storage medium | |
CN115964211A (en) | Root cause positioning method, device, equipment and readable medium | |
US8301584B2 (en) | System and method for adaptive pruning | |
US10242101B2 (en) | Automatic identification of sources of web metric changes | |
CN111898249A (en) | Landslide displacement nonparametric probability density prediction method, equipment and storage medium | |
US20220138557A1 (en) | Deep Hybrid Graph-Based Forecasting Systems | |
Gould | Unified screening for potential elevated adverse event risk and other associations | |
Almiñana et al. | A classification rule reduction algorithm based on significance domains | |
Sandhu et al. | A density based clustering approach for early detection of fault prone modules | |
CN116723083B (en) | Cloud server online fault diagnosis method and device | |
CN118171891B (en) | Work task scheduling method, device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: SUN MICROSYSTEMS, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:LU, YANBING;WALDORF, JERRY;REEL/FRAME:020093/0442 Effective date: 20071108 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
CC | Certificate of correction | ||
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 8 |