US11074272B1 - System and method for managing streaming calculations - Google Patents
System and method for managing streaming calculations Download PDFInfo
- Publication number
- US11074272B1 US11074272B1 US16/224,619 US201816224619A US11074272B1 US 11074272 B1 US11074272 B1 US 11074272B1 US 201816224619 A US201816224619 A US 201816224619A US 11074272 B1 US11074272 B1 US 11074272B1
- Authority
- US
- United States
- Prior art keywords
- data set
- transformation
- data
- reference object
- partition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
- 238000000034 method Methods 0.000 title claims description 54
- 238000004364 calculation method Methods 0.000 title claims description 45
- 230000009466 transformation Effects 0.000 claims abstract description 114
- 238000005192 partition Methods 0.000 claims description 70
- 230000002776 aggregation Effects 0.000 claims description 2
- 238000004220 aggregation Methods 0.000 claims description 2
- 230000001131 transforming effect Effects 0.000 abstract 1
- 230000008859 change Effects 0.000 description 19
- 230000008569 process Effects 0.000 description 19
- 238000000844 transformation Methods 0.000 description 11
- 239000002131 composite material Substances 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 7
- 238000012545 processing Methods 0.000 description 7
- 238000003860 storage Methods 0.000 description 7
- 238000004422 calculation algorithm Methods 0.000 description 6
- 238000000638 solvent extraction Methods 0.000 description 6
- 230000000875 corresponding effect Effects 0.000 description 5
- 238000005259 measurement Methods 0.000 description 5
- 239000000523 sample Substances 0.000 description 5
- 238000009499 grossing Methods 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 230000001419 dependent effect Effects 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000002372 labelling Methods 0.000 description 3
- 230000003068 static effect Effects 0.000 description 3
- 238000007792 addition Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 238000009530 blood pressure measurement Methods 0.000 description 2
- 238000009529 body temperature measurement Methods 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000003466 anti-cipated effect Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 239000000872 buffer Substances 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 238000013501 data transformation Methods 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000009474 immediate action Effects 0.000 description 1
- 230000003116 impacting effect Effects 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000005096 rolling process Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
- G06F16/278—Data partitioning, e.g. horizontal or vertical partitioning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24568—Data stream processing; Continuous queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2272—Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2477—Temporal data queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/258—Data format conversion from or to a database
Definitions
- aspects of the present disclosure relate to performing transformations on streaming data.
- Sensor measurement data is commonly saved as an historical record of a physical process being monitored by the sensors, often along with other observational, manual, calculated, simulated, or related data and metadata associated with the process. While the data in the historic record may be perceived as certain, or “settled,” there is often a need to modify the historic record. For example there may be a need to update the record with verified data in order to more accurately represent the historical facts. In some instances, portions of a record may not be settled due to issues arising from varying data latency (delay of arrival), timestamp error, signal noise, replacement of erroneous or missing data.
- a valve may be intended to maintain an optimal flow of a contained gas by automatically opening and closing in response to changing heat and pressure values. Opening and/or closing the valve can have a certain and immutable impact that cannot be undone.
- the actions e.g., opening or closing the valve
- the actions may have been made based on data that is subject to change and this fact can have far reaching ramifications for systems of record, issues of human responsibility, and work processes. It may be useful to differentiate action based on the likelihood of the data changing or it may be important (e.g., for assigning responsibility or detecting errors) to have some record of the likelihood of the data changing.
- calculations involving multiple data sources are common.
- the manufacturing field involves multiple sensors each providing live, or “streaming,” data to a processor for calculations.
- the calculation may be run periodically and without concern for variable “streaming” data rates and/or variable latency, and data will always be made available for the calculation through extrapolating or interpolating missing values; however, subsequent data from the data stream may make previous results incorrect.
- a calculation can be run at only a point in time when all streams have provided data for the calculation, resulting in accurate and invariant results, but consequently delayed.
- the latter approach is also incapable of providing provisional and/or “in-progress” results.
- Embodiments of the invention concern systems and methods for generating a data demarcating settled and unsettled data.
- a method can include accessing, by a hardware processor, a data set and a reference object, the data set ordered along a dimension and the reference object demarcating a first ordered portion of the data set from a second ordered portion of the data set, and the first ordered portion preceding the reference object along the dimension and the second ordered portion following the reference object along the dimension, wherein values of the first ordered portion are settled and values of the second ordered portion are unsettled, dividing, by the hardware processor, the data set into a first partition including a first portion of the data set and a second partition including a second portion of the data set, optimizing, by the hardware processor, one of a first transformation or a second transformation based on a relative position along the dimension of the reference object to one of the first portion of the data set and the second portion of the data set, and yielding one of an optimized first transformation or an optimized second transformation, applying, by the hardware processor
- the method further includes providing a first copy of the reference object to the first partition and a second copy of the reference object to the second partition, and wherein optimizing one of the first transformation or the second transformation is based on a relative position along the dimension of one of the first copy of the reference object and the second copy of the reference object.
- the method further includes outputting a reference object demarcating a first portion of the aggregated data set from a second portion of the aggregated data set.
- the one of the first partition or the second partition comprises a plurality of partitions and each partition of the plurality of partitions comprises a portion of the data set, and wherein one of a transformation or an optimized transformation is applied to each of the plurality of partitions.
- each partition of the plurality of partitions receives a copy of the reference object.
- the entirety of the data set is contained across an aggregation of the partitions.
- one of the optimized first transformation and the optimized second transformation includes fewer calculations than one of the first transformation or the second transformation, and one of the first transformation or the second transformation is optimized based on the reference object being positioned along the dimension after one of the first portion of the data set or the second portion of the data set.
- the reference object is positioned along the dimension at an interim point of one of the first portion of the data set and the second portion of the data set, the interim point between a first point and a second point of one of the first portion of the data set and the second portion of the data set, and one of the optimized first transformation or the optimized second transformation includes a first set of calculations applied to points preceding the interim point and a second set of calculations applied to points following the interim point.
- a method for generating a demarcation for a data set can include generating, by a hardware processor, a reference object associated with an accessed data set, the reference object demarcating a first ordered portion of the data set from a second ordered portion of the data set, the first ordered portion preceding the reference object along the dimension and the second ordered portion following the reference object along the dimension, labeling, by the hardware processor, values of the first ordered portion as settled by changing metadata associated with each respective value of the first ordered portion; and labeling, by the hardware processor, values of the second ordered portion as unsettled by changing metadata associated with each respective value of the second ordered portion.
- the method may further include receiving, by the hardware processor, a transformation to apply to the accessed data set, applying, by the hardware processor, the transformation to the accessed data set, wherein the applied transformation factors in the reference object for the values of the second ordered portion to produce a transformed data set; and generating, by the hardware processor, an output data set comprising the transformed data set and one of the reference object or a transformed reference object.
- labeling the values of the second ordered portion comprises changing the metadata associated with each respective value of second ordered portion according to a continuous value of uncertainty, and wherein applying the transformation further factors in the continuous value of uncertainty for a plurality of values of the second ordered portion.
- the dimension is time.
- the reference object comprises an input cursor associated with a last accessed data point within the accessed data set and the second ordered portion comprises predicted values based on the first ordered portion.
- the accessed data set comprises a first input data stream and a second input data stream, the first input data stream associated with a first reference object that demarcates a first ordered portion of the first input data stream from a second ordered portion of the first input data stream, the second input data stream associated with a second reference object that demarcates a first ordered portion of the second input data stream from a second ordered portion of the second input data stream, and further comprising applying the transformation to the first input data stream and the second input data stream with reference to the respective first reference object and the second reference object to generate the result in the form of a transformed output data stream, the transformed output data stream including a third reference object that demarcates between a settled output portion of the output data stream and an unsettled output portion.
- the accessed data set comprises a first input data stream and a second input data stream, the first input data stream associated with a first reference object that demarcates a first ordered portion of the first input data stream from a second ordered portion of the first input data stream, the second input data stream associated with a second reference object that demarcates a first ordered portion of the second input data stream from a second ordered portion of the second input data stream.
- the reference object is based on a variable received with the accessed data set. In one embodiment, the reference object is computed in real time as the data set is accessed and by applying calculations as data is settled. In one embodiment, the reference object is modified.
- a non-transitory computer-readable medium may store computer-executable instructions that cause one or more processors to access a data set and a reference object, the data set ordered along a dimension and the reference object demarcating a first ordered portion of the data set from a second ordered portion of the data set, and the first ordered portion preceding the reference object along the dimension and the second ordered portion following the reference object along the dimension, wherein values of the first ordered portion are settled and values of the second ordered portion are unsettled, divide the data set into a first partition including a first portion of the data set and a second partition including a second portion of the data set, optimize one of a first transformation or a second transformation based on a relative position along the dimension of the reference object to one of the first portion of the data set and the second portion of the data set, and yielding one of an optimized first transformation or an optimized second transformation, apply one of the first transformation or the optimized first transformation to the first portion of the data set, and yielding a transformed first portion of the data set, apply one of
- one of the first partition and the second partition comprises a plurality of partitions and each partition of the plurality of partitions comprises a distinct portion of the data set, and wherein one of a transformation or an optimized transformation is applied to each of the plurality of partitions.
- FIG. 1 is a diagram illustrating a data latency management system, in accordance with various embodiments of the subject technology
- FIG. 2 is a flowchart illustrating a method for managing data latency, in accordance with various embodiments of the subject technology
- FIG. 3 is an illustration of a process for applying transformations to streaming data, in accordance with various embodiments of the subject technology
- FIG. 4 is an illustration of processes for applying transformations to streaming data, in accordance with various embodiments of the subject technology.
- FIG. 5 is a system diagram of an example computing system that may implement various systems and methods discussed herein, in accordance with various embodiments of the subject technology.
- Ordered input data can be processed by algorithms including, for example, transformations and calculations, to generate correspondingly ordered output data.
- the ordered output data can include a reference object to a demarcation point along a dimension upon which the data is ordered.
- the demarcation point can differentiate, for example, data that is settled from data that is unsettled or merely provisional. Users of the ordered output data can then use the demarcation point, referred to herein as a “cursor,” in determining how to record, manage, calculate, and/or communicate the data, or provide the output data to other systems and processes that may use it as a respective input.
- the cursor may be based on time or a timestamp, which may or may not be an attribute of the data or data stream itself, as well as other attributes of the data which can be used to connote a demarcation point between settled and unsettled data.
- Ordered input data may constitute a set of data points.
- Data points or sets of points may be recorded with an associated timestamp or timestamps, representing a time of measurement, a time of recording, a time of arrival to the recording device, or other meaningful time references.
- Metadata can be included or associated with the ordered input data and may represent measurement accuracy, precision, tolerance, confidence, variability, source, handling, processing information, or other information to inform proper use and interpretation of the associated ordered input data. Even when correctly arranged in order, data points have varying levels of certainty distinguishable by a demarcation along an ordered sequence. In other words, a timestamp alone may fail to inform a reviewer whether the data associated with the timestamp, before the timestamp, or after the timestamp have any shared values or qualities of stability.
- unsettled data may change but does not always or necessarily change.
- unsettled data itself may not change, but some related data or metadata, for example, may change or not yet be available and upon which the settled nature of the data itself depends, and hence the data cannot be considered settled until the related data, metadata, etc., is received, correlated, considered in a computation, etc.
- Recording, managing, calculating, and communicating data can include, without limitation, decisions or algorithms that determine when to store, cache, or discard data; when to calculate or recalculate dependent assessments or calculations; visual representations of certainty; logical and mathematical operations; propagation of certainty through dependent calculations; and optimizing the methodology in support of partitioning and distributing calculations for scale.
- Calculation environments may deal with a static data set or with a data set that is continuously or periodically refreshed with new data, with or without the loss of previous data.
- the non-static environment is commonly referred to as “streaming” and may include transformations being updated to address new inputs.
- Latency relates to the notion that newly arriving data inherently lags behind some actual measurement, due to physical, transport, processing, or network delays, among other things.
- each input may have a different latency from other inputs, and each of the input latencies may itself be variable.
- Partitionability Another characteristic of calculation environments is “partitionability,” where an algorithm, calculation, search, or other transformation may be broken into smaller parts which are apportioned to other processing systems and the results reassembled in a manner that precisely reconstructs an end result as if no such partitioning has occurred. Partitioning provides speed advantages over non-partitioning systems. Ideally, partitioning should operate without a priori knowledge of the nature of the data and transformation being applied while still guaranteeing that the reassembled result is idempotent (both reproducible and traceable) and the same as if such calculation were performed without partitioning.
- the key dimension is not limited to time and may be any such dimensions along which data can be ordered. For example, any monotonically increasing physical or quantitative value may serve as a key. As discussed herein, and for non-limiting explanatory purposes only, the key is portrayed in the context of time.
- the data streams may be characterized by multiple traits and behaviors. They may be time-value pairs, or samples, having a numeric, string, arbitrary scalar, or other static value and may be of the sort typically used to represent continuous signals. Alternatively, they may also be events, meaning instants or moments in time, or they may be states, which span a range in time (having a start time and an end time).
- the streams may be stored in and accessed through databases and can be continuously updated as time progresses.
- the streams may have multiple and various sources such as temperature sensors, light sensors, and pressure sensors.
- the stream may have different or variable sample frequencies.
- a pressure sensor and a temperature sensor may transmit reading updates every second and every half-second, respectively.
- the streams may have different and variable latencies, as discussed above.
- the example pressure sensor and temperature sensor may each transmit respective signals over different media such as wireless signal and cable, respectively. Where a signal over a cable may have consistent latency, a wirelessly transmitted signal may have a higher latency than that over the cable and also a highly variable latency as arbitrary changes in the intervening space between transmitter and receiver occur.
- a device having a processor to perform calculations may receive one or more of the streams as input.
- a server may monitor or otherwise receive or access a stream of pressure measurement data and a stream of temperature measurement data.
- the calculation may include a time range over which to execute an algorithm using the pressure measurement data and the temperature measurement data.
- Results of a calculation may be in the form of a stream and/or can also be in the form of a non-streaming output value, such as a single number, which can be labeled with settled status (e.g., “settled” or “unsettled”) associated with an output cursor so as to denote a point in the streaming input up to which the corresponding non-streaming output value may be considered settled.
- settled status e.g., “settled” or “unsettled”
- a time and corresponding data value may be associated with the output cursor and inform a requesting user that the output value is settled up to that time.
- a streaming calculation system 100 can compute calculation results for any time range of 1 to N streaming input data 102 A-B.
- the system 100 will compute the calculation results within an unsettledness window, such as that discussed above in reference to the example temperature and pressure sensor combination, and identify results that are settled and invariant as well as results that are provisional and subject to change with, in one example, a minimum or optimal use of computation resources.
- the streaming input data 102 A-B is transmitted to a device 116 which is capable of receiving data and contains a processor able to perform calculations and algorithms included in data transformations.
- the device 116 can be a personal computer or other computing device receiving streaming data over, for example, a network connection, accessing data from a database, or directly receiving data streams from a sensor, in various possible examples.
- the device 116 may be a server or other device.
- device 116 can be a specialized server capable of receiving and processing large volume and high frequency streaming data.
- each input stream is associated with a cursor, whether the stream is provided with a cursor or the cursor is computed separately. But ultimately, each input stream to a transformation has a cursor prior to the execution of the transformation. As noted elsewhere, the cursor may be implicitly or explicitly defined. Similarly, each output stream from the transformation includes an associated output cursor.
- streaming data any ordered (e.g., time series) data may be used.
- a data set includes longitudinal data covering a span of years such as in the case of academic research studies
- the longitudinal data may be processed in sequence based on associated time values of each data record. Such time values may be included directly with the data or may instead be associated to the data as metadata.
- the ordered data may then be processed in appropriate sequence, in which case it can be treated as streaming data.
- the “stream” may be paused and restarted or otherwise manipulated in ways that may be unavailable for a “live” stream or otherwise require additional resources such as buffers and the like. Nevertheless, both streaming and recorded data may be similarly processed by the systems and methods discussed in this disclosure.
- FIG. 2 depicts a method 200 for providing a transformed data stream and cursor to a requester.
- the requester may be, without limitation, a requesting service such as a calling function.
- the requester can also be a listening service which passively receives a streaming output data and cursor 110 (discussed below).
- the requester can further process the streaming output data and cursor 110 to determine a response to the received stream.
- a valve control system may use the streaming output data and cursor 110 to determine whether and for how long to open a valve.
- the streaming output data and cursor 110 can be saved and stored for later review or decision validation.
- the data can be stored in a record and later used to validate or troubleshoot past decisions, such as in the above example of a valve control system.
- the cursor may be based on some time attribute but not necessarily a timestamp within the stream or streams themselves, such as a separate time stream, metadata related to a stream where the metadata includes a time attribute, relative time attributes related to receipt of the stream at whatever device is receiving the stream or streams, etc.
- the cursor does not have to be bound to a key in the data stream such as an actual time value.
- the cursor may be based on timestamp values of the stream, examples of keys, but not tied to a specific time stamp value of the stream.
- the cursor may also be based on some other attribute of the data besides time or timestamps whether the attribute is a key of the stream or not.
- multiple cursors may be generated along multiple dimensions of data or a single cursor may be complex (e.g., cursors reflecting splines or surfaces rather than single numbers).
- a cave mapping system may generate streaming mapping data from one or more probes and sensors. As the three-dimensional structure is mapped and data is accordingly streamed to a receiver, a cursor may demarcate the spatial coordinates of certain, or fully mapped, portions of the cave as compared to the in process or still rendering portions (e.g., still awaiting complete data from every sensor and probe).
- a substantially identical cursor may, for example, refer to the same relative data point within the sequence of data points comprising the stream.
- an input stream having a first, second, and third value may be received with a cursor referring to the second value.
- An output stream having correspondingly transformed first, second, third values may be generated with an output cursor.
- substantially similar input and output cursors may respectively refer to the second and transformed second values of the input and output streams.
- the output cursor may be substantially transformed due to the nature of the transformation 108 applied.
- the transformation 108 may apply a variety of calculations accounting for variable latency and update frequency across all the streams, and thus generate a completely distinct output cursor that refers to a relative data point in the output sequence that is different than any single input cursor reference point.
- Transformations 308 A and 308 B receive input streams 301 A and 301 B, respectively.
- the input stream 301 A can include an input cursor 304 A which demarcates a stream portion 302 A that is settled and a stream portion 306 A that is unsettled or provisional and subject to change.
- An output stream 311 A and an output cursor 305 A are generated together. Where the output cursor 305 A is coupled to the output stream 311 A, the stream is demarcated into a settled stream portion 310 A and an unsettled stream portion 312 A, similarly to the input stream.
- the input stream 301 A may be recorded in, for example, a database and thus the settled values 302 A may be catalogued as being settled in the record, and updated accordingly as the stream continues to provide updates.
- the output stream 311 A may be recorded and the associated settled values 310 A may also be catalogued as being settled in the record. Accordingly, the cursors 304 A and 305 A can be used to update this record on a rolling or streaming basis.
- data may be received in batches spanning multiple data points.
- a pressure sensor may transmit a minute of sensor data which is measured every one second.
- the streamed data consists of batches of 60 sequential data points every minute.
- the input cursor 304 A may be included with the batch and denote the most recent settled data point in the batch.
- the pressure sensor readings may only be settled for the first 30 seconds of the batch update, while the second 30 seconds of readings may be of varying levels of unsettledness.
- the input cursor 304 A demarcates the 30 seconds of settled pressure sensor readings 302 A from the second thirty seconds of pressure sensor readings 306 A.
- the transformation 308 A may, for example, estimate a temperature to the nearest whole number from the pressure sensor readings.
- the generated output stream 311 A of batched data can include an output cursor 305 A demarcating the first 35 seconds of the transformed batch update 310 A as being settled values from the remaining 25 seconds of the transformed batch update 312 A which are unsettled.
- the transformation 308 A was able to generate settled data from unsettled data because, for example, the conversion from pressure to temperature, measured to a low significant figure, only needed a pressure value being settled to a certain degree rather than being completely settled.
- An input cursor 304 B is provided with the input stream 301 B and, similarly to the above, demarcates the input stream 301 B into a stream portion 302 B that is settled and a stream portion 306 B that is unsettled.
- an output stream 311 B and output cursor 305 B are generated.
- the output cursor 305 B demarcates a settled output stream portion 310 B and an unsettled output stream portion 312 B.
- the input stream 301 B is provided such that the cursor 304 B and settled stream portion 302 B are identical to input stream 301 A's cursor ( 304 A) and settled stream portion ( 302 A).
- unsettled stream portions are quite different. If both identical input streams undergo the same transformation, they will produce the same output cursors ( 305 A and 305 B) and settled stream results ( 310 A and 310 B). To put it another way—unsettled input data does not influence the position of the output cursor or settled stream results; unsettled input data can only influence the output of unsettled data. We see that in 312 A and 312 B—the unsettled output stream results differ accordingly with their unsettled inputs.
- FIG. 4 depicts a partitioned transformation 400 A alongside a corresponding non-partitioned transformation 400 B. Both transformations 400 A and 400 B receive an input stream 402 .
- the input stream 402 is received with an input cursor 404 C which refers to a demarcation data point separating the settled values of the stream 404 A from the unsettled values of the stream 404 B.
- the transformation process 400 A first generates a set of partitions 406 .
- Each partition 408 , 410 , 412 , and 414 can perform a calculation on respective portions of the input stream 402 .
- Some partitions may receive portions of the stream 404 A(i) and 404 A(ii) preceding the input cursor 404 C. These partitions may ignore the input cursor 404 C and perform respective calculations accordingly because their respective data points are guaranteed to be settled values as they precede the cursor.
- a set of partitions 416 includes the partitions 408 , 410 , 412 , and 414 along with respective cursor copies 418 , 420 , 422 , and 424 . While the discussed partitions receive individual copies of a cursor, it is understood that other implementations may serve to embody the disclosed technology. For example, each partition may receive a reference to a share cursor.
- the input cursor copy 418 indicates a point outside a respective span of consideration of the partition 408 and so the partition 408 may be able to optimize the transformation. In other words, the cursor copy 418 will not cause a change in the method and/or state related to the transformation process performed by the partition 408 on the portion 404 A(i) of the data stream. In such an example, the cursor is positioned such that the transformation is entirely settled or unsettled.
- a smoothing transformation might smooth an air temperature feed consisting of a local outdoors air temperature sensor data stream and a weather forecasting data stream.
- the air temperature feed may further be associated with a cursor as elsewhere discussed in this disclosure.
- the smoothing transformation might smooth only the local outdoor air temperature sensor feed up until a point on the air temperature feed associated with the cursor. For feed data after the cursor, the smoothing transformation may switch to smoothing the weather forecasting data stream.
- the partitions containing data preceding the cursor would smooth the local outdoor air temperature sensor feed component of the data stream
- the partitions containing data following the cursor would smooth the weather forecasting data stream component
- the partition (or partitions) containing data spanning both sides of the cursor would accordingly smooth portions of both the local outdoor air temperature sensor feed component and the weather forecasting data stream component based on the location of the cursor within the partition.
- Partitions 412 and 414 may factor the respective input cursor copies 422 and 424 into their transformations because the respective portions of the data stream 404 B(i) and 404 B(ii) partially or entirely follow a demarcation point contained in the cursor copies (a point indicated by the cursor 404 C).
- the partition 412 receives a portion of the stream 404 B(i) straddling the input cursor copy 422 and may process the stream in a similar manner to the non-partitioned transformation 400 B (processing together the portions preceding a cursor and portions following a cursor).
- Partitions containing portions of the stream 404 B(ii) entirely following the cursor 404 C can perform the entirety of their respective calculations factoring in the cursor 404 C.
- a partition can receive a cursor copy that contains a demarcation point located outside the span of the stream, but may still be factored into the transformation due to relative proximity or for other purposes.
- partition 410 receives the input cursor copy 420 along with the stream portion 404 A(ii). Though the received stream portion precedes the original input cursor 404 C, the cursor is still factored into the calculations of partition 410 due to proximity. As a result, an output cursor 421 is generated that indicates a demarcation point along the output stream 430 produced by the partition 410 .
- the partitions 408 , 410 , 412 , and 414 produce a set 426 of respective output streams 428 , 430 , 436 , and 438 .
- Each output stream contains a respective output cursor 419 , 421 , 423 , and 425 which incorporates the transformation applied to the input stream to provide a demarcation point. In some cases, the demarcation point will be located within the stream corresponding to the output cursor.
- the output stream 430 is coupled to the output cursor 421 which demarcates the output stream 430 into a preceding output portion 432 and a following output portion 434 (indicating settled and unsettled values, respectively).
- a consolidated output stream 440 is generated and represents a transformation fully applied to the original input stream 402 .
- the stitching operation may be a simple concatenating function applied to the output stream segments in appropriate sequence, or it may involve more complex algorithms incorporating, for example, averaging functions and the like.
- the consolidated output stream 440 can be output with an output cursor 443 which demarcates a preceding, settled portion of the stream 442 and a following, unsettled portion of the stream 444 .
- the consolidated output stream 440 is identical to an output stream 450 produced by the non-partitioned transformation 400 B, which is also generated with an output cursor 453 , which demarcates a preceding, settled portion of the stream 452 and a following, unsettled portion of the stream 454 .
- a requester 112 can then receive the streaming output data and output cursor 110 (operation 208 ).
- the requester 112 can be a computer 114 .
- the output stream may be produced by either a partitioned or non-partitioned transformation as discussed above.
- the output cursor of the streaming output 110 can be used by the requester 112 to identify results of the transformation that are either settled and invariant or provisional and subject to change.
- FIG. 5 is an example computing system 500 that may implement various systems and methods discussed herein.
- the computer system 500 includes one or more computing components in communication via a bus 502 .
- the computing system 500 includes one or more processors 504 .
- the processor 504 can include one or more internal levels of cache (not depicted) and a bus controller or bus interface unit to direct interaction with the bus 502 .
- the processor 504 can perform calculations on data, including transformations 108 , 308 A-B, and 400 A-B and specifically implements the various methods discussed herein.
- Main memory 506 may include one or more memory cards and a control circuit (not depicted), or other forms of removable memory, and may store various software applications including computer executable instructions, that when run on the processor 504 , implement the methods and systems set out herein.
- Other forms of memory, such as a storage device 508 may also be included and accessible, by the processor (or processors) 504 via the bus 502 .
- the computer system 500 can further include a communications interface 518 by way of which the computer system 500 can connect to networks and receive data useful in executing the methods and system set out herein as well as transmitting information to other devices.
- the communications interface 518 may receive streaming input data 102 A-B, 302 A-B and/or 402 via, for example, the internet.
- the computer system 500 can include an output device 516 by which information is displayed, such as a display (not depicted).
- the computer system 500 can also include an input device 520 by which information, such as streaming input data 102 A-B, is input.
- Input device 520 can also be a scanner, keyboard, and/or other input devices for human interfacing as will be apparent to a person of ordinary skill in the art.
- FIG. 5 is but one possible example of a computer system that may employ or be configured in accordance with embodiments of the present disclosure. It will be appreciated that other non-transitory tangible computer-readable storage media storing computer-executable instructions for implementing the presently disclosed technology on a computing system may be utilized.
- the methods disclosed may be implemented as sets of instructions or software readable by a device. Further, it is understood that the specific order or hierarchy of steps in the methods disclosed are instances of example approaches. Based upon design preferences, it is understood that the specific order or hierarchy of steps in the methods can be rearranged while remaining within the disclosed subject matter.
- the accompanying method claims present elements of the various steps in a sample order, and are not necessarily meant to be limited to the specific order or hierarchy presented.
- the described disclosure may be provided as a computer program product, or software, that may include a computer-readable storage medium having stored thereon instructions, which may be used to program a computer system (or other electronic devices) to perform a process according to the present disclosure.
- a computer-readable storage medium includes any mechanism for storing information in a form (e.g., software, processing application) readable by a computer.
- the computer-readable storage medium may include, but is not limited to, optical storage medium (e.g., CD-ROM), magneto-optical storage medium, read only memory (ROM), random access memory (RAM), erasable programmable memory (e.g., EPROM and EEPROM), flash memory, or other types of medium suitable for storing electronic instructions.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Computing Systems (AREA)
- Fuzzy Systems (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
A cursor demarcating a data set between a settled portion and an unsettled portion can be generated. A transformation can be applied to the data set, the transformation accounting for the cursor and transforming the settled portion of the data set differently than the unsettled portion of the data set in order to create a transformed output data set. The transformed output data set may further include a modified cursor based on the applied transformation and demarcating settled and unsettled portions of the transformed output data set.
Description
This application is related to and claims priority under 35 U.S.C. § 119(e) from U.S. Patent Application No. 62/609,180, filed Dec. 21, 2017 entitled “SYSTEM AND METHOD FOR MANAGING STREAMING CALCULATIONS,” the entire contents of which is incorporated herein by reference for all purposes.
Aspects of the present disclosure relate to performing transformations on streaming data.
Sensor measurement data is commonly saved as an historical record of a physical process being monitored by the sensors, often along with other observational, manual, calculated, simulated, or related data and metadata associated with the process. While the data in the historic record may be perceived as certain, or “settled,” there is often a need to modify the historic record. For example there may be a need to update the record with verified data in order to more accurately represent the historical facts. In some instances, portions of a record may not be settled due to issues arising from varying data latency (delay of arrival), timestamp error, signal noise, replacement of erroneous or missing data. Supposedly settled data can also be affected by combining multiple data sources to create better derived estimations, to refine or correct metadata, or to reinterpret conclusions or calculations based on data arriving after the calculations have completed and being applied in hindsight. As a result, output generated based in part on data subject to change will also be subject to change.
Stored data is ostensibly for future use and presumably impacting an ongoing, continuous, or future analysis, decision, or automation, among other actions. For example, a valve may be intended to maintain an optimal flow of a contained gas by automatically opening and closing in response to changing heat and pressure values. Opening and/or closing the valve can have a certain and immutable impact that cannot be undone. In such cases, the actions (e.g., opening or closing the valve) may have been made based on data that is subject to change and this fact can have far reaching ramifications for systems of record, issues of human responsibility, and work processes. It may be useful to differentiate action based on the likelihood of the data changing or it may be important (e.g., for assigning responsibility or detecting errors) to have some record of the likelihood of the data changing.
In some fields, calculations involving multiple data sources are common. For example, the manufacturing field involves multiple sensors each providing live, or “streaming,” data to a processor for calculations. In such cases, the calculation may be run periodically and without concern for variable “streaming” data rates and/or variable latency, and data will always be made available for the calculation through extrapolating or interpolating missing values; however, subsequent data from the data stream may make previous results incorrect. Alternatively, a calculation can be run at only a point in time when all streams have provided data for the calculation, resulting in accurate and invariant results, but consequently delayed. However, the latter approach is also incapable of providing provisional and/or “in-progress” results.
It is with these observations in mind, among others, that various aspects of the present disclosure were conceived.
Embodiments of the invention concern systems and methods for generating a data demarcating settled and unsettled data. In a first embodiment, a method can include accessing, by a hardware processor, a data set and a reference object, the data set ordered along a dimension and the reference object demarcating a first ordered portion of the data set from a second ordered portion of the data set, and the first ordered portion preceding the reference object along the dimension and the second ordered portion following the reference object along the dimension, wherein values of the first ordered portion are settled and values of the second ordered portion are unsettled, dividing, by the hardware processor, the data set into a first partition including a first portion of the data set and a second partition including a second portion of the data set, optimizing, by the hardware processor, one of a first transformation or a second transformation based on a relative position along the dimension of the reference object to one of the first portion of the data set and the second portion of the data set, and yielding one of an optimized first transformation or an optimized second transformation, applying, by the hardware processor, one of the first transformation or the optimized first transformation to the first portion of the data set, and yielding a transformed first portion of the data set, applying, by the hardware processor, one of the second transformation or the optimized second transformation to the second portion of the data set, and yielding a transformed second portion of the data set, and generating, by the hardware processor, an aggregated data set having the transformed first portion of the data set and the transformed second portion of the data set.
In one embodiment, the method further includes providing a first copy of the reference object to the first partition and a second copy of the reference object to the second partition, and wherein optimizing one of the first transformation or the second transformation is based on a relative position along the dimension of one of the first copy of the reference object and the second copy of the reference object.
In one embodiment, the method further includes outputting a reference object demarcating a first portion of the aggregated data set from a second portion of the aggregated data set. In one embodiment, the one of the first partition or the second partition comprises a plurality of partitions and each partition of the plurality of partitions comprises a portion of the data set, and wherein one of a transformation or an optimized transformation is applied to each of the plurality of partitions. In one embodiment, each partition of the plurality of partitions receives a copy of the reference object. In one embodiment, the entirety of the data set is contained across an aggregation of the partitions.
In one embodiment, one of the optimized first transformation and the optimized second transformation includes fewer calculations than one of the first transformation or the second transformation, and one of the first transformation or the second transformation is optimized based on the reference object being positioned along the dimension after one of the first portion of the data set or the second portion of the data set.
In one embodiment, the reference object is positioned along the dimension at an interim point of one of the first portion of the data set and the second portion of the data set, the interim point between a first point and a second point of one of the first portion of the data set and the second portion of the data set, and one of the optimized first transformation or the optimized second transformation includes a first set of calculations applied to points preceding the interim point and a second set of calculations applied to points following the interim point.
In another embodiment, a method for generating a demarcation for a data set can include generating, by a hardware processor, a reference object associated with an accessed data set, the reference object demarcating a first ordered portion of the data set from a second ordered portion of the data set, the first ordered portion preceding the reference object along the dimension and the second ordered portion following the reference object along the dimension, labeling, by the hardware processor, values of the first ordered portion as settled by changing metadata associated with each respective value of the first ordered portion; and labeling, by the hardware processor, values of the second ordered portion as unsettled by changing metadata associated with each respective value of the second ordered portion.
In one embodiment, the method may further include receiving, by the hardware processor, a transformation to apply to the accessed data set, applying, by the hardware processor, the transformation to the accessed data set, wherein the applied transformation factors in the reference object for the values of the second ordered portion to produce a transformed data set; and generating, by the hardware processor, an output data set comprising the transformed data set and one of the reference object or a transformed reference object.
In one embodiment, labeling the values of the second ordered portion comprises changing the metadata associated with each respective value of second ordered portion according to a continuous value of uncertainty, and wherein applying the transformation further factors in the continuous value of uncertainty for a plurality of values of the second ordered portion. In one embodiment, the dimension is time. In one embodiment, the reference object comprises an input cursor associated with a last accessed data point within the accessed data set and the second ordered portion comprises predicted values based on the first ordered portion.
In one embodiment, the accessed data set comprises a first input data stream and a second input data stream, the first input data stream associated with a first reference object that demarcates a first ordered portion of the first input data stream from a second ordered portion of the first input data stream, the second input data stream associated with a second reference object that demarcates a first ordered portion of the second input data stream from a second ordered portion of the second input data stream, and further comprising applying the transformation to the first input data stream and the second input data stream with reference to the respective first reference object and the second reference object to generate the result in the form of a transformed output data stream, the transformed output data stream including a third reference object that demarcates between a settled output portion of the output data stream and an unsettled output portion.
In one embodiment, the accessed data set comprises a first input data stream and a second input data stream, the first input data stream associated with a first reference object that demarcates a first ordered portion of the first input data stream from a second ordered portion of the first input data stream, the second input data stream associated with a second reference object that demarcates a first ordered portion of the second input data stream from a second ordered portion of the second input data stream.
In one embodiment, the reference object is based on a variable received with the accessed data set. In one embodiment, the reference object is computed in real time as the data set is accessed and by applying calculations as data is settled. In one embodiment, the reference object is modified.
In another embodiment, a non-transitory computer-readable medium may store computer-executable instructions that cause one or more processors to access a data set and a reference object, the data set ordered along a dimension and the reference object demarcating a first ordered portion of the data set from a second ordered portion of the data set, and the first ordered portion preceding the reference object along the dimension and the second ordered portion following the reference object along the dimension, wherein values of the first ordered portion are settled and values of the second ordered portion are unsettled, divide the data set into a first partition including a first portion of the data set and a second partition including a second portion of the data set, optimize one of a first transformation or a second transformation based on a relative position along the dimension of the reference object to one of the first portion of the data set and the second portion of the data set, and yielding one of an optimized first transformation or an optimized second transformation, apply one of the first transformation or the optimized first transformation to the first portion of the data set, and yielding a transformed first portion of the data set, apply one of the second transformation or the optimized second transformation to the second portion of the data set, and yielding a transformed second portion of the data set, and generate an aggregated data set comprising the transformed first portion of the data set and the transformed second portion of the data set, and an output reference object demarcating a first portion of the aggregated data set from a second portion of the aggregated data set.
In one embodiment, one of the first partition and the second partition comprises a plurality of partitions and each partition of the plurality of partitions comprises a distinct portion of the data set, and wherein one of a transformation or an optimized transformation is applied to each of the plurality of partitions.
Ordered input data can be processed by algorithms including, for example, transformations and calculations, to generate correspondingly ordered output data. In one embodiment, the ordered output data can include a reference object to a demarcation point along a dimension upon which the data is ordered. The demarcation point can differentiate, for example, data that is settled from data that is unsettled or merely provisional. Users of the ordered output data can then use the demarcation point, referred to herein as a “cursor,” in determining how to record, manage, calculate, and/or communicate the data, or provide the output data to other systems and processes that may use it as a respective input. The cursor may be based on time or a timestamp, which may or may not be an attribute of the data or data stream itself, as well as other attributes of the data which can be used to connote a demarcation point between settled and unsettled data.
Ordered input data may constitute a set of data points. Data points or sets of points may be recorded with an associated timestamp or timestamps, representing a time of measurement, a time of recording, a time of arrival to the recording device, or other meaningful time references. Metadata can be included or associated with the ordered input data and may represent measurement accuracy, precision, tolerance, confidence, variability, source, handling, processing information, or other information to inform proper use and interpretation of the associated ordered input data. Even when correctly arranged in order, data points have varying levels of certainty distinguishable by a demarcation along an ordered sequence. In other words, a timestamp alone may fail to inform a reviewer whether the data associated with the timestamp, before the timestamp, or after the timestamp have any shared values or qualities of stability.
The reinterpretation of events is a particularly challenging situation. For example, early warnings or other indicators of impending failure (or, conversely, an unanticipated opportunity) may justify immediate actions (e.g., further investigation, a change in operations, initiation of repairs, etc.), only to find later on that the events unfolded differently than indicated or anticipated. Along the same lines, subsequent events may unfold so as to make previously uncertain past records certain by providing opportunity for the recorded data and dependent calculations to be assessed as being correct or properly interpreted. The above phenomena may delineate a means for workflows, work processes, and calculations to handle “subject to change,” “unlikely to change,” and other variants differently.
The delineation discussed above is referred to in this disclosure as the “settled” (or, conversely, “unsettled”) nature of the data. For the sake of explanation, and without imputing any limitations, 100% certain data (such as data that has been directly observed) would naturally be conceptually deemed as immutable or settled, whereas 0% certainty (such as data that is entirely speculative and yet to be observed) would imply that the data, related data, and/or metadata may change, is uncertain, provisional or the like and thus the data is unsettled. It is to be understood that other measures, either continuous or discrete, of unsettledness may be used and that unsettled data is such because it represents only one possibility (intended to be the most likely possibility) in a set of potentially infinite future variants. To be clear, unsettled data may change but does not always or necessarily change. Moreover, the unsettled data itself may not change, but some related data or metadata, for example, may change or not yet be available and upon which the settled nature of the data itself depends, and hence the data cannot be considered settled until the related data, metadata, etc., is received, correlated, considered in a computation, etc.
Recording, managing, calculating, and communicating data can include, without limitation, decisions or algorithms that determine when to store, cache, or discard data; when to calculate or recalculate dependent assessments or calculations; visual representations of certainty; logical and mathematical operations; propagation of certainty through dependent calculations; and optimizing the methodology in support of partitioning and distributing calculations for scale.
Calculation environments may deal with a static data set or with a data set that is continuously or periodically refreshed with new data, with or without the loss of previous data. The non-static environment is commonly referred to as “streaming” and may include transformations being updated to address new inputs. Latency relates to the notion that newly arriving data inherently lags behind some actual measurement, due to physical, transport, processing, or network delays, among other things. Furthermore, each input may have a different latency from other inputs, and each of the input latencies may itself be variable.
Another characteristic of calculation environments is “partitionability,” where an algorithm, calculation, search, or other transformation may be broken into smaller parts which are apportioned to other processing systems and the results reassembled in a manner that precisely reconstructs an end result as if no such partitioning has occurred. Partitioning provides speed advantages over non-partitioning systems. Ideally, partitioning should operate without a priori knowledge of the nature of the data and transformation being applied while still guaranteeing that the reassembled result is idempotent (both reproducible and traceable) and the same as if such calculation were performed without partitioning.
It is natural to think of the stream of data as being updated in time, meaning that the key, or continuously updating dimension, is a timestamp associated with incoming values. However, the key dimension is not limited to time and may be any such dimensions along which data can be ordered. For example, any monotonically increasing physical or quantitative value may serve as a key. As discussed herein, and for non-limiting explanatory purposes only, the key is portrayed in the context of time.
The data streams may be characterized by multiple traits and behaviors. They may be time-value pairs, or samples, having a numeric, string, arbitrary scalar, or other static value and may be of the sort typically used to represent continuous signals. Alternatively, they may also be events, meaning instants or moments in time, or they may be states, which span a range in time (having a start time and an end time).
The streams may be stored in and accessed through databases and can be continuously updated as time progresses. The streams may have multiple and various sources such as temperature sensors, light sensors, and pressure sensors. The stream may have different or variable sample frequencies. For example, and without imputing limitation, a pressure sensor and a temperature sensor may transmit reading updates every second and every half-second, respectively. Further, the streams may have different and variable latencies, as discussed above. The example pressure sensor and temperature sensor may each transmit respective signals over different media such as wireless signal and cable, respectively. Where a signal over a cable may have consistent latency, a wirelessly transmitted signal may have a higher latency than that over the cable and also a highly variable latency as arbitrary changes in the intervening space between transmitter and receiver occur.
A device having a processor to perform calculations, such as a computer or a server, may receive one or more of the streams as input. For example, a server may monitor or otherwise receive or access a stream of pressure measurement data and a stream of temperature measurement data. Continuing the example, the calculation may include a time range over which to execute an algorithm using the pressure measurement data and the temperature measurement data. Results of a calculation may be in the form of a stream and/or can also be in the form of a non-streaming output value, such as a single number, which can be labeled with settled status (e.g., “settled” or “unsettled”) associated with an output cursor so as to denote a point in the streaming input up to which the corresponding non-streaming output value may be considered settled. For example, a time and corresponding data value may be associated with the output cursor and inform a requesting user that the output value is settled up to that time.
Further, variable sample frequencies and variable latencies may cause the results of the calculations to be within a window of unsettledness for any time range that includes a stream, at the time of calculation, lacking a data value. For example, where a temperature sensor and a pressure sensor stream data at unaligned timestamps, the most recent calculation may always be within a window of unsettledness because the temperature sensor streams a temperature value when the pressure sensor stream is in between updates, and vice versa.
As depicted by FIG. 1 , a streaming calculation system 100 can compute calculation results for any time range of 1 to N streaming input data 102A-B. The system 100 will compute the calculation results within an unsettledness window, such as that discussed above in reference to the example temperature and pressure sensor combination, and identify results that are settled and invariant as well as results that are provisional and subject to change with, in one example, a minimum or optimal use of computation resources. The streaming input data 102A-B is transmitted to a device 116 which is capable of receiving data and contains a processor able to perform calculations and algorithms included in data transformations. In some embodiments, the device 116 can be a personal computer or other computing device receiving streaming data over, for example, a network connection, accessing data from a database, or directly receiving data streams from a sensor, in various possible examples. In some embodiments, the device 116 may be a server or other device. For example, without imputing limitation, device 116 can be a specialized server capable of receiving and processing large volume and high frequency streaming data.
In one specific example, there is a one-to-one correspondence between input streams and cursors. In such an example, each input stream is associated with a cursor, whether the stream is provided with a cursor or the cursor is computed separately. But ultimately, each input stream to a transformation has a cursor prior to the execution of the transformation. As noted elsewhere, the cursor may be implicitly or explicitly defined. Similarly, each output stream from the transformation includes an associated output cursor.
While the examples herein provided are of streaming data, it is to be understood that any ordered (e.g., time series) data may be used. For example, and without imputing limitation, where a data set includes longitudinal data covering a span of years such as in the case of academic research studies, the longitudinal data may be processed in sequence based on associated time values of each data record. Such time values may be included directly with the data or may instead be associated to the data as metadata. The ordered data may then be processed in appropriate sequence, in which case it can be treated as streaming data. Unlike actual streaming data, however, the “stream” may be paused and restarted or otherwise manipulated in ways that may be unavailable for a “live” stream or otherwise require additional resources such as buffers and the like. Nevertheless, both streaming and recorded data may be similarly processed by the systems and methods discussed in this disclosure.
Referring to FIGS. 1 and 2 , a cursor calculator 104 receives the N input data streams 102A-B (operation 202). The cursor calculator 104 can generate a cursor for each received streaming input data 102A-B (operation 204A). In various possible implementations, the system may also use a supplied input cursor, or may override or modify a supplied input cursor. The cursor marks a deterministic boundary where results at, or older than, the cursor are declared settled, and results newer than the cursor are unsettled. The cursor calculator 104 generates a cursor for each calculation execution and thus maintains a demarcation point which determines the amount of unsettled results. Where no calculations are executed, the cursor calculator 104 may still generate cursors for the streaming input data 102A-B. For example, a monitoring service may require only a cursor for streaming input data and therefore no calculations need be performed as only a cursor and associated stream are output from the system 100.
Further, any or all of streaming input data 102A-B can be composite streams each including multiple ultimate sources of data themselves. For example, the streaming input data 102A may be a composite stream including both streaming temperature data and streaming pressure data. The N streaming input data 102A-B may each be distinct composite pressure and temperature data streams to be processed by the system 100 in order to, for example, discern average rate of change across all monitored gas processing units.
In some embodiments, the cursor may be provided with the streaming input data (not depicted). The provided cursor may be included explicitly or be included implicitly by means of, for example, a timestamp associated with a value of the streaming input, before which the data is settled and after which the data is unsettled. Where an implicit input cursor is provided, an output cursor can be generated based on the implicit cursor by recognizing the change of values from settled to unsettled. For example, where multiple input data streams each include respective timestamps, an output cursor may be generated based on a most recent timestamp value that is identical between the multiple data streams. In contrast, an explicit cursor may be provided as a well-defined data object or reference value which unambiguously refers to a specific data point in the input stream, or timestamp between data points in the input stream, as a demarcation point between settled and unsettled data (e.g., a pointer, index, etc.).
Alternatively or supplementally, rather than computing a cursor, a cursor may instead be obtained or received in some other way (operation 204B). For example, a cursor may be attributed to a data stream through a user interface or as noted immediately above, the input data itself may include a cursor.
In another example, a pressure sensor and a temperature sensor may both stream respective timestamped measurements to the cursor calculator 104. Further, the temperature sensor may update once every minute while the pressure sensor updates once every second. However, neither stream may provide an explicit input cursor. In such a case, the cursor calculator 104 may identify the last shared timestamp between the pressure sensor stream and the temperature sensor stream, and generate a cursor associated with the location of the composite input stream. However, basing the cursor on a timestamp generally, and specifically on the last shared timestamps of the streams themselves are examples, and the cursor may be based on other information. For example, the cursor may be based on some time attribute but not necessarily a timestamp within the stream or streams themselves, such as a separate time stream, metadata related to a stream where the metadata includes a time attribute, relative time attributes related to receipt of the stream at whatever device is receiving the stream or streams, etc. Similarly, the cursor does not have to be bound to a key in the data stream such as an actual time value. In another example, the cursor may be based on timestamp values of the stream, examples of keys, but not tied to a specific time stamp value of the stream. The cursor may also be based on some other attribute of the data besides time or timestamps whether the attribute is a key of the stream or not.
To be clear, multiple cursors may be generated along multiple dimensions of data or a single cursor may be complex (e.g., cursors reflecting splines or surfaces rather than single numbers). For example, and without imputing limitation, a cave mapping system may generate streaming mapping data from one or more probes and sensors. As the three-dimensional structure is mapped and data is accordingly streamed to a receiver, a cursor may demarcate the spatial coordinates of certain, or fully mapped, portions of the cave as compared to the in process or still rendering portions (e.g., still awaiting complete data from every sensor and probe).
Once generated, the cursor may then be combined with the corresponding streaming input data to generate streams 106A-B, which may be provided to a transformation 108. Transformation 108 can then be applied to the composite streams 106A-B to generate a streaming output data and an output cursor 110 (operation 206). The output cursor may be generated as part of the transformation. In some embodiments, the output cursor may be substantially similar to the cursor generated by the cursor calculator 104. For example, where a single, non-composite stream of input data is processed by the system 100 and includes an explicit input cursor as part of the metadata of the stream, the output cursor can be substantially identical to the received explicit input cursor. A substantially identical cursor may, for example, refer to the same relative data point within the sequence of data points comprising the stream. For example, an input stream having a first, second, and third value may be received with a cursor referring to the second value. An output stream having correspondingly transformed first, second, third values may be generated with an output cursor. In such a case, substantially similar input and output cursors may respectively refer to the second and transformed second values of the input and output streams.
In some embodiments, the output cursor may be substantially transformed due to the nature of the transformation 108 applied. For example, where the system 100 receives a large plurality of composite heat and pressure sensor streams having respective input cursors. The generated output data and output cursor 110 can constitute an average rate of change across all sensor inputs. In such a case, the transformation 108 may apply a variety of calculations accounting for variable latency and update frequency across all the streams, and thus generate a completely distinct output cursor that refers to a relative data point in the output sequence that is different than any single input cursor reference point.
The transformation 108 can be performed in either a partitioned process or a non-partitioned process, as further discussed below. Further, the transformation 108 can include multiple, different transformations applied to different input streams, or a singular transformation applied to multiple input streams collectively or individually.
Turning to FIG. 3 , a process 300 for applying two transformations to two data streams is depicted. Transformations 308A and 308B receive input streams 301A and 301B, respectively. The input stream 301A can include an input cursor 304A which demarcates a stream portion 302A that is settled and a stream portion 306A that is unsettled or provisional and subject to change. An output stream 311A and an output cursor 305A are generated together. Where the output cursor 305A is coupled to the output stream 311A, the stream is demarcated into a settled stream portion 310A and an unsettled stream portion 312A, similarly to the input stream.
In some embodiments, the input stream 301A may be recorded in, for example, a database and thus the settled values 302A may be catalogued as being settled in the record, and updated accordingly as the stream continues to provide updates. Similarly, the output stream 311A may be recorded and the associated settled values 310A may also be catalogued as being settled in the record. Accordingly, the cursors 304A and 305A can be used to update this record on a rolling or streaming basis.
In some embodiments, data may be received in batches spanning multiple data points. For example, a pressure sensor may transmit a minute of sensor data which is measured every one second. As a result, the streamed data consists of batches of 60 sequential data points every minute. In the case of streamed batches, the input cursor 304A may be included with the batch and denote the most recent settled data point in the batch. Continuing the example, the pressure sensor readings may only be settled for the first 30 seconds of the batch update, while the second 30 seconds of readings may be of varying levels of unsettledness. In such a case, the input cursor 304A demarcates the 30 seconds of settled pressure sensor readings 302A from the second thirty seconds of pressure sensor readings 306A.
Further, the transformation 308A may, for example, estimate a temperature to the nearest whole number from the pressure sensor readings. In which case, the generated output stream 311A of batched data can include an output cursor 305A demarcating the first 35 seconds of the transformed batch update 310A as being settled values from the remaining 25 seconds of the transformed batch update 312A which are unsettled. In such a case, the transformation 308A was able to generate settled data from unsettled data because, for example, the conversion from pressure to temperature, measured to a low significant figure, only needed a pressure value being settled to a certain degree rather than being completely settled.
An input cursor 304B is provided with the input stream 301B and, similarly to the above, demarcates the input stream 301B into a stream portion 302B that is settled and a stream portion 306B that is unsettled. When the transformation 308B is applied to the input stream 301B, an output stream 311B and output cursor 305B are generated. The output cursor 305B demarcates a settled output stream portion 310B and an unsettled output stream portion 312B. In this example, the input stream 301B is provided such that the cursor 304B and settled stream portion 302B are identical to input stream 301A's cursor (304A) and settled stream portion (302A). Note that the unsettled stream portions (306A and 306B) are quite different. If both identical input streams undergo the same transformation, they will produce the same output cursors (305A and 305B) and settled stream results (310A and 310B). To put it another way—unsettled input data does not influence the position of the output cursor or settled stream results; unsettled input data can only influence the output of unsettled data. We see that in 312A and 312B—the unsettled output stream results differ accordingly with their unsettled inputs.
Whereas the transformation process 400B transforms the entirety of the stream in a single process, the transformation process 400A first generates a set of partitions 406. Each partition 408, 410, 412, and 414 can perform a calculation on respective portions of the input stream 402. Some partitions may receive portions of the stream 404A(i) and 404A(ii) preceding the input cursor 404C. These partitions may ignore the input cursor 404C and perform respective calculations accordingly because their respective data points are guaranteed to be settled values as they precede the cursor.
A set of partitions 416 includes the partitions 408, 410, 412, and 414 along with respective cursor copies 418, 420, 422, and 424. While the discussed partitions receive individual copies of a cursor, it is understood that other implementations may serve to embody the disclosed technology. For example, each partition may receive a reference to a share cursor. The input cursor copy 418 indicates a point outside a respective span of consideration of the partition 408 and so the partition 408 may be able to optimize the transformation. In other words, the cursor copy 418 will not cause a change in the method and/or state related to the transformation process performed by the partition 408 on the portion 404A(i) of the data stream. In such an example, the cursor is positioned such that the transformation is entirely settled or unsettled.
As an example of a change in the method and/or state related to the transformation process stated above, a smoothing transformation might smooth an air temperature feed consisting of a local outdoors air temperature sensor data stream and a weather forecasting data stream. The air temperature feed may further be associated with a cursor as elsewhere discussed in this disclosure. The smoothing transformation might smooth only the local outdoor air temperature sensor feed up until a point on the air temperature feed associated with the cursor. For feed data after the cursor, the smoothing transformation may switch to smoothing the weather forecasting data stream. In the partitioned system, the partitions containing data preceding the cursor would smooth the local outdoor air temperature sensor feed component of the data stream, the partitions containing data following the cursor would smooth the weather forecasting data stream component, and the partition (or partitions) containing data spanning both sides of the cursor would accordingly smooth portions of both the local outdoor air temperature sensor feed component and the weather forecasting data stream component based on the location of the cursor within the partition.
In some cases, a partition can receive a cursor copy that contains a demarcation point located outside the span of the stream, but may still be factored into the transformation due to relative proximity or for other purposes. Here, partition 410 receives the input cursor copy 420 along with the stream portion 404A(ii). Though the received stream portion precedes the original input cursor 404C, the cursor is still factored into the calculations of partition 410 due to proximity. As a result, an output cursor 421 is generated that indicates a demarcation point along the output stream 430 produced by the partition 410.
The partitions 408, 410, 412, and 414 produce a set 426 of respective output streams 428, 430, 436, and 438. Each output stream contains a respective output cursor 419, 421, 423, and 425 which incorporates the transformation applied to the input stream to provide a demarcation point. In some cases, the demarcation point will be located within the stream corresponding to the output cursor. Here, the output stream 430 is coupled to the output cursor 421 which demarcates the output stream 430 into a preceding output portion 432 and a following output portion 434 (indicating settled and unsettled values, respectively). Upon the partitioned output streams 428, 430, 436, and 438 being “stitched” together, a consolidated output stream 440 is generated and represents a transformation fully applied to the original input stream 402. The stitching operation may be a simple concatenating function applied to the output stream segments in appropriate sequence, or it may involve more complex algorithms incorporating, for example, averaging functions and the like.
The consolidated output stream 440 can be output with an output cursor 443 which demarcates a preceding, settled portion of the stream 442 and a following, unsettled portion of the stream 444. The consolidated output stream 440 is identical to an output stream 450 produced by the non-partitioned transformation 400B, which is also generated with an output cursor 453, which demarcates a preceding, settled portion of the stream 452 and a following, unsettled portion of the stream 454.
Turning again to FIGS. 1 and 2 , a requester 112 can then receive the streaming output data and output cursor 110 (operation 208). In some embodiments, the requester 112 can be a computer 114. The output stream may be produced by either a partitioned or non-partitioned transformation as discussed above. The output cursor of the streaming output 110 can be used by the requester 112 to identify results of the transformation that are either settled and invariant or provisional and subject to change.
The computer system 500 can further include a communications interface 518 by way of which the computer system 500 can connect to networks and receive data useful in executing the methods and system set out herein as well as transmitting information to other devices. In some embodiments, the communications interface 518 may receive streaming input data 102A-B, 302A-B and/or 402 via, for example, the internet. The computer system 500 can include an output device 516 by which information is displayed, such as a display (not depicted). The computer system 500 can also include an input device 520 by which information, such as streaming input data 102A-B, is input. Input device 520 can also be a scanner, keyboard, and/or other input devices for human interfacing as will be apparent to a person of ordinary skill in the art. The system set forth in FIG. 5 is but one possible example of a computer system that may employ or be configured in accordance with embodiments of the present disclosure. It will be appreciated that other non-transitory tangible computer-readable storage media storing computer-executable instructions for implementing the presently disclosed technology on a computing system may be utilized.
In the present disclosure, the methods disclosed may be implemented as sets of instructions or software readable by a device. Further, it is understood that the specific order or hierarchy of steps in the methods disclosed are instances of example approaches. Based upon design preferences, it is understood that the specific order or hierarchy of steps in the methods can be rearranged while remaining within the disclosed subject matter. The accompanying method claims present elements of the various steps in a sample order, and are not necessarily meant to be limited to the specific order or hierarchy presented.
The described disclosure may be provided as a computer program product, or software, that may include a computer-readable storage medium having stored thereon instructions, which may be used to program a computer system (or other electronic devices) to perform a process according to the present disclosure. A computer-readable storage medium includes any mechanism for storing information in a form (e.g., software, processing application) readable by a computer. The computer-readable storage medium may include, but is not limited to, optical storage medium (e.g., CD-ROM), magneto-optical storage medium, read only memory (ROM), random access memory (RAM), erasable programmable memory (e.g., EPROM and EEPROM), flash memory, or other types of medium suitable for storing electronic instructions.
The description above includes example systems, methods, techniques, instruction sequences, and/or computer program products that embody techniques of the present disclosure. However, it is understood that the described disclosure may be practiced without these specific details.
While the present disclosure has been described with references to various implementations, it will be understood that these implementations are illustrative and that the scope of the disclosure is not limited to them. Many variations, modifications, additions, and improvements are possible. More generally, implementations in accordance with the present disclosure have been described in the context of particular implementations. Functionality may be separated or combined in blocks differently in various embodiments of the disclosure or described with different terminology. These and other variations, modifications, additions, and improvements may fall within the scope of the disclosure as defined in the claims that follow.
Claims (10)
1. A method for generating a data set, the method comprising:
accessing, by a hardware processor, a data set of streamed sensor data and a reference object, the data set ordered along a dimension and the reference object demarcating a first ordered portion of the data set from a second ordered portion of the data set, and the first ordered portion preceding the reference object along the dimension and the second ordered portion following the reference object along the dimension, wherein values of the first ordered portion are settled and values of the second ordered portion are unsettled;
dividing, by the hardware processor, the data set into a first partition including a first portion of the data set and second partition including a second partition of the data set;
optimizing, by the hardware processor, one of a first transformation or a second transformation based on a relative position along the dimension of the reference object to one of the first portion of the data set and the second portion of the data set, and yielding one of an optimized first transformation or an optimized second transformation;
applying, by the hardware processor, one of the first transformation or the optimized first transformation to the first portion of the data set, and yielding a transformed first portion of the data set;
applying, by the hardware processor, one of the second transformation or the optimized second transformation to the second portion of the data set, and yielding a transformed second portion of the data set; and
generating, by the hardware processor, an aggregated data set comprising the reference object, the transformed first portion of the data set and the transformed second portion of the data set.
2. The method of claim 1 , further comprising dividing, by the hardware processor, the data set into a first partition and a second partition including the second portion of the data set; and providing a first copy of the reference object to the first partition and a second copy of the reference object to the second partition, and wherein optimizing one of the first transformation or the second transformation is based on a relative position along the dimension of one of the first copy of the reference object and the second copy of the reference object.
3. The method of claim 1 , further comprising outputting a reference object demarcating a first portion of the aggregated data set from a second portion of the aggregated data set.
4. The method of claim 2 , wherein one of the first partition or the second partition comprises a plurality of partitions and each partition of the plurality of partitions comprises a portion of the data set, and wherein one of a transformation or an optimized transformation is applied to each of the plurality of partitions.
5. The method of claim 4 , wherein each partition of the plurality of partitions receives a copy of the reference object.
6. The method of claim 2 , wherein the entirety of the data set is contained across an aggregation of the partitions.
7. The method of claim 1 , wherein one of the optimized first transformation and the optimized second transformation includes fewer calculations than one of the first transformation or the second transformation, and wherein one of the first transformation or the second transformation is optimized based on the reference object being positioned along the dimension after one of the first portion of the data set or the second portion of the data set.
8. The method of claim 1 , wherein the reference object is positioned along the dimension at an interim point of one of the first portion of the data set and the second portion of the data set, the interim point between a first point and a second point of one of the first portion of the data set and the second portion of the data set, and wherein one of the optimized first transformation or the optimized second transformation includes a first set of calculations applied to points preceding the interim point and a second set of calculations applied to points following the interim point.
9. A non-transitory computer-readable medium storing instructions that, when executed by one or more processors, cause the one or more processors to:
access a data set of streamed sensor data and a reference object, the data set ordered along a dimension and the reference object demarcating a first ordered portion of the data set from a second ordered portion of the data set, and the first ordered portion preceding the reference object along the dimension and the second ordered portion following the reference object along the dimension, wherein values of the first ordered portion are settled and values of the second ordered portion are unsettled;
dividing the data set into a first partition including a first portion of the data set and second partition including a second partition of the data set;
optimize one of a first transformation or a second transformation based on a relative position along the dimension of the reference object to one of the first portion of the data set and the second portion of the data set, and yielding one of an optimized first transformation or an optimized second transformation;
apply one of the first transformation or the optimized first transformation to the first portion of the data set, and yielding a transformed first portion of the data set;
apply one of the second transformation or the optimized second transformation to the second portion of the data set, and yielding a transformed second portion of the data set; and
generate an aggregated data set comprising the transformed first portion of the data set and the transformed second portion of the data set, and an output reference object demarcating a first portion of the aggregated data set from a second portion of the aggregated data set.
10. The non-transitory computer-readable medium of claim 9 , further comprising instructions, that when executed by one or more processors, cause the one or more processors to
wherein one of the first partition and the second partition comprises a plurality of partitions and each partition of the plurality of partitions comprises a distinct portion of the data set, and wherein one of a transformation or an optimized transformation is applied to each of the plurality of partitions.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/224,619 US11074272B1 (en) | 2017-12-21 | 2018-12-18 | System and method for managing streaming calculations |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201762609180P | 2017-12-21 | 2017-12-21 | |
US16/224,619 US11074272B1 (en) | 2017-12-21 | 2018-12-18 | System and method for managing streaming calculations |
Publications (1)
Publication Number | Publication Date |
---|---|
US11074272B1 true US11074272B1 (en) | 2021-07-27 |
Family
ID=76971427
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/224,619 Active 2039-06-09 US11074272B1 (en) | 2017-12-21 | 2018-12-18 | System and method for managing streaming calculations |
Country Status (1)
Country | Link |
---|---|
US (1) | US11074272B1 (en) |
Citations (79)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2008016878A2 (en) | 2006-08-02 | 2008-02-07 | Vertica Systems, Inc. | Query optimizer |
WO2008016877A2 (en) | 2006-08-02 | 2008-02-07 | Vertica Systems, Inc. | Automatic vertical-database design |
US7490013B2 (en) | 2003-12-09 | 2009-02-10 | Oslsoft, Inc. | Power grid failure detection system and method |
WO2010062994A2 (en) | 2008-11-26 | 2010-06-03 | Vertica Systems, Inc. | Modular query optimizer |
WO2010091191A2 (en) | 2009-02-06 | 2010-08-12 | Vertica Systems, Inc. | Query optimizer with schema conversion |
WO2011008807A2 (en) | 2009-07-14 | 2011-01-20 | Vertica Systems, Inc. | Database storage architecture |
US20110218978A1 (en) | 2010-02-22 | 2011-09-08 | Vertica Systems, Inc. | Operating on time sequences of data |
US8112425B2 (en) | 2006-10-05 | 2012-02-07 | Splunk Inc. | Time series search engine |
US8290931B2 (en) | 2010-02-22 | 2012-10-16 | Hewlett-Packard Development Company, L.P. | Database designer |
US20120317094A1 (en) | 2011-06-07 | 2012-12-13 | Vertica Systems, Inc. | Sideways Information Passing |
US8412696B2 (en) | 2011-01-31 | 2013-04-02 | Splunk Inc. | Real time searching and reporting |
US8515963B1 (en) | 2012-08-17 | 2013-08-20 | Splunk Inc. | Indexing preview |
US8516008B1 (en) | 2012-05-18 | 2013-08-20 | Splunk Inc. | Flexible schema column store |
US8583631B1 (en) | 2013-01-31 | 2013-11-12 | Splunk Inc. | Metadata tracking for a pipelined search language (data modeling for fields) |
US8589432B2 (en) | 2011-01-31 | 2013-11-19 | Splunk Inc. | Real time searching and reporting |
US20140006777A1 (en) | 2012-06-29 | 2014-01-02 | Oslsoft, Inc. | Establishing Secure Communication Between Networks |
US8682886B2 (en) | 2012-05-18 | 2014-03-25 | Splunk Inc. | Report acceleration using intermediate summaries of events |
US8682925B1 (en) | 2013-01-31 | 2014-03-25 | Splunk Inc. | Distributed high performance analytics store |
US8752178B2 (en) | 2013-07-31 | 2014-06-10 | Splunk Inc. | Blacklisting and whitelisting of security-related events |
US8751486B1 (en) | 2013-07-31 | 2014-06-10 | Splunk Inc. | Executing structured queries on unstructured data |
US8751499B1 (en) | 2013-01-22 | 2014-06-10 | Splunk Inc. | Variable representative sampling under resource constraints |
US8788525B2 (en) | 2012-09-07 | 2014-07-22 | Splunk Inc. | Data model for machine data for semantic search |
US8788459B2 (en) | 2012-05-15 | 2014-07-22 | Splunk Inc. | Clustering for high availability and disaster recovery |
US8806361B1 (en) | 2013-09-16 | 2014-08-12 | Splunk Inc. | Multi-lane time-synched visualizations of machine data events |
US8826434B2 (en) | 2013-07-25 | 2014-09-02 | Splunk Inc. | Security threat detection based on indications in big data of access to newly registered domains |
US8909642B2 (en) | 2013-01-23 | 2014-12-09 | Splunk Inc. | Automatic generation of a field-extraction rule based on selections in a sample event |
US8972992B2 (en) | 2013-04-30 | 2015-03-03 | Splunk Inc. | Proactive monitoring tree with state distribution ring |
US8978036B2 (en) | 2013-07-29 | 2015-03-10 | Splunk Inc. | Dynamic scheduling of tasks for collecting and processing data from external sources |
US8990245B2 (en) | 2011-03-14 | 2015-03-24 | Splunk Inc. | Determination and display of the number of unique values for a field defined for events in a distributed data store |
US9015716B2 (en) | 2013-04-30 | 2015-04-21 | Splunk Inc. | Proactive monitoring tree with node pinning for concurrent node comparisons |
US9043332B2 (en) | 2012-09-07 | 2015-05-26 | Splunk Inc. | Cluster performance monitoring |
US20150149134A1 (en) | 2013-11-27 | 2015-05-28 | Falkonry Inc. | Learning Expected Operational Behavior Of Machines From Generic Definitions And Past Behavior |
US9047246B1 (en) | 2014-07-31 | 2015-06-02 | Splunk Inc. | High availability scheduler |
US9052938B1 (en) | 2014-04-15 | 2015-06-09 | Splunk Inc. | Correlation and associated display of virtual machine data and storage performance data |
US20150178286A1 (en) | 2013-12-23 | 2015-06-25 | D Square n.v. | System and Method for Similarity Search in Process Data |
US9078009B2 (en) * | 2010-02-19 | 2015-07-07 | Skype | Data compression for video utilizing non-translational motion information |
US9087090B1 (en) | 2014-07-31 | 2015-07-21 | Splunk Inc. | Facilitating execution of conceptual queries containing qualitative search terms |
US20150220850A1 (en) | 2014-02-06 | 2015-08-06 | SparkCognition, Inc. | System and Method for Generation of a Heuristic |
US9124612B2 (en) | 2012-05-15 | 2015-09-01 | Splunk Inc. | Multi-site clustering |
US9128995B1 (en) | 2014-10-09 | 2015-09-08 | Splunk, Inc. | Defining a graphical visualization along a time-based graph lane using key performance indicators derived from machine data |
US9128779B1 (en) | 2014-07-31 | 2015-09-08 | Splunk Inc. | Distributed tasks for retrieving supplemental job information |
US9130971B2 (en) | 2012-05-15 | 2015-09-08 | Splunk, Inc. | Site-based search affinity |
US9130832B1 (en) | 2014-10-09 | 2015-09-08 | Splunk, Inc. | Creating entity definition from a file |
US20150262060A1 (en) | 2014-03-11 | 2015-09-17 | SparkCognition, Inc. | System and Method for Calculating Remaining Useful Time of Objects |
US9142049B2 (en) | 2013-04-30 | 2015-09-22 | Splunk Inc. | Proactive monitoring tree providing distribution stream chart with branch overlay |
US9146954B1 (en) | 2014-10-09 | 2015-09-29 | Splunk, Inc. | Creating entity definition from a search result set |
US9152929B2 (en) | 2013-01-23 | 2015-10-06 | Splunk Inc. | Real time display of statistics and values for selected regular expressions |
US9185007B2 (en) | 2013-04-30 | 2015-11-10 | Splunk Inc. | Proactive monitoring tree with severity state sorting |
US9210056B1 (en) | 2014-10-09 | 2015-12-08 | Splunk Inc. | Service monitoring interface |
US9215240B2 (en) | 2013-07-25 | 2015-12-15 | Splunk Inc. | Investigative and dynamic detection of potential security-threat indicators from events in big data |
US9225724B2 (en) | 2011-08-12 | 2015-12-29 | Splunk Inc. | Elastic resource scaling |
US9251221B1 (en) | 2014-07-21 | 2016-02-02 | Splunk Inc. | Assigning scores to objects based on search query results |
US9292675B2 (en) | 2014-01-10 | 2016-03-22 | SparkCognition, Inc. | System and method for creating a core cognitive fingerprint |
US9363149B1 (en) | 2015-08-01 | 2016-06-07 | Splunk Inc. | Management console for network security investigations |
US20160196527A1 (en) | 2015-01-06 | 2016-07-07 | Falkonry, Inc. | Condition monitoring and prediction for smart logistics |
US9437022B2 (en) | 2014-01-27 | 2016-09-06 | Splunk Inc. | Time-based visualization of the number of events having various values for a field |
US20160299966A1 (en) | 2015-04-10 | 2016-10-13 | D Square n.v. | System and Method for Creation and Detection of Process Fingerprints for Monitoring in a Process Plant |
US9471362B2 (en) | 2014-09-23 | 2016-10-18 | Splunk Inc. | Correlating hypervisor data for a virtual machine with associated operating system data |
US9491059B2 (en) | 2014-10-09 | 2016-11-08 | Splunk Inc. | Topology navigator for IT services |
US9509765B2 (en) | 2014-07-31 | 2016-11-29 | Splunk Inc. | Asynchronous processing of messages from multiple search peers |
US9516052B1 (en) | 2015-08-01 | 2016-12-06 | Splunk Inc. | Timeline displays of network security investigation events |
US9516053B1 (en) | 2015-08-31 | 2016-12-06 | Splunk Inc. | Network security threat detection by user/user-entity behavioral analysis |
US20170017901A1 (en) | 2015-07-16 | 2017-01-19 | Falkonry Inc. | Machine Learning of Physical Conditions Based on Abstract Relations and Sparse Labels |
US9578053B2 (en) | 2014-04-10 | 2017-02-21 | SparkCognition, Inc. | Systems and methods for using cognitive fingerprints |
US9582585B2 (en) | 2012-09-07 | 2017-02-28 | Splunk Inc. | Discovering fields to filter data returned in response to a search |
US9594814B2 (en) | 2012-09-07 | 2017-03-14 | Splunk Inc. | Advanced field extractor with modification of an extracted field |
US9596253B2 (en) | 2014-10-30 | 2017-03-14 | Splunk Inc. | Capture triggers for capturing network data |
US9607414B2 (en) | 2015-01-27 | 2017-03-28 | Splunk Inc. | Three-dimensional point-in-polygon operation to facilitate displaying three-dimensional structures |
US9646398B2 (en) | 2014-07-09 | 2017-05-09 | Splunk Inc. | Minimizing blur operations for creating a blur effect for an image |
US9660930B2 (en) | 2014-03-17 | 2017-05-23 | Splunk Inc. | Dynamic data server nodes |
US9667640B2 (en) | 2015-04-28 | 2017-05-30 | Splunk Inc. | Automatically generating alerts based on information obtained from search results in a query-processing system |
US9715329B2 (en) | 2013-07-31 | 2017-07-25 | Splunk Inc. | Provisioning of cloud networks with service |
US9733974B2 (en) | 2013-04-30 | 2017-08-15 | Splunk Inc. | Systems and methods for determining parent states of parent components in a virtual-machine environment based on performance states of related child components and component state criteria during a user-selected time period |
US9740755B2 (en) | 2014-09-30 | 2017-08-22 | Splunk, Inc. | Event limited field picker |
US9753961B2 (en) | 2014-10-09 | 2017-09-05 | Splunk Inc. | Identifying events using informational fields |
US9753909B2 (en) | 2012-09-07 | 2017-09-05 | Splunk, Inc. | Advanced field extractor with multiple positive examples |
US9760240B2 (en) | 2014-10-09 | 2017-09-12 | Splunk Inc. | Graphical user interface for static and adaptive thresholds |
US9785886B1 (en) | 2017-04-17 | 2017-10-10 | SparkCognition, Inc. | Cooperative execution of a genetic algorithm with an efficient training algorithm for data-driven model creation |
US10037128B2 (en) | 2014-02-04 | 2018-07-31 | Falkonry, Inc. | Operating behavior classification interface |
-
2018
- 2018-12-18 US US16/224,619 patent/US11074272B1/en active Active
Patent Citations (147)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7490013B2 (en) | 2003-12-09 | 2009-02-10 | Oslsoft, Inc. | Power grid failure detection system and method |
US8671091B2 (en) | 2006-08-02 | 2014-03-11 | Hewlett-Packard Development Company, L.P. | Optimizing snowflake schema queries |
WO2008016877A2 (en) | 2006-08-02 | 2008-02-07 | Vertica Systems, Inc. | Automatic vertical-database design |
US10007686B2 (en) | 2006-08-02 | 2018-06-26 | Entit Software Llc | Automatic vertical-database design |
WO2008016878A2 (en) | 2006-08-02 | 2008-02-07 | Vertica Systems, Inc. | Query optimizer |
US8086598B1 (en) | 2006-08-02 | 2011-12-27 | Hewlett-Packard Development Company, L.P. | Query optimizer with schema conversion |
US9002854B2 (en) | 2006-10-05 | 2015-04-07 | Splunk Inc. | Time series search with interpolated time stamp |
US8990184B2 (en) | 2006-10-05 | 2015-03-24 | Splunk Inc. | Time series search engine |
US8112425B2 (en) | 2006-10-05 | 2012-02-07 | Splunk Inc. | Time series search engine |
US9747316B2 (en) | 2006-10-05 | 2017-08-29 | Splunk Inc. | Search based on a relationship between log data and data from a real-time monitoring environment |
US9594789B2 (en) | 2006-10-05 | 2017-03-14 | Splunk Inc. | Time series search in primary and secondary memory |
US9514175B2 (en) | 2006-10-05 | 2016-12-06 | Splunk Inc. | Normalization of time stamps for event data |
US8312027B2 (en) | 2008-11-26 | 2012-11-13 | Hewlett-Packard Development Company, L.P. | Modular query optimizer |
US8214352B2 (en) | 2008-11-26 | 2012-07-03 | Hewlett-Packard Development Company | Modular query optimizer |
WO2010062994A2 (en) | 2008-11-26 | 2010-06-03 | Vertica Systems, Inc. | Modular query optimizer |
WO2010091191A2 (en) | 2009-02-06 | 2010-08-12 | Vertica Systems, Inc. | Query optimizer with schema conversion |
WO2011008807A2 (en) | 2009-07-14 | 2011-01-20 | Vertica Systems, Inc. | Database storage architecture |
US8700674B2 (en) | 2009-07-14 | 2014-04-15 | Hewlett-Packard Development Company, L.P. | Database storage architecture |
US9078009B2 (en) * | 2010-02-19 | 2015-07-07 | Skype | Data compression for video utilizing non-translational motion information |
US8290931B2 (en) | 2010-02-22 | 2012-10-16 | Hewlett-Packard Development Company, L.P. | Database designer |
US20110218978A1 (en) | 2010-02-22 | 2011-09-08 | Vertica Systems, Inc. | Operating on time sequences of data |
US8589375B2 (en) | 2011-01-31 | 2013-11-19 | Splunk Inc. | Real time searching and reporting |
US8412696B2 (en) | 2011-01-31 | 2013-04-02 | Splunk Inc. | Real time searching and reporting |
US8589432B2 (en) | 2011-01-31 | 2013-11-19 | Splunk Inc. | Real time searching and reporting |
US8990245B2 (en) | 2011-03-14 | 2015-03-24 | Splunk Inc. | Determination and display of the number of unique values for a field defined for events in a distributed data store |
US9430574B2 (en) | 2011-03-14 | 2016-08-30 | Splunk Inc. | Display for a number of unique values for an event field |
US9129028B2 (en) | 2011-03-14 | 2015-09-08 | Splunk, Inc. | Event field distributed search display |
US20120317094A1 (en) | 2011-06-07 | 2012-12-13 | Vertica Systems, Inc. | Sideways Information Passing |
US9356934B2 (en) | 2011-08-12 | 2016-05-31 | Splunk Inc. | Data volume scaling for storing indexed data |
US9516029B2 (en) | 2011-08-12 | 2016-12-06 | Splunk Inc. | Searching indexed data based on user roles |
US9225724B2 (en) | 2011-08-12 | 2015-12-29 | Splunk Inc. | Elastic resource scaling |
US9124612B2 (en) | 2012-05-15 | 2015-09-01 | Splunk Inc. | Multi-site clustering |
US8788459B2 (en) | 2012-05-15 | 2014-07-22 | Splunk Inc. | Clustering for high availability and disaster recovery |
US9130971B2 (en) | 2012-05-15 | 2015-09-08 | Splunk, Inc. | Site-based search affinity |
US9160798B2 (en) | 2012-05-15 | 2015-10-13 | Splunk, Inc. | Clustering for high availability and disaster recovery |
US8682886B2 (en) | 2012-05-18 | 2014-03-25 | Splunk Inc. | Report acceleration using intermediate summaries of events |
US9177002B2 (en) | 2012-05-18 | 2015-11-03 | Splunk, Inc. | Report acceleration using intermediate results in a distributed indexer system for searching events |
US8516008B1 (en) | 2012-05-18 | 2013-08-20 | Splunk Inc. | Flexible schema column store |
US9753974B2 (en) | 2012-05-18 | 2017-09-05 | Splunk Inc. | Flexible schema column store |
US20140006777A1 (en) | 2012-06-29 | 2014-01-02 | Oslsoft, Inc. | Establishing Secure Communication Between Networks |
US8825664B2 (en) | 2012-08-17 | 2014-09-02 | Splunk Inc. | Indexing preview |
US9208206B2 (en) | 2012-08-17 | 2015-12-08 | Splunk Inc. | Selecting parsing rules based on data analysis |
US9442981B2 (en) | 2012-08-17 | 2016-09-13 | Splunk Inc. | Previewing parsed raw data using a graphical user interface |
US8515963B1 (en) | 2012-08-17 | 2013-08-20 | Splunk Inc. | Indexing preview |
US9740788B2 (en) | 2012-08-17 | 2017-08-22 | Splunk, Inc. | Interactive selection and display of a raw data parsing rule |
US8788525B2 (en) | 2012-09-07 | 2014-07-22 | Splunk Inc. | Data model for machine data for semantic search |
US9043332B2 (en) | 2012-09-07 | 2015-05-26 | Splunk Inc. | Cluster performance monitoring |
US8983994B2 (en) | 2012-09-07 | 2015-03-17 | Splunk Inc. | Generation of a data model for searching machine data |
US9047181B2 (en) | 2012-09-07 | 2015-06-02 | Splunk Inc. | Visualization of data from clusters |
US9594814B2 (en) | 2012-09-07 | 2017-03-14 | Splunk Inc. | Advanced field extractor with modification of an extracted field |
US9275338B2 (en) | 2012-09-07 | 2016-03-01 | Splunk Inc. | Predictive analysis of event patterns in machine data |
US9582585B2 (en) | 2012-09-07 | 2017-02-28 | Splunk Inc. | Discovering fields to filter data returned in response to a search |
US9589012B2 (en) | 2012-09-07 | 2017-03-07 | Splunk Inc. | Generation of a data model applied to object queries |
US8788526B2 (en) | 2012-09-07 | 2014-07-22 | Splunk Inc. | Data model for machine data for semantic search |
US9128980B2 (en) | 2012-09-07 | 2015-09-08 | Splunk Inc. | Generation of a data model applied to queries |
US9753909B2 (en) | 2012-09-07 | 2017-09-05 | Splunk, Inc. | Advanced field extractor with multiple positive examples |
US9582557B2 (en) | 2013-01-22 | 2017-02-28 | Splunk Inc. | Sampling events for rule creation with process selection |
US8751499B1 (en) | 2013-01-22 | 2014-06-10 | Splunk Inc. | Variable representative sampling under resource constraints |
US9031955B2 (en) | 2013-01-22 | 2015-05-12 | Splunk Inc. | Sampling of events to use for developing a field-extraction rule for a field to use in event searching |
US9152929B2 (en) | 2013-01-23 | 2015-10-06 | Splunk Inc. | Real time display of statistics and values for selected regular expressions |
US8909642B2 (en) | 2013-01-23 | 2014-12-09 | Splunk Inc. | Automatic generation of a field-extraction rule based on selections in a sample event |
US9128985B2 (en) | 2013-01-31 | 2015-09-08 | Splunk, Inc. | Supplementing a high performance analytics store with evaluation of individual events to respond to an event query |
US8682925B1 (en) | 2013-01-31 | 2014-03-25 | Splunk Inc. | Distributed high performance analytics store |
US8583631B1 (en) | 2013-01-31 | 2013-11-12 | Splunk Inc. | Metadata tracking for a pipelined search language (data modeling for fields) |
US9152682B2 (en) | 2013-01-31 | 2015-10-06 | Splunk Inc. | Tracking metadata for a column in a table as a sequence of commands operates on the table |
US9426045B2 (en) | 2013-04-30 | 2016-08-23 | Splunk Inc. | Proactive monitoring tree with severity state sorting |
US8972992B2 (en) | 2013-04-30 | 2015-03-03 | Splunk Inc. | Proactive monitoring tree with state distribution ring |
US9015716B2 (en) | 2013-04-30 | 2015-04-21 | Splunk Inc. | Proactive monitoring tree with node pinning for concurrent node comparisons |
US9417774B2 (en) | 2013-04-30 | 2016-08-16 | Splunk Inc. | Proactive monitoring tree with node pinning for concurrent node comparisons |
US9142049B2 (en) | 2013-04-30 | 2015-09-22 | Splunk Inc. | Proactive monitoring tree providing distribution stream chart with branch overlay |
US9185007B2 (en) | 2013-04-30 | 2015-11-10 | Splunk Inc. | Proactive monitoring tree with severity state sorting |
US9733974B2 (en) | 2013-04-30 | 2017-08-15 | Splunk Inc. | Systems and methods for determining parent states of parent components in a virtual-machine environment based on performance states of related child components and component state criteria during a user-selected time period |
US9754395B2 (en) | 2013-04-30 | 2017-09-05 | Splunk Inc. | Proactive monitoring tree providing distribution stream chart with branch overlay |
US9215240B2 (en) | 2013-07-25 | 2015-12-15 | Splunk Inc. | Investigative and dynamic detection of potential security-threat indicators from events in big data |
US8826434B2 (en) | 2013-07-25 | 2014-09-02 | Splunk Inc. | Security threat detection based on indications in big data of access to newly registered domains |
US9648037B2 (en) | 2013-07-25 | 2017-05-09 | Splunk Inc. | Security threat detection using access patterns and domain name registrations |
US9756068B2 (en) | 2013-07-25 | 2017-09-05 | Splunk Inc. | Blocking domain name access using access patterns and domain name registrations |
US9248068B2 (en) | 2013-07-25 | 2016-02-02 | Splunk Inc. | Security threat detection of newly registered domains |
US9516046B2 (en) | 2013-07-25 | 2016-12-06 | Splunk Inc. | Analyzing a group of values extracted from events of machine data relative to a population statistic for those values |
US9432396B2 (en) | 2013-07-25 | 2016-08-30 | Splunk Inc. | Security threat detection using domain name registrations |
US9426172B2 (en) | 2013-07-25 | 2016-08-23 | Splunk Inc. | Security threat detection using domain name accesses |
US9173801B2 (en) | 2013-07-25 | 2015-11-03 | Splunk, Inc. | Graphic display of security threats based on indications of access to newly registered domains |
US8978036B2 (en) | 2013-07-29 | 2015-03-10 | Splunk Inc. | Dynamic scheduling of tasks for collecting and processing data from external sources |
US8751486B1 (en) | 2013-07-31 | 2014-06-10 | Splunk Inc. | Executing structured queries on unstructured data |
US8752178B2 (en) | 2013-07-31 | 2014-06-10 | Splunk Inc. | Blacklisting and whitelisting of security-related events |
US9122746B2 (en) | 2013-07-31 | 2015-09-01 | Splunk, Inc. | Executing structured queries on unstructured data |
US9715329B2 (en) | 2013-07-31 | 2017-07-25 | Splunk Inc. | Provisioning of cloud networks with service |
US9594828B2 (en) | 2013-07-31 | 2017-03-14 | Splunk Inc. | Executing structured queries on text records of unstructured data |
US9276946B2 (en) | 2013-07-31 | 2016-03-01 | Splunk Inc. | Blacklisting and whitelisting of security-related events |
US9596252B2 (en) | 2013-07-31 | 2017-03-14 | Splunk Inc. | Identifying possible security threats using event group summaries |
US8806361B1 (en) | 2013-09-16 | 2014-08-12 | Splunk Inc. | Multi-lane time-synched visualizations of machine data events |
US9043717B2 (en) | 2013-09-16 | 2015-05-26 | Splunk Inc. | Multi-lane time-synched visualizations of machine data events |
US20150149134A1 (en) | 2013-11-27 | 2015-05-28 | Falkonry Inc. | Learning Expected Operational Behavior Of Machines From Generic Definitions And Past Behavior |
US20150178286A1 (en) | 2013-12-23 | 2015-06-25 | D Square n.v. | System and Method for Similarity Search in Process Data |
US9292675B2 (en) | 2014-01-10 | 2016-03-22 | SparkCognition, Inc. | System and method for creating a core cognitive fingerprint |
US9437022B2 (en) | 2014-01-27 | 2016-09-06 | Splunk Inc. | Time-based visualization of the number of events having various values for a field |
US10037128B2 (en) | 2014-02-04 | 2018-07-31 | Falkonry, Inc. | Operating behavior classification interface |
US20150220850A1 (en) | 2014-02-06 | 2015-08-06 | SparkCognition, Inc. | System and Method for Generation of a Heuristic |
US9818060B2 (en) | 2014-02-06 | 2017-11-14 | SparkCognition, Inc. | System and method for generation of a heuristic |
US20150262060A1 (en) | 2014-03-11 | 2015-09-17 | SparkCognition, Inc. | System and Method for Calculating Remaining Useful Time of Objects |
US9660930B2 (en) | 2014-03-17 | 2017-05-23 | Splunk Inc. | Dynamic data server nodes |
US9578053B2 (en) | 2014-04-10 | 2017-02-21 | SparkCognition, Inc. | Systems and methods for using cognitive fingerprints |
US9052938B1 (en) | 2014-04-15 | 2015-06-09 | Splunk Inc. | Correlation and associated display of virtual machine data and storage performance data |
US9646398B2 (en) | 2014-07-09 | 2017-05-09 | Splunk Inc. | Minimizing blur operations for creating a blur effect for an image |
US9754359B2 (en) | 2014-07-09 | 2017-09-05 | Splunk Inc. | Identifying previously-blurred areas for creating a blur effect for an image |
US9251221B1 (en) | 2014-07-21 | 2016-02-02 | Splunk Inc. | Assigning scores to objects based on search query results |
US9509765B2 (en) | 2014-07-31 | 2016-11-29 | Splunk Inc. | Asynchronous processing of messages from multiple search peers |
US9256501B1 (en) | 2014-07-31 | 2016-02-09 | Splunk Inc. | High availability scheduler for scheduling map-reduce searches |
US9128779B1 (en) | 2014-07-31 | 2015-09-08 | Splunk Inc. | Distributed tasks for retrieving supplemental job information |
US9087090B1 (en) | 2014-07-31 | 2015-07-21 | Splunk Inc. | Facilitating execution of conceptual queries containing qualitative search terms |
US9047246B1 (en) | 2014-07-31 | 2015-06-02 | Splunk Inc. | High availability scheduler |
US9471362B2 (en) | 2014-09-23 | 2016-10-18 | Splunk Inc. | Correlating hypervisor data for a virtual machine with associated operating system data |
US9740755B2 (en) | 2014-09-30 | 2017-08-22 | Splunk, Inc. | Event limited field picker |
US9208463B1 (en) | 2014-10-09 | 2015-12-08 | Splunk Inc. | Thresholds for key performance indicators derived from machine data |
US9245057B1 (en) | 2014-10-09 | 2016-01-26 | Splunk Inc. | Presenting a graphical visualization along a time-based graph lane using key performance indicators derived from machine data |
US9128995B1 (en) | 2014-10-09 | 2015-09-08 | Splunk, Inc. | Defining a graphical visualization along a time-based graph lane using key performance indicators derived from machine data |
US9130860B1 (en) | 2014-10-09 | 2015-09-08 | Splunk, Inc. | Monitoring service-level performance using key performance indicators derived from machine data |
US9590877B2 (en) | 2014-10-09 | 2017-03-07 | Splunk Inc. | Service monitoring interface |
US9130832B1 (en) | 2014-10-09 | 2015-09-08 | Splunk, Inc. | Creating entity definition from a file |
US9760240B2 (en) | 2014-10-09 | 2017-09-12 | Splunk Inc. | Graphical user interface for static and adaptive thresholds |
US9762455B2 (en) | 2014-10-09 | 2017-09-12 | Splunk Inc. | Monitoring IT services at an individual overall level from machine data |
US9146954B1 (en) | 2014-10-09 | 2015-09-29 | Splunk, Inc. | Creating entity definition from a search result set |
US9614736B2 (en) | 2014-10-09 | 2017-04-04 | Splunk Inc. | Defining a graphical visualization along a time-based graph lane using key performance indicators derived from machine data |
US9584374B2 (en) | 2014-10-09 | 2017-02-28 | Splunk Inc. | Monitoring overall service-level performance using an aggregate key performance indicator derived from machine data |
US9210056B1 (en) | 2014-10-09 | 2015-12-08 | Splunk Inc. | Service monitoring interface |
US9521047B2 (en) | 2014-10-09 | 2016-12-13 | Splunk Inc. | Machine data-derived key performance indicators with per-entity states |
US9596146B2 (en) | 2014-10-09 | 2017-03-14 | Splunk Inc. | Mapping key performance indicators derived from machine data to dashboard templates |
US9294361B1 (en) | 2014-10-09 | 2016-03-22 | Splunk Inc. | Monitoring service-level performance using a key performance indicator (KPI) correlation search |
US9753961B2 (en) | 2014-10-09 | 2017-09-05 | Splunk Inc. | Identifying events using informational fields |
US9747351B2 (en) | 2014-10-09 | 2017-08-29 | Splunk Inc. | Creating an entity definition from a search result set |
US9491059B2 (en) | 2014-10-09 | 2016-11-08 | Splunk Inc. | Topology navigator for IT services |
US9596253B2 (en) | 2014-10-30 | 2017-03-14 | Splunk Inc. | Capture triggers for capturing network data |
US20160196527A1 (en) | 2015-01-06 | 2016-07-07 | Falkonry, Inc. | Condition monitoring and prediction for smart logistics |
US9607414B2 (en) | 2015-01-27 | 2017-03-28 | Splunk Inc. | Three-dimensional point-in-polygon operation to facilitate displaying three-dimensional structures |
US20160299966A1 (en) | 2015-04-10 | 2016-10-13 | D Square n.v. | System and Method for Creation and Detection of Process Fingerprints for Monitoring in a Process Plant |
US9667640B2 (en) | 2015-04-28 | 2017-05-30 | Splunk Inc. | Automatically generating alerts based on information obtained from search results in a query-processing system |
US20170017901A1 (en) | 2015-07-16 | 2017-01-19 | Falkonry Inc. | Machine Learning of Physical Conditions Based on Abstract Relations and Sparse Labels |
US9363149B1 (en) | 2015-08-01 | 2016-06-07 | Splunk Inc. | Management console for network security investigations |
US9516052B1 (en) | 2015-08-01 | 2016-12-06 | Splunk Inc. | Timeline displays of network security investigation events |
US9667641B2 (en) | 2015-08-31 | 2017-05-30 | Splunk Inc. | Complex event processing of computer network data |
US9699205B2 (en) | 2015-08-31 | 2017-07-04 | Splunk Inc. | Network security system |
US9609009B2 (en) | 2015-08-31 | 2017-03-28 | Splunk Inc. | Network security threat detection by user/user-entity behavioral analysis |
US9609011B2 (en) | 2015-08-31 | 2017-03-28 | Splunk Inc. | Interface having selectable, interactive views for evaluating potential network compromise |
US9596254B1 (en) | 2015-08-31 | 2017-03-14 | Splunk Inc. | Event mini-graphs in data intake stage of machine data processing platform |
US9591010B1 (en) | 2015-08-31 | 2017-03-07 | Splunk Inc. | Dual-path distributed architecture for network security analysis |
US9516053B1 (en) | 2015-08-31 | 2016-12-06 | Splunk Inc. | Network security threat detection by user/user-entity behavioral analysis |
US9785886B1 (en) | 2017-04-17 | 2017-10-10 | SparkCognition, Inc. | Cooperative execution of a genetic algorithm with an efficient training algorithm for data-driven model creation |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Schulz et al. | An enhanced visualization process model for incremental visualization | |
WO2018052015A1 (en) | Analysis support device for system, analysis support method and program for system | |
EP3376326A1 (en) | Control apparatus, data structure, and information processing method | |
US20200104737A1 (en) | Self-intelligent improvement in predictive data models | |
Cai et al. | Optimal stochastic scheduling | |
JP6403420B2 (en) | Server system for providing current and past data to clients | |
JP2019185751A (en) | Method of feature quantity preparation, system, and program | |
CN111639798A (en) | Intelligent prediction model selection method and device | |
US10331672B2 (en) | Stream data processing method with time adjustment | |
CN115130065B (en) | Method, device, device and computer-readable medium for processing characteristic information of supply side | |
CN111401940A (en) | Feature prediction method, feature prediction device, electronic device, and storage medium | |
CN106919706A (en) | Data updating method and device | |
US11004007B2 (en) | Predictor management system, predictor management method, and predictor management program | |
JP2008158748A (en) | Variable selection device and method, and program | |
US20230066703A1 (en) | Method for estimating structural vibration in real time | |
CN112632052A (en) | Heterogeneous data sharing method and intelligent sharing system | |
US11074272B1 (en) | System and method for managing streaming calculations | |
JP2015184818A (en) | Server, model application propriety determination method and computer program | |
JPWO2014188524A1 (en) | Work time estimation device | |
Rasmussen et al. | The Restricted Stochastic User Equilibrium with Threshold model: Large-scale application and parameter testing | |
CN116743637B (en) | Abnormal flow detection method and device, electronic equipment and storage medium | |
CN109388385B (en) | Method and apparatus for application development | |
CN113157716B (en) | Data processing method, device, equipment and medium | |
US8589360B2 (en) | Verifying consistency levels | |
US12271806B2 (en) | Artificial neural network training |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY |
|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO SMALL (ORIGINAL EVENT CODE: SMAL); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YR, SMALL ENTITY (ORIGINAL EVENT CODE: M2551); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY Year of fee payment: 4 |