US20140164431A1 - Database and data bus architecture and systems for efficient data distribution - Google Patents
Database and data bus architecture and systems for efficient data distribution Download PDFInfo
- Publication number
- US20140164431A1 US20140164431A1 US13/709,579 US201213709579A US2014164431A1 US 20140164431 A1 US20140164431 A1 US 20140164431A1 US 201213709579 A US201213709579 A US 201213709579A US 2014164431 A1 US2014164431 A1 US 2014164431A1
- Authority
- US
- United States
- Prior art keywords
- data
- database
- data model
- request
- neutral
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000009826 distribution Methods 0.000 title description 4
- 238000013499 data model Methods 0.000 claims abstract description 149
- 230000007935 neutral effect Effects 0.000 claims abstract description 61
- 238000000034 method Methods 0.000 claims abstract description 58
- 238000013500 data storage Methods 0.000 claims abstract description 48
- 238000007726 management method Methods 0.000 claims description 18
- 230000004044 response Effects 0.000 claims description 7
- 238000013519 translation Methods 0.000 claims description 2
- 230000003362 replicative effect Effects 0.000 claims 1
- 230000014616 translation Effects 0.000 claims 1
- 238000005192 partition Methods 0.000 description 71
- 239000003795 chemical substances by application Substances 0.000 description 36
- 238000003860 storage Methods 0.000 description 29
- 238000012545 processing Methods 0.000 description 20
- 230000015654 memory Effects 0.000 description 16
- 238000004891 communication Methods 0.000 description 15
- 230000008859 change Effects 0.000 description 9
- 230000010076 replication Effects 0.000 description 7
- 239000008186 active pharmaceutical agent Substances 0.000 description 6
- 238000010586 diagram Methods 0.000 description 6
- 238000011161 development Methods 0.000 description 5
- 238000004519 manufacturing process Methods 0.000 description 5
- 238000013480 data collection Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000011084 recovery Methods 0.000 description 3
- 238000000638 solvent extraction Methods 0.000 description 3
- 230000001360 synchronised effect Effects 0.000 description 3
- 230000009471 action Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 230000006855 networking Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 230000000712 assembly Effects 0.000 description 1
- 238000000429 assembly Methods 0.000 description 1
- 239000013256 coordination polymer Substances 0.000 description 1
- 230000010154 cross-pollination Effects 0.000 description 1
- 238000013523 data management Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 230000007723 transport mechanism Effects 0.000 description 1
- 238000012384 transportation and delivery Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G06F17/30864—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
Definitions
- the present application relates generally to database and data bus architectures.
- the present application relates generally to a database and data bus architecture arrangement providing for systems for efficient data distribution.
- an operating system executes on computing hardware, and can host a particular database management system and database storage arrangement.
- selected computer hardware having a particular system architecture e.g., compliant with the x86, x86-64, IA64, PowerPC, ARM, or other system architectures
- a particular system architecture e.g., compliant with the x86, x86-64, IA64, PowerPC, ARM, or other system architectures
- That operating system e.g., Windows, Linux, etc.
- relational databases have been developed, in which data requests, such as queries, can be submitted in a relational query structure (e.g., using SQL or some similar language).
- data in such relational databases are stored in records, with interrelationships across table entries in one or more tables, with query results returned in terms of row and table references.
- hierarchical databases have also been developed which store data in records, but generally query results are returned in record and set references.
- Still other database architectures are implemented using different access procedures, such as storage in columns, records, streams, or other structures.
- a computer-implemented method for managing distributed data using any of a plurality of data models includes determining a data request from one of a plurality of database interfaces, each database interface associated with a different data model type.
- the method further includes translating the data request to a second data request based at least in part on a data model neutral description of a data model that is associated with data and the database interface, wherein the data model neutral description is included in a plurality of descriptions of each of a plurality of different data models corresponding to the different data model types.
- the method also includes executing the second data request, thereby reflecting the data request in data storage such that data is managed consistently across each of the plurality of database interfaces.
- a data storage system in a second aspect, includes a plurality of database interfaces each associated with a different data model type and having a different set of database commands associated therewith.
- the data storage system further includes a data model neutral data layer including data storage distributed across a plurality of computing systems.
- the data model neutral data layer is configured to translate data requests from each of the plurality of database interfaces, based at least in part on database commands received at the plurality of database interfaces, to data model neutral data requests.
- a computer-implemented method for managing distributed data using any of a plurality of data models includes receiving a query at a database interface selected from a group of database interfaces, each of the database interfaces associated with a different data model type and having a different set of supported database commands.
- the method also includes transmitting a data request from the database interface to a common data storage layer, the data request based on the query, and translating the data request to a data model neutral data request within the common data storage layer based at least in part on a description of a data model stored within a plurality of metadata atoms describing each of a plurality of different data models.
- Each of the plurality of different data models has one of the plurality of different data model types.
- the method further includes communicating the data model neutral data request to data storage systems within the common data storage layer model, the common data storage layer including data storage distributed across a plurality of computing systems.
- the method also includes receiving data representing a set of data model neutral results received from the plurality of computing systems in response to the data request, and translating the data to a format consistent with the data model and expected by the database interface responsive to the query.
- FIG. 1 is a logical diagram of a data storage system according to an example embodiment of the present disclosure
- FIG. 2 is a logical diagram of a data storage system according to a second possible embodiment
- FIG. 3 is a logical diagram of aspects of a data storage system of FIGS. 1-2 ;
- FIG. 4 is an example logical diagram illustrating a layout of computing resources in an environment implementing the data storage systems of FIGS. 1-3 ;
- FIG. 5 is a block diagram of an electronic computer system useable within the data storage systems disclosed herein;
- FIG. 6 is a flowchart of a method for managing distributed data across a plurality of data model types, according to an example embodiment.
- FIG. 7 is a flowchart of a method for managing distributed data using any of a plurality of data models, according to an example embodiment.
- FIG. 8 is a flowchart of a method for handling a data request based on a database command received from a database interface, according to an example embodiment.
- the present disclosure relates to database and data bus architectures.
- the present application relates generally to a database and data bus architecture arrangement providing for systems for efficient data distribution.
- the database and data bus architectures disclosed herein represent systems in which a unified, data model neutral data storage arrangement can be used as a data layer, with existing database management systems operating to provide different views into a unified, data model neutral data layer.
- the data model neutral layer can maintain descriptions of the data models associated with each database interface to provide a definition that allows replication of data across different data models of different data model types.
- the data model neutral layer can maintain both descriptions of the data models associated with each database interface and a data model neutral data layer, thereby avoiding replication of data but rather maintaining a single data model neutral set of data, upon which various views can be generated for each of a plurality of database interfaces having different data model types.
- a data model corresponds to a particular arrangement of data for use in a database.
- the data model can correspond to a particular database structure or schema that is specific to the data stored in a database.
- a data model type corresponds to a particular type of arrangement of data, whether it be a relational, hierarchical, multidimensional, object oriented, columnar, network, record, or stream arrangements for data, or any other data model type.
- data model neutral data corresponds to data that is not stored in a manner that relies upon a particular data structure, but rather can be described across a variety of such structures. Examples of each of these concepts are generally provided in further detail below in conjunction with the various embodiments of the present disclosure.
- the data storage system 100 corresponds to an implementation of a data storage system in which data models are described in a data model neutral arrangement, but in which data is maintained associated with existing database systems. Accordingly, the data storage system 100 represents an arrangement in which a data model neutral software layer operates as a data bus for exchanging data across various databases each managed by separate database management systems, or database interfaces, having different data model types.
- the data storage system 100 includes a virtualization space 101 executable on a hardware layer 102 .
- the hardware layer 102 supports secure partition services 104 .
- the hardware layer 102 generally corresponds to a large, multiprocessor, networked arrangement including a plurality of computing systems.
- the hardware layer 102 can be assigned to and affiliated with particular portions of the data storage system 100 in a variety of ways, but generally provides processing and memory resources useable to implement a database and database application architecture.
- the hardware layer can be constructed from one or more server computers, an example of which is discussed below in connection with FIG. 5 .
- the secure partition services 104 provides a low-level software layer above the hardware layer 102 , and generally corresponds to a virtualization layer useable to host various types of operating systems that may or may not be compatible with the hardware layer 102 .
- the secure partition services 104 can correspond to a hypervisor software layer installed on one or more computing systems, capable of collectively partitioning available hardware resources available within a computing system into a plurality of partitions.
- each of the partitions represent a defined collection of hardware resources capable of being allocated to a hosted operating system, such that the hosted system views the allocated resources, via the hypervisor, as a computing system itself.
- the secure partition services 104 correspond to S-Par secure partitioning hypervisor software from Unisys Corporation of Blue Bell, Pa. Of course, other secure partition services could be used as well.
- the secure partition services 104 host a set of architecture attributes 106 and a common data bus 108 .
- the architecture attributes 106 reside in a layer above the secure partition services 104 , in that they are published to various partitions 110 (shown as partitions 110 a - d ).
- the architecture attributes 106 can include, for example, emulated processing, memory, networking and/or other attributes made available to the partitions 110 .
- the common data bus 108 hosts and supports data exchange across the plurality of partitions 110 , to allow for cross-pollination of data between the partitions, for use by the operating systems and software installed thereon.
- the common data bus 108 stores metadata describing, for example, a particular file system and/or database structure or schema used in a particular partition, such that when data is stored or altered in that partition, the common data bus 108 detects the data change and replicates that change of data across the other partitions.
- the common data bus 108 can be configured to detect changes in data in virtual file systems or virtual database files in the various partitions 110 , and replicate data between those systems based on known interrelationships between those data structures.
- the common data bus 108 can be implemented using one or more transforms developed between source and target computing system file systems and/or database systems, and includes the software necessary to support export of data from each partition (e.g., from the file system within a particular partition, or within a database having a schema hosted within the partition).
- each of the partitions 110 supported by the secure partition services 104 and common data bus 108 are configured to support any of a variety of operating systems and/or database management systems and database architectures.
- a first partition 110 a hosts a first operating system, depicted as an MCP operating system provided by Unisys Corporation of Blue Bell, Pa.
- other partitions within the system may host different types of systems; in the embodiment shown, a second partition 110 b hosts a second operating system, shown as the OS2200 operating system, also from Unisys Corporation of Blue Bell, Pa.
- a third operating system simply illustrated as a coprocessor, or “CP” is also illustrated as associated with a third partition 110 c.
- Other partitions, such as partitions maintaining third party operating systems e.g., Linux, Windows-based, or other operating systems
- each partition may include one or more data personalities 112 .
- Data personalities 112 generally refer to structures or arrangements by which data is accessed and understood.
- data personalities may correspond to a data model type of a database, such as a relational, hierarchical, multidimensional, columnar, network, record, stream or object oriented data model type.
- Data personalities generally describe the expected operation of an interface to data, rather than the specific structure of a given data set.
- Such a specific structure, or data model corresponds to a particular schema of that data set as may be designed within the data model type.
- the first partition 110 a including the MCP operating system hosts two data personalities, a relational data personality 112 a (such as would be expected of a SQL or other relational database) and a DMSII personality 112 b, useable with DMSII database management system from Unisys Corporation of Blue Bell, Pa.
- the second partition 110 b is illustrated as supporting an RDMS personality 112 c, a DMS personality 112 d, and indexed files in a file system (i.e., a file-based data personality 112 e ).
- each of the partitions 110 a - c can be made available to a further partition or application executing within one of those partitions, illustrated as a data access application 114 .
- the application 114 can access one or more APIs 116 , shown as traditional APIs 116 a and third party APIs 116 b for accessing data stored using nonstandard third party data personalities.
- the APIs 116 are published for use with each of the variety of data personalities 112 , for accessing data in the various partitions.
- the application can access data as needed from each of the various data personalities—e.g., in a relational format from a relational database personality such as personality 112 a, or hierarchical data from a hierarchical database personality (e.g., the DMSII personality 112 ), or other data access arrangements.
- a relational database personality such as personality 112 a
- hierarchical data from a hierarchical database personality (e.g., the DMSII personality 112 ), or other data access arrangements.
- a common data bus 108 to provide data synchronization across partitions, in particular in an example arrangement such as that depicted in FIG. 1 , provides a number of advantages over existing hypervisor systems or even existing data replication systems.
- the application can access data from each of the various data personalities, the application can be designed to access data according to different personalities (rather than being written to interface with a particular data model type), and can request and receive data from a selected personality based on the suitability of the data model type associated with that data personality.
- an application could both store data according to a DMSII data personality 112 b, and could retrieve data in a reporting format from a relational data personality 112 a, or a multidimensional data personality, or some other convenient format.
- each of the data personalities is kept up-to-date via transformations of the data at the time it is stored in each personality, thereby providing convenient retrieval of data in a convenient format, from a supported API, at the application level regardless of whether the data was originally stored in a database having the particular personality from which retrieval is desired.
- data is available from each of the data personalities 112 at essentially data retrieval speeds, since each data personality would not be required to communicate across to other data personalities to retrieve such data (assuming sufficient time between data storage in one data personality and retrieval in another data personality to allow for replication of the data in each of the data models and data model types associated with each of the personalities supported within a particular system.
- an application development environment 118 could be included as well which allows a designer to create applications designed to interface with various data personalities via the APIs 116 a - b.
- the data personalities 112 allow applications to be written using the application development environment 118 that are capable of accessing data from any of the personalities.
- a remote system 120 such as a client system or other remote server, can be communicatively connected to the virtual system 101 , e.g., for communication with the application 114 , or application development environment 118 .
- the application 114 or application development environment can have a web interface, either directly supported within one of the partitions in which the application or application development environment reside, or in a separate partition, managing access to that system.
- third party systems can be incorporated into the overall system 100 .
- one such third party system 122 can be included within the overall virtualized system 101 , hosted by secure partitioning services 104 , and a further third party system 124 is remote from the overall system 100 , and communicatively connected to the system by the common data bus 108 .
- These third party systems are shown to illustrate example interoperability of the common data bus 108 with third party systems.
- the common data bus 108 can be extended, on a case-by-case basis, to such third party systems by establishing a relationship between known data personalities of the supported systems and those developed by third parties.
- both third party systems 122 , 124 operate third party operating systems 126 , 128 , respectively, and have specific third party data personalities 130 , 132 . These may be the same, or different, operating systems and/or data personalities. Further, as illustrated in FIG. 1 , third party operating system 128 can be communicatively connected to the system despite running on incompatible third party hardware 134 .
- FIG. 2 an alternative embodiment of a database and data bus architecture is contemplated, in which a system 200 reduces the amount of data replication involved.
- a common data store 202 takes the place of the common data bus 108 for at least a supported portion of the system 200 , namely one or more partitions 110 having known data personalities.
- each of the partitions that are capable of connection to the common data store 202 no longer are required to independently maintain storage of data associated with the particular data personalities to which they relate, but instead request data from a common data store that stores data in a data model neutral format.
- a data model neutral format examples of such a format are discussed in further detail below, it is noted here that any of a variety of formats that do not specifically rely on positional interrelationships among data elements (e.g., within a common table or data record) to define relationships can be used.
- unstructured data such as key-value pairs or other types of data labeling, could be used.
- the common data store 202 is configured to provide an interface between each of a plurality of data personalities 112 and the underlying data by providing a conduit for data storage from each of the supported partitions 110 .
- the common data store 202 is interfaced to partitions 110 a - c, and provides data to data personalities 112 a - f .
- data personalities 112 a - f rather than representing database systems as in FIG. 1 , effectively act as data views on data in the common data store 202 .
- the common data store 202 can be interfaced to a common data bus 204 , which acts analogously to the common data bus 108 of FIG. 1 , but for only unsupported data structures, i.e., data personalities for which the common data bus 204 may have some knowledge of the data format type, but the common data store 202 lacks knowledge of the data format of the data personality itself.
- the common data store acts as a structure-independent database capable of being maintained in synchronization with external data personalities, such as data personalities 112 g, 112 h, using the common data bus 204 .
- the common data bus 204 would not be required to directly interface with data personalities 112 a - f , since those data personalities would not directly store data; rather, the common data store 202 would manage that data, and would be maintained in synchronization with the common data bus 204 .
- the system 300 generally includes an application layer 302 capable of accessing various data personalities 304 , examples of which, 304 a - b, represent a relational database and a DMSII database.
- a separate environment hosts an MCP system 306 , which can be located on a different partition from either of the data personalities, and is configured to host aspects of the common data store.
- the MCP system 306 acts as a service engine supporting data retrieval according to the data personalities 304 a - b, as would be dictated by query and storage commands received at those data personalities from the application layer.
- a data layer 308 resides beneath the data personalities 304 a - b, and can be executed across a plurality of partitions within a virtual environment.
- the data layer 308 includes data atoms 310 and metadata atoms 312 .
- the data atoms 310 generally include data stored via any of the data personalities 304 , but separated from the format or structure in which that data is stored. In other words, the data atoms 310 have a data model neutral format in which the structure of the data (i.e. its position on disk relative to other data) does not define interrelationships of the data (e.g., in a table/row format such as in a relational database, or in a hierarchical dataset/record arrangement).
- the data atoms can be implemented in key-value pairs, where the metadata atoms 312 associate keys with the specific logical format of that data. In other embodiments, other data model neutral data formats could be used, such as a triple, or some other type of data arrangement.
- the data is stored in a resource description framework (RDF). In such embodiments, the data is stored in records that include a number of data atoms, and associated metadata describing the interrelationships among the data, but which can be stored separately from the data.
- RDF resource description framework
- the metadata atoms 310 can be maintained in key-value pairs or other analogous structures, and define databases based on a description of the database schema, for example which may be received at the data layer 308 in an XMI or other markup language format, thereby allowing decoupling of structure (in the metadata) from the data values themselves.
- each of the data personalities 304 have associated therewith a set of one or more agents useable to format data received from the data atoms 310 into an arrangement that is expected by that data personality 304 .
- the structure corresponds to a data block that contains data responsive to a query, formatted in an arrangement as expected by the data personality.
- the structure could be a block of data containing records in a format normally returned from a portion of a table or tables of a relational database, or dataset and record entries including one or more entries responsive to a query of a hierarchical database.
- the data returned to a data personality is returned in a way that is consistent with the data model associated with that data personality.
- the data personality representing the database management system (albeit without managing the underlying data) will receive the data block having a recognizable structure to that data personality, and will extract the responsive data from that data block for return to the application from which a query or other data request was received.
- a data agent generator 314 manages metadata describing data formats and/or data format types associated with data collections defined using each of the associated data personalities 304 .
- the data agent generator 314 maintains the collection of metadata atoms 312 that describe each of the data formats of databases, and generates data agents 316 associated with each data personality 304 that can format the data stored in a data model neutral format.
- the data agent generator 314 generates a row agent 316 a and a table agent 316 b for response to data inquiries and storing data associated with a relational personality 304 a.
- the data agent generator 314 also generates a set agent 316 c, as well as a data set agent 316 d and a record agent 316 e associated with a DMSII data personality.
- the set agent 316 c includes sub-agents, such as DMSII key agents 318 a - b , which can be used to interrelate records based on keys provided within the DMSII database architecture, and which are tracked in the metadata atoms 312 .
- the data agent generator 314 Based on the personality to which the data agent interfaces, different types of data agents could be generated by the data agent generator 314 , incorporating metadata as defined in the metadata atoms 312 .
- the data agents receive requests for data from the various data personalities 304 , those data agents can manage requests for a receipt of data from the underlying data atoms 310 .
- the data agents 316 can also manage the various tasks typically performed in database management systems but not intrinsically tied to the structure of the data, such as transaction management, recovery, backup, and other data functions.
- the data atoms 310 generally represent data stored in a plurality of databases or other data structures; as such, in typical arrangements the data atoms are stored across a plurality of computing systems. In such embodiments, the data atoms 310 are generally distributed across a number of computing systems, or partitions. As such, in typical implementations of such systems requested data is to be retrieved from more than one computing system or partition. Accordingly, within the common data store 202 , an implementation for data model neutral data retrieval is implemented in which massively-parallel queries can be processed and query results compiled and returned. In one example embodiment, the common data store implements a map-reduce technique for query processing and/or storage, such as the Hadoop Map-Reduce algorithm. Other data processing implementations could be used as well.
- FIGS. 1-3 overall, it is noted that these overall systems allow for use of data personalities by application programs in the same manner as is traditionally provided by database management systems. Accordingly, since such an arrangement is typically located in a large-scale multi-server environment, applications have a choice regarding the specific data personality from which data is requested, despite the fact that data may not have originally been stored using that data personality, and in implementations of FIGS. 2-3 , the data is maintained in a common data store in a data model neutral format.
- the arrangement 400 includes a plurality of logical computing systems 402 a - d , or partitions.
- Each of the logical computing systems 402 a - d can include a collection of computing resources, such as a processor, memory resources, disk resource, network or communications resources, and other resources typically present on a computing system.
- An example of a collection of physical computing resources, formed as a typical discrete electronic computing system is described below in connection with FIG. 5 .
- each of the logical computing systems 402 a - d hosts secure partition services 404 , which define the set of physical computing resources available to higher-layer software, as well as providing an interface between that higher-layer software and the physical computing resources allocated to the particular logical computing system 402 .
- the partition services 404 provide virtualization and security services, as well as backup and recovery services, for each partition.
- the arrangement 400 includes a control partition 406 , guest partitions 408 a - b, and a services partition 410 .
- the control partition 406 schedules allocation of additional partitions to various guest processes as desired.
- the control partition 406 can execute a console application configured to allow reservation of resources for various guest partitions and/or service partitions.
- the guest partitions 408 a - b can execute any of a variety of guest applications.
- the guest partitions 408 a - b can host separate database management systems or data personalities on different hosted operating systems (e.g., the relational and DMSII database management systems of FIG. 3 ).
- guest partitions could host data storage partitions, or an implementation of the common data bus or common data store, a map-reduce service operation useable by the common data store, or other types of services discussed above.
- a services partition 410 hosts one or more services useable by the guest partitions, such as for remote systems communications, data management/replication, or other services.
- FIG. 4 When implementing a system such as those shown in FIGS. 1-3 above in a virtualized computing arrangement such as is illustrated in FIG. 4 , it is noted that although an example set of hosted, virtualized partitions are shown, other partitions could be included in such a system for hosting additional data personalities, applications, data nodes, data processing software, networking operations, or specialty processes. Furthermore, in some embodiments, at least some of the computing arrangements of FIGS. 1-3 can be implemented natively on a local system, rather than on a virtualized system.
- the computing system 500 can represent, for example, a native computing system within which one or more of computing systems 402 a - d , or with multiple of which any of systems 100 - 300 could be implemented.
- the computing device 500 includes a memory 502 , a processing system 504 , a secondary storage device 506 , a network interface card 508 , a video interface 510 , a display unit 512 , an external component interface 514 , and a communication medium 516 .
- the memory 502 includes one or more computer storage media capable of storing data and/or instructions.
- the memory 502 is implemented in different ways.
- the memory 502 can be implemented using various types of computer storage media.
- the processing system 504 includes one or more processing units.
- a processing unit is a physical device or article of manufacture comprising one or more integrated circuits that selectively execute software instructions.
- the processing system 504 is implemented in various ways.
- the processing system 504 can be implemented as one or more processing cores.
- the processing system 504 can include one or more separate microprocessors.
- the processing system 504 can include an application-specific integrated circuit (ASIC) that provides specific functionality.
- ASIC application-specific integrated circuit
- the processing system 504 provides specific functionality by using an ASIC and by executing computer-executable instructions.
- the secondary storage device 506 includes one or more computer storage media.
- the secondary storage device 506 stores data and software instructions not directly accessible by the processing system 504 .
- the processing system 504 performs an I/O operation to retrieve data and/or software instructions from the secondary storage device 506 .
- the secondary storage device 506 includes various types of computer storage media.
- the secondary storage device 506 can include one or more magnetic disks, magnetic tape drives, optical discs, solid state memory devices, and/or other types of computer storage media.
- the network interface card 508 enables the computing device 500 to send data to and receive data from a communication network.
- the network interface card 508 is implemented in different ways.
- the network interface card 508 can be implemented as an Ethernet interface, a token-ring network interface, a fiber optic network interface, a wireless network interface (e.g., WiFi, WiMax, etc.), or another type of network interface.
- the video interface 510 enables the computing device 500 to output video information to the display unit 512 .
- the display unit 512 can be various types of devices for displaying video information, such as a cathode-ray tube display, an LCD display panel, a plasma screen display panel, a touch-sensitive display panel, an LED screen, or a projector.
- the video interface 510 can communicate with the display unit 512 in various ways, such as via a Universal Serial Bus (USB) connector, a VGA connector, a digital visual interface (DVI) connector, an S-Video connector, a High-Definition Multimedia Interface (HDMI) interface, or a DisplayPort connector.
- USB Universal Serial Bus
- VGA VGA connector
- DVI digital visual interface
- S-Video S-Video connector
- HDMI High-Definition Multimedia Interface
- the external component interface 514 enables the computing device 500 to communicate with external devices.
- the external component interface 514 can be a USB interface, a FireWire interface, a serial port interface, a parallel port interface, a PS/2 interface, and/or another type of interface that enables the computing device 500 to communicate with external devices.
- the external component interface 514 enables the computing device 500 to communicate with various external components, such as external storage devices, input devices, speakers, modems, media player docks, other computing devices, scanners, digital cameras, and fingerprint readers.
- the communications medium 516 facilitates communication among the hardware components of the computing device 500 .
- the communications medium 516 facilitates communication among the memory 502 , the processing system 504 , the secondary storage device 506 , the network interface card 508 , the video interface 510 , and the external component interface 514 .
- the communications medium 516 can be implemented in various ways.
- the communications medium 516 can include a PCI bus, a PCI Express bus, an accelerated graphics port (AGP) bus, a serial Advanced Technology Attachment (ATA) interconnect, a parallel ATA interconnect, a Fiber Channel interconnect, a USB bus, a Small Computing system Interface (SCSI) interface, or another type of communications medium.
- the memory 502 stores various types of data and/or software instructions.
- the memory 502 stores a Basic Input/Output System (BIOS) 518 and an operating system 520 .
- BIOS 518 includes a set of computer-executable instructions that, when executed by the processing system 504 , cause the computing device 500 to boot up.
- the operating system 520 includes a set of computer-executable instructions that, when executed by the processing system 504 , cause the computing device 500 to provide an operating system that coordinates the activities and sharing of resources of the computing device 500 .
- the memory 502 stores application software 522 .
- the application software 522 includes computer-executable instructions, that when executed by the processing system 504 , cause the computing device 500 to provide one or more applications.
- the memory 502 also stores program data 524 .
- the program data 524 is data used by programs that execute on the computing device 500 .
- computer readable media may include computer storage media and communication media.
- a computer storage medium is a device or article of manufacture that stores data and/or computer-executable instructions.
- Computer storage media may include volatile and nonvolatile, removable and non-removable devices or articles of manufacture implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data.
- computer storage media may include dynamic random access memory (DRAM), double data rate synchronous dynamic random access memory (DDR SDRAM), reduced latency DRAM, DDR2 SDRAM, DDR3 SDRAM, solid state memory, read-only memory (ROM), electrically-erasable programmable ROM, optical discs (e.g., CD-ROMs, DVDs, etc.), magnetic disks (e.g., hard disks, floppy disks, etc.), magnetic tapes, and other types of devices and/or articles of manufacture that store data.
- DRAM dynamic random access memory
- DDR SDRAM double data rate synchronous dynamic random access memory
- ROM read-only memory
- optical discs e.g., CD-ROMs, DVDs, etc.
- magnetic disks e.g., hard disks, floppy disks, etc.
- magnetic tapes e.g., and other types of devices and/or articles of manufacture that store data.
- Communication media may be embodied by computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave or other transport mechanism, and includes any information delivery media.
- modulated data signal may describe a signal that has one or more characteristics set or changed in such a manner as to encode information in the signal.
- communication media may include wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, radio frequency (RF), infrared, and other wireless media.
- RF radio frequency
- FIG. 6 illustrates a flowchart representing a method 600 for managing distributed data across a plurality of data model types, according to an example embodiment.
- FIG. 6 therefore represents a method 600 that can be performed by a common data bus, such as the common data bus 108 of FIG. 1 or the common data bus 204 of FIG. 2 .
- the method 600 of FIG. 6 begins when a data personality receives a database command (step 602 ).
- the database command can be, for example a query or a data storage command, or other types of commands expected to be received by a particular type of database management system analogous to that data personality.
- the database command can, in various embodiments, effect a change on data managed at the database associated with that data personality.
- the common data bus In response to a detected change in the data managed by the data personality receiving the database command, the common data bus will detect a data request (i.e., a request to provide or request to change data in a particular database), and an analogous data request will be formed by the common data bus.
- a data request i.e., a request to provide or request to change data in a particular database
- the data request can be to form a data model neutral change in data that would be analogous to the data change reflected by the data request.
- the common data bus (or alternatively, the data personality issuing the original data request) will issue a data request at the common data bus. That request will then be translated to a second type of data request (step 606 ).
- the second type of data request can take any of a number of forms, but generally is configured to replicate a change of data from the data personality receiving the request in a second data store having a different format from that data personality receiving an original database command.
- the second data request can be a data model neutral data request, or can be a data request in a different data model (i.e., at a different data personality) as compared to the original request.
- the second data request if executed in a common data bus, causes synchronization of the data personality that is the target of the second data request with the data personality originally receiving the database command (step 608 ).
- the translation and execution of the first data request to different types of data requests may occur many times, such that each data personality maintains a synchronized set of data with each of the other data personalities.
- the specific data personalities to be synchronized for each database or data collection can be user selectable, thereby controlling the number of data personalities requiring synchronization.
- the method 700 is generally executable within a common data store, such as data store 202 of FIG. 2 , with portions of the method 700 performed by agents and/or an agent generator within the common data store.
- the method 700 begins when a database command is received at a data personality, also known in this instance as a database interface (step 702 ).
- the data personality can be referred to as a database interface in this case because the each of the data personalities, rather than storing data, represent an interface to data stored in an underlying common data store.
- the data personality performs a first data request based on the database command received by the data personality (step 704 ).
- the data request is generally a request for data from an underlying data collection, which would normally be issued from a database management system to an underlying database file system; however in the present disclosure, the data request is passed to a common data store. This can be, for example, issued to one or more data agents, such as the agents illustrated in FIG. 3 , as generated by a data agent generator specific to each data personality.
- the common data bus will receive the data request, and translate that data request to a second data request in a data model neutral format (step 706 ). For example, one or more data atoms will receive the data request and translate that data request to one or more data model neutral search or data operations, for example using a map-reduce operation across data distributed on a large number of physical systems in data model neutral data atoms. That data model neutral request will then be executed (step 708 ), managed by the data agents, and data will be returned via the data agents to the data personality from which the data request is received (step 710 ).
- a further method 800 for handling a data request is illustrated, based on a database command received from a database interface, according to an example embodiment.
- the method 800 may be performed, for example, at a common data store, such as data store 202 as illustrated in FIG. 2 , using agents and associated data and metadata atoms as illustrated in connection with FIG. 3 .
- the method 800 is performed using one or more data personalities, or database interfaces, that have been preconfigured with the common data store (i.e., which the common data store has metadata regarding the structure of databases managed by that data personality).
- the method 800 includes obtaining, from a metadata agent, metadata describing a logical structure of the database associated with that particular data personality (step 802 ). This can include, for example, obtaining metadata from a metadata store that was extracted from or otherwise separated from data that is stored in the common data store in a data model neutral format.
- the metadata agent can generate one or more database interface agents based on that metadata (step 804 ).
- the database interface agents are generated to be capable of parsing data and data requests received from a data personality, as well as to collect and logically arrange data to be returned to the data personality in response to a data request from that data personality.
- the data agents are generated based on the metadata describing the personality to be interfaced to the common data store.
- the method 800 will continue upon receipt of a data request at the common data store, for example from a data personality (step 806 ).
- the data request is received at one or more agents interfaced to the data personality, to determine the type of data request that is being made.
- the data request can be to store data in a particular logical location within a database, to retrieve data, to obtain a record count, or other types of database actions.
- the agent receiving the data request will parse the request to determine one or more actions to be taken across the distributed data storage systems associated with the common data store, and distribute that data request across the storage systems to obtain or modify data as required (step 808 ).
- results are formatted by the agent(s) associated with the data personality to be in a form understandable by the data personality (step 810 ).
- the results can then be passed back to the data personality, as if coming from an underlying data storage having a logical organization dictated by that data personality.
- FIGS. 1-8 generally, it is recognized that the various systems and methods described herein provide a number of advantages over existing database systems, and in particular for large-scale, large fanout databases requiring many physical computing systems for implementation.
- the various virtualization services on which the systems are provided allows for customized workload assignment, by placing the common data bus or common data store on entirely separate hardware resources as compared to the various data personalities which they serve.
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
- The present application relates generally to database and data bus architectures. In particular, the present application relates generally to a database and data bus architecture arrangement providing for systems for efficient data distribution.
- In traditional system architectures, an operating system executes on computing hardware, and can host a particular database management system and database storage arrangement. For example, selected computer hardware having a particular system architecture (e.g., compliant with the x86, x86-64, IA64, PowerPC, ARM, or other system architectures) can host an operating system specifically written for or compiled for that architecture. That operating system (e.g., Windows, Linux, etc.) can then host a corresponding database and associated database management system.
- Within this construct, various database architectures have emerged. For example, relational databases have been developed, in which data requests, such as queries, can be submitted in a relational query structure (e.g., using SQL or some similar language). Generally, data in such relational databases are stored in records, with interrelationships across table entries in one or more tables, with query results returned in terms of row and table references. In other examples, hierarchical databases have also been developed which store data in records, but generally query results are returned in record and set references. Still other database architectures are implemented using different access procedures, such as storage in columns, records, streams, or other structures.
- Increasingly, a number of limitations of computing infrastructure have begun to affect these database arrangements. For example, some relational and hierarchical database management systems assume all data is to be stored on a particular partition or computing system, and as such are either unable to or are inefficient at obtaining data stored in separate memories or memory partitions. Furthermore, existing application level programs may be written for use with a relational system when data is stored in a hierarchical database, or vice versa, thereby complicating data access issues. In such situations, it may be the case that separate transactional and relational database instances must be maintained, leading to data consistency and replication difficulties. Or, hierarchical database commands must be translated to a relational database language, accounting for the difference between such data models. In both circumstances, inefficiencies exist in storage and retrieval of data, and limitations as to methods (i.e., database commands and query languages) persist.
- For these and other reasons, improvements are desirable.
- In accordance with the following disclosure, the above and other issues are addressed by the following:
- In a first aspect, a computer-implemented method for managing distributed data using any of a plurality of data models is disclosed. The method includes determining a data request from one of a plurality of database interfaces, each database interface associated with a different data model type. The method further includes translating the data request to a second data request based at least in part on a data model neutral description of a data model that is associated with data and the database interface, wherein the data model neutral description is included in a plurality of descriptions of each of a plurality of different data models corresponding to the different data model types. The method also includes executing the second data request, thereby reflecting the data request in data storage such that data is managed consistently across each of the plurality of database interfaces.
- In a second aspect, a data storage system is disclosed. The data storage system includes a plurality of database interfaces each associated with a different data model type and having a different set of database commands associated therewith. The data storage system further includes a data model neutral data layer including data storage distributed across a plurality of computing systems. The data model neutral data layer is configured to translate data requests from each of the plurality of database interfaces, based at least in part on database commands received at the plurality of database interfaces, to data model neutral data requests.
- In a third aspect, a computer-implemented method for managing distributed data using any of a plurality of data models is disclosed. The method includes receiving a query at a database interface selected from a group of database interfaces, each of the database interfaces associated with a different data model type and having a different set of supported database commands. The method also includes transmitting a data request from the database interface to a common data storage layer, the data request based on the query, and translating the data request to a data model neutral data request within the common data storage layer based at least in part on a description of a data model stored within a plurality of metadata atoms describing each of a plurality of different data models. Each of the plurality of different data models has one of the plurality of different data model types. The method further includes communicating the data model neutral data request to data storage systems within the common data storage layer model, the common data storage layer including data storage distributed across a plurality of computing systems. The method also includes receiving data representing a set of data model neutral results received from the plurality of computing systems in response to the data request, and translating the data to a format consistent with the data model and expected by the database interface responsive to the query.
-
FIG. 1 is a logical diagram of a data storage system according to an example embodiment of the present disclosure; -
FIG. 2 is a logical diagram of a data storage system according to a second possible embodiment; -
FIG. 3 is a logical diagram of aspects of a data storage system ofFIGS. 1-2 ; -
FIG. 4 is an example logical diagram illustrating a layout of computing resources in an environment implementing the data storage systems ofFIGS. 1-3 ; -
FIG. 5 is a block diagram of an electronic computer system useable within the data storage systems disclosed herein; -
FIG. 6 is a flowchart of a method for managing distributed data across a plurality of data model types, according to an example embodiment; and -
FIG. 7 is a flowchart of a method for managing distributed data using any of a plurality of data models, according to an example embodiment; and -
FIG. 8 is a flowchart of a method for handling a data request based on a database command received from a database interface, according to an example embodiment. - Various embodiments of the present invention will be described in detail with reference to the drawings, wherein like reference numerals represent like parts and assemblies throughout the several views. Reference to various embodiments does not limit the scope of the invention, which is limited only by the scope of the claims attached hereto. Additionally, any examples set forth in this specification are not intended to be limiting and merely set forth some of the many possible embodiments for the claimed invention.
- The logical operations of the various embodiments of the disclosure described herein are implemented as: (1) a sequence of computer implemented steps, operations, or procedures running on a programmable circuit within a computer, and/or (2) a sequence of computer implemented steps, operations, or procedures running on a programmable circuit within a directory system, database, or compiler.
- In general the present disclosure relates to database and data bus architectures. In particular, the present application relates generally to a database and data bus architecture arrangement providing for systems for efficient data distribution. The database and data bus architectures disclosed herein represent systems in which a unified, data model neutral data storage arrangement can be used as a data layer, with existing database management systems operating to provide different views into a unified, data model neutral data layer. In example embodiments, the data model neutral layer can maintain descriptions of the data models associated with each database interface to provide a definition that allows replication of data across different data models of different data model types. In other example embodiments, the data model neutral layer can maintain both descriptions of the data models associated with each database interface and a data model neutral data layer, thereby avoiding replication of data but rather maintaining a single data model neutral set of data, upon which various views can be generated for each of a plurality of database interfaces having different data model types.
- In general, and as discussed herein, a data model corresponds to a particular arrangement of data for use in a database. For example, the data model can correspond to a particular database structure or schema that is specific to the data stored in a database. Analogously, a data model type, as referred to herein, corresponds to a particular type of arrangement of data, whether it be a relational, hierarchical, multidimensional, object oriented, columnar, network, record, or stream arrangements for data, or any other data model type. Accordingly, data model neutral data corresponds to data that is not stored in a manner that relies upon a particular data structure, but rather can be described across a variety of such structures. Examples of each of these concepts are generally provided in further detail below in conjunction with the various embodiments of the present disclosure.
- Referring now to
FIG. 1 , a logical diagram of adata storage system 100 is shown, according to an example embodiment of the present disclosure. In general, thedata storage system 100 corresponds to an implementation of a data storage system in which data models are described in a data model neutral arrangement, but in which data is maintained associated with existing database systems. Accordingly, thedata storage system 100 represents an arrangement in which a data model neutral software layer operates as a data bus for exchanging data across various databases each managed by separate database management systems, or database interfaces, having different data model types. - In the embodiment shown, the
data storage system 100 includes avirtualization space 101 executable on ahardware layer 102. Thehardware layer 102 supportssecure partition services 104. Thehardware layer 102 generally corresponds to a large, multiprocessor, networked arrangement including a plurality of computing systems. As further discussed below in connection withFIGS. 4-5 , thehardware layer 102 can be assigned to and affiliated with particular portions of thedata storage system 100 in a variety of ways, but generally provides processing and memory resources useable to implement a database and database application architecture. The hardware layer can be constructed from one or more server computers, an example of which is discussed below in connection withFIG. 5 . - The
secure partition services 104 provides a low-level software layer above thehardware layer 102, and generally corresponds to a virtualization layer useable to host various types of operating systems that may or may not be compatible with thehardware layer 102. For example, thesecure partition services 104 can correspond to a hypervisor software layer installed on one or more computing systems, capable of collectively partitioning available hardware resources available within a computing system into a plurality of partitions. As discussed below in connection withFIG. 4 , each of the partitions represent a defined collection of hardware resources capable of being allocated to a hosted operating system, such that the hosted system views the allocated resources, via the hypervisor, as a computing system itself. In one example embodiment, thesecure partition services 104 correspond to S-Par secure partitioning hypervisor software from Unisys Corporation of Blue Bell, Pa. Of course, other secure partition services could be used as well. - In the embodiment shown, the
secure partition services 104 host a set of architecture attributes 106 and a common data bus 108. The architecture attributes 106 reside in a layer above thesecure partition services 104, in that they are published to various partitions 110 (shown as partitions 110 a-d). In various embodiments, the architecture attributes 106 can include, for example, emulated processing, memory, networking and/or other attributes made available to the partitions 110. - The common data bus 108 hosts and supports data exchange across the plurality of partitions 110, to allow for cross-pollination of data between the partitions, for use by the operating systems and software installed thereon. In particular, the common data bus 108 stores metadata describing, for example, a particular file system and/or database structure or schema used in a particular partition, such that when data is stored or altered in that partition, the common data bus 108 detects the data change and replicates that change of data across the other partitions. In various embodiments, the common data bus 108 can be configured to detect changes in data in virtual file systems or virtual database files in the various partitions 110, and replicate data between those systems based on known interrelationships between those data structures. For example, the common data bus 108 can be implemented using one or more transforms developed between source and target computing system file systems and/or database systems, and includes the software necessary to support export of data from each partition (e.g., from the file system within a particular partition, or within a database having a schema hosted within the partition).
- In the embodiment shown, each of the partitions 110 supported by the
secure partition services 104 and common data bus 108 are configured to support any of a variety of operating systems and/or database management systems and database architectures. In the example depicted, afirst partition 110 a hosts a first operating system, depicted as an MCP operating system provided by Unisys Corporation of Blue Bell, Pa. Similarly, other partitions within the system may host different types of systems; in the embodiment shown, asecond partition 110 b hosts a second operating system, shown as the OS2200 operating system, also from Unisys Corporation of Blue Bell, Pa. A third operating system simply illustrated as a coprocessor, or “CP” is also illustrated as associated with athird partition 110 c. Other partitions, such as partitions maintaining third party operating systems (e.g., Linux, Windows-based, or other operating systems) could be incorporated as well. - Within each of the partitions 110 a-c, each partition may include one or more data personalities 112. Data personalities 112 generally refer to structures or arrangements by which data is accessed and understood. For example, data personalities may correspond to a data model type of a database, such as a relational, hierarchical, multidimensional, columnar, network, record, stream or object oriented data model type. Data personalities generally describe the expected operation of an interface to data, rather than the specific structure of a given data set. Such a specific structure, or data model, corresponds to a particular schema of that data set as may be designed within the data model type.
- In the example embodiment shown, the
first partition 110 a including the MCP operating system hosts two data personalities, arelational data personality 112 a (such as would be expected of a SQL or other relational database) and aDMSII personality 112 b, useable with DMSII database management system from Unisys Corporation of Blue Bell, Pa. Similarly, thesecond partition 110 b is illustrated as supporting anRDMS personality 112 c, aDMS personality 112 d, and indexed files in a file system (i.e., a file-baseddata personality 112 e). - In the arrangement shown, each of the partitions 110 a-c can be made available to a further partition or application executing within one of those partitions, illustrated as a
data access application 114. Theapplication 114 can access one or more APIs 116, shown astraditional APIs 116 a andthird party APIs 116 b for accessing data stored using nonstandard third party data personalities. The APIs 116 are published for use with each of the variety of data personalities 112, for accessing data in the various partitions. As such, the application can access data as needed from each of the various data personalities—e.g., in a relational format from a relational database personality such aspersonality 112 a, or hierarchical data from a hierarchical database personality (e.g., the DMSII personality 112), or other data access arrangements. - Use of a common data bus 108 to provide data synchronization across partitions, in particular in an example arrangement such as that depicted in
FIG. 1 , provides a number of advantages over existing hypervisor systems or even existing data replication systems. Because an application can access data from each of the various data personalities, the application can be designed to access data according to different personalities (rather than being written to interface with a particular data model type), and can request and receive data from a selected personality based on the suitability of the data model type associated with that data personality. For example, an application could both store data according to aDMSII data personality 112 b, and could retrieve data in a reporting format from arelational data personality 112 a, or a multidimensional data personality, or some other convenient format. Using the common data bus 108, each of the data personalities is kept up-to-date via transformations of the data at the time it is stored in each personality, thereby providing convenient retrieval of data in a convenient format, from a supported API, at the application level regardless of whether the data was originally stored in a database having the particular personality from which retrieval is desired. As such, data is available from each of the data personalities 112 at essentially data retrieval speeds, since each data personality would not be required to communicate across to other data personalities to retrieve such data (assuming sufficient time between data storage in one data personality and retrieval in another data personality to allow for replication of the data in each of the data models and data model types associated with each of the personalities supported within a particular system. Optionally, an application development environment 118 could be included as well which allows a designer to create applications designed to interface with various data personalities via the APIs 116 a-b. The data personalities 112 allow applications to be written using the application development environment 118 that are capable of accessing data from any of the personalities. - As illustrated in
system 100, aremote system 120, such as a client system or other remote server, can be communicatively connected to thevirtual system 101, e.g., for communication with theapplication 114, or application development environment 118. For example, theapplication 114 or application development environment can have a web interface, either directly supported within one of the partitions in which the application or application development environment reside, or in a separate partition, managing access to that system. - It is noted that, as illustrated, other third party systems can be incorporated into the
overall system 100. In the embodiment shown, one suchthird party system 122 can be included within the overallvirtualized system 101, hosted bysecure partitioning services 104, and a further third party system 124 is remote from theoverall system 100, and communicatively connected to the system by the common data bus 108. These third party systems are shown to illustrate example interoperability of the common data bus 108 with third party systems. In connection withthird party system 122, the common data bus 108 can be extended, on a case-by-case basis, to such third party systems by establishing a relationship between known data personalities of the supported systems and those developed by third parties. In the example shown, boththird party systems 122, 124 operate thirdparty operating systems party data personalities FIG. 1 , thirdparty operating system 128 can be communicatively connected to the system despite running on incompatible third party hardware 134. - Although the
system 100 ofFIG. 1 has numerous advantages, it is noted that, in particular for large data collections, some inefficiencies may exist, for example due to the requirement that data be replicated as many times as there are different data personalities. Accordingly, and as illustrated inFIG. 2 , an alternative embodiment of a database and data bus architecture is contemplated, in which asystem 200 reduces the amount of data replication involved. In connection with thesystem 200, acommon data store 202 takes the place of the common data bus 108 for at least a supported portion of thesystem 200, namely one or more partitions 110 having known data personalities. In this embodiment, each of the partitions that are capable of connection to thecommon data store 202 no longer are required to independently maintain storage of data associated with the particular data personalities to which they relate, but instead request data from a common data store that stores data in a data model neutral format. Although examples of such a format are discussed in further detail below, it is noted here that any of a variety of formats that do not specifically rely on positional interrelationships among data elements (e.g., within a common table or data record) to define relationships can be used. For example, unstructured data, such as key-value pairs or other types of data labeling, could be used. - In the particular embodiment shown, the
common data store 202 is configured to provide an interface between each of a plurality of data personalities 112 and the underlying data by providing a conduit for data storage from each of the supported partitions 110. In the embodiment shown, thecommon data store 202 is interfaced to partitions 110 a-c, and provides data to data personalities 112 a-f. As such, data personalities 112 a-f, rather than representing database systems as inFIG. 1 , effectively act as data views on data in thecommon data store 202. - The
common data store 202 can be interfaced to a common data bus 204, which acts analogously to the common data bus 108 ofFIG. 1 , but for only unsupported data structures, i.e., data personalities for which the common data bus 204 may have some knowledge of the data format type, but thecommon data store 202 lacks knowledge of the data format of the data personality itself. In other words, the common data store acts as a structure-independent database capable of being maintained in synchronization with external data personalities, such as data personalities 112 g, 112 h, using the common data bus 204. In this arrangement, the common data bus 204 would not be required to directly interface with data personalities 112 a-f, since those data personalities would not directly store data; rather, thecommon data store 202 would manage that data, and would be maintained in synchronization with the common data bus 204. - In the embodiment shown, it is noted that additional features can be incorporated in the
common data store 202, in addition to those managed in the common data bus 204. For example, functionalities that are related to database functions but which are not part of a particular data model can entirely be managed within the common data store; for example, transaction management, recovery, backup, and other data functions can be managed within thecommon data store 202. Other functionalities typically associated with database management systems could be incorporated into a common data store as well. - Now referring to
FIG. 3 , a general implementation of an example embodiment of asystem 300 incorporating a common data store, such as thecommon data store 202 ofFIG. 2 , is shown. Thesystem 300 generally includes anapplication layer 302 capable of accessing various data personalities 304, examples of which, 304 a-b, represent a relational database and a DMSII database. In the embodiment shown, a separate environment hosts anMCP system 306, which can be located on a different partition from either of the data personalities, and is configured to host aspects of the common data store. In other words, theMCP system 306 acts as a service engine supporting data retrieval according to the data personalities 304 a-b, as would be dictated by query and storage commands received at those data personalities from the application layer. - A
data layer 308 resides beneath the data personalities 304 a-b, and can be executed across a plurality of partitions within a virtual environment. Thedata layer 308 includesdata atoms 310 andmetadata atoms 312. Thedata atoms 310 generally include data stored via any of the data personalities 304, but separated from the format or structure in which that data is stored. In other words, thedata atoms 310 have a data model neutral format in which the structure of the data (i.e. its position on disk relative to other data) does not define interrelationships of the data (e.g., in a table/row format such as in a relational database, or in a hierarchical dataset/record arrangement). - In contrast to the data records of a DMSII database, or tuples stored by a SQL database, in the example embodiments of the
data atoms 310, the data atoms can be implemented in key-value pairs, where themetadata atoms 312 associate keys with the specific logical format of that data. In other embodiments, other data model neutral data formats could be used, such as a triple, or some other type of data arrangement. In some embodiments, the data is stored in a resource description framework (RDF). In such embodiments, the data is stored in records that include a number of data atoms, and associated metadata describing the interrelationships among the data, but which can be stored separately from the data. Similarly, themetadata atoms 310 can be maintained in key-value pairs or other analogous structures, and define databases based on a description of the database schema, for example which may be received at thedata layer 308 in an XMI or other markup language format, thereby allowing decoupling of structure (in the metadata) from the data values themselves. - In the embodiment shown, each of the data personalities 304 have associated therewith a set of one or more agents useable to format data received from the
data atoms 310 into an arrangement that is expected by that data personality 304. Although the particular format of the data to be returned to the data personality may vary, in some embodiments the structure corresponds to a data block that contains data responsive to a query, formatted in an arrangement as expected by the data personality. For example, the structure could be a block of data containing records in a format normally returned from a portion of a table or tables of a relational database, or dataset and record entries including one or more entries responsive to a query of a hierarchical database. In other words, the data returned to a data personality is returned in a way that is consistent with the data model associated with that data personality. The data personality, representing the database management system (albeit without managing the underlying data) will receive the data block having a recognizable structure to that data personality, and will extract the responsive data from that data block for return to the application from which a query or other data request was received. - To implement the above arrangement, in the particular example embodiment shown, a
data agent generator 314 manages metadata describing data formats and/or data format types associated with data collections defined using each of the associated data personalities 304. Thedata agent generator 314 maintains the collection ofmetadata atoms 312 that describe each of the data formats of databases, and generates data agents 316 associated with each data personality 304 that can format the data stored in a data model neutral format. In the embodiment shown, thedata agent generator 314 generates arow agent 316 a and atable agent 316 b for response to data inquiries and storing data associated with arelational personality 304 a. Thedata agent generator 314 also generates aset agent 316 c, as well as adata set agent 316 d and arecord agent 316 e associated with a DMSII data personality. In the embodiment shown, theset agent 316 c includes sub-agents, such as DMSII key agents 318 a-b, which can be used to interrelate records based on keys provided within the DMSII database architecture, and which are tracked in themetadata atoms 312. - Based on the personality to which the data agent interfaces, different types of data agents could be generated by the
data agent generator 314, incorporating metadata as defined in themetadata atoms 312. When the data agents receive requests for data from the various data personalities 304, those data agents can manage requests for a receipt of data from theunderlying data atoms 310. The data agents 316 can also manage the various tasks typically performed in database management systems but not intrinsically tied to the structure of the data, such as transaction management, recovery, backup, and other data functions. - In connection with both
FIGS. 2-3 , it is noted that thedata atoms 310 generally represent data stored in a plurality of databases or other data structures; as such, in typical arrangements the data atoms are stored across a plurality of computing systems. In such embodiments, thedata atoms 310 are generally distributed across a number of computing systems, or partitions. As such, in typical implementations of such systems requested data is to be retrieved from more than one computing system or partition. Accordingly, within thecommon data store 202, an implementation for data model neutral data retrieval is implemented in which massively-parallel queries can be processed and query results compiled and returned. In one example embodiment, the common data store implements a map-reduce technique for query processing and/or storage, such as the Hadoop Map-Reduce algorithm. Other data processing implementations could be used as well. - Referring to
FIGS. 1-3 overall, it is noted that these overall systems allow for use of data personalities by application programs in the same manner as is traditionally provided by database management systems. Accordingly, since such an arrangement is typically located in a large-scale multi-server environment, applications have a choice regarding the specific data personality from which data is requested, despite the fact that data may not have originally been stored using that data personality, and in implementations ofFIGS. 2-3 , the data is maintained in a common data store in a data model neutral format. - Referring now to
FIG. 4 , anexample arrangement 400 of systems is illustrated, on which the systems ofFIGS. 1-3 can be implemented. In the embodiment shown, thearrangement 400 includes a plurality of logical computing systems 402 a-d, or partitions. Each of the logical computing systems 402 a-d can include a collection of computing resources, such as a processor, memory resources, disk resource, network or communications resources, and other resources typically present on a computing system. An example of a collection of physical computing resources, formed as a typical discrete electronic computing system is described below in connection withFIG. 5 . - In general, each of the logical computing systems 402 a-d hosts
secure partition services 404, which define the set of physical computing resources available to higher-layer software, as well as providing an interface between that higher-layer software and the physical computing resources allocated to the particular logical computing system 402. Furthermore, thepartition services 404 provide virtualization and security services, as well as backup and recovery services, for each partition. - In the embodiment shown, the
arrangement 400 includes acontrol partition 406, guest partitions 408 a-b, and aservices partition 410. Thecontrol partition 406 schedules allocation of additional partitions to various guest processes as desired. For example, thecontrol partition 406 can execute a console application configured to allow reservation of resources for various guest partitions and/or service partitions. The guest partitions 408 a-b can execute any of a variety of guest applications. For example, the guest partitions 408 a-b can host separate database management systems or data personalities on different hosted operating systems (e.g., the relational and DMSII database management systems ofFIG. 3 ). Still further guest partitions (not shown) could host data storage partitions, or an implementation of the common data bus or common data store, a map-reduce service operation useable by the common data store, or other types of services discussed above. Aservices partition 410 hosts one or more services useable by the guest partitions, such as for remote systems communications, data management/replication, or other services. - When implementing a system such as those shown in
FIGS. 1-3 above in a virtualized computing arrangement such as is illustrated inFIG. 4 , it is noted that although an example set of hosted, virtualized partitions are shown, other partitions could be included in such a system for hosting additional data personalities, applications, data nodes, data processing software, networking operations, or specialty processes. Furthermore, in some embodiments, at least some of the computing arrangements ofFIGS. 1-3 can be implemented natively on a local system, rather than on a virtualized system. - Referring now to
FIG. 5 , a schematic illustration of an example computing system in which aspects of the present disclosure can be implemented. Thecomputing system 500 can represent, for example, a native computing system within which one or more of computing systems 402 a-d, or with multiple of which any of systems 100-300 could be implemented. - In the example of
FIG. 5 , thecomputing device 500 includes amemory 502, aprocessing system 504, asecondary storage device 506, anetwork interface card 508, avideo interface 510, adisplay unit 512, anexternal component interface 514, and acommunication medium 516. Thememory 502 includes one or more computer storage media capable of storing data and/or instructions. In different embodiments, thememory 502 is implemented in different ways. For example, thememory 502 can be implemented using various types of computer storage media. - The
processing system 504 includes one or more processing units. A processing unit is a physical device or article of manufacture comprising one or more integrated circuits that selectively execute software instructions. In various embodiments, theprocessing system 504 is implemented in various ways. For example, theprocessing system 504 can be implemented as one or more processing cores. In another example, theprocessing system 504 can include one or more separate microprocessors. In yet another example embodiment, theprocessing system 504 can include an application-specific integrated circuit (ASIC) that provides specific functionality. In yet another example, theprocessing system 504 provides specific functionality by using an ASIC and by executing computer-executable instructions. - The
secondary storage device 506 includes one or more computer storage media. Thesecondary storage device 506 stores data and software instructions not directly accessible by theprocessing system 504. In other words, theprocessing system 504 performs an I/O operation to retrieve data and/or software instructions from thesecondary storage device 506. In various embodiments, thesecondary storage device 506 includes various types of computer storage media. For example, thesecondary storage device 506 can include one or more magnetic disks, magnetic tape drives, optical discs, solid state memory devices, and/or other types of computer storage media. - The
network interface card 508 enables thecomputing device 500 to send data to and receive data from a communication network. In different embodiments, thenetwork interface card 508 is implemented in different ways. For example, thenetwork interface card 508 can be implemented as an Ethernet interface, a token-ring network interface, a fiber optic network interface, a wireless network interface (e.g., WiFi, WiMax, etc.), or another type of network interface. - The
video interface 510 enables thecomputing device 500 to output video information to thedisplay unit 512. Thedisplay unit 512 can be various types of devices for displaying video information, such as a cathode-ray tube display, an LCD display panel, a plasma screen display panel, a touch-sensitive display panel, an LED screen, or a projector. Thevideo interface 510 can communicate with thedisplay unit 512 in various ways, such as via a Universal Serial Bus (USB) connector, a VGA connector, a digital visual interface (DVI) connector, an S-Video connector, a High-Definition Multimedia Interface (HDMI) interface, or a DisplayPort connector. - The
external component interface 514 enables thecomputing device 500 to communicate with external devices. For example, theexternal component interface 514 can be a USB interface, a FireWire interface, a serial port interface, a parallel port interface, a PS/2 interface, and/or another type of interface that enables thecomputing device 500 to communicate with external devices. In various embodiments, theexternal component interface 514 enables thecomputing device 500 to communicate with various external components, such as external storage devices, input devices, speakers, modems, media player docks, other computing devices, scanners, digital cameras, and fingerprint readers. - The
communications medium 516 facilitates communication among the hardware components of thecomputing device 500. In the example ofFIG. 5 , thecommunications medium 516 facilitates communication among thememory 502, theprocessing system 504, thesecondary storage device 506, thenetwork interface card 508, thevideo interface 510, and theexternal component interface 514. Thecommunications medium 516 can be implemented in various ways. For example, thecommunications medium 516 can include a PCI bus, a PCI Express bus, an accelerated graphics port (AGP) bus, a serial Advanced Technology Attachment (ATA) interconnect, a parallel ATA interconnect, a Fiber Channel interconnect, a USB bus, a Small Computing system Interface (SCSI) interface, or another type of communications medium. - The
memory 502 stores various types of data and/or software instructions. For instance, in the example ofFIG. 5 , thememory 502 stores a Basic Input/Output System (BIOS) 518 and anoperating system 520. TheBIOS 518 includes a set of computer-executable instructions that, when executed by theprocessing system 504, cause thecomputing device 500 to boot up. Theoperating system 520 includes a set of computer-executable instructions that, when executed by theprocessing system 504, cause thecomputing device 500 to provide an operating system that coordinates the activities and sharing of resources of thecomputing device 500. Furthermore, thememory 502stores application software 522. Theapplication software 522 includes computer-executable instructions, that when executed by theprocessing system 504, cause thecomputing device 500 to provide one or more applications. Thememory 502 also storesprogram data 524. Theprogram data 524 is data used by programs that execute on thecomputing device 500. - Although particular features are discussed herein as included within an
electronic computing device 500, it is recognized that in certain embodiments not all such components or features may be included within a computing device executing according to the methods and systems of the present disclosure. Furthermore, different types of hardware and/or software systems could be incorporated into such an electronic computing device. - In accordance with the present disclosure, the term computer readable media as used herein may include computer storage media and communication media. As used in this document, a computer storage medium is a device or article of manufacture that stores data and/or computer-executable instructions. Computer storage media may include volatile and nonvolatile, removable and non-removable devices or articles of manufacture implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data. By way of example, and not limitation, computer storage media may include dynamic random access memory (DRAM), double data rate synchronous dynamic random access memory (DDR SDRAM), reduced latency DRAM, DDR2 SDRAM, DDR3 SDRAM, solid state memory, read-only memory (ROM), electrically-erasable programmable ROM, optical discs (e.g., CD-ROMs, DVDs, etc.), magnetic disks (e.g., hard disks, floppy disks, etc.), magnetic tapes, and other types of devices and/or articles of manufacture that store data. However, such computer readable media, and in particular computer readable storage media, are generally implemented via systems that include at least some non-transitory storage of instructions and data that implements the subject matter disclosed herein.
- Communication media may be embodied by computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave or other transport mechanism, and includes any information delivery media. The term “modulated data signal” may describe a signal that has one or more characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media may include wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, radio frequency (RF), infrared, and other wireless media.
- Referring now to
FIGS. 6-8 , example methods for managing distributed data according the various embodiments described above in connection withFIGS. 1-5 .FIG. 6 illustrates a flowchart representing amethod 600 for managing distributed data across a plurality of data model types, according to an example embodiment.FIG. 6 therefore represents amethod 600 that can be performed by a common data bus, such as the common data bus 108 ofFIG. 1 or the common data bus 204 ofFIG. 2 . - The
method 600 ofFIG. 6 begins when a data personality receives a database command (step 602). The database command can be, for example a query or a data storage command, or other types of commands expected to be received by a particular type of database management system analogous to that data personality. The database command can, in various embodiments, effect a change on data managed at the database associated with that data personality. - In response to a detected change in the data managed by the data personality receiving the database command, the common data bus will detect a data request (i.e., a request to provide or request to change data in a particular database), and an analogous data request will be formed by the common data bus. In the event the common data bus is interfaced to a common data store, the data request can be to form a data model neutral change in data that would be analogous to the data change reflected by the data request. In the event of a data change, the common data bus (or alternatively, the data personality issuing the original data request) will issue a data request at the common data bus. That request will then be translated to a second type of data request (step 606). The second type of data request can take any of a number of forms, but generally is configured to replicate a change of data from the data personality receiving the request in a second data store having a different format from that data personality receiving an original database command. For example, the second data request can be a data model neutral data request, or can be a data request in a different data model (i.e., at a different data personality) as compared to the original request. The second data request, if executed in a common data bus, causes synchronization of the data personality that is the target of the second data request with the data personality originally receiving the database command (step 608).
- It is noted that, depending upon the number of different data personalities, the translation and execution of the first data request to different types of data requests may occur many times, such that each data personality maintains a synchronized set of data with each of the other data personalities. Additionally, in some cases, the specific data personalities to be synchronized for each database or data collection can be user selectable, thereby controlling the number of data personalities requiring synchronization.
- Referring now to
FIG. 7 , a flowchart of amethod 700 for managing distributed data using any of a plurality of data models, according to an example embodiment. Themethod 700 is generally executable within a common data store, such asdata store 202 ofFIG. 2 , with portions of themethod 700 performed by agents and/or an agent generator within the common data store. - In the embodiment shown, the
method 700 begins when a database command is received at a data personality, also known in this instance as a database interface (step 702). The data personality can be referred to as a database interface in this case because the each of the data personalities, rather than storing data, represent an interface to data stored in an underlying common data store. The data personality performs a first data request based on the database command received by the data personality (step 704). The data request is generally a request for data from an underlying data collection, which would normally be issued from a database management system to an underlying database file system; however in the present disclosure, the data request is passed to a common data store. This can be, for example, issued to one or more data agents, such as the agents illustrated inFIG. 3 , as generated by a data agent generator specific to each data personality. - The common data bus will receive the data request, and translate that data request to a second data request in a data model neutral format (step 706). For example, one or more data atoms will receive the data request and translate that data request to one or more data model neutral search or data operations, for example using a map-reduce operation across data distributed on a large number of physical systems in data model neutral data atoms. That data model neutral request will then be executed (step 708), managed by the data agents, and data will be returned via the data agents to the data personality from which the data request is received (step 710).
- Referring now to
FIG. 8 , afurther method 800 for handling a data request is illustrated, based on a database command received from a database interface, according to an example embodiment. Themethod 800 may be performed, for example, at a common data store, such asdata store 202 as illustrated inFIG. 2 , using agents and associated data and metadata atoms as illustrated in connection withFIG. 3 . - In general the
method 800 is performed using one or more data personalities, or database interfaces, that have been preconfigured with the common data store (i.e., which the common data store has metadata regarding the structure of databases managed by that data personality). In the embodiment shown, themethod 800 includes obtaining, from a metadata agent, metadata describing a logical structure of the database associated with that particular data personality (step 802). This can include, for example, obtaining metadata from a metadata store that was extracted from or otherwise separated from data that is stored in the common data store in a data model neutral format. - Once the metadata is obtained, the metadata agent can generate one or more database interface agents based on that metadata (step 804). The database interface agents are generated to be capable of parsing data and data requests received from a data personality, as well as to collect and logically arrange data to be returned to the data personality in response to a data request from that data personality. In some embodiments, the data agents are generated based on the metadata describing the personality to be interfaced to the common data store.
- In the embodiment shown, the
method 800 will continue upon receipt of a data request at the common data store, for example from a data personality (step 806). The data request is received at one or more agents interfaced to the data personality, to determine the type of data request that is being made. For example, the data request can be to store data in a particular logical location within a database, to retrieve data, to obtain a record count, or other types of database actions. Based on that data request, the agent receiving the data request will parse the request to determine one or more actions to be taken across the distributed data storage systems associated with the common data store, and distribute that data request across the storage systems to obtain or modify data as required (step 808). To the extent any results are required (e.g., either acknowledgement of completed storage of data, or receipt of data in response to a query or record count operation), those results are formatted by the agent(s) associated with the data personality to be in a form understandable by the data personality (step 810). The results can then be passed back to the data personality, as if coming from an underlying data storage having a logical organization dictated by that data personality. - Referring to
FIGS. 1-8 generally, it is recognized that the various systems and methods described herein provide a number of advantages over existing database systems, and in particular for large-scale, large fanout databases requiring many physical computing systems for implementation. For example, the various virtualization services on which the systems are provided allows for customized workload assignment, by placing the common data bus or common data store on entirely separate hardware resources as compared to the various data personalities which they serve. Additionally, due to the common storage of data in a format easily and quickly searched regardless of the interface from which a query or other database command is received, data retrieval times can be reduced, due to a lack of a requirement to replicate data if a common data store is used, as well as due to distribution of query tasks across many partitions to avoid bogging down one particular hardware system with many complicated data requests. - The above specification, examples and data provide a complete description of the manufacture and use of the composition of the invention. Since many embodiments of the invention can be made without departing from the spirit and scope of the invention, the invention resides in the claims hereinafter appended.
Claims (26)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/709,579 US9330147B2 (en) | 2012-12-10 | 2012-12-10 | Database and data bus architecture and systems for efficient data distribution |
PCT/US2013/073961 WO2014093262A1 (en) | 2012-12-10 | 2013-12-10 | Database and data bus architecture and systems for efficient data distribution |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/709,579 US9330147B2 (en) | 2012-12-10 | 2012-12-10 | Database and data bus architecture and systems for efficient data distribution |
Publications (2)
Publication Number | Publication Date |
---|---|
US20140164431A1 true US20140164431A1 (en) | 2014-06-12 |
US9330147B2 US9330147B2 (en) | 2016-05-03 |
Family
ID=50882163
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/709,579 Active 2034-08-02 US9330147B2 (en) | 2012-12-10 | 2012-12-10 | Database and data bus architecture and systems for efficient data distribution |
Country Status (2)
Country | Link |
---|---|
US (1) | US9330147B2 (en) |
WO (1) | WO2014093262A1 (en) |
Cited By (34)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140279899A1 (en) * | 2013-03-15 | 2014-09-18 | Unisys Corporation | Data bus architecture for inter-database data distribution |
US20170351683A1 (en) * | 2016-06-07 | 2017-12-07 | Salesforce.Com, Inc. | Hierarchical data insertion |
US20180225333A1 (en) * | 2017-02-08 | 2018-08-09 | International Business Machines Corporation | Data write/import performance in a database through distributed memory |
CN112732468A (en) * | 2021-04-01 | 2021-04-30 | 统信软件技术有限公司 | Data processing method, data interaction system and computing equipment |
US11036697B2 (en) * | 2016-06-19 | 2021-06-15 | Data.World, Inc. | Transmuting data associations among data arrangements to facilitate data operations in a system of networked collaborative datasets |
US11042560B2 (en) | 2016-06-19 | 2021-06-22 | data. world, Inc. | Extended computerized query language syntax for analyzing multiple tabular data arrangements in data-driven collaborative projects |
US11042556B2 (en) | 2016-06-19 | 2021-06-22 | Data.World, Inc. | Localized link formation to perform implicitly federated queries using extended computerized query language syntax |
US11093633B2 (en) | 2016-06-19 | 2021-08-17 | Data.World, Inc. | Platform management of integrated access of public and privately-accessible datasets utilizing federated query generation and query schema rewriting optimization |
US11163755B2 (en) | 2016-06-19 | 2021-11-02 | Data.World, Inc. | Query generation for collaborative datasets |
US11210313B2 (en) | 2016-06-19 | 2021-12-28 | Data.World, Inc. | Computerized tools to discover, form, and analyze dataset interrelations among a system of networked collaborative datasets |
USD940169S1 (en) | 2018-05-22 | 2022-01-04 | Data.World, Inc. | Display screen or portion thereof with a graphical user interface |
USD940732S1 (en) | 2018-05-22 | 2022-01-11 | Data.World, Inc. | Display screen or portion thereof with a graphical user interface |
US11238109B2 (en) | 2017-03-09 | 2022-02-01 | Data.World, Inc. | Computerized tools configured to determine subsets of graph data arrangements for linking relevant data to enrich datasets associated with a data-driven collaborative dataset platform |
US11243960B2 (en) | 2018-03-20 | 2022-02-08 | Data.World, Inc. | Content addressable caching and federation in linked data projects in a data-driven collaborative dataset platform using disparate database architectures |
US11246018B2 (en) | 2016-06-19 | 2022-02-08 | Data.World, Inc. | Computerized tool implementation of layered data files to discover, form, or analyze dataset interrelations of networked collaborative datasets |
US11327996B2 (en) | 2016-06-19 | 2022-05-10 | Data.World, Inc. | Interactive interfaces to present data arrangement overviews and summarized dataset attributes for collaborative datasets |
US11334625B2 (en) | 2016-06-19 | 2022-05-17 | Data.World, Inc. | Loading collaborative datasets into data stores for queries via distributed computer networks |
US11373094B2 (en) | 2016-06-19 | 2022-06-28 | Data.World, Inc. | Platform management of integrated access of public and privately-accessible datasets utilizing federated query generation and query schema rewriting optimization |
US11409802B2 (en) | 2010-10-22 | 2022-08-09 | Data.World, Inc. | System for accessing a relational database using semantic queries |
US11442988B2 (en) | 2018-06-07 | 2022-09-13 | Data.World, Inc. | Method and system for editing and maintaining a graph schema |
US11468049B2 (en) | 2016-06-19 | 2022-10-11 | Data.World, Inc. | Data ingestion to generate layered dataset interrelations to form a system of networked collaborative datasets |
US11573948B2 (en) | 2018-03-20 | 2023-02-07 | Data.World, Inc. | Predictive determination of constraint data for application with linked data in graph-based datasets associated with a data-driven collaborative dataset platform |
US11609680B2 (en) | 2016-06-19 | 2023-03-21 | Data.World, Inc. | Interactive interfaces as computerized tools to present summarization data of dataset attributes for collaborative datasets |
US11669540B2 (en) | 2017-03-09 | 2023-06-06 | Data.World, Inc. | Matching subsets of tabular data arrangements to subsets of graphical data arrangements at ingestion into data-driven collaborative datasets |
US11675808B2 (en) | 2016-06-19 | 2023-06-13 | Data.World, Inc. | Dataset analysis and dataset attribute inferencing to form collaborative datasets |
US11755602B2 (en) | 2016-06-19 | 2023-09-12 | Data.World, Inc. | Correlating parallelized data from disparate data sources to aggregate graph data portions to predictively identify entity data |
US11816118B2 (en) | 2016-06-19 | 2023-11-14 | Data.World, Inc. | Collaborative dataset consolidation via distributed computer networks |
US11941140B2 (en) | 2016-06-19 | 2024-03-26 | Data.World, Inc. | Platform management of integrated access of public and privately-accessible datasets utilizing federated query generation and query schema rewriting optimization |
US11947529B2 (en) | 2018-05-22 | 2024-04-02 | Data.World, Inc. | Generating and analyzing a data model to identify relevant data catalog data derived from graph-based data arrangements to perform an action |
US11947600B2 (en) | 2021-11-30 | 2024-04-02 | Data.World, Inc. | Content addressable caching and federation in linked data projects in a data-driven collaborative dataset platform using disparate database architectures |
US11947554B2 (en) | 2016-06-19 | 2024-04-02 | Data.World, Inc. | Loading collaborative datasets into data stores for queries via distributed computer networks |
US12008050B2 (en) | 2017-03-09 | 2024-06-11 | Data.World, Inc. | Computerized tools configured to determine subsets of graph data arrangements for linking relevant data to enrich datasets associated with a data-driven collaborative dataset platform |
US12061617B2 (en) | 2016-06-19 | 2024-08-13 | Data.World, Inc. | Consolidator platform to implement collaborative datasets via distributed computer networks |
US12117997B2 (en) | 2018-05-22 | 2024-10-15 | Data.World, Inc. | Auxiliary query commands to deploy predictive data models for queries in a networked computing platform |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10628421B2 (en) | 2017-02-07 | 2020-04-21 | International Business Machines Corporation | Managing a single database management system |
US11941029B2 (en) | 2022-02-03 | 2024-03-26 | Bank Of America Corporation | Automatic extension of database partitions |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6609123B1 (en) * | 1999-09-03 | 2003-08-19 | Cognos Incorporated | Query engine and method for querying data using metadata model |
US20060047780A1 (en) * | 2005-11-08 | 2006-03-02 | Gregory Patnude | Method and apparatus for web-based, schema-driven application-server and client-interface package using a generalized, data-object format and asynchronous communication methods without the use of a markup language. |
US20060274727A1 (en) * | 2005-06-06 | 2006-12-07 | Microsoft Corporation | Transport-neutral in-order delivery in a distributed system |
US8027349B2 (en) * | 2003-09-25 | 2011-09-27 | Roy-G-Biv Corporation | Database event driven motion systems |
US8418072B1 (en) * | 2007-12-24 | 2013-04-09 | Emc Corporation | UI data model abstraction |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8005849B2 (en) * | 2006-08-31 | 2011-08-23 | Red Hat, Inc. | Database access server with reformatting |
US7693900B2 (en) * | 2006-09-27 | 2010-04-06 | The Boeing Company | Querying of distributed databases using neutral ontology model for query front end |
US20100049694A1 (en) * | 2008-08-20 | 2010-02-25 | Ca, Inc. | Method and system for extending a relational schema |
US8250113B2 (en) * | 2010-04-30 | 2012-08-21 | International Business Machines Corporation | Web service discovery via data abstraction model |
US20120011134A1 (en) * | 2010-07-08 | 2012-01-12 | Travnik Jakub | Systems and methods for database query translation |
-
2012
- 2012-12-10 US US13/709,579 patent/US9330147B2/en active Active
-
2013
- 2013-12-10 WO PCT/US2013/073961 patent/WO2014093262A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6609123B1 (en) * | 1999-09-03 | 2003-08-19 | Cognos Incorporated | Query engine and method for querying data using metadata model |
US8027349B2 (en) * | 2003-09-25 | 2011-09-27 | Roy-G-Biv Corporation | Database event driven motion systems |
US20060274727A1 (en) * | 2005-06-06 | 2006-12-07 | Microsoft Corporation | Transport-neutral in-order delivery in a distributed system |
US20060047780A1 (en) * | 2005-11-08 | 2006-03-02 | Gregory Patnude | Method and apparatus for web-based, schema-driven application-server and client-interface package using a generalized, data-object format and asynchronous communication methods without the use of a markup language. |
US8418072B1 (en) * | 2007-12-24 | 2013-04-09 | Emc Corporation | UI data model abstraction |
Cited By (42)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11409802B2 (en) | 2010-10-22 | 2022-08-09 | Data.World, Inc. | System for accessing a relational database using semantic queries |
US20140279899A1 (en) * | 2013-03-15 | 2014-09-18 | Unisys Corporation | Data bus architecture for inter-database data distribution |
US20170351683A1 (en) * | 2016-06-07 | 2017-12-07 | Salesforce.Com, Inc. | Hierarchical data insertion |
US11373094B2 (en) | 2016-06-19 | 2022-06-28 | Data.World, Inc. | Platform management of integrated access of public and privately-accessible datasets utilizing federated query generation and query schema rewriting optimization |
US11734564B2 (en) | 2016-06-19 | 2023-08-22 | Data.World, Inc. | Platform management of integrated access of public and privately-accessible datasets utilizing federated query generation and query schema rewriting optimization |
US11036697B2 (en) * | 2016-06-19 | 2021-06-15 | Data.World, Inc. | Transmuting data associations among data arrangements to facilitate data operations in a system of networked collaborative datasets |
US11042560B2 (en) | 2016-06-19 | 2021-06-22 | data. world, Inc. | Extended computerized query language syntax for analyzing multiple tabular data arrangements in data-driven collaborative projects |
US11042556B2 (en) | 2016-06-19 | 2021-06-22 | Data.World, Inc. | Localized link formation to perform implicitly federated queries using extended computerized query language syntax |
US11093633B2 (en) | 2016-06-19 | 2021-08-17 | Data.World, Inc. | Platform management of integrated access of public and privately-accessible datasets utilizing federated query generation and query schema rewriting optimization |
US11163755B2 (en) | 2016-06-19 | 2021-11-02 | Data.World, Inc. | Query generation for collaborative datasets |
US11210313B2 (en) | 2016-06-19 | 2021-12-28 | Data.World, Inc. | Computerized tools to discover, form, and analyze dataset interrelations among a system of networked collaborative datasets |
US12061617B2 (en) | 2016-06-19 | 2024-08-13 | Data.World, Inc. | Consolidator platform to implement collaborative datasets via distributed computer networks |
US11947554B2 (en) | 2016-06-19 | 2024-04-02 | Data.World, Inc. | Loading collaborative datasets into data stores for queries via distributed computer networks |
US11941140B2 (en) | 2016-06-19 | 2024-03-26 | Data.World, Inc. | Platform management of integrated access of public and privately-accessible datasets utilizing federated query generation and query schema rewriting optimization |
US11928596B2 (en) | 2016-06-19 | 2024-03-12 | Data.World, Inc. | Platform management of integrated access of public and privately-accessible datasets utilizing federated query generation and query schema rewriting optimization |
US11246018B2 (en) | 2016-06-19 | 2022-02-08 | Data.World, Inc. | Computerized tool implementation of layered data files to discover, form, or analyze dataset interrelations of networked collaborative datasets |
US11277720B2 (en) | 2016-06-19 | 2022-03-15 | Data.World, Inc. | Computerized tool implementation of layered data files to discover, form, or analyze dataset interrelations of networked collaborative datasets |
US11314734B2 (en) | 2016-06-19 | 2022-04-26 | Data.World, Inc. | Query generation for collaborative datasets |
US11327996B2 (en) | 2016-06-19 | 2022-05-10 | Data.World, Inc. | Interactive interfaces to present data arrangement overviews and summarized dataset attributes for collaborative datasets |
US11816118B2 (en) | 2016-06-19 | 2023-11-14 | Data.World, Inc. | Collaborative dataset consolidation via distributed computer networks |
US11755602B2 (en) | 2016-06-19 | 2023-09-12 | Data.World, Inc. | Correlating parallelized data from disparate data sources to aggregate graph data portions to predictively identify entity data |
US11386218B2 (en) | 2016-06-19 | 2022-07-12 | Data.World, Inc. | Platform management of integrated access of public and privately-accessible datasets utilizing federated query generation and query schema rewriting optimization |
US11334625B2 (en) | 2016-06-19 | 2022-05-17 | Data.World, Inc. | Loading collaborative datasets into data stores for queries via distributed computer networks |
US11609680B2 (en) | 2016-06-19 | 2023-03-21 | Data.World, Inc. | Interactive interfaces as computerized tools to present summarization data of dataset attributes for collaborative datasets |
US11468049B2 (en) | 2016-06-19 | 2022-10-11 | Data.World, Inc. | Data ingestion to generate layered dataset interrelations to form a system of networked collaborative datasets |
US11726992B2 (en) | 2016-06-19 | 2023-08-15 | Data.World, Inc. | Query generation for collaborative datasets |
US11675808B2 (en) | 2016-06-19 | 2023-06-13 | Data.World, Inc. | Dataset analysis and dataset attribute inferencing to form collaborative datasets |
US10565202B2 (en) * | 2017-02-08 | 2020-02-18 | International Business Machines Corporation | Data write/import performance in a database through distributed memory |
US20180225333A1 (en) * | 2017-02-08 | 2018-08-09 | International Business Machines Corporation | Data write/import performance in a database through distributed memory |
US11238109B2 (en) | 2017-03-09 | 2022-02-01 | Data.World, Inc. | Computerized tools configured to determine subsets of graph data arrangements for linking relevant data to enrich datasets associated with a data-driven collaborative dataset platform |
US11669540B2 (en) | 2017-03-09 | 2023-06-06 | Data.World, Inc. | Matching subsets of tabular data arrangements to subsets of graphical data arrangements at ingestion into data-driven collaborative datasets |
US12008050B2 (en) | 2017-03-09 | 2024-06-11 | Data.World, Inc. | Computerized tools configured to determine subsets of graph data arrangements for linking relevant data to enrich datasets associated with a data-driven collaborative dataset platform |
US11573948B2 (en) | 2018-03-20 | 2023-02-07 | Data.World, Inc. | Predictive determination of constraint data for application with linked data in graph-based datasets associated with a data-driven collaborative dataset platform |
US11243960B2 (en) | 2018-03-20 | 2022-02-08 | Data.World, Inc. | Content addressable caching and federation in linked data projects in a data-driven collaborative dataset platform using disparate database architectures |
US11947529B2 (en) | 2018-05-22 | 2024-04-02 | Data.World, Inc. | Generating and analyzing a data model to identify relevant data catalog data derived from graph-based data arrangements to perform an action |
USD940732S1 (en) | 2018-05-22 | 2022-01-11 | Data.World, Inc. | Display screen or portion thereof with a graphical user interface |
USD940169S1 (en) | 2018-05-22 | 2022-01-04 | Data.World, Inc. | Display screen or portion thereof with a graphical user interface |
US12117997B2 (en) | 2018-05-22 | 2024-10-15 | Data.World, Inc. | Auxiliary query commands to deploy predictive data models for queries in a networked computing platform |
US11657089B2 (en) | 2018-06-07 | 2023-05-23 | Data.World, Inc. | Method and system for editing and maintaining a graph schema |
US11442988B2 (en) | 2018-06-07 | 2022-09-13 | Data.World, Inc. | Method and system for editing and maintaining a graph schema |
CN112732468A (en) * | 2021-04-01 | 2021-04-30 | 统信软件技术有限公司 | Data processing method, data interaction system and computing equipment |
US11947600B2 (en) | 2021-11-30 | 2024-04-02 | Data.World, Inc. | Content addressable caching and federation in linked data projects in a data-driven collaborative dataset platform using disparate database architectures |
Also Published As
Publication number | Publication date |
---|---|
US9330147B2 (en) | 2016-05-03 |
WO2014093262A1 (en) | 2014-06-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9330147B2 (en) | Database and data bus architecture and systems for efficient data distribution | |
US20220237166A1 (en) | Table partitioning within distributed database systems | |
US20140279899A1 (en) | Data bus architecture for inter-database data distribution | |
JP7263297B2 (en) | Real-time cross-system database replication for hybrid cloud elastic scaling and high-performance data virtualization | |
US20160371355A1 (en) | Techniques for resource description framework modeling within distributed database systems | |
WO2022095366A1 (en) | Redis-based data reading method and apparatus, device, and readable storage medium | |
US11120046B2 (en) | Data replication in a distributed storage system | |
US9442862B2 (en) | Polymorph table with shared columns | |
US10810116B2 (en) | In-memory database with page size adaptation during loading | |
US11789971B1 (en) | Adding replicas to a multi-leader replica group for a data set | |
US11176106B2 (en) | Dynamic modification of database schema | |
US10762050B2 (en) | Distribution of global namespace to achieve performance and capacity linear scaling in cluster filesystems | |
US9026553B2 (en) | Data expanse viewer for database systems | |
US20130132443A1 (en) | Structure-specific record count database operations | |
US20250094384A1 (en) | Database Aware, Space Efficient, High Performance, Snapshots On Hyper-Scale Distributed Storage | |
US20250036650A1 (en) | Change-aware snapshot replication | |
CA2893006A1 (en) | Data bus architecture for inter-database data distribution |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: UNISYS CORPORATION, PENNSYLVANIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:TOLBERT, DOUGLAS M;AFFELDT, STEVEN J;REEL/FRAME:037492/0585 Effective date: 20130207 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
AS | Assignment |
Owner name: WELLS FARGO BANK, NATIONAL ASSOCIATION, AS AGENT, PENNSYLVANIA Free format text: SECURITY INTEREST;ASSIGNOR:UNISYS CORPORATION;REEL/FRAME:039960/0057 Effective date: 20160906 Owner name: WELLS FARGO BANK, NATIONAL ASSOCIATION, AS AGENT, Free format text: SECURITY INTEREST;ASSIGNOR:UNISYS CORPORATION;REEL/FRAME:039960/0057 Effective date: 20160906 |
|
AS | Assignment |
Owner name: WELLS FARGO BANK, NATIONAL ASSOCIATION, AS COLLATE Free format text: PATENT SECURITY AGREEMENT;ASSIGNOR:UNISYS CORPORATION;REEL/FRAME:042354/0001 Effective date: 20170417 Owner name: WELLS FARGO BANK, NATIONAL ASSOCIATION, AS COLLATERAL TRUSTEE, NEW YORK Free format text: PATENT SECURITY AGREEMENT;ASSIGNOR:UNISYS CORPORATION;REEL/FRAME:042354/0001 Effective date: 20170417 |
|
AS | Assignment |
Owner name: JPMORGAN CHASE BANK, N.A., AS ADMINISTRATIVE AGENT, ILLINOIS Free format text: SECURITY INTEREST;ASSIGNOR:UNISYS CORPORATION;REEL/FRAME:044144/0081 Effective date: 20171005 Owner name: JPMORGAN CHASE BANK, N.A., AS ADMINISTRATIVE AGENT Free format text: SECURITY INTEREST;ASSIGNOR:UNISYS CORPORATION;REEL/FRAME:044144/0081 Effective date: 20171005 |
|
AS | Assignment |
Owner name: UNISYS CORPORATION, PENNSYLVANIA Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:WELLS FARGO BANK, NATIONAL ASSOCIATION;REEL/FRAME:044416/0114 Effective date: 20171005 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |
|
AS | Assignment |
Owner name: UNISYS CORPORATION, PENNSYLVANIA Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:WELLS FARGO BANK, NATIONAL ASSOCIATION;REEL/FRAME:054231/0496 Effective date: 20200319 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 8 |