US12026185B2 - Document search and analysis tool - Google Patents
Document search and analysis tool Download PDFInfo
- Publication number
- US12026185B2 US12026185B2 US17/188,774 US202117188774A US12026185B2 US 12026185 B2 US12026185 B2 US 12026185B2 US 202117188774 A US202117188774 A US 202117188774A US 12026185 B2 US12026185 B2 US 12026185B2
- Authority
- US
- United States
- Prior art keywords
- keywords
- documents
- search query
- knowledge graph
- document
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
- G06N5/022—Knowledge engineering; Knowledge acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3325—Reformulation based on results of preceding query
- G06F16/3326—Reformulation based on results of preceding query using relevance feedback from the user, e.g. relevance feedback on documents, documents sets, document terms or passages
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
Definitions
- a set of keywords may be obtained for use in searching for documents.
- a set of potential keywords may be generated based on the set of keywords and/or other information.
- One or more potential keywords may be selected from the set of potential keywords for use in searching for the documents.
- the set of keywords and the potential keyword(s) selected from the set of potential keywords may form a set of selected keywords.
- the documents may be identified based on the set of selected keywords and/or other information.
- a knowledge graph model that represents the identified documents may be generated.
- the knowledge graph model may include document nodes representing the identified documents and a first search node representing the set of selected keywords. Relative position of the individual document nodes with respect to the first search node may represent similarity between corresponding identified documents and the set of selected keywords.
- a system for document searching and analysis may include one or more electronic storage, one or more processors and/or other components.
- the electronic storage may store information relating to documents, information relating to keywords, information relating to potential keywords, information relating to selected keywords, information relating to knowledge graph models, and/or other information.
- the keyword component may be configured to obtain one or more sets of keywords.
- the set(s) of keywords may be obtained for use in searching for documents.
- a set of keywords may include one or more keywords inputted by a user.
- a set of keywords may include one or more keywords extracted from a source document.
- the document component may be configured to identify the documents.
- the documents may be identified based on the set of selected keywords and/or other information.
- the documents may be identified based on searching within one or more databases.
- the knowledge graph model may further include a second search node representing a subset of the set of selected keywords. Relative position of the individual document nodes with respect to the second search node may represent similarity between the corresponding identified documents and the subset of the set of selected keywords.
- FIG. 4 A illustrates an example knowledge graph model
- FIG. 4 B illustrates an example knowledge graph model
- the processor 11 may be configured to provide information processing capabilities in the system 10 .
- the processor 11 may comprise one or more of a digital processor, an analog processor, a digital circuit designed to process information, a central processing unit, a graphics processing unit, a microcontroller, an analog circuit designed to process information, a state machine, and/or other mechanisms for electronically processing information.
- the processor 11 may be configured to execute one or more machine-readable instructions 100 to facilitate document searching and analysis.
- the machine-readable instructions 100 may include one or more computer program components.
- the machine-readable instructions 100 may include a keyword component 102 , a potential keyword component 104 , a keyword selection component 106 , a document component 108 , a model component 110 , and/or other computer program components.
- the set(s) of potential keywords may be generated further based on multiple domain-specific models.
- a domain-specific model may refer to a model that is trained for a specific area of knowledge/information.
- Individual domain-specific model may include an unsupervised vector model trained based on words corresponding to a specific domain.
- Individual domain-specific model may represent the vocabulary of the specific domain.
- Individual domain-specific model may represent the concepts of the specific domain.
- Individual domain-specific model may provide domain-specific word/sentence representation and/or classification.
- the keyword selection component 106 may be configured to select one or more potential keywords from the set(s) of potential keywords. Selecting a potential keyword from a set of potential keywords may include ascertaining, choosing, determining, establishing, finding, identifying, obtaining, and/or otherwise selecting the potential keyword from the set of potential keywords.
- the potential keyword(s) may be selected for use in searching for documents. That is, the keyword selection component 106 may selected, from the set(s) of potential keywords, the potential keyword(s) that will be used to search for documents.
- the set(s) of keywords obtained by the keyword component 102 and the potential keyword(s) selected from the set(s) of potential keywords by the keyword selection component 106 may form a set of selected keywords. That is, the original keywords and the selected potential keywords may form the whole set of search keywords.
- the potential keyword(s) selected by the keyword selection component 106 may be added to the keywords to be used in searching for documents.
- the graphical user interface may display original keywords and potential keywords generated from the original keywords.
- the graphical user interface may enable a user to determine which ones of the original keywords and potential keywords will be used in searching for documents.
- the graphical user interface may enable a user to determine which ones of the original keywords and potential keywords will be removed and excluded from use in searching for documents.
- the document component 108 may be configured to identify documents. Identifying documents may include discovering, finding, pinpointing, selecting, and/or otherwise identifying documents.
- a document may refer to one or more collections of information. Information may be included in a document as one or more words (text).
- a document may include an electronic document.
- a document may be stored within one or more files.
- Information within a document may be stored in one or more formats and/or containers.
- a format may refer to one or more ways in which the information within a document is arranged/laid out (e.g., file format).
- a container may refer to one or more ways in which information within a document is arranged/laid out in association with other information (e.g., zip format).
- information within a document may be stored in one or more of a text file (e.g., TXT file, DOC file, PDF file), a communication file (e.g., email file, text message file), a spreadsheet file (e.g., XLS file), a presentation file (e.g., PPT file), a visual file (e.g., image file, video file) and/or other files.
- a text file e.g., TXT file, DOC file, PDF file
- a communication file e.g., email file, text message file
- a spreadsheet file e.g., XLS file
- presentation file e.g., PPT file
- visual file e.g., image file, video file
- the documents may be identified based on the set of selected keywords and/or other information.
- the documents may be identified based on the set(s) of keywords obtained by the keyword component 102 and the potential keyword(s) selected from the set(s) of potential keywords by the keyword selection component 10 .
- a document may be identified by the document component 108 based on the document including some or all of the selected keywords.
- the document component 108 may identify documents that include content that matches some or all of the selected keywords.
- search parameters used by the document component 108 may specify how many and/or which of the selected keywords must appear within a document for the document to be identified by the document component 108 .
- the search parameters used by the document component 108 may use different weights for different ones of the selected keywords.
- the search parameters used by the document component 108 may include use of filters and/or weights for different types of documents.
- the search parameters used by the document component 108 may include use of filters and/or weights for documents in different domains/fields.
- the search parameters used by the document component 108 may include use of filters and/or weights for different document contributors (e.g., authors, organizations).
- the search parameters used by the document component 108 may use relationships between two or more of the selected keywords. For instance, the search parameters may require a document to include a particular keyword within a certain number of words/sentences/paragraphs of another keyword to be identified as a matching document. Use of other criteria within the search parameters is contemplated.
- the documents may be identified based on searching within one or more databases.
- Individual databases may include different types of documents.
- Individual databases may include different collections of documents.
- information with the documents may be converted from non-textual form to textual form to allow for searching. For instance, images and/or other unsearchable documents may be processed to extract text contained within the documents. The extracted text may be used to search through the document for the selected keywords.
- the model component 110 may be configured to generate a knowledge graph model that represents the identified documents.
- the knowledge graph model may refer to a model that represents the identified documents and information relating to the identified documents using one or more graphs.
- the knowledge graph model may represent the identified documents and information relating to the identified documents using nodes.
- the knowledge graph model may include document nodes representing the identified documents.
- the knowledge graph model may include one or more search nodes representing the selected keywords.
- the knowledge graph model may include a general search node representing the set of selected keywords (all of the search query used to find the documents).
- the knowledge graph model may include one or more subset search nodes representing subset(s) of the set of selected keywords (portion(s) of the search query used to find the documents).
- the relative position of individual documents nodes with respect to the search nodes may represent similarity between corresponding identified documents and the selected keywords represented by the search nodes.
- the relative position of a document node with respect to a search node may include the distance between the document node and the search and/or the direction from the search node to the document node (or vice versa).
- the relative position of the individual document nodes with respect to the general search node may represent similarity between corresponding identified documents and the set of selected keywords.
- the relative position of the individual document nodes with respect to a subset search node may represent similarity between the corresponding identified documents and a subset of the set of selected keywords.
- Edges may exist between the document nodes and the search node(s).
- individual edges may be labeled with a similarity score reflecting the similarity between the corresponding identified documents and the search query represented by a search node (entirety or portion of the search query).
- other information relating to the documents may be displayed within the knowledge graph model.
- the knowledge graph model may display information relating to dates of the documents, sizes of the documents, authors of the documents, owners of the documents, publications of the documents, and/or other information relating to the documents.
- the knowledge graph model may display Eigenvalue calculation from a graph of document word vectors.
- the knowledge graph model may be an interactive model.
- a user may interact with the knowledge graph model to obtain one or more of the identified documents. For example, individual document nodes may operate as a shortcut to the corresponding document.
- a user may prompt retrieval of (e.g., open, download) a document by clicking the corresponding document node.
- FIGS. 4 A and 4 B illustrate example knowledge graph models 400 , 450 .
- the graph model 400 may include document nodes 412 , 414 , 418 , 420 and a search node 402 .
- the document nodes 412 , 414 , 418 , 420 may represent different documents identified based on a set of selected keywords.
- the search node 402 may represent the entirety of selected keywords (whole search query).
- the relative position of the document nodes 412 , 414 , 418 , 420 with respect of the search node 402 may represent similarity between the corresponding identified documents and the entirety of the selected keywords.
- Shorter distances between the document nodes 412 , 414 , 418 , 420 and the search node 402 may indicate greater similarity between the corresponding identified documents and the entirety of the selected keywords. Longer distances between the document nodes 412 , 414 , 418 , 420 and the search node 402 may indicate less similarity between the corresponding identified documents and the entirety of the selected keywords.
- the knowledge graph model 400 may normalize the distances between the document nodes 412 , 414 , 418 , 420 and the search node 402 so that the entire knowledge graph model 400 fits within a given area (e.g., fits on a screen).
- the graph model 450 may include the document nodes 412 , 414 , 418 , 420 and a search node 404 .
- the search node 404 may represent a subset of the selected keywords (a portion of the search query).
- the relative position of the document nodes 412 , 414 , 418 , 420 with respect of the search node 404 may represent similarity between the corresponding identified documents and the subset of the selected keywords.
- the document that most closely matches the subset of selected keywords may be the document represented by the document node 412 .
- the knowledge graph models 400 , 450 are shown with one search node each, this is merely as an example and is not meant to be limiting. In some implementations, a knowledge graph model may include multiple search nodes representing different sets of selected keywords.
- the knowledge graph model may assist users in determining which identified documents are most relevant to the search query.
- the knowledge graph model may allow the users to determine which of the documents will be reviewed and/or the order in which the documents will be reviewed based on their proximity of the corresponding document nodes to the search node(s). For instance, a user may start by reviewing documents with document nodes closest to the search node(s). A user may target documents with document nodes near a particular search node to target documents that meet a particular portion of the search query.
- the knowledge graph model may provide visualization of similarity among the identified documents through clustering. Documents that are similar to each other may be placed near each other within the knowledge graph model. Different clusters of documents nodes may indicate different clusters of similar documents. The density of document nodes within the knowledge graph model may indicate how unique/similar the identified documents are to each other.
- one or more analytics may be performed on the identified documents to generate one or more categorical reports for the identified documents.
- a categorical report may include division of the identified documents into multiple categories.
- a categorical report may include division of the identified documents into different categories, and one or more subdivision of documents in different categories into multiple subcategories.
- the documents may be divided base on dates of the documents, sizes of the documents, authors of the documents, owners of the documents, publications of the documents, and/or other information relating to the documents.
- the division of the identified documents into categories may be presented in one or more visual forms, such as in charts, graphs, maps, and/or other visual forms. Other analytics and/or analysis of the documents is contemplated.
- the current disclosure may be utilized to facilitate technology search, technology landscape review, and/or other analysis of technology.
- the current disclosure may be utilized to facilitate searching of information (e.g., prior art, publication) relating to particular technology (e.g., invention, research area). Such information may be used to assess novelty of technology.
- the current disclosure may be utilized to facilitate prior art searching for particular invention and/or to structure disclosure of the invention.
- the current disclosure may be utilized to provide guidance on filing protection for the invention (e.g., patent application filing). Additionally, high risk portions of the disclosure (e.g., high risk claims) may be identified and resolutions to such may be provided.
- the current disclosure may be utilized to identify prior art relating to an invention and assess novelty of the invention.
- a user may provide keywords to be used in searching for prior art.
- the keywords provided by the user may be utilized to generate other words that may be used in searching for prior art.
- a user may select keywords that will be used in the search.
- the prior art found in the search may be presented using a knowledge graph model.
- the knowledge graph model may enable the user to determine similarity of the prior art to the search query and to review the prior art. Other analysis may be performed to assist the user in determining whether to proceed with filing protection for the invention (e.g., patent application filing).
- a user may provide (e.g., upload) information about an invention (e.g., invention disclosure, patent claims) and analysis may be performed to identify and classify high risk portions.
- High risk portion may refer to a portion of the invention information that may be duplicative of prior art and/or may have a high probability (e.g., greater than a certain percentage, such as greater than 50%) of being rejected as being disclosed/suggested by the prior art.
- Information relating to obtaining protection for the invention may be generated and provided to the user.
- the user may be provided with information on likelihood of acceptance of the patent claims for the invention, examiner information, industry categorization, and/or other information that may assist the user in making decision on whether to attempt to protect the invention.
- the accuracy of the classification by the text classification models may be increased by training the text classification models with art unit information, examiner information, and/or other information relating to the claims/patent office(s).
- the text classification models may take into account not just whether particular claim language has been previously rejected/accepted, but also on the art unit in which the claims were processed and/or the examiner that reviewed the claims.
- the text classification models may be used to provide suggested changes to the claim language to increase the probability of the claims being accepted at the patent office(s).
- information about likely art unit and/or potential examiner for the claims may be provided. For instance, focus and/or rejection rates of particular art unit/examiner may be provided.
- the current disclosure may be utilized to identify and analyze documents in other fields.
- the current disclosure may be used in supply, trading, and procurement to automate the categorization/classification of supply and chain contracts, autocomplete and summarize contracts, detect ambiguities within contracts, identify strengths/weaknesses in contracts, determine differences between same types of contracts, and/or suggest improvements to contracts.
- the current disclosure may be used in Health, Safety, and Environment (HSE) in oil and gas industry to automate the categorization/classification of HSE reports, autocomplete and summarize HSE reports, perform risk assessment, to assist with legal and other operations, find most similar scope clause or compensation clause, and/or enable conceptional similarity searches.
- HSE Health, Safety, and Environment
- the current disclosure may be used in engineering fields to improve engineering requirements per INCOSE standards (e.g., ambiguity, conciseness, etc.), and/or facility requirements packaging (e.g., associate requirements with a particular effort/projects/scope). Other usage of the current disclosure is contemplated.
- External resources may include hosts/sources of information, computing, and/or processing and/or other providers of information, computing, and/or processing outside of the system 10 .
- any communication medium may be used to facilitate interaction between any components of the system 10 .
- One or more components of the system 10 may communicate with each other through hard-wired communication, wireless communication, or both.
- one or more components of the system 10 may communicate with each other through a network.
- the processor 11 may wirelessly communicate with the electronic storage 13 .
- wireless communication may include one or more of radio communication, Bluetooth communication, Wi-Fi communication, cellular communication, infrared communication, or other wireless communication. Other types of communications are contemplated by the present disclosure.
- processor 11 and the electronic storage 13 are shown in FIG. 1 as single entities, this is for illustrative purposes only.
- One or more of the components of the system 10 may be contained within a single device or across multiple devices.
- the processor 11 may comprise a plurality of processing units. These processing units may be physically located within the same device, or the processor 11 may represent processing functionality of a plurality of devices operating in coordination.
- the processor 11 may be separate from and/or be part of one or more components of the system 10 .
- the processor 11 may be configured to execute one or more components by software; hardware; firmware; some combination of software, hardware, and/or firmware; and/or other mechanisms for configuring processing capabilities on the processor 11 .
- processor 11 may be configured to execute one or more additional computer program components that may perform some or all of the functionality attributed to one or more of computer program components described herein.
- a set of keywords may be obtained for use in searching for documents.
- operation 202 may be performed by a processor component the same as or similar to the keyword component 102 (Shown in FIG. 1 and described herein).
- a knowledge graph model that represents the identified documents may be generated.
- the knowledge graph model may include document nodes representing the identified documents and a first search node representing the set of selected keywords. Relative position of the individual document nodes with respect to the first search node may represent similarity between corresponding identified documents and the set of selected keywords.
- operation 210 may be performed by a processor component the same as or similar to the model component 110 (Shown in FIG. 1 and described herein).
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Document Processing Apparatus (AREA)
Abstract
Description
Claims (20)
Priority Applications (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/188,774 US12026185B2 (en) | 2021-03-01 | 2021-03-01 | Document search and analysis tool |
CA3211088A CA3211088A1 (en) | 2021-03-01 | 2022-02-28 | Document search and analysis tool |
EP22763824.4A EP4302204A4 (en) | 2021-03-01 | 2022-02-28 | DOCUMENT SEARCH AND ANALYSIS TOOL |
CN202280018080.6A CN116917885A (en) | 2021-03-01 | 2022-02-28 | Document searching and analyzing tool |
PCT/US2022/018086 WO2022187120A1 (en) | 2021-03-01 | 2022-02-28 | Document search and analysis tool |
AU2022228415A AU2022228415B2 (en) | 2021-03-01 | 2022-02-28 | Document search and analysis tool |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/188,774 US12026185B2 (en) | 2021-03-01 | 2021-03-01 | Document search and analysis tool |
Publications (2)
Publication Number | Publication Date |
---|---|
US20220277032A1 US20220277032A1 (en) | 2022-09-01 |
US12026185B2 true US12026185B2 (en) | 2024-07-02 |
Family
ID=83006438
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/188,774 Active 2041-09-01 US12026185B2 (en) | 2021-03-01 | 2021-03-01 | Document search and analysis tool |
Country Status (6)
Country | Link |
---|---|
US (1) | US12026185B2 (en) |
EP (1) | EP4302204A4 (en) |
CN (1) | CN116917885A (en) |
AU (1) | AU2022228415B2 (en) |
CA (1) | CA3211088A1 (en) |
WO (1) | WO2022187120A1 (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US12106192B2 (en) | 2021-03-01 | 2024-10-01 | Chevron U.S.A. Inc. | White space analysis |
US12008054B2 (en) * | 2022-01-31 | 2024-06-11 | Walmart Apollo, Llc | Systems and methods for determining and utilizing search token importance using machine learning architectures |
JP2024065828A (en) * | 2022-10-31 | 2024-05-15 | 株式会社東芝 | Information processing device, information processing method, and information processing program |
Citations (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070003166A1 (en) * | 2005-06-30 | 2007-01-04 | Kathrin Berkner | White space graphs and trees for content-adaptive scaling of document images |
US20080033741A1 (en) * | 2006-08-04 | 2008-02-07 | Leviathan Entertainment, Llc | Automated Prior Art Search Tool |
US20100125566A1 (en) * | 2008-11-18 | 2010-05-20 | Patentcafe.Com, Inc. | System and method for conducting a patent search |
US20110302172A1 (en) * | 2010-06-02 | 2011-12-08 | Microsoft Corporation | Topical search engines and query context models |
US20110320186A1 (en) | 2010-06-23 | 2011-12-29 | Rolls-Royce Plc | Entity recognition |
US20130013603A1 (en) | 2011-05-24 | 2013-01-10 | Namesforlife, Llc | Semiotic indexing of digital resources |
US8819451B2 (en) | 2009-05-28 | 2014-08-26 | Microsoft Corporation | Techniques for representing keywords in an encrypted search index to prevent histogram-based attacks |
US8843821B2 (en) | 2000-02-29 | 2014-09-23 | Bao Q. Tran | Patent development system |
US20150169582A1 (en) | 2013-12-14 | 2015-06-18 | Mirosoft Corporation | Query techniques and ranking results for knowledge-based matching |
WO2016036760A1 (en) | 2014-09-03 | 2016-03-10 | Atigeo Corporation | Method and system for searching and analyzing large numbers of electronic documents |
US20160378805A1 (en) | 2015-06-23 | 2016-12-29 | Microsoft Technology Licensing, Llc | Matching documents using a bit vector search index |
US20180039696A1 (en) * | 2016-08-08 | 2018-02-08 | Baidu Usa Llc | Knowledge graph entity reconciler |
US20180060733A1 (en) * | 2016-08-31 | 2018-03-01 | International Business Machines Corporation | Techniques for assigning confidence scores to relationship entries in a knowledge graph |
US10248718B2 (en) | 2015-07-04 | 2019-04-02 | Accenture Global Solutions Limited | Generating a domain ontology using word embeddings |
US20190114304A1 (en) * | 2016-05-27 | 2019-04-18 | Koninklijke Philips N.V. | Systems and methods for modeling free-text clinical documents into a hierarchical graph-like data structure based on semantic relationships among clinical concepts present in the documents |
US20190354636A1 (en) | 2018-05-18 | 2019-11-21 | Xcential Corporation | Methods and Systems for Comparison of Structured Documents |
US10497366B2 (en) | 2018-03-23 | 2019-12-03 | Servicenow, Inc. | Hybrid learning system for natural language understanding |
US20190370397A1 (en) * | 2018-06-01 | 2019-12-05 | Accenture Global Solutions Limited | Artificial intelligence based-document processing |
US20210013103A1 (en) * | 2018-07-31 | 2021-01-14 | Taiwan Semiconductor Manufacturing Co., Ltd. | Semiconductor device with elongated pattern |
US20210012103A1 (en) | 2019-07-09 | 2021-01-14 | American International Group, Inc., | Systems and methods for information extraction from text documents with spatial context |
US20210056261A1 (en) * | 2019-08-20 | 2021-02-25 | Daystrom Information Systems, Llc | Hybrid artificial intelligence system for semi-automatic patent pinfringement analysis |
US20210133390A1 (en) * | 2019-11-01 | 2021-05-06 | Fuji Xerox Co., Ltd. | Conceptual graph processing apparatus and non-transitory computer readable medium |
US20220253477A1 (en) * | 2021-02-08 | 2022-08-11 | Adobe Inc. | Knowledge-derived search suggestion |
US20220277220A1 (en) | 2021-03-01 | 2022-09-01 | Chevron U.S.A. Inc. | White space analysis |
-
2021
- 2021-03-01 US US17/188,774 patent/US12026185B2/en active Active
-
2022
- 2022-02-28 CN CN202280018080.6A patent/CN116917885A/en active Pending
- 2022-02-28 EP EP22763824.4A patent/EP4302204A4/en active Pending
- 2022-02-28 CA CA3211088A patent/CA3211088A1/en active Pending
- 2022-02-28 WO PCT/US2022/018086 patent/WO2022187120A1/en active Application Filing
- 2022-02-28 AU AU2022228415A patent/AU2022228415B2/en active Active
Patent Citations (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8843821B2 (en) | 2000-02-29 | 2014-09-23 | Bao Q. Tran | Patent development system |
US20070003166A1 (en) * | 2005-06-30 | 2007-01-04 | Kathrin Berkner | White space graphs and trees for content-adaptive scaling of document images |
US20080033741A1 (en) * | 2006-08-04 | 2008-02-07 | Leviathan Entertainment, Llc | Automated Prior Art Search Tool |
US20100125566A1 (en) * | 2008-11-18 | 2010-05-20 | Patentcafe.Com, Inc. | System and method for conducting a patent search |
US8819451B2 (en) | 2009-05-28 | 2014-08-26 | Microsoft Corporation | Techniques for representing keywords in an encrypted search index to prevent histogram-based attacks |
US20110302172A1 (en) * | 2010-06-02 | 2011-12-08 | Microsoft Corporation | Topical search engines and query context models |
US20110320186A1 (en) | 2010-06-23 | 2011-12-29 | Rolls-Royce Plc | Entity recognition |
US20130013603A1 (en) | 2011-05-24 | 2013-01-10 | Namesforlife, Llc | Semiotic indexing of digital resources |
US20150169582A1 (en) | 2013-12-14 | 2015-06-18 | Mirosoft Corporation | Query techniques and ranking results for knowledge-based matching |
WO2016036760A1 (en) | 2014-09-03 | 2016-03-10 | Atigeo Corporation | Method and system for searching and analyzing large numbers of electronic documents |
US20160378805A1 (en) | 2015-06-23 | 2016-12-29 | Microsoft Technology Licensing, Llc | Matching documents using a bit vector search index |
US10248718B2 (en) | 2015-07-04 | 2019-04-02 | Accenture Global Solutions Limited | Generating a domain ontology using word embeddings |
US20190114304A1 (en) * | 2016-05-27 | 2019-04-18 | Koninklijke Philips N.V. | Systems and methods for modeling free-text clinical documents into a hierarchical graph-like data structure based on semantic relationships among clinical concepts present in the documents |
US20180039696A1 (en) * | 2016-08-08 | 2018-02-08 | Baidu Usa Llc | Knowledge graph entity reconciler |
US20180060733A1 (en) * | 2016-08-31 | 2018-03-01 | International Business Machines Corporation | Techniques for assigning confidence scores to relationship entries in a knowledge graph |
US10497366B2 (en) | 2018-03-23 | 2019-12-03 | Servicenow, Inc. | Hybrid learning system for natural language understanding |
US20190354636A1 (en) | 2018-05-18 | 2019-11-21 | Xcential Corporation | Methods and Systems for Comparison of Structured Documents |
US20190370397A1 (en) * | 2018-06-01 | 2019-12-05 | Accenture Global Solutions Limited | Artificial intelligence based-document processing |
US20210013103A1 (en) * | 2018-07-31 | 2021-01-14 | Taiwan Semiconductor Manufacturing Co., Ltd. | Semiconductor device with elongated pattern |
US20210012103A1 (en) | 2019-07-09 | 2021-01-14 | American International Group, Inc., | Systems and methods for information extraction from text documents with spatial context |
US20210056261A1 (en) * | 2019-08-20 | 2021-02-25 | Daystrom Information Systems, Llc | Hybrid artificial intelligence system for semi-automatic patent pinfringement analysis |
US20210133390A1 (en) * | 2019-11-01 | 2021-05-06 | Fuji Xerox Co., Ltd. | Conceptual graph processing apparatus and non-transitory computer readable medium |
US20220253477A1 (en) * | 2021-02-08 | 2022-08-11 | Adobe Inc. | Knowledge-derived search suggestion |
US20220277220A1 (en) | 2021-03-01 | 2022-09-01 | Chevron U.S.A. Inc. | White space analysis |
Non-Patent Citations (4)
Title |
---|
AU Examination Report from AU Application No. 2022228414, mailed Oct. 23, 2023 (3 pages). |
AU Examination Report from AU Application No. 2022228415, mailed Oct. 18, 2023 (3 pages). |
Bi et al., PALM: Pre-training an Autoencoding&Autoregressive Language Model for Context-conditioned Generation⋅, Sep. 20, 2020 (Sep. 20, 2020) [online], [retrieved May 8, 2022]. Retrieved from the Internet <URL: https:/larxiv.org/pdf/2004.07159.pdf?fbclid= IwAR0BNI1lzR5bhcuEbyfNw2UN7MApHFoFP3BN40FKkW8x3bqolK_HilU293>, entire document, especially4, col. 2, para 3, p. 6, col. 2, para 2, p. 7 col. 1, para 1, and p. 8, col. 2, para 3. |
International Search Report and Written Opinion for PCT Application No. PCT/US2022/018082, mailed Jun. 9, 2022 (8 pages). |
Also Published As
Publication number | Publication date |
---|---|
EP4302204A1 (en) | 2024-01-10 |
WO2022187120A1 (en) | 2022-09-09 |
CA3211088A1 (en) | 2022-09-09 |
EP4302204A4 (en) | 2025-02-12 |
CN116917885A (en) | 2023-10-20 |
AU2022228415A1 (en) | 2023-08-24 |
US20220277032A1 (en) | 2022-09-01 |
AU2022228415B2 (en) | 2024-03-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU2022228415B2 (en) | Document search and analysis tool | |
US12259930B2 (en) | System and method for automated file reporting | |
US9305083B2 (en) | Author disambiguation | |
US10489439B2 (en) | System and method for entity extraction from semi-structured text documents | |
US10049148B1 (en) | Enhanced text clustering based on topic clusters | |
CN107958014B (en) | Search engine | |
US8738635B2 (en) | Detection of junk in search result ranking | |
US9996504B2 (en) | System and method for classifying text sentiment classes based on past examples | |
US9858330B2 (en) | Content categorization system | |
US11232114B1 (en) | System and method for automated classification of structured property description extracted from data source using numeric representation and keyword search | |
US11263209B2 (en) | Context-sensitive feature score generation | |
CN108228612B (en) | Method and device for extracting network event keywords and emotional tendency | |
Short | Text mining and subject analysis for fiction; or, using machine learning and information extraction to assign subject headings to dime novels | |
US10671810B2 (en) | Citation explanations | |
US12265567B2 (en) | Artificial intelligence assisted originality evaluator | |
Silva | Parts that add up to a whole: a framework for the analysis of tables | |
tong et al. | Mining and analyzing user feedback from app reviews: An econometric approach | |
US20220277220A1 (en) | White space analysis | |
US20210089541A1 (en) | Intellectual property support device, intellectual property support method, and intellectual property support program | |
Romero-Córdoba et al. | A comparative study of soft computing software for enhancing the capabilities of business document management systems | |
Menéndez | Damegender: Towards an International and Free Dataset about Name, Gender and Frequency | |
Araszkiewicz et al. | Similarity and Relevance of Court Decisions: A Computational Study on | |
Tran et al. | Gute Arbeit”: Topic Exploration and Analysis Challenges for Corpora of German Qualitative Studies | |
Taylor | Reduced Geographic Scope as a Strategy for Toponym Resolution | |
Caliò | Advancing Keyword Clustering Techniques: A Comparative Exploration of Supervised and Unsupervised Methods |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
AS | Assignment |
Owner name: CHEVRON U.S.A. INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:MOHAMED, MOHAMED IBRAHIM;BOWDEN, LARRY A.;FENG, XIN;SIGNING DATES FROM 20210226 TO 20211022;REEL/FRAME:057906/0228 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
AS | Assignment |
Owner name: CHEVRON U.S.A. INC., CALIFORNIA Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE THE SPELLING OF THE 2ND INVENTOR'S NAME TO INCLUDE THE SUFFIX PREVIOUSLY RECORDED ON REEL 57906 FRAME 228. ASSIGNOR(S) HEREBY CONFIRMS THE THE ASSIGNMENT;ASSIGNORS:MOHAMED, MOHAMED IBRAHIM;BOWDEN, LARRY A., JR.;FENG, XIN;SIGNING DATES FROM 20210226 TO 20211022;REEL/FRAME:067433/0929 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |