US8548995B1 - Ranking of documents based on analysis of related documents - Google Patents
Ranking of documents based on analysis of related documents Download PDFInfo
- Publication number
- US8548995B1 US8548995B1 US10/658,452 US65845203A US8548995B1 US 8548995 B1 US8548995 B1 US 8548995B1 US 65845203 A US65845203 A US 65845203A US 8548995 B1 US8548995 B1 US 8548995B1
- Authority
- US
- United States
- Prior art keywords
- document
- documents
- relevance score
- score
- instructions
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime, expires
Links
- 238000000034 method Methods 0.000 claims description 39
- 238000004891 communication Methods 0.000 claims description 7
- 238000010586 diagram Methods 0.000 description 10
- 230000006870 function Effects 0.000 description 6
- 230000007246 mechanism Effects 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000007670 refining Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 210000003484 anatomy Anatomy 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 239000004020 conductor Substances 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/338—Presentation of query results
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
Definitions
- the present invention relates generally to the ranking of documents and, more particularly, to techniques for refining the ranking of an initial set of documents.
- the World Wide Web (“web”) contains a vast amount of information. Locating a desired portion of the information, however, can be challenging. This problem is compounded because the amount of information on the web and the number of new users inexperienced at web searching are growing rapidly.
- Search engines attempt to return hyperlinks to web pages in which a user is interested.
- search engines base their determination of the user's interest on search terms (called a search query) entered by the user.
- the goal of the search engine is to provide links to high quality, relevant results to the user based on the search query.
- the search engine accomplishes this by matching the terms in the search query to a corpus of pre-stored web pages. Web pages that contain the user's search terms are “hits” and are returned to the user.
- a search engine may attempt to sort the list of hits so that the most relevant and/or highest quality pages are at the top of the list of hits returned to the user. For example, the search engine may assign a rank or score to each hit, where the score is designed to correspond to the relevance or importance of the web page. Determining appropriate scores can be a difficult task. For one thing, the importance of a web page to the user is inherently subjective and depends on the user's interests, knowledge, and attitudes. There is, however, much that can be determined objectively about the relative importance of a web page. Conventional methods of determining relevance are based on the contents of the web page.
- More advanced techniques determine the importance of a web page based on more than the content of the web page. For example, one known method, described in the article entitled “The Anatomy of a Large-Scale Hypertextual Search Engine,” by Sergey Brin and Lawrence Page, assigns a degree of importance to a web page based on the link structure of the web page. In other words, the Brin and Page algorithm attempts to quantify the importance of a web page based on more than just the content of the web page.
- a returned set of news articles about a particular news topic may be ranked.
- Postings gathered from message groups, such as Usenet groups, may also be ranked when returned to the user.
- the present invention is directed to a document ranking technique in which, for a given document, a set of related documents is determined. A related score is calculated based on the related document. The score may then be used to modify an original ranking of the given document.
- One aspect of the invention is directed to a method for scoring documents.
- the method comprises obtaining an initial set of documents and generating a set of related documents for at least one document in the initial set of documents using a similarity criterion.
- the method further includes generating a related set score by applying a related set criterion to the set of related documents corresponding to the at least one document and scoring the at least one document using the related set score.
- Another aspect of the invention is directed to a method for refining a ranking associated with a document.
- the method includes obtaining a set of documents related to the document, calculating a relevance ranking for the set of documents using a criterion, and modifying the ranking associated with the document based on the relevance ranking for the set of documents.
- FIG. 1 is an exemplary diagram of a network in which systems and methods consistent with the principles of the invention may be implemented;
- FIG. 2 is an exemplary diagram of a client or server according to an implementation consistent with the principles of the invention
- FIG. 3 is a block diagram illustrating an implementation of an exemplary search engine
- FIG. 4 is a flow chart illustrating methods consistent with the present invention for implementing the re-ranking component shown in FIG. 3 ;
- FIG. 5 is a diagram illustrating operations consistent with aspects of the invention for computing a related set score
- FIG. 6 is a diagram an alternate implementation consistent with aspects of the invention for computing the related set score.
- a ranking component refines the initial rankings of documents that are initially ranked via a scoring criterion.
- the ranking component may generate or receive a set of documents that are related to a given document.
- the ranking component may then apply a scoring criterion to the set of related documents to generate a related set score.
- the related set score may be used to refine the initial rankings.
- FIG. 1 is an exemplary diagram of a network 100 in which systems and methods consistent with the principles of the invention may be implemented.
- Network 100 may include multiple clients 110 connected to one or more servers 120 via a network 140 .
- Network 140 may include a local area network (LAN), a wide area network (WAN), a telephone network, such as the Public Switched Telephone Network (PSTN), an intranet, the Internet, or a combination of networks.
- PSTN Public Switched Telephone Network
- An intranet the Internet
- Internet or a combination of networks.
- Two clients 110 and one server 120 have been illustrated as connected to network 140 for simplicity. In practice, there may be more or less clients and servers. Also, in some instances, a client may perform the functions of a server and a server may perform the functions of a client.
- Clients 110 may include client entities.
- An entity may be defined as a device, such as a wireless telephone, a personal computer, a personal digital assistant (PDA), a lap top, or another type of computation or communication device, a thread or process running on one of these devices, and/or an object executable by one of these device.
- Server 120 may include server entities that process, search, and/or maintain documents in a manner consistent with the principles of the invention.
- Clients 110 and server 120 may connect to network 140 via wired, wireless, or optical connections.
- server 120 may include a search engine 125 usable by clients 110 .
- Search engine 125 may be a search engine such as a query-based web page search engine, a news server, or a Usenet message server or archiving source.
- search engine 125 in response to a client request, returns sets of documents to the client. These documents may be ranked and displayed in a ranking order determined consistent with aspects of the invention.
- a document is to be broadly interpreted to include any machine-readable and machine-storable work product.
- a document may be an email, a file, a combination of files, one or more files with embedded links to other files, a news group posting, etc.
- a common document is a Web page. Web pages often include content and may include embedded information (such as meta information, hyperlinks, etc.) and/or embedded instructions (such as Javascript, etc.).
- FIG. 2 is an exemplary diagram of a client 110 or server 120 according to an implementation consistent with the principles of the invention.
- Client/server 110 / 120 may include a bus 210 , a processor 220 , a main memory 230 , a read only memory (ROM) 240 , a storage device 250 , one or more input devices 260 , one or more output devices 270 , and a communication interface 280 .
- Bus 210 may include one or more conductors that permit communication among the components of client/server 110 / 120 .
- Processor 220 may include any type of conventional processor or microprocessor that interprets and executes instructions.
- Main memory 230 may include a random access memory (RAM) or another type of dynamic storage device that stores information and instructions for execution by processor 220 .
- ROM 240 may include a conventional ROM device or another type of static storage device that stores static information and instructions for use by processor 220 .
- Storage device 250 may include a magnetic and/or optical recording medium and its corresponding drive.
- Input device(s) 260 may include one or more conventional mechanisms that permit a user to input information to client/server 110 / 120 , such as a keyboard, a mouse, a pen, voice recognition and/or biometric mechanisms, etc.
- Output device(s) 270 may include one or more conventional mechanisms that output information to the user, including a display, a printer, a speaker, etc.
- Communication interface 280 may include any transceiver-like mechanism that enables client 110 to communicate with other devices and/or systems.
- communication interface 280 may include mechanisms for communicating with another device or system via a network, such as network 140 .
- server 120 performs certain searching or document retrieval related operations through search engine 125 .
- Search engine 125 may be stored in a computer-readable medium such as memory 230 .
- a computer-readable medium may be defined as one or more physical or logical memory devices. and/or carrier waves.
- the software instructions defining search engine 125 may be read into memory 230 from another computer-readable medium, such as data storage device 250 , or from another device via communication interface 280 .
- the software instructions contained in memory 230 causes processor 220 to perform processes that will be described later.
- hardwired circuitry may be used in place of or in combination with software instructions to implement processes consistent with the present invention.
- implementations consistent with the principles of the invention are not limited to any specific combination of hardware circuitry and software.
- FIG. 3 is a block diagram illustrating an implementation of search engine 125 in additional detail.
- search engine 125 is described as a traditional search engine that returns a ranked or ordered set of documents related to a user query.
- search engine 125 may be thought of as any of a number of services or applications that rank or order a set of input documents. For example, a set of documents that are classified by topic or a set of postings gathered from message groups, such as Usenet groups, may also be ranked when returned to the user.
- Search engine 125 may also be a specialized search engine, such as a news search engine.
- Search engine 125 may include a document locator 330 and a ranking component 340 .
- document locator 330 finds a set of documents whose contents match a user search query.
- Ranking component 330 further ranks the located set of documents based on relevance.
- Document locator 330 may initially locate documents from a document corpus by comparing the terms in the user's search query to the documents in the corpus. In general, processes for indexing documents and searching the indexed corpus of documents to return a set of documents containing the searched terms are well known in the art. Accordingly, this functionality of relevant document component 330 will not be described further herein.
- Ranking component 340 assists search engine 125 in returning relevant documents to the user by ranking the set of documents identified by document locator 330 .
- This ranking may take the form of assigning a numerical value corresponding to the calculated relevance of each document identified by document locator 330 .
- Ranking component 340 includes main ranking component 345 and re-ranking component 347 .
- Main ranking component 345 assigns an initial rank (a score) to each document received from document locator 330 .
- the initial rank value corresponds to a calculated relevance of the document.
- suitable ranking algorithms known in the art. One of which is described in the article by Brin and Page, as mentioned in the Background of the Invention section of this disclosure.
- main ranking component 345 and document locator 330 may be combined so that document locator 330 produces a set of relevant documents each having rank values.
- the rank values may be generated based on the relative position of the user's search terms in the returned documents. For example, documents may have their rank value based on the proximity of the search terms in the document (documents with the search terms close together are given higher rank values) or on the number of occurrences of the search term (e.g., a document that repeatedly uses a search term is given a higher rank value).
- the initial ranking scores assigned by main ranking component 345 may be refined by re-ranking component 347 to improve the relevance scores.
- FIG. 4 is a flow chart illustrating methods consistent with the present invention for implementing re-ranking component 347 .
- the functions performed by main ranking component 345 and re-ranking component 347 may be combined as a single ranking component 340 .
- document locator 330 may return an initial set of documents, such as a set of documents generated in response to a user search query (act 401 ).
- Each of the documents in the initial set of documents may be initially ranked based on some scoring criterion that may generate a rank score or value for each document in the initial set of documents.
- the criterion may be based on, for example, a user search query, a topic (e.g., sports), a list of keywords, a geographical area, or similarity to another document or set of documents.
- re-ranking component 347 may then generate one or more sets of related documents (act 402 ).
- a set of related documents may be generated for each document, d, in the initial set of documents.
- the related sets of documents will be referred to herein as a related set D, which includes documents d 1 , d 2 , . . . , d n , where n is a positive integer greater than or equal to one.
- the related set D may be drawn from all documents known to search engine 125 , or may be drawn from any desired subset of documents.
- re-ranking component 347 may compute a related set score for D (act 403 ).
- the related set score may be based on a matching procedure between the document d and the related set D, or a subset of the related set D.
- the initial ranking score for document d may then be modified based on the related set score (act 404 ).
- a number of techniques may be used to compute the set of related documents D for a particular document d in the initial set.
- the documents in D are determined as documents that re-ranking component 347 determines to be somehow similar to document d.
- similarity between document d and another document may be based on authorship or publication information.
- This type of similarity criteria can be particularly useful in the context of news articles. For example, if news articles are being ranked, one similarity criterion may be defined from the news source that published the news article d. The related set D may then be defined as a set of documents published by the same source as news article d. Thus, if d is an article from the New York Times then D may be the set of previous articles published by the New York Times.
- Another news article based similarity criteria may be defined as news articles having the same author as the news article d. Under this criteria, other articles written by the same journalist may form related set D. For example, if document d is an article by Thomas Friedman, the related set D may be the set of previous articles by Thomas Friedman.
- Yet another possible news article based similarity criterion may be defined as news articles from the same or similar publication sections.
- the similarity criterion may be the same newspaper section.
- Other articles from the same section e.g., “sports” or “business” could be used to form related set D. If document d is an article in the “business” section then related set D could be defined as the set of other articles in the “business” sections of any newspaper.
- Similarity criteria may be defined that are more specific to message groups, such as Usenet postings.
- a similarity criteria may be based on the author, where the author is defined by the email address of the poster. Thus, other postings by the same email address may form the related set D.
- Search engine 125 may be used to search postings from multiple message groups.
- a similarity criteria may be based on the group in which the posting appears. Other articles in the same message group would be considered related by this criterion. For example, if document d is a posting in the news group “soc.culture.Ethiopia,” the related set D may be defined as all the other postings from the news group “soc.culture.Ethiopia.”
- Another similarity criteria that can be used for message groups may be based on the thread in which the posting occurred. Thus, the related set D may be defined as all the other postings in the same thread as posting d. This similarity criteria will tend to return smaller sets than one based on the news group.
- search engine 125 may implement a general web page search engine.
- the similarity criteria may define the related set D as other web pages from the same web site as document d, as web pages that link to document d, or as web pages to which document d contains a link.
- the same web site can broadly refer to documents on the same host or in the same domain.
- the similarity criteria may be defined in many different ways, depending on the particular situation.
- numerous other classification or clustering techniques can be used to identify a set of related documents to a given document.
- the related documents in related set D can be pre-computed or generated when necessary.
- combinations of the above-discussed similarity criteria or other similarity criteria may be used to define related set D.
- re-ranking component 347 computes the related set score for document d.
- the search query may be classified and the classification may be compared to the pseudo-document. For example, assume the original search query is “New York Yankees.” This query may be classified as a “Sports” query, and the pseudo-document may be compared based on how well it matches the topic of sports.
- FIG. 5 is a diagram illustrating exemplary operations consistent with aspects of the invention for computing the related set score.
- Re-ranking component 347 may combine the documents in the related set D to produce a single “pseudo-document” (act 501 ).
- the documents in D may be combined via straightforward concatenation.
- a match between the pseudo-document and the initial scoring criterion may then be performed (act 502 ). If the initial scoring criteria is based on a search query, for example, the search query can be compared to the pseudo-document in the same manner that the search query was compared to the main document corpus from which the initial set of documents were generated.
- a different scoring criterion such as one related to the original scoring criterion, may be used to compute the related set score.
- the result of this comparison may be a ranking score.
- the ranking score may be returned as the related set score (act 503 ) (act 50 ).
- FIG. 6 is a diagram of an exemplary alternate implementation consistent with aspects of the invention for computing the related set score. This implementation is similar to that shown in FIG. 5 , except that instead of combining the documents in the related set D to form a pseudo-document, each document n in D is individually matched to the initial scoring criterion in a manner similar to the matching performed in act 502 (act 601 ). When all the documents in D have been evaluated to obtain a ranking value, (acts 602 and 603 ), the ranking values may be combined to obtain the related set score (act 604 ). The ranking values may be combined via a number a possible functions, such as an average value, a weighted average value, a sum, etc.
- another possible technique for determining the related set score can include determining a geographical relevance vector of each document in related set D. For example, based on terms in the documents in D, a vector may be generated over all the documents in D, in which the vector defines a set of geographic scores. The geographic scores may represent a confidence level that the documents in related set D are relevant to a particular geographic region. For example, an exemplary vector for a particular related set D may include three non-zero confidence scores, such as (USA, 0.5), (Europe, 0.4), (Asia, 0.05). This vector may be matched with a geographic ranking criterion, such as the geographic search query “USA,” to produce the related set score. As a possible modification to this technique, the vector can be generated based on the source of each of the documents in D instead of on the documents themselves.
- Another possible technique for determining the related set score can include determining a topic of each document in related set D.
- the topics may be generated using automated classification techniques or drawn from either manually or automated pre-generated classification information (e.g., a hierarchical web directory tree).
- the topics for the documents in related set D may be combined to produce a vector of topic scores that defines confidence levels in the topics.
- the vector of topic scores can then be matched with the ranking criterion, in a manner similar to that described above, to produce the related set score.
- a set of terms that are “strong” in each document in the related set D may be determined and combined to produce a vector of final strong terms for the documents in D.
- the determination of whether a term is “strong” can be based on, for example, a pre-determined list of terms or terms that are determined to have an inverse document frequency (idf) above a threshold level.
- the idf of each term in a document may be defined based on a ratio of the number of occurrences of the term in the document to the relative frequency of the term in the language or over the entire document corpus. Thus, terms that are generally less common in the corpus but that occur frequently in the document will have a high idf and may be classified as being strong.
- the final vector of strong terms can then be matched with the given ranking criterion (e.g., a search query) to produce the related set score.
- the related set score can be defined in many different ways, depending on the particular situation.
- the related set score and the initial ranking score for each document d may be combined to produce a modified (final) ranking score for document d.
- the initial set of documents returned in act 401 may then be re-ranked by re-ranking component 347 based on the modified ranking scores.
- the modified ranking score may be calculated as a weighted sum as follows: ⁇ (Initial_Score)+ ⁇ (Related_Set_Score), (1) where ⁇ and ⁇ are predetermined constants.
- the values to use for ⁇ and ⁇ may be determined by one of ordinary skill in the art through empirical trial-and-error techniques. Exemplary values for ⁇ and ⁇ may be 0.8 and 0.2, respectively.
- ⁇ could be set to zero and ⁇ could be set to one.
- the modified ranking score is equal to the related set score.
- the initial ranking scores may not even need to be calculated.
- the modified ranking score may be calculated by using the related set score to boost the initial score as follows: Initial_Score ⁇ (1+ ⁇ Related_Set_Score). (2) As in equation (1), ⁇ may be a suitable predetermined constant.
- related set D combinations of the discussed similarity criteria may be used to define related set D.
- the source, the journalist, and the section can all be used to define a related set D.
- more than one related set score can be computed for each document.
- the multiple related set scores can then be combined by, for example, a summing or averaging function, and then used in formula (1) or (2).
- the related set score can be computed independently of the ranking criteria.
- the related document set D can be scored based on the length of the included documents, on the timeliness of the documents, on the quality of the documents (determined by, for example, human evaluation or automated techniques based on grammar, spelling, or writing style), or based on popularity or usage characteristics of the documents.
- the ranking component described above improves the ranking of documents initially ranked through a number of possible existing scoring criterion.
- the ranking component can be applied, for example, to the ranking of news articles, Usenet postings, or general web searches.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
α(Initial_Score)+β(Related_Set_Score), (1)
where α and β are predetermined constants. The values to use for α and β may be determined by one of ordinary skill in the art through empirical trial-and-error techniques. Exemplary values for α and β may be 0.8 and 0.2, respectively. As a special case of formula (I), α could be set to zero and β could be set to one. In this case, the modified ranking score is equal to the related set score. Thus, in this case, the initial ranking scores may not even need to be calculated.
Initial_Score×(1+β×Related_Set_Score). (2)
As in equation (1), β may be a suitable predetermined constant.
Claims (26)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/658,452 US8548995B1 (en) | 2003-09-10 | 2003-09-10 | Ranking of documents based on analysis of related documents |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/658,452 US8548995B1 (en) | 2003-09-10 | 2003-09-10 | Ranking of documents based on analysis of related documents |
Publications (1)
Publication Number | Publication Date |
---|---|
US8548995B1 true US8548995B1 (en) | 2013-10-01 |
Family
ID=49229976
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/658,452 Expired - Lifetime US8548995B1 (en) | 2003-09-10 | 2003-09-10 | Ranking of documents based on analysis of related documents |
Country Status (1)
Country | Link |
---|---|
US (1) | US8548995B1 (en) |
Cited By (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100138445A1 (en) * | 2007-04-26 | 2010-06-03 | Kristian Luoma | method and apparatus |
US20130282707A1 (en) * | 2012-04-24 | 2013-10-24 | Discovery Engine Corporation | Two-step combiner for search result scores |
US20140114642A1 (en) * | 2012-10-19 | 2014-04-24 | Laurens van den Oever | Statistical linguistic analysis of source content |
US8762373B1 (en) * | 2006-09-29 | 2014-06-24 | Google Inc. | Personalized search result ranking |
US8838619B1 (en) * | 2009-08-12 | 2014-09-16 | Google Inc. | Ranking authors and their content in the same framework |
US9037967B1 (en) * | 2014-02-18 | 2015-05-19 | King Fahd University Of Petroleum And Minerals | Arabic spell checking technique |
US20150154180A1 (en) * | 2011-02-28 | 2015-06-04 | Sdl Structured Content Management | Systems, Methods and Media for Translating Informational Content |
US20150193440A1 (en) * | 2014-01-03 | 2015-07-09 | Yahoo! Inc. | Systems and methods for content processing |
US20150193495A1 (en) * | 2014-01-03 | 2015-07-09 | Yahoo! Inc. | Systems and methods for quote extraction |
WO2015103540A1 (en) * | 2014-01-03 | 2015-07-09 | Yahoo! Inc. | Systems and methods for content processing |
USD760791S1 (en) | 2014-01-03 | 2016-07-05 | Yahoo! Inc. | Animated graphical user interface for a display screen or portion thereof |
USD760792S1 (en) | 2014-01-03 | 2016-07-05 | Yahoo! Inc. | Animated graphical user interface for a display screen or portion thereof |
USD761833S1 (en) | 2014-09-11 | 2016-07-19 | Yahoo! Inc. | Display screen with graphical user interface of a menu for a news digest |
US9465793B2 (en) | 2010-05-13 | 2016-10-11 | Grammarly, Inc. | Systems and methods for advanced grammar checking |
USD775183S1 (en) | 2014-01-03 | 2016-12-27 | Yahoo! Inc. | Display screen with transitional graphical user interface for a content digest |
US9742836B2 (en) | 2014-01-03 | 2017-08-22 | Yahoo Holdings, Inc. | Systems and methods for content delivery |
US9971756B2 (en) | 2014-01-03 | 2018-05-15 | Oath Inc. | Systems and methods for delivering task-oriented content |
US9984054B2 (en) | 2011-08-24 | 2018-05-29 | Sdl Inc. | Web interface including the review and manipulation of a web document and utilizing permission based control |
US10140320B2 (en) | 2011-02-28 | 2018-11-27 | Sdl Inc. | Systems, methods, and media for generating analytical data |
US10296167B2 (en) | 2014-01-03 | 2019-05-21 | Oath Inc. | Systems and methods for displaying an expanding menu via a user interface |
US10606914B2 (en) * | 2017-10-25 | 2020-03-31 | International Business Machines Corporation | Apparatus for webpage scoring |
US10936687B1 (en) | 2010-04-21 | 2021-03-02 | Richard Paiz | Codex search patterns virtual maestro |
US11048765B1 (en) | 2008-06-25 | 2021-06-29 | Richard Paiz | Search engine optimizer |
US11244011B2 (en) * | 2015-10-23 | 2022-02-08 | International Business Machines Corporation | Ingestion planning for complex tables |
US11379473B1 (en) | 2010-04-21 | 2022-07-05 | Richard Paiz | Site rank codex search patterns |
US11416534B2 (en) * | 2018-12-03 | 2022-08-16 | Fujitsu Limited | Classification of electronic documents |
US11423018B1 (en) | 2010-04-21 | 2022-08-23 | Richard Paiz | Multivariate analysis replica intelligent ambience evolving system |
US11741090B1 (en) | 2013-02-26 | 2023-08-29 | Richard Paiz | Site rank codex search patterns |
US11809506B1 (en) | 2013-02-26 | 2023-11-07 | Richard Paiz | Multivariant analyzing replicating intelligent ambience evolving system |
Citations (63)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4823306A (en) * | 1987-08-14 | 1989-04-18 | International Business Machines Corporation | Text search system |
US5855015A (en) * | 1995-03-20 | 1998-12-29 | Interval Research Corporation | System and method for retrieval of hyperlinked information resources |
US5864846A (en) * | 1996-06-28 | 1999-01-26 | Siemens Corporate Research, Inc. | Method for facilitating world wide web searches utilizing a document distribution fusion strategy |
US5918236A (en) * | 1996-06-28 | 1999-06-29 | Oracle Corporation | Point of view gists and generic gists in a document browsing system |
US5920854A (en) * | 1996-08-14 | 1999-07-06 | Infoseek Corporation | Real-time document collection search engine with phrase indexing |
US5963940A (en) * | 1995-08-16 | 1999-10-05 | Syracuse University | Natural language information retrieval system and method |
US5983221A (en) * | 1998-01-13 | 1999-11-09 | Wordstream, Inc. | Method and apparatus for improved document searching |
US6038560A (en) * | 1997-05-21 | 2000-03-14 | Oracle Corporation | Concept knowledge base search and retrieval system |
US6067552A (en) * | 1995-08-21 | 2000-05-23 | Cnet, Inc. | User interface system and method for browsing a hypertext database |
US6070176A (en) * | 1997-01-30 | 2000-05-30 | Intel Corporation | Method and apparatus for graphically representing portions of the world wide web |
US6112203A (en) * | 1998-04-09 | 2000-08-29 | Altavista Company | Method for ranking documents in a hyperlinked environment using connectivity and selective content analysis |
US6119114A (en) * | 1996-09-17 | 2000-09-12 | Smadja; Frank | Method and apparatus for dynamic relevance ranking |
US6138113A (en) * | 1998-08-10 | 2000-10-24 | Altavista Company | Method for identifying near duplicate pages in a hyperlinked database |
US6185550B1 (en) * | 1997-06-13 | 2001-02-06 | Sun Microsystems, Inc. | Method and apparatus for classifying documents within a class hierarchy creating term vector, term file and relevance ranking |
US6189002B1 (en) * | 1998-12-14 | 2001-02-13 | Dolphin Search | Process and system for retrieval of documents using context-relevant semantic profiles |
US6285999B1 (en) * | 1997-01-10 | 2001-09-04 | The Board Of Trustees Of The Leland Stanford Junior University | Method for node ranking in a linked database |
US6292830B1 (en) * | 1997-08-08 | 2001-09-18 | Iterations Llc | System for optimizing interaction among agents acting on multiple levels |
US6321228B1 (en) * | 1999-08-31 | 2001-11-20 | Powercast Media, Inc. | Internet search system for retrieving selected results from a previous search |
US6502081B1 (en) * | 1999-08-06 | 2002-12-31 | Lexis Nexis | System and method for classifying legal concepts using legal topic scheme |
US6526440B1 (en) * | 2001-01-30 | 2003-02-25 | Google, Inc. | Ranking search results by reranking the results based on local inter-connectivity |
US6581057B1 (en) * | 2000-05-09 | 2003-06-17 | Justsystem Corporation | Method and apparatus for rapidly producing document summaries and document browsing aids |
US20030115188A1 (en) * | 2001-12-19 | 2003-06-19 | Narayan Srinivasa | Method and apparatus for electronically extracting application specific multidimensional information from a library of searchable documents and for providing the application specific information to a user application |
US6591261B1 (en) * | 1999-06-21 | 2003-07-08 | Zerx, Llc | Network search engine and navigation tool and method of determining search results in accordance with search criteria and/or associated sites |
US20030217052A1 (en) * | 2000-08-24 | 2003-11-20 | Celebros Ltd. | Search engine method and apparatus |
US20030220913A1 (en) * | 2002-05-24 | 2003-11-27 | International Business Machines Corporation | Techniques for personalized and adaptive search services |
US20040059726A1 (en) * | 2002-09-09 | 2004-03-25 | Jeff Hunter | Context-sensitive wordless search |
US20040128273A1 (en) * | 2002-12-31 | 2004-07-01 | International Business Machines Corporation | Temporal link analysis of linked entities |
US6766316B2 (en) * | 2001-01-18 | 2004-07-20 | Science Applications International Corporation | Method and system of ranking and clustering for document indexing and retrieval |
US6778997B2 (en) * | 2001-01-05 | 2004-08-17 | International Business Machines Corporation | XML: finding authoritative pages for mining communities based on page structure criteria |
US20040236725A1 (en) * | 2003-05-19 | 2004-11-25 | Einat Amitay | Disambiguation of term occurrences |
US6829599B2 (en) * | 2002-10-02 | 2004-12-07 | Xerox Corporation | System and method for improving answer relevance in meta-search engines |
US6842876B2 (en) * | 1998-04-14 | 2005-01-11 | Fuji Xerox Co., Ltd. | Document cache replacement policy for automatically generating groups of documents based on similarity of content |
US20050010605A1 (en) * | 2002-12-23 | 2005-01-13 | West Publishing Company | Information retrieval systems with database-selection aids |
US6848077B1 (en) * | 2000-07-13 | 2005-01-25 | International Business Machines Corporation | Dynamically creating hyperlinks to other web documents in received world wide web documents based on text terms in the received document defined as of interest to user |
US20050060311A1 (en) * | 2003-09-12 | 2005-03-17 | Simon Tong | Methods and systems for improving a search ranking using related queries |
US6901399B1 (en) * | 1997-07-22 | 2005-05-31 | Microsoft Corporation | System for processing textual inputs using natural language processing techniques |
US20050154761A1 (en) * | 2004-01-12 | 2005-07-14 | International Business Machines Corporation | Method and apparatus for determining relative relevance between portions of large electronic documents |
US6947920B2 (en) * | 2001-06-20 | 2005-09-20 | Oracle International Corporation | Method and system for response time optimization of data query rankings and retrieval |
US6965900B2 (en) * | 2001-12-19 | 2005-11-15 | X-Labs Holdings, Llc | Method and apparatus for electronically extracting application specific multidimensional information from documents selected from a set of documents electronically extracted from a library of electronically searchable documents |
US7003513B2 (en) * | 2000-07-04 | 2006-02-21 | International Business Machines Corporation | Method and system of weighted context feedback for result improvement in information retrieval |
US20060271524A1 (en) * | 2005-02-28 | 2006-11-30 | Michael Tanne | Methods of and systems for searching by incorporating user-entered information |
US7146409B1 (en) * | 2001-07-24 | 2006-12-05 | Brightplanet Corporation | System and method for efficient control and capture of dynamic database content |
US7197497B2 (en) * | 2003-04-25 | 2007-03-27 | Overture Services, Inc. | Method and apparatus for machine learning a document relevance function |
US20070174340A1 (en) * | 2005-11-30 | 2007-07-26 | Gross John N | System & Method of Delivering RSS Content Based Advertising |
US20070192369A1 (en) * | 2005-11-30 | 2007-08-16 | Gross John N | System & Method of Evaluating Content Based Advertising |
US20080010270A1 (en) * | 2005-11-30 | 2008-01-10 | Gross John N | System & Method of Delivering Content Based Advertising |
US20080016050A1 (en) * | 2001-05-09 | 2008-01-17 | International Business Machines Corporation | System and method of finding documents related to other documents and of finding related words in response to a query to refine a search |
US20080114721A1 (en) * | 2006-11-15 | 2008-05-15 | Rosie Jones | System and method for generating substitutable queries on the basis of one or more features |
US20080126335A1 (en) * | 2006-11-29 | 2008-05-29 | Oracle International Corporation | Efficient computation of document similarity |
US20080140699A1 (en) * | 2005-11-09 | 2008-06-12 | Rosie Jones | System and method for generating substitutable queries |
US20090216696A1 (en) * | 2008-02-25 | 2009-08-27 | Downs Oliver B | Determining relevant information for domains of interest |
US20100131523A1 (en) * | 2008-11-25 | 2010-05-27 | Leo Chi-Lok Yu | Mechanism for associating document with email based on relevant context |
US7836391B2 (en) * | 2003-06-10 | 2010-11-16 | Google Inc. | Document search engine including highlighting of confident results |
US20110029517A1 (en) * | 2009-07-31 | 2011-02-03 | Shihao Ji | Global and topical ranking of search results using user clicks |
US20110040787A1 (en) * | 2009-08-12 | 2011-02-17 | Google Inc. | Presenting comments from various sources |
US20110191310A1 (en) * | 2010-02-03 | 2011-08-04 | Wenhui Liao | Method and system for ranking intellectual property documents using claim analysis |
US7996396B2 (en) * | 2006-03-28 | 2011-08-09 | A9.Com, Inc. | Identifying the items most relevant to a current query based on user activity with respect to the results of similar queries |
US8024327B2 (en) * | 2007-06-26 | 2011-09-20 | Endeca Technologies, Inc. | System and method for measuring the quality of document sets |
US8086631B2 (en) * | 2008-12-12 | 2011-12-27 | Microsoft Corporation | Search result diversification |
US8086594B1 (en) * | 2007-03-30 | 2011-12-27 | Google Inc. | Bifurcated document relevance scoring |
US8086619B2 (en) * | 2003-09-05 | 2011-12-27 | Google Inc. | System and method for providing search query refinements |
US20120016888A1 (en) * | 2003-09-30 | 2012-01-19 | Google Inc. | Document scoring based on query analysis |
US20120290926A1 (en) * | 2011-05-12 | 2012-11-15 | Infinote Corporation | Efficient document management and search |
-
2003
- 2003-09-10 US US10/658,452 patent/US8548995B1/en not_active Expired - Lifetime
Patent Citations (78)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4823306A (en) * | 1987-08-14 | 1989-04-18 | International Business Machines Corporation | Text search system |
US5855015A (en) * | 1995-03-20 | 1998-12-29 | Interval Research Corporation | System and method for retrieval of hyperlinked information resources |
US5963940A (en) * | 1995-08-16 | 1999-10-05 | Syracuse University | Natural language information retrieval system and method |
US6067552A (en) * | 1995-08-21 | 2000-05-23 | Cnet, Inc. | User interface system and method for browsing a hypertext database |
US5864846A (en) * | 1996-06-28 | 1999-01-26 | Siemens Corporate Research, Inc. | Method for facilitating world wide web searches utilizing a document distribution fusion strategy |
US5918236A (en) * | 1996-06-28 | 1999-06-29 | Oracle Corporation | Point of view gists and generic gists in a document browsing system |
US6070158A (en) * | 1996-08-14 | 2000-05-30 | Infoseek Corporation | Real-time document collection search engine with phrase indexing |
US5920854A (en) * | 1996-08-14 | 1999-07-06 | Infoseek Corporation | Real-time document collection search engine with phrase indexing |
US6119114A (en) * | 1996-09-17 | 2000-09-12 | Smadja; Frank | Method and apparatus for dynamic relevance ranking |
US6285999B1 (en) * | 1997-01-10 | 2001-09-04 | The Board Of Trustees Of The Leland Stanford Junior University | Method for node ranking in a linked database |
US6070176A (en) * | 1997-01-30 | 2000-05-30 | Intel Corporation | Method and apparatus for graphically representing portions of the world wide web |
US6038560A (en) * | 1997-05-21 | 2000-03-14 | Oracle Corporation | Concept knowledge base search and retrieval system |
US6185550B1 (en) * | 1997-06-13 | 2001-02-06 | Sun Microsystems, Inc. | Method and apparatus for classifying documents within a class hierarchy creating term vector, term file and relevance ranking |
US6901399B1 (en) * | 1997-07-22 | 2005-05-31 | Microsoft Corporation | System for processing textual inputs using natural language processing techniques |
US6292830B1 (en) * | 1997-08-08 | 2001-09-18 | Iterations Llc | System for optimizing interaction among agents acting on multiple levels |
US5983221A (en) * | 1998-01-13 | 1999-11-09 | Wordstream, Inc. | Method and apparatus for improved document searching |
US6112203A (en) * | 1998-04-09 | 2000-08-29 | Altavista Company | Method for ranking documents in a hyperlinked environment using connectivity and selective content analysis |
US6842876B2 (en) * | 1998-04-14 | 2005-01-11 | Fuji Xerox Co., Ltd. | Document cache replacement policy for automatically generating groups of documents based on similarity of content |
US6138113A (en) * | 1998-08-10 | 2000-10-24 | Altavista Company | Method for identifying near duplicate pages in a hyperlinked database |
US6189002B1 (en) * | 1998-12-14 | 2001-02-13 | Dolphin Search | Process and system for retrieval of documents using context-relevant semantic profiles |
US6591261B1 (en) * | 1999-06-21 | 2003-07-08 | Zerx, Llc | Network search engine and navigation tool and method of determining search results in accordance with search criteria and/or associated sites |
US6502081B1 (en) * | 1999-08-06 | 2002-12-31 | Lexis Nexis | System and method for classifying legal concepts using legal topic scheme |
US6321228B1 (en) * | 1999-08-31 | 2001-11-20 | Powercast Media, Inc. | Internet search system for retrieving selected results from a previous search |
US6581057B1 (en) * | 2000-05-09 | 2003-06-17 | Justsystem Corporation | Method and apparatus for rapidly producing document summaries and document browsing aids |
US7003513B2 (en) * | 2000-07-04 | 2006-02-21 | International Business Machines Corporation | Method and system of weighted context feedback for result improvement in information retrieval |
US6848077B1 (en) * | 2000-07-13 | 2005-01-25 | International Business Machines Corporation | Dynamically creating hyperlinks to other web documents in received world wide web documents based on text terms in the received document defined as of interest to user |
US20030217052A1 (en) * | 2000-08-24 | 2003-11-20 | Celebros Ltd. | Search engine method and apparatus |
US6778997B2 (en) * | 2001-01-05 | 2004-08-17 | International Business Machines Corporation | XML: finding authoritative pages for mining communities based on page structure criteria |
US6766316B2 (en) * | 2001-01-18 | 2004-07-20 | Science Applications International Corporation | Method and system of ranking and clustering for document indexing and retrieval |
US6526440B1 (en) * | 2001-01-30 | 2003-02-25 | Google, Inc. | Ranking search results by reranking the results based on local inter-connectivity |
US20080016050A1 (en) * | 2001-05-09 | 2008-01-17 | International Business Machines Corporation | System and method of finding documents related to other documents and of finding related words in response to a query to refine a search |
US6947920B2 (en) * | 2001-06-20 | 2005-09-20 | Oracle International Corporation | Method and system for response time optimization of data query rankings and retrieval |
US7146409B1 (en) * | 2001-07-24 | 2006-12-05 | Brightplanet Corporation | System and method for efficient control and capture of dynamic database content |
US6965900B2 (en) * | 2001-12-19 | 2005-11-15 | X-Labs Holdings, Llc | Method and apparatus for electronically extracting application specific multidimensional information from documents selected from a set of documents electronically extracted from a library of electronically searchable documents |
US20030115188A1 (en) * | 2001-12-19 | 2003-06-19 | Narayan Srinivasa | Method and apparatus for electronically extracting application specific multidimensional information from a library of searchable documents and for providing the application specific information to a user application |
US20030220913A1 (en) * | 2002-05-24 | 2003-11-27 | International Business Machines Corporation | Techniques for personalized and adaptive search services |
US20040059726A1 (en) * | 2002-09-09 | 2004-03-25 | Jeff Hunter | Context-sensitive wordless search |
US6829599B2 (en) * | 2002-10-02 | 2004-12-07 | Xerox Corporation | System and method for improving answer relevance in meta-search engines |
US20050010605A1 (en) * | 2002-12-23 | 2005-01-13 | West Publishing Company | Information retrieval systems with database-selection aids |
US20040128273A1 (en) * | 2002-12-31 | 2004-07-01 | International Business Machines Corporation | Temporal link analysis of linked entities |
US7197497B2 (en) * | 2003-04-25 | 2007-03-27 | Overture Services, Inc. | Method and apparatus for machine learning a document relevance function |
US20040236725A1 (en) * | 2003-05-19 | 2004-11-25 | Einat Amitay | Disambiguation of term occurrences |
US7260571B2 (en) * | 2003-05-19 | 2007-08-21 | International Business Machines Corporation | Disambiguation of term occurrences |
US7836391B2 (en) * | 2003-06-10 | 2010-11-16 | Google Inc. | Document search engine including highlighting of confident results |
US8086619B2 (en) * | 2003-09-05 | 2011-12-27 | Google Inc. | System and method for providing search query refinements |
US7505964B2 (en) * | 2003-09-12 | 2009-03-17 | Google Inc. | Methods and systems for improving a search ranking using related queries |
US8024326B2 (en) * | 2003-09-12 | 2011-09-20 | Google Inc. | Methods and systems for improving a search ranking using related queries |
US20050060311A1 (en) * | 2003-09-12 | 2005-03-17 | Simon Tong | Methods and systems for improving a search ranking using related queries |
US8380705B2 (en) * | 2003-09-12 | 2013-02-19 | Google Inc. | Methods and systems for improving a search ranking using related queries |
US20120016888A1 (en) * | 2003-09-30 | 2012-01-19 | Google Inc. | Document scoring based on query analysis |
US8224827B2 (en) * | 2003-09-30 | 2012-07-17 | Google Inc. | Document ranking based on document classification |
US20120209838A1 (en) * | 2003-09-30 | 2012-08-16 | Google Inc. | Document scoring based on query analysis |
US8266143B2 (en) * | 2003-09-30 | 2012-09-11 | Google Inc. | Document scoring based on query analysis |
US20070234140A1 (en) * | 2004-01-12 | 2007-10-04 | Lee Chris G | Method and apparatus for determining relative relevance between portions of large electronic documents |
US7254587B2 (en) * | 2004-01-12 | 2007-08-07 | International Business Machines Corporation | Method and apparatus for determining relative relevance between portions of large electronic documents |
US20050154761A1 (en) * | 2004-01-12 | 2005-07-14 | International Business Machines Corporation | Method and apparatus for determining relative relevance between portions of large electronic documents |
US20060271524A1 (en) * | 2005-02-28 | 2006-11-30 | Michael Tanne | Methods of and systems for searching by incorporating user-entered information |
US20080140699A1 (en) * | 2005-11-09 | 2008-06-12 | Rosie Jones | System and method for generating substitutable queries |
US7962479B2 (en) * | 2005-11-09 | 2011-06-14 | Yahoo! Inc. | System and method for generating substitutable queries |
US20080010270A1 (en) * | 2005-11-30 | 2008-01-10 | Gross John N | System & Method of Delivering Content Based Advertising |
US20070192369A1 (en) * | 2005-11-30 | 2007-08-16 | Gross John N | System & Method of Evaluating Content Based Advertising |
US20070174340A1 (en) * | 2005-11-30 | 2007-07-26 | Gross John N | System & Method of Delivering RSS Content Based Advertising |
US7996396B2 (en) * | 2006-03-28 | 2011-08-09 | A9.Com, Inc. | Identifying the items most relevant to a current query based on user activity with respect to the results of similar queries |
US20080114721A1 (en) * | 2006-11-15 | 2008-05-15 | Rosie Jones | System and method for generating substitutable queries on the basis of one or more features |
US7739264B2 (en) * | 2006-11-15 | 2010-06-15 | Yahoo! Inc. | System and method for generating substitutable queries on the basis of one or more features |
US7610281B2 (en) * | 2006-11-29 | 2009-10-27 | Oracle International Corp. | Efficient computation of document similarity |
US20080126335A1 (en) * | 2006-11-29 | 2008-05-29 | Oracle International Corporation | Efficient computation of document similarity |
US8086594B1 (en) * | 2007-03-30 | 2011-12-27 | Google Inc. | Bifurcated document relevance scoring |
US8024327B2 (en) * | 2007-06-26 | 2011-09-20 | Endeca Technologies, Inc. | System and method for measuring the quality of document sets |
US20090216696A1 (en) * | 2008-02-25 | 2009-08-27 | Downs Oliver B | Determining relevant information for domains of interest |
US20100131523A1 (en) * | 2008-11-25 | 2010-05-27 | Leo Chi-Lok Yu | Mechanism for associating document with email based on relevant context |
US8086631B2 (en) * | 2008-12-12 | 2011-12-27 | Microsoft Corporation | Search result diversification |
US20120089588A1 (en) * | 2008-12-12 | 2012-04-12 | Microsoft Corporation | Search result diversification |
US8250092B2 (en) * | 2008-12-12 | 2012-08-21 | Microsoft Corporation | Search result diversification |
US20110029517A1 (en) * | 2009-07-31 | 2011-02-03 | Shihao Ji | Global and topical ranking of search results using user clicks |
US20110040787A1 (en) * | 2009-08-12 | 2011-02-17 | Google Inc. | Presenting comments from various sources |
US20110191310A1 (en) * | 2010-02-03 | 2011-08-04 | Wenhui Liao | Method and system for ranking intellectual property documents using claim analysis |
US20120290926A1 (en) * | 2011-05-12 | 2012-11-15 | Infinote Corporation | Efficient document management and search |
Non-Patent Citations (3)
Title |
---|
Bharat, K. and Henzinger, M., "Improved Algorithms for Topic Distillation in a Hyperlinked Environment," Aug. 1998, ACM SIGIR '08, Melbourne, Australia, pp. 104-111. * |
Brian Amento, Loren Terveen, and Will Hill; "Does 'Authority' Mean Quality? Predicting Expert Quality Rating of Web Documents"; ACM-SIGIR; Jul. 2000, pp. 296-303. * |
Osinski, Stanislaw, "An Algorithm for Clustering of Web Search Results," Jun. 2003, Master Thesis, Poznan University of Technology, Poland, pp. 1-101. * |
Cited By (50)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9037581B1 (en) | 2006-09-29 | 2015-05-19 | Google Inc. | Personalized search result ranking |
US8762373B1 (en) * | 2006-09-29 | 2014-06-24 | Google Inc. | Personalized search result ranking |
US20100138445A1 (en) * | 2007-04-26 | 2010-06-03 | Kristian Luoma | method and apparatus |
US8799254B2 (en) * | 2007-04-26 | 2014-08-05 | Nokia Corporation | Method and apparatus for improved searching of database content |
US11941058B1 (en) | 2008-06-25 | 2024-03-26 | Richard Paiz | Search engine optimizer |
US11048765B1 (en) | 2008-06-25 | 2021-06-29 | Richard Paiz | Search engine optimizer |
US11675841B1 (en) | 2008-06-25 | 2023-06-13 | Richard Paiz | Search engine optimizer |
US9875313B1 (en) | 2009-08-12 | 2018-01-23 | Google Llc | Ranking authors and their content in the same framework |
US8838619B1 (en) * | 2009-08-12 | 2014-09-16 | Google Inc. | Ranking authors and their content in the same framework |
US10936687B1 (en) | 2010-04-21 | 2021-03-02 | Richard Paiz | Codex search patterns virtual maestro |
US11379473B1 (en) | 2010-04-21 | 2022-07-05 | Richard Paiz | Site rank codex search patterns |
US11423018B1 (en) | 2010-04-21 | 2022-08-23 | Richard Paiz | Multivariate analysis replica intelligent ambience evolving system |
US9465793B2 (en) | 2010-05-13 | 2016-10-11 | Grammarly, Inc. | Systems and methods for advanced grammar checking |
US10387565B2 (en) | 2010-05-13 | 2019-08-20 | Grammarly, Inc. | Systems and methods for advanced grammar checking |
US20150154180A1 (en) * | 2011-02-28 | 2015-06-04 | Sdl Structured Content Management | Systems, Methods and Media for Translating Informational Content |
US12222912B2 (en) | 2011-02-28 | 2025-02-11 | Sdl Inc. | Systems and methods of generating analytical data based on captured audit trails |
US9471563B2 (en) * | 2011-02-28 | 2016-10-18 | Sdl Inc. | Systems, methods and media for translating informational content |
US10140320B2 (en) | 2011-02-28 | 2018-11-27 | Sdl Inc. | Systems, methods, and media for generating analytical data |
US11886402B2 (en) | 2011-02-28 | 2024-01-30 | Sdl Inc. | Systems, methods, and media for dynamically generating informational content |
US11366792B2 (en) | 2011-02-28 | 2022-06-21 | Sdl Inc. | Systems, methods, and media for generating analytical data |
US11775738B2 (en) | 2011-08-24 | 2023-10-03 | Sdl Inc. | Systems and methods for document review, display and validation within a collaborative environment |
US11263390B2 (en) | 2011-08-24 | 2022-03-01 | Sdl Inc. | Systems and methods for informational document review, display and validation |
US9984054B2 (en) | 2011-08-24 | 2018-05-29 | Sdl Inc. | Web interface including the review and manipulation of a web document and utilizing permission based control |
US20130282707A1 (en) * | 2012-04-24 | 2013-10-24 | Discovery Engine Corporation | Two-step combiner for search result scores |
US9916306B2 (en) * | 2012-10-19 | 2018-03-13 | Sdl Inc. | Statistical linguistic analysis of source content |
US20140114642A1 (en) * | 2012-10-19 | 2014-04-24 | Laurens van den Oever | Statistical linguistic analysis of source content |
US11809506B1 (en) | 2013-02-26 | 2023-11-07 | Richard Paiz | Multivariant analyzing replicating intelligent ambience evolving system |
US11741090B1 (en) | 2013-02-26 | 2023-08-29 | Richard Paiz | Site rank codex search patterns |
US9971756B2 (en) | 2014-01-03 | 2018-05-15 | Oath Inc. | Systems and methods for delivering task-oriented content |
US20170199932A1 (en) * | 2014-01-03 | 2017-07-13 | Yahoo! Inc. | Systems and methods for quote extraction |
US10242095B2 (en) * | 2014-01-03 | 2019-03-26 | Oath Inc. | Systems and methods for quote extraction |
US20150193440A1 (en) * | 2014-01-03 | 2015-07-09 | Yahoo! Inc. | Systems and methods for content processing |
US20150193495A1 (en) * | 2014-01-03 | 2015-07-09 | Yahoo! Inc. | Systems and methods for quote extraction |
WO2015103540A1 (en) * | 2014-01-03 | 2015-07-09 | Yahoo! Inc. | Systems and methods for content processing |
US10037318B2 (en) | 2014-01-03 | 2018-07-31 | Oath Inc. | Systems and methods for image processing |
US9940099B2 (en) * | 2014-01-03 | 2018-04-10 | Oath Inc. | Systems and methods for content processing |
US11144281B2 (en) | 2014-01-03 | 2021-10-12 | Verizon Media Inc. | Systems and methods for content processing |
USD760791S1 (en) | 2014-01-03 | 2016-07-05 | Yahoo! Inc. | Animated graphical user interface for a display screen or portion thereof |
US9742836B2 (en) | 2014-01-03 | 2017-08-22 | Yahoo Holdings, Inc. | Systems and methods for content delivery |
US10296167B2 (en) | 2014-01-03 | 2019-05-21 | Oath Inc. | Systems and methods for displaying an expanding menu via a user interface |
US9558180B2 (en) * | 2014-01-03 | 2017-01-31 | Yahoo! Inc. | Systems and methods for quote extraction |
USD760792S1 (en) | 2014-01-03 | 2016-07-05 | Yahoo! Inc. | Animated graphical user interface for a display screen or portion thereof |
USD775183S1 (en) | 2014-01-03 | 2016-12-27 | Yahoo! Inc. | Display screen with transitional graphical user interface for a content digest |
US9465849B2 (en) | 2014-01-03 | 2016-10-11 | Yahoo! Inc. | Systems and methods for content processing |
US9037967B1 (en) * | 2014-02-18 | 2015-05-19 | King Fahd University Of Petroleum And Minerals | Arabic spell checking technique |
US10503357B2 (en) | 2014-04-03 | 2019-12-10 | Oath Inc. | Systems and methods for delivering task-oriented content using a desktop widget |
USD761833S1 (en) | 2014-09-11 | 2016-07-19 | Yahoo! Inc. | Display screen with graphical user interface of a menu for a news digest |
US11244011B2 (en) * | 2015-10-23 | 2022-02-08 | International Business Machines Corporation | Ingestion planning for complex tables |
US10606914B2 (en) * | 2017-10-25 | 2020-03-31 | International Business Machines Corporation | Apparatus for webpage scoring |
US11416534B2 (en) * | 2018-12-03 | 2022-08-16 | Fujitsu Limited | Classification of electronic documents |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8548995B1 (en) | Ranking of documents based on analysis of related documents | |
US10452718B1 (en) | Locating meaningful stopwords or stop-phrases in keyword-based retrieval systems | |
CA2845194C (en) | Classification of ambiguous geographic references | |
US8082244B2 (en) | Systems and methods for determining document freshness | |
US6615209B1 (en) | Detecting query-specific duplicate documents | |
US8346757B1 (en) | Determining query terms of little significance | |
US8631026B1 (en) | Methods and systems for efficient query rewriting | |
US8977630B1 (en) | Personalizing search results | |
US8332426B2 (en) | Indentifying referring expressions for concepts | |
US9569504B1 (en) | Deriving and using document and site quality signals from search query streams | |
US9177057B2 (en) | Re-ranking search results based on lexical and ontological concepts | |
US20150172299A1 (en) | Indexing and retrieval of blogs | |
US7296016B1 (en) | Systems and methods for performing point-of-view searching | |
US8364672B2 (en) | Concept disambiguation via search engine search results | |
US9977816B1 (en) | Link-based ranking of objects that do not include explicitly defined links | |
WO2011019297A1 (en) | Spreading comments to other documents | |
Zhuang et al. | Re-ranking search results using query logs | |
US8762225B1 (en) | Systems and methods for scoring documents | |
US8122013B1 (en) | Title based local search ranking | |
US8595225B1 (en) | Systems and methods for correlating document topicality and popularity |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: GOOGLE TECHNOLOGY INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:CURTISS, MICHAEL;REEL/FRAME:016075/0253 Effective date: 20030909 |
|
AS | Assignment |
Owner name: GOOGLE INC., CALIFORNIA Free format text: MERGER;ASSIGNOR:GOOGLE TECHNOLOGY INC.;REEL/FRAME:016081/0053 Effective date: 20030827 |
|
FEPP | Fee payment procedure |
Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
AS | Assignment |
Owner name: GOOGLE LLC, CALIFORNIA Free format text: CHANGE OF NAME;ASSIGNOR:GOOGLE INC.;REEL/FRAME:044101/0299 Effective date: 20170929 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 8 |