Searching on The Lens is as easy as typing in search queries and hitting search. From there you have numerous options to sort and filter your search to refine your results towards a single result or a collection of documents. Alternatively you can use the Structured Search or Biological Search features to start your search with more constrained parameters.
When you search on The Lens we query your search results against our vast database of patent documents from jurisdictions around the world. Results are sorted by default using their generated rank. This rank is determined through advanced algorithms which establish which documents are the most relevant given your search terms, parameters and filters. Each patent’s rank is not an indicator of the document’s quality or importance, but rather is a measure of how well this patent matches your search.
We also apply some modifications to the natural rank of these documents, boosting the rank of higher quality documents over lower quality documents to place the best documents at the top of our results. For an example see this link
Limitations and Caveats to Patent Search
While we strive to produce the highest quality patent database, the user should be aware that there are limitations that may affect the outcome of any search. Some of these limitations are inherent in the data provided by the Patent Offices, while others result from the processing of these data. In the interest of full disclosure, below is a list of known issues with the data and their causes.
- Can be inherent in the original data, in which case they will appear in the PDF document (where available);
- Can arise from OCR (optical character recognition) processing (which puts images into a full-text searchable format) in two ways, in which case the correct spelling will appear in the PDF document:
- (i) because the OCR process is generally only 99% accurate
- (ii) can result when words are split over two lines by hyphenation in the original patent document. Currently, such words are indexed as the two separate parts by the OCR process. For example, if the word “magnetism” is split over two lines as “magnet-ism” then the OCR process indexes it as two separate words “magnet” and “ism”. Where the error is noted, the affected documents will be re-processed to correct this problem.
- Alternate spellings
- many words in English can be spelled differently, depending on the preference of the writer (e.g., harbor/harbour; center/centre; labeled/labelled);
- spelling is usually, but not always, consistent within a document;
- in US patent documents, mostly the spelling is American even if the writer is not from the U.S. while in EP patent documents, the spelling is mostly British;
- in WO documents, the spelling preference may depend upon the country of origin or the receiving office.
- Names (inventors, assignees, etc)
- names in the inventor, applicant/assignee or agent fields are indexed just like any other word. The various collections format names in different ways, e.g. “John Smith” may appear in any of the following forms: “J. Smith”; “John Smith”; “Smith, John”; “Smith, J.”, etc. ;
- the best approach to searching for a particular person’s name is to use just the last name, surname or family name (e.g. “Smith”), and if too many documents are returned from the search, then refine the search with one or more additional criteria, such as an organisation name (e.g. “university AND Cornell)”.
- Inconsistency of presentation among data sets (e.g. Greek letters, layouts, fields present, order of fields)
- these inconsistencies will affect your search strategy and search results;
- Greek letters: It is now possible to search such characters by entering the Unicode character, e.g. beta = β. Please refer to the manual for your computer’s operating system for instructions on entering non-roman characters.
- layouts: unlike the other data sets, the U.S. patents generally have a fixed set of headings (e.g. Field of the Invention; Summary of the Invention);
- fields: not all information on the front page is common among the datasets. For example, U.S. documents may contain fields (e.g. U.S. classification codes) not present in EP and WO documents;