PatSeq Bulk Download

We provide the entire PatSeq data collection for download in FASTA format.

Data files are provided both for individual jurisdictions and also for the entire PatSeq database. Each set consists of multiple files, grouped by:

  • document type (grants, applications),
  • sequence type (nucleotides, peptides), and
  • document location (in claims, all locations – please note: “in claims” is a subset of the “all” dataset).  For example, “Grants: Nucleotides (all)” refers to nucleotide sequences disclosed in granted patent documents regardless of where they are referenced in the documents whereas “Grants: Nucleotides (in Claims)” refers to nucleotide sequences referenced in the claims of the granted patent documents

Here you can download some sample data (1k sequences, FASTA format, 920KB).

How to download

  1. Before you are able to download, you will need to register and/or sign in to  request permission to download the monthly updates of Patseq database.
  2. Once you are signed in, you can request access by clicking on the “Sequence bulk download” tab in your work area as shown below
  3. Bulkdownload

  4. In the Sequence Bulk download tab , you can check out the  licensing options.  Licensing fees are structured based on the size of the commercial entity.  They will be used to expand PatSeq functionalities , including, for example, developing flatfiles with  sequence and patent metadata to be downloaded with the sequence bulk download in the near future.  So, your support will help enrich and improve our services to you in the long term…
  5.  licensingoptions
  6. Read our Terms of Use
  7. Send request online
  8. Once the bulk download access is granted, it will be enabled for your Lens account and you can then download the sequence files
    • using your web browser and the PatSeq Data app, or
    • programmatically, using the download API.

See further explanations for both download options below.


PatSeq Data

Within the PatSeq data app you can use the provided  “Sequence download”  buttons to download data files for individual jurisdictions or the entire PatSeq database.

psd bulk download

Bulk Download in “PatSeq Data”

 

The data is provided in multiple files, grouped by

  • document type (grants, applications),
  • sequence type (nucleotides, peptides), and
  • document location (in claims, all locations – please note: “in claims” is a subset of the “all” dataset).  For example, “Grants: Nucleotides (all)” refers to nucleotide sequences disclosed in granted patent documents regardless of where they are referenced in the documents whereas “Grants: Nucleotides (in Claims)” refers to nucleotide sequencesreferenced in the claims of the granted patent documents

 Download API

In order to download sequence programmatically (e.g. in automated scheduled scripts) you need to authenticate your application. This is achieved through an unique API access token.

Create your personal access token

You can manage your API access tokens in the Sequence Bulk download tab of your work area.

Bulk Download API Tokens

Click on the “Generate a new access token” link below the table. You can generate multiple tokens, and to allow you to distinguish them you can label them individually using the description field.

Note: When generating a new access token, ensure to copy it somewhere safe. For security reasons, it won’t be displayed again. API access tokens can be revoked by deleting the API access token.

Using the API

The API endpoint is https://www.lens.org/lens/bio/psd/api

The Bulk Download API provides two endpoints:

  • /files – to retrieve a list of available files
    Method GET
    URL https://www.lens.org/lens/bio/psd/api/files
    Request Parameter

    • access_token : (String) API access token
  • /download – to download specific files
    Method GET
    URL https://www.lens.org/lens/bio/psd/api/download
    Request Parameter

    • access_token : (String) API access token
    • file : (String) relative file path as returned by the /files API call.

The access token can either be submitted using the HTTP header Authorization field, or alternatively using the URI access_token request parameter. See rfc6750 – Bearer Token Usage for more details.

We also provide a technical documentation with direct access to the API.

Example

To download all the latest nucleotide sequences extracted from US grants and referenced in the claims, you should use the following link:

https://www.lens.org/lens/bio/psd/api/download?access_token={your_personal_token}&file=us/grant/na-claims.fa.gz

with

wget "https://www.lens.org/lens/bio/psd/api/download?access_token={your_personal_token}&file=us/grant/na-claims.fa.gz" -O us-grants-na-claims.fa.gz

or specify the access token in the HTTP Authorization header: "Authorization: Bearer {your_personal_token}"

Request URL: https://www.lens.org/lens/bio/psd/api/download?file=us/grant/na-claims.fa.gz"

Please note, whatever HTTP client you use will need to be able to follow 302 redirects.