Open Targets

Note

Important Update 9/19/2024: All URLs are changing. We are enabling public access to all Genomics Data Lake containers. The existing “signed URLs” (shared access signatures) will be retired at: 2024-11-04T00:00:00Z. After this time, the URLs without a query string will continue to work, however the “signed URLs” will no longer work and will return a 403 HTTP status code. Please plan accordingly to access the public URLs without a query string after this date (remove the ‘?’ and trailing characters).

The Open Targets Platform is a data resource to facilitate the systematic identification and prioritization of potential therapeutic drug targets. This resource integrates publicly available datasets, including those datasets that are generated by the Open Targets consortium, to build and score target-disease associations, aiding in the identification and prioritization of drug targets. Additionally, it incorporates pertinent annotation information about targets, diseases, phenotypes, drugs, and their key relationships.

The Open Targets Genetics highlights variant-centric statistical evidence to allow both prioritization of candidate causal variants at trait-associated loci and identification of potential drug targets. It collects and combines genetic associations gathered from published literature as well as newly derived data from sources like UK Biobank and FinnGen. Additionally, it includes functional genomics information such as chromatin conformation and interactions, along with quantitative trait loci (eQTLs, pQTLs, and sQTLs). Large-scale pipelines apply statistical fine-mapping across thousands of trait-associated loci to resolve association signals and link each variant to its proximal and distal target genes using a 'Locus2Gene' assessment. Integrated cross-trait colocalisation analyses and linking to detailed pharmaceutical compounds extend the capacity of Open Targets Genetics to explore drug repositioning opportunities and shared genetic architecture.

Note

Microsoft provides Azure Open Datasets on an “as is” basis. Microsoft makes no warranties, express or implied, guarantees or conditions with respect to your use of the datasets. To the extent permitted under your local law, Microsoft disclaims all liability for any damages or losses, including direct, consequential, special, indirect, incidental or punitive, resulting from your use of the datasets.

This dataset is provided under the original terms that Microsoft received source data. The dataset may include data sourced from Microsoft.

Data source

This dataset is a mirror of http://ftp.ebi.ac.uk/pub/databases/opentargets/platform/latest and http://ftp.ebi.ac.uk/pub/databases/opentargets/genetics/latest/

Data volumes and update frequency

This dataset contains approximately 350 GB of data and is updated daily.

Storage location

This dataset is stored in the West US 2 Azure region. Allocating compute resources in West US 2 is recommended for affinity.

Data access

West US 2: https://datasetopentargets.blob.core.windows.net/dataset

SAS Token: sv=2019-10-10&si=prod&sr=c&sig=9nzcxaQn0NprMPlSh4RhFQHcXedLQIcFgbERiooHEqM%3D

Use terms

Please refer to the data use terms as described here.

Contact

https://www.internationalgenome.org/contact

View the rest of the datasets in the Open Datasets catalog.