Linking Linked Data Linked Data to Integrated DataExpert Bioinformatics from Bioinformatics Experts
Put your data on the webmake a pretty web site later. Expert Bioinformatics from Bioinformatics Experts
Expert Bioinformatics from Bioinformatics Experts
Now we can ask questions like this...What members of a target pathway are already targeted in other diseases? Ta...
Because we have lots of data exposedas RDF Uniprot:Protein ...
What do you do when you have to adddata... Expert Bioinformatics from Bioinformatics Experts
Or connect SPARQL endpoints? RDF != Linked Data Expert Bioinformatics from Bioinformatics Experts
Is your data 5* ? Linked data is essential to actually connect the semantic web. It is quite easy to do with a little thou...
Example openflydata to BioCyc What genes are differentially expressed in the hindgut and are there any pathways associated...
Problem: Node URIs<http://openflydata.org/id/flyatlas/affyid/1616608_a_at><http://purl.org/NET/flyatlas/schema#gene><http:...
Integration Level 1Use Identifiers.org CONSTRUCT { ?x RDFS:seeAlso `bif:sprintf_iri ("http://identifiers.org/f...
Integration Level 2adding property characteristics BP = <http://www.biopax.org/release/biopax-level3.owl#>BP:Protein BP:co...
Integration Level 3class subsumption FlyA = <http://purl.org/NET/flyatlas/schema#>flywebflyatlas:1616608_a_at a flyatlas:P...
Connect BiochemicalReactions toExpression ValuesSELECT ?name ?id ?meanWHERE{ ?reaction a BP:BiochemicalReaction . ?rea...
Expert Bioinformatics from Bioinformatics Experts
Client Architecture Expert Bioinformatics from Bioinformatics Experts
Vocabularies in Linked DataWhat does the linked data cloud know about Drugs.... ...
Create a tighter more unified “view” underone schema Expert Bioinformatics from Bioinformatics Experts
Unified VocabularyWhat does the linked data cloud know about Drugs.... Expert Bioinformatics from Bioinformatics E...
Map Classes and Properties into asingle instantiated view Expert Bioinformatics from Bioinformatics Experts
Before QuerySELECT *WHERE{?s drugb:calculatedInChIKey ?inchiD .?s a drugb:Drug .?c a Chembl:ChemicalCompund .?c chembl:sta...
After QuerySELECT *where{?s a GB:Drug .?s GB:inchiKey ?inchi .} Expert Bioinformatics from Bioinformatics Experts
Linked Data Architecture Expert Bioinformatics from Bioinformatics Experts
Creating fixed “views” of Linked DataWhen the use of integrated data is fixed e.g. an API orapplication, Linked Data can b...
Summary● Exposing data as RDF does not equal Linked Data● Making data linked is not hard – Node IRIs – ...
www.generalbioinformatics.com/science.html Expert Bioinformatics from Bioinformatics Experts
of 26

Linking Linked Data CSHALS2013

Published on: Mar 3, 2016
Source: www.slideshare.net


Transcripts - Linking Linked Data CSHALS2013

  • 1. Linking Linked Data Linked Data to Integrated DataExpert Bioinformatics from Bioinformatics Experts
  • 2. Put your data on the webmake a pretty web site later. Expert Bioinformatics from Bioinformatics Experts
  • 3. Expert Bioinformatics from Bioinformatics Experts
  • 4. Now we can ask questions like this...What members of a target pathway are already targeted in other diseases? Target Pathway Disease Chembl Uniprot Reactome OMIM Protein Target Compound Pathway Disease Expert Bioinformatics from Bioinformatics Experts
  • 5. Because we have lots of data exposedas RDF Uniprot:Protein BioPAX:Protein Mim:Phenotype Expert Bioinformatics from Bioinformatics Experts
  • 6. What do you do when you have to adddata... Expert Bioinformatics from Bioinformatics Experts
  • 7. Or connect SPARQL endpoints? RDF != Linked Data Expert Bioinformatics from Bioinformatics Experts
  • 8. Is your data 5* ? Linked data is essential to actually connect the semantic web. It is quite easy to do with a little thought, and becomes second nature. Various common sense considerations determine when to make a link and when not to. Expert Bioinformatics from Bioinformatics Experts
  • 9. Example openflydata to BioCyc What genes are differentially expressed in the hindgut and are there any pathways associated with those genes? ● Use FlyAtlas at openflydata.org for tissue specific expression profiles. ● Use FlyCyc from BioCyc. ● Then SPARQL Expert Bioinformatics from Bioinformatics Experts
  • 10. Problem: Node URIs<http://openflydata.org/id/flyatlas/affyid/1616608_a_at><http://purl.org/NET/flyatlas/schema#gene><http://openflydata.org/id/flybase/feature/FBgn0001128> .<http://biocyc.org/biopax/biopax-level3#UnificationXref202209><http://www.biopax.org/release/biopax-level3.owl#xref><http://biocyc.org/biopax/biopax-level3#Protein202210> .<http://biocyc.org/biopax/biopax-level3#UnificationXref202209><http://www.biopax.org/release/biopax-level3.owl#db> FlyCyc .<http://biocyc.org/biopax/biopax-level3#UnificationXref202209><http://www.biopax.org/release/biopax-level3.owl#id> FBGN0001128 . Expert Bioinformatics from Bioinformatics Experts
  • 11. Integration Level 1Use Identifiers.org CONSTRUCT { ?x RDFS:seeAlso `bif:sprintf_iri ("http://identifiers.org/flybase/%s", ?id)` } WHERE { ?x BP:unificationxref ?xref . ?xref BP:id ?id . ?blank BP:db "FlyCyc"^^xsd:string } Expert Bioinformatics from Bioinformatics Experts
  • 12. Integration Level 2adding property characteristics BP = <http://www.biopax.org/release/biopax-level3.owl#>BP:Protein BP:controls BP:CatalysisBP:Catalysis BP:controls BP:BioChemicalReactionBP:Protein BP:controls BP:BioChemicalReactionCONSTRUCT {?x GB:controlledBy ?y }WHERE { ?x BP:controls ?catalysis . ?catalysis BP:controls ?y } Expert Bioinformatics from Bioinformatics Experts
  • 13. Integration Level 3class subsumption FlyA = <http://purl.org/NET/flyatlas/schema#>flywebflyatlas:1616608_a_at a flyatlas:ProbeData BP = <http://www.biopax.org/release/biopax-level3.owl#> flyatlas:ProbeData rdfs:subClassOf BP:DNARegionCONSTRUCT {?x a BP:DNARegion }WHERE { ?x a flyatlas:ProbeData } Expert Bioinformatics from Bioinformatics Experts
  • 14. Connect BiochemicalReactions toExpression ValuesSELECT ?name ?id ?meanWHERE{ ?reaction a BP:BiochemicalReaction . ?reaction BP:standardName ?name . ?reaction GB:controlledBy ?protein . ?protein a BP:Protein . ?protein BP:xref ?id . ?probe a BP:DNARegion . ?probe BP:xref ?id . ?probe flyatlas:l_fatbody ?blank . ?blank flyatlas:mean ?mean}LIMIT 5 No Reasoner – just a few SPARQL CONSTRUCTs Expert Bioinformatics from Bioinformatics Experts
  • 15. Expert Bioinformatics from Bioinformatics Experts
  • 16. Client Architecture Expert Bioinformatics from Bioinformatics Experts
  • 17. Vocabularies in Linked DataWhat does the linked data cloud know about Drugs.... chembl:Activity chembl:Assay chembl:AssayCategorySELECT distinct ?class chembl:AssayTargetLinkWHERE chembl:ChemicalCompound >100 chembl:DrugTarget{ chembl:LiteratureCitation ?s a ?class . dailymed:drugs ?s ?p ?o drugbank:Drug} drugbank:DrugInteraction drugbank:EnzymeLink drugbank:ExternalIdentifier drugbank:ExternalLink drugbank:LiteratureCitation drugbank:Molecule drugbank:OrganismSpecies drugbank:Patent drugbank:ProteinSequence drugbank:TargetLink entrez:EnsemblReference entrez:Gene pdb:Molecule pdb:Structure pubmed:Chemical pubmed:Citation Expert Bioinformatics from Bioinformatics Experts pubmed:DatabankReference
  • 18. Create a tighter more unified “view” underone schema Expert Bioinformatics from Bioinformatics Experts
  • 19. Unified VocabularyWhat does the linked data cloud know about Drugs.... Expert Bioinformatics from Bioinformatics Experts
  • 20. Map Classes and Properties into asingle instantiated view Expert Bioinformatics from Bioinformatics Experts
  • 21. Before QuerySELECT *WHERE{?s drugb:calculatedInChIKey ?inchiD .?s a drugb:Drug .?c a Chembl:ChemicalCompund .?c chembl:standardInChIKey ?inchiC .FILTER regex(?inchiD, ?inchiC)} Expert Bioinformatics from Bioinformatics Experts
  • 22. After QuerySELECT *where{?s a GB:Drug .?s GB:inchiKey ?inchi .} Expert Bioinformatics from Bioinformatics Experts
  • 23. Linked Data Architecture Expert Bioinformatics from Bioinformatics Experts
  • 24. Creating fixed “views” of Linked DataWhen the use of integrated data is fixed e.g. an API orapplication, Linked Data can be expensive: – Changes to data requires significant recoding – Multiple Schemas make queries long and inefficient• A view or middle layer of data used by the API, changes to data are managed by the view and the API is minimally disturbed – Views are easier to query – Views are faster to query• Client gets the best of both worlds a tight view of data for API queries while still having all the advantages of a linked data strategy. Expert Bioinformatics from Bioinformatics Experts
  • 25. Summary● Exposing data as RDF does not equal Linked Data● Making data linked is not hard – Node IRIs – Unifying Classes – Transitive closure of Properties● A little semantics goes a long way (no reasoner required)● Creating “Views” from one schema to another is not hard. – But should be easier Expert Bioinformatics from Bioinformatics Experts
  • 26. www.generalbioinformatics.com/science.html Expert Bioinformatics from Bioinformatics Experts

Related Documents