http://info216.wiki.uib.no/api.php?action=feedcontributions&user=Tkr048&feedformat=atom
info216 - User contributions [en]
2024-03-28T17:56:00Z
User contributions
MediaWiki 1.39.6
http://info216.wiki.uib.no/index.php?title=Lab:_Using_Graph_Embeddings&diff=1867
Lab: Using Graph Embeddings
2022-05-06T11:49:12Z
<p>Tkr048: </p>
<hr />
<div><br />
=Lab 13: Using Graph Embeddings=<br />
<br />
==Topics==<br />
Using knowledge graph embeddings with TorchKGE.<br />
<br />
<!-- ==Tutorial== --><br />
<br />
<br />
==Classes and methods==<br />
The following TorchKGE classes are central:<br />
* '''KnowledgeGraph''' - contains the knowledge graph (KG)<br />
* '''Model''' - contains the embeddings (entity and relation vectors) for some KG<br />
<br />
<!--<br />
<syntaxhighlight><br />
</syntaxhighlight><br />
--><br />
<br />
==Tasks==<br />
<br />
'''Knowledge Graph''':<br />
* Use a [https://torchkge.readthedocs.io/en/latest/reference/utils.html#pre-trained-models dataset loader] to load a KG you want to work with. Freebase FB15k is a good choice. (You will need a pre-trained model for your KG later, to choose one of FB15k, FB15k237, WDV5, WN18RR, or Yago3-10. This lab has mostly been tested on FB15k.)<br />
* Use the methods provided by the [https://torchkge.readthedocs.io/en/latest/reference/data.html#knowledge-graph KnolwedgeGraph class] to inspect the KG. <br />
** Print out the numbers of entities, relations, and facts in the training, validation, and testing sets. <br />
** Print the identifiers for the first 10 entities and relations (''tip:'' ent2ix and rel2ix).<br />
<br />
'''External identifiers''':<br />
* Download a dataset that provides more understandable labels for the entities (and perhaps relations) in your KnowledgeGraph<br />
** If you use FB15k, the relation names are not so bad, but the entity identifiers do not give much meaning. Same with WordNet. [https://github.com/villmow/datasets_knowledge_embedding This repository] contains mappings for the Freebase and WordNet datasets.<br />
** If you use a Wikidata graph, the entities and relations are all P- and Q-codes. To get labels, you can try a combination of [https://query.wikidata.org/ SPARQL queries] and [https://pypi.org/project/Wikidata/ this API].<br />
* Create mappings from external label to entity (and perhaps relation) ids in the KnowledgeGraph. Also create the inverse mappings.<br />
<br />
'''Test entities and relations''':<br />
* Get the KG indexes for a few entities and relations. If you use the Freebase or Wikidata graphs, you can try 'J. K. Rowling' and 'WALL·E' as entities (''note'' that the dot in 'WALL·E' is not a hyphen or usual period.) For relations you can try 'influenced by' and 'genre'.<br />
<br />
'''Model''':<br />
* Load a [https://torchkge.readthedocs.io/en/latest/reference/utils.html#pre-trained-models pre-trained TransE model] that matches your KnowledgeGraph.<br />
** Print out the numbers of entities, relations, and the dimensions of the entity and relation vectors. Do they match your KnowledgeGraph. <br />
* Get the vectors for your test entities and relations (for example, 'J. K. Rowling' and 'influenced by').<br />
* Find vectors for a few more entities (both unrelated and related ones, e.g., 'J. R. R. Tolkien', 'C. S. Lewis', ...). Use the [https://torchkge.readthedocs.io/en/latest/reference/models.html#translationalmodels model.dissimilarity()-method] to estimate how semantically close your entities are. Do the distances make sense?<br />
<br />
'''K-nearest neighbours''':<br />
* Find the indexes of the 10 entity vectors that are nearest neighbours to your entity of choice. You can use [https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.NearestNeighbors.html sciKit-learn's sklearn.neighbors.NearestNeighbors.kneighbors()-method] for this.<br />
* Map the indexes of the 10-nearest neighbouring entities back into human-understandable labels. Does this make sense? Try the same thing with another entity (e.g., 'WALL·E').<br />
<br />
'''Translation''':<br />
* Add together the vectors for an entity and a relation that that gives meaning for the entity (e.g., 'J. K. Rowling' - 'influenced by', 'WALL·E' - 'genre'). Find the 10-nearest neighbouring entities for the vector sum. Does this make sense? Try more entities and relations. Try to find examples that work and that do not work well.<br />
<br />
==Code to get started==<br />
<br />
<syntaxhighlight><br />
!pip install torchkge<br />
!pip install sklearn<br />
!git clone https://github.com/villmow/datasets_knowledge_embedding.git<br />
<br />
from torchkge.utils.datasets import load_fb15k237<br />
<br />
kg_train, kg_val, kg_test = load_fb15k237()<br />
<br />
print(list(kg_train.ent2ix.keys())[-10:])<br />
print(list(kg_train.rel2ix.keys())[-10:])<br />
<br />
<br />
<br />
"""Download files with human-readable labels for (most) Freebase entities used in the dataset. <br />
Labels seem to be missing for some entities used in FB15k-237."""<br />
<br />
import json<br />
<br />
TEXT_TRIPLES_DIR = 'datasets_knowledge_embedding/FB15k-237/'<br />
with open(TEXT_TRIPLES_DIR+'entity2wikidata.json') as file:<br />
_entity2wikidata = json.load(file)<br />
<br />
ent2lbl = {<br />
ent: wd['label']<br />
for ent, wd in _entity2wikidata.items()<br />
}<br />
lbl2ent = {lbl: ent for ent, lbl in ent2lbl.items()}<br />
<br />
<br />
print([<br />
ent2lbl[ent] <br />
for ent in kg_train.ent2ix.keys()<br />
if ent in ent2lbl][-10:])<br />
<br />
<br />
</syntaxhighlight><br />
<br />
<br />
<br />
==If You Have More Time==<br />
<br />
* Try it out with different datasets, for example one you create youreself using SPARQL queries on an open KG.<br />
<br />
==Useful readings==<br />
* [https://torchkge.readthedocs.io/en/latest/ Welcome to TorchKGE’ s documentation!]</div>
Tkr048
http://info216.wiki.uib.no/index.php?title=Lab:_Using_Graph_Embeddings&diff=1865
Lab: Using Graph Embeddings
2022-05-04T07:01:00Z
<p>Tkr048: </p>
<hr />
<div><br />
=Lab 13: Using Graph Embeddings=<br />
<br />
==Topics==<br />
Using knowledge graph embeddings with TorchKGE.<br />
<br />
<!-- ==Tutorial== --><br />
<br />
<br />
==Classes and methods==<br />
The following TorchKGE classes are central:<br />
* '''KnowledgeGraph''' - contains the knowledge graph (KG)<br />
* '''Model''' - contains the embeddings (entity and relation vectors) for some KG<br />
<br />
<!--<br />
<syntaxhighlight><br />
</syntaxhighlight><br />
--><br />
<br />
==Tasks==<br />
<br />
'''Knowledge Graph''':<br />
* Use a [https://torchkge.readthedocs.io/en/latest/reference/utils.html#pre-trained-models dataset loader] to load a KG you want to work with. Freebase FB15k is a good choice. (You will need a pre-trained model for your KG later, to choose one of FB15k, FB15k237, WDV5, WN18RR, or Yago3-10. This lab has mostly been tested on FB15k.)<br />
* Use the methods provided by the [https://torchkge.readthedocs.io/en/latest/reference/data.html#knowledge-graph KnolwedgeGraph class] to inspect the KG. <br />
** Print out the numbers of entities, relations, and facts in the training, validation, and testing sets. <br />
** Print the identifiers for the first 10 entities and relations (''tip:'' ent2ix and rel2ix).<br />
<br />
'''External identifiers''':<br />
* Download a dataset that provides more understandable labels for the entities (and perhaps relations) in your KnowledgeGraph<br />
** If you use FB15k, the relation names are not so bad, but the entity identifiers do not give much meaning. Same with WordNet. [https://github.com/villmow/datasets_knowledge_embedding This repository] contains mappings for the Freebase and WordNet datasets.<br />
** If you use a Wikidata graph, the entities and relations are all P- and Q-codes. To get labels, you can try a combination of [https://query.wikidata.org/ SPARQL queries] and [https://pypi.org/project/Wikidata/ this API].<br />
* Create mappings from external label to entity (and perhaps relation) ids in the KnowledgeGraph. Also create the inverse mappings.<br />
<br />
'''Test entities and relations''':<br />
* Get the KG indexes for a few entities and relations. If you use the Freebase or Wikidata graphs, you can try 'J. K. Rowling' and 'WALL·E' as entities (''note'' that the dot in 'WALL·E' is not a hyphen or usual period.) For relations you can try 'influenced by' and 'genre'.<br />
<br />
'''Model''':<br />
* Load a [https://torchkge.readthedocs.io/en/latest/reference/utils.html#pre-trained-models pre-trained TransE model] that matches your KnowledgeGraph.<br />
** Print out the numbers of entities, relations, and the dimensions of the entity and relation vectors. Do they match your KnowledgeGraph. <br />
* Get the vectors for your test entities and relations (for example, 'J. K. Rowling' and 'influenced by').<br />
* Find vectors for a few more entities (both unrelated and related ones, e.g., 'J. R. R. Tolkien', 'C. S. Lewis', ...). Use the [https://torchkge.readthedocs.io/en/latest/reference/models.html#translationalmodels model.dissimilarity()-method] to estimate how semantically close your entities are. Do the distances make sense?<br />
<br />
'''K-nearest neighbours''':<br />
* Find the indexes of the 10 entity vectors that are nearest neighbours to your entity of choice. You can use [https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.NearestNeighbors.html sciKit-learn's sklearn.neighbors.NearestNeighbors.kneighbors()-method] for this.<br />
* Map the indexes of the 10-nearest neighbouring entities back into human-understandable labels. Does this make sense? Try the same thing with another entity (e.g., 'WALL·E').<br />
<br />
'''Translation''':<br />
* Add together the vectors for an entity and a relation that that gives meaning for the entity (e.g., 'J. K. Rowling' - 'influenced by', 'WALL·E' - 'genre'). Find the 10-nearest neighbouring entities for the vector sum. Does this make sense? Try more entities and relations. Try to find examples that work and that do not work well.<br />
<br />
==Code to get started==<br />
<br />
<syntaxhighlight><br />
<br />
from torchkge.utils.datasets import load_fb15k237<br />
<br />
kg_train, kg_val, kg_test = load_fb15k237()<br />
<br />
print(list(kg_train.ent2ix.keys())[-10:])<br />
print(list(kg_train.rel2ix.keys())[-10:])<br />
<br />
<br />
<br />
# Download files with human-readable labels for (most) Freebase entities used in the dataset. Labels seem to be missing for <br />
import json<br />
some entities used in FB15k-237.<br />
TEXT_TRIPLES_DIR = 'datasets_knowledge_embedding/FB15k-237/'<br />
with open(TEXT_TRIPLES_DIR+'entity2wikidata.json') as file:<br />
_entity2wikidata = json.load(file)<br />
<br />
ent2lbl = {<br />
ent: wd['label']<br />
for ent, wd in _entity2wikidata.items()<br />
}<br />
lbl2ent = {lbl: ent for ent, lbl in ent2lbl.items()}<br />
<br />
<br />
print([<br />
ent2lbl[ent] <br />
for ent in kg_train.ent2ix.keys()<br />
if ent in ent2lbl][-10:])<br />
<br />
<br />
</syntaxhighlight><br />
<br />
<br />
<br />
==If You Have More Time==<br />
<br />
* Try it out with different datasets, for example one you create youreself using SPARQL queries on an open KG.<br />
<br />
==Useful readings==<br />
* [https://torchkge.readthedocs.io/en/latest/ Welcome to TorchKGE’ s documentation!]</div>
Tkr048
http://info216.wiki.uib.no/index.php?title=Lab:_Using_Graph_Embeddings&diff=1864
Lab: Using Graph Embeddings
2022-05-04T06:21:14Z
<p>Tkr048: </p>
<hr />
<div><br />
=Lab 13: Using Graph Embeddings=<br />
<br />
==Topics==<br />
Using knowledge graph embeddings with TorchKGE.<br />
<br />
<!-- ==Tutorial== --><br />
<br />
<br />
==Classes and methods==<br />
The following TorchKGE classes are central:<br />
* '''KnowledgeGraph''' - contains the knowledge graph (KG)<br />
* '''Model''' - contains the embeddings (entity and relation vectors) for some KG<br />
<br />
<!--<br />
<syntaxhighlight><br />
</syntaxhighlight><br />
--><br />
<br />
==Tasks==<br />
<br />
'''Knowledge Graph''':<br />
* Use a [https://torchkge.readthedocs.io/en/latest/reference/utils.html#pre-trained-models dataset loader] to load a KG you want to work with. Freebase FB15k is a good choice. (You will need a pre-trained model for your KG later, to choose one of FB15k, FB15k237, WDV5, WN18RR, or Yago3-10. This lab has mostly been tested on FB15k.)<br />
* Use the methods provided by the [https://torchkge.readthedocs.io/en/latest/reference/data.html#knowledge-graph KnolwedgeGraph class] to inspect the KG. <br />
** Print out the numbers of entities, relations, and facts in the training, validation, and testing sets. <br />
** Print the identifiers for the first 10 entities and relations (''tip:'' ent2ix and rel2ix).<br />
<br />
'''External identifiers''':<br />
* Download a dataset that provides more understandable labels for the entities (and perhaps relations) in your KnowledgeGraph<br />
** If you use FB15k, the relation names are not so bad, but the entity identifiers do not give much meaning. Same with WordNet. [https://github.com/villmow/datasets_knowledge_embedding This repository] contains mappings for the Freebase and WordNet datasets.<br />
** If you use a Wikidata graph, the entities and relations are all P- and Q-codes. To get labels, you can try a combination of [https://query.wikidata.org/ SPARQL queries] and [https://pypi.org/project/Wikidata/ this API].<br />
* Create mappings from external label to entity (and perhaps relation) ids in the KnowledgeGraph. Also create the inverse mappings.<br />
<br />
'''Test entities and relations''':<br />
* Get the KG indexes for a few entities and relations. If you use the Freebase or Wikidata graphs, you can try 'J. K. Rowling' and 'WALL·E' as entities (''note'' that the dot in 'WALL·E' is not a hyphen or usual period.) For relations you can try 'influenced by' and 'genre'.<br />
<br />
'''Model''':<br />
* Load a [https://torchkge.readthedocs.io/en/latest/reference/utils.html#pre-trained-models pre-trained TransE model] that matches your KnowledgeGraph.<br />
** Print out the numbers of entities, relations, and the dimensions of the entity and relation vectors. Do they match your KnowledgeGraph. <br />
* Get the vectors for your test entities and relations (for example, 'J. K. Rowling' and 'influenced by').<br />
* Find vectors for a few more entities (both unrelated and related ones, e.g., 'J. R. R. Tolkien', 'C. S. Lewis', ...). Use the [https://torchkge.readthedocs.io/en/latest/reference/models.html#translationalmodels model.dissimilarity()-method] to estimate how semantically close your entities are. Do the distances make sense?<br />
<br />
'''K-nearest neighbours''':<br />
* Find the indexes of the 10 entity vectors that are nearest neighbours to your entity of choice. You can use [https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.NearestNeighbors.html sciKit-learn's sklearn.neighbors.NearestNeighbors.kneighbors()-method] for this.<br />
* Map the indexes of the 10-nearest neighbouring entities back into human-understandable labels. Does this make sense? Try the same thing with another entity (e.g., 'WALL·E').<br />
<br />
'''Translation''':<br />
* Add together the vectors for an entity and a relation that that gives meaning for the entity (e.g., 'J. K. Rowling' - 'influenced by', 'WALL·E' - 'genre'). Find the 10-nearest neighbouring entities for the vector sum. Does this make sense? Try more entities and relations. Try to find examples that work and that do not work well.<br />
<br />
==Code to get started==<br />
<br />
<syntaxhighlight><br />
<br />
from torchkge.utils.datasets import load_fb15k237<br />
<br />
kg_train, kg_val, kg_test = load_fb15k237()<br />
<br />
print(list(kg_train.ent2ix.keys())[-10:])<br />
print(list(kg_train.rel2ix.keys())[-10:])<br />
<br />
<br />
<br />
# Download files with human-readable labels for (most) Freebase entities used in the dataset. Labels seem to be missing for <br />
import json<br />
some entities used in FB15k-237.<br />
TEXT_TRIPLES_DIR = 'datasets_knowledge_embedding/FB15k-237/'<br />
with open(TEXT_TRIPLES_DIR+'entity2wikidata.json') as file:<br />
_entity2wikidata = json.load(file)<br />
<br />
ent2lbl = {<br />
ent: wd['label']<br />
for ent, wd in _entity2wikidata.items()<br />
}<br />
lbl2ent = {lbl: ent for ent, lbl in ent2lbl.items()}<br />
<br />
<br />
</syntaxhighlight><br />
<br />
<br />
<br />
==If You Have More Time==<br />
<br />
* Try it out with different datasets, for example one you create youreself using SPARQL queries on an open KG.<br />
<br />
==Useful readings==<br />
* [https://torchkge.readthedocs.io/en/latest/ Welcome to TorchKGE’ s documentation!]</div>
Tkr048
http://info216.wiki.uib.no/index.php?title=Lab_Exercises&diff=1863
Lab Exercises
2022-05-03T06:12:10Z
<p>Tkr048: </p>
<hr />
<div>Here we will present new Python-based lab exercises each week. The old [[Java Labs | Java-based exercises]] are still available if you prefer.<br />
<br />
# [[Lab: Getting started with VSCode, Python and RDFlib]] (week 4, from 25/1) <!-- After S1: KG & S2: RDF --><br />
# [[Lab: RDF programming with RDFlib]] (week 5, from 1/2) <!-- After S3: SPARQL --><br />
# [[Lab: SPARQL]] (week 6, from 8/2) <!-- After S4: Storing and sharing KGs --><br />
# [[Lab: SPARQL Programming]] (week 7, from 15/2) <!-- After S5: Open KGs--><br />
# [[Lab: Web APIs and JSON-LD]] (week 8, from 22/2) <!-- After S6: Enterprise KGs --><br />
# [[Lab: Semantic Lifting - CSV]] (week 9, from 28/2) <!-- After S8: Vocabularies --><br />
# [[Lab: RDFS]] (week 10, from 6/3) <!-- After S7: RDFS --><br />
# [[Lab: OWL 1]] (week 11, from 15/3) <!-- After S9: OWL --><br />
# [[Lab: Semantic Lifting - XML]] (week 12, from 22/3) <!-- After S10: DL --><br />
# [[Lab: More OWL | Lab: OWL 2]] (week 13, from 29/4) <!-- After S11: OWL-DL --><br />
# [[Lab: Semantic Lifting - HTML]] (week 14, from 5/4) <!-- After S12: KG embeddings --><br />
# Week 15: Easter<br />
# Week 16: No labs<br />
# [[Lab: Even More OWL | Lab: owlready2]] (week 17, from 26/4)<br />
# [[Lab: Using Graph Embeddings]] (week 18, from 3/5)<br />
# [[Lab: Training Graph Embeddings]] (week 19, from 10/5)</div>
Tkr048
http://info216.wiki.uib.no/index.php?title=Lab_Exercises&diff=1861
Lab Exercises
2022-05-02T10:53:15Z
<p>Tkr048: </p>
<hr />
<div>Here we will present new Python-based lab exercises each week. The old [[Java Labs | Java-based exercises]] are still available if you prefer.<br />
<br />
# [[Lab: Getting started with VSCode, Python and RDFlib]] (week 4, from 25/1) <!-- After S1: KG & S2: RDF --><br />
# [[Lab: RDF programming with RDFlib]] (week 5, from 1/2) <!-- After S3: SPARQL --><br />
# [[Lab: SPARQL]] (week 6, from 8/2) <!-- After S4: Storing and sharing KGs --><br />
# [[Lab: SPARQL Programming]] (week 7, from 15/2) <!-- After S5: Open KGs--><br />
# [[Lab: Web APIs and JSON-LD]] (week 8, from 22/2) <!-- After S6: Enterprise KGs --><br />
# [[Lab: Semantic Lifting - CSV]] (week 9, from 28/2) <!-- After S8: Vocabularies --><br />
# [[Lab: RDFS]] (week 10, from 6/3) <!-- After S7: RDFS --><br />
# [[Lab: OWL 1]] (week 11, from 15/3) <!-- After S9: OWL --><br />
# [[Lab: Semantic Lifting - XML]] (week 12, from 22/3) <!-- After S10: DL --><br />
# [[Lab: More OWL | Lab: OWL 2]] (week 13, from 29/4) <!-- After S11: OWL-DL --><br />
# [[Lab: Semantic Lifting - HTML]] (week 14, from 5/4) <!-- After S12: KG embeddings --><br />
# Week 15: Easter<br />
# Week 16: No labs<br />
# [[Lab: Even More OWL | Lab: owlready2]] (week 17, from 26/4)<br />
# Lab: Using Graph Embeddings (week 18, from 3/5)<br />
# Lab: Training Graph Embeddings (week 19, from 10/5)<br />
# [[Lab: Using Graph Embeddings]] (week 18, from 3/5)<br />
# [[Lab: Training Graph Embeddings]] (week 19, from 10/5)</div>
Tkr048
http://info216.wiki.uib.no/index.php?title=Lab_Solutions&diff=1841
Lab Solutions
2022-04-16T10:40:15Z
<p>Tkr048: </p>
<hr />
<div>This page will be updated with Python examples related to the lectures and labs. We will add more examples after each lab has ended. The first examples will use Python's RDFlib. We will introduce other relevant libraries later.<br />
<br />
=Example lab solutions=<br />
<br />
==Getting started==<br />
<br />
<syntaxhighlight><br />
<br />
from rdflib.collection import Collection<br />
from rdflib import Graph, Namespace, Literal, URIRef<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
EX = Namespace('http://EXample.org/')<br />
RL = Namespace('http://purl.org/vocab/relationship/')<br />
DBO = Namespace('https://dbpedia.org/ontology/')<br />
DBR = Namespace('https://dbpedia.org/page/')<br />
<br />
g.namespace_manager.bind('exampleURI', EX)<br />
g.namespace_manager.bind('relationship', RL)<br />
g.namespace_manager.bind('dbpediaOntology', DBO)<br />
g.namespace_manager.bind('dbpediaPage', DBR)<br />
<br />
g.add((EX.Cade, RDF.type, FOAF.Person)) <br />
g.add((EX.Mary, RDF.type, FOAF.Person))<br />
g.add((EX.Cade, RL.spouseOf, EX.Mary)) # a symmetrical relation from an established namespace<br />
g.add((DBR.France, DBO.capital, DBR.Paris)) <br />
g.add((EX.Cade, FOAF.age, Literal(27)))<br />
g.add((EX.Mary, FOAF.age, Literal('26', datatype=XSD.int)))<br />
Collection (g, EX.MaryInterests, [EX.hiking, EX.choclate, EX.biology])<br />
g.add((EX.Mary, EX.hasIntrest, EX.MaryInterests))<br />
g.add((EX.Mary, RDF.type, EX.student))<br />
g.add((DBO.capital, EX.range, EX.city))<br />
g.add((EX.Mary, RDF.type, EX.kind))<br />
g.add((EX.Cade, RDF.type, EX.kindPerson))<br />
<br />
#hobbies = ['hiking', 'choclate', 'biology']<br />
#for i in hobbies:<br />
# g.add((EX.Mary, FOAF.interest, EX[i]))<br />
<br />
print(g.serialize(format="turtle"))<br />
</syntaxhighlight><br />
<br />
==RDFlib==<br />
<syntaxhighlight><br />
<br />
from rdflib.namespace import RDF, XSD, FOAF<br />
from rdflib import Graph, Namespace, Literal, BNode<br />
from rdflib.collection import Collection<br />
<br />
<br />
g = Graph()<br />
ex = Namespace('http://example.org/')<br />
schema = Namespace("https://schema.org/")<br />
dbp = Namespace("https://dbpedia.org/resource/")<br />
<br />
g.bind("ex", ex)<br />
g.bind("dbp", dbp)<br />
g.bind("schema", schema)<br />
<br />
address = BNode()<br />
degree = BNode()<br />
<br />
# from lab 1<br />
g.add((ex.Cade, FOAF.name, Literal("Cade Tracey", datatype=XSD.string)))<br />
g.add((ex.Mary, FOAF.name, Literal("Mary", datatype=XSD.string)))<br />
g.add((ex.Cade, RDF.type, FOAF.Person))<br />
g.add((ex.Mary, RDF.type, FOAF.Person))<br />
g.add((ex.Mary, RDF.type, ex.Student))<br />
g.add((ex.Cade, ex.married, ex.Mary))<br />
g.add((ex.Cade, FOAF.age, Literal('27', datatype=XSD.int)))<br />
g.add((ex.Mary, FOAF.age, Literal('26', datatype=XSD.int)))<br />
g.add((ex.Paris, RDF.type, ex.City))<br />
g.add((ex.France, ex.Capital, ex.Paris))<br />
g.add((ex.Mary, FOAF.interest, ex.hiking))<br />
g.add((ex.Mary, FOAF.interest, ex.Chocolate))<br />
g.add((ex.Mary, FOAF.interest, ex.biology))<br />
g.add((ex.France, ex.City, ex.Paris))<br />
g.add((ex.Mary, ex.Characterostic, ex.kind))<br />
g.add((ex.Cade, ex.Characterostic, ex.kind))<br />
g.add((ex.France, RDF.type, ex.Country))<br />
g.add((ex.Cade, schema.address, address))<br />
<br />
# BNode address<br />
g.add((address, RDF.type, schema.PostalAdress))<br />
g.add((address, schema.streetAddress, Literal('1516 Henry Street')))<br />
g.add((address, schema.addresCity, dbp.Berkeley))<br />
g.add((address, schema.addressRegion, dbp.California))<br />
g.add((address, schema.postalCode, Literal('94709')))<br />
g.add((address, schema.addressCountry, dbp.United_States))<br />
<br />
# More info about Cade<br />
g.add((ex.Cade, ex.Degree, degree))<br />
g.add((degree, ex.Field, dbp.Biology))<br />
g.add((degree, RDF.type, dbp.Bachelors_degree))<br />
g.add((degree, ex.Universety, dbp.University_of_California))<br />
g.add((degree, ex.year, Literal('2001', datatype=XSD.gYear)))<br />
<br />
# Emma<br />
emma_degree = BNode()<br />
g.add((ex.Emma, FOAF.name, Literal("Emma Dominguez", datatype=XSD.string)))<br />
g.add((ex.Emma, RDF.type, FOAF.Person))<br />
g.add((ex.Emma, ex.Degree, emma_degree))<br />
g.add((degree, ex.Field, dbp.Chemistry))<br />
g.add((degree, RDF.type, dbp.Masters_degree))<br />
g.add((degree, ex.Universety, dbp.University_of_Valencia))<br />
g.add((degree, ex.year, Literal('2015', datatype=XSD.gYear)))<br />
<br />
# Address<br />
emma_address = BNode()<br />
g.add((ex.Emma, schema.address, emma_address))<br />
g.add((emma_address, RDF.type, schema.PostalAdress))<br />
g.add((emma_address, schema.streetAddress,<br />
Literal('Carrer de la Guardia Civil 20')))<br />
g.add((emma_address, schema.addressRegion, dbp.Valencia))<br />
g.add((emma_address, schema.postalCode, Literal('46020')))<br />
g.add((emma_address, schema.addressCountry, dbp.Spain))<br />
<br />
b = BNode()<br />
g.add((ex.Emma, ex.visit, b))<br />
Collection(g, b,<br />
[dbp.Portugal, dbp.Italy, dbp.France, dbp.Germany, dbp.Denmark, dbp.Sweden])<br />
<br />
</syntaxhighlight><br />
<br />
==SPARQL - Blazegraph==<br />
<syntaxhighlight><br />
PREFIX ex: <http://example.org/> <br />
PREFIX foaf: <http://xmlns.com/foaf/0.1/> <br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> <br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> <br />
PREFIX xml: <http://www.w3.org/XML/1998/namespace> <br />
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> <br />
<br />
<br />
#select all triplets in graph<br />
SELECT ?s ?p ?o<br />
WHERE {<br />
?s ?p ?o .<br />
} <br />
#select the interestes of Cade<br />
SELECT ?cadeInterest<br />
WHERE {<br />
ex:Cade ex:interest ?cadeInterest .<br />
} <br />
#select the country and city where Emma lives<br />
SELECT ?emmaCity ?emmaCountry<br />
WHERE {<br />
ex:Emma ex:address ?address .<br />
?address ex:city ?emmaCity .<br />
?address ex:country ?emmaCountry .<br />
} <br />
#select the people who are over 26 years old<br />
SELECT ?person ?age<br />
WHERE {<br />
?person ex:age ?age .<br />
FILTER(?age > 26) . <br />
} <br />
#select people who graduated with Bachelor<br />
SELECT ?person ?degree<br />
WHERE {<br />
?person ex:degree ?degree .<br />
?degree ex:degreeLevel "Bachelor" .<br />
<br />
} <br />
# delete cades photography interest<br />
DELETE DATA<br />
{<br />
ex:Cade ex:interest ex:Photography .<br />
} <br />
<br />
# delete and insert university of valencia<br />
DELETE { ?s ?p ex:University_of_Valencia }<br />
INSERT { ?s ?p ex:Universidad_de_Valencia }<br />
WHERE { ?s ?p ex:University_of_Valencia } <br />
<br />
#check if the deletion worked<br />
SELECT ?s ?o2<br />
WHERE { <br />
?s ex:degree ?o .<br />
?o ex:degreeSource ?o2 .<br />
} <br />
#describe sergio<br />
DESCRIBE ex:Sergio ?o<br />
WHERE {<br />
ex:Sergio ?p ?o .<br />
?o ?p2 ?o2 .<br />
}<br />
</syntaxhighlight><br />
<br />
==SPARQL - RDFlib==<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON, POST, GET, TURTLE<br />
<br />
namespace = "lab4"<br />
sparql = SPARQLWrapper("http://10.111.21.183:9999/blazegraph/namespace/"+ namespace + "/sparql")<br />
<br />
# Print out Cades interests<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT * WHERE {<br />
ex:Cade ex:interest ?interest.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
for result in results["results"]["bindings"]:<br />
print(result["interest"]["value"])<br />
<br />
# Print Emmas city and country<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?emmaCity ?emmaCountry<br />
WHERE {<br />
ex:Emma ex:address ?address .<br />
?address ex:city ?emmaCity .<br />
?address ex:country ?emmaCountry .<br />
} <br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
for result in results["results"]["bindings"]:<br />
print("Emma's city is "+result["emmaCity"]["value"]+" and Emma's country is " + result["emmaCountry"]["value"])<br />
<br />
#Select the people who are over 26 years old<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?person ?age<br />
WHERE {<br />
?person ex:age ?age .<br />
FILTER(?age > 26) . <br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
for result in results["results"]["bindings"]:<br />
print("All people who are over 26 years old: "+result["person"]["value"])<br />
<br />
#Select people who graduated with Bachelor<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?person ?degree<br />
WHERE {<br />
?person ex:degree ?degree .<br />
?degree ex:degreeLevel "Bachelor" .<br />
} <br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
for result in results["results"]["bindings"]:<br />
print("People who graduated with Bachelor: "+result["person"]["value"])<br />
<br />
#Delete cades photography interest<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
DELETE DATA {<br />
ex:Cade ex:interest ex:Photography .<br />
} <br />
""")<br />
sparql.setMethod(POST)<br />
results = sparql.query()<br />
print(results.response.read())<br />
<br />
# Print out Cades interests again<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT * WHERE {<br />
ex:Cade ex:interest ?interest.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
sparql.setMethod(GET)<br />
results = sparql.query().convert()<br />
for result in results["results"]["bindings"]:<br />
print(result["interest"]["value"])<br />
<br />
# Check university names<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?s ?o2<br />
WHERE { <br />
?s ex:degree ?o .<br />
?o ex:degreeSource ?o2 .<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
for result in results["results"]["bindings"]:<br />
print(result["o2"]["value"])<br />
<br />
<br />
#Delete and insert university of valencia<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
DELETE { ?s ?p ex:University_of_Valencia }<br />
INSERT { ?s ?p ex:Universidad_de_Valencia }<br />
WHERE { ?s ?p ex:University_of_Valencia } <br />
""")<br />
sparql.setMethod(POST)<br />
results = sparql.query()<br />
print(results.response.read())<br />
<br />
# Check university names again<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?s ?o2<br />
WHERE { <br />
?s ex:degree ?o .<br />
?o ex:degreeSource ?o2 .<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
sparql.setMethod(GET)<br />
results = sparql.query().convert()<br />
for result in results["results"]["bindings"]:<br />
print(result["o2"]["value"])<br />
<br />
#Insert Sergio<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
PREFIX foaf: <http://xmlns.com/foaf/0.1/><br />
INSERT DATA {<br />
ex:Sergio a foaf:Person ;<br />
ex:address [ a ex:Address ;<br />
ex:city ex:Valenciay ;<br />
ex:country ex:Spain ;<br />
ex:postalCode "46021"^^xsd:string ;<br />
ex:state ex:California ;<br />
ex:street "4_Carrer_del_Serpis"^^xsd:string ] ;<br />
ex:degree [ ex:degreeField ex:Computer_science ;<br />
ex:degreeLevel "Master"^^xsd:string ;<br />
ex:degreeSource ex:University_of_Valencia ;<br />
ex:year "2008"^^xsd:gYear ] ;<br />
ex:expertise ex:Big_data,<br />
ex:Semantic_technologies,<br />
ex:Machine_learning;<br />
foaf:name "Sergio_Pastor"^^xsd:string .<br />
}<br />
""")<br />
sparql.setMethod(POST)<br />
results = sparql.query()<br />
print(results.response.read())<br />
sparql.setMethod(GET)<br />
<br />
# Describe Sergio<br />
sparql.setReturnFormat(TURTLE)<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
DESCRIBE ex:Sergio ?o<br />
WHERE {<br />
ex:Sergio ?p ?o .<br />
?o ?p2 ?o2 .<br />
}<br />
""")<br />
results = sparql.query().convert()<br />
print(results.serialize(format='turtle'))<br />
<br />
# Construct that any city is in the country in an address<br />
sparql.setQuery("""<br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> <br />
PREFIX ex: <http://example.org/><br />
CONSTRUCT {?city ex:locatedIn ?country}<br />
Where {<br />
?s rdf:type ex:Address .<br />
?s ex:city ?city .<br />
?s ex:country ?country.<br />
}<br />
""")<br />
sparql.setReturnFormat(TURTLE)<br />
results = sparql.query().convert()<br />
print(results.serialize(format='turtle'))<br />
<br />
</syntaxhighlight><br />
==Web APIs and JSON-LD==<br />
<br />
<syntaxhighlight><br />
import requests<br />
from rdflib import FOAF, Namespace, Literal, RDF, Graph, TURTLE<br />
<br />
r = requests.get('http://api.open-notify.org/astros.json').json()<br />
g = Graph()<br />
EX = Namespace('http://EXample.org/')<br />
g.bind("ex", EX)<br />
<br />
for item in r['people']:<br />
craft = item['craft'].replace(" ","_")<br />
person = item['name'].replace(" ","_")<br />
g.add((EX[person], EX.onCraft, EX[craft]))<br />
g.add((EX[person], RDF.type, FOAF.Person))<br />
g.add((EX[person], FOAF.name, Literal(item['name'])))<br />
g.add((EX[craft], FOAF.name, Literal(item['craft'])))<br />
res = g.query("""<br />
CONSTRUCT {?person1 foaf:knows ?person2}<br />
WHERE {<br />
?person1 ex:onCraft ?craft .<br />
?person2 ex:onCraft ?craft .<br />
}<br />
""")<br />
<br />
for triplet in res:<br />
# (we don't need to add that they know themselves)<br />
if (triplet[0] != triplet[2]):<br />
g.add((triplet))<br />
<br />
print(g.serialize(format="turtle"))<br />
</syntaxhighlight><br />
<br />
==Semantic lifting - CSV==<br />
<syntaxhighlight><br />
import pandas as pd<br />
from rdflib import Graph, Namespace, URIRef, Literal<br />
from rdflib.namespace import RDF, XSD<br />
<br />
ex = Namespace("http://example.org/")<br />
dbr = Namespace("http://dbpedia.org/resource/")<br />
dbp = Namespace("https://dbpedia.org/property/")<br />
dbpage = Namespace("https://dbpedia.org/page/")<br />
sem = Namespace("http://semanticweb.cs.vu.nl/2009/11/sem/")<br />
tl = Namespace("http://purl.org/NET/c4dm/timeline.owl#")<br />
<br />
g = Graph()<br />
g.bind("ex", ex)<br />
g.bind("dbr", dbr)<br />
g.bind("dbp", dbp)<br />
g.bind("dbpage", dbpage)<br />
g.bind("sem", sem)<br />
g.bind("tl", tl)<br />
<br />
df = pd.read_csv("russia-investigations.csv")<br />
# We need to correct the type of the columns in the DataFrame, as Pandas assigns an incorrect type when it reads the file (for me at least). We use .astype("str") to convert the content of the columns to a string.<br />
df["name"] = df["name"].astype("str")<br />
df["type"] = df["type"].astype("str")<br />
<br />
# iterrows creates an iterable object (list of rows)<br />
for index, row in df.iterrows():<br />
investigation = URIRef(ex + row['investigation'])<br />
investigation_start = Literal(row['investigation-start'], datatype=XSD.date)<br />
investigation_end = Literal(row['investigation-end'], datatype=XSD.date)<br />
investigation_days = Literal(row['investigation-days'], datatype=XSD.integer)<br />
<br />
name = Literal(row['name'], datatype=XSD.string)<br />
name_underscore = URIRef(dbpage + row['name'].replace(" ","_"))<br />
investigation_result = URIRef(ex + row['investigation']+ "_investigation_" + row['name'].replace(" ","_"))<br />
indictment_days = Literal(row['indictment-days'], datatype=XSD.integer)<br />
type = URIRef(dbr + row['type'].replace(" ","_"))<br />
cp_date = Literal(row['cp-date'], datatype=XSD.date)<br />
cp_days = Literal(row['cp-days'], datatype=XSD.duration)<br />
overturned = Literal(row['overturned'], datatype=XSD.boolean)<br />
pardoned = Literal(row['pardoned'], datatype=XSD.boolean)<br />
american = Literal(row['american'], datatype=XSD.boolean)<br />
president = Literal(row['president'], datatype=XSD.string)<br />
president_underscore = URIRef(dbr + row['president'].replace(" ","_"))<br />
<br />
g.add((investigation, RDF.type, sem.Event))<br />
g.add((investigation, sem.hasBeginTimeStamp, investigation_start))<br />
g.add((investigation, sem.hasEndTimeStamp, investigation_end))<br />
g.add((investigation, tl.duration, investigation_days))<br />
g.add((investigation, dbp.president, president_underscore))<br />
g.add((investigation, sem.hasSubEvent, investigation_result))<br />
<br />
g.add((investigation_result, ex.resultType, type))<br />
g.add((investigation_result, ex.objectOfInvestigation, name_underscore))<br />
g.add((investigation_result, ex.isAmerican, american))<br />
g.add((investigation_result, ex.indictmentDuration, indictment_days))<br />
g.add((investigation_result, ex.caseSolved, cp_date))<br />
g.add((investigation_result, ex.daysBeforeCaseSolved, cp_days))<br />
g.add((investigation_result, ex.overturned, overturned))<br />
g.add((investigation_result, ex.pardoned, pardoned))<br />
<br />
g.serialize("output.ttl",format="ttl")<br />
</syntaxhighlight><br />
<br />
==RDFS==<br />
<syntaxhighlight><br />
from rdflib.namespace import RDF, FOAF, XSD, RDFS<br />
from rdflib import OWL, Graph, Namespace, URIRef, Literal, BNode<br />
from rdflib.namespace import RDF, RDFS, XSD, OWL<br />
import owlrl<br />
<br />
ex = Namespace("http://example.org/")<br />
dbr = Namespace("http://dbpedia.org/resource/")<br />
dbp = Namespace("https://dbpedia.org/property/")<br />
dbpage = Namespace("https://dbpedia.org/page/")<br />
sem = Namespace("http://semanticweb.cs.vu.nl/2009/11/sem/")<br />
tl = Namespace("http://purl.org/NET/c4dm/timeline.owl#")<br />
<br />
g = Graph()<br />
g.bind("ex", ex)<br />
g.bind("dbr", dbr)<br />
g.bind("dbp", dbp)<br />
g.bind("dbpage", dbpage)<br />
g.bind("sem", sem)<br />
g.bind("tl", tl)<br />
<br />
g.parse(location="exampleTTL.ttl", format="turtle")<br />
<br />
# University of California and University of Valencia are both Universities. <br />
g.add((ex.University_of_California, RDF.type, ex.University))<br />
g.add((ex.University_of_Valencia, RDF.type, ex.University))<br />
# All universities are higher education institutions (HEIs). <br />
g.add((ex.University, RDFS.subClassOf, ex.Higher_education))<br />
# Only persons can have an expertise, and what they have expertise in is always a subject. <br />
g.add((ex.expertise, RDFS.domain, FOAF.Person))<br />
g.add((ex.expertise, RDFS.range, ex.subject))<br />
# Only persons can graduate from a HEI. <br />
g.add((ex.graduatedFromHEI, RDFS.domain, FOAF.Person))<br />
g.add((ex.graduatedFromHEI, RDFS.range, ex.Higher_education))<br />
# If you are a student, you are in fact a person as well. <br />
g.add((ex.Student, RDFS.subClassOf, FOAF.Person))<br />
# That a person is married to someone, means that they know them. <br />
g.add((ex.married, RDFS.subPropertyOf, FOAF.knows))<br />
# Finally, if a person has a name, that name is also the label of that entity."<br />
g.add((FOAF.name, RDFS.subPropertyOf, RDFS.label))<br />
<br />
# Having a degree from a HEI means that you have also graduated from that HEI. <br />
g.add((ex.graduatedFromHEI, RDFS.subPropertyOf, ex.degree))<br />
# That a city is a capital of a country means that this city is located in that country. <br />
g.add((ex.capital, RDFS.domain, ex.Country))<br />
g.add((ex.capital, RDFS.range, ex.City))<br />
g.add((ex.capital, RDFS.subPropertyOf, ex.hasLocation))<br />
# That someone was involved in a meeting, means that they have met the other participants. <br />
# This question was bad for the RDFS lab because we need complex OWL or easy sparql.<br />
res = g.query("""<br />
CONSTRUCT {?person1 ex:haveMet ?person2}<br />
WHERE {<br />
?person1 ex:meeting ?Meeting .<br />
?Meeting ex:involved ?person2 .<br />
}<br />
""")<br />
for triplet in res:<br />
#we don't need to add that people have met themselves<br />
if (triplet[0] != triplet[2]):<br />
g.add((triplet))<br />
# If someone partook in a meeting somewhere, means that they have visited that place"<br />
# This question was bad for the RDFS lab for the same reason.<br />
res = g.query("""<br />
CONSTRUCT {?person ex:hasVisited ?place}<br />
WHERE {<br />
?person1 ex:meeting ?Meeting .<br />
?Meeting ex:location ?place .<br />
}<br />
""")<br />
for triplet in res:<br />
g.add((triplet))<br />
<br />
rdfs = owlrl.OWLRL.OWLRL_Semantics(g, False, False, False)<br />
rdfs.closure()<br />
rdfs.flush_stored_triples()<br />
g.serialize("output.ttl",format="ttl")<br />
</syntaxhighlight><br />
<br />
==OWL 1==<br />
<syntaxhighlight><br />
import owlrl<br />
from rdflib import Graph, Namespace, Literal, URIRef<br />
from rdflib.namespace import RDF, RDFS, XSD, FOAF, OWL<br />
from rdflib.collection import Collection<br />
<br />
g = Graph()<br />
print()<br />
# Namespaces<br />
ex = Namespace("http://example.org/")<br />
dbp = Namespace("http://dbpedia.org/resource/")<br />
geo = Namespace("http://sws.geonames.org/")<br />
schema = Namespace("https://schema.org/")<br />
akt = Namespace("http://www.aktors.org/ontology/portal#")<br />
vcard = Namespace("http://www.w3.org/2006/vcard/ns#")<br />
<br />
g.bind("ex", ex)<br />
g.bind("owl", OWL)<br />
<br />
g.parse(location="lab8turtle.txt", format="turtle")<br />
<br />
# Cade and Emma are two different persons. <br />
g.add((ex.Cade, OWL.differentFrom, ex.Emma))<br />
# The country USA above is the same as the DBpedia resource http://dbpedia.org/resource/United_States (dbr:United_States) and the GeoNames resource http://sws.geonames.org/6252001/ (gn:6252001). <br />
g.add((ex.USA, OWL.sameAs, dbp.United_States))<br />
g.add((ex.USA, OWL.sameAs, geo["6252001"]))<br />
# The person class (the RDF type the Cade and Emma resources) in your graph is the same as FOAF's, schema.org's and AKT's person classes <br />
# (they are http://xmlns.com/foaf/0.1/Person, http://schema.org/Person, and http://www.aktors.org/ontology/portal#Person, respectively. <br />
g.add((FOAF.Person, OWL.sameAs, schema.Person))<br />
g.add((FOAF.Person, OWL.sameAs, akt.Person))<br />
# Nothing can be any two of a person, a university, or a city at the same time. <br />
Collection(g, ex.DisjointClasses, [FOAF.Person, ex.University, ex.City])<br />
g.add((OWL.AllDifferent, OWL.distinctMembers, ex.DisjointClasses))<br />
# The property you have used in your RDF/RDFS graph to represent that 94709 is the US zip code of Berkeley, California in US <br />
# is a subproperty of VCard's postal code-property (http://www.w3.org/2006/vcard/ns#postal-code). <br />
g.add((ex.postalCode, RDFS.subPropertyOf, vcard["postal-code"]))<br />
# No two US cities can have the same postal code. <br />
# We have to add a relation from city to postal code first<br />
res = g.query("""<br />
PREFIX RDF: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
PREFIX ex: <http://example.org/><br />
CONSTRUCT {?usa_city ex:us_city_postal_code ?postalcode}<br />
WHERE {<br />
?address RDF:type ex:Address .<br />
?address ex:country ex:USA .<br />
?address ex:city ?usa_city .<br />
?address ex:postalCode ?postalcode<br />
}<br />
""")<br />
for triplet in res:<br />
g.add((triplet))<br />
# Now we can make us cities have distinct postal codes<br />
g.add((ex.us_city_postal_code, RDF.type, OWL.FunctionalProperty))<br />
g.add((ex.us_city_postal_code, RDF.type, OWL.InverseFunctionalProperty))<br />
g.add((ex.us_city_postal_code, RDFS.subPropertyOf, ex.postalcode))<br />
<br />
# The property you have used for Emma living in Valencia is the same property as FOAF's based-near property <br />
# (http://xmlns.com/foaf/0.1/based_near), and it is the inverse of DBpedia's hometown property (http://dbpedia.org/ontology/hometown, dbo:hometown). <br />
g.add((ex.city, OWL.sameAs, FOAF.based_near))<br />
g.add((ex.city, OWL.inverseOf, dbp.hometown))<br />
<br />
g.add((ex.Cade, ex.married, ex.Mary))<br />
g.add((ex.Cade, ex.livesWith, ex.Mary))<br />
g.add((ex.Cade, ex.sibling, ex.Andrew))<br />
g.add((ex.Cade, ex.hasFather, ex.Bob))<br />
g.add((ex.Bob, ex.fatherOf, ex.Cade))<br />
<br />
<br />
#Look through the predicates(properties) above and add new triples for each one that describes them as any of the following: <br />
# a reflexive , irreflexive, symmetric, asymmetric, transitive, functional, or an Inverse Functional Property.<br />
g.add((ex.married, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.married, RDF.type, OWL.FunctionalProperty))<br />
g.add((ex.married, RDF.type, OWL.InverseFunctionalProperty))<br />
g.add((ex.married, RDF.type, OWL.IrreflexiveProperty))<br />
<br />
g.add((ex.livesWith, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.ReflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.TransitiveProperty))<br />
<br />
g.add((ex.sibling, RDF.type, OWL.SymmetricProperty))<br />
<br />
g.add((ex.hasFather, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.FunctionalProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.IrreflexiveProperty))<br />
<br />
g.add((ex.fatherOf, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.InverseFunctionalProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.IrreflexiveProperty))<br />
<br />
# These three lines add inferred triples to the graph.<br />
owl = owlrl.CombinedClosure.RDFS_OWLRL_Semantics(g, False, False, False)<br />
owl.closure()<br />
owl.flush_stored_triples()<br />
<br />
g.serialize("lab8output.xml",format="xml")<br />
</syntaxhighlight><br />
<br />
==Lab 11: Semantic Lifting - HTML==<br />
<br />
<syntaxhighlight><br />
from bs4 import BeautifulSoup as bs<br />
from rdflib import Graph, Literal, URIRef, Namespace<br />
from rdflib.namespace import RDF, SKOS, XSD<br />
import requests<br />
<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
# Download html from URL and parse it with BeautifulSoup.<br />
url = "https://www.semanticscholar.org/topic/Knowledge-Graph/159858"<br />
page = requests.get(url)<br />
html = bs(page.content, features="html.parser")<br />
# print(html.prettify())<br />
<br />
# Find the html that surrounds all the papers<br />
papers = html.find_all('div', attrs={'class': 'flex-container'})<br />
# Find the html that surrounds the info box<br />
topic = html.find_all(<br />
'div', attrs={'class': 'flex-item__left-column entity-header'})<br />
<br />
<br />
# Iterate through each paper to make triples:<br />
for paper in papers:<br />
# e.g selecting title.<br />
title = paper.find('div', attrs={'class': 'timeline-paper-title'}).text<br />
author = paper.find('span', attrs={'class': 'author-list'}).text<br />
papper_year = paper.find(<br />
'li', attrs={'data-selenium-selector': "paper-year"}).text<br />
corpus_ID = paper.find(<br />
'li', attrs={'data-selenium-selector': "corpus-id"}).text<br />
corpus_ID = corpus_ID.replace(" ", "_")<br />
c_id = corpus_ID.replace("Corpus_ID:_", "")<br />
<br />
article = URIRef(ex + c_id)<br />
<br />
# Adding tripels<br />
g.add((article, RDF.type, ex.paper))<br />
g.add((article, ex.HasID, Literal(c_id, datatype=XSD.int)))<br />
g.add((article, ex.HasTitle, Literal(title, datatype=XSD.string)))<br />
g.add((article, ex.Publisher_year, Literal(papper_year, datatype=XSD.year)))<br />
<br />
author = author.split(", ")<br />
for x in author:<br />
name = x.replace(" ", "_")<br />
name = URIRef(ex + name)<br />
<br />
g.add((article, ex.hasAuthor, name))<br />
<br />
# Iterate through the info box to make triples:<br />
for items in topic:<br />
main_topic = items.find('h1', attrs={'class': 'entity-name'}).text<br />
related_topic = items.find(<br />
'div', attrs={'class': 'entity-aliases'}).text<br />
related_topic = related_topic.replace("Known as: ", "")<br />
related_topic = related_topic.replace(f'\xa0Expand', "")<br />
related_topic = related_topic.replace(" ", "")<br />
main_topic = main_topic.replace(" ", "_")<br />
<br />
main_topic = URIRef(ex + main_topic)<br />
<br />
g.add((article, RDF.type, SKOS.Concept))<br />
g.add((article, SKOS.hasTopConcept, main_topic))<br />
<br />
related_topic = related_topic.split(',')<br />
<br />
for related_labels in related_topic:<br />
related_topic = URIRef(ex + related_labels)<br />
g.add((article, SKOS.broader, related_topic))<br />
<br />
<br />
print(g.serialize(format='turtle'))<br />
</syntaxhighlight><br />
<br />
=More miscellaneous examples=<br />
<br />
<br />
===Printing the triples of the Graph in a readable way===<br />
<syntaxhighlight><br />
# The turtle format has the purpose of being more readable for humans. <br />
print(g.serialize(format="turtle"))<br />
</syntaxhighlight><br />
<br />
===Coding Tasks Lab 1===<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, URIRef, BNode, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.Cade, ex.married, ex.Mary))<br />
g.add((ex.France, ex.capital, ex.Paris))<br />
g.add((ex.Cade, ex.age, Literal("27", datatype=XSD.integer)))<br />
g.add((ex.Mary, ex.age, Literal("26", datatype=XSD.integer)))<br />
g.add((ex.Mary, ex.interest, ex.Hiking))<br />
g.add((ex.Mary, ex.interest, ex.Chocolate))<br />
g.add((ex.Mary, ex.interest, ex.Biology))<br />
g.add((ex.Mary, RDF.type, ex.Student))<br />
g.add((ex.Paris, RDF.type, ex.City))<br />
g.add((ex.Paris, ex.locatedIn, ex.France))<br />
g.add((ex.Cade, ex.characteristic, ex.Kind))<br />
g.add((ex.Mary, ex.characteristic, ex.Kind))<br />
g.add((ex.Mary, RDF.type, FOAF.Person))<br />
g.add((ex.Cade, RDF.type, FOAF.Person))<br />
<br />
# OR<br />
<br />
g = Graph()<br />
<br />
ex = Namespace('http://example.org/')<br />
<br />
g.add((ex.Cade, FOAF.name, Literal("Cade", datatype=XSD.string)))<br />
g.add((ex.Mary, FOAF.name, Literal("Mary", datatype=XSD.string)))<br />
g.add((ex.Cade, RDF.type, FOAF.Person))<br />
g.add((ex.Mary, RDF.type, FOAF.Person))<br />
g.add((ex.Mary, RDF.type, ex.Student))<br />
g.add((ex.Cade, ex.Married, ex.Mary))<br />
g.add((ex.Cade, FOAF.age, Literal('27', datatype=XSD.int)))<br />
g.add((ex.Mary, FOAF.age, Literal('26', datatype=XSD.int)))<br />
g.add((ex.Paris, RDF.type, ex.City))<br />
g.add((ex.France, ex.Capital, ex.Paris))<br />
g.add((ex.Mary, FOAF.interest, ex.hiking))<br />
g.add((ex.Mary, FOAF.interest, ex.Chocolate))<br />
g.add((ex.Mary, FOAF.interest, ex.biology))<br />
g.add((ex.France, ex.City, ex.Paris))<br />
g.add((ex.Mary, ex.characteristic, ex.kind))<br />
g.add((ex.Cade, ex.characteristic, ex.kind))<br />
g.add((ex.France, RDF.type, ex.Country))<br />
<br />
<br />
print(g.serialize(format="turtle"))<br />
<br />
</syntaxhighlight><br />
<br />
==Basic RDF programming==<br />
<br />
===Different ways to create an address===<br />
<br />
<syntaxhighlight><br />
<br />
from rdflib import Graph, Namespace, URIRef, BNode, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
<br />
# How to represent the address of Cade Tracey. From probably the worst solution to the best.<br />
<br />
# Solution 1 -<br />
# Make the entire address into one Literal. However, Generally we want to separate each part of an address into their own triples. This is useful for instance if we want to find only the streets where people live. <br />
<br />
g.add((ex.Cade_Tracey, ex.livesIn, Literal("1516_Henry_Street, Berkeley, California 94709, USA")))<br />
<br />
<br />
# Solution 2 - <br />
# Seperate the different pieces information into their own triples<br />
<br />
g.add((ex.Cade_tracey, ex.street, Literal("1516_Henry_Street")))<br />
g.add((ex.Cade_tracey, ex.city, Literal("Berkeley")))<br />
g.add((ex.Cade_tracey, ex.state, Literal("California")))<br />
g.add((ex.Cade_tracey, ex.zipcode, Literal("94709")))<br />
g.add((ex.Cade_tracey, ex.country, Literal("USA")))<br />
<br />
<br />
# Solution 3 - Some parts of the addresses can make more sense to be resources than Literals.<br />
# Larger concepts like a city or state are typically represented as resources rather than Literals, but this is not necesarilly a requirement in the case that you don't intend to say more about them. <br />
<br />
g.add((ex.Cade_tracey, ex.street, Literal("1516_Henry_Street")))<br />
g.add((ex.Cade_tracey, ex.city, ex.Berkeley))<br />
g.add((ex.Cade_tracey, ex.state, ex.California))<br />
g.add((ex.Cade_tracey, ex.zipcode, Literal("94709")))<br />
g.add((ex.Cade_tracey, ex.country, ex.USA))<br />
<br />
<br />
# Solution 4 <br />
# Grouping of the information into an Address. We can Represent the address concept with its own URI OR with a Blank Node. <br />
# One advantage of this is that we can easily remove the entire address, instead of removing each individual part of the address. <br />
# Solution 4 or 5 is how I would recommend to make addresses. Here, ex.CadeAddress could also be called something like ex.address1 or so on, if you want to give each address a unique ID. <br />
<br />
# Address URI - CadeAdress<br />
<br />
g.add((ex.Cade_Tracey, ex.address, ex.CadeAddress))<br />
g.add((ex.CadeAddress, RDF.type, ex.Address))<br />
g.add((ex.CadeAddress, ex.street, Literal("1516 Henry Street")))<br />
g.add((ex.CadeAddress, ex.city, ex.Berkeley))<br />
g.add((ex.CadeAddress, ex.state, ex.California))<br />
g.add((ex.CadeAddress, ex.postalCode, Literal("94709")))<br />
g.add((ex.CadeAddress, ex.country, ex.USA))<br />
<br />
# OR<br />
<br />
# Blank node for Address. <br />
address = BNode()<br />
g.add((ex.Cade_Tracey, ex.address, address))<br />
g.add((address, RDF.type, ex.Address))<br />
g.add((address, ex.street, Literal("1516 Henry Street", datatype=XSD.string)))<br />
g.add((address, ex.city, ex.Berkeley))<br />
g.add((address, ex.state, ex.California))<br />
g.add((address, ex.postalCode, Literal("94709", datatype=XSD.string)))<br />
g.add((address, ex.country, ex.USA))<br />
<br />
<br />
# Solution 5 using existing vocabularies for address <br />
<br />
# (in this case https://schema.org/PostalAddress from schema.org). <br />
# Also using existing ontology for places like California. (like http://dbpedia.org/resource/California from dbpedia.org)<br />
<br />
schema = Namespace("https://schema.org/")<br />
dbp = Namespace("https://dpbedia.org/resource/")<br />
<br />
g.add((ex.Cade_Tracey, schema.address, ex.CadeAddress))<br />
g.add((ex.CadeAddress, RDF.type, schema.PostalAddress))<br />
g.add((ex.CadeAddress, schema.streetAddress, Literal("1516 Henry Street")))<br />
g.add((ex.CadeAddress, schema.addresCity, dbp.Berkeley))<br />
g.add((ex.CadeAddress, schema.addressRegion, dbp.California))<br />
g.add((ex.CadeAddress, schema.postalCode, Literal("94709")))<br />
g.add((ex.CadeAddress, schema.addressCountry, dbp.United_States))<br />
<br />
</syntaxhighlight><br />
<br />
===Typed Literals===<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace<br />
from rdflib.namespace import XSD<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.Cade, ex.age, Literal(27, datatype=XSD.integer)))<br />
g.add((ex.Cade, ex.gpa, Literal(3.3, datatype=XSD.float)))<br />
g.add((ex.Cade, FOAF.name, Literal("Cade Tracey", datatype=XSD.string)))<br />
g.add((ex.Cade, ex.birthday, Literal("2006-01-01", datatype=XSD.date)))<br />
</syntaxhighlight><br />
<br />
<br />
===Writing and reading graphs/files===<br />
<br />
<syntaxhighlight><br />
# Writing the graph to a file on your system. Possible formats = turtle, n3, xml, nt.<br />
g.serialize(destination="triples.txt", format="turtle")<br />
<br />
# Parsing a local file<br />
parsed_graph = g.parse(location="triples.txt", format="turtle")<br />
<br />
# Parsing a remote endpoint like Dbpedia<br />
dbpedia_graph = g.parse("http://dbpedia.org/resource/Pluto")<br />
</syntaxhighlight><br />
<br />
===Graph Binding===<br />
<syntaxhighlight><br />
#Graph Binding is useful for at least two reasons:<br />
#(1) We no longer need to specify prefixes with SPARQL queries if they are already binded to the graph.<br />
#(2) When serializing the graph, the serialization will show the correct expected prefix <br />
# instead of default namespace names ns1, ns2 etc.<br />
<br />
g = Graph()<br />
<br />
ex = Namespace("http://example.org/")<br />
dbp = Namespace("http://dbpedia.org/resource/")<br />
schema = Namespace("https://schema.org/")<br />
<br />
g.bind("ex", ex)<br />
g.bind("dbp", dbp)<br />
g.bind("schema", schema)<br />
</syntaxhighlight><br />
<br />
===Collection Example===<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace<br />
from rdflib.collection import Collection<br />
<br />
<br />
# Sometimes we want to add many objects or subjects for the same predicate at once. <br />
# In these cases we can use Collection() to save some time.<br />
# In this case I want to add all countries that Emma has visited at once.<br />
<br />
b = BNode()<br />
g.add((ex.Emma, ex.visit, b))<br />
Collection(g, b,<br />
[ex.Portugal, ex.Italy, ex.France, ex.Germany, ex.Denmark, ex.Sweden])<br />
<br />
# OR<br />
<br />
g.add((ex.Emma, ex.visit, ex.EmmaVisits))<br />
Collection(g, ex.EmmaVisits,<br />
[ex.Portugal, ex.Italy, ex.France, ex.Germany, ex.Denmark, ex.Sweden])<br />
<br />
</syntaxhighlight><br />
<br />
==SPARQL==<br />
<br />
Also see the [[SPARQL Examples]] page!<br />
<br />
===Querying a local ("in memory") graph===<br />
<br />
Example contents of the file family.ttl:<br />
@prefix rex: <http://example.org/royal#> .<br />
@prefix fam: <http://example.org/family#> .<br />
<br />
rex:IngridAlexandra fam:hasParent rex:HaakonMagnus .<br />
rex:SverreMagnus fam:hasParent rex:HaakonMagnus .<br />
rex:HaakonMagnus fam:hasParent rex:Harald .<br />
rex:MarthaLouise fam:hasParent rex:Harald .<br />
rex:HaakonMagnus fam:hasSister rex:MarthaLouise .<br />
<br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
g.parse("family.ttl", format='ttl')<br />
<br />
qres = g.query("""<br />
PREFIX fam: <http://example.org/family#><br />
SELECT ?child ?sister WHERE {<br />
?child fam:hasParent ?parent . <br />
?parent fam:hasSister ?sister .<br />
}""")<br />
for row in qres:<br />
print("%s has aunt %s" % row)<br />
<br />
With a prepared query, you can write the query once, and then bind some of the variables each time you use it:<br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
g.parse("family.ttl", format='ttl')<br />
<br />
q = rdflib.plugins.sparql.prepareQuery(<br />
"""SELECT ?child ?sister WHERE {<br />
?child fam:hasParent ?parent .<br />
?parent fam:hasSister ?sister .<br />
}""",<br />
initNs = { "fam": "http://example.org/family#"})<br />
<br />
sm = rdflib.URIRef("http://example.org/royal#SverreMagnus")<br />
<br />
for row in g.query(q, initBindings={'child': sm}):<br />
print(row)<br />
<br />
===Select all contents of lists (rdfllib.Collection)===<br />
<syntaxhighlight><br />
<br />
# rdflib.Collection has a different interntal structure so it requires a slightly more advance query. Here I am selecting all places that Emma has visited.<br />
<br />
PREFIX ex: <http://example.org/><br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
<br />
SELECT ?visit<br />
WHERE {<br />
ex:Emma ex:visit/rdf:rest*/rdf:first ?visit<br />
}<br />
</syntaxhighlight><br />
<br />
<br />
===Using parameters/variables in rdflib queries===<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, URIRef<br />
from rdflib.plugins.sparql import prepareQuery<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
g.add((ex.Cade, ex.livesIn, ex.France))<br />
g.add((ex.Anne, ex.livesIn, ex.Norway))<br />
g.add((ex.Sofie, ex.livesIn, ex.Sweden))<br />
g.add((ex.Per, ex.livesIn, ex.Norway))<br />
g.add((ex.John, ex.livesIn, ex.USA))<br />
<br />
<br />
def find_people_from_country(country):<br />
country = URIRef(ex + country)<br />
q = prepareQuery(<br />
"""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?person WHERE { <br />
?person ex:livesIn ?country.<br />
}<br />
""")<br />
<br />
capital_result = g.query(q, initBindings={'country': country})<br />
<br />
for row in capital_result:<br />
print(row)<br />
<br />
find_people_from_country("Norway")<br />
</syntaxhighlight><br />
<br />
===SELECTING data from Blazegraph via Python===<br />
<syntaxhighlight><br />
<br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
# This creates a server connection to the same URL that contains the graphic interface for Blazegraph. <br />
# You also need to add "sparql" to end of the URL like below.<br />
<br />
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/sparql")<br />
<br />
# SELECT all triples in the database.<br />
<br />
sparql.setQuery("""<br />
SELECT DISTINCT ?p WHERE {<br />
?s ?p ?o.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["p"]["value"])<br />
<br />
# SELECT all interests of Cade<br />
<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT DISTINCT ?interest WHERE {<br />
ex:Cade ex:interest ?interest.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["interest"]["value"])<br />
</syntaxhighlight><br />
<br />
===Updating data from Blazegraph via Python===<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, POST, DIGEST<br />
<br />
namespace = "kb"<br />
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/namespace/"+ namespace + "/sparql")<br />
<br />
sparql.setMethod(POST)<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
INSERT DATA{<br />
ex:Cade ex:interest ex:Mathematics.<br />
}<br />
""")<br />
<br />
results = sparql.query()<br />
print(results.response.read())<br />
<br />
<br />
</syntaxhighlight><br />
===Retrieving data from Wikidata with SparqlWrapper===<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
sparql = SPARQLWrapper("https://query.wikidata.org/sparql")<br />
# In the query I want to select all the Vitamins in wikidata.<br />
<br />
sparql.setQuery("""<br />
SELECT ?nutrient ?nutrientLabel WHERE<br />
{<br />
?nutrient wdt:P279 wd:Q34956.<br />
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }<br />
}<br />
""")<br />
<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["nutrient"]["value"], " ", result["nutrientLabel"]["value"])<br />
</syntaxhighlight><br />
<br />
<br />
More examples can be found in the example section on the official query service here: https://query.wikidata.org/.<br />
<br />
===Download from BlazeGraph===<br />
<br />
<syntaxhighlight><br />
"""<br />
Dumps a database to a local RDF file.<br />
You need to install the SPARQLWrapper package first...<br />
"""<br />
<br />
import datetime<br />
from SPARQLWrapper import SPARQLWrapper, RDFXML<br />
<br />
# your namespace, the default is 'kb'<br />
ns = 'kb'<br />
<br />
# the SPARQL endpoint<br />
endpoint = 'http://info216.i2s.uib.no/bigdata/namespace/' + ns + '/sparql'<br />
<br />
# - the endpoint just moved, the old one was:<br />
# endpoint = 'http://i2s.uib.no:8888/bigdata/namespace/' + ns + '/sparql'<br />
<br />
# create wrapper<br />
wrapper = SPARQLWrapper(endpoint)<br />
<br />
# prepare the SPARQL update<br />
wrapper.setQuery('CONSTRUCT { ?s ?p ?o } WHERE { ?s ?p ?o }')<br />
wrapper.setReturnFormat(RDFXML)<br />
<br />
# execute the SPARQL update and convert the result to an rdflib.Graph <br />
graph = wrapper.query().convert()<br />
<br />
# the destination file, with code to make it timestamped<br />
destfile = 'rdf_dumps/slr-kg4news-' + datetime.datetime.now().strftime('%Y%m%d-%H%M') + '.rdf'<br />
<br />
# serialize the result to file<br />
graph.serialize(destination=destfile, format='ttl')<br />
<br />
# report and quit<br />
print('Wrote %u triples to file %s .' %<br />
(len(res), destfile))<br />
</syntaxhighlight><br />
<br />
===Query Dbpedia with SparqlWrapper===<br />
<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
sparql = SPARQLWrapper("http://dbpedia.org/sparql")<br />
<br />
sparql.setQuery("""<br />
PREFIX dbr: <http://dbpedia.org/resource/><br />
PREFIX dbo: <http://dbpedia.org/ontology/><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
SELECT ?comment<br />
WHERE {<br />
dbr:Barack_Obama rdfs:comment ?comment.<br />
FILTER (langMatches(lang(?comment),"en"))<br />
}<br />
""")<br />
<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["comment"]["value"])<br />
</syntaxhighlight><br />
<br />
==Lifting CSV to RDF==<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, FOAF, RDFS, OWL<br />
import pandas as pd<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
# Load the CSV data as a pandas Dataframe.<br />
csv_data = pd.read_csv("task1.csv")<br />
<br />
# Here I deal with spaces (" ") in the data. I replace them with "_" so that URI's become valid.<br />
csv_data = csv_data.replace(to_replace=" ", value="_", regex=True)<br />
<br />
# Here I mark all missing/empty data as "unknown". This makes it easy to delete triples containing this later.<br />
csv_data = csv_data.fillna("unknown")<br />
<br />
# Loop through the CSV data, and then make RDF triples.<br />
for index, row in csv_data.iterrows():<br />
# The names of the people act as subjects.<br />
subject = row['Name']<br />
# Create triples: e.g. "Cade_Tracey - age - 27"<br />
g.add((URIRef(ex + subject), URIRef(ex + "age"), Literal(row["Age"])))<br />
g.add((URIRef(ex + subject), URIRef(ex + "married"), URIRef(ex + row["Spouse"])))<br />
g.add((URIRef(ex + subject), URIRef(ex + "country"), URIRef(ex + row["Country"])))<br />
<br />
# If We want can add additional RDF/RDFS/OWL information e.g<br />
g.add((URIRef(ex + subject), RDF.type, FOAF.Person))<br />
<br />
# I remove triples that I marked as unknown earlier.<br />
g.remove((None, None, URIRef("http://example.org/unknown")))<br />
<br />
# Clean printing of the graph.<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===CSV file for above example===<br />
<br />
<syntaxhighlight><br />
"Name","Age","Spouse","Country"<br />
"Cade Tracey","26","Mary Jackson","US"<br />
"Bob Johnson","21","","Canada"<br />
"Mary Jackson","25","","France"<br />
"Phil Philips","32","Catherine Smith","Japan"<br />
</syntaxhighlight><br />
<br />
<br />
=Coding Tasks Lab 6=<br />
<syntaxhighlight><br />
import pandas as pd<br />
<br />
<br />
from rdflib import Graph, Namespace, URIRef, Literal, BNode<br />
from rdflib.namespace import RDF, XSD<br />
<br />
<br />
ex = Namespace("http://example.org/")<br />
sem = Namespace("http://semanticweb.cs.vu.nl/2009/11/sem/")<br />
<br />
g = Graph()<br />
g.bind("ex", ex)<br />
g.bind("sem", sem)<br />
<br />
<br />
# Removing unwanted characters<br />
df = pd.read_csv('russia-investigation.csv')<br />
# Here I deal with spaces (" ") in the data. I replace them with "_" so that URI's become valid.<br />
df = df.replace(to_replace=" ", value="_", regex=True)<br />
# This may seem odd, but in the data set we have a name like this:("Scooter"). So we have to remove quotation marks<br />
df = df.replace(to_replace=f'"', value="", regex=True)<br />
# # Here I mark all missing/empty data as "unknown". This makes it easy to delete triples containing this later.<br />
df = df.fillna("unknown")<br />
<br />
# Loop through the CSV data, and then make RDF triples.<br />
for index, row in df.iterrows():<br />
name = row['investigation']<br />
investigation = URIRef(ex + name)<br />
g.add((investigation, RDF.type, sem.Event))<br />
investigation_start = row["investigation-start"]<br />
g.add((investigation, sem.hasBeginTimeStamp, Literal(<br />
investigation_start, datatype=XSD.datetime)))<br />
investigation_end = row["investigation-end"]<br />
g.add((investigation, sem.hasEndTimeStamp, Literal(<br />
investigation_end, datatype=XSD.datetime)))<br />
investigation_end = row["investigation-days"]<br />
g.add((investigation, sem.hasXSDDuration, Literal(<br />
investigation_end, datatype=XSD.Days)))<br />
person = row["name"]<br />
person = URIRef(ex + person)<br />
g.add((investigation, sem.Actor, person))<br />
result = row['type']<br />
g.add((investigation, sem.hasSubEvent, Literal(result, datatype=XSD.string)))<br />
overturned = row["overturned"]<br />
g.add((investigation, ex.overtuned, Literal(overturned, datatype=XSD.boolean)))<br />
pardoned = row["pardoned"]<br />
g.add((investigation, ex.pardon, Literal(pardoned, datatype=XSD.boolean)))<br />
<br />
g.serialize("output.ttl", format="ttl")<br />
print(g.serialize(format="turtle"))<br />
<br />
<br />
</syntaxhighlight><br />
<br />
==RDFS==<br />
<br />
===RDFS-plus (OWL) Properties===<br />
<syntaxhighlight><br />
g.add((ex.married, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.married, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.ReflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.sibling, RDF.type, OWL.TransitiveProperty))<br />
g.add((ex.sibling, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.sibling, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.FunctionalProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.IrreflexiveProperty))<br />
<br />
# Sometimes there is no definite answer, and it comes down to how we want to model our properties<br />
# e.g is livesWith a transitive property? Usually yes, but we can also want to specify that a child lives with both of her divorced parents.<br />
# which means that: (mother livesWith child % child livesWith father) != mother livesWith father. Which makes it non-transitive.<br />
</syntaxhighlight><br />
<br />
<!--<br />
==Lifting XML to RDF==<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, XSD, RDFS<br />
import xml.etree.ElementTree as ET<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/TV/")<br />
prov = Namespace("http://www.w3.org/ns/prov#")<br />
g.bind("ex", ex)<br />
g.bind("prov", prov)<br />
<br />
tree = ET.parse("tv_shows.xml")<br />
root = tree.getroot()<br />
<br />
for tv_show in root.findall('tv_show'):<br />
show_id = tv_show.attrib["id"]<br />
title = tv_show.find("title").text<br />
<br />
g.add((URIRef(ex + show_id), ex.title, Literal(title, datatype=XSD.string)))<br />
g.add((URIRef(ex + show_id), RDF.type, ex.TV_Show))<br />
<br />
for actor in tv_show.findall("actor"):<br />
first_name = actor.find("firstname").text<br />
last_name = actor.find("lastname").text<br />
full_name = first_name + "_" + last_name<br />
<br />
g.add((URIRef(ex + show_id), ex.stars, URIRef(ex + full_name)))<br />
g.add((URIRef(ex + full_name), ex.starsIn, URIRef(title)))<br />
g.add((URIRef(ex + full_name), RDF.type, ex.Actor))<br />
<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
<br />
<br />
<br />
===RDFS inference with RDFLib===<br />
You can use the OWL-RL package to add inference capabilities to RDFLib. It can be installed using the pip install command:<br />
<syntaxhighlight><br />
pip install owlrl<br />
</syntaxhighlight><br />
Or download it from [https://github.com/RDFLib/OWL-RL GitHub] and copy the ''owlrl'' subfolder into your project folder next to your Python files.<br />
<br />
[https://owl-rl.readthedocs.io/en/latest/owlrl.html OWL-RL documentation.]<br />
<br />
Example program to get you started. In this example we are creating the graph using sparql.update, but it is also possible to parse the data from a file.<br />
<syntaxhighlight><br />
import rdflib.plugins.sparql.update<br />
import owlrl.RDFSClosure<br />
<br />
g = rdflib.Graph()<br />
<br />
ex = rdflib.Namespace('http://example.org#')<br />
g.bind('', ex)<br />
<br />
g.update("""<br />
PREFIX ex: <http://example.org#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
INSERT DATA {<br />
ex:Socrates rdf:type ex:Man .<br />
ex:Man rdfs:subClassOf ex:Mortal .<br />
}""")<br />
<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
# RDF_Semantics parameters: <br />
# - graph (rdflib.Graph) – The RDF graph to be extended.<br />
# - axioms (bool) – Whether (non-datatype) axiomatic triples should be added or not.<br />
# - daxioms (bool) – Whether datatype axiomatic triples should be added or not.<br />
# - rdfs (bool) – Whether RDFS inference is also done (used in subclassed only).<br />
# For now, you will in most cases use all False in RDFS_Semtantics.<br />
<br />
# Generates the closure of the graph - generates the new entailed triples, but does not add them to the graph.<br />
rdfs.closure()<br />
# Adds the new triples to the graph and empties the RDFS triple-container.<br />
rdfs.flush_stored_triples()<br />
<br />
# Ask-query to check whether a new triple has been generated from the entailment.<br />
b = g.query("""<br />
PREFIX ex: <http://example.org#><br />
ASK {<br />
ex:Socrates rdf:type ex:Mortal .<br />
} <br />
""")<br />
print('Result: ' + bool(b))<br />
</syntaxhighlight><br />
<br />
===Language tagged RDFS labels=== <br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, Literal<br />
from rdflib.namespace import RDFS<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.France, RDFS.label, Literal("Frankrike", lang="no")))<br />
g.add((ex.France, RDFS.label, Literal("France", lang="en")))<br />
g.add((ex.France, RDFS.label, Literal("Francia", lang="es")))<br />
<br />
<br />
</syntaxhighlight><br />
<br />
==OWL== <br />
===Basic inference with RDFLib=== <br />
<br />
You can use the OWL-RL package again as for Lecture 5.<br />
<br />
Instead of: <br />
<syntaxhighlight><br />
# The next three lines add inferred triples to g.<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
rdfs.closure()<br />
rdfs.flush_stored_triples()<br />
</syntaxhighlight><br />
you can write this to get both RDFS and basic RDFS Plus / OWL inference:<br />
<syntaxhighlight><br />
# The next three lines add inferred triples to g.<br />
owl = owlrl.CombinedClosure.RDFS_OWLRL_Semantics(g, False, False, False)<br />
owl.closure()<br />
owl.flush_stored_triples()<br />
</syntaxhighlight><br />
<br />
Example updates and queries:<br />
<syntaxhighlight><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
PREFIX ex: <http://example.org#><br />
<br />
INSERT DATA {<br />
ex:Socrates ex:hasWife ex:Xanthippe .<br />
ex:hasHusband owl:inverseOf ex:hasWife .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Xanthippe ex:hasHusband ex:Socrates .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ^ex:hasHusband ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
INSERT DATA {<br />
ex:hasWife rdfs:subPropertyOf ex:hasSpouse .<br />
ex:hasSpouse rdf:type owl:SymmetricProperty . <br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ex:hasSpouse ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ^ex:hasSpouse ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<br />
<br />
<br />
<br />
<br />
===XML Data for above example===<br />
<syntaxhighlight><br />
<data><br />
<tv_show id="1050"><br />
<title>The_Sopranos</title><br />
<actor><br />
<firstname>James</firstname><br />
<lastname>Gandolfini</lastname><br />
</actor><br />
</tv_show><br />
<tv_show id="1066"><br />
<title>Seinfeld</title><br />
<actor><br />
<firstname>Jerry</firstname><br />
<lastname>Seinfeld</lastname><br />
</actor><br />
<actor><br />
<firstname>Julia</firstname><br />
<lastname>Louis-dreyfus</lastname><br />
</actor><br />
<actor><br />
<firstname>Jason</firstname><br />
<lastname>Alexander</lastname><br />
</actor><br />
</tv_show><br />
</data><br />
</syntaxhighlight><br />
<br />
==Lifting HTML to RDF==<br />
<syntaxhighlight><br />
from bs4 import BeautifulSoup as bs, NavigableString<br />
from rdflib import Graph, URIRef, Namespace<br />
from rdflib.namespace import RDF<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
html = open("tv_shows.html").read()<br />
html = bs(html, features="html.parser")<br />
<br />
shows = html.find_all('li', attrs={'class': 'show'})<br />
for show in shows:<br />
title = show.find("h3").text<br />
actors = show.find('ul', attrs={'class': 'actor_list'})<br />
for actor in actors:<br />
if isinstance(actor, NavigableString):<br />
continue<br />
else:<br />
actor = actor.text.replace(" ", "_")<br />
g.add((URIRef(ex + title), ex.stars, URIRef(ex + actor)))<br />
g.add((URIRef(ex + actor), RDF.type, ex.Actor))<br />
<br />
g.add((URIRef(ex + title), RDF.type, ex.TV_Show))<br />
<br />
<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===HTML code for the example above===<br />
<syntaxhighlight><br />
<!DOCTYPE html><br />
<html><br />
<head><br />
<meta charset="utf-8"><br />
<title></title><br />
</head><br />
<body><br />
<div class="tv_shows"><br />
<ul><br />
<li class="show"><br />
<h3>The_Sopranos</h3><br />
<div class="irrelevant_data"></div><br />
<ul class="actor_list"><br />
<li>James Gandolfini</li><br />
</ul><br />
</li><br />
<li class="show"><br />
<h3>Seinfeld</h3><br />
<div class="irrelevant_data"></div><br />
<ul class="actor_list"><br />
<li >Jerry Seinfeld</li><br />
<li>Jason Alexander</li><br />
<li>Julia Louis-Dreyfus</li><br />
</ul><br />
</li><br />
</ul><br />
</div><br />
</body><br />
</html><br />
</syntaxhighlight><br />
<br />
==Web APIs with JSON==<br />
<syntaxhighlight><br />
import requests<br />
import json<br />
import pprint<br />
<br />
# Retrieve JSON data from API service URL. Then load it with the json library as a json object.<br />
url = "http://api.geonames.org/postalCodeLookupJSON?postalcode=46020&#country=ES&username=demo"<br />
data = requests.get(url).content.decode("utf-8")<br />
data = json.loads(data)<br />
pprint.pprint(data)<br />
</syntaxhighlight><br />
<br />
<br />
==JSON-LD==<br />
<br />
<syntaxhighlight><br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
<br />
example = """<br />
{<br />
"@context": {<br />
"name": "http://xmlns.com/foaf/0.1/name",<br />
"homepage": {<br />
"@id": "http://xmlns.com/foaf/0.1/homepage",<br />
"@type": "@id"<br />
}<br />
},<br />
"@id": "http://me.markus-lanthaler.com/",<br />
"name": "Markus Lanthaler",<br />
"homepage": "http://www.markus-lanthaler.com/"<br />
}<br />
"""<br />
<br />
# json-ld parsing automatically deals with @contexts<br />
g.parse(data=example, format='json-ld')<br />
<br />
# serialisation does expansion by default<br />
for line in g.serialize(format='json-ld').decode().splitlines():<br />
print(line)<br />
<br />
# by supplying a context object, serialisation can do compaction<br />
context = {<br />
"foaf": "http://xmlns.com/foaf/0.1/"<br />
}<br />
for line in g.serialize(format='json-ld', context=context).decode().splitlines():<br />
print(line)<br />
</syntaxhighlight><br />
<br />
<br />
<div class="credits" style="text-align: right; direction: ltr; margin-left: 1em;">''INFO216, UiB, 2017-2020. All code examples are [https://creativecommons.org/choose/zero/ CC0].'' </div><br />
<br />
==OWL - Complex Classes and Restrictions==<br />
<syntaxhighlight><br />
import owlrl<br />
from rdflib import Graph, Literal, Namespace, BNode<br />
from rdflib.namespace import RDF, OWL, RDFS<br />
from rdflib.collection import Collection<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
g.bind("owl", OWL)<br />
<br />
# a Season is either Autumn, Winter, Spring, Summer<br />
seasons = BNode()<br />
Collection(g, seasons, [ex.Winter, ex.Autumn, ex.Spring, ex.Summer])<br />
g.add((ex.Season, OWL.oneOf, seasons))<br />
<br />
# A Parent is a Father or Mother<br />
b = BNode()<br />
Collection(g, b, [ex.Father, ex.Mother])<br />
g.add((ex.Parent, OWL.unionOf, b))<br />
<br />
# A Woman is a person who has the "female" gender<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.gender))<br />
g.add((br, OWL.hasValue, ex.Female))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Woman, OWL.intersectionOf, bi))<br />
<br />
# A vegetarian is a Person who only eats vegetarian food<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.eats))<br />
g.add((br, OWL.allValuesFrom, ex.VeganFood))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Vegetarian, OWL.intersectionOf, bi))<br />
<br />
# A vegetarian is a Person who can not eat meat.<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.eats))<br />
g.add((br, OWL.QualifiedCardinality, Literal(0)))<br />
g.add((br, OWL.onClass, ex.Meat))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Vegetarian, OWL.intersectionOf, bi))<br />
<br />
# A Worried Parent is a parent who has at least one sick child<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.hasChild))<br />
g.add((br, OWL.QualifiedMinCardinality, Literal(1)))<br />
g.add((br, OWL.onClass, ex.Sick))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Parent, br])<br />
g.add((ex.WorriedParent, OWL.intersectionOf, bi))<br />
<br />
# using the restriction above, If we now write...: <br />
g.add((ex.Bob, RDF.type, ex.Parent))<br />
g.add((ex.Bob, ex.hasChild, ex.John))<br />
g.add((ex.John, RDF.type, ex.Sick))<br />
# ...we can infer with owl reasoning that Bob is a worried Parent even though we didn't specify it ourselves because Bob fullfills the restriction and Parent requirements.<br />
<br />
</syntaxhighlight><br />
<br />
==Protege-OWL reasoning with HermiT==<br />
<br />
[[:File:DL-reasoning-RoyalFamily-final.owl.txt | Example file]] from Lecture 13 about OWL-DL, rules and reasoning.<br />
<br />
--></div>
Tkr048
http://info216.wiki.uib.no/index.php?title=Lab_Solutions&diff=1795
Lab Solutions
2022-03-24T12:41:44Z
<p>Tkr048: </p>
<hr />
<div>This page will be updated with Python examples related to the lectures and labs. We will add more examples after each lab has ended. The first examples will use Python's RDFlib. We will introduce other relevant libraries later.<br />
<br />
=Example lab solutions=<br />
<br />
==Getting started==<br />
<br />
<syntaxhighlight><br />
<br />
from rdflib.collection import Collection<br />
from rdflib import Graph, Namespace, Literal, URIRef<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
EX = Namespace('http://EXample.org/')<br />
RL = Namespace('http://purl.org/vocab/relationship/')<br />
DBO = Namespace('https://dbpedia.org/ontology/')<br />
DBR = Namespace('https://dbpedia.org/page/')<br />
<br />
g.namespace_manager.bind('exampleURI', EX)<br />
g.namespace_manager.bind('relationship', RL)<br />
g.namespace_manager.bind('dbpediaOntology', DBO)<br />
g.namespace_manager.bind('dbpediaPage', DBR)<br />
<br />
g.add((EX.Cade, RDF.type, FOAF.Person)) <br />
g.add((EX.Mary, RDF.type, FOAF.Person))<br />
g.add((EX.Cade, RL.spouseOf, EX.Mary)) # a symmetrical relation from an established namespace<br />
g.add((DBR.France, DBO.capital, DBR.Paris)) <br />
g.add((EX.Cade, FOAF.age, Literal(27)))<br />
g.add((EX.Mary, FOAF.age, Literal('26', datatype=XSD.int)))<br />
Collection (g, EX.MaryInterests, [EX.hiking, EX.choclate, EX.biology])<br />
g.add((EX.Mary, EX.hasIntrest, EX.MaryInterests))<br />
g.add((EX.Mary, RDF.type, EX.student))<br />
g.add((DBO.capital, EX.range, EX.city))<br />
g.add((EX.Mary, RDF.type, EX.kind))<br />
g.add((EX.Cade, RDF.type, EX.kindPerson))<br />
<br />
#hobbies = ['hiking', 'choclate', 'biology']<br />
#for i in hobbies:<br />
# g.add((EX.Mary, FOAF.interest, EX[i]))<br />
<br />
print(g.serialize(format="turtle"))<br />
</syntaxhighlight><br />
<br />
==RDFlib==<br />
<syntaxhighlight><br />
<br />
from rdflib.namespace import RDF, XSD, FOAF<br />
from rdflib import Graph, Namespace, Literal, BNode<br />
from rdflib.collection import Collection<br />
<br />
<br />
g = Graph()<br />
ex = Namespace('http://example.org/')<br />
schema = Namespace("https://schema.org/")<br />
dbp = Namespace("https://dbpedia.org/resource/")<br />
<br />
g.bind("ex", ex)<br />
g.bind("dbp", dbp)<br />
g.bind("schema", schema)<br />
<br />
address = BNode()<br />
degree = BNode()<br />
<br />
# from lab 1<br />
g.add((ex.Cade, FOAF.name, Literal("Cade Tracey", datatype=XSD.string)))<br />
g.add((ex.Mary, FOAF.name, Literal("Mary", datatype=XSD.string)))<br />
g.add((ex.Cade, RDF.type, FOAF.Person))<br />
g.add((ex.Mary, RDF.type, FOAF.Person))<br />
g.add((ex.Mary, RDF.type, ex.Student))<br />
g.add((ex.Cade, ex.married, ex.Mary))<br />
g.add((ex.Cade, FOAF.age, Literal('27', datatype=XSD.int)))<br />
g.add((ex.Mary, FOAF.age, Literal('26', datatype=XSD.int)))<br />
g.add((ex.Paris, RDF.type, ex.City))<br />
g.add((ex.France, ex.Capital, ex.Paris))<br />
g.add((ex.Mary, FOAF.interest, ex.hiking))<br />
g.add((ex.Mary, FOAF.interest, ex.Chocolate))<br />
g.add((ex.Mary, FOAF.interest, ex.biology))<br />
g.add((ex.France, ex.City, ex.Paris))<br />
g.add((ex.Mary, ex.Characterostic, ex.kind))<br />
g.add((ex.Cade, ex.Characterostic, ex.kind))<br />
g.add((ex.France, RDF.type, ex.Country))<br />
g.add((ex.Cade, schema.address, address))<br />
<br />
# BNode address<br />
g.add((address, RDF.type, schema.PostalAdress))<br />
g.add((address, schema.streetAddress, Literal('1516 Henry Street')))<br />
g.add((address, schema.addresCity, dbp.Berkeley))<br />
g.add((address, schema.addressRegion, dbp.California))<br />
g.add((address, schema.postalCode, Literal('94709')))<br />
g.add((address, schema.addressCountry, dbp.United_States))<br />
<br />
# More info about Cade<br />
g.add((ex.Cade, ex.Degree, degree))<br />
g.add((degree, ex.Field, dbp.Biology))<br />
g.add((degree, RDF.type, dbp.Bachelors_degree))<br />
g.add((degree, ex.Universety, dbp.University_of_California))<br />
g.add((degree, ex.year, Literal('2001', datatype=XSD.gYear)))<br />
<br />
# Emma<br />
emma_degree = BNode()<br />
g.add((ex.Emma, FOAF.name, Literal("Emma Dominguez", datatype=XSD.string)))<br />
g.add((ex.Emma, RDF.type, FOAF.Person))<br />
g.add((ex.Emma, ex.Degree, emma_degree))<br />
g.add((degree, ex.Field, dbp.Chemistry))<br />
g.add((degree, RDF.type, dbp.Masters_degree))<br />
g.add((degree, ex.Universety, dbp.University_of_Valencia))<br />
g.add((degree, ex.year, Literal('2015', datatype=XSD.gYear)))<br />
<br />
# Address<br />
emma_address = BNode()<br />
g.add((ex.Emma, schema.address, emma_address))<br />
g.add((emma_address, RDF.type, schema.PostalAdress))<br />
g.add((emma_address, schema.streetAddress,<br />
Literal('Carrer de la Guardia Civil 20')))<br />
g.add((emma_address, schema.addressRegion, dbp.Valencia))<br />
g.add((emma_address, schema.postalCode, Literal('46020')))<br />
g.add((emma_address, schema.addressCountry, dbp.Spain))<br />
<br />
b = BNode()<br />
g.add((ex.Emma, ex.visit, b))<br />
Collection(g, b,<br />
[dbp.Portugal, dbp.Italy, dbp.France, dbp.Germany, dbp.Denmark, dbp.Sweden])<br />
<br />
</syntaxhighlight><br />
<br />
==SPARQL - Blazegraph==<br />
<syntaxhighlight><br />
PREFIX ex: <http://example.org/> <br />
PREFIX foaf: <http://xmlns.com/foaf/0.1/> <br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> <br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> <br />
PREFIX xml: <http://www.w3.org/XML/1998/namespace> <br />
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> <br />
<br />
<br />
#select all triplets in graph<br />
SELECT ?s ?p ?o<br />
WHERE {<br />
?s ?p ?o .<br />
} <br />
#select the interestes of Cade<br />
SELECT ?cadeInterest<br />
WHERE {<br />
ex:Cade ex:interest ?cadeInterest .<br />
} <br />
#select the country and city where Emma lives<br />
SELECT ?emmaCity ?emmaCountry<br />
WHERE {<br />
ex:Emma ex:address ?address .<br />
?address ex:city ?emmaCity .<br />
?address ex:country ?emmaCountry .<br />
} <br />
#select the people who are over 26 years old<br />
SELECT ?person ?age<br />
WHERE {<br />
?person ex:age ?age .<br />
FILTER(?age > 26) . <br />
} <br />
#select people who graduated with Bachelor<br />
SELECT ?person ?degree<br />
WHERE {<br />
?person ex:degree ?degree .<br />
?degree ex:degreeLevel "Bachelor" .<br />
<br />
} <br />
# delete cades photography interest<br />
DELETE DATA<br />
{<br />
ex:Cade ex:interest ex:Photography .<br />
} <br />
<br />
# delete and insert university of valencia<br />
DELETE { ?s ?p ex:University_of_Valencia }<br />
INSERT { ?s ?p ex:Universidad_de_Valencia }<br />
WHERE { ?s ?p ex:University_of_Valencia } <br />
<br />
#check if the deletion worked<br />
SELECT ?s ?o2<br />
WHERE { <br />
?s ex:degree ?o .<br />
?o ex:degreeSource ?o2 .<br />
} <br />
#describe sergio<br />
DESCRIBE ex:Sergio ?o<br />
WHERE {<br />
ex:Sergio ?p ?o .<br />
?o ?p2 ?o2 .<br />
}<br />
</syntaxhighlight><br />
<br />
==SPARQL - RDFlib==<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON, POST, GET, TURTLE<br />
<br />
namespace = "lab4"<br />
sparql = SPARQLWrapper("http://10.111.21.183:9999/blazegraph/namespace/"+ namespace + "/sparql")<br />
<br />
# Print out Cades interests<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT * WHERE {<br />
ex:Cade ex:interest ?interest.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
for result in results["results"]["bindings"]:<br />
print(result["interest"]["value"])<br />
<br />
# Print Emmas city and country<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?emmaCity ?emmaCountry<br />
WHERE {<br />
ex:Emma ex:address ?address .<br />
?address ex:city ?emmaCity .<br />
?address ex:country ?emmaCountry .<br />
} <br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
for result in results["results"]["bindings"]:<br />
print("Emma's city is "+result["emmaCity"]["value"]+" and Emma's country is " + result["emmaCountry"]["value"])<br />
<br />
#Select the people who are over 26 years old<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?person ?age<br />
WHERE {<br />
?person ex:age ?age .<br />
FILTER(?age > 26) . <br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
for result in results["results"]["bindings"]:<br />
print("All people who are over 26 years old: "+result["person"]["value"])<br />
<br />
#Select people who graduated with Bachelor<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?person ?degree<br />
WHERE {<br />
?person ex:degree ?degree .<br />
?degree ex:degreeLevel "Bachelor" .<br />
} <br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
for result in results["results"]["bindings"]:<br />
print("People who graduated with Bachelor: "+result["person"]["value"])<br />
<br />
#Delete cades photography interest<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
DELETE DATA {<br />
ex:Cade ex:interest ex:Photography .<br />
} <br />
""")<br />
sparql.setMethod(POST)<br />
results = sparql.query()<br />
print(results.response.read())<br />
<br />
# Print out Cades interests again<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT * WHERE {<br />
ex:Cade ex:interest ?interest.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
sparql.setMethod(GET)<br />
results = sparql.query().convert()<br />
for result in results["results"]["bindings"]:<br />
print(result["interest"]["value"])<br />
<br />
# Check university names<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?s ?o2<br />
WHERE { <br />
?s ex:degree ?o .<br />
?o ex:degreeSource ?o2 .<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
for result in results["results"]["bindings"]:<br />
print(result["o2"]["value"])<br />
<br />
<br />
#Delete and insert university of valencia<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
DELETE { ?s ?p ex:University_of_Valencia }<br />
INSERT { ?s ?p ex:Universidad_de_Valencia }<br />
WHERE { ?s ?p ex:University_of_Valencia } <br />
""")<br />
sparql.setMethod(POST)<br />
results = sparql.query()<br />
print(results.response.read())<br />
<br />
# Check university names again<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?s ?o2<br />
WHERE { <br />
?s ex:degree ?o .<br />
?o ex:degreeSource ?o2 .<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
sparql.setMethod(GET)<br />
results = sparql.query().convert()<br />
for result in results["results"]["bindings"]:<br />
print(result["o2"]["value"])<br />
<br />
#Insert Sergio<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
PREFIX foaf: <http://xmlns.com/foaf/0.1/><br />
INSERT DATA {<br />
ex:Sergio a foaf:Person ;<br />
ex:address [ a ex:Address ;<br />
ex:city ex:Valenciay ;<br />
ex:country ex:Spain ;<br />
ex:postalCode "46021"^^xsd:string ;<br />
ex:state ex:California ;<br />
ex:street "4_Carrer_del_Serpis"^^xsd:string ] ;<br />
ex:degree [ ex:degreeField ex:Computer_science ;<br />
ex:degreeLevel "Master"^^xsd:string ;<br />
ex:degreeSource ex:University_of_Valencia ;<br />
ex:year "2008"^^xsd:gYear ] ;<br />
ex:expertise ex:Big_data,<br />
ex:Semantic_technologies,<br />
ex:Machine_learning;<br />
foaf:name "Sergio_Pastor"^^xsd:string .<br />
}<br />
""")<br />
sparql.setMethod(POST)<br />
results = sparql.query()<br />
print(results.response.read())<br />
sparql.setMethod(GET)<br />
<br />
# Describe Sergio<br />
sparql.setReturnFormat(TURTLE)<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
DESCRIBE ex:Sergio ?o<br />
WHERE {<br />
ex:Sergio ?p ?o .<br />
?o ?p2 ?o2 .<br />
}<br />
""")<br />
results = sparql.query().convert()<br />
print(results.serialize(format='turtle'))<br />
<br />
# Construct that any city is in the country in an address<br />
sparql.setQuery("""<br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> <br />
PREFIX ex: <http://example.org/><br />
CONSTRUCT {?city ex:locatedIn ?country}<br />
Where {<br />
?s rdf:type ex:Address .<br />
?s ex:city ?city .<br />
?s ex:country ?country.<br />
}<br />
""")<br />
sparql.setReturnFormat(TURTLE)<br />
results = sparql.query().convert()<br />
print(results.serialize(format='turtle'))<br />
<br />
</syntaxhighlight><br />
==Web APIs and JSON-LD==<br />
<br />
<syntaxhighlight><br />
import requests<br />
from rdflib import FOAF, Namespace, Literal, RDF, Graph, TURTLE<br />
<br />
r = requests.get('http://api.open-notify.org/astros.json').json()<br />
g = Graph()<br />
EX = Namespace('http://EXample.org/')<br />
g.bind("ex", EX)<br />
<br />
for item in r['people']:<br />
craft = item['craft'].replace(" ","_")<br />
person = item['name'].replace(" ","_")<br />
g.add((EX[person], EX.onCraft, EX[craft]))<br />
g.add((EX[person], RDF.type, FOAF.Person))<br />
g.add((EX[person], FOAF.name, Literal(item['name'])))<br />
g.add((EX[craft], FOAF.name, Literal(item['craft'])))<br />
res = g.query("""<br />
CONSTRUCT {?person1 foaf:knows ?person2}<br />
WHERE {<br />
?person1 ex:onCraft ?craft .<br />
?person2 ex:onCraft ?craft .<br />
}<br />
""")<br />
<br />
for triplet in res:<br />
# (we don't need to add that they know themselves)<br />
if (triplet[0] != triplet[2]):<br />
g.add((triplet))<br />
<br />
print(g.serialize(format="turtle"))<br />
</syntaxhighlight><br />
<br />
==Semantic lifting - CSV==<br />
<syntaxhighlight><br />
import pandas as pd<br />
from rdflib import Graph, Namespace, URIRef, Literal<br />
from rdflib.namespace import RDF, XSD<br />
<br />
ex = Namespace("http://example.org/")<br />
dbr = Namespace("http://dbpedia.org/resource/")<br />
dbp = Namespace("https://dbpedia.org/property/")<br />
dbpage = Namespace("https://dbpedia.org/page/")<br />
sem = Namespace("http://semanticweb.cs.vu.nl/2009/11/sem/")<br />
tl = Namespace("http://purl.org/NET/c4dm/timeline.owl#")<br />
<br />
g = Graph()<br />
g.bind("ex", ex)<br />
g.bind("dbr", dbr)<br />
g.bind("dbp", dbp)<br />
g.bind("dbpage", dbpage)<br />
g.bind("sem", sem)<br />
g.bind("tl", tl)<br />
<br />
df = pd.read_csv("russia-investigations.csv")<br />
# We need to correct the type of the columns in the DataFrame, as Pandas assigns an incorrect type when it reads the file (for me at least). We use .astype("str") to convert the content of the columns to a string.<br />
df["name"] = df["name"].astype("str")<br />
df["type"] = df["type"].astype("str")<br />
<br />
# iterrows creates an iterable object (list of rows)<br />
for index, row in df.iterrows():<br />
investigation = URIRef(ex + row['investigation'])<br />
investigation_start = Literal(row['investigation-start'], datatype=XSD.date)<br />
investigation_end = Literal(row['investigation-end'], datatype=XSD.date)<br />
investigation_days = Literal(row['investigation-days'], datatype=XSD.integer)<br />
<br />
name = Literal(row['name'], datatype=XSD.string)<br />
name_underscore = URIRef(dbpage + row['name'].replace(" ","_"))<br />
investigation_result = URIRef(ex + row['investigation']+ "_investigation_" + row['name'].replace(" ","_"))<br />
indictment_days = Literal(row['indictment-days'], datatype=XSD.integer)<br />
type = URIRef(dbr + row['type'].replace(" ","_"))<br />
cp_date = Literal(row['cp-date'], datatype=XSD.date)<br />
cp_days = Literal(row['cp-days'], datatype=XSD.duration)<br />
overturned = Literal(row['overturned'], datatype=XSD.boolean)<br />
pardoned = Literal(row['pardoned'], datatype=XSD.boolean)<br />
american = Literal(row['american'], datatype=XSD.boolean)<br />
president = Literal(row['president'], datatype=XSD.string)<br />
president_underscore = URIRef(dbr + row['president'].replace(" ","_"))<br />
<br />
g.add((investigation, RDF.type, sem.Event))<br />
g.add((investigation, sem.hasBeginTimeStamp, investigation_start))<br />
g.add((investigation, sem.hasEndTimeStamp, investigation_end))<br />
g.add((investigation, tl.duration, investigation_days))<br />
g.add((investigation, dbp.president, president_underscore))<br />
g.add((investigation, sem.hasSubEvent, investigation_result))<br />
<br />
g.add((investigation_result, ex.resultType, type))<br />
g.add((investigation_result, ex.objectOfInvestigation, name_underscore))<br />
g.add((investigation_result, ex.isAmerican, american))<br />
g.add((investigation_result, ex.indictmentDuration, indictment_days))<br />
g.add((investigation_result, ex.caseSolved, cp_date))<br />
g.add((investigation_result, ex.daysBeforeCaseSolved, cp_days))<br />
g.add((investigation_result, ex.overturned, overturned))<br />
g.add((investigation_result, ex.pardoned, pardoned))<br />
<br />
g.serialize("output.ttl",format="ttl")<br />
</syntaxhighlight><br />
<br />
==RDFS==<br />
<syntaxhighlight><br />
from rdflib.namespace import RDF, FOAF, XSD, RDFS<br />
from rdflib import OWL, Graph, Namespace, URIRef, Literal, BNode<br />
from rdflib.namespace import RDF, RDFS, XSD, OWL<br />
import owlrl<br />
<br />
ex = Namespace("http://example.org/")<br />
dbr = Namespace("http://dbpedia.org/resource/")<br />
dbp = Namespace("https://dbpedia.org/property/")<br />
dbpage = Namespace("https://dbpedia.org/page/")<br />
sem = Namespace("http://semanticweb.cs.vu.nl/2009/11/sem/")<br />
tl = Namespace("http://purl.org/NET/c4dm/timeline.owl#")<br />
<br />
g = Graph()<br />
g.bind("ex", ex)<br />
g.bind("dbr", dbr)<br />
g.bind("dbp", dbp)<br />
g.bind("dbpage", dbpage)<br />
g.bind("sem", sem)<br />
g.bind("tl", tl)<br />
<br />
g.parse(location="exampleTTL.ttl", format="turtle")<br />
<br />
# University of California and University of Valencia are both Universities. <br />
g.add((ex.University_of_California, RDF.type, ex.University))<br />
g.add((ex.University_of_Valencia, RDF.type, ex.University))<br />
# All universities are higher education institutions (HEIs). <br />
g.add((ex.University, RDFS.subClassOf, ex.Higher_education))<br />
# Only persons can have an expertise, and what they have expertise in is always a subject. <br />
g.add((ex.expertise, RDFS.domain, FOAF.Person))<br />
g.add((ex.expertise, RDFS.range, ex.subject))<br />
# Only persons can graduate from a HEI. <br />
g.add((ex.graduatedFromHEI, RDFS.domain, FOAF.Person))<br />
g.add((ex.graduatedFromHEI, RDFS.range, ex.Higher_education))<br />
# If you are a student, you are in fact a person as well. <br />
g.add((ex.Student, RDFS.subClassOf, FOAF.Person))<br />
# That a person is married to someone, means that they know them. <br />
g.add((ex.married, RDFS.subPropertyOf, FOAF.knows))<br />
# Finally, if a person has a name, that name is also the label of that entity."<br />
g.add((FOAF.name, RDFS.subPropertyOf, RDFS.label))<br />
<br />
# Having a degree from a HEI means that you have also graduated from that HEI. <br />
g.add((ex.graduatedFromHEI, RDFS.subPropertyOf, ex.degree))<br />
# That a city is a capital of a country means that this city is located in that country. <br />
g.add((ex.capital, RDFS.domain, ex.Country))<br />
g.add((ex.capital, RDFS.range, ex.City))<br />
g.add((ex.capital, RDFS.subPropertyOf, ex.hasLocation))<br />
# That someone was involved in a meeting, means that they have met the other participants. <br />
# This question was bad for the RDFS lab because we need complex OWL or easy sparql.<br />
res = g.query("""<br />
CONSTRUCT {?person1 ex:haveMet ?person2}<br />
WHERE {<br />
?person1 ex:meeting ?Meeting .<br />
?Meeting ex:involved ?person2 .<br />
}<br />
""")<br />
for triplet in res:<br />
#we don't need to add that people have met themselves<br />
if (triplet[0] != triplet[2]):<br />
g.add((triplet))<br />
# If someone partook in a meeting somewhere, means that they have visited that place"<br />
# This question was bad for the RDFS lab for the same reason.<br />
res = g.query("""<br />
CONSTRUCT {?person ex:hasVisited ?place}<br />
WHERE {<br />
?person1 ex:meeting ?Meeting .<br />
?Meeting ex:location ?place .<br />
}<br />
""")<br />
for triplet in res:<br />
g.add((triplet))<br />
<br />
rdfs = owlrl.OWLRL.OWLRL_Semantics(g, False, False, False)<br />
rdfs.closure()<br />
rdfs.flush_stored_triples()<br />
g.serialize("output.ttl",format="ttl")<br />
</syntaxhighlight><br />
<br />
=More miscellaneous examples=<br />
<br />
<br />
===Printing the triples of the Graph in a readable way===<br />
<syntaxhighlight><br />
# The turtle format has the purpose of being more readable for humans. <br />
print(g.serialize(format="turtle"))<br />
</syntaxhighlight><br />
<br />
===Coding Tasks Lab 1===<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, URIRef, BNode, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.Cade, ex.married, ex.Mary))<br />
g.add((ex.France, ex.capital, ex.Paris))<br />
g.add((ex.Cade, ex.age, Literal("27", datatype=XSD.integer)))<br />
g.add((ex.Mary, ex.age, Literal("26", datatype=XSD.integer)))<br />
g.add((ex.Mary, ex.interest, ex.Hiking))<br />
g.add((ex.Mary, ex.interest, ex.Chocolate))<br />
g.add((ex.Mary, ex.interest, ex.Biology))<br />
g.add((ex.Mary, RDF.type, ex.Student))<br />
g.add((ex.Paris, RDF.type, ex.City))<br />
g.add((ex.Paris, ex.locatedIn, ex.France))<br />
g.add((ex.Cade, ex.characteristic, ex.Kind))<br />
g.add((ex.Mary, ex.characteristic, ex.Kind))<br />
g.add((ex.Mary, RDF.type, FOAF.Person))<br />
g.add((ex.Cade, RDF.type, FOAF.Person))<br />
<br />
# OR<br />
<br />
g = Graph()<br />
<br />
ex = Namespace('http://example.org/')<br />
<br />
g.add((ex.Cade, FOAF.name, Literal("Cade", datatype=XSD.string)))<br />
g.add((ex.Mary, FOAF.name, Literal("Mary", datatype=XSD.string)))<br />
g.add((ex.Cade, RDF.type, FOAF.Person))<br />
g.add((ex.Mary, RDF.type, FOAF.Person))<br />
g.add((ex.Mary, RDF.type, ex.Student))<br />
g.add((ex.Cade, ex.Married, ex.Mary))<br />
g.add((ex.Cade, FOAF.age, Literal('27', datatype=XSD.int)))<br />
g.add((ex.Mary, FOAF.age, Literal('26', datatype=XSD.int)))<br />
g.add((ex.Paris, RDF.type, ex.City))<br />
g.add((ex.France, ex.Capital, ex.Paris))<br />
g.add((ex.Mary, FOAF.interest, ex.hiking))<br />
g.add((ex.Mary, FOAF.interest, ex.Chocolate))<br />
g.add((ex.Mary, FOAF.interest, ex.biology))<br />
g.add((ex.France, ex.City, ex.Paris))<br />
g.add((ex.Mary, ex.characteristic, ex.kind))<br />
g.add((ex.Cade, ex.characteristic, ex.kind))<br />
g.add((ex.France, RDF.type, ex.Country))<br />
<br />
<br />
print(g.serialize(format="turtle"))<br />
<br />
</syntaxhighlight><br />
<br />
==Basic RDF programming==<br />
<br />
===Different ways to create an address===<br />
<br />
<syntaxhighlight><br />
<br />
from rdflib import Graph, Namespace, URIRef, BNode, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
<br />
# How to represent the address of Cade Tracey. From probably the worst solution to the best.<br />
<br />
# Solution 1 -<br />
# Make the entire address into one Literal. However, Generally we want to separate each part of an address into their own triples. This is useful for instance if we want to find only the streets where people live. <br />
<br />
g.add((ex.Cade_Tracey, ex.livesIn, Literal("1516_Henry_Street, Berkeley, California 94709, USA")))<br />
<br />
<br />
# Solution 2 - <br />
# Seperate the different pieces information into their own triples<br />
<br />
g.add((ex.Cade_tracey, ex.street, Literal("1516_Henry_Street")))<br />
g.add((ex.Cade_tracey, ex.city, Literal("Berkeley")))<br />
g.add((ex.Cade_tracey, ex.state, Literal("California")))<br />
g.add((ex.Cade_tracey, ex.zipcode, Literal("94709")))<br />
g.add((ex.Cade_tracey, ex.country, Literal("USA")))<br />
<br />
<br />
# Solution 3 - Some parts of the addresses can make more sense to be resources than Literals.<br />
# Larger concepts like a city or state are typically represented as resources rather than Literals, but this is not necesarilly a requirement in the case that you don't intend to say more about them. <br />
<br />
g.add((ex.Cade_tracey, ex.street, Literal("1516_Henry_Street")))<br />
g.add((ex.Cade_tracey, ex.city, ex.Berkeley))<br />
g.add((ex.Cade_tracey, ex.state, ex.California))<br />
g.add((ex.Cade_tracey, ex.zipcode, Literal("94709")))<br />
g.add((ex.Cade_tracey, ex.country, ex.USA))<br />
<br />
<br />
# Solution 4 <br />
# Grouping of the information into an Address. We can Represent the address concept with its own URI OR with a Blank Node. <br />
# One advantage of this is that we can easily remove the entire address, instead of removing each individual part of the address. <br />
# Solution 4 or 5 is how I would recommend to make addresses. Here, ex.CadeAddress could also be called something like ex.address1 or so on, if you want to give each address a unique ID. <br />
<br />
# Address URI - CadeAdress<br />
<br />
g.add((ex.Cade_Tracey, ex.address, ex.CadeAddress))<br />
g.add((ex.CadeAddress, RDF.type, ex.Address))<br />
g.add((ex.CadeAddress, ex.street, Literal("1516 Henry Street")))<br />
g.add((ex.CadeAddress, ex.city, ex.Berkeley))<br />
g.add((ex.CadeAddress, ex.state, ex.California))<br />
g.add((ex.CadeAddress, ex.postalCode, Literal("94709")))<br />
g.add((ex.CadeAddress, ex.country, ex.USA))<br />
<br />
# OR<br />
<br />
# Blank node for Address. <br />
address = BNode()<br />
g.add((ex.Cade_Tracey, ex.address, address))<br />
g.add((address, RDF.type, ex.Address))<br />
g.add((address, ex.street, Literal("1516 Henry Street", datatype=XSD.string)))<br />
g.add((address, ex.city, ex.Berkeley))<br />
g.add((address, ex.state, ex.California))<br />
g.add((address, ex.postalCode, Literal("94709", datatype=XSD.string)))<br />
g.add((address, ex.country, ex.USA))<br />
<br />
<br />
# Solution 5 using existing vocabularies for address <br />
<br />
# (in this case https://schema.org/PostalAddress from schema.org). <br />
# Also using existing ontology for places like California. (like http://dbpedia.org/resource/California from dbpedia.org)<br />
<br />
schema = Namespace("https://schema.org/")<br />
dbp = Namespace("https://dpbedia.org/resource/")<br />
<br />
g.add((ex.Cade_Tracey, schema.address, ex.CadeAddress))<br />
g.add((ex.CadeAddress, RDF.type, schema.PostalAddress))<br />
g.add((ex.CadeAddress, schema.streetAddress, Literal("1516 Henry Street")))<br />
g.add((ex.CadeAddress, schema.addresCity, dbp.Berkeley))<br />
g.add((ex.CadeAddress, schema.addressRegion, dbp.California))<br />
g.add((ex.CadeAddress, schema.postalCode, Literal("94709")))<br />
g.add((ex.CadeAddress, schema.addressCountry, dbp.United_States))<br />
<br />
</syntaxhighlight><br />
<br />
===Typed Literals===<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace<br />
from rdflib.namespace import XSD<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.Cade, ex.age, Literal(27, datatype=XSD.integer)))<br />
g.add((ex.Cade, ex.gpa, Literal(3.3, datatype=XSD.float)))<br />
g.add((ex.Cade, FOAF.name, Literal("Cade Tracey", datatype=XSD.string)))<br />
g.add((ex.Cade, ex.birthday, Literal("2006-01-01", datatype=XSD.date)))<br />
</syntaxhighlight><br />
<br />
<br />
===Writing and reading graphs/files===<br />
<br />
<syntaxhighlight><br />
# Writing the graph to a file on your system. Possible formats = turtle, n3, xml, nt.<br />
g.serialize(destination="triples.txt", format="turtle")<br />
<br />
# Parsing a local file<br />
parsed_graph = g.parse(location="triples.txt", format="turtle")<br />
<br />
# Parsing a remote endpoint like Dbpedia<br />
dbpedia_graph = g.parse("http://dbpedia.org/resource/Pluto")<br />
</syntaxhighlight><br />
<br />
===Graph Binding===<br />
<syntaxhighlight><br />
#Graph Binding is useful for at least two reasons:<br />
#(1) We no longer need to specify prefixes with SPARQL queries if they are already binded to the graph.<br />
#(2) When serializing the graph, the serialization will show the correct expected prefix <br />
# instead of default namespace names ns1, ns2 etc.<br />
<br />
g = Graph()<br />
<br />
ex = Namespace("http://example.org/")<br />
dbp = Namespace("http://dbpedia.org/resource/")<br />
schema = Namespace("https://schema.org/")<br />
<br />
g.bind("ex", ex)<br />
g.bind("dbp", dbp)<br />
g.bind("schema", schema)<br />
</syntaxhighlight><br />
<br />
===Collection Example===<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace<br />
from rdflib.collection import Collection<br />
<br />
<br />
# Sometimes we want to add many objects or subjects for the same predicate at once. <br />
# In these cases we can use Collection() to save some time.<br />
# In this case I want to add all countries that Emma has visited at once.<br />
<br />
b = BNode()<br />
g.add((ex.Emma, ex.visit, b))<br />
Collection(g, b,<br />
[ex.Portugal, ex.Italy, ex.France, ex.Germany, ex.Denmark, ex.Sweden])<br />
<br />
# OR<br />
<br />
g.add((ex.Emma, ex.visit, ex.EmmaVisits))<br />
Collection(g, ex.EmmaVisits,<br />
[ex.Portugal, ex.Italy, ex.France, ex.Germany, ex.Denmark, ex.Sweden])<br />
<br />
</syntaxhighlight><br />
<br />
==SPARQL==<br />
<br />
Also see the [[SPARQL Examples]] page!<br />
<br />
===Querying a local ("in memory") graph===<br />
<br />
Example contents of the file family.ttl:<br />
@prefix rex: <http://example.org/royal#> .<br />
@prefix fam: <http://example.org/family#> .<br />
<br />
rex:IngridAlexandra fam:hasParent rex:HaakonMagnus .<br />
rex:SverreMagnus fam:hasParent rex:HaakonMagnus .<br />
rex:HaakonMagnus fam:hasParent rex:Harald .<br />
rex:MarthaLouise fam:hasParent rex:Harald .<br />
rex:HaakonMagnus fam:hasSister rex:MarthaLouise .<br />
<br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
g.parse("family.ttl", format='ttl')<br />
<br />
qres = g.query("""<br />
PREFIX fam: <http://example.org/family#><br />
SELECT ?child ?sister WHERE {<br />
?child fam:hasParent ?parent . <br />
?parent fam:hasSister ?sister .<br />
}""")<br />
for row in qres:<br />
print("%s has aunt %s" % row)<br />
<br />
With a prepared query, you can write the query once, and then bind some of the variables each time you use it:<br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
g.parse("family.ttl", format='ttl')<br />
<br />
q = rdflib.plugins.sparql.prepareQuery(<br />
"""SELECT ?child ?sister WHERE {<br />
?child fam:hasParent ?parent .<br />
?parent fam:hasSister ?sister .<br />
}""",<br />
initNs = { "fam": "http://example.org/family#"})<br />
<br />
sm = rdflib.URIRef("http://example.org/royal#SverreMagnus")<br />
<br />
for row in g.query(q, initBindings={'child': sm}):<br />
print(row)<br />
<br />
===Select all contents of lists (rdfllib.Collection)===<br />
<syntaxhighlight><br />
<br />
# rdflib.Collection has a different interntal structure so it requires a slightly more advance query. Here I am selecting all places that Emma has visited.<br />
<br />
PREFIX ex: <http://example.org/><br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
<br />
SELECT ?visit<br />
WHERE {<br />
ex:Emma ex:visit/rdf:rest*/rdf:first ?visit<br />
}<br />
</syntaxhighlight><br />
<br />
<br />
===Using parameters/variables in rdflib queries===<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, URIRef<br />
from rdflib.plugins.sparql import prepareQuery<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
g.add((ex.Cade, ex.livesIn, ex.France))<br />
g.add((ex.Anne, ex.livesIn, ex.Norway))<br />
g.add((ex.Sofie, ex.livesIn, ex.Sweden))<br />
g.add((ex.Per, ex.livesIn, ex.Norway))<br />
g.add((ex.John, ex.livesIn, ex.USA))<br />
<br />
<br />
def find_people_from_country(country):<br />
country = URIRef(ex + country)<br />
q = prepareQuery(<br />
"""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?person WHERE { <br />
?person ex:livesIn ?country.<br />
}<br />
""")<br />
<br />
capital_result = g.query(q, initBindings={'country': country})<br />
<br />
for row in capital_result:<br />
print(row)<br />
<br />
find_people_from_country("Norway")<br />
</syntaxhighlight><br />
<br />
===SELECTING data from Blazegraph via Python===<br />
<syntaxhighlight><br />
<br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
# This creates a server connection to the same URL that contains the graphic interface for Blazegraph. <br />
# You also need to add "sparql" to end of the URL like below.<br />
<br />
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/sparql")<br />
<br />
# SELECT all triples in the database.<br />
<br />
sparql.setQuery("""<br />
SELECT DISTINCT ?p WHERE {<br />
?s ?p ?o.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["p"]["value"])<br />
<br />
# SELECT all interests of Cade<br />
<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT DISTINCT ?interest WHERE {<br />
ex:Cade ex:interest ?interest.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["interest"]["value"])<br />
</syntaxhighlight><br />
<br />
===Updating data from Blazegraph via Python===<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, POST, DIGEST<br />
<br />
namespace = "kb"<br />
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/namespace/"+ namespace + "/sparql")<br />
<br />
sparql.setMethod(POST)<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
INSERT DATA{<br />
ex:Cade ex:interest ex:Mathematics.<br />
}<br />
""")<br />
<br />
results = sparql.query()<br />
print(results.response.read())<br />
<br />
<br />
</syntaxhighlight><br />
===Retrieving data from Wikidata with SparqlWrapper===<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
sparql = SPARQLWrapper("https://query.wikidata.org/sparql")<br />
# In the query I want to select all the Vitamins in wikidata.<br />
<br />
sparql.setQuery("""<br />
SELECT ?nutrient ?nutrientLabel WHERE<br />
{<br />
?nutrient wdt:P279 wd:Q34956.<br />
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }<br />
}<br />
""")<br />
<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["nutrient"]["value"], " ", result["nutrientLabel"]["value"])<br />
</syntaxhighlight><br />
<br />
<br />
More examples can be found in the example section on the official query service here: https://query.wikidata.org/.<br />
<br />
===Download from BlazeGraph===<br />
<br />
<syntaxhighlight><br />
"""<br />
Dumps a database to a local RDF file.<br />
You need to install the SPARQLWrapper package first...<br />
"""<br />
<br />
import datetime<br />
from SPARQLWrapper import SPARQLWrapper, RDFXML<br />
<br />
# your namespace, the default is 'kb'<br />
ns = 'kb'<br />
<br />
# the SPARQL endpoint<br />
endpoint = 'http://info216.i2s.uib.no/bigdata/namespace/' + ns + '/sparql'<br />
<br />
# - the endpoint just moved, the old one was:<br />
# endpoint = 'http://i2s.uib.no:8888/bigdata/namespace/' + ns + '/sparql'<br />
<br />
# create wrapper<br />
wrapper = SPARQLWrapper(endpoint)<br />
<br />
# prepare the SPARQL update<br />
wrapper.setQuery('CONSTRUCT { ?s ?p ?o } WHERE { ?s ?p ?o }')<br />
wrapper.setReturnFormat(RDFXML)<br />
<br />
# execute the SPARQL update and convert the result to an rdflib.Graph <br />
graph = wrapper.query().convert()<br />
<br />
# the destination file, with code to make it timestamped<br />
destfile = 'rdf_dumps/slr-kg4news-' + datetime.datetime.now().strftime('%Y%m%d-%H%M') + '.rdf'<br />
<br />
# serialize the result to file<br />
graph.serialize(destination=destfile, format='ttl')<br />
<br />
# report and quit<br />
print('Wrote %u triples to file %s .' %<br />
(len(res), destfile))<br />
</syntaxhighlight><br />
<br />
===Query Dbpedia with SparqlWrapper===<br />
<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
sparql = SPARQLWrapper("http://dbpedia.org/sparql")<br />
<br />
sparql.setQuery("""<br />
PREFIX dbr: <http://dbpedia.org/resource/><br />
PREFIX dbo: <http://dbpedia.org/ontology/><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
SELECT ?comment<br />
WHERE {<br />
dbr:Barack_Obama rdfs:comment ?comment.<br />
FILTER (langMatches(lang(?comment),"en"))<br />
}<br />
""")<br />
<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["comment"]["value"])<br />
</syntaxhighlight><br />
<br />
==Lifting CSV to RDF==<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, FOAF, RDFS, OWL<br />
import pandas as pd<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
# Load the CSV data as a pandas Dataframe.<br />
csv_data = pd.read_csv("task1.csv")<br />
<br />
# Here I deal with spaces (" ") in the data. I replace them with "_" so that URI's become valid.<br />
csv_data = csv_data.replace(to_replace=" ", value="_", regex=True)<br />
<br />
# Here I mark all missing/empty data as "unknown". This makes it easy to delete triples containing this later.<br />
csv_data = csv_data.fillna("unknown")<br />
<br />
# Loop through the CSV data, and then make RDF triples.<br />
for index, row in csv_data.iterrows():<br />
# The names of the people act as subjects.<br />
subject = row['Name']<br />
# Create triples: e.g. "Cade_Tracey - age - 27"<br />
g.add((URIRef(ex + subject), URIRef(ex + "age"), Literal(row["Age"])))<br />
g.add((URIRef(ex + subject), URIRef(ex + "married"), URIRef(ex + row["Spouse"])))<br />
g.add((URIRef(ex + subject), URIRef(ex + "country"), URIRef(ex + row["Country"])))<br />
<br />
# If We want can add additional RDF/RDFS/OWL information e.g<br />
g.add((URIRef(ex + subject), RDF.type, FOAF.Person))<br />
<br />
# I remove triples that I marked as unknown earlier.<br />
g.remove((None, None, URIRef("http://example.org/unknown")))<br />
<br />
# Clean printing of the graph.<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===CSV file for above example===<br />
<br />
<syntaxhighlight><br />
"Name","Age","Spouse","Country"<br />
"Cade Tracey","26","Mary Jackson","US"<br />
"Bob Johnson","21","","Canada"<br />
"Mary Jackson","25","","France"<br />
"Phil Philips","32","Catherine Smith","Japan"<br />
</syntaxhighlight><br />
<br />
<br />
=Coding Tasks Lab 6=<br />
<syntaxhighlight><br />
import pandas as pd<br />
<br />
<br />
from rdflib import Graph, Namespace, URIRef, Literal, BNode<br />
from rdflib.namespace import RDF, XSD<br />
<br />
<br />
ex = Namespace("http://example.org/")<br />
sem = Namespace("http://semanticweb.cs.vu.nl/2009/11/sem/")<br />
<br />
g = Graph()<br />
g.bind("ex", ex)<br />
g.bind("sem", sem)<br />
<br />
<br />
# Removing unwanted characters<br />
df = pd.read_csv('russia-investigation.csv')<br />
# Here I deal with spaces (" ") in the data. I replace them with "_" so that URI's become valid.<br />
df = df.replace(to_replace=" ", value="_", regex=True)<br />
# This may seem odd, but in the data set we have a name like this:("Scooter"). So we have to remove quotation marks<br />
df = df.replace(to_replace=f'"', value="", regex=True)<br />
# # Here I mark all missing/empty data as "unknown". This makes it easy to delete triples containing this later.<br />
df = df.fillna("unknown")<br />
<br />
# Loop through the CSV data, and then make RDF triples.<br />
for index, row in df.iterrows():<br />
name = row['investigation']<br />
investigation = URIRef(ex + name)<br />
g.add((investigation, RDF.type, sem.Event))<br />
investigation_start = row["investigation-start"]<br />
g.add((investigation, sem.hasBeginTimeStamp, Literal(<br />
investigation_start, datatype=XSD.datetime)))<br />
investigation_end = row["investigation-end"]<br />
g.add((investigation, sem.hasEndTimeStamp, Literal(<br />
investigation_end, datatype=XSD.datetime)))<br />
investigation_end = row["investigation-days"]<br />
g.add((investigation, sem.hasXSDDuration, Literal(<br />
investigation_end, datatype=XSD.Days)))<br />
person = row["name"]<br />
person = URIRef(ex + person)<br />
g.add((investigation, sem.Actor, person))<br />
result = row['type']<br />
g.add((investigation, sem.hasSubEvent, Literal(result, datatype=XSD.string)))<br />
overturned = row["overturned"]<br />
g.add((investigation, ex.overtuned, Literal(overturned, datatype=XSD.boolean)))<br />
pardoned = row["pardoned"]<br />
g.add((investigation, ex.pardon, Literal(pardoned, datatype=XSD.boolean)))<br />
<br />
g.serialize("output.ttl", format="ttl")<br />
print(g.serialize(format="turtle"))<br />
<br />
<br />
</syntaxhighlight><br />
==RDFS==<br />
<br />
===RDFS-plus (OWL) Properties===<br />
<syntaxhighlight><br />
g.add((ex.married, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.married, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.ReflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.sibling, RDF.type, OWL.TransitiveProperty))<br />
g.add((ex.sibling, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.sibling, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.FunctionalProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.IrreflexiveProperty))<br />
<br />
# Sometimes there is no definite answer, and it comes down to how we want to model our properties<br />
# e.g is livesWith a transitive property? Usually yes, but we can also want to specify that a child lives with both of her divorced parents.<br />
# which means that: (mother livesWith child % child livesWith father) != mother livesWith father. Which makes it non-transitive.<br />
</syntaxhighlight><br />
<br />
<!--<br />
==Lifting XML to RDF==<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, XSD, RDFS<br />
import xml.etree.ElementTree as ET<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/TV/")<br />
prov = Namespace("http://www.w3.org/ns/prov#")<br />
g.bind("ex", ex)<br />
g.bind("prov", prov)<br />
<br />
tree = ET.parse("tv_shows.xml")<br />
root = tree.getroot()<br />
<br />
for tv_show in root.findall('tv_show'):<br />
show_id = tv_show.attrib["id"]<br />
title = tv_show.find("title").text<br />
<br />
g.add((URIRef(ex + show_id), ex.title, Literal(title, datatype=XSD.string)))<br />
g.add((URIRef(ex + show_id), RDF.type, ex.TV_Show))<br />
<br />
for actor in tv_show.findall("actor"):<br />
first_name = actor.find("firstname").text<br />
last_name = actor.find("lastname").text<br />
full_name = first_name + "_" + last_name<br />
<br />
g.add((URIRef(ex + show_id), ex.stars, URIRef(ex + full_name)))<br />
g.add((URIRef(ex + full_name), ex.starsIn, URIRef(title)))<br />
g.add((URIRef(ex + full_name), RDF.type, ex.Actor))<br />
<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
<br />
<br />
<br />
===RDFS inference with RDFLib===<br />
You can use the OWL-RL package to add inference capabilities to RDFLib. It can be installed using the pip install command:<br />
<syntaxhighlight><br />
pip install owlrl<br />
</syntaxhighlight><br />
Or download it from [https://github.com/RDFLib/OWL-RL GitHub] and copy the ''owlrl'' subfolder into your project folder next to your Python files.<br />
<br />
[https://owl-rl.readthedocs.io/en/latest/owlrl.html OWL-RL documentation.]<br />
<br />
Example program to get you started. In this example we are creating the graph using sparql.update, but it is also possible to parse the data from a file.<br />
<syntaxhighlight><br />
import rdflib.plugins.sparql.update<br />
import owlrl.RDFSClosure<br />
<br />
g = rdflib.Graph()<br />
<br />
ex = rdflib.Namespace('http://example.org#')<br />
g.bind('', ex)<br />
<br />
g.update("""<br />
PREFIX ex: <http://example.org#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
INSERT DATA {<br />
ex:Socrates rdf:type ex:Man .<br />
ex:Man rdfs:subClassOf ex:Mortal .<br />
}""")<br />
<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
# RDF_Semantics parameters: <br />
# - graph (rdflib.Graph) – The RDF graph to be extended.<br />
# - axioms (bool) – Whether (non-datatype) axiomatic triples should be added or not.<br />
# - daxioms (bool) – Whether datatype axiomatic triples should be added or not.<br />
# - rdfs (bool) – Whether RDFS inference is also done (used in subclassed only).<br />
# For now, you will in most cases use all False in RDFS_Semtantics.<br />
<br />
# Generates the closure of the graph - generates the new entailed triples, but does not add them to the graph.<br />
rdfs.closure()<br />
# Adds the new triples to the graph and empties the RDFS triple-container.<br />
rdfs.flush_stored_triples()<br />
<br />
# Ask-query to check whether a new triple has been generated from the entailment.<br />
b = g.query("""<br />
PREFIX ex: <http://example.org#><br />
ASK {<br />
ex:Socrates rdf:type ex:Mortal .<br />
} <br />
""")<br />
print('Result: ' + bool(b))<br />
</syntaxhighlight><br />
<br />
===Language tagged RDFS labels=== <br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, Literal<br />
from rdflib.namespace import RDFS<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.France, RDFS.label, Literal("Frankrike", lang="no")))<br />
g.add((ex.France, RDFS.label, Literal("France", lang="en")))<br />
g.add((ex.France, RDFS.label, Literal("Francia", lang="es")))<br />
<br />
<br />
</syntaxhighlight><br />
<br />
==OWL== <br />
===Basic inference with RDFLib=== <br />
<br />
You can use the OWL-RL package again as for Lecture 5.<br />
<br />
Instead of: <br />
<syntaxhighlight><br />
# The next three lines add inferred triples to g.<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
rdfs.closure()<br />
rdfs.flush_stored_triples()<br />
</syntaxhighlight><br />
you can write this to get both RDFS and basic RDFS Plus / OWL inference:<br />
<syntaxhighlight><br />
# The next three lines add inferred triples to g.<br />
owl = owlrl.CombinedClosure.RDFS_OWLRL_Semantics(g, False, False, False)<br />
owl.closure()<br />
owl.flush_stored_triples()<br />
</syntaxhighlight><br />
<br />
Example updates and queries:<br />
<syntaxhighlight><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
PREFIX ex: <http://example.org#><br />
<br />
INSERT DATA {<br />
ex:Socrates ex:hasWife ex:Xanthippe .<br />
ex:hasHusband owl:inverseOf ex:hasWife .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Xanthippe ex:hasHusband ex:Socrates .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ^ex:hasHusband ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
INSERT DATA {<br />
ex:hasWife rdfs:subPropertyOf ex:hasSpouse .<br />
ex:hasSpouse rdf:type owl:SymmetricProperty . <br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ex:hasSpouse ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ^ex:hasSpouse ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<br />
<br />
<br />
<br />
<br />
===XML Data for above example===<br />
<syntaxhighlight><br />
<data><br />
<tv_show id="1050"><br />
<title>The_Sopranos</title><br />
<actor><br />
<firstname>James</firstname><br />
<lastname>Gandolfini</lastname><br />
</actor><br />
</tv_show><br />
<tv_show id="1066"><br />
<title>Seinfeld</title><br />
<actor><br />
<firstname>Jerry</firstname><br />
<lastname>Seinfeld</lastname><br />
</actor><br />
<actor><br />
<firstname>Julia</firstname><br />
<lastname>Louis-dreyfus</lastname><br />
</actor><br />
<actor><br />
<firstname>Jason</firstname><br />
<lastname>Alexander</lastname><br />
</actor><br />
</tv_show><br />
</data><br />
</syntaxhighlight><br />
<br />
==Lifting HTML to RDF==<br />
<syntaxhighlight><br />
from bs4 import BeautifulSoup as bs, NavigableString<br />
from rdflib import Graph, URIRef, Namespace<br />
from rdflib.namespace import RDF<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
html = open("tv_shows.html").read()<br />
html = bs(html, features="html.parser")<br />
<br />
shows = html.find_all('li', attrs={'class': 'show'})<br />
for show in shows:<br />
title = show.find("h3").text<br />
actors = show.find('ul', attrs={'class': 'actor_list'})<br />
for actor in actors:<br />
if isinstance(actor, NavigableString):<br />
continue<br />
else:<br />
actor = actor.text.replace(" ", "_")<br />
g.add((URIRef(ex + title), ex.stars, URIRef(ex + actor)))<br />
g.add((URIRef(ex + actor), RDF.type, ex.Actor))<br />
<br />
g.add((URIRef(ex + title), RDF.type, ex.TV_Show))<br />
<br />
<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===HTML code for the example above===<br />
<syntaxhighlight><br />
<!DOCTYPE html><br />
<html><br />
<head><br />
<meta charset="utf-8"><br />
<title></title><br />
</head><br />
<body><br />
<div class="tv_shows"><br />
<ul><br />
<li class="show"><br />
<h3>The_Sopranos</h3><br />
<div class="irrelevant_data"></div><br />
<ul class="actor_list"><br />
<li>James Gandolfini</li><br />
</ul><br />
</li><br />
<li class="show"><br />
<h3>Seinfeld</h3><br />
<div class="irrelevant_data"></div><br />
<ul class="actor_list"><br />
<li >Jerry Seinfeld</li><br />
<li>Jason Alexander</li><br />
<li>Julia Louis-Dreyfus</li><br />
</ul><br />
</li><br />
</ul><br />
</div><br />
</body><br />
</html><br />
</syntaxhighlight><br />
<br />
==Web APIs with JSON==<br />
<syntaxhighlight><br />
import requests<br />
import json<br />
import pprint<br />
<br />
# Retrieve JSON data from API service URL. Then load it with the json library as a json object.<br />
url = "http://api.geonames.org/postalCodeLookupJSON?postalcode=46020&#country=ES&username=demo"<br />
data = requests.get(url).content.decode("utf-8")<br />
data = json.loads(data)<br />
pprint.pprint(data)<br />
</syntaxhighlight><br />
<br />
<br />
==JSON-LD==<br />
<br />
<syntaxhighlight><br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
<br />
example = """<br />
{<br />
"@context": {<br />
"name": "http://xmlns.com/foaf/0.1/name",<br />
"homepage": {<br />
"@id": "http://xmlns.com/foaf/0.1/homepage",<br />
"@type": "@id"<br />
}<br />
},<br />
"@id": "http://me.markus-lanthaler.com/",<br />
"name": "Markus Lanthaler",<br />
"homepage": "http://www.markus-lanthaler.com/"<br />
}<br />
"""<br />
<br />
# json-ld parsing automatically deals with @contexts<br />
g.parse(data=example, format='json-ld')<br />
<br />
# serialisation does expansion by default<br />
for line in g.serialize(format='json-ld').decode().splitlines():<br />
print(line)<br />
<br />
# by supplying a context object, serialisation can do compaction<br />
context = {<br />
"foaf": "http://xmlns.com/foaf/0.1/"<br />
}<br />
for line in g.serialize(format='json-ld', context=context).decode().splitlines():<br />
print(line)<br />
</syntaxhighlight><br />
<br />
<br />
<div class="credits" style="text-align: right; direction: ltr; margin-left: 1em;">''INFO216, UiB, 2017-2020. All code examples are [https://creativecommons.org/choose/zero/ CC0].'' </div><br />
<br />
==OWL - Complex Classes and Restrictions==<br />
<syntaxhighlight><br />
import owlrl<br />
from rdflib import Graph, Literal, Namespace, BNode<br />
from rdflib.namespace import RDF, OWL, RDFS<br />
from rdflib.collection import Collection<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
g.bind("owl", OWL)<br />
<br />
# a Season is either Autumn, Winter, Spring, Summer<br />
seasons = BNode()<br />
Collection(g, seasons, [ex.Winter, ex.Autumn, ex.Spring, ex.Summer])<br />
g.add((ex.Season, OWL.oneOf, seasons))<br />
<br />
# A Parent is a Father or Mother<br />
b = BNode()<br />
Collection(g, b, [ex.Father, ex.Mother])<br />
g.add((ex.Parent, OWL.unionOf, b))<br />
<br />
# A Woman is a person who has the "female" gender<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.gender))<br />
g.add((br, OWL.hasValue, ex.Female))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Woman, OWL.intersectionOf, bi))<br />
<br />
# A vegetarian is a Person who only eats vegetarian food<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.eats))<br />
g.add((br, OWL.allValuesFrom, ex.VeganFood))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Vegetarian, OWL.intersectionOf, bi))<br />
<br />
# A vegetarian is a Person who can not eat meat.<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.eats))<br />
g.add((br, OWL.QualifiedCardinality, Literal(0)))<br />
g.add((br, OWL.onClass, ex.Meat))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Vegetarian, OWL.intersectionOf, bi))<br />
<br />
# A Worried Parent is a parent who has at least one sick child<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.hasChild))<br />
g.add((br, OWL.QualifiedMinCardinality, Literal(1)))<br />
g.add((br, OWL.onClass, ex.Sick))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Parent, br])<br />
g.add((ex.WorriedParent, OWL.intersectionOf, bi))<br />
<br />
# using the restriction above, If we now write...: <br />
g.add((ex.Bob, RDF.type, ex.Parent))<br />
g.add((ex.Bob, ex.hasChild, ex.John))<br />
g.add((ex.John, RDF.type, ex.Sick))<br />
# ...we can infer with owl reasoning that Bob is a worried Parent even though we didn't specify it ourselves because Bob fullfills the restriction and Parent requirements.<br />
<br />
</syntaxhighlight><br />
<br />
==Protege-OWL reasoning with HermiT==<br />
<br />
[[:File:DL-reasoning-RoyalFamily-final.owl.txt | Example file]] from Lecture 13 about OWL-DL, rules and reasoning.<br />
<br />
--></div>
Tkr048
http://info216.wiki.uib.no/index.php?title=Lab:_Semantic_Lifting_-_XML&diff=1787
Lab: Semantic Lifting - XML
2022-03-17T12:10:50Z
<p>Tkr048: </p>
<hr />
<div><br />
=Lab 9: Semantic Lifting - XML=<br />
<br />
==Topics==<br />
The first task for today will be finishing of [https://wiki.uib.no/info216/index.php/Lab:_Semantic_Lifting_-_CSV the CSV lab]. Both finishing parsing and lifting the file, and implementing DBpedia Spotlight on at least one of the columns.<br />
<br />
If you have completed that you can start working with lifting XML data in the task below.<br />
XML stands for Extensible Markup Language and is still widely used for data storage/transfer, especially for websites.<br />
<br />
XML has a tree structure similar to HTML, consisting of a root element, children and parent elements, attributes and so on.<br />
The goal is for you to learn an example of how we can convert unsemantic data into RDF. <br />
<br />
<br />
==Relevant Libraries/Functions==<br />
<br />
import requests<br />
<br />
import xml.etree.ElementTree as ET<br />
<br />
* ET.ElementTree()<br />
* ET.parse('xmlfile.xml')<br />
* ET.fromstring("XML_data_as_string")<br />
<br />
All parts of the XML tree are considered '''Elements'''.<br />
<br />
* Element.getroot()<br />
* Element.findall("path_in_tree")<br />
* Element.find("name_of_tag")<br />
* Element.text<br />
* Element.attrib("name_of_attribute")<br />
<br />
<br />
<br />
==Tasks==<br />
<br />
'''Task 1'''<br />
<br />
'''Lift the XML data from http://feeds.bbci.co.uk/news/rss.xml about news articles by BBC_News into RDF triples.'''<br />
<br />
You can look at the actual XML structure of the data by clicking ctrl + U when you have opend the link in browser.<br />
<br />
The actual data about the news articles are stored under the <item></item> tags<br />
<br />
For instance a triple should be something of the form: news_paper_id - hasTitle - titleValue<br />
<br />
Do this by parsing the XML using ElementTree (see import above).<br />
<br />
I recommend starting with the code at the bottom of the page and continuing on it. This code retrieves the XML using a HTTPRequest and saves it to an XML_file, so that you can view and parse it easily. <br />
<br />
You can use this regex (string matcher) to get only the ID's from the full url that is in the <guid> data.<br />
<syntaxhighlight><br />
news_id = re.findall('\d+$', news_id)[0]<br />
</syntaxhighlight><br />
<br />
<br />
'''Task 2'''<br />
<br />
Parse through the fictional XML data below and add the correct journalists as the writers of the news_articles from earlier. <br />
This means that e.g if the news article is written on a Tuesday, Thomas Smith is the one who wrote it. <br />
One way to do this is by checking if any of the days in the "whenWriting" attribute is contained in the news articles "pubDate".<br />
<br />
<syntaxhighlight><br />
<data><br />
<news_publisher name="BBC News"><br />
<journalist whenWriting="Mon, Tue, Wed" ><br />
<firstname>Thomas</firstname><br />
<lastname>Smith</lastname><br />
</journalist><br />
<journalist whenWriting="Thu, Fri" ><br />
<firstname>Joseph</firstname><br />
<lastname>Olson</lastname><br />
</journalist><br />
<journalist whenWriting="Sat, Sun" ><br />
<firstname>Sophia</firstname><br />
<lastname>Cruise</lastname><br />
</journalist><br />
</news_publisher><br />
</data><br />
</syntaxhighlight><br />
<br />
<br />
==If You have more Time==<br />
Extend the graph using the PROV vocabulary to describe Agents and Entities.<br />
For instance, we want to say that the news articles originates from BBC, <br />
and that the journalists acts on behalf of BBC.<br />
<br />
<br />
==Code to Get Started==<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, XSD<br />
import xml.etree.ElementTree as ET<br />
import requests<br />
import re<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
prov = Namespace("http://www.w3.org/ns/prov#")<br />
g.bind("ex", ex)<br />
g.bind("prov", prov)<br />
<br />
<br />
# URL of xml data<br />
url = 'http://feeds.bbci.co.uk/news/rss.xml'<br />
<br />
# Retrieve the xml data from the web-url.<br />
resp = requests.get(url)<br />
<br />
# Creating an ElementTree from the response content<br />
tree = ET.ElementTree(ET.fromstring(resp.content))<br />
<br />
# Or saving the xml data to a .xml file and creating a tree from this<br />
with open('news.xml', 'wb') as f:<br />
f.write(resp.content)<br />
<br />
</syntaxhighlight><br />
<br />
<br />
{| role="presentation" class="wikitable mw-collapsible mw-collapsed"<br />
| <strong>Hints</strong><br />
|-<br />
| <br />
|}<br />
<br />
<br />
==Useful Reading==<br />
* [https://www.geeksforgeeks.org/xml-parsing-python/ XML-parsing-python by geeksforgeeks.org]<br />
* [https://www.w3schools.com/xml/xml_whatis.asp XML information by w3schools.com]<br />
* [https://www.w3.org/TR/prov-o/#description PROV vocabulary]</div>
Tkr048
http://info216.wiki.uib.no/index.php?title=Lab:_OWL_1&diff=1786
Lab: OWL 1
2022-03-16T08:45:52Z
<p>Tkr048: </p>
<hr />
<div>=Lab 8: OWL 1 ("RDFS Plus / Basic OWL")=<br />
<br />
==Topics==<br />
Basic OWL ontology programming with RDFlib and owlrl.<br />
<br />
WebVOWL visualisation.<br />
<br />
RDF and RDFS might be relevant too.<br />
<br />
==Classes/Vocabularies==<br />
<br />
Vocabulary:<br />
* OWL (sameAs, equivalentClass, equivalentProperty, differentFrom, disjointWith, inverseOf)<br />
* OWL (SymmetricProperty, AsymmetricProperty, ReflexiveProperty, IrreflexiveProperty, TransitiveProperty, FunctionalProperty, InverseFunctionalProperty, AllDifferent)<br />
<br />
==Tasks==<br />
'''Task 1'''<br />
<br />
'''Write OWL triples that corresponds to the following text. '''.If you can, try to build on your example from labs 2 and 7, or extend the triples at the bottom of the page. OWL can be imported from rdflib.namespace.<br />
<br />
Cade and Emma are two different persons. <!-- All the countries mentioned above are different. --> The country USA above is<br />
the same as the DBpedia resource http://dbpedia.org/resource/United_States (dbr:United_States) and the GeoNames<br />
resource http://sws.geonames.org/6252001/ (gn:6252001). The person class (the RDF type the Cade and Emma resources)<br />
in your graph is the same as FOAF's, schema.org's and AKT's person classes (they are http://xmlns.com/foaf/0.1/Person,<br />
http://schema.org/Person, and http://www.aktors.org/ontology/portal#Person, respectively. Nothing can be any two of a person, a university, or a city at the same time. The property<br />
you have used in your RDF/RDFS graph to represent that 94709 is the US zip code of Berkeley, California in US is<br />
a subproperty of VCard's postal code-property (http://www.w3.org/2006/vcard/ns#postal-code). No two US cities can<br />
have the same postal code. The property you have used for Emma living in Valencia is the same property as FOAF's<br />
based near-property (http://xmlns.com/foaf/0.1/based_near), and it is the inverse of DBpedia's hometown property<br />
(http://dbpedia.org/ontology/hometown, dbo:hometown). (This is not completely precise: but "hometown" is perhaps the<br />
inverse of a subproperty of "based near".)<br />
<br />
<br />
1. Cade and Emma are two different persons. <br />
<br />
2. The resource for USA is the same as the DBpedia resource http://dbpedia.org/resource/United_States (dbr:United_States) and the GeoNames resource http://sws.geonames.org/6252001/ (gn:6252001).<br />
<br />
3. The Person class is the same as schema person and akt person. <br />
<br />
4. Nothing can be any two of a person, a university, or a city at the same time.<br />
<br />
5. The resource for postal code is a subproperty of VCard. An No city can have the same.<br />
<br />
6. The property you have used for living is the same as FOAF base_near, and is the invers of DBpedia's hometown property. <br />
<br />
<br />
<br />
'''Task 2'''<br />
<syntaxhighlight><br />
g.add((ex.Cade, ex.married, ex.Mary))<br />
g.add((ex.Cade, ex.livesWith, ex.Mary))<br />
g.add((ex.Cade, ex.sibling, ex.Andrew))<br />
g.add((ex.Cade, ex.hasFather, ex.Bob))<br />
g.add((ex.Bob, ex.fatherOf, ex.Cade))<br />
</syntaxhighlight><br />
<br />
Look through the predicates(properties) above and add new triples for each one that describes them as any of the following: a reflexive , irreflexive, symmetric, asymmetric, transitive, functional, or an Inverse Functional Property.<br />
e.g<br />
<syntaxhighlight><br />
g.add((ex.married, RDF.type, OWL.SymmetricProperty))<br />
</syntaxhighlight><br />
<br />
'''Task 3'''<br />
<br />
Print/Serialize the ontology. Then use owlrl like seen below to infer additional triples. Can you spot the many inferences?<br />
<br />
<syntaxhighlight><br />
# These three lines add inferred triples to the graph.<br />
owl = owlrl.CombinedClosure.RDFS_OWLRL_Semantics(g, False, False, False)<br />
owl.closure()<br />
owl.flush_stored_triples()<br />
</syntaxhighlight><br />
<br />
Finally write the ontology to a XML file, and try to visualise it using https://service.tib.eu/webvowl/. WebVOWL is oriented towards visualising classes and their properties, so the individuals may not show.<br />
<br />
<br />
==Useful Reading==<br />
* [https://wiki.uib.no/info216/index.php/File:S06-OWL-1.pdf Lecture Notes]<br />
<br />
* [https://wiki.uib.no/info216/index.php/Python_Examples#RDFS_Plus_.2F_OWL_inference_with_RDFLib Example page]<br />
<br />
==Triples you can extend for the tasks==<br />
'''Python'''<br />
<syntaxhighlight><br />
<br />
import owlrl<br />
from rdflib import Graph, Namespace, Literal, URIRef<br />
from rdflib.namespace import RDF, RDFS, XSD, FOAF, OWL<br />
<br />
g = Graph()<br />
<br />
# Namespaces<br />
ex = Namespace("http://example.org/")<br />
dbp = Namespace("http://dbpedia.org/resource/")<br />
geo = Namespace("http://sws.geonames.org/")<br />
schema = Namespace("https://schema.org/")<br />
akt = Namespace("http://www.aktors.org/ontology/portal#")<br />
vcard = Namespace("http://www.w3.org/2006/vcard/ns#")<br />
<br />
g.bind("ex", ex)<br />
g.bind("owl", OWL)<br />
<br />
<br />
# RDFS Tasks from last time.<br />
g.add((ex.Cade, ex.degreeFrom, ex.University_of_California))<br />
g.add((ex.Emma, ex.degreeFrom, ex.University_of_Valencia))<br />
g.add((ex.Cade, ex.degreeSubject, ex.Biology))<br />
g.add((ex.Emma, ex.degreeSubject, ex.Chemistry))<br />
g.add((ex.University_of_California, RDF.type, ex.University))<br />
g.add((ex.University_of_Valencia, RDF.type, ex.University))<br />
g.add((ex.University, RDFS.subClassOf, ex.Higher_Education_Institution))<br />
g.add((ex.expertise, RDFS.range, ex.Subject))<br />
g.add((ex.expertise, RDFS.domain, FOAF.Person))<br />
g.add((ex.degreeSubject, RDFS.subPropertyOf, ex.expertise))<br />
g.add((ex.graduated, RDFS.range, ex.Higher_Education_Institution))<br />
g.add((ex.graduated, RDFS.domain, FOAF.Person))<br />
g.add((ex.degreeFrom, RDFS.subPropertyOf, ex.graduated))<br />
g.add((ex.Biology, RDFS.label, Literal("Biology", lang="en")))<br />
g.add((ex.Biology, RDFS.label, Literal("La Biologie", lang="fr")))<br />
g.add((ex.Biology, RDFS.comment, Literal("Biology is a natural science concerned with the study of life and living organisms, including their structure, function, growth, evolution, distribution, identification and taxonomy.")))<br />
g.add((ex.Chemistry, RDFS.label, Literal("Chemistry", lang="en")))<br />
g.add((ex.Chemistry, RDFS.label, Literal("La Chimie", lang="fr")))<br />
g.add((ex.Chemistry, RDFS.comment, Literal("Chemistry is a branch of physical science that studies the composition, structure, properties and change of matter.", lang="en")))<br />
<br />
# Write OWL triples here<br />
</syntaxhighlight><br />
'''Turtle'''<br />
<syntaxhighlight><br />
@prefix ex: <http://example.org/> .<br />
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .<br />
<br />
ex:Cade ex:degreeFrom ex:University_of_California ;<br />
ex:degreeSubject ex:Biology .<br />
<br />
ex:Emma ex:degreeFrom ex:University_of_Valencia ;<br />
ex:degreeSubject ex:Chemistry .<br />
<br />
ex:degreeFrom rdfs:subPropertyOf ex:graduated .<br />
<br />
ex:degreeSubject rdfs:subPropertyOf ex:expertise .<br />
<br />
ex:Biology rdfs:label "Biology"@en,<br />
"La Biologie"@fr ;<br />
rdfs:comment "Biology is a natural science concerned with the study of life and living organisms, including their structure, function, growth, evolution, distribution, identification and taxonomy." .<br />
<br />
ex:Chemistry rdfs:label "Chemistry"@en,<br />
"La Chimie"@fr ;<br />
rdfs:comment "Chemistry is a branch of physical science that studies the composition, structure, properties and change of matter."@en .<br />
<br />
ex:University_of_California a ex:University .<br />
<br />
ex:University_of_Valencia a ex:University .<br />
<br />
ex:expertise rdfs:domain <http://xmlns.com/foaf/0.1/Person> ;<br />
rdfs:range ex:Subject .<br />
<br />
ex:graduated rdfs:domain <http://xmlns.com/foaf/0.1/Person> ;<br />
rdfs:range ex:Higher_Education_Institution .<br />
<br />
ex:University rdfs:subClassOf ex:Higher_Education_Institution .<br />
<br />
ex:Cade a foaf:Person ;<br />
ex:address [ a ex:Address ;<br />
ex:city ex:Berkeley ;<br />
ex:country ex:USA ;<br />
ex:postalCode "94709"^^xsd:string ;<br />
ex:state ex:California ;<br />
ex:street "1516_Henry_Street"^^xsd:string ] ;<br />
ex:age 27 ;<br />
ex:characteristic ex:Kind ;<br />
ex:degree [ ex:degreeField ex:Biology ;<br />
ex:degreeLevel "Bachelor"^^xsd:string ;<br />
ex:degreeSource ex:University_of_California ;<br />
ex:year "2011-01-01"^^xsd:gYear ] ;<br />
ex:interest ex:Bird,<br />
ex:Ecology,<br />
ex:Environmentalism,<br />
ex:Photography,<br />
ex:Travelling ;<br />
ex:married ex:Mary ;<br />
ex:meeting ex:Meeting1 ;<br />
ex:visit ex:Canada,<br />
ex:France,<br />
ex:Germany ;<br />
foaf:knows ex:Emma ;<br />
ex:name "Cade_Tracey"^^xsd:string .<br />
<br />
ex:Mary a ex:Student;<br />
ex:age 26 ;<br />
ex:characteristic ex:Kind ;<br />
ex:interest ex:Biology,<br />
ex:Chocolate,<br />
ex:Hiking .<br />
<br />
ex:Emma a foaf:Person ;<br />
ex:address [ a ex:Address ;<br />
ex:city ex:Valencia ;<br />
ex:country ex:Spain ;<br />
ex:postalCode "46020"^^xsd:string ;<br />
ex:street "Carrer_de_la Guardia_Civil_20"^^xsd:string ] ;<br />
ex:age 26 ;<br />
ex:degree [ ex:degreeField ex:Chemistry ;<br />
ex:degreeLevel "Master"^^xsd:string ;<br />
ex:degreeSource ex:University_of_Valencia ;<br />
ex:year "2015-01-01"^^xsd:gYear ] ;<br />
ex:expertise ex:Air_Pollution,<br />
ex:Toxic_Waste,<br />
ex:Waste_Management,<br />
ex:Bananas ;<br />
ex:interest ex:Bike_Riding,<br />
ex:Music,<br />
ex:Travelling ;<br />
ex:meeting ex:Meeting1 ;<br />
ex:visit ( ex:Portugal ex:Italy ex:France ex:Germany ex:Denmark ex:Sweden ) ;<br />
ex:name "Emma_Dominguez"^^xsd:string .<br />
<br />
ex:Meeting1 a ex:Meeting ;<br />
ex:date "August, 2014"^^xsd:string ;<br />
ex:involved ex:Cade,<br />
ex:Emma ;<br />
ex:location ex:Paris .<br />
<br />
ex:Paris a ex:City ;<br />
ex:capitalOf ex:France ;<br />
ex:locatedIn ex:France .<br />
<br />
ex:France ex:capital ex:Paris .<br />
</syntaxhighlight></div>
Tkr048
http://info216.wiki.uib.no/index.php?title=Lab_Solutions&diff=1761
Lab Solutions
2022-03-10T13:16:55Z
<p>Tkr048: </p>
<hr />
<div>This page will be updated with Python examples related to the lectures and labs. We will add more examples after each lab has ended. The first examples will use Python's RDFlib. We will introduce other relevant libraries later.<br />
<br />
==Getting started==<br />
<br />
<syntaxhighlight><br />
<br />
from rdflib.collection import Collection<br />
from rdflib import Graph, Namespace, Literal, URIRef<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
EX = Namespace('http://EXample.org/')<br />
RL = Namespace('http://purl.org/vocab/relationship/')<br />
DBO = Namespace('https://dbpedia.org/ontology/')<br />
DBR = Namespace('https://dbpedia.org/page/')<br />
<br />
g.namespace_manager.bind('exampleURI', EX)<br />
g.namespace_manager.bind('relationship', RL)<br />
g.namespace_manager.bind('dbpediaOntology', DBO)<br />
g.namespace_manager.bind('dbpediaPage', DBR)<br />
<br />
g.add((EX.Cade, RDF.type, FOAF.Person)) <br />
g.add((EX.Mary, RDF.type, FOAF.Person))<br />
g.add((EX.Cade, RL.spouseOf, EX.Mary)) # a symmetrical relation from an established namespace<br />
g.add((DBR.France, DBO.capital, DBR.Paris)) <br />
g.add((EX.Cade, FOAF.age, Literal(27)))<br />
g.add((EX.Mary, FOAF.age, Literal('26', datatype=XSD.int)))<br />
Collection (g, EX.MaryInterests, [EX.hiking, EX.choclate, EX.biology])<br />
g.add((EX.Mary, EX.hasIntrest, EX.MaryInterests))<br />
g.add((EX.Mary, RDF.type, EX.student))<br />
g.add((DBO.capital, EX.range, EX.city))<br />
g.add((EX.Mary, RDF.type, EX.kind))<br />
g.add((EX.Cade, RDF.type, EX.kindPerson))<br />
<br />
#hobbies = ['hiking', 'choclate', 'biology']<br />
#for i in hobbies:<br />
# g.add((EX.Mary, FOAF.interest, EX[i]))<br />
<br />
print(g.serialize(format="turtle"))<br />
</syntaxhighlight><br />
<br />
==RDFlib==<br />
<syntaxhighlight><br />
<br />
from rdflib.namespace import RDF, XSD, FOAF<br />
from rdflib import Graph, Namespace, Literal, BNode<br />
from rdflib.collection import Collection<br />
<br />
<br />
g = Graph()<br />
ex = Namespace('http://example.org/')<br />
schema = Namespace("https://schema.org/")<br />
dbp = Namespace("https://dbpedia.org/resource/")<br />
<br />
g.bind("ex", ex)<br />
g.bind("dbp", dbp)<br />
g.bind("schema", schema)<br />
<br />
address = BNode()<br />
degree = BNode()<br />
<br />
# from lab 1<br />
g.add((ex.Cade, FOAF.name, Literal("Cade Tracey", datatype=XSD.string)))<br />
g.add((ex.Mary, FOAF.name, Literal("Mary", datatype=XSD.string)))<br />
g.add((ex.Cade, RDF.type, FOAF.Person))<br />
g.add((ex.Mary, RDF.type, FOAF.Person))<br />
g.add((ex.Mary, RDF.type, ex.Student))<br />
g.add((ex.Cade, ex.married, ex.Mary))<br />
g.add((ex.Cade, FOAF.age, Literal('27', datatype=XSD.int)))<br />
g.add((ex.Mary, FOAF.age, Literal('26', datatype=XSD.int)))<br />
g.add((ex.Paris, RDF.type, ex.City))<br />
g.add((ex.France, ex.Capital, ex.Paris))<br />
g.add((ex.Mary, FOAF.interest, ex.hiking))<br />
g.add((ex.Mary, FOAF.interest, ex.Chocolate))<br />
g.add((ex.Mary, FOAF.interest, ex.biology))<br />
g.add((ex.France, ex.City, ex.Paris))<br />
g.add((ex.Mary, ex.Characterostic, ex.kind))<br />
g.add((ex.Cade, ex.Characterostic, ex.kind))<br />
g.add((ex.France, RDF.type, ex.Country))<br />
g.add((ex.Cade, schema.address, address))<br />
<br />
# BNode address<br />
g.add((address, RDF.type, schema.PostalAdress))<br />
g.add((address, schema.streetAddress, Literal('1516 Henry Street')))<br />
g.add((address, schema.addresCity, dbp.Berkeley))<br />
g.add((address, schema.addressRegion, dbp.California))<br />
g.add((address, schema.postalCode, Literal('94709')))<br />
g.add((address, schema.addressCountry, dbp.United_States))<br />
<br />
# More info about Cade<br />
g.add((ex.Cade, ex.Degree, degree))<br />
g.add((degree, ex.Field, dbp.Biology))<br />
g.add((degree, RDF.type, dbp.Bachelors_degree))<br />
g.add((degree, ex.Universety, dbp.University_of_California))<br />
g.add((degree, ex.year, Literal('2001', datatype=XSD.gYear)))<br />
<br />
# Emma<br />
emma_degree = BNode()<br />
g.add((ex.Emma, FOAF.name, Literal("Emma Dominguez", datatype=XSD.string)))<br />
g.add((ex.Emma, RDF.type, FOAF.Person))<br />
g.add((ex.Emma, ex.Degree, emma_degree))<br />
g.add((degree, ex.Field, dbp.Chemistry))<br />
g.add((degree, RDF.type, dbp.Masters_degree))<br />
g.add((degree, ex.Universety, dbp.University_of_Valencia))<br />
g.add((degree, ex.year, Literal('2015', datatype=XSD.gYear)))<br />
<br />
# Address<br />
emma_address = BNode()<br />
g.add((ex.Emma, schema.address, emma_address))<br />
g.add((emma_address, RDF.type, schema.PostalAdress))<br />
g.add((emma_address, schema.streetAddress,<br />
Literal('Carrer de la Guardia Civil 20')))<br />
g.add((emma_address, schema.addressRegion, dbp.Valencia))<br />
g.add((emma_address, schema.postalCode, Literal('46020')))<br />
g.add((emma_address, schema.addressCountry, dbp.Spain))<br />
<br />
b = BNode()<br />
g.add((ex.Emma, ex.visit, b))<br />
Collection(g, b,<br />
[dbp.Portugal, dbp.Italy, dbp.France, dbp.Germany, dbp.Denmark, dbp.Sweden])<br />
<br />
</syntaxhighlight><br />
<br />
==SPARQL - Blazegraph==<br />
<br />
<br />
==SPARQL - RDFlib==<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON, POST, GET, TURTLE<br />
<br />
namespace = "lab4"<br />
sparql = SPARQLWrapper("http://10.111.21.183:9999/blazegraph/namespace/"+ namespace + "/sparql")<br />
<br />
# Print out Cades interests<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT * WHERE {<br />
ex:Cade ex:interest ?interest.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
for result in results["results"]["bindings"]:<br />
print(result["interest"]["value"])<br />
<br />
# Print Emmas city and country<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?emmaCity ?emmaCountry<br />
WHERE {<br />
ex:Emma ex:address ?address .<br />
?address ex:city ?emmaCity .<br />
?address ex:country ?emmaCountry .<br />
} <br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
for result in results["results"]["bindings"]:<br />
print("Emma's city is "+result["emmaCity"]["value"]+" and Emma's country is " + result["emmaCountry"]["value"])<br />
<br />
#Select the people who are over 26 years old<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?person ?age<br />
WHERE {<br />
?person ex:age ?age .<br />
FILTER(?age > 26) . <br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
for result in results["results"]["bindings"]:<br />
print("All people who are over 26 years old: "+result["person"]["value"])<br />
<br />
#Select people who graduated with Bachelor<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?person ?degree<br />
WHERE {<br />
?person ex:degree ?degree .<br />
?degree ex:degreeLevel "Bachelor" .<br />
} <br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
for result in results["results"]["bindings"]:<br />
print("People who graduated with Bachelor: "+result["person"]["value"])<br />
<br />
#Delete cades photography interest<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
DELETE DATA {<br />
ex:Cade ex:interest ex:Photography .<br />
} <br />
""")<br />
sparql.setMethod(POST)<br />
results = sparql.query()<br />
print(results.response.read())<br />
<br />
# Print out Cades interests again<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT * WHERE {<br />
ex:Cade ex:interest ?interest.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
sparql.setMethod(GET)<br />
results = sparql.query().convert()<br />
for result in results["results"]["bindings"]:<br />
print(result["interest"]["value"])<br />
<br />
# Check university names<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?s ?o2<br />
WHERE { <br />
?s ex:degree ?o .<br />
?o ex:degreeSource ?o2 .<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
for result in results["results"]["bindings"]:<br />
print(result["o2"]["value"])<br />
<br />
<br />
#Delete and insert university of valencia<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
DELETE { ?s ?p ex:University_of_Valencia }<br />
INSERT { ?s ?p ex:Universidad_de_Valencia }<br />
WHERE { ?s ?p ex:University_of_Valencia } <br />
""")<br />
sparql.setMethod(POST)<br />
results = sparql.query()<br />
print(results.response.read())<br />
<br />
# Check university names again<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?s ?o2<br />
WHERE { <br />
?s ex:degree ?o .<br />
?o ex:degreeSource ?o2 .<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
sparql.setMethod(GET)<br />
results = sparql.query().convert()<br />
for result in results["results"]["bindings"]:<br />
print(result["o2"]["value"])<br />
<br />
#Insert Sergio<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
PREFIX foaf: <http://xmlns.com/foaf/0.1/><br />
INSERT DATA {<br />
ex:Sergio a foaf:Person ;<br />
ex:address [ a ex:Address ;<br />
ex:city ex:Valenciay ;<br />
ex:country ex:Spain ;<br />
ex:postalCode "46021"^^xsd:string ;<br />
ex:state ex:California ;<br />
ex:street "4_Carrer_del_Serpis"^^xsd:string ] ;<br />
ex:degree [ ex:degreeField ex:Computer_science ;<br />
ex:degreeLevel "Master"^^xsd:string ;<br />
ex:degreeSource ex:University_of_Valencia ;<br />
ex:year "2008"^^xsd:gYear ] ;<br />
ex:expertise ex:Big_data,<br />
ex:Semantic_technologies,<br />
ex:Machine_learning;<br />
foaf:name "Sergio_Pastor"^^xsd:string .<br />
}<br />
""")<br />
sparql.setMethod(POST)<br />
results = sparql.query()<br />
print(results.response.read())<br />
sparql.setMethod(GET)<br />
<br />
# Describe Sergio<br />
sparql.setReturnFormat(TURTLE)<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
DESCRIBE ex:Sergio ?o<br />
WHERE {<br />
ex:Sergio ?p ?o .<br />
?o ?p2 ?o2 .<br />
}<br />
""")<br />
results = sparql.query().convert()<br />
print(results.serialize(format='turtle'))<br />
<br />
# Construct that any city is in the country in an address<br />
sparql.setQuery("""<br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> <br />
PREFIX ex: <http://example.org/><br />
CONSTRUCT {?city ex:locatedIn ?country}<br />
Where {<br />
?s rdf:type ex:Address .<br />
?s ex:city ?city .<br />
?s ex:country ?country.<br />
}<br />
""")<br />
sparql.setReturnFormat(TURTLE)<br />
results = sparql.query().convert()<br />
print(results.serialize(format='turtle'))<br />
<br />
</syntaxhighlight><br />
==Web APIs and JSON-LD==<br />
<br />
<syntaxhighlight><br />
import requests<br />
from rdflib import FOAF, Namespace, Literal, RDF, Graph, TURTLE<br />
<br />
r = requests.get('http://api.open-notify.org/astros.json').json()<br />
g = Graph()<br />
EX = Namespace('http://EXample.org/')<br />
g.bind("ex", EX)<br />
<br />
for item in r['people']:<br />
craft = item['craft'].replace(" ","_")<br />
person = item['name'].replace(" ","_")<br />
g.add((EX[person], EX.onCraft, EX[craft]))<br />
g.add((EX[person], RDF.type, FOAF.Person))<br />
g.add((EX[person], FOAF.name, Literal(item['name'])))<br />
g.add((EX[craft], FOAF.name, Literal(item['craft'])))<br />
res = g.query("""<br />
CONSTRUCT {?person1 foaf:knows ?person2}<br />
WHERE {<br />
?person1 ex:onCraft ?craft .<br />
?person2 ex:onCraft ?craft .<br />
}<br />
""")<br />
<br />
for triplet in res:<br />
if (triplet[0] != triplet[2]):<br />
g.add((triplet))<br />
<br />
print(g.serialize(format="turtle"))<br />
</syntaxhighlight><br />
<br />
==Semantic lifting - CSV==<br />
<br />
<br />
==RDFS==<br />
<br />
<br />
<br />
==Getting started==<br />
<br />
<br />
===Printing the triples of the Graph in a readable way===<br />
<syntaxhighlight><br />
# The turtle format has the purpose of being more readable for humans. <br />
print(g.serialize(format="turtle"))<br />
</syntaxhighlight><br />
<br />
===Coding Tasks Lab 1===<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, URIRef, BNode, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.Cade, ex.married, ex.Mary))<br />
g.add((ex.France, ex.capital, ex.Paris))<br />
g.add((ex.Cade, ex.age, Literal("27", datatype=XSD.integer)))<br />
g.add((ex.Mary, ex.age, Literal("26", datatype=XSD.integer)))<br />
g.add((ex.Mary, ex.interest, ex.Hiking))<br />
g.add((ex.Mary, ex.interest, ex.Chocolate))<br />
g.add((ex.Mary, ex.interest, ex.Biology))<br />
g.add((ex.Mary, RDF.type, ex.Student))<br />
g.add((ex.Paris, RDF.type, ex.City))<br />
g.add((ex.Paris, ex.locatedIn, ex.France))<br />
g.add((ex.Cade, ex.characteristic, ex.Kind))<br />
g.add((ex.Mary, ex.characteristic, ex.Kind))<br />
g.add((ex.Mary, RDF.type, FOAF.Person))<br />
g.add((ex.Cade, RDF.type, FOAF.Person))<br />
<br />
# OR<br />
<br />
g = Graph()<br />
<br />
ex = Namespace('http://example.org/')<br />
<br />
g.add((ex.Cade, FOAF.name, Literal("Cade", datatype=XSD.string)))<br />
g.add((ex.Mary, FOAF.name, Literal("Mary", datatype=XSD.string)))<br />
g.add((ex.Cade, RDF.type, FOAF.Person))<br />
g.add((ex.Mary, RDF.type, FOAF.Person))<br />
g.add((ex.Mary, RDF.type, ex.Student))<br />
g.add((ex.Cade, ex.Married, ex.Mary))<br />
g.add((ex.Cade, FOAF.age, Literal('27', datatype=XSD.int)))<br />
g.add((ex.Mary, FOAF.age, Literal('26', datatype=XSD.int)))<br />
g.add((ex.Paris, RDF.type, ex.City))<br />
g.add((ex.France, ex.Capital, ex.Paris))<br />
g.add((ex.Mary, FOAF.interest, ex.hiking))<br />
g.add((ex.Mary, FOAF.interest, ex.Chocolate))<br />
g.add((ex.Mary, FOAF.interest, ex.biology))<br />
g.add((ex.France, ex.City, ex.Paris))<br />
g.add((ex.Mary, ex.characteristic, ex.kind))<br />
g.add((ex.Cade, ex.characteristic, ex.kind))<br />
g.add((ex.France, RDF.type, ex.Country))<br />
<br />
<br />
print(g.serialize(format="turtle"))<br />
<br />
</syntaxhighlight><br />
<br />
==Basic RDF programming==<br />
<br />
===Different ways to create an address===<br />
<br />
<syntaxhighlight><br />
<br />
from rdflib import Graph, Namespace, URIRef, BNode, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
<br />
# How to represent the address of Cade Tracey. From probably the worst solution to the best.<br />
<br />
# Solution 1 -<br />
# Make the entire address into one Literal. However, Generally we want to separate each part of an address into their own triples. This is useful for instance if we want to find only the streets where people live. <br />
<br />
g.add((ex.Cade_Tracey, ex.livesIn, Literal("1516_Henry_Street, Berkeley, California 94709, USA")))<br />
<br />
<br />
# Solution 2 - <br />
# Seperate the different pieces information into their own triples<br />
<br />
g.add((ex.Cade_tracey, ex.street, Literal("1516_Henry_Street")))<br />
g.add((ex.Cade_tracey, ex.city, Literal("Berkeley")))<br />
g.add((ex.Cade_tracey, ex.state, Literal("California")))<br />
g.add((ex.Cade_tracey, ex.zipcode, Literal("94709")))<br />
g.add((ex.Cade_tracey, ex.country, Literal("USA")))<br />
<br />
<br />
# Solution 3 - Some parts of the addresses can make more sense to be resources than Literals.<br />
# Larger concepts like a city or state are typically represented as resources rather than Literals, but this is not necesarilly a requirement in the case that you don't intend to say more about them. <br />
<br />
g.add((ex.Cade_tracey, ex.street, Literal("1516_Henry_Street")))<br />
g.add((ex.Cade_tracey, ex.city, ex.Berkeley))<br />
g.add((ex.Cade_tracey, ex.state, ex.California))<br />
g.add((ex.Cade_tracey, ex.zipcode, Literal("94709")))<br />
g.add((ex.Cade_tracey, ex.country, ex.USA))<br />
<br />
<br />
# Solution 4 <br />
# Grouping of the information into an Address. We can Represent the address concept with its own URI OR with a Blank Node. <br />
# One advantage of this is that we can easily remove the entire address, instead of removing each individual part of the address. <br />
# Solution 4 or 5 is how I would recommend to make addresses. Here, ex.CadeAddress could also be called something like ex.address1 or so on, if you want to give each address a unique ID. <br />
<br />
# Address URI - CadeAdress<br />
<br />
g.add((ex.Cade_Tracey, ex.address, ex.CadeAddress))<br />
g.add((ex.CadeAddress, RDF.type, ex.Address))<br />
g.add((ex.CadeAddress, ex.street, Literal("1516 Henry Street")))<br />
g.add((ex.CadeAddress, ex.city, ex.Berkeley))<br />
g.add((ex.CadeAddress, ex.state, ex.California))<br />
g.add((ex.CadeAddress, ex.postalCode, Literal("94709")))<br />
g.add((ex.CadeAddress, ex.country, ex.USA))<br />
<br />
# OR<br />
<br />
# Blank node for Address. <br />
address = BNode()<br />
g.add((ex.Cade_Tracey, ex.address, address))<br />
g.add((address, RDF.type, ex.Address))<br />
g.add((address, ex.street, Literal("1516 Henry Street", datatype=XSD.string)))<br />
g.add((address, ex.city, ex.Berkeley))<br />
g.add((address, ex.state, ex.California))<br />
g.add((address, ex.postalCode, Literal("94709", datatype=XSD.string)))<br />
g.add((address, ex.country, ex.USA))<br />
<br />
<br />
# Solution 5 using existing vocabularies for address <br />
<br />
# (in this case https://schema.org/PostalAddress from schema.org). <br />
# Also using existing ontology for places like California. (like http://dbpedia.org/resource/California from dbpedia.org)<br />
<br />
schema = Namespace("https://schema.org/")<br />
dbp = Namespace("https://dpbedia.org/resource/")<br />
<br />
g.add((ex.Cade_Tracey, schema.address, ex.CadeAddress))<br />
g.add((ex.CadeAddress, RDF.type, schema.PostalAddress))<br />
g.add((ex.CadeAddress, schema.streetAddress, Literal("1516 Henry Street")))<br />
g.add((ex.CadeAddress, schema.addresCity, dbp.Berkeley))<br />
g.add((ex.CadeAddress, schema.addressRegion, dbp.California))<br />
g.add((ex.CadeAddress, schema.postalCode, Literal("94709")))<br />
g.add((ex.CadeAddress, schema.addressCountry, dbp.United_States))<br />
<br />
</syntaxhighlight><br />
<br />
===Typed Literals===<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace<br />
from rdflib.namespace import XSD<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.Cade, ex.age, Literal(27, datatype=XSD.integer)))<br />
g.add((ex.Cade, ex.gpa, Literal(3.3, datatype=XSD.float)))<br />
g.add((ex.Cade, FOAF.name, Literal("Cade Tracey", datatype=XSD.string)))<br />
g.add((ex.Cade, ex.birthday, Literal("2006-01-01", datatype=XSD.date)))<br />
</syntaxhighlight><br />
<br />
<br />
===Writing and reading graphs/files===<br />
<br />
<syntaxhighlight><br />
# Writing the graph to a file on your system. Possible formats = turtle, n3, xml, nt.<br />
g.serialize(destination="triples.txt", format="turtle")<br />
<br />
# Parsing a local file<br />
parsed_graph = g.parse(location="triples.txt", format="turtle")<br />
<br />
# Parsing a remote endpoint like Dbpedia<br />
dbpedia_graph = g.parse("http://dbpedia.org/resource/Pluto")<br />
</syntaxhighlight><br />
<br />
===Graph Binding===<br />
<syntaxhighlight><br />
#Graph Binding is useful for at least two reasons:<br />
#(1) We no longer need to specify prefixes with SPARQL queries if they are already binded to the graph.<br />
#(2) When serializing the graph, the serialization will show the correct expected prefix <br />
# instead of default namespace names ns1, ns2 etc.<br />
<br />
g = Graph()<br />
<br />
ex = Namespace("http://example.org/")<br />
dbp = Namespace("http://dbpedia.org/resource/")<br />
schema = Namespace("https://schema.org/")<br />
<br />
g.bind("ex", ex)<br />
g.bind("dbp", dbp)<br />
g.bind("schema", schema)<br />
</syntaxhighlight><br />
<br />
===Collection Example===<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace<br />
from rdflib.collection import Collection<br />
<br />
<br />
# Sometimes we want to add many objects or subjects for the same predicate at once. <br />
# In these cases we can use Collection() to save some time.<br />
# In this case I want to add all countries that Emma has visited at once.<br />
<br />
b = BNode()<br />
g.add((ex.Emma, ex.visit, b))<br />
Collection(g, b,<br />
[ex.Portugal, ex.Italy, ex.France, ex.Germany, ex.Denmark, ex.Sweden])<br />
<br />
# OR<br />
<br />
g.add((ex.Emma, ex.visit, ex.EmmaVisits))<br />
Collection(g, ex.EmmaVisits,<br />
[ex.Portugal, ex.Italy, ex.France, ex.Germany, ex.Denmark, ex.Sweden])<br />
<br />
</syntaxhighlight><br />
<br />
==SPARQL==<br />
<br />
Also see the [[SPARQL Examples]] page!<br />
<br />
===Querying a local ("in memory") graph===<br />
<br />
Example contents of the file family.ttl:<br />
@prefix rex: <http://example.org/royal#> .<br />
@prefix fam: <http://example.org/family#> .<br />
<br />
rex:IngridAlexandra fam:hasParent rex:HaakonMagnus .<br />
rex:SverreMagnus fam:hasParent rex:HaakonMagnus .<br />
rex:HaakonMagnus fam:hasParent rex:Harald .<br />
rex:MarthaLouise fam:hasParent rex:Harald .<br />
rex:HaakonMagnus fam:hasSister rex:MarthaLouise .<br />
<br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
g.parse("family.ttl", format='ttl')<br />
<br />
qres = g.query("""<br />
PREFIX fam: <http://example.org/family#><br />
SELECT ?child ?sister WHERE {<br />
?child fam:hasParent ?parent . <br />
?parent fam:hasSister ?sister .<br />
}""")<br />
for row in qres:<br />
print("%s has aunt %s" % row)<br />
<br />
With a prepared query, you can write the query once, and then bind some of the variables each time you use it:<br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
g.parse("family.ttl", format='ttl')<br />
<br />
q = rdflib.plugins.sparql.prepareQuery(<br />
"""SELECT ?child ?sister WHERE {<br />
?child fam:hasParent ?parent .<br />
?parent fam:hasSister ?sister .<br />
}""",<br />
initNs = { "fam": "http://example.org/family#"})<br />
<br />
sm = rdflib.URIRef("http://example.org/royal#SverreMagnus")<br />
<br />
for row in g.query(q, initBindings={'child': sm}):<br />
print(row)<br />
<br />
===Select all contents of lists (rdfllib.Collection)===<br />
<syntaxhighlight><br />
<br />
# rdflib.Collection has a different interntal structure so it requires a slightly more advance query. Here I am selecting all places that Emma has visited.<br />
<br />
PREFIX ex: <http://example.org/><br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
<br />
SELECT ?visit<br />
WHERE {<br />
ex:Emma ex:visit/rdf:rest*/rdf:first ?visit<br />
}<br />
</syntaxhighlight><br />
<br />
<br />
===Using parameters/variables in rdflib queries===<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, URIRef<br />
from rdflib.plugins.sparql import prepareQuery<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
g.add((ex.Cade, ex.livesIn, ex.France))<br />
g.add((ex.Anne, ex.livesIn, ex.Norway))<br />
g.add((ex.Sofie, ex.livesIn, ex.Sweden))<br />
g.add((ex.Per, ex.livesIn, ex.Norway))<br />
g.add((ex.John, ex.livesIn, ex.USA))<br />
<br />
<br />
def find_people_from_country(country):<br />
country = URIRef(ex + country)<br />
q = prepareQuery(<br />
"""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?person WHERE { <br />
?person ex:livesIn ?country.<br />
}<br />
""")<br />
<br />
capital_result = g.query(q, initBindings={'country': country})<br />
<br />
for row in capital_result:<br />
print(row)<br />
<br />
find_people_from_country("Norway")<br />
</syntaxhighlight><br />
<br />
===SELECTING data from Blazegraph via Python===<br />
<syntaxhighlight><br />
<br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
# This creates a server connection to the same URL that contains the graphic interface for Blazegraph. <br />
# You also need to add "sparql" to end of the URL like below.<br />
<br />
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/sparql")<br />
<br />
# SELECT all triples in the database.<br />
<br />
sparql.setQuery("""<br />
SELECT DISTINCT ?p WHERE {<br />
?s ?p ?o.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["p"]["value"])<br />
<br />
# SELECT all interests of Cade<br />
<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT DISTINCT ?interest WHERE {<br />
ex:Cade ex:interest ?interest.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["interest"]["value"])<br />
</syntaxhighlight><br />
<br />
===Updating data from Blazegraph via Python===<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, POST, DIGEST<br />
<br />
namespace = "kb"<br />
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/namespace/"+ namespace + "/sparql")<br />
<br />
sparql.setMethod(POST)<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
INSERT DATA{<br />
ex:Cade ex:interest ex:Mathematics.<br />
}<br />
""")<br />
<br />
results = sparql.query()<br />
print(results.response.read())<br />
<br />
<br />
</syntaxhighlight><br />
===Retrieving data from Wikidata with SparqlWrapper===<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
sparql = SPARQLWrapper("https://query.wikidata.org/sparql")<br />
# In the query I want to select all the Vitamins in wikidata.<br />
<br />
sparql.setQuery("""<br />
SELECT ?nutrient ?nutrientLabel WHERE<br />
{<br />
?nutrient wdt:P279 wd:Q34956.<br />
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }<br />
}<br />
""")<br />
<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["nutrient"]["value"], " ", result["nutrientLabel"]["value"])<br />
</syntaxhighlight><br />
<br />
<br />
More examples can be found in the example section on the official query service here: https://query.wikidata.org/.<br />
<br />
===Download from BlazeGraph===<br />
<br />
<syntaxhighlight><br />
"""<br />
Dumps a database to a local RDF file.<br />
You need to install the SPARQLWrapper package first...<br />
"""<br />
<br />
import datetime<br />
from SPARQLWrapper import SPARQLWrapper, RDFXML<br />
<br />
# your namespace, the default is 'kb'<br />
ns = 'kb'<br />
<br />
# the SPARQL endpoint<br />
endpoint = 'http://info216.i2s.uib.no/bigdata/namespace/' + ns + '/sparql'<br />
<br />
# - the endpoint just moved, the old one was:<br />
# endpoint = 'http://i2s.uib.no:8888/bigdata/namespace/' + ns + '/sparql'<br />
<br />
# create wrapper<br />
wrapper = SPARQLWrapper(endpoint)<br />
<br />
# prepare the SPARQL update<br />
wrapper.setQuery('CONSTRUCT { ?s ?p ?o } WHERE { ?s ?p ?o }')<br />
wrapper.setReturnFormat(RDFXML)<br />
<br />
# execute the SPARQL update and convert the result to an rdflib.Graph <br />
graph = wrapper.query().convert()<br />
<br />
# the destination file, with code to make it timestamped<br />
destfile = 'rdf_dumps/slr-kg4news-' + datetime.datetime.now().strftime('%Y%m%d-%H%M') + '.rdf'<br />
<br />
# serialize the result to file<br />
graph.serialize(destination=destfile, format='ttl')<br />
<br />
# report and quit<br />
print('Wrote %u triples to file %s .' %<br />
(len(res), destfile))<br />
</syntaxhighlight><br />
<br />
===Query Dbpedia with SparqlWrapper===<br />
<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
sparql = SPARQLWrapper("http://dbpedia.org/sparql")<br />
<br />
sparql.setQuery("""<br />
PREFIX dbr: <http://dbpedia.org/resource/><br />
PREFIX dbo: <http://dbpedia.org/ontology/><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
SELECT ?comment<br />
WHERE {<br />
dbr:Barack_Obama rdfs:comment ?comment.<br />
FILTER (langMatches(lang(?comment),"en"))<br />
}<br />
""")<br />
<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["comment"]["value"])<br />
</syntaxhighlight><br />
<br />
==Lifting CSV to RDF==<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, FOAF, RDFS, OWL<br />
import pandas as pd<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
# Load the CSV data as a pandas Dataframe.<br />
csv_data = pd.read_csv("task1.csv")<br />
<br />
# Here I deal with spaces (" ") in the data. I replace them with "_" so that URI's become valid.<br />
csv_data = csv_data.replace(to_replace=" ", value="_", regex=True)<br />
<br />
# Here I mark all missing/empty data as "unknown". This makes it easy to delete triples containing this later.<br />
csv_data = csv_data.fillna("unknown")<br />
<br />
# Loop through the CSV data, and then make RDF triples.<br />
for index, row in csv_data.iterrows():<br />
# The names of the people act as subjects.<br />
subject = row['Name']<br />
# Create triples: e.g. "Cade_Tracey - age - 27"<br />
g.add((URIRef(ex + subject), URIRef(ex + "age"), Literal(row["Age"])))<br />
g.add((URIRef(ex + subject), URIRef(ex + "married"), URIRef(ex + row["Spouse"])))<br />
g.add((URIRef(ex + subject), URIRef(ex + "country"), URIRef(ex + row["Country"])))<br />
<br />
# If We want can add additional RDF/RDFS/OWL information e.g<br />
g.add((URIRef(ex + subject), RDF.type, FOAF.Person))<br />
<br />
# I remove triples that I marked as unknown earlier.<br />
g.remove((None, None, URIRef("http://example.org/unknown")))<br />
<br />
# Clean printing of the graph.<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===CSV file for above example===<br />
<br />
<syntaxhighlight><br />
"Name","Age","Spouse","Country"<br />
"Cade Tracey","26","Mary Jackson","US"<br />
"Bob Johnson","21","","Canada"<br />
"Mary Jackson","25","","France"<br />
"Phil Philips","32","Catherine Smith","Japan"<br />
</syntaxhighlight><br />
<br />
<br />
=Coding Tasks Lab 6=<br />
<syntaxhighlight><br />
import pandas as pd<br />
<br />
<br />
from rdflib import Graph, Namespace, URIRef, Literal, BNode<br />
from rdflib.namespace import RDF, XSD<br />
<br />
<br />
ex = Namespace("http://example.org/")<br />
sem = Namespace("http://semanticweb.cs.vu.nl/2009/11/sem/")<br />
<br />
g = Graph()<br />
g.bind("ex", ex)<br />
g.bind("sem", sem)<br />
<br />
<br />
# Removing unwanted characters<br />
df = pd.read_csv('russia-investigation.csv')<br />
# Here I deal with spaces (" ") in the data. I replace them with "_" so that URI's become valid.<br />
df = df.replace(to_replace=" ", value="_", regex=True)<br />
# This may seem odd, but in the data set we have a name like this:("Scooter"). So we have to remove quotation marks<br />
df = df.replace(to_replace=f'"', value="", regex=True)<br />
# # Here I mark all missing/empty data as "unknown". This makes it easy to delete triples containing this later.<br />
df = df.fillna("unknown")<br />
<br />
# Loop through the CSV data, and then make RDF triples.<br />
for index, row in df.iterrows():<br />
name = row['investigation']<br />
investigation = URIRef(ex + name)<br />
g.add((investigation, RDF.type, sem.Event))<br />
investigation_start = row["investigation-start"]<br />
g.add((investigation, sem.hasBeginTimeStamp, Literal(<br />
investigation_start, datatype=XSD.datetime)))<br />
investigation_end = row["investigation-end"]<br />
g.add((investigation, sem.hasEndTimeStamp, Literal(<br />
investigation_end, datatype=XSD.datetime)))<br />
investigation_end = row["investigation-days"]<br />
g.add((investigation, sem.hasXSDDuration, Literal(<br />
investigation_end, datatype=XSD.Days)))<br />
person = row["name"]<br />
person = URIRef(ex + person)<br />
g.add((investigation, sem.Actor, person))<br />
result = row['type']<br />
g.add((investigation, sem.hasSubEvent, Literal(result, datatype=XSD.string)))<br />
overturned = row["overturned"]<br />
g.add((investigation, ex.overtuned, Literal(overturned, datatype=XSD.boolean)))<br />
pardoned = row["pardoned"]<br />
g.add((investigation, ex.pardon, Literal(pardoned, datatype=XSD.boolean)))<br />
<br />
g.serialize("output.ttl", format="ttl")<br />
print(g.serialize(format="turtle"))<br />
<br />
</syntaxhighlight><br />
<!--<br />
==Lifting XML to RDF==<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, XSD, RDFS<br />
import xml.etree.ElementTree as ET<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/TV/")<br />
prov = Namespace("http://www.w3.org/ns/prov#")<br />
g.bind("ex", ex)<br />
g.bind("prov", prov)<br />
<br />
tree = ET.parse("tv_shows.xml")<br />
root = tree.getroot()<br />
<br />
for tv_show in root.findall('tv_show'):<br />
show_id = tv_show.attrib["id"]<br />
title = tv_show.find("title").text<br />
<br />
g.add((URIRef(ex + show_id), ex.title, Literal(title, datatype=XSD.string)))<br />
g.add((URIRef(ex + show_id), RDF.type, ex.TV_Show))<br />
<br />
for actor in tv_show.findall("actor"):<br />
first_name = actor.find("firstname").text<br />
last_name = actor.find("lastname").text<br />
full_name = first_name + "_" + last_name<br />
<br />
g.add((URIRef(ex + show_id), ex.stars, URIRef(ex + full_name)))<br />
g.add((URIRef(ex + full_name), ex.starsIn, URIRef(title)))<br />
g.add((URIRef(ex + full_name), RDF.type, ex.Actor))<br />
<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
<br />
==RDFS==<br />
<br />
===RDFS-plus (OWL) Properties===<br />
<syntaxhighlight><br />
g.add((ex.married, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.married, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.ReflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.sibling, RDF.type, OWL.TransitiveProperty))<br />
g.add((ex.sibling, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.sibling, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.FunctionalProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.IrreflexiveProperty))<br />
<br />
# Sometimes there is no definite answer, and it comes down to how we want to model our properties<br />
# e.g is livesWith a transitive property? Usually yes, but we can also want to specify that a child lives with both of her divorced parents.<br />
# which means that: (mother livesWith child % child livesWith father) != mother livesWith father. Which makes it non-transitive.<br />
</syntaxhighlight><br />
<br />
===RDFS inference with RDFLib===<br />
You can use the OWL-RL package to add inference capabilities to RDFLib. It can be installed using the pip install command:<br />
<syntaxhighlight><br />
pip install owlrl<br />
</syntaxhighlight><br />
Or download it from [https://github.com/RDFLib/OWL-RL GitHub] and copy the ''owlrl'' subfolder into your project folder next to your Python files.<br />
<br />
[https://owl-rl.readthedocs.io/en/latest/owlrl.html OWL-RL documentation.]<br />
<br />
Example program to get you started. In this example we are creating the graph using sparql.update, but it is also possible to parse the data from a file.<br />
<syntaxhighlight><br />
import rdflib.plugins.sparql.update<br />
import owlrl.RDFSClosure<br />
<br />
g = rdflib.Graph()<br />
<br />
ex = rdflib.Namespace('http://example.org#')<br />
g.bind('', ex)<br />
<br />
g.update("""<br />
PREFIX ex: <http://example.org#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
INSERT DATA {<br />
ex:Socrates rdf:type ex:Man .<br />
ex:Man rdfs:subClassOf ex:Mortal .<br />
}""")<br />
<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
# RDF_Semantics parameters: <br />
# - graph (rdflib.Graph) – The RDF graph to be extended.<br />
# - axioms (bool) – Whether (non-datatype) axiomatic triples should be added or not.<br />
# - daxioms (bool) – Whether datatype axiomatic triples should be added or not.<br />
# - rdfs (bool) – Whether RDFS inference is also done (used in subclassed only).<br />
# For now, you will in most cases use all False in RDFS_Semtantics.<br />
<br />
# Generates the closure of the graph - generates the new entailed triples, but does not add them to the graph.<br />
rdfs.closure()<br />
# Adds the new triples to the graph and empties the RDFS triple-container.<br />
rdfs.flush_stored_triples()<br />
<br />
# Ask-query to check whether a new triple has been generated from the entailment.<br />
b = g.query("""<br />
PREFIX ex: <http://example.org#><br />
ASK {<br />
ex:Socrates rdf:type ex:Mortal .<br />
} <br />
""")<br />
print('Result: ' + bool(b))<br />
</syntaxhighlight><br />
<br />
===Language tagged RDFS labels=== <br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, Literal<br />
from rdflib.namespace import RDFS<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.France, RDFS.label, Literal("Frankrike", lang="no")))<br />
g.add((ex.France, RDFS.label, Literal("France", lang="en")))<br />
g.add((ex.France, RDFS.label, Literal("Francia", lang="es")))<br />
<br />
<br />
</syntaxhighlight><br />
<br />
==OWL== <br />
===Basic inference with RDFLib=== <br />
<br />
You can use the OWL-RL package again as for Lecture 5.<br />
<br />
Instead of: <br />
<syntaxhighlight><br />
# The next three lines add inferred triples to g.<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
rdfs.closure()<br />
rdfs.flush_stored_triples()<br />
</syntaxhighlight><br />
you can write this to get both RDFS and basic RDFS Plus / OWL inference:<br />
<syntaxhighlight><br />
# The next three lines add inferred triples to g.<br />
owl = owlrl.CombinedClosure.RDFS_OWLRL_Semantics(g, False, False, False)<br />
owl.closure()<br />
owl.flush_stored_triples()<br />
</syntaxhighlight><br />
<br />
Example updates and queries:<br />
<syntaxhighlight><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
PREFIX ex: <http://example.org#><br />
<br />
INSERT DATA {<br />
ex:Socrates ex:hasWife ex:Xanthippe .<br />
ex:hasHusband owl:inverseOf ex:hasWife .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Xanthippe ex:hasHusband ex:Socrates .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ^ex:hasHusband ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
INSERT DATA {<br />
ex:hasWife rdfs:subPropertyOf ex:hasSpouse .<br />
ex:hasSpouse rdf:type owl:SymmetricProperty . <br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ex:hasSpouse ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ^ex:hasSpouse ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<br />
<br />
<br />
<br />
<br />
===XML Data for above example===<br />
<syntaxhighlight><br />
<data><br />
<tv_show id="1050"><br />
<title>The_Sopranos</title><br />
<actor><br />
<firstname>James</firstname><br />
<lastname>Gandolfini</lastname><br />
</actor><br />
</tv_show><br />
<tv_show id="1066"><br />
<title>Seinfeld</title><br />
<actor><br />
<firstname>Jerry</firstname><br />
<lastname>Seinfeld</lastname><br />
</actor><br />
<actor><br />
<firstname>Julia</firstname><br />
<lastname>Louis-dreyfus</lastname><br />
</actor><br />
<actor><br />
<firstname>Jason</firstname><br />
<lastname>Alexander</lastname><br />
</actor><br />
</tv_show><br />
</data><br />
</syntaxhighlight><br />
<br />
==Lifting HTML to RDF==<br />
<syntaxhighlight><br />
from bs4 import BeautifulSoup as bs, NavigableString<br />
from rdflib import Graph, URIRef, Namespace<br />
from rdflib.namespace import RDF<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
html = open("tv_shows.html").read()<br />
html = bs(html, features="html.parser")<br />
<br />
shows = html.find_all('li', attrs={'class': 'show'})<br />
for show in shows:<br />
title = show.find("h3").text<br />
actors = show.find('ul', attrs={'class': 'actor_list'})<br />
for actor in actors:<br />
if isinstance(actor, NavigableString):<br />
continue<br />
else:<br />
actor = actor.text.replace(" ", "_")<br />
g.add((URIRef(ex + title), ex.stars, URIRef(ex + actor)))<br />
g.add((URIRef(ex + actor), RDF.type, ex.Actor))<br />
<br />
g.add((URIRef(ex + title), RDF.type, ex.TV_Show))<br />
<br />
<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===HTML code for the example above===<br />
<syntaxhighlight><br />
<!DOCTYPE html><br />
<html><br />
<head><br />
<meta charset="utf-8"><br />
<title></title><br />
</head><br />
<body><br />
<div class="tv_shows"><br />
<ul><br />
<li class="show"><br />
<h3>The_Sopranos</h3><br />
<div class="irrelevant_data"></div><br />
<ul class="actor_list"><br />
<li>James Gandolfini</li><br />
</ul><br />
</li><br />
<li class="show"><br />
<h3>Seinfeld</h3><br />
<div class="irrelevant_data"></div><br />
<ul class="actor_list"><br />
<li >Jerry Seinfeld</li><br />
<li>Jason Alexander</li><br />
<li>Julia Louis-Dreyfus</li><br />
</ul><br />
</li><br />
</ul><br />
</div><br />
</body><br />
</html><br />
</syntaxhighlight><br />
<br />
==Web APIs with JSON==<br />
<syntaxhighlight><br />
import requests<br />
import json<br />
import pprint<br />
<br />
# Retrieve JSON data from API service URL. Then load it with the json library as a json object.<br />
url = "http://api.geonames.org/postalCodeLookupJSON?postalcode=46020&#country=ES&username=demo"<br />
data = requests.get(url).content.decode("utf-8")<br />
data = json.loads(data)<br />
pprint.pprint(data)<br />
</syntaxhighlight><br />
<br />
<br />
==JSON-LD==<br />
<br />
<syntaxhighlight><br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
<br />
example = """<br />
{<br />
"@context": {<br />
"name": "http://xmlns.com/foaf/0.1/name",<br />
"homepage": {<br />
"@id": "http://xmlns.com/foaf/0.1/homepage",<br />
"@type": "@id"<br />
}<br />
},<br />
"@id": "http://me.markus-lanthaler.com/",<br />
"name": "Markus Lanthaler",<br />
"homepage": "http://www.markus-lanthaler.com/"<br />
}<br />
"""<br />
<br />
# json-ld parsing automatically deals with @contexts<br />
g.parse(data=example, format='json-ld')<br />
<br />
# serialisation does expansion by default<br />
for line in g.serialize(format='json-ld').decode().splitlines():<br />
print(line)<br />
<br />
# by supplying a context object, serialisation can do compaction<br />
context = {<br />
"foaf": "http://xmlns.com/foaf/0.1/"<br />
}<br />
for line in g.serialize(format='json-ld', context=context).decode().splitlines():<br />
print(line)<br />
</syntaxhighlight><br />
<br />
<br />
<div class="credits" style="text-align: right; direction: ltr; margin-left: 1em;">''INFO216, UiB, 2017-2020. All code examples are [https://creativecommons.org/choose/zero/ CC0].'' </div><br />
<br />
==OWL - Complex Classes and Restrictions==<br />
<syntaxhighlight><br />
import owlrl<br />
from rdflib import Graph, Literal, Namespace, BNode<br />
from rdflib.namespace import RDF, OWL, RDFS<br />
from rdflib.collection import Collection<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
g.bind("owl", OWL)<br />
<br />
# a Season is either Autumn, Winter, Spring, Summer<br />
seasons = BNode()<br />
Collection(g, seasons, [ex.Winter, ex.Autumn, ex.Spring, ex.Summer])<br />
g.add((ex.Season, OWL.oneOf, seasons))<br />
<br />
# A Parent is a Father or Mother<br />
b = BNode()<br />
Collection(g, b, [ex.Father, ex.Mother])<br />
g.add((ex.Parent, OWL.unionOf, b))<br />
<br />
# A Woman is a person who has the "female" gender<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.gender))<br />
g.add((br, OWL.hasValue, ex.Female))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Woman, OWL.intersectionOf, bi))<br />
<br />
# A vegetarian is a Person who only eats vegetarian food<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.eats))<br />
g.add((br, OWL.allValuesFrom, ex.VeganFood))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Vegetarian, OWL.intersectionOf, bi))<br />
<br />
# A vegetarian is a Person who can not eat meat.<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.eats))<br />
g.add((br, OWL.QualifiedCardinality, Literal(0)))<br />
g.add((br, OWL.onClass, ex.Meat))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Vegetarian, OWL.intersectionOf, bi))<br />
<br />
# A Worried Parent is a parent who has at least one sick child<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.hasChild))<br />
g.add((br, OWL.QualifiedMinCardinality, Literal(1)))<br />
g.add((br, OWL.onClass, ex.Sick))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Parent, br])<br />
g.add((ex.WorriedParent, OWL.intersectionOf, bi))<br />
<br />
# using the restriction above, If we now write...: <br />
g.add((ex.Bob, RDF.type, ex.Parent))<br />
g.add((ex.Bob, ex.hasChild, ex.John))<br />
g.add((ex.John, RDF.type, ex.Sick))<br />
# ...we can infer with owl reasoning that Bob is a worried Parent even though we didn't specify it ourselves because Bob fullfills the restriction and Parent requirements.<br />
<br />
</syntaxhighlight><br />
<br />
==Protege-OWL reasoning with HermiT==<br />
<br />
[[:File:DL-reasoning-RoyalFamily-final.owl.txt | Example file]] from Lecture 13 about OWL-DL, rules and reasoning.<br />
<br />
--></div>
Tkr048
http://info216.wiki.uib.no/index.php?title=Lab_Solutions&diff=1759
Lab Solutions
2022-03-10T13:13:16Z
<p>Tkr048: </p>
<hr />
<div>This page will be updated with Python examples related to the lectures and labs. We will add more examples after each lab has ended. The first examples will use Python's RDFlib. We will introduce other relevant libraries later.<br />
<br />
==Getting started==<br />
<br />
<syntaxhighlight><br />
<br />
from rdflib.collection import Collection<br />
from rdflib import Graph, Namespace, Literal, URIRef<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
EX = Namespace('http://EXample.org/')<br />
RL = Namespace('http://purl.org/vocab/relationship/')<br />
DBO = Namespace('https://dbpedia.org/ontology/')<br />
DBR = Namespace('https://dbpedia.org/page/')<br />
<br />
g.namespace_manager.bind('exampleURI', EX)<br />
g.namespace_manager.bind('relationship', RL)<br />
g.namespace_manager.bind('dbpediaOntology', DBO)<br />
g.namespace_manager.bind('dbpediaPage', DBR)<br />
<br />
g.add((EX.Cade, RDF.type, FOAF.Person)) <br />
g.add((EX.Mary, RDF.type, FOAF.Person))<br />
g.add((EX.Cade, RL.spouseOf, EX.Mary)) # a symmetrical relation from an established namespace<br />
g.add((DBR.France, DBO.capital, DBR.Paris)) <br />
g.add((EX.Cade, FOAF.age, Literal(27)))<br />
g.add((EX.Mary, FOAF.age, Literal('26', datatype=XSD.int)))<br />
Collection (g, EX.MaryInterests, [EX.hiking, EX.choclate, EX.biology])<br />
g.add((EX.Mary, EX.hasIntrest, EX.MaryInterests))<br />
g.add((EX.Mary, RDF.type, EX.student))<br />
g.add((DBO.capital, EX.range, EX.city))<br />
g.add((EX.Mary, RDF.type, EX.kind))<br />
g.add((EX.Cade, RDF.type, EX.kindPerson))<br />
<br />
#hobbies = ['hiking', 'choclate', 'biology']<br />
#for i in hobbies:<br />
# g.add((EX.Mary, FOAF.interest, EX[i]))<br />
<br />
print(g.serialize(format="turtle"))<br />
</syntaxhighlight><br />
<br />
==RDFlib==<br />
<br />
<br />
==SPARQL - Blazegraph==<br />
<br />
<br />
==SPARQL - RDFlib==<br />
<syntaxhighlight><br />
import requests<br />
from rdflib import FOAF, Namespace, Literal, RDF, Graph, TURTLE<br />
<br />
r = requests.get('http://api.open-notify.org/astros.json').json()<br />
g = Graph()<br />
EX = Namespace('http://EXample.org/')<br />
g.bind("ex", EX)<br />
<br />
for item in r['people']:<br />
craft = item['craft'].replace(" ","_")<br />
person = item['name'].replace(" ","_")<br />
g.add((EX[person], EX.onCraft, EX[craft]))<br />
g.add((EX[person], RDF.type, FOAF.Person))<br />
g.add((EX[person], FOAF.name, Literal(item['name'])))<br />
g.add((EX[craft], FOAF.name, Literal(item['craft'])))<br />
res = g.query("""<br />
CONSTRUCT {?person1 foaf:knows ?person2}<br />
WHERE {<br />
?person1 ex:onCraft ?craft .<br />
?person2 ex:onCraft ?craft .<br />
}<br />
""")<br />
<br />
for triplet in res:<br />
if (triplet[0] != triplet[2]):<br />
g.add((triplet))<br />
<br />
print(g.serialize(format="turtle"))<br />
</syntaxhighlight><br />
==Web APIs and JSON-LD==<br />
<br />
<br />
==Semantic lifting - CSV==<br />
<br />
<br />
==RDFS==<br />
<br />
<br />
<br />
==Getting started==<br />
<br />
<br />
===Printing the triples of the Graph in a readable way===<br />
<syntaxhighlight><br />
# The turtle format has the purpose of being more readable for humans. <br />
print(g.serialize(format="turtle"))<br />
</syntaxhighlight><br />
<br />
===Coding Tasks Lab 1===<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, URIRef, BNode, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.Cade, ex.married, ex.Mary))<br />
g.add((ex.France, ex.capital, ex.Paris))<br />
g.add((ex.Cade, ex.age, Literal("27", datatype=XSD.integer)))<br />
g.add((ex.Mary, ex.age, Literal("26", datatype=XSD.integer)))<br />
g.add((ex.Mary, ex.interest, ex.Hiking))<br />
g.add((ex.Mary, ex.interest, ex.Chocolate))<br />
g.add((ex.Mary, ex.interest, ex.Biology))<br />
g.add((ex.Mary, RDF.type, ex.Student))<br />
g.add((ex.Paris, RDF.type, ex.City))<br />
g.add((ex.Paris, ex.locatedIn, ex.France))<br />
g.add((ex.Cade, ex.characteristic, ex.Kind))<br />
g.add((ex.Mary, ex.characteristic, ex.Kind))<br />
g.add((ex.Mary, RDF.type, FOAF.Person))<br />
g.add((ex.Cade, RDF.type, FOAF.Person))<br />
<br />
# OR<br />
<br />
g = Graph()<br />
<br />
ex = Namespace('http://example.org/')<br />
<br />
g.add((ex.Cade, FOAF.name, Literal("Cade", datatype=XSD.string)))<br />
g.add((ex.Mary, FOAF.name, Literal("Mary", datatype=XSD.string)))<br />
g.add((ex.Cade, RDF.type, FOAF.Person))<br />
g.add((ex.Mary, RDF.type, FOAF.Person))<br />
g.add((ex.Mary, RDF.type, ex.Student))<br />
g.add((ex.Cade, ex.Married, ex.Mary))<br />
g.add((ex.Cade, FOAF.age, Literal('27', datatype=XSD.int)))<br />
g.add((ex.Mary, FOAF.age, Literal('26', datatype=XSD.int)))<br />
g.add((ex.Paris, RDF.type, ex.City))<br />
g.add((ex.France, ex.Capital, ex.Paris))<br />
g.add((ex.Mary, FOAF.interest, ex.hiking))<br />
g.add((ex.Mary, FOAF.interest, ex.Chocolate))<br />
g.add((ex.Mary, FOAF.interest, ex.biology))<br />
g.add((ex.France, ex.City, ex.Paris))<br />
g.add((ex.Mary, ex.characteristic, ex.kind))<br />
g.add((ex.Cade, ex.characteristic, ex.kind))<br />
g.add((ex.France, RDF.type, ex.Country))<br />
<br />
<br />
print(g.serialize(format="turtle"))<br />
<br />
</syntaxhighlight><br />
<br />
==Basic RDF programming==<br />
<br />
===Different ways to create an address===<br />
<br />
<syntaxhighlight><br />
<br />
from rdflib import Graph, Namespace, URIRef, BNode, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
<br />
# How to represent the address of Cade Tracey. From probably the worst solution to the best.<br />
<br />
# Solution 1 -<br />
# Make the entire address into one Literal. However, Generally we want to separate each part of an address into their own triples. This is useful for instance if we want to find only the streets where people live. <br />
<br />
g.add((ex.Cade_Tracey, ex.livesIn, Literal("1516_Henry_Street, Berkeley, California 94709, USA")))<br />
<br />
<br />
# Solution 2 - <br />
# Seperate the different pieces information into their own triples<br />
<br />
g.add((ex.Cade_tracey, ex.street, Literal("1516_Henry_Street")))<br />
g.add((ex.Cade_tracey, ex.city, Literal("Berkeley")))<br />
g.add((ex.Cade_tracey, ex.state, Literal("California")))<br />
g.add((ex.Cade_tracey, ex.zipcode, Literal("94709")))<br />
g.add((ex.Cade_tracey, ex.country, Literal("USA")))<br />
<br />
<br />
# Solution 3 - Some parts of the addresses can make more sense to be resources than Literals.<br />
# Larger concepts like a city or state are typically represented as resources rather than Literals, but this is not necesarilly a requirement in the case that you don't intend to say more about them. <br />
<br />
g.add((ex.Cade_tracey, ex.street, Literal("1516_Henry_Street")))<br />
g.add((ex.Cade_tracey, ex.city, ex.Berkeley))<br />
g.add((ex.Cade_tracey, ex.state, ex.California))<br />
g.add((ex.Cade_tracey, ex.zipcode, Literal("94709")))<br />
g.add((ex.Cade_tracey, ex.country, ex.USA))<br />
<br />
<br />
# Solution 4 <br />
# Grouping of the information into an Address. We can Represent the address concept with its own URI OR with a Blank Node. <br />
# One advantage of this is that we can easily remove the entire address, instead of removing each individual part of the address. <br />
# Solution 4 or 5 is how I would recommend to make addresses. Here, ex.CadeAddress could also be called something like ex.address1 or so on, if you want to give each address a unique ID. <br />
<br />
# Address URI - CadeAdress<br />
<br />
g.add((ex.Cade_Tracey, ex.address, ex.CadeAddress))<br />
g.add((ex.CadeAddress, RDF.type, ex.Address))<br />
g.add((ex.CadeAddress, ex.street, Literal("1516 Henry Street")))<br />
g.add((ex.CadeAddress, ex.city, ex.Berkeley))<br />
g.add((ex.CadeAddress, ex.state, ex.California))<br />
g.add((ex.CadeAddress, ex.postalCode, Literal("94709")))<br />
g.add((ex.CadeAddress, ex.country, ex.USA))<br />
<br />
# OR<br />
<br />
# Blank node for Address. <br />
address = BNode()<br />
g.add((ex.Cade_Tracey, ex.address, address))<br />
g.add((address, RDF.type, ex.Address))<br />
g.add((address, ex.street, Literal("1516 Henry Street", datatype=XSD.string)))<br />
g.add((address, ex.city, ex.Berkeley))<br />
g.add((address, ex.state, ex.California))<br />
g.add((address, ex.postalCode, Literal("94709", datatype=XSD.string)))<br />
g.add((address, ex.country, ex.USA))<br />
<br />
<br />
# Solution 5 using existing vocabularies for address <br />
<br />
# (in this case https://schema.org/PostalAddress from schema.org). <br />
# Also using existing ontology for places like California. (like http://dbpedia.org/resource/California from dbpedia.org)<br />
<br />
schema = Namespace("https://schema.org/")<br />
dbp = Namespace("https://dpbedia.org/resource/")<br />
<br />
g.add((ex.Cade_Tracey, schema.address, ex.CadeAddress))<br />
g.add((ex.CadeAddress, RDF.type, schema.PostalAddress))<br />
g.add((ex.CadeAddress, schema.streetAddress, Literal("1516 Henry Street")))<br />
g.add((ex.CadeAddress, schema.addresCity, dbp.Berkeley))<br />
g.add((ex.CadeAddress, schema.addressRegion, dbp.California))<br />
g.add((ex.CadeAddress, schema.postalCode, Literal("94709")))<br />
g.add((ex.CadeAddress, schema.addressCountry, dbp.United_States))<br />
<br />
</syntaxhighlight><br />
<br />
===Typed Literals===<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace<br />
from rdflib.namespace import XSD<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.Cade, ex.age, Literal(27, datatype=XSD.integer)))<br />
g.add((ex.Cade, ex.gpa, Literal(3.3, datatype=XSD.float)))<br />
g.add((ex.Cade, FOAF.name, Literal("Cade Tracey", datatype=XSD.string)))<br />
g.add((ex.Cade, ex.birthday, Literal("2006-01-01", datatype=XSD.date)))<br />
</syntaxhighlight><br />
<br />
<br />
===Writing and reading graphs/files===<br />
<br />
<syntaxhighlight><br />
# Writing the graph to a file on your system. Possible formats = turtle, n3, xml, nt.<br />
g.serialize(destination="triples.txt", format="turtle")<br />
<br />
# Parsing a local file<br />
parsed_graph = g.parse(location="triples.txt", format="turtle")<br />
<br />
# Parsing a remote endpoint like Dbpedia<br />
dbpedia_graph = g.parse("http://dbpedia.org/resource/Pluto")<br />
</syntaxhighlight><br />
<br />
===Graph Binding===<br />
<syntaxhighlight><br />
#Graph Binding is useful for at least two reasons:<br />
#(1) We no longer need to specify prefixes with SPARQL queries if they are already binded to the graph.<br />
#(2) When serializing the graph, the serialization will show the correct expected prefix <br />
# instead of default namespace names ns1, ns2 etc.<br />
<br />
g = Graph()<br />
<br />
ex = Namespace("http://example.org/")<br />
dbp = Namespace("http://dbpedia.org/resource/")<br />
schema = Namespace("https://schema.org/")<br />
<br />
g.bind("ex", ex)<br />
g.bind("dbp", dbp)<br />
g.bind("schema", schema)<br />
</syntaxhighlight><br />
<br />
===Collection Example===<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace<br />
from rdflib.collection import Collection<br />
<br />
<br />
# Sometimes we want to add many objects or subjects for the same predicate at once. <br />
# In these cases we can use Collection() to save some time.<br />
# In this case I want to add all countries that Emma has visited at once.<br />
<br />
b = BNode()<br />
g.add((ex.Emma, ex.visit, b))<br />
Collection(g, b,<br />
[ex.Portugal, ex.Italy, ex.France, ex.Germany, ex.Denmark, ex.Sweden])<br />
<br />
# OR<br />
<br />
g.add((ex.Emma, ex.visit, ex.EmmaVisits))<br />
Collection(g, ex.EmmaVisits,<br />
[ex.Portugal, ex.Italy, ex.France, ex.Germany, ex.Denmark, ex.Sweden])<br />
<br />
</syntaxhighlight><br />
<br />
==SPARQL==<br />
<br />
Also see the [[SPARQL Examples]] page!<br />
<br />
===Querying a local ("in memory") graph===<br />
<br />
Example contents of the file family.ttl:<br />
@prefix rex: <http://example.org/royal#> .<br />
@prefix fam: <http://example.org/family#> .<br />
<br />
rex:IngridAlexandra fam:hasParent rex:HaakonMagnus .<br />
rex:SverreMagnus fam:hasParent rex:HaakonMagnus .<br />
rex:HaakonMagnus fam:hasParent rex:Harald .<br />
rex:MarthaLouise fam:hasParent rex:Harald .<br />
rex:HaakonMagnus fam:hasSister rex:MarthaLouise .<br />
<br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
g.parse("family.ttl", format='ttl')<br />
<br />
qres = g.query("""<br />
PREFIX fam: <http://example.org/family#><br />
SELECT ?child ?sister WHERE {<br />
?child fam:hasParent ?parent . <br />
?parent fam:hasSister ?sister .<br />
}""")<br />
for row in qres:<br />
print("%s has aunt %s" % row)<br />
<br />
With a prepared query, you can write the query once, and then bind some of the variables each time you use it:<br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
g.parse("family.ttl", format='ttl')<br />
<br />
q = rdflib.plugins.sparql.prepareQuery(<br />
"""SELECT ?child ?sister WHERE {<br />
?child fam:hasParent ?parent .<br />
?parent fam:hasSister ?sister .<br />
}""",<br />
initNs = { "fam": "http://example.org/family#"})<br />
<br />
sm = rdflib.URIRef("http://example.org/royal#SverreMagnus")<br />
<br />
for row in g.query(q, initBindings={'child': sm}):<br />
print(row)<br />
<br />
===Select all contents of lists (rdfllib.Collection)===<br />
<syntaxhighlight><br />
<br />
# rdflib.Collection has a different interntal structure so it requires a slightly more advance query. Here I am selecting all places that Emma has visited.<br />
<br />
PREFIX ex: <http://example.org/><br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
<br />
SELECT ?visit<br />
WHERE {<br />
ex:Emma ex:visit/rdf:rest*/rdf:first ?visit<br />
}<br />
</syntaxhighlight><br />
<br />
<br />
===Using parameters/variables in rdflib queries===<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, URIRef<br />
from rdflib.plugins.sparql import prepareQuery<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
g.add((ex.Cade, ex.livesIn, ex.France))<br />
g.add((ex.Anne, ex.livesIn, ex.Norway))<br />
g.add((ex.Sofie, ex.livesIn, ex.Sweden))<br />
g.add((ex.Per, ex.livesIn, ex.Norway))<br />
g.add((ex.John, ex.livesIn, ex.USA))<br />
<br />
<br />
def find_people_from_country(country):<br />
country = URIRef(ex + country)<br />
q = prepareQuery(<br />
"""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?person WHERE { <br />
?person ex:livesIn ?country.<br />
}<br />
""")<br />
<br />
capital_result = g.query(q, initBindings={'country': country})<br />
<br />
for row in capital_result:<br />
print(row)<br />
<br />
find_people_from_country("Norway")<br />
</syntaxhighlight><br />
<br />
===SELECTING data from Blazegraph via Python===<br />
<syntaxhighlight><br />
<br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
# This creates a server connection to the same URL that contains the graphic interface for Blazegraph. <br />
# You also need to add "sparql" to end of the URL like below.<br />
<br />
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/sparql")<br />
<br />
# SELECT all triples in the database.<br />
<br />
sparql.setQuery("""<br />
SELECT DISTINCT ?p WHERE {<br />
?s ?p ?o.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["p"]["value"])<br />
<br />
# SELECT all interests of Cade<br />
<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT DISTINCT ?interest WHERE {<br />
ex:Cade ex:interest ?interest.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["interest"]["value"])<br />
</syntaxhighlight><br />
<br />
===Updating data from Blazegraph via Python===<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, POST, DIGEST<br />
<br />
namespace = "kb"<br />
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/namespace/"+ namespace + "/sparql")<br />
<br />
sparql.setMethod(POST)<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
INSERT DATA{<br />
ex:Cade ex:interest ex:Mathematics.<br />
}<br />
""")<br />
<br />
results = sparql.query()<br />
print(results.response.read())<br />
<br />
<br />
</syntaxhighlight><br />
===Retrieving data from Wikidata with SparqlWrapper===<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
sparql = SPARQLWrapper("https://query.wikidata.org/sparql")<br />
# In the query I want to select all the Vitamins in wikidata.<br />
<br />
sparql.setQuery("""<br />
SELECT ?nutrient ?nutrientLabel WHERE<br />
{<br />
?nutrient wdt:P279 wd:Q34956.<br />
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }<br />
}<br />
""")<br />
<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["nutrient"]["value"], " ", result["nutrientLabel"]["value"])<br />
</syntaxhighlight><br />
<br />
<br />
More examples can be found in the example section on the official query service here: https://query.wikidata.org/.<br />
<br />
===Download from BlazeGraph===<br />
<br />
<syntaxhighlight><br />
"""<br />
Dumps a database to a local RDF file.<br />
You need to install the SPARQLWrapper package first...<br />
"""<br />
<br />
import datetime<br />
from SPARQLWrapper import SPARQLWrapper, RDFXML<br />
<br />
# your namespace, the default is 'kb'<br />
ns = 'kb'<br />
<br />
# the SPARQL endpoint<br />
endpoint = 'http://info216.i2s.uib.no/bigdata/namespace/' + ns + '/sparql'<br />
<br />
# - the endpoint just moved, the old one was:<br />
# endpoint = 'http://i2s.uib.no:8888/bigdata/namespace/' + ns + '/sparql'<br />
<br />
# create wrapper<br />
wrapper = SPARQLWrapper(endpoint)<br />
<br />
# prepare the SPARQL update<br />
wrapper.setQuery('CONSTRUCT { ?s ?p ?o } WHERE { ?s ?p ?o }')<br />
wrapper.setReturnFormat(RDFXML)<br />
<br />
# execute the SPARQL update and convert the result to an rdflib.Graph <br />
graph = wrapper.query().convert()<br />
<br />
# the destination file, with code to make it timestamped<br />
destfile = 'rdf_dumps/slr-kg4news-' + datetime.datetime.now().strftime('%Y%m%d-%H%M') + '.rdf'<br />
<br />
# serialize the result to file<br />
graph.serialize(destination=destfile, format='ttl')<br />
<br />
# report and quit<br />
print('Wrote %u triples to file %s .' %<br />
(len(res), destfile))<br />
</syntaxhighlight><br />
<br />
===Query Dbpedia with SparqlWrapper===<br />
<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
sparql = SPARQLWrapper("http://dbpedia.org/sparql")<br />
<br />
sparql.setQuery("""<br />
PREFIX dbr: <http://dbpedia.org/resource/><br />
PREFIX dbo: <http://dbpedia.org/ontology/><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
SELECT ?comment<br />
WHERE {<br />
dbr:Barack_Obama rdfs:comment ?comment.<br />
FILTER (langMatches(lang(?comment),"en"))<br />
}<br />
""")<br />
<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["comment"]["value"])<br />
</syntaxhighlight><br />
<br />
==Lifting CSV to RDF==<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, FOAF, RDFS, OWL<br />
import pandas as pd<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
# Load the CSV data as a pandas Dataframe.<br />
csv_data = pd.read_csv("task1.csv")<br />
<br />
# Here I deal with spaces (" ") in the data. I replace them with "_" so that URI's become valid.<br />
csv_data = csv_data.replace(to_replace=" ", value="_", regex=True)<br />
<br />
# Here I mark all missing/empty data as "unknown". This makes it easy to delete triples containing this later.<br />
csv_data = csv_data.fillna("unknown")<br />
<br />
# Loop through the CSV data, and then make RDF triples.<br />
for index, row in csv_data.iterrows():<br />
# The names of the people act as subjects.<br />
subject = row['Name']<br />
# Create triples: e.g. "Cade_Tracey - age - 27"<br />
g.add((URIRef(ex + subject), URIRef(ex + "age"), Literal(row["Age"])))<br />
g.add((URIRef(ex + subject), URIRef(ex + "married"), URIRef(ex + row["Spouse"])))<br />
g.add((URIRef(ex + subject), URIRef(ex + "country"), URIRef(ex + row["Country"])))<br />
<br />
# If We want can add additional RDF/RDFS/OWL information e.g<br />
g.add((URIRef(ex + subject), RDF.type, FOAF.Person))<br />
<br />
# I remove triples that I marked as unknown earlier.<br />
g.remove((None, None, URIRef("http://example.org/unknown")))<br />
<br />
# Clean printing of the graph.<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===CSV file for above example===<br />
<br />
<syntaxhighlight><br />
"Name","Age","Spouse","Country"<br />
"Cade Tracey","26","Mary Jackson","US"<br />
"Bob Johnson","21","","Canada"<br />
"Mary Jackson","25","","France"<br />
"Phil Philips","32","Catherine Smith","Japan"<br />
</syntaxhighlight><br />
<br />
<br />
=Coding Tasks Lab 6=<br />
<syntaxhighlight><br />
import pandas as pd<br />
<br />
<br />
from rdflib import Graph, Namespace, URIRef, Literal, BNode<br />
from rdflib.namespace import RDF, XSD<br />
<br />
<br />
ex = Namespace("http://example.org/")<br />
sem = Namespace("http://semanticweb.cs.vu.nl/2009/11/sem/")<br />
<br />
g = Graph()<br />
g.bind("ex", ex)<br />
g.bind("sem", sem)<br />
<br />
<br />
# Removing unwanted characters<br />
df = pd.read_csv('russia-investigation.csv')<br />
# Here I deal with spaces (" ") in the data. I replace them with "_" so that URI's become valid.<br />
df = df.replace(to_replace=" ", value="_", regex=True)<br />
# This may seem odd, but in the data set we have a name like this:("Scooter"). So we have to remove quotation marks<br />
df = df.replace(to_replace=f'"', value="", regex=True)<br />
# # Here I mark all missing/empty data as "unknown". This makes it easy to delete triples containing this later.<br />
df = df.fillna("unknown")<br />
<br />
# Loop through the CSV data, and then make RDF triples.<br />
for index, row in df.iterrows():<br />
name = row['investigation']<br />
investigation = URIRef(ex + name)<br />
g.add((investigation, RDF.type, sem.Event))<br />
investigation_start = row["investigation-start"]<br />
g.add((investigation, sem.hasBeginTimeStamp, Literal(<br />
investigation_start, datatype=XSD.datetime)))<br />
investigation_end = row["investigation-end"]<br />
g.add((investigation, sem.hasEndTimeStamp, Literal(<br />
investigation_end, datatype=XSD.datetime)))<br />
investigation_end = row["investigation-days"]<br />
g.add((investigation, sem.hasXSDDuration, Literal(<br />
investigation_end, datatype=XSD.Days)))<br />
person = row["name"]<br />
person = URIRef(ex + person)<br />
g.add((investigation, sem.Actor, person))<br />
result = row['type']<br />
g.add((investigation, sem.hasSubEvent, Literal(result, datatype=XSD.string)))<br />
overturned = row["overturned"]<br />
g.add((investigation, ex.overtuned, Literal(overturned, datatype=XSD.boolean)))<br />
pardoned = row["pardoned"]<br />
g.add((investigation, ex.pardon, Literal(pardoned, datatype=XSD.boolean)))<br />
<br />
g.serialize("output.ttl", format="ttl")<br />
print(g.serialize(format="turtle"))<br />
<br />
</syntaxhighlight><br />
<!--<br />
==Lifting XML to RDF==<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, XSD, RDFS<br />
import xml.etree.ElementTree as ET<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/TV/")<br />
prov = Namespace("http://www.w3.org/ns/prov#")<br />
g.bind("ex", ex)<br />
g.bind("prov", prov)<br />
<br />
tree = ET.parse("tv_shows.xml")<br />
root = tree.getroot()<br />
<br />
for tv_show in root.findall('tv_show'):<br />
show_id = tv_show.attrib["id"]<br />
title = tv_show.find("title").text<br />
<br />
g.add((URIRef(ex + show_id), ex.title, Literal(title, datatype=XSD.string)))<br />
g.add((URIRef(ex + show_id), RDF.type, ex.TV_Show))<br />
<br />
for actor in tv_show.findall("actor"):<br />
first_name = actor.find("firstname").text<br />
last_name = actor.find("lastname").text<br />
full_name = first_name + "_" + last_name<br />
<br />
g.add((URIRef(ex + show_id), ex.stars, URIRef(ex + full_name)))<br />
g.add((URIRef(ex + full_name), ex.starsIn, URIRef(title)))<br />
g.add((URIRef(ex + full_name), RDF.type, ex.Actor))<br />
<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
<br />
==RDFS==<br />
<br />
===RDFS-plus (OWL) Properties===<br />
<syntaxhighlight><br />
g.add((ex.married, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.married, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.ReflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.sibling, RDF.type, OWL.TransitiveProperty))<br />
g.add((ex.sibling, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.sibling, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.FunctionalProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.IrreflexiveProperty))<br />
<br />
# Sometimes there is no definite answer, and it comes down to how we want to model our properties<br />
# e.g is livesWith a transitive property? Usually yes, but we can also want to specify that a child lives with both of her divorced parents.<br />
# which means that: (mother livesWith child % child livesWith father) != mother livesWith father. Which makes it non-transitive.<br />
</syntaxhighlight><br />
<br />
===RDFS inference with RDFLib===<br />
You can use the OWL-RL package to add inference capabilities to RDFLib. It can be installed using the pip install command:<br />
<syntaxhighlight><br />
pip install owlrl<br />
</syntaxhighlight><br />
Or download it from [https://github.com/RDFLib/OWL-RL GitHub] and copy the ''owlrl'' subfolder into your project folder next to your Python files.<br />
<br />
[https://owl-rl.readthedocs.io/en/latest/owlrl.html OWL-RL documentation.]<br />
<br />
Example program to get you started. In this example we are creating the graph using sparql.update, but it is also possible to parse the data from a file.<br />
<syntaxhighlight><br />
import rdflib.plugins.sparql.update<br />
import owlrl.RDFSClosure<br />
<br />
g = rdflib.Graph()<br />
<br />
ex = rdflib.Namespace('http://example.org#')<br />
g.bind('', ex)<br />
<br />
g.update("""<br />
PREFIX ex: <http://example.org#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
INSERT DATA {<br />
ex:Socrates rdf:type ex:Man .<br />
ex:Man rdfs:subClassOf ex:Mortal .<br />
}""")<br />
<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
# RDF_Semantics parameters: <br />
# - graph (rdflib.Graph) – The RDF graph to be extended.<br />
# - axioms (bool) – Whether (non-datatype) axiomatic triples should be added or not.<br />
# - daxioms (bool) – Whether datatype axiomatic triples should be added or not.<br />
# - rdfs (bool) – Whether RDFS inference is also done (used in subclassed only).<br />
# For now, you will in most cases use all False in RDFS_Semtantics.<br />
<br />
# Generates the closure of the graph - generates the new entailed triples, but does not add them to the graph.<br />
rdfs.closure()<br />
# Adds the new triples to the graph and empties the RDFS triple-container.<br />
rdfs.flush_stored_triples()<br />
<br />
# Ask-query to check whether a new triple has been generated from the entailment.<br />
b = g.query("""<br />
PREFIX ex: <http://example.org#><br />
ASK {<br />
ex:Socrates rdf:type ex:Mortal .<br />
} <br />
""")<br />
print('Result: ' + bool(b))<br />
</syntaxhighlight><br />
<br />
===Language tagged RDFS labels=== <br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, Literal<br />
from rdflib.namespace import RDFS<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.France, RDFS.label, Literal("Frankrike", lang="no")))<br />
g.add((ex.France, RDFS.label, Literal("France", lang="en")))<br />
g.add((ex.France, RDFS.label, Literal("Francia", lang="es")))<br />
<br />
<br />
</syntaxhighlight><br />
<br />
==OWL== <br />
===Basic inference with RDFLib=== <br />
<br />
You can use the OWL-RL package again as for Lecture 5.<br />
<br />
Instead of: <br />
<syntaxhighlight><br />
# The next three lines add inferred triples to g.<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
rdfs.closure()<br />
rdfs.flush_stored_triples()<br />
</syntaxhighlight><br />
you can write this to get both RDFS and basic RDFS Plus / OWL inference:<br />
<syntaxhighlight><br />
# The next three lines add inferred triples to g.<br />
owl = owlrl.CombinedClosure.RDFS_OWLRL_Semantics(g, False, False, False)<br />
owl.closure()<br />
owl.flush_stored_triples()<br />
</syntaxhighlight><br />
<br />
Example updates and queries:<br />
<syntaxhighlight><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
PREFIX ex: <http://example.org#><br />
<br />
INSERT DATA {<br />
ex:Socrates ex:hasWife ex:Xanthippe .<br />
ex:hasHusband owl:inverseOf ex:hasWife .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Xanthippe ex:hasHusband ex:Socrates .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ^ex:hasHusband ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
INSERT DATA {<br />
ex:hasWife rdfs:subPropertyOf ex:hasSpouse .<br />
ex:hasSpouse rdf:type owl:SymmetricProperty . <br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ex:hasSpouse ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ^ex:hasSpouse ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<br />
<br />
<br />
<br />
<br />
===XML Data for above example===<br />
<syntaxhighlight><br />
<data><br />
<tv_show id="1050"><br />
<title>The_Sopranos</title><br />
<actor><br />
<firstname>James</firstname><br />
<lastname>Gandolfini</lastname><br />
</actor><br />
</tv_show><br />
<tv_show id="1066"><br />
<title>Seinfeld</title><br />
<actor><br />
<firstname>Jerry</firstname><br />
<lastname>Seinfeld</lastname><br />
</actor><br />
<actor><br />
<firstname>Julia</firstname><br />
<lastname>Louis-dreyfus</lastname><br />
</actor><br />
<actor><br />
<firstname>Jason</firstname><br />
<lastname>Alexander</lastname><br />
</actor><br />
</tv_show><br />
</data><br />
</syntaxhighlight><br />
<br />
==Lifting HTML to RDF==<br />
<syntaxhighlight><br />
from bs4 import BeautifulSoup as bs, NavigableString<br />
from rdflib import Graph, URIRef, Namespace<br />
from rdflib.namespace import RDF<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
html = open("tv_shows.html").read()<br />
html = bs(html, features="html.parser")<br />
<br />
shows = html.find_all('li', attrs={'class': 'show'})<br />
for show in shows:<br />
title = show.find("h3").text<br />
actors = show.find('ul', attrs={'class': 'actor_list'})<br />
for actor in actors:<br />
if isinstance(actor, NavigableString):<br />
continue<br />
else:<br />
actor = actor.text.replace(" ", "_")<br />
g.add((URIRef(ex + title), ex.stars, URIRef(ex + actor)))<br />
g.add((URIRef(ex + actor), RDF.type, ex.Actor))<br />
<br />
g.add((URIRef(ex + title), RDF.type, ex.TV_Show))<br />
<br />
<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===HTML code for the example above===<br />
<syntaxhighlight><br />
<!DOCTYPE html><br />
<html><br />
<head><br />
<meta charset="utf-8"><br />
<title></title><br />
</head><br />
<body><br />
<div class="tv_shows"><br />
<ul><br />
<li class="show"><br />
<h3>The_Sopranos</h3><br />
<div class="irrelevant_data"></div><br />
<ul class="actor_list"><br />
<li>James Gandolfini</li><br />
</ul><br />
</li><br />
<li class="show"><br />
<h3>Seinfeld</h3><br />
<div class="irrelevant_data"></div><br />
<ul class="actor_list"><br />
<li >Jerry Seinfeld</li><br />
<li>Jason Alexander</li><br />
<li>Julia Louis-Dreyfus</li><br />
</ul><br />
</li><br />
</ul><br />
</div><br />
</body><br />
</html><br />
</syntaxhighlight><br />
<br />
==Web APIs with JSON==<br />
<syntaxhighlight><br />
import requests<br />
import json<br />
import pprint<br />
<br />
# Retrieve JSON data from API service URL. Then load it with the json library as a json object.<br />
url = "http://api.geonames.org/postalCodeLookupJSON?postalcode=46020&#country=ES&username=demo"<br />
data = requests.get(url).content.decode("utf-8")<br />
data = json.loads(data)<br />
pprint.pprint(data)<br />
</syntaxhighlight><br />
<br />
<br />
==JSON-LD==<br />
<br />
<syntaxhighlight><br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
<br />
example = """<br />
{<br />
"@context": {<br />
"name": "http://xmlns.com/foaf/0.1/name",<br />
"homepage": {<br />
"@id": "http://xmlns.com/foaf/0.1/homepage",<br />
"@type": "@id"<br />
}<br />
},<br />
"@id": "http://me.markus-lanthaler.com/",<br />
"name": "Markus Lanthaler",<br />
"homepage": "http://www.markus-lanthaler.com/"<br />
}<br />
"""<br />
<br />
# json-ld parsing automatically deals with @contexts<br />
g.parse(data=example, format='json-ld')<br />
<br />
# serialisation does expansion by default<br />
for line in g.serialize(format='json-ld').decode().splitlines():<br />
print(line)<br />
<br />
# by supplying a context object, serialisation can do compaction<br />
context = {<br />
"foaf": "http://xmlns.com/foaf/0.1/"<br />
}<br />
for line in g.serialize(format='json-ld', context=context).decode().splitlines():<br />
print(line)<br />
</syntaxhighlight><br />
<br />
<br />
<div class="credits" style="text-align: right; direction: ltr; margin-left: 1em;">''INFO216, UiB, 2017-2020. All code examples are [https://creativecommons.org/choose/zero/ CC0].'' </div><br />
<br />
==OWL - Complex Classes and Restrictions==<br />
<syntaxhighlight><br />
import owlrl<br />
from rdflib import Graph, Literal, Namespace, BNode<br />
from rdflib.namespace import RDF, OWL, RDFS<br />
from rdflib.collection import Collection<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
g.bind("owl", OWL)<br />
<br />
# a Season is either Autumn, Winter, Spring, Summer<br />
seasons = BNode()<br />
Collection(g, seasons, [ex.Winter, ex.Autumn, ex.Spring, ex.Summer])<br />
g.add((ex.Season, OWL.oneOf, seasons))<br />
<br />
# A Parent is a Father or Mother<br />
b = BNode()<br />
Collection(g, b, [ex.Father, ex.Mother])<br />
g.add((ex.Parent, OWL.unionOf, b))<br />
<br />
# A Woman is a person who has the "female" gender<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.gender))<br />
g.add((br, OWL.hasValue, ex.Female))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Woman, OWL.intersectionOf, bi))<br />
<br />
# A vegetarian is a Person who only eats vegetarian food<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.eats))<br />
g.add((br, OWL.allValuesFrom, ex.VeganFood))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Vegetarian, OWL.intersectionOf, bi))<br />
<br />
# A vegetarian is a Person who can not eat meat.<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.eats))<br />
g.add((br, OWL.QualifiedCardinality, Literal(0)))<br />
g.add((br, OWL.onClass, ex.Meat))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Vegetarian, OWL.intersectionOf, bi))<br />
<br />
# A Worried Parent is a parent who has at least one sick child<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.hasChild))<br />
g.add((br, OWL.QualifiedMinCardinality, Literal(1)))<br />
g.add((br, OWL.onClass, ex.Sick))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Parent, br])<br />
g.add((ex.WorriedParent, OWL.intersectionOf, bi))<br />
<br />
# using the restriction above, If we now write...: <br />
g.add((ex.Bob, RDF.type, ex.Parent))<br />
g.add((ex.Bob, ex.hasChild, ex.John))<br />
g.add((ex.John, RDF.type, ex.Sick))<br />
# ...we can infer with owl reasoning that Bob is a worried Parent even though we didn't specify it ourselves because Bob fullfills the restriction and Parent requirements.<br />
<br />
</syntaxhighlight><br />
<br />
==Protege-OWL reasoning with HermiT==<br />
<br />
[[:File:DL-reasoning-RoyalFamily-final.owl.txt | Example file]] from Lecture 13 about OWL-DL, rules and reasoning.<br />
<br />
--></div>
Tkr048
http://info216.wiki.uib.no/index.php?title=Lab_Solutions&diff=1758
Lab Solutions
2022-03-10T13:12:17Z
<p>Tkr048: </p>
<hr />
<div>This page will be updated with Python examples related to the lectures and labs. We will add more examples after each lab has ended. The first examples will use Python's RDFlib. We will introduce other relevant libraries later.<br />
<br />
==Getting started==<br />
<syntaxhighlight><br />
<br />
from rdflib.collection import Collection<br />
from rdflib import Graph, Namespace, Literal, URIRef<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
EX = Namespace('http://EXample.org/')<br />
RL = Namespace('http://purl.org/vocab/relationship/')<br />
DBO = Namespace('https://dbpedia.org/ontology/')<br />
DBR = Namespace('https://dbpedia.org/page/')<br />
<br />
g.namespace_manager.bind('exampleURI', EX)<br />
g.namespace_manager.bind('relationship', RL)<br />
g.namespace_manager.bind('dbpediaOntology', DBO)<br />
g.namespace_manager.bind('dbpediaPage', DBR)<br />
<br />
g.add((EX.Cade, RDF.type, FOAF.Person)) <br />
g.add((EX.Mary, RDF.type, FOAF.Person))<br />
g.add((EX.Cade, RL.spouseOf, EX.Mary)) # a symmetrical relation from an established namespace<br />
g.add((DBR.France, DBO.capital, DBR.Paris)) <br />
g.add((EX.Cade, FOAF.age, Literal(27)))<br />
g.add((EX.Mary, FOAF.age, Literal('26', datatype=XSD.int)))<br />
Collection (g, EX.MaryInterests, [EX.hiking, EX.choclate, EX.biology])<br />
g.add((EX.Mary, EX.hasIntrest, EX.MaryInterests))<br />
g.add((EX.Mary, RDF.type, EX.student))<br />
g.add((DBO.capital, EX.range, EX.city))<br />
g.add((EX.Mary, RDF.type, EX.kind))<br />
g.add((EX.Cade, RDF.type, EX.kindPerson))<br />
<br />
#hobbies = ['hiking', 'choclate', 'biology']<br />
#for i in hobbies:<br />
# g.add((EX.Mary, FOAF.interest, EX[i]))<br />
<br />
print(g.serialize(format="turtle"))<br />
</syntaxhighlight><br />
==RDFlib==<br />
<br />
<br />
==SPARQL - Blazegraph==<br />
<br />
<br />
==SPARQL - RDFlib==<br />
<syntaxhighlight><br />
import requests<br />
from rdflib import FOAF, Namespace, Literal, RDF, Graph, TURTLE<br />
<br />
r = requests.get('http://api.open-notify.org/astros.json').json()<br />
g = Graph()<br />
EX = Namespace('http://EXample.org/')<br />
g.bind("ex", EX)<br />
<br />
for item in r['people']:<br />
craft = item['craft'].replace(" ","_")<br />
person = item['name'].replace(" ","_")<br />
g.add((EX[person], EX.onCraft, EX[craft]))<br />
g.add((EX[person], RDF.type, FOAF.Person))<br />
g.add((EX[person], FOAF.name, Literal(item['name'])))<br />
g.add((EX[craft], FOAF.name, Literal(item['craft'])))<br />
res = g.query("""<br />
CONSTRUCT {?person1 foaf:knows ?person2}<br />
WHERE {<br />
?person1 ex:onCraft ?craft .<br />
?person2 ex:onCraft ?craft .<br />
}<br />
""")<br />
<br />
for triplet in res:<br />
if (triplet[0] != triplet[2]):<br />
g.add((triplet))<br />
<br />
print(g.serialize(format="turtle"))<br />
</syntaxhighlight><br />
==Web APIs and JSON-LD==<br />
<br />
<br />
==Semantic lifting - CSV==<br />
<br />
<br />
==RDFS==<br />
<br />
<br />
<br />
==Getting started==<br />
<br />
<br />
===Printing the triples of the Graph in a readable way===<br />
<syntaxhighlight><br />
# The turtle format has the purpose of being more readable for humans. <br />
print(g.serialize(format="turtle"))<br />
</syntaxhighlight><br />
<br />
===Coding Tasks Lab 1===<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, URIRef, BNode, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.Cade, ex.married, ex.Mary))<br />
g.add((ex.France, ex.capital, ex.Paris))<br />
g.add((ex.Cade, ex.age, Literal("27", datatype=XSD.integer)))<br />
g.add((ex.Mary, ex.age, Literal("26", datatype=XSD.integer)))<br />
g.add((ex.Mary, ex.interest, ex.Hiking))<br />
g.add((ex.Mary, ex.interest, ex.Chocolate))<br />
g.add((ex.Mary, ex.interest, ex.Biology))<br />
g.add((ex.Mary, RDF.type, ex.Student))<br />
g.add((ex.Paris, RDF.type, ex.City))<br />
g.add((ex.Paris, ex.locatedIn, ex.France))<br />
g.add((ex.Cade, ex.characteristic, ex.Kind))<br />
g.add((ex.Mary, ex.characteristic, ex.Kind))<br />
g.add((ex.Mary, RDF.type, FOAF.Person))<br />
g.add((ex.Cade, RDF.type, FOAF.Person))<br />
<br />
# OR<br />
<br />
g = Graph()<br />
<br />
ex = Namespace('http://example.org/')<br />
<br />
g.add((ex.Cade, FOAF.name, Literal("Cade", datatype=XSD.string)))<br />
g.add((ex.Mary, FOAF.name, Literal("Mary", datatype=XSD.string)))<br />
g.add((ex.Cade, RDF.type, FOAF.Person))<br />
g.add((ex.Mary, RDF.type, FOAF.Person))<br />
g.add((ex.Mary, RDF.type, ex.Student))<br />
g.add((ex.Cade, ex.Married, ex.Mary))<br />
g.add((ex.Cade, FOAF.age, Literal('27', datatype=XSD.int)))<br />
g.add((ex.Mary, FOAF.age, Literal('26', datatype=XSD.int)))<br />
g.add((ex.Paris, RDF.type, ex.City))<br />
g.add((ex.France, ex.Capital, ex.Paris))<br />
g.add((ex.Mary, FOAF.interest, ex.hiking))<br />
g.add((ex.Mary, FOAF.interest, ex.Chocolate))<br />
g.add((ex.Mary, FOAF.interest, ex.biology))<br />
g.add((ex.France, ex.City, ex.Paris))<br />
g.add((ex.Mary, ex.characteristic, ex.kind))<br />
g.add((ex.Cade, ex.characteristic, ex.kind))<br />
g.add((ex.France, RDF.type, ex.Country))<br />
<br />
<br />
print(g.serialize(format="turtle"))<br />
<br />
</syntaxhighlight><br />
<br />
==Basic RDF programming==<br />
<br />
===Different ways to create an address===<br />
<br />
<syntaxhighlight><br />
<br />
from rdflib import Graph, Namespace, URIRef, BNode, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
<br />
# How to represent the address of Cade Tracey. From probably the worst solution to the best.<br />
<br />
# Solution 1 -<br />
# Make the entire address into one Literal. However, Generally we want to separate each part of an address into their own triples. This is useful for instance if we want to find only the streets where people live. <br />
<br />
g.add((ex.Cade_Tracey, ex.livesIn, Literal("1516_Henry_Street, Berkeley, California 94709, USA")))<br />
<br />
<br />
# Solution 2 - <br />
# Seperate the different pieces information into their own triples<br />
<br />
g.add((ex.Cade_tracey, ex.street, Literal("1516_Henry_Street")))<br />
g.add((ex.Cade_tracey, ex.city, Literal("Berkeley")))<br />
g.add((ex.Cade_tracey, ex.state, Literal("California")))<br />
g.add((ex.Cade_tracey, ex.zipcode, Literal("94709")))<br />
g.add((ex.Cade_tracey, ex.country, Literal("USA")))<br />
<br />
<br />
# Solution 3 - Some parts of the addresses can make more sense to be resources than Literals.<br />
# Larger concepts like a city or state are typically represented as resources rather than Literals, but this is not necesarilly a requirement in the case that you don't intend to say more about them. <br />
<br />
g.add((ex.Cade_tracey, ex.street, Literal("1516_Henry_Street")))<br />
g.add((ex.Cade_tracey, ex.city, ex.Berkeley))<br />
g.add((ex.Cade_tracey, ex.state, ex.California))<br />
g.add((ex.Cade_tracey, ex.zipcode, Literal("94709")))<br />
g.add((ex.Cade_tracey, ex.country, ex.USA))<br />
<br />
<br />
# Solution 4 <br />
# Grouping of the information into an Address. We can Represent the address concept with its own URI OR with a Blank Node. <br />
# One advantage of this is that we can easily remove the entire address, instead of removing each individual part of the address. <br />
# Solution 4 or 5 is how I would recommend to make addresses. Here, ex.CadeAddress could also be called something like ex.address1 or so on, if you want to give each address a unique ID. <br />
<br />
# Address URI - CadeAdress<br />
<br />
g.add((ex.Cade_Tracey, ex.address, ex.CadeAddress))<br />
g.add((ex.CadeAddress, RDF.type, ex.Address))<br />
g.add((ex.CadeAddress, ex.street, Literal("1516 Henry Street")))<br />
g.add((ex.CadeAddress, ex.city, ex.Berkeley))<br />
g.add((ex.CadeAddress, ex.state, ex.California))<br />
g.add((ex.CadeAddress, ex.postalCode, Literal("94709")))<br />
g.add((ex.CadeAddress, ex.country, ex.USA))<br />
<br />
# OR<br />
<br />
# Blank node for Address. <br />
address = BNode()<br />
g.add((ex.Cade_Tracey, ex.address, address))<br />
g.add((address, RDF.type, ex.Address))<br />
g.add((address, ex.street, Literal("1516 Henry Street", datatype=XSD.string)))<br />
g.add((address, ex.city, ex.Berkeley))<br />
g.add((address, ex.state, ex.California))<br />
g.add((address, ex.postalCode, Literal("94709", datatype=XSD.string)))<br />
g.add((address, ex.country, ex.USA))<br />
<br />
<br />
# Solution 5 using existing vocabularies for address <br />
<br />
# (in this case https://schema.org/PostalAddress from schema.org). <br />
# Also using existing ontology for places like California. (like http://dbpedia.org/resource/California from dbpedia.org)<br />
<br />
schema = Namespace("https://schema.org/")<br />
dbp = Namespace("https://dpbedia.org/resource/")<br />
<br />
g.add((ex.Cade_Tracey, schema.address, ex.CadeAddress))<br />
g.add((ex.CadeAddress, RDF.type, schema.PostalAddress))<br />
g.add((ex.CadeAddress, schema.streetAddress, Literal("1516 Henry Street")))<br />
g.add((ex.CadeAddress, schema.addresCity, dbp.Berkeley))<br />
g.add((ex.CadeAddress, schema.addressRegion, dbp.California))<br />
g.add((ex.CadeAddress, schema.postalCode, Literal("94709")))<br />
g.add((ex.CadeAddress, schema.addressCountry, dbp.United_States))<br />
<br />
</syntaxhighlight><br />
<br />
===Typed Literals===<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace<br />
from rdflib.namespace import XSD<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.Cade, ex.age, Literal(27, datatype=XSD.integer)))<br />
g.add((ex.Cade, ex.gpa, Literal(3.3, datatype=XSD.float)))<br />
g.add((ex.Cade, FOAF.name, Literal("Cade Tracey", datatype=XSD.string)))<br />
g.add((ex.Cade, ex.birthday, Literal("2006-01-01", datatype=XSD.date)))<br />
</syntaxhighlight><br />
<br />
<br />
===Writing and reading graphs/files===<br />
<br />
<syntaxhighlight><br />
# Writing the graph to a file on your system. Possible formats = turtle, n3, xml, nt.<br />
g.serialize(destination="triples.txt", format="turtle")<br />
<br />
# Parsing a local file<br />
parsed_graph = g.parse(location="triples.txt", format="turtle")<br />
<br />
# Parsing a remote endpoint like Dbpedia<br />
dbpedia_graph = g.parse("http://dbpedia.org/resource/Pluto")<br />
</syntaxhighlight><br />
<br />
===Graph Binding===<br />
<syntaxhighlight><br />
#Graph Binding is useful for at least two reasons:<br />
#(1) We no longer need to specify prefixes with SPARQL queries if they are already binded to the graph.<br />
#(2) When serializing the graph, the serialization will show the correct expected prefix <br />
# instead of default namespace names ns1, ns2 etc.<br />
<br />
g = Graph()<br />
<br />
ex = Namespace("http://example.org/")<br />
dbp = Namespace("http://dbpedia.org/resource/")<br />
schema = Namespace("https://schema.org/")<br />
<br />
g.bind("ex", ex)<br />
g.bind("dbp", dbp)<br />
g.bind("schema", schema)<br />
</syntaxhighlight><br />
<br />
===Collection Example===<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace<br />
from rdflib.collection import Collection<br />
<br />
<br />
# Sometimes we want to add many objects or subjects for the same predicate at once. <br />
# In these cases we can use Collection() to save some time.<br />
# In this case I want to add all countries that Emma has visited at once.<br />
<br />
b = BNode()<br />
g.add((ex.Emma, ex.visit, b))<br />
Collection(g, b,<br />
[ex.Portugal, ex.Italy, ex.France, ex.Germany, ex.Denmark, ex.Sweden])<br />
<br />
# OR<br />
<br />
g.add((ex.Emma, ex.visit, ex.EmmaVisits))<br />
Collection(g, ex.EmmaVisits,<br />
[ex.Portugal, ex.Italy, ex.France, ex.Germany, ex.Denmark, ex.Sweden])<br />
<br />
</syntaxhighlight><br />
<br />
==SPARQL==<br />
<br />
Also see the [[SPARQL Examples]] page!<br />
<br />
===Querying a local ("in memory") graph===<br />
<br />
Example contents of the file family.ttl:<br />
@prefix rex: <http://example.org/royal#> .<br />
@prefix fam: <http://example.org/family#> .<br />
<br />
rex:IngridAlexandra fam:hasParent rex:HaakonMagnus .<br />
rex:SverreMagnus fam:hasParent rex:HaakonMagnus .<br />
rex:HaakonMagnus fam:hasParent rex:Harald .<br />
rex:MarthaLouise fam:hasParent rex:Harald .<br />
rex:HaakonMagnus fam:hasSister rex:MarthaLouise .<br />
<br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
g.parse("family.ttl", format='ttl')<br />
<br />
qres = g.query("""<br />
PREFIX fam: <http://example.org/family#><br />
SELECT ?child ?sister WHERE {<br />
?child fam:hasParent ?parent . <br />
?parent fam:hasSister ?sister .<br />
}""")<br />
for row in qres:<br />
print("%s has aunt %s" % row)<br />
<br />
With a prepared query, you can write the query once, and then bind some of the variables each time you use it:<br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
g.parse("family.ttl", format='ttl')<br />
<br />
q = rdflib.plugins.sparql.prepareQuery(<br />
"""SELECT ?child ?sister WHERE {<br />
?child fam:hasParent ?parent .<br />
?parent fam:hasSister ?sister .<br />
}""",<br />
initNs = { "fam": "http://example.org/family#"})<br />
<br />
sm = rdflib.URIRef("http://example.org/royal#SverreMagnus")<br />
<br />
for row in g.query(q, initBindings={'child': sm}):<br />
print(row)<br />
<br />
===Select all contents of lists (rdfllib.Collection)===<br />
<syntaxhighlight><br />
<br />
# rdflib.Collection has a different interntal structure so it requires a slightly more advance query. Here I am selecting all places that Emma has visited.<br />
<br />
PREFIX ex: <http://example.org/><br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
<br />
SELECT ?visit<br />
WHERE {<br />
ex:Emma ex:visit/rdf:rest*/rdf:first ?visit<br />
}<br />
</syntaxhighlight><br />
<br />
<br />
===Using parameters/variables in rdflib queries===<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, URIRef<br />
from rdflib.plugins.sparql import prepareQuery<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
g.add((ex.Cade, ex.livesIn, ex.France))<br />
g.add((ex.Anne, ex.livesIn, ex.Norway))<br />
g.add((ex.Sofie, ex.livesIn, ex.Sweden))<br />
g.add((ex.Per, ex.livesIn, ex.Norway))<br />
g.add((ex.John, ex.livesIn, ex.USA))<br />
<br />
<br />
def find_people_from_country(country):<br />
country = URIRef(ex + country)<br />
q = prepareQuery(<br />
"""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?person WHERE { <br />
?person ex:livesIn ?country.<br />
}<br />
""")<br />
<br />
capital_result = g.query(q, initBindings={'country': country})<br />
<br />
for row in capital_result:<br />
print(row)<br />
<br />
find_people_from_country("Norway")<br />
</syntaxhighlight><br />
<br />
===SELECTING data from Blazegraph via Python===<br />
<syntaxhighlight><br />
<br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
# This creates a server connection to the same URL that contains the graphic interface for Blazegraph. <br />
# You also need to add "sparql" to end of the URL like below.<br />
<br />
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/sparql")<br />
<br />
# SELECT all triples in the database.<br />
<br />
sparql.setQuery("""<br />
SELECT DISTINCT ?p WHERE {<br />
?s ?p ?o.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["p"]["value"])<br />
<br />
# SELECT all interests of Cade<br />
<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT DISTINCT ?interest WHERE {<br />
ex:Cade ex:interest ?interest.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["interest"]["value"])<br />
</syntaxhighlight><br />
<br />
===Updating data from Blazegraph via Python===<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, POST, DIGEST<br />
<br />
namespace = "kb"<br />
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/namespace/"+ namespace + "/sparql")<br />
<br />
sparql.setMethod(POST)<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
INSERT DATA{<br />
ex:Cade ex:interest ex:Mathematics.<br />
}<br />
""")<br />
<br />
results = sparql.query()<br />
print(results.response.read())<br />
<br />
<br />
</syntaxhighlight><br />
===Retrieving data from Wikidata with SparqlWrapper===<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
sparql = SPARQLWrapper("https://query.wikidata.org/sparql")<br />
# In the query I want to select all the Vitamins in wikidata.<br />
<br />
sparql.setQuery("""<br />
SELECT ?nutrient ?nutrientLabel WHERE<br />
{<br />
?nutrient wdt:P279 wd:Q34956.<br />
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }<br />
}<br />
""")<br />
<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["nutrient"]["value"], " ", result["nutrientLabel"]["value"])<br />
</syntaxhighlight><br />
<br />
<br />
More examples can be found in the example section on the official query service here: https://query.wikidata.org/.<br />
<br />
===Download from BlazeGraph===<br />
<br />
<syntaxhighlight><br />
"""<br />
Dumps a database to a local RDF file.<br />
You need to install the SPARQLWrapper package first...<br />
"""<br />
<br />
import datetime<br />
from SPARQLWrapper import SPARQLWrapper, RDFXML<br />
<br />
# your namespace, the default is 'kb'<br />
ns = 'kb'<br />
<br />
# the SPARQL endpoint<br />
endpoint = 'http://info216.i2s.uib.no/bigdata/namespace/' + ns + '/sparql'<br />
<br />
# - the endpoint just moved, the old one was:<br />
# endpoint = 'http://i2s.uib.no:8888/bigdata/namespace/' + ns + '/sparql'<br />
<br />
# create wrapper<br />
wrapper = SPARQLWrapper(endpoint)<br />
<br />
# prepare the SPARQL update<br />
wrapper.setQuery('CONSTRUCT { ?s ?p ?o } WHERE { ?s ?p ?o }')<br />
wrapper.setReturnFormat(RDFXML)<br />
<br />
# execute the SPARQL update and convert the result to an rdflib.Graph <br />
graph = wrapper.query().convert()<br />
<br />
# the destination file, with code to make it timestamped<br />
destfile = 'rdf_dumps/slr-kg4news-' + datetime.datetime.now().strftime('%Y%m%d-%H%M') + '.rdf'<br />
<br />
# serialize the result to file<br />
graph.serialize(destination=destfile, format='ttl')<br />
<br />
# report and quit<br />
print('Wrote %u triples to file %s .' %<br />
(len(res), destfile))<br />
</syntaxhighlight><br />
<br />
===Query Dbpedia with SparqlWrapper===<br />
<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
sparql = SPARQLWrapper("http://dbpedia.org/sparql")<br />
<br />
sparql.setQuery("""<br />
PREFIX dbr: <http://dbpedia.org/resource/><br />
PREFIX dbo: <http://dbpedia.org/ontology/><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
SELECT ?comment<br />
WHERE {<br />
dbr:Barack_Obama rdfs:comment ?comment.<br />
FILTER (langMatches(lang(?comment),"en"))<br />
}<br />
""")<br />
<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["comment"]["value"])<br />
</syntaxhighlight><br />
<br />
==Lifting CSV to RDF==<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, FOAF, RDFS, OWL<br />
import pandas as pd<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
# Load the CSV data as a pandas Dataframe.<br />
csv_data = pd.read_csv("task1.csv")<br />
<br />
# Here I deal with spaces (" ") in the data. I replace them with "_" so that URI's become valid.<br />
csv_data = csv_data.replace(to_replace=" ", value="_", regex=True)<br />
<br />
# Here I mark all missing/empty data as "unknown". This makes it easy to delete triples containing this later.<br />
csv_data = csv_data.fillna("unknown")<br />
<br />
# Loop through the CSV data, and then make RDF triples.<br />
for index, row in csv_data.iterrows():<br />
# The names of the people act as subjects.<br />
subject = row['Name']<br />
# Create triples: e.g. "Cade_Tracey - age - 27"<br />
g.add((URIRef(ex + subject), URIRef(ex + "age"), Literal(row["Age"])))<br />
g.add((URIRef(ex + subject), URIRef(ex + "married"), URIRef(ex + row["Spouse"])))<br />
g.add((URIRef(ex + subject), URIRef(ex + "country"), URIRef(ex + row["Country"])))<br />
<br />
# If We want can add additional RDF/RDFS/OWL information e.g<br />
g.add((URIRef(ex + subject), RDF.type, FOAF.Person))<br />
<br />
# I remove triples that I marked as unknown earlier.<br />
g.remove((None, None, URIRef("http://example.org/unknown")))<br />
<br />
# Clean printing of the graph.<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===CSV file for above example===<br />
<br />
<syntaxhighlight><br />
"Name","Age","Spouse","Country"<br />
"Cade Tracey","26","Mary Jackson","US"<br />
"Bob Johnson","21","","Canada"<br />
"Mary Jackson","25","","France"<br />
"Phil Philips","32","Catherine Smith","Japan"<br />
</syntaxhighlight><br />
<br />
<br />
=Coding Tasks Lab 6=<br />
<syntaxhighlight><br />
import pandas as pd<br />
<br />
<br />
from rdflib import Graph, Namespace, URIRef, Literal, BNode<br />
from rdflib.namespace import RDF, XSD<br />
<br />
<br />
ex = Namespace("http://example.org/")<br />
sem = Namespace("http://semanticweb.cs.vu.nl/2009/11/sem/")<br />
<br />
g = Graph()<br />
g.bind("ex", ex)<br />
g.bind("sem", sem)<br />
<br />
<br />
# Removing unwanted characters<br />
df = pd.read_csv('russia-investigation.csv')<br />
# Here I deal with spaces (" ") in the data. I replace them with "_" so that URI's become valid.<br />
df = df.replace(to_replace=" ", value="_", regex=True)<br />
# This may seem odd, but in the data set we have a name like this:("Scooter"). So we have to remove quotation marks<br />
df = df.replace(to_replace=f'"', value="", regex=True)<br />
# # Here I mark all missing/empty data as "unknown". This makes it easy to delete triples containing this later.<br />
df = df.fillna("unknown")<br />
<br />
# Loop through the CSV data, and then make RDF triples.<br />
for index, row in df.iterrows():<br />
name = row['investigation']<br />
investigation = URIRef(ex + name)<br />
g.add((investigation, RDF.type, sem.Event))<br />
investigation_start = row["investigation-start"]<br />
g.add((investigation, sem.hasBeginTimeStamp, Literal(<br />
investigation_start, datatype=XSD.datetime)))<br />
investigation_end = row["investigation-end"]<br />
g.add((investigation, sem.hasEndTimeStamp, Literal(<br />
investigation_end, datatype=XSD.datetime)))<br />
investigation_end = row["investigation-days"]<br />
g.add((investigation, sem.hasXSDDuration, Literal(<br />
investigation_end, datatype=XSD.Days)))<br />
person = row["name"]<br />
person = URIRef(ex + person)<br />
g.add((investigation, sem.Actor, person))<br />
result = row['type']<br />
g.add((investigation, sem.hasSubEvent, Literal(result, datatype=XSD.string)))<br />
overturned = row["overturned"]<br />
g.add((investigation, ex.overtuned, Literal(overturned, datatype=XSD.boolean)))<br />
pardoned = row["pardoned"]<br />
g.add((investigation, ex.pardon, Literal(pardoned, datatype=XSD.boolean)))<br />
<br />
g.serialize("output.ttl", format="ttl")<br />
print(g.serialize(format="turtle"))<br />
<br />
</syntaxhighlight><br />
<!--<br />
==Lifting XML to RDF==<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, XSD, RDFS<br />
import xml.etree.ElementTree as ET<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/TV/")<br />
prov = Namespace("http://www.w3.org/ns/prov#")<br />
g.bind("ex", ex)<br />
g.bind("prov", prov)<br />
<br />
tree = ET.parse("tv_shows.xml")<br />
root = tree.getroot()<br />
<br />
for tv_show in root.findall('tv_show'):<br />
show_id = tv_show.attrib["id"]<br />
title = tv_show.find("title").text<br />
<br />
g.add((URIRef(ex + show_id), ex.title, Literal(title, datatype=XSD.string)))<br />
g.add((URIRef(ex + show_id), RDF.type, ex.TV_Show))<br />
<br />
for actor in tv_show.findall("actor"):<br />
first_name = actor.find("firstname").text<br />
last_name = actor.find("lastname").text<br />
full_name = first_name + "_" + last_name<br />
<br />
g.add((URIRef(ex + show_id), ex.stars, URIRef(ex + full_name)))<br />
g.add((URIRef(ex + full_name), ex.starsIn, URIRef(title)))<br />
g.add((URIRef(ex + full_name), RDF.type, ex.Actor))<br />
<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
<br />
==RDFS==<br />
<br />
===RDFS-plus (OWL) Properties===<br />
<syntaxhighlight><br />
g.add((ex.married, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.married, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.ReflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.sibling, RDF.type, OWL.TransitiveProperty))<br />
g.add((ex.sibling, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.sibling, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.FunctionalProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.IrreflexiveProperty))<br />
<br />
# Sometimes there is no definite answer, and it comes down to how we want to model our properties<br />
# e.g is livesWith a transitive property? Usually yes, but we can also want to specify that a child lives with both of her divorced parents.<br />
# which means that: (mother livesWith child % child livesWith father) != mother livesWith father. Which makes it non-transitive.<br />
</syntaxhighlight><br />
<br />
===RDFS inference with RDFLib===<br />
You can use the OWL-RL package to add inference capabilities to RDFLib. It can be installed using the pip install command:<br />
<syntaxhighlight><br />
pip install owlrl<br />
</syntaxhighlight><br />
Or download it from [https://github.com/RDFLib/OWL-RL GitHub] and copy the ''owlrl'' subfolder into your project folder next to your Python files.<br />
<br />
[https://owl-rl.readthedocs.io/en/latest/owlrl.html OWL-RL documentation.]<br />
<br />
Example program to get you started. In this example we are creating the graph using sparql.update, but it is also possible to parse the data from a file.<br />
<syntaxhighlight><br />
import rdflib.plugins.sparql.update<br />
import owlrl.RDFSClosure<br />
<br />
g = rdflib.Graph()<br />
<br />
ex = rdflib.Namespace('http://example.org#')<br />
g.bind('', ex)<br />
<br />
g.update("""<br />
PREFIX ex: <http://example.org#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
INSERT DATA {<br />
ex:Socrates rdf:type ex:Man .<br />
ex:Man rdfs:subClassOf ex:Mortal .<br />
}""")<br />
<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
# RDF_Semantics parameters: <br />
# - graph (rdflib.Graph) – The RDF graph to be extended.<br />
# - axioms (bool) – Whether (non-datatype) axiomatic triples should be added or not.<br />
# - daxioms (bool) – Whether datatype axiomatic triples should be added or not.<br />
# - rdfs (bool) – Whether RDFS inference is also done (used in subclassed only).<br />
# For now, you will in most cases use all False in RDFS_Semtantics.<br />
<br />
# Generates the closure of the graph - generates the new entailed triples, but does not add them to the graph.<br />
rdfs.closure()<br />
# Adds the new triples to the graph and empties the RDFS triple-container.<br />
rdfs.flush_stored_triples()<br />
<br />
# Ask-query to check whether a new triple has been generated from the entailment.<br />
b = g.query("""<br />
PREFIX ex: <http://example.org#><br />
ASK {<br />
ex:Socrates rdf:type ex:Mortal .<br />
} <br />
""")<br />
print('Result: ' + bool(b))<br />
</syntaxhighlight><br />
<br />
===Language tagged RDFS labels=== <br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, Literal<br />
from rdflib.namespace import RDFS<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.France, RDFS.label, Literal("Frankrike", lang="no")))<br />
g.add((ex.France, RDFS.label, Literal("France", lang="en")))<br />
g.add((ex.France, RDFS.label, Literal("Francia", lang="es")))<br />
<br />
<br />
</syntaxhighlight><br />
<br />
==OWL== <br />
===Basic inference with RDFLib=== <br />
<br />
You can use the OWL-RL package again as for Lecture 5.<br />
<br />
Instead of: <br />
<syntaxhighlight><br />
# The next three lines add inferred triples to g.<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
rdfs.closure()<br />
rdfs.flush_stored_triples()<br />
</syntaxhighlight><br />
you can write this to get both RDFS and basic RDFS Plus / OWL inference:<br />
<syntaxhighlight><br />
# The next three lines add inferred triples to g.<br />
owl = owlrl.CombinedClosure.RDFS_OWLRL_Semantics(g, False, False, False)<br />
owl.closure()<br />
owl.flush_stored_triples()<br />
</syntaxhighlight><br />
<br />
Example updates and queries:<br />
<syntaxhighlight><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
PREFIX ex: <http://example.org#><br />
<br />
INSERT DATA {<br />
ex:Socrates ex:hasWife ex:Xanthippe .<br />
ex:hasHusband owl:inverseOf ex:hasWife .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Xanthippe ex:hasHusband ex:Socrates .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ^ex:hasHusband ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
INSERT DATA {<br />
ex:hasWife rdfs:subPropertyOf ex:hasSpouse .<br />
ex:hasSpouse rdf:type owl:SymmetricProperty . <br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ex:hasSpouse ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ^ex:hasSpouse ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<br />
<br />
<br />
<br />
<br />
===XML Data for above example===<br />
<syntaxhighlight><br />
<data><br />
<tv_show id="1050"><br />
<title>The_Sopranos</title><br />
<actor><br />
<firstname>James</firstname><br />
<lastname>Gandolfini</lastname><br />
</actor><br />
</tv_show><br />
<tv_show id="1066"><br />
<title>Seinfeld</title><br />
<actor><br />
<firstname>Jerry</firstname><br />
<lastname>Seinfeld</lastname><br />
</actor><br />
<actor><br />
<firstname>Julia</firstname><br />
<lastname>Louis-dreyfus</lastname><br />
</actor><br />
<actor><br />
<firstname>Jason</firstname><br />
<lastname>Alexander</lastname><br />
</actor><br />
</tv_show><br />
</data><br />
</syntaxhighlight><br />
<br />
==Lifting HTML to RDF==<br />
<syntaxhighlight><br />
from bs4 import BeautifulSoup as bs, NavigableString<br />
from rdflib import Graph, URIRef, Namespace<br />
from rdflib.namespace import RDF<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
html = open("tv_shows.html").read()<br />
html = bs(html, features="html.parser")<br />
<br />
shows = html.find_all('li', attrs={'class': 'show'})<br />
for show in shows:<br />
title = show.find("h3").text<br />
actors = show.find('ul', attrs={'class': 'actor_list'})<br />
for actor in actors:<br />
if isinstance(actor, NavigableString):<br />
continue<br />
else:<br />
actor = actor.text.replace(" ", "_")<br />
g.add((URIRef(ex + title), ex.stars, URIRef(ex + actor)))<br />
g.add((URIRef(ex + actor), RDF.type, ex.Actor))<br />
<br />
g.add((URIRef(ex + title), RDF.type, ex.TV_Show))<br />
<br />
<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===HTML code for the example above===<br />
<syntaxhighlight><br />
<!DOCTYPE html><br />
<html><br />
<head><br />
<meta charset="utf-8"><br />
<title></title><br />
</head><br />
<body><br />
<div class="tv_shows"><br />
<ul><br />
<li class="show"><br />
<h3>The_Sopranos</h3><br />
<div class="irrelevant_data"></div><br />
<ul class="actor_list"><br />
<li>James Gandolfini</li><br />
</ul><br />
</li><br />
<li class="show"><br />
<h3>Seinfeld</h3><br />
<div class="irrelevant_data"></div><br />
<ul class="actor_list"><br />
<li >Jerry Seinfeld</li><br />
<li>Jason Alexander</li><br />
<li>Julia Louis-Dreyfus</li><br />
</ul><br />
</li><br />
</ul><br />
</div><br />
</body><br />
</html><br />
</syntaxhighlight><br />
<br />
==Web APIs with JSON==<br />
<syntaxhighlight><br />
import requests<br />
import json<br />
import pprint<br />
<br />
# Retrieve JSON data from API service URL. Then load it with the json library as a json object.<br />
url = "http://api.geonames.org/postalCodeLookupJSON?postalcode=46020&#country=ES&username=demo"<br />
data = requests.get(url).content.decode("utf-8")<br />
data = json.loads(data)<br />
pprint.pprint(data)<br />
</syntaxhighlight><br />
<br />
<br />
==JSON-LD==<br />
<br />
<syntaxhighlight><br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
<br />
example = """<br />
{<br />
"@context": {<br />
"name": "http://xmlns.com/foaf/0.1/name",<br />
"homepage": {<br />
"@id": "http://xmlns.com/foaf/0.1/homepage",<br />
"@type": "@id"<br />
}<br />
},<br />
"@id": "http://me.markus-lanthaler.com/",<br />
"name": "Markus Lanthaler",<br />
"homepage": "http://www.markus-lanthaler.com/"<br />
}<br />
"""<br />
<br />
# json-ld parsing automatically deals with @contexts<br />
g.parse(data=example, format='json-ld')<br />
<br />
# serialisation does expansion by default<br />
for line in g.serialize(format='json-ld').decode().splitlines():<br />
print(line)<br />
<br />
# by supplying a context object, serialisation can do compaction<br />
context = {<br />
"foaf": "http://xmlns.com/foaf/0.1/"<br />
}<br />
for line in g.serialize(format='json-ld', context=context).decode().splitlines():<br />
print(line)<br />
</syntaxhighlight><br />
<br />
<br />
<div class="credits" style="text-align: right; direction: ltr; margin-left: 1em;">''INFO216, UiB, 2017-2020. All code examples are [https://creativecommons.org/choose/zero/ CC0].'' </div><br />
<br />
==OWL - Complex Classes and Restrictions==<br />
<syntaxhighlight><br />
import owlrl<br />
from rdflib import Graph, Literal, Namespace, BNode<br />
from rdflib.namespace import RDF, OWL, RDFS<br />
from rdflib.collection import Collection<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
g.bind("owl", OWL)<br />
<br />
# a Season is either Autumn, Winter, Spring, Summer<br />
seasons = BNode()<br />
Collection(g, seasons, [ex.Winter, ex.Autumn, ex.Spring, ex.Summer])<br />
g.add((ex.Season, OWL.oneOf, seasons))<br />
<br />
# A Parent is a Father or Mother<br />
b = BNode()<br />
Collection(g, b, [ex.Father, ex.Mother])<br />
g.add((ex.Parent, OWL.unionOf, b))<br />
<br />
# A Woman is a person who has the "female" gender<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.gender))<br />
g.add((br, OWL.hasValue, ex.Female))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Woman, OWL.intersectionOf, bi))<br />
<br />
# A vegetarian is a Person who only eats vegetarian food<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.eats))<br />
g.add((br, OWL.allValuesFrom, ex.VeganFood))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Vegetarian, OWL.intersectionOf, bi))<br />
<br />
# A vegetarian is a Person who can not eat meat.<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.eats))<br />
g.add((br, OWL.QualifiedCardinality, Literal(0)))<br />
g.add((br, OWL.onClass, ex.Meat))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Vegetarian, OWL.intersectionOf, bi))<br />
<br />
# A Worried Parent is a parent who has at least one sick child<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.hasChild))<br />
g.add((br, OWL.QualifiedMinCardinality, Literal(1)))<br />
g.add((br, OWL.onClass, ex.Sick))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Parent, br])<br />
g.add((ex.WorriedParent, OWL.intersectionOf, bi))<br />
<br />
# using the restriction above, If we now write...: <br />
g.add((ex.Bob, RDF.type, ex.Parent))<br />
g.add((ex.Bob, ex.hasChild, ex.John))<br />
g.add((ex.John, RDF.type, ex.Sick))<br />
# ...we can infer with owl reasoning that Bob is a worried Parent even though we didn't specify it ourselves because Bob fullfills the restriction and Parent requirements.<br />
<br />
</syntaxhighlight><br />
<br />
==Protege-OWL reasoning with HermiT==<br />
<br />
[[:File:DL-reasoning-RoyalFamily-final.owl.txt | Example file]] from Lecture 13 about OWL-DL, rules and reasoning.<br />
<br />
--></div>
Tkr048
http://info216.wiki.uib.no/index.php?title=SPARQL_Examples&diff=1751
SPARQL Examples
2022-03-10T12:53:05Z
<p>Tkr048: </p>
<hr />
<div>This page will be updated with SPARQL examples as the course progresses.<br />
<br />
<br />
== SPARQL Examples from Session 3: SPARQL==<br />
<br />
<br />
<br />
===Prefixes used===<br />
<br />
The examples below will assume that these are in place (some examples aren't yet visible).<br />
<br />
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#><br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
PREFIX dc: <http://purl.org/dc/terms/><br />
PREFIX bibo: <http://purl.org/ontology/bibo/><br />
PREFIX foaf: <http://xmlns.com/foaf/0.1/><br />
PREFIX skos: <http://www.w3.org/2004/02/skos/core#><br />
PREFIX ss: <http://semanticscholar.org/><br />
PREFIX kg: <http://i2s.uib.no/kg4news/><br />
PREFIX sp: <http://i2s.uib.no/kg4news/science-parse/><br />
PREFIX th: <http://i2s.uib.no/kg4news/theme/><br />
PREFIX xml: <http://www.w3.org/XML/1998/namespace> <br />
PREFIX ex: <http://example.org/> <br />
<br />
===Select all triplets in graph===<br />
<br />
SELECT ?s ?p ?o<br />
WHERE {<br />
?s ?p ?o .<br />
} <br />
<br />
or <br />
<br />
SELECT *<br />
WHERE {<br />
?s ?p ?o .<br />
} <br />
<br />
===Select the interestes of Cade===<br />
<br />
SELECT ?cadeInterest<br />
WHERE {<br />
ex:Cade ex:interest ?cadeInterest .<br />
} <br />
===Select the country and city where Emma lives===<br />
<br />
SELECT ?emmaCity ?emmaCountry<br />
WHERE {<br />
ex:Emma ex:address ?address .<br />
?address ex:city ?emmaCity .<br />
?address ex:country ?emmaCountry .<br />
} <br />
<br />
All address info to Emma.<br />
<br />
SELECT ?person ?city ?country ?postalcode ?street WHERE{<br />
ex:Emma ex:address ?address. <br />
?person ex:address ?address.<br />
?address ex:city ?city.<br />
?address ex:country ?country.<br />
?address ex:postalCode ?postalcode.<br />
?address ex:street ?street.<br />
}<br />
<br />
Selecting all address info, to everyone.<br />
<br />
SELECT ?name ?city ?country ?postalcode ?street WHERE{ <br />
?person foaf:name ?name. <br />
?person ex:address ?address.<br />
?address ex:city ?city.<br />
?address ex:country ?country.<br />
?address ex:postalCode ?postalcode.<br />
?address ex:street ?street.<br />
}<br />
<br />
===Select the people who are over 26 years old===<br />
<br />
SELECT ?person ?age<br />
WHERE {<br />
?person ex:age ?age .<br />
FILTER(?age > 26) . <br />
} <br />
===Select people who graduated with Bachelor===<br />
<br />
SELECT ?person ?degree<br />
WHERE {<br />
?person ex:degree ?degree .<br />
?degree ex:degreeLevel "Bachelor" .<br />
<br />
} <br />
===Delete cades photography interest===<br />
<br />
DELETE DATA<br />
{<br />
ex:Cade ex:interest ex:Photography .<br />
} <br />
===Delete and insert university of valencia===<br />
<br />
DELETE { ?s ?p ex:University_of_Valencia }<br />
INSERT { ?s ?p ex:Universidad_de_Valencia }<br />
WHERE { ?s ?p ex:University_of_Valencia } <br />
<br />
===Check if the deletion worked===<br />
<br />
SELECT ?s ?o2<br />
WHERE { <br />
?s ex:degree ?o .<br />
?o ex:degreeSource ?o2 .<br />
}<br />
<br />
===Insert Sergio===<br />
<br />
INSERT DATA {<br />
ex:Sergio a foaf:Person ;<br />
ex:address [ a ex:Address ;<br />
ex:city ex:Valenciay ;<br />
ex:country ex:Spain ;<br />
ex:postalCode "46021"^^xsd:string ;<br />
ex:state ex:California ;<br />
ex:street "4_Carrer_del_Serpis"^^xsd:string ] ;<br />
ex:degree [ ex:degreeField ex:Computer_science ;<br />
ex:degreeLevel "Master"^^xsd:string ;<br />
ex:degreeSource ex:University_of_Valencia ;<br />
ex:year "2008"^^xsd:gYear ] ;<br />
ex:expertise ex:Big_data,<br />
ex:Semantic_technologies,<br />
ex:Machine_learning;<br />
foaf:name "Sergio_Pastor"^^xsd:string .<br />
}<br />
<br />
===Describe Sergio===<br />
<br />
DESCRIBE ex:Sergio ?o<br />
WHERE {<br />
ex:Sergio ?p ?o .<br />
?o ?p2 ?o2 .<br />
}<br />
<br />
===Construct that any city is in the country in an address=== <br />
<br />
CONSTRUCT {?city ex:locatedIn ?country}<br />
Where {<br />
?s rdf:type ex:Address .<br />
?s ex:city ?city .<br />
?s ex:country ?country.<br />
}<br />
<br />
<br />
==The data are available in this Blazegraph triple store:==<br />
[http://sandbox.i2s.uib.no http://sandbox.i2s.uib.no] , but you may need to be inside the UiB network (or on VPN.)<br />
<br />
SELECT DISTINCT ?p WHERE {<br />
?s rdf:type ss:Paper .<br />
?s ?p ?o .<br />
} LIMIT 100<br />
<br />
===Explain all types and properties===<br />
<br />
SELECT ?pt ?e WHERE {<br />
?pt rdfs:comment ?e .<br />
} LIMIT 100<br />
<br />
===List main papers===<br />
<br />
SELECT * WHERE {<br />
<br />
?paper rdf:type kg:MainPaper .<br />
?paper dc:date ?year .<br />
<br />
}<br />
<br />
===List properties===<br />
<br />
SELECT DISTINCT ?p WHERE {<br />
?s ?p ?o .<br />
} LIMIT 100<br />
<br />
===List types===<br />
<br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
<br />
SELECT DISTINCT ?t WHERE {<br />
?s rdf:type ?t .<br />
} LIMIT 100<br />
<br />
===List authors===<br />
<br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
PREFIX foaf: <http://xmlns.com/foaf/0.1/><br />
<br />
SELECT DISTINCT ?p WHERE {<br />
?s rdf:type foaf:Person .<br />
?s ?p ?o .<br />
} LIMIT 100<br />
<br />
Add this to show datatypes!<br />
BIND ( DATATYPE(?year) AS ?type )<br />
<br />
Add this to only show years with the right type.<br />
FILTER ( DATATYPE(?year) = xsd:gYear )<br />
<br />
===Group and count main papers by year===<br />
<br />
SELECT ?year (COUNT(?paper) AS ?count) WHERE {<br />
<br />
?paper rdf:type kg:MainPaper .<br />
?paper dc:date ?year .<br />
FILTER ( DATATYPE(?year) = xsd:gYear )<br />
<br />
}<br />
GROUP BY ?year<br />
<br />
Add this to order the results<br />
ORDER BY ?year<br />
<br />
Add this to order and only show years with more than 5 papers.<br />
HAVING (?count > 5)<br />
ORDER BY DESC(?count)<br />
<br />
===Show papers===<br />
<br />
SELECT ?paper ?year WHERE {<br />
<br />
?paper rdf:type kg:MainPaper .<br />
?paper dc:date ?year .<br />
FILTER ( DATATYPE(?year) = xsd:gYear )<br />
<br />
}<br />
<br />
Change last lines to show papers without an xsd:gYear too.<br />
OPTIONAL {<br />
?paper dc:date ?year .<br />
FILTER ( DATATYPE(?year) = xsd:gYear )<br />
}<br />
<br />
===Alternative values for variables===<br />
<br />
SELECT ?p ?n ?year WHERE {<br />
?p rdf:type kg:MainPaper .<br />
?p dc:contributor ?a .<br />
?a foaf:name ?n .<br />
?p dc:date ?year .<br />
FILTER ( CONTAINS( ?n, ?str ) )<br />
FILTER ( CONTAINS( STR(?year), ?yr) )<br />
<br />
VALUES ?str { "Andreas" "David" }<br />
VALUES ?yr { "2020" "2019" }<br />
}<br />
<br />
===Property paths (composite properties)===<br />
<br />
This query:<br />
SELECT ?p ?n WHERE {<br />
?p rdf:type kg:MainPaper .<br />
?p dc:contributor ?c .<br />
?c foaf:name ?n .<br />
}<br />
<br />
Can be simplified by eliminating ?c:<br />
SELECT ?p ?n WHERE {<br />
?p rdf:type kg:MainPaper .<br />
?p dc:contributor / foaf:name ?n .<br />
}<br />
<br />
Can be further simplified by first reversing rdf:type:<br />
SELECT ?p ?n WHERE {<br />
kg:MainPaper ^rdf:type ?p .<br />
?p dc:contributor / foaf:name ?n .<br />
}<br />
<br />
...and the eliminating ?p:<br />
SELECT ?n WHERE {<br />
kg:MainPaper ^rdf:type / dc:contributor / foaf:name ?n .<br />
}<br />
<br />
===Retrieve titles of papers that mention SPARQL===<br />
<br />
Get papers with topics labelled "SPARQL":<br />
SELECT ?t WHERE {<br />
?t ^dc:title / dc:subject / skos:prefLabel "SPARQL" .<br />
}<br />
<br />
Some labels also go via a theme:<br />
SELECT ?t WHERE {<br />
?t ^dc:title / dc:subject / th:theme / skos:prefLabel "SPARQL" .<br />
}<br />
<br />
We can get both using a path with an optional element (the '?'):<br />
SELECT ?t WHERE {<br />
?t ^dc:title / dc:subject / th:theme? / skos:prefLabel "SPARQL" .<br />
}<br />
<br />
===Using an external SPARQL endpoint===<br />
<br />
We limit to a single label to avoid time-outs and rate limitations:<br />
SELECT ?a ?n ?r WHERE {<br />
?a rdf:type ss:Topic .<br />
?a skos:prefLabel ?n .<br />
FILTER ( ?n = "SPARQL" )<br />
BIND ( STRLANG( ?n, "en" ) AS ?n2 )<br />
SERVICE <https://dbpedia.org/sparql> {<br />
?r rdfs:label ?n2 .<br />
}<br />
} LIMIT 1<br />
<br />
===Insert 4-digit years for all main papers===<br />
<br />
Main papers that do not have an xsd:gYear:<br />
SELECT * WHERE {<br />
<br />
?p rdf:type kg:MainPaper .<br />
?p dc:date ?d .<br />
FILTER ( DATATYPE(?d) = xsd:gYear )<br />
<br />
}<br />
<br />
Show the datatypes:<br />
SELECT * WHERE {<br />
<br />
?p rdf:type kg:MainPaper .<br />
?p dc:date ?d .<br />
FILTER ( DATATYPE(?d) = xsd:dateTime )<br />
BIND ( year( ?d ) AS ?dt )<br />
<br />
}<br />
<br />
Insert 4-digit years:<br />
INSERT { ?paper dc:date ?year } <br />
WHERE {<br />
<br />
?paper rdf:type kg:MainPaper .<br />
?paper dc:date ?date .<br />
FILTER( DATATYPE(?date) != xsd:gYear )<br />
BIND ( YEAR(?date) AS ?year ) <br />
<br />
}<br />
<br />
(Actually, these years are xsd:integer- s, not quite xsd:gYear-s.)<br />
<br />
<!--<br />
<br />
== SPARQL Examples from Session 7: RDFS==<br />
<br />
===Turn on inference!===<br />
<br />
Make sure inference is on in your triple store, or that you compute closures if you run this in Python with rdflib and OWL-RL.<br />
<br />
In '''Blazegraph''', create a new "Namespace" with the "Inference" box checked. <br />
Remember to "Use" the new namespace.<br />
<br />
In '''Python''', install the [https://owl-rl.readthedocs.io/en/latest/ OWL-RL package] (''pip install owlrl''). <br />
Explicitly compute RDFS closure like this:<br />
import owlrl.RDFSClosure<br />
<br />
...<br />
<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
rdfs.closure()<br />
rdfs.flush_stored_triples()<br />
<br />
===rdfs:subClassOf entailment===<br />
<br />
Update:<br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
PREFIX foaf: <http://xmlns.com/foaf/0.1/><br />
PREFIX kg: <http://i2s.uib.no/kg4news/><br />
<br />
INSERT DATA {<br />
kg:TimBernersLee rdf:type kg:Author .<br />
kg:Author rdfs:subClassOf foaf:Person . <br />
}<br />
<br />
Query:<br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
PREFIX foaf: <http://xmlns.com/foaf/0.1/><br />
PREFIX kg: <http://i2s.uib.no/kg4news/><br />
<br />
ASK {<br />
kg:TimBernersLee rdf:type foaf:Person . <br />
}<br />
<br />
===rdfs:domain entailment===<br />
<br />
Update:<br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
PREFIX dcterm: <http://purl.org/dc/terms/><br />
PREFIX kg: <http://i2s.uib.no/kg4news/><br />
<br />
INSERT DATA {<br />
kg:TimBernersLee rdf:type kg:Author .<br />
kg:TheSemanticWeb dcterm:contributor kg:TimBernersLee .<br />
dcterm:contributor rdfs:domain kg:Paper .<br />
}<br />
<br />
Query:<br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
PREFIX dcterm: <http://purl.org/dc/terms/><br />
PREFIX kg: <http://i2s.uib.no/kg4news/><br />
<br />
ASK {<br />
kg:TheSemanticWeb rdf:type kg:Paper .<br />
}<br />
<br />
==OWL Ontologies==<br />
<br />
The following files contain an ontology for the knowledge graph used in this page:<br />
* Small: [[:File:small-kg4news-ontology.txt]]<br />
* Full: [[:File:kg4news-ontology.txt]]<br />
<br />
Rename them from '.txt.' to '.ttl' after you download them.<br />
<br />
You can<br />
* view the ontologies online using [http://www.visualdataweb.de/webvowl/ WebVOWL] or <br />
* download the [https://protege.stanford.edu/products.php#desktop-protege Protegé-OWL] ontology editor.<br />
<!--<br />
--></div>
Tkr048
http://info216.wiki.uib.no/index.php?title=SPARQL_Examples&diff=1750
SPARQL Examples
2022-03-10T12:50:40Z
<p>Tkr048: </p>
<hr />
<div>This page will be updated with SPARQL examples as the course progresses.<br />
<br />
<br />
== SPARQL Examples from Session 3: SPARQL==<br />
<br />
<br />
<br />
===Prefixes used===<br />
<br />
The examples below will assume that these are in place (some examples aren't yet visible).<br />
<br />
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#><br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
PREFIX dc: <http://purl.org/dc/terms/><br />
PREFIX bibo: <http://purl.org/ontology/bibo/><br />
PREFIX foaf: <http://xmlns.com/foaf/0.1/><br />
PREFIX skos: <http://www.w3.org/2004/02/skos/core#><br />
PREFIX ss: <http://semanticscholar.org/><br />
PREFIX kg: <http://i2s.uib.no/kg4news/><br />
PREFIX sp: <http://i2s.uib.no/kg4news/science-parse/><br />
PREFIX th: <http://i2s.uib.no/kg4news/theme/><br />
PREFIX xml: <http://www.w3.org/XML/1998/namespace> <br />
PREFIX ex: <http://example.org/> <br />
<br />
===Select all triplets in graph===<br />
<br />
SELECT ?s ?p ?o<br />
WHERE {<br />
?s ?p ?o .<br />
} <br />
<br />
or <br />
<br />
SELECT *<br />
WHERE {<br />
?s ?p ?o .<br />
} <br />
<br />
===Select the interestes of Cade===<br />
<br />
SELECT ?cadeInterest<br />
WHERE {<br />
ex:Cade ex:interest ?cadeInterest .<br />
} <br />
===Select the country and city where Emma lives===<br />
<br />
SELECT ?emmaCity ?emmaCountry<br />
WHERE {<br />
ex:Emma ex:address ?address .<br />
?address ex:city ?emmaCity .<br />
?address ex:country ?emmaCountry .<br />
} <br />
<br />
All address info to Emma.<br />
<br />
SELECT ?person ?city ?country ?postalcode ?street WHERE{<br />
ex:Emma ex:address ?address. <br />
?person ex:address ?address.<br />
?address ex:city ?city.<br />
?address ex:country ?country.<br />
?address ex:postalCode ?postalcode.<br />
?address ex:street ?street.<br />
}<br />
<br />
Selecting all address info, to everyone.<br />
<br />
SELECT ?name ?city ?country ?postalcode ?street WHERE{ <br />
?person foaf:name ?name. <br />
?person ex:address ?address.<br />
?address ex:city ?city.<br />
?address ex:country ?country.<br />
?address ex:postalCode ?postalcode.<br />
?address ex:street ?street.<br />
}<br />
<br />
===Select the people who are over 26 years old===<br />
<br />
SELECT ?person ?age<br />
WHERE {<br />
?person ex:age ?age .<br />
FILTER(?age > 26) . <br />
} <br />
===Select people who graduated with Bachelor===<br />
<br />
SELECT ?person ?degree<br />
WHERE {<br />
?person ex:degree ?degree .<br />
?degree ex:degreeLevel "Bachelor" .<br />
<br />
} <br />
===Delete cades photography interest===<br />
<br />
DELETE DATA<br />
{<br />
ex:Cade ex:interest ex:Photography .<br />
} <br />
===Delete and insert university of valencia===<br />
<br />
DELETE { ?s ?p ex:University_of_Valencia }<br />
INSERT { ?s ?p ex:Universidad_de_Valencia }<br />
WHERE { ?s ?p ex:University_of_Valencia } <br />
<br />
===Check if the deletion worked===<br />
<br />
SELECT ?s ?o2<br />
WHERE { <br />
?s ex:degree ?o .<br />
?o ex:degreeSource ?o2 .<br />
}<br />
<br />
===Insert Sergio===<br />
<br />
INSERT DATA {<br />
ex:Sergio a foaf:Person ;<br />
ex:address [ a ex:Address ;<br />
ex:city ex:Valenciay ;<br />
ex:country ex:Spain ;<br />
ex:postalCode "46021"^^xsd:string ;<br />
ex:state ex:California ;<br />
ex:street "4_Carrer_del_Serpis"^^xsd:string ] ;<br />
ex:degree [ ex:degreeField ex:Computer_science ;<br />
ex:degreeLevel "Master"^^xsd:string ;<br />
ex:degreeSource ex:University_of_Valencia ;<br />
ex:year "2008"^^xsd:gYear ] ;<br />
ex:expertise ex:Big_data,<br />
ex:Semantic_technologies,<br />
ex:Machine_learning;<br />
foaf:name "Sergio_Pastor"^^xsd:string .<br />
}<br />
<br />
===Describe Sergio===<br />
<br />
DESCRIBE ex:Sergio ?o<br />
WHERE {<br />
ex:Sergio ?p ?o .<br />
?o ?p2 ?o2 .<br />
}<br />
<br />
===Construct that any city is in the country in an address=== <br />
<br />
CONSTRUCT {?city ex:locatedIn ?country}<br />
Where {<br />
?s rdf:type ex:Address .<br />
?s ex:city ?city .<br />
?s ex:country ?country.<br />
}<br />
<br />
<br />
==The data are available in this Blazegraph triple store:==<br />
[http://sandbox.i2s.uib.no http://sandbox.i2s.uib.no] , but you may need to be inside the UiB network (or on VPN.)<br />
<br />
SELECT DISTINCT ?p WHERE {<br />
?s rdf:type ss:Paper .<br />
?s ?p ?o .<br />
} LIMIT 100<br />
<br />
===Explain all types and properties===<br />
<br />
SELECT ?pt ?e WHERE {<br />
?pt rdfs:comment ?e .<br />
} LIMIT 100<br />
<br />
===List main papers===<br />
<br />
SELECT * WHERE {<br />
<br />
?paper rdf:type kg:MainPaper .<br />
?paper dc:date ?year .<br />
<br />
}<br />
<br />
===List properties===<br />
<br />
SELECT DISTINCT ?p WHERE {<br />
?s ?p ?o .<br />
} LIMIT 100<br />
<br />
===List types===<br />
<br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
<br />
SELECT DISTINCT ?t WHERE {<br />
?s rdf:type ?t .<br />
} LIMIT 100<br />
<br />
===List authors===<br />
<br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
PREFIX foaf: <http://xmlns.com/foaf/0.1/><br />
<br />
SELECT DISTINCT ?p WHERE {<br />
?s rdf:type foaf:Person .<br />
?s ?p ?o .<br />
} LIMIT 100<br />
<br />
Add this to show datatypes!<br />
BIND ( DATATYPE(?year) AS ?type )<br />
<br />
Add this to only show years with the right type.<br />
FILTER ( DATATYPE(?year) = xsd:gYear )<br />
<br />
===Group and count main papers by year===<br />
<br />
SELECT ?year (COUNT(?paper) AS ?count) WHERE {<br />
<br />
?paper rdf:type kg:MainPaper .<br />
?paper dc:date ?year .<br />
FILTER ( DATATYPE(?year) = xsd:gYear )<br />
<br />
}<br />
GROUP BY ?year<br />
<br />
Add this to order the results<br />
ORDER BY ?year<br />
<br />
Add this to order and only show years with more than 5 papers.<br />
HAVING (?count > 5)<br />
ORDER BY DESC(?count)<br />
<br />
===Show papers===<br />
<br />
SELECT ?paper ?year WHERE {<br />
<br />
?paper rdf:type kg:MainPaper .<br />
?paper dc:date ?year .<br />
FILTER ( DATATYPE(?year) = xsd:gYear )<br />
<br />
}<br />
<br />
Change last lines to show papers without an xsd:gYear too.<br />
OPTIONAL {<br />
?paper dc:date ?year .<br />
FILTER ( DATATYPE(?year) = xsd:gYear )<br />
}<br />
<br />
===Alternative values for variables===<br />
<br />
SELECT ?p ?n ?year WHERE {<br />
?p rdf:type kg:MainPaper .<br />
?p dc:contributor ?a .<br />
?a foaf:name ?n .<br />
?p dc:date ?year .<br />
FILTER ( CONTAINS( ?n, ?str ) )<br />
FILTER ( CONTAINS( STR(?year), ?yr) )<br />
<br />
VALUES ?str { "Andreas" "David" }<br />
VALUES ?yr { "2020" "2019" }<br />
}<br />
<br />
===Property paths (composite properties)===<br />
<br />
This query:<br />
SELECT ?p ?n WHERE {<br />
?p rdf:type kg:MainPaper .<br />
?p dc:contributor ?c .<br />
?c foaf:name ?n .<br />
}<br />
<br />
Can be simplified by eliminating ?c:<br />
SELECT ?p ?n WHERE {<br />
?p rdf:type kg:MainPaper .<br />
?p dc:contributor / foaf:name ?n .<br />
}<br />
<br />
Can be further simplified by first reversing rdf:type:<br />
SELECT ?p ?n WHERE {<br />
kg:MainPaper ^rdf:type ?p .<br />
?p dc:contributor / foaf:name ?n .<br />
}<br />
<br />
...and the eliminating ?p:<br />
SELECT ?n WHERE {<br />
kg:MainPaper ^rdf:type / dc:contributor / foaf:name ?n .<br />
}<br />
<br />
===Retrieve titles of papers that mention SPARQL===<br />
<br />
Get papers with topics labelled "SPARQL":<br />
SELECT ?t WHERE {<br />
?t ^dc:title / dc:subject / skos:prefLabel "SPARQL" .<br />
}<br />
<br />
Some labels also go via a theme:<br />
SELECT ?t WHERE {<br />
?t ^dc:title / dc:subject / th:theme / skos:prefLabel "SPARQL" .<br />
}<br />
<br />
We can get both using a path with an optional element (the '?'):<br />
SELECT ?t WHERE {<br />
?t ^dc:title / dc:subject / th:theme? / skos:prefLabel "SPARQL" .<br />
}<br />
<br />
===Using an external SPARQL endpoint===<br />
<br />
We limit to a single label to avoid time-outs and rate limitations:<br />
SELECT ?a ?n ?r WHERE {<br />
?a rdf:type ss:Topic .<br />
?a skos:prefLabel ?n .<br />
FILTER ( ?n = "SPARQL" )<br />
BIND ( STRLANG( ?n, "en" ) AS ?n2 )<br />
SERVICE <https://dbpedia.org/sparql> {<br />
?r rdfs:label ?n2 .<br />
}<br />
} LIMIT 1<br />
<br />
===Insert 4-digit years for all main papers===<br />
<br />
Main papers that do not have an xsd:gYear:<br />
SELECT * WHERE {<br />
<br />
?p rdf:type kg:MainPaper .<br />
?p dc:date ?d .<br />
FILTER ( DATATYPE(?d) = xsd:gYear )<br />
<br />
}<br />
<br />
Show the datatypes:<br />
SELECT * WHERE {<br />
<br />
?p rdf:type kg:MainPaper .<br />
?p dc:date ?d .<br />
FILTER ( DATATYPE(?d) = xsd:dateTime )<br />
BIND ( year( ?d ) AS ?dt )<br />
<br />
}<br />
<br />
Insert 4-digit years:<br />
INSERT { ?paper dc:date ?year } <br />
WHERE {<br />
<br />
?paper rdf:type kg:MainPaper .<br />
?paper dc:date ?date .<br />
FILTER( DATATYPE(?date) != xsd:gYear )<br />
BIND ( YEAR(?date) AS ?year ) <br />
<br />
}<br />
<br />
(Actually, these years are xsd:integer- s, not quite xsd:gYear-s.)<br />
<br />
<!--<br />
<br />
== SPARQL Examples from Session 7: RDFS==<br />
<br />
===Turn on inference!===<br />
<br />
Make sure inference is on in your triple store, or that you compute closures if you run this in Python with rdflib and OWL-RL.<br />
<br />
In '''Blazegraph''', create a new "Namespace" with the "Inference" box checked. <br />
Remember to "Use" the new namespace.<br />
<br />
In '''Python''', install the [https://owl-rl.readthedocs.io/en/latest/ OWL-RL package] (''pip install owlrl''). <br />
Explicitly compute RDFS closure like this:<br />
import owlrl.RDFSClosure<br />
<br />
...<br />
<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
rdfs.closure()<br />
rdfs.flush_stored_triples()<br />
<br />
===rdfs:subClassOf entailment===<br />
<br />
Update:<br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
PREFIX foaf: <http://xmlns.com/foaf/0.1/><br />
PREFIX kg: <http://i2s.uib.no/kg4news/><br />
<br />
INSERT DATA {<br />
kg:TimBernersLee rdf:type kg:Author .<br />
kg:Author rdfs:subClassOf foaf:Person . <br />
}<br />
<br />
Query:<br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
PREFIX foaf: <http://xmlns.com/foaf/0.1/><br />
PREFIX kg: <http://i2s.uib.no/kg4news/><br />
<br />
ASK {<br />
kg:TimBernersLee rdf:type foaf:Person . <br />
}<br />
<br />
===rdfs:domain entailment===<br />
<br />
Update:<br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
PREFIX dcterm: <http://purl.org/dc/terms/><br />
PREFIX kg: <http://i2s.uib.no/kg4news/><br />
<br />
INSERT DATA {<br />
kg:TimBernersLee rdf:type kg:Author .<br />
kg:TheSemanticWeb dcterm:contributor kg:TimBernersLee .<br />
dcterm:contributor rdfs:domain kg:Paper .<br />
}<br />
<br />
Query:<br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
PREFIX dcterm: <http://purl.org/dc/terms/><br />
PREFIX kg: <http://i2s.uib.no/kg4news/><br />
<br />
ASK {<br />
kg:TheSemanticWeb rdf:type kg:Paper .<br />
}<br />
<br />
==OWL Ontologies==<br />
<br />
The following files contain an ontology for the knowledge graph used in this page:<br />
* Small: [[:File:small-kg4news-ontology.txt]]<br />
* Full: [[:File:kg4news-ontology.txt]]<br />
<br />
Rename them from '.txt.' to '.ttl' after you download them.<br />
<br />
You can<br />
* view the ontologies online using [http://www.visualdataweb.de/webvowl/ WebVOWL] or <br />
* download the [https://protege.stanford.edu/products.php#desktop-protege Protegé-OWL] ontology editor.<br />
<!--<br />
--></div>
Tkr048
http://info216.wiki.uib.no/index.php?title=Lab_Solutions&diff=1748
Lab Solutions
2022-03-10T12:29:03Z
<p>Tkr048: </p>
<hr />
<div>This page will be updated with Python examples related to the lectures and labs. We will add more examples after each lab has ended. The first examples will use Python's RDFlib. We will introduce other relevant libraries later.<br />
<br />
<br />
==Getting started==<br />
<br />
<br />
===Printing the triples of the Graph in a readable way===<br />
<syntaxhighlight><br />
# The turtle format has the purpose of being more readable for humans. <br />
print(g.serialize(format="turtle"))<br />
</syntaxhighlight><br />
<br />
===Coding Tasks Lab 1===<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, URIRef, BNode, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.Cade, ex.married, ex.Mary))<br />
g.add((ex.France, ex.capital, ex.Paris))<br />
g.add((ex.Cade, ex.age, Literal("27", datatype=XSD.integer)))<br />
g.add((ex.Mary, ex.age, Literal("26", datatype=XSD.integer)))<br />
g.add((ex.Mary, ex.interest, ex.Hiking))<br />
g.add((ex.Mary, ex.interest, ex.Chocolate))<br />
g.add((ex.Mary, ex.interest, ex.Biology))<br />
g.add((ex.Mary, RDF.type, ex.Student))<br />
g.add((ex.Paris, RDF.type, ex.City))<br />
g.add((ex.Paris, ex.locatedIn, ex.France))<br />
g.add((ex.Cade, ex.characteristic, ex.Kind))<br />
g.add((ex.Mary, ex.characteristic, ex.Kind))<br />
g.add((ex.Mary, RDF.type, FOAF.Person))<br />
g.add((ex.Cade, RDF.type, FOAF.Person))<br />
<br />
# OR<br />
<br />
g = Graph()<br />
<br />
ex = Namespace('http://example.org/')<br />
<br />
g.add((ex.Cade, FOAF.name, Literal("Cade", datatype=XSD.string)))<br />
g.add((ex.Mary, FOAF.name, Literal("Mary", datatype=XSD.string)))<br />
g.add((ex.Cade, RDF.type, FOAF.Person))<br />
g.add((ex.Mary, RDF.type, FOAF.Person))<br />
g.add((ex.Mary, RDF.type, ex.Student))<br />
g.add((ex.Cade, ex.Married, ex.Mary))<br />
g.add((ex.Cade, FOAF.age, Literal('27', datatype=XSD.int)))<br />
g.add((ex.Mary, FOAF.age, Literal('26', datatype=XSD.int)))<br />
g.add((ex.Paris, RDF.type, ex.City))<br />
g.add((ex.France, ex.Capital, ex.Paris))<br />
g.add((ex.Mary, FOAF.interest, ex.hiking))<br />
g.add((ex.Mary, FOAF.interest, ex.Chocolate))<br />
g.add((ex.Mary, FOAF.interest, ex.biology))<br />
g.add((ex.France, ex.City, ex.Paris))<br />
g.add((ex.Mary, ex.characteristic, ex.kind))<br />
g.add((ex.Cade, ex.characteristic, ex.kind))<br />
g.add((ex.France, RDF.type, ex.Country))<br />
<br />
<br />
print(g.serialize(format="turtle"))<br />
<br />
</syntaxhighlight><br />
<br />
==Basic RDF programming==<br />
<br />
===Different ways to create an address===<br />
<br />
<syntaxhighlight><br />
<br />
from rdflib import Graph, Namespace, URIRef, BNode, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
<br />
# How to represent the address of Cade Tracey. From probably the worst solution to the best.<br />
<br />
# Solution 1 -<br />
# Make the entire address into one Literal. However, Generally we want to separate each part of an address into their own triples. This is useful for instance if we want to find only the streets where people live. <br />
<br />
g.add((ex.Cade_Tracey, ex.livesIn, Literal("1516_Henry_Street, Berkeley, California 94709, USA")))<br />
<br />
<br />
# Solution 2 - <br />
# Seperate the different pieces information into their own triples<br />
<br />
g.add((ex.Cade_tracey, ex.street, Literal("1516_Henry_Street")))<br />
g.add((ex.Cade_tracey, ex.city, Literal("Berkeley")))<br />
g.add((ex.Cade_tracey, ex.state, Literal("California")))<br />
g.add((ex.Cade_tracey, ex.zipcode, Literal("94709")))<br />
g.add((ex.Cade_tracey, ex.country, Literal("USA")))<br />
<br />
<br />
# Solution 3 - Some parts of the addresses can make more sense to be resources than Literals.<br />
# Larger concepts like a city or state are typically represented as resources rather than Literals, but this is not necesarilly a requirement in the case that you don't intend to say more about them. <br />
<br />
g.add((ex.Cade_tracey, ex.street, Literal("1516_Henry_Street")))<br />
g.add((ex.Cade_tracey, ex.city, ex.Berkeley))<br />
g.add((ex.Cade_tracey, ex.state, ex.California))<br />
g.add((ex.Cade_tracey, ex.zipcode, Literal("94709")))<br />
g.add((ex.Cade_tracey, ex.country, ex.USA))<br />
<br />
<br />
# Solution 4 <br />
# Grouping of the information into an Address. We can Represent the address concept with its own URI OR with a Blank Node. <br />
# One advantage of this is that we can easily remove the entire address, instead of removing each individual part of the address. <br />
# Solution 4 or 5 is how I would recommend to make addresses. Here, ex.CadeAddress could also be called something like ex.address1 or so on, if you want to give each address a unique ID. <br />
<br />
# Address URI - CadeAdress<br />
<br />
g.add((ex.Cade_Tracey, ex.address, ex.CadeAddress))<br />
g.add((ex.CadeAddress, RDF.type, ex.Address))<br />
g.add((ex.CadeAddress, ex.street, Literal("1516 Henry Street")))<br />
g.add((ex.CadeAddress, ex.city, ex.Berkeley))<br />
g.add((ex.CadeAddress, ex.state, ex.California))<br />
g.add((ex.CadeAddress, ex.postalCode, Literal("94709")))<br />
g.add((ex.CadeAddress, ex.country, ex.USA))<br />
<br />
# OR<br />
<br />
# Blank node for Address. <br />
address = BNode()<br />
g.add((ex.Cade_Tracey, ex.address, address))<br />
g.add((address, RDF.type, ex.Address))<br />
g.add((address, ex.street, Literal("1516 Henry Street", datatype=XSD.string)))<br />
g.add((address, ex.city, ex.Berkeley))<br />
g.add((address, ex.state, ex.California))<br />
g.add((address, ex.postalCode, Literal("94709", datatype=XSD.string)))<br />
g.add((address, ex.country, ex.USA))<br />
<br />
<br />
# Solution 5 using existing vocabularies for address <br />
<br />
# (in this case https://schema.org/PostalAddress from schema.org). <br />
# Also using existing ontology for places like California. (like http://dbpedia.org/resource/California from dbpedia.org)<br />
<br />
schema = Namespace("https://schema.org/")<br />
dbp = Namespace("https://dpbedia.org/resource/")<br />
<br />
g.add((ex.Cade_Tracey, schema.address, ex.CadeAddress))<br />
g.add((ex.CadeAddress, RDF.type, schema.PostalAddress))<br />
g.add((ex.CadeAddress, schema.streetAddress, Literal("1516 Henry Street")))<br />
g.add((ex.CadeAddress, schema.addresCity, dbp.Berkeley))<br />
g.add((ex.CadeAddress, schema.addressRegion, dbp.California))<br />
g.add((ex.CadeAddress, schema.postalCode, Literal("94709")))<br />
g.add((ex.CadeAddress, schema.addressCountry, dbp.United_States))<br />
<br />
</syntaxhighlight><br />
<br />
===Typed Literals===<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace<br />
from rdflib.namespace import XSD<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.Cade, ex.age, Literal(27, datatype=XSD.integer)))<br />
g.add((ex.Cade, ex.gpa, Literal(3.3, datatype=XSD.float)))<br />
g.add((ex.Cade, FOAF.name, Literal("Cade Tracey", datatype=XSD.string)))<br />
g.add((ex.Cade, ex.birthday, Literal("2006-01-01", datatype=XSD.date)))<br />
</syntaxhighlight><br />
<br />
<br />
===Writing and reading graphs/files===<br />
<br />
<syntaxhighlight><br />
# Writing the graph to a file on your system. Possible formats = turtle, n3, xml, nt.<br />
g.serialize(destination="triples.txt", format="turtle")<br />
<br />
# Parsing a local file<br />
parsed_graph = g.parse(location="triples.txt", format="turtle")<br />
<br />
# Parsing a remote endpoint like Dbpedia<br />
dbpedia_graph = g.parse("http://dbpedia.org/resource/Pluto")<br />
</syntaxhighlight><br />
<br />
===Graph Binding===<br />
<syntaxhighlight><br />
#Graph Binding is useful for at least two reasons:<br />
#(1) We no longer need to specify prefixes with SPARQL queries if they are already binded to the graph.<br />
#(2) When serializing the graph, the serialization will show the correct expected prefix <br />
# instead of default namespace names ns1, ns2 etc.<br />
<br />
g = Graph()<br />
<br />
ex = Namespace("http://example.org/")<br />
dbp = Namespace("http://dbpedia.org/resource/")<br />
schema = Namespace("https://schema.org/")<br />
<br />
g.bind("ex", ex)<br />
g.bind("dbp", dbp)<br />
g.bind("schema", schema)<br />
</syntaxhighlight><br />
<br />
===Collection Example===<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace<br />
from rdflib.collection import Collection<br />
<br />
<br />
# Sometimes we want to add many objects or subjects for the same predicate at once. <br />
# In these cases we can use Collection() to save some time.<br />
# In this case I want to add all countries that Emma has visited at once.<br />
<br />
b = BNode()<br />
g.add((ex.Emma, ex.visit, b))<br />
Collection(g, b,<br />
[ex.Portugal, ex.Italy, ex.France, ex.Germany, ex.Denmark, ex.Sweden])<br />
<br />
# OR<br />
<br />
g.add((ex.Emma, ex.visit, ex.EmmaVisits))<br />
Collection(g, ex.EmmaVisits,<br />
[ex.Portugal, ex.Italy, ex.France, ex.Germany, ex.Denmark, ex.Sweden])<br />
<br />
</syntaxhighlight><br />
<br />
==SPARQL==<br />
<br />
Also see the [[SPARQL Examples]] page!<br />
<br />
===Querying a local ("in memory") graph===<br />
<br />
Example contents of the file family.ttl:<br />
@prefix rex: <http://example.org/royal#> .<br />
@prefix fam: <http://example.org/family#> .<br />
<br />
rex:IngridAlexandra fam:hasParent rex:HaakonMagnus .<br />
rex:SverreMagnus fam:hasParent rex:HaakonMagnus .<br />
rex:HaakonMagnus fam:hasParent rex:Harald .<br />
rex:MarthaLouise fam:hasParent rex:Harald .<br />
rex:HaakonMagnus fam:hasSister rex:MarthaLouise .<br />
<br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
g.parse("family.ttl", format='ttl')<br />
<br />
qres = g.query("""<br />
PREFIX fam: <http://example.org/family#><br />
SELECT ?child ?sister WHERE {<br />
?child fam:hasParent ?parent . <br />
?parent fam:hasSister ?sister .<br />
}""")<br />
for row in qres:<br />
print("%s has aunt %s" % row)<br />
<br />
With a prepared query, you can write the query once, and then bind some of the variables each time you use it:<br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
g.parse("family.ttl", format='ttl')<br />
<br />
q = rdflib.plugins.sparql.prepareQuery(<br />
"""SELECT ?child ?sister WHERE {<br />
?child fam:hasParent ?parent .<br />
?parent fam:hasSister ?sister .<br />
}""",<br />
initNs = { "fam": "http://example.org/family#"})<br />
<br />
sm = rdflib.URIRef("http://example.org/royal#SverreMagnus")<br />
<br />
for row in g.query(q, initBindings={'child': sm}):<br />
print(row)<br />
<br />
===Select all contents of lists (rdfllib.Collection)===<br />
<syntaxhighlight><br />
<br />
# rdflib.Collection has a different interntal structure so it requires a slightly more advance query. Here I am selecting all places that Emma has visited.<br />
<br />
PREFIX ex: <http://example.org/><br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
<br />
SELECT ?visit<br />
WHERE {<br />
ex:Emma ex:visit/rdf:rest*/rdf:first ?visit<br />
}<br />
</syntaxhighlight><br />
<br />
<br />
===Using parameters/variables in rdflib queries===<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, URIRef<br />
from rdflib.plugins.sparql import prepareQuery<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
g.add((ex.Cade, ex.livesIn, ex.France))<br />
g.add((ex.Anne, ex.livesIn, ex.Norway))<br />
g.add((ex.Sofie, ex.livesIn, ex.Sweden))<br />
g.add((ex.Per, ex.livesIn, ex.Norway))<br />
g.add((ex.John, ex.livesIn, ex.USA))<br />
<br />
<br />
def find_people_from_country(country):<br />
country = URIRef(ex + country)<br />
q = prepareQuery(<br />
"""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?person WHERE { <br />
?person ex:livesIn ?country.<br />
}<br />
""")<br />
<br />
capital_result = g.query(q, initBindings={'country': country})<br />
<br />
for row in capital_result:<br />
print(row)<br />
<br />
find_people_from_country("Norway")<br />
</syntaxhighlight><br />
<br />
===SELECTING data from Blazegraph via Python===<br />
<syntaxhighlight><br />
<br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
# This creates a server connection to the same URL that contains the graphic interface for Blazegraph. <br />
# You also need to add "sparql" to end of the URL like below.<br />
<br />
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/sparql")<br />
<br />
# SELECT all triples in the database.<br />
<br />
sparql.setQuery("""<br />
SELECT DISTINCT ?p WHERE {<br />
?s ?p ?o.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["p"]["value"])<br />
<br />
# SELECT all interests of Cade<br />
<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT DISTINCT ?interest WHERE {<br />
ex:Cade ex:interest ?interest.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["interest"]["value"])<br />
</syntaxhighlight><br />
<br />
===Updating data from Blazegraph via Python===<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, POST, DIGEST<br />
<br />
namespace = "kb"<br />
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/namespace/"+ namespace + "/sparql")<br />
<br />
sparql.setMethod(POST)<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
INSERT DATA{<br />
ex:Cade ex:interest ex:Mathematics.<br />
}<br />
""")<br />
<br />
results = sparql.query()<br />
print(results.response.read())<br />
<br />
<br />
</syntaxhighlight><br />
===Retrieving data from Wikidata with SparqlWrapper===<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
sparql = SPARQLWrapper("https://query.wikidata.org/sparql")<br />
# In the query I want to select all the Vitamins in wikidata.<br />
<br />
sparql.setQuery("""<br />
SELECT ?nutrient ?nutrientLabel WHERE<br />
{<br />
?nutrient wdt:P279 wd:Q34956.<br />
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }<br />
}<br />
""")<br />
<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["nutrient"]["value"], " ", result["nutrientLabel"]["value"])<br />
</syntaxhighlight><br />
<br />
<br />
More examples can be found in the example section on the official query service here: https://query.wikidata.org/.<br />
<br />
===Download from BlazeGraph===<br />
<br />
<syntaxhighlight><br />
"""<br />
Dumps a database to a local RDF file.<br />
You need to install the SPARQLWrapper package first...<br />
"""<br />
<br />
import datetime<br />
from SPARQLWrapper import SPARQLWrapper, RDFXML<br />
<br />
# your namespace, the default is 'kb'<br />
ns = 'kb'<br />
<br />
# the SPARQL endpoint<br />
endpoint = 'http://info216.i2s.uib.no/bigdata/namespace/' + ns + '/sparql'<br />
<br />
# - the endpoint just moved, the old one was:<br />
# endpoint = 'http://i2s.uib.no:8888/bigdata/namespace/' + ns + '/sparql'<br />
<br />
# create wrapper<br />
wrapper = SPARQLWrapper(endpoint)<br />
<br />
# prepare the SPARQL update<br />
wrapper.setQuery('CONSTRUCT { ?s ?p ?o } WHERE { ?s ?p ?o }')<br />
wrapper.setReturnFormat(RDFXML)<br />
<br />
# execute the SPARQL update and convert the result to an rdflib.Graph <br />
graph = wrapper.query().convert()<br />
<br />
# the destination file, with code to make it timestamped<br />
destfile = 'rdf_dumps/slr-kg4news-' + datetime.datetime.now().strftime('%Y%m%d-%H%M') + '.rdf'<br />
<br />
# serialize the result to file<br />
graph.serialize(destination=destfile, format='ttl')<br />
<br />
# report and quit<br />
print('Wrote %u triples to file %s .' %<br />
(len(res), destfile))<br />
</syntaxhighlight><br />
<br />
===Query Dbpedia with SparqlWrapper===<br />
<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
sparql = SPARQLWrapper("http://dbpedia.org/sparql")<br />
<br />
sparql.setQuery("""<br />
PREFIX dbr: <http://dbpedia.org/resource/><br />
PREFIX dbo: <http://dbpedia.org/ontology/><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
SELECT ?comment<br />
WHERE {<br />
dbr:Barack_Obama rdfs:comment ?comment.<br />
FILTER (langMatches(lang(?comment),"en"))<br />
}<br />
""")<br />
<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["comment"]["value"])<br />
</syntaxhighlight><br />
<br />
==Lifting CSV to RDF==<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, FOAF, RDFS, OWL<br />
import pandas as pd<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
# Load the CSV data as a pandas Dataframe.<br />
csv_data = pd.read_csv("task1.csv")<br />
<br />
# Here I deal with spaces (" ") in the data. I replace them with "_" so that URI's become valid.<br />
csv_data = csv_data.replace(to_replace=" ", value="_", regex=True)<br />
<br />
# Here I mark all missing/empty data as "unknown". This makes it easy to delete triples containing this later.<br />
csv_data = csv_data.fillna("unknown")<br />
<br />
# Loop through the CSV data, and then make RDF triples.<br />
for index, row in csv_data.iterrows():<br />
# The names of the people act as subjects.<br />
subject = row['Name']<br />
# Create triples: e.g. "Cade_Tracey - age - 27"<br />
g.add((URIRef(ex + subject), URIRef(ex + "age"), Literal(row["Age"])))<br />
g.add((URIRef(ex + subject), URIRef(ex + "married"), URIRef(ex + row["Spouse"])))<br />
g.add((URIRef(ex + subject), URIRef(ex + "country"), URIRef(ex + row["Country"])))<br />
<br />
# If We want can add additional RDF/RDFS/OWL information e.g<br />
g.add((URIRef(ex + subject), RDF.type, FOAF.Person))<br />
<br />
# I remove triples that I marked as unknown earlier.<br />
g.remove((None, None, URIRef("http://example.org/unknown")))<br />
<br />
# Clean printing of the graph.<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===CSV file for above example===<br />
<br />
<syntaxhighlight><br />
"Name","Age","Spouse","Country"<br />
"Cade Tracey","26","Mary Jackson","US"<br />
"Bob Johnson","21","","Canada"<br />
"Mary Jackson","25","","France"<br />
"Phil Philips","32","Catherine Smith","Japan"<br />
</syntaxhighlight><br />
<br />
<br />
=Coding Tasks Lab 6=<br />
<syntaxhighlight><br />
import pandas as pd<br />
<br />
<br />
from rdflib import Graph, Namespace, URIRef, Literal, BNode<br />
from rdflib.namespace import RDF, XSD<br />
<br />
<br />
ex = Namespace("http://example.org/")<br />
sem = Namespace("http://semanticweb.cs.vu.nl/2009/11/sem/")<br />
<br />
g = Graph()<br />
g.bind("ex", ex)<br />
g.bind("sem", sem)<br />
<br />
<br />
# Removing unwanted characters<br />
df = pd.read_csv('russia-investigation.csv')<br />
# Here I deal with spaces (" ") in the data. I replace them with "_" so that URI's become valid.<br />
df = df.replace(to_replace=" ", value="_", regex=True)<br />
# This may seem odd, but in the data set we have a name like this:("Scooter"). So we have to remove quotation marks<br />
df = df.replace(to_replace=f'"', value="", regex=True)<br />
# # Here I mark all missing/empty data as "unknown". This makes it easy to delete triples containing this later.<br />
df = df.fillna("unknown")<br />
<br />
# Loop through the CSV data, and then make RDF triples.<br />
for index, row in df.iterrows():<br />
name = row['investigation']<br />
investigation = URIRef(ex + name)<br />
g.add((investigation, RDF.type, sem.Event))<br />
investigation_start = row["investigation-start"]<br />
g.add((investigation, sem.hasBeginTimeStamp, Literal(<br />
investigation_start, datatype=XSD.datetime)))<br />
investigation_end = row["investigation-end"]<br />
g.add((investigation, sem.hasEndTimeStamp, Literal(<br />
investigation_end, datatype=XSD.datetime)))<br />
investigation_end = row["investigation-days"]<br />
g.add((investigation, sem.hasXSDDuration, Literal(<br />
investigation_end, datatype=XSD.Days)))<br />
person = row["name"]<br />
person = URIRef(ex + person)<br />
g.add((investigation, sem.Actor, person))<br />
result = row['type']<br />
g.add((investigation, sem.hasSubEvent, Literal(result, datatype=XSD.string)))<br />
overturned = row["overturned"]<br />
g.add((investigation, ex.overtuned, Literal(overturned, datatype=XSD.boolean)))<br />
pardoned = row["pardoned"]<br />
g.add((investigation, ex.pardon, Literal(pardoned, datatype=XSD.boolean)))<br />
<br />
g.serialize("output.ttl", format="ttl")<br />
print(g.serialize(format="turtle"))<br />
<br />
</syntaxhighlight><br />
<!--<br />
==Lifting XML to RDF==<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, XSD, RDFS<br />
import xml.etree.ElementTree as ET<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/TV/")<br />
prov = Namespace("http://www.w3.org/ns/prov#")<br />
g.bind("ex", ex)<br />
g.bind("prov", prov)<br />
<br />
tree = ET.parse("tv_shows.xml")<br />
root = tree.getroot()<br />
<br />
for tv_show in root.findall('tv_show'):<br />
show_id = tv_show.attrib["id"]<br />
title = tv_show.find("title").text<br />
<br />
g.add((URIRef(ex + show_id), ex.title, Literal(title, datatype=XSD.string)))<br />
g.add((URIRef(ex + show_id), RDF.type, ex.TV_Show))<br />
<br />
for actor in tv_show.findall("actor"):<br />
first_name = actor.find("firstname").text<br />
last_name = actor.find("lastname").text<br />
full_name = first_name + "_" + last_name<br />
<br />
g.add((URIRef(ex + show_id), ex.stars, URIRef(ex + full_name)))<br />
g.add((URIRef(ex + full_name), ex.starsIn, URIRef(title)))<br />
g.add((URIRef(ex + full_name), RDF.type, ex.Actor))<br />
<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
<br />
==RDFS==<br />
<br />
===RDFS-plus (OWL) Properties===<br />
<syntaxhighlight><br />
g.add((ex.married, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.married, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.ReflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.sibling, RDF.type, OWL.TransitiveProperty))<br />
g.add((ex.sibling, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.sibling, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.FunctionalProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.IrreflexiveProperty))<br />
<br />
# Sometimes there is no definite answer, and it comes down to how we want to model our properties<br />
# e.g is livesWith a transitive property? Usually yes, but we can also want to specify that a child lives with both of her divorced parents.<br />
# which means that: (mother livesWith child % child livesWith father) != mother livesWith father. Which makes it non-transitive.<br />
</syntaxhighlight><br />
<br />
===RDFS inference with RDFLib===<br />
You can use the OWL-RL package to add inference capabilities to RDFLib. It can be installed using the pip install command:<br />
<syntaxhighlight><br />
pip install owlrl<br />
</syntaxhighlight><br />
Or download it from [https://github.com/RDFLib/OWL-RL GitHub] and copy the ''owlrl'' subfolder into your project folder next to your Python files.<br />
<br />
[https://owl-rl.readthedocs.io/en/latest/owlrl.html OWL-RL documentation.]<br />
<br />
Example program to get you started. In this example we are creating the graph using sparql.update, but it is also possible to parse the data from a file.<br />
<syntaxhighlight><br />
import rdflib.plugins.sparql.update<br />
import owlrl.RDFSClosure<br />
<br />
g = rdflib.Graph()<br />
<br />
ex = rdflib.Namespace('http://example.org#')<br />
g.bind('', ex)<br />
<br />
g.update("""<br />
PREFIX ex: <http://example.org#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
INSERT DATA {<br />
ex:Socrates rdf:type ex:Man .<br />
ex:Man rdfs:subClassOf ex:Mortal .<br />
}""")<br />
<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
# RDF_Semantics parameters: <br />
# - graph (rdflib.Graph) – The RDF graph to be extended.<br />
# - axioms (bool) – Whether (non-datatype) axiomatic triples should be added or not.<br />
# - daxioms (bool) – Whether datatype axiomatic triples should be added or not.<br />
# - rdfs (bool) – Whether RDFS inference is also done (used in subclassed only).<br />
# For now, you will in most cases use all False in RDFS_Semtantics.<br />
<br />
# Generates the closure of the graph - generates the new entailed triples, but does not add them to the graph.<br />
rdfs.closure()<br />
# Adds the new triples to the graph and empties the RDFS triple-container.<br />
rdfs.flush_stored_triples()<br />
<br />
# Ask-query to check whether a new triple has been generated from the entailment.<br />
b = g.query("""<br />
PREFIX ex: <http://example.org#><br />
ASK {<br />
ex:Socrates rdf:type ex:Mortal .<br />
} <br />
""")<br />
print('Result: ' + bool(b))<br />
</syntaxhighlight><br />
<br />
===Language tagged RDFS labels=== <br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, Literal<br />
from rdflib.namespace import RDFS<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.France, RDFS.label, Literal("Frankrike", lang="no")))<br />
g.add((ex.France, RDFS.label, Literal("France", lang="en")))<br />
g.add((ex.France, RDFS.label, Literal("Francia", lang="es")))<br />
<br />
<br />
</syntaxhighlight><br />
<br />
==OWL== <br />
===Basic inference with RDFLib=== <br />
<br />
You can use the OWL-RL package again as for Lecture 5.<br />
<br />
Instead of: <br />
<syntaxhighlight><br />
# The next three lines add inferred triples to g.<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
rdfs.closure()<br />
rdfs.flush_stored_triples()<br />
</syntaxhighlight><br />
you can write this to get both RDFS and basic RDFS Plus / OWL inference:<br />
<syntaxhighlight><br />
# The next three lines add inferred triples to g.<br />
owl = owlrl.CombinedClosure.RDFS_OWLRL_Semantics(g, False, False, False)<br />
owl.closure()<br />
owl.flush_stored_triples()<br />
</syntaxhighlight><br />
<br />
Example updates and queries:<br />
<syntaxhighlight><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
PREFIX ex: <http://example.org#><br />
<br />
INSERT DATA {<br />
ex:Socrates ex:hasWife ex:Xanthippe .<br />
ex:hasHusband owl:inverseOf ex:hasWife .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Xanthippe ex:hasHusband ex:Socrates .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ^ex:hasHusband ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
INSERT DATA {<br />
ex:hasWife rdfs:subPropertyOf ex:hasSpouse .<br />
ex:hasSpouse rdf:type owl:SymmetricProperty . <br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ex:hasSpouse ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ^ex:hasSpouse ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<br />
<br />
<br />
<br />
<br />
===XML Data for above example===<br />
<syntaxhighlight><br />
<data><br />
<tv_show id="1050"><br />
<title>The_Sopranos</title><br />
<actor><br />
<firstname>James</firstname><br />
<lastname>Gandolfini</lastname><br />
</actor><br />
</tv_show><br />
<tv_show id="1066"><br />
<title>Seinfeld</title><br />
<actor><br />
<firstname>Jerry</firstname><br />
<lastname>Seinfeld</lastname><br />
</actor><br />
<actor><br />
<firstname>Julia</firstname><br />
<lastname>Louis-dreyfus</lastname><br />
</actor><br />
<actor><br />
<firstname>Jason</firstname><br />
<lastname>Alexander</lastname><br />
</actor><br />
</tv_show><br />
</data><br />
</syntaxhighlight><br />
<br />
==Lifting HTML to RDF==<br />
<syntaxhighlight><br />
from bs4 import BeautifulSoup as bs, NavigableString<br />
from rdflib import Graph, URIRef, Namespace<br />
from rdflib.namespace import RDF<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
html = open("tv_shows.html").read()<br />
html = bs(html, features="html.parser")<br />
<br />
shows = html.find_all('li', attrs={'class': 'show'})<br />
for show in shows:<br />
title = show.find("h3").text<br />
actors = show.find('ul', attrs={'class': 'actor_list'})<br />
for actor in actors:<br />
if isinstance(actor, NavigableString):<br />
continue<br />
else:<br />
actor = actor.text.replace(" ", "_")<br />
g.add((URIRef(ex + title), ex.stars, URIRef(ex + actor)))<br />
g.add((URIRef(ex + actor), RDF.type, ex.Actor))<br />
<br />
g.add((URIRef(ex + title), RDF.type, ex.TV_Show))<br />
<br />
<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===HTML code for the example above===<br />
<syntaxhighlight><br />
<!DOCTYPE html><br />
<html><br />
<head><br />
<meta charset="utf-8"><br />
<title></title><br />
</head><br />
<body><br />
<div class="tv_shows"><br />
<ul><br />
<li class="show"><br />
<h3>The_Sopranos</h3><br />
<div class="irrelevant_data"></div><br />
<ul class="actor_list"><br />
<li>James Gandolfini</li><br />
</ul><br />
</li><br />
<li class="show"><br />
<h3>Seinfeld</h3><br />
<div class="irrelevant_data"></div><br />
<ul class="actor_list"><br />
<li >Jerry Seinfeld</li><br />
<li>Jason Alexander</li><br />
<li>Julia Louis-Dreyfus</li><br />
</ul><br />
</li><br />
</ul><br />
</div><br />
</body><br />
</html><br />
</syntaxhighlight><br />
<br />
==Web APIs with JSON==<br />
<syntaxhighlight><br />
import requests<br />
import json<br />
import pprint<br />
<br />
# Retrieve JSON data from API service URL. Then load it with the json library as a json object.<br />
url = "http://api.geonames.org/postalCodeLookupJSON?postalcode=46020&#country=ES&username=demo"<br />
data = requests.get(url).content.decode("utf-8")<br />
data = json.loads(data)<br />
pprint.pprint(data)<br />
</syntaxhighlight><br />
<br />
<br />
==JSON-LD==<br />
<br />
<syntaxhighlight><br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
<br />
example = """<br />
{<br />
"@context": {<br />
"name": "http://xmlns.com/foaf/0.1/name",<br />
"homepage": {<br />
"@id": "http://xmlns.com/foaf/0.1/homepage",<br />
"@type": "@id"<br />
}<br />
},<br />
"@id": "http://me.markus-lanthaler.com/",<br />
"name": "Markus Lanthaler",<br />
"homepage": "http://www.markus-lanthaler.com/"<br />
}<br />
"""<br />
<br />
# json-ld parsing automatically deals with @contexts<br />
g.parse(data=example, format='json-ld')<br />
<br />
# serialisation does expansion by default<br />
for line in g.serialize(format='json-ld').decode().splitlines():<br />
print(line)<br />
<br />
# by supplying a context object, serialisation can do compaction<br />
context = {<br />
"foaf": "http://xmlns.com/foaf/0.1/"<br />
}<br />
for line in g.serialize(format='json-ld', context=context).decode().splitlines():<br />
print(line)<br />
</syntaxhighlight><br />
<br />
<br />
<div class="credits" style="text-align: right; direction: ltr; margin-left: 1em;">''INFO216, UiB, 2017-2020. All code examples are [https://creativecommons.org/choose/zero/ CC0].'' </div><br />
<br />
==OWL - Complex Classes and Restrictions==<br />
<syntaxhighlight><br />
import owlrl<br />
from rdflib import Graph, Literal, Namespace, BNode<br />
from rdflib.namespace import RDF, OWL, RDFS<br />
from rdflib.collection import Collection<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
g.bind("owl", OWL)<br />
<br />
# a Season is either Autumn, Winter, Spring, Summer<br />
seasons = BNode()<br />
Collection(g, seasons, [ex.Winter, ex.Autumn, ex.Spring, ex.Summer])<br />
g.add((ex.Season, OWL.oneOf, seasons))<br />
<br />
# A Parent is a Father or Mother<br />
b = BNode()<br />
Collection(g, b, [ex.Father, ex.Mother])<br />
g.add((ex.Parent, OWL.unionOf, b))<br />
<br />
# A Woman is a person who has the "female" gender<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.gender))<br />
g.add((br, OWL.hasValue, ex.Female))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Woman, OWL.intersectionOf, bi))<br />
<br />
# A vegetarian is a Person who only eats vegetarian food<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.eats))<br />
g.add((br, OWL.allValuesFrom, ex.VeganFood))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Vegetarian, OWL.intersectionOf, bi))<br />
<br />
# A vegetarian is a Person who can not eat meat.<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.eats))<br />
g.add((br, OWL.QualifiedCardinality, Literal(0)))<br />
g.add((br, OWL.onClass, ex.Meat))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Vegetarian, OWL.intersectionOf, bi))<br />
<br />
# A Worried Parent is a parent who has at least one sick child<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.hasChild))<br />
g.add((br, OWL.QualifiedMinCardinality, Literal(1)))<br />
g.add((br, OWL.onClass, ex.Sick))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Parent, br])<br />
g.add((ex.WorriedParent, OWL.intersectionOf, bi))<br />
<br />
# using the restriction above, If we now write...: <br />
g.add((ex.Bob, RDF.type, ex.Parent))<br />
g.add((ex.Bob, ex.hasChild, ex.John))<br />
g.add((ex.John, RDF.type, ex.Sick))<br />
# ...we can infer with owl reasoning that Bob is a worried Parent even though we didn't specify it ourselves because Bob fullfills the restriction and Parent requirements.<br />
<br />
</syntaxhighlight><br />
<br />
==Protege-OWL reasoning with HermiT==<br />
<br />
[[:File:DL-reasoning-RoyalFamily-final.owl.txt | Example file]] from Lecture 13 about OWL-DL, rules and reasoning.<br />
<br />
--></div>
Tkr048
http://info216.wiki.uib.no/index.php?title=Lab_Solutions&diff=1747
Lab Solutions
2022-03-10T12:26:36Z
<p>Tkr048: </p>
<hr />
<div>This page will be updated with Python examples related to the lectures and labs. We will add more examples after each lab has ended. The first examples will use Python's RDFlib. We will introduce other relevant libraries later.<br />
<br />
<br />
==Getting started==<br />
<br />
<br />
===Printing the triples of the Graph in a readable way===<br />
<syntaxhighlight><br />
# The turtle format has the purpose of being more readable for humans. <br />
print(g.serialize(format="turtle"))<br />
</syntaxhighlight><br />
<br />
===Coding Tasks Lab 1===<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, URIRef, BNode, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.Cade, ex.married, ex.Mary))<br />
g.add((ex.France, ex.capital, ex.Paris))<br />
g.add((ex.Cade, ex.age, Literal("27", datatype=XSD.integer)))<br />
g.add((ex.Mary, ex.age, Literal("26", datatype=XSD.integer)))<br />
g.add((ex.Mary, ex.interest, ex.Hiking))<br />
g.add((ex.Mary, ex.interest, ex.Chocolate))<br />
g.add((ex.Mary, ex.interest, ex.Biology))<br />
g.add((ex.Mary, RDF.type, ex.Student))<br />
g.add((ex.Paris, RDF.type, ex.City))<br />
g.add((ex.Paris, ex.locatedIn, ex.France))<br />
g.add((ex.Cade, ex.characteristic, ex.Kind))<br />
g.add((ex.Mary, ex.characteristic, ex.Kind))<br />
g.add((ex.Mary, RDF.type, FOAF.Person))<br />
g.add((ex.Cade, RDF.type, FOAF.Person))<br />
<br />
# OR<br />
<br />
g = Graph()<br />
<br />
ex = Namespace('http://example.org/')<br />
<br />
g.add((ex.Cade, FOAF.name, Literal("Cade", datatype=XSD.string)))<br />
g.add((ex.Mary, FOAF.name, Literal("Mary", datatype=XSD.string)))<br />
g.add((ex.Cade, RDF.type, FOAF.Person))<br />
g.add((ex.Mary, RDF.type, FOAF.Person))<br />
g.add((ex.Mary, RDF.type, ex.Student))<br />
g.add((ex.Cade, ex.Married, ex.Mary))<br />
g.add((ex.Cade, FOAF.age, Literal('27', datatype=XSD.int)))<br />
g.add((ex.Mary, FOAF.age, Literal('26', datatype=XSD.int)))<br />
g.add((ex.Paris, RDF.type, ex.City))<br />
g.add((ex.France, ex.Capital, ex.Paris))<br />
g.add((ex.Mary, FOAF.interest, ex.hiking))<br />
g.add((ex.Mary, FOAF.interest, ex.Chocolate))<br />
g.add((ex.Mary, FOAF.interest, ex.biology))<br />
g.add((ex.France, ex.City, ex.Paris))<br />
g.add((ex.Mary, ex.characteristic, ex.kind))<br />
g.add((ex.Cade, ex.characteristic, ex.kind))<br />
g.add((ex.France, RDF.type, ex.Country))<br />
<br />
<br />
print(g.serialize(format="turtle"))<br />
<br />
</syntaxhighlight><br />
<br />
==Basic RDF programming==<br />
<br />
===Different ways to create an address===<br />
<br />
<syntaxhighlight><br />
<br />
from rdflib import Graph, Namespace, URIRef, BNode, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
<br />
# How to represent the address of Cade Tracey. From probably the worst solution to the best.<br />
<br />
# Solution 1 -<br />
# Make the entire address into one Literal. However, Generally we want to separate each part of an address into their own triples. This is useful for instance if we want to find only the streets where people live. <br />
<br />
g.add((ex.Cade_Tracey, ex.livesIn, Literal("1516_Henry_Street, Berkeley, California 94709, USA")))<br />
<br />
<br />
# Solution 2 - <br />
# Seperate the different pieces information into their own triples<br />
<br />
g.add((ex.Cade_tracey, ex.street, Literal("1516_Henry_Street")))<br />
g.add((ex.Cade_tracey, ex.city, Literal("Berkeley")))<br />
g.add((ex.Cade_tracey, ex.state, Literal("California")))<br />
g.add((ex.Cade_tracey, ex.zipcode, Literal("94709")))<br />
g.add((ex.Cade_tracey, ex.country, Literal("USA")))<br />
<br />
<br />
# Solution 3 - Some parts of the addresses can make more sense to be resources than Literals.<br />
# Larger concepts like a city or state are typically represented as resources rather than Literals, but this is not necesarilly a requirement in the case that you don't intend to say more about them. <br />
<br />
g.add((ex.Cade_tracey, ex.street, Literal("1516_Henry_Street")))<br />
g.add((ex.Cade_tracey, ex.city, ex.Berkeley))<br />
g.add((ex.Cade_tracey, ex.state, ex.California))<br />
g.add((ex.Cade_tracey, ex.zipcode, Literal("94709")))<br />
g.add((ex.Cade_tracey, ex.country, ex.USA))<br />
<br />
<br />
# Solution 4 <br />
# Grouping of the information into an Address. We can Represent the address concept with its own URI OR with a Blank Node. <br />
# One advantage of this is that we can easily remove the entire address, instead of removing each individual part of the address. <br />
# Solution 4 or 5 is how I would recommend to make addresses. Here, ex.CadeAddress could also be called something like ex.address1 or so on, if you want to give each address a unique ID. <br />
<br />
# Address URI - CadeAdress<br />
<br />
g.add((ex.Cade_Tracey, ex.address, ex.CadeAddress))<br />
g.add((ex.CadeAddress, RDF.type, ex.Address))<br />
g.add((ex.CadeAddress, ex.street, Literal("1516 Henry Street")))<br />
g.add((ex.CadeAddress, ex.city, ex.Berkeley))<br />
g.add((ex.CadeAddress, ex.state, ex.California))<br />
g.add((ex.CadeAddress, ex.postalCode, Literal("94709")))<br />
g.add((ex.CadeAddress, ex.country, ex.USA))<br />
<br />
# OR<br />
<br />
# Blank node for Address. <br />
address = BNode()<br />
g.add((ex.Cade_Tracey, ex.address, address))<br />
g.add((address, RDF.type, ex.Address))<br />
g.add((address, ex.street, Literal("1516 Henry Street", datatype=XSD.string)))<br />
g.add((address, ex.city, ex.Berkeley))<br />
g.add((address, ex.state, ex.California))<br />
g.add((address, ex.postalCode, Literal("94709", datatype=XSD.string)))<br />
g.add((address, ex.country, ex.USA))<br />
<br />
<br />
# Solution 5 using existing vocabularies for address <br />
<br />
# (in this case https://schema.org/PostalAddress from schema.org). <br />
# Also using existing ontology for places like California. (like http://dbpedia.org/resource/California from dbpedia.org)<br />
<br />
schema = Namespace("https://schema.org/")<br />
dbp = Namespace("https://dpbedia.org/resource/")<br />
<br />
g.add((ex.Cade_Tracey, schema.address, ex.CadeAddress))<br />
g.add((ex.CadeAddress, RDF.type, schema.PostalAddress))<br />
g.add((ex.CadeAddress, schema.streetAddress, Literal("1516 Henry Street")))<br />
g.add((ex.CadeAddress, schema.addresCity, dbp.Berkeley))<br />
g.add((ex.CadeAddress, schema.addressRegion, dbp.California))<br />
g.add((ex.CadeAddress, schema.postalCode, Literal("94709")))<br />
g.add((ex.CadeAddress, schema.addressCountry, dbp.United_States))<br />
<br />
</syntaxhighlight><br />
<br />
===Typed Literals===<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace<br />
from rdflib.namespace import XSD<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.Cade, ex.age, Literal(27, datatype=XSD.integer)))<br />
g.add((ex.Cade, ex.gpa, Literal(3.3, datatype=XSD.float)))<br />
g.add((ex.Cade, FOAF.name, Literal("Cade Tracey", datatype=XSD.string)))<br />
g.add((ex.Cade, ex.birthday, Literal("2006-01-01", datatype=XSD.date)))<br />
</syntaxhighlight><br />
<br />
<br />
===Writing and reading graphs/files===<br />
<br />
<syntaxhighlight><br />
# Writing the graph to a file on your system. Possible formats = turtle, n3, xml, nt.<br />
g.serialize(destination="triples.txt", format="turtle")<br />
<br />
# Parsing a local file<br />
parsed_graph = g.parse(location="triples.txt", format="turtle")<br />
<br />
# Parsing a remote endpoint like Dbpedia<br />
dbpedia_graph = g.parse("http://dbpedia.org/resource/Pluto")<br />
</syntaxhighlight><br />
<br />
===Graph Binding===<br />
<syntaxhighlight><br />
#Graph Binding is useful for at least two reasons:<br />
#(1) We no longer need to specify prefixes with SPARQL queries if they are already binded to the graph.<br />
#(2) When serializing the graph, the serialization will show the correct expected prefix <br />
# instead of default namespace names ns1, ns2 etc.<br />
<br />
g = Graph()<br />
<br />
ex = Namespace("http://example.org/")<br />
dbp = Namespace("http://dbpedia.org/resource/")<br />
schema = Namespace("https://schema.org/")<br />
<br />
g.bind("ex", ex)<br />
g.bind("dbp", dbp)<br />
g.bind("schema", schema)<br />
</syntaxhighlight><br />
<br />
===Collection Example===<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace<br />
from rdflib.collection import Collection<br />
<br />
<br />
# Sometimes we want to add many objects or subjects for the same predicate at once. <br />
# In these cases we can use Collection() to save some time.<br />
# In this case I want to add all countries that Emma has visited at once.<br />
<br />
b = BNode()<br />
g.add((ex.Emma, ex.visit, b))<br />
Collection(g, b,<br />
[ex.Portugal, ex.Italy, ex.France, ex.Germany, ex.Denmark, ex.Sweden])<br />
<br />
# OR<br />
<br />
g.add((ex.Emma, ex.visit, ex.EmmaVisits))<br />
Collection(g, ex.EmmaVisits,<br />
[ex.Portugal, ex.Italy, ex.France, ex.Germany, ex.Denmark, ex.Sweden])<br />
<br />
</syntaxhighlight><br />
<br />
==SPARQL==<br />
<br />
Also see the [[SPARQL Examples]] page!<br />
<br />
===Querying a local ("in memory") graph===<br />
<br />
Example contents of the file family.ttl:<br />
@prefix rex: <http://example.org/royal#> .<br />
@prefix fam: <http://example.org/family#> .<br />
<br />
rex:IngridAlexandra fam:hasParent rex:HaakonMagnus .<br />
rex:SverreMagnus fam:hasParent rex:HaakonMagnus .<br />
rex:HaakonMagnus fam:hasParent rex:Harald .<br />
rex:MarthaLouise fam:hasParent rex:Harald .<br />
rex:HaakonMagnus fam:hasSister rex:MarthaLouise .<br />
<br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
g.parse("family.ttl", format='ttl')<br />
<br />
qres = g.query("""<br />
PREFIX fam: <http://example.org/family#><br />
SELECT ?child ?sister WHERE {<br />
?child fam:hasParent ?parent . <br />
?parent fam:hasSister ?sister .<br />
}""")<br />
for row in qres:<br />
print("%s has aunt %s" % row)<br />
<br />
With a prepared query, you can write the query once, and then bind some of the variables each time you use it:<br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
g.parse("family.ttl", format='ttl')<br />
<br />
q = rdflib.plugins.sparql.prepareQuery(<br />
"""SELECT ?child ?sister WHERE {<br />
?child fam:hasParent ?parent .<br />
?parent fam:hasSister ?sister .<br />
}""",<br />
initNs = { "fam": "http://example.org/family#"})<br />
<br />
sm = rdflib.URIRef("http://example.org/royal#SverreMagnus")<br />
<br />
for row in g.query(q, initBindings={'child': sm}):<br />
print(row)<br />
<br />
===Select all contents of lists (rdfllib.Collection)===<br />
<syntaxhighlight><br />
<br />
# rdflib.Collection has a different interntal structure so it requires a slightly more advance query. Here I am selecting all places that Emma has visited.<br />
<br />
PREFIX ex: <http://example.org/><br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
<br />
SELECT ?visit<br />
WHERE {<br />
ex:Emma ex:visit/rdf:rest*/rdf:first ?visit<br />
}<br />
</syntaxhighlight><br />
<br />
<br />
===Using parameters/variables in rdflib queries===<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, URIRef<br />
from rdflib.plugins.sparql import prepareQuery<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
g.add((ex.Cade, ex.livesIn, ex.France))<br />
g.add((ex.Anne, ex.livesIn, ex.Norway))<br />
g.add((ex.Sofie, ex.livesIn, ex.Sweden))<br />
g.add((ex.Per, ex.livesIn, ex.Norway))<br />
g.add((ex.John, ex.livesIn, ex.USA))<br />
<br />
<br />
def find_people_from_country(country):<br />
country = URIRef(ex + country)<br />
q = prepareQuery(<br />
"""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?person WHERE { <br />
?person ex:livesIn ?country.<br />
}<br />
""")<br />
<br />
capital_result = g.query(q, initBindings={'country': country})<br />
<br />
for row in capital_result:<br />
print(row)<br />
<br />
find_people_from_country("Norway")<br />
</syntaxhighlight><br />
<br />
===SELECTING data from Blazegraph via Python===<br />
<syntaxhighlight><br />
<br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
# This creates a server connection to the same URL that contains the graphic interface for Blazegraph. <br />
# You also need to add "sparql" to end of the URL like below.<br />
<br />
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/sparql")<br />
<br />
# SELECT all triples in the database.<br />
<br />
sparql.setQuery("""<br />
SELECT DISTINCT ?p WHERE {<br />
?s ?p ?o.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["p"]["value"])<br />
<br />
# SELECT all interests of Cade<br />
<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT DISTINCT ?interest WHERE {<br />
ex:Cade ex:interest ?interest.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["interest"]["value"])<br />
</syntaxhighlight><br />
<br />
===Updating data from Blazegraph via Python===<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, POST, DIGEST<br />
<br />
namespace = "kb"<br />
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/namespace/"+ namespace + "/sparql")<br />
<br />
sparql.setMethod(POST)<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
INSERT DATA{<br />
ex:Cade ex:interest ex:Mathematics.<br />
}<br />
""")<br />
<br />
results = sparql.query()<br />
print(results.response.read())<br />
<br />
<br />
</syntaxhighlight><br />
===Retrieving data from Wikidata with SparqlWrapper===<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
sparql = SPARQLWrapper("https://query.wikidata.org/sparql")<br />
# In the query I want to select all the Vitamins in wikidata.<br />
<br />
sparql.setQuery("""<br />
SELECT ?nutrient ?nutrientLabel WHERE<br />
{<br />
?nutrient wdt:P279 wd:Q34956.<br />
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }<br />
}<br />
""")<br />
<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["nutrient"]["value"], " ", result["nutrientLabel"]["value"])<br />
</syntaxhighlight><br />
<br />
<br />
More examples can be found in the example section on the official query service here: https://query.wikidata.org/.<br />
<br />
===Download from BlazeGraph===<br />
<br />
<syntaxhighlight><br />
"""<br />
Dumps a database to a local RDF file.<br />
You need to install the SPARQLWrapper package first...<br />
"""<br />
<br />
import datetime<br />
from SPARQLWrapper import SPARQLWrapper, RDFXML<br />
<br />
# your namespace, the default is 'kb'<br />
ns = 'kb'<br />
<br />
# the SPARQL endpoint<br />
endpoint = 'http://info216.i2s.uib.no/bigdata/namespace/' + ns + '/sparql'<br />
<br />
# - the endpoint just moved, the old one was:<br />
# endpoint = 'http://i2s.uib.no:8888/bigdata/namespace/' + ns + '/sparql'<br />
<br />
# create wrapper<br />
wrapper = SPARQLWrapper(endpoint)<br />
<br />
# prepare the SPARQL update<br />
wrapper.setQuery('CONSTRUCT { ?s ?p ?o } WHERE { ?s ?p ?o }')<br />
wrapper.setReturnFormat(RDFXML)<br />
<br />
# execute the SPARQL update and convert the result to an rdflib.Graph <br />
graph = wrapper.query().convert()<br />
<br />
# the destination file, with code to make it timestamped<br />
destfile = 'rdf_dumps/slr-kg4news-' + datetime.datetime.now().strftime('%Y%m%d-%H%M') + '.rdf'<br />
<br />
# serialize the result to file<br />
graph.serialize(destination=destfile, format='ttl')<br />
<br />
# report and quit<br />
print('Wrote %u triples to file %s .' %<br />
(len(res), destfile))<br />
</syntaxhighlight><br />
<br />
===Query Dbpedia with SparqlWrapper===<br />
<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
sparql = SPARQLWrapper("http://dbpedia.org/sparql")<br />
<br />
sparql.setQuery("""<br />
PREFIX dbr: <http://dbpedia.org/resource/><br />
PREFIX dbo: <http://dbpedia.org/ontology/><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
SELECT ?comment<br />
WHERE {<br />
dbr:Barack_Obama rdfs:comment ?comment.<br />
FILTER (langMatches(lang(?comment),"en"))<br />
}<br />
""")<br />
<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["comment"]["value"])<br />
</syntaxhighlight><br />
<br />
==Lifting CSV to RDF==<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, FOAF, RDFS, OWL<br />
import pandas as pd<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
# Load the CSV data as a pandas Dataframe.<br />
csv_data = pd.read_csv("task1.csv")<br />
<br />
# Here I deal with spaces (" ") in the data. I replace them with "_" so that URI's become valid.<br />
csv_data = csv_data.replace(to_replace=" ", value="_", regex=True)<br />
<br />
# Here I mark all missing/empty data as "unknown". This makes it easy to delete triples containing this later.<br />
csv_data = csv_data.fillna("unknown")<br />
<br />
# Loop through the CSV data, and then make RDF triples.<br />
for index, row in csv_data.iterrows():<br />
# The names of the people act as subjects.<br />
subject = row['Name']<br />
# Create triples: e.g. "Cade_Tracey - age - 27"<br />
g.add((URIRef(ex + subject), URIRef(ex + "age"), Literal(row["Age"])))<br />
g.add((URIRef(ex + subject), URIRef(ex + "married"), URIRef(ex + row["Spouse"])))<br />
g.add((URIRef(ex + subject), URIRef(ex + "country"), URIRef(ex + row["Country"])))<br />
<br />
# If We want can add additional RDF/RDFS/OWL information e.g<br />
g.add((URIRef(ex + subject), RDF.type, FOAF.Person))<br />
<br />
# I remove triples that I marked as unknown earlier.<br />
g.remove((None, None, URIRef("http://example.org/unknown")))<br />
<br />
# Clean printing of the graph.<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===CSV file for above example===<br />
<br />
<syntaxhighlight><br />
"Name","Age","Spouse","Country"<br />
"Cade Tracey","26","Mary Jackson","US"<br />
"Bob Johnson","21","","Canada"<br />
"Mary Jackson","25","","France"<br />
"Phil Philips","32","Catherine Smith","Japan"<br />
</syntaxhighlight><br />
<br />
<br />
===Coding Tasks Lab 6===<br />
<syntaxhighlight><br />
import pandas as pd<br />
<br />
<br />
from rdflib import Graph, Namespace, URIRef, Literal, BNode<br />
from rdflib.namespace import RDF, XSD<br />
<br />
<br />
ex = Namespace("http://example.org/")<br />
sem = Namespace("http://semanticweb.cs.vu.nl/2009/11/sem/")<br />
<br />
g = Graph()<br />
g.bind("ex", ex)<br />
g.bind("sem", sem)<br />
<br />
<br />
# Removing unwanted characters<br />
df = pd.read_csv('russia-investigation.csv')<br />
# Here I deal with spaces (" ") in the data. I replace them with "_" so that URI's become valid.<br />
df = df.replace(to_replace=" ", value="_", regex=True)<br />
# This may seem odd, but in the data set we have a name like this:("Scooter"). So we have to remove quotation marks<br />
df = df.replace(to_replace=f'"', value="", regex=True)<br />
# # Here I mark all missing/empty data as "unknown". This makes it easy to delete triples containing this later.<br />
df = df.fillna("unknown")<br />
<br />
# Loop through the CSV data, and then make RDF triples.<br />
for index, row in df.iterrows():<br />
name = row['investigation']<br />
investigation = URIRef(ex + name)<br />
g.add((investigation, RDF.type, sem.Event))<br />
investigation_start = row["investigation-start"]<br />
g.add((investigation, sem.hasBeginTimeStamp, Literal(<br />
investigation_start, datatype=XSD.datetime)))<br />
investigation_end = row["investigation-end"]<br />
g.add((investigation, sem.hasEndTimeStamp, Literal(<br />
investigation_end, datatype=XSD.datetime)))<br />
investigation_end = row["investigation-days"]<br />
g.add((investigation, sem.hasXSDDuration, Literal(<br />
investigation_end, datatype=XSD.Days)))<br />
person = row["name"]<br />
person = URIRef(ex + person)<br />
g.add((investigation, sem.Actor, person))<br />
result = row['type']<br />
g.add((investigation, sem.hasSubEvent, Literal(result, datatype=XSD.string)))<br />
overturned = row["overturned"]<br />
g.add((investigation, ex.overtuned, Literal(overturned, datatype=XSD.boolean)))<br />
pardoned = row["pardoned"]<br />
g.add((investigation, ex.pardon, Literal(pardoned, datatype=XSD.boolean)))<br />
<br />
g.serialize("output.ttl", format="ttl")<br />
print(g.serialize(format="turtle"))<br />
<br />
</syntaxhighlight><br />
<!--<br />
==Lifting XML to RDF==<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, XSD, RDFS<br />
import xml.etree.ElementTree as ET<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/TV/")<br />
prov = Namespace("http://www.w3.org/ns/prov#")<br />
g.bind("ex", ex)<br />
g.bind("prov", prov)<br />
<br />
tree = ET.parse("tv_shows.xml")<br />
root = tree.getroot()<br />
<br />
for tv_show in root.findall('tv_show'):<br />
show_id = tv_show.attrib["id"]<br />
title = tv_show.find("title").text<br />
<br />
g.add((URIRef(ex + show_id), ex.title, Literal(title, datatype=XSD.string)))<br />
g.add((URIRef(ex + show_id), RDF.type, ex.TV_Show))<br />
<br />
for actor in tv_show.findall("actor"):<br />
first_name = actor.find("firstname").text<br />
last_name = actor.find("lastname").text<br />
full_name = first_name + "_" + last_name<br />
<br />
g.add((URIRef(ex + show_id), ex.stars, URIRef(ex + full_name)))<br />
g.add((URIRef(ex + full_name), ex.starsIn, URIRef(title)))<br />
g.add((URIRef(ex + full_name), RDF.type, ex.Actor))<br />
<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
<br />
==RDFS==<br />
<br />
===RDFS-plus (OWL) Properties===<br />
<syntaxhighlight><br />
g.add((ex.married, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.married, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.ReflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.sibling, RDF.type, OWL.TransitiveProperty))<br />
g.add((ex.sibling, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.sibling, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.FunctionalProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.IrreflexiveProperty))<br />
<br />
# Sometimes there is no definite answer, and it comes down to how we want to model our properties<br />
# e.g is livesWith a transitive property? Usually yes, but we can also want to specify that a child lives with both of her divorced parents.<br />
# which means that: (mother livesWith child % child livesWith father) != mother livesWith father. Which makes it non-transitive.<br />
</syntaxhighlight><br />
<br />
===RDFS inference with RDFLib===<br />
You can use the OWL-RL package to add inference capabilities to RDFLib. It can be installed using the pip install command:<br />
<syntaxhighlight><br />
pip install owlrl<br />
</syntaxhighlight><br />
Or download it from [https://github.com/RDFLib/OWL-RL GitHub] and copy the ''owlrl'' subfolder into your project folder next to your Python files.<br />
<br />
[https://owl-rl.readthedocs.io/en/latest/owlrl.html OWL-RL documentation.]<br />
<br />
Example program to get you started. In this example we are creating the graph using sparql.update, but it is also possible to parse the data from a file.<br />
<syntaxhighlight><br />
import rdflib.plugins.sparql.update<br />
import owlrl.RDFSClosure<br />
<br />
g = rdflib.Graph()<br />
<br />
ex = rdflib.Namespace('http://example.org#')<br />
g.bind('', ex)<br />
<br />
g.update("""<br />
PREFIX ex: <http://example.org#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
INSERT DATA {<br />
ex:Socrates rdf:type ex:Man .<br />
ex:Man rdfs:subClassOf ex:Mortal .<br />
}""")<br />
<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
# RDF_Semantics parameters: <br />
# - graph (rdflib.Graph) – The RDF graph to be extended.<br />
# - axioms (bool) – Whether (non-datatype) axiomatic triples should be added or not.<br />
# - daxioms (bool) – Whether datatype axiomatic triples should be added or not.<br />
# - rdfs (bool) – Whether RDFS inference is also done (used in subclassed only).<br />
# For now, you will in most cases use all False in RDFS_Semtantics.<br />
<br />
# Generates the closure of the graph - generates the new entailed triples, but does not add them to the graph.<br />
rdfs.closure()<br />
# Adds the new triples to the graph and empties the RDFS triple-container.<br />
rdfs.flush_stored_triples()<br />
<br />
# Ask-query to check whether a new triple has been generated from the entailment.<br />
b = g.query("""<br />
PREFIX ex: <http://example.org#><br />
ASK {<br />
ex:Socrates rdf:type ex:Mortal .<br />
} <br />
""")<br />
print('Result: ' + bool(b))<br />
</syntaxhighlight><br />
<br />
===Language tagged RDFS labels=== <br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, Literal<br />
from rdflib.namespace import RDFS<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.France, RDFS.label, Literal("Frankrike", lang="no")))<br />
g.add((ex.France, RDFS.label, Literal("France", lang="en")))<br />
g.add((ex.France, RDFS.label, Literal("Francia", lang="es")))<br />
<br />
<br />
</syntaxhighlight><br />
<br />
==OWL== <br />
===Basic inference with RDFLib=== <br />
<br />
You can use the OWL-RL package again as for Lecture 5.<br />
<br />
Instead of: <br />
<syntaxhighlight><br />
# The next three lines add inferred triples to g.<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
rdfs.closure()<br />
rdfs.flush_stored_triples()<br />
</syntaxhighlight><br />
you can write this to get both RDFS and basic RDFS Plus / OWL inference:<br />
<syntaxhighlight><br />
# The next three lines add inferred triples to g.<br />
owl = owlrl.CombinedClosure.RDFS_OWLRL_Semantics(g, False, False, False)<br />
owl.closure()<br />
owl.flush_stored_triples()<br />
</syntaxhighlight><br />
<br />
Example updates and queries:<br />
<syntaxhighlight><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
PREFIX ex: <http://example.org#><br />
<br />
INSERT DATA {<br />
ex:Socrates ex:hasWife ex:Xanthippe .<br />
ex:hasHusband owl:inverseOf ex:hasWife .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Xanthippe ex:hasHusband ex:Socrates .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ^ex:hasHusband ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
INSERT DATA {<br />
ex:hasWife rdfs:subPropertyOf ex:hasSpouse .<br />
ex:hasSpouse rdf:type owl:SymmetricProperty . <br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ex:hasSpouse ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ^ex:hasSpouse ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<br />
<br />
<br />
<br />
<br />
===XML Data for above example===<br />
<syntaxhighlight><br />
<data><br />
<tv_show id="1050"><br />
<title>The_Sopranos</title><br />
<actor><br />
<firstname>James</firstname><br />
<lastname>Gandolfini</lastname><br />
</actor><br />
</tv_show><br />
<tv_show id="1066"><br />
<title>Seinfeld</title><br />
<actor><br />
<firstname>Jerry</firstname><br />
<lastname>Seinfeld</lastname><br />
</actor><br />
<actor><br />
<firstname>Julia</firstname><br />
<lastname>Louis-dreyfus</lastname><br />
</actor><br />
<actor><br />
<firstname>Jason</firstname><br />
<lastname>Alexander</lastname><br />
</actor><br />
</tv_show><br />
</data><br />
</syntaxhighlight><br />
<br />
==Lifting HTML to RDF==<br />
<syntaxhighlight><br />
from bs4 import BeautifulSoup as bs, NavigableString<br />
from rdflib import Graph, URIRef, Namespace<br />
from rdflib.namespace import RDF<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
html = open("tv_shows.html").read()<br />
html = bs(html, features="html.parser")<br />
<br />
shows = html.find_all('li', attrs={'class': 'show'})<br />
for show in shows:<br />
title = show.find("h3").text<br />
actors = show.find('ul', attrs={'class': 'actor_list'})<br />
for actor in actors:<br />
if isinstance(actor, NavigableString):<br />
continue<br />
else:<br />
actor = actor.text.replace(" ", "_")<br />
g.add((URIRef(ex + title), ex.stars, URIRef(ex + actor)))<br />
g.add((URIRef(ex + actor), RDF.type, ex.Actor))<br />
<br />
g.add((URIRef(ex + title), RDF.type, ex.TV_Show))<br />
<br />
<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===HTML code for the example above===<br />
<syntaxhighlight><br />
<!DOCTYPE html><br />
<html><br />
<head><br />
<meta charset="utf-8"><br />
<title></title><br />
</head><br />
<body><br />
<div class="tv_shows"><br />
<ul><br />
<li class="show"><br />
<h3>The_Sopranos</h3><br />
<div class="irrelevant_data"></div><br />
<ul class="actor_list"><br />
<li>James Gandolfini</li><br />
</ul><br />
</li><br />
<li class="show"><br />
<h3>Seinfeld</h3><br />
<div class="irrelevant_data"></div><br />
<ul class="actor_list"><br />
<li >Jerry Seinfeld</li><br />
<li>Jason Alexander</li><br />
<li>Julia Louis-Dreyfus</li><br />
</ul><br />
</li><br />
</ul><br />
</div><br />
</body><br />
</html><br />
</syntaxhighlight><br />
<br />
==Web APIs with JSON==<br />
<syntaxhighlight><br />
import requests<br />
import json<br />
import pprint<br />
<br />
# Retrieve JSON data from API service URL. Then load it with the json library as a json object.<br />
url = "http://api.geonames.org/postalCodeLookupJSON?postalcode=46020&#country=ES&username=demo"<br />
data = requests.get(url).content.decode("utf-8")<br />
data = json.loads(data)<br />
pprint.pprint(data)<br />
</syntaxhighlight><br />
<br />
<br />
==JSON-LD==<br />
<br />
<syntaxhighlight><br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
<br />
example = """<br />
{<br />
"@context": {<br />
"name": "http://xmlns.com/foaf/0.1/name",<br />
"homepage": {<br />
"@id": "http://xmlns.com/foaf/0.1/homepage",<br />
"@type": "@id"<br />
}<br />
},<br />
"@id": "http://me.markus-lanthaler.com/",<br />
"name": "Markus Lanthaler",<br />
"homepage": "http://www.markus-lanthaler.com/"<br />
}<br />
"""<br />
<br />
# json-ld parsing automatically deals with @contexts<br />
g.parse(data=example, format='json-ld')<br />
<br />
# serialisation does expansion by default<br />
for line in g.serialize(format='json-ld').decode().splitlines():<br />
print(line)<br />
<br />
# by supplying a context object, serialisation can do compaction<br />
context = {<br />
"foaf": "http://xmlns.com/foaf/0.1/"<br />
}<br />
for line in g.serialize(format='json-ld', context=context).decode().splitlines():<br />
print(line)<br />
</syntaxhighlight><br />
<br />
<br />
<div class="credits" style="text-align: right; direction: ltr; margin-left: 1em;">''INFO216, UiB, 2017-2020. All code examples are [https://creativecommons.org/choose/zero/ CC0].'' </div><br />
<br />
==OWL - Complex Classes and Restrictions==<br />
<syntaxhighlight><br />
import owlrl<br />
from rdflib import Graph, Literal, Namespace, BNode<br />
from rdflib.namespace import RDF, OWL, RDFS<br />
from rdflib.collection import Collection<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
g.bind("owl", OWL)<br />
<br />
# a Season is either Autumn, Winter, Spring, Summer<br />
seasons = BNode()<br />
Collection(g, seasons, [ex.Winter, ex.Autumn, ex.Spring, ex.Summer])<br />
g.add((ex.Season, OWL.oneOf, seasons))<br />
<br />
# A Parent is a Father or Mother<br />
b = BNode()<br />
Collection(g, b, [ex.Father, ex.Mother])<br />
g.add((ex.Parent, OWL.unionOf, b))<br />
<br />
# A Woman is a person who has the "female" gender<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.gender))<br />
g.add((br, OWL.hasValue, ex.Female))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Woman, OWL.intersectionOf, bi))<br />
<br />
# A vegetarian is a Person who only eats vegetarian food<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.eats))<br />
g.add((br, OWL.allValuesFrom, ex.VeganFood))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Vegetarian, OWL.intersectionOf, bi))<br />
<br />
# A vegetarian is a Person who can not eat meat.<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.eats))<br />
g.add((br, OWL.QualifiedCardinality, Literal(0)))<br />
g.add((br, OWL.onClass, ex.Meat))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Vegetarian, OWL.intersectionOf, bi))<br />
<br />
# A Worried Parent is a parent who has at least one sick child<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.hasChild))<br />
g.add((br, OWL.QualifiedMinCardinality, Literal(1)))<br />
g.add((br, OWL.onClass, ex.Sick))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Parent, br])<br />
g.add((ex.WorriedParent, OWL.intersectionOf, bi))<br />
<br />
# using the restriction above, If we now write...: <br />
g.add((ex.Bob, RDF.type, ex.Parent))<br />
g.add((ex.Bob, ex.hasChild, ex.John))<br />
g.add((ex.John, RDF.type, ex.Sick))<br />
# ...we can infer with owl reasoning that Bob is a worried Parent even though we didn't specify it ourselves because Bob fullfills the restriction and Parent requirements.<br />
<br />
</syntaxhighlight><br />
<br />
==Protege-OWL reasoning with HermiT==<br />
<br />
[[:File:DL-reasoning-RoyalFamily-final.owl.txt | Example file]] from Lecture 13 about OWL-DL, rules and reasoning.<br />
<br />
--></div>
Tkr048
http://info216.wiki.uib.no/index.php?title=Lab_Solutions&diff=1746
Lab Solutions
2022-03-10T12:24:49Z
<p>Tkr048: </p>
<hr />
<div>This page will be updated with Python examples related to the lectures and labs. We will add more examples after each lab has ended. The first examples will use Python's RDFlib. We will introduce other relevant libraries later.<br />
<br />
<br />
==Getting started==<br />
<br />
<br />
===Printing the triples of the Graph in a readable way===<br />
<syntaxhighlight><br />
# The turtle format has the purpose of being more readable for humans. <br />
print(g.serialize(format="turtle"))<br />
</syntaxhighlight><br />
<br />
===Coding Tasks Lab 1===<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, URIRef, BNode, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.Cade, ex.married, ex.Mary))<br />
g.add((ex.France, ex.capital, ex.Paris))<br />
g.add((ex.Cade, ex.age, Literal("27", datatype=XSD.integer)))<br />
g.add((ex.Mary, ex.age, Literal("26", datatype=XSD.integer)))<br />
g.add((ex.Mary, ex.interest, ex.Hiking))<br />
g.add((ex.Mary, ex.interest, ex.Chocolate))<br />
g.add((ex.Mary, ex.interest, ex.Biology))<br />
g.add((ex.Mary, RDF.type, ex.Student))<br />
g.add((ex.Paris, RDF.type, ex.City))<br />
g.add((ex.Paris, ex.locatedIn, ex.France))<br />
g.add((ex.Cade, ex.characteristic, ex.Kind))<br />
g.add((ex.Mary, ex.characteristic, ex.Kind))<br />
g.add((ex.Mary, RDF.type, FOAF.Person))<br />
g.add((ex.Cade, RDF.type, FOAF.Person))<br />
<br />
<br />
# OR<br />
<br />
<br />
from rdflib import Graph, Namespace, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
<br />
g = Graph()<br />
<br />
ex = Namespace('http://example.org/')<br />
<br />
g.add((ex.Cade, FOAF.name, Literal("Cade", datatype=XSD.string)))<br />
g.add((ex.Mary, FOAF.name, Literal("Mary", datatype=XSD.string)))<br />
g.add((ex.Cade, RDF.type, FOAF.Person))<br />
g.add((ex.Mary, RDF.type, FOAF.Person))<br />
g.add((ex.Mary, RDF.type, ex.Student))<br />
g.add((ex.Cade, ex.Married, ex.Mary))<br />
g.add((ex.Cade, FOAF.age, Literal('27', datatype=XSD.int)))<br />
g.add((ex.Mary, FOAF.age, Literal('26', datatype=XSD.int)))<br />
g.add((ex.Paris, RDF.type, ex.City))<br />
g.add((ex.France, ex.Capital, ex.Paris))<br />
g.add((ex.Mary, FOAF.interest, ex.hiking))<br />
g.add((ex.Mary, FOAF.interest, ex.Chocolate))<br />
g.add((ex.Mary, FOAF.interest, ex.biology))<br />
g.add((ex.France, ex.City, ex.Paris))<br />
g.add((ex.Mary, ex.characteristic, ex.kind))<br />
g.add((ex.Cade, ex.characteristic, ex.kind))<br />
g.add((ex.France, RDF.type, ex.Country))<br />
<br />
<br />
print(g.serialize(format="turtle"))<br />
<br />
</syntaxhighlight><br />
<br />
==Basic RDF programming==<br />
<br />
===Different ways to create an address===<br />
<br />
<syntaxhighlight><br />
<br />
from rdflib import Graph, Namespace, URIRef, BNode, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
<br />
# How to represent the address of Cade Tracey. From probably the worst solution to the best.<br />
<br />
# Solution 1 -<br />
# Make the entire address into one Literal. However, Generally we want to separate each part of an address into their own triples. This is useful for instance if we want to find only the streets where people live. <br />
<br />
g.add((ex.Cade_Tracey, ex.livesIn, Literal("1516_Henry_Street, Berkeley, California 94709, USA")))<br />
<br />
<br />
# Solution 2 - <br />
# Seperate the different pieces information into their own triples<br />
<br />
g.add((ex.Cade_tracey, ex.street, Literal("1516_Henry_Street")))<br />
g.add((ex.Cade_tracey, ex.city, Literal("Berkeley")))<br />
g.add((ex.Cade_tracey, ex.state, Literal("California")))<br />
g.add((ex.Cade_tracey, ex.zipcode, Literal("94709")))<br />
g.add((ex.Cade_tracey, ex.country, Literal("USA")))<br />
<br />
<br />
# Solution 3 - Some parts of the addresses can make more sense to be resources than Literals.<br />
# Larger concepts like a city or state are typically represented as resources rather than Literals, but this is not necesarilly a requirement in the case that you don't intend to say more about them. <br />
<br />
g.add((ex.Cade_tracey, ex.street, Literal("1516_Henry_Street")))<br />
g.add((ex.Cade_tracey, ex.city, ex.Berkeley))<br />
g.add((ex.Cade_tracey, ex.state, ex.California))<br />
g.add((ex.Cade_tracey, ex.zipcode, Literal("94709")))<br />
g.add((ex.Cade_tracey, ex.country, ex.USA))<br />
<br />
<br />
# Solution 4 <br />
# Grouping of the information into an Address. We can Represent the address concept with its own URI OR with a Blank Node. <br />
# One advantage of this is that we can easily remove the entire address, instead of removing each individual part of the address. <br />
# Solution 4 or 5 is how I would recommend to make addresses. Here, ex.CadeAddress could also be called something like ex.address1 or so on, if you want to give each address a unique ID. <br />
<br />
# Address URI - CadeAdress<br />
<br />
g.add((ex.Cade_Tracey, ex.address, ex.CadeAddress))<br />
g.add((ex.CadeAddress, RDF.type, ex.Address))<br />
g.add((ex.CadeAddress, ex.street, Literal("1516 Henry Street")))<br />
g.add((ex.CadeAddress, ex.city, ex.Berkeley))<br />
g.add((ex.CadeAddress, ex.state, ex.California))<br />
g.add((ex.CadeAddress, ex.postalCode, Literal("94709")))<br />
g.add((ex.CadeAddress, ex.country, ex.USA))<br />
<br />
# OR<br />
<br />
# Blank node for Address. <br />
address = BNode()<br />
g.add((ex.Cade_Tracey, ex.address, address))<br />
g.add((address, RDF.type, ex.Address))<br />
g.add((address, ex.street, Literal("1516 Henry Street", datatype=XSD.string)))<br />
g.add((address, ex.city, ex.Berkeley))<br />
g.add((address, ex.state, ex.California))<br />
g.add((address, ex.postalCode, Literal("94709", datatype=XSD.string)))<br />
g.add((address, ex.country, ex.USA))<br />
<br />
<br />
# Solution 5 using existing vocabularies for address <br />
<br />
# (in this case https://schema.org/PostalAddress from schema.org). <br />
# Also using existing ontology for places like California. (like http://dbpedia.org/resource/California from dbpedia.org)<br />
<br />
schema = Namespace("https://schema.org/")<br />
dbp = Namespace("https://dpbedia.org/resource/")<br />
<br />
g.add((ex.Cade_Tracey, schema.address, ex.CadeAddress))<br />
g.add((ex.CadeAddress, RDF.type, schema.PostalAddress))<br />
g.add((ex.CadeAddress, schema.streetAddress, Literal("1516 Henry Street")))<br />
g.add((ex.CadeAddress, schema.addresCity, dbp.Berkeley))<br />
g.add((ex.CadeAddress, schema.addressRegion, dbp.California))<br />
g.add((ex.CadeAddress, schema.postalCode, Literal("94709")))<br />
g.add((ex.CadeAddress, schema.addressCountry, dbp.United_States))<br />
<br />
</syntaxhighlight><br />
<br />
===Typed Literals===<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace<br />
from rdflib.namespace import XSD<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.Cade, ex.age, Literal(27, datatype=XSD.integer)))<br />
g.add((ex.Cade, ex.gpa, Literal(3.3, datatype=XSD.float)))<br />
g.add((ex.Cade, FOAF.name, Literal("Cade Tracey", datatype=XSD.string)))<br />
g.add((ex.Cade, ex.birthday, Literal("2006-01-01", datatype=XSD.date)))<br />
</syntaxhighlight><br />
<br />
<br />
===Writing and reading graphs/files===<br />
<br />
<syntaxhighlight><br />
# Writing the graph to a file on your system. Possible formats = turtle, n3, xml, nt.<br />
g.serialize(destination="triples.txt", format="turtle")<br />
<br />
# Parsing a local file<br />
parsed_graph = g.parse(location="triples.txt", format="turtle")<br />
<br />
# Parsing a remote endpoint like Dbpedia<br />
dbpedia_graph = g.parse("http://dbpedia.org/resource/Pluto")<br />
</syntaxhighlight><br />
<br />
===Graph Binding===<br />
<syntaxhighlight><br />
#Graph Binding is useful for at least two reasons:<br />
#(1) We no longer need to specify prefixes with SPARQL queries if they are already binded to the graph.<br />
#(2) When serializing the graph, the serialization will show the correct expected prefix <br />
# instead of default namespace names ns1, ns2 etc.<br />
<br />
g = Graph()<br />
<br />
ex = Namespace("http://example.org/")<br />
dbp = Namespace("http://dbpedia.org/resource/")<br />
schema = Namespace("https://schema.org/")<br />
<br />
g.bind("ex", ex)<br />
g.bind("dbp", dbp)<br />
g.bind("schema", schema)<br />
</syntaxhighlight><br />
<br />
===Collection Example===<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace<br />
from rdflib.collection import Collection<br />
<br />
<br />
# Sometimes we want to add many objects or subjects for the same predicate at once. <br />
# In these cases we can use Collection() to save some time.<br />
# In this case I want to add all countries that Emma has visited at once.<br />
<br />
b = BNode()<br />
g.add((ex.Emma, ex.visit, b))<br />
Collection(g, b,<br />
[ex.Portugal, ex.Italy, ex.France, ex.Germany, ex.Denmark, ex.Sweden])<br />
<br />
# OR<br />
<br />
g.add((ex.Emma, ex.visit, ex.EmmaVisits))<br />
Collection(g, ex.EmmaVisits,<br />
[ex.Portugal, ex.Italy, ex.France, ex.Germany, ex.Denmark, ex.Sweden])<br />
<br />
</syntaxhighlight><br />
<br />
==SPARQL==<br />
<br />
Also see the [[SPARQL Examples]] page!<br />
<br />
===Querying a local ("in memory") graph===<br />
<br />
Example contents of the file family.ttl:<br />
@prefix rex: <http://example.org/royal#> .<br />
@prefix fam: <http://example.org/family#> .<br />
<br />
rex:IngridAlexandra fam:hasParent rex:HaakonMagnus .<br />
rex:SverreMagnus fam:hasParent rex:HaakonMagnus .<br />
rex:HaakonMagnus fam:hasParent rex:Harald .<br />
rex:MarthaLouise fam:hasParent rex:Harald .<br />
rex:HaakonMagnus fam:hasSister rex:MarthaLouise .<br />
<br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
g.parse("family.ttl", format='ttl')<br />
<br />
qres = g.query("""<br />
PREFIX fam: <http://example.org/family#><br />
SELECT ?child ?sister WHERE {<br />
?child fam:hasParent ?parent . <br />
?parent fam:hasSister ?sister .<br />
}""")<br />
for row in qres:<br />
print("%s has aunt %s" % row)<br />
<br />
With a prepared query, you can write the query once, and then bind some of the variables each time you use it:<br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
g.parse("family.ttl", format='ttl')<br />
<br />
q = rdflib.plugins.sparql.prepareQuery(<br />
"""SELECT ?child ?sister WHERE {<br />
?child fam:hasParent ?parent .<br />
?parent fam:hasSister ?sister .<br />
}""",<br />
initNs = { "fam": "http://example.org/family#"})<br />
<br />
sm = rdflib.URIRef("http://example.org/royal#SverreMagnus")<br />
<br />
for row in g.query(q, initBindings={'child': sm}):<br />
print(row)<br />
<br />
===Select all contents of lists (rdfllib.Collection)===<br />
<syntaxhighlight><br />
<br />
# rdflib.Collection has a different interntal structure so it requires a slightly more advance query. Here I am selecting all places that Emma has visited.<br />
<br />
PREFIX ex: <http://example.org/><br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
<br />
SELECT ?visit<br />
WHERE {<br />
ex:Emma ex:visit/rdf:rest*/rdf:first ?visit<br />
}<br />
</syntaxhighlight><br />
<br />
<br />
===Using parameters/variables in rdflib queries===<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, URIRef<br />
from rdflib.plugins.sparql import prepareQuery<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
g.add((ex.Cade, ex.livesIn, ex.France))<br />
g.add((ex.Anne, ex.livesIn, ex.Norway))<br />
g.add((ex.Sofie, ex.livesIn, ex.Sweden))<br />
g.add((ex.Per, ex.livesIn, ex.Norway))<br />
g.add((ex.John, ex.livesIn, ex.USA))<br />
<br />
<br />
def find_people_from_country(country):<br />
country = URIRef(ex + country)<br />
q = prepareQuery(<br />
"""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?person WHERE { <br />
?person ex:livesIn ?country.<br />
}<br />
""")<br />
<br />
capital_result = g.query(q, initBindings={'country': country})<br />
<br />
for row in capital_result:<br />
print(row)<br />
<br />
find_people_from_country("Norway")<br />
</syntaxhighlight><br />
<br />
===SELECTING data from Blazegraph via Python===<br />
<syntaxhighlight><br />
<br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
# This creates a server connection to the same URL that contains the graphic interface for Blazegraph. <br />
# You also need to add "sparql" to end of the URL like below.<br />
<br />
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/sparql")<br />
<br />
# SELECT all triples in the database.<br />
<br />
sparql.setQuery("""<br />
SELECT DISTINCT ?p WHERE {<br />
?s ?p ?o.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["p"]["value"])<br />
<br />
# SELECT all interests of Cade<br />
<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT DISTINCT ?interest WHERE {<br />
ex:Cade ex:interest ?interest.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["interest"]["value"])<br />
</syntaxhighlight><br />
<br />
===Updating data from Blazegraph via Python===<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, POST, DIGEST<br />
<br />
namespace = "kb"<br />
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/namespace/"+ namespace + "/sparql")<br />
<br />
sparql.setMethod(POST)<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
INSERT DATA{<br />
ex:Cade ex:interest ex:Mathematics.<br />
}<br />
""")<br />
<br />
results = sparql.query()<br />
print(results.response.read())<br />
<br />
<br />
</syntaxhighlight><br />
===Retrieving data from Wikidata with SparqlWrapper===<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
sparql = SPARQLWrapper("https://query.wikidata.org/sparql")<br />
# In the query I want to select all the Vitamins in wikidata.<br />
<br />
sparql.setQuery("""<br />
SELECT ?nutrient ?nutrientLabel WHERE<br />
{<br />
?nutrient wdt:P279 wd:Q34956.<br />
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }<br />
}<br />
""")<br />
<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["nutrient"]["value"], " ", result["nutrientLabel"]["value"])<br />
</syntaxhighlight><br />
<br />
<br />
More examples can be found in the example section on the official query service here: https://query.wikidata.org/.<br />
<br />
===Download from BlazeGraph===<br />
<br />
<syntaxhighlight><br />
"""<br />
Dumps a database to a local RDF file.<br />
You need to install the SPARQLWrapper package first...<br />
"""<br />
<br />
import datetime<br />
from SPARQLWrapper import SPARQLWrapper, RDFXML<br />
<br />
# your namespace, the default is 'kb'<br />
ns = 'kb'<br />
<br />
# the SPARQL endpoint<br />
endpoint = 'http://info216.i2s.uib.no/bigdata/namespace/' + ns + '/sparql'<br />
<br />
# - the endpoint just moved, the old one was:<br />
# endpoint = 'http://i2s.uib.no:8888/bigdata/namespace/' + ns + '/sparql'<br />
<br />
# create wrapper<br />
wrapper = SPARQLWrapper(endpoint)<br />
<br />
# prepare the SPARQL update<br />
wrapper.setQuery('CONSTRUCT { ?s ?p ?o } WHERE { ?s ?p ?o }')<br />
wrapper.setReturnFormat(RDFXML)<br />
<br />
# execute the SPARQL update and convert the result to an rdflib.Graph <br />
graph = wrapper.query().convert()<br />
<br />
# the destination file, with code to make it timestamped<br />
destfile = 'rdf_dumps/slr-kg4news-' + datetime.datetime.now().strftime('%Y%m%d-%H%M') + '.rdf'<br />
<br />
# serialize the result to file<br />
graph.serialize(destination=destfile, format='ttl')<br />
<br />
# report and quit<br />
print('Wrote %u triples to file %s .' %<br />
(len(res), destfile))<br />
</syntaxhighlight><br />
<br />
===Query Dbpedia with SparqlWrapper===<br />
<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
sparql = SPARQLWrapper("http://dbpedia.org/sparql")<br />
<br />
sparql.setQuery("""<br />
PREFIX dbr: <http://dbpedia.org/resource/><br />
PREFIX dbo: <http://dbpedia.org/ontology/><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
SELECT ?comment<br />
WHERE {<br />
dbr:Barack_Obama rdfs:comment ?comment.<br />
FILTER (langMatches(lang(?comment),"en"))<br />
}<br />
""")<br />
<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["comment"]["value"])<br />
</syntaxhighlight><br />
<br />
==Lifting CSV to RDF==<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, FOAF, RDFS, OWL<br />
import pandas as pd<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
# Load the CSV data as a pandas Dataframe.<br />
csv_data = pd.read_csv("task1.csv")<br />
<br />
# Here I deal with spaces (" ") in the data. I replace them with "_" so that URI's become valid.<br />
csv_data = csv_data.replace(to_replace=" ", value="_", regex=True)<br />
<br />
# Here I mark all missing/empty data as "unknown". This makes it easy to delete triples containing this later.<br />
csv_data = csv_data.fillna("unknown")<br />
<br />
# Loop through the CSV data, and then make RDF triples.<br />
for index, row in csv_data.iterrows():<br />
# The names of the people act as subjects.<br />
subject = row['Name']<br />
# Create triples: e.g. "Cade_Tracey - age - 27"<br />
g.add((URIRef(ex + subject), URIRef(ex + "age"), Literal(row["Age"])))<br />
g.add((URIRef(ex + subject), URIRef(ex + "married"), URIRef(ex + row["Spouse"])))<br />
g.add((URIRef(ex + subject), URIRef(ex + "country"), URIRef(ex + row["Country"])))<br />
<br />
# If We want can add additional RDF/RDFS/OWL information e.g<br />
g.add((URIRef(ex + subject), RDF.type, FOAF.Person))<br />
<br />
# I remove triples that I marked as unknown earlier.<br />
g.remove((None, None, URIRef("http://example.org/unknown")))<br />
<br />
# Clean printing of the graph.<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===CSV file for above example===<br />
<br />
<syntaxhighlight><br />
"Name","Age","Spouse","Country"<br />
"Cade Tracey","26","Mary Jackson","US"<br />
"Bob Johnson","21","","Canada"<br />
"Mary Jackson","25","","France"<br />
"Phil Philips","32","Catherine Smith","Japan"<br />
</syntaxhighlight><br />
<br />
<br />
===Coding Tasks Lab 6===<br />
<syntaxhighlight><br />
import pandas as pd<br />
<br />
<br />
from rdflib import Graph, Namespace, URIRef, Literal, BNode<br />
from rdflib.namespace import RDF, XSD<br />
<br />
<br />
ex = Namespace("http://example.org/")<br />
sem = Namespace("http://semanticweb.cs.vu.nl/2009/11/sem/")<br />
<br />
g = Graph()<br />
g.bind("ex", ex)<br />
g.bind("sem", sem)<br />
<br />
<br />
# Removing unwanted characters<br />
df = pd.read_csv('russia-investigation.csv')<br />
# Here I deal with spaces (" ") in the data. I replace them with "_" so that URI's become valid.<br />
df = df.replace(to_replace=" ", value="_", regex=True)<br />
# This may seem odd, but in the data set we have a name like this:("Scooter"). So we have to remove quotation marks<br />
df = df.replace(to_replace=f'"', value="", regex=True)<br />
# # Here I mark all missing/empty data as "unknown". This makes it easy to delete triples containing this later.<br />
df = df.fillna("unknown")<br />
<br />
# Loop through the CSV data, and then make RDF triples.<br />
for index, row in df.iterrows():<br />
name = row['investigation']<br />
investigation = URIRef(ex + name)<br />
g.add((investigation, RDF.type, sem.Event))<br />
investigation_start = row["investigation-start"]<br />
g.add((investigation, sem.hasBeginTimeStamp, Literal(<br />
investigation_start, datatype=XSD.datetime)))<br />
investigation_end = row["investigation-end"]<br />
g.add((investigation, sem.hasEndTimeStamp, Literal(<br />
investigation_end, datatype=XSD.datetime)))<br />
investigation_end = row["investigation-days"]<br />
g.add((investigation, sem.hasXSDDuration, Literal(<br />
investigation_end, datatype=XSD.Days)))<br />
person = row["name"]<br />
person = URIRef(ex + person)<br />
g.add((investigation, sem.Actor, person))<br />
result = row['type']<br />
g.add((investigation, sem.hasSubEvent, Literal(result, datatype=XSD.string)))<br />
overturned = row["overturned"]<br />
g.add((investigation, ex.overtuned, Literal(overturned, datatype=XSD.boolean)))<br />
pardoned = row["pardoned"]<br />
g.add((investigation, ex.pardon, Literal(pardoned, datatype=XSD.boolean)))<br />
<br />
g.serialize("output.ttl", format="ttl")<br />
print(g.serialize(format="turtle"))<br />
<br />
</syntaxhighlight><br />
<!--<br />
==Lifting XML to RDF==<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, XSD, RDFS<br />
import xml.etree.ElementTree as ET<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/TV/")<br />
prov = Namespace("http://www.w3.org/ns/prov#")<br />
g.bind("ex", ex)<br />
g.bind("prov", prov)<br />
<br />
tree = ET.parse("tv_shows.xml")<br />
root = tree.getroot()<br />
<br />
for tv_show in root.findall('tv_show'):<br />
show_id = tv_show.attrib["id"]<br />
title = tv_show.find("title").text<br />
<br />
g.add((URIRef(ex + show_id), ex.title, Literal(title, datatype=XSD.string)))<br />
g.add((URIRef(ex + show_id), RDF.type, ex.TV_Show))<br />
<br />
for actor in tv_show.findall("actor"):<br />
first_name = actor.find("firstname").text<br />
last_name = actor.find("lastname").text<br />
full_name = first_name + "_" + last_name<br />
<br />
g.add((URIRef(ex + show_id), ex.stars, URIRef(ex + full_name)))<br />
g.add((URIRef(ex + full_name), ex.starsIn, URIRef(title)))<br />
g.add((URIRef(ex + full_name), RDF.type, ex.Actor))<br />
<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
<br />
==RDFS==<br />
<br />
===RDFS-plus (OWL) Properties===<br />
<syntaxhighlight><br />
g.add((ex.married, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.married, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.ReflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.sibling, RDF.type, OWL.TransitiveProperty))<br />
g.add((ex.sibling, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.sibling, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.FunctionalProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.IrreflexiveProperty))<br />
<br />
# Sometimes there is no definite answer, and it comes down to how we want to model our properties<br />
# e.g is livesWith a transitive property? Usually yes, but we can also want to specify that a child lives with both of her divorced parents.<br />
# which means that: (mother livesWith child % child livesWith father) != mother livesWith father. Which makes it non-transitive.<br />
</syntaxhighlight><br />
<br />
===RDFS inference with RDFLib===<br />
You can use the OWL-RL package to add inference capabilities to RDFLib. It can be installed using the pip install command:<br />
<syntaxhighlight><br />
pip install owlrl<br />
</syntaxhighlight><br />
Or download it from [https://github.com/RDFLib/OWL-RL GitHub] and copy the ''owlrl'' subfolder into your project folder next to your Python files.<br />
<br />
[https://owl-rl.readthedocs.io/en/latest/owlrl.html OWL-RL documentation.]<br />
<br />
Example program to get you started. In this example we are creating the graph using sparql.update, but it is also possible to parse the data from a file.<br />
<syntaxhighlight><br />
import rdflib.plugins.sparql.update<br />
import owlrl.RDFSClosure<br />
<br />
g = rdflib.Graph()<br />
<br />
ex = rdflib.Namespace('http://example.org#')<br />
g.bind('', ex)<br />
<br />
g.update("""<br />
PREFIX ex: <http://example.org#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
INSERT DATA {<br />
ex:Socrates rdf:type ex:Man .<br />
ex:Man rdfs:subClassOf ex:Mortal .<br />
}""")<br />
<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
# RDF_Semantics parameters: <br />
# - graph (rdflib.Graph) – The RDF graph to be extended.<br />
# - axioms (bool) – Whether (non-datatype) axiomatic triples should be added or not.<br />
# - daxioms (bool) – Whether datatype axiomatic triples should be added or not.<br />
# - rdfs (bool) – Whether RDFS inference is also done (used in subclassed only).<br />
# For now, you will in most cases use all False in RDFS_Semtantics.<br />
<br />
# Generates the closure of the graph - generates the new entailed triples, but does not add them to the graph.<br />
rdfs.closure()<br />
# Adds the new triples to the graph and empties the RDFS triple-container.<br />
rdfs.flush_stored_triples()<br />
<br />
# Ask-query to check whether a new triple has been generated from the entailment.<br />
b = g.query("""<br />
PREFIX ex: <http://example.org#><br />
ASK {<br />
ex:Socrates rdf:type ex:Mortal .<br />
} <br />
""")<br />
print('Result: ' + bool(b))<br />
</syntaxhighlight><br />
<br />
===Language tagged RDFS labels=== <br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, Literal<br />
from rdflib.namespace import RDFS<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.France, RDFS.label, Literal("Frankrike", lang="no")))<br />
g.add((ex.France, RDFS.label, Literal("France", lang="en")))<br />
g.add((ex.France, RDFS.label, Literal("Francia", lang="es")))<br />
<br />
<br />
</syntaxhighlight><br />
<br />
==OWL== <br />
===Basic inference with RDFLib=== <br />
<br />
You can use the OWL-RL package again as for Lecture 5.<br />
<br />
Instead of: <br />
<syntaxhighlight><br />
# The next three lines add inferred triples to g.<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
rdfs.closure()<br />
rdfs.flush_stored_triples()<br />
</syntaxhighlight><br />
you can write this to get both RDFS and basic RDFS Plus / OWL inference:<br />
<syntaxhighlight><br />
# The next three lines add inferred triples to g.<br />
owl = owlrl.CombinedClosure.RDFS_OWLRL_Semantics(g, False, False, False)<br />
owl.closure()<br />
owl.flush_stored_triples()<br />
</syntaxhighlight><br />
<br />
Example updates and queries:<br />
<syntaxhighlight><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
PREFIX ex: <http://example.org#><br />
<br />
INSERT DATA {<br />
ex:Socrates ex:hasWife ex:Xanthippe .<br />
ex:hasHusband owl:inverseOf ex:hasWife .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Xanthippe ex:hasHusband ex:Socrates .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ^ex:hasHusband ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
INSERT DATA {<br />
ex:hasWife rdfs:subPropertyOf ex:hasSpouse .<br />
ex:hasSpouse rdf:type owl:SymmetricProperty . <br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ex:hasSpouse ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ^ex:hasSpouse ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<br />
<br />
<br />
<br />
<br />
===XML Data for above example===<br />
<syntaxhighlight><br />
<data><br />
<tv_show id="1050"><br />
<title>The_Sopranos</title><br />
<actor><br />
<firstname>James</firstname><br />
<lastname>Gandolfini</lastname><br />
</actor><br />
</tv_show><br />
<tv_show id="1066"><br />
<title>Seinfeld</title><br />
<actor><br />
<firstname>Jerry</firstname><br />
<lastname>Seinfeld</lastname><br />
</actor><br />
<actor><br />
<firstname>Julia</firstname><br />
<lastname>Louis-dreyfus</lastname><br />
</actor><br />
<actor><br />
<firstname>Jason</firstname><br />
<lastname>Alexander</lastname><br />
</actor><br />
</tv_show><br />
</data><br />
</syntaxhighlight><br />
<br />
==Lifting HTML to RDF==<br />
<syntaxhighlight><br />
from bs4 import BeautifulSoup as bs, NavigableString<br />
from rdflib import Graph, URIRef, Namespace<br />
from rdflib.namespace import RDF<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
html = open("tv_shows.html").read()<br />
html = bs(html, features="html.parser")<br />
<br />
shows = html.find_all('li', attrs={'class': 'show'})<br />
for show in shows:<br />
title = show.find("h3").text<br />
actors = show.find('ul', attrs={'class': 'actor_list'})<br />
for actor in actors:<br />
if isinstance(actor, NavigableString):<br />
continue<br />
else:<br />
actor = actor.text.replace(" ", "_")<br />
g.add((URIRef(ex + title), ex.stars, URIRef(ex + actor)))<br />
g.add((URIRef(ex + actor), RDF.type, ex.Actor))<br />
<br />
g.add((URIRef(ex + title), RDF.type, ex.TV_Show))<br />
<br />
<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===HTML code for the example above===<br />
<syntaxhighlight><br />
<!DOCTYPE html><br />
<html><br />
<head><br />
<meta charset="utf-8"><br />
<title></title><br />
</head><br />
<body><br />
<div class="tv_shows"><br />
<ul><br />
<li class="show"><br />
<h3>The_Sopranos</h3><br />
<div class="irrelevant_data"></div><br />
<ul class="actor_list"><br />
<li>James Gandolfini</li><br />
</ul><br />
</li><br />
<li class="show"><br />
<h3>Seinfeld</h3><br />
<div class="irrelevant_data"></div><br />
<ul class="actor_list"><br />
<li >Jerry Seinfeld</li><br />
<li>Jason Alexander</li><br />
<li>Julia Louis-Dreyfus</li><br />
</ul><br />
</li><br />
</ul><br />
</div><br />
</body><br />
</html><br />
</syntaxhighlight><br />
<br />
==Web APIs with JSON==<br />
<syntaxhighlight><br />
import requests<br />
import json<br />
import pprint<br />
<br />
# Retrieve JSON data from API service URL. Then load it with the json library as a json object.<br />
url = "http://api.geonames.org/postalCodeLookupJSON?postalcode=46020&#country=ES&username=demo"<br />
data = requests.get(url).content.decode("utf-8")<br />
data = json.loads(data)<br />
pprint.pprint(data)<br />
</syntaxhighlight><br />
<br />
<br />
==JSON-LD==<br />
<br />
<syntaxhighlight><br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
<br />
example = """<br />
{<br />
"@context": {<br />
"name": "http://xmlns.com/foaf/0.1/name",<br />
"homepage": {<br />
"@id": "http://xmlns.com/foaf/0.1/homepage",<br />
"@type": "@id"<br />
}<br />
},<br />
"@id": "http://me.markus-lanthaler.com/",<br />
"name": "Markus Lanthaler",<br />
"homepage": "http://www.markus-lanthaler.com/"<br />
}<br />
"""<br />
<br />
# json-ld parsing automatically deals with @contexts<br />
g.parse(data=example, format='json-ld')<br />
<br />
# serialisation does expansion by default<br />
for line in g.serialize(format='json-ld').decode().splitlines():<br />
print(line)<br />
<br />
# by supplying a context object, serialisation can do compaction<br />
context = {<br />
"foaf": "http://xmlns.com/foaf/0.1/"<br />
}<br />
for line in g.serialize(format='json-ld', context=context).decode().splitlines():<br />
print(line)<br />
</syntaxhighlight><br />
<br />
<br />
<div class="credits" style="text-align: right; direction: ltr; margin-left: 1em;">''INFO216, UiB, 2017-2020. All code examples are [https://creativecommons.org/choose/zero/ CC0].'' </div><br />
<br />
==OWL - Complex Classes and Restrictions==<br />
<syntaxhighlight><br />
import owlrl<br />
from rdflib import Graph, Literal, Namespace, BNode<br />
from rdflib.namespace import RDF, OWL, RDFS<br />
from rdflib.collection import Collection<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
g.bind("owl", OWL)<br />
<br />
# a Season is either Autumn, Winter, Spring, Summer<br />
seasons = BNode()<br />
Collection(g, seasons, [ex.Winter, ex.Autumn, ex.Spring, ex.Summer])<br />
g.add((ex.Season, OWL.oneOf, seasons))<br />
<br />
# A Parent is a Father or Mother<br />
b = BNode()<br />
Collection(g, b, [ex.Father, ex.Mother])<br />
g.add((ex.Parent, OWL.unionOf, b))<br />
<br />
# A Woman is a person who has the "female" gender<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.gender))<br />
g.add((br, OWL.hasValue, ex.Female))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Woman, OWL.intersectionOf, bi))<br />
<br />
# A vegetarian is a Person who only eats vegetarian food<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.eats))<br />
g.add((br, OWL.allValuesFrom, ex.VeganFood))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Vegetarian, OWL.intersectionOf, bi))<br />
<br />
# A vegetarian is a Person who can not eat meat.<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.eats))<br />
g.add((br, OWL.QualifiedCardinality, Literal(0)))<br />
g.add((br, OWL.onClass, ex.Meat))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Vegetarian, OWL.intersectionOf, bi))<br />
<br />
# A Worried Parent is a parent who has at least one sick child<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.hasChild))<br />
g.add((br, OWL.QualifiedMinCardinality, Literal(1)))<br />
g.add((br, OWL.onClass, ex.Sick))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Parent, br])<br />
g.add((ex.WorriedParent, OWL.intersectionOf, bi))<br />
<br />
# using the restriction above, If we now write...: <br />
g.add((ex.Bob, RDF.type, ex.Parent))<br />
g.add((ex.Bob, ex.hasChild, ex.John))<br />
g.add((ex.John, RDF.type, ex.Sick))<br />
# ...we can infer with owl reasoning that Bob is a worried Parent even though we didn't specify it ourselves because Bob fullfills the restriction and Parent requirements.<br />
<br />
</syntaxhighlight><br />
<br />
==Protege-OWL reasoning with HermiT==<br />
<br />
[[:File:DL-reasoning-RoyalFamily-final.owl.txt | Example file]] from Lecture 13 about OWL-DL, rules and reasoning.<br />
<br />
--></div>
Tkr048
http://info216.wiki.uib.no/index.php?title=Lab_Solutions&diff=1745
Lab Solutions
2022-03-08T07:41:26Z
<p>Tkr048: /* Coding Tasks Lab 6 */</p>
<hr />
<div>This page will be updated with Python examples related to the lectures and labs. We will add more examples after each lab has ended. The first examples will use Python's RDFlib. We will introduce other relevant libraries later.<br />
<br />
<br />
==Getting started==<br />
<br />
<br />
===Printing the triples of the Graph in a readable way===<br />
<syntaxhighlight><br />
# The turtle format has the purpose of being more readable for humans. <br />
print(g.serialize(format="turtle"))<br />
</syntaxhighlight><br />
<br />
===Coding Tasks Lab 1===<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, URIRef, BNode, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.Cade, ex.married, ex.Mary))<br />
g.add((ex.France, ex.capital, ex.Paris))<br />
g.add((ex.Cade, ex.age, Literal("27", datatype=XSD.integer)))<br />
g.add((ex.Mary, ex.age, Literal("26", datatype=XSD.integer)))<br />
g.add((ex.Mary, ex.interest, ex.Hiking))<br />
g.add((ex.Mary, ex.interest, ex.Chocolate))<br />
g.add((ex.Mary, ex.interest, ex.Biology))<br />
g.add((ex.Mary, RDF.type, ex.Student))<br />
g.add((ex.Paris, RDF.type, ex.City))<br />
g.add((ex.Paris, ex.locatedIn, ex.France))<br />
g.add((ex.Cade, ex.characteristic, ex.Kind))<br />
g.add((ex.Mary, ex.characteristic, ex.Kind))<br />
g.add((ex.Mary, RDF.type, FOAF.Person))<br />
g.add((ex.Cade, RDF.type, FOAF.Person))<br />
<br />
</syntaxhighlight><br />
<br />
==Basic RDF programming==<br />
<br />
===Different ways to create an address===<br />
<br />
<syntaxhighlight><br />
<br />
from rdflib import Graph, Namespace, URIRef, BNode, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
<br />
# How to represent the address of Cade Tracey. From probably the worst solution to the best.<br />
<br />
# Solution 1 -<br />
# Make the entire address into one Literal. However, Generally we want to separate each part of an address into their own triples. This is useful for instance if we want to find only the streets where people live. <br />
<br />
g.add((ex.Cade_Tracey, ex.livesIn, Literal("1516_Henry_Street, Berkeley, California 94709, USA")))<br />
<br />
<br />
# Solution 2 - <br />
# Seperate the different pieces information into their own triples<br />
<br />
g.add((ex.Cade_tracey, ex.street, Literal("1516_Henry_Street")))<br />
g.add((ex.Cade_tracey, ex.city, Literal("Berkeley")))<br />
g.add((ex.Cade_tracey, ex.state, Literal("California")))<br />
g.add((ex.Cade_tracey, ex.zipcode, Literal("94709")))<br />
g.add((ex.Cade_tracey, ex.country, Literal("USA")))<br />
<br />
<br />
# Solution 3 - Some parts of the addresses can make more sense to be resources than Literals.<br />
# Larger concepts like a city or state are typically represented as resources rather than Literals, but this is not necesarilly a requirement in the case that you don't intend to say more about them. <br />
<br />
g.add((ex.Cade_tracey, ex.street, Literal("1516_Henry_Street")))<br />
g.add((ex.Cade_tracey, ex.city, ex.Berkeley))<br />
g.add((ex.Cade_tracey, ex.state, ex.California))<br />
g.add((ex.Cade_tracey, ex.zipcode, Literal("94709")))<br />
g.add((ex.Cade_tracey, ex.country, ex.USA))<br />
<br />
<br />
# Solution 4 <br />
# Grouping of the information into an Address. We can Represent the address concept with its own URI OR with a Blank Node. <br />
# One advantage of this is that we can easily remove the entire address, instead of removing each individual part of the address. <br />
# Solution 4 or 5 is how I would recommend to make addresses. Here, ex.CadeAddress could also be called something like ex.address1 or so on, if you want to give each address a unique ID. <br />
<br />
# Address URI - CadeAdress<br />
<br />
g.add((ex.Cade_Tracey, ex.address, ex.CadeAddress))<br />
g.add((ex.CadeAddress, RDF.type, ex.Address))<br />
g.add((ex.CadeAddress, ex.street, Literal("1516 Henry Street")))<br />
g.add((ex.CadeAddress, ex.city, ex.Berkeley))<br />
g.add((ex.CadeAddress, ex.state, ex.California))<br />
g.add((ex.CadeAddress, ex.postalCode, Literal("94709")))<br />
g.add((ex.CadeAddress, ex.country, ex.USA))<br />
<br />
# OR<br />
<br />
# Blank node for Address. <br />
address = BNode()<br />
g.add((ex.Cade_Tracey, ex.address, address))<br />
g.add((address, RDF.type, ex.Address))<br />
g.add((address, ex.street, Literal("1516 Henry Street", datatype=XSD.string)))<br />
g.add((address, ex.city, ex.Berkeley))<br />
g.add((address, ex.state, ex.California))<br />
g.add((address, ex.postalCode, Literal("94709", datatype=XSD.string)))<br />
g.add((address, ex.country, ex.USA))<br />
<br />
<br />
# Solution 5 using existing vocabularies for address <br />
<br />
# (in this case https://schema.org/PostalAddress from schema.org). <br />
# Also using existing ontology for places like California. (like http://dbpedia.org/resource/California from dbpedia.org)<br />
<br />
schema = Namespace("https://schema.org/")<br />
dbp = Namespace("https://dpbedia.org/resource/")<br />
<br />
g.add((ex.Cade_Tracey, schema.address, ex.CadeAddress))<br />
g.add((ex.CadeAddress, RDF.type, schema.PostalAddress))<br />
g.add((ex.CadeAddress, schema.streetAddress, Literal("1516 Henry Street")))<br />
g.add((ex.CadeAddress, schema.addresCity, dbp.Berkeley))<br />
g.add((ex.CadeAddress, schema.addressRegion, dbp.California))<br />
g.add((ex.CadeAddress, schema.postalCode, Literal("94709")))<br />
g.add((ex.CadeAddress, schema.addressCountry, dbp.United_States))<br />
<br />
</syntaxhighlight><br />
<br />
===Typed Literals===<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace<br />
from rdflib.namespace import XSD<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.Cade, ex.age, Literal(27, datatype=XSD.integer)))<br />
g.add((ex.Cade, ex.gpa, Literal(3.3, datatype=XSD.float)))<br />
g.add((ex.Cade, FOAF.name, Literal("Cade Tracey", datatype=XSD.string)))<br />
g.add((ex.Cade, ex.birthday, Literal("2006-01-01", datatype=XSD.date)))<br />
</syntaxhighlight><br />
<br />
<br />
===Writing and reading graphs/files===<br />
<br />
<syntaxhighlight><br />
# Writing the graph to a file on your system. Possible formats = turtle, n3, xml, nt.<br />
g.serialize(destination="triples.txt", format="turtle")<br />
<br />
# Parsing a local file<br />
parsed_graph = g.parse(location="triples.txt", format="turtle")<br />
<br />
# Parsing a remote endpoint like Dbpedia<br />
dbpedia_graph = g.parse("http://dbpedia.org/resource/Pluto")<br />
</syntaxhighlight><br />
<br />
===Graph Binding===<br />
<syntaxhighlight><br />
#Graph Binding is useful for at least two reasons:<br />
#(1) We no longer need to specify prefixes with SPARQL queries if they are already binded to the graph.<br />
#(2) When serializing the graph, the serialization will show the correct expected prefix <br />
# instead of default namespace names ns1, ns2 etc.<br />
<br />
g = Graph()<br />
<br />
ex = Namespace("http://example.org/")<br />
dbp = Namespace("http://dbpedia.org/resource/")<br />
schema = Namespace("https://schema.org/")<br />
<br />
g.bind("ex", ex)<br />
g.bind("dbp", dbp)<br />
g.bind("schema", schema)<br />
</syntaxhighlight><br />
<br />
===Collection Example===<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace<br />
from rdflib.collection import Collection<br />
<br />
<br />
# Sometimes we want to add many objects or subjects for the same predicate at once. <br />
# In these cases we can use Collection() to save some time.<br />
# In this case I want to add all countries that Emma has visited at once.<br />
<br />
b = BNode()<br />
g.add((ex.Emma, ex.visit, b))<br />
Collection(g, b,<br />
[ex.Portugal, ex.Italy, ex.France, ex.Germany, ex.Denmark, ex.Sweden])<br />
<br />
# OR<br />
<br />
g.add((ex.Emma, ex.visit, ex.EmmaVisits))<br />
Collection(g, ex.EmmaVisits,<br />
[ex.Portugal, ex.Italy, ex.France, ex.Germany, ex.Denmark, ex.Sweden])<br />
<br />
</syntaxhighlight><br />
<br />
==SPARQL==<br />
<br />
Also see the [[SPARQL Examples]] page!<br />
<br />
===Querying a local ("in memory") graph===<br />
<br />
Example contents of the file family.ttl:<br />
@prefix rex: <http://example.org/royal#> .<br />
@prefix fam: <http://example.org/family#> .<br />
<br />
rex:IngridAlexandra fam:hasParent rex:HaakonMagnus .<br />
rex:SverreMagnus fam:hasParent rex:HaakonMagnus .<br />
rex:HaakonMagnus fam:hasParent rex:Harald .<br />
rex:MarthaLouise fam:hasParent rex:Harald .<br />
rex:HaakonMagnus fam:hasSister rex:MarthaLouise .<br />
<br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
g.parse("family.ttl", format='ttl')<br />
<br />
qres = g.query("""<br />
PREFIX fam: <http://example.org/family#><br />
SELECT ?child ?sister WHERE {<br />
?child fam:hasParent ?parent . <br />
?parent fam:hasSister ?sister .<br />
}""")<br />
for row in qres:<br />
print("%s has aunt %s" % row)<br />
<br />
With a prepared query, you can write the query once, and then bind some of the variables each time you use it:<br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
g.parse("family.ttl", format='ttl')<br />
<br />
q = rdflib.plugins.sparql.prepareQuery(<br />
"""SELECT ?child ?sister WHERE {<br />
?child fam:hasParent ?parent .<br />
?parent fam:hasSister ?sister .<br />
}""",<br />
initNs = { "fam": "http://example.org/family#"})<br />
<br />
sm = rdflib.URIRef("http://example.org/royal#SverreMagnus")<br />
<br />
for row in g.query(q, initBindings={'child': sm}):<br />
print(row)<br />
<br />
===Select all contents of lists (rdfllib.Collection)===<br />
<syntaxhighlight><br />
<br />
# rdflib.Collection has a different interntal structure so it requires a slightly more advance query. Here I am selecting all places that Emma has visited.<br />
<br />
PREFIX ex: <http://example.org/><br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
<br />
SELECT ?visit<br />
WHERE {<br />
ex:Emma ex:visit/rdf:rest*/rdf:first ?visit<br />
}<br />
</syntaxhighlight><br />
<br />
<br />
===Using parameters/variables in rdflib queries===<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, URIRef<br />
from rdflib.plugins.sparql import prepareQuery<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
g.add((ex.Cade, ex.livesIn, ex.France))<br />
g.add((ex.Anne, ex.livesIn, ex.Norway))<br />
g.add((ex.Sofie, ex.livesIn, ex.Sweden))<br />
g.add((ex.Per, ex.livesIn, ex.Norway))<br />
g.add((ex.John, ex.livesIn, ex.USA))<br />
<br />
<br />
def find_people_from_country(country):<br />
country = URIRef(ex + country)<br />
q = prepareQuery(<br />
"""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?person WHERE { <br />
?person ex:livesIn ?country.<br />
}<br />
""")<br />
<br />
capital_result = g.query(q, initBindings={'country': country})<br />
<br />
for row in capital_result:<br />
print(row)<br />
<br />
find_people_from_country("Norway")<br />
</syntaxhighlight><br />
<br />
===SELECTING data from Blazegraph via Python===<br />
<syntaxhighlight><br />
<br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
# This creates a server connection to the same URL that contains the graphic interface for Blazegraph. <br />
# You also need to add "sparql" to end of the URL like below.<br />
<br />
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/sparql")<br />
<br />
# SELECT all triples in the database.<br />
<br />
sparql.setQuery("""<br />
SELECT DISTINCT ?p WHERE {<br />
?s ?p ?o.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["p"]["value"])<br />
<br />
# SELECT all interests of Cade<br />
<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT DISTINCT ?interest WHERE {<br />
ex:Cade ex:interest ?interest.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["interest"]["value"])<br />
</syntaxhighlight><br />
<br />
===Updating data from Blazegraph via Python===<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, POST, DIGEST<br />
<br />
namespace = "kb"<br />
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/namespace/"+ namespace + "/sparql")<br />
<br />
sparql.setMethod(POST)<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
INSERT DATA{<br />
ex:Cade ex:interest ex:Mathematics.<br />
}<br />
""")<br />
<br />
results = sparql.query()<br />
print(results.response.read())<br />
<br />
<br />
</syntaxhighlight><br />
===Retrieving data from Wikidata with SparqlWrapper===<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
sparql = SPARQLWrapper("https://query.wikidata.org/sparql")<br />
# In the query I want to select all the Vitamins in wikidata.<br />
<br />
sparql.setQuery("""<br />
SELECT ?nutrient ?nutrientLabel WHERE<br />
{<br />
?nutrient wdt:P279 wd:Q34956.<br />
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }<br />
}<br />
""")<br />
<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["nutrient"]["value"], " ", result["nutrientLabel"]["value"])<br />
</syntaxhighlight><br />
<br />
<br />
More examples can be found in the example section on the official query service here: https://query.wikidata.org/.<br />
<br />
===Download from BlazeGraph===<br />
<br />
<syntaxhighlight><br />
"""<br />
Dumps a database to a local RDF file.<br />
You need to install the SPARQLWrapper package first...<br />
"""<br />
<br />
import datetime<br />
from SPARQLWrapper import SPARQLWrapper, RDFXML<br />
<br />
# your namespace, the default is 'kb'<br />
ns = 'kb'<br />
<br />
# the SPARQL endpoint<br />
endpoint = 'http://info216.i2s.uib.no/bigdata/namespace/' + ns + '/sparql'<br />
<br />
# - the endpoint just moved, the old one was:<br />
# endpoint = 'http://i2s.uib.no:8888/bigdata/namespace/' + ns + '/sparql'<br />
<br />
# create wrapper<br />
wrapper = SPARQLWrapper(endpoint)<br />
<br />
# prepare the SPARQL update<br />
wrapper.setQuery('CONSTRUCT { ?s ?p ?o } WHERE { ?s ?p ?o }')<br />
wrapper.setReturnFormat(RDFXML)<br />
<br />
# execute the SPARQL update and convert the result to an rdflib.Graph <br />
graph = wrapper.query().convert()<br />
<br />
# the destination file, with code to make it timestamped<br />
destfile = 'rdf_dumps/slr-kg4news-' + datetime.datetime.now().strftime('%Y%m%d-%H%M') + '.rdf'<br />
<br />
# serialize the result to file<br />
graph.serialize(destination=destfile, format='ttl')<br />
<br />
# report and quit<br />
print('Wrote %u triples to file %s .' %<br />
(len(res), destfile))<br />
</syntaxhighlight><br />
<br />
===Query Dbpedia with SparqlWrapper===<br />
<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
sparql = SPARQLWrapper("http://dbpedia.org/sparql")<br />
<br />
sparql.setQuery("""<br />
PREFIX dbr: <http://dbpedia.org/resource/><br />
PREFIX dbo: <http://dbpedia.org/ontology/><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
SELECT ?comment<br />
WHERE {<br />
dbr:Barack_Obama rdfs:comment ?comment.<br />
FILTER (langMatches(lang(?comment),"en"))<br />
}<br />
""")<br />
<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["comment"]["value"])<br />
</syntaxhighlight><br />
<br />
==Lifting CSV to RDF==<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, FOAF, RDFS, OWL<br />
import pandas as pd<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
# Load the CSV data as a pandas Dataframe.<br />
csv_data = pd.read_csv("task1.csv")<br />
<br />
# Here I deal with spaces (" ") in the data. I replace them with "_" so that URI's become valid.<br />
csv_data = csv_data.replace(to_replace=" ", value="_", regex=True)<br />
<br />
# Here I mark all missing/empty data as "unknown". This makes it easy to delete triples containing this later.<br />
csv_data = csv_data.fillna("unknown")<br />
<br />
# Loop through the CSV data, and then make RDF triples.<br />
for index, row in csv_data.iterrows():<br />
# The names of the people act as subjects.<br />
subject = row['Name']<br />
# Create triples: e.g. "Cade_Tracey - age - 27"<br />
g.add((URIRef(ex + subject), URIRef(ex + "age"), Literal(row["Age"])))<br />
g.add((URIRef(ex + subject), URIRef(ex + "married"), URIRef(ex + row["Spouse"])))<br />
g.add((URIRef(ex + subject), URIRef(ex + "country"), URIRef(ex + row["Country"])))<br />
<br />
# If We want can add additional RDF/RDFS/OWL information e.g<br />
g.add((URIRef(ex + subject), RDF.type, FOAF.Person))<br />
<br />
# I remove triples that I marked as unknown earlier.<br />
g.remove((None, None, URIRef("http://example.org/unknown")))<br />
<br />
# Clean printing of the graph.<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===CSV file for above example===<br />
<br />
<syntaxhighlight><br />
"Name","Age","Spouse","Country"<br />
"Cade Tracey","26","Mary Jackson","US"<br />
"Bob Johnson","21","","Canada"<br />
"Mary Jackson","25","","France"<br />
"Phil Philips","32","Catherine Smith","Japan"<br />
</syntaxhighlight><br />
<br />
<br />
===Coding Tasks Lab 6===<br />
<syntaxhighlight><br />
import pandas as pd<br />
<br />
<br />
from rdflib import Graph, Namespace, URIRef, Literal, BNode<br />
from rdflib.namespace import RDF, XSD<br />
<br />
<br />
ex = Namespace("http://example.org/")<br />
sem = Namespace("http://semanticweb.cs.vu.nl/2009/11/sem/")<br />
<br />
g = Graph()<br />
g.bind("ex", ex)<br />
g.bind("sem", sem)<br />
<br />
<br />
# Removing unwanted characters<br />
df = pd.read_csv('russia-investigation.csv')<br />
# Here I deal with spaces (" ") in the data. I replace them with "_" so that URI's become valid.<br />
df = df.replace(to_replace=" ", value="_", regex=True)<br />
# This may seem odd, but in the data set we have a name like this:("Scooter"). So we have to remove quotation marks<br />
df = df.replace(to_replace=f'"', value="", regex=True)<br />
# # Here I mark all missing/empty data as "unknown". This makes it easy to delete triples containing this later.<br />
df = df.fillna("unknown")<br />
<br />
# Loop through the CSV data, and then make RDF triples.<br />
for index, row in df.iterrows():<br />
name = row['investigation']<br />
investigation = URIRef(ex + name)<br />
g.add((investigation, RDF.type, sem.Event))<br />
investigation_start = row["investigation-start"]<br />
g.add((investigation, sem.hasBeginTimeStamp, Literal(<br />
investigation_start, datatype=XSD.datetime)))<br />
investigation_end = row["investigation-end"]<br />
g.add((investigation, sem.hasEndTimeStamp, Literal(<br />
investigation_end, datatype=XSD.datetime)))<br />
investigation_end = row["investigation-days"]<br />
g.add((investigation, sem.hasXSDDuration, Literal(<br />
investigation_end, datatype=XSD.Days)))<br />
person = row["name"]<br />
person = URIRef(ex + person)<br />
g.add((investigation, sem.Actor, person))<br />
result = row['type']<br />
g.add((investigation, sem.hasSubEvent, Literal(result, datatype=XSD.string)))<br />
overturned = row["overturned"]<br />
g.add((investigation, ex.overtuned, Literal(overturned, datatype=XSD.boolean)))<br />
pardoned = row["pardoned"]<br />
g.add((investigation, ex.pardon, Literal(pardoned, datatype=XSD.boolean)))<br />
<br />
g.serialize("output.ttl", format="ttl")<br />
print(g.serialize(format="turtle"))<br />
<br />
</syntaxhighlight><br />
<!--<br />
==Lifting XML to RDF==<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, XSD, RDFS<br />
import xml.etree.ElementTree as ET<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/TV/")<br />
prov = Namespace("http://www.w3.org/ns/prov#")<br />
g.bind("ex", ex)<br />
g.bind("prov", prov)<br />
<br />
tree = ET.parse("tv_shows.xml")<br />
root = tree.getroot()<br />
<br />
for tv_show in root.findall('tv_show'):<br />
show_id = tv_show.attrib["id"]<br />
title = tv_show.find("title").text<br />
<br />
g.add((URIRef(ex + show_id), ex.title, Literal(title, datatype=XSD.string)))<br />
g.add((URIRef(ex + show_id), RDF.type, ex.TV_Show))<br />
<br />
for actor in tv_show.findall("actor"):<br />
first_name = actor.find("firstname").text<br />
last_name = actor.find("lastname").text<br />
full_name = first_name + "_" + last_name<br />
<br />
g.add((URIRef(ex + show_id), ex.stars, URIRef(ex + full_name)))<br />
g.add((URIRef(ex + full_name), ex.starsIn, URIRef(title)))<br />
g.add((URIRef(ex + full_name), RDF.type, ex.Actor))<br />
<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
<br />
==RDFS==<br />
<br />
===RDFS-plus (OWL) Properties===<br />
<syntaxhighlight><br />
g.add((ex.married, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.married, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.ReflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.sibling, RDF.type, OWL.TransitiveProperty))<br />
g.add((ex.sibling, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.sibling, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.FunctionalProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.IrreflexiveProperty))<br />
<br />
# Sometimes there is no definite answer, and it comes down to how we want to model our properties<br />
# e.g is livesWith a transitive property? Usually yes, but we can also want to specify that a child lives with both of her divorced parents.<br />
# which means that: (mother livesWith child % child livesWith father) != mother livesWith father. Which makes it non-transitive.<br />
</syntaxhighlight><br />
<br />
===RDFS inference with RDFLib===<br />
You can use the OWL-RL package to add inference capabilities to RDFLib. It can be installed using the pip install command:<br />
<syntaxhighlight><br />
pip install owlrl<br />
</syntaxhighlight><br />
Or download it from [https://github.com/RDFLib/OWL-RL GitHub] and copy the ''owlrl'' subfolder into your project folder next to your Python files.<br />
<br />
[https://owl-rl.readthedocs.io/en/latest/owlrl.html OWL-RL documentation.]<br />
<br />
Example program to get you started. In this example we are creating the graph using sparql.update, but it is also possible to parse the data from a file.<br />
<syntaxhighlight><br />
import rdflib.plugins.sparql.update<br />
import owlrl.RDFSClosure<br />
<br />
g = rdflib.Graph()<br />
<br />
ex = rdflib.Namespace('http://example.org#')<br />
g.bind('', ex)<br />
<br />
g.update("""<br />
PREFIX ex: <http://example.org#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
INSERT DATA {<br />
ex:Socrates rdf:type ex:Man .<br />
ex:Man rdfs:subClassOf ex:Mortal .<br />
}""")<br />
<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
# RDF_Semantics parameters: <br />
# - graph (rdflib.Graph) – The RDF graph to be extended.<br />
# - axioms (bool) – Whether (non-datatype) axiomatic triples should be added or not.<br />
# - daxioms (bool) – Whether datatype axiomatic triples should be added or not.<br />
# - rdfs (bool) – Whether RDFS inference is also done (used in subclassed only).<br />
# For now, you will in most cases use all False in RDFS_Semtantics.<br />
<br />
# Generates the closure of the graph - generates the new entailed triples, but does not add them to the graph.<br />
rdfs.closure()<br />
# Adds the new triples to the graph and empties the RDFS triple-container.<br />
rdfs.flush_stored_triples()<br />
<br />
# Ask-query to check whether a new triple has been generated from the entailment.<br />
b = g.query("""<br />
PREFIX ex: <http://example.org#><br />
ASK {<br />
ex:Socrates rdf:type ex:Mortal .<br />
} <br />
""")<br />
print('Result: ' + bool(b))<br />
</syntaxhighlight><br />
<br />
===Language tagged RDFS labels=== <br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, Literal<br />
from rdflib.namespace import RDFS<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.France, RDFS.label, Literal("Frankrike", lang="no")))<br />
g.add((ex.France, RDFS.label, Literal("France", lang="en")))<br />
g.add((ex.France, RDFS.label, Literal("Francia", lang="es")))<br />
<br />
<br />
</syntaxhighlight><br />
<br />
==OWL== <br />
===Basic inference with RDFLib=== <br />
<br />
You can use the OWL-RL package again as for Lecture 5.<br />
<br />
Instead of: <br />
<syntaxhighlight><br />
# The next three lines add inferred triples to g.<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
rdfs.closure()<br />
rdfs.flush_stored_triples()<br />
</syntaxhighlight><br />
you can write this to get both RDFS and basic RDFS Plus / OWL inference:<br />
<syntaxhighlight><br />
# The next three lines add inferred triples to g.<br />
owl = owlrl.CombinedClosure.RDFS_OWLRL_Semantics(g, False, False, False)<br />
owl.closure()<br />
owl.flush_stored_triples()<br />
</syntaxhighlight><br />
<br />
Example updates and queries:<br />
<syntaxhighlight><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
PREFIX ex: <http://example.org#><br />
<br />
INSERT DATA {<br />
ex:Socrates ex:hasWife ex:Xanthippe .<br />
ex:hasHusband owl:inverseOf ex:hasWife .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Xanthippe ex:hasHusband ex:Socrates .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ^ex:hasHusband ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
INSERT DATA {<br />
ex:hasWife rdfs:subPropertyOf ex:hasSpouse .<br />
ex:hasSpouse rdf:type owl:SymmetricProperty . <br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ex:hasSpouse ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ^ex:hasSpouse ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<br />
<br />
<br />
<br />
<br />
===XML Data for above example===<br />
<syntaxhighlight><br />
<data><br />
<tv_show id="1050"><br />
<title>The_Sopranos</title><br />
<actor><br />
<firstname>James</firstname><br />
<lastname>Gandolfini</lastname><br />
</actor><br />
</tv_show><br />
<tv_show id="1066"><br />
<title>Seinfeld</title><br />
<actor><br />
<firstname>Jerry</firstname><br />
<lastname>Seinfeld</lastname><br />
</actor><br />
<actor><br />
<firstname>Julia</firstname><br />
<lastname>Louis-dreyfus</lastname><br />
</actor><br />
<actor><br />
<firstname>Jason</firstname><br />
<lastname>Alexander</lastname><br />
</actor><br />
</tv_show><br />
</data><br />
</syntaxhighlight><br />
<br />
==Lifting HTML to RDF==<br />
<syntaxhighlight><br />
from bs4 import BeautifulSoup as bs, NavigableString<br />
from rdflib import Graph, URIRef, Namespace<br />
from rdflib.namespace import RDF<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
html = open("tv_shows.html").read()<br />
html = bs(html, features="html.parser")<br />
<br />
shows = html.find_all('li', attrs={'class': 'show'})<br />
for show in shows:<br />
title = show.find("h3").text<br />
actors = show.find('ul', attrs={'class': 'actor_list'})<br />
for actor in actors:<br />
if isinstance(actor, NavigableString):<br />
continue<br />
else:<br />
actor = actor.text.replace(" ", "_")<br />
g.add((URIRef(ex + title), ex.stars, URIRef(ex + actor)))<br />
g.add((URIRef(ex + actor), RDF.type, ex.Actor))<br />
<br />
g.add((URIRef(ex + title), RDF.type, ex.TV_Show))<br />
<br />
<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===HTML code for the example above===<br />
<syntaxhighlight><br />
<!DOCTYPE html><br />
<html><br />
<head><br />
<meta charset="utf-8"><br />
<title></title><br />
</head><br />
<body><br />
<div class="tv_shows"><br />
<ul><br />
<li class="show"><br />
<h3>The_Sopranos</h3><br />
<div class="irrelevant_data"></div><br />
<ul class="actor_list"><br />
<li>James Gandolfini</li><br />
</ul><br />
</li><br />
<li class="show"><br />
<h3>Seinfeld</h3><br />
<div class="irrelevant_data"></div><br />
<ul class="actor_list"><br />
<li >Jerry Seinfeld</li><br />
<li>Jason Alexander</li><br />
<li>Julia Louis-Dreyfus</li><br />
</ul><br />
</li><br />
</ul><br />
</div><br />
</body><br />
</html><br />
</syntaxhighlight><br />
<br />
==Web APIs with JSON==<br />
<syntaxhighlight><br />
import requests<br />
import json<br />
import pprint<br />
<br />
# Retrieve JSON data from API service URL. Then load it with the json library as a json object.<br />
url = "http://api.geonames.org/postalCodeLookupJSON?postalcode=46020&#country=ES&username=demo"<br />
data = requests.get(url).content.decode("utf-8")<br />
data = json.loads(data)<br />
pprint.pprint(data)<br />
</syntaxhighlight><br />
<br />
<br />
==JSON-LD==<br />
<br />
<syntaxhighlight><br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
<br />
example = """<br />
{<br />
"@context": {<br />
"name": "http://xmlns.com/foaf/0.1/name",<br />
"homepage": {<br />
"@id": "http://xmlns.com/foaf/0.1/homepage",<br />
"@type": "@id"<br />
}<br />
},<br />
"@id": "http://me.markus-lanthaler.com/",<br />
"name": "Markus Lanthaler",<br />
"homepage": "http://www.markus-lanthaler.com/"<br />
}<br />
"""<br />
<br />
# json-ld parsing automatically deals with @contexts<br />
g.parse(data=example, format='json-ld')<br />
<br />
# serialisation does expansion by default<br />
for line in g.serialize(format='json-ld').decode().splitlines():<br />
print(line)<br />
<br />
# by supplying a context object, serialisation can do compaction<br />
context = {<br />
"foaf": "http://xmlns.com/foaf/0.1/"<br />
}<br />
for line in g.serialize(format='json-ld', context=context).decode().splitlines():<br />
print(line)<br />
</syntaxhighlight><br />
<br />
<br />
<div class="credits" style="text-align: right; direction: ltr; margin-left: 1em;">''INFO216, UiB, 2017-2020. All code examples are [https://creativecommons.org/choose/zero/ CC0].'' </div><br />
<br />
==OWL - Complex Classes and Restrictions==<br />
<syntaxhighlight><br />
import owlrl<br />
from rdflib import Graph, Literal, Namespace, BNode<br />
from rdflib.namespace import RDF, OWL, RDFS<br />
from rdflib.collection import Collection<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
g.bind("owl", OWL)<br />
<br />
# a Season is either Autumn, Winter, Spring, Summer<br />
seasons = BNode()<br />
Collection(g, seasons, [ex.Winter, ex.Autumn, ex.Spring, ex.Summer])<br />
g.add((ex.Season, OWL.oneOf, seasons))<br />
<br />
# A Parent is a Father or Mother<br />
b = BNode()<br />
Collection(g, b, [ex.Father, ex.Mother])<br />
g.add((ex.Parent, OWL.unionOf, b))<br />
<br />
# A Woman is a person who has the "female" gender<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.gender))<br />
g.add((br, OWL.hasValue, ex.Female))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Woman, OWL.intersectionOf, bi))<br />
<br />
# A vegetarian is a Person who only eats vegetarian food<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.eats))<br />
g.add((br, OWL.allValuesFrom, ex.VeganFood))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Vegetarian, OWL.intersectionOf, bi))<br />
<br />
# A vegetarian is a Person who can not eat meat.<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.eats))<br />
g.add((br, OWL.QualifiedCardinality, Literal(0)))<br />
g.add((br, OWL.onClass, ex.Meat))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Vegetarian, OWL.intersectionOf, bi))<br />
<br />
# A Worried Parent is a parent who has at least one sick child<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.hasChild))<br />
g.add((br, OWL.QualifiedMinCardinality, Literal(1)))<br />
g.add((br, OWL.onClass, ex.Sick))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Parent, br])<br />
g.add((ex.WorriedParent, OWL.intersectionOf, bi))<br />
<br />
# using the restriction above, If we now write...: <br />
g.add((ex.Bob, RDF.type, ex.Parent))<br />
g.add((ex.Bob, ex.hasChild, ex.John))<br />
g.add((ex.John, RDF.type, ex.Sick))<br />
# ...we can infer with owl reasoning that Bob is a worried Parent even though we didn't specify it ourselves because Bob fullfills the restriction and Parent requirements.<br />
<br />
</syntaxhighlight><br />
<br />
==Protege-OWL reasoning with HermiT==<br />
<br />
[[:File:DL-reasoning-RoyalFamily-final.owl.txt | Example file]] from Lecture 13 about OWL-DL, rules and reasoning.<br />
<br />
--></div>
Tkr048
http://info216.wiki.uib.no/index.php?title=Lab_Solutions&diff=1744
Lab Solutions
2022-03-08T07:40:32Z
<p>Tkr048: </p>
<hr />
<div>This page will be updated with Python examples related to the lectures and labs. We will add more examples after each lab has ended. The first examples will use Python's RDFlib. We will introduce other relevant libraries later.<br />
<br />
<br />
==Getting started==<br />
<br />
<br />
===Printing the triples of the Graph in a readable way===<br />
<syntaxhighlight><br />
# The turtle format has the purpose of being more readable for humans. <br />
print(g.serialize(format="turtle"))<br />
</syntaxhighlight><br />
<br />
===Coding Tasks Lab 1===<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, URIRef, BNode, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.Cade, ex.married, ex.Mary))<br />
g.add((ex.France, ex.capital, ex.Paris))<br />
g.add((ex.Cade, ex.age, Literal("27", datatype=XSD.integer)))<br />
g.add((ex.Mary, ex.age, Literal("26", datatype=XSD.integer)))<br />
g.add((ex.Mary, ex.interest, ex.Hiking))<br />
g.add((ex.Mary, ex.interest, ex.Chocolate))<br />
g.add((ex.Mary, ex.interest, ex.Biology))<br />
g.add((ex.Mary, RDF.type, ex.Student))<br />
g.add((ex.Paris, RDF.type, ex.City))<br />
g.add((ex.Paris, ex.locatedIn, ex.France))<br />
g.add((ex.Cade, ex.characteristic, ex.Kind))<br />
g.add((ex.Mary, ex.characteristic, ex.Kind))<br />
g.add((ex.Mary, RDF.type, FOAF.Person))<br />
g.add((ex.Cade, RDF.type, FOAF.Person))<br />
<br />
</syntaxhighlight><br />
<br />
==Basic RDF programming==<br />
<br />
===Different ways to create an address===<br />
<br />
<syntaxhighlight><br />
<br />
from rdflib import Graph, Namespace, URIRef, BNode, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
<br />
# How to represent the address of Cade Tracey. From probably the worst solution to the best.<br />
<br />
# Solution 1 -<br />
# Make the entire address into one Literal. However, Generally we want to separate each part of an address into their own triples. This is useful for instance if we want to find only the streets where people live. <br />
<br />
g.add((ex.Cade_Tracey, ex.livesIn, Literal("1516_Henry_Street, Berkeley, California 94709, USA")))<br />
<br />
<br />
# Solution 2 - <br />
# Seperate the different pieces information into their own triples<br />
<br />
g.add((ex.Cade_tracey, ex.street, Literal("1516_Henry_Street")))<br />
g.add((ex.Cade_tracey, ex.city, Literal("Berkeley")))<br />
g.add((ex.Cade_tracey, ex.state, Literal("California")))<br />
g.add((ex.Cade_tracey, ex.zipcode, Literal("94709")))<br />
g.add((ex.Cade_tracey, ex.country, Literal("USA")))<br />
<br />
<br />
# Solution 3 - Some parts of the addresses can make more sense to be resources than Literals.<br />
# Larger concepts like a city or state are typically represented as resources rather than Literals, but this is not necesarilly a requirement in the case that you don't intend to say more about them. <br />
<br />
g.add((ex.Cade_tracey, ex.street, Literal("1516_Henry_Street")))<br />
g.add((ex.Cade_tracey, ex.city, ex.Berkeley))<br />
g.add((ex.Cade_tracey, ex.state, ex.California))<br />
g.add((ex.Cade_tracey, ex.zipcode, Literal("94709")))<br />
g.add((ex.Cade_tracey, ex.country, ex.USA))<br />
<br />
<br />
# Solution 4 <br />
# Grouping of the information into an Address. We can Represent the address concept with its own URI OR with a Blank Node. <br />
# One advantage of this is that we can easily remove the entire address, instead of removing each individual part of the address. <br />
# Solution 4 or 5 is how I would recommend to make addresses. Here, ex.CadeAddress could also be called something like ex.address1 or so on, if you want to give each address a unique ID. <br />
<br />
# Address URI - CadeAdress<br />
<br />
g.add((ex.Cade_Tracey, ex.address, ex.CadeAddress))<br />
g.add((ex.CadeAddress, RDF.type, ex.Address))<br />
g.add((ex.CadeAddress, ex.street, Literal("1516 Henry Street")))<br />
g.add((ex.CadeAddress, ex.city, ex.Berkeley))<br />
g.add((ex.CadeAddress, ex.state, ex.California))<br />
g.add((ex.CadeAddress, ex.postalCode, Literal("94709")))<br />
g.add((ex.CadeAddress, ex.country, ex.USA))<br />
<br />
# OR<br />
<br />
# Blank node for Address. <br />
address = BNode()<br />
g.add((ex.Cade_Tracey, ex.address, address))<br />
g.add((address, RDF.type, ex.Address))<br />
g.add((address, ex.street, Literal("1516 Henry Street", datatype=XSD.string)))<br />
g.add((address, ex.city, ex.Berkeley))<br />
g.add((address, ex.state, ex.California))<br />
g.add((address, ex.postalCode, Literal("94709", datatype=XSD.string)))<br />
g.add((address, ex.country, ex.USA))<br />
<br />
<br />
# Solution 5 using existing vocabularies for address <br />
<br />
# (in this case https://schema.org/PostalAddress from schema.org). <br />
# Also using existing ontology for places like California. (like http://dbpedia.org/resource/California from dbpedia.org)<br />
<br />
schema = Namespace("https://schema.org/")<br />
dbp = Namespace("https://dpbedia.org/resource/")<br />
<br />
g.add((ex.Cade_Tracey, schema.address, ex.CadeAddress))<br />
g.add((ex.CadeAddress, RDF.type, schema.PostalAddress))<br />
g.add((ex.CadeAddress, schema.streetAddress, Literal("1516 Henry Street")))<br />
g.add((ex.CadeAddress, schema.addresCity, dbp.Berkeley))<br />
g.add((ex.CadeAddress, schema.addressRegion, dbp.California))<br />
g.add((ex.CadeAddress, schema.postalCode, Literal("94709")))<br />
g.add((ex.CadeAddress, schema.addressCountry, dbp.United_States))<br />
<br />
</syntaxhighlight><br />
<br />
===Typed Literals===<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace<br />
from rdflib.namespace import XSD<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.Cade, ex.age, Literal(27, datatype=XSD.integer)))<br />
g.add((ex.Cade, ex.gpa, Literal(3.3, datatype=XSD.float)))<br />
g.add((ex.Cade, FOAF.name, Literal("Cade Tracey", datatype=XSD.string)))<br />
g.add((ex.Cade, ex.birthday, Literal("2006-01-01", datatype=XSD.date)))<br />
</syntaxhighlight><br />
<br />
<br />
===Writing and reading graphs/files===<br />
<br />
<syntaxhighlight><br />
# Writing the graph to a file on your system. Possible formats = turtle, n3, xml, nt.<br />
g.serialize(destination="triples.txt", format="turtle")<br />
<br />
# Parsing a local file<br />
parsed_graph = g.parse(location="triples.txt", format="turtle")<br />
<br />
# Parsing a remote endpoint like Dbpedia<br />
dbpedia_graph = g.parse("http://dbpedia.org/resource/Pluto")<br />
</syntaxhighlight><br />
<br />
===Graph Binding===<br />
<syntaxhighlight><br />
#Graph Binding is useful for at least two reasons:<br />
#(1) We no longer need to specify prefixes with SPARQL queries if they are already binded to the graph.<br />
#(2) When serializing the graph, the serialization will show the correct expected prefix <br />
# instead of default namespace names ns1, ns2 etc.<br />
<br />
g = Graph()<br />
<br />
ex = Namespace("http://example.org/")<br />
dbp = Namespace("http://dbpedia.org/resource/")<br />
schema = Namespace("https://schema.org/")<br />
<br />
g.bind("ex", ex)<br />
g.bind("dbp", dbp)<br />
g.bind("schema", schema)<br />
</syntaxhighlight><br />
<br />
===Collection Example===<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace<br />
from rdflib.collection import Collection<br />
<br />
<br />
# Sometimes we want to add many objects or subjects for the same predicate at once. <br />
# In these cases we can use Collection() to save some time.<br />
# In this case I want to add all countries that Emma has visited at once.<br />
<br />
b = BNode()<br />
g.add((ex.Emma, ex.visit, b))<br />
Collection(g, b,<br />
[ex.Portugal, ex.Italy, ex.France, ex.Germany, ex.Denmark, ex.Sweden])<br />
<br />
# OR<br />
<br />
g.add((ex.Emma, ex.visit, ex.EmmaVisits))<br />
Collection(g, ex.EmmaVisits,<br />
[ex.Portugal, ex.Italy, ex.France, ex.Germany, ex.Denmark, ex.Sweden])<br />
<br />
</syntaxhighlight><br />
<br />
==SPARQL==<br />
<br />
Also see the [[SPARQL Examples]] page!<br />
<br />
===Querying a local ("in memory") graph===<br />
<br />
Example contents of the file family.ttl:<br />
@prefix rex: <http://example.org/royal#> .<br />
@prefix fam: <http://example.org/family#> .<br />
<br />
rex:IngridAlexandra fam:hasParent rex:HaakonMagnus .<br />
rex:SverreMagnus fam:hasParent rex:HaakonMagnus .<br />
rex:HaakonMagnus fam:hasParent rex:Harald .<br />
rex:MarthaLouise fam:hasParent rex:Harald .<br />
rex:HaakonMagnus fam:hasSister rex:MarthaLouise .<br />
<br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
g.parse("family.ttl", format='ttl')<br />
<br />
qres = g.query("""<br />
PREFIX fam: <http://example.org/family#><br />
SELECT ?child ?sister WHERE {<br />
?child fam:hasParent ?parent . <br />
?parent fam:hasSister ?sister .<br />
}""")<br />
for row in qres:<br />
print("%s has aunt %s" % row)<br />
<br />
With a prepared query, you can write the query once, and then bind some of the variables each time you use it:<br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
g.parse("family.ttl", format='ttl')<br />
<br />
q = rdflib.plugins.sparql.prepareQuery(<br />
"""SELECT ?child ?sister WHERE {<br />
?child fam:hasParent ?parent .<br />
?parent fam:hasSister ?sister .<br />
}""",<br />
initNs = { "fam": "http://example.org/family#"})<br />
<br />
sm = rdflib.URIRef("http://example.org/royal#SverreMagnus")<br />
<br />
for row in g.query(q, initBindings={'child': sm}):<br />
print(row)<br />
<br />
===Select all contents of lists (rdfllib.Collection)===<br />
<syntaxhighlight><br />
<br />
# rdflib.Collection has a different interntal structure so it requires a slightly more advance query. Here I am selecting all places that Emma has visited.<br />
<br />
PREFIX ex: <http://example.org/><br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
<br />
SELECT ?visit<br />
WHERE {<br />
ex:Emma ex:visit/rdf:rest*/rdf:first ?visit<br />
}<br />
</syntaxhighlight><br />
<br />
<br />
===Using parameters/variables in rdflib queries===<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, URIRef<br />
from rdflib.plugins.sparql import prepareQuery<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
g.add((ex.Cade, ex.livesIn, ex.France))<br />
g.add((ex.Anne, ex.livesIn, ex.Norway))<br />
g.add((ex.Sofie, ex.livesIn, ex.Sweden))<br />
g.add((ex.Per, ex.livesIn, ex.Norway))<br />
g.add((ex.John, ex.livesIn, ex.USA))<br />
<br />
<br />
def find_people_from_country(country):<br />
country = URIRef(ex + country)<br />
q = prepareQuery(<br />
"""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?person WHERE { <br />
?person ex:livesIn ?country.<br />
}<br />
""")<br />
<br />
capital_result = g.query(q, initBindings={'country': country})<br />
<br />
for row in capital_result:<br />
print(row)<br />
<br />
find_people_from_country("Norway")<br />
</syntaxhighlight><br />
<br />
===SELECTING data from Blazegraph via Python===<br />
<syntaxhighlight><br />
<br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
# This creates a server connection to the same URL that contains the graphic interface for Blazegraph. <br />
# You also need to add "sparql" to end of the URL like below.<br />
<br />
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/sparql")<br />
<br />
# SELECT all triples in the database.<br />
<br />
sparql.setQuery("""<br />
SELECT DISTINCT ?p WHERE {<br />
?s ?p ?o.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["p"]["value"])<br />
<br />
# SELECT all interests of Cade<br />
<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT DISTINCT ?interest WHERE {<br />
ex:Cade ex:interest ?interest.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["interest"]["value"])<br />
</syntaxhighlight><br />
<br />
===Updating data from Blazegraph via Python===<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, POST, DIGEST<br />
<br />
namespace = "kb"<br />
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/namespace/"+ namespace + "/sparql")<br />
<br />
sparql.setMethod(POST)<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
INSERT DATA{<br />
ex:Cade ex:interest ex:Mathematics.<br />
}<br />
""")<br />
<br />
results = sparql.query()<br />
print(results.response.read())<br />
<br />
<br />
</syntaxhighlight><br />
===Retrieving data from Wikidata with SparqlWrapper===<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
sparql = SPARQLWrapper("https://query.wikidata.org/sparql")<br />
# In the query I want to select all the Vitamins in wikidata.<br />
<br />
sparql.setQuery("""<br />
SELECT ?nutrient ?nutrientLabel WHERE<br />
{<br />
?nutrient wdt:P279 wd:Q34956.<br />
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }<br />
}<br />
""")<br />
<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["nutrient"]["value"], " ", result["nutrientLabel"]["value"])<br />
</syntaxhighlight><br />
<br />
<br />
More examples can be found in the example section on the official query service here: https://query.wikidata.org/.<br />
<br />
===Download from BlazeGraph===<br />
<br />
<syntaxhighlight><br />
"""<br />
Dumps a database to a local RDF file.<br />
You need to install the SPARQLWrapper package first...<br />
"""<br />
<br />
import datetime<br />
from SPARQLWrapper import SPARQLWrapper, RDFXML<br />
<br />
# your namespace, the default is 'kb'<br />
ns = 'kb'<br />
<br />
# the SPARQL endpoint<br />
endpoint = 'http://info216.i2s.uib.no/bigdata/namespace/' + ns + '/sparql'<br />
<br />
# - the endpoint just moved, the old one was:<br />
# endpoint = 'http://i2s.uib.no:8888/bigdata/namespace/' + ns + '/sparql'<br />
<br />
# create wrapper<br />
wrapper = SPARQLWrapper(endpoint)<br />
<br />
# prepare the SPARQL update<br />
wrapper.setQuery('CONSTRUCT { ?s ?p ?o } WHERE { ?s ?p ?o }')<br />
wrapper.setReturnFormat(RDFXML)<br />
<br />
# execute the SPARQL update and convert the result to an rdflib.Graph <br />
graph = wrapper.query().convert()<br />
<br />
# the destination file, with code to make it timestamped<br />
destfile = 'rdf_dumps/slr-kg4news-' + datetime.datetime.now().strftime('%Y%m%d-%H%M') + '.rdf'<br />
<br />
# serialize the result to file<br />
graph.serialize(destination=destfile, format='ttl')<br />
<br />
# report and quit<br />
print('Wrote %u triples to file %s .' %<br />
(len(res), destfile))<br />
</syntaxhighlight><br />
<br />
===Query Dbpedia with SparqlWrapper===<br />
<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
sparql = SPARQLWrapper("http://dbpedia.org/sparql")<br />
<br />
sparql.setQuery("""<br />
PREFIX dbr: <http://dbpedia.org/resource/><br />
PREFIX dbo: <http://dbpedia.org/ontology/><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
SELECT ?comment<br />
WHERE {<br />
dbr:Barack_Obama rdfs:comment ?comment.<br />
FILTER (langMatches(lang(?comment),"en"))<br />
}<br />
""")<br />
<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["comment"]["value"])<br />
</syntaxhighlight><br />
<br />
==Lifting CSV to RDF==<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, FOAF, RDFS, OWL<br />
import pandas as pd<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
# Load the CSV data as a pandas Dataframe.<br />
csv_data = pd.read_csv("task1.csv")<br />
<br />
# Here I deal with spaces (" ") in the data. I replace them with "_" so that URI's become valid.<br />
csv_data = csv_data.replace(to_replace=" ", value="_", regex=True)<br />
<br />
# Here I mark all missing/empty data as "unknown". This makes it easy to delete triples containing this later.<br />
csv_data = csv_data.fillna("unknown")<br />
<br />
# Loop through the CSV data, and then make RDF triples.<br />
for index, row in csv_data.iterrows():<br />
# The names of the people act as subjects.<br />
subject = row['Name']<br />
# Create triples: e.g. "Cade_Tracey - age - 27"<br />
g.add((URIRef(ex + subject), URIRef(ex + "age"), Literal(row["Age"])))<br />
g.add((URIRef(ex + subject), URIRef(ex + "married"), URIRef(ex + row["Spouse"])))<br />
g.add((URIRef(ex + subject), URIRef(ex + "country"), URIRef(ex + row["Country"])))<br />
<br />
# If We want can add additional RDF/RDFS/OWL information e.g<br />
g.add((URIRef(ex + subject), RDF.type, FOAF.Person))<br />
<br />
# I remove triples that I marked as unknown earlier.<br />
g.remove((None, None, URIRef("http://example.org/unknown")))<br />
<br />
# Clean printing of the graph.<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===CSV file for above example===<br />
<br />
<syntaxhighlight><br />
"Name","Age","Spouse","Country"<br />
"Cade Tracey","26","Mary Jackson","US"<br />
"Bob Johnson","21","","Canada"<br />
"Mary Jackson","25","","France"<br />
"Phil Philips","32","Catherine Smith","Japan"<br />
</syntaxhighlight><br />
<br />
<br />
===Coding Tasks Lab 6===<br />
<syntaxhighlight><br />
import pandas as pd<br />
<br />
<br />
from rdflib import Graph, Namespace, URIRef, Literal, BNode<br />
from rdflib.namespace import RDF, XSD<br />
<br />
<br />
ex = Namespace("http://example.org/")<br />
sem = Namespace("http://semanticweb.cs.vu.nl/2009/11/sem/")<br />
<br />
g = Graph()<br />
g.bind("ex", ex)<br />
g.bind("sem", sem)<br />
<br />
<br />
# Removing unwanted characters<br />
df = pd.read_csv('Snippet/Lab/data/russia-investigation.csv')<br />
# Here I deal with spaces (" ") in the data. I replace them with "_" so that URI's become valid.<br />
df = df.replace(to_replace=" ", value="_", regex=True)<br />
# This may seem odd, but in the data set we have a name like this:("Scooter"). So we have to remove quotation marks<br />
df = df.replace(to_replace=f'"', value="", regex=True)<br />
# # Here I mark all missing/empty data as "unknown". This makes it easy to delete triples containing this later.<br />
df = df.fillna("unknown")<br />
<br />
# Loop through the CSV data, and then make RDF triples.<br />
for index, row in df.iterrows():<br />
name = row['investigation']<br />
investigation = URIRef(ex + name)<br />
g.add((investigation, RDF.type, sem.Event))<br />
investigation_start = row["investigation-start"]<br />
g.add((investigation, sem.hasBeginTimeStamp, Literal(<br />
investigation_start, datatype=XSD.datetime)))<br />
investigation_end = row["investigation-end"]<br />
g.add((investigation, sem.hasEndTimeStamp, Literal(<br />
investigation_end, datatype=XSD.datetime)))<br />
investigation_end = row["investigation-days"]<br />
g.add((investigation, sem.hasXSDDuration, Literal(<br />
investigation_end, datatype=XSD.Days)))<br />
person = row["name"]<br />
person = URIRef(ex + person)<br />
g.add((investigation, sem.Actor, person))<br />
result = row['type']<br />
g.add((investigation, sem.hasSubEvent, Literal(result, datatype=XSD.string)))<br />
overturned = row["overturned"]<br />
g.add((investigation, ex.overtuned, Literal(overturned, datatype=XSD.boolean)))<br />
pardoned = row["pardoned"]<br />
g.add((investigation, ex.pardon, Literal(pardoned, datatype=XSD.boolean)))<br />
<br />
g.serialize("output.ttl", format="ttl")<br />
print(g.serialize(format="turtle"))<br />
<br />
</syntaxhighlight><br />
<!--<br />
==Lifting XML to RDF==<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, XSD, RDFS<br />
import xml.etree.ElementTree as ET<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/TV/")<br />
prov = Namespace("http://www.w3.org/ns/prov#")<br />
g.bind("ex", ex)<br />
g.bind("prov", prov)<br />
<br />
tree = ET.parse("tv_shows.xml")<br />
root = tree.getroot()<br />
<br />
for tv_show in root.findall('tv_show'):<br />
show_id = tv_show.attrib["id"]<br />
title = tv_show.find("title").text<br />
<br />
g.add((URIRef(ex + show_id), ex.title, Literal(title, datatype=XSD.string)))<br />
g.add((URIRef(ex + show_id), RDF.type, ex.TV_Show))<br />
<br />
for actor in tv_show.findall("actor"):<br />
first_name = actor.find("firstname").text<br />
last_name = actor.find("lastname").text<br />
full_name = first_name + "_" + last_name<br />
<br />
g.add((URIRef(ex + show_id), ex.stars, URIRef(ex + full_name)))<br />
g.add((URIRef(ex + full_name), ex.starsIn, URIRef(title)))<br />
g.add((URIRef(ex + full_name), RDF.type, ex.Actor))<br />
<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
<br />
==RDFS==<br />
<br />
===RDFS-plus (OWL) Properties===<br />
<syntaxhighlight><br />
g.add((ex.married, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.married, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.ReflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.sibling, RDF.type, OWL.TransitiveProperty))<br />
g.add((ex.sibling, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.sibling, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.FunctionalProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.IrreflexiveProperty))<br />
<br />
# Sometimes there is no definite answer, and it comes down to how we want to model our properties<br />
# e.g is livesWith a transitive property? Usually yes, but we can also want to specify that a child lives with both of her divorced parents.<br />
# which means that: (mother livesWith child % child livesWith father) != mother livesWith father. Which makes it non-transitive.<br />
</syntaxhighlight><br />
<br />
===RDFS inference with RDFLib===<br />
You can use the OWL-RL package to add inference capabilities to RDFLib. It can be installed using the pip install command:<br />
<syntaxhighlight><br />
pip install owlrl<br />
</syntaxhighlight><br />
Or download it from [https://github.com/RDFLib/OWL-RL GitHub] and copy the ''owlrl'' subfolder into your project folder next to your Python files.<br />
<br />
[https://owl-rl.readthedocs.io/en/latest/owlrl.html OWL-RL documentation.]<br />
<br />
Example program to get you started. In this example we are creating the graph using sparql.update, but it is also possible to parse the data from a file.<br />
<syntaxhighlight><br />
import rdflib.plugins.sparql.update<br />
import owlrl.RDFSClosure<br />
<br />
g = rdflib.Graph()<br />
<br />
ex = rdflib.Namespace('http://example.org#')<br />
g.bind('', ex)<br />
<br />
g.update("""<br />
PREFIX ex: <http://example.org#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
INSERT DATA {<br />
ex:Socrates rdf:type ex:Man .<br />
ex:Man rdfs:subClassOf ex:Mortal .<br />
}""")<br />
<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
# RDF_Semantics parameters: <br />
# - graph (rdflib.Graph) – The RDF graph to be extended.<br />
# - axioms (bool) – Whether (non-datatype) axiomatic triples should be added or not.<br />
# - daxioms (bool) – Whether datatype axiomatic triples should be added or not.<br />
# - rdfs (bool) – Whether RDFS inference is also done (used in subclassed only).<br />
# For now, you will in most cases use all False in RDFS_Semtantics.<br />
<br />
# Generates the closure of the graph - generates the new entailed triples, but does not add them to the graph.<br />
rdfs.closure()<br />
# Adds the new triples to the graph and empties the RDFS triple-container.<br />
rdfs.flush_stored_triples()<br />
<br />
# Ask-query to check whether a new triple has been generated from the entailment.<br />
b = g.query("""<br />
PREFIX ex: <http://example.org#><br />
ASK {<br />
ex:Socrates rdf:type ex:Mortal .<br />
} <br />
""")<br />
print('Result: ' + bool(b))<br />
</syntaxhighlight><br />
<br />
===Language tagged RDFS labels=== <br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, Literal<br />
from rdflib.namespace import RDFS<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.France, RDFS.label, Literal("Frankrike", lang="no")))<br />
g.add((ex.France, RDFS.label, Literal("France", lang="en")))<br />
g.add((ex.France, RDFS.label, Literal("Francia", lang="es")))<br />
<br />
<br />
</syntaxhighlight><br />
<br />
==OWL== <br />
===Basic inference with RDFLib=== <br />
<br />
You can use the OWL-RL package again as for Lecture 5.<br />
<br />
Instead of: <br />
<syntaxhighlight><br />
# The next three lines add inferred triples to g.<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
rdfs.closure()<br />
rdfs.flush_stored_triples()<br />
</syntaxhighlight><br />
you can write this to get both RDFS and basic RDFS Plus / OWL inference:<br />
<syntaxhighlight><br />
# The next three lines add inferred triples to g.<br />
owl = owlrl.CombinedClosure.RDFS_OWLRL_Semantics(g, False, False, False)<br />
owl.closure()<br />
owl.flush_stored_triples()<br />
</syntaxhighlight><br />
<br />
Example updates and queries:<br />
<syntaxhighlight><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
PREFIX ex: <http://example.org#><br />
<br />
INSERT DATA {<br />
ex:Socrates ex:hasWife ex:Xanthippe .<br />
ex:hasHusband owl:inverseOf ex:hasWife .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Xanthippe ex:hasHusband ex:Socrates .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ^ex:hasHusband ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
INSERT DATA {<br />
ex:hasWife rdfs:subPropertyOf ex:hasSpouse .<br />
ex:hasSpouse rdf:type owl:SymmetricProperty . <br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ex:hasSpouse ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ^ex:hasSpouse ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<br />
<br />
<br />
<br />
<br />
===XML Data for above example===<br />
<syntaxhighlight><br />
<data><br />
<tv_show id="1050"><br />
<title>The_Sopranos</title><br />
<actor><br />
<firstname>James</firstname><br />
<lastname>Gandolfini</lastname><br />
</actor><br />
</tv_show><br />
<tv_show id="1066"><br />
<title>Seinfeld</title><br />
<actor><br />
<firstname>Jerry</firstname><br />
<lastname>Seinfeld</lastname><br />
</actor><br />
<actor><br />
<firstname>Julia</firstname><br />
<lastname>Louis-dreyfus</lastname><br />
</actor><br />
<actor><br />
<firstname>Jason</firstname><br />
<lastname>Alexander</lastname><br />
</actor><br />
</tv_show><br />
</data><br />
</syntaxhighlight><br />
<br />
==Lifting HTML to RDF==<br />
<syntaxhighlight><br />
from bs4 import BeautifulSoup as bs, NavigableString<br />
from rdflib import Graph, URIRef, Namespace<br />
from rdflib.namespace import RDF<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
html = open("tv_shows.html").read()<br />
html = bs(html, features="html.parser")<br />
<br />
shows = html.find_all('li', attrs={'class': 'show'})<br />
for show in shows:<br />
title = show.find("h3").text<br />
actors = show.find('ul', attrs={'class': 'actor_list'})<br />
for actor in actors:<br />
if isinstance(actor, NavigableString):<br />
continue<br />
else:<br />
actor = actor.text.replace(" ", "_")<br />
g.add((URIRef(ex + title), ex.stars, URIRef(ex + actor)))<br />
g.add((URIRef(ex + actor), RDF.type, ex.Actor))<br />
<br />
g.add((URIRef(ex + title), RDF.type, ex.TV_Show))<br />
<br />
<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===HTML code for the example above===<br />
<syntaxhighlight><br />
<!DOCTYPE html><br />
<html><br />
<head><br />
<meta charset="utf-8"><br />
<title></title><br />
</head><br />
<body><br />
<div class="tv_shows"><br />
<ul><br />
<li class="show"><br />
<h3>The_Sopranos</h3><br />
<div class="irrelevant_data"></div><br />
<ul class="actor_list"><br />
<li>James Gandolfini</li><br />
</ul><br />
</li><br />
<li class="show"><br />
<h3>Seinfeld</h3><br />
<div class="irrelevant_data"></div><br />
<ul class="actor_list"><br />
<li >Jerry Seinfeld</li><br />
<li>Jason Alexander</li><br />
<li>Julia Louis-Dreyfus</li><br />
</ul><br />
</li><br />
</ul><br />
</div><br />
</body><br />
</html><br />
</syntaxhighlight><br />
<br />
==Web APIs with JSON==<br />
<syntaxhighlight><br />
import requests<br />
import json<br />
import pprint<br />
<br />
# Retrieve JSON data from API service URL. Then load it with the json library as a json object.<br />
url = "http://api.geonames.org/postalCodeLookupJSON?postalcode=46020&#country=ES&username=demo"<br />
data = requests.get(url).content.decode("utf-8")<br />
data = json.loads(data)<br />
pprint.pprint(data)<br />
</syntaxhighlight><br />
<br />
<br />
==JSON-LD==<br />
<br />
<syntaxhighlight><br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
<br />
example = """<br />
{<br />
"@context": {<br />
"name": "http://xmlns.com/foaf/0.1/name",<br />
"homepage": {<br />
"@id": "http://xmlns.com/foaf/0.1/homepage",<br />
"@type": "@id"<br />
}<br />
},<br />
"@id": "http://me.markus-lanthaler.com/",<br />
"name": "Markus Lanthaler",<br />
"homepage": "http://www.markus-lanthaler.com/"<br />
}<br />
"""<br />
<br />
# json-ld parsing automatically deals with @contexts<br />
g.parse(data=example, format='json-ld')<br />
<br />
# serialisation does expansion by default<br />
for line in g.serialize(format='json-ld').decode().splitlines():<br />
print(line)<br />
<br />
# by supplying a context object, serialisation can do compaction<br />
context = {<br />
"foaf": "http://xmlns.com/foaf/0.1/"<br />
}<br />
for line in g.serialize(format='json-ld', context=context).decode().splitlines():<br />
print(line)<br />
</syntaxhighlight><br />
<br />
<br />
<div class="credits" style="text-align: right; direction: ltr; margin-left: 1em;">''INFO216, UiB, 2017-2020. All code examples are [https://creativecommons.org/choose/zero/ CC0].'' </div><br />
<br />
==OWL - Complex Classes and Restrictions==<br />
<syntaxhighlight><br />
import owlrl<br />
from rdflib import Graph, Literal, Namespace, BNode<br />
from rdflib.namespace import RDF, OWL, RDFS<br />
from rdflib.collection import Collection<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
g.bind("owl", OWL)<br />
<br />
# a Season is either Autumn, Winter, Spring, Summer<br />
seasons = BNode()<br />
Collection(g, seasons, [ex.Winter, ex.Autumn, ex.Spring, ex.Summer])<br />
g.add((ex.Season, OWL.oneOf, seasons))<br />
<br />
# A Parent is a Father or Mother<br />
b = BNode()<br />
Collection(g, b, [ex.Father, ex.Mother])<br />
g.add((ex.Parent, OWL.unionOf, b))<br />
<br />
# A Woman is a person who has the "female" gender<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.gender))<br />
g.add((br, OWL.hasValue, ex.Female))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Woman, OWL.intersectionOf, bi))<br />
<br />
# A vegetarian is a Person who only eats vegetarian food<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.eats))<br />
g.add((br, OWL.allValuesFrom, ex.VeganFood))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Vegetarian, OWL.intersectionOf, bi))<br />
<br />
# A vegetarian is a Person who can not eat meat.<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.eats))<br />
g.add((br, OWL.QualifiedCardinality, Literal(0)))<br />
g.add((br, OWL.onClass, ex.Meat))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Vegetarian, OWL.intersectionOf, bi))<br />
<br />
# A Worried Parent is a parent who has at least one sick child<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.hasChild))<br />
g.add((br, OWL.QualifiedMinCardinality, Literal(1)))<br />
g.add((br, OWL.onClass, ex.Sick))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Parent, br])<br />
g.add((ex.WorriedParent, OWL.intersectionOf, bi))<br />
<br />
# using the restriction above, If we now write...: <br />
g.add((ex.Bob, RDF.type, ex.Parent))<br />
g.add((ex.Bob, ex.hasChild, ex.John))<br />
g.add((ex.John, RDF.type, ex.Sick))<br />
# ...we can infer with owl reasoning that Bob is a worried Parent even though we didn't specify it ourselves because Bob fullfills the restriction and Parent requirements.<br />
<br />
</syntaxhighlight><br />
<br />
==Protege-OWL reasoning with HermiT==<br />
<br />
[[:File:DL-reasoning-RoyalFamily-final.owl.txt | Example file]] from Lecture 13 about OWL-DL, rules and reasoning.<br />
<br />
--></div>
Tkr048
http://info216.wiki.uib.no/index.php?title=Lab_Solutions&diff=1743
Lab Solutions
2022-03-08T07:11:58Z
<p>Tkr048: </p>
<hr />
<div>This page will be updated with Python examples related to the lectures and labs. We will add more examples after each lab has ended. The first examples will use Python's RDFlib. We will introduce other relevant libraries later.<br />
<br />
<br />
==Getting started==<br />
<br />
<br />
===Printing the triples of the Graph in a readable way===<br />
<syntaxhighlight><br />
# The turtle format has the purpose of being more readable for humans. <br />
print(g.serialize(format="turtle"))<br />
</syntaxhighlight><br />
<br />
===Coding Tasks Lab 1===<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, URIRef, BNode, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.Cade, ex.married, ex.Mary))<br />
g.add((ex.France, ex.capital, ex.Paris))<br />
g.add((ex.Cade, ex.age, Literal("27", datatype=XSD.integer)))<br />
g.add((ex.Mary, ex.age, Literal("26", datatype=XSD.integer)))<br />
g.add((ex.Mary, ex.interest, ex.Hiking))<br />
g.add((ex.Mary, ex.interest, ex.Chocolate))<br />
g.add((ex.Mary, ex.interest, ex.Biology))<br />
g.add((ex.Mary, RDF.type, ex.Student))<br />
g.add((ex.Paris, RDF.type, ex.City))<br />
g.add((ex.Paris, ex.locatedIn, ex.France))<br />
g.add((ex.Cade, ex.characteristic, ex.Kind))<br />
g.add((ex.Mary, ex.characteristic, ex.Kind))<br />
g.add((ex.Mary, RDF.type, FOAF.Person))<br />
g.add((ex.Cade, RDF.type, FOAF.Person))<br />
<br />
</syntaxhighlight><br />
<br />
==Basic RDF programming==<br />
<br />
===Different ways to create an address===<br />
<br />
<syntaxhighlight><br />
<br />
from rdflib import Graph, Namespace, URIRef, BNode, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
<br />
# How to represent the address of Cade Tracey. From probably the worst solution to the best.<br />
<br />
# Solution 1 -<br />
# Make the entire address into one Literal. However, Generally we want to separate each part of an address into their own triples. This is useful for instance if we want to find only the streets where people live. <br />
<br />
g.add((ex.Cade_Tracey, ex.livesIn, Literal("1516_Henry_Street, Berkeley, California 94709, USA")))<br />
<br />
<br />
# Solution 2 - <br />
# Seperate the different pieces information into their own triples<br />
<br />
g.add((ex.Cade_tracey, ex.street, Literal("1516_Henry_Street")))<br />
g.add((ex.Cade_tracey, ex.city, Literal("Berkeley")))<br />
g.add((ex.Cade_tracey, ex.state, Literal("California")))<br />
g.add((ex.Cade_tracey, ex.zipcode, Literal("94709")))<br />
g.add((ex.Cade_tracey, ex.country, Literal("USA")))<br />
<br />
<br />
# Solution 3 - Some parts of the addresses can make more sense to be resources than Literals.<br />
# Larger concepts like a city or state are typically represented as resources rather than Literals, but this is not necesarilly a requirement in the case that you don't intend to say more about them. <br />
<br />
g.add((ex.Cade_tracey, ex.street, Literal("1516_Henry_Street")))<br />
g.add((ex.Cade_tracey, ex.city, ex.Berkeley))<br />
g.add((ex.Cade_tracey, ex.state, ex.California))<br />
g.add((ex.Cade_tracey, ex.zipcode, Literal("94709")))<br />
g.add((ex.Cade_tracey, ex.country, ex.USA))<br />
<br />
<br />
# Solution 4 <br />
# Grouping of the information into an Address. We can Represent the address concept with its own URI OR with a Blank Node. <br />
# One advantage of this is that we can easily remove the entire address, instead of removing each individual part of the address. <br />
# Solution 4 or 5 is how I would recommend to make addresses. Here, ex.CadeAddress could also be called something like ex.address1 or so on, if you want to give each address a unique ID. <br />
<br />
# Address URI - CadeAdress<br />
<br />
g.add((ex.Cade_Tracey, ex.address, ex.CadeAddress))<br />
g.add((ex.CadeAddress, RDF.type, ex.Address))<br />
g.add((ex.CadeAddress, ex.street, Literal("1516 Henry Street")))<br />
g.add((ex.CadeAddress, ex.city, ex.Berkeley))<br />
g.add((ex.CadeAddress, ex.state, ex.California))<br />
g.add((ex.CadeAddress, ex.postalCode, Literal("94709")))<br />
g.add((ex.CadeAddress, ex.country, ex.USA))<br />
<br />
# OR<br />
<br />
# Blank node for Address. <br />
address = BNode()<br />
g.add((ex.Cade_Tracey, ex.address, address))<br />
g.add((address, RDF.type, ex.Address))<br />
g.add((address, ex.street, Literal("1516 Henry Street", datatype=XSD.string)))<br />
g.add((address, ex.city, ex.Berkeley))<br />
g.add((address, ex.state, ex.California))<br />
g.add((address, ex.postalCode, Literal("94709", datatype=XSD.string)))<br />
g.add((address, ex.country, ex.USA))<br />
<br />
<br />
# Solution 5 using existing vocabularies for address <br />
<br />
# (in this case https://schema.org/PostalAddress from schema.org). <br />
# Also using existing ontology for places like California. (like http://dbpedia.org/resource/California from dbpedia.org)<br />
<br />
schema = Namespace("https://schema.org/")<br />
dbp = Namespace("https://dpbedia.org/resource/")<br />
<br />
g.add((ex.Cade_Tracey, schema.address, ex.CadeAddress))<br />
g.add((ex.CadeAddress, RDF.type, schema.PostalAddress))<br />
g.add((ex.CadeAddress, schema.streetAddress, Literal("1516 Henry Street")))<br />
g.add((ex.CadeAddress, schema.addresCity, dbp.Berkeley))<br />
g.add((ex.CadeAddress, schema.addressRegion, dbp.California))<br />
g.add((ex.CadeAddress, schema.postalCode, Literal("94709")))<br />
g.add((ex.CadeAddress, schema.addressCountry, dbp.United_States))<br />
<br />
</syntaxhighlight><br />
<br />
===Typed Literals===<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace<br />
from rdflib.namespace import XSD<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.Cade, ex.age, Literal(27, datatype=XSD.integer)))<br />
g.add((ex.Cade, ex.gpa, Literal(3.3, datatype=XSD.float)))<br />
g.add((ex.Cade, FOAF.name, Literal("Cade Tracey", datatype=XSD.string)))<br />
g.add((ex.Cade, ex.birthday, Literal("2006-01-01", datatype=XSD.date)))<br />
</syntaxhighlight><br />
<br />
<br />
===Writing and reading graphs/files===<br />
<br />
<syntaxhighlight><br />
# Writing the graph to a file on your system. Possible formats = turtle, n3, xml, nt.<br />
g.serialize(destination="triples.txt", format="turtle")<br />
<br />
# Parsing a local file<br />
parsed_graph = g.parse(location="triples.txt", format="turtle")<br />
<br />
# Parsing a remote endpoint like Dbpedia<br />
dbpedia_graph = g.parse("http://dbpedia.org/resource/Pluto")<br />
</syntaxhighlight><br />
<br />
===Graph Binding===<br />
<syntaxhighlight><br />
#Graph Binding is useful for at least two reasons:<br />
#(1) We no longer need to specify prefixes with SPARQL queries if they are already binded to the graph.<br />
#(2) When serializing the graph, the serialization will show the correct expected prefix <br />
# instead of default namespace names ns1, ns2 etc.<br />
<br />
g = Graph()<br />
<br />
ex = Namespace("http://example.org/")<br />
dbp = Namespace("http://dbpedia.org/resource/")<br />
schema = Namespace("https://schema.org/")<br />
<br />
g.bind("ex", ex)<br />
g.bind("dbp", dbp)<br />
g.bind("schema", schema)<br />
</syntaxhighlight><br />
<br />
===Collection Example===<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace<br />
from rdflib.collection import Collection<br />
<br />
<br />
# Sometimes we want to add many objects or subjects for the same predicate at once. <br />
# In these cases we can use Collection() to save some time.<br />
# In this case I want to add all countries that Emma has visited at once.<br />
<br />
b = BNode()<br />
g.add((ex.Emma, ex.visit, b))<br />
Collection(g, b,<br />
[ex.Portugal, ex.Italy, ex.France, ex.Germany, ex.Denmark, ex.Sweden])<br />
<br />
# OR<br />
<br />
g.add((ex.Emma, ex.visit, ex.EmmaVisits))<br />
Collection(g, ex.EmmaVisits,<br />
[ex.Portugal, ex.Italy, ex.France, ex.Germany, ex.Denmark, ex.Sweden])<br />
<br />
</syntaxhighlight><br />
<br />
==SPARQL==<br />
<br />
Also see the [[SPARQL Examples]] page!<br />
<br />
===Querying a local ("in memory") graph===<br />
<br />
Example contents of the file family.ttl:<br />
@prefix rex: <http://example.org/royal#> .<br />
@prefix fam: <http://example.org/family#> .<br />
<br />
rex:IngridAlexandra fam:hasParent rex:HaakonMagnus .<br />
rex:SverreMagnus fam:hasParent rex:HaakonMagnus .<br />
rex:HaakonMagnus fam:hasParent rex:Harald .<br />
rex:MarthaLouise fam:hasParent rex:Harald .<br />
rex:HaakonMagnus fam:hasSister rex:MarthaLouise .<br />
<br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
g.parse("family.ttl", format='ttl')<br />
<br />
qres = g.query("""<br />
PREFIX fam: <http://example.org/family#><br />
SELECT ?child ?sister WHERE {<br />
?child fam:hasParent ?parent . <br />
?parent fam:hasSister ?sister .<br />
}""")<br />
for row in qres:<br />
print("%s has aunt %s" % row)<br />
<br />
With a prepared query, you can write the query once, and then bind some of the variables each time you use it:<br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
g.parse("family.ttl", format='ttl')<br />
<br />
q = rdflib.plugins.sparql.prepareQuery(<br />
"""SELECT ?child ?sister WHERE {<br />
?child fam:hasParent ?parent .<br />
?parent fam:hasSister ?sister .<br />
}""",<br />
initNs = { "fam": "http://example.org/family#"})<br />
<br />
sm = rdflib.URIRef("http://example.org/royal#SverreMagnus")<br />
<br />
for row in g.query(q, initBindings={'child': sm}):<br />
print(row)<br />
<br />
===Select all contents of lists (rdfllib.Collection)===<br />
<syntaxhighlight><br />
<br />
# rdflib.Collection has a different interntal structure so it requires a slightly more advance query. Here I am selecting all places that Emma has visited.<br />
<br />
PREFIX ex: <http://example.org/><br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
<br />
SELECT ?visit<br />
WHERE {<br />
ex:Emma ex:visit/rdf:rest*/rdf:first ?visit<br />
}<br />
</syntaxhighlight><br />
<br />
<br />
===Using parameters/variables in rdflib queries===<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, URIRef<br />
from rdflib.plugins.sparql import prepareQuery<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
g.add((ex.Cade, ex.livesIn, ex.France))<br />
g.add((ex.Anne, ex.livesIn, ex.Norway))<br />
g.add((ex.Sofie, ex.livesIn, ex.Sweden))<br />
g.add((ex.Per, ex.livesIn, ex.Norway))<br />
g.add((ex.John, ex.livesIn, ex.USA))<br />
<br />
<br />
def find_people_from_country(country):<br />
country = URIRef(ex + country)<br />
q = prepareQuery(<br />
"""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?person WHERE { <br />
?person ex:livesIn ?country.<br />
}<br />
""")<br />
<br />
capital_result = g.query(q, initBindings={'country': country})<br />
<br />
for row in capital_result:<br />
print(row)<br />
<br />
find_people_from_country("Norway")<br />
</syntaxhighlight><br />
<br />
===SELECTING data from Blazegraph via Python===<br />
<syntaxhighlight><br />
<br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
# This creates a server connection to the same URL that contains the graphic interface for Blazegraph. <br />
# You also need to add "sparql" to end of the URL like below.<br />
<br />
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/sparql")<br />
<br />
# SELECT all triples in the database.<br />
<br />
sparql.setQuery("""<br />
SELECT DISTINCT ?p WHERE {<br />
?s ?p ?o.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["p"]["value"])<br />
<br />
# SELECT all interests of Cade<br />
<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT DISTINCT ?interest WHERE {<br />
ex:Cade ex:interest ?interest.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["interest"]["value"])<br />
</syntaxhighlight><br />
<br />
===Updating data from Blazegraph via Python===<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, POST, DIGEST<br />
<br />
namespace = "kb"<br />
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/namespace/"+ namespace + "/sparql")<br />
<br />
sparql.setMethod(POST)<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
INSERT DATA{<br />
ex:Cade ex:interest ex:Mathematics.<br />
}<br />
""")<br />
<br />
results = sparql.query()<br />
print(results.response.read())<br />
<br />
<br />
</syntaxhighlight><br />
===Retrieving data from Wikidata with SparqlWrapper===<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
sparql = SPARQLWrapper("https://query.wikidata.org/sparql")<br />
# In the query I want to select all the Vitamins in wikidata.<br />
<br />
sparql.setQuery("""<br />
SELECT ?nutrient ?nutrientLabel WHERE<br />
{<br />
?nutrient wdt:P279 wd:Q34956.<br />
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }<br />
}<br />
""")<br />
<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["nutrient"]["value"], " ", result["nutrientLabel"]["value"])<br />
</syntaxhighlight><br />
<br />
<br />
More examples can be found in the example section on the official query service here: https://query.wikidata.org/.<br />
<br />
===Download from BlazeGraph===<br />
<br />
<syntaxhighlight><br />
"""<br />
Dumps a database to a local RDF file.<br />
You need to install the SPARQLWrapper package first...<br />
"""<br />
<br />
import datetime<br />
from SPARQLWrapper import SPARQLWrapper, RDFXML<br />
<br />
# your namespace, the default is 'kb'<br />
ns = 'kb'<br />
<br />
# the SPARQL endpoint<br />
endpoint = 'http://info216.i2s.uib.no/bigdata/namespace/' + ns + '/sparql'<br />
<br />
# - the endpoint just moved, the old one was:<br />
# endpoint = 'http://i2s.uib.no:8888/bigdata/namespace/' + ns + '/sparql'<br />
<br />
# create wrapper<br />
wrapper = SPARQLWrapper(endpoint)<br />
<br />
# prepare the SPARQL update<br />
wrapper.setQuery('CONSTRUCT { ?s ?p ?o } WHERE { ?s ?p ?o }')<br />
wrapper.setReturnFormat(RDFXML)<br />
<br />
# execute the SPARQL update and convert the result to an rdflib.Graph <br />
graph = wrapper.query().convert()<br />
<br />
# the destination file, with code to make it timestamped<br />
destfile = 'rdf_dumps/slr-kg4news-' + datetime.datetime.now().strftime('%Y%m%d-%H%M') + '.rdf'<br />
<br />
# serialize the result to file<br />
graph.serialize(destination=destfile, format='ttl')<br />
<br />
# report and quit<br />
print('Wrote %u triples to file %s .' %<br />
(len(res), destfile))<br />
</syntaxhighlight><br />
<br />
===Query Dbpedia with SparqlWrapper===<br />
<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
sparql = SPARQLWrapper("http://dbpedia.org/sparql")<br />
<br />
sparql.setQuery("""<br />
PREFIX dbr: <http://dbpedia.org/resource/><br />
PREFIX dbo: <http://dbpedia.org/ontology/><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
SELECT ?comment<br />
WHERE {<br />
dbr:Barack_Obama rdfs:comment ?comment.<br />
FILTER (langMatches(lang(?comment),"en"))<br />
}<br />
""")<br />
<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["comment"]["value"])<br />
</syntaxhighlight><br />
<br />
==Lifting CSV to RDF==<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, FOAF, RDFS, OWL<br />
import pandas as pd<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
# Load the CSV data as a pandas Dataframe.<br />
csv_data = pd.read_csv("task1.csv")<br />
<br />
# Here I deal with spaces (" ") in the data. I replace them with "_" so that URI's become valid.<br />
csv_data = csv_data.replace(to_replace=" ", value="_", regex=True)<br />
<br />
# Here I mark all missing/empty data as "unknown". This makes it easy to delete triples containing this later.<br />
csv_data = csv_data.fillna("unknown")<br />
<br />
# Loop through the CSV data, and then make RDF triples.<br />
for index, row in csv_data.iterrows():<br />
# The names of the people act as subjects.<br />
subject = row['Name']<br />
# Create triples: e.g. "Cade_Tracey - age - 27"<br />
g.add((URIRef(ex + subject), URIRef(ex + "age"), Literal(row["Age"])))<br />
g.add((URIRef(ex + subject), URIRef(ex + "married"), URIRef(ex + row["Spouse"])))<br />
g.add((URIRef(ex + subject), URIRef(ex + "country"), URIRef(ex + row["Country"])))<br />
<br />
# If We want can add additional RDF/RDFS/OWL information e.g<br />
g.add((URIRef(ex + subject), RDF.type, FOAF.Person))<br />
<br />
# I remove triples that I marked as unknown earlier.<br />
g.remove((None, None, URIRef("http://example.org/unknown")))<br />
<br />
# Clean printing of the graph.<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===CSV file for above example===<br />
<br />
<syntaxhighlight><br />
"Name","Age","Spouse","Country"<br />
"Cade Tracey","26","Mary Jackson","US"<br />
"Bob Johnson","21","","Canada"<br />
"Mary Jackson","25","","France"<br />
"Phil Philips","32","Catherine Smith","Japan"<br />
</syntaxhighlight><br />
<!--<br />
==Lifting XML to RDF==<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, XSD, RDFS<br />
import xml.etree.ElementTree as ET<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/TV/")<br />
prov = Namespace("http://www.w3.org/ns/prov#")<br />
g.bind("ex", ex)<br />
g.bind("prov", prov)<br />
<br />
tree = ET.parse("tv_shows.xml")<br />
root = tree.getroot()<br />
<br />
for tv_show in root.findall('tv_show'):<br />
show_id = tv_show.attrib["id"]<br />
title = tv_show.find("title").text<br />
<br />
g.add((URIRef(ex + show_id), ex.title, Literal(title, datatype=XSD.string)))<br />
g.add((URIRef(ex + show_id), RDF.type, ex.TV_Show))<br />
<br />
for actor in tv_show.findall("actor"):<br />
first_name = actor.find("firstname").text<br />
last_name = actor.find("lastname").text<br />
full_name = first_name + "_" + last_name<br />
<br />
g.add((URIRef(ex + show_id), ex.stars, URIRef(ex + full_name)))<br />
g.add((URIRef(ex + full_name), ex.starsIn, URIRef(title)))<br />
g.add((URIRef(ex + full_name), RDF.type, ex.Actor))<br />
<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
<br />
==RDFS==<br />
<br />
===RDFS-plus (OWL) Properties===<br />
<syntaxhighlight><br />
g.add((ex.married, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.married, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.ReflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.sibling, RDF.type, OWL.TransitiveProperty))<br />
g.add((ex.sibling, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.sibling, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.FunctionalProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.IrreflexiveProperty))<br />
<br />
# Sometimes there is no definite answer, and it comes down to how we want to model our properties<br />
# e.g is livesWith a transitive property? Usually yes, but we can also want to specify that a child lives with both of her divorced parents.<br />
# which means that: (mother livesWith child % child livesWith father) != mother livesWith father. Which makes it non-transitive.<br />
</syntaxhighlight><br />
<br />
===RDFS inference with RDFLib===<br />
You can use the OWL-RL package to add inference capabilities to RDFLib. It can be installed using the pip install command:<br />
<syntaxhighlight><br />
pip install owlrl<br />
</syntaxhighlight><br />
Or download it from [https://github.com/RDFLib/OWL-RL GitHub] and copy the ''owlrl'' subfolder into your project folder next to your Python files.<br />
<br />
[https://owl-rl.readthedocs.io/en/latest/owlrl.html OWL-RL documentation.]<br />
<br />
Example program to get you started. In this example we are creating the graph using sparql.update, but it is also possible to parse the data from a file.<br />
<syntaxhighlight><br />
import rdflib.plugins.sparql.update<br />
import owlrl.RDFSClosure<br />
<br />
g = rdflib.Graph()<br />
<br />
ex = rdflib.Namespace('http://example.org#')<br />
g.bind('', ex)<br />
<br />
g.update("""<br />
PREFIX ex: <http://example.org#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
INSERT DATA {<br />
ex:Socrates rdf:type ex:Man .<br />
ex:Man rdfs:subClassOf ex:Mortal .<br />
}""")<br />
<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
# RDF_Semantics parameters: <br />
# - graph (rdflib.Graph) – The RDF graph to be extended.<br />
# - axioms (bool) – Whether (non-datatype) axiomatic triples should be added or not.<br />
# - daxioms (bool) – Whether datatype axiomatic triples should be added or not.<br />
# - rdfs (bool) – Whether RDFS inference is also done (used in subclassed only).<br />
# For now, you will in most cases use all False in RDFS_Semtantics.<br />
<br />
# Generates the closure of the graph - generates the new entailed triples, but does not add them to the graph.<br />
rdfs.closure()<br />
# Adds the new triples to the graph and empties the RDFS triple-container.<br />
rdfs.flush_stored_triples()<br />
<br />
# Ask-query to check whether a new triple has been generated from the entailment.<br />
b = g.query("""<br />
PREFIX ex: <http://example.org#><br />
ASK {<br />
ex:Socrates rdf:type ex:Mortal .<br />
} <br />
""")<br />
print('Result: ' + bool(b))<br />
</syntaxhighlight><br />
<br />
===Language tagged RDFS labels=== <br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, Literal<br />
from rdflib.namespace import RDFS<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.France, RDFS.label, Literal("Frankrike", lang="no")))<br />
g.add((ex.France, RDFS.label, Literal("France", lang="en")))<br />
g.add((ex.France, RDFS.label, Literal("Francia", lang="es")))<br />
<br />
<br />
</syntaxhighlight><br />
<br />
==OWL== <br />
===Basic inference with RDFLib=== <br />
<br />
You can use the OWL-RL package again as for Lecture 5.<br />
<br />
Instead of: <br />
<syntaxhighlight><br />
# The next three lines add inferred triples to g.<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
rdfs.closure()<br />
rdfs.flush_stored_triples()<br />
</syntaxhighlight><br />
you can write this to get both RDFS and basic RDFS Plus / OWL inference:<br />
<syntaxhighlight><br />
# The next three lines add inferred triples to g.<br />
owl = owlrl.CombinedClosure.RDFS_OWLRL_Semantics(g, False, False, False)<br />
owl.closure()<br />
owl.flush_stored_triples()<br />
</syntaxhighlight><br />
<br />
Example updates and queries:<br />
<syntaxhighlight><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
PREFIX ex: <http://example.org#><br />
<br />
INSERT DATA {<br />
ex:Socrates ex:hasWife ex:Xanthippe .<br />
ex:hasHusband owl:inverseOf ex:hasWife .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Xanthippe ex:hasHusband ex:Socrates .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ^ex:hasHusband ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
INSERT DATA {<br />
ex:hasWife rdfs:subPropertyOf ex:hasSpouse .<br />
ex:hasSpouse rdf:type owl:SymmetricProperty . <br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ex:hasSpouse ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ^ex:hasSpouse ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<br />
<br />
<br />
<br />
<br />
===XML Data for above example===<br />
<syntaxhighlight><br />
<data><br />
<tv_show id="1050"><br />
<title>The_Sopranos</title><br />
<actor><br />
<firstname>James</firstname><br />
<lastname>Gandolfini</lastname><br />
</actor><br />
</tv_show><br />
<tv_show id="1066"><br />
<title>Seinfeld</title><br />
<actor><br />
<firstname>Jerry</firstname><br />
<lastname>Seinfeld</lastname><br />
</actor><br />
<actor><br />
<firstname>Julia</firstname><br />
<lastname>Louis-dreyfus</lastname><br />
</actor><br />
<actor><br />
<firstname>Jason</firstname><br />
<lastname>Alexander</lastname><br />
</actor><br />
</tv_show><br />
</data><br />
</syntaxhighlight><br />
<br />
==Lifting HTML to RDF==<br />
<syntaxhighlight><br />
from bs4 import BeautifulSoup as bs, NavigableString<br />
from rdflib import Graph, URIRef, Namespace<br />
from rdflib.namespace import RDF<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
html = open("tv_shows.html").read()<br />
html = bs(html, features="html.parser")<br />
<br />
shows = html.find_all('li', attrs={'class': 'show'})<br />
for show in shows:<br />
title = show.find("h3").text<br />
actors = show.find('ul', attrs={'class': 'actor_list'})<br />
for actor in actors:<br />
if isinstance(actor, NavigableString):<br />
continue<br />
else:<br />
actor = actor.text.replace(" ", "_")<br />
g.add((URIRef(ex + title), ex.stars, URIRef(ex + actor)))<br />
g.add((URIRef(ex + actor), RDF.type, ex.Actor))<br />
<br />
g.add((URIRef(ex + title), RDF.type, ex.TV_Show))<br />
<br />
<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===HTML code for the example above===<br />
<syntaxhighlight><br />
<!DOCTYPE html><br />
<html><br />
<head><br />
<meta charset="utf-8"><br />
<title></title><br />
</head><br />
<body><br />
<div class="tv_shows"><br />
<ul><br />
<li class="show"><br />
<h3>The_Sopranos</h3><br />
<div class="irrelevant_data"></div><br />
<ul class="actor_list"><br />
<li>James Gandolfini</li><br />
</ul><br />
</li><br />
<li class="show"><br />
<h3>Seinfeld</h3><br />
<div class="irrelevant_data"></div><br />
<ul class="actor_list"><br />
<li >Jerry Seinfeld</li><br />
<li>Jason Alexander</li><br />
<li>Julia Louis-Dreyfus</li><br />
</ul><br />
</li><br />
</ul><br />
</div><br />
</body><br />
</html><br />
</syntaxhighlight><br />
<br />
==Web APIs with JSON==<br />
<syntaxhighlight><br />
import requests<br />
import json<br />
import pprint<br />
<br />
# Retrieve JSON data from API service URL. Then load it with the json library as a json object.<br />
url = "http://api.geonames.org/postalCodeLookupJSON?postalcode=46020&#country=ES&username=demo"<br />
data = requests.get(url).content.decode("utf-8")<br />
data = json.loads(data)<br />
pprint.pprint(data)<br />
</syntaxhighlight><br />
<br />
<br />
==JSON-LD==<br />
<br />
<syntaxhighlight><br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
<br />
example = """<br />
{<br />
"@context": {<br />
"name": "http://xmlns.com/foaf/0.1/name",<br />
"homepage": {<br />
"@id": "http://xmlns.com/foaf/0.1/homepage",<br />
"@type": "@id"<br />
}<br />
},<br />
"@id": "http://me.markus-lanthaler.com/",<br />
"name": "Markus Lanthaler",<br />
"homepage": "http://www.markus-lanthaler.com/"<br />
}<br />
"""<br />
<br />
# json-ld parsing automatically deals with @contexts<br />
g.parse(data=example, format='json-ld')<br />
<br />
# serialisation does expansion by default<br />
for line in g.serialize(format='json-ld').decode().splitlines():<br />
print(line)<br />
<br />
# by supplying a context object, serialisation can do compaction<br />
context = {<br />
"foaf": "http://xmlns.com/foaf/0.1/"<br />
}<br />
for line in g.serialize(format='json-ld', context=context).decode().splitlines():<br />
print(line)<br />
</syntaxhighlight><br />
<br />
<br />
<div class="credits" style="text-align: right; direction: ltr; margin-left: 1em;">''INFO216, UiB, 2017-2020. All code examples are [https://creativecommons.org/choose/zero/ CC0].'' </div><br />
<br />
==OWL - Complex Classes and Restrictions==<br />
<syntaxhighlight><br />
import owlrl<br />
from rdflib import Graph, Literal, Namespace, BNode<br />
from rdflib.namespace import RDF, OWL, RDFS<br />
from rdflib.collection import Collection<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
g.bind("owl", OWL)<br />
<br />
# a Season is either Autumn, Winter, Spring, Summer<br />
seasons = BNode()<br />
Collection(g, seasons, [ex.Winter, ex.Autumn, ex.Spring, ex.Summer])<br />
g.add((ex.Season, OWL.oneOf, seasons))<br />
<br />
# A Parent is a Father or Mother<br />
b = BNode()<br />
Collection(g, b, [ex.Father, ex.Mother])<br />
g.add((ex.Parent, OWL.unionOf, b))<br />
<br />
# A Woman is a person who has the "female" gender<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.gender))<br />
g.add((br, OWL.hasValue, ex.Female))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Woman, OWL.intersectionOf, bi))<br />
<br />
# A vegetarian is a Person who only eats vegetarian food<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.eats))<br />
g.add((br, OWL.allValuesFrom, ex.VeganFood))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Vegetarian, OWL.intersectionOf, bi))<br />
<br />
# A vegetarian is a Person who can not eat meat.<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.eats))<br />
g.add((br, OWL.QualifiedCardinality, Literal(0)))<br />
g.add((br, OWL.onClass, ex.Meat))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Vegetarian, OWL.intersectionOf, bi))<br />
<br />
# A Worried Parent is a parent who has at least one sick child<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.hasChild))<br />
g.add((br, OWL.QualifiedMinCardinality, Literal(1)))<br />
g.add((br, OWL.onClass, ex.Sick))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Parent, br])<br />
g.add((ex.WorriedParent, OWL.intersectionOf, bi))<br />
<br />
# using the restriction above, If we now write...: <br />
g.add((ex.Bob, RDF.type, ex.Parent))<br />
g.add((ex.Bob, ex.hasChild, ex.John))<br />
g.add((ex.John, RDF.type, ex.Sick))<br />
# ...we can infer with owl reasoning that Bob is a worried Parent even though we didn't specify it ourselves because Bob fullfills the restriction and Parent requirements.<br />
<br />
</syntaxhighlight><br />
<br />
==Protege-OWL reasoning with HermiT==<br />
<br />
[[:File:DL-reasoning-RoyalFamily-final.owl.txt | Example file]] from Lecture 13 about OWL-DL, rules and reasoning.<br />
<br />
--></div>
Tkr048
http://info216.wiki.uib.no/index.php?title=Lab_Solutions&diff=1712
Lab Solutions
2022-02-21T08:15:21Z
<p>Tkr048: </p>
<hr />
<div>This page will be updated with Python examples related to the lectures and labs. We will add more examples after each lab has ended. The first examples will use Python's RDFlib. We will introduce other relevant libraries later.<br />
<br />
<br />
==Getting started==<br />
<br />
<br />
===Printing the triples of the Graph in a readable way===<br />
<syntaxhighlight><br />
# The turtle format has the purpose of being more readable for humans. <br />
print(g.serialize(format="turtle"))<br />
</syntaxhighlight><br />
<br />
===Coding Tasks Lab 1===<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, URIRef, BNode, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.Cade, ex.married, ex.Mary))<br />
g.add((ex.France, ex.capital, ex.Paris))<br />
g.add((ex.Cade, ex.age, Literal("27", datatype=XSD.integer)))<br />
g.add((ex.Mary, ex.age, Literal("26", datatype=XSD.integer)))<br />
g.add((ex.Mary, ex.interest, ex.Hiking))<br />
g.add((ex.Mary, ex.interest, ex.Chocolate))<br />
g.add((ex.Mary, ex.interest, ex.Biology))<br />
g.add((ex.Mary, RDF.type, ex.Student))<br />
g.add((ex.Paris, RDF.type, ex.City))<br />
g.add((ex.Paris, ex.locatedIn, ex.France))<br />
g.add((ex.Cade, ex.characteristic, ex.Kind))<br />
g.add((ex.Mary, ex.characteristic, ex.Kind))<br />
g.add((ex.Mary, RDF.type, FOAF.Person))<br />
g.add((ex.Cade, RDF.type, FOAF.Person))<br />
<br />
</syntaxhighlight><br />
<br />
==Basic RDF programming==<br />
<br />
===Different ways to create an address===<br />
<br />
<syntaxhighlight><br />
<br />
from rdflib import Graph, Namespace, URIRef, BNode, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
<br />
# How to represent the address of Cade Tracey. From probably the worst solution to the best.<br />
<br />
# Solution 1 -<br />
# Make the entire address into one Literal. However, Generally we want to separate each part of an address into their own triples. This is useful for instance if we want to find only the streets where people live. <br />
<br />
g.add((ex.Cade_Tracey, ex.livesIn, Literal("1516_Henry_Street, Berkeley, California 94709, USA")))<br />
<br />
<br />
# Solution 2 - <br />
# Seperate the different pieces information into their own triples<br />
<br />
g.add((ex.Cade_tracey, ex.street, Literal("1516_Henry_Street")))<br />
g.add((ex.Cade_tracey, ex.city, Literal("Berkeley")))<br />
g.add((ex.Cade_tracey, ex.state, Literal("California")))<br />
g.add((ex.Cade_tracey, ex.zipcode, Literal("94709")))<br />
g.add((ex.Cade_tracey, ex.country, Literal("USA")))<br />
<br />
<br />
# Solution 3 - Some parts of the addresses can make more sense to be resources than Literals.<br />
# Larger concepts like a city or state are typically represented as resources rather than Literals, but this is not necesarilly a requirement in the case that you don't intend to say more about them. <br />
<br />
g.add((ex.Cade_tracey, ex.street, Literal("1516_Henry_Street")))<br />
g.add((ex.Cade_tracey, ex.city, ex.Berkeley))<br />
g.add((ex.Cade_tracey, ex.state, ex.California))<br />
g.add((ex.Cade_tracey, ex.zipcode, Literal("94709")))<br />
g.add((ex.Cade_tracey, ex.country, ex.USA))<br />
<br />
<br />
# Solution 4 <br />
# Grouping of the information into an Address. We can Represent the address concept with its own URI OR with a Blank Node. <br />
# One advantage of this is that we can easily remove the entire address, instead of removing each individual part of the address. <br />
# Solution 4 or 5 is how I would recommend to make addresses. Here, ex.CadeAddress could also be called something like ex.address1 or so on, if you want to give each address a unique ID. <br />
<br />
# Address URI - CadeAdress<br />
<br />
g.add((ex.Cade_Tracey, ex.address, ex.CadeAddress))<br />
g.add((ex.CadeAddress, RDF.type, ex.Address))<br />
g.add((ex.CadeAddress, ex.street, Literal("1516 Henry Street")))<br />
g.add((ex.CadeAddress, ex.city, ex.Berkeley))<br />
g.add((ex.CadeAddress, ex.state, ex.California))<br />
g.add((ex.CadeAddress, ex.postalCode, Literal("94709")))<br />
g.add((ex.CadeAddress, ex.country, ex.USA))<br />
<br />
# OR<br />
<br />
# Blank node for Address. <br />
address = BNode()<br />
g.add((ex.Cade_Tracey, ex.address, address))<br />
g.add((address, RDF.type, ex.Address))<br />
g.add((address, ex.street, Literal("1516 Henry Street", datatype=XSD.string)))<br />
g.add((address, ex.city, ex.Berkeley))<br />
g.add((address, ex.state, ex.California))<br />
g.add((address, ex.postalCode, Literal("94709", datatype=XSD.string)))<br />
g.add((address, ex.country, ex.USA))<br />
<br />
<br />
# Solution 5 using existing vocabularies for address <br />
<br />
# (in this case https://schema.org/PostalAddress from schema.org). <br />
# Also using existing ontology for places like California. (like http://dbpedia.org/resource/California from dbpedia.org)<br />
<br />
schema = Namespace("https://schema.org/")<br />
dbp = Namespace("https://dpbedia.org/resource/")<br />
<br />
g.add((ex.Cade_Tracey, schema.address, ex.CadeAddress))<br />
g.add((ex.CadeAddress, RDF.type, schema.PostalAddress))<br />
g.add((ex.CadeAddress, schema.streetAddress, Literal("1516 Henry Street")))<br />
g.add((ex.CadeAddress, schema.addresCity, dbp.Berkeley))<br />
g.add((ex.CadeAddress, schema.addressRegion, dbp.California))<br />
g.add((ex.CadeAddress, schema.postalCode, Literal("94709")))<br />
g.add((ex.CadeAddress, schema.addressCountry, dbp.United_States))<br />
<br />
</syntaxhighlight><br />
<br />
===Typed Literals===<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace<br />
from rdflib.namespace import XSD<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.Cade, ex.age, Literal(27, datatype=XSD.integer)))<br />
g.add((ex.Cade, ex.gpa, Literal(3.3, datatype=XSD.float)))<br />
g.add((ex.Cade, FOAF.name, Literal("Cade Tracey", datatype=XSD.string)))<br />
g.add((ex.Cade, ex.birthday, Literal("2006-01-01", datatype=XSD.date)))<br />
</syntaxhighlight><br />
<br />
<br />
===Writing and reading graphs/files===<br />
<br />
<syntaxhighlight><br />
# Writing the graph to a file on your system. Possible formats = turtle, n3, xml, nt.<br />
g.serialize(destination="triples.txt", format="turtle")<br />
<br />
# Parsing a local file<br />
parsed_graph = g.parse(location="triples.txt", format="turtle")<br />
<br />
# Parsing a remote endpoint like Dbpedia<br />
dbpedia_graph = g.parse("http://dbpedia.org/resource/Pluto")<br />
</syntaxhighlight><br />
<br />
===Graph Binding===<br />
<syntaxhighlight><br />
#Graph Binding is useful for at least two reasons:<br />
#(1) We no longer need to specify prefixes with SPARQL queries if they are already binded to the graph.<br />
#(2) When serializing the graph, the serialization will show the correct expected prefix <br />
# instead of default namespace names ns1, ns2 etc.<br />
<br />
g = Graph()<br />
<br />
ex = Namespace("http://example.org/")<br />
dbp = Namespace("http://dbpedia.org/resource/")<br />
schema = Namespace("https://schema.org/")<br />
<br />
g.bind("ex", ex)<br />
g.bind("dbp", dbp)<br />
g.bind("schema", schema)<br />
</syntaxhighlight><br />
<br />
===Collection Example===<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace<br />
from rdflib.collection import Collection<br />
<br />
<br />
# Sometimes we want to add many objects or subjects for the same predicate at once. <br />
# In these cases we can use Collection() to save some time.<br />
# In this case I want to add all countries that Emma has visited at once.<br />
<br />
b = BNode()<br />
g.add((ex.Emma, ex.visit, b))<br />
Collection(g, b,<br />
[ex.Portugal, ex.Italy, ex.France, ex.Germany, ex.Denmark, ex.Sweden])<br />
<br />
# OR<br />
<br />
g.add((ex.Emma, ex.visit, ex.EmmaVisits))<br />
Collection(g, ex.EmmaVisits,<br />
[ex.Portugal, ex.Italy, ex.France, ex.Germany, ex.Denmark, ex.Sweden])<br />
<br />
</syntaxhighlight><br />
<br />
==SPARQL==<br />
<br />
Also see the [[SPARQL Examples]] page!<br />
<br />
===Querying a local ("in memory") graph===<br />
<br />
Example contents of the file family.ttl:<br />
@prefix rex: <http://example.org/royal#> .<br />
@prefix fam: <http://example.org/family#> .<br />
<br />
rex:IngridAlexandra fam:hasParent rex:HaakonMagnus .<br />
rex:SverreMagnus fam:hasParent rex:HaakonMagnus .<br />
rex:HaakonMagnus fam:hasParent rex:Harald .<br />
rex:MarthaLouise fam:hasParent rex:Harald .<br />
rex:HaakonMagnus fam:hasSister rex:MarthaLouise .<br />
<br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
g.parse("family.ttl", format='ttl')<br />
<br />
qres = g.query("""<br />
PREFIX fam: <http://example.org/family#><br />
SELECT ?child ?sister WHERE {<br />
?child fam:hasParent ?parent . <br />
?parent fam:hasSister ?sister .<br />
}""")<br />
for row in qres:<br />
print("%s has aunt %s" % row)<br />
<br />
With a prepared query, you can write the query once, and then bind some of the variables each time you use it:<br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
g.parse("family.ttl", format='ttl')<br />
<br />
q = rdflib.plugins.sparql.prepareQuery(<br />
"""SELECT ?child ?sister WHERE {<br />
?child fam:hasParent ?parent .<br />
?parent fam:hasSister ?sister .<br />
}""",<br />
initNs = { "fam": "http://example.org/family#"})<br />
<br />
sm = rdflib.URIRef("http://example.org/royal#SverreMagnus")<br />
<br />
for row in g.query(q, initBindings={'child': sm}):<br />
print(row)<br />
<br />
===Select all contents of lists (rdfllib.Collection)===<br />
<syntaxhighlight><br />
<br />
# rdflib.Collection has a different interntal structure so it requires a slightly more advance query. Here I am selecting all places that Emma has visited.<br />
<br />
PREFIX ex: <http://example.org/><br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
<br />
SELECT ?visit<br />
WHERE {<br />
ex:Emma ex:visit/rdf:rest*/rdf:first ?visit<br />
}<br />
</syntaxhighlight><br />
<br />
<br />
===Using parameters/variables in rdflib queries===<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, URIRef<br />
from rdflib.plugins.sparql import prepareQuery<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
g.add((ex.Cade, ex.livesIn, ex.France))<br />
g.add((ex.Anne, ex.livesIn, ex.Norway))<br />
g.add((ex.Sofie, ex.livesIn, ex.Sweden))<br />
g.add((ex.Per, ex.livesIn, ex.Norway))<br />
g.add((ex.John, ex.livesIn, ex.USA))<br />
<br />
<br />
def find_people_from_country(country):<br />
country = URIRef(ex + country)<br />
q = prepareQuery(<br />
"""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?person WHERE { <br />
?person ex:livesIn ?country.<br />
}<br />
""")<br />
<br />
capital_result = g.query(q, initBindings={'country': country})<br />
<br />
for row in capital_result:<br />
print(row)<br />
<br />
find_people_from_country("Norway")<br />
</syntaxhighlight><br />
<br />
===SELECTING data from Blazegraph via Python===<br />
<syntaxhighlight><br />
<br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
# This creates a server connection to the same URL that contains the graphic interface for Blazegraph. <br />
# You also need to add "sparql" to end of the URL like below.<br />
<br />
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/sparql")<br />
<br />
# SELECT all triples in the database.<br />
<br />
sparql.setQuery("""<br />
SELECT DISTINCT ?p WHERE {<br />
?s ?p ?o.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["p"]["value"])<br />
<br />
# SELECT all interests of Cade<br />
<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT DISTINCT ?interest WHERE {<br />
ex:Cade ex:interest ?interest.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["interest"]["value"])<br />
</syntaxhighlight><br />
<br />
===Updating data from Blazegraph via Python===<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, POST, DIGEST<br />
<br />
namespace = "kb"<br />
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/namespace/"+ namespace + "/sparql")<br />
<br />
sparql.setMethod(POST)<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
INSERT DATA{<br />
ex:Cade ex:interest ex:Mathematics.<br />
}<br />
""")<br />
<br />
results = sparql.query()<br />
print(results.response.read())<br />
<br />
<br />
</syntaxhighlight><br />
===Retrieving data from Wikidata with SparqlWrapper===<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
sparql = SPARQLWrapper("https://query.wikidata.org/sparql")<br />
# In the query I want to select all the Vitamins in wikidata.<br />
<br />
sparql.setQuery("""<br />
SELECT ?nutrient ?nutrientLabel WHERE<br />
{<br />
?nutrient wdt:P279 wd:Q34956.<br />
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }<br />
}<br />
""")<br />
<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["nutrient"]["value"], " ", result["nutrientLabel"]["value"])<br />
</syntaxhighlight><br />
<br />
<!--<br />
More examples can be found in the example section on the official query service here: https://query.wikidata.org/.<br />
<br />
===Download from BlazeGraph===<br />
<br />
<syntaxhighlight><br />
"""<br />
Dumps a database to a local RDF file.<br />
You need to install the SPARQLWrapper package first...<br />
"""<br />
<br />
import datetime<br />
from SPARQLWrapper import SPARQLWrapper, RDFXML<br />
<br />
# your namespace, the default is 'kb'<br />
ns = 'kb'<br />
<br />
# the SPARQL endpoint<br />
endpoint = 'http://info216.i2s.uib.no/bigdata/namespace/' + ns + '/sparql'<br />
<br />
# - the endpoint just moved, the old one was:<br />
# endpoint = 'http://i2s.uib.no:8888/bigdata/namespace/' + ns + '/sparql'<br />
<br />
# create wrapper<br />
wrapper = SPARQLWrapper(endpoint)<br />
<br />
# prepare the SPARQL update<br />
wrapper.setQuery('CONSTRUCT { ?s ?p ?o } WHERE { ?s ?p ?o }')<br />
wrapper.setReturnFormat(RDFXML)<br />
<br />
# execute the SPARQL update and convert the result to an rdflib.Graph <br />
graph = wrapper.query().convert()<br />
<br />
# the destination file, with code to make it timestamped<br />
destfile = 'rdf_dumps/slr-kg4news-' + datetime.datetime.now().strftime('%Y%m%d-%H%M') + '.rdf'<br />
<br />
# serialize the result to file<br />
graph.serialize(destination=destfile, format='ttl')<br />
<br />
# report and quit<br />
print('Wrote %u triples to file %s .' %<br />
(len(res), destfile))<br />
</syntaxhighlight><br />
<br />
===Query Dbpedia with SparqlWrapper===<br />
<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
sparql = SPARQLWrapper("http://dbpedia.org/sparql")<br />
<br />
sparql.setQuery("""<br />
PREFIX dbr: <http://dbpedia.org/resource/><br />
PREFIX dbo: <http://dbpedia.org/ontology/><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
SELECT ?comment<br />
WHERE {<br />
dbr:Barack_Obama rdfs:comment ?comment.<br />
FILTER (langMatches(lang(?comment),"en"))<br />
}<br />
""")<br />
<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["comment"]["value"])<br />
</syntaxhighlight><br />
<br />
==RDFS==<br />
<br />
===RDFS-plus (OWL) Properties===<br />
<syntaxhighlight><br />
g.add((ex.married, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.married, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.ReflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.sibling, RDF.type, OWL.TransitiveProperty))<br />
g.add((ex.sibling, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.sibling, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.FunctionalProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.IrreflexiveProperty))<br />
<br />
# Sometimes there is no definite answer, and it comes down to how we want to model our properties<br />
# e.g is livesWith a transitive property? Usually yes, but we can also want to specify that a child lives with both of her divorced parents.<br />
# which means that: (mother livesWith child % child livesWith father) != mother livesWith father. Which makes it non-transitive.<br />
</syntaxhighlight><br />
<br />
===RDFS inference with RDFLib===<br />
You can use the OWL-RL package to add inference capabilities to RDFLib. It can be installed using the pip install command:<br />
<syntaxhighlight><br />
pip install owlrl<br />
</syntaxhighlight><br />
Or download it from [https://github.com/RDFLib/OWL-RL GitHub] and copy the ''owlrl'' subfolder into your project folder next to your Python files.<br />
<br />
[https://owl-rl.readthedocs.io/en/latest/owlrl.html OWL-RL documentation.]<br />
<br />
Example program to get you started. In this example we are creating the graph using sparql.update, but it is also possible to parse the data from a file.<br />
<syntaxhighlight><br />
import rdflib.plugins.sparql.update<br />
import owlrl.RDFSClosure<br />
<br />
g = rdflib.Graph()<br />
<br />
ex = rdflib.Namespace('http://example.org#')<br />
g.bind('', ex)<br />
<br />
g.update("""<br />
PREFIX ex: <http://example.org#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
INSERT DATA {<br />
ex:Socrates rdf:type ex:Man .<br />
ex:Man rdfs:subClassOf ex:Mortal .<br />
}""")<br />
<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
# RDF_Semantics parameters: <br />
# - graph (rdflib.Graph) – The RDF graph to be extended.<br />
# - axioms (bool) – Whether (non-datatype) axiomatic triples should be added or not.<br />
# - daxioms (bool) – Whether datatype axiomatic triples should be added or not.<br />
# - rdfs (bool) – Whether RDFS inference is also done (used in subclassed only).<br />
# For now, you will in most cases use all False in RDFS_Semtantics.<br />
<br />
# Generates the closure of the graph - generates the new entailed triples, but does not add them to the graph.<br />
rdfs.closure()<br />
# Adds the new triples to the graph and empties the RDFS triple-container.<br />
rdfs.flush_stored_triples()<br />
<br />
# Ask-query to check whether a new triple has been generated from the entailment.<br />
b = g.query("""<br />
PREFIX ex: <http://example.org#><br />
ASK {<br />
ex:Socrates rdf:type ex:Mortal .<br />
} <br />
""")<br />
print('Result: ' + bool(b))<br />
</syntaxhighlight><br />
<br />
===Language tagged RDFS labels=== <br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, Literal<br />
from rdflib.namespace import RDFS<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.France, RDFS.label, Literal("Frankrike", lang="no")))<br />
g.add((ex.France, RDFS.label, Literal("France", lang="en")))<br />
g.add((ex.France, RDFS.label, Literal("Francia", lang="es")))<br />
<br />
<br />
</syntaxhighlight><br />
<br />
==OWL== <br />
===Basic inference with RDFLib=== <br />
<br />
You can use the OWL-RL package again as for Lecture 5.<br />
<br />
Instead of: <br />
<syntaxhighlight><br />
# The next three lines add inferred triples to g.<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
rdfs.closure()<br />
rdfs.flush_stored_triples()<br />
</syntaxhighlight><br />
you can write this to get both RDFS and basic RDFS Plus / OWL inference:<br />
<syntaxhighlight><br />
# The next three lines add inferred triples to g.<br />
owl = owlrl.CombinedClosure.RDFS_OWLRL_Semantics(g, False, False, False)<br />
owl.closure()<br />
owl.flush_stored_triples()<br />
</syntaxhighlight><br />
<br />
Example updates and queries:<br />
<syntaxhighlight><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
PREFIX ex: <http://example.org#><br />
<br />
INSERT DATA {<br />
ex:Socrates ex:hasWife ex:Xanthippe .<br />
ex:hasHusband owl:inverseOf ex:hasWife .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Xanthippe ex:hasHusband ex:Socrates .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ^ex:hasHusband ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
INSERT DATA {<br />
ex:hasWife rdfs:subPropertyOf ex:hasSpouse .<br />
ex:hasSpouse rdf:type owl:SymmetricProperty . <br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ex:hasSpouse ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ^ex:hasSpouse ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<br />
<br />
==Lifting CSV to RDF==<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, FOAF, RDFS, OWL<br />
import pandas as pd<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
# Load the CSV data as a pandas Dataframe.<br />
csv_data = pd.read_csv("task1.csv")<br />
<br />
# Here I deal with spaces (" ") in the data. I replace them with "_" so that URI's become valid.<br />
csv_data = csv_data.replace(to_replace=" ", value="_", regex=True)<br />
<br />
# Here I mark all missing/empty data as "unknown". This makes it easy to delete triples containing this later.<br />
csv_data = csv_data.fillna("unknown")<br />
<br />
# Loop through the CSV data, and then make RDF triples.<br />
for index, row in csv_data.iterrows():<br />
# The names of the people act as subjects.<br />
subject = row['Name']<br />
# Create triples: e.g. "Cade_Tracey - age - 27"<br />
g.add((URIRef(ex + subject), URIRef(ex + "age"), Literal(row["Age"])))<br />
g.add((URIRef(ex + subject), URIRef(ex + "married"), URIRef(ex + row["Spouse"])))<br />
g.add((URIRef(ex + subject), URIRef(ex + "country"), URIRef(ex + row["Country"])))<br />
<br />
# If We want can add additional RDF/RDFS/OWL information e.g<br />
g.add((URIRef(ex + subject), RDF.type, FOAF.Person))<br />
<br />
# I remove triples that I marked as unknown earlier.<br />
g.remove((None, None, URIRef("http://example.org/unknown")))<br />
<br />
# Clean printing of the graph.<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===CSV file for above example===<br />
<br />
<syntaxhighlight><br />
"Name","Age","Spouse","Country"<br />
"Cade Tracey","26","Mary Jackson","US"<br />
"Bob Johnson","21","","Canada"<br />
"Mary Jackson","25","","France"<br />
"Phil Philips","32","Catherine Smith","Japan"<br />
</syntaxhighlight><br />
<br />
==Lifting XML to RDF==<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, XSD, RDFS<br />
import xml.etree.ElementTree as ET<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/TV/")<br />
prov = Namespace("http://www.w3.org/ns/prov#")<br />
g.bind("ex", ex)<br />
g.bind("prov", prov)<br />
<br />
tree = ET.parse("tv_shows.xml")<br />
root = tree.getroot()<br />
<br />
for tv_show in root.findall('tv_show'):<br />
show_id = tv_show.attrib["id"]<br />
title = tv_show.find("title").text<br />
<br />
g.add((URIRef(ex + show_id), ex.title, Literal(title, datatype=XSD.string)))<br />
g.add((URIRef(ex + show_id), RDF.type, ex.TV_Show))<br />
<br />
for actor in tv_show.findall("actor"):<br />
first_name = actor.find("firstname").text<br />
last_name = actor.find("lastname").text<br />
full_name = first_name + "_" + last_name<br />
<br />
g.add((URIRef(ex + show_id), ex.stars, URIRef(ex + full_name)))<br />
g.add((URIRef(ex + full_name), ex.starsIn, URIRef(title)))<br />
g.add((URIRef(ex + full_name), RDF.type, ex.Actor))<br />
<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
<br />
<br />
===XML Data for above example===<br />
<syntaxhighlight><br />
<data><br />
<tv_show id="1050"><br />
<title>The_Sopranos</title><br />
<actor><br />
<firstname>James</firstname><br />
<lastname>Gandolfini</lastname><br />
</actor><br />
</tv_show><br />
<tv_show id="1066"><br />
<title>Seinfeld</title><br />
<actor><br />
<firstname>Jerry</firstname><br />
<lastname>Seinfeld</lastname><br />
</actor><br />
<actor><br />
<firstname>Julia</firstname><br />
<lastname>Louis-dreyfus</lastname><br />
</actor><br />
<actor><br />
<firstname>Jason</firstname><br />
<lastname>Alexander</lastname><br />
</actor><br />
</tv_show><br />
</data><br />
</syntaxhighlight><br />
<br />
==Lifting HTML to RDF==<br />
<syntaxhighlight><br />
from bs4 import BeautifulSoup as bs, NavigableString<br />
from rdflib import Graph, URIRef, Namespace<br />
from rdflib.namespace import RDF<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
html = open("tv_shows.html").read()<br />
html = bs(html, features="html.parser")<br />
<br />
shows = html.find_all('li', attrs={'class': 'show'})<br />
for show in shows:<br />
title = show.find("h3").text<br />
actors = show.find('ul', attrs={'class': 'actor_list'})<br />
for actor in actors:<br />
if isinstance(actor, NavigableString):<br />
continue<br />
else:<br />
actor = actor.text.replace(" ", "_")<br />
g.add((URIRef(ex + title), ex.stars, URIRef(ex + actor)))<br />
g.add((URIRef(ex + actor), RDF.type, ex.Actor))<br />
<br />
g.add((URIRef(ex + title), RDF.type, ex.TV_Show))<br />
<br />
<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===HTML code for the example above===<br />
<syntaxhighlight><br />
<!DOCTYPE html><br />
<html><br />
<head><br />
<meta charset="utf-8"><br />
<title></title><br />
</head><br />
<body><br />
<div class="tv_shows"><br />
<ul><br />
<li class="show"><br />
<h3>The_Sopranos</h3><br />
<div class="irrelevant_data"></div><br />
<ul class="actor_list"><br />
<li>James Gandolfini</li><br />
</ul><br />
</li><br />
<li class="show"><br />
<h3>Seinfeld</h3><br />
<div class="irrelevant_data"></div><br />
<ul class="actor_list"><br />
<li >Jerry Seinfeld</li><br />
<li>Jason Alexander</li><br />
<li>Julia Louis-Dreyfus</li><br />
</ul><br />
</li><br />
</ul><br />
</div><br />
</body><br />
</html><br />
</syntaxhighlight><br />
<br />
==Web APIs with JSON==<br />
<syntaxhighlight><br />
import requests<br />
import json<br />
import pprint<br />
<br />
# Retrieve JSON data from API service URL. Then load it with the json library as a json object.<br />
url = "http://api.geonames.org/postalCodeLookupJSON?postalcode=46020&#country=ES&username=demo"<br />
data = requests.get(url).content.decode("utf-8")<br />
data = json.loads(data)<br />
pprint.pprint(data)<br />
</syntaxhighlight><br />
<br />
<br />
==JSON-LD==<br />
<br />
<syntaxhighlight><br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
<br />
example = """<br />
{<br />
"@context": {<br />
"name": "http://xmlns.com/foaf/0.1/name",<br />
"homepage": {<br />
"@id": "http://xmlns.com/foaf/0.1/homepage",<br />
"@type": "@id"<br />
}<br />
},<br />
"@id": "http://me.markus-lanthaler.com/",<br />
"name": "Markus Lanthaler",<br />
"homepage": "http://www.markus-lanthaler.com/"<br />
}<br />
"""<br />
<br />
# json-ld parsing automatically deals with @contexts<br />
g.parse(data=example, format='json-ld')<br />
<br />
# serialisation does expansion by default<br />
for line in g.serialize(format='json-ld').decode().splitlines():<br />
print(line)<br />
<br />
# by supplying a context object, serialisation can do compaction<br />
context = {<br />
"foaf": "http://xmlns.com/foaf/0.1/"<br />
}<br />
for line in g.serialize(format='json-ld', context=context).decode().splitlines():<br />
print(line)<br />
</syntaxhighlight><br />
<br />
<br />
<div class="credits" style="text-align: right; direction: ltr; margin-left: 1em;">''INFO216, UiB, 2017-2020. All code examples are [https://creativecommons.org/choose/zero/ CC0].'' </div><br />
<br />
==OWL - Complex Classes and Restrictions==<br />
<syntaxhighlight><br />
import owlrl<br />
from rdflib import Graph, Literal, Namespace, BNode<br />
from rdflib.namespace import RDF, OWL, RDFS<br />
from rdflib.collection import Collection<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
g.bind("owl", OWL)<br />
<br />
# a Season is either Autumn, Winter, Spring, Summer<br />
seasons = BNode()<br />
Collection(g, seasons, [ex.Winter, ex.Autumn, ex.Spring, ex.Summer])<br />
g.add((ex.Season, OWL.oneOf, seasons))<br />
<br />
# A Parent is a Father or Mother<br />
b = BNode()<br />
Collection(g, b, [ex.Father, ex.Mother])<br />
g.add((ex.Parent, OWL.unionOf, b))<br />
<br />
# A Woman is a person who has the "female" gender<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.gender))<br />
g.add((br, OWL.hasValue, ex.Female))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Woman, OWL.intersectionOf, bi))<br />
<br />
# A vegetarian is a Person who only eats vegetarian food<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.eats))<br />
g.add((br, OWL.allValuesFrom, ex.VeganFood))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Vegetarian, OWL.intersectionOf, bi))<br />
<br />
# A vegetarian is a Person who can not eat meat.<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.eats))<br />
g.add((br, OWL.QualifiedCardinality, Literal(0)))<br />
g.add((br, OWL.onClass, ex.Meat))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Vegetarian, OWL.intersectionOf, bi))<br />
<br />
# A Worried Parent is a parent who has at least one sick child<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.hasChild))<br />
g.add((br, OWL.QualifiedMinCardinality, Literal(1)))<br />
g.add((br, OWL.onClass, ex.Sick))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Parent, br])<br />
g.add((ex.WorriedParent, OWL.intersectionOf, bi))<br />
<br />
# using the restriction above, If we now write...: <br />
g.add((ex.Bob, RDF.type, ex.Parent))<br />
g.add((ex.Bob, ex.hasChild, ex.John))<br />
g.add((ex.John, RDF.type, ex.Sick))<br />
# ...we can infer with owl reasoning that Bob is a worried Parent even though we didn't specify it ourselves because Bob fullfills the restriction and Parent requirements.<br />
<br />
</syntaxhighlight><br />
<br />
==Protege-OWL reasoning with HermiT==<br />
<br />
[[:File:DL-reasoning-RoyalFamily-final.owl.txt | Example file]] from Lecture 13 about OWL-DL, rules and reasoning.<br />
<br />
--></div>
Tkr048
http://info216.wiki.uib.no/index.php?title=Lab:_SPARQL&diff=1687
Lab: SPARQL
2022-02-09T14:35:37Z
<p>Tkr048: </p>
<hr />
<div>=Lab 3: SPARQL / Blazegraph=<br />
<br />
==Topics==<br />
* Setting up the Blazegraph graph database. Previously we have only stored our triples in memory, which is not persistent. <br />
* SPARQL queries and updates. We use SPARQL to retrieve of update triples in our databases/graphs of triples<br />
<br />
==Installing the Blazegraph database on your own computer==<br />
Download Blazegraph (blazegraph.jar) from here: [https://blazegraph.com/ https://blazegraph.com/]<br />
I recommend placing blazegraph.jar in the same folder of your python project for the labs. <br />
Navigate to the folder of blazegraph.jar in your commandline/terminal using cd. (cd C:\Users\marti\info216 for me as an example). Now run this command:<br />
<syntaxhighlight><br />
java -server -Xmx4g -jar blazegraph.jar<br />
</syntaxhighlight><br />
You might have to install java 64-bit JDK if you have problems running blazegraph. You can do it from this link: <br />
"https://www.oracle.com/technetwork/java/javase/downloads/"<br />
If you get an "Address already in use" error, this is likely because blazegraph has been terminated improperly. Either restart the terminal-session or try to run this command instead: <br />
<syntaxhighlight><br />
java -server -Xmx4g -Djetty.port=19999 -jar blazegraph.jar <br />
</syntaxhighlight><br />
This changes the port of the blazegraph server.<br />
<br />
If you have trouble installing Blazegraph you can use this link for now: "http://sandbox.i2s.uib.no/bigdata/".<br />
This is the same blazegraph interface, but its stored in the cloud and only be used on the UiB network. You may be able to access it without connecting to the UiB Network, but if you are unable to access the endpoint try connecting via the VPN. Instructions [https://hjelp.uib.no/tas/public/ssp/content/detail/service?unid=a566dafec92a4d35bba974f0733f3663 here].<br />
<br />
<br />
If it works it should now display an url like: "http://10.0.0.13:9999/blazegraph/". Open this in a browser. <br />
You can now run SPARQL queries and updates and load RDF graphs from your file into Blazegraph.<br />
In the update tab, load RDF data (select type below) and then paste the contents of your turtle/.txt file to add them all at once to the database. If you have not serialized your graph from lab 2 yet, you can use the triples on the bottom of the page instead. Just copy and paste them into the Update section.<br />
<br />
==Tasks==<br />
<br />
Write the following SPARQL queries: <br />
<br />
* SELECT all triples in your graph. <br />
* SELECT all the interests of Cade.<br />
* SELECT the city and country of where Emma lives.<br />
* SELECT only people who are older than 26.<br />
* SELECT Everyone who graduated with a Bachelor Degree. <br />
<br />
Use SPARQL Update's DELETE DATA to delete that fact that Cade is interested in Photography. Run your SPARQL query again to check that the graph has changed.<br />
<br />
Use INSERT DATA to add information about Sergio Pastor, who lives in 4 Carrer del Serpis, 46021 Valencia, Spain. he has a M.Sc. in computer from the University of Valencia from 2008. His areas of expertise include big data, semantic technologies and machine learning.<br />
<br />
Write a SPARQL DELETE/INSERT update to change the name of "University of Valencia" to "Universidad de Valencia" whereever it occurs.<br />
<br />
Write a SPARQL DESCRIBE query to get basic information about Sergio.<br />
<br />
Write a SPARQL CONSTRUCT query that returns that: any city in an address is a cityOf the country of the same address.<br />
<br />
==If you have more time==<br />
Redo all the above steps, this time writing a Python/RDFlib program. This will be the topic of lab 6.<br />
You can look at the python example page to see how to connect to your blazegraph endpoint in Python and how to perform some basic queries.<br />
<br />
<br />
==Useful Links==<br />
[https://wiki.uib.no/info216/index.php/File:S03-SPARQL-13.pdf Lecture Notes]<br />
<br />
[https://www.w3.org/TR/sparql11-query/ SPARQL Query Documentation]<br />
<br />
[http://www.w3.org/TR/sparql11-update/ SPARQL Update Documentation]<br />
<br />
If you want to explore more, try out Wikidata Query Service<br />
<br />
[https://query.wikidata.org/ Wikidata Query Service]<br />
<br />
Tutorials <br />
<br />
[https://www.wikidata.org/wiki/Wikidata:SPARQL_tutorial Tutorials]<br />
<br />
[https://wdqs-tutorial.toolforge.org/ Interactive tutorial]<br />
<br />
==Triples that you can base your queries on: (turtle format)==<br />
<syntaxhighlight><br />
@prefix ex: <http://example.org/> .<br />
@prefix foaf: <http://xmlns.com/foaf/0.1/> .<br />
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .<br />
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .<br />
@prefix xml: <http://www.w3.org/XML/1998/namespace> .<br />
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .<br />
<br />
ex:Cade a foaf:Person ;<br />
ex:address [ a ex:Address ;<br />
ex:city ex:Berkeley ;<br />
ex:country ex:USA ;<br />
ex:postalCode "94709"^^xsd:string ;<br />
ex:state ex:California ;<br />
ex:street "1516_Henry_Street"^^xsd:string ] ;<br />
ex:age 27 ;<br />
ex:characteristic ex:Kind ;<br />
ex:degree [ ex:degreeField ex:Biology ;<br />
ex:degreeLevel "Bachelor"^^xsd:string ;<br />
ex:degreeSource ex:University_of_California ;<br />
ex:year "2011-01-01"^^xsd:gYear ] ;<br />
ex:interest ex:Bird,<br />
ex:Ecology,<br />
ex:Environmentalism,<br />
ex:Photography,<br />
ex:Travelling ;<br />
ex:married ex:Mary ;<br />
ex:meeting ex:Meeting1 ;<br />
ex:visit ex:Canada,<br />
ex:France,<br />
ex:Germany ;<br />
foaf:knows ex:Emma ;<br />
foaf:name "Cade_Tracey"^^xsd:string .<br />
<br />
ex:Mary a ex:Student,<br />
foaf:Person ;<br />
ex:age 26 ;<br />
ex:characteristic ex:Kind ;<br />
ex:interest ex:Biology,<br />
ex:Chocolate,<br />
ex:Hiking .<br />
<br />
ex:Emma a foaf:Person ;<br />
ex:address [ a ex:Address ;<br />
ex:city ex:Valencia ;<br />
ex:country ex:Spain ;<br />
ex:postalCode "46020"^^xsd:string ;<br />
ex:street "Carrer_de_la Guardia_Civil_20"^^xsd:string ] ;<br />
ex:age 26 ;<br />
ex:degree [ ex:degreeField ex:Chemistry ;<br />
ex:degreeLevel "Master"^^xsd:string ;<br />
ex:degreeSource ex:University_of_Valencia ;<br />
ex:year "2015-01-01"^^xsd:gYear ] ;<br />
ex:expertise ex:Air_Pollution,<br />
ex:Toxic_Waste,<br />
ex:Waste_Management ;<br />
ex:interest ex:Bike_Riding,<br />
ex:Music,<br />
ex:Travelling ;<br />
ex:meeting ex:Meeting1 ;<br />
ex:visit ( ex:Portugal ex:Italy ex:France ex:Germany ex:Denmark ex:Sweden ) ;<br />
foaf:name "Emma_Dominguez"^^xsd:string .<br />
<br />
ex:Meeting1 a ex:Meeting ;<br />
ex:date "August, 2014"^^xsd:string ;<br />
ex:involved ex:Cade,<br />
ex:Emma ;<br />
ex:location ex:Paris .<br />
<br />
ex:Paris a ex:City ;<br />
ex:capitalOf ex:France ;<br />
ex:locatedIn ex:France .<br />
<br />
ex:France ex:capital ex:Paris .<br />
<br />
<br />
</syntaxhighlight></div>
Tkr048
http://info216.wiki.uib.no/index.php?title=Lab_Solutions&diff=1681
Lab Solutions
2022-02-06T17:26:59Z
<p>Tkr048: </p>
<hr />
<div>This page will be updated with Python examples related to the lectures and labs. We will add more examples after each lab has ended. The first examples will use Python's RDFlib. We will introduce other relevant libraries later.<br />
<br />
<br />
==Getting started==<br />
<br />
<br />
===Printing the triples of the Graph in a readable way===<br />
<syntaxhighlight><br />
# The turtle format has the purpose of being more readable for humans. <br />
print(g.serialize(format="turtle"))<br />
</syntaxhighlight><br />
<br />
===Coding Tasks Lab 1===<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, URIRef, BNode, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.Cade, ex.married, ex.Mary))<br />
g.add((ex.France, ex.capital, ex.Paris))<br />
g.add((ex.Cade, ex.age, Literal("27", datatype=XSD.integer)))<br />
g.add((ex.Mary, ex.age, Literal("26", datatype=XSD.integer)))<br />
g.add((ex.Mary, ex.interest, ex.Hiking))<br />
g.add((ex.Mary, ex.interest, ex.Chocolate))<br />
g.add((ex.Mary, ex.interest, ex.Biology))<br />
g.add((ex.Mary, RDF.type, ex.Student))<br />
g.add((ex.Paris, RDF.type, ex.City))<br />
g.add((ex.Paris, ex.locatedIn, ex.France))<br />
g.add((ex.Cade, ex.characteristic, ex.Kind))<br />
g.add((ex.Mary, ex.characteristic, ex.Kind))<br />
g.add((ex.Mary, RDF.type, FOAF.Person))<br />
g.add((ex.Cade, RDF.type, FOAF.Person))<br />
<br />
</syntaxhighlight><br />
<br />
==Basic RDF programming==<br />
<br />
===Different ways to create an address===<br />
<br />
<syntaxhighlight><br />
<br />
from rdflib import Graph, Namespace, URIRef, BNode, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
<br />
# How to represent the address of Cade Tracey. From probably the worst solution to the best.<br />
<br />
# Solution 1 -<br />
# Make the entire address into one Literal. However, Generally we want to separate each part of an address into their own triples. This is useful for instance if we want to find only the streets where people live. <br />
<br />
g.add((ex.Cade_Tracey, ex.livesIn, Literal("1516_Henry_Street, Berkeley, California 94709, USA")))<br />
<br />
<br />
# Solution 2 - <br />
# Seperate the different pieces information into their own triples<br />
<br />
g.add((ex.Cade_tracey, ex.street, Literal("1516_Henry_Street")))<br />
g.add((ex.Cade_tracey, ex.city, Literal("Berkeley")))<br />
g.add((ex.Cade_tracey, ex.state, Literal("California")))<br />
g.add((ex.Cade_tracey, ex.zipcode, Literal("94709")))<br />
g.add((ex.Cade_tracey, ex.country, Literal("USA")))<br />
<br />
<br />
# Solution 3 - Some parts of the addresses can make more sense to be resources than Literals.<br />
# Larger concepts like a city or state are typically represented as resources rather than Literals, but this is not necesarilly a requirement in the case that you don't intend to say more about them. <br />
<br />
g.add((ex.Cade_tracey, ex.street, Literal("1516_Henry_Street")))<br />
g.add((ex.Cade_tracey, ex.city, ex.Berkeley))<br />
g.add((ex.Cade_tracey, ex.state, ex.California))<br />
g.add((ex.Cade_tracey, ex.zipcode, Literal("94709")))<br />
g.add((ex.Cade_tracey, ex.country, ex.USA))<br />
<br />
<br />
# Solution 4 <br />
# Grouping of the information into an Address. We can Represent the address concept with its own URI OR with a Blank Node. <br />
# One advantage of this is that we can easily remove the entire address, instead of removing each individual part of the address. <br />
# Solution 4 or 5 is how I would recommend to make addresses. Here, ex.CadeAddress could also be called something like ex.address1 or so on, if you want to give each address a unique ID. <br />
<br />
# Address URI - CadeAdress<br />
<br />
g.add((ex.Cade_Tracey, ex.address, ex.CadeAddress))<br />
g.add((ex.CadeAddress, RDF.type, ex.Address))<br />
g.add((ex.CadeAddress, ex.street, Literal("1516 Henry Street")))<br />
g.add((ex.CadeAddress, ex.city, ex.Berkeley))<br />
g.add((ex.CadeAddress, ex.state, ex.California))<br />
g.add((ex.CadeAddress, ex.postalCode, Literal("94709")))<br />
g.add((ex.CadeAddress, ex.country, ex.USA))<br />
<br />
# OR<br />
<br />
# Blank node for Address. <br />
address = BNode()<br />
g.add((ex.Cade_Tracey, ex.address, address))<br />
g.add((address, RDF.type, ex.Address))<br />
g.add((address, ex.street, Literal("1516 Henry Street", datatype=XSD.string)))<br />
g.add((address, ex.city, ex.Berkeley))<br />
g.add((address, ex.state, ex.California))<br />
g.add((address, ex.postalCode, Literal("94709", datatype=XSD.string)))<br />
g.add((address, ex.country, ex.USA))<br />
<br />
<br />
# Solution 5 using existing vocabularies for address <br />
<br />
# (in this case https://schema.org/PostalAddress from schema.org). <br />
# Also using existing ontology for places like California. (like http://dbpedia.org/resource/California from dbpedia.org)<br />
<br />
schema = Namespace("https://schema.org/")<br />
dbp = Namespace("https://dpbedia.org/resource/")<br />
<br />
g.add((ex.Cade_Tracey, schema.address, ex.CadeAddress))<br />
g.add((ex.CadeAddress, RDF.type, schema.PostalAddress))<br />
g.add((ex.CadeAddress, schema.streetAddress, Literal("1516 Henry Street")))<br />
g.add((ex.CadeAddress, schema.addresCity, dbp.Berkeley))<br />
g.add((ex.CadeAddress, schema.addressRegion, dbp.California))<br />
g.add((ex.CadeAddress, schema.postalCode, Literal("94709")))<br />
g.add((ex.CadeAddress, schema.addressCountry, dbp.United_States))<br />
<br />
</syntaxhighlight><br />
<br />
===Typed Literals===<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace<br />
from rdflib.namespace import XSD<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.Cade, ex.age, Literal(27, datatype=XSD.integer)))<br />
g.add((ex.Cade, ex.gpa, Literal(3.3, datatype=XSD.float)))<br />
g.add((ex.Cade, FOAF.name, Literal("Cade Tracey", datatype=XSD.string)))<br />
g.add((ex.Cade, ex.birthday, Literal("2006-01-01", datatype=XSD.date)))<br />
</syntaxhighlight><br />
<br />
<br />
===Writing and reading graphs/files===<br />
<br />
<syntaxhighlight><br />
# Writing the graph to a file on your system. Possible formats = turtle, n3, xml, nt.<br />
g.serialize(destination="triples.txt", format="turtle")<br />
<br />
# Parsing a local file<br />
parsed_graph = g.parse(location="triples.txt", format="turtle")<br />
<br />
# Parsing a remote endpoint like Dbpedia<br />
dbpedia_graph = g.parse("http://dbpedia.org/resource/Pluto")<br />
</syntaxhighlight><br />
<br />
===Graph Binding===<br />
<syntaxhighlight><br />
#Graph Binding is useful for at least two reasons:<br />
#(1) We no longer need to specify prefixes with SPARQL queries if they are already binded to the graph.<br />
#(2) When serializing the graph, the serialization will show the correct expected prefix <br />
# instead of default namespace names ns1, ns2 etc.<br />
<br />
g = Graph()<br />
<br />
ex = Namespace("http://example.org/")<br />
dbp = Namespace("http://dbpedia.org/resource/")<br />
schema = Namespace("https://schema.org/")<br />
<br />
g.bind("ex", ex)<br />
g.bind("dbp", dbp)<br />
g.bind("schema", schema)<br />
</syntaxhighlight><br />
<br />
===Collection Example===<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace<br />
from rdflib.collection import Collection<br />
<br />
<br />
# Sometimes we want to add many objects or subjects for the same predicate at once. <br />
# In these cases we can use Collection() to save some time.<br />
# In this case I want to add all countries that Emma has visited at once.<br />
<br />
b = BNode()<br />
g.add((ex.Emma, ex.visit, b))<br />
Collection(g, b,<br />
[ex.Portugal, ex.Italy, ex.France, ex.Germany, ex.Denmark, ex.Sweden])<br />
<br />
# OR<br />
<br />
g.add((ex.Emma, ex.visit, ex.EmmaVisits))<br />
Collection(g, ex.EmmaVisits,<br />
[ex.Portugal, ex.Italy, ex.France, ex.Germany, ex.Denmark, ex.Sweden])<br />
<br />
</syntaxhighlight><br />
<!--<br />
==SPARQL==<br />
<br />
Also see the [[SPARQL Examples]] page!<br />
<br />
===Querying a local ("in memory") graph===<br />
<br />
Example contents of the file family.ttl:<br />
@prefix rex: <http://example.org/royal#> .<br />
@prefix fam: <http://example.org/family#> .<br />
<br />
rex:IngridAlexandra fam:hasParent rex:HaakonMagnus .<br />
rex:SverreMagnus fam:hasParent rex:HaakonMagnus .<br />
rex:HaakonMagnus fam:hasParent rex:Harald .<br />
rex:MarthaLouise fam:hasParent rex:Harald .<br />
rex:HaakonMagnus fam:hasSister rex:MarthaLouise .<br />
<br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
g.parse("family.ttl", format='ttl')<br />
<br />
qres = g.query("""<br />
PREFIX fam: <http://example.org/family#><br />
SELECT ?child ?sister WHERE {<br />
?child fam:hasParent ?parent . <br />
?parent fam:hasSister ?sister .<br />
}""")<br />
for row in qres:<br />
print("%s has aunt %s" % row)<br />
<br />
With a prepared query, you can write the query once, and then bind some of the variables each time you use it:<br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
g.parse("family.ttl", format='ttl')<br />
<br />
q = rdflib.plugins.sparql.prepareQuery(<br />
"""SELECT ?child ?sister WHERE {<br />
?child fam:hasParent ?parent .<br />
?parent fam:hasSister ?sister .<br />
}""",<br />
initNs = { "fam": "http://example.org/family#"})<br />
<br />
sm = rdflib.URIRef("http://example.org/royal#SverreMagnus")<br />
<br />
for row in g.query(q, initBindings={'child': sm}):<br />
print(row)<br />
<br />
===Select all contents of lists (rdfllib.Collection)===<br />
<syntaxhighlight><br />
<br />
# rdflib.Collection has a different interntal structure so it requires a slightly more advance query. Here I am selecting all places that Emma has visited.<br />
<br />
PREFIX ex: <http://example.org/><br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
<br />
SELECT ?visit<br />
WHERE {<br />
ex:Emma ex:visit/rdf:rest*/rdf:first ?visit<br />
}<br />
</syntaxhighlight><br />
<br />
<br />
===Using parameters/variables in rdflib queries===<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, URIRef<br />
from rdflib.plugins.sparql import prepareQuery<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
g.add((ex.Cade, ex.livesIn, ex.France))<br />
g.add((ex.Anne, ex.livesIn, ex.Norway))<br />
g.add((ex.Sofie, ex.livesIn, ex.Sweden))<br />
g.add((ex.Per, ex.livesIn, ex.Norway))<br />
g.add((ex.John, ex.livesIn, ex.USA))<br />
<br />
<br />
def find_people_from_country(country):<br />
country = URIRef(ex + country)<br />
q = prepareQuery(<br />
"""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?person WHERE { <br />
?person ex:livesIn ?country.<br />
}<br />
""")<br />
<br />
capital_result = g.query(q, initBindings={'country': country})<br />
<br />
for row in capital_result:<br />
print(row)<br />
<br />
find_people_from_country("Norway")<br />
</syntaxhighlight><br />
<br />
===SELECTING data from Blazegraph via Python===<br />
<syntaxhighlight><br />
<br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
# This creates a server connection to the same URL that contains the graphic interface for Blazegraph. <br />
# You also need to add "sparql" to end of the URL like below.<br />
<br />
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/sparql")<br />
<br />
# SELECT all triples in the database.<br />
<br />
sparql.setQuery("""<br />
SELECT DISTINCT ?p WHERE {<br />
?s ?p ?o.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["p"]["value"])<br />
<br />
# SELECT all interests of Cade<br />
<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT DISTINCT ?interest WHERE {<br />
ex:Cade ex:interest ?interest.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["interest"]["value"])<br />
</syntaxhighlight><br />
<br />
===Updating data from Blazegraph via Python===<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, POST, DIGEST<br />
<br />
namespace = "kb"<br />
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/namespace/"+ namespace + "/sparql")<br />
<br />
sparql.setMethod(POST)<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
INSERT DATA{<br />
ex:Cade ex:interest ex:Mathematics.<br />
}<br />
""")<br />
<br />
results = sparql.query()<br />
print(results.response.read())<br />
<br />
<br />
</syntaxhighlight><br />
===Retrieving data from Wikidata with SparqlWrapper===<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
sparql = SPARQLWrapper("https://query.wikidata.org/sparql")<br />
# In the query I want to select all the Vitamins in wikidata.<br />
<br />
sparql.setQuery("""<br />
SELECT ?nutrient ?nutrientLabel WHERE<br />
{<br />
?nutrient wdt:P279 wd:Q34956.<br />
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }<br />
}<br />
""")<br />
<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["nutrient"]["value"], " ", result["nutrientLabel"]["value"])<br />
</syntaxhighlight><br />
<br />
More examples can be found in the example section on the official query service here: https://query.wikidata.org/.<br />
<br />
===Download from BlazeGraph===<br />
<br />
<syntaxhighlight><br />
"""<br />
Dumps a database to a local RDF file.<br />
You need to install the SPARQLWrapper package first...<br />
"""<br />
<br />
import datetime<br />
from SPARQLWrapper import SPARQLWrapper, RDFXML<br />
<br />
# your namespace, the default is 'kb'<br />
ns = 'kb'<br />
<br />
# the SPARQL endpoint<br />
endpoint = 'http://info216.i2s.uib.no/bigdata/namespace/' + ns + '/sparql'<br />
<br />
# - the endpoint just moved, the old one was:<br />
# endpoint = 'http://i2s.uib.no:8888/bigdata/namespace/' + ns + '/sparql'<br />
<br />
# create wrapper<br />
wrapper = SPARQLWrapper(endpoint)<br />
<br />
# prepare the SPARQL update<br />
wrapper.setQuery('CONSTRUCT { ?s ?p ?o } WHERE { ?s ?p ?o }')<br />
wrapper.setReturnFormat(RDFXML)<br />
<br />
# execute the SPARQL update and convert the result to an rdflib.Graph <br />
graph = wrapper.query().convert()<br />
<br />
# the destination file, with code to make it timestamped<br />
destfile = 'rdf_dumps/slr-kg4news-' + datetime.datetime.now().strftime('%Y%m%d-%H%M') + '.rdf'<br />
<br />
# serialize the result to file<br />
graph.serialize(destination=destfile, format='ttl')<br />
<br />
# report and quit<br />
print('Wrote %u triples to file %s .' %<br />
(len(res), destfile))<br />
</syntaxhighlight><br />
<br />
===Query Dbpedia with SparqlWrapper===<br />
<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
sparql = SPARQLWrapper("http://dbpedia.org/sparql")<br />
<br />
sparql.setQuery("""<br />
PREFIX dbr: <http://dbpedia.org/resource/><br />
PREFIX dbo: <http://dbpedia.org/ontology/><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
SELECT ?comment<br />
WHERE {<br />
dbr:Barack_Obama rdfs:comment ?comment.<br />
FILTER (langMatches(lang(?comment),"en"))<br />
}<br />
""")<br />
<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["comment"]["value"])<br />
</syntaxhighlight><br />
<br />
==RDFS==<br />
<br />
===RDFS-plus (OWL) Properties===<br />
<syntaxhighlight><br />
g.add((ex.married, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.married, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.ReflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.sibling, RDF.type, OWL.TransitiveProperty))<br />
g.add((ex.sibling, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.sibling, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.FunctionalProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.IrreflexiveProperty))<br />
<br />
# Sometimes there is no definite answer, and it comes down to how we want to model our properties<br />
# e.g is livesWith a transitive property? Usually yes, but we can also want to specify that a child lives with both of her divorced parents.<br />
# which means that: (mother livesWith child % child livesWith father) != mother livesWith father. Which makes it non-transitive.<br />
</syntaxhighlight><br />
<br />
===RDFS inference with RDFLib===<br />
You can use the OWL-RL package to add inference capabilities to RDFLib. It can be installed using the pip install command:<br />
<syntaxhighlight><br />
pip install owlrl<br />
</syntaxhighlight><br />
Or download it from [https://github.com/RDFLib/OWL-RL GitHub] and copy the ''owlrl'' subfolder into your project folder next to your Python files.<br />
<br />
[https://owl-rl.readthedocs.io/en/latest/owlrl.html OWL-RL documentation.]<br />
<br />
Example program to get you started. In this example we are creating the graph using sparql.update, but it is also possible to parse the data from a file.<br />
<syntaxhighlight><br />
import rdflib.plugins.sparql.update<br />
import owlrl.RDFSClosure<br />
<br />
g = rdflib.Graph()<br />
<br />
ex = rdflib.Namespace('http://example.org#')<br />
g.bind('', ex)<br />
<br />
g.update("""<br />
PREFIX ex: <http://example.org#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
INSERT DATA {<br />
ex:Socrates rdf:type ex:Man .<br />
ex:Man rdfs:subClassOf ex:Mortal .<br />
}""")<br />
<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
# RDF_Semantics parameters: <br />
# - graph (rdflib.Graph) – The RDF graph to be extended.<br />
# - axioms (bool) – Whether (non-datatype) axiomatic triples should be added or not.<br />
# - daxioms (bool) – Whether datatype axiomatic triples should be added or not.<br />
# - rdfs (bool) – Whether RDFS inference is also done (used in subclassed only).<br />
# For now, you will in most cases use all False in RDFS_Semtantics.<br />
<br />
# Generates the closure of the graph - generates the new entailed triples, but does not add them to the graph.<br />
rdfs.closure()<br />
# Adds the new triples to the graph and empties the RDFS triple-container.<br />
rdfs.flush_stored_triples()<br />
<br />
# Ask-query to check whether a new triple has been generated from the entailment.<br />
b = g.query("""<br />
PREFIX ex: <http://example.org#><br />
ASK {<br />
ex:Socrates rdf:type ex:Mortal .<br />
} <br />
""")<br />
print('Result: ' + bool(b))<br />
</syntaxhighlight><br />
<br />
===Language tagged RDFS labels=== <br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, Literal<br />
from rdflib.namespace import RDFS<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.France, RDFS.label, Literal("Frankrike", lang="no")))<br />
g.add((ex.France, RDFS.label, Literal("France", lang="en")))<br />
g.add((ex.France, RDFS.label, Literal("Francia", lang="es")))<br />
<br />
<br />
</syntaxhighlight><br />
<br />
==OWL== <br />
===Basic inference with RDFLib=== <br />
<br />
You can use the OWL-RL package again as for Lecture 5.<br />
<br />
Instead of: <br />
<syntaxhighlight><br />
# The next three lines add inferred triples to g.<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
rdfs.closure()<br />
rdfs.flush_stored_triples()<br />
</syntaxhighlight><br />
you can write this to get both RDFS and basic RDFS Plus / OWL inference:<br />
<syntaxhighlight><br />
# The next three lines add inferred triples to g.<br />
owl = owlrl.CombinedClosure.RDFS_OWLRL_Semantics(g, False, False, False)<br />
owl.closure()<br />
owl.flush_stored_triples()<br />
</syntaxhighlight><br />
<br />
Example updates and queries:<br />
<syntaxhighlight><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
PREFIX ex: <http://example.org#><br />
<br />
INSERT DATA {<br />
ex:Socrates ex:hasWife ex:Xanthippe .<br />
ex:hasHusband owl:inverseOf ex:hasWife .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Xanthippe ex:hasHusband ex:Socrates .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ^ex:hasHusband ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
INSERT DATA {<br />
ex:hasWife rdfs:subPropertyOf ex:hasSpouse .<br />
ex:hasSpouse rdf:type owl:SymmetricProperty . <br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ex:hasSpouse ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ^ex:hasSpouse ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<br />
<br />
==Lifting CSV to RDF==<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, FOAF, RDFS, OWL<br />
import pandas as pd<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
# Load the CSV data as a pandas Dataframe.<br />
csv_data = pd.read_csv("task1.csv")<br />
<br />
# Here I deal with spaces (" ") in the data. I replace them with "_" so that URI's become valid.<br />
csv_data = csv_data.replace(to_replace=" ", value="_", regex=True)<br />
<br />
# Here I mark all missing/empty data as "unknown". This makes it easy to delete triples containing this later.<br />
csv_data = csv_data.fillna("unknown")<br />
<br />
# Loop through the CSV data, and then make RDF triples.<br />
for index, row in csv_data.iterrows():<br />
# The names of the people act as subjects.<br />
subject = row['Name']<br />
# Create triples: e.g. "Cade_Tracey - age - 27"<br />
g.add((URIRef(ex + subject), URIRef(ex + "age"), Literal(row["Age"])))<br />
g.add((URIRef(ex + subject), URIRef(ex + "married"), URIRef(ex + row["Spouse"])))<br />
g.add((URIRef(ex + subject), URIRef(ex + "country"), URIRef(ex + row["Country"])))<br />
<br />
# If We want can add additional RDF/RDFS/OWL information e.g<br />
g.add((URIRef(ex + subject), RDF.type, FOAF.Person))<br />
<br />
# I remove triples that I marked as unknown earlier.<br />
g.remove((None, None, URIRef("http://example.org/unknown")))<br />
<br />
# Clean printing of the graph.<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===CSV file for above example===<br />
<br />
<syntaxhighlight><br />
"Name","Age","Spouse","Country"<br />
"Cade Tracey","26","Mary Jackson","US"<br />
"Bob Johnson","21","","Canada"<br />
"Mary Jackson","25","","France"<br />
"Phil Philips","32","Catherine Smith","Japan"<br />
</syntaxhighlight><br />
<br />
==Lifting XML to RDF==<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, XSD, RDFS<br />
import xml.etree.ElementTree as ET<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/TV/")<br />
prov = Namespace("http://www.w3.org/ns/prov#")<br />
g.bind("ex", ex)<br />
g.bind("prov", prov)<br />
<br />
tree = ET.parse("tv_shows.xml")<br />
root = tree.getroot()<br />
<br />
for tv_show in root.findall('tv_show'):<br />
show_id = tv_show.attrib["id"]<br />
title = tv_show.find("title").text<br />
<br />
g.add((URIRef(ex + show_id), ex.title, Literal(title, datatype=XSD.string)))<br />
g.add((URIRef(ex + show_id), RDF.type, ex.TV_Show))<br />
<br />
for actor in tv_show.findall("actor"):<br />
first_name = actor.find("firstname").text<br />
last_name = actor.find("lastname").text<br />
full_name = first_name + "_" + last_name<br />
<br />
g.add((URIRef(ex + show_id), ex.stars, URIRef(ex + full_name)))<br />
g.add((URIRef(ex + full_name), ex.starsIn, URIRef(title)))<br />
g.add((URIRef(ex + full_name), RDF.type, ex.Actor))<br />
<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
<br />
<br />
===XML Data for above example===<br />
<syntaxhighlight><br />
<data><br />
<tv_show id="1050"><br />
<title>The_Sopranos</title><br />
<actor><br />
<firstname>James</firstname><br />
<lastname>Gandolfini</lastname><br />
</actor><br />
</tv_show><br />
<tv_show id="1066"><br />
<title>Seinfeld</title><br />
<actor><br />
<firstname>Jerry</firstname><br />
<lastname>Seinfeld</lastname><br />
</actor><br />
<actor><br />
<firstname>Julia</firstname><br />
<lastname>Louis-dreyfus</lastname><br />
</actor><br />
<actor><br />
<firstname>Jason</firstname><br />
<lastname>Alexander</lastname><br />
</actor><br />
</tv_show><br />
</data><br />
</syntaxhighlight><br />
<br />
==Lifting HTML to RDF==<br />
<syntaxhighlight><br />
from bs4 import BeautifulSoup as bs, NavigableString<br />
from rdflib import Graph, URIRef, Namespace<br />
from rdflib.namespace import RDF<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
html = open("tv_shows.html").read()<br />
html = bs(html, features="html.parser")<br />
<br />
shows = html.find_all('li', attrs={'class': 'show'})<br />
for show in shows:<br />
title = show.find("h3").text<br />
actors = show.find('ul', attrs={'class': 'actor_list'})<br />
for actor in actors:<br />
if isinstance(actor, NavigableString):<br />
continue<br />
else:<br />
actor = actor.text.replace(" ", "_")<br />
g.add((URIRef(ex + title), ex.stars, URIRef(ex + actor)))<br />
g.add((URIRef(ex + actor), RDF.type, ex.Actor))<br />
<br />
g.add((URIRef(ex + title), RDF.type, ex.TV_Show))<br />
<br />
<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===HTML code for the example above===<br />
<syntaxhighlight><br />
<!DOCTYPE html><br />
<html><br />
<head><br />
<meta charset="utf-8"><br />
<title></title><br />
</head><br />
<body><br />
<div class="tv_shows"><br />
<ul><br />
<li class="show"><br />
<h3>The_Sopranos</h3><br />
<div class="irrelevant_data"></div><br />
<ul class="actor_list"><br />
<li>James Gandolfini</li><br />
</ul><br />
</li><br />
<li class="show"><br />
<h3>Seinfeld</h3><br />
<div class="irrelevant_data"></div><br />
<ul class="actor_list"><br />
<li >Jerry Seinfeld</li><br />
<li>Jason Alexander</li><br />
<li>Julia Louis-Dreyfus</li><br />
</ul><br />
</li><br />
</ul><br />
</div><br />
</body><br />
</html><br />
</syntaxhighlight><br />
<br />
==Web APIs with JSON==<br />
<syntaxhighlight><br />
import requests<br />
import json<br />
import pprint<br />
<br />
# Retrieve JSON data from API service URL. Then load it with the json library as a json object.<br />
url = "http://api.geonames.org/postalCodeLookupJSON?postalcode=46020&#country=ES&username=demo"<br />
data = requests.get(url).content.decode("utf-8")<br />
data = json.loads(data)<br />
pprint.pprint(data)<br />
</syntaxhighlight><br />
<br />
<br />
==JSON-LD==<br />
<br />
<syntaxhighlight><br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
<br />
example = """<br />
{<br />
"@context": {<br />
"name": "http://xmlns.com/foaf/0.1/name",<br />
"homepage": {<br />
"@id": "http://xmlns.com/foaf/0.1/homepage",<br />
"@type": "@id"<br />
}<br />
},<br />
"@id": "http://me.markus-lanthaler.com/",<br />
"name": "Markus Lanthaler",<br />
"homepage": "http://www.markus-lanthaler.com/"<br />
}<br />
"""<br />
<br />
# json-ld parsing automatically deals with @contexts<br />
g.parse(data=example, format='json-ld')<br />
<br />
# serialisation does expansion by default<br />
for line in g.serialize(format='json-ld').decode().splitlines():<br />
print(line)<br />
<br />
# by supplying a context object, serialisation can do compaction<br />
context = {<br />
"foaf": "http://xmlns.com/foaf/0.1/"<br />
}<br />
for line in g.serialize(format='json-ld', context=context).decode().splitlines():<br />
print(line)<br />
</syntaxhighlight><br />
<br />
<br />
<div class="credits" style="text-align: right; direction: ltr; margin-left: 1em;">''INFO216, UiB, 2017-2020. All code examples are [https://creativecommons.org/choose/zero/ CC0].'' </div><br />
<br />
==OWL - Complex Classes and Restrictions==<br />
<syntaxhighlight><br />
import owlrl<br />
from rdflib import Graph, Literal, Namespace, BNode<br />
from rdflib.namespace import RDF, OWL, RDFS<br />
from rdflib.collection import Collection<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
g.bind("owl", OWL)<br />
<br />
# a Season is either Autumn, Winter, Spring, Summer<br />
seasons = BNode()<br />
Collection(g, seasons, [ex.Winter, ex.Autumn, ex.Spring, ex.Summer])<br />
g.add((ex.Season, OWL.oneOf, seasons))<br />
<br />
# A Parent is a Father or Mother<br />
b = BNode()<br />
Collection(g, b, [ex.Father, ex.Mother])<br />
g.add((ex.Parent, OWL.unionOf, b))<br />
<br />
# A Woman is a person who has the "female" gender<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.gender))<br />
g.add((br, OWL.hasValue, ex.Female))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Woman, OWL.intersectionOf, bi))<br />
<br />
# A vegetarian is a Person who only eats vegetarian food<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.eats))<br />
g.add((br, OWL.allValuesFrom, ex.VeganFood))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Vegetarian, OWL.intersectionOf, bi))<br />
<br />
# A vegetarian is a Person who can not eat meat.<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.eats))<br />
g.add((br, OWL.QualifiedCardinality, Literal(0)))<br />
g.add((br, OWL.onClass, ex.Meat))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Vegetarian, OWL.intersectionOf, bi))<br />
<br />
# A Worried Parent is a parent who has at least one sick child<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.hasChild))<br />
g.add((br, OWL.QualifiedMinCardinality, Literal(1)))<br />
g.add((br, OWL.onClass, ex.Sick))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Parent, br])<br />
g.add((ex.WorriedParent, OWL.intersectionOf, bi))<br />
<br />
# using the restriction above, If we now write...: <br />
g.add((ex.Bob, RDF.type, ex.Parent))<br />
g.add((ex.Bob, ex.hasChild, ex.John))<br />
g.add((ex.John, RDF.type, ex.Sick))<br />
# ...we can infer with owl reasoning that Bob is a worried Parent even though we didn't specify it ourselves because Bob fullfills the restriction and Parent requirements.<br />
<br />
</syntaxhighlight><br />
<br />
==Protege-OWL reasoning with HermiT==<br />
<br />
[[:File:DL-reasoning-RoyalFamily-final.owl.txt | Example file]] from Lecture 13 about OWL-DL, rules and reasoning.<br />
<br />
--></div>
Tkr048
http://info216.wiki.uib.no/index.php?title=Lab_Solutions&diff=1672
Lab Solutions
2022-01-30T14:43:21Z
<p>Tkr048: </p>
<hr />
<div>This page will be updated with Python examples related to the lectures and labs. We will add more examples after each lab has ended. The first examples will use Python's RDFlib. We will introduce other relevant libraries later.<br />
<br />
<br />
==Getting started==<br />
<br />
<br />
===Printing the triples of the Graph in a readable way===<br />
<syntaxhighlight><br />
# The turtle format has the purpose of being more readable for humans. <br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===Coding Tasks Lab 1===<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, URIRef, BNode, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.Cade, ex.married, ex.Mary))<br />
g.add((ex.France, ex.capital, ex.Paris))<br />
g.add((ex.Cade, ex.age, Literal("27", datatype=XSD.integer)))<br />
g.add((ex.Mary, ex.age, Literal("26", datatype=XSD.integer)))<br />
g.add((ex.Mary, ex.interest, ex.Hiking))<br />
g.add((ex.Mary, ex.interest, ex.Chocolate))<br />
g.add((ex.Mary, ex.interest, ex.Biology))<br />
g.add((ex.Mary, RDF.type, ex.Student))<br />
g.add((ex.Paris, RDF.type, ex.City))<br />
g.add((ex.Paris, ex.locatedIn, ex.France))<br />
g.add((ex.Cade, ex.characteristic, ex.Kind))<br />
g.add((ex.Mary, ex.characteristic, ex.Kind))<br />
g.add((ex.Mary, RDF.type, FOAF.Person))<br />
g.add((ex.Cade, RDF.type, FOAF.Person))<br />
<br />
</syntaxhighlight><br />
<!--<br />
==Basic RDF programming==<br />
<br />
===Different ways to create an address===<br />
<br />
<syntaxhighlight><br />
<br />
from rdflib import Graph, Namespace, URIRef, BNode, Literal<br />
from rdflib.namespace import RDF, FOAF, XSD<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
<br />
# How to represent the address of Cade Tracey. From probably the worst solution to the best.<br />
<br />
# Solution 1 -<br />
# Make the entire address into one Literal. However, Generally we want to separate each part of an address into their own triples. This is useful for instance if we want to find only the streets where people live. <br />
<br />
g.add((ex.Cade_Tracey, ex.livesIn, Literal("1516_Henry_Street, Berkeley, California 94709, USA")))<br />
<br />
<br />
# Solution 2 - <br />
# Seperate the different pieces information into their own triples<br />
<br />
g.add((ex.Cade_tracey, ex.street, Literal("1516_Henry_Street")))<br />
g.add((ex.Cade_tracey, ex.city, Literal("Berkeley")))<br />
g.add((ex.Cade_tracey, ex.state, Literal("California")))<br />
g.add((ex.Cade_tracey, ex.zipcode, Literal("94709")))<br />
g.add((ex.Cade_tracey, ex.country, Literal("USA")))<br />
<br />
<br />
# Solution 3 - Some parts of the addresses can make more sense to be resources than Literals.<br />
# Larger concepts like a city or state are typically represented as resources rather than Literals, but this is not necesarilly a requirement in the case that you don't intend to say more about them. <br />
<br />
g.add((ex.Cade_tracey, ex.street, Literal("1516_Henry_Street")))<br />
g.add((ex.Cade_tracey, ex.city, ex.Berkeley))<br />
g.add((ex.Cade_tracey, ex.state, ex.California))<br />
g.add((ex.Cade_tracey, ex.zipcode, Literal("94709")))<br />
g.add((ex.Cade_tracey, ex.country, ex.USA))<br />
<br />
<br />
# Solution 4 <br />
# Grouping of the information into an Address. We can Represent the address concept with its own URI OR with a Blank Node. <br />
# One advantage of this is that we can easily remove the entire address, instead of removing each individual part of the address. <br />
# Solution 4 or 5 is how I would recommend to make addresses. Here, ex.CadeAddress could also be called something like ex.address1 or so on, if you want to give each address a unique ID. <br />
<br />
# Address URI - CadeAdress<br />
<br />
g.add((ex.Cade_Tracey, ex.address, ex.CadeAddress))<br />
g.add((ex.CadeAddress, RDF.type, ex.Address))<br />
g.add((ex.CadeAddress, ex.street, Literal("1516 Henry Street")))<br />
g.add((ex.CadeAddress, ex.city, ex.Berkeley))<br />
g.add((ex.CadeAddress, ex.state, ex.California))<br />
g.add((ex.CadeAddress, ex.postalCode, Literal("94709")))<br />
g.add((ex.CadeAddress, ex.country, ex.USA))<br />
<br />
# OR<br />
<br />
# Blank node for Address. <br />
address = BNode()<br />
g.add((ex.Cade_Tracey, ex.address, address))<br />
g.add((address, RDF.type, ex.Address))<br />
g.add((address, ex.street, Literal("1516 Henry Street", datatype=XSD.string)))<br />
g.add((address, ex.city, ex.Berkeley))<br />
g.add((address, ex.state, ex.California))<br />
g.add((address, ex.postalCode, Literal("94709", datatype=XSD.string)))<br />
g.add((address, ex.country, ex.USA))<br />
<br />
<br />
# Solution 5 using existing vocabularies for address <br />
<br />
# (in this case https://schema.org/PostalAddress from schema.org). <br />
# Also using existing ontology for places like California. (like http://dbpedia.org/resource/California from dbpedia.org)<br />
<br />
schema = Namespace("https://schema.org/")<br />
dbp = Namespace("https://dpbedia.org/resource/")<br />
<br />
g.add((ex.Cade_Tracey, schema.address, ex.CadeAddress))<br />
g.add((ex.CadeAddress, RDF.type, schema.PostalAddress))<br />
g.add((ex.CadeAddress, schema.streetAddress, Literal("1516 Henry Street")))<br />
g.add((ex.CadeAddress, schema.addresCity, dbp.Berkeley))<br />
g.add((ex.CadeAddress, schema.addressRegion, dbp.California))<br />
g.add((ex.CadeAddress, schema.postalCode, Literal("94709")))<br />
g.add((ex.CadeAddress, schema.addressCountry, dbp.United_States))<br />
<br />
</syntaxhighlight><br />
<br />
===Typed Literals===<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace<br />
from rdflib.namespace import XSD<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.Cade, ex.age, Literal(27, datatype=XSD.integer)))<br />
g.add((ex.Cade, ex.gpa, Literal(3.3, datatype=XSD.float)))<br />
g.add((ex.Cade, FOAF.name, Literal("Cade Tracey", datatype=XSD.string)))<br />
g.add((ex.Cade, ex.birthday, Literal("2006-01-01", datatype=XSD.date)))<br />
</syntaxhighlight><br />
<br />
<br />
===Writing and reading graphs/files===<br />
<br />
<syntaxhighlight><br />
# Writing the graph to a file on your system. Possible formats = turtle, n3, xml, nt.<br />
g.serialize(destination="triples.txt", format="turtle")<br />
<br />
# Parsing a local file<br />
parsed_graph = g.parse(location="triples.txt", format="turtle")<br />
<br />
# Parsing a remote endpoint like Dbpedia<br />
dbpedia_graph = g.parse("http://dbpedia.org/resource/Pluto")<br />
</syntaxhighlight><br />
<br />
===Graph Binding===<br />
<syntaxhighlight><br />
#Graph Binding is useful for at least two reasons:<br />
#(1) We no longer need to specify prefixes with SPARQL queries if they are already binded to the graph.<br />
#(2) When serializing the graph, the serialization will show the correct expected prefix <br />
# instead of default namespace names ns1, ns2 etc.<br />
<br />
g = Graph()<br />
<br />
ex = Namespace("http://example.org/")<br />
dbp = Namespace("http://dbpedia.org/resource/")<br />
schema = Namespace("https://schema.org/")<br />
<br />
g.bind("ex", ex)<br />
g.bind("dbp", dbp)<br />
g.bind("schema", schema)<br />
</syntaxhighlight><br />
<br />
===Collection Example===<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace<br />
from rdflib.collection import Collection<br />
<br />
<br />
# Sometimes we want to add many objects or subjects for the same predicate at once. <br />
# In these cases we can use Collection() to save some time.<br />
# In this case I want to add all countries that Emma has visited at once.<br />
<br />
b = BNode()<br />
g.add((ex.Emma, ex.visit, b))<br />
Collection(g, b,<br />
[ex.Portugal, ex.Italy, ex.France, ex.Germany, ex.Denmark, ex.Sweden])<br />
<br />
# OR<br />
<br />
g.add((ex.Emma, ex.visit, ex.EmmaVisits))<br />
Collection(g, ex.EmmaVisits,<br />
[ex.Portugal, ex.Italy, ex.France, ex.Germany, ex.Denmark, ex.Sweden])<br />
<br />
</syntaxhighlight><br />
<br />
==SPARQL==<br />
<br />
Also see the [[SPARQL Examples]] page!<br />
<br />
===Querying a local ("in memory") graph===<br />
<br />
Example contents of the file family.ttl:<br />
@prefix rex: <http://example.org/royal#> .<br />
@prefix fam: <http://example.org/family#> .<br />
<br />
rex:IngridAlexandra fam:hasParent rex:HaakonMagnus .<br />
rex:SverreMagnus fam:hasParent rex:HaakonMagnus .<br />
rex:HaakonMagnus fam:hasParent rex:Harald .<br />
rex:MarthaLouise fam:hasParent rex:Harald .<br />
rex:HaakonMagnus fam:hasSister rex:MarthaLouise .<br />
<br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
g.parse("family.ttl", format='ttl')<br />
<br />
qres = g.query("""<br />
PREFIX fam: <http://example.org/family#><br />
SELECT ?child ?sister WHERE {<br />
?child fam:hasParent ?parent . <br />
?parent fam:hasSister ?sister .<br />
}""")<br />
for row in qres:<br />
print("%s has aunt %s" % row)<br />
<br />
With a prepared query, you can write the query once, and then bind some of the variables each time you use it:<br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
g.parse("family.ttl", format='ttl')<br />
<br />
q = rdflib.plugins.sparql.prepareQuery(<br />
"""SELECT ?child ?sister WHERE {<br />
?child fam:hasParent ?parent .<br />
?parent fam:hasSister ?sister .<br />
}""",<br />
initNs = { "fam": "http://example.org/family#"})<br />
<br />
sm = rdflib.URIRef("http://example.org/royal#SverreMagnus")<br />
<br />
for row in g.query(q, initBindings={'child': sm}):<br />
print(row)<br />
<br />
===Select all contents of lists (rdfllib.Collection)===<br />
<syntaxhighlight><br />
<br />
# rdflib.Collection has a different interntal structure so it requires a slightly more advance query. Here I am selecting all places that Emma has visited.<br />
<br />
PREFIX ex: <http://example.org/><br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#><br />
<br />
SELECT ?visit<br />
WHERE {<br />
ex:Emma ex:visit/rdf:rest*/rdf:first ?visit<br />
}<br />
</syntaxhighlight><br />
<br />
<br />
===Using parameters/variables in rdflib queries===<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, URIRef<br />
from rdflib.plugins.sparql import prepareQuery<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
g.add((ex.Cade, ex.livesIn, ex.France))<br />
g.add((ex.Anne, ex.livesIn, ex.Norway))<br />
g.add((ex.Sofie, ex.livesIn, ex.Sweden))<br />
g.add((ex.Per, ex.livesIn, ex.Norway))<br />
g.add((ex.John, ex.livesIn, ex.USA))<br />
<br />
<br />
def find_people_from_country(country):<br />
country = URIRef(ex + country)<br />
q = prepareQuery(<br />
"""<br />
PREFIX ex: <http://example.org/><br />
SELECT ?person WHERE { <br />
?person ex:livesIn ?country.<br />
}<br />
""")<br />
<br />
capital_result = g.query(q, initBindings={'country': country})<br />
<br />
for row in capital_result:<br />
print(row)<br />
<br />
find_people_from_country("Norway")<br />
</syntaxhighlight><br />
<br />
===SELECTING data from Blazegraph via Python===<br />
<syntaxhighlight><br />
<br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
# This creates a server connection to the same URL that contains the graphic interface for Blazegraph. <br />
# You also need to add "sparql" to end of the URL like below.<br />
<br />
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/sparql")<br />
<br />
# SELECT all triples in the database.<br />
<br />
sparql.setQuery("""<br />
SELECT DISTINCT ?p WHERE {<br />
?s ?p ?o.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["p"]["value"])<br />
<br />
# SELECT all interests of Cade<br />
<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
SELECT DISTINCT ?interest WHERE {<br />
ex:Cade ex:interest ?interest.<br />
}<br />
""")<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["interest"]["value"])<br />
</syntaxhighlight><br />
<br />
===Updating data from Blazegraph via Python===<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, POST, DIGEST<br />
<br />
namespace = "kb"<br />
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/namespace/"+ namespace + "/sparql")<br />
<br />
sparql.setMethod(POST)<br />
sparql.setQuery("""<br />
PREFIX ex: <http://example.org/><br />
INSERT DATA{<br />
ex:Cade ex:interest ex:Mathematics.<br />
}<br />
""")<br />
<br />
results = sparql.query()<br />
print(results.response.read())<br />
<br />
<br />
</syntaxhighlight><br />
===Retrieving data from Wikidata with SparqlWrapper===<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
sparql = SPARQLWrapper("https://query.wikidata.org/sparql")<br />
# In the query I want to select all the Vitamins in wikidata.<br />
<br />
sparql.setQuery("""<br />
SELECT ?nutrient ?nutrientLabel WHERE<br />
{<br />
?nutrient wdt:P279 wd:Q34956.<br />
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }<br />
}<br />
""")<br />
<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["nutrient"]["value"], " ", result["nutrientLabel"]["value"])<br />
</syntaxhighlight><br />
<br />
More examples can be found in the example section on the official query service here: https://query.wikidata.org/.<br />
<br />
===Download from BlazeGraph===<br />
<br />
<syntaxhighlight><br />
"""<br />
Dumps a database to a local RDF file.<br />
You need to install the SPARQLWrapper package first...<br />
"""<br />
<br />
import datetime<br />
from SPARQLWrapper import SPARQLWrapper, RDFXML<br />
<br />
# your namespace, the default is 'kb'<br />
ns = 'kb'<br />
<br />
# the SPARQL endpoint<br />
endpoint = 'http://info216.i2s.uib.no/bigdata/namespace/' + ns + '/sparql'<br />
<br />
# - the endpoint just moved, the old one was:<br />
# endpoint = 'http://i2s.uib.no:8888/bigdata/namespace/' + ns + '/sparql'<br />
<br />
# create wrapper<br />
wrapper = SPARQLWrapper(endpoint)<br />
<br />
# prepare the SPARQL update<br />
wrapper.setQuery('CONSTRUCT { ?s ?p ?o } WHERE { ?s ?p ?o }')<br />
wrapper.setReturnFormat(RDFXML)<br />
<br />
# execute the SPARQL update and convert the result to an rdflib.Graph <br />
graph = wrapper.query().convert()<br />
<br />
# the destination file, with code to make it timestamped<br />
destfile = 'rdf_dumps/slr-kg4news-' + datetime.datetime.now().strftime('%Y%m%d-%H%M') + '.rdf'<br />
<br />
# serialize the result to file<br />
graph.serialize(destination=destfile, format='ttl')<br />
<br />
# report and quit<br />
print('Wrote %u triples to file %s .' %<br />
(len(res), destfile))<br />
</syntaxhighlight><br />
<br />
===Query Dbpedia with SparqlWrapper===<br />
<br />
<syntaxhighlight><br />
from SPARQLWrapper import SPARQLWrapper, JSON<br />
<br />
sparql = SPARQLWrapper("http://dbpedia.org/sparql")<br />
<br />
sparql.setQuery("""<br />
PREFIX dbr: <http://dbpedia.org/resource/><br />
PREFIX dbo: <http://dbpedia.org/ontology/><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
SELECT ?comment<br />
WHERE {<br />
dbr:Barack_Obama rdfs:comment ?comment.<br />
FILTER (langMatches(lang(?comment),"en"))<br />
}<br />
""")<br />
<br />
sparql.setReturnFormat(JSON)<br />
results = sparql.query().convert()<br />
<br />
for result in results["results"]["bindings"]:<br />
print(result["comment"]["value"])<br />
</syntaxhighlight><br />
<br />
==RDFS==<br />
<br />
===RDFS-plus (OWL) Properties===<br />
<syntaxhighlight><br />
g.add((ex.married, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.married, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.ReflexiveProperty))<br />
g.add((ex.livesWith, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.sibling, RDF.type, OWL.TransitiveProperty))<br />
g.add((ex.sibling, RDF.type, OWL.SymmetricProperty))<br />
g.add((ex.sibling, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.FunctionalProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.hasFather, RDF.type, OWL.IrreflexiveProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.AsymmetricProperty))<br />
g.add((ex.fatherOf, RDF.type, OWL.IrreflexiveProperty))<br />
<br />
# Sometimes there is no definite answer, and it comes down to how we want to model our properties<br />
# e.g is livesWith a transitive property? Usually yes, but we can also want to specify that a child lives with both of her divorced parents.<br />
# which means that: (mother livesWith child % child livesWith father) != mother livesWith father. Which makes it non-transitive.<br />
</syntaxhighlight><br />
<br />
===RDFS inference with RDFLib===<br />
You can use the OWL-RL package to add inference capabilities to RDFLib. It can be installed using the pip install command:<br />
<syntaxhighlight><br />
pip install owlrl<br />
</syntaxhighlight><br />
Or download it from [https://github.com/RDFLib/OWL-RL GitHub] and copy the ''owlrl'' subfolder into your project folder next to your Python files.<br />
<br />
[https://owl-rl.readthedocs.io/en/latest/owlrl.html OWL-RL documentation.]<br />
<br />
Example program to get you started. In this example we are creating the graph using sparql.update, but it is also possible to parse the data from a file.<br />
<syntaxhighlight><br />
import rdflib.plugins.sparql.update<br />
import owlrl.RDFSClosure<br />
<br />
g = rdflib.Graph()<br />
<br />
ex = rdflib.Namespace('http://example.org#')<br />
g.bind('', ex)<br />
<br />
g.update("""<br />
PREFIX ex: <http://example.org#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
INSERT DATA {<br />
ex:Socrates rdf:type ex:Man .<br />
ex:Man rdfs:subClassOf ex:Mortal .<br />
}""")<br />
<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
# RDF_Semantics parameters: <br />
# - graph (rdflib.Graph) – The RDF graph to be extended.<br />
# - axioms (bool) – Whether (non-datatype) axiomatic triples should be added or not.<br />
# - daxioms (bool) – Whether datatype axiomatic triples should be added or not.<br />
# - rdfs (bool) – Whether RDFS inference is also done (used in subclassed only).<br />
# For now, you will in most cases use all False in RDFS_Semtantics.<br />
<br />
# Generates the closure of the graph - generates the new entailed triples, but does not add them to the graph.<br />
rdfs.closure()<br />
# Adds the new triples to the graph and empties the RDFS triple-container.<br />
rdfs.flush_stored_triples()<br />
<br />
# Ask-query to check whether a new triple has been generated from the entailment.<br />
b = g.query("""<br />
PREFIX ex: <http://example.org#><br />
ASK {<br />
ex:Socrates rdf:type ex:Mortal .<br />
} <br />
""")<br />
print('Result: ' + bool(b))<br />
</syntaxhighlight><br />
<br />
===Language tagged RDFS labels=== <br />
<syntaxhighlight><br />
from rdflib import Graph, Namespace, Literal<br />
from rdflib.namespace import RDFS<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
<br />
g.add((ex.France, RDFS.label, Literal("Frankrike", lang="no")))<br />
g.add((ex.France, RDFS.label, Literal("France", lang="en")))<br />
g.add((ex.France, RDFS.label, Literal("Francia", lang="es")))<br />
<br />
<br />
</syntaxhighlight><br />
<br />
==OWL== <br />
===Basic inference with RDFLib=== <br />
<br />
You can use the OWL-RL package again as for Lecture 5.<br />
<br />
Instead of: <br />
<syntaxhighlight><br />
# The next three lines add inferred triples to g.<br />
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)<br />
rdfs.closure()<br />
rdfs.flush_stored_triples()<br />
</syntaxhighlight><br />
you can write this to get both RDFS and basic RDFS Plus / OWL inference:<br />
<syntaxhighlight><br />
# The next three lines add inferred triples to g.<br />
owl = owlrl.CombinedClosure.RDFS_OWLRL_Semantics(g, False, False, False)<br />
owl.closure()<br />
owl.flush_stored_triples()<br />
</syntaxhighlight><br />
<br />
Example updates and queries:<br />
<syntaxhighlight><br />
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#><br />
PREFIX owl: <http://www.w3.org/2002/07/owl#><br />
PREFIX ex: <http://example.org#><br />
<br />
INSERT DATA {<br />
ex:Socrates ex:hasWife ex:Xanthippe .<br />
ex:hasHusband owl:inverseOf ex:hasWife .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Xanthippe ex:hasHusband ex:Socrates .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ^ex:hasHusband ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
INSERT DATA {<br />
ex:hasWife rdfs:subPropertyOf ex:hasSpouse .<br />
ex:hasSpouse rdf:type owl:SymmetricProperty . <br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ex:hasSpouse ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight><br />
ASK {<br />
ex:Socrates ^ex:hasSpouse ex:Xanthippe .<br />
}<br />
</syntaxhighlight><br />
<br />
<br />
<br />
==Lifting CSV to RDF==<br />
<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, FOAF, RDFS, OWL<br />
import pandas as pd<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
# Load the CSV data as a pandas Dataframe.<br />
csv_data = pd.read_csv("task1.csv")<br />
<br />
# Here I deal with spaces (" ") in the data. I replace them with "_" so that URI's become valid.<br />
csv_data = csv_data.replace(to_replace=" ", value="_", regex=True)<br />
<br />
# Here I mark all missing/empty data as "unknown". This makes it easy to delete triples containing this later.<br />
csv_data = csv_data.fillna("unknown")<br />
<br />
# Loop through the CSV data, and then make RDF triples.<br />
for index, row in csv_data.iterrows():<br />
# The names of the people act as subjects.<br />
subject = row['Name']<br />
# Create triples: e.g. "Cade_Tracey - age - 27"<br />
g.add((URIRef(ex + subject), URIRef(ex + "age"), Literal(row["Age"])))<br />
g.add((URIRef(ex + subject), URIRef(ex + "married"), URIRef(ex + row["Spouse"])))<br />
g.add((URIRef(ex + subject), URIRef(ex + "country"), URIRef(ex + row["Country"])))<br />
<br />
# If We want can add additional RDF/RDFS/OWL information e.g<br />
g.add((URIRef(ex + subject), RDF.type, FOAF.Person))<br />
<br />
# I remove triples that I marked as unknown earlier.<br />
g.remove((None, None, URIRef("http://example.org/unknown")))<br />
<br />
# Clean printing of the graph.<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===CSV file for above example===<br />
<br />
<syntaxhighlight><br />
"Name","Age","Spouse","Country"<br />
"Cade Tracey","26","Mary Jackson","US"<br />
"Bob Johnson","21","","Canada"<br />
"Mary Jackson","25","","France"<br />
"Phil Philips","32","Catherine Smith","Japan"<br />
</syntaxhighlight><br />
<br />
==Lifting XML to RDF==<br />
<syntaxhighlight><br />
from rdflib import Graph, Literal, Namespace, URIRef<br />
from rdflib.namespace import RDF, XSD, RDFS<br />
import xml.etree.ElementTree as ET<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/TV/")<br />
prov = Namespace("http://www.w3.org/ns/prov#")<br />
g.bind("ex", ex)<br />
g.bind("prov", prov)<br />
<br />
tree = ET.parse("tv_shows.xml")<br />
root = tree.getroot()<br />
<br />
for tv_show in root.findall('tv_show'):<br />
show_id = tv_show.attrib["id"]<br />
title = tv_show.find("title").text<br />
<br />
g.add((URIRef(ex + show_id), ex.title, Literal(title, datatype=XSD.string)))<br />
g.add((URIRef(ex + show_id), RDF.type, ex.TV_Show))<br />
<br />
for actor in tv_show.findall("actor"):<br />
first_name = actor.find("firstname").text<br />
last_name = actor.find("lastname").text<br />
full_name = first_name + "_" + last_name<br />
<br />
g.add((URIRef(ex + show_id), ex.stars, URIRef(ex + full_name)))<br />
g.add((URIRef(ex + full_name), ex.starsIn, URIRef(title)))<br />
g.add((URIRef(ex + full_name), RDF.type, ex.Actor))<br />
<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
<br />
<br />
===XML Data for above example===<br />
<syntaxhighlight><br />
<data><br />
<tv_show id="1050"><br />
<title>The_Sopranos</title><br />
<actor><br />
<firstname>James</firstname><br />
<lastname>Gandolfini</lastname><br />
</actor><br />
</tv_show><br />
<tv_show id="1066"><br />
<title>Seinfeld</title><br />
<actor><br />
<firstname>Jerry</firstname><br />
<lastname>Seinfeld</lastname><br />
</actor><br />
<actor><br />
<firstname>Julia</firstname><br />
<lastname>Louis-dreyfus</lastname><br />
</actor><br />
<actor><br />
<firstname>Jason</firstname><br />
<lastname>Alexander</lastname><br />
</actor><br />
</tv_show><br />
</data><br />
</syntaxhighlight><br />
<br />
==Lifting HTML to RDF==<br />
<syntaxhighlight><br />
from bs4 import BeautifulSoup as bs, NavigableString<br />
from rdflib import Graph, URIRef, Namespace<br />
from rdflib.namespace import RDF<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
<br />
html = open("tv_shows.html").read()<br />
html = bs(html, features="html.parser")<br />
<br />
shows = html.find_all('li', attrs={'class': 'show'})<br />
for show in shows:<br />
title = show.find("h3").text<br />
actors = show.find('ul', attrs={'class': 'actor_list'})<br />
for actor in actors:<br />
if isinstance(actor, NavigableString):<br />
continue<br />
else:<br />
actor = actor.text.replace(" ", "_")<br />
g.add((URIRef(ex + title), ex.stars, URIRef(ex + actor)))<br />
g.add((URIRef(ex + actor), RDF.type, ex.Actor))<br />
<br />
g.add((URIRef(ex + title), RDF.type, ex.TV_Show))<br />
<br />
<br />
print(g.serialize(format="turtle").decode())<br />
</syntaxhighlight><br />
<br />
===HTML code for the example above===<br />
<syntaxhighlight><br />
<!DOCTYPE html><br />
<html><br />
<head><br />
<meta charset="utf-8"><br />
<title></title><br />
</head><br />
<body><br />
<div class="tv_shows"><br />
<ul><br />
<li class="show"><br />
<h3>The_Sopranos</h3><br />
<div class="irrelevant_data"></div><br />
<ul class="actor_list"><br />
<li>James Gandolfini</li><br />
</ul><br />
</li><br />
<li class="show"><br />
<h3>Seinfeld</h3><br />
<div class="irrelevant_data"></div><br />
<ul class="actor_list"><br />
<li >Jerry Seinfeld</li><br />
<li>Jason Alexander</li><br />
<li>Julia Louis-Dreyfus</li><br />
</ul><br />
</li><br />
</ul><br />
</div><br />
</body><br />
</html><br />
</syntaxhighlight><br />
<br />
==Web APIs with JSON==<br />
<syntaxhighlight><br />
import requests<br />
import json<br />
import pprint<br />
<br />
# Retrieve JSON data from API service URL. Then load it with the json library as a json object.<br />
url = "http://api.geonames.org/postalCodeLookupJSON?postalcode=46020&#country=ES&username=demo"<br />
data = requests.get(url).content.decode("utf-8")<br />
data = json.loads(data)<br />
pprint.pprint(data)<br />
</syntaxhighlight><br />
<br />
<br />
==JSON-LD==<br />
<br />
<syntaxhighlight><br />
import rdflib<br />
<br />
g = rdflib.Graph()<br />
<br />
example = """<br />
{<br />
"@context": {<br />
"name": "http://xmlns.com/foaf/0.1/name",<br />
"homepage": {<br />
"@id": "http://xmlns.com/foaf/0.1/homepage",<br />
"@type": "@id"<br />
}<br />
},<br />
"@id": "http://me.markus-lanthaler.com/",<br />
"name": "Markus Lanthaler",<br />
"homepage": "http://www.markus-lanthaler.com/"<br />
}<br />
"""<br />
<br />
# json-ld parsing automatically deals with @contexts<br />
g.parse(data=example, format='json-ld')<br />
<br />
# serialisation does expansion by default<br />
for line in g.serialize(format='json-ld').decode().splitlines():<br />
print(line)<br />
<br />
# by supplying a context object, serialisation can do compaction<br />
context = {<br />
"foaf": "http://xmlns.com/foaf/0.1/"<br />
}<br />
for line in g.serialize(format='json-ld', context=context).decode().splitlines():<br />
print(line)<br />
</syntaxhighlight><br />
<br />
<br />
<div class="credits" style="text-align: right; direction: ltr; margin-left: 1em;">''INFO216, UiB, 2017-2020. All code examples are [https://creativecommons.org/choose/zero/ CC0].'' </div><br />
<br />
==OWL - Complex Classes and Restrictions==<br />
<syntaxhighlight><br />
import owlrl<br />
from rdflib import Graph, Literal, Namespace, BNode<br />
from rdflib.namespace import RDF, OWL, RDFS<br />
from rdflib.collection import Collection<br />
<br />
g = Graph()<br />
ex = Namespace("http://example.org/")<br />
g.bind("ex", ex)<br />
g.bind("owl", OWL)<br />
<br />
# a Season is either Autumn, Winter, Spring, Summer<br />
seasons = BNode()<br />
Collection(g, seasons, [ex.Winter, ex.Autumn, ex.Spring, ex.Summer])<br />
g.add((ex.Season, OWL.oneOf, seasons))<br />
<br />
# A Parent is a Father or Mother<br />
b = BNode()<br />
Collection(g, b, [ex.Father, ex.Mother])<br />
g.add((ex.Parent, OWL.unionOf, b))<br />
<br />
# A Woman is a person who has the "female" gender<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.gender))<br />
g.add((br, OWL.hasValue, ex.Female))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Woman, OWL.intersectionOf, bi))<br />
<br />
# A vegetarian is a Person who only eats vegetarian food<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.eats))<br />
g.add((br, OWL.allValuesFrom, ex.VeganFood))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Vegetarian, OWL.intersectionOf, bi))<br />
<br />
# A vegetarian is a Person who can not eat meat.<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.eats))<br />
g.add((br, OWL.QualifiedCardinality, Literal(0)))<br />
g.add((br, OWL.onClass, ex.Meat))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Person, br])<br />
g.add((ex.Vegetarian, OWL.intersectionOf, bi))<br />
<br />
# A Worried Parent is a parent who has at least one sick child<br />
br = BNode()<br />
g.add((br, RDF.type, OWL.Restriction))<br />
g.add((br, OWL.onProperty, ex.hasChild))<br />
g.add((br, OWL.QualifiedMinCardinality, Literal(1)))<br />
g.add((br, OWL.onClass, ex.Sick))<br />
bi = BNode()<br />
Collection(g, bi, [ex.Parent, br])<br />
g.add((ex.WorriedParent, OWL.intersectionOf, bi))<br />
<br />
# using the restriction above, If we now write...: <br />
g.add((ex.Bob, RDF.type, ex.Parent))<br />
g.add((ex.Bob, ex.hasChild, ex.John))<br />
g.add((ex.John, RDF.type, ex.Sick))<br />
# ...we can infer with owl reasoning that Bob is a worried Parent even though we didn't specify it ourselves because Bob fullfills the restriction and Parent requirements.<br />
<br />
</syntaxhighlight><br />
<br />
==Protege-OWL reasoning with HermiT==<br />
<br />
[[:File:DL-reasoning-RoyalFamily-final.owl.txt | Example file]] from Lecture 13 about OWL-DL, rules and reasoning.<br />
<br />
--></div>
Tkr048