Difference between revisions of "Lab: RDFS"

From Info216
m
 
(41 intermediate revisions by 5 users not shown)
Line 1: Line 1:
 +
==Topics==
 +
* Simple RDFS statements/triples
 +
* Basic RDFS programming in RDFlib
 +
* Basic RDFS reasoning with OWL-RL
  
=Lab 7: RDFS Programming with rdflib and owlurl=
+
==Useful materials==
 +
rdflib classes/interfaces and attributes:
 +
* RDF (RDF.type)
 +
* RDFS (RDFS.domain, RDFS.range, RDFS.subClassOf, RDFS.subPropertyOf)
 +
* [https://docs.google.com/presentation/d/13fkzg7eM2pnKGqYlKpPMFIJwnOLKBkbT0A62s7OcnOs Lab Presentation of RDFS]
  
==Topics==
+
OWL-RL:
Basic RDFS graph programming in RDFlib.
+
* [https://pypi.org/project/owlrl/ OWL-RL at PyPi]
Entailments and axioms with owlrl.
+
* [https://owl-rl.readthedocs.io/en/latest/ OWL-RL Documentation]
  
==Classes/Methods/Vocabularies==
+
OWL-RL classes/interfaces:
owlrl.RDFSClosure (RDFS_Semantics, closure, flush_stored_triples)
+
* RDFSClosure, RDFS_Semantics
  
Vocabularies:  
+
==Tasks==
 +
'''Task:'''
 +
Install OWL-RL into your virtual environment:
 +
pip install owlrl
  
RDF(type)
+
'''Task:'''
 +
We will use simple RDF statements from the Mueller investigation RDF graph you create in Exercise 1. Create a new rdflib graph and add triples to represent that:
 +
* Rick Gates was charged with money laundering and tax evasion.
  
RDFS (label, comment, subClassOf, subPropertyOf, domain, range)
+
Use RDFS terms to add these rules as triples:
 +
* When one thing that is charged with another thing,
 +
** the first thing is a person under investigation and
 +
** the second thing is an offense.
  
==Tasks==
+
To add triples, you can use either:
First, pip install owlurl.
+
* simple ''graph.add((s, p, o))'' statements or
The RDFS library can be imported from rdflib.namespace, just like FOAF or RDF.
+
* ''INSERT DATA {...}'' SPARQL updates.
 +
If you use SPARQL updates, you can define a namespace dictionary like this:
 +
EX = Namespace('http://example.org#')
 +
NS = {
 +
    'ex': EX,
 +
    'rdf': RDF,
 +
    'rdfs': RDFS,
 +
    'foaf': FOAF,
 +
}
 +
You can then give NS as an optional argument to graph.update() - or to graph.query() - like this:
 +
g.update("""
 +
    # when you provide an initNs-argument, you do not have
 +
    # to define PREFIX-es as part of the update (or query)
 +
 +
    INSERT DATA {
 +
        # the triples you want to add go here,
 +
        # you can use the prefixes defined in the NS-dict
 +
    }
 +
""", initNs=NS)
  
Consider the following extensions to the task from lab 2:
+
'''Task:'''
"University of California, Berkeley and University of Valencia are both Universities.
+
* Write a SPARQL query that checks the RDF type(s) of Rick Gates in your RDF graph.
All universities are higher education instituttions (HEIs). That a person has a degree in a subject means
+
* Write a similar SPARQL query that checks the RDF type(s) of money laundering in your RDF graph.
that the person has expertise in that subject. Only persons can have an expertise, and what they have expertise
+
* Write a small function that computes the ''RDFS closure'' on your graph.
in is always a subject. Having a degree from a HEI means that you have also gradutated from that HEI. Only persons can graduate from a HEI."
+
* Re-run the SPARQL queries to check the types of Rick Gates and of money laundering again: have they changed?
  
'''Create and output the RDFS graph in RDFlib''' - if you can, try to build on
+
You can compute the RDFS closure on a graph ''g'' like this:
your example from lab 2!
+
import owlrl
 +
 +
owlrl.DeductiveClosure(owlrl.RDFS_Semantics).expand(g)
  
Check that simple inference works -  make sure that your graph contains triples like these, even if
+
'''Task:'''
you have not asserted them explicitly:
+
Use RDFS terms to add this rule as a triple:
* that UCB and UV are HEIs
+
* A person under investigation is a FOAF person.
* that Cade and Emma both have expertises
+
* Like earlier, check the RDF types of Rick Gates before and after running RDFS reasoning. Do they change?
* that Cade and Emma are both persons
 
* that biology and chemistry are both subjects
 
* that Cade and Emma have both graduated from some HEI
 
  
 +
'''Task:'''
 +
Add in "plain RDF" as in Exercise 1:
 +
* Paul Manafort was convicted for tax evasion.
  
Rewrite some of your existing code to use rdfs:label in a triple and add an rdfs:comment to the same resource.
+
Use RDFS  terms to add these rules as triples:
 +
* When one thing is ''convicted for'' another thing,
 +
** the first thing is also ''charged with'' the second thing.
  
==If you have more time...==
+
''Note:'' we are dealing with a "timeless" graph here, that represents facts that has held at "some points in time", but not necessarily at the same time.
Create a new RDFS graph that wraps an empty graph. This graph contains only RDFS axioms. Write it out in Turtle and check that you understand  the meaning and purpose of each axiom.
 
  
Create an RDF (not RDFS) graph that contains all the triples in your first graph (the one with all the people and universities). Subtract all the triples in the axiom graph from the people/university graph. Write it out to see that you are left with only the asserted and entailed triples and that none of the axioms remain.
+
* What are the RDF types of Paul Manafort and of tax evasion before and after RDFS reasoning?
 +
* Does the RDFS domain and range of the ''convicted for'' property change?
  
<!-- Download the SKOS vocabulary from https://www.w3.org/2009/08/skos-reference/skos.rdf and save it to a file called, e.g., SKOS.rdf .
+
==If you have more time...==
Use the schemagen tool (it is inside your Jena folders, for example under apache-jena-3.1.1/bin) to generate a Java class for the SKOS vocabulary.  
+
'''Task:'''
You need to do this from a console window, using a command like "<path>/schemagen -i <infile.rdf> -o <outfile.java>".
+
* Create a Turtle file with all the RDF and RDFS triples from the earlier tasks.
 +
* Go to a local or online Blazegraph. Create a new Blazegraph namespace and ''Use'' it. (Remember that namespaces in Blazegraph and rdflib have nothing to do with one another.)  
 +
* Go to the ''Upload'' tab to add the triples into your new Blazegraph namespace.
 +
* Go through each of the above queries to confirm that Blazegraph has not performed any RDFS reasoning.
  
Copy the SKOS.java file into your project in the same package as your other Java files,  and try to use SKOS properties
+
You can also list all the triples in the namespace to ensure that nothing has been added:
where they fit, for example to organise the keywords for interests and expertise.
+
SELECT * WHERE { ?s ?p ?o }
-->
 
  
==Useful Readings==
+
'''Task:'''
*[https://wiki.uib.no/info216/index.php/File:S05-RDFS-11.pdf Lecture Notes]
+
* Create another Blazegraph namespace, but this time ''with the inference box checked''. ''Use'' the new namespace.  
*[https://wiki.uib.no/info216/index.php/Python_Examples Example page]
+
* List all the triples in the new inferencing namespace. (What you see are the axioms in RDFS.) How many are there?
 +
* Upload the triples into your new inferencing Blazegraph namespace.
 +
* List all the triples again. How many have been added?
 +
* Go through each of the above queries to ensure that Blazegraph (with inference) has reasoned as you expect.

Latest revision as of 14:14, 15 March 2023

Topics

  • Simple RDFS statements/triples
  • Basic RDFS programming in RDFlib
  • Basic RDFS reasoning with OWL-RL

Useful materials

rdflib classes/interfaces and attributes:

OWL-RL:

OWL-RL classes/interfaces:

  • RDFSClosure, RDFS_Semantics

Tasks

Task: Install OWL-RL into your virtual environment:

pip install owlrl

Task: We will use simple RDF statements from the Mueller investigation RDF graph you create in Exercise 1. Create a new rdflib graph and add triples to represent that:

  • Rick Gates was charged with money laundering and tax evasion.

Use RDFS terms to add these rules as triples:

  • When one thing that is charged with another thing,
    • the first thing is a person under investigation and
    • the second thing is an offense.

To add triples, you can use either:

  • simple graph.add((s, p, o)) statements or
  • INSERT DATA {...} SPARQL updates.

If you use SPARQL updates, you can define a namespace dictionary like this:

EX = Namespace('http://example.org#')
NS = {
    'ex': EX,
    'rdf': RDF,
    'rdfs': RDFS,
    'foaf': FOAF,
}

You can then give NS as an optional argument to graph.update() - or to graph.query() - like this:

g.update("""
    # when you provide an initNs-argument, you do not have 
    # to define PREFIX-es as part of the update (or query)

    INSERT DATA {
        # the triples you want to add go here,
        # you can use the prefixes defined in the NS-dict
    }
""", initNs=NS)

Task:

  • Write a SPARQL query that checks the RDF type(s) of Rick Gates in your RDF graph.
  • Write a similar SPARQL query that checks the RDF type(s) of money laundering in your RDF graph.
  • Write a small function that computes the RDFS closure on your graph.
  • Re-run the SPARQL queries to check the types of Rick Gates and of money laundering again: have they changed?

You can compute the RDFS closure on a graph g like this:

import owlrl

owlrl.DeductiveClosure(owlrl.RDFS_Semantics).expand(g)

Task: Use RDFS terms to add this rule as a triple:

  • A person under investigation is a FOAF person.
  • Like earlier, check the RDF types of Rick Gates before and after running RDFS reasoning. Do they change?

Task: Add in "plain RDF" as in Exercise 1:

  • Paul Manafort was convicted for tax evasion.

Use RDFS terms to add these rules as triples:

  • When one thing is convicted for another thing,
    • the first thing is also charged with the second thing.

Note: we are dealing with a "timeless" graph here, that represents facts that has held at "some points in time", but not necessarily at the same time.

  • What are the RDF types of Paul Manafort and of tax evasion before and after RDFS reasoning?
  • Does the RDFS domain and range of the convicted for property change?

If you have more time...

Task:

  • Create a Turtle file with all the RDF and RDFS triples from the earlier tasks.
  • Go to a local or online Blazegraph. Create a new Blazegraph namespace and Use it. (Remember that namespaces in Blazegraph and rdflib have nothing to do with one another.)
  • Go to the Upload tab to add the triples into your new Blazegraph namespace.
  • Go through each of the above queries to confirm that Blazegraph has not performed any RDFS reasoning.

You can also list all the triples in the namespace to ensure that nothing has been added:

SELECT * WHERE { ?s ?p ?o }

Task:

  • Create another Blazegraph namespace, but this time with the inference box checked. Use the new namespace.
  • List all the triples in the new inferencing namespace. (What you see are the axioms in RDFS.) How many are there?
  • Upload the triples into your new inferencing Blazegraph namespace.
  • List all the triples again. How many have been added?
  • Go through each of the above queries to ensure that Blazegraph (with inference) has reasoned as you expect.