Difference between revisions of "Lab: RDF programming with RDFlib"

From Info216
(41 intermediate revisions by 3 users not shown)
Line 3: Line 3:
  
 
==Topics==
 
==Topics==
* RDF graph sketching.
 
 
* Basic RDF graph programming with RDFlib.
 
* Basic RDF graph programming with RDFlib.
 
* Simple reading/writing from/to file.
 
* Simple reading/writing from/to file.
 +
* Simple looping through graph
  
 
==Classes/interfaces==
 
==Classes/interfaces==
import [https://rdflib.readthedocs.io/en/stable/py-modindex.html rdflib]:
+
from rdflib import Graph, Namespace, URIRef, BNode, Literal
*Graph (add, remove, serialize)
+
 
*URIRef
+
from rdflib.namespace import RDF, FOAF, XSD
*Literal
+
 
*NameSpace
+
from rdflib.collection import Collection
*Bnode
+
 
*RDF, FOAF
+
 
 +
Methods:
 +
Graph - add(), remove(), triples(), serialize(), parse(), bind()
  
 
==Tasks==
 
==Tasks==
 
Consider the following situation:
 
Consider the following situation:
"Cade Tracy lives in 1516 Henry Street, Berkeley, California 94709,
+
"Cade lives in 1516 Henry Street, Berkeley, California 94709,
 
USA. He has a B.Sc. in biology from the University of California,
 
USA. He has a B.Sc. in biology from the University of California,
 
Berkeley from 2011. His interests include birds, ecology, the
 
Berkeley from 2011. His interests include birds, ecology, the
 
environment, photography and travelling. He has visited Canada and
 
environment, photography and travelling. He has visited Canada and
France. Ines Dominguez lives in Carrer de la Guardia Civil 20, 46020
+
France. Emma Dominguez lives in Carrer de la Guardia Civil 20, 46020
 
Valencia, Spain. She has a M.Sc. in chemistry from the University of
 
Valencia, Spain. She has a M.Sc. in chemistry from the University of
 
Valencia from 2015. Her areas of expertise include waste management,
 
Valencia from 2015. Her areas of expertise include waste management,
 
toxic waste, air pollution. Her interests include bike riding, music
 
toxic waste, air pollution. Her interests include bike riding, music
 
and travelling. She has visited Portugal, Italy, France, Germany,
 
and travelling. She has visited Portugal, Italy, France, Germany,
Denmark and Sweden. Cade knows Ines. They met in Paris in August
+
Denmark and Sweden. Cade knows Emma. They met in Paris in August
 
2014."
 
2014."
(Make up your own URIs when you need to (like "http://example.org/"), or even better: use terms
 
you know from vocabularies such as FOAF and RDFS.
 
  
Sketch this RDF graph on paper.
 
  
Create a graph in RDFlib with triples corresponding to the text above.
+
Create a graph in RDFlib with triples corresponding to the text above. Build on the graph from lab 1. Use your own URIs when you need to (like "http://example.org/"), but try to use terms from vocabularies such as FOAF, RDF, XSD, and others.
  
Write out your model to the console in the following formats: TURTLE,
+
Write out your graph to the console. This seems to be the cleanest way of printing the graph to me:
N-TRIPLE, N3, JSON-LD, RDF/XML. How do they differ? What is the
+
print(g.serialize(format="turtle"))
default?
+
But try all the following formats: "turtle", "n3", "nt", "json-ld", "xml". How do they differ? What is the default?
  
Write your graph to a file.
+
Write your graph to a file. To do this, you can simply use the location parameter e.g: g.serialize(destination="triples.txt", format="turtle").
  
Edit the file so that Cade has the middle name Creighton and the middle name of Ines is María.
+
Look at the file and edit it so that Cade has also visited Germany and so that Emma is 26 years old.  
  
Create a new program that reads your model in again from the file and
+
Create a new program that reads your graph in again from the file and
writes it to the console. Check that the new middle names are there!
+
writes it to the console. e.g g.parse(location="triples.txt", format="turtle")
 +
Check that your new data is there!
  
 
Continuing with either your first or second program, write a loop that
 
Continuing with either your first or second program, write a loop that
goes through all the triples in the graph/model and prints them to the
+
goes through all the triples in the graph and prints them to the
 
console.
 
console.
  
 
Change the loop so that (a) it only loops through triples about
 
Change the loop so that (a) it only loops through triples about
Ines (b) it only loops through statements involving the names of
+
Emma (b) it only loops through triples involving the names of
people.
+
people.  
 
 
Remove all addresses from the graph/model.
 
 
 
==If you have more time...==
 
  
Below are four lines of comma-separated values (csv - five lines with
+
Remove all triples about Mary using graph.remove(). (triples of Mary are from lab 1)
the headers) that could have been saved from a spreadsheet. Copy them
 
into a file and write a program with a loop that reads each line from that file (except
 
the initial header line) and adds it to your graph as triples:
 
  
  "Name","Gender","Country","Town","Expertise","Interests"
+
==Useful Links==
  "Regina Catherine Hall","F","Great Britain","Manchester","Ecology, zoology","Football, music travelling"
 
  "Achille Blaise","M","France","Nancy","","Chess, computer games"
 
  "Nyarai Awotwi Ihejirika","F","Kenya","Nairobi","Computers, semantic networks","Hiking, botany"
 
  "Xun He Zhang","M","China","Chengdu","Internet, mathematics, logistics","Dancing, music, trombone"
 
  
In the resulting graph, delete all information about Achille.
+
[https://rdflib.readthedocs.io/en/stable/index.html rdflib documentation]:
 +
* [https://rdflib.readthedocs.io/en/stable/intro_to_creating_rdf.html Creating Triples]
 +
* [https://rdflib.readthedocs.io/en/stable/intro_to_graphs.html Navigating Graphs]
 +
* [https://rdflib.readthedocs.io/en/stable/intro_to_parsing.html Parsing]
  
Have you used all the classes/intefaces listed at the beginning in your code, and
+
[https://wiki.uib.no/info216/index.php/File:S02-RDF-9.pdf Lecture Notes]
all the variants of all the methods? If not, try to change your code
 
to try them all!
 

Revision as of 15:53, 4 February 2022

Lab 2: RDF programming with RDFlib

Topics

  • Basic RDF graph programming with RDFlib.
  • Simple reading/writing from/to file.
  • Simple looping through graph

Classes/interfaces

from rdflib import Graph, Namespace, URIRef, BNode, Literal

from rdflib.namespace import RDF, FOAF, XSD

from rdflib.collection import Collection


Methods: Graph - add(), remove(), triples(), serialize(), parse(), bind()

Tasks

Consider the following situation: "Cade lives in 1516 Henry Street, Berkeley, California 94709, USA. He has a B.Sc. in biology from the University of California, Berkeley from 2011. His interests include birds, ecology, the environment, photography and travelling. He has visited Canada and France. Emma Dominguez lives in Carrer de la Guardia Civil 20, 46020 Valencia, Spain. She has a M.Sc. in chemistry from the University of Valencia from 2015. Her areas of expertise include waste management, toxic waste, air pollution. Her interests include bike riding, music and travelling. She has visited Portugal, Italy, France, Germany, Denmark and Sweden. Cade knows Emma. They met in Paris in August 2014."


Create a graph in RDFlib with triples corresponding to the text above. Build on the graph from lab 1. Use your own URIs when you need to (like "http://example.org/"), but try to use terms from vocabularies such as FOAF, RDF, XSD, and others.

Write out your graph to the console. This seems to be the cleanest way of printing the graph to me: print(g.serialize(format="turtle")) But try all the following formats: "turtle", "n3", "nt", "json-ld", "xml". How do they differ? What is the default?

Write your graph to a file. To do this, you can simply use the location parameter e.g: g.serialize(destination="triples.txt", format="turtle").

Look at the file and edit it so that Cade has also visited Germany and so that Emma is 26 years old.

Create a new program that reads your graph in again from the file and writes it to the console. e.g g.parse(location="triples.txt", format="turtle") Check that your new data is there!

Continuing with either your first or second program, write a loop that goes through all the triples in the graph and prints them to the console.

Change the loop so that (a) it only loops through triples about Emma (b) it only loops through triples involving the names of people.

Remove all triples about Mary using graph.remove(). (triples of Mary are from lab 1)

Useful Links

rdflib documentation:

Lecture Notes