Lab: RDF programming with RDFlib: Difference between revisions

From info216
No edit summary
Line 38: Line 38:
'''Task:''' Write a method (function) that starts with Donald Trump prints out a graph depth-first to show how the other graph nodes are connected to him. For example, the output could be:
'''Task:''' Write a method (function) that starts with Donald Trump prints out a graph depth-first to show how the other graph nodes are connected to him. For example, the output could be:
  ex:Donald_Trump
  ex:Donald_Trump
     ^ex:campaignManager ex:Paul_Manafort
     <-- ex:campaignManager ex:Paul_Manafort
         ex:convictedFor ex:BankAndTaxFraud
         --> ex:convictedFor ex:BankAndTaxFraud
         ...
         ...
     ^ex:attorneyFor ex:Michael_Cohen
     <-- ex:attorneyFor ex:Michael_Cohen
         ex:pleadedGuilty ex:LyingToCongress
         --> ex:pleadedGuilty ex:LyingToCongress


Here, the ^-sign is used to indicate the reverse of a property.  
Here, the <-- and --> arrows are printed to indicate the reverse of a property. We do that with a ''print()'' statement in Python, not from inside rdflib.  


''Note:'' Because you must follow triples in both subject-to-predicate and predicate-to-subject direction, you must keep a list of already visited nodes, and never return to a previously visited one.
''Note:'' Because you must follow triples in both subject-to-predicate and predicate-to-subject direction, you must keep a list of already visited nodes, and never return to a previously visited one.


''Note:'' If you want a neat solution, it may be best to combine two graph traversals: first traverse the model breadth-first to create a new tree-shaped model, and then traverse the tree-shaped model depth-first to print it out with indentation. (The point of the first breadth-first step is to find the shortest path to each node.)
''Note:'' If you want a neat solution, it may be best to combine two graph traversals: first traverse the model breadth-first to create a new tree-shaped model, and then traverse the tree-shaped model depth-first to print it out with indentation. (The point of the first breadth-first step is to find the shortest path to each node.)

Revision as of 17:24, 20 January 2023

Topics

  • RDF graph programming with RDFlib

Useful materials

RDFLib:

RDFlib classes/interfaces:

  • from rdflib import Graph, Namespace, URIRef, BNode, Literal
  • from rdflib.namespace import RDF, FOAF, XSD
  • from rdflib.collection import Collection

RDFlib methods:

  • Graph: add(), remove(), triples(), serialize(), parse(), bind()

Tasks

Continue with the graph you created in Exercise 1.

Task: Continue to extend your graph:

  • Michael Cohen was Donald Trump's attorney.
    • He pleaded guilty for lying to Congress.
  • Michael Flynn was adviser to Donald Trump.
    • He pleaded guilty for lying to the FBI.
    • He negotiated a plea agreement.

Task: According to this FRONTLINE article, Gates', Cohen's and Flynn's lying were different and are described in different detail.

  • How can you represent "different instances of lying" as triples?
  • How can you modify your knowledge graph to account for this?

Task: Save (serialize) your graph to a Turtle file. Add a few triples with more information about Donald Trump. Visualise the result if you want. Read (parse) the Turtle file back into a Python program, and check that the new triples are there.

If you have more time...

Task: Write a method (function) that starts with Donald Trump prints out a graph depth-first to show how the other graph nodes are connected to him. For example, the output could be:

ex:Donald_Trump
    <-- ex:campaignManager ex:Paul_Manafort
        --> ex:convictedFor ex:BankAndTaxFraud
        ...
    <-- ex:attorneyFor ex:Michael_Cohen
        --> ex:pleadedGuilty ex:LyingToCongress

Here, the <-- and --> arrows are printed to indicate the reverse of a property. We do that with a print() statement in Python, not from inside rdflib.

Note: Because you must follow triples in both subject-to-predicate and predicate-to-subject direction, you must keep a list of already visited nodes, and never return to a previously visited one.

Note: If you want a neat solution, it may be best to combine two graph traversals: first traverse the model breadth-first to create a new tree-shaped model, and then traverse the tree-shaped model depth-first to print it out with indentation. (The point of the first breadth-first step is to find the shortest path to each node.)