Lab: Getting started with VSCode, Python and RDFlib

From info216
Revision as of 15:37, 13 January 2020 by Say004 (talk | contribs)

Lab 1: Getting started with Pycharm, Python and RDFlib

Topics

Installation of and introduction to Python, Pycharm and RDFlib.

PyCharm is a Python Integrated Development Environment (IDE) that can be used for writing python code.

RDFLib is a Python library for working with RDF. With this we can create RDF graphs, parse and serialize RDF, perform SPARQL queries on graphs and more.

Preparations

If needed here is an introduction of how to create projects and python files in Pycharm Eclipse IDE - Tutorial by jetbrains.com.

RDFLib Documentation- Todays focus are the following:

Additional recommended in depth documentation:

Useful code examples can be found of this wiki.

Lecture Notes

Relevant RDFlib classes/interfaces (and methods)

import rdflib:

  • Graph (add, remove, serialize)
  • URIRef
  • Literal
  • NameSpace
  • RDF, FOAF

All RDFlib modules can be found here Browser search (often Ctrl-F) is useful here to find the module that you want.

Tasks

You likely already have Python installed on your computer (it comes with MAC for instance). To find out, type 'python' in the commandline on windows or type 'python --version' in the terminal on your MAC. If you don't already have Python, download and install the Python for your operative system here. We recommend downloading the most recent version.

If you are already using a different Python IDE than Pycharm that you are comfortable with, then you are free to use that program instead.

Otherwise Download and Install the free and open source Pycharm Community IDE here.


Pip On Windows: Download pip here: https://bootstrap.pypa.io/get-pip.py. (Right click on this link and click "Save link as". You can save it to you desktop or somewhere else if you would like)

Open a command prompt and navigate to the folder containing get-pip.py. (e.g enter 'cd Desktop' to get to the desktop)

Now run the following command in the promt 'python get-pip.py'

Pip On Mac:

Open the terminal and first run: 'curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py'

Now run this: 'python get-pip.py'


RDFlib

Using the Pycharm terminal, install RDFlib by simply entering: pip install rdflib

Start Pycharm and create a new project with a new .py file.




Here is a smart thing to do from the start: Decide on a folder that you always use for downloading, unpacking and installing your own software. Different software packages you install - like Java, Jena and Eclipse - will go into separate subfolders of this folder. This way you will always know where you have them. On my Windows computer, I put everything under a folder called "C:\Programs\Java". On Linux, I use the folder "/opt/Java". It is good practice to use folder names without blank characters (spaces) in them, because they can cause problems for some software packages.

Download and unpack the Eclipse Oxygen IDE for Java Developers: go to https://eclipse.org/downloads/ and Get Eclipse Oxygen for your computer and operating system. This will download an installer program.

Start the installer program (on Linux, you go to the folder where you installed it and start the "eclipse-inst" program). Choose "Eclipse IDE for Java Developers" and choose an installation folder. (I use folders called "C:\Programs\eclipse-oxygen" and "/opt/eclipse-oxygen" for this.)

Start Eclipse from the start menu (on Linux, go to the folder where you installed it and start the "eclipse" program). If you want to, you can make a shortcut to the "eclipse" program from your desktop or task bar.

Tell Eclipse that you are going to use the JDK 8 you just downloaded: "Window" -> "Preferences" -> "Java" -> "Installed JREs" -> "Add..."

Download and unpack Jena 3.6.0: https://jena.apache.org/download/ . (I use folders called "C:\Programs\apache-jena" and "/opt/apache-jena" for this.) You can download and unpack the source files too.

While you are at it, you can download and unpack Apache Jena Fuseki too. You will need it later.

Re-start Eclipse and create an Eclipse project for Java (File -> New -> Java Project). You can call it INFO216Lab1 for example. Right click on the new project in the "Package Explorer" and choose "Build Path" -> "Add external Archives..." Now you can copy or link the Jena JAR-files into the project. You find them in: .../apache-jena/lib/*.jar .

Create a Java program that creates a single triple and writes it out. Use the Classes/interfaces and methods listed above. The simplest way to run the program is using the green "play" button in the top Eclipse toolbar.

Go to the "INFO216 Wiki" page in this wiki and download the file log4j.properties . You can get rid of all the "log4j warnings" by copying the log4j.properties file into exactly the same directory as your Java-files.

If you have more time...

If you have more time you can continue extending your graph with some of the stuff from lab 2: "Cade Tracy lives in 1516 Henry Street, Berkeley, California 94709, USA. He has a B.Sc. in biology from the University of California, Berkeley from 2011. His interests include birds, ecology, the environment, photography and travelling. He has visited Canada and France." Try to use as many different methods as possible to create the triples.