Difference between revisions of "Lab: Getting started with VSCode, Python and RDFlib"

From Info216
 
(90 intermediate revisions by 4 users not shown)
Line 1: Line 1:
=Lab 1: Getting started with Pycharm, Python and RDFlib=
+
=Lab 1: Getting started with VSCode, Python and RDFlib=
  
 
==Topics==  
 
==Topics==  
Installation of and introduction to Python, Pycharm and RDFlib.
+
Today we have 2 goals.
 +
1. To setup a programming environment that you will use for the lab excercises (if you don't already have one).
 +
2. To start some basic programming of RDF.  
  
PyCharm is a Python Integrated Development Environment (IDE) that can be used for writing python code.
+
Installation of and introduction to Python, VSCode, Pip and RDFlib.
  
RDFLib is a Python library for working with RDF. With this we can create RDF graphs, parse and serialize RDF, perform SPARQL queries on graphs and more.
+
'''VSCode''' is an Integrated Development Environment (IDE) that can be used for writing Python code.
  
==Preparations==
+
'''Conda''' is a package and virtual environment manager that will help you avoid dependency conflicts and clutter.
If needed here is an introduction of how to create projects and python files in Pycharm [https://www.jetbrains.com/help/pycharm/creating-and-running-your-first-python-project.html Eclipse IDE - Tutorial by jetbrains.com].
 
  
[https://rdflib.readthedocs.io/en/stable/index.html# RDFLib Documentation]- Todays focus are the following:
+
The '''Pip''' command is used to easily install additional python packages for your coding environment.
* [https://rdflib.readthedocs.io/en/stable/intro_to_creating_rdf.html  Intro to creating triples]
 
* [https://rdflib.readthedocs.io/en/stable/intro_to_parsing.html Loading and saving RDF]
 
  
Additional recommended in depth documentation:
+
'''RDFLib''' is a Python package for working with RDF. With this we can create RDF graphs, parse and serialize RDF, perform SPARQL queries on graphs and more.
* [https://rdflib.readthedocs.io/en/stable/rdf_terms.html RDF Terms]
 
* [https://rdflib.readthedocs.io/en/stable/namespaces_and_bindings.html Namespaces and Bindings]
 
  
Useful code [[examples]] codes can be found of this wiki.
+
==Tasks==
  
[[Lecture Notes]]
+
You likely already have '''Python''' installed on your computer (it comes with MAC for instance). To find out, type 'python' in the commandline on windows or type 'python --version' in the terminal on your MAC.
 +
If you don't already have Python, one alternative is to download and install the Python for your operative system [https://www.python.org/downloads/ here.] Per today (January 2022) RDFlib officially supports python version 3.7, 3.8 and 3.9. Or..
  
==Relevant RDFlib classes/interfaces (and methods)==
+
If you already have python 3.10 installed (not officially supported), or if you just want to install python in a most hygienic way, use  [https://docs.conda.io/en/latest/miniconda.html '''Conda'''] to create a virtual python environment for this course.
import [https://rdflib.readthedocs.io/en/stable/py-modindex.html rdflib]:
+
* Install 'Miniconda' using the link above
*Graph (add, remove, serialize)
+
* Windows search for 'anaconda' and open a prompt, or open a terminal in mac and linux.
*URIRef
+
* Enter 'conda update conda' to make sure you get the latest.
*Literal
+
* Enter 'conda create --name info216 python=3.9' to create a python 3.9 environment and name it after this course.
*NameSpace
+
* Enter 'conda activate info216' to use the new python environment.
*RDF, FOAF
 
  
All RDFlib modules can be found [https://rdflib.readthedocs.io/en/stable/py-modindex.html here]
 
Browser search (often Ctrl-F) is useful gere to find the module that you want.
 
  
==Tasks==
+
If you are already using a different Python IDE than VSCode that you are comfortable with, then you are free to use that program instead.
  
You likely already have Python installed on your computer (it comes with MAC for instance). To find out, type 'python' in  the commandline on windows or type 'python --version' in the terminal on your MAC.
+
Otherwise Download and Install the free and open source '''Visual Studio Code''' IDE [https://code.visualstudio.com/Download here.]
If you don't already have Python, download and install the Python for your operative system [https://www.python.org/downloads/ here.]
 
We recommend downloading the most recent version.
 
  
If you are already using a different Python IDE than Pycharm that you are comfortable with, then you are free to use that program instead.
+
Start VSCode and create a new project by opening a folder with a new .py file. You can create the folder and file in the terminal or in windows explorer. Install the Python extension by Microsoft in the vscode extension manager. When the Python extension is installed you can use the 'select interpreter' field on the bottom left to use the virtual environment you made, or make sure you are using a supported version of Python.  
 +
  
Otherwise Download and Install the free and open source Pycharm Community IDE [https://www.jetbrains.com/pycharm/download/ here.]
+
'''RDFlib'''
  
Download pip here: https://bootstrap.pypa.io/get-pip.py. (Right click on this link and click "Save link as". You can save it to you desktop or somewhere else if you would like)
+
Using the VSCode terminal, or activated conda terminal if you are using a conda environment, install RDFlib by simply entering: 'pip install rdflib'
  
 +
To import rdflib to your .py file, write 'import rdflib' at the top of the file.
 +
Alternatively to import certain modules of rdflib, you can write e.g 'from rdflib import Graph'.
  
Using the pycharm commandline install RDF lib by simply entering: pip install rdflib
 
  
 +
'''Coding Tasks'''
  
Start Pycharm and create a new project with a new .py file.
+
When solving the coding tasks you can look at the readings below if needed.
 +
Use the Classes/interfaces and methods listed at the bottom of the page. ()
  
  
 +
'''1.''' Write a program that creates an RDF graph containing the triples from the following sentences. Note that one sentence could result in more than one triple.
 +
When solving these particular tasks I would try to avoid using Blank Nodes, and instead use full URI's instead. For URIs you can just use an example URI like "http://example.org/".
 +
This means that if you talk about a person called Cade, the URI could be  "http://example.org/Cade". Remember the Namespaces can be used so that you don't have to write the full URI everytime.
  
 +
* Cade is married to Mary
 +
* The capital of France is Paris
 +
* Cade is 27 years old
 +
* 26 years is the age of Mary
 +
* Marys interests include hiking, chocolate and biology
 +
* Mary is a student
 +
* Paris is a City in France
 +
* Cade and Mary are kind people
  
  
 +
==If you have more time...==
 +
If you have more time you can continue extending your graph with some of the stuff from lab 2:
 +
''"Cade has the full name Cade Tracey. He lives in 1516 Henry Street, Berkeley, California 94709, USA. He has a B.Sc. in biology from the University of California, Berkeley from 2011. His interests include birds, ecology, the environment, photography and travelling. He has visited Canada and France."''
 +
Try to use as many different methods as possible to create the triples.
  
'''Here is a smart thing to do from the start:'''
+
==Documentation and other useful reading==
Decide on a folder that you always use for downloading, unpacking and installing your own software. Different software packages you install - like Java, Jena and Eclipse - will go into separate subfolders of this folder. This way you will always know where you have them. On my Windows computer, I put everything under a folder called "C:\Programs\Java". On Linux, I use the folder "/opt/Java". ''It is good practice to use folder names without blank characters (spaces) in them, because they can cause problems for some software packages.''
 
  
Download and unpack the Eclipse Oxygen IDE for Java Developers: go to https://eclipse.org/downloads/ and Get Eclipse Oxygen for your computer and operating system. This will download an installer program.  
+
Conda documentation for what we did today
 +
* [https://docs.conda.io/projects/conda/en/latest/user-guide/getting-started.html# Getting started with Conda]
 +
* [https://code.visualstudio.com/docs/python/environments# Using Python environments in VS Code]
  
Start the installer program (on Linux, you go to the folder where you installed it and start the "eclipse-inst" program). Choose "Eclipse IDE for Java Developers" and choose an installation folder.
+
If needed here is an introduction of how to create projects and python files in VSCode [https://code.visualstudio.com/docs/python/python-tutorial Getting Started with Python in VS Code].
(I use folders called "C:\Programs\eclipse-oxygen" and "/opt/eclipse-oxygen" for this.)
 
  
Start Eclipse from the start menu (on Linux, go to the folder where you installed it and start the "eclipse" program). If you want to, you can make a shortcut to the "eclipse" program from your desktop or task bar.
+
[https://rdflib.readthedocs.io/en/stable/index.html# RDFLib Documentation]- Todays focus are the following:
 +
* [https://rdflib.readthedocs.io/en/stable/gettingstarted.html Intro to RDFlib]
 +
* [https://rdflib.readthedocs.io/en/stable/intro_to_creating_rdf.html  Intro to creating triples]
  
Tell Eclipse that you are going to use the JDK 8 you just downloaded: "Window" -> "Preferences" -> "Java" -> "Installed JREs" -> "Add..."
+
Additional recommended in depth documentation:
 +
* [https://rdflib.readthedocs.io/en/stable/rdf_terms.html RDF Terms]
 +
* [https://rdflib.readthedocs.io/en/stable/namespaces_and_bindings.html Namespaces and Bindings]
  
Download and unpack Jena 3.6.0: https://jena.apache.org/download/ . (I use folders called "C:\Programs\apache-jena" and "/opt/apache-jena" for this.) You can download and unpack the source files too.
+
[[:File:S01-KnowledgeGraphs.pdf | Slides from the lecture]]
  
While you are at it, you can download and unpack Apache Jena Fuseki too. You will need it later.
+
==Relevant RDFlib interfaces (and methods)==
 +
import [https://rdflib.readthedocs.io/en/stable/py-modindex.html rdflib]:
 +
*Graph (add)
 +
*URIRef
 +
*Literal
 +
*NameSpace
 +
*Bnode
 +
*Collection
 +
*RDF, FOAF
  
Re-start Eclipse and create an Eclipse project for Java (File -> New -> Java Project). You can call it INFO216Lab1 for example. Right click on the new project in the "Package Explorer" and choose "Build Path" -> "Add external Archives..." Now you can copy or link the Jena JAR-files into the project. You find them in: .../apache-jena/lib/*.jar .  
+
All RDFlib modules can be found [https://rdflib.readthedocs.io/en/stable/py-modindex.html here]
 
+
Browser search (often Ctrl-F) is useful here to find the module that you want. Look at the different serializations that are available and try a few of them out. Which serialization is the most comprehensible to you?
Create a Java program that creates a single triple and writes it out. Use the Classes/interfaces and methods listed above. The simplest way to run the program is using the green "play" button in the top Eclipse toolbar.
 
 
 
Go to the "INFO216 Wiki" page in this wiki and download the file log4j.properties . You can get rid of all the "log4j warnings" by copying the log4j.properties file into exactly the same directory as your Java-files.
 
 
 
==If you have more time...==
 
If you have more time you can continue extending your graph with some of the stuff from lab 2:
 
''"Cade Tracy lives in 1516 Henry Street, Berkeley, California 94709, USA. He has a B.Sc. in biology from the University of California, Berkeley from 2011. His interests include birds, ecology, the environment, photography and travelling. He has visited Canada and France."''
 
Try to use as many different methods as possible to create the triples.
 

Latest revision as of 14:05, 26 January 2022

Lab 1: Getting started with VSCode, Python and RDFlib

Topics

Today we have 2 goals. 1. To setup a programming environment that you will use for the lab excercises (if you don't already have one). 2. To start some basic programming of RDF.

Installation of and introduction to Python, VSCode, Pip and RDFlib.

VSCode is an Integrated Development Environment (IDE) that can be used for writing Python code.

Conda is a package and virtual environment manager that will help you avoid dependency conflicts and clutter.

The Pip command is used to easily install additional python packages for your coding environment.

RDFLib is a Python package for working with RDF. With this we can create RDF graphs, parse and serialize RDF, perform SPARQL queries on graphs and more.

Tasks

You likely already have Python installed on your computer (it comes with MAC for instance). To find out, type 'python' in the commandline on windows or type 'python --version' in the terminal on your MAC. If you don't already have Python, one alternative is to download and install the Python for your operative system here. Per today (January 2022) RDFlib officially supports python version 3.7, 3.8 and 3.9. Or..

If you already have python 3.10 installed (not officially supported), or if you just want to install python in a most hygienic way, use Conda to create a virtual python environment for this course.

  • Install 'Miniconda' using the link above
  • Windows search for 'anaconda' and open a prompt, or open a terminal in mac and linux.
  • Enter 'conda update conda' to make sure you get the latest.
  • Enter 'conda create --name info216 python=3.9' to create a python 3.9 environment and name it after this course.
  • Enter 'conda activate info216' to use the new python environment.


If you are already using a different Python IDE than VSCode that you are comfortable with, then you are free to use that program instead.

Otherwise Download and Install the free and open source Visual Studio Code IDE here.

Start VSCode and create a new project by opening a folder with a new .py file. You can create the folder and file in the terminal or in windows explorer. Install the Python extension by Microsoft in the vscode extension manager. When the Python extension is installed you can use the 'select interpreter' field on the bottom left to use the virtual environment you made, or make sure you are using a supported version of Python.


RDFlib

Using the VSCode terminal, or activated conda terminal if you are using a conda environment, install RDFlib by simply entering: 'pip install rdflib'

To import rdflib to your .py file, write 'import rdflib' at the top of the file. Alternatively to import certain modules of rdflib, you can write e.g 'from rdflib import Graph'.


Coding Tasks

When solving the coding tasks you can look at the readings below if needed. Use the Classes/interfaces and methods listed at the bottom of the page. ()


1. Write a program that creates an RDF graph containing the triples from the following sentences. Note that one sentence could result in more than one triple. When solving these particular tasks I would try to avoid using Blank Nodes, and instead use full URI's instead. For URIs you can just use an example URI like "http://example.org/". This means that if you talk about a person called Cade, the URI could be "http://example.org/Cade". Remember the Namespaces can be used so that you don't have to write the full URI everytime.

  • Cade is married to Mary
  • The capital of France is Paris
  • Cade is 27 years old
  • 26 years is the age of Mary
  • Marys interests include hiking, chocolate and biology
  • Mary is a student
  • Paris is a City in France
  • Cade and Mary are kind people


If you have more time...

If you have more time you can continue extending your graph with some of the stuff from lab 2: "Cade has the full name Cade Tracey. He lives in 1516 Henry Street, Berkeley, California 94709, USA. He has a B.Sc. in biology from the University of California, Berkeley from 2011. His interests include birds, ecology, the environment, photography and travelling. He has visited Canada and France." Try to use as many different methods as possible to create the triples.

Documentation and other useful reading

Conda documentation for what we did today

If needed here is an introduction of how to create projects and python files in VSCode Getting Started with Python in VS Code.

RDFLib Documentation- Todays focus are the following:

Additional recommended in depth documentation:

Slides from the lecture

Relevant RDFlib interfaces (and methods)

import rdflib:

  • Graph (add)
  • URIRef
  • Literal
  • NameSpace
  • Bnode
  • Collection
  • RDF, FOAF

All RDFlib modules can be found here Browser search (often Ctrl-F) is useful here to find the module that you want. Look at the different serializations that are available and try a few of them out. Which serialization is the most comprehensible to you?