About the group project
Here is some more information about the final submission of your programming projects at the end of the semester.
The programming project shall develop a semantic/linked data application. Development and run-time platform is free choice, as is programming language. It will be graded based on the produced code, effort, originality, and development process.
- Show that you can program with semantic technologies such as RDF, RDFS, SPARQL, etc.
- Use existing data sets (linked open data resources)
- Use existing vocabularies (and perhaps extend them)
- Simple presentation interface / dashboard
- Make the program run
- Shortcuts can be ok (some manual steps, artificial data)
- Progress throughout semester
- Take at least one or two data sets (can be semantic or not)
- Read and/or lift them
- Use existing vocabularies
- Combine data sets
- Show and implement some use cases that were impossible or harder or less flexible before
- Focus on maintainability, making it easy run over time
Examples of previous projects
- Make your own municipalities
- Map of party financing
- Reasoning over toll roads
- Social assessment network
- Tracking IT infrastructure
- Music concert assistant
- Quiz generator
- Live semantic flight data
- Semantic security service
- Recommender system based on knowledge graphs
- Semantic crawlers and spiders
- Presentation / visualisation of graphs
- Community knowledge graph enrichment
- Semantic query visualization
- Content generation/enrichment using different data sets
(This is in Norwegian, sorry. Google Translate or similar services should give you a good idea about the content...)
Løfting av nyhetsdata
Nyhetsaggregatorer som webhose.io samler nyhetsmeldinger fra mange ulike kilder og tilbyr dem gjennom et vevgrensesnitt (Web API), typisk based på JSON. Noen aggregatorer tilbyr også noe semantisk annotering, men den er ikke alltid komplett. Oppgaven går ut på å løfte data fra en semantisk nyhetsaggregator til små RDF-grafer som representerer mest mulig av innholdet i hver nyhetsmelding. (Vi har noe prosjektmidler til å abonnere på slike nyhetsstrømmer etter avtale.)
Kobling av kjøredata mot elektroniske kart
I Transfeed-prosjektet har vi logger med kjøredata logget hver 5 sekund. Dataene inneholder blant annet geokoordinater. Elektroniske veikart som i OpenStreetMap inneholder også geokoordinater. Oppgaven går ut på å koble kjørelogger og veidata, slik at vi vet hvilke veier hver bil har kjørt på. BlazeGraph har en utvidelse for geokoordinater som kanskje kan brukes til dette.
Temaidentifikasjon i nyhetstekster
Det finnes standardverktøy som trekker ut sentrale tema eller begreper fra ulike tekster, som f eks nyhetstekster. Oppgaven går ut på å løfte resultatene fra slike analyseverktøy, sånn at man får IRI-er som representerer hva teksten handler om. IRI-ene kan f eks lenke til Wikidata eller DBpedia.
Visualisering av grafer
Tjenester som Google Charts tilbyr mange måter å visualisere strukturerte data på. Mange av dem er generiske, men noen visualiseringer avhenger av semantikken i dataene. For eksempel trenger man geokoordinater eller stedsnavn for å bruke kart. Man må ha tidsstemplede data for å bruke tidslinjer, bevegelseskart og Gantt-skjema, og man må ha organisasjoner for å tegne organogram. Oppgaven går på å foreslå passende visualiseringer av RDF-grafer på grunnlag av de semantiske termene (egenskaper, klasser og datatyper) som er benyttet.
The project will be carried by groups of up to 3 people. All group members have to be registered in the same lab group.
Some data sources
- Open semantic data sets: http://lod-cloud.net
- Vocabularies: http://lov.okfn.org/dataset/lov/
- Statistics and overviews: http://lod2.stat.gov.rs/lod2statworkbench
You will have an meeting with the course responsible to discuss your project idea. Then there will be two oral project presentations in front of the class in the lab, where you will receive feedback.
You can show your progress by using a PowerPoint and going over your idea, how the project has gone so far, explain what technologies and datasets you are using and perhaps show some working code.
- Your project should be submitted electronically as a Zip file or other standard compressed archive through Inspera (eksamen.uib.no) before the given deadline.
- The Zip file should contain a PDF file with your project report (max 2 pages of text, diagrams come in addition). The PDF file name should contain the exam numbers of everyone in the group (format: INFO216ProjectSpringYY_XXX_XXX.pdf), for example INFO216ProjectSpring20_108_116.pdf .
- The Zip file should also contain a folder with all your code, along with a README file. This README file should include instructions about: how to run your code and whether it requires additional packages or datasets before it can run. Do not include large semantic data sets like DBpedia in the Zip file, but you can include small example data sets to run the code on, if you want. The folder name should contain the exam numbers of everyone in the group (format: INFO216ProjectSpringYY_XXX_XXX/), for example INFO216ProjectSpring20_108_116/.
Previously, you were asked to submit the project code through GitHub, but we use Inspera instead because GitHub is not fully anonymous
You should comment the code lightly in-line. This does not mean one comment line for each code line but, e.g., one brief comment for each logical group of code lines. We do not require full Javadoc. Your code has to be uploaded into a Git repository, together with a README file. This README file should include instructions about: how to run your code and where to find the different components of your project/code. The link to your Git repository has to be included in the final report.
The project report file should be anonymous, and contain the exam numbers of all group members, BOTH on the first page and in the file name (e.g., INFO216ProjectReport_108_116.pdf .) It must be in English (STRICT REQUIREMENT).
The length of the project report is 1500 words MAX (2 A4 pages with 11pt font and 2.5 cm margins). This is a HARD limit. The quality of your code and idea is more important than the quality of the report.
You should briefly explain the purpose of your system. Why have you made this? Why is it a good idea to do this using semantic technologies? What can you do now that wasn't possible before?
You should probably list the technologies/tools/standards/vocabularies you have used and explain briefly why you chose each of them. Did you consider alternatives? Why were the ones you chose better?
If you are reading/converting/lifting data from multiple sources and/or using existing tools in addition to your own program, you should probably include a flow chart or architecture sketch (which is different from a class diagram).
If your code is complex, you may include a class diagram of your system, such as a UML class diagram.
You should mention any particular problems you have had and/or things you want to do differently next time. You should describe the contributions each person in your group has made to the project.
If you want to briefly describe how to run the code you have submitted, you can do that separately in a README.TXT file.
Here are some guidelines from an earlier year about what should be in your report, but take these as suggestions only - they are not strict requirements, in particular the mentioning of Git Guidelines. By "TBOX Ontology Model" is meant the ontology/vocabulary you have developed or adapted.