Difference between revisions of "About the group project"

From Info216
m (Sinoa moved page About the programming project to About the group project: Programming has become slightly less central)
Line 1: Line 1:
Here is some more information about the final submission of your programming projects at the end of the semester.
+
The mandatory group project in INFO216 counts 40% of the final grade.
 +
You must have submitted the group project by the deadline to be allowed to take the final exam (which counts the remaining 60% of the final grade).
  
<!--
+
=Group size=
=Project Definition=
+
The project shall be carried out in groups of three ''and not more''.
The programming project shall develop a semantic/linked data application. Development and run-time platform is free choice, as is programming language. It will be graded based on the produced code, effort, originality, and development process.
+
* Working individually / in pairs is possible, but not optimal
 +
* Groups of more than three will ''never'' be accepted. Sorry.
  
==Success factors==
+
=Task=
* Show that you can program with semantic technologies such as RDF, RDFS, SPARQL, etc.
+
The programming project shall develop a '''semantic KG-based (RDF, SPARQL, OWL...) dataset, application, or service'''.
* Use existing data sets (linked open data resources)
+
You can define your own task or choose a given assignment.
* Use existing vocabularies (and perhaps extend them)
 
* ''Simple'' presentation interface / dashboard
 
* Make the program run
 
* Shortcuts can be ok (some manual steps, artificial data)
 
* Progress throughout semester
 
  
==Requirements==
+
==Self-defined tasks==
 +
A good self-defined task will satisfy most of these requirements:
 
* Take at least one or two data sets (can be semantic or not)
 
* Take at least one or two data sets (can be semantic or not)
 
* Read and/or lift them
 
* Read and/or lift them
 
* Use existing vocabularies
 
* Use existing vocabularies
 
* Combine data sets
 
* Combine data sets
* Show and implement some use cases that were impossible or harder or less flexible before
+
* Show and implement some use cases that were either
* Focus on maintainability, making it easy run over time
+
** harder to achieve,
 +
** less flexible,  or even
 +
** impossible
 +
before
  
==Examples of previous projects==
+
Here are some examples of self-defined tasks from earlier semesters:
 
* Make your own municipalities
 
* Make your own municipalities
 
* Map of party financing
 
* Map of party financing
Line 40: Line 41:
 
* Content generation/enrichment using different data sets
 
* Content generation/enrichment using different data sets
  
==Possible tasks==
+
==Given assignments==
 +
The given assignments are related to ongoing research projects in information science at the University of Bergen.
 +
They can be a bit challenging, but you will contribute to solving real research challenges.
 +
If make your project code open source, it is even possible we can use some of it in our projects, but this is not a requirement.
  
''(This is in Norwegian, sorry. [http://translate.google.com/ Google Translate] or similar services should give you a good idea about the content...)''
+
Here is a list of [[Given group assignments]] for Spring 2021.
  
===Løfting av nyhetsdata===
+
=Technology=
 +
Programming language, development and run-time platform is free of choice. It will be graded based on the produced code, effort, originality, and development process.
  
Nyhetsaggregatorer som webhose.io samler nyhetsmeldinger fra mange ulike kilder og tilbyr dem gjennom et vevgrensesnitt (Web API), typisk based på JSON. Noen aggregatorer tilbyr også noe semantisk annotering, men den er ikke alltid komplett. Oppgaven går ut på å løfte data fra en semantisk nyhetsaggregator til små RDF-grafer som representerer mest mulig av innholdet i hver nyhetsmelding. (Vi har noe prosjektmidler til å abonnere på slike nyhetsstrømmer etter avtale.)
+
==Success factors==
 
+
* Show that you can program with semantic technologies such as RDF, RDFS, SPARQL, etc.
===Kobling av kjøredata mot elektroniske kart===
+
* Use existing data sets (linked open data resources)
 
+
* Use existing vocabularies (and perhaps extend them)
I Transfeed-prosjektet har vi logger med kjøredata logget hver 5 sekund. Dataene inneholder blant annet geokoordinater. Elektroniske veikart som i OpenStreetMap inneholder også geokoordinater. Oppgaven går ut på å koble kjørelogger og veidata, slik at vi vet hvilke veier hver bil har kjørt på. BlazeGraph har en utvidelse for geokoordinater som kanskje kan brukes til dette.
+
* ''Simple'' presentation interface / dashboard
 
+
* Make the program run
===Temaidentifikasjon i nyhetstekster===
+
* Shortcuts can be ok (some manual steps, artificial data)
 
+
* Progress throughout semester
Det finnes standardverktøy som trekker ut sentrale tema eller begreper fra ulike tekster, som f eks nyhetstekster. Oppgaven går ut på å løfte resultatene fra slike analyseverktøy, sånn at man får IRI-er som representerer hva teksten handler om. IRI-ene kan f eks lenke til Wikidata eller DBpedia.
 
 
 
===Visualisering av grafer===
 
 
 
Tjenester som Google Charts tilbyr mange måter å visualisere strukturerte data på. Mange av dem er generiske, men noen visualiseringer avhenger av semantikken i dataene. For eksempel trenger man geokoordinater eller stedsnavn for å bruke kart. Man må ha tidsstemplede data for å bruke tidslinjer, bevegelseskart og Gantt-skjema, og man må ha organisasjoner for å tegne organogram. Oppgaven går på å foreslå passende visualiseringer av RDF-grafer på grunnlag av de semantiske termene (egenskaper, klasser og datatyper) som er benyttet.
 
  
  

Revision as of 18:51, 24 January 2021

The mandatory group project in INFO216 counts 40% of the final grade. You must have submitted the group project by the deadline to be allowed to take the final exam (which counts the remaining 60% of the final grade).

Group size

The project shall be carried out in groups of three and not more.

  • Working individually / in pairs is possible, but not optimal
  • Groups of more than three will never be accepted. Sorry.

Task

The programming project shall develop a semantic KG-based (RDF, SPARQL, OWL...) dataset, application, or service. You can define your own task or choose a given assignment.

Self-defined tasks

A good self-defined task will satisfy most of these requirements:

  • Take at least one or two data sets (can be semantic or not)
  • Read and/or lift them
  • Use existing vocabularies
  • Combine data sets
  • Show and implement some use cases that were either
    • harder to achieve,
    • less flexible, or even
    • impossible

before

Here are some examples of self-defined tasks from earlier semesters:

  • Make your own municipalities
  • Map of party financing
  • Reasoning over toll roads
  • Social assessment network
  • LinkedMDB-portal
  • Tracking IT infrastructure
  • Music concert assistant
  • Quiz generator
  • Live semantic flight data
  • Semantic security service
  • Recommender system based on knowledge graphs
  • Semantic crawlers and spiders
  • Presentation / visualisation of graphs
  • Community knowledge graph enrichment
  • Semantic query visualization
  • Content generation/enrichment using different data sets

Given assignments

The given assignments are related to ongoing research projects in information science at the University of Bergen. They can be a bit challenging, but you will contribute to solving real research challenges. If make your project code open source, it is even possible we can use some of it in our projects, but this is not a requirement.

Here is a list of Given group assignments for Spring 2021.

Technology

Programming language, development and run-time platform is free of choice. It will be graded based on the produced code, effort, originality, and development process.

Success factors

  • Show that you can program with semantic technologies such as RDF, RDFS, SPARQL, etc.
  • Use existing data sets (linked open data resources)
  • Use existing vocabularies (and perhaps extend them)
  • Simple presentation interface / dashboard
  • Make the program run
  • Shortcuts can be ok (some manual steps, artificial data)
  • Progress throughout semester


Groups

The project will be carried by groups of up to 3 people. All group members have to be registered in the same lab group.

Some data sources

Oral Presentation

You will have an meeting with the course responsible to discuss your project idea. Then there will be two oral project presentations in front of the class in the lab, where you will receive feedback.

You can show your progress by using a PowerPoint and going over your idea, how the project has gone so far, explain what technologies and datasets you are using and perhaps show some working code.

Submission

  • Your project should be submitted electronically as a Zip file or other standard compressed archive through Inspera (eksamen.uib.no) before the given deadline.
  • The Zip file should contain a PDF file with your project report (max 2 pages of text, diagrams come in addition). The PDF file name should contain the exam numbers of everyone in the group (format: INFO216ProjectSpringYY_XXX_XXX.pdf), for example INFO216ProjectSpring20_108_116.pdf .
  • The Zip file should also contain a folder with all your code, along with a README file. This README file should include instructions about: how to run your code and whether it requires additional packages or datasets before it can run. Do not include large semantic data sets like DBpedia in the Zip file, but you can include small example data sets to run the code on, if you want. The folder name should contain the exam numbers of everyone in the group (format: INFO216ProjectSpringYY_XXX_XXX/), for example INFO216ProjectSpring20_108_116/.

Previously, you were asked to submit the project code through GitHub, but we use Inspera instead because GitHub is not fully anonymous

Code

You should comment the code lightly in-line. This does not mean one comment line for each code line but, e.g., one brief comment for each logical group of code lines.

Project report

The project report file should be anonymous, and contain the exam numbers of all group members, BOTH on the first page and in the file name (e.g., INFO216ProjectReport_108_116.pdf .) It must be in English (STRICT REQUIREMENT).

The length of the project report is 1500 words MAX (2 A4 pages with 11pt font and 2.5 cm margins). This is a HARD limit. The quality of your code and idea is more important than the quality of the report.

You should briefly explain the purpose of your system. Why have you made this? Why is it a good idea to do this using semantic technologies? What can you do now that wasn't possible before?

You should probably list the technologies/tools/standards/vocabularies you have used and explain briefly why you chose each of them. Did you consider alternatives? Why were the ones you chose better?

If you are reading/converting/lifting data from multiple sources and/or using existing tools in addition to your own program, you should probably include a flow chart or architecture sketch (which is different from a class diagram).

If your code is complex, you may include a class diagram of your system, such as a UML class diagram.

You should mention any particular problems you have had and/or things you want to do differently next time. You should describe the contributions each person in your group has made to the project.

If you want to briefly describe how to run the code you have submitted, you can do that separately in a README.TXT file.

Here are some Report Guidelines from an earlier year about what should be in your report, but take these as suggestions only - they are not strict requirements, in particular the mentioning of Git. By "TBOX Ontology Model" is meant the ontology/vocabulary you have developed or adapted. -->

INFO216, UiB, 2017-2020, Prof. Andreas L. Opdahl (c)