Lab: Training Graph Embeddings: Difference between revisions

From info216
Line 23: Line 23:


'''Pre-trained models''':
'''Pre-trained models''':
* Choose a KG and TransE model you want to work with. It should have a [https://torchkge.readthedocs.io/en/latest/reference/utils.html#pre-trained-models pre-trained model] available.
* Choose a KG and TransE model you want to work with. It should have a [https://torchkge.readthedocs.io/en/latest/reference/utils.html#pre-trained-models pre-trained model] available. (Freebase FB15k is still a good choice, see the note below if you want to use Wikidata.)
* Load the pre-trained model (you do no need the KG yet). and evaluate it using the examples given here: https://torchkge.readthedocs.io/en/latest/tutorials/evaluation.html .
* Load the pre-trained model (you do no need the KG yet). and evaluate it using the examples given here: https://torchkge.readthedocs.io/en/latest/tutorials/evaluation.html .
* ''Extra:'' You can also evaluate the model on [https://torchkge.readthedocs.io/en/latest/reference/evaluation.html#relation-prediction relation prediction] but, the way TransE is pre-trained, it is awful on this task.
* ''Extra:'' You can also evaluate the model on [https://torchkge.readthedocs.io/en/latest/reference/evaluation.html#relation-prediction relation prediction] but, the way TransE is pre-trained, it is awful on this task.
''Note:'' The Wikidata dataset returns two graphs. They are not train/test, but the dataset with and without additional attributes. Start with the one without attributes. You need to split it into train/validation/test yourself using [https://torchkge.readthedocs.io/en/latest/reference/data.html#knowledge-graph KG.split_kg()].


'''Train your own''':
'''Train your own''':

Revision as of 12:34, 14 April 2022

Lab 14: Training Graph Embeddings

Topics

Training knowledge graph embeddings with TorchKGE.


Classes and methods

The following TorchKGE classes from the previous lab remain central:

  • KG - contains the knowledge graph (KG)
  • Model - contains the embeddings (entity and relation vectors) for the KG

More classes will be suggested below.


Tasks

Pre-trained models:

Note: The Wikidata dataset returns two graphs. They are not train/test, but the dataset with and without additional attributes. Start with the one without attributes. You need to split it into train/validation/test yourself using KG.split_kg().

Train your own:

  • Load the corresponding KG using a dataset loader.
  • Run the Shortest training example, but use a much lower value for epoch (for example 200).
  • Take note of the evaluation metrics and final loss, and re-run the example using different numbers of epochs. What happens when you increase the number?
  • Also run the Simplest training example. Use the documentation to make sure you have an idea of what the different parts of the algorithm do.

Train with early stopping:

  • Run the Training with Ignite example. Use the documentation to make sure you have an idea of what the different parts of the algorithm do. How do the results compare with your exploration of different epoch values?


If You Have More Time

  • Try this out on the other models supported by TorchKGE, both other TransX models and a deep model (ConvKB).
  • Try it out with different datasets, for example one you create youreself using SPARQL queries on an open KG.

Useful readings