Submit to our SemEval Task until the 12th of March

Our SemEval 2020 Task3: Predicting the (Graded) Effect of Context in Word Similarity is open for submissions. For this task, we ask participants to build systems to predict the effect that context has on human perception of similarity of words. Participants can submit their results until the 12th of March. 

In order to be able to look at these effects, we built several datasets where we asked annotators to score how similar a pair of words are after they have read a short paragraph (which contains the two words). Each pair is scored within two of these paragraphs, allowing us to look at changes in similarity ratings due to context. We built datasets, containing these contextual similarity ratings, in four different languages:

  • Croatian: HR
  • English: EN
  • Finnish: FI
  • Slovenian: SL

The pairs of words come from the well known SimLex999 dataset. The contexts are chosen so as to encourage different perceptions of similarity. Polysemy plays a role, however, we are especially interested in more subtle, graded changes in meaning. All data and examples are available on this link: https://competitions.codalab.org/competitions/20905 and more details here: https://arxiv.org/abs/1912.05320

EMBEDDIA at IFAM 2020

The EMBEDDIA project was promoted at the International Trade Fair for Automation and Mechatronics (IFAM) 2020, which took place on February 11-13 in Ljubljana, Slovenia. EMBEDDIA was featured at the fair stand of the Jožef Stefan Institute and at the seminar for Artificial Intelligence in Industry and Society, where members of the EMBEDDIA consortium, prof. dr. Nada Lavrač and prof. dr. Marko Robnik-Šikonja gave speeches.

Writen by: Martin Marzidovšek (JSI)