Workshop on modern NLP through large pre-trained language models

EMBEDDIA partners from the Faculty of Computer and Information Science (University of Ljubljana) organized a workshop on modern NLP through large pre-trained language models on September 29th, 2020 in Ljubljana, Slovenia.

The workshop was primarily aimed at data scientists (academics, professionals, or students) that know some programming in Python and want to learn the basics of modern natural language processing. It was instructed by EMBEDDIA’s technical manager Marko Robnik-Šikonja and touched on the following subjects:

  • text preprocessing,
  • text representations,
  • basics of neural networks for text processing,
  • neural language models,
  • BERT and transformers,
  • hands-on (a downstream task with transformers): sentiment analysis, named entity recognition, text generation, etc.

EMBEDDIA tools standing out on international challenges

The EMBEDDIA team is glad to announce our tools are performing great at international challenges. 

The results in multilingual and social information of our semantic enrichment tools recently outperformed all other participants in the official rankings in all languages in: 

HIPE (Identifying Historical People, Places and other Entities) is a evaluation campaign on named entity processing on historical newspapers in French, German and English, organized in the context of the impresso project and run as a CLEF 2020 Evaluation Lab.

FinNum is a task for fine-grained numeral understanding in financial social media data – to identify the linking between the target cashtag and the target numeral.

Also! Our multilingual fake news spreader model (in English and Spanish) came out third (out of 66 participants) at this year’s PAN. You can find out more about the fake news spreader model on this link