EMBEDDIA Conference: AI Technology for the Media Industry

On Wednesday, 8th December, we will organize a half-day industry conference where participants will have the opportunity to be the first to find out about tools developed within the EMBEDDIA project. This will be a half-day online event (9 AM-2 PM CET) where we will demonstrate the EMBEDDIA Media Assistant — a collection of tools for AI-based text processing.

We would like to have you as a guest at our conference. If you are interested, please register here.

More details of the conference and its schedule are available here.

EMBEDDIA Workshop – presenting EMBEDDIA tools

On June 21st, we had an engaging workshop about a part of our EMBEDDIA Assistant Tool called Texta Toolkit. Around 20 corpus linguists, computational linguists, NLP-related researchers, and journalists across Europe got an overview of Texta Toolkit and a demo of its functions. Attendees were able to test the Toolkit themselves with the Aylien Covid news dataset and answer research questions like “What are the higher risk factors of developing a severe case of coronavirus?”, “How has China or Wuhan-related news frequency changed over time?”. The workshop was an excellent opportunity to introduce the EMBEDDIA Assistant tool and obtain feedback from possible users on its functionalities and accessibility.

EMBEDDIA Hackashop, a recapitulation

On April 19, we wrapped the EACL Hackashop on News Media Content Analysis and Automated Report Generation. The aim of Hackashop 2021 was to foster discussion and research on the combination of language technology and news media content. It provided a forum for both discussing scientific advances in the analysis of news stories and their reader comments and automated generation of reports, as well as for experimental work on identifying interesting phenomena in reader comments and reporting on them.

The hackashop was implemented in a dual format. A traditional track consisted of submission of scientific papers, their reviews, and finally paper presentations. It was complemented by an active, experimentation-based track consisting of an online hackathon preceding the workshop, with the presentation of the results in the joint workshop event. Both tracks shared the same topic, news media analysis, and generation, and participants to the two tracks had a good amount of overlap.

In the workshop track, we encouraged submissions of long and short papers. Based on three expert reviews for each submission, weighing the contributions of the submission against its length, 13 papers were selected for presentation in the workshop event.

The online hackathon was organized during a three-week period in February 2021, with six participating teams. The challenges they addressed covered a broad range, as each team had the freedom to define their own aims. In the spirit of providing a joint forum for discussing both scientific advances and experimental work, five hackathon teams submitted short reports to be included in this proceedings.

We were very happy to see several cross-disciplinary and cross-sector collaborations involving, e.g., computer scientists, social scientists, and the media industry, both in workshop papers and hackathon contributions. We were also happy to have numerous contributions that address multilingual settings and low-resource languages.

The workshop event on 19 April 2021 brought both tracks together, with presentations of both scientific workshop papers and empirical hackathon reports. We concluded the Hackashop with an excellent presentation of our keynote speaker, professor Neil Maiden.

We would once again like to thank all workshop paper authors and hackathon participants for their contributions to the hackashop! We are thankful to the programme committee members for their insightful reviews of the workshop papers. We are equally thankful to the large number of experts who made tools, models, data, and challenges available for the hackathon and provided support for the participants.

Authors: Hannu Toivonen and Michele Boggia

EMBEDDIA Hackathon wrap-up

On Friday, February 19, we wrapped up the EMBEDDIA hackathon. In an online event, the hackathon participants presented their results and their views on the EMBEDDIA tools and identified challenges.

We like to extend our gratitude to the hackathon participants and the EMBEDDIA staff for making the hackathon a success. It was a very nice opportunity for the EMBEDDIA consortium to see our developed tools being utilized outside of the consortium for similar or newly identified NLP challenges.

Below are snapshots of the wrap-up meeting.

EMBEDDIA Hackashop: Hackathon halfway get-together

On February 10, the EMBEDDIA consortium organized a hackathon get-together of hackathon participants and EMBEDDIA staff. We used this event to check-in with the teams and present the expectations and challenges of our media partner, the Finnish News Agency (STT).

The interaction with hackathon teams was conducted via the Gather.town application and it was in the form of a tool/model/data/challenge support session. We used the Gather.town application to make the event less formal and more social. Participants were able to wander around and meet other participants and see what they are working on — or to chat with other researchers from EMBEDDIA!

Below are snapshots of today’s event.

Kick-off of the Hackashop on news media content analysis and automated report generation

Today the EMBEDDIA consortium officially kicked-off the Hackashop on news media content analysis and automated report generation. Project partners presented the projects, challenges, and data to be used in the course of the hackashop. Due to the pandemic, the hackashop will be an online event. The hackathon part of the hackashop will run from February 1-21, 2021.

Below are some snapshots from today’s event.

Hackashop on news media content analysis and automated report generation – Call for workshop papers

The EMBEDDIA consortium is proud to announce the organization of the Hackashop on news media content analysis and automated report generation in conjunction with EACL 2021.

The Call for workshop papers is now published — more details are available here.

We welcome work broadly in the area of natural language processing of news media, addressing the various needs from the readers who consume news of their personal interest to journalists who keep track of what is going on in the world, try to understand what their readers think of various topics, or want to automate routine reporting.

Workshop on modern NLP through large pre-trained language models

EMBEDDIA partners from the Faculty of Computer and Information Science (University of Ljubljana) organized a workshop on modern NLP through large pre-trained language models on September 29th, 2020 in Ljubljana, Slovenia.

The workshop was primarily aimed at data scientists (academics, professionals, or students) that know some programming in Python and want to learn the basics of modern natural language processing. It was instructed by EMBEDDIA’s technical manager Marko Robnik-Šikonja and touched on the following subjects:

  • text preprocessing,
  • text representations,
  • basics of neural networks for text processing,
  • neural language models,
  • BERT and transformers,
  • hands-on (a downstream task with transformers): sentiment analysis, named entity recognition, text generation, etc.