Hackashop on news media content analysis and automated report generation
1 – 21 February (online hackathon) + 19 April, 2021 (presentations)
Hackashop website: http://embeddia.eu/hackashop2021/
Hackathon registration: https://forms.gle/queqwRHmpNUpPyna7
Hackathon in conjunction with EACL 2021
EACL website: https://2021.eacl.org
News media content, including news articles and the comments that readers post, is a rich potential source of insight into current events and opinions. This hackashop, a combination of hackathon and workshop, aims to bring together an audience from computer science, social science and industry to target multi-disciplinary challenges in news and comment analysis and reporting.
In order to encourage scientific and technological advances, the hackashop has a dual format: (1) a traditional workshop format for discussing scientific advances is combined with (2) a preceding hackathon-type event in which participants are provided with news media data and a suite of tools to experiment with novel approaches and solutions.
Hackashop participants are welcome to participate in one or both activities. The workshop day on 19 April will then bring participants together to share new insights from both research- and experimentation-based work. This call for participation deals with the hackathon track only (see the hackathon website for call for workshop papers.)
Why participate in the hackathon
Would you like to tackle exciting challenges in news media analysis and generation? Or do you like developing and experimenting with solutions in a multidisciplinary setting? Or are you possibly interested in multilingual settings, smaller languages, or low-resource languages? Or would you like to learn things about NLP hands on, and from working with others?
We will provide access to relevant tools and models from ongoing research, datasets you can experiment with, as well as support from technical experts who know the tools and the models, from social scientists who have insight into media research, and from media professionals who know the practice and needs of the media industry.
The hackathon targets anyone interested in Natural Language Processing and Machine Learning (doctoral and graduate students, researchers and professionals). We especially welcome media researchers from social sciences/media studies and private sector representatives to join the cross-disciplinary teams.
If you are a student, you may be eligible to earn 3 ECTS for your participation. (University of Helsinki, the organizer, will grant its students 3 ECTS. We will issue other students a certificate of participation with which you can negotiate in your own university.)
Hackathon format
Due to Covid-19, the hackathon is organised as a virtual event over a period of three weeks, Feb 1 – Feb 21, 2021. We do not expect full-time participation during the time period, but regular and substantial effort will likely be required for a satisfying experience for yourself and the collaborators.
We will organise structured activities to support participants: matchmaking before the event, tutorials to tools, datasets and challenges in the beginning, expert consultancy and coaching through the hackathon, forums for communication and peer support, joint events for sharing experiences and hints and seeing what others are working on.
A specialty of the hackashop is that completed hackathon projects are invited to submit a brief report (2-4 pages + references) to the hackashop workshop proceedings, to be published by EACL, and to present their project briefly in the workshop event on 19 April in conjunction with EACL. These give hackathon participants both the opportunity to publish their project and to present it to a wider audience. Participation in the workshop proceedings and event are strongly encouraged!
Rules of the hackathon
Do cool stuff related to news media analysis and generation. More specifically:
-
- identify a relevant challenge (feel free to use our example challenges as seeds),
- develop a solution to the challenge using at least some of the tools or models we provide,
- experiment, e.g., with the data that we provide,
- write a brief report and present/demo your results in the wrap-up event,
- optionally: participate in the EACL workshop with your report and your presentation.
Link: Instructions for completing the hackathon
Challenges
Link: Descriptions of example challenges to address in the hackathon
Automated content analysis of news media, including news articles and users’ comments on them, can provide unparalleled insight into current events, interests and opinions, as well as trends and changes in them. The needs are varied, from the readers who consume news of their personal interest to journalists who keep track of what is going on in the world, try to understand what their readers think of various topics, or want to automate routine reporting. You and your team are free to choose a specific problem. To help you get started, we provide example challenges (see the link above).
Tools and models
Link: Tools and models provided to hackathon participants
We provide a collection of various tools that you can use to attack the above challenges, especially for some smaller languages and in multilingual settings. The tools correspond in majority to individual components and readily trained models for various languages, but also comprise selected integrated toolkits and workflows. The tools and models offered often cover some subset of Croatian, Estonian, Slovene, Finnish, Swedish, Lithuanian, Russian, Latvian, English; some tools are more general, some allow training for new languages. We expect participants to make use of at least some of the tools/models provided by the organisers (see the link above). You are of course welcome to use any additional tools you find useful.
Datasets
Link: Datasets for possible use in the hackathon
We also provide news datasets that can be used in experimentation. Some of the datasets are from our partner media companies, some are from public sources. You are also welcome to use your own data!
Important dates for the hackathon
-
- Jan 18 – Jan 29: Match-making period for those seeking for team mates
- Jan 29: Registration deadline
- Feb 1, at 13:00-15:00 CET: Hackathon kickoff (in Zoom, see Slack for link)
Teams work on solving challenges using tool and models provided - Feb 10, at 13:00-15:00 CET: Hackathon halfway get-together (Zoom + Gather.town)
Teams work on solving challenges using tool and models provided - Feb 19, at 13:00-15:00 CET: Hackathon wrap-up (Zoom) (see instructions)
- Feb 21: Hackathon reports due (see instructions)
For those who participate in the workshop:
-
- Mar 1: Camera-ready versions of reports due (see instructions)
- Apr 19: Workshop event with brief presentations by hackathon teams
Quick feedback will be provided on hackathon reports for preparing the camera-ready versions.
Note that a separate call for peer-reviewed workshop papers is available at
http://embeddia.eu/hackashop2021-call-for-workshop-papers/
The paper submission deadline is Jan 31.
Register at https://forms.gle/queqwRHmpNUpPyna7.
Registration deadline is Jan 29, 2021. Registration is free of charge.
You are free to register as a team or as an individual (but we ask all members of teams to register themselves). We will help in match-making between participants who are looking for teammates.
Organizing committee
Hannu Toivonen (University of Helsinki, Finland), Hackashop Chair
Michele Boggia (University of Helsinki, Finland), Interaction Chair
Marko Robnik-Šikonja (University of Ljubljana, Slovenia), Tool Chair
Matthew Purver (Queen Mary University of London, UK), Data Chair
Carl-Gustav Linden (University of Bergen, Norway), Challenge Chair
Senja Pollak (Jozef Stefan Institute, Slovenia)
Contact
Hannu Toivonen or Michele Boggia
Support
The hackashop is supported by the Horizon2020 project EMBEDDIA (“Cross-Lingual Embeddings for Less-Represented Languages in European News Media”, project number 825153, 2020-2022), http://embeddia.eu.