Hackashop on news media content analysis and automated report generation

Workshop proposal in preparation

Brief description: News media content, including news articles and the comments that readers post, is a rich potential source of insight into current events and opinions – but many challenges remain if we are to develop automated tools for analysis and reporting. This hackashop, a novel combination of workshop and hackathon, will bring together an audience from computer science, social science and industry to target these multi-disciplinary challenges. In order to encourage scientific and technological advances, we propose a dual format: a traditional workshop format will be combined with a preceding active event in which participants will be provided with news media data and a suite of component tools to experiment with novel approaches and solutions. The workshop day will then bring participants together to share new insights from both research- and experimentation-based work.

Motivation and relevance:  Automated content analysis of news media, including both news articles and users’ comments on them, can provide unparalleled insight into current events, interests and opinions, as well as trends and changes in them. The needs are varied, from the readers who consume news of their personal interest to journalists who keep track of what is going on in the world, try to understand what their readers think of various topics, or want to automate routine reporting. The hackashop, a novel combination of workshop and hackathon, specifically welcomes cross-disciplinary collaborations of computer scientists with media researchers and other social scientists in order to reach richer insights into the needs and opportunities in news media analysis and generation.

Objectives:  The aim of the workshop is to foster discussion and research on the combination of language technology and news media content. We aim to establish a forum for (1) discussing scientific advances in analysis of news stories and their reader comments and in automated generation of reports, as well as for (2) experimental work on identifying interesting phenomena in reader comments and reporting on them. We encourage contributions that address multilingual settings, including (but not limited to) low-resource languages. The workshop proceedings will contain contributed manuscripts, accepted using a peer-review process, and a summary of the experimental activities.

Format:  The workshop experiments with a dual format: (1) a traditional track with paper submissions, reviews and paper presentations, and (2) an active, experimentation-based track where hackathon-type online activities precede the workshop, and hackathon teams/individuals present their work in the workshop. Datasets and a suite of tools for optional use in the hackathon will be provided by the organizers. Participants can choose to participate in only one or in both roles.   The final workshop balance will depend on the quantity and quality of submissions to either track.   Depending on the covid situation, we are prepared to carry out all activities online. The hackathon part is designed as an online activity in any case. The workshop part will, if needed, be implemented using tools such as underline.io (formal sessions) and gather.town (informal meeting space) to guarantee a smooth and interactive event.

Tentative schedule:

Paper trackHackathon track
Nov: Prepare and issue call for papers
Feb 15: Paper submission deadline
Mar 15: Acceptance notification
Mar 31: Camera-ready copies
Apr 19/20: Workshop
Nov: Prepare and issue call for hackathon participation
Nov – Jan: Preparation of hackathon datasets and tools
Mar 10-15: Optional team matchmaking
Mar 15: Individuals and Team registration, Hackathon activities start
Apr 15: Hackathon activities end and reports due Apr 19/20: Workshop

In preparation – details to follow.