TETYS (NGI Search European project)

This image has an empty alt attribute; its file name is TETYS-Logo-wide-RGB.png

Topics Evolution That You See (TETYS) has been selected for funding (150K Euros) in the NGI Search 2nd Call 2023. NGI Search is within the EU programme HORIZON-CL4-2021-HUMAN-01; it has received funding from the European Union’s Horizon Europe research and innovation programme under the grant agreement 101069364 and it is framed under Next Generation Internet Initiative.

Duration of the program: August 2023-September 2024

TETYS proposes the next-generation open-source Web topic explorer inspecting a big textual corpus, projecting results on a dashboard of topic trends with easy-to-drive statistical testing. It is composed of

1) a pipeline for ingesting huge data corpora, extracting highly relevant topics, clustered along orthogonal dimensions

2) an interactive dashboard, supporting topic visualization as word clouds and exploration of temporal series.

The first prototype CorToViz prototype explores the CORD-19 dataset (COVID-19 / SARS-CoV-2 virus research abstracts). Many different domains will be explored using TETYS (e.g., climate change and controversial debates on social media).

What is our starting point?

CORToViz is the first research demonstrator showcasing the TETYS approach

How far are we aiming in this year?

Consolidate first demonstrator: 1) making the pipeline applicable to any corpus of Web textual documents; 2) validating the dashboard experience with solid and broad user studies

Build pilot working on other domains of interest for Web users: preparing for full trials and commercialization.

Objectives with NGI Search

1) develop the testbed into a solid architecture (reaching a mature TRL 5)
2) understand crucial aspects of approaching TRL 6 (being ready for addressing the market)

Excellence

TETYS will involve the use of data-driven statistical / visualization-based strategies for making Web users: 1) aware of the semantic content of analyzed datasets; 2) understand temporal evolution of the topics.
We are building a general full-stack process to explain topic evolution: 1) applicable to any corpus of short textual documents, 2) using any topic model of choice, and 3) showing results as a time-series dashboard.
BERTopic will be exploited as a core component, allowing lightweight analytics at the service of Web searchers.

Impact

We are going to explore the possibility of making TETYS a browser extension that improves the search of known engines, as no extensions that analyze topics and their evolution currently exist for common browsers.
In terms of open-source contributions, we aim to provide improved access to technical and general texts (scientific papers, e-commerce reviews, events feedback tweets, public engagement threads, etc) to quickly grasp temporal trends.
A prototype on climate change-related documents, studying the interest in the topic evolution will be built, ideally stimulating public debate and awareness of environmental changes.


Projects activities track

October 5th, 2023: Our preprint on CORToViz is on ArXiv! Check it out at https://arxiv.org/abs/2310.03928


November 1st, 2023: A first prototype on Climate Change Scientific articles is available at http://gmql.eu/climviz!


November 10th, 2023: The project has been presented at the SFSCON conference https://www.sfscon.it/programs/2023/ by Francesco Invernici, NOI TechPark – Bolzano/Italy. Nov. 10th-11th, 2023. See the video and slides at https://www.sfscon.it/talks/the-cord-19-topic-visualizer/.


February 2nd, 2024: I have been interviewed by NGI Search on our TETYS Project. See the full interview on The Next Generation Internet (NGI) Community on Funding Box.


March 15th: Our GitHub repository, with our code on the data pipeline and the first prototype dashboard is published on https://github.com/FrInve/TETYS/! All the code is open source under BSD-3-Clause license.