TETYS (NGI Search European project)
Topics Evolution That You See (TETYS) has been selected for funding (150K Euros) in the NGI Search 2nd Call 2023. NGI Search is within the EU programme HORIZON-CL4-2021-HUMAN-01; it has received funding from the European Union’s Horizon Europe research and innovation programme under the grant agreement 101069364 and it is framed under Next Generation Internet Initiative.
Duration of the funded program: September 2023-August 2024
TETYS proposes the next-generation open-source Web topic explorer inspecting a big textual corpus, projecting results on a dashboard of topic trends with easy-to-drive statistical testing. It is composed of
1) a pipeline for ingesting huge data corpora, extracting highly relevant topics, clustered along orthogonal dimensions
2) an interactive dashboard, supporting topic visualization as word clouds and exploration of temporal series.
The first prototype CorToViz prototype explores the CORD-19 dataset (COVID-19 / SARS-CoV-2 virus research abstracts). Many different domains will be explored using TETYS (e.g., climate change and controversial debates on social media).

What is our starting point?
CORToViz is the first research demonstrator showcasing the TETYS approach
How far are we aiming in this year?
Consolidate first demonstrator: 1) making the pipeline applicable to any corpus of Web textual documents; 2) validating the dashboard experience with solid and broad user studies
Build pilot working on other domains of interest for Web users: preparing for full trials and commercialization.
Objectives with NGI Search
1) develop the testbed into a solid architecture (reaching a mature TRL 5)
2) understand crucial aspects of approaching TRL 6 (being ready for addressing the market)
Excellence
TETYS will involve the use of data-driven statistical / visualization-based strategies for making Web users: 1) aware of the semantic content of analyzed datasets; 2) understand temporal evolution of the topics.
We are building a general full-stack process to explain topic evolution: 1) applicable to any corpus of short textual documents, 2) using any topic model of choice, and 3) showing results as a time-series dashboard.
BERTopic will be exploited as a core component, allowing lightweight analytics at the service of Web searchers.
Impact
We are going to explore the possibility of making TETYS a browser extension that improves the search of known engines, as no extensions that analyze topics and their evolution currently exist for common browsers.
In terms of open-source contributions, we aim to provide improved access to technical and general texts (scientific papers, e-commerce reviews, events feedback tweets, public engagement threads, etc) to quickly grasp temporal trends.
A prototype on climate change-related documents, studying the interest in the topic evolution will be built, ideally stimulating public debate and awareness of environmental changes.
Projects activities track
October 5th, 2023: Our preprint on CORToViz is on ArXiv! Check it out at https://arxiv.org/abs/2310.03928

November 1st, 2023: A first prototype on Climate Change Scientific articles is available at http://gmql.eu/climviz!

November 10th, 2023: The project has been presented at the SFSCON conference https://www.sfscon.it/programs/2023/ by Francesco Invernici, NOI TechPark – Bolzano/Italy. Nov. 10th-11th, 2023. See the video and slides at https://www.sfscon.it/talks/the-cord-19-topic-visualizer/.

February 2nd, 2024: I have been interviewed by NGI Search on our TETYS Project. See the full interview on The Next Generation Internet (NGI) Community on Funding Box.
March 4th, 2024: Our Master Student Jelena has joined the team to help us improve the data extraction phase and compare BERTopic stages with more up-to-date technologies.
March 15th, 2024: Our GitHub repository, with our code on the data pipeline and the first prototype dashboard is published on https://github.com/FrInve/TETYS/! All the code is open source under BSD-3-Clause license.
April 8th, 2024: TETYS has been presented in the Department of Electronics, Information and Bioengineering at Politecnico di Milano https://www.deib.polimi.it/eng/european-projects/details/499
May 2nd, 2024: Our Master Students Francesca and Amir have joined the team to help us improve the User experience and statistical tests.
June 7th, 2024: Anna has presented the TETYS project in the context of the Research Project Exhibition at the CAiSE Conference (Limassol, Cyprus). Check out our poster here!
June 25th, 2024: We had a fruitful discussion with Linknovate and explored our possible market’s landscape with their tool.
August 31st, 2024: our mature solution for exploring Sustainable Development Goals in scientific literature has been released and presented to the Data Science Group at Politecnico di Milano. Check our our repository at https://github.com/FrInve/TETYS at working prototype at https://gmql.eu/tetys/!

September 27th, 2024: Anna was interviewed by a researcher at the Department of Business Development and Technology, Aarhus University exploring several aspects of our NGI Search project.
November 4th, 2024: We presented TETYS at the Open Web Search standup meeting. See news.
November 11th, 2024: Our extended preprint on the TETYS system for exploring Sustainable Development Goals is now online on Arxiv, check it out here!

March 26th, 2025: Our Demo paper “TETYS: Configurable Topic Modeling Exploration for Big Corpora of Text Documents” has been accepted and will be presented by Francesco Invernici at the “28th International Conference on Extending Database Technology (EDBT)” in Barcelona, Spain! Check out the paper here and the poster here!