sap_logo
Thesis projects

If you want to request a thesis project in collaboration with the RSTLess group, compile !

Bachelor Thesis

Master thesis

Chunking in Retrieval Augmented Generation
Advisor: Fabrizio Silvestri

Co-Advisor: Giovanni Trappolini
Difficulty: ●○○

NLP/RAG
Retrieval Augmented Generation (RAG) systems enhance LLM performance by providing external knowledge, but their effectiveness depends heavily on how documents are chunked. Current research suggests that chunk size significantly affects retrieval quality but lacks a systematic analysis of how chunk size influences the distracting effect of irrelevant information. This research aims to quantify how different chunking strategies impact the LLM's ability to focus on relevant information while ignoring distractions in the retrieved context.

BackTracing
Advisor: Fabrizio Silvestri

Difficulty: ●○○

NLP/RAG
In many real-world applications (e.g., customer service, education, legal research), understanding the underlying reason behind a question is crucial. This thesis proposes a novel approach to a largely unexplored task in information retrieval: tracing a user's query back to its possible cause or underlying motivation — a process we refer to as "backtracing." Unlike traditional IR tasks that focus on retrieving relevant documents to answer or expand upon a query, backtracing inverts this perspective by asking: "Why was this query asked?"

Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.