MoRAGBench: a benchmarking framework for RAG pipelines on mobile devices / by Huzaifa Shaaban Kabakibo ; first reviewer: Prof. Dr. Lin Wang, Second reviewer: Prof. Dr. Marco Platzner

Kabakibo, Huzaifa Shaaban

Titelaufnahme

Titel
MoRAGBench : a benchmarking framework for RAG pipelines on mobile devices / by Huzaifa Shaaban Kabakibo ; first reviewer: Prof. Dr. Lin Wang, Second reviewer: Prof. Dr. Marco Platzner
Autor
Kabakibo, Huzaifa Shaaban
Gutachter
Wang, Lin ; Platzner, Marco
Erschienen
Paderborn, 2026
Umfang
1 Online-Ressource (63 Seiten) : Illustrationen, Diagramme
Hochschulschrift
Universität Paderborn, Masterarbeit, 2026
Anmerkung
Tag der Abgabe: 19.03.2026
Datum der Abgabe
19.3.2026
Sprache
Englisch
Dokumenttyp
Masterarbeit
Schlagwörter (GND)
Paderborn
URN
urn:nbn:de:hbz:466:2-57677
DOI
https://doi.org/10.17619/UNIPB/1-2533

Links

Social Media

Share
Nachweis
Universitätsbibliothek Paderborn
IIIF
IIIF-Manifest

Dateien

MoRAGBench [Pdf 0.93 mb]
RIS

Klassifikation

Besondere Sammlungen → Veröffentlichungen der Universität → Fakultät für Elektrotechnik, Informatik und Mathematik
Klassifikation (DDC) → Informatik, Informationswissenschaft, allgemeine Werke

Abstract

Retrieval Augmented Generation (RAG) has emerged as an effective approach for improving the factual grounding and contextual relevance of Large Language Models (LLMs) by combining neural generation with external knowledge retrieval. While most existing RAG systems are designed for server or cloud environments, executing complete pipelines directly on mobile devices introduces significant challenges due to limited computational resources, memory constraints, hardware heterogeneity, and immature software stacks and optimizations. Despite growing interest in on-device intelligence, there remains limited systematic understanding of how individual RAG components behave and contribute to the overall performance under realistic mobile conditions. This thesis presents MoRAGBench, a modular benchmarking framework for evaluating RAG pipelines on Android smartphones. The framework enables configurable experimentation across all stages of the pipeline, including document chunking, embedding generation, indexing and retrieval, augmentation, and LLM inference. To provide comprehensive analysis, MoRAGBench introduces two complementary evaluation modes: an approximate nearest neighbor benchmark that isolates retrieval performance, and an end-to-end task benchmark that measures down-stream question answering quality and overall system efficiency. Extensive experiments conducted on a modern smartphone reveal fundamental trade-offs be-tween retrieval accuracy, latency, throughput, and memory consumption. The results show that efficient on-device RAG deployment is primarily a systems-level challenge in which performance improvement emerges from the interaction between pipeline components and hardware execution backends rather than from individual model improvements alone. The study further highlights limitations of current mobile inference-acceleration frameworks and demonstrates the importance of balanced RAG pipeline configurations and approximate similarity search methods for achieving practical performance. By enabling systematic, reproducible evaluation of RAG systems under mobile constraints, MoRAGBench provides practical insights and a foundation for future research toward efficient, privacy-preserving, fully on-device intelligent assistants. MoRAGBench is fully open-sourced athttps://github.com/upb-cn/MoRAGBench.

Statistik

Das PDF-Dokument wurde 163 mal heruntergeladen.

Lizenz-/Rechtehinweis

Creative Commons Namensnennung - Nicht kommerziell - Keine Bearbeitung 4.0 International Lizenz

Publizieren

Besondere Sammlungen

Digitalisierungsservice

Hilfe

Impressum

Datenschutz

Titelaufnahme