Serving Deep Learning Models with Deduplication from Relational Databases

Lixi Zhou, Jiaqing Chen, Amitabh Das, Hong Min, Lei Yu, Ming Zhao, Jia Zou

Research output: Contribution to journalConference articlepeer-review

3 Scopus citations

Abstract

Serving deep learning models from relational databases brings significant benefits. First, features extracted from databases do not need to be transferred to any decoupled deep learning systems for inferences, and thus the system management overhead can be significantly reduced. Second, in a relational database, data management along the storage hierarchy is fully integrated with query processing, and thus it can continue model serving even if the working set size exceeds the available memory. Applying model deduplication can greatly reduce the storage space, memory footprint, cache misses, and inference latency. However, existing data deduplication techniques are not applicable to the deep learning model serving applications in relational databases. They do not consider the impacts on model inference accuracy as well as the inconsistency between tensor blocks and database pages. This work proposed synergistic storage optimization techniques for duplication detection, page packing, and caching, to enhance database systems for model serving. Evaluation results show that our proposed techniques significantly improved the storage efficiency and the model inference latency, and outperformed existing deep learning frameworks in targeting scenarios.

Original languageEnglish (US)
Pages (from-to)2230-2243
Number of pages14
JournalProceedings of the VLDB Endowment
Volume15
Issue number10
DOIs
StatePublished - 2022
Event48th International Conference on Very Large Data Bases, VLDB 2022 - Sydney, Australia
Duration: Sep 5 2022Sep 9 2022

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • General Computer Science

Fingerprint

Dive into the research topics of 'Serving Deep Learning Models with Deduplication from Relational Databases'. Together they form a unique fingerprint.

Cite this