Tensor relational algebra for distributed machine learning system design

Binhang Yuan; Dimitrije Jankov; Jia Zou; Yuxin Tang; Daniel Bourgeois; Chris Jermaine

doi:10.14778/3457390.3457399

Tensor relational algebra for distributed machine learning system design

Binhang Yuan, Dimitrije Jankov, Jia Zou, Yuxin Tang, Daniel Bourgeois, Chris Jermaine

Engineering, Ira A. Fulton Schools of (IAFSE)

Research output: Contribution to journal › Conference article › peer-review

15 Scopus citations

Abstract

We consider the question: what is the abstraction that should be implemented by the computational engine of a machine learning system? Current machine learning systems typically push whole tensors through a series of compute kernels such as matrix multiplications or activation functions, where each kernel runs on an AI accelerator (ASIC) such as a GPU. This implementation abstraction provides little built-in support for ML systems to scale past a single machine, or for handling large models with matrices or tensors that do not easily fit into the RAM of an ASIC. In this paper, we present an alternative implementation abstraction called the tensor relational algebra (TRA). The TRA is a set-based algebra based on the relational algebra. Expressions in the TRA operate over binary tensor relations, where keys are multi-dimensional arrays and values are tensors. The TRA is easily executed with high efficiency in a parallel or distributed environment, and amenable to automatic optimization. Our empirical study shows that the optimized TRA-based back-end can significantly outperform alternatives for running ML workflows in distributed clusters.

Original language	English (US)
Pages (from-to)	1338-1350
Number of pages	13
Journal	Proceedings of the VLDB Endowment
Volume	14
Issue number	8
DOIs	https://doi.org/10.14778/3457390.3457399
State	Published - 2021
Event	47th International Conference on Very Large Data Bases, VLDB 2021 - Virtual, Online Duration: Aug 16 2021 → Aug 20 2021

ASJC Scopus subject areas

Computer Science (miscellaneous)
General Computer Science

Access to Document

10.14778/3457390.3457399

Cite this

@article{9f9dd45bb2fa4a42bbe85f64b6d8a3b0,

title = "Tensor relational algebra for distributed machine learning system design",

abstract = "We consider the question: what is the abstraction that should be implemented by the computational engine of a machine learning system? Current machine learning systems typically push whole tensors through a series of compute kernels such as matrix multiplications or activation functions, where each kernel runs on an AI accelerator (ASIC) such as a GPU. This implementation abstraction provides little built-in support for ML systems to scale past a single machine, or for handling large models with matrices or tensors that do not easily fit into the RAM of an ASIC. In this paper, we present an alternative implementation abstraction called the tensor relational algebra (TRA). The TRA is a set-based algebra based on the relational algebra. Expressions in the TRA operate over binary tensor relations, where keys are multi-dimensional arrays and values are tensors. The TRA is easily executed with high efficiency in a parallel or distributed environment, and amenable to automatic optimization. Our empirical study shows that the optimized TRA-based back-end can significantly outperform alternatives for running ML workflows in distributed clusters.",

author = "Binhang Yuan and Dimitrije Jankov and Jia Zou and Yuxin Tang and Daniel Bourgeois and Chris Jermaine",

note = "Funding Information: This work was supported by an NIH CTSA, award no. UL1TR003167 and by the NSF under grant nos. 1918651, 1910803, 2008240 and 1842494. We also thank the anonymous reviewers for their insightful comments on earlier versions of the paper. Publisher Copyright: {\textcopyright} 2021, VLDB Endowment. All rights reserved.; 47th International Conference on Very Large Data Bases, VLDB 2021 ; Conference date: 16-08-2021 Through 20-08-2021",

year = "2021",

doi = "10.14778/3457390.3457399",

language = "English (US)",

volume = "14",

pages = "1338--1350",

journal = "Proceedings of the VLDB Endowment",

issn = "2150-8097",

publisher = "Very Large Data Base Endowment Inc.",

number = "8",

}

TY - JOUR

T1 - Tensor relational algebra for distributed machine learning system design

AU - Yuan, Binhang

AU - Jankov, Dimitrije

AU - Zou, Jia

AU - Tang, Yuxin

AU - Bourgeois, Daniel

AU - Jermaine, Chris

N1 - Funding Information: This work was supported by an NIH CTSA, award no. UL1TR003167 and by the NSF under grant nos. 1918651, 1910803, 2008240 and 1842494. We also thank the anonymous reviewers for their insightful comments on earlier versions of the paper. Publisher Copyright: © 2021, VLDB Endowment. All rights reserved.

PY - 2021

Y1 - 2021

N2 - We consider the question: what is the abstraction that should be implemented by the computational engine of a machine learning system? Current machine learning systems typically push whole tensors through a series of compute kernels such as matrix multiplications or activation functions, where each kernel runs on an AI accelerator (ASIC) such as a GPU. This implementation abstraction provides little built-in support for ML systems to scale past a single machine, or for handling large models with matrices or tensors that do not easily fit into the RAM of an ASIC. In this paper, we present an alternative implementation abstraction called the tensor relational algebra (TRA). The TRA is a set-based algebra based on the relational algebra. Expressions in the TRA operate over binary tensor relations, where keys are multi-dimensional arrays and values are tensors. The TRA is easily executed with high efficiency in a parallel or distributed environment, and amenable to automatic optimization. Our empirical study shows that the optimized TRA-based back-end can significantly outperform alternatives for running ML workflows in distributed clusters.

AB - We consider the question: what is the abstraction that should be implemented by the computational engine of a machine learning system? Current machine learning systems typically push whole tensors through a series of compute kernels such as matrix multiplications or activation functions, where each kernel runs on an AI accelerator (ASIC) such as a GPU. This implementation abstraction provides little built-in support for ML systems to scale past a single machine, or for handling large models with matrices or tensors that do not easily fit into the RAM of an ASIC. In this paper, we present an alternative implementation abstraction called the tensor relational algebra (TRA). The TRA is a set-based algebra based on the relational algebra. Expressions in the TRA operate over binary tensor relations, where keys are multi-dimensional arrays and values are tensors. The TRA is easily executed with high efficiency in a parallel or distributed environment, and amenable to automatic optimization. Our empirical study shows that the optimized TRA-based back-end can significantly outperform alternatives for running ML workflows in distributed clusters.

UR - http://www.scopus.com/inward/record.url?scp=85115288826&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85115288826&partnerID=8YFLogxK

U2 - 10.14778/3457390.3457399

DO - 10.14778/3457390.3457399

M3 - Conference article

AN - SCOPUS:85115288826

SN - 2150-8097

VL - 14

SP - 1338

EP - 1350

JO - Proceedings of the VLDB Endowment

JF - Proceedings of the VLDB Endowment

IS - 8

T2 - 47th International Conference on Very Large Data Bases, VLDB 2021

Y2 - 16 August 2021 through 20 August 2021

ER -

Tensor relational algebra for distributed machine learning system design

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this