TY - GEN
T1 - TUX2
T2 - 14th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2017
AU - Xiao, Wencong
AU - Xue, Jilong
AU - Miao, Youshan
AU - Li, Zhen
AU - Chen, Cheng
AU - Wu, Ming
AU - Li, Wei
AU - Zhou, Lidong
PY - 2017
Y1 - 2017
N2 - TUX2 is a new distributed graph engine that bridges graph computation and distributed machine learning. TUX2 inherits the benefits of an elegant graph computation model, efficient graph layout, and balanced parallelism to scale to billion-edge graphs; we extend and optimize it for distributed machine learning to support heterogeneity, a Stale Synchronous Parallel model, and a new MEGA (Mini-batch, Exchange, GlobalSync, and Apply) model. We have developed a set of representative distributed machine learning algorithms in TUX2, covering both supervised and unsupervised learning. Compared to implementations on distributed machine learning platforms, writing these algorithms in TUX2 takes only about 25% of the code: Our graph computation model hides the detailed management of data layout, partitioning, and parallelism from developers. Our extensive evaluation of TUX2, using large data sets with up to 64 billion edges, shows that TUX2 outperforms state-of-the-art distributed graph engines PowerGraph and PowerLyra by an order of magnitude, while beating two state-of-the-art distributed machine learning systems by at least 48%.
AB - TUX2 is a new distributed graph engine that bridges graph computation and distributed machine learning. TUX2 inherits the benefits of an elegant graph computation model, efficient graph layout, and balanced parallelism to scale to billion-edge graphs; we extend and optimize it for distributed machine learning to support heterogeneity, a Stale Synchronous Parallel model, and a new MEGA (Mini-batch, Exchange, GlobalSync, and Apply) model. We have developed a set of representative distributed machine learning algorithms in TUX2, covering both supervised and unsupervised learning. Compared to implementations on distributed machine learning platforms, writing these algorithms in TUX2 takes only about 25% of the code: Our graph computation model hides the detailed management of data layout, partitioning, and parallelism from developers. Our extensive evaluation of TUX2, using large data sets with up to 64 billion edges, shows that TUX2 outperforms state-of-the-art distributed graph engines PowerGraph and PowerLyra by an order of magnitude, while beating two state-of-the-art distributed machine learning systems by at least 48%.
UR - https://www.scopus.com/pages/publications/85046730668
M3 - 会议稿件
AN - SCOPUS:85046730668
T3 - Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2017
SP - 669
EP - 682
BT - Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2017
PB - USENIX Association
Y2 - 27 March 2017 through 29 March 2017
ER -