Language:
English
日文
繁體中文
Help
南開科技大學
圖書館首頁
編目中圖書申請
Login
Back
Switch To:
Labeled
|
MARC Mode
|
ISBD
Data management in machine learning systems /
Record Type:
Language materials, printed : Monograph/item
Title/Author:
Data management in machine learning systems // Matthias Boehm, Arun Kumar, Jun Yang.
Author:
Boehm, Matthias.
other author:
Kumar, Arun.
Published:
[San Rafael, California] :Morgan & Claypool Publishers,c2019.
Description:
xv, 157 p. :ill. (some col.) ;24 cm.
Subject:
Machine learning. -
ISBN:
1681734966 (pbk.)
ISBN:
9781681734965 (pbk.) :
ISBN:
1681734982 (bound)
ISBN:
9781681734989 (bound)
ISBN:
9781681734972 (ebk.)
Data management in machine learning systems /
Boehm, Matthias.
Data management in machine learning systems /
Matthias Boehm, Arun Kumar, Jun Yang. - [San Rafael, California] :Morgan & Claypool Publishers,c2019. - xv, 157 p. :ill. (some col.) ;24 cm. - Synthesis lectures on data management ;#57. - Synthesis lectures on data management ;#57..
Includes bibliographical references (p. 127-156).
1. Introduction: Overview of ML lifecycle and ML users -- Motivation -- Outline and scope -- 2. ML through database queries and UDFs: Linear algebra -- Iterative algorithms -- Sampling-based methods -- Discussion -- Summary --3. Multi-table ML and deep systems integration -- Learning over joins -- Statistical relational learning and non-IID models -- Deeper integration and specialized DBMSs -- Summary -- 4. Rewrites and optimization: Optimization scope -- Logical rewrites and planning -- Physical rewrites and operators -- Automatic operator fusion -- Runtime adaptation -- Summary -- 5. Execution strategies: Data-parallel execution -- Task-parallel execution -- Parameter servers (model-parallel execution) -- Hybrid execution strategies -- Accelerators (GPUs, FPGAs, ASICs) -- Summary -- 6. Data access methods: Caching and buffer pool management -- Compression -- NUMA-aware partitioning and replication -- Index structures -- Summary -- 7. Resource heterogeneity and elasticity: Provisioning, configuration, and scheduling -- Handling failures -- Working with markets of transient resources -- Summary -- 8. Systems for ML lifecycle tasks: Data sourcing and cleaning for ML -- Feature engineering and deep learning -- Model selection and model management -- Interaction, visualization, debugging, and inspection -- Model deployment and serving -- Benchmarking ML systems -- Summary -- 9. Conclusions: Bibliography -- Authors' biographies.
Large-scale data analytics using machine learning (ML) underpins many modern data-driven applications. ML systems provide means of specifying and executing these ML workloads in an efficient and scalable manner. Data management is at the heart of many ML systems due to data-driven application characteristics, data-centric workload characteristics, and system architectures inspired by classical data management techniques. In this book, we follow this data-centric view of ML systems and aim to provide a comprehensive overview of data management in ML systems for the end-to-end data science or ML lifecycle. We review multiple interconnected lines of work: (1) ML support in database (DB) systems, (2) DB-inspired ML systems, and (3) ML lifecycle systems. Covered topics include: in-database analytics via query generation and user-defined functions, factorized and statistical -relational learning; optimizing compilers for ML workloads; execution strategies and hardware accelerators; data access methods such as compression, partitioning and indexing; resource elasticity and cloud markets; as well as systems for data preparation for ML, model selection, model management, model debugging, and model serving. Given the rapidly evolving field, we strive for a balance between an up-to-date survey of ML systems, an overview of the underlying concepts and techniques, as well as pointers to open research questions. Hence, this book might serve as a starting point for both systems researchers and developers. -- Provided by publisher.
ISBN: 1681734966 (pbk.)Subjects--Topical Terms:
147118
Machine learning.
LC Class. No.: Q325.5 / .B643 2019
Dewey Class. No.: 006.31
Data management in machine learning systems /
LDR
:03862cam a2200253 a 4500
001
1000103578
005
20200730105807.0
008
190309s2019 caua b 000 0 eng d
020
$a
1681734966 (pbk.)
020
$a
9781681734965 (pbk.) :
$c
NT1423
020
$a
1681734982 (bound)
020
$a
9781681734989 (bound)
020
$a
9781681734972 (ebk.)
035
$a
(OCoLC)1089489956
$z
(OCoLC)1089574816
035
$a
on1089489956
040
$a
YDX
$b
eng
$c
YDX
$d
HF9
$d
OCLCO
$d
OCLCF
$d
BNG
$d
LIP
050
# 4
$a
Q325.5
$b
.B643 2019
082
0 4
$a
006.31
$2
23
100
1
$a
Boehm, Matthias.
$3
1000128492
245
1 0
$a
Data management in machine learning systems /
$c
Matthias Boehm, Arun Kumar, Jun Yang.
260
#
$a
[San Rafael, California] :
$b
Morgan & Claypool Publishers,
$c
c2019.
300
$a
xv, 157 p. :
$b
ill. (some col.) ;
$c
24 cm.
490
1
$a
Synthesis lectures on data management ;
$v
#57
504
$a
Includes bibliographical references (p. 127-156).
505
0 #
$a
1. Introduction: Overview of ML lifecycle and ML users -- Motivation -- Outline and scope -- 2. ML through database queries and UDFs: Linear algebra -- Iterative algorithms -- Sampling-based methods -- Discussion -- Summary --3. Multi-table ML and deep systems integration -- Learning over joins -- Statistical relational learning and non-IID models -- Deeper integration and specialized DBMSs -- Summary -- 4. Rewrites and optimization: Optimization scope -- Logical rewrites and planning -- Physical rewrites and operators -- Automatic operator fusion -- Runtime adaptation -- Summary -- 5. Execution strategies: Data-parallel execution -- Task-parallel execution -- Parameter servers (model-parallel execution) -- Hybrid execution strategies -- Accelerators (GPUs, FPGAs, ASICs) -- Summary -- 6. Data access methods: Caching and buffer pool management -- Compression -- NUMA-aware partitioning and replication -- Index structures -- Summary -- 7. Resource heterogeneity and elasticity: Provisioning, configuration, and scheduling -- Handling failures -- Working with markets of transient resources -- Summary -- 8. Systems for ML lifecycle tasks: Data sourcing and cleaning for ML -- Feature engineering and deep learning -- Model selection and model management -- Interaction, visualization, debugging, and inspection -- Model deployment and serving -- Benchmarking ML systems -- Summary -- 9. Conclusions: Bibliography -- Authors' biographies.
520
#
$a
Large-scale data analytics using machine learning (ML) underpins many modern data-driven applications. ML systems provide means of specifying and executing these ML workloads in an efficient and scalable manner. Data management is at the heart of many ML systems due to data-driven application characteristics, data-centric workload characteristics, and system architectures inspired by classical data management techniques. In this book, we follow this data-centric view of ML systems and aim to provide a comprehensive overview of data management in ML systems for the end-to-end data science or ML lifecycle. We review multiple interconnected lines of work: (1) ML support in database (DB) systems, (2) DB-inspired ML systems, and (3) ML lifecycle systems. Covered topics include: in-database analytics via query generation and user-defined functions, factorized and statistical -relational learning; optimizing compilers for ML workloads; execution strategies and hardware accelerators; data access methods such as compression, partitioning and indexing; resource elasticity and cloud markets; as well as systems for data preparation for ML, model selection, model management, model debugging, and model serving. Given the rapidly evolving field, we strive for a balance between an up-to-date survey of ML systems, an overview of the underlying concepts and techniques, as well as pointers to open research questions. Hence, this book might serve as a starting point for both systems researchers and developers. -- Provided by publisher.
650
# 0
$a
Machine learning.
$3
147118
650
# 0
$a
Database management.
$3
146657
700
1 #
$a
Kumar, Arun.
$3
1000128493
700
1 #
$a
Yang, Jun.
$3
1000128494
830
0
$a
Synthesis lectures on data management ;
$v
#57.
$3
1000128495
0 based onreview(s)
Location:
全部
六樓西文書庫 (6th Floor-Western Books)
Year:
Volume Number:
Items
1 records • Pages 1 •
1
Barcode Number
Location Name
Item Class
Material type
Call number
Usage Class
Loan Status
No. of reservations
Opac note
Attachments
E19536
六樓西文書庫 (6th Floor-Western Books)
一般借閱
外文書
* 006.31 B671 2019
一般(Normal)
On shelf
0
5030000-1080011
1 records • Pages 1 •
1
Reviews
Add a review
and share your thoughts with other readers
Save to Personal ReadLists
Export a biliographic
pickup library
Processing
...
Change password
Login