Authors
Parimarjan Negi, Ziniu Wu, Andreas Kipf, Nesime Tatbul, Ryan Marcus, Sam Madden, Tim Kraska, Mohammad Alizadeh
Publication date
2023/2/1
Journal
Proceedings of the VLDB Endowment
Volume
16
Issue
6
Pages
1520-1533
Publisher
VLDB Endowment
Description
Query driven cardinality estimation models learn from a historical log of queries. They are lightweight, having low storage requirements, fast inference and training, and are easily adaptable for any kind of query. Unfortunately, such models can suffer unpredictably bad performance under workload drift, i.e., if the query pattern or data changes. This makes them unreliable and hard to deploy. We analyze the reasons why models become unpredictable due to workload drift, and introduce modifications to the query representation and neural network training techniques to make query-driven models robust to the effects of workload drift. First, we emulate workload drift in queries involving some unseen tables or columns by randomly masking out some table or column features during training. This forces the model to make predictions with missing query information, relying more on robust features based on up-to-date …
Total citations
Scholar articles
P Negi, Z Wu, A Kipf, N Tatbul, R Marcus, S Madden… - Proceedings of the VLDB Endowment, 2023