Pictured from left to right: PhD students Xiaoying Wang, Weiyuan Wu and Changbo Qu.

Data science researchers win Best Experiments, Analysis & Benchmark Paper Award at VLDB Conference

March 21, 2022

By: Andrew Ringer

Data science researchers in the School of Computing Science recently received the Best Experiments, Analysis & Benchmark Paper Award at the Very Large Data Bases (VLDB) Conference for their paper titled Are We Ready for Learned Cardinality Estimation?

PhD students Xiaoying Wang, Changbo Qu and Weiyuan Wu, and computing science professor Jiannan Wang collaborated on the paper. 

With the rise of machine learning for databases, many research studies have been conducted to explore how existing database components could be replaced with learned models to improve the processing speeds of database systems. By using cardinality estimation, described as the process of estimating a query’s result before executing the query, the researchers hope to improve this research area. 

“Applying learned cardinality estimators into a database would greatly improve the cardinality estimation accuracy, thus can greatly increase the speed of a database for some queries, especially analytic queries,” says Wu. 

While the researchers concluded that learned methods are more accurate than traditional methods for query optimization, there are still challenges to overcome in order for learned methods to be deployed in a real database system. For example, low speed in training and inference, illogical behaviors and dealing with frequent data updates are still issues to be resolved over time. That being said, the researchers hope that this paper will encourage other data scientists to work on these problems.

“Learned methods have shown great potential in accuracy,” says Qu. “We believe that if future works could address the issues that we identified, these methods will be deployed in real production systems and make real database systems better.”

“I hope our work will encourage people to put more focus on improving the efficiency and trustworthiness of learned cardinality estimation models,” says Xiaoying Wang.

Improving the speed of databases has been a long term goal in data science. The SFU Data Science Research Group has over 30 years of experience in data mining and database systems. This best paper award is evidence of the research group achieving its goal to “train next-generation leaders and researchers in data science.”