The online world fills databases with immense amounts of data. Your local grocery stores, your financial institutions, your streaming services and even your medical providers all maintain vast arrays of information across multiple databases.
Managing all this data is a significant challenge. And the process of applying artificial intelligence to make inferences, or apply logical rules or interpret information, on such data can be urgent, especially when delays known as latencies are also a major issue. Applications such as supply chain prediction, credit card fraud detection, customer service chatbot provision, emergency service response and health care consulting all require real-time inferences from data being managed in a database.
The current lack of support for machine learning inference in existing databases means that a separate process and system is needed, and is particularly critical for the certain applications, like the ones mentioned above. The data transfer between two systems significantly increases latency and this delay makes it challenging to meet the time constraints of interactive applications looking for real-time results.
Jia Zou, an assistant professor of computer science and engineering in the Ira A. Fulton Schools of Engineering at Arizona State University, and her team of researchers are proposing a solution, that, if successful, will greatly reduce the end-to-end latency for all-scale model serving on data that is managed by a relational database.