Microsoft has introduced a new deep learning acceleration platform granting high-speed, low-latency serving of machine learning models: dubbed “Brainwave,” the system allows developers to deploy such models onto programmable silicon and achieve high performance beyond what they’d be able to get from a CPU or GPU. Intel’s new 14 nm Stratix 10 FPGA was used to demonstrate how it could be ported and what kind of performance is possible.
…Project Brainwave achieves a major leap forward in both performance and flexibility for cloud-based serving of deep learning models. We designed the system for real-time AI, which means the system processes requests as fast as it receives them, with ultra-low latency. Real-time AI is becoming increasingly important as cloud infrastructures process live data streams, whether they be search queries, videos, sensor streams, or interactions with users.
Discussion
Source: [H]ardOCP – Microsoft Unveils Project Brainwave for Real-Time AI