With the advancement of deep convolutional neural networks, speech recognition systems achieved the amazing performance in the tasks of natural language processing field. While being outstanding, resource-constrained environments limited enterprise-level applications. In this paper, we use two binarized neural networks called Bi-real Net and PCNN (Projection Convolutional Neural Networks) to study the problem of compressing WaveNet which is a generative model in raw audio waveforms recognition. In particular, Bi-real Net and PCNN are applied to minimize the computational cost gap between real-valued and binarized WaveNet model, which leads to a new 1-bit dilated causal convolution. We collected a dataset which including over 950,000 clear key word voice without noise. In this dataset, 1-bit WaveNet were trained through these binarizations and got a satisfactory perform.
1-bit WaveNet: Compressing a Generative Neural Network in Speech Recognition with Two Binarized Methods Sicheng Gao, Runqi Wang, Liuyang Jiang, Baochang Zhang