Publications
See Google Scholar or publications.
"_" denotes that the student was under the supervision of Prof. Wang while conducting the work.
Preprints
- K. Li, G. Chen, W. Sang, Y. Luo, Z. Chen, S. Wang, S. He, Z.-Q. Wang, A. Li, Z. Wu, and X. Hu, "Advances in Speech Separation: Techniques, Challenges, and Future Trends", in arxiv preprint arXiv:2508.10830, 2025.
- S. Cornell, C. Boeddeker, T. Park, H. Huang, D. Raj, M. Wiesner, Y. Masuyama, X. Chang, Z.-Q. Wang, S. Squartini, P. Garcia, and S. Watanabe, "Recent Trends in Distant Conversational Speech Recognition: A Review of CHiME-7 and 8 DASR Challenges", in arxiv preprint arXiv:2507.18161, 2025.
- Z.-Q. Wang and R. Pang, "Mixture to Beamformed Mixture: Leveraging Beamformed Mixture as Weak-Supervision for Speech Enhancement and Noise-Robust ASR", in arxiv preprint arXiv:2507.15229, 2025.
- P. Shen, K. Chen, S. He, P. Chen, S. Yuan, H. Kong, X. Zhang, and Z.-Q. Wang, "Listen to Extract: Onset-Prompted Target Speaker Extraction", in arxiv preprint arXiv:2505.05114, 2025. [Sound Demo]
- Z.-Q. Wang, "ctPuLSE: Close-Talk, and Pseudo-Label Based Far-Field, Speech Enhancement", in arxiv preprint arXiv:2407.19485, 2024. [Sound Demo]
- Z.-Q. Wang, G. Wichern, and J. Le Roux, "Leveraging Low-Distortion Target Estimates for Improved Speech Enhancement", in arXiv preprint arXiv:2110.00570, 2021. [Code]
2026
- Y. Masuyama, X. Chang, W. Zhang, S. Cornell, Z.-Q. Wang, N. Ono,Y. Qian, and S. Watanabe, "An End-to-End Integration of Speech Separation and Recognition with Self-Supervised Learning Representation", in Computer Speech & Language (CSL), vol. 95, issue 101813, pp. 1-18, 2026.
2025
- Z.-Q. Wang, "SuperM2M: Supervised and Mixture-to-Mixture Co-Learning for Speech Enhancement and Noise-Robust ASR", in Neural Networks (NN), vol. 188, issue 107408, pp. 1-16, 2025. [Sound Demo]
- Z. Xu, X. Fu, Z.-Q. Wang, X. Jiang, and R. Roy Choudhury, "Unsupervised Blind Speech Separation with A Diffusion Prior", in International Conference on Machine Learning (ICML), 2025. [Sound Demo] [Code]
- Y. Wu, Z. Xu, J. Chen, Z.-Q. Wang, and R. Roy Choudhury, "Unsupervised Multi-Channel Speech Dereverberation via Diffusion", in ICML Workshop on Machine Learning for Audio (ICML Workshop), also in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2025. [ICML Workshop version, WASPAA version]
- S. Araki, N. Ito, R. Haeb-Umbach, G. Wichern, Z.-Q. Wang, and Y. Mitsufuji, "30+ Years of Source Separation Research: Achievements and Future Challenges", in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2025.
- P. Shen, X. Zhang, and Z.-Q. Wang, "ARiSE: Auto-Regressive Multi-Channel Speech Enhancement", in Interspeech, pp. 1183-1187, 2025.
- F. Zhao, X. Zhang, and Z.-Q. Wang, "Multi-Channel Acoustic Echo Cancellation Based on Direction-of-Arrival Estimation", in Interspeech, pp. 629-633, 2025.
- L. Fu, Y. Liu, Z. Liu, Z. Yang, Z.-Q. Wang, Y. Li, and H. Kong, "AuralNet: Hierarchical Attention-based 3D Binaural Localization of Overlapping Speakers", in Interspeech, pp. 938-942, 2025.
- R. Sachdev, Z.-Q. Wang, and C.-H. H. Yang, "Evolutionary Prompt Design for LLM-Based Post-ASR Error Correction", in IEEE Workshop on Signal Processing Systems (SiPS), 2025.
- F. Wu and Z.-Q. Wang, "TS-TFGridNet: Extending TF-GridNet for Label-Queried Target Sound Extraction via Embedding Concatenation", in DCASE Challenge, technical report, 2025. [Rank 3rd place in DCASE2025 Challenge Task 4 - Spatial Semantic Segmentation of Sound Scenes]
2024
- Z.-Q. Wang, "USDnet: Unsupervised Speech Dereverberation via Neural Forward Filtering", in IEEE/ACM Transactions on Audio, Speech, and Language Processing (IEEE/ACM TASLP), vol. 32, pp. 3882-3895, 2024. [Sound Demo]
- Z.-Q. Wang, "Mixture to Mixture: Leveraging Close-talk Mixtures as Weak-supervision for Speech Separation", in IEEE Signal Processing Letters (IEEE SPL), vol. 31, pp. 1715-1719, 2024. [Sound Demo]
- Z.-Q. Wang, A. Kumar, and S. Watanabe, "Cross-Talk Reduction", in International Joint Conference on Artificial Intelligence (IJCAI), pp. 5171-5180, 2024. [Sound Demo] [Poster] [Slide]