Publications

"_" denotes that the student was under the supervision of Prof. Wang while conducting the work.

2026

[29] X. Li, J. Wang, Z. Li, S. Ali Rakhshan, Z. Chen, Z.-Q. Wang, and H. Kong, "Enhanced Extrinsic Calibration of Acoustic Cameras via Closed-Form Initialization and Batch Optimization", in IEEE Sensors Journal, 2026.
[28] S. Cornell, C. Boeddeker, T. Park, H. Huang, D. Raj, M. Wiesner, Y. Masuyama, X. Chang, Z.-Q. Wang, S. Squartini, P. Garcia, and S. Watanabe, "Recent Trends in Distant Conversational Speech Recognition: A Review of CHiME-7 and 8 DASR Challenges", in Computer Speech & Language (CSL), vol. 97, issue 101901, pp. 1-36, 2026.
[27] Y. Masuyama, X. Chang, W. Zhang, S. Cornell, Z.-Q. Wang, N. Ono,Y. Qian, and S. Watanabe, "An End-to-End Integration of Speech Separation and Recognition with Self-Supervised Learning Representation", in CSL, vol. 95, issue 101813, pp. 1-18, 2026.
[26] T. Ling, S. He, and Z.-Q. Wang, "Token-Guided Target Speaker Extraction via Cross-Modality Alignment", in Interspeech, 2026.
[25] P. Shen, X. Zhang, and Z.-Q. Wang, "Adaptive Hard-Pair Sampling via Curriculum Learning for Speech Separation", in Interspeech, 2026.
[24] S. Song, F. Wu, and Z.-Q. Wang, "ARTT: Augmented Reverberant-Target Training for Unsupervised Monaural Speech Dereverberation", in Interspeech, 2026.
[23] R. Pang, S. He, J. Sun, and Z.-Q. Wang, "USDnet++: Improving Unsupervised Neural Speech Dereverberation by Leveraging Signal Processing Based Dereverberation", in Interspeech, 2026.
[22] Z.-Q. Wang and R. Pang, "Mixture to Beamformed Mixture: Leveraging Beamformed Mixture as Weak-Supervision for Speech Enhancement and Noise-Robust ASR", in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 16337-16341, 2026.
[21] S. He and Z.-Q. Wang, "VM-UNSSOR: Unsupervised Neural Speech Separation Enhanced by Higher-SNR Virtual Microphone Arrays", in ICASSP, pp. 15642-15646, 2026.
[20] J. Sun, S. He, R. Pang, and Z.-Q. Wang, "Neural Forward Filtering for Speaker-Image Separation", in ICASSP, pp. 14407-14411, 2026.
[19] P. Shen, S. He, X. Zhang, and Z.-Q. Wang, "LExTra: Folded Prompt and Split-Role Attention for Target Speaker Extraction", in ICASSP, pp. 18887-18891, 2026.
[18] T. Ling, S. He, P. Shen, and Z.-Q. Wang, "MC-LExt: Multi-Channel Target Speaker Extraction with Onset-Prompted Speaker Conditioning Mechanism", in ICASSP, pp. 18967-18971, 2026.
[17] P. Lu, P. Zhou, X. Chen, J. Wang, and Z.-Q. Wang, "UJCodec: An End-to-End UNet-Style Codec for Joint Speech Compression and Enhancement", in ICASSP, pp. 15802-15806, 2026.
[16] Y. Zhu, J. Jin, X. Luo, W. Yang, Z.-Q. Wang, G. Huang, J. Chen, and J. Benesty, "Forward Convolutive Prediction for Frame Online Monaural Speech Dereverberation Based on Kronecker Product Decomposition", in ICASSP, pp. 15987-15991, 2026.
[15] T. Ling, P. Shen, and Z.-Q. Wang, "The SUSTech AILab System Description for CHiME-9 MCoRec Challenge", in ICASSP Workshop on HSCMA and CHiME, 2026.

2025

[14] P. Shen, K. Chen, S. He, P. Chen, S. Yuan, H. Kong, X. Zhang, and Z.-Q. Wang, "Listen to Extract: Onset-Prompted Target Speaker Extraction", in IEEE Transactions on Audio, Speech and Language Processing (TASLPRO), vol. 33, pp. 4832-4843, 2025. [Sound Demo]
[13] Z.-Q. Wang, "ctPuLSE: Close-Talk, and Pseudo-Label Based Far-Field, Speech Enhancement", in Journal of The Acoustical Society of America (JASA), vol. 158, issue 4, pp. 2849-2862, 2025. [Sound Demo]
[12] Z.-Q. Wang, "SuperM2M: Supervised and Mixture-to-Mixture Co-Learning for Speech Enhancement and Noise-Robust ASR", in Neural Networks (NN), vol. 188, issue 107408, pp. 1-16, 2025. [Sound Demo]
[11] Z. Xu, X. Fu, Z.-Q. Wang, X. Jiang, and R. Roy Choudhury, "Unsupervised Blind Speech Separation with A Diffusion Prior", in International Conference on Machine Learning (ICML), pp. 69160-69188, 2025. [Sound Demo] [Code]
[10] Y. Wu, Z. Xu, J. Chen, Z.-Q. Wang, and R. Roy Choudhury, "Unsupervised Multi-Channel Speech Dereverberation via Diffusion", in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2025.
[9] S. Araki, N. Ito, R. Haeb-Umbach, G. Wichern, Z.-Q. Wang, and Y. Mitsufuji, "30+ Years of Source Separation Research: Achievements and Future Challenges", in ICASSP, 2025.
[8] P. Shen, X. Zhang, and Z.-Q. Wang, "ARiSE: Auto-Regressive Multi-Channel Speech Enhancement", in Interspeech, pp. 1183-1187, 2025.
[7] F. Zhao, X. Zhang, and Z.-Q. Wang, "Multi-Channel Acoustic Echo Cancellation Based on Direction-of-Arrival Estimation", in Interspeech, pp. 629-633, 2025.
[6] L. Fu, Y. Liu, Z. Liu, Z. Yang, Z.-Q. Wang, Y. Li, and H. Kong, "AuralNet: Hierarchical Attention-based 3D Binaural Localization of Overlapping Speakers", in Interspeech, pp. 938-942, 2025.
[5] R. Sachdev, Z.-Q. Wang, and C.-H. H. Yang, "Evolutionary Prompt Design for LLM-Based Post-ASR Error Correction", in IEEE Workshop on Signal Processing Systems (SiPS), pp. 131-135, 2025.
[4] F. Wu and Z.-Q. Wang, "TS-TFGridNet: Extending TF-GridNet for Label-Queried Target Sound Extraction via Embedding Concatenation", in DCASE Challenge, technical report, 2025. [Rank 3rd place in DCASE2025 Challenge Task 4 - Spatial Semantic Segmentation of Sound Scenes]

2024

[3] Z.-Q. Wang, "USDnet: Unsupervised Speech Dereverberation via Neural Forward Filtering", in IEEE/ACM Transactions on Audio, Speech, and Language Processing (IEEE/ACM TASLP), vol. 32, pp. 3882-3895, 2024. [Sound Demo]
[2] Z.-Q. Wang, "Mixture to Mixture: Leveraging Close-talk Mixtures as Weak-supervision for Speech Separation", in IEEE Signal Processing Letters (IEEE SPL), vol. 31, pp. 1715-1719, 2024. [Sound Demo]
[1] Z.-Q. Wang, A. Kumar, and S. Watanabe, "Cross-Talk Reduction", in International Joint Conference on Artificial Intelligence (IJCAI), pp. 5171-5180, 2024. [Sound Demo] [Poster] [Slide]

Preprints

Z.-Q. Wang and S. Cornell, "Cross-Talk Speech Reduction, by Separation, for Separation", in arXiv preprint arXiv:2605.19695, 2026. [Sound Demo]

F. Zhao and Z.-Q. Wang, "Why Not Put a Microphone Near the Loudspeaker? A New Paradigm for Acoustic Echo Cancellation", in arxiv preprint arXiv:2511.03244, 2025.

K. Li, G. Chen, W. Sang, Y. Luo, Z. Chen, S. Wang, S. He, Z.-Q. Wang, A. Li, Z. Wu, and X. Hu, "Advances in Speech Separation: Techniques, Challenges, and Future Trends", in arxiv preprint arXiv:2508.10830, 2025.

Z.-Q. Wang, G. Wichern, and J. Le Roux, "Leveraging Low-Distortion Target Estimates for Improved Speech Enhancement", in arXiv preprint arXiv:2110.00570, 2021. [Code]