Homepage
Li, H., Wang, X., Feng, C., Zuo, C., Wang, Y., Lo, H., Cui, Y., Wang, B., Cui, D., Jing, S., Shan, Y., Xiong, Y., Wang, J., Zhang, Y., and Fan, Z. 2026. ReviveMoE: Fast recovery for hardware failures in large-scale MoE LLM inference deployments. arXiv preprint arXiv:2602.21140. [Link]
Yousefijamarani, Z., Wang, X., Wang, Q., Heisler, M., Shabani, T., Gholipour, N., Yassini, P., Chang, H., Chen, K., Zhang, Q., Bai, X., Wang, J., Xiong, Y., Zhang, Y., and Fan, Z. 2025. HyperFlexis: Joint design of algorithms and systems for multi-SLO serving and fast scaling. arXiv preprint arXiv:2508.15919. [Link]
Fan, Z., Ghaddar, B., Wang, X., Xing, L., Zhang, Y., and Zhou, Z. 2025. Artificial intelligence for optimization: Unleashing the potential of parameter generation, model formulation, and solution methods. European Journal of Operational Research. [Link]
Singh, G., Wang, X., Hu, Y., Yu, T., Xing, L., Jiang, W., Wang, Z., Bai, X., Li, Y., Xiong, Y., Zhang, Y., and Fan, Z. 2025. Efficiently serving large multimodal models using EPD disaggregation. In Proceedings of the 42nd International Conference on Machine Learning (ICML). [Link]
Heisler, M., Yousefijamarani, Z., Wang, X., Wang, Q., et al. 2025. LLM inference scheduling: A survey of techniques, frameworks, and trade-offs. TechRxiv Preprint. [Link]
Xing, L., Wang, X., Feng, Y., Fan, Z., Xiong, J., Guo, Z., Fu, X., Ramamonjison, R., Mostajabdaveh, M., Han, X., Zhou, Z., and Zhang, Y. 2024. Towards human-aligned evaluation for linear programming word problems. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (COLING). [Link]
Fan, Z., Huang, F., Wang, X., Zhou, Z., Pei, J., Friedlander, M. P., and Zhang, Y. 2024. Fair and efficient contribution valuation for vertical federated learning. In Proceedings of the International Conference on Learning Representations (ICLR). [Link]
Fan, Z., Wang, X., Yakovenko, O., Sivas, A. A., Ren, O., Zhang, Y., and Zhou, Z. 2023. Smart initial basis selection for linear programs. In Proceedings of the 40th International Conference on Machine Learning (ICML). [Link]
Gholami, M., Akbari, M., Wang, X., Kamranian, B., and Zhang, Y. 2023. ETran: Energy-based transferability estimation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). [Link]
Qiao, C., Xiang, Z., Wang, X., Chen, S., Fan, Y., and Zhao, X. 2023. Objects matter: Learning object relation graph for robust absolute pose regression. Neurocomputing, 521, 11-26. [Link]
Wang, X. and Li, Y. 2021. Harmonized dense knowledge distillation training for multi-exit architectures. In Proceedings of the AAAI Conference on Artificial Intelligence. [Link]
Ouyang, S., Wang, X., Lyu, K., and Li, Y. 2021. Pseudo-label generation-evaluation framework for cross-domain weakly supervised object detection. In Proceedings of the IEEE International Conference on Image Processing (ICIP). [Link]
Wang, X. and Li, Y. 2020. Gradient deconfliction-based training for multi-exit architectures. In Proceedings of the IEEE International Conference on Image Processing (ICIP). [Link]