profile_picture
Jiacheng Zhao (赵家程)
Associate Professor, ICT, CAS
zhaojiacheng@ict.ac.cn

Dr. Jiacheng Zhao is an Associate Professor and Master Advisor at State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences (SKLP, ICT-CAS). I obtained my Ph.D. degree in July 2017 in ICT, CAS under the supervision of Professor Feng Xiaobing. I received my bachelor’s degree in computer science from Tianjin University in 2012.

Recently, my research focuses on building next-gen compiler infrastructure for domain specific accelerators, e.g. GPUs, AI Chips and network processors.

I’m looking for self-motivated students (Ph.D., M.S. and/or undergraduate students) to collaborate with me in the general area of compilers ranging from heterogeneous compiler construction and machine learning systems, to compiler assisted runtime systems. Send me an email if you are interested.

Interests

  • Compiler Construction
  • Heterogeneous Programming
  • Machine Learning Systems
  • Compiler Infrastructure for Domain Specific Architectures

Academia

Institute of Computing Technology, Chinese Academy of Sciences
2012 - 2017
Ph.D. Computer Architecture
Supervised by Prof. Xiaobing Feng. Awarded National Scholarship for Graduate Students of China (2013&2016), Special Prize of President Scholarship (Xia Peisu Scholarship), Pacemaker to Merit Student of Chinese Academy of Sciences
Tianjin University
2008 - 2012
B.Sc. Computer Science and Technology

News

Where good things happen
  • Our work on structured output as a control-plane vulnerability in LLM jailbreaking accepted to CCS 2026. , May 2026.
  • Two papers on re-architecting GPU memory management through driver-runtime co-design accepted to OSDI 2026 and ICML 2026, with the ICML paper selected as a Spotlight (top 2.2%). , May 2026.
  • Symbiotic MLLM Serving accepted to ISCA 2026. , May 2026.
  • T2T accepted to CGO 2026 and won the Distinguished Paper Award (3 out of 56 papers). , Feb 2026.
  • MikPoly accepted to ASPLOS 2024. Cheers! , Nov 2023.

Recent Publications

Reach me if the attached link fails
Scale-Up Networks for Large Language Models: Measurement, Implications and Optimization, 2026, ACM SIGCOMM 2026, to appear
Zhiyi Yao , Leyi Ye , Ziang Ren , Boliang Liu , Xinyu Yuan , Yuedong Xu , Jiacheng Zhao , Heng Pan
A GPU Memory Allocator with Device-Side Page Table Materialization and Deferred TLB Coherence, 2026, USENIX Symposium on Operating Systems Design and Implementation (OSDI 2026), to appear
Yangyu Zhang , Lei Chen , Chunwei Xia , Shuaijiang Li , Shuoming Zhang , Zhicheng Li , Qianqi Sun , Jiawei Xiao , Ruiyuan Xu , Ao Chen , Guangli Li , Xiaobing Feng , Huimin Cui , Chenxi Wang , Jiacheng Zhao*
CONTINUUM: Restoring the Contiguous Tensor Abstraction Efficiently for Dynamic AI Workloads via Hardware Virtualization, 2026, International Conference on Machine Learning (ICML 2026), Spotlight (top 2.2%), to appear
Yangyu Zhang , Shuoming Zhang , Chunwei Xia , Shuaijiang Li , Zhicheng Li , Ruiyuan Xu , Zheming Yang , Lei Chen , Yuan Wen , Guangli Li , Xiaobing Feng , Huimin Cui , Jiacheng Zhao
LEGO: An LLM-Enabled Hierarchical Optimizer for Tensor Computation Graphs with Structure-Aware Search and Compositional Synthesis, 2026, International Conference on Machine Learning (ICML 2026), to appear
Ruiyuan Xu , Shuoming Zhang , Guangli Li , Qiuchu Yu , Rui Zhang , Yangyu Zhang , Hao Qian , Chunwei Xia , Jiacheng Zhao , Chenxi Wang , Xiaobing Feng , Jingling Xue , Huimin Cui
Symbiotic MLLM Serving: Dynamically Balancing Parallelism Across GPUs and Resources Within GPUs, 2026, The 53rd Annual International Symposium on Computer Architecture (ISCA 2026), to appear
Zhicheng Li , Jiacheng Zhao* , Yangyu Zhang , Zhaolin Duan , Xinyu Liu , Siqi Li , Shuoming Zhang , Shuaijiang Li , Donglin Yu , Yuan Wen , Chunwei Xia , Xiyu Shi , Huimin Cui
When Grammar Guides the Attack: Uncovering Control-Plane Vulnerabilities in LLMs with Structured Output, 2026, The ACM Conference on Computer and Communications Security (CCS 2026), to appear
Shuoming Zhang , Jiacheng Zhao* , Hanyuan Dong , Ruiyuan Xu , Zhicheng Li , Yangyu Zhang , Shuaijiang Li , Yuan Wen , Chunwei Xia , Zheng Wang , Xiaobing Feng , Huimin Cui
From Threads to Tiles: T2T, a Compiler for CUDA-to-NPU Translation via 2D Vectorization, 2026, The 24th ACM/IEEE International Symposium on Code Generation and Optimization (CGO 2026), Distinguished Paper Award
Shuaijiang Li , Jiacheng Zhao* , Ying Liu , Shuoming Zhang , Lei Chen , Yijin Li , Yangyu Zhang , Zhicheng Li , Runyu Zhou , Xiyu Shi , Chunwei Xia , Yuan Wen , Xiaobing Feng , Huimin Cui
SpaceServe: Spatial Multiplexing of Complementary Encoders and Decoders for Multimodal LLMs, 2025, The Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS 2025)
Zhicheng Li , Shuoming Zhang , Jiacheng Zhao* , Siqi Li , Xiyu Shi , Yangyu Zhang , Shuaijiang Li , Donglin Yu , Zheming Yang , Yuan Wen , Huimin Cui
Beyond Prompts: Space-Time Decoupling Control-Plane Jailbreaks in LLM Structured Output, 2025, arXiv preprint arXiv:2503.24191 (2025)
Shuoming Zhang , Jiacheng Zhao , Hanyuan Dong , Ruiyuan Xu , Zhicheng Li , Yangyu Zhang , Shuaijiang Li , et al.
Fast and scalable neural network quantum states method for molecular potential energy surfaces, 2025, IEEE Transactions on Parallel and Distributed Systems (2025)
Yangjun Wu , Wanlu Cao , Jiacheng Zhao , Honghui Shang
Qiwu: Exploiting Ciphertext-Level SIMD Parallelism in Homomorphic Encryption Programs, 2025, Proceedings of the 23rd ACM/IEEE International Symposium on Code Generation and Optimization (CGO 2025), pp. 523-537
Zhongcheng Zhang , Ying Liu , Yuyang Zhang , Zhenchuan Chen , Jiacheng Zhao , Xiaobing Feng , Huimin Cui , Jingling Xue
TopServe: Task-Operator Co-scheduling for Efficient Multi-DNN Inference Serving on GPUs, 2025, European Conference on Parallel Processing, pp. 292-305. Cham: Springer Nature Switzerland, 2025
Ao Chen , Guangli Li , Feng Yu , Xueying Wang , Jiacheng Zhao , Huimin Cui , Xiaobing Feng , Jingling Xue
Introducing compiler semantics into large language models as programming language translators: A case study of C to x86 assembly, 2024, Findings of the Association for Computational Linguistics: EMNLP 2024, pp. 996-1011
Shuoming Zhang , Jiacheng Zhao , Chunwei Xia , Zheng Wang , Yunji Chen , Huimin Cui
Optimizing Deep Learning Inference via Global Analysis and Tensor Expressions, 2024, 29th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2024)
Chunwei Xia , Jiacheng Zhao* , Qianqi Sun , Zheng Wang , Yuan Wen , Teng Yu , Xiaobing Feng , Huimin Cui
Optimizing Dynamic-Shape Neural Networks on Accelerators via On-the-Fly Micro-Kernel Polymerization, 2024, 29th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2024)
Feng Yu , Guangli Li , Jiacheng Zhao , Huimin Cui , Xiaobing Feng , Jingling Xue
Enabling Large Dynamic Neural Network Training with Learning-based Memory Management, 2024, 30th IEEE International Symposium on High-Performance Computer Architecture (HPCA 2024)
Jie Ren , Dong Xu , Shuangyan Yang , Jiacheng Zhao , Zhicheng Li , Christian Navasca , Chenxi Wang , Harry Xu , Dong Li
Honeycomb: Secure and Efficient GPU Executions via Static Validation, 2023, 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI 23)
Haohui Mai , Jiacheng Zhao* , Hongren Zheng , Yiyang Zhao , Zibin Liu , Mingyu Gao , Cong Wang , Huimin Cui , Xiaobing Feng , Christos Kozyrakis
Sirius: Harvesting Whole-Program Optimization Opportunities for DNNs, 2023, Sixth Conference on Machine Learning and Systems (MLSYS)
Yijin Li , Jiacheng Zhao , Qianqi Sun , Haohui Mai , Lei Chen , Wanlu Cao , Yanfan Chen , Zhicheng Li , Ying Liu , Xinyuan Zhang , Xiyu Shi , Jie Zhao , Jingling Xue , Huimin Cui , Xiaobing Feng
PosFuzz: Augmenting Greybox Fuzzing with Effective Position Distribution, 2023, Cybersecurity
Yanyan Zou , Wei Huo , Jiacheng Zhao , Yu Zhang , Ji Shi , Wei Zou
Unified holistic memory management supporting multiple big data processing frameworks over hybrid memories, 2022, ACM Transactions on Computer Systems
Lei Chen , Jiacheng Zhao* , Chenxi Wang , Ting Cao , John Zigman , Haris Volos , Onur Mutlu , Fang Lv , Xiaobing Feng , Guoqing Harry Xu , Huimin Cui
VTensor: Using Virtual Tensors to Build a Layout-Oblivious AI Programming Framework (Full Version), 2022, Journal of Computer Science and Technology
Feng Yu , Jiacheng Zhao , Huimin Cui , Xiaobing Feng , Jingling Xue
VTensor: Using Virtual Tensors to Build a Layout-Oblivious AI Programming Framework (Poster Version), 2020, PACT '20: Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques
Feng Yu , Jiacheng Zhao , Huimin Cui , Xiaobing Feng , Jingling Xue
DNNTune: Automatic benchmarking DNN models for mobile-cloud computing, 2019, ACM Transactions on Architecture and Code Optimization
Chunwei Xia , Jiacheng Zhao* , Huimin Cui , Xiaobing Feng , Jingling Xue
On retargeting the ai programming framework to new hardwares, 2018, Network and Parallel Computing: 15th IFIP WG 10.3 International Conference, NPC 2018
Jiacheng Zhao , Yisong Chang , Denghui Li , Chunwei Xia , Huimin Cui , Ke Zhang , Xiaobing Feng
Characterizing DNN models for edge-cloud computing (Poster), 2018, 2018 IEEE International Symposium on Workload Characterization (IISWC)
Chunwei Xia , Jiacheng Zhao , Huimin Cui , Xiaobing Feng
Revisiting loop tiling for datacenters: live and let live, 2018, ICS '18: Proceedings of the 2018 International Conference on Supercomputing
Jiacheng Zhao , Huimin Cui , Yalin Zhang , Jingling Xue , Xiaobing Feng
Predicting cross-core performance interference on multicore processors with regression analysis, 2016, IEEE Transactions on Parallel and Distributed Systems
Jiacheng Zhao , Huimin Cui , Jingling Xue , Xiaobing Feng
Hadoop+: Modeling and Evaluating the Heterogeneity for MapReduce Applications in Heterogeneous Clusters, 2015, Proceedings of the 29th ACM on International Conference on Supercomputing (ICS)
Wenting He , Huimin Cui , Binbin Lu , Jiacheng Zhao , Shengmei Li , Gong Ruan , Jingling Xue , Xiaobing Feng , Wensen Yang , Youliang Yan
An empirical model for predicting cross-core performance interference on multicore processors, 2013, Proceedings of the 22nd international conference on Parallel architectures and compilation techniques (PACT)
Jiacheng Zhao , Xiaobing Feng , Huimin Cui , Youliang Yan , Jingling Xue , Wensen Yang

Projects

Engineering real compiler systems
CVM: Building unified compiler Infrastructure for domain specific architectures, sponsored by NSFC
Redesigning programming language for Ascend AI chips, sponsored by Huawei
Building AI-native programming language and multi level optimizations, sponsored by China National Key R&D Program