Kaiwei Che

About Me

I am Kaiwei Che (Richard), a third-year Ph.D. student at Peking University (2023–2027), advised by Prof. Yonghong Tian and Prof. Li Yuan. My research focuses on AI Infra and Efficient AI, including model sparse quantization, training and inference acceleration, and brain-inspired algorithms. I have published 3 CCF-A first-author papers at ICML (Spotlight), NeurIPS (Spotlight), and AAAI, along with 2 SCI journal papers, with 360+ total citations.

Before this, I received my Master’s degree from Southern University of Science and Technology (SUSTech) in 2023, advised by Prof. Qinghu Meng. I received my Bachelor’s degree from Shenzhen University in 2020.

I have interned at Huawei (2012 Lab, Research Intern, 2021–2023) and DJI (RoboMasters, 2018), with extensive experience in algorithm research and engineering practice.

Research Interests

AI Infra & Efficient AI: model sparse quantization, training/inference acceleration, brain-inspired algorithms
Spiking Neural Networks (SNNs): energy-efficient deep learning, neuromorphic computing
LLM/VLM: post-training alignment (SFT, RLHF), RAG, agent systems
Event-based Vision: neuromorphic sensing, efficient event processing

Publications

First Author

ICML 2026, Spotlight Efficiently Training Time-to-First-Spike Spiking Neural Networks from Scratch
Kaiwei Che, Wei Fang, Zhengyu Ma, Yifan Huang, Peng Xue, Li Yuan, Timothée Masquelier, Yonghong Tian
AAAI 2026 Parallel Training Time-to-First-Spike Spiking Neural Networks
Kaiwei Che, Wei Fang, Peng Xue, Yifan Huang, Zhengyu Ma, Yonghong Tian
NeurIPS 2022, Spotlight Differentiable Hierarchical and Surrogate Gradient Search for Spiking Neural Networks
Kaiwei Che, Luziwei Leng, Kaixuan Zhang, Jianguo Zhang, Qinghu Meng, Jie Cheng, Qinghai Guo, Jianxing Liao
MM 2026, under review Deep-TTFS: Scaling Time-to-First-Spike Neural Networks to ImageNet
Kaiwei Che, Wei Fang, Zhengyu Ma, Peng Xue, Li Yuan, Yonghong Tian
Frontiers in Neuroscience 2024 Auto-Spikformer: Spikformer Architecture Search
Kaiwei Che, Zhaokun Zhou, Jun Niu, Zhengyu Ma, Wei Fang, Yanqi Chen, Shuaijie Shen, Li Yuan, Yonghong Tian
Transactions on Artificial Intelligence 2024 Spatial-Temporal Search for Spiking Neural Networks
Kaiwei Che, Zhaokun Zhou, Li Yuan, Jianguo Zhang, Yonghong Tian, Luziwei Leng
Intelligence & Robotics 2024 A Deep Learning-based System for Accurate Detection of Anatomical Landmarks in Colon Environment
Kaiwei Che, Chengwei Ye, Yibing Yao, Nachuan Ma, Ruo Zhang, Jiankun Wang, Max Q-H Meng

Co-Author

Neural Networks 2025 Spatially-enhanced Spiking Neural Network for Efficient Point Cloud Analysis
Yijie Lu, Zhiyi Pan, Renrui Zhang, Yanhao Jia, Kaiwei Che, Zhaokun Zhou
NeurIPS 2024 Spiking Transformer with Experts Mixture
Zhaokun Zhou, Yijie Lu, Yanhao Jia, Kaiwei Che, Jun Niu, Liwei Huang, Xinyu Shi, Yuesheng Zhu, Guoqi Li, Zhaofei Yu
TIP, under review Spikformer V2: Join the High Accuracy Club on ImageNet with an SNN Ticket
Zhaokun Zhou, Kaiwei Che, Wei Fang, Keyu Tian, Yuesheng Zhu, Shuicheng Yan, Yonghong Tian, Li Yuan
TNNLS 2024 Accurate and Efficient Event-based Semantic Segmentation Using Adaptive Spiking Encoder-Decoder Network
Rui Zhang, Luziwei Leng, Kaiwei Che, Hu Zhang, Jie Cheng, Qinghai Guo, Jianxing Liao, Ran Cheng
IEEE TCDS 2024 Automotive Object Detection via Learning Sparse Events by Spiking Neurons
Hu Zhang, Yanchen Li, Luziwei Leng, Kaiwei Che, Qian Liu, Qinghai Guo, Jianxing Liao, Ran Cheng
IEEE TCDS 2024 Spatial-Temporal Spiking Feature Pruning in Spiking Transformer
Zhaokun Zhou, Kaiwei Che, Jun Niu, Man Yao, Guoqi Li, Li Yuan, Guibo Luo, Yuesheng Zhu
ICANN 2024 A Multi-modal Spiking Meta-learner with Brain-Inspired Task-Aware Modulation Scheme
Jun Niu, Zhaokun Zhou, Kaiwei Che, Li Yuan
CVPR 2022 Discrete Time Convolution for Fast Event-Based Stereo
Kaixuan Zhang, Kaiwei Che, Jianguo Zhang, Jie Cheng, Ziyang Zhang, Qinghai Guo, Luziwei Leng
ROBIO 2021 Motion Planning for Hexapod Robots in Dynamic Rough Terrain Environments
Bingyi Xia, Kaiwei Che, Zhilong Tang, Jiankun Wang, Max Q-H Meng

Research Projects & Experience

AI Infra & Efficient AI

NPUSlim: LLM Quantization, Sparse & Acceleration Framework on Huawei NPU (Provincial Key Project)
A framework for quantization, sparsity, and inference acceleration on Huawei NPUs. Supports mainstream model integration and deployment via vLLM / vLLM-ascend. Implements full-pipeline 2:4 structured sparse operator adaptation on NPU, integrates GPTQ, SparseGPT and other PTQ algorithms. Features a layer-block streaming quantization pipeline with trillion-parameter-scale offline quantization capability on a single machine.
Winner-Take-All Spiking Transformer for Language Modeling
Proposes WTA-based attention to replace softmax for fully spike-driven, floating-point-free computation. Develops both Decoder-only and Encoder-only SNN Transformers, achieving SNN SOTA across multiple datasets with only 1/14 the energy of Qwen-1.5B.

Agent

SF-RAG: Selective Filtering for Retrieval-Augmented Generation
Addresses context overload and intermediate information issues in traditional RAG. Proposes a lightweight Reasoner for parallel inference and CoT chunk-level utility scoring, enabling dynamic retrieval document filtering. Outperforms SELF-RAG, SURE, and other baselines on 5 open-domain QA benchmarks. The Reasoner is an LLaMA-8B model fine-tuned via SFT on GPT-4o synthetic data.
Personal Time Management Agent
A multi-workflow agent built on LangGraph that routes weekly/daily planning and review workflows by intent. Uses interrupt/resume for user confirmation and information completion, constraining the LLM from unauthorized plan modifications for improved controllability.

Technical Blogs

LLM/VLM Technical Blog
Systematically covers VLM technical directions, including discriminative models (CLIP, BLIP-2, Flamingo, Qwen-VL) and generative models (DDPM, LDM, DALL-E, DiT). Includes multimodal fine-tuning experiments using LLaMA-Factory with Qwen3-VL SFT.
RLHF Technical Blog
Covers RL alignment methods (PG, TRPO, PPO, GRPO) from both mathematical principles and engineering implementation perspectives. Includes multimodal RL alignment experiments on Geo3K using verl with Qwen2.5-VL and LoRA GRPO.

Academic Service

I have experience in reviewing for high-impact AI conferences and journals, including:

Conferences: NeurIPS, ICML, ICLR, CVPR, AAAI

Journals: TIP, TNNLS, Pattern Recognition, Scientific Reports

Skills

Programming Languages & Frameworks: Python, C++, Triton; PyTorch
Post-training & Alignment: SFT, RLHF (PPO, GRPO); Frameworks: LLaMA-Factory, verl
Agent Systems: RAG, Tool use; Framework: LangGraph
Inference Optimization: Quantization, sparsity, inference acceleration, NPU deployment; Framework: vLLM
Open Source: Spikingjelly (2k+ stars), NPUSlim, RoboRTS (800+ stars)
Languages: CET6

Contact

E-mail: chekaiwei@stu.pku.edu.cn
Google Scholar: Kaiwei Che