Zhenliang Zhang | BIGAI
Zhenliang Zhang | BIGAI
Home
Publications
Contact
HI-Lab
Light
Dark
Automatic
Large Language Model
[AAAI 2026] Reasoning with Exploration: An Entropy Perspective
In this work, we revisit entropy–a signal of exploration in RL–and examine its relationship to exploratory reasoning in LMs.
Daixuan Cheng
,
Shaohan Huang
,
Xuekai Zhu
,
Bo Dai
,
Xin Zhao
,
Zhenliang Zhang
,
Furu Wei
PDF
Cite
[NeurIPS 2025 Workshop] ValuePilot: A Two-Phase Framework for Value-Driven Decision-Making
We propose ValuePilot, a two-phase value-driven decision-making framework comprising a dataset generation toolkit DGT and a decision-making module DMM trained on the generated data.
Yitong Luo
,
Hou Hei Lam
,
Ziang Chen
,
Zhenliang Zhang
,
Xue Feng
PDF
Cite
[EMNLP 2025] On Domain-Adaptive Post-Training for Multimodal Large Language Models
This paper systematically investigates domain adaptation of MLLMs via post-training, focusing on data synthesis, training pipeline, and task evaluation.
Daixuan Cheng
,
Shaohan Huang
,
Ziyu Zhu
,
Xintong Zhang
,
Wayne Xin Zhao
,
Zhongzhi Luan
,
Bo Dai
,
Zhenliang Zhang
Cite
Cite
×