What-If Analysis of Large Language Models: Explore the Game World Using Proactive Thinking

LLMs struggle with decision-making in high-stakes environments like MOBA games, primarily due to a lack of proactive reasoning and limited understanding of complex game dynamics. To address this, we propose What-if Analysis LLM (WiA-LLM), a framework that trains an LLM as an explicit, language-based world model.

Instead of representing the environment in latent vectors, WiA-LLM uses natural language to simulate how the game state evolves over time in response to candidate actions, and provides textual justifications for these predicted outcomes. WiA-LLM is trained in two stages: supervised fine-tuning on human-like reasoning traces, followed by reinforcement learning with outcome-based rewards based on the alignment between predicted and actual future states.

In the Honor of Kings (HoK) environment, WiA-LLM attains 74.2% accuracy (27%$\uparrow$ vs. base model) in forecasting game-state changes.

In addition, WiA-LLM demonstrate strategic behavior more closely aligned with expert players than purely reactive LLMs, indicating enhanced foresight and expert-like decision-making.

What-If Analysis of Large Language Models: Explore the Game World Using Proactive Thinking

Authors

Abstract

Resources

Stay in the loop

Pages

Tools

Details