LLM Framework Adapts Dialogue Policies to User Personalities

Hui Wang, Fafa Zhang, Meng Liu, Xiangyu Chen, Chaoxu Mu· June 15, 2026 View original

Summary

This paper introduces UP-NRPA, a User Portrait based Nested Rollout Policy Adaptation framework that enables Large Language Models to dynamically customize dialogue strategies. It adapts to diverse user characteristics in real-time using feedback and user portraits, achieving high success rates in goal-oriented dialogue tasks without offline reinforcement learning.

This research addresses the challenge of creating dialogue systems that can dynamically adjust their strategies to individual user characteristics. The paper proposes UP-NRPA (User Portrait based Nested Rollout Policy Adaptation), an online framework that leverages Large Language Models (LLMs) for this purpose. Unlike traditional methods that rely on extensive offline training for different user groups, UP-NRPA offers a real-time adaptive mechanism. It uses immediate user feedback combined with a "user portrait" – encompassing personality, preferences, and objectives – to customize dialogue policies on the fly, eliminating the need for pre-trained reinforcement learning models. Evaluations on collaborative and non-collaborative dialogue benchmarks demonstrated significant improvements. UP-NRPA achieved a 100% success rate in several dialogue tasks and notably increased the sale-to-list ratio by 56.41% in negotiation scenarios. These results highlight the framework's ability to adapt to diverse user needs without a dedicated training mechanism, making dialogue systems more responsive and effective.

Why it matters

Personalizing user interactions is key to improving customer satisfaction and achieving business goals in dialogue systems. This framework offers a powerful, adaptive solution for professionals developing chatbots, virtual assistants, and conversational AI, enabling more effective and user-centric communication.

How to implement this in your domain

  1. 1Integrate UP-NRPA principles into existing conversational AI platforms to enhance user personalization.
  2. 2Develop robust user profiling mechanisms to create accurate "user portraits" for dynamic adaptation.
  3. 3Apply the framework to customer service chatbots to improve resolution rates and user satisfaction.
  4. 4Experiment with UP-NRPA in sales or negotiation AI to optimize outcomes based on individual user behavior.

Who benefits

Customer ServiceE-commerceSalesMarketingConversational AI

Key takeaways

  • UP-NRPA enables dynamic adaptation of dialogue policies using user portraits and LLMs.
  • It customizes strategies in real-time without requiring offline reinforcement learning.
  • The framework achieved high success rates and significant performance gains in dialogue tasks.
  • This approach makes dialogue systems more responsive to diverse user characteristics.

Original post by Hui Wang, Fafa Zhang, Meng Liu, Xiangyu Chen, Chaoxu Mu

"arXiv:2606.13683v1 Announce Type: new Abstract: To address the challenge that current dialogue policy planning methods struggle to dynamically adapt to diverse user characteristics, this paper proposes a User Portrait based Nested Rollout Policy Adaptation (UP-NRPA) online framew…"

View on X

Originally posted by Hui Wang, Fafa Zhang, Meng Liu, Xiangyu Chen, Chaoxu Mu on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses