PersonalAlign: Hierarchical Implicit Intent Alignment for Personalized GUI Agent with Long-Term User-Centric Records

Harbin Institute of Technology, Shenzhen
Corresponding author

Building more intelligent personal agents for aligning with user intents.

PersonalAlign

Personal agents are valuable because they can adapt to individual users over time, reducing repetitive interactions, minimizing manual specification, and proactively supporting daily tasks based on personal habits. Instead of requiring users to repeatedly state the same preferences or issue fully explicit instructions, a personal agent should remember what a user consistently prefers, recognize recurring routines, and act in alignment with the user’s long-term behavior. This capability is essential for real-world usability, where efficiency, convenience, and continuity across interactions matter more than strict instruction completeness.
we highlight PersonalAlign, a new agent task that requires GUI agents to ground personalization in users' historical interaction records. Under this setting, agent should not only identify and apply long-term user preferences to resolve underspecified instructions, but also recognize recurring routines and provide appropriate assistance at the correct time. This demands hierarchical personalization capability, where preference-level and routine-level intents are explicitly distinguished and jointly leveraged. PersonalAlign simultaneously challenges an agent’s long-term memory modeling and its personalized GUI execution capability.

AndroidIntent

We introduce AndroidIntent, a large-scale GUI dataset designed to support research on implicit user intents. Based on real intent records, the dataset contains 20k trajectories collected from different users over two months of mobile phone usage. We further annotate 775 preference intents and 215 routine intents to support both execution-level and proactive evaluation. In addition, we carefully select 100 false routine intents to rigorously assess the identification accuracy of proactive suggestions.


HIM-Agent

To support PersonalAlign, an agent's memory should generalize stable user representations while filtering out one-off behaviors, separate preferences from routines, and continuously evolve with new user interactions.
We propose HIM-Agent, a foundational personal memory framework that enables GUI agents to leverage long-term user records without interfering with original task execution. HIM-Agent maintains a streaming update memory and hierarchically organizes memory prototypes into Preference Intent Memory and Routine Intent Memory through execution-based and state-based filters, enabling effective hierarchical intent alignment.


Experiment

We evaluate a range of GUI agents on AndroidIntent, including GPT-5, Qwen3-VL, and UI-TARS. Despite recent progress, current GUI agents still exhibit limitations in both vague instruction execution and proactive recommendation, highlighting substantial room for improvement toward building more intelligent personalized agents.

Conclusion

We introduce a new critical challenge for agents, PersonalAlign, which requires hierarchical personalization to resolve implicit intents in daily interactions. We introduce AndroidIntent, a new user-centric GUI benchmark curated with a filter-verify strategy, and propose HIM-Agent, a memory framework that enables hierarchical personalization based on long-term user records. Experiments on AndroidIntent show some new challenge and the effectiveness of the HIM-Agent.