Personal agents are valuable because they can adapt to individual users over time, reducing repetitive
interactions, minimizing manual specification, and proactively supporting daily tasks based on personal
habits. Instead of requiring users to repeatedly state the same preferences or issue fully explicit
instructions, a personal agent should remember what a user consistently prefers, recognize recurring
routines, and act in alignment with the user’s long-term behavior. This capability is essential for
real-world usability, where efficiency, convenience, and continuity across interactions matter more than
strict instruction completeness.
we highlight PersonalAlign, a new agent task that requires GUI agents to ground personalization in
users' historical
interaction records. Under this setting, agent should not only identify and apply long-term user
preferences to resolve underspecified instructions, but also recognize recurring routines and provide
appropriate assistance at the correct time. This demands hierarchical personalization capability, where
preference-level and routine-level intents are explicitly distinguished and jointly leveraged.
PersonalAlign simultaneously challenges an agent’s long-term memory modeling and its personalized GUI
execution capability.
We introduce AndroidIntent, a large-scale GUI dataset designed to support research on implicit user intents. Based on real intent records, the dataset contains 20k trajectories collected from different users over two months of mobile phone usage. We further annotate 775 preference intents and 215 routine intents to support both execution-level and proactive evaluation. In addition, we carefully select 100 false routine intents to rigorously assess the identification accuracy of proactive suggestions.
To support PersonalAlign, an agent's memory should generalize stable user representations while filtering
out
one-off behaviors,
separate preferences from routines, and continuously evolve with new user interactions.
We propose HIM-Agent, a foundational personal memory framework that enables GUI agents to leverage
long-term user records
without interfering with original task execution. HIM-Agent maintains a streaming update memory and
hierarchically organizes
memory prototypes into Preference Intent Memory and Routine Intent Memory through
execution-based and state-based filters,
enabling effective hierarchical intent alignment.
We evaluate a range of GUI agents on AndroidIntent, including GPT-5, Qwen3-VL, and UI-TARS. Despite recent progress, current GUI agents still exhibit limitations in both vague instruction execution and proactive recommendation, highlighting substantial room for improvement toward building more intelligent personalized agents.
We introduce a new critical challenge for agents, PersonalAlign, which requires hierarchical personalization to resolve implicit intents in daily interactions. We introduce AndroidIntent, a new user-centric GUI benchmark curated with a filter-verify strategy, and propose HIM-Agent, a memory framework that enables hierarchical personalization based on long-term user records. Experiments on AndroidIntent show some new challenge and the effectiveness of the HIM-Agent.