Skip to main content
0
A

Awesome GUI Agent

πŸ’» A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.

Rating

0.0

Votes

0

score

Downloads

0

total

Price

Free

No login needed

Works With

Claude CodeCursorWindsurfVS CodeDeveloper tool

About

Awesome GUI Agent [](https://github.com/sindresorhus/awesome)

A curated list of papers, projects, and resources for multi-modal Graphical User Interface (GUI) agents.

Build a digital assistant on your screen. Generated by DALL-E-3.

WELCOME CONTRIBUTE!

πŸ”₯ This project is actively maintained, and we welcome your contributions. If you have any suggestions, such as missing papers or information, please feel free to open an issue or submit a pull request.

πŸ€– Try our Awesome-Paper-Agent. Just provide an arXiv URL link, and it will automatically return formatted information, like this:

User:
https://arxiv.org/abs/2312.13108

GPT:
+ [AssistGUI: Task-Oriented Desktop Graphical User Interface Automation](https://arxiv.org/abs/2312.13108) (Dec. 2023)

  [](https://github.com/showlab/assistgui)
  [](https://arxiv.org/abs/2312.13108)
  [](https://showlab.github.io/assistgui/)

So then you can easily copy and use this information in your pull requests.

⭐ If you find this repository useful, please give it a star.

Quick Navigation: [Datasets / Benchmarks] [Models / Agents] [Surveys] [Projects]

Datasets / Benchmarks

+ World of Bits: An Open-Domain Platform for Web-Based Agents (Aug. 2017, ICML 2017)

[](https://proceedings.mlr.press/v70/shi17a/shi17a.pdf)

+ A Unified Solution for Structured Web Data Extraction (Jul. 2011, SIGIR 2011)

[](https://dl.acm.org/doi/10.1145/2009916.2010020)

+ Rico: A Mobile App Dataset for Building Data-Driven Design Applications (Oct. 2017)

[](https://dl.acm.org/doi/10.1145/3126594.3126651)

+ Reinforcement Learning on Web Interfaces using Workflow-Guided Exploration (Feb. 2018, ICLR 2018)

[](https://github.com/stanfordnlp/wge) [](https://arxiv.org/abs/1802.08802)

+ Mapping Natural Language Instructions to Mobile UI Action Sequences (May. 2020, ACL 2020)

[](https://github.com/deepneuralmachine/seq2act-tensorflow) [](https://arxiv.org/abs/2005.03776)

+ WebSRC: A Dataset for Web-Based Structural Reading Comprehension (Jan. 2021, EMNLP 2021)

[](https://arxiv.org/abs/2101.09465) [](https://x-lance.github.io/WebSRC/)

+ AndroidEnv: A Reinforcement Learning Platform for Android (May. 2021)

[](https://github.com/deepmind/android_env) [](https://arxiv.org/abs/2105.13231) [](https://github.com/deepmind/android_env)

+ A Dataset for Interactive Vision-Language Navigation with Unknown Command Feasibility (Feb. 2022)

[](https://arxiv.org/abs/2202.02312)

Don't lose this

Three weeks from now, you'll want Awesome GUI Agent again. Will you remember where to find it?

Save it to your library and the next time you need Awesome GUI Agent, it’s one tap away β€” from any AI app you use. Group it into a bench with the rest of the team for that kind of task and you can pull the whole stack at once.

⚑ Pro tip for geeks: add a-gnt πŸ€΅πŸ»β€β™‚οΈ as a custom connector in Claude or a custom GPT in ChatGPT β€” one click and your library is right there in the chat. Or, if you’re in an editor, install the a-gnt MCP server and say β€œuse my [bench name]” in Claude Code, Cursor, VS Code, or Windsurf.

πŸ€΅πŸ»β€β™‚οΈ

a-gnt's Take

Our honest review

πŸ’» A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents. Best for anyone looking to make their AI assistant more capable in search & web. It's completely free and works across most major AI apps. This one just landed in the catalog β€” worth trying while it's fresh.

Tips for getting started

1

Tap "Get" above, pick your AI app, and follow the steps. Most installs take under 30 seconds.

What's New

Version 1.0.06 days ago

Imported from GitHub

Ratings & Reviews

0.0

out of 5

0 ratings

No reviews yet. Be the first to share your experience.