• Home
  • Blog
  • Stack

© 2026 YIKZERO

New Paradigms for Visual Interaction in the Age of AI

Design· April 24, 2024
This article was translated from Chinese with GPT-5.5. If there are any discrepancies, please refer to the original Chinese version.

Preface

I’ve been thinking about what form these “tools” that we take for granted will take in the age of artificial intelligence.

Currently, the most common application of artificial intelligence is usually limited to chat interfaces, where interaction typically follows a text-based question-and-answer model. As technology continues to advance, Vercel has also introduced Generative UI, which can dynamically generate customized interactive user interfaces based on conversation content and needs. What form will future UI take? Will it evolve into what is shown in movies, with cool holographic displays and intuitive interactive HUDs, or might we enter a new era that no longer relies on traditional GUIs?

Natural language becoming the fundamental interface for every software application might mean the end of GUIs

Random Thoughts

In the future, could I simply say, “Siri, book me a plane ticket,” or, just like Musk’s brain-computer interface experiments, transmit information with only a thought? The information that needs to be presented would appear directly in front of the eyes (in that case, would blind people also be able to see?), without the need for any external devices. The content that needs to be controlled could also be interacted with through fingers and eye movements, like with VR glasses. Perhaps in the future, it really will be like Black Mirror.

Research

AI Pin

Humane’s Ai Pin can be considered a pioneer. This product attaches magnetically to clothing. It has no screen and is controlled through voice and a small touch surface. It has a laser that can project images onto your palm, as well as a camera that can recognize gestures for switching and other operations.

Features it currently and may potentially support in the future: laser projection display, gesture and voice interaction, AI assistant, real-time translation, health monitoring, music playback, message notifications, and more.

AI Pin product screenshot
AI Pin product screenshot

Rabbit

Rabbit R1 is an AI device launched by Rabbit, positioned as your pocket companion. It mainly consists of a screen, camera, scroll wheel, and button.

Features it currently and may potentially support in the future: understanding and executing actions (hailing rides, ordering food, playing music), real-time translation, teaching mode, etc. You can watch the review video.

Rabbit R1 front render
Rabbit R1 front render

Dot

Dot is a chat app on iOS. Users can use it to send text, voice memos, images, PDF files, and it can also search the web for you. Currently, Dot communicates through text and aims to become an always-available companion. Unlike most AI conversations, Dot can remember what you’ve said before and almost never forgets anything.

Dot product interface
Dot product interface

Amazon Lex

Amazon Lex V2 is a service provided by AWS that allows developers to build conversational interfaces for applications, supporting voice and text input. Users can interact using natural language, such as placing orders. In addition, Lex V2 can also be integrated with web elements (buttons, select boxes, etc.) to solve users’ problems or help them complete specific tasks through Semi-Guided Conversations.

Summary

These examples may not necessarily be representative, but they do provide a simple glimpse of the overall trend:

  • The hardware carrier has changed. In addition to phones and computers, people are also trying to present content on more portable devices, such as necklaces, badges, and pagers.
  • The ways in which visual images are presented are becoming more diverse, such as projection and laser displays. Due to size constraints, the main forms are primarily text, simple charts, and button information.
  • Natural language interaction will be used more and more widely.

My Thoughts

Generative UI Is a Future Trend

At present, interface design must satisfy as many people as possible. Any experienced professional designer knows the main drawback of this approach — you can never make anyone completely satisfied. Personalization and customization also play a relatively small role.

This application scenario is somewhat similar to Alipay’s dynamic personalized cards on the home screen based on location (showing airport-related information when at the airport).

iOS Smarter Siri by @upintheozone
iOS Smarter Siri by @upintheozone

Moreover, generative UI is not limited to dynamic cards; it can also dynamically present an entire system for you based on personal data. The article Generative UI and Outcome-Oriented Design gives examples of what true personalization for every individual looks like:

  • Dyslexia — the application displays special fonts and color contrast
  • Users who care about cost and time — explicitly display this information and automatically increase the weight of corresponding flights
  • Although it is the same application, the interface, features, and more are all personally customized
Comparison between traditional UI and generative UI
Comparison between traditional UI and generative UI

Designers will also shift from universal design to personalized design, defining boundaries for various personalized scenarios to strengthen constraints on artificial intelligence.

Traditional Interaction Will Not Be Replaced, but It Will Be Greatly Simplified

Although natural language interaction has advantages, in scenarios that require fast and precise input, visual elements such as buttons, icons, and gestures are still necessary. Pure voice input may reduce efficiency. Also, not all tasks can be completed well through voice conversations alone.

Different interaction methods suit different situations. The coexistence of visual and voice input/output can maximize adherence to accessibility design principles, allowing people with disabilities to choose the method that suits them. Once LLMs can complete specific tasks, interfaces will be greatly simplified.

In addition, natural language interaction is not applicable in all scenarios. For example, in companies, libraries, and similar places, it may also cause concerns such as privacy leakage.

Final Notes

Although I wanted to expand on this as much as possible, as I kept writing, I still ended up returning to the GUI itself — I was merely imagining the future forms of existing products...

References

  • ScreenAI: A visual language model for UI and visually-situated language understanding
  • The AI Device Revolution Isn’t Going to Kill the Smartphone
  • Meet Dot, an AI companion designed by an Apple alum, here to help you live your best life
  • Malleable software in the age of LLMs
  • Generative UI and Outcome-Oriented Design
  • UFO: A UI-Focused Agent for Windows OS Interaction
  • Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs
Building a Blog from a Product Designer’s Perspective
A Study of Modular UI Design: Icons