Generative UI

Generative UI: LLMs are Effective UI Generators

Yaniv Leviathan, Dani Valevski, Matan Kalman, Danny Lumen,
Eyal Segalis, Eyal Molad, Shlomi Pasternak, Vishnu Natchu, Valerie Nygaard, Srinivasan (Cheenu) Venkatachar, James Manyika, Yossi Matias

Google Research

Abstract

AI models excel at creating content, but typically render it with static, predefined interfaces. Specifically, the output of LLMs is often a markdown “wall of text”. Generative UI is a long standing promise, where the model generates not just the content, but the interface itself. Until now, Generative UI was not possible in a robust fashion. We demonstrate that when properly prompted and equipped with the right set of tools, a modern LLM can robustly produce high quality custom UIs for virtually any prompt. When ignoring generation speed, results generated by our implementation are overwhelmingly preferred by humans over the standard LLM markdown output. In fact, while the results generated by our implementation are worse than those crafted by human experts, they are at least comparable in 44% of cases. We show that this ability for robust Generative UI is emergent, with substantial improvements from previous models. We also create and release PAGEN, a novel dataset of expert-crafted results to aid in evaluating Generative UI implementations, as well as the results of our system for future comparisons.

High Level Method Overview

Our Generative UI implementation results in a fully-generated web page that is rendered as-is on the user's browser. To that end, we employ 3 main components:

A server which exposes several endpoints enabling access to key tools, such as image generation and search.
Carefully crafted system instructions for Gemini, which include the goal, planning guidelines, examples and more.
A set of post-processors which fix a set of common issues that couldn't be fixed with the system instructions alone.

Results

We evaluate user preference across several different result formats: a custom website crafted for the prompt by a human expert, the top Google Search result for the query, text (LLMoutput without markdown), standard LLM output (in markdown format), and our Generative UI implementation. We randomly sampled 100 prompts from LMArena and collected pairwise preferences from human raters, sending each result to 2 raters. The following tables show the resulting ELO scores and side-by-side user preference for each of the UI modalities. Generative UI obtains an ELO score of 1710.7, indicating a strong user preference over all other formats, except human experts. See more details in the paper.

Generative UI is an emerging capability with newer models - we see strong user preference and drastically less errors for results with the new Gemini models.

Generative UI: LLMs are Effective UI Generators

Abstract

Generative UI Examples

Education

Education for Kids

Practical Tasks

Simple Queries

Fun & Games

High Level Method Overview

Results

Acknowledgements