Our analysis route: Designing for accessibility
In our early analysis, we discovered {that a} important barrier to digital fairness is the “accessibility hole”, i.e., the delay between the discharge of a brand new function and the creation of an assistive layer for it. To shut this hole, we’re shifting from reactive instruments to agentic techniques which are native to the interface.
Analysis pillar: Utilizing multi-system brokers to enhance accessibility
Multimodal AI instruments present one of the promising paths to constructing accessible interfaces. In particular prototypes, similar to our work with net readability, we’ve examined a mannequin the place a central Orchestrator acts as a strategic studying supervisor.
As an alternative of a person navigating a posh maze of menus, the Orchestrator maintains shared context — understanding the doc and making it extra accessible by delegating the duties to skilled sub-agents.
- The Summarization Agent: It masters complicated paperwork by breaking down data and delegating key duties to skilled sub-agents, making even the deepest insights clear and accessible.
- The Settings agent: Handles UI changes, similar to scaling textual content, dynamically.
By testing this modular strategy,our analysis exhibits customers can work together with techniques extra intuitively, guaranteeing that specialised duties are all the time dealt with by the precise skilled with out the person needing to hunt for the “appropriate” button.
Towards multimodal fluency
Our analysis additionally focuses on transferring past fundamental text-to-speech towards multimodal fluency. By leveraging Gemini’s capability to course of voice, imaginative and prescient, and textual content concurrently, we’ve constructed prototypes that may flip stay video into rapid, interactive audio descriptions.
This is not nearly describing a scene; it’s about situational consciousness. In our co-design periods, we’ve noticed how permitting customers to interactively question their setting — asking for particular visible particulars as they occur — can scale back cognitive load and rework a passive expertise into an lively, conversational exploration.
