Chant
Chant is a React SDK that allows web apps made with React to automate its tasks and workflows using voice commands.
Why?
Modern web apps are increasingly complex - be it productivity tools, work tools (think Jira, Slack, Notion), data heavy ecommerce websites etc. And they are constantly changing to serve their users better. As a user, a lot of the time goes into understanding and becoming better at these tools - just to do our main work. I've always thought of this pattern in software absurd and a huge waste of time. Think about how much time is wasted in Jira. Do I need say more?
The idea for Chant originated from this frustration that I have faced since I don't know how long. And this is me saying it as a dev. Think about roles who have to work with these tools all day. With the incredible capabilities of voice AI, the time is ripe for such a thing to exist.
How it works?
The SDK provides react hooks to register app actions and DOM elements the need to be interacted with to perform those action. An action is a series of steps written in natural language. These actions and steps are registered as soon as the root component is mounted.
When the user issues a voice command to perform an action, the SDK generates the transcript and validates it against the registered actions using an LLM. If a matching action is found, the SDK asks the LLM to produce the series of steps to execute it. The steps are a JSON array of commands mapped to the DOM elements, with some metadata.
The key is that LLMs have rich knowledge and understanding of the DOM. By providing it with the right context and the DOM elements schema for an action, it can produce the right series of steps to execute it. The possibilities are endless.
Desired Vision
Right now, the goal is to figure out a way to test the SDK in different kinds of web apps, focusing mainly on the productivity and work tools. Testing is limited to apps build with React only but that should be a non-problem. The core challenge is to make it wrok with projects that are built with different build tools like npm, yarn, bun, etc. which has their own set of conventions and quirks.