The product
Captions started as the one-to-go video captions editor for beginners. Powered by AI, the app can automatically split the frames to catch blank spots and remove all the filling words like "Mmh", "Euh", etc. Just press the button and ta-da, it's done!
When they first reached us the goal was to redesign the app and enrich the product with different basic video editor functionalities such as split, trim, pic in pic, transitions, etc. We also built on captions functionalities and enabled more options such as highlighting words, editing style, matching stickers to words, and so on.
Opportunity
Soon the company started surfing on the wave of AI's new generation language models and pivoting towards Artificial Intelligence as the center of their offering. From voice assistants to text-to-image generators, AI opens a universe of possibilities and through the power of this technology, Captions decided to turn speech recognition to the next level.
Lip dub was the first feature we designed, a powerful one where you can change what you're saying. Use case examples:
* You recorded a new video but made a mistake, you can overdub a word or an entire part and the AI will make your lip move to the new script.
*You want to reach a whole new audience in another part of the world, just press the button and the AI will translate the entire video into another language.
User engagement and response were so positive that we decided to bring it to the next level and create Captions' very own script-writing assistant. Imagine, you describe what you want to talk about and the app will write the entire script for you.
Process
The first iteration was a simple ideas generator that can give daily ideas for users to generate content faster. It can scan the user project content, see what they talk about, and suggest ideas related to their interest.
We imagined an experience where the user needs at least 5 projects to unlock the idea generator. That would encourage users to engage with the app to unlock the feature.
In this experience, the user gets one idea (or a few) at a time and can shuffle, like, or control how random it is. It was already a step forward in injecting more AI power into the product, but why not push it even further?
What if the AI could take that idea and literally write that script for your next TikTok video or tech review you want to record? Meet Geniio, your script-writer assistant :)
The flow:
* Tap on an idea a generate a script from it
* Go through the settings first to adjust that idea and narrow down the result: Title, description, tone of voice, keywords, and length
* Get a script with a layout that's made for a video (includes a hook, intro, and paragraph for each outline item and a conclusion)
* On the script, see the word count, save it for later, and go back to settings to adjust the input
Optimization
We tested the feature and got some feedback from users that some aspects of the AI were not clear. We also put priority on the scriptwriter AI and moved the ideas generator to the next phase.
* Templates for customization
People record videos for different purposes and the format is very different from making an ad for their local hair saloon or an influencer talking about tech so we introduced a template section in the editor
* Talk to a human
We made it clearer that they could describe exactly what they wanted like they would describe it to a real human, so we added explicit hint text and real examples.
We wanted to teach the users that filling in the inputs enables them to 'Generate' the script output. Once on the output screen (the script), the user can easily 'Edit' and go back to editor mode. The input from users and the output from the AI are like a conversation.
* Walk-through
We set a first-time-use event (scroll after 5 seconds) to trigger a bottom drawer that encourages users to go back and forth to the editor and adjust until they get to the best result.
*Navigation
In the higher level of navigation, we placed the 'Version history' to enable users to access previous versions. From the script mode, the user can 'Record', 'Shuffle', or 'Delete'.
* Teleprompter
From that script, the user should be able to record it and go to teleprompter mode.
* Dynamic island
While in another app, the user should be able to see his script and record a video.
Iterations
That feature is the result of a continuous conversation with the client to follow his vision and optimised from user feedback. From ideas-only, ideas-to-script, to editor-script mode, it took a few tries to get the flow and the layout right while keeping the design smooth and the vibe right.