
One of the most useful applications of AI right now, in my view, is note-taking. I love being able to speak out loud and have something reliably listening to every word. It’s almost therapeutic, a moment to empty my head and lay everything out in one place.
What would make it even better is adding some structure to that brain dump: pulling out insights, spotting patterns, and understanding where I’m making sense… and where I’m not. This is especially helpful when I’m exploring new ideas.
That’s the thinking behind this Note Taker app. I want to record, transcribe in real time, and instantly receive a clear summary and next steps. When I first started building this, real-time speech-to-text wasn’t available through the OpenAI APIs, so I had to rely on the browser’s built-in capabilities. I’ve now updated the code so the app uses one of OpenAI’s models directly, which makes the experience far more accurate and reliable.
One lesson I’ve learned while building this is that new ideas tend to appear as soon as I start experimenting. For example: I thought it would be brilliant to use this during online meetings. Ideally, it would detect different speakers automatically so the notes become far more structured and useful. Most note-taking apps today make us overly reliant on their transcription and summarisation. But what if I could jot down key points myself and have real-time AI collaboration enhancing those notes as I go?
The closest app I’ve tried that does a decent job of this is Granola. The main issue I’ve found, though, is that if someone in a meeting room speaks without being logged in, the system can’t tell who it is, it just treats all sound as coming from the device’s microphone as the user. This is where diarisation becomes essential: the ability to identify each person’s voice as a unique signature and automatically link their words to the correct speaker in the notes. The technology isn’t quite there yet, but I wouldn’t be surprised if it arrives sooner rather than later.
And for full transparency, I’m vibe-coding this all the way.
Feel free to take a look at it here. You will need your OpenAI key. Please share any thoughts. Enjoy!