> I was wondering whether someone considered integrating Galene with
> automatic speech recognition software like Whisper for transcriptions.
I think it would be a good idea.
> It would be interesting to have realtime transcriptions per stream, but
> I don't know yet how challenging this is technically.
Galene doesn't decode the audio data, so this would need to be done on the
client side. You could either do it in the client itself, or write
a specialised client that receives the audio, transcribes it, and sends
the transcription to the other participants. (I'd start by having the
transcription in the chat, we could later design a protocol extension that
allows a client to publish captions.)
The advantage of doing this in the client is that the transcription client
can be written in whatever language is convenient, as long as it has
a WebRTC library.
-- Juliusz