Galène videoconferencing server discussion list archives
 help / color / mirror / Atom feed
* [Galene] Whisper transcriptions?
@ 2022-11-19 20:35 Jeroen van Veen
  2022-11-24 19:52 ` [Galene] " Juliusz Chroboczek
  0 siblings, 1 reply; 2+ messages in thread
From: Jeroen van Veen @ 2022-11-19 20:35 UTC (permalink / raw)
  To: galene

[-- Attachment #1: Type: text/plain, Size: 415 bytes --]

Hi,

I was wondering whether someone considered integrating Galene with automatic speech recognition software like Whisper for
transcriptions. It would be interesting to have realtime transcriptions per stream, but I don't know yet how challenging this is technically.
There seem to be two implementations of Whisper:
https://github.com/ggerganov/whisper.cpp
https://github.com/openai/whisper

kind regards,

Jeroen

[-- Attachment #2: Type: text/html, Size: 1786 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Galene] Re: Whisper transcriptions?
  2022-11-19 20:35 [Galene] Whisper transcriptions? Jeroen van Veen
@ 2022-11-24 19:52 ` Juliusz Chroboczek
  0 siblings, 0 replies; 2+ messages in thread
From: Juliusz Chroboczek @ 2022-11-24 19:52 UTC (permalink / raw)
  To: Jeroen van Veen; +Cc: galene

> I was wondering whether someone considered integrating Galene with
> automatic speech recognition software like Whisper for transcriptions.

I think it would be a good idea.

> It would be interesting to have realtime transcriptions per stream, but
> I don't know yet how challenging this is technically.

Galene doesn't decode the audio data, so this would need to be done on the
client side.  You could either do it in the client itself, or write
a specialised client that receives the audio, transcribes it, and sends
the transcription to the other participants.  (I'd start by having the
transcription in the chat, we could later design a protocol extension that
allows a client to publish captions.)

The advantage of doing this in the client is that the transcription client
can be written in whatever language is convenient, as long as it has
a WebRTC library.

-- Juliusz

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-11-24 19:52 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-19 20:35 [Galene] Whisper transcriptions? Jeroen van Veen
2022-11-24 19:52 ` [Galene] " Juliusz Chroboczek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox