From: Juliusz Chroboczek <jch@irif.fr>
To: galene@lists.galene.org
Subject: [Galene] What I've learnt about background blur
Date: Mon, 16 Dec 2024 13:53:12 +0100 [thread overview]
Message-ID: <87seqnhjrb.wl-jch@irif.fr> (raw)
What I've learned while implementing background blur.
There are multiple segmentation techniques
==========================================
The first step is to segment the image into background and foreground.
There are multiple techniques to do that, and they yield different
results.
In addition to the traditional techniques, such as chroma keying, there
are at least two ML-based approaches: semantic segmentation and depth
estimation. Semantic segmentation is trained to recognise human figures,
while depth estimation guesses the distance of every pixel from the
camera.
Both approaches work surprisingly well (computer vision is really close to
magic). They have different privacy issues: semantic segmentation will
unblur the background when a human walks by, and depth estimation will
unblur objects lying on your desk. (The obvious solution would be to do
both and combine the results, but that seems a little overkill.)
I've used semantic segmentation in Galene, since that's what MediaPipe
implements. I haven't experimented with depth estimation.
Blending is not that obvious
============================
The obvious postprocessing techniques is to blur the original image in
order to construct the blurred backgound. That's what both Jitsi Meet [1]
and LiveKit [2] do, and that's what I did initially. It turns out that it
doesn't yield very good results: it causes the foreground to bleed into
the background, causing a halo around the foreground figure.
Thankfully, the problem is well known in the Photoshop community, and the
simple solution is to mask before blurring: this way, you're blurring the
background only, and avoid bleeding the foreground into the background.
That's what Galene does now, and I find the results satisfactory, at least
on 640x400 videos. (I think this can be improved still, but I'm currently
waiting for my mana bar to replenish.)
Interestingly, Google Meet apparently does something more complicated:
they use a WebGL shader to vary the amount of blurring depending on
something or something else, and use a bilateral filter instead of
a Gaussian filter. Unfortunately, their posting on the subject [3] does
not make a lot of sense to me (they're applying a bilateral filter to
a monochrome mask? where does the CoC come from?), I suspect that the
technical writer did not fully undertand what the engineers told them.
[1]: https://github.com/jitsi/jitsi-meet/blob/master/react/features/stream-effects/virtual-background/JitsiStreamBackgroundEffect.ts#L102
[2]: https://github.com/livekit/track-processors-js/blob/main/src/transformers/BackgroundTransformer.ts#L143
[3]: https://research.google/blog/background-features-in-google-meet-powered-by-web-ml/
-- Juliusz
next reply other threads:[~2024-12-16 12:53 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-16 12:53 Juliusz Chroboczek [this message]
2024-12-16 13:33 ` [Galene] " Tim Panton
2024-12-16 13:42 ` Juliusz Chroboczek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.galene.org/postorius/lists/galene.lists.galene.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87seqnhjrb.wl-jch@irif.fr \
--to=jch@irif.fr \
--cc=galene@lists.galene.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox