From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: mail.toke.dk; spf=pass (mailfrom) smtp.mailfrom=irif.fr (client-ip=2001:660:3301:8000::1:2; helo=korolev.univ-paris7.fr; envelope-from=jch@irif.fr; receiver=) Authentication-Results: mail.toke.dk; dkim=pass (2048-bit key; unprotected) header.d=irif.fr header.i=@irif.fr header.a=rsa-sha256 header.s=dkim-irif header.b=eIcQoEGZ Received: from korolev.univ-paris7.fr (korolev.univ-paris7.fr [IPv6:2001:660:3301:8000::1:2]) by mail.toke.dk (Postfix) with ESMTPS id 7D7F8AB5AAE for ; Mon, 16 Dec 2024 13:53:15 +0100 (CET) Received: from potemkin.univ-paris7.fr (potemkin.univ-paris7.fr [IPv6:2001:660:3301:8000::1:1]) by korolev.univ-paris7.fr (8.14.4/8.14.4/relay1/82085) with ESMTP id 4BGCrFdh024995 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for ; Mon, 16 Dec 2024 13:53:15 +0100 Received: from mailhub.math.univ-paris-diderot.fr (mailhub.math.univ-paris-diderot.fr [81.194.30.253]) by potemkin.univ-paris7.fr (8.14.4/8.14.4/relay2/82085) with ESMTP id 4BGCrFP0017360 for ; Mon, 16 Dec 2024 13:53:15 +0100 Received: from mailhub.math.univ-paris-diderot.fr (localhost [127.0.0.1]) by mailhub.math.univ-paris-diderot.fr (Postfix) with ESMTP id 0C9A14D508 for ; Mon, 16 Dec 2024 13:53:14 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=irif.fr; h= content-type:content-type:mime-version:user-agent:subject :subject:from:from:message-id:date:date:received:received; s= dkim-irif; t=1734353593; x=1735217594; bh=QDisbaHQYH8T6qgCXdv2kQ U2FjQJQAdMJxLpibEUje4=; b=eIcQoEGZ3HCn91u/rMBMVz0UOUSkXP7JePTWw0 +05483RlNhLx4W1hT6oa7iysIUq9s4wQ6REFbk+BXEVe/5CklGqKmv5x70BaLmeH tjoowuDhylySNjBj4ElVJYRF0mije1xDL80dWV04llWSPV/QQl0zIWYQxrGMiFOa 56IvqNKlTNT6Hsp340jJ2FtaVuYn9/veoB1iZ6icMxEjCsuDYPyEN4ImA9VOsjr5 Uszbh4uJqLyimC9f6S/+u+OokbUUltd/NbvemuMeq6ZrRbe2ziOOHKEc65S88v0F Ehq7PGHWhe8tPLomX/PtmyYABolcKQM7D2ZAd2z+zlKun3hQ== X-Virus-Scanned: amavisd-new at math.univ-paris-diderot.fr Received: from mailhub.math.univ-paris-diderot.fr ([127.0.0.1]) by mailhub.math.univ-paris-diderot.fr (mailhub.math.univ-paris-diderot.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id TggmdDKosm-r for ; Mon, 16 Dec 2024 13:53:13 +0100 (CET) Received: from pirx.irif.fr (unknown [37.175.116.0]) (Authenticated sender: jch) by mailhub.math.univ-paris-diderot.fr (Postfix) with ESMTPSA id 592F04D507 for ; Mon, 16 Dec 2024 13:53:12 +0100 (CET) Date: Mon, 16 Dec 2024 13:53:12 +0100 Message-ID: <87seqnhjrb.wl-jch@irif.fr> From: Juliusz Chroboczek To: galene@lists.galene.org User-Agent: Wanderlust/2.15.9 (Almost Unreal) Emacs/29.4 Mule/6.0 MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (korolev.univ-paris7.fr [IPv6:2001:660:3301:8000::1:2]); Mon, 16 Dec 2024 13:53:15 +0100 (CET) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (potemkin.univ-paris7.fr [194.254.61.141]); Mon, 16 Dec 2024 13:53:15 +0100 (CET) X-Miltered: at korolev with ID 676022BB.000 by Joe's j-chkmail (http : // j-chkmail dot ensmp dot fr)! X-Miltered: at potemkin with ID 676022BB.000 by Joe's j-chkmail (http : // j-chkmail dot ensmp dot fr)! X-j-chkmail-Enveloppe: 676022BB.000 from potemkin.univ-paris7.fr/potemkin.univ-paris7.fr/null/potemkin.univ-paris7.fr/ X-j-chkmail-Enveloppe: 676022BB.000 from mailhub.math.univ-paris-diderot.fr/mailhub.math.univ-paris-diderot.fr/null/mailhub.math.univ-paris-diderot.fr/ X-j-chkmail-Score: MSGID : 676022BB.000 on korolev.univ-paris7.fr : j-chkmail score : . : R=. U=. O=. B=0.000 -> S=0.000 X-j-chkmail-Score: MSGID : 676022BB.000 on potemkin.univ-paris7.fr : j-chkmail score : . : R=. U=. O=. B=0.000 -> S=0.000 X-j-chkmail-Status: Ham X-j-chkmail-Status: Ham Message-ID-Hash: FE4ZCMYWP4OPZJPXSRI6WVNQZ62XKQMV X-Message-ID-Hash: FE4ZCMYWP4OPZJPXSRI6WVNQZ62XKQMV X-MailFrom: jch@irif.fr X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list Subject: [Galene] What I've learnt about background blur List-Id: =?utf-8?q?Gal=C3=A8ne_videoconferencing_server_discussion_list?= Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: What I've learned while implementing background blur. There are multiple segmentation techniques ========================================== The first step is to segment the image into background and foreground. There are multiple techniques to do that, and they yield different results. In addition to the traditional techniques, such as chroma keying, there are at least two ML-based approaches: semantic segmentation and depth estimation. Semantic segmentation is trained to recognise human figures, while depth estimation guesses the distance of every pixel from the camera. Both approaches work surprisingly well (computer vision is really close to magic). They have different privacy issues: semantic segmentation will unblur the background when a human walks by, and depth estimation will unblur objects lying on your desk. (The obvious solution would be to do both and combine the results, but that seems a little overkill.) I've used semantic segmentation in Galene, since that's what MediaPipe implements. I haven't experimented with depth estimation. Blending is not that obvious ============================ The obvious postprocessing techniques is to blur the original image in order to construct the blurred backgound. That's what both Jitsi Meet [1] and LiveKit [2] do, and that's what I did initially. It turns out that it doesn't yield very good results: it causes the foreground to bleed into the background, causing a halo around the foreground figure. Thankfully, the problem is well known in the Photoshop community, and the simple solution is to mask before blurring: this way, you're blurring the background only, and avoid bleeding the foreground into the background. That's what Galene does now, and I find the results satisfactory, at least on 640x400 videos. (I think this can be improved still, but I'm currently waiting for my mana bar to replenish.) Interestingly, Google Meet apparently does something more complicated: they use a WebGL shader to vary the amount of blurring depending on something or something else, and use a bilateral filter instead of a Gaussian filter. Unfortunately, their posting on the subject [3] does not make a lot of sense to me (they're applying a bilateral filter to a monochrome mask? where does the CoC come from?), I suspect that the technical writer did not fully undertand what the engineers told them. [1]: https://github.com/jitsi/jitsi-meet/blob/master/react/features/stream-effects/virtual-background/JitsiStreamBackgroundEffect.ts#L102 [2]: https://github.com/livekit/track-processors-js/blob/main/src/transformers/BackgroundTransformer.ts#L143 [3]: https://research.google/blog/background-features-in-google-meet-powered-by-web-ml/ -- Juliusz