Galène videoconferencing server discussion list archives
 help / color / mirror / Atom feed
* [Galene]  New Galène protocol
@ 2021-02-03 19:04 Juliusz Chroboczek
  2021-02-03 21:01 ` [Galene] " Michael Ströder
  0 siblings, 1 reply; 28+ messages in thread
From: Juliusz Chroboczek @ 2021-02-03 19:04 UTC (permalink / raw)
  To: galene

Dear all,

I've just pushed a new revision of the protocol, that does renegotiation
correctly.  Please let me know if you see any issues.

-- Juliusz


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Galene] Re: New Galène protocol
  2021-02-03 19:04 [Galene] New Galène protocol Juliusz Chroboczek
@ 2021-02-03 21:01 ` Michael Ströder
  2021-02-03 21:11   ` Juliusz Chroboczek
  0 siblings, 1 reply; 28+ messages in thread
From: Michael Ströder @ 2021-02-03 21:01 UTC (permalink / raw)
  To: Juliusz Chroboczek, galene

On 2/3/21 8:04 PM, Juliusz Chroboczek wrote:
> I've just pushed a new revision of the protocol, that does renegotiation
> correctly.  Please let me know if you see any issues.

Anything in particular you want to be tested?

Ciao, Michael.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Galene]  Re: New Galène protocol
  2021-02-03 21:01 ` [Galene] " Michael Ströder
@ 2021-02-03 21:11   ` Juliusz Chroboczek
  2021-02-04 10:49     ` Michael Ströder
  0 siblings, 1 reply; 28+ messages in thread
From: Juliusz Chroboczek @ 2021-02-03 21:11 UTC (permalink / raw)
  To: Michael Ströder; +Cc: galene

>> I've just pushed a new revision of the protocol, that does renegotiation
>> correctly.  Please let me know if you see any issues.

> Anything in particular you want to be tested?

Behaviour against anything else than Chrome.

For example, we used to destroy the stream and create it anew when the
user changed the value in the "Receive" menu.  We now renegotiate, and
while I've hacked around issues with Chrome, I'm not sure how other
browsers will react to the renegotiation.

The other thing to test is behaviour when large numbers of users arrive
and leave simultaneously, which will happen at the next lecture.

-- Juliusz


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Galene] Re: New Galène protocol
  2021-02-03 21:11   ` Juliusz Chroboczek
@ 2021-02-04 10:49     ` Michael Ströder
  2021-02-04 18:27       ` Michael Ströder
  0 siblings, 1 reply; 28+ messages in thread
From: Michael Ströder @ 2021-02-04 10:49 UTC (permalink / raw)
  To: galene

On 2/3/21 10:11 PM, Juliusz Chroboczek wrote:
>>> I've just pushed a new revision of the protocol, that does renegotiation
>>> correctly.  Please let me know if you see any issues.
> 
>> Anything in particular you want to be tested?
> 
> Behaviour against anything else than Chrome.

I now see *lots* of these messages:

ice WARNING: 2021/02/04 11:07:24 failed to send packet: write udp
10.23.45.67:53560->10.23.45.89:50995: use of closed network connection

Is this something to worry about?

Also my screen sharing was frozen in a 2-user conference several times.
Unsharing and sharing again made it work again.

I'm using these PION env vars:

PION_LOG_TRACE=""
PION_LOG_DEBUG=""
PIONS_LOG_INFO="ice"
PIONS_LOG_WARNING="all"
PIONS_LOG_ERROR="all"

Ciao, Michael.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Galene] Re: New Galène protocol
  2021-02-04 10:49     ` Michael Ströder
@ 2021-02-04 18:27       ` Michael Ströder
  2021-02-04 20:04         ` Juliusz Chroboczek
  0 siblings, 1 reply; 28+ messages in thread
From: Michael Ströder @ 2021-02-04 18:27 UTC (permalink / raw)
  To: galene

On 2/4/21 11:49 AM, Michael Ströder wrote:
> On 2/3/21 10:11 PM, Juliusz Chroboczek wrote:
>>>> I've just pushed a new revision of the protocol, that does renegotiation
>>>> correctly.  Please let me know if you see any issues.
>>
>>> Anything in particular you want to be tested?
>>
>> Behaviour against anything else than Chrome.
> 
> I now see *lots* of these messages:
> 
> ice WARNING: 2021/02/04 11:07:24 failed to send packet: write udp
> 10.23.45.67:53560->10.23.45.89:50995: use of closed network connection
> 
> Is this something to worry about?

Hmm, this definitely did not happen before yesterdays' commits. And it
is really filling my log at quite high rate.

Ciao, Michael.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Galene]  Re: New Galène protocol
  2021-02-04 18:27       ` Michael Ströder
@ 2021-02-04 20:04         ` Juliusz Chroboczek
  2021-02-04 20:47           ` Michael Ströder
  0 siblings, 1 reply; 28+ messages in thread
From: Juliusz Chroboczek @ 2021-02-04 20:04 UTC (permalink / raw)
  To: Michael Ströder; +Cc: galene

Could you pull and see if it's better?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Galene] Re: New Galène protocol
  2021-02-04 20:04         ` Juliusz Chroboczek
@ 2021-02-04 20:47           ` Michael Ströder
  2021-02-04 21:26             ` Michael Ströder
  0 siblings, 1 reply; 28+ messages in thread
From: Michael Ströder @ 2021-02-04 20:47 UTC (permalink / raw)
  To: galene

On 2/4/21 9:04 PM, Juliusz Chroboczek wrote:
> Could you pull and see if it's better?

I cannot claim to have a real test plan.

But I've managed to reproduce the issue with git revision
6054ae6cc6b25e92f3fc482eb60d2ec0e7ee9ff6 and two Firefox 85 with
different profiles by wildly switching receive options, leaving session
and re-entering it.

I did not manage to reproduce with git revision
b4240c45059d9b4d8c8119fae8a3ccc1a5969377.

So at first glance the latter is an improvement.

But decent testing is something else. ;-)

Ciao, Michael.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Galene] Re: New Galène protocol
  2021-02-04 20:47           ` Michael Ströder
@ 2021-02-04 21:26             ` Michael Ströder
  2021-02-04 21:57               ` Juliusz Chroboczek
  0 siblings, 1 reply; 28+ messages in thread
From: Michael Ströder @ 2021-02-04 21:26 UTC (permalink / raw)
  To: galene

On 2/4/21 9:47 PM, Michael Ströder wrote:
> On 2/4/21 9:04 PM, Juliusz Chroboczek wrote:
>> Could you pull and see if it's better?
> 
> I cannot claim to have a real test plan.
> 
> But I've managed to reproduce the issue with git revision
> 6054ae6cc6b25e92f3fc482eb60d2ec0e7ee9ff6 and two Firefox 85 with
> different profiles by wildly switching receive options, leaving session
> and re-entering it.
> 
> I did not manage to reproduce with git revision
> b4240c45059d9b4d8c8119fae8a3ccc1a5969377.
> 
> So at first glance the latter is an improvement.
> 
> But decent testing is something else. ;-)

Hmm, *after* writing the above log messages started to appear again.

Ciao, Michael.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Galene]  Re: New Galène protocol
  2021-02-04 21:26             ` Michael Ströder
@ 2021-02-04 21:57               ` Juliusz Chroboczek
  2021-02-04 22:06                 ` Michael Ströder
  0 siblings, 1 reply; 28+ messages in thread
From: Juliusz Chroboczek @ 2021-02-04 21:57 UTC (permalink / raw)
  To: Michael Ströder; +Cc: galene

> Hmm, *after* writing the above log messages started to appear again.

Are there any user-visible issues when this happens?  I suspect it's
nothing to worry about, just one of the threads that send data racing with
the thread that shuts the other threads down.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Galene] Re: New Galène protocol
  2021-02-04 21:57               ` Juliusz Chroboczek
@ 2021-02-04 22:06                 ` Michael Ströder
  2021-02-04 22:53                   ` Juliusz Chroboczek
  0 siblings, 1 reply; 28+ messages in thread
From: Michael Ströder @ 2021-02-04 22:06 UTC (permalink / raw)
  To: galene

On 2/4/21 10:57 PM, Juliusz Chroboczek wrote:
>> Hmm, *after* writing the above log messages started to appear again.
> 
> Are there any user-visible issues when this happens?

Not really.

> I suspect it's nothing to worry about, just one of the threads that
> send data racing with the thread that shuts the other threads down.
Hmm, but pion ice logs this as warning. Really ignore hundreds/thousands
of messages per minute? And why didn't I saw this before?

Ciao, Michael.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Galene]  Re: New Galène protocol
  2021-02-04 22:06                 ` Michael Ströder
@ 2021-02-04 22:53                   ` Juliusz Chroboczek
  2021-02-05  9:59                     ` Michael Ströder
  0 siblings, 1 reply; 28+ messages in thread
From: Juliusz Chroboczek @ 2021-02-04 22:53 UTC (permalink / raw)
  To: Michael Ströder; +Cc: galene

>> I suspect it's nothing to worry about, just one of the threads that
>> send data racing with the thread that shuts the other threads down.

> Hmm, but pion ice logs this as warning. Really ignore hundreds/thousands
> of messages per minute? And why didn't I saw this before?

It looks like I wasn't always closing connections correctly.  Could you
please check if 66de0d solves the issue?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Galene] Re: New Galène protocol
  2021-02-04 22:53                   ` Juliusz Chroboczek
@ 2021-02-05  9:59                     ` Michael Ströder
  2021-02-05 10:05                       ` Michael Ströder
  0 siblings, 1 reply; 28+ messages in thread
From: Michael Ströder @ 2021-02-05  9:59 UTC (permalink / raw)
  To: galene

On 2/4/21 11:53 PM, Juliusz Chroboczek wrote:
>>> I suspect it's nothing to worry about, just one of the threads that
>>> send data racing with the thread that shuts the other threads down.
> 
>> Hmm, but pion ice logs this as warning. Really ignore hundreds/thousands
>> of messages per minute? And why didn't I saw this before?
> 
> It looks like I wasn't always closing connections correctly.  Could you
> please check if 66de0d solves the issue?

Sorry, I still get these warnings.

Ciao, Michael.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Galene] Re: New Galène protocol
  2021-02-05  9:59                     ` Michael Ströder
@ 2021-02-05 10:05                       ` Michael Ströder
  2021-02-05 11:03                         ` Juliusz Chroboczek
  0 siblings, 1 reply; 28+ messages in thread
From: Michael Ströder @ 2021-02-05 10:05 UTC (permalink / raw)
  To: galene

On 2/5/21 10:59 AM, Michael Ströder wrote:
> On 2/4/21 11:53 PM, Juliusz Chroboczek wrote:
>>>> I suspect it's nothing to worry about, just one of the threads that
>>>> send data racing with the thread that shuts the other threads down.
>>
>>> Hmm, but pion ice logs this as warning. Really ignore hundreds/thousands
>>> of messages per minute? And why didn't I saw this before?
>>
>> It looks like I wasn't always closing connections correctly.  Could you
>> please check if 66de0d solves the issue?
> 
> Sorry, I still get these warnings.

Well, my impression is that there are less warning messages logged, but
not something I can really quantify. And they are logged after both test
users (one operator, one normal user) left the session until I restart
Galène.

Ciao, Michael.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Galene]  Re: New Galène protocol
  2021-02-05 10:05                       ` Michael Ströder
@ 2021-02-05 11:03                         ` Juliusz Chroboczek
  2021-02-05 11:22                           ` Juliusz Chroboczek
  0 siblings, 1 reply; 28+ messages in thread
From: Juliusz Chroboczek @ 2021-02-05 11:03 UTC (permalink / raw)
  To: Michael Ströder; +Cc: galene

>> Sorry, I still get these warnings.

> And they are logged after both test users (one operator, one normal
> user) left the session until I restart Galène.

I'm unable to reproduce the warnings, but I am seeing something wrong with
the system call activity which indicates something is not being closed
correctly.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Galene]  Re: New Galène protocol
  2021-02-05 11:03                         ` Juliusz Chroboczek
@ 2021-02-05 11:22                           ` Juliusz Chroboczek
  2021-02-05 14:19                             ` Michael Ströder
  0 siblings, 1 reply; 28+ messages in thread
From: Juliusz Chroboczek @ 2021-02-05 11:22 UTC (permalink / raw)
  To: Michael Ströder; +Cc: galene

commit c3a19c9128f2922d2285cc840dc9b633e0f625ce (HEAD -> master, origin/master)
Author: Juliusz Chroboczek <jch@irif.fr>
Date:   Fri Feb 5 12:20:33 2021 +0100

    Avoid race between closing connections and terminating client.
    
    We need to terminate all down connections synchronously, otherwise
    we risk leaving open connections lying around.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Galene] Re: New Galène protocol
  2021-02-05 11:22                           ` Juliusz Chroboczek
@ 2021-02-05 14:19                             ` Michael Ströder
  2021-02-05 20:40                               ` Michael Ströder
  0 siblings, 1 reply; 28+ messages in thread
From: Michael Ströder @ 2021-02-05 14:19 UTC (permalink / raw)
  To: galene

On 2/5/21 12:22 PM, Juliusz Chroboczek wrote:
> commit c3a19c9128f2922d2285cc840dc9b633e0f625ce (HEAD -> master, origin/master)
> Author: Juliusz Chroboczek <jch@irif.fr>
> Date:   Fri Feb 5 12:20:33 2021 +0100
> 
>     Avoid race between closing connections and terminating client.
>     
>     We need to terminate all down connections synchronously, otherwise
>     we risk leaving open connections lying around.

So far this looks good. The harder test will be this evening with more
users.

Ciao, Michael.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Galene] Re: New Galène protocol
  2021-02-05 14:19                             ` Michael Ströder
@ 2021-02-05 20:40                               ` Michael Ströder
  2021-02-10 18:23                                 ` [Galene] use of closed network connection Michael Ströder
  0 siblings, 1 reply; 28+ messages in thread
From: Michael Ströder @ 2021-02-05 20:40 UTC (permalink / raw)
  To: galene

On 2/5/21 3:19 PM, Michael Ströder wrote:
> On 2/5/21 12:22 PM, Juliusz Chroboczek wrote:
>> commit c3a19c9128f2922d2285cc840dc9b633e0f625ce (HEAD -> master, origin/master)
>> Author: Juliusz Chroboczek <jch@irif.fr>
>> Date:   Fri Feb 5 12:20:33 2021 +0100
>>
>>     Avoid race between closing connections and terminating client.
>>     
>>     We need to terminate all down connections synchronously, otherwise
>>     we risk leaving open connections lying around.
> 
> So far this looks good. The harder test will be this evening with more
> users.

Worked just fine with 7 users and no warning messages at all.

Ciao, Michael.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Galene] use of closed network connection
  2021-02-05 20:40                               ` Michael Ströder
@ 2021-02-10 18:23                                 ` Michael Ströder
  2021-02-10 20:30                                   ` [Galene] " Toke Høiland-Jørgensen
  2021-02-11 15:02                                   ` [Galene] Re: use of closed network connection Juliusz Chroboczek
  0 siblings, 2 replies; 28+ messages in thread
From: Michael Ströder @ 2021-02-10 18:23 UTC (permalink / raw)
  To: galene

On 2/5/21 9:40 PM, Michael Ströder wrote:
> On 2/5/21 3:19 PM, Michael Ströder wrote:
>> On 2/5/21 12:22 PM, Juliusz Chroboczek wrote:
>>> commit c3a19c9128f2922d2285cc840dc9b633e0f625ce (HEAD -> master, origin/master)
>>> Author: Juliusz Chroboczek <jch@irif.fr>
>>> Date:   Fri Feb 5 12:20:33 2021 +0100
>>>
>>>     Avoid race between closing connections and terminating client.
>>>     
>>>     We need to terminate all down connections synchronously, otherwise
>>>     we risk leaving open connections lying around.
>>
>> So far this looks good. The harder test will be this evening with more
>> users.
> 
> Worked just fine with 7 users and no warning messages at all.

Unfortunately I get these warning messages again with git revision 8f89ac0.

Prior to that the user wasn't able to use the chat and pion/ice logged
this several times but without further info:

ice WARNING: 2021/02/10 18:59:02 pingAllCandidates called with no
candidate pairs. Connection is not possible yet.

The user left the session and re-entered. This resulted in the user
being listed twice and ~70 messages/sec saying:

failed to send packet: write udp 10.1.1.13:52736->10.1.1.28:49213: use
of closed network connection

Note that this seems to be a connection relayed via TURN server. I can't
tell why relaying was enforced though.

Ciao, Michael.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Galene] Re: use of closed network connection
  2021-02-10 18:23                                 ` [Galene] use of closed network connection Michael Ströder
@ 2021-02-10 20:30                                   ` Toke Høiland-Jørgensen
  2021-02-11 14:59                                     ` Juliusz Chroboczek
                                                       ` (3 more replies)
  2021-02-11 15:02                                   ` [Galene] Re: use of closed network connection Juliusz Chroboczek
  1 sibling, 4 replies; 28+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-02-10 20:30 UTC (permalink / raw)
  To: Michael Ströder, galene

Michael Ströder <michael@stroeder.com> writes:

> On 2/5/21 9:40 PM, Michael Ströder wrote:
>> On 2/5/21 3:19 PM, Michael Ströder wrote:
>>> On 2/5/21 12:22 PM, Juliusz Chroboczek wrote:
>>>> commit c3a19c9128f2922d2285cc840dc9b633e0f625ce (HEAD -> master, origin/master)
>>>> Author: Juliusz Chroboczek <jch@irif.fr>
>>>> Date:   Fri Feb 5 12:20:33 2021 +0100
>>>>
>>>>     Avoid race between closing connections and terminating client.
>>>>     
>>>>     We need to terminate all down connections synchronously, otherwise
>>>>     we risk leaving open connections lying around.
>>>
>>> So far this looks good. The harder test will be this evening with more
>>> users.
>> 
>> Worked just fine with 7 users and no warning messages at all.
>
> Unfortunately I get these warning messages again with git revision
> 8f89ac0.

I'm also seeing warnings on this version:

Feb 10 18:58:37 video galene[1251625]: 2021/02/10 18:58:37 client: client is dead
Feb 10 19:08:16 video galene[1251625]: 2021/02/10 19:08:16 client: websocket: close 1006 (abnormal closure): unexpected EOF
Feb 10 19:15:29 video galene[1251625]: 2021/02/10 19:15:29 Read RTCP: io: read/write on closed pipe
Feb 10 19:15:29 video galene[1251625]: 2021/02/10 19:15:29 Read RTCP: io: read/write on closed pipe
Feb 10 19:15:29 video galene[1251625]: 2021/02/10 19:15:29 Read RTCP: io: read/write on closed pipe
Feb 10 19:15:29 video galene[1251625]: 2021/02/10 19:15:29 client: client is dead
Feb 10 19:24:24 video galene[1251625]: 2021/02/10 19:24:24 Read RTCP: io: read/write on closed pipe
Feb 10 19:24:24 video galene[1251625]: 2021/02/10 19:24:24 Read RTCP: io: read/write on closed pipe
Feb 10 19:24:24 video galene[1251625]: 2021/02/10 19:24:24 client: read tcp 45.145.95.8:443->192.38.142.147:1128: read: connection reset by pe
Feb 10 20:25:23 video galene[1251625]: 2021/02/10 20:25:23 sendPLI: io: read/write on closed pipe
Feb 10 20:25:25 video galene[1251625]: 2021/02/10 20:25:25 sendPLI: io: read/write on closed pipe
Feb 10 20:25:25 video galene[1251625]: 2021/02/10 20:25:25 Replace: file does not exist
Feb 10 20:25:25 video galene[1251625]: 2021/02/10 20:25:25 Replace: file does not exist
Feb 10 20:25:25 video galene[1251625]: 2021/02/10 20:25:25 Replace: file does not exist
Feb 10 20:25:25 video galene[1251625]: 2021/02/10 20:25:25 Replace: file does not exist
Feb 10 20:25:25 video galene[1251625]: 2021/02/10 20:25:25 Replace: file does not exist


I also have frozen video frame of a user that left stuck in the group
view now (but he's not in the user list).

-Toke

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Galene] Re: use of closed network connection
  2021-02-10 20:30                                   ` [Galene] " Toke Høiland-Jørgensen
@ 2021-02-11 14:59                                     ` Juliusz Chroboczek
  2021-02-11 15:17                                     ` Juliusz Chroboczek
                                                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 28+ messages in thread
From: Juliusz Chroboczek @ 2021-02-11 14:59 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: Michael Ströder, galene

> Feb 10 19:24:24 video galene[1251625]: 2021/02/10 19:24:24 Read RTCP: io: read/write on closed pipe

This one is harmless.  Still looking at the other ones.

commit a4cd27988f49454b2947d65a7df58143a1c9d9f0 (HEAD -> master, origin/master)
Author: Juliusz Chroboczek <jch@irif.fr>
Date:   Thu Feb 11 15:56:47 2021 +0100

    Don't complain about ErrClosedPipe in RTCP receiver.
    
    This simply indicates that the server closed the connection
    before we received the close indication from the client.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Galene] Re: use of closed network connection
  2021-02-10 18:23                                 ` [Galene] use of closed network connection Michael Ströder
  2021-02-10 20:30                                   ` [Galene] " Toke Høiland-Jørgensen
@ 2021-02-11 15:02                                   ` Juliusz Chroboczek
  1 sibling, 0 replies; 28+ messages in thread
From: Juliusz Chroboczek @ 2021-02-11 15:02 UTC (permalink / raw)
  To: Michael Ströder; +Cc: galene

> The user left the session and re-entered. This resulted in the user
> being listed twice and ~70 messages/sec saying:

This one is worrying.  It indicates that the per-client automaton is
stuck, which might lead to the whole group getting deadlocked later on.

If you manage to reproduce it, and if nobody important is using the
server, it would be helpful if you could send a SIGABRT to the server
after that happens and send me the backtrace by private mail.

-- Juliusz

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Galene] Re: use of closed network connection
  2021-02-10 20:30                                   ` [Galene] " Toke Høiland-Jørgensen
  2021-02-11 14:59                                     ` Juliusz Chroboczek
@ 2021-02-11 15:17                                     ` Juliusz Chroboczek
  2021-02-11 15:56                                       ` Toke Høiland-Jørgensen
  2021-02-11 18:08                                     ` Juliusz Chroboczek
  2021-02-14 16:32                                     ` [Galene] Re: Frozen videos [was: use of closed network connection] Juliusz Chroboczek
  3 siblings, 1 reply; 28+ messages in thread
From: Juliusz Chroboczek @ 2021-02-11 15:17 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: Michael Ströder, galene

> Feb 10 20:25:25 video galene[1251625]: 2021/02/10 20:25:25 Replace: file does not exist

> I also have frozen video frame of a user that left stuck in the group
> view now (but he's not in the user list).

Did these two things happen at roughly the same time, or are they
unrelated events?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Galene] Re: use of closed network connection
  2021-02-11 15:17                                     ` Juliusz Chroboczek
@ 2021-02-11 15:56                                       ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 28+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-02-11 15:56 UTC (permalink / raw)
  To: Juliusz Chroboczek; +Cc: Michael Ströder, galene

Juliusz Chroboczek <jch@irif.fr> writes:

>> Feb 10 20:25:25 video galene[1251625]: 2021/02/10 20:25:25 Replace: file does not exist
>
>> I also have frozen video frame of a user that left stuck in the group
>> view now (but he's not in the user list).
>
> Did these two things happen at roughly the same time, or are they
> unrelated events?

Hmm, the log timestamps are UTC, and seeing as I sent that email right
after it happened, I suppose they could have been related? :)

-Toke

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Galene] Re: use of closed network connection
  2021-02-10 20:30                                   ` [Galene] " Toke Høiland-Jørgensen
  2021-02-11 14:59                                     ` Juliusz Chroboczek
  2021-02-11 15:17                                     ` Juliusz Chroboczek
@ 2021-02-11 18:08                                     ` Juliusz Chroboczek
  2021-02-11 18:20                                       ` Toke Høiland-Jørgensen
  2021-02-14 16:32                                     ` [Galene] Re: Frozen videos [was: use of closed network connection] Juliusz Chroboczek
  3 siblings, 1 reply; 28+ messages in thread
From: Juliusz Chroboczek @ 2021-02-11 18:08 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: galene

> I also have frozen video frame of a user that left stuck in the group
> view now (but he's not in the user list).

Thanks Toke, I've managed to reproduce it.  It happens if the client shuts
down before negotiation is complete.  Probably some subtle ordering issue.

-- Juliusz

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Galene] Re: use of closed network connection
  2021-02-11 18:08                                     ` Juliusz Chroboczek
@ 2021-02-11 18:20                                       ` Toke Høiland-Jørgensen
  2021-03-02  1:02                                         ` Dave Taht
  0 siblings, 1 reply; 28+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-02-11 18:20 UTC (permalink / raw)
  To: Juliusz Chroboczek; +Cc: galene

Juliusz Chroboczek <jch@irif.fr> writes:

>> I also have frozen video frame of a user that left stuck in the group
>> view now (but he's not in the user list).
>
> Thanks Toke, I've managed to reproduce it.  It happens if the client shuts
> down before negotiation is complete.  Probably some subtle ordering
> issue.

Great! And yeah, the person this happened to did have a bit of a flaky
connection, so I guess it could have been because he got disconnected
while negotiation was still ongoing...

-Toke

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Galene] Re: Frozen videos [was: use of closed network connection]
  2021-02-10 20:30                                   ` [Galene] " Toke Høiland-Jørgensen
                                                       ` (2 preceding siblings ...)
  2021-02-11 18:08                                     ` Juliusz Chroboczek
@ 2021-02-14 16:32                                     ` Juliusz Chroboczek
  2021-02-14 17:21                                       ` Toke Høiland-Jørgensen
  3 siblings, 1 reply; 28+ messages in thread
From: Juliusz Chroboczek @ 2021-02-14 16:32 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: galene

> Feb 10 20:25:25 video galene[1251625]: 2021/02/10 20:25:25 Replace: file does not exist

> I also have frozen video frame of a user that left stuck in the group
> view now (but he's not in the user list).

This should be fixed now.

We were pushing connections asynchronously, which could cause the push
actions to arrive in the wrong order: if a connection was created and then
deleted, the deletion could arrive before the creation, in which case the
deletion would log a warning (file does not exist) and the creation would
succeed, leaving a frozen video.

Of course, we cannot just replace asynchronous with synchronous, as that
might cause deadlocks.  So I've had to implement an explicit queue of
pending actions, and make communication between threads use unbounded
amounts of buffering.

Personal opinion -- I much prefer the semantics of Erlang's mailboxes
(which implement unbounded buffering) to that of Go's channels (which use
bounded amounts of buffering, and are therefore susceptible to deadlocks).

-- Juliusz

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Galene] Re: Frozen videos [was: use of closed network connection]
  2021-02-14 16:32                                     ` [Galene] Re: Frozen videos [was: use of closed network connection] Juliusz Chroboczek
@ 2021-02-14 17:21                                       ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 28+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-02-14 17:21 UTC (permalink / raw)
  To: Juliusz Chroboczek; +Cc: galene

Juliusz Chroboczek <jch@irif.fr> writes:

>> Feb 10 20:25:25 video galene[1251625]: 2021/02/10 20:25:25 Replace: file does not exist
>
>> I also have frozen video frame of a user that left stuck in the group
>> view now (but he's not in the user list).
>
> This should be fixed now.
>
> We were pushing connections asynchronously, which could cause the push
> actions to arrive in the wrong order: if a connection was created and then
> deleted, the deletion could arrive before the creation, in which case the
> deletion would log a warning (file does not exist) and the creation would
> succeed, leaving a frozen video.
>
> Of course, we cannot just replace asynchronous with synchronous, as that
> might cause deadlocks.  So I've had to implement an explicit queue of
> pending actions, and make communication between threads use unbounded
> amounts of buffering.

Right, deployed the latest revision; will let you know if I experience
any more problems with this :)

-Toke

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Galene] Re: use of closed network connection
  2021-02-11 18:20                                       ` Toke Høiland-Jørgensen
@ 2021-03-02  1:02                                         ` Dave Taht
  0 siblings, 0 replies; 28+ messages in thread
From: Dave Taht @ 2021-03-02  1:02 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: Juliusz Chroboczek, galene

having webrtc tests that actually blew up the network (e.g. netem with
20% loss, and blackholing ports, mistreating ecn, mistreating dscp)
would help long term... in not just galene's case. Perhaps tests like
this exist somewhere?

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2021-03-02  1:02 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-03 19:04 [Galene] New Galène protocol Juliusz Chroboczek
2021-02-03 21:01 ` [Galene] " Michael Ströder
2021-02-03 21:11   ` Juliusz Chroboczek
2021-02-04 10:49     ` Michael Ströder
2021-02-04 18:27       ` Michael Ströder
2021-02-04 20:04         ` Juliusz Chroboczek
2021-02-04 20:47           ` Michael Ströder
2021-02-04 21:26             ` Michael Ströder
2021-02-04 21:57               ` Juliusz Chroboczek
2021-02-04 22:06                 ` Michael Ströder
2021-02-04 22:53                   ` Juliusz Chroboczek
2021-02-05  9:59                     ` Michael Ströder
2021-02-05 10:05                       ` Michael Ströder
2021-02-05 11:03                         ` Juliusz Chroboczek
2021-02-05 11:22                           ` Juliusz Chroboczek
2021-02-05 14:19                             ` Michael Ströder
2021-02-05 20:40                               ` Michael Ströder
2021-02-10 18:23                                 ` [Galene] use of closed network connection Michael Ströder
2021-02-10 20:30                                   ` [Galene] " Toke Høiland-Jørgensen
2021-02-11 14:59                                     ` Juliusz Chroboczek
2021-02-11 15:17                                     ` Juliusz Chroboczek
2021-02-11 15:56                                       ` Toke Høiland-Jørgensen
2021-02-11 18:08                                     ` Juliusz Chroboczek
2021-02-11 18:20                                       ` Toke Høiland-Jørgensen
2021-03-02  1:02                                         ` Dave Taht
2021-02-14 16:32                                     ` [Galene] Re: Frozen videos [was: use of closed network connection] Juliusz Chroboczek
2021-02-14 17:21                                       ` Toke Høiland-Jørgensen
2021-02-11 15:02                                   ` [Galene] Re: use of closed network connection Juliusz Chroboczek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox