Opened 5 years ago

Closed 4 years ago

#474 closed defect (fixed)

when stopping GPRS connection , framworkd does not respond to most queries and blocks

Reported by: zoff99 Owned by: charlie
Priority: major Milestone:
Component: framework/ocontextd Version: milestone5.5
Keywords: Cc: mail@…

Description

when stopping GPRS connection , framworkd does not respond to most queries and blocks

this is what i do:

root@om-gta02:~# mdbus -s org.freesmartphone.ogsmd /org/freesmartphone/GSM/Device org.freesmartphone.GSM.PDP.DeactivateContext 2> /dev/null

root@om-gta02:~# mdbus -s org.freesmartphone.ogsmd  2> /dev/null
/
    



on other shell i look with strace:

root@om-gta02:~# ps aux|grep -e mdbus -e frame
root      1631  6.3 16.2  60516 19652 ?        Ssl  17:16   8:26 python /usr/bin/frameworkd
root      2774 11.9  4.5   9340  5452 pts/4    S+   19:28   0:02 python /usr/bin/mdbus -s org.freesmartphone.ogsmd

root@om-gta02:~# strace -p 2774
Process 2774 attached - interrupt to quit
restart_syscall(<... resuming interrupted call ...>

root@om-gta02:~# strace -p 1631
Process 1631 attached - interrupt to quit
futex(0x1bca90, FUTEX_WAIT, 0, NULL^C <unfinished ...>

everything just blocks and freezes up

only fix it to restart mdbus and frameworkd

Attachments (7)

hang1.txt (2.3 KB) - added by tdilo 5 years ago.
frameworkd.log excerpt up to the point where it freezes
hang2.txt (3.2 KB) - added by tdilo 5 years ago.
another hang
hang3.txt (3.1 KB) - added by tdilo 5 years ago.
normal1.txt (2.7 KB) - added by tdilo 5 years ago.
frameworkd.log where it continues normally
normal2.txt (2.9 KB) - added by tdilo 5 years ago.
normal3.txt (1.9 KB) - added by tdilo 5 years ago.
normal4.txt (3.6 KB) - added by tdilo 5 years ago.

Download all attachments as: .zip

Change History (11)

comment:1 Changed 5 years ago by avanc

  • Cc mail@… added

Changed 5 years ago by tdilo

frameworkd.log excerpt up to the point where it freezes

Changed 5 years ago by tdilo

another hang

Changed 5 years ago by tdilo

Changed 5 years ago by tdilo

frameworkd.log where it continues normally

Changed 5 years ago by tdilo

Changed 5 years ago by tdilo

Changed 5 years ago by tdilo

comment:2 Changed 5 years ago by tdilo

Since updating two Debian installations (on two machines) to the latest (packaged) fsousaged, fso-abyss and openmoko-panel-plugin, I am also experiencing this bug. It does not happen every time the GPRS connection is disabled, but it is very rare that I need more than three tries to get frameworkd and dbus (I think) to freeze up, either by using mdbus or o-p-p to control the GPRS connection.

I've been using the latest packaged frameworkd with o-p-p 0.9, the old muxer and ousaged for quite some time and the freeze did not occur in that scenario. I did some quick tests while on the road and the hanging still happens when either using the old muxer, ousaged and no panel plugin running. It seems that neither of the new components alone are causing the bug directly.

As my knowledge of the inner workings of the whole system (and my spare time) is rather limited, I attached some log excerpts of both successful and unsuccessful GPRS connection shutdowns; In one case, a normal gprs disconnect (normal4.txt) is followed directly by a hang on the next connect-disconnect cylce (hang3.txt). The other logs are in no particular order.

The hang only seems to occur when calypso answers with both OK and NO CARRIER to whatever is sent to it before - the normal connection shutdowns don't contain NO CARRIER in the response:

<MiscChannel? via /dev/pts/1>: sending 21 bytes: 'AT+CGACT=0;+CGATT=0\r\n'

"bad" calypso response:
<MiscChannel? via /dev/pts/1>: got 20 bytes: '\r\nOK\r\n\r\nNO CARRIER\r\n'
"good" calypso response:
<MiscChannel? via /dev/pts/1>: got 6 bytes: '\r\nOK\r\n'

I couldn't reproduce the bug with both frameworkd and dbus running through strace. Maybe just bad luck.

comment:3 Changed 4 years ago by sim

decoration
Changed 1 year ago by admin

bathtub
Changed 1 year ago by admin

solar system
Changed 1 year ago by admin

stair parts
Changed 1 year ago by admin

solar supply
Changed 1 year ago by admin

comment:4 Changed 4 years ago by mickey

  • Resolution set to fixed
  • Status changed from new to closed

Unfortunately we did not call waitpid(), hence there was a certain likelyness of ppp being zombiefied.

commit 93673aa09cafc8fb5cfc3cb4055a73e25e595b70
Author: Michael 'Mickey' Lauer <mickey@…>
Date: Tue May 11 00:40:37 2010 +0200

processguard: attempt to fix zombie at process shutdown

Note: See TracTickets for help on using tickets.