Warning: Can't synchronize with repository "(default)" (No changeset 96d22ec3fa3ef6de3ea8dc0d7d398adc9aa071cf in the repository). Look in the Trac log for more information.

Ticket #435 (closed defect: fixed)

Opened 4 years ago

Last modified 3 years ago

After missed call, which causes resume, the phone and suspend do not work anymore

Reported by: khiraly Owned by: mickey
Priority: blocker Milestone: milestone5.5
Component: framework/ousaged Version:
Keywords: Cc: jeremy.mcnaughton@…, reply@…, ayers@…, Chaosspawn23@…, seba.dos1@…

Description

Summary:

After receiving some call or sms, the phone does not suspend anymore. It happens rather sporadically, and could not find (yet) a reliable way to reproduce it.

There are some findings along with log files in the original bugreport.

The original bugreport: https://docs.openmoko.org/trac/ticket/2284#comment:11

Attachments

frameworkd-no-suspend-without-accepting-the-call.log.gz (143.2 KB) - added by khiraly 4 years ago.
No suspend after receiving a call (without accepting it)
paroli-no-suspend-without-accepting-the-call.log (55.3 KB) - added by khiraly 4 years ago.
Paroli.log no suspend after receiving a call (without accepting it)
fso_triggers.py (8.0 KB) - added by khiraly 4 years ago.
/usr/lib/python2.6/site-packages/framework/subsystems/oeventsd/fso_triggers.py
rule.py (4.4 KB) - added by khiraly 4 years ago.
/usr/lib/python2.6/site-packages/framework/subsystems/oeventsd/rule.py
session.conf (2.6 KB) - added by khiraly 4 years ago.
/etc/dbus-1/session.conf after "correctly" installing the r1 .ipk file
system.conf (2.5 KB) - added by khiraly 4 years ago.
/etc/dbus-1/system.conf after "correctly" installing the r1 .ipk file
rules.yaml (4.0 KB) - added by MadHatter 4 years ago.
OM2009T4 rules.yaml

Change History

comment:1 Changed 4 years ago by khiraly

When it does not suspend, nothing happens when I press the power button. No dim the screen, nothing. The screen stays at the same stage (usually full brightness), and nothing happens. What is worst, after timeout the screen switch off, so the powersaving is working.

It occured to me today without even accepting the call. I only needed to an incoming call (I ended the call at the other end).

Changed 4 years ago by khiraly

No suspend after receiving a call (without accepting it)

Changed 4 years ago by khiraly

Paroli.log no suspend after receiving a call (without accepting it)

comment:2 Changed 4 years ago by khiraly

The above frameworkd.log is created by modifying two files in frameworkd: /usr/lib/python2.6/site-packages/framework/subsystems/oeventsd/fso_triggers.py /usr/lib/python2.6/site-packages/framework/subsystems/oeventsd/rule.py

Changed 4 years ago by khiraly

/usr/lib/python2.6/site-packages/framework/subsystems/oeventsd/fso_triggers.py

Changed 4 years ago by khiraly

/usr/lib/python2.6/site-packages/framework/subsystems/oeventsd/rule.py

comment:3 Changed 4 years ago by rhk

Confirm this - I lost the log when restarting the phone :(

comment:4 Changed 4 years ago by khiraly

What I would like to know (to advance with debugging), is how CallListContains? class (fso_trigger.py:58) are initialized, how many objects are of them, and what they are supposed to do exactly, and when they are destroying in the call process.

I read in rules.yaml file the following rule(/etc/freesmartphone/oevents/rules.yaml:45):

    while: CallListContains("incoming")
    filters: Not(CallListContains("active"))
    actions:
             - RingTone()
             - SetDisplayBrightness("0", 90)

I understand from this rule, that the CallListContains?("active") is the inverse of CallListContains?("incoming"): CallListContains?("incoming") = not CallListContains?("active")

Im asking this, because I see in the log: 2009.05.13 15:03:55.929 oeventsd.rule WARNING Untrigger for 'CallListContains?(active)' called, but not yet triggered. Not untriggering

I tracked down this warning, and in fso_triggers.py:60:

    def trigger(self, id=None, status=None, properties=None, **kargs):
        logger.debug("Trigger %s", self)
        self.calls[id] = status
        if self.status in self.calls.values():
            super(CallListContains, self).trigger()
        else:
            super(CallListContains, self).untrigger()

When an incoming call happens CallListContains?.status is "active", and the parameter status is "incoming", so the else statement gets executed. ((Because we put into self.calls the "incoming" string and self.status is still "active".))

So Im lacking the overall process overview. So cant really identify what is going wrong with the calling process.

Could somebody (mickey?) help me a bit, point me to docs, or explain the process?

comment:5 Changed 4 years ago by jmcnaught

  • Cc jeremy.mcnaughton@… added

This is also happening for me with SHR-unstable downloaded May 23/09.

comment:6 Changed 4 years ago by MadHatter

  • Cc reply@… added

in case it helps shed any light, i got my phone into this state (easy: receive an incoming call, and don't answer it; for me, this is close-to 100% reliable) and tried mickeyterm while ssh'ed in:

root@tom-gta02:~# mickeyterm Traceback (most recent call last):

File "/usr/bin/mickeyterm", line 520, in <module>

port = iMuxer.AllocChannel?( "mickeyterm.%d" % os.getpid() )

File "/usr/lib/python2.6/site-packages/dbus/proxies.py", line 68, in call

return self._proxy_method(*args, keywords)

File "/usr/lib/python2.6/site-packages/dbus/proxies.py", line 140, in call

keywords)

File "/usr/lib/python2.6/site-packages/dbus/connection.py", line 622, in call_blocking

message, timeout)

dbus.exceptions.DBusException: org.freesmartphone.GSM.MUX.NoChannel?: All channels are used

when i ring the phone, paroli doesn't show an incoming call, although local RFI makes it clear that the modem's talking to the network.

if anyone can think of any more testing i can do, i'm probably a good candidate as i can reproduce this very, very easily.

comment:7 Changed 4 years ago by ayers

  • Cc ayers@… added

comment:8 Changed 4 years ago by jluebbe

  • Milestone set to milestone5.5

comment:9 Changed 4 years ago by Chaosspawn23

  • Cc Chaosspawn23@… added

comment:10 Changed 4 years ago by khiraly

I dont know if it was obvious or not, but the thing is:

Once fso does not suspend anymore, it is not ABLE TO RECEIVE call or INITIATE a call.

On the callee part, there is phone ringing, there is RF going in and out (I can here if I put my freerunner near a radio). And if I restart the frameworkd, it can be called. Just Im not always able to restart frameworkd, because it simply does not start (no such process).

In summary: When I make or receive call, I have a 50% chance that FSO crashed. If it crashed, I have 50% chance that I can relaunch FSO, otherwise I need to reboot.

So if the chance to having 10 successfull calls, is about 1% chance. Its quite sad.

Any comment? Or nobody is interested in this bug?

comment:11 follow-up: ↓ 12 Changed 4 years ago by MadHatter

that's not always true for me. about 2/3 of the time it gets into "won't suspend" state, the phone still works and i can still send SMSs; the remaining 1/3 it's as khiraly says.

but after an incoming call the chance that it gets into a "won't suspend" state is more than 9/10.

comment:12 in reply to: ↑ 11 Changed 4 years ago by khiraly

Replying to MadHatter:

that's not always true for me. about 2/3 of the time it gets into "won't suspend" state, the phone still works and i can still send SMSs;

I dont know about SMS, I was only talking about receiving or initating calls. (I only sended sms once, to try out paroli some weeks ago).

So I cant add any specifics on the sms part.

But calling (receive or initiate) is impossible when FSO cant suspend. And what is worst, on the other phone everything looks normal. I can hear ringing (in the line), so it simply seems, I dont want to pick up the phone.

comment:13 follow-up: ↓ 14 Changed 4 years ago by khiraly

I was told on IRC, this bug is _maybe_ closed by an upstream dbus bug.

There is a hunch on IRC that the bug is this: https://bugs.freedesktop.org/show_bug.cgi?id=19796

Can anyone confirm it? (if it can be caused by dbus, and if so, this is the upstream bugid?)

comment:14 in reply to: ↑ 13 Changed 4 years ago by khiraly

Replying to khiraly:

I was told on IRC, this bug is _maybe_ closed by an upstream dbus bug.

Ok, I got more info: This bug maybe is the same or depend on bug #416: http://trac.freesmartphone.org/ticket/416

Which depends on upstream bug #19796: https://bugs.freedesktop.org/show_bug.cgi?id=19796

Where is already a patch available: dbus-connection-bug-19796-v2.patch

Mrmoku build new dbus bindings, and put online the .ipk files: http://build.shr-project.org/tests/mrmoku/dbus_1.2.14-r0_armv4t.ipk http://build.shr-project.org/tests/mrmoku/libdbus-1-3_1.2.14-r0_armv4t.ipk

Dos1 already installed on his shr unstable. I have installed on a two day old om2009 unstable.

Here is a smart tutorial how to install:

  1. opkg install libdbus-1-3_1.2.14-r0_armv4t.ipk
  2. opkg install dbus_1.2.14-r0_armv4t.ipk

* system.conf (Y/I/N/O/D) [default=N] ?N * session.conf (Y/I/N/O/D) [default=N] ?Y

So say N on the first question, and Yes on the second. Just like above.

I will report, if this fixes or problem. Any other takers?

comment:15 Changed 4 years ago by MadHatter

thanks, khiraly. despite your excellent writeup i stupidly overwrote my system.conf, and this has caused GSM and all other startup to fail. could you email me directly (madhatter -at- teaparty.net) your system.conf so i can restore the default one and test this patch?

THANK YOU for the research!

comment:16 follow-ups: ↓ 17 ↓ 20 Changed 4 years ago by MadHatter

i got my old system.conf back by uninstalling dbus, then installing to ipkg from the testing distro, then installing the new one above on top of it.

results: no problems suspending after my first inbound call, which i did not pick up. problems after my second inbound test call, which i did pick up; no outbound phone or SMS functionality after that.

more reports later.

comment:17 in reply to: ↑ 16 Changed 4 years ago by khiraly

Replying to MadHatter:

problems after my second inbound test call

Looks like that new dbus does not fix this bug;-(

I received (probably a call), because my phone does not suspend anymore. Paroli cant start (gsm service not available, and none of the other services either)

I saved my logs (paroli, frameworkd), I can clean it up (remove all my phone numbers, and personal info), and attache to this bugreport. But is there anyone interesting in it?

Khiraly

comment:19 Changed 4 years ago by dos

  • Status changed from new to in_testing

Changed 4 years ago by khiraly

/etc/dbus-1/session.conf after "correctly" installing the r1 .ipk file

Changed 4 years ago by khiraly

/etc/dbus-1/system.conf after "correctly" installing the r1 .ipk file

comment:20 in reply to: ↑ 16 ; follow-up: ↓ 21 Changed 4 years ago by khiraly

Replying to MadHatter:

i got my old system.conf back by uninstalling dbus, then installing to ipkg from the testing distro, then installing the new one above on top of it.

As dos1 said, we were testing the wrong .ipk files. Please retest it with the new ones (r1).

I have attached my system.conf and session.conf file after installing the .ipk files with the method what I posted (system.conf no; session.conf yes) for convenience. Please check your system against these files too, if you have any (startup) problem.

I hope it will fix our problem this time. I will report back in the next days how it worked. I hope you will try out too.

Khiraly

comment:21 in reply to: ↑ 20 ; follow-up: ↓ 24 Changed 4 years ago by MadHatter

new ipks installed and in testing (no startup problems any more, but THANK YOU for posting the two .conf files)...

after first inbound test call (not answered), i can't suspend and can't make any more calls. so i'm not sure the problem is completely fixed.

dos, can i produce any logs to help with this?

comment:22 Changed 4 years ago by khiraly

I just talked on my phone (the new r1 .ipk files are installed). Now the phone does not suspend anymore (and cant initiate or receive any call). I saved the logs (paroli.log and frameworkd.log), and restarted frameworkd.

After I restarted frameworkd suspend works again, and I can call and receive calls too.

Summary:

  • This new r1 dbus does not fix this bugreport.
  • This is clearly a software bug (no calypso bug), because restarting frameworkd often solves the problem. (but restarting frameworkd is not always reliable. I dont know how this r1 dbus affect the reliability of frameworkd restart, but I suspect its the same).

So Is the frameworkd.log is interesting?

Khiraly

comment:23 Changed 4 years ago by khiraly

With the new dbus bindings it even more likely fails frameworkd.

It failed on every incoming call until now (6 from 6).

Khiraly

comment:24 in reply to: ↑ 21 Changed 4 years ago by khiraly

Replying to MadHatter:

new ipks installed and in testing

Seems like you are using om2009 just like me. SHR's frameworkd does not suffer from this problem (or more rarely).

So I suggest to you replacing om2009's fso with SHR's one. I replaced and so far I have positive opinion.

But installing SHR's frameworkd has some caveats, so here is a short tutorial:

  1. remove frameworkd and frameworkd-config from your phone:

opkg remove -force-depends frameworkd opkg remove -force-depends frameworkd-config

  1. Download SHR's frameworkd (note: -config comes from om-gt02 dir, frameworkd from armv4t dir):

http://build.shr-project.org/shr-unstable/ipk/armv4t/frameworkd_0.8.5.1+gitr1408+2d056edd375523947f99bce0bb3d21fe354ca2e0-r4_armv4t.ipk http://build.shr-project.org/shr-unstable/ipk/om-gta02/shr_0.8.5.1+gitr1408+2d056edd375523947f99bce0bb3d21fe354ca2e0-50+60bf60a9c8ca9b13eaa2f80084ce06fbdf1558a7-r6_om-gta02.ipk

  1. Copy them to the phone: scp frameworkd-config-shr_0.8.5.1+gitr1408+2d056edd375523947f99bce0bb3d21fe354ca2e0-50+60bf60a9c8ca9b13eaa2f80084ce06fbdf1558a7-r6_om-gta02.ipk root@192.168.0.202:/home/root

scp frameworkd_0.8.5.1+gitr1408+2d056edd375523947f99bce0bb3d21fe354ca2e0-r4_armv4t.ipk root@192.168.0.202:/home/root

  1. Install them.
  1. Make a symlink of alsa's scenarios dir, because shr messing with alsa files (placing at random dir without fixing frameworkd's config files):

ln -s /usr/share/shr/scenarii/ /usr/share/openmoko/scenarios

If you dont do this, you will have ringtone, but no other sound (so neither the caller nor the callee hear anything)

  1. Overwrite oeventsd config file, if not you will have two simultanously ringtone (one from paroli one from oeventsd).

Paroli's method is more advanced (less response time)

cd /etc/freesmartphone/oeventsd/ mv rules.yaml rules-backshr.yaml cp paroli_rules.yaml rules.yaml

Restart the phone (restarting frameworkd caused me some trouble, like calling freerunner always report on the other end line busy)

Hope it helps to you too. Please report back if you try it out.

( For reference this is the current version of om2009's fso: frameworkd - 0.8.5.1+gitr1+d939c48142d9db9c2ff4da0489b14b0e7387e037-r5

-- And this is what I installed from SHR: frameworkd - 0.8.5.1+gitr1408+2d056edd375523947f99bce0bb3d21fe354ca2e0-r4 -

frameworkd-config-shr - 0.8.5.1+gitr1408+2d056edd375523947f99bce0bb3d21fe354ca2e0-50+60bf60a9c8ca9b13eaa2f80084ce06fbdf1558a7-r6

)

Best regards,

Khiraly

comment:25 Changed 4 years ago by rhk

Thanks Khiraly for your work!

Some errors in your HOWTO:

  • 2nd downloadabla package name is wrong (shr instead of frameworkd-config)
  • no need to scp, you can wget straight to freerunner
  • it's /etc/freesmartphone/oevents, not oeventsd

So here's it again:

1) Remove old packages

opkg remove -force-depends frameworkd;opkg remove -force-depends frameworkd-config

2) Download new packages ( I guess we could combine 2) and 3) with opkg install http://...

wget http://build.shr-project.org/shr-unstable/ipk/armv4t/frameworkd_0.8.5.1+gitr1408+2d056edd375523947f99bce0bb3d21fe354ca2e0-r4_armv4t.ipk

wget http://build.shr-project.org/shr-unstable/ipk/om-gta02/frameworkd-config-shr_0.8.5.1+gitr1408+2d056edd375523947f99bce0bb3d21fe354ca2e0-50+60bf60a9c8ca9b13eaa2f80084ce06fbdf1558a7-r6_om-gta02.ipk

3) Install the new packages

opkg install *.ipk

4) symlink alsastate files

ln -s /usr/share/shr/scenarii/ /usr/share/openmoko/scenarios

5) Overwrite oeventsd config file

cd /etc/freesmartphone/oevents/

mv rules.yaml rules-backshr.yaml

cp paroli_rules.yaml rules.yaml

6) restart the phone

I'll let you know if it Works For Me (TM)

comment:26 Changed 4 years ago by khiraly

Thanks rhkfin to trying out!

Ok. The bug happens just more rarely. I was able to make about 10 successfull call before frameworkd failed silently (no call, no suspend).

comment:27 follow-up: ↓ 28 Changed 4 years ago by khiraly

Ok the bug is reproducible:

  1. Suspend the phone manually (using the power button)
  2. Call the freerunner
  3. dont pick up the call (on the freerunner)
  4. hang up on the other end
  5. Try to suspend (it fails)

Failing suspend also means unable to initiate or receive call.

It looks similar to #416

Khiraly

comment:28 in reply to: ↑ 27 Changed 4 years ago by ykstortnilats

I can also confirm this bug. I'm using SHR-unstable 06/06/2009. After upgrading the dbus packages mentioned by dos, I can still reproduce the bug by the procedure above, so it seems that upgrading the dbus package does not solve the bug.

comment:29 Changed 4 years ago by dos

New dbus fixed that issue. You're just hitting another - with old dbus there was two different issues, which blocked phone functionality. With new dbus there still remains one, which you're hitting.

comment:30 Changed 4 years ago by dos

Looks like this bug can be closed (as noone reported THIS bug with new dbus and patch), and you should test new frameworkd to check, if fix for #416 really fixes that issue (I will do it soon).

comment:31 Changed 4 years ago by dos

  • Priority changed from major to blocker
  • Component changed from framework/general to framework/oeventsd
  • Summary changed from After some received call or sms, the phone does not suspend anymore to After some received call or sms, the phone and suspend do not work anymore

Well, for me it's still not fixed, so looks like it have not clue with #416.

And sorry, I was wrong - that new dbus fixes ANOTHER issue, this issue isn't fixed :)

From logs looks like oeventsd call (RequestResource?) fails with timeout - so i'm settings component to framework/oeventsd. I'm also bumping priority to blocker, as ms5.5 CAN'T have that issue, cause when it happens, user don't know about he will miss calls and messages, and battery will discharge rapidly. Feel free to revert properties of this ticket if you disagree.

BTW. Could that in_testing be removed?

comment:32 Changed 4 years ago by dos

  • Cc seba.dos1@… added
  • Summary changed from After some received call or sms, the phone and suspend do not work anymore to After missed call, which causes resume, the phone and suspend do not work anymore

comment:33 Changed 4 years ago by dos

After some debugging I finally know, what causes this problem: it's OccupyResource? in rules.yaml, which is called too early (when ousaged didn't finish resuming all resources). Looks like fixing #381 should fix this ticket too.

As workaround for user I propose removing OccupyResource?('CPU') line from rules.yaml in rules about calls.

comment:34 Changed 4 years ago by MadHatter

dos, that's brilliant. i'm happy to test; could you post some diffs against the existing rules.yaml file so i can remove only the right lines in the right way?

comment:35 Changed 4 years ago by dos

On SHR you'll have:

while: CallListContains?("incoming") filters: Not(CallListContains?("active")) actions:

-

while: CallStatus?() filters: Or(HasAttr?(status, "outgoing"), HasAttr?(status, "active")) actions:

Edit it to just:

while: CallListContains?("incoming") filters: Not(CallListContains?("active")) actions:

  • RingTone?()
  • Command('xset -display localhost:0 s reset')

On plain FSO just remove that:

while: CallStatus?() filters: Not(HasAttr?(status, "release")) actions:

Dunno how it's handled in Om2009.

comment:36 Changed 4 years ago by MadHatter

sadly, i'm 2009. would it help if i posted my rules.yaml?

also, i have two such files: /etc/freesmartphone/opreferences/schema/rules.yaml and /etc/freesmartphone/oevents/rules.yaml .

i presume it's the latter?

comment:37 Changed 4 years ago by dos

Yes, it's /etc/freesmartphone/oevents/rules.yaml

Changed 4 years ago by MadHatter

OM2009T4 rules.yaml

comment:38 Changed 4 years ago by khiraly

After some debugging I finally know, what causes this problem: it's OccupyResource?? in

Im really sorry to not updated thig bugreport, but nytowl already found that bug. There were a rules.yaml update in paroli git too: http://git.paroli-project.org/?p=paroli.git;a=commit;h=f9ead522912d25f3bd7efdd7820f92208b631a69

dos1: I told you on irc, maybe you missed in the logs.:-( Sorry you spended time on it, and duplicated effort.;-\

Btw, now I have an another frameworkd failure, but I will open a new bugreport and attache patch. (everything looks normal, it can be suspended, just there is no incoming/outgoing call)

comment:39 Changed 4 years ago by dos

  • Component changed from framework/oeventsd to framework/ousaged

Well, it's only workaround, issue is still there - if something calls RequestResource? when resources are resuming from suspend, then frameworkd goes to unusable state. So it has to be fixed in framework/ousaged, not in rules.yaml.

comment:40 Changed 4 years ago by bam

  • Cc mybigspam@… added

comment:41 follow-up: ↓ 42 Changed 4 years ago by mickey

Dos' analysis seems to be correct. I have now used the current version of the FSO-ms5.5 with Zhone for 20 hours without any problems. What I did is

  • disable ousaged in frameworkd.conf,
  • add the entries for fsousage and fsousage.controller in frameworkd.conf,
  • add starting fsousaged (from cornucopia) right before the framework in /etc/init.d/frameworkd.

The difference of fsousaged implementing the usage API in a seperate process plus implementing it in a synchronous way (since Vala has no async server support atm.) makes it not vunerable against the reentrance problems with frameworkd's implementation of the usage API.

Please reproduce my changes and do some tests.

comment:42 in reply to: ↑ 41 Changed 4 years ago by bam

Replying to mickey:

Dos' analysis seems to be correct. I have now used the current version of the FSO-ms5.5 with Zhone for 20 hours without any problems. What I did is

  • disable ousaged in frameworkd.conf,
  • add the entries for fsousage and fsousage.controller in frameworkd.conf,
  • add starting fsousaged (from cornucopia) right before the framework in /etc/init.d/frameworkd.

... Please reproduce my changes and do some tests.

Could you show your frameworkd.conf and /etc/init.d/frameworkd please?

comment:43 Changed 4 years ago by mickey

I don't have them offhand. Here's how to recreate them:

  • Use a stock frameworkd.conf as base. Add disable = 1 in [ousaged]. Create two new (empty) sections: [fsousage] and [fsousage.controller].
  • In /etc/init.d/frameworkd, there's the line to start 'start-stop-daemon --start --pidfile /var/run/${NAME}.pid --make-pidfile --background -x /usr/bin/frameworkd'. Before this line, add 'start-stop-daemon --start --pidfile /var/run/fsousaged.pid --make-pidfile --background -x /usr/bin/fsousaged'. Do a corresponding change in the stop section.

comment:44 Changed 4 years ago by bam

  • Cc mybigspam@… removed

comment:45 Changed 4 years ago by janvlug

comment:46 Changed 4 years ago by stefan

  • Status changed from in_testing to closed
  • Resolution set to fixed

We can no longer reproduce this. Please reopen if you can reproduce this with a ms5.5 based system.

comment:47 Changed 3 years ago by bascorp

Note: See TracTickets for help on using tickets.