Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pause/resume HDMI playback will cause IPC error and kernel call trace #238

Closed
keqiaozhang opened this issue Nov 1, 2018 · 18 comments
Closed
Assignees
Labels
APL Applies to ApolloLake platform bug Something isn't working GLK Applies to Gemini Lake P1 Blocker bugs or important features

Comments

@keqiaozhang
Copy link
Collaborator

keqiaozhang commented Nov 1, 2018

This issue happens when keep doing pause/resume HDMI playback.
How to reproduce:

  1. aplay -Dhw:0,5 -f dat -c 2 audio.wav -i
  2. press the space key to do pause/resume(need time)

error message:

[  102.381952] Hardware name: Google Yorp/Yorp, BIOS Google_Yorp.10641.0.2018_05_03_1421 05/03/2018
[  102.381958] Workqueue: pm pm_runtime_work
[  102.381961] task: ffffa1cfac108e00 task.stack: ffffb46603878000
[  102.381966] RIP: 0010:snd_hdac_display_power+0x9c/0xa7 [snd_hda_core]
[  102.381968] RSP: 0018:ffffb4660387bcf0 EFLAGS: 00010246
[  102.381970] RAX: ffffa1cfdcafd6e8 RBX: ffffffffffffffed RCX: 0000000000000000
[  102.381972] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffa1cfcf48d028
[  102.381974] RBP: ffffb4660387bd00 R08: 0000000000000000 R09: 0000001d58aff612
[  102.381976] R10: 071c71c71c71c71c R11: ffffffff98465a4a R12: ffffffffc040cf45
[  102.381978] R13: 0000000000000000 R14: ffffa1cfcf48d028 R15: ffffa1cfcd975028
[  102.381980] FS:  0000000000000000(0000) GS:ffffa1cfffc00000(0000) knlGS:0000000000000000
[  102.381983] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  102.381985] CR2: 0000386ee96fc000 CR3: 000000010a412000 CR4: 00000000003406f0
[  102.381986] Call Trace:
[  102.381997]  hdac_hdmi_runtime_suspend+0x41/0xc3 [snd_soc_hdac_hdmi]
[  102.382000]  __rpm_callback+0xcd/0x14d
[  102.382005]  ? hdmi_codec_complete+0x58/0x58 [snd_soc_hdac_hdmi]
[  102.382008]  rpm_callback+0x27/0x7e
[  102.382011]  ? hdmi_codec_complete+0x58/0x58 [snd_soc_hdac_hdmi]
[  102.382015]  rpm_suspend+0x206/0x403
[  102.382019]  rpm_idle+0x1a4/0x23f
[  102.382022]  pm_runtime_work+0x67/0x92
[  102.382027]  process_one_work+0x18d/0x2fc
[  102.382031]  worker_thread+0x1a2/0x269
[  102.382033]  ? pr_cont_work+0x58/0x58
[  102.382036]  kthread+0xfa/0x10a
[  102.382039]  ? pr_cont_work+0x58/0x58
[  102.382042]  ? rcu_read_unlock_sched_notrace+0x48/0x48
[  102.382045]  ret_from_fork+0x35/0x40
[  102.382049] Code: 74 29 31 db ff c9 41 89 8e d0 04 00 00 75 15 48 8b 48 08 4c 8b 59 10 4d 85 db 74 08 48 8b 38 e8 75 c3 48 d8 89 d8 5b 41 5e 5d c3 <0f> 0b 41 8b 8e d0 04 00 00 eb cc 0f 1f 44 00 00 55 48 89 e5 41
[  102.382085] ---[ end trace ae986683af1866ab ]---
[  102.382221] sof-audio sof-audio: tag 1 not opened!

dmesg.log
logger-t.log

Tested with:
soft-master:4ac9248f
sof-master:43da16
kernel-debug/cros-v4.14-rebase: 90eee4

@keqiaozhang keqiaozhang added bug Something isn't working P1 Blocker bugs or important features APL Applies to ApolloLake platform br-cros GLK Applies to Gemini Lake labels Nov 1, 2018
@mengdonglin
Copy link
Collaborator

@keqiaozhang What's the possibility or any BKM to reproduce this issue?

@keqiaozhang
Copy link
Collaborator Author

@mengdonglin
To reproduce this issue, we need to frequent pause/resume the HDMI playback. I press the space key at a high frequency. The possibility of this issue is 100%.

@plbossart
Copy link
Member

do you see this with topic/sof-dev on an Up2 or is this GLK-specific?

@keqiaozhang
Copy link
Collaborator Author

@plbossart
We will test this issue with sof-dev on UP2 tomorrow.

@mengdonglin mengdonglin removed the APL Applies to ApolloLake platform label Nov 1, 2018
@ranj063
Copy link
Collaborator

ranj063 commented Nov 1, 2018

@keqiaozhang I wonder if this issue #239 and thesofproject/sof#522 have a common root cause.

We need to fix the hdac_hdmi_runtime_suspend() routine to turn display power off only once. This commit in debug/cros-sof-v4.14-rebase f931c90

@markyang
Copy link

markyang commented Nov 2, 2018

Summary:
This issue can be reproduced on UP2.
pcm_write:2011: write error: Input/output error
aplay -Dhw:0,2 -c2 -fS16_LE -vv -i ~/Music/18_WAV_48000Hz_stereo_16bit.wav
Playing WAVE '/home/mark/Music/18_WAV_48000Hz_stereo_16bit.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo
...
=== PAUSE === #############+ | 24% aplay: pcm_write:2011: write error: Input/output error

dmesg:
[ 3695.052461] sof-audio sof-audio: ipc rx: 0x90020000: GLB_TRACE_MSG
[ 3695.052484] sof-audio sof-audio: ipc rx done: 0x90020000: GLB_TRACE_MSG
[ 3695.552422] sof-audio sof-audio: ipc rx: 0x90020000: GLB_TRACE_MSG
[ 3695.552446] sof-audio sof-audio: ipc rx done: 0x90020000: GLB_TRACE_MSG
[ 3696.556865] sof-audio sof-audio: tag 1 not opened!
[ 3696.556907] sof-audio sof-audio: ipc tx: 0x40010000: GLB_PM_MSG: CTX_SAVE
[ 3696.557198] sof-audio sof-audio: ipc tx succeeded: 0x40010000: GLB_PM_MSG: CTX_SAVE
[ 3696.584842] sof-audio sof-audio: DSP core(s) enabled? 0 : core_mask 3
[ 3696.584879] sof-audio sof-audio: Debug PCIR: 00000000 at 00000044
[ 3926.054931] HDMI HDA Codec ehdaudio0D2: hdac_hdmi_present_sense: disconnect for pin:port 6:0
[ 5001.786504] perf: interrupt took too long (3957 > 3928), lowering kernel.perf_event_max_sample_rate to 50500
[ 5418.445072] sof-audio sof-audio: Debug PCIR: 00000000 at 00000044
[ 5418.445084] sof-audio sof-audio: Debug PCIR: 00000000 at 00000048
[ 5418.445092] sof-audio sof-audio: Debug PCIW: 00000000 at 00000048
[ 5418.449046] sof-audio sof-audio: Debug PCIR: 00000000 at 00000048
[ 5418.449057] sof-audio sof-audio: Debug PCIW: 00000040 at 00000048
[ 5418.449067] sof-audio sof-audio: Debug PCIR: 00000000 at 00000048
[ 5418.449074] sof-audio sof-audio: Debug PCIW: 00000000 at 00000048
[ 5418.449084] sof-audio sof-audio: Debug PCIR: 00000000 at 00000048
[ 5418.449091] sof-audio sof-audio: Debug PCIW: 00000040 at 00000048
[ 5418.449188] sof-audio sof-audio: loading firmware
[ 5418.449198] sof-audio sof-audio: booting DSP firmware
[ 5418.449635] sof-audio sof-audio: period_bytes:0xfc0
[ 5418.449641] sof-audio sof-audio: periods:42
[ 5418.449738] sof-audio sof-audio: unstall/run core: core_mask = 1
[ 5418.449746] sof-audio sof-audio: DSP core(s) enabled? 1 : core_mask 1
[ 5418.492227] sof-audio sof-audio: ipc rx: 0x70000000: FW_READY
[ 5418.492238] sof-audio sof-audio: ipc: DSP is ready 0x70000000 offset 0x81000
[ 5418.492299] sof-audio sof-audio: Firmware info: version 1.1-43da1 build 43 on Nov 2 2018:18:01:12
[ 5418.492448] sof-audio sof-audio: found ext header type 1 size 0x9c
[ 5418.492469] sof-audio sof-audio: booting DSP firmware completed
[ 5418.492473] sof-audio sof-audio: ipc rx done: 0x70000000: FW_READY
[ 5418.504672] sof-audio sof-audio: Firmware download successful, booting...
[ 5418.504676] sof-audio sof-audio: firmware boot complete
[ 5418.504694] sof-audio sof-audio: period_bytes:0x0
[ 5418.504696] sof-audio sof-audio: periods:1
[ 5418.504703] sof-audio sof-audio: stream_tag: 1
[ 5418.504722] sof-audio sof-audio: ipc tx: 0x90010000: GLB_TRACE_MSG

reproduce steps:

  1. aplay -Dhw:0,5 -f dat -c 2 audio.wav -i
  2. press the space key to do pause/resume (need try some times, 30%)

Test env:
sof master: 43da169
sof tool: 937032f
kernel sof-dev: a5ddc3d
tplg: sof-apl-pcm512x.tplg-937032f

Log:
logger-t-apl.txt
dmesg-apl.txt

@markyang markyang added the APL Applies to ApolloLake platform label Nov 2, 2018
@ranj063
Copy link
Collaborator

ranj063 commented Nov 5, 2018

@markyang it seems like the error is different on the UP2. On GLK, we're seeing ipc error whereas on the up2, we're seeing i/o error. I think these should be separated

@ranj063
Copy link
Collaborator

ranj063 commented Nov 5, 2018

@keqiaozhang @keyonjie could you please revert f931c90 on the debug/cros-sof-v4.14-rebase and check if the error still exists on glk. I could not reproduce it after I reverted that commit.

@keyonjie
Copy link

keyonjie commented Nov 5, 2018

@ranj063 I can't see it on my glk after reverted.

@keyonjie
Copy link

keyonjie commented Nov 5, 2018

@plbossart maybe we need redo this patch.

@plbossart
Copy link
Member

Yes, I don't know why that patch was merged in the chrome kernel. There was a bit of pushback on it and it seemed quite broken. I don't even see why on alsa-devel it has the label [RESEND] when I see no trace of the initial discussion, so not sure if this was legit or not.

The only thing I see after the initial contribution is bf27026 ('ASoC: hdac_hdmi: Add documentation for power management').

We should double-check with the Chrome team why they believe this patch was necessary, what problem it fixes. In the mean time let's make sure SOF complies with the documented power management expectations.

@plbossart
Copy link
Member

Now that I think of this, there was something odd when I worked on cros-v4.14-intel-audio,
21dfdd3 (see skl.c change)

At the time I didn't flag a conflict but remember thinking something was off in the reference counting.

@ranj063
Copy link
Collaborator

ranj063 commented Nov 5, 2018

@plbossart let me check with the chrome team today and get back.

@ranj063
Copy link
Collaborator

ranj063 commented Nov 5, 2018

@plbossart I checked with the chrome team and this patch is necessary but we need to revert it and re-apply it properly so that there is only one display power off and it is after the links are powered down.

@plbossart
Copy link
Member

@ranj063 the right way to do this is submit it through sof-dev, then apply on chrome. Also if it's necessary we should submit this patch upstream, not sure what happened here last Summer.

@keqiaozhang
Copy link
Collaborator Author

This issue is basically fixed on your kernel branch, but in one corner case like keep pressing the space key for few seconds(less than 5 seconds). it will run to ipc timed out error.

kernel driver- ranja063-cros-sof-v4.14-rebase:a38231
sof-master:1607737
soft-master:2cd668c

@ranj063
Copy link
Collaborator

ranj063 commented Nov 14, 2018

@keqiaozhang can you please try it with glk-004-drop-stable? I didnt see the issue with pressing space key continuously

@keqiaozhang
Copy link
Collaborator Author

Confirmed that this issue is fixed.
Kernel-topic/cros-sof-v4.14-rebase: a38231
DSP FW-origin/glk-004-drop-stable: 1f79920
Topology-origin/master: 5894ed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
APL Applies to ApolloLake platform bug Something isn't working GLK Applies to Gemini Lake P1 Blocker bugs or important features
Projects
None yet
Development

No branches or pull requests

6 participants