Commits
Thomas Gleixner committed 677fe555cbf
serial: imx: Fix recursive locking bug commit 9ec1882df2 (tty: serial: imx: console write routing is unsafe on SMP) introduced a recursive locking bug in imx_console_write(). The callchain is: imx_rxint() spin_lock_irqsave(&sport->port.lock,flags); ... uart_handle_sysrq_char(); sysrq_function(); printk(); imx_console_write(); spin_lock_irqsave(&sport->port.lock,flags); <--- DEAD The bad news is that the kernel debugging facilities can dectect the problem, but the printks never surface on the serial console for obvious reasons. There is a similar issue with oops_in_progress. If the kernel crashes we really don't want to be stuck on the lock and unable to tell what happened. In general most UP originated drivers miss these checks and nobody ever notices because CONFIG_PROVE_LOCKING seems to be still ignored by a large number of developers. The solution is to avoid locking in the sysrq case and trylock in the oops_in_progress case. This scheme is used in other drivers as well and it would be nice if we could move this to a common place, so the usual copy/paste/modify bugs can be avoided. Now there is another issue with this scheme: CPU0 CPU1 printk() rxint() sysrq_detection() -> sets port->sysrq return from interrupt console_write() if (port->sysrq) avoid locking port->sysrq is reset with the next receive character. So as long as the port->sysrq is not reset and this can take an endless amount of time if after the break no futher receive character follows, all console writes happen unlocked. While the current writer is protected against other console writers by the console sem, it's unprotected against open/close or other operations which fiddle with the port. That's what the above mentioned commit tried to solve. That's an issue in all drivers which use that scheme and unfortunately there is no easy workaround. The only solution is to have a separate indicator port->sysrq_cpu. uart_handle_sysrq_char() then sets it to smp_processor_id() before calling into handle_sysrq() and resets it to -1 after that. Then change the locking check to: if (port->sysrq_cpu == smp_processor_id()) locked = 0; else if (oops_in_progress) locked = spin_trylock_irqsave(port->lock, flags); else spin_lock_irqsave(port->lock, flags); That would force all other cpus into the spin_lock path. Problem solved, but that's way beyond the scope of this fix and really wants to be implemented in a common function which calls the uart specific write function to avoid another gazillion of hard to debug copy/paste/modify bugs. Reported-and-tested-by: Tim Sander <tim@krieglstein.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable <stable@vger.kernel.org> # 3.6+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>