66 lines
3.8 KiB
ReStructuredText
66 lines
3.8 KiB
ReStructuredText
========================
|
|
SoundWire Error Handling
|
|
========================
|
|
|
|
The SoundWire PHY was designed with care and errors on the bus are going to
|
|
be very unlikely, and if they happen it should be limited to single bit
|
|
errors. Examples of this design can be found in the synchronization
|
|
mechanism (sync loss after two errors) and short CRCs used for the Bulk
|
|
Register Access.
|
|
|
|
The errors can be detected with multiple mechanisms:
|
|
|
|
1. Bus clash or parity errors: This mechanism relies on low-level detectors
|
|
that are independent of the payload and usages, and they cover both control
|
|
and audio data. The current implementation only logs such errors.
|
|
Improvements could be invalidating an entire programming sequence and
|
|
restarting from a known position. In the case of such errors outside of a
|
|
control/command sequence, there is no concealment or recovery for audio
|
|
data enabled by the SoundWire protocol, the location of the error will also
|
|
impact its audibility (most-significant bits will be more impacted in PCM),
|
|
and after a number of such errors are detected the bus might be reset. Note
|
|
that bus clashes due to programming errors (two streams using the same bit
|
|
slots) or electrical issues during the transmit/receive transition cannot
|
|
be distinguished, although a recurring bus clash when audio is enabled is a
|
|
indication of a bus allocation issue. The interrupt mechanism can also help
|
|
identify Slaves which detected a Bus Clash or a Parity Error, but they may
|
|
not be responsible for the errors so resetting them individually is not a
|
|
viable recovery strategy.
|
|
|
|
2. Command status: Each command is associated with a status, which only
|
|
covers transmission of the data between devices. The ACK status indicates
|
|
that the command was received and will be executed by the end of the
|
|
current frame. A NAK indicates that the command was in error and will not
|
|
be applied. In case of a bad programming (command sent to non-existent
|
|
Slave or to a non-implemented register) or electrical issue, no response
|
|
signals the command was ignored. Some Master implementations allow for a
|
|
command to be retransmitted several times. If the retransmission fails,
|
|
backtracking and restarting the entire programming sequence might be a
|
|
solution. Alternatively some implementations might directly issue a bus
|
|
reset and re-enumerate all devices.
|
|
|
|
3. Timeouts: In a number of cases such as ChannelPrepare or
|
|
ClockStopPrepare, the bus driver is supposed to poll a register field until
|
|
it transitions to a NotFinished value of zero. The MIPI SoundWire spec 1.1
|
|
does not define timeouts but the MIPI SoundWire DisCo document adds
|
|
recommendation on timeouts. If such configurations do not complete, the
|
|
driver will return a -ETIMEOUT. Such timeouts are symptoms of a faulty
|
|
Slave device and are likely impossible to recover from.
|
|
|
|
Errors during global reconfiguration sequences are extremely difficult to
|
|
handle:
|
|
|
|
1. BankSwitch: An error during the last command issuing a BankSwitch is
|
|
difficult to backtrack from. Retransmitting the Bank Switch command may be
|
|
possible in a single segment setup, but this can lead to synchronization
|
|
problems when enabling multiple bus segments (a command with side effects
|
|
such as frame reconfiguration would be handled at different times). A global
|
|
hard-reset might be the best solution.
|
|
|
|
Note that SoundWire does not provide a mechanism to detect illegal values
|
|
written in valid registers. In a number of cases the standard even mentions
|
|
that the Slave might behave in implementation-defined ways. The bus
|
|
implementation does not provide a recovery mechanism for such errors, Slave
|
|
or Master driver implementers are responsible for writing valid values in
|
|
valid registers and implement additional range checking if needed.
|