Skip to content
  • Vladimir Oltean's avatar
    c92cbaea
    net: dsa: sja1105: fix PTP timestamping with large tc-taprio cycles · c92cbaea
    Vladimir Oltean authored
    
    
    It isn't actually described clearly at all in UM10944.pdf, but on TX of
    a management frame (such as PTP), this needs to happen:
    
    - The destination MAC address (i.e. 01-80-c2-00-00-0e), along with the
      desired destination port, need to be installed in one of the 4
      management slots of the switch, over SPI.
    - The host can poll over SPI for that management slot's ENFPORT field.
      That gets unset when the switch has matched the slot to the frame.
    
    And therein lies the problem. ENFPORT does not mean that the packet has
    been transmitted. Just that it has been received over the CPU port, and
    that the mgmt slot is yet again available.
    
    This is relevant because of what we are doing in sja1105_ptp_txtstamp_skb,
    which is called right after sja1105_mgmt_xmit. We are in a hard
    real-time deadline, since the hardware only gives us 24 bits of TX
    timestamp, so we need to read the full PTP clock to reconstruct it.
    Because we're in a hurry (in an attempt to make sure that we have a full
    64-bit PTP time which is as close as possible to the actual transmission
    time of the frame, to avoid 24-bit wraparounds), first we read the PTP
    clock, then we poll for the TX timestamp to become available.
    
    But of course, we don't know for sure that the frame has been
    transmitted when we read the full PTP clock. We had assumed that ENFPORT
    means it has, but the assumption is incorrect. And while in most
    real-life scenarios this has never been caught due to software delays,
    nowhere is this fact more obvious than with a tc-taprio offload, where
    PTP traffic gets a small timeslot very rarely (example: 1 packet per 10
    ms). In that case, we will be reading the PTP clock for timestamp
    reconstruction too early (before the packet has been transmitted), and
    this renders the reconstruction procedure incorrect (see the assumptions
    described in the comments found on function sja1105_tstamp_reconstruct).
    So the PTP TX timestamps will be off by 1<<24 clock ticks, or 135 ms
    (1 tick is 8 ns).
    
    So fix this case of premature optimization by simply reordering the
    sja1105_ptpegr_ts_poll and the sja1105_ptpclkval_read function calls. It
    turns out that in practice, the 135 ms hard deadline for PTP timestamp
    wraparound is not so hard, since even the most bandwidth-intensive PTP
    profiles, such as 802.1AS-2011, have a sync frame interval of 125 ms.
    So if we couldn't deliver a timestamp in 135 ms (which we can), we're
    toast and have much bigger problems anyway.
    
    Fixes: 47ed985e ("net: dsa: sja1105: Add logic for TX timestamping")
    Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
    Acked-by: default avatarRichard Cochran <richardcochran@gmail.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    c92cbaea
    net: dsa: sja1105: fix PTP timestamping with large tc-taprio cycles
    Vladimir Oltean authored
    
    
    It isn't actually described clearly at all in UM10944.pdf, but on TX of
    a management frame (such as PTP), this needs to happen:
    
    - The destination MAC address (i.e. 01-80-c2-00-00-0e), along with the
      desired destination port, need to be installed in one of the 4
      management slots of the switch, over SPI.
    - The host can poll over SPI for that management slot's ENFPORT field.
      That gets unset when the switch has matched the slot to the frame.
    
    And therein lies the problem. ENFPORT does not mean that the packet has
    been transmitted. Just that it has been received over the CPU port, and
    that the mgmt slot is yet again available.
    
    This is relevant because of what we are doing in sja1105_ptp_txtstamp_skb,
    which is called right after sja1105_mgmt_xmit. We are in a hard
    real-time deadline, since the hardware only gives us 24 bits of TX
    timestamp, so we need to read the full PTP clock to reconstruct it.
    Because we're in a hurry (in an attempt to make sure that we have a full
    64-bit PTP time which is as close as possible to the actual transmission
    time of the frame, to avoid 24-bit wraparounds), first we read the PTP
    clock, then we poll for the TX timestamp to become available.
    
    But of course, we don't know for sure that the frame has been
    transmitted when we read the full PTP clock. We had assumed that ENFPORT
    means it has, but the assumption is incorrect. And while in most
    real-life scenarios this has never been caught due to software delays,
    nowhere is this fact more obvious than with a tc-taprio offload, where
    PTP traffic gets a small timeslot very rarely (example: 1 packet per 10
    ms). In that case, we will be reading the PTP clock for timestamp
    reconstruction too early (before the packet has been transmitted), and
    this renders the reconstruction procedure incorrect (see the assumptions
    described in the comments found on function sja1105_tstamp_reconstruct).
    So the PTP TX timestamps will be off by 1<<24 clock ticks, or 135 ms
    (1 tick is 8 ns).
    
    So fix this case of premature optimization by simply reordering the
    sja1105_ptpegr_ts_poll and the sja1105_ptpclkval_read function calls. It
    turns out that in practice, the 135 ms hard deadline for PTP timestamp
    wraparound is not so hard, since even the most bandwidth-intensive PTP
    profiles, such as 802.1AS-2011, have a sync frame interval of 125 ms.
    So if we couldn't deliver a timestamp in 135 ms (which we can), we're
    toast and have much bigger problems anyway.
    
    Fixes: 47ed985e ("net: dsa: sja1105: Add logic for TX timestamping")
    Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
    Acked-by: default avatarRichard Cochran <richardcochran@gmail.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
Loading