Comware

 View Only
  • 1.  5140 spurious overtemp traps - hotspot 6 on both stack members (not simultaneously)

    Posted Aug 08, 2024 09:25 AM
    Edited by HZ55 Aug 08, 2024 10:07 AM

    We have spurious overtemp traps on 5140 - hot spot 6.

    %Aug  8 13:20:03:301 2024 sw5140 DEV/4/TEMPERATURE_ALARM: -Slot=2; Temperature is greater than the high-temperature alarming threshold on slot 2 sensor hotspot 6.
    %Aug  8 13:20:43:300 2024 sw5140 DEV/5/TEMPERATURE_NORMAL: -Slot=2; Temperature changed to normal on slot 2 sensor hotspot 6.

    I happened to be able to login at 13:20:55   and disp environment showed  the temperature went down already to 51 degrees from 96 or more.

    I wonder whether these alarms are useful/realistic or just noise (EDIT: in the latter case it may be a hw issue perhaps).

    This short timeframe suggests I would have to take snmp samples quite often if I wanted to get a clearer picture.

    Or may be I should have the trap trigger some intensive polling for a minute or so.

    This IRF stack of two has this may be once or twice a week. The room is airconditioned at 18°C and there is nothing blocking the fans or so. Only hotspot 6 is affected on each but it does not happen at the same time. The device are still fairly new.

    There are no features enabled that would suggest a high cpu load. I have no idea whether hotspot 6 is cpu related.

    And we do not see this in the same way elsewhere on on 5140.

    Any comments / opinions ?  



  • 2.  RE: 5140 spurious overtemp traps - hotspot 6 on both stack members (not simultaneously)

    Posted Sep 24, 2024 09:20 AM

    We still have this. I am just gathering this across all 5140. This is not so rare in the set of 5140 which we have deployed - also for customers. 5130 never had this.

    The hotspot can only be taken from the log not the trap information so I try to get a bigger picture from our syslog receiver.

    The alarm is gone after 40s. This may be a standard polling timeframe, I am not sure. So the temperature could in theory be gone even quicker. 




  • 3.  RE: 5140 spurious overtemp traps - hotspot 6 on both stack members (not simultaneously)

    Posted Sep 25, 2024 05:20 AM
    Edited by HZ55 Sep 25, 2024 06:29 AM

    I have been hinted on a fix for mistaken temperature alerts in an upcoming version not on the portal yet. 

    So my theory is correct and we can just wait until it shows up.