Announcement

Collapse
No announcement yet.

Weird NVMe boot event

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Weird NVMe boot event

    A few minutes ago I booted up and was immediately presented with a warning msg, shown in the top half of the graphic below. My NVMe Composite temperature was 84C and the remaining life on the drive was substantial shortened. The drive didn't feel hot, certainly not 84C hot, so I powered down and then rebooted. Everything came back up as it normally does. The drive showed no evidence of the previous temperature spike. Strange indeed. Anyone experience this?
    No doubt I will be watching it like a hawk for the foreseeable future.

    Click image for larger version  Name:	drive_going_bad.png Views:	4 Size:	774.0 KB ID:	675106
    Last edited by GreyGeek; Nov 08, 2023, 10:07 PM.
    "A nation that is afraid to let its people judge the truth and falsehood in an open market is a nation that is afraid of its people.”
    – John F. Kennedy, February 26, 1962.

    #2
    Never have had something like this before.
    Perhaps a faulty contact or general miscommunication of data?

    I assume you do have a good backup strategy, so if it does not happen again during the next days of usage I would not be that concerned…
    Last edited by Schwarzer Kater; Nov 08, 2023, 08:26 AM.
    Debian KDE & LXQt • Kubuntu & Lubuntu • openSUSE KDE • Windows • macOS X
    Desktop: Lenovo ThinkCentre M75s • Laptop: Apple MacBook Pro 13" • and others

    get rid of Snap script (20.04 +)reinstall Snap for release-upgrade script (20.04 +)
    install traditional Firefox script (22.04 +)​ • install traditional Thunderbird script (24.04)

    Comment


      #3
      It's a form factor M.2 Samsung SSD 980 1 TB. One end is slotted and the other end is screwed down. It's wrapped in blue heat conductor silicon. I doubt if there is a mechanical problem or anything related to vibration. My suspicion is on the the firmware. I've rebooted several times throughout the day and the problem hasn't re-appeared.

      The BTRFS fs is on nvme0n1p1

      *-nvme0
      description: NVMe device
      product: Samsung SSD 980 1TB
      physical id: 2
      logical name: /dev/nvme0
      version: 1B4QFXO7
      serial: S64ANXXXXXXXXXXX
      configuration: nqn=nqn.1994-11.com.samsung:nvme:980M.2:S64ANJ0R954589R state=live
      *-namespace:0
      description: NVMe disk
      physical id: 0
      logical name: hwmon2
      *-namespace:1
      description: NVMe disk
      physical id: 2
      logical name: /dev/ng0n1
      *-namespace:2
      description: NVMe disk
      physical id: 1
      bus info: nvme@0:1
      logical name: /dev/nvme0n1
      size: 931GiB (1TB)
      capabilities: gpt-1.00 partitioned partitioned:gpt
      configuration: guid=66a5de34-f6cd-6847-b1cb-96def5923537 logicalsectorsize=512 sectorsize=512 wwid=eui.002538d9999994fe
      *-volume
      description: EFI partition
      physical id: 1
      bus info: nvme@0:1,1
      logical name: /dev/nvme0n1p1
      serial: 34f2f072-dbc0-7845-bb1c-ff836d0e0e36
      capacity: 931GiB


      "A nation that is afraid to let its people judge the truth and falsehood in an open market is a nation that is afraid of its people.”
      – John F. Kennedy, February 26, 1962.

      Comment


        #4
        If your suspicion is on the firmware, did you already look for a new one with e.g. sudo fwupdmgr get-updates?
        Debian KDE & LXQt • Kubuntu & Lubuntu • openSUSE KDE • Windows • macOS X
        Desktop: Lenovo ThinkCentre M75s • Laptop: Apple MacBook Pro 13" • and others

        get rid of Snap script (20.04 +)reinstall Snap for release-upgrade script (20.04 +)
        install traditional Firefox script (22.04 +)​ • install traditional Thunderbird script (24.04)

        Comment


          #5
          Seems like a glitch. Curious about what firmware version?

          My 2 Samsung nvme drives:
          Code:
          - S73VNJ0TA06050V -
          
          Capacity : 1.00 TB (1,000,204,886,016 bytes)
          DevicePath : /dev/nvme1n1
          DeviceStatus : Unknown
          Firmware : 0B2QJXD7
          FirmwareUpdateAvailable : The selected drive contains current firmware as of this tool release.
          Index : 1
          MaximumLBA : 1953525167
          ModelNumber : Samsung SSD 990 PRO 1TB
          PercentOverProvisioned : 100.00
          SectorDataSize : 512
          SerialNumber : S73VNJ0TA06050V
          
          - S73VNJ0TA06033K -
          
          Capacity : 1.00 TB (1,000,204,886,016 bytes)
          DevicePath : /dev/nvme2n1
          DeviceStatus : Unknown
          Firmware : 0B2QJXD7
          FirmwareUpdateAvailable : The selected drive contains current firmware as of this tool release.
          Index : 2
          MaximumLBA : 1953525167
          ModelNumber : Samsung SSD 990 PRO 1TB
          PercentOverProvisioned : 100.00
          SectorDataSize : 512
          SerialNumber : S73VNJ0TA06033K
          ​This output came from: https://www.solidigm.com/content/sol.../ka-00085.html
          Last edited by oshunluvr; Nov 09, 2023, 05:34 PM.

          Please Read Me

          Comment


            #6
            Originally posted by Schwarzer Kater View Post
            If your suspicion is on the firmware, did you already look for a new one with e.g. sudo fwupdmgr get-updates?
            Never heard of that command before. Thanks for the heads up.
            But, no joy.
            $ sudo fwupdmgr get-updates
            [sudo] password for jerry:
            Devices with no available firmware updates:
            • HP TrueVision HD Camera
            • SSD 870 EVO 1TB
            • SSD 980 1TB
            • System Firmware
            Devices with the latest available firmware version:
            • Unifying Receiver
            No updates available



            "A nation that is afraid to let its people judge the truth and falsehood in an open market is a nation that is afraid of its people.”
            – John F. Kennedy, February 26, 1962.

            Comment


              #7
              System Information
              OS Version : Linux : 6.1.0-13-amd64 (#1 SMP PREEMPT_DYNAMIC Debian 6.1.55-1 (2023-09-29))
              Process ID : 2123
              Uptime : 265 sec (0 days, 0 hours, 4 min, 25 sec)

              Physical Disk Information - Disk: #0: Samsung SSD 980 1TB

              Hard Disk Summary
              Hard Disk Number : 0
              Hard Disk Device : /dev/nvme0
              Interface : NVMe
              Hard Disk Model ID : Samsung SSD 980 1TB
              Firmware Revision : 1B4QFXO7
              Hard Disk Serial Number : S64ANJ0R954589R
              Total Size : 953869 MB
              Current Temperature : 29 °C (84 °F)
              Maximum Temperature (during Entire Lifespan) : 29 °C (84 °F)
              Power On Time : 3 days, 5 hours
              Estimated Remaining Lifetime : more than 1000 days
              Lifetime Writes : 4.56 TB
              Health :
              100 % (Excellent)
              Performance :
              100 % (Excellent)
              The status of the solid state disk is PERFECT. Problematic or weak sectors were not found.
              The health is determined by SSD specific S.M.A.R.T. attribute(s): Available Spare (Percent), Percentage Used

              No actions needed.
              Properties
              NVMe Standard Version : 1.4
              PCI Vendor ID (VID) : 0x144D (Samsung)
              PCI Subsystem Vendor ID (SSVID) : 0x144D (Samsung)
              IEEE OUI Identifier : 38-25-00
              Recommended Arbitration Burst (RAB) : 2
              Multi-Interface Capabilities : 0
              Maximum Data Transfer Size : 512 (9)
              Abort Command Limit : 8
              Asynchronous Event Request Limit : 4
              Number FW Slots Support : 3
              Maximum Error Log Page Entries : 64
              Total Number Of Power States : 5
              Admin Vendor Specific CMD Format : 1
              Submission Queue Entry Size : Max: 64, Min: 64
              Completion Queue Entry Size : Max: 16, Min: 16
              Number Of Namespaces : 1
              Stripe Size : 0
              Maximum Power : 524
              NVMe Features
              Device Self-test : Supported
              Extended Self-test Estimated Time : 35 minutes
              Only One Device Self-test : No
              Namespace Management : Not supported
              First Firmware Slot Read Only : No
              Command Effects Log Page : Supported
              SMART Information Per Namespace : Supported
              Save / Select Fields : Supported
              Dataset Management Command : Supported
              Compare Command : Supported
              Cryptographic Erase : Supported
              Format All Namespaces : Supported
              Volatile Write Cache Present : Supported
              Autonomous Power State Transitions : Supported
              Host Controlled Thermal Management : Supported
              Warning Composite Temperature Threshold : 355 °K (82 °C)
              Critical Composite Temperature Threshold : 358 °K (85 °C)
              Sanitize Block Erase : Supported
              Sanitize Crypto Erase : Supported
              Sanitize Status : Never sanitized [0]
              NVMe Namespace Information
              NS 1 Total Sectors : 1953525168
              NS 1 Sector Size : 512 bytes
              NS 1 Active LBA Format Index : 0
              NS 1 LBA Formats Supported : 1
              NS 1 LBA Format List (Performance) : 512 bytes (Best)
              S.M.A.R.T.
              Attribute Threshold Value
              Critical Warning 0
              Composite Temperature (Kelvin) 302
              Available Spare (Percent) 100
              Available Spare Threshold 10
              Percentage Used 0
              Data Units Read (512000 b) 4,548,010
              Data Units Written (512000 b) 9,788,867
              Host Read Commands 21,446,400
              Host Write Commands 32,036,872
              Controller Busy Time (minutes) 91
              Power Cycles 1,013
              Power On Hours 77
              Unsafe Shutdowns 147
              Media and Data Integrity Errors 0
              Number of Error Information Log Entries 0
              Warning Composite Temperature Time (minutes) 1
              Critical Composite Temperature Time (minutes) 0
              Temperature Sensor 1 302
              Temperature Sensor 2 308
              "A nation that is afraid to let its people judge the truth and falsehood in an open market is a nation that is afraid of its people.”
              – John F. Kennedy, February 26, 1962.

              Comment

              Working...
              X