Thanks for these thoughts. I am very conscious of how little I understand of how flash storage operates, as the onboard firmware has such a bigger and less transparent role. When a certain logical portion of the capacity is mapped out as a partition and the rest is left free, do the physical cells that are actually used (and reused) even map to a distinct physical region? I actually thought (though based on what source I can't say) that when you left space free on an SSD, it could be employed by internal wear-levelling.I must confess, right from the outset, that I'm not a fan of ssds, especially mlc flash ssds.
Flash storage has very good IO, providing it has sufficiently large buffers and a fast enough controller to keep up with things, but a poor write endurance and for this reason most ssds are, in effect, highly over-provisioned i.e. there's a lot more flash on the drive than it's stated capacity. But this extra flash is there purely for the purpose of replacing flash cells that have failed, which is all handled within the ssd drive by its on-board controller - it can't be accessed or even interrogated, except by drive utilities designed specifically for the purpose.
Because of the poor write endurance of flash memory all ssds incorporate wear-levelling. What this means is that if you delete something from flash storage then the flash cells that originally held the deleted data are simply marked as deleted but are not immediately made available for re-writing; at a block level those cells are still regraded as in use and this is why fstrim is necessary. When you manually ran fstrim and it reported 64GB, that's how much data had been written to the ssd but then subsequently deleted. But the purpose of fstrim is to mark those 'deleted' cells as now being available for re-use and allow them to be written to again. This is all going on inside the ssd and is not reflected in OS's view of the filesystem sitting on those partitions.
Now this is a bit of guesswork: because fstrim hadn't been run then it's possible that the partition had effectively become full, even though, as far as the OS was concerned, the filesystem was still half empty; the partition size was 128GB, subtract the 64GB marked as deleted by the drive, and then subtract the amount of 'live' data on the drive and we could be looking at 0.
If this is the case (this is all based on the research I had to do in a weird situation where the solution was to run fstrim on LVM mirrored HDDs!) the good news is that you probably haven't hammered the drive/partition - it's only been written to once (apart from subsequent use) - but all of it.
Given the degree of over-provisioning to allow for cell failure, the typical MTBF of > 100k hours and several 10s of full-drive re-writes, for typical consumer-grade ssds, seems reasonable, so the drive should be ok.
As far as I recall, when I've installed fstrim on a system it has scheduled a periodic run of fstrim, although you might want to tune the frequency; running it more frequently than necessary will increase the wear on the drive whereas, if my hypotheses is correct, not running it frequently enough will result in the failure you got.
Fwiw, I have two Proxmox VE hosts that use non-premium 128GB ssds for their root filesystem and the disk images for the four VMs that they run - one each for pihole and zabbix-proxy (SQLite) and the other two for 60 sec data capture, so small data but quite a lot of it: fstrim runs once/week and recovers ~1.6GB.
Not really applicable to this thread but a general word of advice - don't try to run ZFS on consumer-grade ssds - you really need to fork out for data-centre grade drives.
If a partition does map consistently to a specific batch of physical cells, then it would make sense to me to clone the partition into the free space and then leave the original "quarantined". (I could even do that 6 more times with this disk.) But if the cells did not actually take real punishment yet, but rather just hit the point of being unable to keep writing due to no TRIM being done, I assume that would be wasteful. But as I say I'm not sufficiently conversant with the tech to understand what is accurate.
Statistics: Posted by Havinit — Sat Dec 13, 2025 1:19 pm