Why do hard drives fail so often? And 10 ways to avoid it.

Hard drives seem to be one of the components of a computer that are most likely to fail on you at some point, leaving you wondering why? and what can you do about it?

Hard drives are mechanical devices with moving parts which makes them subject to more points of failure than other computer components. External hard drives have an increased potential for failure than internal drives through lack of cooling, being dropped, bad cable management, etc..   

In this article, we will explore the common failure points and how to avoid them where possible.

Why do hard drives seem to fail so often?

The keyword in this question is “seem”. Depending on your perspective you could fairly perceive that hard drives fail often. A friend of mine is continuously attributing data loss to hard drive failure and another who uses the same amount of storage but has never had so much as a corrupted file. Both of them have different views on this topic but there is an objective answer out there. 

Hard drives do fail. They are electro-mechanical devices that are subject to wear and tear and they get a lot of use. Not only that but they are highly sensitive to their surroundings something as ambiguous as smoke in the air can damage a disk beyond repair.

If you consider the sheer amount of points of failure compared to RAM for example you end up wondering how they ever work in the first place. Now if you take that logic to an external hard drive, which are usually in badly designed cases with no cooling capacity and cheap USB to SATA interfaces, and the points of failure increase again especially when you shove it in and out of a backpack etc… 

So why does one friend have a hard time of it and another not? Well, my guess is a little bit of luck but also maybe that friend who having them fail should take a look at his computer case and cooling system. Maybe there is something in his set-up that significantly shortens the life expectancy of a hard drive. 

Hard drive failure and the bathtub curve. 

Hard drives mostly fail from old age. The parts get worn out and begin to degrade over time like anything else. But there is an interesting phenomenon that appears when you plot the life span of almost anything on a graph called the bathtub graph. 

This graph demonstrates the likelihood of failure during the lifetime of hard drives but interestingly is applicable to almost anything you could attribute a lifespan to. In this context we’re looking at hard drives, of course, so let’s break it down.

The early life failures show an initial spike that gradually reduces to a steady failure rate until the “old age” of the hard drive starts to give way to the effects of entropy. The initial high failure rate is the testing phase through which every component of the device has to function properly in order for the device to make it to a stable failure rate. As each component begins to reliably function the likelihood of the device just not working reduces until it becomes stable and the most likely point of failure becomes time.

During that period if the device operated in a vacuum it should just operate until it’s worn out, but in the real world, there are too many variables for the baseline to be zero. These are random failures. For a hard drive, this is anything from a component going to a cup of coffee being spilled. The possibilities are endless.

8 steps to reduce random hard drive failures.

With the bathtub curve in mind, we can make an 8 step plan to getting the absolute most out of a hard drive and prevent data loss.

1. Mitigate risk.

The other steps might seem like overkill to some, certainly my partner thinks I’m being ridiculous when I advise her on how to look after her hard drive, so for a simple way of thinking about it remember these three tips and you will be fine.

1. Back up! –  Back up your hard drives and consider using a raid setup for any vital data or similar setup to ensure you don’t lose any of those precious files!

2. Outsource – Back up to a cloud drive and let the provider worry about the life of their drives.

3. Consolidate data – transfer any data that isn’t regularly used to a storage drive that you can back up and keep in a safe place.

2. Use it after a trial period.

For the initial use of a new hard drive, it is a good idea to use it as a backup drive, or if you do use it to keep it backed up until such a time as it reaches the random failure portion of its lifespan (and even then keep it backed up). Now I’m sure there is a scientific method to figure out exactly how long that should be, but I don’t know it so I go by 5% of the expected lifespan which I take from the warranty.

3. Consider the designated use for the drive.

Is the drive internal or external? For an internal drive think about why you need it? is it replacing a failed drive? if so why did its predecessor fail? Could improvements be made to your setup to help aid in a longer lifespan for the new hard drive?

For an external drive, is it for extreme locations? or for hour desktop? can you make improvements to the functional location of your drive to better protect it from the elements? Have you bought an external drive suitable for the desired use case?

4. Improve the working environment for the drive.

Does your PC need new fans? Is there somewhere cooler you could place the tower? a cooling pad for your laptop? A decent cable for an external drive? Extra padding in your camera bag etc…

5. Keep it cool.

Overheating is considered one of the most likely reasons a hard drive might fail sooner than you expect.

HDDs use a magnetic film on a spinning disk which stores data in binary that is represented by sequences created in the alternating polarization on a ferromagnetic film. This film can be damaged by a phenomenon called thermally induced magnetic instability, also known as the superparamagnetic limit. Manufacturers take steps to prevent this from harming your HDD by coating the platters with two opposed magnetic layers that reinforce each other.

So, in short, the cooler your hard drive is the better it will operate. Cool, not cold. Cold conditions have their own physical effects so running your drive in a freezer won’t help. Keep ventilation good and keep your pc away from heat sources.

6. Use it.

If you have a drive that is just for storage or backups then putting it away in a drawer for months on end might actually contribute to data loss. The magnetic polarization of the disk is fragile and every time you use an HDD your computer reinforces the magnetization essentially refreshing the data. A drive that hasn’t been used for a long time will be far more likely to have corrupted files.

7. Store your storage safely.

When your drive is not in use then it needs to be stored safely. Keep it dry, cool, and out of the way of any potential harm, for me, that means a drawer with a lock so my toddler can’t pry his way in there.

8. Monitor it.

Keep an eye on the health of your hard drive throughout its lifespan, if it begins to take a while to load or files are showing up as corrupted these could be the telltale signs that it is heading into the upwards trend of the bathtub curve and it may be time to back it up and find a replacement.

 A well-looked-after hard drive that makes its way through its early life failures is likely to serve you well until it wears out, I have a functioning drive that is 14 years old. I don’t use it but I do plug it in from time to time just to see how it’s doing and it still fires up. Just remember, back it up!