Backup Basics for Photographers
Remember that time on Sex and the City when Carrie went to boot up her laptop and got Sad Mac? If you’ve never blown on a Nintendo cartridge or used a pencil to wind a tape cassette, you’re probably too young to remember Sad Mac (actually, you might also be too young to remember Sex and the City, in which case, thanks for making me feel ancient). Ah, Sad Mac. That cute but ominous little anthropomorphic CPU used to be Apple’s way of telling you that something was very wrong and, ostensibly, preparing you for the traumatic realization that your computer might be hosed. When poor Carrie took her ailing PowerBook to be repaired, the first question the support rep asked her was, “When was the last time you backed up?” To which she replied, “Umm, I don’t do that.” Suddenly, her entire library of manuscripts hung in the balance, along with her sanity.
Nostalgia for the days of Zip Drives and the upside-down Apple logo aside, this tale of woe provides an excellent lesson in the importance of a solid backup workflow. Apple may have done away with Sad Mac, but if your computer suddenly stops working, I suspect many of you will be quite inconsolable. Carrie’s manuscripts represented her entire career as a writer. As photographers, our image libraries not only allow us to earn a living through licensing opportunities; they comprise a body of work potentially spanning decades, demonstrating our growth as artists. Losing access to our images would be undeniably catastrophic.
However, the loss of images barely scratches the surface of the potential damage to which even the average household is susceptible, from something as common as a failed hard drive or power surge. Financial records, legal documents, family keepsakes, address books…the list goes on. While the specific backup needs between professional photographers and the typical household will obviously vary, the need to have a backup workflow in place remains a common denominator.
Why Should We Back Up?
There are two types of people in this world: those who have lost data, and those who will lose data. Data loss has been a rite of passage for decades. For some reason, all of us have to learn this lesson the hard way (hopefully with only one occurrence). 100 percent of hard drives fail at some point. Hacking and ransomware have far eclipsed similar crimes within the physical realm. Severe weather creates power surges and outages that can damage sensitive electronics and their data. No matter the cause, you are all but guaranteed to be exposed to a data loss situation at some point. The question is whether or not you will have a backup in place to render that data loss moot.
Here’s only a brief list of all the horrible things that can happen to your computer and data. Are you prepared in the event of one or more of these perils?
- Hardware failure (including media-related problems like disc rot and demagnetization)
- Hackers, viruses, ransomware, or other malicious activity
- Drops, liquid spills, or other accidents
- Fire, flood, power surge, or other natural disasters
- Theft/burglary
- User error, such as accidental deletion or overwrite
I am solidly in the category of those who have lost data, having experienced everything from total system meltdowns to old-school viruses to failures of single drives that were not backed up. If you are also in the “I’ve lost data” camp, note that if you’ve failed to learn from that experience, you will inevitably repeat your initiation—an unfortunate, potentially expensive, and completely unnecessary occurrence. Keep in mind that even if you have additional copies of your work, if they are all in one place, you are still not prepared for some of the above events.
Misconceptions About Backups
There are many misconceptions about what constitutes a backup, some of which I’ve been guilty of myself. It’s important to recognize that, while some of these items might be incorporated into your backup strategy, they are not themselves a backup.
Fault tolerance
A simple example of fault tolerance would be dual memory card slots on a camera. Assuming you set the camera to record the exact same data to both cards, you are protected against the failure (fault) of either memory card. However, this merely provides “tolerance” to a single drive (card) failure. If your camera is dropped, immersed in water, stolen, or otherwise incapacitated with both memory cards still inside, you have no protection against data loss.
The same logic applies in a desktop computing environment. One of the most common mistakes people make in choosing a backup strategy is confusing “fault tolerance” for a true “backup.” Formerly discussed only in serious computer enthusiast or professional circles, RAID has become commonplace in the market these days. Short for “Redundant Array of Independent Disks,” RAID essentially combines multiple physical drives to create one logical drive. You might have four hard drives grouped together in a RAID configuration, but on your desktop, you only see one drive. Depending on whether your purpose in setting up the RAID is mainly to improve read/write speed, or to provide fault tolerance in the event of a drive failure, this will determine which type of configuration makes sense for your needs. It’s important to note that not all RAID types provide any fault tolerance in the event a drive fails. Even with a five-drive RAID setup, you could experience catastrophic data loss in the event of a single drive failure, if the RAID is not configured to provide any fault tolerance.
However, even with one- or even two-drive fault tolerance—RAID is not itself a backup. First of all, if your RAID is managed by a hardware controller (as opposed to being managed by your operating system), the ability for you to actually access your data depends entirely on having a working hardware controller. Those are often specific to each type of enclosure (the box in which the drives are housed), so if your controller fails and you can’t get a replacement, your data has just become unreadable. Secondly, a power surge, fire, flood, or other issues could still easily wipe out all the drives in your RAID. Fault tolerance is meant to preserve uptime or act as a “first line of defense,” not to be the sole determinant of whether or not you actually lose data. Fault tolerance is still an important consideration in a backup strategy, but it is not on its own a backup strategy. For more information on the different types of RAID configurations, check out this video.
Time Machine (Mac users only)
On the surface, Time Machine seems like the perfect backup solution for Mac users (and make no mistake—I do use it, it is wonderful, and it has saved me more than once). However, there are a couple of things to know about Time Machine that, in my opinion, make it not a true backup solution, at least not on its own.
Time Machine is, honestly, awesome. Adhering to Apple’s recently-strained-if-not-completely-abandoned claim that “it just works,” Time Machine automatically takes hourly snapshots that represent the exact state of your computer system at the moment of backup. The current day’s backups are kept for 24 hours, after which they are automatically pruned to include only one per day for a month, and thereafter, one per week until the backup disk is full. From that point, the oldest backup versions are automatically purged to make room for new backups. Time Machine is beautiful in its simplicity. Backups can be encrypted, stored on network volumes (looking at you, fellow NAS users), and can exclude specific drives (volumes), files, or folders. Restoring previous versions of files is as simple as clicking and dragging, and even a complete system restore is, for the most part, automatic.
However, just like that voucher for a free lunch you received from those nice people at the vacation time-share company, some conditions apply. As I mentioned, Time Machine captures snapshots, not complete copies of the file and folder structure of your system. Using a UNIX-based technology called “hard links,” Time Machine stores only shortcut references to files that remain unchanged over the course of time. With each backup instance, anything unchanged since the last backup only has that shortcut stored—not another copy of the same file. It takes a minuscule amount of space to do that, and that’s exactly how Time Machine is able to retain so many “versions” of your system with relatively little additional disk space for each. This is not necessarily a problem, as long as you understand that what is being stored on your Time Machine drive is something that is natively accessible only by machines running contemporary versions of MacOS and is something that must be “restored” to another drive—you cannot boot directly from Time Machine without the involvement of another OS installation. Technically, it is possible to recover most of your files with a Windows machine; however, this assumes you have the expertise to do so and realize that it is not at all a perfect process. To the same end, while you could theoretically back up to Time Machine remotely, I don’t recommend it as an off-site backup solution, in part for technical reasons that go beyond the scope of this article. The key takeaway is that Time Machine’s strength lies in its ability to simply, selectively, and organically recover files (potentially in multiple versions) within a MacOS environment. If you’re looking for an off-site backup, go with a service that is specifically tuned to provide that service with a more transparent experience.
Windows users—I apologize but it’s been almost twenty years since I’ve used Windows for anything but basic emails and word processing at office jobs prior to becoming a photographer. The last Windows version I used seriously was Windows 98, so I have no idea if Microsoft has introduced any features comparable to Time Machine. If you know of any comparable solutions, please feel free to mention them in the comments.
File-sharing platforms (like Dropbox)
A frightening number of photographers I know seem to sleep better at night because their files are “backed up” in Dropbox. Folks, Dropbox is super cool and I do use it as part of my overall workflow, but I don’t recommend using it, or services like it, for backup purposes. For one thing, when you actually “share” a folder with another user (not just the link), they have the ability to edit that folder, which means that accidental deletion or overwriting can be a major issue. While the Dropbox platform does provide some restore functionality for restoring accidentally deleted items (how much depends on your plan tier), the entire focus of the platform is much more about file sharing and collaboration than backup. The disadvantages to Dropbox as a backup are similar to those of having a cloned drive, but without the same advantages (notwithstanding the various advantages of using Dropbox for purposes other than backup).
As Easy as 3-2-1
We’ve reviewed some common practices that can be part of a backup solution, but are not on their own a backup solution. What are the best practices, then? The generally accepted formula for a solid backup strategy is fairly simple: three total copies of your data, two copies are on different media, one copy is kept off-site.
Three total copies
With three copies, you can experience a loss of up to two copies and still have the third preserved. The statistical likelihood of experiencing data loss both locally and off-site is minuscule, provided, by way of example, not limitation, that you haven’t simply stashed a hard drive at your friend’s house down the street when both of your homes are in a wildfire or flood zone.
Two copies on different media
Making a copy of your data doesn’t do much good if the copy resides on the same drive or physical media as the original. At that point, it’s simply a copy—not a backup. It won’t provide much protection. Having your second copy on an entirely separate drive or physical media (such as an external hard drive) reduces the risk that hardware failure or other local accidents would affect both copies at the same time. In the past, DVDs or similar optical media were used for backup, but aside from disc rot and the recent scarcity of optical drives and media, such a solution requires proactive and time-consuming interaction from the end-user, not to mention the fact that even dual-layer Blu-Ray discs only store about 50GB apiece—a paltry amount compared to the contemporary photographer’s needs. Cloned drives could technically serve the purpose of a second copy, allowing for continuous uptime in the event of a drive failure; however, having cloned drives means any data corruption or malware on the main operating drive may be easily passed on to the cloned copy as well.
One copy off-site
This doesn’t have to be complicated. While most opt for a cloud backup service (more on those later), your off-site copy could be something as simple as an extra hard drive you keep at a friend’s house and update once every month with your most important files. That said, as I mentioned above, if your friend is located within close proximity to your primary storage location, any disasters affecting both your residences would be subject to the destruction of both local and “off-site” copies. This is why cloud-based solutions are most favored for off-site backups. Cloud solutions often maintain multiple redundant copies across wide geographical regions, ensuring the availability of your data even in catastrophic conditions. However, even off-site backup services are not impenetrable. Two entirely plausible scenarios would be infiltration of the network by hackers or the simple act of the service suddenly going out of business and deleting users’ data. The latter is less likely than the former, but it’s still a possibility, which is the main reason so many people are still wary of trusting cloud-based solutions. However, that is the whole point in having a local copy as well—the off-site copy would typically be your last resort.
There are many options available for cloud-based backups, and they vary in complexity and cost based on the use cases for which they were designed. BackBlaze’s personal backup service has flat-rate pricing and does not charge for re-downloading data; however, it’s only for desktops/laptops (not NAS devices) and any external drives you want to back up must remain connected on a continuous basis. On the other hand, you have pay-as-you-go services like Amazon Glacier, which charge a very small amount for each gigabyte you back up, but will charge you whenever a file changes (requiring a new upload) or whenever you need to download data. This can get very expensive if you’re not aware of what you might be doing to incur charges; however, for photographers with very large storage needs, services like Glacier are very economical by comparison to other options. Bear in mind that with proper on-site data practices, you will hopefully never have to retrieve data from the off-site backups except in the event of a total catastrophe, in which case you may even be able to have the cost covered under “data recovery” as part of an insurance claim (but please check with your agent about that).
Determine Your Individual Needs
When it comes to backup strategy, one size definitely does not fit all. Among other individual factors, the strategy that’s right for you will depend on the following:
Amount of acceptable downtime
This refers to both the time your computer is inoperable due to a failure and the amount of time you are unable to access data. If you have both a desktop and a laptop, and your desktop fails, you could in theory use the laptop while the desktop is repaired, your main priority would be maintaining access to your working files. In such a case, it’s worth considering cloning your main system drive to a drive that is reasonably close in speed. With the proper setup, you can plug the clone drive into your backup machine and actually boot from it for a virtually seamless experience (again, Windows folks, I imagine this is the case for you as well, but I’m not sure).
Note that this will not prevent any issues relative to the failure of your system as a whole if you do not have a backup machine; a clone is only functional if either the problem is localized to your main drive itself, or you have an alternate system you can use by booting directly from the external drive. Bear in mind that when cloning a drive, any files deleted from the source drive will also be deleted from the destination drive. While this solution protects against downtime, it does not protect against all perils associated with data loss.
Downtime is a consideration for any workflow; not just situations where cloning is involved. If a certain amount of downtime is acceptable to you, you can generally restore a backup from an external hard drive fairly quickly (once the proper repairs are made to your system). However, cloud backups can take many hours to restore, depending in part on the speed of your Internet connection. BackBlaze’s solution to this is to provide the option to purchase a hard drive with your data already on it for faster restoration; however, you still have to wait for the drive to arrive before you can even get started. This is where fault tolerance can make all the difference. Instead of waiting for a lengthy restoration process, you can simply swap a bad drive for a new one and have it rebuild in the background.
Versioning capabilities
How many times have you accidentally saved over the previous version of an image when you meant to save a new copy? Or perhaps you made a destructive change to a document that you later regretted? Assuming you’ve been working on the image for a while, you may be able to access and recover prior versions of files stored in Time Machine or other services. Versioning isn’t just about accidental changes, either. If a file becomes corrupted, you can generally recover the last in-tact version from a prior version. This feature has saved me more than once, and it’s worth considering when determining your backup strategy. If the Time Machine volume is properly encrypted, it should be resistant to viruses and other nefarious activity to some degree as well.
Budget and time considerations
Apart from a slim group of folks who simply don’t think they’d be upset at losing all their data, most people I talk to who do not have a solid backup workflow in place believe the process is either too expensive or too time-consuming (or both). But backup workflows are not a one-size-fits-all situation. My mother recently asked me for advice on backing up the modest amount of data she keeps on her laptop, but recoiled at the idea of paying for a monthly subscription or even investing in a decent external drive. My recommended solution for her was to buy two decently sized USB flash drives and to periodically copy all her files to them. She sees one of her friends about once a month, so her friend keeps one USB drive at her house and my mother trades her for the less updated one when they see each other. In the strictest terms, this satisfies the 3-2-1 requirement, noting that her friend lives far enough away that they would not likely be subject to simultaneous natural disasters. For whatever reason, my mother is more comfortable with this method than paying anything for a cloud backup service, and more importantly, is willing to commit to keeping up with it. For me, this solution would be incredibly annoying, never mind woefully inadequate. But at the end of the day, the backup solution that is right for you, is the one you will keep up with. There are trade-offs at each step. We tell our clients, “Good, fast, cheap—pick two.” Something similar could be said for backups: “Robust, automatic, cheap—find balance between the three.”
A Simple Backup Workflow Scenario
If you’re not already backing up, don’t let your current situation stop you from at least getting something in place to protect your most critical files. To get started at a basic level, all you need is a blank external hard drive, an Internet connection, and a subscription to a cloud backup service (like BackBlaze). The drive will likely cost about $100 USD and BackBlaze is less than $5 USD per month if you’re willing to pre-pay your subscription.
Today, most of us work with an SSD (solid-state drive) as our main drive. These drives are generally much faster than the older mechanical high-density drives (HDD), but they come at a higher price and often lower storage capacity than HDDs. Since backup drives don’t really need to be fast, HDDs are a great option, especially if you plan to store multiple versions of files on the backup drive, or if you have an additional external drive you want to back up. As to how to manage the backup process, if you’re on a Mac, I’d recommend Time Machine as the simplest solution. I haven’t used Windows in years, but I believe it also has a built-in backup manager that will at least automate the process. Meanwhile, your computer (and any consistently connected external drives) will back up to the cloud. This whole process should be, for the most part, transparent as well as easy on your system resources.
The biggest limitation you will experience with this scenario is if you have a lot of data already stored on external drives. If this is the case, you’ll likely run out of space on your backup drive. Further, while you can set BackBlaze to back up external drives (local drives only—backing up NAS devices is not included), BackBlaze will purge any backups older than 30 days, meaning that any external drives you want to back up to BackBlaze must remain plugged into your computer at all times. You can see that this simple workflow becomes problematic or falls apart completely when a lot of data is introduced, although if you are willing to make the investment, you could technically use BackBlaze to back up even a very large RAID setup, so long as it’s consistently connected locally (not as a network volume). The other main consideration is that this is not a “bootable” solution in the event your main system drive fails. You will have to replace the system drive before you can get back to work.
However simple and somewhat limited, this solution is the most cost-effective, satisfies the 3-2-1 requirements, and provides at least some protection against every peril mentioned at the beginning of this article.
What is your current backup strategy? Will you be making any changes going forward? Any questions on what’s covered here? Let us know in the comments below.