Saving some of your data to a cloud storage system like SkyDrive, Google Drive, and DropBox is a good way to have that data easily available from multiple systems. Free cloud storage that does not charge you any money to make a backup of your data is an awesome and generous service. Yet, there are risks that we do well to consider. Those risks determine what you store on the cloud and when you should or should not use the cloud as a replacement for your hard drive. This is a continuation of my other post, Use The Cloud But Keep A Local Copy, in which I talk more briefly about a real world scenario involving the cloud.
Time can pause and your heart thud rapidly upon the realization of data loss. Co-creator of the Internet, Vince Cerf, has let everyone know that even though you got data, you may not be able to read that data. You back up your files, year after year, and one day you try to access your files and are surprised to find out that you cannot read those files anymore. The reason is the computer programs or apps you used to create and view the files have evolved in a way that makes your files obsolete. As Vince Cerf shows, it happens.
Read-only Archival File Format
If you never need to change the file again, save it as a PDF. The Adobe PDF format is defacto standardized enough that they hold up very, very well. Even if the information changes, you can save snapshots of the data as PDF. This will hold you in good stead. Make sure the PDF is saved in a generic format like PDF/A with no custom settings.
PDF is good for snapshots but what about the source files for those PDFs and how about things that are not written but visual? Use the most open source formats you can find that are supported everywhere. Open source software can live forever and the translation method for reading and writing files can be viewed in the software. With open source, someone out there can recover the decipher method for open formats. Great formats to use are:
Documents, Spreadsheets, Presentations
LibreOffice using the Open Document Format
Photos from Camera or Post Processed
Special Mention about Plain-Text
A plain-text file is the most universal file format you can use but only if you take care. A plain-text file can be the most long-lived format you can use. They can also get corrupted so that they are not immediately readable. Even when this happens, you can still recover most or all of the information from plain-text files where it would be less feasible with other formats.
- Try to use the ASCII character set for plain-text files.
- ASCII is near the oldest, surest character set to recover.
- Most recovery tools, even advanced ones may become confused with Unicode where ASCII will work.
- Verify ASCII files whenever backed up.
- A second copy of ASCII files into PDF can improve recovery efforts in the future.
Although databases seem complex, they can always be exported to plain-text. Put those ASCII files in a standardized, compressed file format. Accompany the compressed data with a document on how to extract the files back into a database structure and all is well. A delimited file can always be refitted into a high speed binary format. The reverse is less feasible or problematic if you are too complacent with binary formats.
You can store your data permanently on GitHub or Amazon. If the data is not sensitive, those are good options for long-term storage. Something can happen and you cannot access those services so local storage will be more dependable. The most reliable storage technology is actual hard drives. As long as you make duplicate backups onto multiple hard drives and swap those drives out every few years, you will have durable access to the data. Make sure you power down the drives properly and that the data has fully written to the drives.
It may not seem like it now, but Google does fail. Many depend on Google for data storage and services and they are usually accessible without difficulty. One day, Google may discover a way to operate without error but Google did fail in the past. They still do very, very well.
Latency and Presence
Today, no company can afford to have computers in every backyard. If they did, they really wouldn’t really be centralized. Part of the appeal of the cloud is the centralized nature of it. That is also it’s Achilles heel.
The central servers are likely many, many miles from where you are. Any hiccups along the way means problems accessing data and information. Access to the information will either be too slow or blocked altogether.
Hard drives are the most reliable way to store information available to most people. Tape drives may have even greater reliability but may vary more widely in cost and standardization. Either way, direct storage is the most sound way to keep information close by and accessible. Cloud storage can be more convenient when communication technology is running at its prime but at other times it can become unproductive and unsuitable for sensitive information. There is a place for cloud storage but local storage, properly used, is the safer bet.
It has been mentioned that the State of California may pursue encryption requirements for computer technology. I believe this is intended for the protection of consumer information. The benefit of this occurring in California is that is where most of the high-profile Internet companies are based. Any guidance that applies to California in terms of technology will have the most significant impact in terms of computers and the Internet. Under the effective leadership of California Attorney General Kamala Harris, the nation’s users of computer services over the Web stand to benefit greatly.