Having stored information in a particular data format, how do we get it back out again? How easy is it to access the data? The answer naturally depends on which data format we are dealing with.
For data stored in plain text files, it is very easy to find software that can read the files, although the software may have to be provided with additional information about the structure of the files--where the data values reside within the file.
XML, a more formal standard, contains explicit information about structure (XML is self-describing) so does not have this problem. However, XML carries an additional burden because we need software that understands the XML syntax.
For data stored in binary files, the main problem is finding software that is designed to read the specific binary format. Having done that, the software does all of the work of extracting the appropriate data values. This is an all or nothing scenario; we either have software to read the file, in which case data extraction is trivial, or we do not have the software, in which case we can do nothing. This scenario includes data stored in spreadsheets, though in that case the likelihood of having appropriate software is much higher.
When information is stored in a database, we again require specific software to access the data. However, the situation is not as bad as it is for binary formats because there is a common interface shared by all database management systems--a common language that allows us to extract data from a database, no matter which database management system is being used. This language, the Structured Query Language (SQL), is the focus of this chapter.
Paul Murrell
This document is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.