YBC 7289.
Photo by Bill Casselman.^{7.1}
Some of the earliest known examples of recorded information come from Mesopotamia, which roughly corresponds to modernday Iraq, and date from around the middle of the fourth millenium BC, which is quite a long time ago. The writing is called cuneiform, which refers to the fact that marks were made in wet clay with a wedgeshaped stylus.
A particularly famous mathematical example of cuneiform is the clay tablet known as YBC 7289.
This tablet is inscribed with a set of numbers using the Babylonian
sexagesimal (base60) system. In this system, a symbol resembling a
lessthan sign (we will use <)
represents the value 10 and a symbol resembling a
tall narrow triangle, with the tip pointing down, represents the value 1
(we will use ).
For example, the value 30 is written (roughly) like this: <<<
.
This value can be seen in the topleft “corner” of YBC 7289
(see Figure 7.1).

The number along the central horizontal line on YBC 7289 has four digits:

, which is 1; <<
, which is 24; <<<<<
, which is
51; and <
, which is 10. Historians have determined that there is
an unwritten “decimal place” after the 1 and this means that the decimal
value of this number is
(to 9 significant decmial digits), which,
for around 1600 BC,
is ridiculously close to the true value of the length of the diagonal
of a unit square (
).
The value at the bottom of the tablet has three digits42, 25, and 35and corresponds to the value (i.e., the length of the diagonal for a square with sides of length 30).
What we are going to do with this remarkable piece of mathematical history is to treat it as information that needs to be stored electronically.
The choice of a clay tablet for recording the information on YBC 7289 was obviously a good one in terms of the durability of the storage medium. Very few electronic media today have an expected lifetime of several thousand years. However, electronic media do have many other advantages.
The most obvious advantage of an electronic medium is that it is very easy to make copies. The curators in charge of YBC 7289 would no doubt love to be able to make identical copies of such a precious artifact, but truly identical copies are only possible for electronic information.
This leads us to the problem of how we produce an electronic record of the tablet YBC 7289. A straightforward approach would be to write a simple textual description of the tablet.
YBC 7289 is a clay tablet with various cuneiform marks on it that describe the relationship between the length of the diagonal and the length of the sides of a square.
This approach has the advantages that it is easy to create and it is easy for a human to access the information. However, when we store electronic information, we should also be concerned about whether the information is easily accessible for computer software. This essentially means that we should supply clear labels so that individual pieces of information can be retrieved easily. For example, the label of the tablet is something that might be used to identify this tablet from all other cuneiform artifacts, so the label information should be clearly identified.
label: YBC 7289 description: A clay tablet with various cuneiform marks on it that describe the relationship between the length of the diagonal and the length of the sides of a square.
Thinking about what sorts of questions will be asked of the data is a good way to guide the design of data storage. Another sort of information that people might go looking for is the set of cuneiform markings that occur on the tablet.
The markings on the tablet are numbers, but they are also symbols, so it would probably be best to record both numeric and textual representations. There are three sets of markings, and three values to record for each set; a common way to record this sort of information is with a row of information per set of markings, with three columns of values on each row.
label: YBC 7289 description: A clay tablet with various cuneiform marks on it that describe the relationship between the length of the diagonal and the length of the sides of a square. <<< 30 30  << <<<<< < 1 24 51 10 2.41421296 <<<< << <<< 42 25 35 42.4263889
When storing the lines of symbols and numbers, we have spaced out the information so that it is easy, for a human, to see where one sort of value ends and another begins. Again, this information is even more important for the computer. Another option is to use a special character, such as a comma, to indicate the start/end of separate values.
label: YBC 7289 description: A clay tablet with various cuneiform marks on it that describe the relationship between the length of the diagonal and the length of the sides of a square. values: cuneiform,sexagesimal,decimal <<<,30,30  << <<<<< <,1 24 51 10,2.41421296 <<<< << <<<,42 25 35,42.4263889
Something else we should add is information about how the values relate to each other. Someone who is unfamiliar with Babylonian history may have difficulty realising that the three values on each line actually correspond to each other. This sort of encoding information is essential metadatainformation about the data values.
label: YBC 7289 description: A clay tablet with various cuneiform marks on it that describe the relationship between the length of the diagonal and the length of the sides of a square. encoding: In cuneiform, a '<' stands for 30 and a '' stands for 1. Sexagesimal values are base 60, with a sexagesimal point after the first digit; the first digit represents ones, the second digit is sixtieths, the third is threethousand sixhundredths, and the fourth is two hundred and sixteen thousandths. values: cuneiform,sexagesimal,decimal <<<,30,30  << <<<<< <,1 24 51 10,2.41421296 <<<< << <<<,42 25 35,42.4263889
The position of the markings on the tablet, and the fact that there is also a square, with its diagonals inscribed, are all important information that contribute to a full understanding of the tablet. The best way to capture this information is with a photograph. In many fields, data consist not just of numbers, but also pictures, sounds, and video. This sort of information creates additional files that are not easily incorporated together with textual or numerical data. The problem becomes not only how to store each individual representation of the information, but also how to organise the information in a sensible way.
label: YBC 7289 description: A clay tablet with various cuneiform marks on it that describe the relationship between the length of the diagonal and the length of the sides of a square. photo: ybc7289.png encoding: In cuneiform, a '<' stands for 30 and a '' stands for 1. Sexagesimal values are base 60, with a sexagesimal point after the first digit; the first digit represents ones, the second digit is sixtieths, the third is threethousand sixhundredths, and the fourth is two hundred and sixteen thousandths. values: cuneiform,sexagesimal,decimal <<<,30,30  << <<<<< <,1 24 51 10,2.41421296 <<<< << <<<,42 25 35,42.4263889
Information about the source of the data may also be of interest. For example, the tablet has been dated to sometime between 1800 BC and 1600 BC. Little is known of its rediscovery, except that it was acquired in 1912 AD by an agent of J. P. Morgan, who subsequntly bequeathed it to Yale University. This sort of metadata is easy to record as a textual description.
label: YBC 7289 description: A clay tablet with various cuneiform marks on it that describe the relationship between the length of the diagonal and the length of the sides of a square. photo: ybc7289.png medium: clay tablet history: Created between 1800 BC and 1600 BC, purchased by J.P. Morgan 1912, bequeathed to Yale University. encoding: In cuneiform, a '<' stands for 30 and a '' stands for 1. Sexagesimal values are base 60, with a sexagesimal point after the first digit; the first digit represents ones, the second digit is sixtieths, the third is threethousand sixhundredths, and the fourth is two hundred and sixteen thousandths. values: cuneiform,sexagesimal,decimal <<<,30,30  << <<<<< <,1 24 51 10,2.41421296 <<<< << <<<,42 25 35,42.4263889
The YBC in the tablet's label stands for the Yale Babylonian Collection. This tablet is just one item within one of the largest collections of cuneiforms in the world. In other words, there are a lot of other sources of data very similar this one. This has several implications for how we should store information about YBC 7298. First of all, we should store the same sort of information as is stored for other tablets in the collection so that, for example, a researcher can search for all tablets created in a certain time period. We should also think about the fact that some of the information that we have stored for YBC 7289 is very likely to be in common with all items in the collection. For example, the explanation of the sexagesimal system will be the same for other tablets from the same era. With this in mind, it does not make sense to record the encoding information for every single tablet. It would make sense to record the encoding information once, perhaps in a separate file, and just refer to the appropriate encoding information within the record for an individual tablet.
label: YBC 7289 description: A clay tablet with various cuneiform marks on it that describe the relationship between the length of the diagonal and the length of the sides of a square. photo: ybc7289.png medium: clay tablet history: Created between 1800 BC and 1600 BC, purchased by J.P. Morgan 1912, bequeathed to Yale University. encoding: sexagesimal.txt values: cuneiform,sexagesimal,decimal <<<,30,30  << <<<<< <,1 24 51 10,2.41421296 <<<< << <<<,42 25 35,42.4263889
These are just a couple of the possible text representations of the data. Another whole set of options to consider are binary formats, for example, the photograph and the text and numeric information could all be included in a single file. The most likely solution in practice is that this information resides in a database of information that describes the entire Yale Babylonian Collection.
This chapter will look at the decisions involved in choosing a format for storing information, we will discuss a number of standard data storage formats, and we will acquire the technical knowledge to be able to work with the different formats.
Paul Murrell
This document is licensed under a Creative Commons AttributionNoncommercialShare Alike 3.0 License.