Numerical copyediting follies

A short time ago, I was reading an otherwise pretty good article when I encountered a sentence that was confusing at first:

Large numbers are written in a variety of formats: In English, numbers may be represented as numerals (5,000), as numbers words (5,000), or in what we might call the hybrid system (5,000).

It took me a moment to realize, with a profound sadness that only a professional number guy like myself can appreciate, that what was meant was:

Large numbers are written in a variety of formats: In English, numbers may be represented as numerals (5,000), as numbers words (five thousand), or in what we might call the hybrid system (5 thousand).

No doubt the sentence read properly at first, but then an overeager copyeditor (or perhaps an automated copyediting system?) got hold of the sentence and converted it into house style for the journal, which rendered the entire thing completely meaningless.  Probably the authors should have caught it at proofs, but they didn’t, and there you go.  (I also think numbers words should be number words, but who am I to complain?  Oh, right, almost forgot there for a moment.)

One of the perils of working in a field like numerals is that every journal and publisher has a house style, and numerals are one of those things that authors often throw about casually, thus requiring some attention from editorial staff.  The problem is that when the subject of your research is numerals, you can’t rely on a house style or intuition to figure out what to do.   So, for instance, just to take a very basic example, it’s weird to say that V is the Roman numeral equivalent of five; it is the equivalent of 5.  But almost no style guide permits free-standing numerals less than 10 to be written in numerical notation.

Now, I should say that I have had very positive experiences with copyeditors in general.  The copyeditors who worked on Numerical Notation, in particular, were absolutely superb.   I did, however, write an extensive page-long memo to my main copyeditor, complete with acceptable and unacceptable sample sentences, with all of the little exceptions and pedantries that would have taken hours to undo.   I like to think that they appreciated it but I leave open the possibility that they thought I was an entitled prick.  I also had to contend with around 30 fonts, most of my own creation, with various numerical signs.  I could write a long post about my process for creating numerical fonts, but I fear it would be even more boring than a post about copyediting goofs.  Anyway, the result is fantastic and I have found very, very few editorial issues in the book.

But the authors above should take heart – it can happen to anyone.    Around 2002, when I was finishing up my PhD, I worked as a research assistant for my supervisor, Bruce Trigger.  One day Bruce recounted to me the most remarkable thing about the book he was working on at the time.     Wherever he had written ‘one million’, he got back a version that said ‘ten lacks’, and wherever he had ‘two million’, it said ‘twenty lacks’, and so on.     At first he was extremely confused, and when he got to the bottom of it, it turns out that, like so many things, the copyediting for the book was outsourced to an Indian firm.      In Indian English, the word ‘lack’ or ‘lakh’ (borrowed from Hindi) is frequently employed to mean ‘100,000’, because, as in Hindi, Indian English has a special power term for every other power of ten above 1,000 (thousand, lack = 100K , crore = 10 million ) where American English has one for every third power (thousand, million, billion) and of course British English is even more irregular.  In this case the error was caught and fixed.  It serves, though, as an object lesson that I tell to students today, about mixed languages and Global English and weird numerical anomalies.   It’s a reminder that in the world-system, sometimes the periphery strikes back.


Octothorpe, quadrathorpe, bithorpe

Since I am, both by vocation and avocation, a word guy, it’s pretty rare for me to learn new English words.  Since I am, in particular, a number words guy, it is especially rare for me to learn new English numerical words (my personal all-time favourites are tolfraedic and zenzizenzizenzic, for the record).  So imagine my surprise upon reading the latest post from the fantastic Shady Characters blog on punctuation to encounter the word bithorpe, and then after some searching, its cousin quadrathorpe, both of which were new to me.

You won’t find either of these in any dictionary, but you will find them in dark corners of the Internet.    You will find octothorpe (also spelled octalthorpe and octothorp, however – a word that emerged from the folks at Bell Labs in the late 60s / early 70s to refer to the sign #, known to most as pound or number sign or hash(tag).  No one is really clear on its etymology, as there are a number of unconvincing competing theories, but it’s reasonably clear that the ‘octo’ is supposed to represent the eight points on the ends of the four lines.    And thus, by jocular extension, a quadrathorpe is an equals sign (half an octothorpe) and a bithorpe is a hyphen, with four and two endpoints respectively.

Hoping to procrastinate from other, more important things, I spent some time this afternoon poking around on the origin of these strange terms, and the earliest I could find is this Usenet post from the group misc.misc from April 1989 (i.e., several years before most of us even had email and two years before Al Gore created the internet).      Since this list was composed from the results of a survey, someone obviously coined them (in jest) before that time, but probably not much before.    This list appears to have spawned many copies (some exact, others less so), almost all of which reproduce the rhetorical (possibly unanswerable) parenthetical question, “So what’s a monothorpe?”

Coexistence and variation in numerals and writing systems

Well, it only took about 20 minutes for Dan Milton to solve the mystery of the Egyptian stamp: it has four distinct numerical notation systems on it: Western (Hindu-Arabic) numerals, Arabic numerals, Roman numerals, and most prominently but obscurely, the ‘Eye of Horus’ which served, in some instances, as fractional values in the Egyptian hieroglyphs:

egypt-stampAt the time I posted it this morning, it was the only postage stamp I knew of to contain four numerical notation systems.   (As Frédéric Grosshans quickly noted, however, a few of the stamps of the Indian state of Hyderabad from the late 19th century contain Western, Arabic, Devanagari, and Telugu numerals, and also meet that criterion, although all four of those systems are closely related to one another, whereas the Roman numerals and the Egyptian fractional numerals are not closely related to the Western or Arabic systems.    So that’s kind of neat.  I have a little collection of stamps with weird numerical systems (like Ethiopic or Brahmi), multiple numeral systems (like the above), unorthodox Roman numerals (Pot 1999), etc., and am looking to expand it, since it is a fairly delimited set and, as a pretty odd basis for a collection, isn’t going to break the bank.  In case I have any fans who are looking for a cheap present for me.  Just sayin’ …

We in the West tend to take for granted, today, that really there is only one numerical system worthy of attention, the Western or Hindu-Arabic system, which is normatively universal and standardized throughout the world.  We also tend to feel the same way about, for instance, the Gregorian calendar.   That’s a little sad but not that surprising.   But we also take it for granted that, in general, throughout history, each speech community has only one set of number words, one script, and one associated numerical notation system.    Of course, a moment’s reflection shows us this isn’t true: virtually any academic book still has its prefatory material paginated in Roman numerals, not to mention that we use Roman numerals for enumerating things we consider important or prestigious, like kings, popes, Super Bowls, and ophthalmological congresses.   And this is not to mention other systems like binary, hexadecimal, or the fascinating colour-based system for indicating the resistance value of resistors.    I’ve complained elsewhere that we put too much emphasis on comparing one system’s structure negatively against another, but to turn it around, we should ask what positive social, cognitive, or technical values are served by having multiple systems available for use.

We need to be more aware that the simultaneous use of multiple scripts, and multiple numeral systems, simultaneously in a given society is not particularly anomalous.       In Numerical Notation (Chrisomalis 2010), I structured the book system by system, rather than society by society, which helps outline the structure and history of each individual representational tradition, and to organize them into phylogenies or families.  But one of the potential pitfalls of this approach is that it de-emphasizes the coexistence of systems and their use by the same individuals at the same time by under-stressing how these are actually used, and how often they overlap.    Just as sociolinguists have increasingly recognized the value of register choice within speech communities, we ought to think about script choice (Sebba 2009) in the same way.   With numerals, we also have the choice to not use number symbols at all but instead to write them out lexically, which then raises further questions (is it two thousand thirteen or twenty thirteen?) – many languages have parallel numeral systems (Ahlers 2012; Bender and Beller 2007).   We need to get over the idea that it is natural or good or even typical for a society to have a single language with a single script and a single numerical system, because in fact that’s the exception rather than the norm.

The stamp above is a quadrilingual text (French, Arabic, Latin, Egyptian) in three scripts (Roman, Arabic, hieroglyphic) and four numerical notations (Western, Arabic, Roman, Egyptian).    We should think about the difficulty of composing and designing such a linguistically complex text – it really is impressive in its own right.  We should also reflect on the social context in which the language of a colonizer (French), the language of the populace (Arabic), and two consciously archaic languages (Latin and Egyptian), and their corresponding notations, evoke a complex history in a single text.  Once we start to become aware of the frequency of multiple languages, scripts, and numeral systems within a single social context, we have taken an important step towards analyzing social and linguistic variation in these traditions.


Ahlers, Jocelyn C. 2012. “Two Eights Make Sixteen Beads: Historical and Contemporary Ethnography in Language Revitalization.” International Journal of American Linguistics no. 78 (4):533-555.

Bender, Andrea, and Sieghard Beller. 2007. “Counting in Tongan: The traditional number systems and their cognitive implications.” Journal of Cognition and Culture no. 7 (3-4):3-4.

Chrisomalis, Stephen. 2010. Numerical Notation: A Comparative History.  New York: Cambridge University Press.

Pot, Hessel. 1999. “Roman numerals.” The Mathematical Intelligencer no. 21 (3):80.

Sebba, Mark. 2009. “Sociolinguistic approaches to writing systems research.” Writing Systems Research no. 1 (1):35-49.

What makes this stamp unique?: a contest

To the best of my knowledge, this Egyptian postage stamp, along with the two other denominations in the same 1937 series (5 mills and 15 mills), are unique in a very specific way.  My puzzle to you is: what makes these stamps so special?


Place your guess by commenting below (one guess per person).  If you are the respondent with the correct answer, your ‘prize’ is that you may ask me any question relating to the themes of this blog and I will write a separate post on that subject.    Happy hunting!

Edit: Well, that didn’t take long.  In just over 20 minutes, Dan Milton successfully determined the answer.  In case you still want to figure it out on your own, I won’t post the answer here in the main post, but you can find it in the comments if you’re stumped.  I will follow up with some analysis later.

Plants (humans?) are incredibly cool, but don’t do math

There’s a fascinating article on BBC News today, about a really interesting study that proposes that an internal mechanism in the Arabidopsis thaliana plant (which is used widely in scientific experiments as a model organism) regulates starch consumption in the absence of sunlight in a way that requires the plants to be able to mathematically “divide” the numbers of two different types of cells.  Now I’m not a botanist and I can’t say whether the result is correct, but I do take issue with the claim that “They’re actually doing maths in a simple, chemical way”.  The last quote from the article is more accurate: “This is not evidence for plant intelligence. It simply suggests that plants have a mechanism designed to automatically regulate how fast they burn carbohydrates at night. Plants don’t do maths voluntarily and with a purpose in mind like we do.”

All sorts of natural processes can be modelled using mathematics – so, for instance, Fibonacci patterns appear in a variety of plants in the operation of phyllotaxis (the arrangement of leaves on stems).    We don’t say that these plants ‘do math’.    And the same principle applies above to the new finding above.  It’s incredibly cool that these mathematical patterns emerge, and it’s a very interesting question why they emerge biochemically.  But that raises an even more interesting question: what do we mean when we say that humans ‘do math’?

Humans are organisms and thus part of the physical world, and so lots of the things they do unconsciously or without explicit reflection can thus be modelled mathematically.    But this is not the same as saying that all humans do mathematics.  This seems to be what is being suggested in the last quotation: that ‘doing math’ involves conscious, explicit, purposeful reflection on the mathematical aspects of reality.    Being able to throw a curveball is not ‘doing mathematics’; being able to model the trajectory of a curveball is.  And the overlap between the sets of humans able to do each task is minimal.

Let me give another example related to the plant study above.  A child has a pile of 23 candies and wants to divide it among some gathered group of five kids including herself.  She starts to her right giving one candy to each friend, continuing to pass them out until they’re all gone.    When the process is complete, each child will have 4 candies and the three to the right of the distributor will have 5 each.  We could, if we wished to, define ‘division’ as ‘the process of dividing up a group of objects among another group’ and then say ‘thus, the kids are dividing 23 by 5 and getting 4 with a remainder of 3’.  But I think most of us would be reluctant to argue that the first child understands division, or knows how to divide.    Even though distributing the candy is a conscious decision, and even though it requires some general process (one candy to one child), it does not require that the child be able to do mathematics.

For the same reason, I sometimes have some skepticism when my colleagues in ethnomathematics describe the mathematics of some human activity in terms of fractal geometry or the Fibonacci series.   It is, of course, possible that people have some awareness of the processes behind their activities, and ethnographically, when they can talk about that, it is very interesting.   For instance, if the child above says “Well, I know I have 23 candies and so they won’t go evenly, so there are going to be some left over at the end,” then we do indeed know that the child has some explicit knowledge of division.    I worry, in fact, that because so many natural processes result in such sequences, that we confuse the result with the conscious awareness of the process.  In doing so, we fail to investigate the explicit mathematical knowledge that humans do actually encode in all sorts of things they do, and we falsely attribute a sort of explicit consciousness to activities that have no explicitness underlying them (in humans, animals, plants, and even in nonliving things).

Screws, hammers, and Roman numerals: An allegorical complaint

Let’s imagine that you have a toolbox in your garage, full of all sorts of different useful things, and I’m your annoying neighbor.  One day I drop by while you’re working.  I rummage around, pick up a screwdriver, and say to you, “Gosh, that’s not a very good hammer, is it?”  Naturally, you protest that it isn’t a hammer at all.  Next, I hold the screwdriver by the head instead of the handle and say, “Well, of course, you could use it like this to bang in nails, but it would be very cumbersome.”  You look at me, wondering whether I didn’t hear you properly, and say, “No, really.  It’s not a hammer. I have a hammer, but it’s in the trunk of my car, and that’s not it.”  I turn to you and say, “Well, I’ve never seen your hammer, and it would really be a lot easier if you just used the handle of a screwdriver to bang in nails.  Except that it’s no good for that.”

Now let’s turn from this surreal Pythonesque world to another scenario.

You’re an epigrapher and you find some inscriptions with some Roman numerals.  You look at them and say, “Gosh, those things aren’t very good for math, are they?”  Of course, the writer is dead, so he/she doesn’t say anything.  Next, you fiddle around with the numerals and think to yourself, “Well, look at that!  You could use those for arithmetic if you wanted to, but it would be very cumbersome.”  Again, the writer is not around to protest, although as it turns out, someone else dug up an abacus a few kilometers away.   You think of that, though, and say, “Well, it would really be a lot easier if they had just used numerals to do arithmetic, except that their numerals are no good for that.”

So this is the world I live in, and this is the battle I fight.

The problem is a cognitive and ideological one. We are so attached to the idea that numerals are for arithmetic that it’s very hard to stop and ask whether number symbols were actually used for doing calculations in a given society.  There’s essentially no evidence that Romans or anyone else ever lined up or computed with Roman numerals on papyrus or slate or sand or anything else, while there’s abundant evidence that they used an abacus along with finger-computation.  This should give us pause, but our cognitive bias in favour of the numeral/math functional association overpowers it.    For almost all numerical notation systems used over the past 5000 years, there’s precious little evidence that numerals were manipulated arithmetically.  You might have a multiplication table, or you might write results, but you wouldn’t line up numbers, break long numerals into powers to work with them, or anything of the sort.   And since we don’t know that much about abaci and other arithmetic technologies, even though they were obviously used for arithmetic, we assume (wrongly) that they certainly could never be equally good as written numbers.  And thus we conclude (finally, wrongly, again) that Romans were hopeless at arithmetic.   We might even blame their (purported) lack of mathematical proficiency on their lack of a ‘good’, ‘efficient’ numeral system.

It’s a casual, all-too-easy ethnocentrism, and hard to detect.  It’s not the nativistic, “our ways are good, your ways are bad” ethnocentrism that we mostly know to avoid.     Because arithmetic as it is presently taught almost everywhere relies on the structure of the positional decimal numerals, lined up and manipulated as needed, it takes on a naturalness that is deceptively difficult to untangle.   Yes, the Roman numerals are quite difficult to use if you presume that the way to use them is to break them apart, line them up, and do arithmetic in something like the way we were taught.   This isn’t to say that the functions of technologies aren’t relevant, but if we decide in advance what their functions must be, we are likely to miss out on what they actually were, and our judgements will be compromised.

To hammer the point home: if we do that, we’re screwed.

Not the earliest zero, rediscovered

A rather unfortunate effort in Discover by Amir Aczel, ‘How I Rediscovered the Oldest Zero in History’ more or less effaces his solid legwork with shoddy theorizing and ahistorical claims.  Supported by the Sloan Foundation, Aczel (a popular science writer) went to Cambodia and tracked down the location of the Old Khmer inscription from Sambor, which is dated 605 in the Saka era (equivalent to 683 CE), which obviously contains a zero.    While the Hindu-Arabic-Western numerical tradition is seen to emanate from India, all of our earliest unquestioned examples (the late 7th century ones) of the zero are from Southeast Asia, and Sambor is the earliest one.  Because things have been rough in Cambodia for a long time, his work tracking it down and ensuring that it would be protected deserves a lot of credit.

If he had stopped there it would have been fine. Unfortunately, in an effort to bolster the importance of his claim, Aczel spends quite a lot of time justifying this as the first zero anywhere, ever, neglecting Babylonian and Maya zeroes from many centuries earlier.  To do that he needs to whip out all sorts of after-the-fact justifications of why those zeroes don’t really count, because Babylonians didn’t use their zero as a pure placeholder, or because Maya zeroes, well actually he just ignores those until the comments (but don’t read the comments – really, folks, that is the first rule of the internet).   Just for kicks, and regardless of the fact that it has nothing to do with zero, he starts off with a lengthy diatribe about how the Roman numerals are ‘clunky’ and ‘cumbersome’ and ‘inefficient’, which as long-time readers of this blog, or anyone who has read Numerical Notation, will know, is an utterly ridiculous, ahistorical claim that is divorced from how such numerals were actually used over two millennia.

I have come to terms with the fact that I will probably be spending the rest of my career pointing out that absolute judgements of the efficiency of numeral systems run the gamut from ‘missing the point’ to ‘completely ahistorical’ to ‘rabidly ethnocentric’.  While Aczel’s piece is not the worst of the sort, it certainly doesn’t deserve much praise.  Which is a shame, since that Sambor inscription really is the first known zero in the Indian tradition (to which our own Western numerals owe their origin) and it’s great that he’s been able to reconfirm its location in a politically perilous part of the world.

It’s just ones and zeroes: the representational power of binary notation

This recent Saturday Morning Breakfast Cereal strip illustrates a ridiculous, but ultimately profound, issue around how we think about numbers and computers:

Most of us who use computers, regardless of age, do not actually think that there are little physical tokens that look like ‘1’ and ‘o’ physically bouncing around inside the CPU or residing on the hard drive.    We know that that can’t be true.   In some sense, we (hopefully) understand that ‘1’ and ‘0’ are symbols of ‘on-ness’ and ‘off-ness’, conventional representations using binary (a two-state numerical system) of the foundation of  modern electronic circuitry.  And yet, when we talk about how computers ‘think’, we inevitably end up talking about 1s and 0s. Which is why we chuckle when the same idea is used in the Onion article ‘Microsoft Patents Ones, Zeroes‘ or in the Futurama movie Bender’s Big Score, which relies on the conceit of a  series of ones and zeroes tattooed upon someone’s butt that, when read aloud, opens up a time-travelling portal.

We laugh because, at some level, we know that computers are not really reading ones and zeroes off a page.  But if not, what do we think they’re doing? I think it would be fascinating to figure out what the cultural model is that underlies this – that it would be a nice ethnographic question to ask, “What does it mean when people say that computers use 1s and 0s?”   You would surely get a lot of responses from computer scientists that talk about switches and logic gates, and some blank stares, but it would be very interesting to see how ordinary, average, computer-literate users talk about binary as a language that computers understand.

Like any good geek dad, I spend a lot of time trying to stop my son from spending all day watching Youtube videos of video games, and the solution, fortunately, seems to be that he also likes watching a lot of Youtube videos about science and technology, and so he introduces me to some cool ones, and we watch them together.    So take a few minutes to check out this recent video from the fantastic Computerphile channel, where James Clewett talks about the importance of abstraction as a means of allowing us to talk about  what’s going on in everyday computing in an understandable way:

Let’s focus on the segment starting at around 0:59: “Look, a transistor is just a switch, and that switch can be open or closed, and the electrons travelling down the wire, they’re either there or they’re not there, which is a 1 or a 0,  and in Numberphile we talk about 1s and 0s a lot, so we won’t go back into that, but it’s just numbers travelling down a wire.”

Clewett, who obviously does understand exactly what is going on, starts with a discussion of switches (real objects) which can be in one of two states, on or off, and then moves to electrons (real objects) either being present or absent, then makes an abstracting discursive move to talking about 1s and 0s, which are not real physical objects, but an abstract representation of the states of switches or the presence/absence of electrons.  And then, within twenty seconds, he’s moved to ‘just numbers travelling down a wire’, which is a highly concrete representation indeed, but clearly not a literal one.  And even though we and he know that numbers are abstractions of the properties of the world – that the numbers are not actually little objects moving down a wire – this seems to be a very central way of thinking about how computers think.  We can’t seem to do without it for very long.

I wonder whether this is tied in to the metalinguistic idea that entities need language to communicate or to think – that we need a metaphorical, language-like understanding of how computers process information, and so we build up this understanding that is close to how we imagine a thinking entity must process information, even though we understand at some other level that it cannot actually work this way.    It may be the most apt metaphor for understanding off/on switches (or digital information generally) but it is still a metaphorical understanding constrained by how we think entities that process information analogously to humans must work.

Numerals in webcomics

Over the past few years, I have been informally collecting and curating a set of comics (mostly online webcomics) relating to my main research interest in numerals, number systems, and numeracy.     While I am led to understand that not everyone in the world appreciates my particularly nerdesque sense of humour, it seems reasonable to suppose that if you’re reading this blog, then you might be like me and find these to be hilarious and/or thought-provoking.    Here are some of my favourites; reader contributions are very welcome (along with suggestions of other comics where I might find good material in the future).


Married to the Sea


Mortgage industry:

Number Two Number Four:


Saturday Morning Breakfast Cereal

Balls constants:

Conversation Trick #57721: Self-referential phrases:


Polish hand magic:

Too many zeroes:


Toothpaste for Dinner

10 types of people:

Happy New Year 2008:

Swedish binary:

Synaesthesia emergency:




1 to 10:

Binary sudoku:

Code Talkers:

ISO 8601:

License Plate:


Number line:

Numerical sex positions:

One two:

Words for small sets:

Numerals inside the Great Pyramid

A couple of weeks ago all the news was about some new red ochre markings found in a shaft on the interior of the Great Pyramid at Giza (a.k.a. the Pyramid of Khufu), identified using an exploratory robot. That was pretty cool. But if you’re a professional numbers guy (as I am) you’ll be doubly excited to learn that it is probable that those marks are hieratic numerals. If this interpretation is correct, these are almost certainly mason’s marks used to indicate some quantity involved in the construction. Other than the fact that I would like all news outlets to stop calling them hieroglyphs (they aren’t – the hieratic script is a cursive Egyptian script that differs significantly from the hieroglyphs, and the numerals look nothing alike), this is really cool. I do want to urge caution, however: this does not imply that the Great Pyramid was designed along some sort of mystical pattern or using some numerological precepts. It actually doesn’t tell us even that the marks indicate the length of the shaft (as Luca Miatello suggests in the new article) – it could just as easily be 121 bricks in a pile used to make a portion of the pyramid. I am also not 100% convinced of the ‘121’ interpretation – the 100 could be a 200, very easily, or even some other sign altogether, for instance. But the idea that numerical marks using hieratic script would be made by the pyramid-makers is entirely plausible and helps show the role of hieratic script in the Old Kingdom. Although it’s hardly going to revolutionize our understanding of Egyptian mathematics, it may well help outline the functional contexts of the use of numerals in Old Kingdom Egypt.