Tomorrow night (Wed. Nov 10, 6:30pm EST) the Planet Word museum in Washington, DC will be hosting an online interview and Q&A with me, entitled “The Language of Numbers”. There’s still time to sign up at the link to preregister for the free Zoom event where I’ll be answering questions about linguistics, number systems, and my book, Reckonings: Numerals, Cognition, and History. Sign up soon – registrations will close shortly. Hope to see folks there!
Once again, I am having my undergraduates in my intermediate linguistic anthro class at Wayne State pick from a curated list of fun, interesting, socially relevant, or just plain wacky words for their original research papers this term. The Lexiculture project teaches a dollop of research methods, a touch of discourse analysis, a dab of corpus linguistics, and a soupcon of linguistic anthropology as the student-researchers investigate the sociocultural context and relevance of a single English word.
Earlier this year I edited 62 student papers written from 2013-2020 into an open-access ebook, The Lexiculture Papers: English Words and Culture. Check it out – I’m exceptionally proud of this collection of student scholarship.
And for those interested, here’s the list my students are choosing from this year:
|all out||djent||man cave||shoo-in|
|car phone||Information Superhighway||psychobabble||unmentionables|
|card-carrying||jailbait||rando||upside the head|
Any favourites you’d really like to see picked this year?
All right, if you follow me over on Twitter, you’ll have seen, over the past few weeks, a puzzle I presented there (with hints and historical digressions) that ended with the successful decipherment of what I can now tell you is called the Serpentine Cipher – this particular word is just the word SERPENTINE. And you will certainly see that each sign certainly is serpentine-looking:
This text is super short and decipherment is certainly a challenge without hints and without some additional information. It starts with the numerical notation used by Johann Joachim Becher in his 1661 Character pro notitia linguarum universali. This was, as the Latin name suggests, one of many 17th century ‘universal language’ schemes, meant to encode concepts rather than words tied to any specific language. Becher’s system used a different number for each of 10,000 concepts, distinguished with lines and dots around a frame:
Becher’s notation wasn’t completely original to him, though. It’s a variant of the Cistercian numerals described in David King’s magisterial 2001 book, Ciphers of the Monks. The system became better known in 2020 via the Numberphile Youtube channel:
King’s book shows how this local development, in parallel to Indo-Arabic / Western ciphered-positional numerals (the digits 0-9), spread throughout European intellectual life into strange places, from volume markings on Belgian wine barrels to modern German nationalist runology. But among the more notable places you find this kind of numeration is in various ciphers, universal language schemes, and other sorts of semi-cryptic efforts to encode language in the 16th and 17th centuries. Although we now know, very firmly, that the Cistercian numerals were a medieval European invention, they were often described as ‘Chaldean’ and/or assigned considerable antiquity / mysticism.
My own contribution to this reception literature was in a post here a few years ago, Cistercian number magic of the Boy Scouts, showing how it ended up in 20th century Scouting literature:
Anyway, the Serpentine Cipher isn’t based on any of that, but is taken directly from Becher. But you can’t just use Becher’s universal cipher at this point, because a ‘universal language’ of 10,000 individual concepts is pretty damn useless. Instead, to solve it, you needed to convert the five glyphs to numbers, and then those to specific pairs of letters – so that five glyphs produces a plaintext of ten letters.
So if you got that far, you found that the five glyphs were five numerals written quasi-positionally, without a zero, in a mixture of base 5 and 10: 737, 3233, 473, 1633, and 473. The fact that the third and fifth glyphs are identical is important, but also potentially misleading. By the way, the reason you don’t need a zero is that the ‘place values’ aren’t linear, but oriented on the same frame, so you can simply leave one blank to indicate an empty space. It’s a kind of ‘orientational’ or ‘rotational’ zero-less place-value. The downside is that unlike a linear phrase it isn’t infinitely extendable.
Next, you needed to notice that each number is the product of exactly two prime factors. By the Fundamental Theorem of Arithmetic, every number is the product of some unique set of prime factors. So there’s no ambiguity: 737 is *only* 11 x 67. And by chance, there are 25 primes below 100, so, borrowing Z = 101, we can associate each prime with a letter:
- A = 2
- B= 3
- C = 5
- D = 7
- E = 11
- F = 13
- G = 17
- H = 19
- I = 23
- J = 29
- K = 31
- L = 37
- M = 41
- N = 43
- O = 47
- P = 53
- Q = 59
- R = 61
- S = 67
- T = 71
- U = 73
- V = 79
- W = 83
- X = 89
- Y = 97
- Z = 101
Thus, each glyph can be treated as a product, and thus as a two letter sequence. 737 = 11 (E) x 67 (S), the 5th and 19th primes. (For words like PIZZA that would use the ZZ glyph (101 x 101 = 10201) you have some different options for that fifth place-value, but these are rare enough to ignore for now). Then all you have to do is ‘serpentine’ between the two letter-pair combinations for each number to figure out which pairs lead to the solution. Voila!:
An added bonus of using the word SERPENTINE is that it illustrates one of the key (mildly) confounding properties of the cipher, namely that an identical glyph (473) always has two readings, both of which occur in this one word.
Now, note that the only glyphs that will have even values are ones that use A=2, because the product of odd numbers is always odd. This would have provided a hint – if I’d given you a word with any As in it. (You can also use A=3 … Z=103 if you like, but there will be more products >10000 then.)
Really, once you see all those 11s, it’s not a bad guess that those 11s are Es – but of course, without knowing exactly what their position is, it makes deciphering such a short text tricky. But I don’t pretend that this would stand up to serious cryptanalysis as-is.
Finally, if you have a ‘straggler’ odd letter left out at the end of a word or phrase you can either multiply three letters into a product (though that gets unwieldy, e.g., WRY = 83 X 61 X 97 = 491,111) or just have a single number (a prime) at the end. Either one of these might tip you off as to a word boundary. Of course, you don’t have to stop at word boundaries, so you can SP LI TU PT HE WO RD SI NT OP AI RS LI KE TH IS.
Anyway, thanks to all who played along. I think this is a bunch of fun, doesn’t need much more than basic arithmetic, and provides a neat digression into the history of number systems and early modern cryptography. Paul Leyland was the first correct decipherer and is thus a winner of a copy of my book, Reckonings: Numerals, Cognition, and History, which, while it is not really about ciphers at all, does have a lot of stuff relevant to number systems and early modern history.
Finally, this cipher is presented in memory of my dear friend Victor Henri Napoleon, who was one of the original decipherers of an early/experimental version of the Serpentine in 2017, and who passed away suddenly last week at the age of N (43). You will be missed, Vic!
- the biases and blind spots that lead folks to conclude wrongly that the Roman numerals were replaced because they were awkward for arithmetic;
- the various relationships among words for counting, thinking, talking, and cutting;
- our unexpected choices and constraints when selecting how to say and write numbers;
- the history of the comparative, historical linguistic disciplines including linguistic anthropology, classics, and philology;
- and a lot more!
For those of you who don’t know the podcast, it’s a gem that focuses on etymology, classics, English, history, and more. Strongly recommended! – ok, I grant that I may be biased when it comes to today’s episode, but there’s a ton of great other content to be found on the podcast, as well as the affiliated website and Youtube channel.
How many number systems are out there? When I finished my dissertation in 2003, I described my work as analyzing “over 100” structurally distinct numerical notations. Counting them is really impossible, because no one knows what ‘structurally distinct’ means. Does it ‘count’ as a distinct system when, in Western Europe, folks started to use numeral delimiter commas (26,000 vs. 26000) or decimal points? I was hopelessly trying to give a number, without necessarily counting the dozens of decimal, positional systems of the broader Indo-Arabic family. All those systems descended from the positional variants of the Brahmi numerals that originated in early medieval India, in which all sorts of script traditions use ten signs for 0-9 but substitute local signs. We can call those all different systems, or we can not, depending on our perspective.
But then by the time my dissertation became a full-fledged book, Numerical Notation: A Comparative History, in 2010, having been poked and prodded by no fewer than 14 peer reviewers (yes, really!!!), more systems were added. I stuck with “over 100” because, well, that’s technically true, but by that point it was many more than that. And I keep finding more. There’s so much out there that hasn’t been accounted for. I was going over some notes earlier this week and there are at least 25 notations on my ‘to add’ list not described anywhere in the synthetic / comparative literature. Probably closer to 50, and counting. Part of the challenge is that these are notations that are peripheral to the concerns of the major traditions of philology, epigraphy, and the history of science. I don’t think I missed any well-known ones! Some of them may have been used by only a handful of individuals, or for a short time. But there are a lot of them – far more than I would have guessed when I started on this wild path.
In a single article (cited only four times since publication), M.A. Jaspan (1967) described not one but two numerical notation systems used by speakers and writers of Rejang, a language of southwestern Sumatra. Other than technical reports by Miller 2011 and Pandey 2018 for Unicode encoding, basically no one has ever acknowledged or discussed them:
This first system may look unusual, but it is part of a broad tradition of aksharapallî systems, which use the alphasyllabaries (abugidas) of South and Southeast Asia, in their customary order, to assign numerical values to specific syllables (Chrisomalis 2010: 212-213). Here, the 23 signs (with the implied vowel ‘a’) correspond to 1-9, 10-90, and 100-500, and then for the higher hundreds, two signs combine additively. This system doesn’t have a zero – each multiple of each power of the base (10) gets its own sign, so it’s what I’ve classified as ciphered-additive – like Greek, Hebrew, and Arabic alphabetic numerals, or Cherokee, Jurchin, or Sinhalese, among others. Jaspan is dead wrong in writing (1967: 512) that “It has, as far as I know, no parallel or similarity to, other known systems either in South-East Asia or elsewhere.” Aksharapallî systems were once widespread throughout South and Southeast Asia, and are used for various purposes, including pagination, which is exactly what Jaspan reports that at least some Rejang writers used them for during his fieldwork in the early 1960s.
The second system is in some ways, even more striking. The system is structurally almost identical to the Roman numerals – there are signs for each power of 10, as well as the quinary halves 5 and 50. The hundreds are still additive but have some more complexities, and then the thousands don’t have a quinary component at all. These sorts of systems that rely on repeated signs within each power, and don’t use place-value, are called cumulative-additive and are very common throughout the Near East and the Mediterranean but relatively rare in East and Southeast Asia (though there are systems like the Ryukyuan suchuma that have this structure). I have absolutely no idea where it came from – unlike the first system, it doesn’t have any obvious relatives. At least for Jaspan’s consultants, it was used for keeping business accounts in the 1960s, though not widely.
The standard history of numerical notation is one where all systems gave way to a single, universalizing notation, the digits 0123456789, which spread globally without competition. And there’s certainly a point to be made there. But there is a countervailing factor, the inventive impetus under which we can expect all sorts of notations to be invented, perhaps not with global reach, but of critical importance for understanding the comparative scope of the world’s numerical systems. In my new book, Reckonings: Numerals, Cognition, and History (Chrisomalis 2020), I make the case that we are not at the ‘end of history’ of numeration – that innovation continues apace in this domain, and that focusing only on the well-known systems produces a very barren history. Cases like the Rejang numerals help produce a richer narrative – one of constant and ongoing numerical innovation.
Chrisomalis, Stephen. Numerical notation: A comparative history. Cambridge University Press, 2010.
Chrisomalis, Stephen. Reckonings: Numerals, cognition, and history. MIT Press, 2020.
Jaspan, Mervyn Aubrey. “Symbols at work: Aspects of kinetic and mnemonic representation in Redjang ritual.” Bijdragen tot de Taal-, Land-en Volkenkunde 4de Afl (1967): 476-516.
Miller, Christopher. “Indonesian and Philippine Scripts and extensions not yet encoded or proposed for encoding in Unicode as of version 6.0.” (2011).
Pandey, Anshuman. “Preliminary proposal to encode Rejang Numbers in Unicode.” (2018).