MTAAC and Teaching Computers to Read Sumerian

us4-he2-gal2
Administrator

Posts: 1,714

MTAAC and Teaching Computers to Read Sumerian Nov 5, 2018 4:36:19 GMT -5

Quote

Post by us4-he2-gal2 on Nov 5, 2018 4:36:19 GMT -5

MTAAC

Hey everyone - I thought I would mention some things about this fairly new project, MTAAC, which I have been working for in the last 5-6 months. It was the idea and brainchild of a co-student of mine, Emilie Page-Perron. I first heard of Emilie when she posted on our 'So you want to be an Assyriologist' thread, ten years ago as Sohnyrin. Subsequently, we both ended up in Toronto. She has had an interest in both Sumerian language and also computer programming skills, and has worked for the CDLI while completeting her Ph.D studies, in fact, she has become one of the principle investigators at CDLI, coordinating its development with that team. Since MTAAC is her side project, she is now my boss, I suppose.

The goal of MTAAC (Machine Translation and Automated Analysis of Cuneiform Languages) is to develop computer software which is able to 'read' cuneiform languages. At the moment, the focus is on Ur III administrative texts in Sumerian and on transliterations (transcriptions of the cuneiform into our Roman alphabet, so it is not intended for the computer to recognize cuneiform). Initially, my reaction to this project goal was 'so if the computers are going to be reading translating Sumerian, what am I going to be doing in my prospective career?' While I am not unconcerned about this, Emilie points out that automated reading of texts would solve a major problem in the field - that is, even after one hundred years of scholarship, there are still hundreds of thousands of cuneiform texts which are not available in printed translation. And these texts are unlikely be published in translation in our lifetime. After working on the Ur III administrative corpus for almost 6 months, I have witnessed the extent of that problem with this corpus of texts: while most are available in transliteration (transcribed from Sumerian cuneiform into our alphabet), only 1 in 10 (I would estimate) are available in translation. This is, in part, due to the general practice of Sumerologists who tend to publish the Ur III corpus in transliteration but do not publish their translations for various reasons (either the texts seem too obvious with their lists of sheep for X purpose; or there are lingering uncertainties about how the abbreviated grammar should be rendered in English - an so forth).

The work which Jinyan (a co-student) and I do for the project is really the 'grunt work,' that is, we annotate texts in minute detail so that the computer will have material with which to learn the language. This involves labelling each morophological element of a Sumerian word so as to explain its function and meaning using a sort of code that the computer can 'understand'. The project website is here: cdli-gh.github.io/mtaac/ . You will see that Jinyan and I are not mentioned currently, owing to the fact that the website has not been updated in the last 6 months. One contribution I have made is the development of an extensive theoretical treatment of the morphology of the Ur III administrative grammar. The relevant documents should be available on the website soon. Also, MTAAC follows the interpretation of Sumerian grammar laid out by Gábor Zólyomi. Prof. Zólyomi generously makes his new Sumerian grammar available for free download:

elte.academia.edu/G%C3%A1borZ%C3%B3lyomi

Last Edit: Nov 5, 2018 16:57:10 GMT -5 by us4-he2-gal2

Defying Sleep, which like a fog breaths upon him.

us4-he2-gal2
Administrator

Posts: 1,714

MTAAC and Teaching Computers to Read Sumerian Dec 11, 2018 13:13:53 GMT -5

Quote

Post by us4-he2-gal2 on Dec 11, 2018 13:13:53 GMT -5

Interestingly, MTAAC was recently featured in a bbc.com story, well the article appeared yesterday, Dec. 10th. As of today, the story still appears on the 'front page' of the online edition:

link to full article here: www.bbc.com/future/story/20181207-how-ai-could-help-us-with-ancient-languages-like-sumerian

I would disagree with the headline "the key to unlocking ancient languages?" and also the second headline "the key to cracking long-dead languages?" - of course, the idea with headlines is to grab attention and perhaps to fire the imagination: Under the subsection "furture" the reader sees this unexpected mixing of future technology and ancient language, ancient information technology. The creates a intriguing contrast, and forces one to question what the connection could be. This curiosity may prompt one to click onto the page and investigate. What I object to, in particular, is the specific phrasing that implies that the MTAAC project is "unlocking ancient languages" . Well, 100% of that credit should go to the brilliant philologists and lexicographers who worked hard over the last 100+ years to deliver Assyriology as a science.

However, as the article does explain, machine reading of cuneiform texts may solve another problem, that is, even after 100+ years the field has not been able to publish translations of the large majority of cuneiform documents, especially the 'everyday' documents (again, not because cuneiform has not be 'cracked', i.e. we haven't had the knowledge to read Sumerian and Akkadian; but because of i.e. manpower issues. Shortages of manpower.) In any case, Emilie was able to discuss some of the aims of the project with the BBC journalist and a lot of the article is focused on this:

Last Edit: Dec 11, 2018 13:29:28 GMT -5 by us4-he2-gal2

Defying Sleep, which like a fog breaths upon him.

hukkana e n e n u r i a n Posts: 300	MTAAC and Teaching Computers to Read Sumerian Dec 16, 2018 18:49:51 GMT -5 Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by hukkana on Dec 16, 2018 18:49:51 GMT -5 I find it rather hard to imagine something as context sensitive as cuneiform getting translated by a machine.

us4-he2-gal2
Administrator

Posts: 1,714

MTAAC and Teaching Computers to Read Sumerian Dec 19, 2018 16:01:16 GMT -5

Quote

Post by us4-he2-gal2 on Dec 19, 2018 16:01:16 GMT -5

I suppose I only deal with the human understanding part of the project. Then the computer folks take this typical philological analysis and do their work, so that, by means of algorithms, analogy, and fancy code and other computer stuff I don't really comprehend, the computer translates the text. In other words, I have no personal comprehension of how the computer side of the project works or its chances of success.

Emilie seems to be convincing the world that it will work, however, and she is becoming quasi famous - she has had another interview, this time with the Canadian news outlet CBC.

Last Edit: Dec 19, 2018 16:03:03 GMT -5 by us4-he2-gal2

Defying Sleep, which like a fog breaths upon him.

MTAAC and Teaching Computers to Read Sumerian

Post by us4-he2-gal2 on Nov 5, 2018 4:36:19 GMT -5

Post by us4-he2-gal2 on Dec 11, 2018 13:13:53 GMT -5

Post by hukkana on Dec 16, 2018 18:49:51 GMT -5

Post by us4-he2-gal2 on Dec 19, 2018 16:01:16 GMT -5