Sci-Tech

Decoding ancient texts, AI supports research on ancient history

2025-01-14   

From finance to medicine, artificial intelligence (AI) is profoundly changing modern life. Nowadays, it is entering the field of ancient text research: from Greek and Latin classics to Chinese oracle bone inscriptions, artificial neural networks are becoming the key to interpreting ancient texts. It can not only handle vast archives, fill character gaps, but also decode rare or extinct languages that are almost untraceable, bringing ancient wisdom back to glory under the light of modern technology. In October 2023, Federica Nicolaldi received an email with a picture that completely changed her research. This picture shows the remains of a scroll of papyrus that survived the Vesuvius volcanic disaster in 79 AD, discovered in the 18th century in the ruins of a luxurious villa in the ancient city of Herculaneum. These weathered papyrus papers, once one of hundreds of ancient books, have become fragile due to the erosion of time, and most of them cannot be unfolded. Nicolaldi is a papyrus scholar at the University of Naples in Italy, who participated in a study using AI to read difficult to understand text. Now, she has witnessed a miracle: in the picture, on a piece of papyrus tape, Greek letters are densely woven, radiating new life in the darkness. This project, called the "Vesuvius Challenge," is just the tip of the iceberg in AI reshaping the study of ancient history. Neural networks have been used for decades to reconstruct ancient texts, and computers have been used to classify and analyze digital texts. However, the most exciting aspect currently is the use of neural networks. Neural networks are composed of a hierarchical structure of interconnected nodes, especially "deep" neural networks with multiple internal layers. Convolutional neural network (CNN) models can accurately capture grid like data structures from these images. While CNN models have shone in the field of optical character recognition, they have also opened up other diverse application avenues. For example, when exploring oracle bone inscriptions, Chinese research teams cleverly use these models to restore severely eroded text patterns, deeply analyze the evolution trajectory of oracle bone inscriptions over time, and reassemble broken cultural relics fragments to recreate the original historical appearance. At the same time, recurrent neural networks (RNNs), as a model specifically designed for processing linear sequence data, are beginning to demonstrate enormous potential in searching, translating, and filling missing content in transcribed ancient texts. RNN has been used to provide intelligent recommendations for missing characters in hundreds of rigorously formatted administrative and legal texts during the Babylonian period. So, can neural networks find connections in the fragments of history that are difficult for human experts to discover? In 2017, a collaboration from the University of Oxford in the UK embarked on an exploration journey, when two researchers were facing the challenge of deciphering Greek inscriptions in Sicily. Classical scholars often rely on their understanding of existing texts to interpret new materials, but it is difficult to fully grasp all relevant information. Oxford University researchers believe that this is precisely the field where machine learning can play a role. They used an RNN based Pythia model and trained it on tens of thousands of Greek inscriptions, ultimately successfully predicting missing words and characters in the text. In 2022, they launched the Ithaca model, which not only predicts missing content but also provides date and origin suggestions for unknown text. Ithaca utilizes the breakthrough of the Transformer model to capture more complex language patterns. The currently popular chatbots worldwide, such as OpenAI's ChatGPT, are based on the Transformer model. Korean researchers have a tricky task in translating and restoring vast historical archives: organizing one of the largest historical archives in the world. This archive provides detailed records of the daily lives of 27 kings of the Kingdom of Korea from the 14th century to the early 20th century, covering hundreds of thousands of articles. Jin Hengjun, a machine translation expert at New York University in the United States, said that the amount of text data is extremely large. Manually translating these texts into modern Korean is expected to take several decades. Jin Hengjun collaborates with his Korean counterparts to train an automatic translation system using Transformer networks. The results show that AI translations far exceed ancient Korean in accuracy and readability, and sometimes even outperform modern Korean. For ancient languages with only a small amount of text left, researchers also use neural networks to crack them. Katrina Papavasilio of the University of Patras in Greece and her team used RNN to restore the missing linear text B in the Mycenae clay tablets in Nosos, Crete. Tests have shown that the model has high prediction accuracy and often matches the recommendations of human experts. Facing the dual challenges of verification and utilization, using AI to crack ancient texts still faces many challenges. AI technology enables non professionals to access a large amount of ancient literature, and ensuring the accuracy of research results has become the primary challenge. Although the power of neural networks is remarkable, their occasional misleading results, known as "hallucinations," also raise concerns about the reliability of the results. The British journal Nature pointed out that to solve this problem, humanities experts and computer scientists need to work together to study and verify the interpretation results of AI. At the same time, it is advocated to open source all relevant data (including raw text, scanned files, training models, and algorithms) to enhance the transparency and verifiability of research. This approach is called the 'digital source chain', aiming to build a complete chain from raw data to final conclusions, making it easy for anyone to trace and verify the research process. In addition, with the rapid increase in the number of digital texts, how to effectively utilize these massive data resources and extract important information about ancient society from them is also a new challenge faced by researchers. This requires researchers to shift their perspective from a single text analysis to a deeper understanding of the overall culture, and attempt to correlate text data from different regions and periods to obtain a more comprehensive understanding. (New Society)

Edit:He Chuanning Responsible editor:Su Suiyue

Source:Sci-Tech Daily

Special statement: if the pictures and texts reproduced or quoted on this site infringe your legitimate rights and interests, please contact this site, and this site will correct and delete them in time. For copyright issues and website cooperation, please contact through outlook new era email:lwxsd@liaowanghn.com

Recommended Reading Change it

Links