The European Union now has 23 official languages. That means documents written in one language may need to be translated into any of 22 others, a total of 253 possible language pairs. Small wonder that the institutions of the European Union, and organisations dealing with international commerce, among others, have a keen interest in automating the process where they can.
Efforts to use computers to translate languages, known as machine translation, date from the 1950s, yet computers still cannot compete with human translators for the quality of the results. Machine translation works best for formal texts in specialised areas where vocabulary is unambiguous and sentence patterns are limited. Aircraft manufacturers, for example, have devised their own systems for quickly translating technical manuals into many languages.
The EU has been active in promoting research in this field since the large Eurotra project of the 1980s. In common with other projects of the time, Eurotra used a ‘rules-based’ approach where the computer is taught the rules of syntax and applies them to translate a text from one language to another. This is also the basis of most commercial translation software.
But since the early 1990s the new concept of ‘statistical’ translation has gained ground in the machine translation community, arising out of research into speech recognition. This dispenses with rules in favour of using statistical methods based on a text ‘corpus’.
A corpus is a large body of written material, amounting to tens of millions of words, intended to be representative of a language. Parallel corpora contain the same material in two or more languages and the computer compares the corpora to learn how words and expressions in one language correspond to those in another. An important example is a parallel corpus of 11 languages based on the proceedings of the European Parliament.Pattern matching
To use a single corpus you need a dictionary for the vocabulary and a way to understand the syntax. In the original METIS project, completed in 2003, the corpus was processed to analysis sentence patterns and the text to be translated was then matched against the patterns.
In Greek, for example, the verb can precede the subject of a sentence. “So if you come in with a Greek sentence, ‘Eats Mary a cake’, you would like the machine to be able to translate it into English and rearrange the words to make ‘Mary eats a cake’,” explains Dr Markantonatou. “Pattern matching is a good way of doing that because it is able to take patterns from the source language and make them like the target language.”
METIS II takes the principle further by matching patterns at the ‘chunk’ level, a phrase or fragment of a sentence rather than a sentence as a whole, as this makes the pattern matching more efficient.
It can also use grammar rules to generate alternative possibilities for the translation and then use the corpus to identify which is the more probable. For example, where English would say ‘I like cakes’, some European languages might use the form ‘cakes please me.’ So in translating into English, METIS II can test alternative interpretations against the English language corpus. In this example, 'cakes please me' would get a very low score while the closest match 'I like cakes' would score highly.Four languages
The problem now is what to do next. Results from METIS II are being followed up in national research programmes in Spain and Belgium, but there are no plans as yet to further develop the whole system. Some of the components created in the project, such as dictionaries and associated language tools, could be marketable in their own right, but would need an industrial partner to provide the investment needed to turn the prototype into a commercial product.
“For Greek, it would be an excellent opportunity because there is nothing really good for [translating it] at present,” Dr Markantonatou tells ICT Results. “With a better lexicon, fixing bugs and making algorithms more efficient, this kind of thing could work. In another two or three years, METIS could be a very serious competitor to SYSTRAN. It’s a matter of funding.”
Christian Nielsen | alfa
Scientists develop algorithm for researching evolution of species with WGD
26.02.2020 | ITMO University
High-pressure scientists in Bayreuth discover promising material for information technology
25.02.2020 | Universität Bayreuth
Researchers at the University of Bayreuth have discovered an unusual material: When cooled down to two degrees Celsius, its crystal structure and electronic properties change abruptly and significantly. In this new state, the distances between iron atoms can be tailored with the help of light beams. This opens up intriguing possibilities for application in the field of information technology. The scientists have presented their discovery in the journal "Angewandte Chemie - International Edition". The new findings are the result of close cooperation with partnering facilities in Augsburg, Dresden, Hamburg, and Moscow.
The material is an unusual form of iron oxide with the formula Fe₅O₆. The researchers produced it at a pressure of 15 gigapascals in a high-pressure laboratory...
Study by Mainz physicists indicates that the next generation of neutrino experiments may well find the answer to one of the most pressing issues in neutrino physics
Among the most exciting challenges in modern physics is the identification of the neutrino mass ordering. Physicists from the Cluster of Excellence PRISMA+ at...
Fraunhofer researchers are investigating the potential of microimplants to stimulate nerve cells and treat chronic conditions like asthma, diabetes, or Parkinson’s disease. Find out what makes this form of treatment so appealing and which challenges the researchers still have to master.
A study by the Robert Koch Institute has found that one in four women will suffer from weak bladders at some point in their lives. Treatments of this condition...
The operational speed of semiconductors in various electronic and optoelectronic devices is limited to several gigahertz (a billion oscillations per second). This constrains the upper limit of the operational speed of computing. Now researchers from the Max Planck Institute for the Structure and Dynamics of Matter in Hamburg, Germany, and the Indian Institute of Technology in Bombay have explained how these processes can be sped up through the use of light waves and defected solid materials.
Light waves perform several hundred trillion oscillations per second. Hence, it is natural to envision employing light oscillations to drive the electronic...
Most natural and artificial surfaces are rough: metals and even glasses that appear smooth to the naked eye can look like jagged mountain ranges under the microscope. There is currently no uniform theory about the origin of this roughness despite it being observed on all scales, from the atomic to the tectonic. Scientists suspect that the rough surface is formed by irreversible plastic deformation that occurs in many processes of mechanical machining of components such as milling.
Prof. Dr. Lars Pastewka from the Simulation group at the Department of Microsystems Engineering at the University of Freiburg and his team have simulated such...
12.02.2020 | Event News
16.01.2020 | Event News
15.01.2020 | Event News
26.02.2020 | Physics and Astronomy
26.02.2020 | Interdisciplinary Research
26.02.2020 | Power and Electrical Engineering