Antoni Oliver is an associate professor at the Open University of Catalonia (UOC – Barcelona – Spain) and the director of the Master Degree in Translation and Technologies. He holds a degree in Telecommunications Engineering, a degree in Slavonic Philology, a master degree in Free Software and a PhD in Linguistics.
He coordinates and teaches several subjects in the degree in Translation, Interpreting and Applied Languages (UOC-UVic) and in the master degree in Translation and Technologies (UOC). He also collaborates in the master degree Tradumatica at the Autonomous University of Barcelona (UAB). His research is centered in the field of Natural Language Processing, mainly in areas related with machine translation, computer assisted translation tools and the automatic creation of lexical and terminological resources.He has participated in several European and Spanish national projects related with his research areas: Metis II, about statistical machine translation using monolingual corpora; Etitle, about automatic multilingual subtitling; Know, Know 2, Skater and Tuner, where he has mainly participated in the creation of lexical and semantic resources. He is currently a member of the project “MOMENT: Metaphors of severe mental disorder,” where the discourse analysis of affected people and mental health professionals is analysed.
He is the author of several papers in academic journals and presentation in international conferences in the area of Natural Language Processing. He is the author of the book “Technological Tools for Translators” (in Spanish, English translation available but unpublished), where a panoramic view of the translation technologies is offered.
1. You have studied both in the language field (Linguistics, Slavonic Philology) and in technical domains (Telecommunication Engineering, Free Software). What inspired you to study in these fields?
Firstly I studied engineering, finished it and started to work as an engineer. When I was studying in the summers, I travelled several times to the Eastern Europe and every summer I stopped a few days in Zagreb (Croatia) where I have some friends. So, I started to learn Croatian and enrolled in the Russian courses in the Official School of Languages (as Russian was the only Slavonic Language offered in the school). Some years after that, when I was still studying Russian in the School of Languages, the University of Barcelona started to offer Slavonic Philology, so I decided to study it. During my studies of philology, I studied subjects related to Computational Linguistics and Natural Language Processing and I discovered that they had some points in common with my previous studies in engineering. After my degree in philology, I enrolled in the doctoral courses and after some years I completed my PhD in Computational Linguistics.
2. You are the director of the MA on Specialised Translation at the Universitat Oberta de Catalunya and a professor there. What are the main challenges in your day-to-day work, and what in particular do you really enjoy doing?
The main challenge is to keep the master degree up to date, adapting it to the new technologies and the real needs of our students, the future professional translators. Now we are involved in the transformation of this master degree into a new master degree, called MA on Translation and Technologies, which will be offered in the next course. This new MA will offer two itineraries: a professional one and a research-oriented one to students willing to do a PhD in this area.
I really enjoy testing new techniques and tools and preparing teaching materials explaining these new developments.
3. You are teaching subjects related to translation technologies and natural language processing. What advice would you give to students that want to develop in the fields of translation and terminology?
All translators and terminologists need to have a good knowledge of the available technologies that can be of great help in their daily work. Nevertheless, both in translation and terminology, the main and the important work is the human, manual one. So, technology should be one of the skills for translators and terminologists, a very important one, indeed, but it is not probably the main one. On the other hand, there is a relatively new profession: the translation technologist that provides technological services to translators and terminologists in companies and institutions. I think that this new professional profile is very interesting and it will have very promising working opportunities.
4. Which book, paper or project in the field of terminology would you recommend to terminologists to read/follow?
I think it is always useful to re-read the works on terminology from Wüster, available in several languages. The main ideas expressed in this book are still valid, and they are very useful to remember during terminological work. For a good introduction to the main techniques for terminology extraction, I would recommend the paper from Maria Teresa Pazienza (Pazienza M.T., Pennacchiotti M., Zanzotto F.M. (2005) Terminology Extraction: An Analysis of Linguistic and Statistical Approaches. In: Sirmakessis S. (eds) Knowledge Mining. Studies in Fuzziness and Soft Computing, vol 185. Springer, Berlin, Heidelberg).
There are also a lot of aspects of terminology, apart from the theoretical and technical ones, that are treated in the free e-book written by Rodolfo Maslias “Terminology in the changing world of communication”. I think this e-book is worth reading.
5. You have created TBXTools, an automatic terminology extraction tool. Could you tell us a bit more about it?
Terminology is one of my areas of research and I also participate in several projects for the creation of terminology resources. TBXTools is our platform both for research and production. This tool allows performing several tasks related with terminology extraction, so it is very suitable for the creation of resources. It also offers several useful features for research: for example the ability to automatically evaluate the results of the extractions in terms of precision and recall. Every new idea or technique we want to test we develop it into the tool. In this respect, the tool is growing and every few months new features are implemented.
The tool is being actively developed and it is released as free software, so everybody can freely download, use and share it. For the moment, it does not have a graphical user interface and it should be used in the terminal. But, we plan to provide a graphical user interface soon, providing the main functionalities.
6. For extraction and managing terms from a domain specific corpus, what are the best practices that you would recommend?
The extraction process needs several steps to be successful:
– An accurate planning of the extraction task, defining the resources and methodologies that we want to perform.
– The process should be documented thoroughly.
– The terminology extraction process requires a lot of manual revision. To make it easier we can use a list of already known terms to avoid revising them again. We should also maintain a database of candidates marked as non-terms, as they can be also used to avoid revising them in every extraction process. In this way, we can speed up the process.
– The results of the extraction (the list of new terms and the list of candidates marked as non-terms) should be conveniently saved and some system for the control of version should be applied. We should know which the last and the valid version is and where it is stored.
So the important thing is to plan what we want to do, document how we have done it, and save the results in a known place with a known name.
7. Do you use IATE in the courses at your institution? How do you evaluate the Spanish terminology present in IATE and would you suggest any improvements or any areas to especially focus on?
Yes, IATE is one of the main terminological resources in the world, so we teach how to use it and we actually use it in our courses. Spanish is well represented in IATE and there has been a great effort to revise the contents in Spanish, so now we have a lot of quality terms in IATE. As you know, there are several other official languages in Spain, and they are not represented in IATE (as they are not official languages in the European Union). We are currently working in the Catalan version of the IATE and I think this is an interesting direction for other languages in Europe that don’t have the status of being an official language in the European Union.
On the other hand, IATE has developed the subjects of interest of the European Union. There are a lot of unrepresented subjects, as they are not used in the European institutions. A very interesting project would be the inclusion of other subjects in IATE, using the same structure and for all the available languages. I think that for this task, universities can have an important role, as they can develop the resource during the training of their students.
8. What is special about terminology work in Spain? Are there specific difficulties you face, or specific advantages you have? And have these changed over the years?
A lot of terminological work is being done by institutions from Governments of autonomous regions having their own official languages. For example, an excellent work in terminology is being done by TERMCAT, the official institution for terminology from the Catalan Government. In this institution, terminological resources including at least Catalan, Spanish and English are created, and most of them are released with a free licence. Similar work is being done by Euskalterm, from the Basque Government.
9. In what other interesting projects have you recently been involved with? Could you elaborate a little on those?
As already mentioned, we are currently working with TERMCAT in the creation of the Catalan version of IATE. Of course, for the moment this is an unofficial version, and it contains the Catalan terms for the IATE terms, sharing the same codes. In this regard, we get a multilingual terminological database including Catalan and another 24 European languages. In this project, some students from our master degree are collaborating, and we think this is an interesting experience for the students. We want to share this experience with other institutions to offer our methodologies for the enlargement of the IATE for some underrepresented languages, and also for the inclusion of new languages. We hope that in the near future, this Catalan version of the IATE will be included in the official release.
I am also involved in projects related to Machine Translation, mainly related with neural approaches. These approaches are getting very promising results in terms of quality. There are a lot of neural machine translation toolkits, and we are experimenting with some of them.
I would also like to mention the InLéctor Project (https://inlector.wordpress.com/), where we are publishing free bilingual e-books that aims to help readers willing to read in the original language. In these books, you can access the translated version of the book with a click in the sentence. This feature can be of great help to readers with an intermediate level in the source language. Up to now, the translations are real published translations, but due to the difficulty finding translations in the public domain, we are starting to use neural machine translation to create the translated version. The goal of these bilingual books is not to read the book in the target language, but to give some help when you’re reading the original version.
10. What is your view on the future of translation and terminology?
I’m very optimistic about the future of translation. New technical achievements, for example, the very good results of Neural Machine Translation, will have a positive impact to the profession. More and more texts will be translated using machine translation, but at the same time, consumers will be more aware of the importance of revision of important texts, and also will be sensitive to the text that should be translated by humans. So, I think that more work will be available, both in terms of post-editing work, but also in terms of manual human translation.
It is necessary to adapt these new machine translation systems to different domains of terminology, so more resources will develop, giving new opportunities to terminologists.
So, I think that we will have a positive interaction between human translators and terminologists, on one hand, and technology, on the other. The overall effect of this interaction will be positive for the profession.
Interviewed by Olga Vamvaka – Terminology trainee at the Terminology Coordination Unit of the European Parliament (Luxembourg).
She holds a BA in International Relations and Organisations and an MA in Translation and has worked in language teaching. She speaks Greek, English, Czech and French.