In the era of globalization, people’s interaction is limited by an invisible languages barrier. Although English is used beyond boundaries and practiced by billion people today, it still becomes a problem in many areas when this language is not acceptable for specific use i.e business and marketing.
The use of human force in translation will grant authenticity and also provides better accuracy, but it is considered not scalable for a larger application in the business area. The use of machine translation to help product localization is a feasible option that motivates the development of better machine translation from time to time.
Since its first development, machine translation is designated to give a quick translation result that needs no or less human participation. In a long process of experiments, the active learning algorithm has become the current paradigm that will help a machine translation do its work automatically.
The continuous development had applied several approaches to refine the translation quality. Only several years ago, the significant changes in the way a machine translation works have made more reliable results among many language pairs. Thanks to the active learning technology making the automated machine translation getting better today.
What is active learning that underlies advanced translation technology? For several decades ago, the application of statistical machine translation had inspired a better process for better results since not every language has similar rules of grammar and linguistic features that can be converted easily.
The corpus-based translation that was employed by the statistical approach required parallel corpora from each of the language pairs to make the translation works. This approach would require so much equivalent data made by humans that would be taken into the process. The amount of data needed by the machine to work automatically had cost a large amount of human effort as well as incentive so that the machine could work in every language pair.
This method was then considered less prospective so that active learning technology is applied in current translation technology. The active learning translation does not only seek the parallel or equivalent corpora but it refers to crowd data to be processed and analyzed to make a translation. It is quite a huge computing process but the current supercomputer is capable to handle the process as it also stores added information to do the next translation.
The process of active learning technology is almost similar to how our brains do translation. That’s why it is also called nerve process translation. This active learning uses the neural network to make the automatization amid the process. It can create a decision that needs a small amount of human intervention that makes this approach is considered more feasible compared to the previous version.
Both statistics-based MT and neural-based MT are requiring a large amount of data, the big data. However, the last version needs lower efforts of language experts since the newer version employs an active learning algorithm that can learn languages from non-expert sources, and generate parallel data to build the system for more reliable machine translation.
References: active learning, deep learning