Concept for Algorithmic Identification of Primary Spoken Language of Second-Language Speaker Using Meta-Analysis of Deviations from Proper Usage Both for Algorithmically-Translated Text and Human-Translated Text
Human beings and machines, alike, use algorithms to parse and process information concerning language. Because simple translation algorithms still lack the richness of capacity of a human translator, it is fairly easy to determine when a sample of text is the byproduct of an algorithmic translator, provided the sample size is large enough. In the world of online propaganda, nation-states use a combination of algorithmically-translated text and “expertly” translated texts prepared by humans.
While simply having a copy of the Google Translate translation matrix would enable a programmer to create an algorithm that indicates what the native language of a speaker might be when text is algorithmically-translated (it is not clear if even this level of analysis has been achieved with an algorithm since nothing came up in a cursory search of Google Patents,) when language is translated by a human, one must build their own system for analysis of the text that takes into consideration common errors made even by skilled translators including artifacts of language that do not technically constitute errors and which would be perceived as irregular only by a native speaker with great verbal aptitude.
Human beings and machines, alike, use algorithms to parse and process information concerning language. Because simple translation algorithms still lack the richness of capacity of a human translator, it is fairly easy to determine when a sample of text is the byproduct of an algorithmic translator, provided the sample size is large enough. In the world of online propaganda, nation-states use a combination of algorithmically-translated text and “expertly” translated texts prepared by humans.
While simply having a copy of the Google Translate translation matrix would enable a programmer to create an algorithm that indicates what the native language of a speaker might be when text is algorithmically-translated (it is not clear if even this level of analysis has been achieved with an algorithm since nothing came up in a cursory search of Google Patents,) when language is translated by a human, one must build their own system for analysis of the text that takes into consideration common errors made even by skilled translators including artifacts of language that do not technically constitute errors and which would be perceived as irregular only by a native speaker with great verbal aptitude.
