Just fifteen years ago, the average user rarely encountered language technology - now anyone with web access can use it. When you command Alexa to turn the lights on, ask Google what year your favorite movie was released or dictate a text message while you’re cooking dinner – all of these modern touches utilise human language technology.
With over a quarter of UK homes now owning a smart device – not to mention the powerful computers we carry around in our pockets all day, every day – it’s not surprising that we have become accustomed to a certain way of living.
ALEXA, WHAT IS LANGUAGE TECHNOLOGY?
The ability to use language – chat, debate, question, discuss – comes naturally to most of us. Language technology, or more commonly human language technology, is simply how computers and devices are programmed to understand it. The modern ability for machines to analyse, produce, modify or respond to human speech is a major step change in artificial intelligence.
The Alexa, Siri and Cortana devices of our world feel like relatively new additions to our daily lives; however, one of the earliest and best-known examples of human language technology was the ELIZA programme.
Developed by Joseph Weizenbaum at MIT in 1966, it was designed to respond to written statements and user questions in the style of a Rogerian – or person-centered – psychotherapist.
Such technology could be perceived to be genuinely instinctive – like it understood us and could therefore respond intelligently. Yet, the rather more mundane reality is programmes like ELIZA are designed to follow a pattern matching routine that relies on understanding just a few keywords.
Take the phrase, “it seems you dislike me”. ELIZA understands the keywords “you” and “me”, which matches the general pattern of “you [other words] me”, allowing ELIZA to transform the words “you” and “me” to “I” and “you” to offer a response: “What makes you think I dislike you?” In this example, ELIZA has no understanding of the word “dislike”, or indeed the concept, but doesn’t need to in order to devise a sound, logical response.
ALEXA, WHAT IS COMPUTATIONAL LINGUISTICS?
Teaching a program to understand keywords sounds simple enough, however, an in-depth knowledge of linguistics and computer science are paramount to developing language technology to a more sophisticated level.
Focus around “computational linguistics” – a term coined by David Hays, a founding member of both the Association for Computational Linguistics and the International Committee on Computational Linguistics - can be traced back to the United States in the 1950s, when computers were programmed to automatically translate texts from foreign languages - such as Russian scientific journals - into English.
Assumptions centered on the fact that computers can complete arithmetic calculations at a much faster rate and more accurately than their human counterparts, leading many to believe that it was only a matter of time before computers could conquer language too.
So, when machine translation (otherwise known as mechanical translation) failed to yield accurate translations right away, the concept of computers recognising language was deemed to be far more complex than initially thought.
Subsequently, data devotees dedicated their time to developing algorithms and software for intelligently processing language data, leading computational linguistics to become a sub-division of artificial intelligence in the 1960s with a focus on human-level comprehension and the accurate production of natural language.
ALEXA, WHAT NEXT FOR LANGUAGE TECH?
Language technology is already more prevalent than we likely realise. Filters in chat rooms, customer service chatbots on websites and parental controls all utilise it to identify and assess keywords. You may also have heard of “social media mining”, where programmes automatically group user-generated content, such as tweets, through keywords in order to extract patterns and form conclusions about different demographics. It’s all made possible with language technology.
Such advances were made possible once we embraced a probabilistic model of language – whether it’s text or speech-based – and trained those programmes using vast data collections.
Still, there are many things we cannot yet do…
Understanding the meaning and context behind our language is hard enough, but when it is constantly evolving (remember when ‘sick’ was a bad thing?) we have to figure out how to keep pace in order to empower more applications in the future. I hope we do, for humans and machines can achieve so much together when we’re speaking the same language.