Sign Language Avatars: The German Gold Rush to Undercut Flesh And Blood Interpreters

December 4, 2024

a Deaf Journalism Europe joint production by Wille Felix Zante (Taubenschlag), Jos de Winde (DNieuws/Turkoois) and Charlotte Berthier (Médiapi)

SUMMARY

The rise of Artificial Intelligence (AI) almost immediately saw several startups claiming to have solved the ongoing issue of sign language avatars not being good enough. Sign language avatars are essentially toolkits that more or less aim to translate spoken or written language into signed languages by ways of animated computer models looking like low effort Pixar style animated characters. These have been plagued with the “uncanny valley” effect, the phenomenon where computer animations are almost lifelike but to the human eye still look immediately “off”. This means that sign language avatars as a replacement for human translations are by and large mostly useless because they need almost as much work or cost as much as human translation. The fall of 2023 saw a company in Germany make a big splash in the media by selling sign language avatar kits to municipalities nationwide, and media reporting that AI could be used to make texts accessible. The very questionable result was rightly scolded by a local Deaf initiative. At the same time, the National Association of the Deaf, the DGB, announced that it recommended explicitly against the use of avatars because the technology is not yet mature enough. Interestingly enough, the DGB spokesperson was involved with one of the Deaf enterprises that were partnered with the avatar producing (hearing-led) company. Our research also uncovered that there are potentially millions to be made by selling this relatively “low effort”, questionable solution to municipalities across the nation since German law requires public institutions to have sign language accessibility baked into their websites. The subject matter is complex, and we tried our best to condense it in this summary paragraph. For further details, continue to read the whole original article that goes in-depth, covering the issue of the avatar/AI gold rush from a German perspective. We hope you enjoy this experiment to bring complex investigative journalism to the Deaf communities.

Throughout history, the deaf community has faced technological innovations with a mix of hope and apprehension. From the anxiety-inducing arrival of phones and radios to the silent TV era and fax machines, each advancement brought its own set of challenges and opportunities. Now, as we stand on the cusp of a new technological era, marked by the emergence of avatars and artificial intelligence, the deaf community finds itself at another crossroads. But what does this mean for the community today, and how does it fit into the broader narrative of technological progress? These questions will be explored and the economic and societal implications within the deaf community unpacked. We had the opportunity to have the insights of an expert of the intersection of “Deaf Studies, language policies, sign language interpreting, and AI sign language technologies, Maartje de Meulders as well as Ralph Raule, accessibility entrepreneur in the deaf ecosystem and digitalization advocate at the German national association of the Deaf.

After three years of research and development, late 2023 saw the introduction of sign language avatars in German municipal websites. The toolkit, dubbed “Kommunaler Gebärdensprach-Avatar-Baukasten” („municipal sign language avatar kit“, „KGA-Baukasten“ in short) was based on a research project called AVASAG, spearheaded by Charamel and yomma, two companies based in Cologne and Berlin respectively. The toolkit that was based on this was an offering by Charamel. The aim was simple: To create a more cost-efficient way of making websites accessible, based on the idea that most written content like data protection information as required by GDPR, for example, is pretty much interchangeable between sites.

For this article, Deaf Journalism Europe researched the current situation in Germany where the National Association of the Deaf (NAD) recently advised against the use of sign language avatars in the wake of semi-automated avatar translation services becoming more and more widespread. We use this as a starting point to evaluate risks and chances for the Deaf community. The expertise of deaf researcher Maartje De Meulder, whom we interviewed over email, covers the academic perspective of this matter that intersects with technology, ethics and economics.

The history of sign language avatars dates back to 1982, and was born out of the need to translate sign language into written information. However, this undertaking ran up against considerable difficulties. Unlike written languages, sign languages do not have a universally accepted written form, relying instead on various notation systems such as HamNoSys or Stokoe notation, which are not easily translatable. In addition, the lack of exhaustive corpora for signed languages further complicates the process, preventing advances similar to those seen in the transcription and translation of written languages, facilitated by technologies such as DeepL or Google Translate.

The long term goal with avatars is to replace human interpreters, about which we will talk with researcher Maartje De Meulder later in this article. Short term it is not yet possible to do so, but several projects have produced comprehensible but far from perfect avatars. While a lot of research and commercial projects have involved Deaf professionals, one project sparked controversy last year with straight up offering avatars to fledging official institutions and municipalities in search of cost-effective ways to comply with otherwise very lenient accessibility laws in Germany. It is transparent that the main goal here is cost cutting, while also solving the issue of interpreter/translator availability.

Research by DJE partners uncovered a paper discussing the application of the municipal avatar project in a town. The paper, which is used to inform a decision by said town’s parliament cites costs of EUR 3,750 per year, a complete steal compared to the claimed EUR 170 per hour that flesh-and-blood sign language interpreters would cost. The number is based on the generally accepted hourly rate in Germany for two interpreters in most settings, which is EUR 85 per hour for one person. The paper doesn’t mention that translators for film generally charge even more based on the result: between 70 and 120 euros per minute of film. It is of course difficult to compare interpreting and translating and shows how little the people who ultimately decide this know about the subject.

A NAD representative in a double role – between commerce and politics

Either way, in these cases the cost savings would be very tangible and a huge incentive to municipalities and other public offices to switch to avatars which in the long term could be more and more automated. When the research project AVASAG launched three years ago, Charamel and yomma were happy to work together, however in Fall of 2023 the first cracks started to appear.

The Kompetenzzentrum Gebärdensprache in Bayern (KOGEBA, english: „Sign Language Expertise Center in Bavaria“), a new initiative closely associated with the Munich Deaf Association, protested the use of sign language avatars and criticized that one of the persons involved in the Charamel and yomma joint venture had a double role as a NAD representative. When questioned by DJE partners, both company spokespersons along with the NAD representative claimed that everything was above board and that the representative’s expertise and background with the technical solutions were the exact reasons he was involved in the project.

Meanwhile, avatars constructed by Charamel using the toolkit apparently developed in the AVASAG project started appearing on websites nationwide, all identical but adapted to the cities whose content they claimed to translate.

Fast forward to Spring of 2024. The Bavarian initiative issued another public statement examining the avatars produced with the toolkit. This time round, they focus on websites that have already implemented the avatar, such as the Zeppelin Museum in Friedrichshafen and several other sites, mostly belonging to municipalities, using the municipal avatar toolkit from Charamel. They find the avatars lacking, especially when custom words such as the cities’ names are used. The fingerspelling is criticized as being too slow, too long, the signs are hard to understand, KOGEBA says.

Days later, the NAD in collaboration with the national sign language teacher’s association, the national association for the hearing impaired and the certified deaf interpreter’s association, published another short and quote, “preliminary” statement regarding use of avatars: advocating staunchly against it. The presenter was none other than the NAD representative previously criticized for his double role in the AVASAG research project: Ralph Raule. What caused this divide? We reached out to yomma, Charamel and of course Raule. All replied, save for yomma who apologized profusely but said they were unable to reply just yet. By the time this article was published, they hadn’t replied. Neither did the NAD publish the final joint statement that was announced. When we interviewed him in Spring of 2024, Raule insisted we quote him as the NAD representative and not in his role at yomma.

But first of all, what are avatars?

Avatars are digital representations used in virtual environments. A sign language avatar is a gesturing animated character. Sign language avatars speak sign language. There are various techniques that can be used for this.

The first one is through motion capture, where the signers wear a suit with markers. Cameras from different angles capture the movements, and the markers help to process movement into animations that digital representations use.

The second method is through coding. This involves converting elements such as hand shapes, hand location and direction, as well as facial expressions, mouth shapes, eyes, and eyebrows into code. These are then programmed into movement.

Those two methods can also be combined, using motion capture for single gestures and using manual instructions for transitioning and combining gestures.

Do Avatars have a Purpose?

Maartje de Meulders, a deaf researcher from Belgium, says: “Absolutely, there are definitely everyday use cases for sign language production/generation in the form of avatars (virtual humans), for example in domains such as the hospitality sector, tourism, and for semi-automated customer interaction (comparable to chat bots).” De Meulder describes herself as working at the intersection of “Deaf Studies, language policies, sign language interpreting, and AI sign language technologies.” She is curious about how these fields “impact the daily lives and rights of deaf people, and the broader context of sign language rights.”

Avatars have a definite advantage over flesh and blood interpreters in some use cases. There are many places where avatars could be deployed instead. On trains, at the airport, on social media, or to make places more accessible, especially if it’s not known whether there will be deaf sign language users there. Avatars should ideally not be intended to replace sign language interpreters but to complement them. “Automated translation may help alleviate some of the current limitations with human sign language interpreting services, but this will require significant technological advancements,” says De Meulder.

In the future, you might be able to deploy an avatar in situations where no interpreter is available, or perhaps in situations where you’d prefer not to have an interpreter. Language technology is fundamentally a positive development, but much still needs to be done to make it usable. It can also be thought about in the context of deaf children and their hearing parents. In learning sign language, an avatar can also be a supplement, for example, between lessons, so that parents who want to learn sign language have more opportunities to practice. In general, avatars can help raise awareness and visibility of signed languages.

Why are Deaf People so vocal against Avatars?

There is a certain aversion against the use of sign language avatars among the deaf community. De Meulder comments on this:

“Currently, most avatars presented to deaf users are prototypes and may not fully meet user needs. Typically, feedback is solicited by hearing developers in laboratory settings which may not accurately reflect real-world usage. The risk with asking this kind of user feedback is that deaf people will see avatars’ signing as another signing style they’ll have to put up with and learn to ‘understand’ (just as they need to learn to understand interpreters’ signing). This can lead to socially desirable responses.” There is also a possible risk that the respondents in the surveys might be subconsciously biased: “Respondents might say they understood just because they think they are expected to appreciate this technology that is made ‘for’ them.” One issue with research in this field is that there is no reliable practical data: “here is a big difference from watching an avatar from a screen in your own office for a short experiment, and having to watch it during a nerve-wrecking medical appointment.”

De Meulder continues to talk about expectations within the Deaf community: “Some deaf people may hold high expectations for signing avatars, hoping for a magical solution that can improve access. It is important to carefully manage expectations and not to oversell capacities of sign language technologies.”

She however sees the risk of governments using avatars as a cheap out on the road to full accessibility: “There is a concern among many deaf people that governments will choose the cost-effective route with mediocre machine translation over human interpreters in some situations, and that deaf people will be forced to accept machine translation in situations where it is not warranted. This concern stems from broader experiences with how governments and other institutions deal with linguistic diversity and multilingualism. We know this is a slippery slope: deaf people have objected against the use of VRI (Video Remote Interpreting, eds.) in hospitals for example, yet is it now largely accepted to use VRI in some medical situations.”

Sign Language Avatars: Navigating Technology, Advocacy, and Ethical Concerns

Criticism of sign language avatars has happened as early as 2018, when the World Federation of the Deaf (WFD) published a statement together with the World Association of Sign Language Interpreters (WASLI) on the use of sign language avatars. The main criticism was that computer generated translations could not reach the quality of human interpreters, namely regarding sociolinguistic and sociocultural factors. Especially in life-threatening situations where Deaf people are in emergency situations, avatars should never be used, as the margin for error is too high and – this being another important point in itself – there is no two-way communication with avatars. This means that information can be translated but Deaf people would not be able to talk back or even ask for clarification. Much like sign language gloves, avatar research largely seems to ignore that Deaf people are, in fact, able and eager to talk back. This is something that avatars cannot deliver even in 2024, six years after the WFD/WASLI joint statement.

De Meulder differentiates the use in “emergency situations”: “It is important to prioritize application domains and to identify those research agendas wich are problematic while leaving space for those that are not. There is a distinction between for example an avatar presenting information on a government webpage, or an avatar used to mediate communication during a life-threatening healthcare situation. Prioritizing and distinguishing different application domains will advance the state of the art in such a way that it is more likely technologies will be adopted by end users. It is also important to prioritize within application domains and make distinctions per different uses per domain. For example, not all applications in the medical domain are the same. Some might be useful, for example check-in at a hospital, while others may remain a no-go, e.g. life-threatening surgery or other critical cases.”

In 2019, three Austrian associations followed up with more statements advising against the use of avatars. The linguistics association Verbal warned against the use of avatars because they would make sign language look like an “artificial” language, as well as depriving Deaf people, especially children, of role models. They refute the main pro avatar argument that avatars would provide a cheaper translation solution in the long run by stating that human rights, like the right of access to communication, cannot be weighed up in money. They worry that the increased funding for avatar research will result in less funding for actual sign language research and support for Austrian Sign Language, which is an recognized United Nations heritage. The Austrian National Association of the Deaf (ÖGLB), like WAD and WASLI before them, agree that research and production of sign language avatars should always be led, monitored and evaluated by Deaf professionals. The ÖGLB states that they are worried about Deaf people becoming more and more irrelevant for representation of sign language.

All associations, even the European Union of the Deaf (EUD) agree that sign language avatars can permissible for non-vital information like delayed trains and weather forecasts. They are, however, unified in the opposite too: Avatars should never be used for vital information like in emergencies or disasters.

For De Meulder, it is necessary for advocates to back the arguments against avatars up with facts and research on their own: “Deaf organizations need to prepare for potential disruptive changes caused by advancements in language technologies, especially machine translation. I would say: look at the horizon. Deaf associations can help proactively shape policies that anticipate future developments, ensuring that deaf people continue to have choices in how they access information. Another issues deaf associations (and sign language interpreter associations and training programmes!) need to think about is how this will affect the sign language interpreting profession.”

It is now clear that the emergence of avatars, like a technological tsunami, cannot be stopped, and that the deaf community must follow and become involved in this development if it is not once again to lose accessibility advancements. However, we can ask ourselves the question of the ethics of this involvement. This is indeed a problem in the world of deaf academics as well as in the world of deaf politics.

Deaf community and their involvement into avatar’s development : a tricky position

The German Association of the Deaf (DGB) has been quiet on the subject of sign language avatars, recalls Ralph Raule, the current representative of Media and Digitalization at the German NAD. We are talking to him about the preliminary statement, hastily published in 2024 just days after the second statement by the Bavarian KOGEBA initiative hit. They had already protested the AVASAG project in November 2023, claiming that the project, started in 2020, did not involve Deaf professionals at the relevant stages of producing sign language avatars.

More importantly, they criticized Ralph Raule’s involvement with AVASAG, a commercial project, while also holding the representative role with the NAD. Raule, who founded and holds a stake in a sign language movie producing company, denied those claims, along with all project partners of AVASAG. They also claim that contrary to criticism by KOGEBA, there have indeed been deaf professionals involved. That was November 2023.

Since this, documents from municipalities have surfaced that same Fall where pricing for sign language avatars supplied by Charamel were discussed. Specifically, the Landkreis Tuttlingen discussed funding and realization for avatar usage on their websites. For two years, €3,750 were quoted, with additional “free” two years of coverage. This would mean a cost of €3,750 for four years. It’s unclear whether or not it will be the same going forward. “We’re convinced that we can reduce the use of ‘real” sign language interpreters through the use of digital avatars”, says the paper. “Real” (sic!) sign language interpreters are quoted as €170 per hour for two persons. Going by this rate that excludes any other fees such as ride costs, you could cover only about 22 hours of interpreting with the same amount that Charamel would bill for four years. The paper continues to compare the digital avatar to text to speech functions already implemented on their websites. The paper also quietly announced the introduction of these avatars to municipal websites starting the turn of the year 2023/2024.

The Watchdog of the South strikes again

KOGEBA, ever so watchful, did not miss this. In their second statement, released Valentine’s Day 2024, they list 23 municipalities using sign language avatars. Among them are Munich, Duisburg, Würzburg and Regensburg, major German cities. Assuming pricing is the same across municipalities, this suggests a sales volume of just over €86.000. (Checking with Charamel, they quote us an estimated €1.000 minimum licensing fee per year, which sounds on par with the 3.750€ for what is effectively four year’s worth of avatar usage). Germany has slightly above 10.000 municipalities, allowing for a potential sales volume of 37 million euros across four years using a concept that is considerably cheaper than actual sign language interpreters. There is a lot of money to be made with developing, testing and licensing avatars.

After KOGEBA’s second statement, the German national association of the Deaf, the DGB, reacted quickly, and together with the national association of sign language teachers, national association of hard of hearing people and the certified deaf interpreter’s association, published a statement clearly advising against the use of sign language avatars.

This statement was presented by Ralph Raule, the same representative whose involvement with the avatar project AVASAG through his company, yomma, was criticized previously. How did this change of mind happen?

Right away, Raule clarifies a few things in an email interview. The avatars that were criticized in the second statement, namely those used by Zeppelin Museum in Friedrichshafen, were experimental, he says. “The general idea was to test the waters: See what we need and how far advanced the technology is, in order to develop a good avatar system later.” What’s seen today on the Zeppelin website is purely experimental and the people used for motion capture were not trained or certified as interpreters. “It had been clear from the start that this was only supposed to be used after qualified professionals had verified that it meets production standards”, he says. “It’s clear to anyone who knows sign language that the avatar in this state is not fit for publication.” He stresses again that he had been adamant about not using the experimental videos on public facing websites. „At yomma, we were surprised because the test footage we recorded was published for the Zeppelin Museum. Charamel acted like it was perfectly reasonable and a mature product.“

Alexander Stricker, owner of Charamel, the avatar company currently selling their avatar to municipalities nationwide, tells us in an email that “it is very important to us to involve the end-users in an early state”. The product, he continues, “is being developed in a series of iterative, that means continuative processes”. He says he is “explicitly grateful” for the second KOGEBA statement which he says helps Charamel to improve on their tech.

When questioned about why the NAD’s preliminary statement happened so quickly after the second KOGEBA statement, Raule admits that the DGB had a statement in the works as early as 2020, but the NAD wasn’t happy about it because technology kept advancing so fast. They had talks with KOGEBA in December of 2023. Raule says KOGEBA asked why they had not yet published their statement and if it was because of his double involvement with yomma/AVASAG and the DGB. “This question made me reconsider and finally finish that statement”. He clarifies that the statement is the product of four major associations and as such involves some very specific wording. The statement is closely related to their attempts at creating an unified guideline for sign language quality standards in media and translation. The advancement of avatar technology put pressure on getting the statement out of the door, says Raule. “Many officials seem to assume that avatars are a silver bullet for accessibility of Deaf people without asking the Deaf community what they think of avatars.”

Since the municipal avatar project became public more and more newspapers report that it will soon be possible to use AI to translate written or spoken (auditory) language into sign language. When questioned about that, the (hearing without any sign language background) journalists involved in those articles have revealed that they are referring to the municipal avatar project specifically, the same thing that has come under fire by KOGEBA and has been denied as fit for production. We asked Alexander Stricker of Charamel whether or not it was made clear to yomma that the experimental material would be used in real life scenarios. Stricker replied that the research results from the AVASAG project, which Charamel initiated and wherein yomma, among other partners, was a part in, could be used by any partner in the project. This, he says, has been clarified in a „thorough“ contract between all partners.

Summary and Outlook: Where Do We Go From Here?

One only needs to look at the progress since the first outings of AI generated video in 2023 to 2024’s Sora AI video generation to see that the advance of technology has been extremely fast. Written language translations like DeepL, thanks to the advances of AI, has become very reliable too. It is not unfathomable to see AI being used to improve automated sign language interpreting through avatars, even though most people seem skeptical about it and the uncanny valley effect many associate with sign language avatars is something that needs to be considered.

Ever since the debut of Memoji “avatars” with Apple’s iPhone X, it’s become clear that motion tracking without full motion capture suits and specialized technology is possible for regular users. The Californian company’s recent introduction of “personas”, which are basically individually designed avatars for use in video chat when the new Vision Pro headset is used, shows that it is possible to copy most movement and hand shapes very accurately, almost circumventing the uncanny valley effect.

Looking away from the interpreting aspect, there are other uses for this general technology as well, as shown with RWTH Aachen’s research into anonymization through sign language avatars. Anonymous sign language is nigh impossible but avatars – or customizable “personas”, as Apple terms it – could provide a layer of secrecy or neutrality for sign language presenters. Seeing as avatars are mostly pursued as a cost cutting measure, however, it is unlikely that this is going to get funding and gain traction. Again, as the Austrian association Verbal said, human rights can not be weighed up with money – this should be the key takeaway from the issue of avatars and one can see how there is a lot of money to be made from providing “interpreting services” at a much lower cost.

De Meulder sees a special challenge for Deaf researchers. On one hand they’re forced to show their best side in a setting with hearing co-researchers, and are struggling with power and communication imbalances: “ Many deaf researchers have experienced being asked to collaborate on sign language AI projects often well after the idea has been conceived, the team built, the research conducted or even near the project write-up. This often creates a catch-22 for deaf researchers because the technologies will be developed and the research will be published regardless of our involvement, while being involved may imply endorsement.”

Currently, De Meulder recalls 10 research projects since 2015, all having received funding from the European Union to the tune of almost 26 million euros. As with most technologies, development is economically motivated: “This is no different for sign language technologies, although the economic incentives might differ due to the smaller size of the market. But they are still prominent, and do not only have to do with generating profit but also with reducing costs.” She does not necessarily see this at odds with human rights and accessibility, but instead it should be considered how to “leverage these technologies to improve access for deaf people, and not just weighing them against their economic incentives.”

The debate on the ethical aspects of sign language avatars continues. The main danger for the deaf community lies in the risk of compromising the integrity of sign language and deaf culture. Sign language avatars could potentially degrade the quality and authenticity of sign language, making it artificial or eliminating important cultural and sociolinguistic aspects. In addition, there are concerns about the commercial use of this technology, which could be driven by economic interests rather than ethical and social considerations aimed at improving accessibility for the deaf community. Ultimately, if sign language avatars are not developed in a responsible manner, with meaningful involvement of the deaf community itself and with particular attention to the preservation of sign language and deaf culture, they could potentially cause more harm than benefit.