Home Technology Panjaya, led by a founder who sold a video startup to Apple,...

Panjaya, led by a founder who sold a video startup to Apple, has ventured into video dubbing using deepfake technology.

Panjaya, led by a founder who sold a video startup to Apple, has ventured into video dubbing using deepfake technology.

There is a huge opportunity for generative AI in the world of translation, and a startup called Panjaya is taking this concept to the next level. A hyper-realistic Gen AI-powered dubbing tool for videos that recreates the original voice of a person speaking a new language. , the video and the speaker’s physical movements are automatically modified to naturally match the new speech pattern.

The startup, which has been quiet for the past three years, has unveiled the first version of its product, BodyTalk, with its first round of external funding of $9.5 million.

Panjaya was developed by deep learning experts Hilik Shani and Ariel Shalom. They have quietly worked on deep learning technology for the Israeli government for most of their professional lives, and are currently the general manager and CTO of a startup, respectively. They took off their G-man hats in 2021 due to the startup itch, and Guy Piekarz joined as CEO a year and a half ago.

Piekarz is not the founder of Panjaya, but he is a notable name to join. In 2013, he sold his startup. did Found it on Apple. Startup Matcha was an early player in streaming video discovery and recommendation and was acquired early in Apple’s TV and streaming strategy, when there were more rumors than actual products. Matcha was bootstrapped and sold for $10 to $15 million per song. That’s reasonable considering the significant influence Apple has ultimately brought to streaming media.

Piekarz has been with Apple for nearly a decade, building the Apple TV and sports verticals. He was then introduced to Panjaya through Viola Ventures, one of its backers (others include R-Squared Ventures, JFrog co-founder and CEO Shlomi Ben Haim, Chris Rice, Guy Schory, Ryan Floyd of Storm Ventures, and Ali of Riviera Partners Behnam et al.). , and Oded Bardi.

“By then, I had left Apple and was planning to do something completely different,” Piekarz said. “But when I saw the technology demo, it blew my mind, and the rest is history.”

BodyTalk is interested in the ways in which multiple techniques that operate on different aspects of synthetic media are brought into the frame simultaneously.

It starts with audio-based translation, which can currently provide translations in 29 languages. The translation is then set to a video version of the original, spoken in a voice that mimics the original speaker, with the speaker’s lips and other movements modified to fit the new words and expressions. All of this is created automatically on the video after the user uploads it to the platform, and it also comes with a dashboard with additional editing tools. Future plans include getting closer to real-time processing as well as APIs. (Currently, BodyTalk is “near real-time,” so video takes minutes to process, Piekarz said.)

“We’re using the best products where we need them,” Piekarz said of the company’s use of large language models and other tools from third parties. “And we’re building our own AI models where there really isn’t a solution on the market.”

An example of this, he continued, is the company’s lip sync. “Our entire lip-sync engine was developed in-house by our AI research team because we couldn’t find anything that reached the level and quality of the multiple speakers, angles, and all of the business use cases we wanted to support.”

The current focus is only on B2B. Clients include JFrog and the TED media organization. The company has plans for further expansion in media, especially in areas such as sports, education, marketing, healthcare and medicine.

The resulting translated video isn’t all that different from what you get from a deepfake, but it’s quite bizarre. But Piekarz cringes at the term. For years, this has captured a negative connotation that is antithetical to the market a startup is targeting.

“‘Deepfakes’ are not what we are interested in,” he said. “We’re trying to avoid that whole name thing.” Instead, he said, he thought of Panjaya as part of a “deeply real category.”

He added that by targeting only the B2B market and controlling who can access the tools, they are creating “guardrails” to prevent misuse of the technology. He also believes that in the long term, more tools will be built to help detect when videos have been modified, including through watermarking, to create synthetic media, either legitimate or nefarious. “We definitely want to be a part of that and we don’t want to allow misinformation,” he said.

not very good print

There are a number of startups competing with Panjaya in the broader AI-based video translation space, including big names like Vimeo and Eleven Labs, as well as smaller companies like Speechify and Synesis. For all of them, building ways to improve the way dubbing works feels like swimming against a strong current. That’s because captions have become a very standard part of how we consume videos these days.

In TV, there are numerous reasons for this, including poor speakers, background noise from busy lives, actors muttering, limited production budgets, and more sound effects. CBS polled American TV viewers and found that more than half keep captions on “some (21%) or all (34%) of the time.”

But some captions are loved simply because they are fun to read, and entire cults have formed around them.

On social media and other apps, captions simply reflect the experience. TikTok, for example, began turning on subtitles by default for all videos starting in November 2023.

Nonetheless, the international market for dubbed content is still huge, and even though English is often thought of as the lingua franca of the internet, research groups such as CSA have evidence that content delivered in native languages ​​drives better engagement, especially on the internet. there is. B2B context. Panjaya’s argument is that more natural native language content can yield better results.

Some customers appear to support this theory. According to TED, using Panjaya’s tools, views for dubbed talks increased by 115% and completion rates for translated videos doubled.

Exit mobile version