Speech Recognition, AI, Machine Learning | MarTech Series https://martechseries.com/category/predictive-ai/ai-platforms-machine-learning/speech-recognition/ Marketing Technology Insights Wed, 29 Apr 2026 13:28:04 +0000 en-US hourly 1 https://wordpress.org/?v=6.8.5 https://martechseries.com/wp-content/uploads/2024/09/cropped-martech_series_logo-1-4-32x32.png Speech Recognition, AI, Machine Learning | MarTech Series https://martechseries.com/category/predictive-ai/ai-platforms-machine-learning/speech-recognition/ 32 32 Deepgram Launches Flux Multilingual: The World’s First Multilingual Conversational Speech Recognition Model https://martechseries.com/predictive-ai/ai-platforms-machine-learning/speech-recognition/deepgram-launches-flux-multilingual-the-worlds-first-multilingual-conversational-speech-recognition-model/ Wed, 29 Apr 2026 13:28:04 +0000 https://martechseries.com/?p=399419

Deepgram Logo

One model, ten languages, and monolingual-grade accuracy for voice agents worldwide

Deepgram, the real-time AI infrastructure company underpinning the Voice AI economy, announced the general availability (GA) of Flux Multilingual, expanding its conversational speech recognition model beyond English to support 10 languages, with the ability to automatically detect, understand, and switch languages dynamically within a single conversation in real time. Developers, enterprises, and product teams building voice agents now have access to the first real-time conversational speech recognition model, delivering accurate turn-taking, interruption handling, low latency, and natural human-like conversations at global scale.

“Voice AI agents will soon become the default for how global enterprises interact with customers,” said Scott Stephenson, CEO and Co-Founder, Deepgram. “Today is a major step forward towards that future…”

Traditional automatic speech recognition (ASR) is designed for transcription. Flux introduced a new approach, conversational speech recognition (CSR), built from the ground up to understand dialogue flow and enable real-time interaction. Flux has rapidly become foundational infrastructure for real-time voice agents, powering production systems that developers trust to deliver fast, natural conversational experiences with best-in-class accuracy in turn detection and speech recognition. Prior to today’s release, extending these experiences across multiple languages required stitching together multilingual transcription models, language detection, and routing logic, introducing latency, complexity, and brittle user experiences. Flux Multilingual replaces that complexity with a single model and API, making it possible to build conversational voice agents across 10 languages without re-architecting systems or sacrificing performance.

Marketing Technology News: MarTech Interview with Stephen Howard-Sarin, MD of Retail Media, Americas @ Criteo

With native support for turn-taking, interruptions, and code-switching within a single interaction, voice applications remain fluid, responsive, and natural regardless of language or region. Flux Multilingual delivers monolingual-grade accuracy across languages. Developers can guide the model with language hints or let it auto-detect, adapting in real time even mid-conversation.

“Voice AI agents will soon become the default for how global enterprises interact with customers,” said Scott Stephenson, CEO and Co-Founder, Deepgram. “Today is a major step forward towards that future. Flux Multilingual gives developers a single perception model to build global voice agents, with the ability to switch language mid-call. Now, enterprises can deliver the same seamless experience to any customer, in any market. Deepgram is the leader in real-time AI infrastructure, and Flux Multilingual is the latest in our suite of capabilities that enables developers to deliver real-time products across the globe.”

“Customers told us that Flux transformed what’s possible for real-time voice AI agents in English,” said Omar Paul, Vice President of Products, Twilio. “It stood to reason that Deepgram would solve this globally too. Our customers’ teams no longer need to sacrifice accuracy with legacy multilingual systems, nor stitch multiple models with complex routing themselves. With Flux Multilingual, teams take the exact conversational experience they built for English and extend it across languages with a single system.”

Marketing Technology News: From MarTech Stack to MarTech Fabric: Weaving Brand, Content, and Conversion Into One Thread

Flux Multilingual Capabilities

Supported Languages
English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, and Dutch

Ultra-low latency conversational speech recognition, now global
Flux Multilingual is built for understanding and interaction, not just transcription. It uses model-based turn detection, not simple silence detection, to deliver accurate end-of-turn decisions in under 400 milliseconds, keeping conversations fluid and responsive across languages.

Monolingual-grade accuracy with real-time language control
Flux Multilingual delivers monolingual-grade accuracy across languages, with flexible real-time control through language hints or automatic detection, native code-switching, and dynamic adaptation as conversations evolve.

Build and scale global voice agents with one model
Flux Multilingual supports 10 languages in a single conversational model, enabling teams to build and deploy voice agents globally with one integration. One model, ten languages, one API, with no additional infrastructure or model orchestration required.

Key Features

  • Native turn detection and interruption handling for natural dialogue flow
  • Low-latency streaming transcription for real-time responsiveness
  • Automatic language detection and language hint support for accuracy control
  • Mid-session configurability for dynamic language adaptation
  • Native code-switching within a single conversation
  • Fully compatible with existing Flux API integrations

Flux Multilingual is now generally available (GA). As part of the launch, Deepgram is offering a limited-time promotional rate on streaming speech-to-text, including Flux Multilingual and Nova-3 models.

Flux Multilingual is available via Deepgram’s Cloud API or as a self-hosted deployment, with support for EU endpoints, SDKs, and seamless integration into voice agent architectures. Developers can get started today at deepgram.com or try Flux Multilingual directly in the Deepgram Playground.

Write in to psen@itechseries.com to learn more about our exclusive editorial packages and programs.

]]>
Sanas Broadens Real‑Time Speech AI Platform for Global Enterprises https://martechseries.com/predictive-ai/ai-platforms-machine-learning/sanas-broadens-real%e2%80%91time-speech-ai-platform-for-global-enterprises/ Thu, 26 Mar 2026 14:46:06 +0000 https://martechseries.com/?p=397570 Real‑Time Language Translation and Speech Enhancement unlock clearer communication across languages, accents and environments

Sanas, the Speech AI Platform built for enterprise communication, announced the launch of Real-Time Language Translation alongside a major upgrade to Speech Enhancement. Together, these capabilities advance Sanas’ evolution from an accent tool into a full-scale speech ecosystem helping enterprises communicate without barriers.

Already powering millions of conversations across more than 100 countries, Sanas supports leading global enterprises across healthcare, financial services, retail and hospitality. The platform is trusted by organizations including Carelon, Cigna, Comcast, Huntington, Robinhood, UnitedHealth Group, Vanguard and Wyndham, who rely on Sanas to support high-volume, real-time communication across diverse teams and environments.

The new capabilities further unify the platform to deliver clearer communication at enterprise scale.

Marketing Technology News: MarTech Interview with Stephen Howard-Sarin, MD of Retail Media, Americas @ Criteo

The Enterprise Communication Gap
Sanas was founded in 2021 at Stanford’s Artificial Intelligence Laboratory after a co-founder’s classmate lost a job not because of skill or performance, but because his accent was not easily understood. That moment revealed a deeper systemic challenge across global enterprises. Every global enterprise runs on voice, but the infrastructure powering real-time communication has not meaningfully evolved in decades.

Modern enterprise communication continues to be constrained by accent barriers, language fragmentation, inconsistent audio quality and limited real-time intelligence. As work increasingly happens through live audio in hybrid meetings, distributed teams and AI-assisted workflows, these gaps influence who is heard, how decisions are made and how effectively teams operate.

Sanas is building the real-time speech infrastructure layer that sits between humans, machines and customers in every live interaction. With on-device AI processing as the company’s core advantage, Sanas believes that all communication will eventually flow through this layer, where speech is enhanced, translated and understood in real time.

“Everyone deserves to be understood. We did not start Sanas to change how people speak. We started Sanas to remove barriers so communication can happen clearly and naturally, whether it is between humans or between humans and machines,” said Sharath Keshava Narayana, CEO and Co-founder of Sanas. “We are building the real-time speech infrastructure layer that enterprises will rely on for every interaction. These new capabilities move us closer to a world where communication is instant, clear and effortless.”

Marketing Technology News: From MarTech Stack to MarTech Fabric: Weaving Brand, Content, and Conversion Into One Thread

Two Innovations Expanding Real-Time Communication

Real-Time Language Translation
Sanas now enables real-time, speech-to-speech translation across more than 13 languages with minimal latency. The system automatically detects languages and preserves each speaker’s vocal identity, tone and intent so conversations feel authentic. For example, a person can speak in French while the listener hears the conversation in English in real time.

Speech Enhancement
Speech Enhancement upgrades low-fidelity audio into high- and ultra-fidelity speech in real time, marking a major advancement beyond traditional noise cancellation. While noise cancellation removes the background, speech enhancement reconstructs the foreground, restoring clarity, intelligibility and presence while preserving vocal detail and expression. For enterprises, the result is more efficient interactions, improved customer satisfaction, and lower hardware costs.

“These innovations bring us closer to a world where language, accent and audio quality are no longer barriers to understanding,” said Shawn Zhang, CTO and Co-founder of Sanas. “More than just translating words, we’re preserving meaning, emotion and identity, ensuring every conversation is heard as it was meant to be.”

Enterprise Security, Privacy and Performance
Built with enterprise-grade privacy in mind, Real-Time Language Translation supports containerized deployment options, ensuring data control and compliance for highly regulated industries.

Sanas’ secure-by-design architecture includes on-device and self-hosted deployment options, reducing reliance on cloud infrastructure while ensuring low latency and data sovereignty. The platform is reinforced by HITRUST i1 certification and is designed to support organizations with strict security and compliance requirements. Sanas does not monitor, record or store call data, underscoring its commitment to privacy and trust.

Write in to psen@itechseries.com to learn more about our exclusive editorial packages and programs.

]]>
Noty.ai Offers Free Unlimited Meeting Transcriptions for Google Meet and Zoom to All Users https://martechseries.com/sales-marketing/sales-enablement/unified-communications/noty-ai-offers-free-unlimited-meeting-transcriptions-for-google-meet-and-zoom-to-all-users/ Wed, 28 Jun 2023 07:40:29 +0000 https://martechseries.com/?p=337563 Noty.ai, the developer of Noty Workplace AI Assistant, has announced the removal of all limitations on meeting transcriptions in its product.

Noty.ai, developer of Noty Workplace AI Assistant for team productivity, today announces that it removes all limitations to meeting transcriptions in its product. From now on, all our users regardless of their plan can transcribe all their calls and store meeting notes in the Noty app and Google Docs.

The modern market offers countless solutions for corporate data retention and loss prevention. However, there’s one source of business-critical information that is overlooked. We can safely say that companies lose up to 80% of their meeting data and attempts to address this issue aren’t sufficient. It’s even more astonishing taking into account that management spends 50% of their work time in meetings.Previously, businesses would record their meetings. However, this format of data retention is storage-intensive and inconvenient for data revision and search.

With the development of AI speech recognition technologies, we’ve seen the emergence of a new category of tools.

Marketing Technology News: Cyntexa Partners with iContact to Revolutionize Email Marketing for Businesses

Meeting transcription applications enable teams to store their calls in text format. It improves data discovery and revision. It also takes up less space than audio recording. Unfortunately, these tools have limitations to the number of transcribed meetings, their duration, and the period of storage. If the users want to preserve more data, they have to pay.

At first, Noty.ai followed a similar logic limiting the number of meetings and their duration depending on the plan.

However, it quickly became clear that it put too much stress on the users who had to make hard choices between the meetings they want to store.

Every meeting counts. Every call has the potential to deliver out-of-the-box ideas, unexpected insights, and milestone decisions for business development. This critical information shouldn’t fall through the cracks just because people tend to forget 50% of a conversation within one hour after it’s over.

Today, businesses need to store 100% of their meeting data. That’s why Noty.ai made a turning-point decision to remove all the limitations on the meeting transcriptions. This transition will not impact the accuracy of transcriptions, and it will remain at 95% for Google Meet and 94% for Zoom. Additionally, Noty users can highlight the most important parts of conversations in 1 click and create unlimited follow-ups.

Marketing Technology News: MarTech Interview with Eran Helft, Director of Product Management at monday.com

“The decision to offer free and unlimited meeting transcriptions was a natural progression for Noty, driven by our commitment to improving productivity and dedication to understanding our customers’ needs. Since day one, our primary goal has been to enhance productivity and revolutionize the way we work by leveraging AI capabilities for remote work,” said Natalie Marina, CEO of Noty.ai. “In 2023, transcriptions have become essential in all communication tools, transcending the realm of luxury and becoming a “basic feature”. However, transcription alone does not directly contribute to productivity. At Noty. we are taking digital workflows and communication to new the next level, an AI workplace assistant designed to facilitate task completion and ensure nothing falls through the cracks. By utilizing transcription, we can gather and store information, but our ultimate goal is on getting the work done, ensuring tasks are diligently tracked until completion, and for the first time in your life, enabling you to follow up on 100% of your connections and meetings effortlessly.”

With meeting transcription made free, Noty.ai focuses on monetization of its premium functionality, Workplace AI Assistant. It is an AI-based productivity tool that pulls out business-critical data from call transcripts in seconds.

If a team made important decisions during the call, assigned tasks, or agreed on action items for a project, Noty Workplace AI Assistant will find this data and create a list of items. In the future, the app will enable users to create custom prompts for AI, increasing the capabilities of its users to automate meeting routines.

To try free unlimited meeting transcriptions for Google Meet and Zoom go to the Noty app.

Marketing Technology News: Millennials at Work: The Counterintuitive Guide to Keeping Them Happy and Engaged

]]>
VIQ Solutions Submits Two Patent Applications for Methods Related to Training and Automated Selection of Domain Specific Language Models (DLSM) https://martechseries.com/predictive-ai/ai-platforms-machine-learning/speech-recognition/viq-solutions-submits-two-patent-applications-for-methods-related-to-training-and-automated-selection-of-domain-specific-language-models-dlsm/ Thu, 01 Jun 2023 13:28:14 +0000 https://martechseries.com/?p=334792 VIQ Solutions Inc., a global provider of secure, AI-driven, digital voice and video capture technology and transcription services, announces it has filed two provisional patent applications with the United States Patent and Trademark Office (USPTO). The patents augment the Company’s speech engine agnostic workflows, improving documentation accuracy and usability of documentation, for courts, insurance, law enforcement and media organizations across the globe.

The first patent application is directed to the systems and methods for training domain-specific Automated Speech Recognition (ASR) language models. The novel methods can be leveraged to form accurate speech recognition to easily manage nuances of domain-specific words, accented speech and languages, including mixed-language audio files, and document formatting requirements.

Marketing Technology News: MarTech Interview with Jeaneen Andrews-Feldman, Chief Marketing Officer at Simpli.fi

“We anticipate ASR improvements in domain-specific verticals by taking advantage of our large volume of production data to fine tune the DSLMs.”

The second patent application is directed to the automated selection of domain specific ASR models. This patent aims to protect the various mechanisms used to load static or dynamic DSLMs. The static model is used based on a pre-assigned setting and the dynamic methods utilize the Goodness Score or Artificial Intelligence Natural Language Understanding (NLU) to determine the context and domain of the audio, followed by a second pass using a DSLM creating profound ASR improvements in domain-specific verticals.

Both workflows utilize DSLMs to improve the quality of the draft transcript. DSLMs use a large database to create a baseline to create a new task-specific model that is being trained using the knowledge learned by the large language model. It leverages the knowledge learned during the training of the original model to improve the performance of the new model, resulting in increased efficiency, improved performance, and enhanced interpretability.

Marketing Technology News: Breaking Down the Unexpected Decline in Ad Spending: Insights From a Tech Founder’s Perspective

“DSLMs are one of many layers of aiAssist™ that supplies NetScribe with workflow automation and high-quality draft transcripts,” said Vahram Sukyas, Chief Technology Officer, VIQ Solutions. “We anticipate ASR improvements in domain-specific verticals by taking advantage of our large volume of production data to fine tune the DSLMs.”

The accuracy and usability of draft transcripts, FirstDraft™, is expected to increase and become even more competitive when enhanced with a series of “extra loops” in aiAssist workflow through VIQ trained DSLMs,” said Susan Sumner, President and Chief Operating Officer, VIQ Solutions. “Fewer edits equal greater efficiency and higher productivity as well as more capacity for faster turnaround orders, time savings, and cost reduction.”

]]>
Interprefy Unveils Next-Generation AI Event Translator Aivia https://martechseries.com/predictive-ai/ai-platforms-machine-learning/interprefy-unveils-next-generation-ai-event-translator-aivia/ Wed, 26 Apr 2023 13:25:35 +0000 https://martechseries.com/?p=331434 – Real-time speech translation solution aims to completely remove global event language barriers 

Experience the full interactive Multichannel News Release here: Multilingual meeting technology and services provider Interprefy has unveiled Interprefy Aivia, the world’s first advanced automated speech translation service for online and live events, at Event Tech Live in Las Vegas.

Able to pick up and translate speech in real-time, the artificial intelligence (AI) based real-time translation solution is the first of its kind to be made available to the public, bringing accurate, AI-translated audio and captions to global audiences at the touch of a button.

The solution will be available in 24 languages and regional accents initially, with many more to come in the near future, and is immediately available for in-person audiences as well as for major platforms such as Microsoft Teams, Zoom, and ON24.

Marketing Technology News: MarTech Interview with Baba Diallo, Director of Creator Relations at Calaxy

Interprefy’s approach to achieving real-time audio translation excellence is unique: “We have spent the last few years developing an AI-benchmarking platform that assesses the best engine for each language combination. By bringing the best AI technology together with our years of experience, thousands of events supported, and our technical expertise, we believe we have produced the most accurate and flexible AI speech translation tool available on the market today,” Oddmund Braaten, CEO at Interprefy, said.

Aivia completes Interprefy’s world-renowned end-to-end remote simultaneous and sign language interpretation offering by providing a cost-effective AI-powered real-time translation solution for scenarios like webinars, training, town halls, or conferences.

According to Braaten, AI is no substitute for professional interpreters in certain scenarios: “Especially in diplomatic conversations, or regulatory sessions, a skilled linguist will continue to outperform AI, as they are able to read the audience and provide nuanced localisation to accommodate sarcasm, humor, or idioms.”

While remote interpretation technology continues to grow the demand for simultaneous interpretation, AI can provide a high-quality solution for market segments where nuance is rare: “Aivia can provide language access where the support from professional interpreters is considered impractical or unaffordable. Simultaneous interpretation is still a premium service today, and AI can help make language access available for a wide range of organisations and events with smaller budgets”, Braaten concludes.

Marketing Technology News: The 3 Biggest Lessons From Pro Sports’ Adoption of Web3

]]>
Dialpad Named a Visionary in the 2022 Gartner Magic Quadrant Unified Communications as a Service, Worldwide https://martechseries.com/sales-marketing/sales-enablement/unified-communications/dialpad-named-a-visionary-in-the-2022-gartner-magic-quadrant-unified-communications-as-a-service-worldwide/ Thu, 01 Dec 2022 09:34:55 +0000 https://martechseries.com/?p=313644

Building on global momentum and widespread customer satisfaction, Dialpad is positioned as a Visionary for the second time, and in the Gartner Magic Quadrant for the fourth consecutive year

Dialpad, the industry-leading Ai-Powered Customer Intelligence Platform, announced it has been positioned as a Visionary by Gartner, Inc. in the 2022 Magic Quadrant for Unified Communications as a Service (UCaaS), Worldwide.

In the past year, Dialpad has seen exponential momentum and continues to see growth globally driven by demand for integrated Unified Communications and Contact Center services that are easy to use, quick to scale and, most critically, Ai powered.

Marketing Technology News: MarTech Interview with Laura Goldberg, CMO at Constant Contact

“We’re excited to be recognized as a Visionary in the Gartner Magic Quadrant for UCaaS”

“We’re excited to be recognized as a Visionary in the Gartner Magic Quadrant for UCaaS,” said Craig Walker, CEO of Dialpad. “At a time when businesses and contact centers alike are grappling with hybrid work environments, our end-to-end Ai-powered Customer Intelligence Platform for contact centers, voice, and video, keeps everyone connected no matter where they’re based while providing real-time insights into customer satisfaction and team performance. We believe our position in this year’s Magic Quadrant validates the growing market need for Ai-powered, integrated unified communications and we look forward to continuing our leadership in the industry.”

With the integration of omnichannel and digital self-service, Dialpad’s Ai-powered Customer Intelligence Platform provides an all-in-one solution built in the cloud that is simple for users to set up and deploy. With integrated real-time speech recognition and natural language processing, Dialpad’s platform not only powers voice, video, and messaging on any device, but also seamlessly captures and analyzes conversations to automate workflows and deliver predictive insights.

Marketing Technology News: Accelerating Digital Transformation: Simple Solutions for Complex Problems

]]>
Global CX Solutions Provider Startek Deploys Verint Speech Analytics to Enable World-Class Experiences for Customers https://martechseries.com/mts-insights/global-cx-solutions-provider-startek-deploys-verint-speech-analytics-to-enable-world-class-experiences-for-customers/ Tue, 29 Nov 2022 14:08:55 +0000 https://martechseries.com/?p=313175

Verint’s AI-Powered Speech Analytics Solution Helps Startek Derive Insights from 500 Million Customer Interactions Annually for Improved

Customer Relationships

Verint The Customer Engagement Company, announced that Startek is leveraging Verint Speech Analytics to derive insights from the company’s more than 500 million annual customer interactions.

Startek is a global provider of tech-enabled customer experience management solutions, digital transformation, and technology services to leading brands. More than 43,000 Startek agents serve a broad range of customers across a variety of languages, industries, and regions, connecting in multiple languages.

Deploying Verint Speech Analytics, which is part of Verint’s Interaction Analytics solution powered by Da Vinci™ AI and Analytics, has enabled Startek to move away from call sampling to a more holistic approach comprised of identifying call drivers and root causes through advanced speech analytics, transcribing more than 80 percent of recorded calls to analyze emerging trends, identify areas of opportunity, and pinpoint customer concerns.

“Startek leverages Verint Speech Analytics to provide a deep understanding of the drivers behind customer calls and derive actionable insights that enable agents to deliver the best experience”

As a result, Startek has been able to achieve improved first contact resolution by four percent and enhanced quality and customer satisfaction levels to be consistently above target. Use of artificial intelligence (AI) and analytics has also eliminated the potential for inconsistency or bias in evaluation.

Advanced dashboards give Startek and its customers heightened visibility and insights into category trends and term trends along with their respective call volumes. Using these dashboards, Startek can compare performance metrics from an organizational level and drill down to an agent level. Additionally, agents can better recognize their roles and performance.

Marketing Technology News: The Algorand Foundation Appoints Anil Kakani as India Country Head

“World-class customer experience relies on the right combination of people, data, and technology,” says Abhinandan Jain, chief digital officer at Startek. “Speech analytics supports agents, enabling them to focus more attention on the customer and deliver a more empathetic experience.”

“Startek leverages Verint Speech Analytics to provide a deep understanding of the drivers behind customer calls and derive actionable insights that enable agents to deliver the best experience,” Jain concludes. “In this way, the solution helps us to keep pace with changing customer needs and ensure that every call is a coaching moment that enables our agents to continue to learn and grow.”

“We are thrilled to partner with Startek and help them provide exceptional service to their customers. Verint Interaction Analytics powered by market-leading speech and text analytics provides mission-critical insights on voice and digital customer interactions, helping top brands around the globe deliver a better customer experience while improving the efficiency of their customer engagement operations,” says Verint’s Ady Meretz, president, Asia Pacific.

Marketing Technology News: MarTech Interview with Laura Goldberg, CMO at Constant Contact

]]>
Deepgram Raises $47M to Define the Future of AI Speech Understanding https://martechseries.com/predictive-ai/ai-platforms-machine-learning/deepgram-raises-47m-to-define-the-future-of-ai-speech-understanding/ Tue, 29 Nov 2022 12:43:32 +0000 https://martechseries.com/?p=313154 Deepgram announced that it has raised $47 million in equity funding led by Madrona, with Alkeon participating as a new investor. Prior investors also participated in the deal. This transaction completes Deepgram’s Series B round in which the company raised a total of $72 million, the largest known Series B round in the automatic speech recognition category.

Founded in 2015 by CEO Scott Stephenson—a particle physicist turned deep learning entrepreneur—Deepgram is the leader in the field of AI speech recognition. “We’re building the foundational AI models for understanding human speech,” Stephenson said. “Understanding speech at scale starts with accurate transcriptions, but it doesn’t stop there. At Deepgram, we view accurate transcription as an increasingly solved problem for many of the more than 100 languages we work with. That’s the groundwork upon which we’re building the future of speech understanding: to give our customers insight into not just what was said, but how it was said, which can result in an actionable understanding of why it was said.”

Deepgram has ambitious plans for the future of AI voice understanding. Whereas Natural Language Processing has traditionally taken a text-based approach, Deepgram leans into its audio-first foundation to incorporate nuances of intonation and inflection into its sentiment analysis, speaker labeling, summarization and more. The company recently expanded its feature set to include topic detection, language detection, and translation to deliver conversational context for enterprise analytics and expand internationally.

Marketing Technology News: MarTech Interview with Laura Goldberg, CMO at Constant Contact

“Audio is among the largest and most data-rich pools of value that companies have left mostly untouched because the tools to analyze and understand that data simply weren’t there, weren’t fast enough, or were prohibitively expensive to implement,” said Madrona managing director Karan Mehandru. “Deepgram enables any developer at any company to unlock the value of voice by delivering new speech intelligence capabilities on top of already world-class transcriptions, all through a straightforward developer experience and at an affordable price. We’re excited by Deepgram’s traction over the past 12 months and are thrilled to back the team and technology behind the voice revolution.”

“Having Madrona onboard validates our developer-first approach to fulfilling huge pent-up demand for speech transcription and understanding,” Stephenson added. “Deepgram democratizes access to the world’s best speech AI with just an API call.”

Since launch, Deepgram has transcribed over 1 trillion words from over 10,000 years of raw audio data. The company counts NASA, Spotify, and Citi among its flagship customers. Deepgram’s speech models are the most accurate automatic speech recognition solution across many languages, and the company offers additional, custom model training as an option to narrow the gap between human and AI transcription. The company’s new language understanding capabilities bring unparalleled context and dimensionality to the future of speech AI technology.

Marketing Technology News: Accelerating Digital Transformation: Simple Solutions for Complex Problems

]]>
The First-Ever Remarkable Tech Summit Provides Greater Inclusivity and Equity with GoVoBo and AppTek’s AI-Enabled Live Captioning https://martechseries.com/predictive-ai/ai-platforms-machine-learning/the-first-ever-remarkable-tech-summit-provides-greater-inclusivity-and-equity-with-govobo-and-appteks-ai-enabled-live-captioning/ Wed, 16 Nov 2022 15:05:45 +0000 https://martechseries.com/?p=311718 The GoVoBo live automatic captioning platform was used to power captions and translations at the Remarkable Technology Summit, made possible by Cerebral Palsy Alliance and Cerebral Palsy Alliance Research Foundation. The summit was designed to connect, celebrate, and grow the emerging Disability Tech ecosystem. GoVoBo utilizes AppTek’s advanced AI-enabled automatic speech recognition and machine translation technologies to support inclusive and accessible experiences for deaf and hard of hearing audiences.

GoVoBo, an inclusive technologies company for deaf and hard of hearing users, and AppTek, a global leader in artificial intelligence (AI) for automatic speech recognition (ASR), neural machine translation (NMT), natural language processing and understanding (NLP/U), and text-to-speech (TTS) technologies, are proud to announce their support of Cerebral Palsy Alliance Research Foundation (CPARF) and its Remarkable Technology Summit, recently held in San Francisco, California.

The GoVoBo/AppTek team collaborated with CPARF to provide accurate, real-time captioning throughout the two days of presentations and panels in response to a request to support the event’s deaf and hard of hearing attendees. “Our team was proud to respond immediately to provide our service to help make the event a success and a great experience for all attendees equally,” said Mike Veronis, Co-founder of GoVoBo. “We are thrilled to provide support to CPARF and any future events like this that promote innovations in accessibility and inclusivity.”

Marketing Technology News: MarTech Interview with Bogdan Carlescu, VP of Marketing at Creatopy

According to CPARF, the first-ever Remarkable Technology Summit event was held to promote product innovations for accessibility by connecting innovative corporate investors with startups. These conversations had many goals, including encouraging companies to incorporate accessible technologies inside their next generation of products.

The event brought together key members of the technology and accessibility sectors to discuss critical topics in product development, such inclusive design for people with disabilities. One of the invited speakers was Varun Chandak, Founder and President of Access to Success and a noted expert on accessibility innovation and product design methods. Varun, who is hard of hearing, commended how well the GoVoBo service performed without any customization or training and said, “The service performed beautifully and improved my experience at the event, as well as the experience of other participants I met.”

Marketing Technology News: How and Why Marketers Must Understand The Story Behind Consumer Data To Guide Strategic Business…

The summit also resonated with attendees because of the conversations it sparked. “The most impressive aspect of the summit is that we convened such an incredible community with diverse experience,” said CPARF Executive Director Michael Pearlmutter. “Embracing their strengths, everyone came together to thoughtfully advance disability technology and start the conversation.”

The GoVoBo service provides users the freedom to use any online meeting service without the need to configure caption services for each meeting individually. GoVoBo automatically transcribes spoken content in real time, using the latest in AI-enabled speech recognition technology from AppTek embedded in a customizable easy-to-use application. At the end of each session, users are presented with an editable and exportable transcript, so they can focus on the meeting attendees and interaction, rather than looking away to take notes. The live translation feature provided by AppTek’s neural machine translation technology allows users to instantly translate their conversation and the conversations of foreign speakers, enabling the ability to all users to communicate with others around the world.

]]>
Interactions Continues to Reshape the Customer Experience Marketplace in Collaboration with NVIDIA Riva https://martechseries.com/predictive-ai/ai-platforms-machine-learning/interactions-continues-to-reshape-the-customer-experience-marketplace-in-collaboration-with-nvidia-riva/ Fri, 23 Sep 2022 07:09:29 +0000 https://martechseries.com/?p=303715 The collaboration combines two best-in-class conversational AI solutions to deliver unprecedented accuracy, productivity, and customer satisfaction in the contact center

Interactions, the world leader in conversational artificial intelligence (AI), announced a go-to-market collaboration with NVIDIA Riva to reshape the customer experience in the contact center and beyond.

The combination of the Interactions Intelligent Virtual Assistant (IVA) with the NVIDIA Riva technology and its expanded language libraries delivers, in certain cases, the most human-like customer experience solution on the market.

This integration means end customers can get the great service they need, instantly and painlessly. For new businesses working with Interactions, this technology option means satisfied, loyal customers, improved operational efficiency, and an increased bottom line.

“By integrating Riva with its expanded language libraries into Interactions, we’re empowering the world’s most customer-centric brands to reshape their customer-support ecosystems using a combined best-in-class AI and human experience,” said Michael Iacobucci, CEO of Interactions. “Interactions uplevels contact center performance in a fundamental way by reducing wait and handle times, significantly increasing self-service efficiency, and ultimately delivering greater customer satisfaction.”

“Riva enables customization at every stage of speech AI pipelines, empowering customers to build unique solutions that can easily meet the requirements of even the world’s highest performing enterprises,” said Hemant Dhulla, GM of Conversational AI at NVIDIA. “By choosing NVIDIA Riva, Interactions is creating solutions that provide the real-time customer experience that today’s consumers demand.”

Marketing Technology News: Memsource Welcomes Technology and SaaS Industry Executive Jason Hemingway as VP Global Marketing

Unique Best-in-Class Solution for Businesses
Interactions has a nearly 20-year track record of delivering exceptional customer experiences for the world’s most well-known brands. The company annually facilitates more than 1 billion real-time conversations through its patented combination of AI and human understanding.

NVIDIA Riva provides best-in-class speech AI through its accelerated software development kit (SDK) running on NVIDIA GPUs. Speech AI applications built with Riva provide better services and improved customer experiences and engagement.

Riva integration into the Interactions IVA architecture delivers a highly customizable automated speech recognition (ASR) and text-to-speech (TTS) technology, solving customers’ unique problems. The unique value Interactions brings is its patented Human Assisted Understanding (HAU) model which provides real-time understanding to keep conversations moving forward without disruption. Interactions has developed a unique capability of redacting personal or payment information to maintain the highest level of privacy. Customers have the ultimate flexibility to deploy the solution wherever their data is: in the cloud, on-prem, or hybrid. Interactions works with NVIDIA to provide businesses with the most accurate and human-like solution.

Marketing Technology News: MarTech Interview with Zarina Stanford, CMO at Bazaarvoice

]]>