Build Generative Voice AI apps with our ASR/Speech-to-Text and LLM-powered NLU APIs. Record & Transcribe meetings, contact center calls, videos, etc. Get LLM-powered Summary, Sentiment and more. Build Conversational Voice Assistants that integrate with your Contact Center platform. Get started with our developer-first platform today.
Voicegain’s deep learning ASR offers an unbeatable combination of accuracy, price and flexibility. Voicegain ASR can be deployed on-premise, in your VPC or invoked as a cloud service. We integrate out-of-the-box with leading contact center, video meeting and bot platforms.
Voicegain’s out-of-the-box accuracy – for both batch and streaming speech recognition - are on par with the very best. But you can achieve accuracy in the high 90s when you train with your data.
Voicegain is priced 50%-75% lower than the large Cloud Speech-to-Text players. Our Edge pricing is also very affordable compared to competing options.
Access Voicegain on our multi-tenant Cloud. Or deploy it in your Datacenter or VPC. Use your existing audio infrastructure and integrate with a protocol of your choice.
Our ASR is built on most recent advances in deep learning. We utilize end-to-end transformer-based deep neural networks and we have trained it with several tens of thousands of hours of diverse audio datasets.
APIs to embed transcription into your app and build voice bots accessible over telephony. Deploy Voicegain on your infrastructure (VPC, Datacenter) or use our cloud service
Get your own AI Meeting Assistant to automate note taking. Always know who said what when and where! Integrates with video meeting platforms like Zoom, Microsoft Teams and Google Meet. Edge (On-Prem or VPC) options available.
Automate Quality Assurance and extract CX insights from voice interactions in contact center. White-label or Source Code License of UI available.
Voicegain, the leading Edge Voice AI platform for enterprises and Voice SaaS companies, is thrilled to announce the successful completion of a System and Organizational Control (SOC) 2 Type 1 Audit performed by Sensiba LLP.
Read more →Nuance just announced that the Nuance Recognizer which is a MRCP grammar-based ASR will reach EOL in June 2026. This decision affects a significant number of on-premise speech-enabled IVR systems that rely on the Nuance Recognizer, creating uncertainty for many businesses.
If you are impacted by this decision, this post outlines an immediate fix while preparing companies for an AI future.
The decision appears to be driven by two primary factors:
Nuance provides two upgrade options, but neither is fully compatible with existing IVRs:
The EOL announcement introduces two major hurdles for businesses:
If your business relies on Nuance’s MRCP-based ASR (as of November 2024), now is the time to plan for a replacement. Below, we outline a solution that allows you to continue using your existing IVR without major disruptions.
Voicegain offers a seamless alternative to Nuance's grammar-based MRCP ASR. Our platform:
This allows you to maintain your current IVR workflow until you're ready to upgrade on your terms.
Over the next few years, many businesses will transition to generative AI-powered phone agents to improve caller experiences and increase automation rates. While this is a promising future, businesses shouldn’t feel forced to move to the cloud just to access these capabilities.
Voicegain’s deep-learning-based large-vocabulary STT engine is designed to evolve with your needs:
To discuss your upgrade options, email us at sales@voicegain.ai. If you'd like to test our solution, sign up for a free developer account (no credit card required) and get 1,500 free hours of usage. Visit the link in the instructions, and once signed up, contact support@voicegain.ai to request MRCP access.
Start future-proofing your IVR system today with Voicegain.
LLMs like ChatGPT and Bard are taking the world by storm! An LLM like ChatGPT is really good at both understanding language and acquiring knowledge of this content. The outcome of this is almost eerie and scary. Because once these LLMs acquire knowledge, they are able to answer very accurately questions that in the past seemed to require human judgement.
One big use-case for LLMs is in the analysis of business meetings - both internal (between employees) and external (e.g conversations with customers, vendors, etc).
In the past few years, companies have been primarily using multi-tenant Revenue/Sales Intelligence and Meeting AI SaaS offerings to transcribe business conversations and extract insights. With such multi-tenant offerings, transcription and natural language processing takes place on the Vendor cloud. Once the transcript is generated, NLU models offered by the Meeting AI vendor is used to extract insights. E.g, Revenue intelligence products like Gong extract questions and sales blockers in sales conversations. Most meeting AI assistants extract summaries and action items.
Essentially these NLU models - and many of these predate the LLMs - were able to summarize, extract topics, keywords and phrases. Enterprises did not mind using the cloud infrastructure of the vendor to store the transcripts as what this NLU could do seemed pretty harmless.
However the LLMs take this to a whole different level. Our team used Open AI Embeddings API to generate embeddings of our daily meeting transcripts that were conducted over a one-month period. We stored these embeddings in an open-source Vector database (our knowledge-base). During testing, for each user question, we generated embedding of the question and queried the vector database (i.e knowledge-base) to get related/similar embeddings.
Then we provided these related documents as context and the user question as a prompt to GPT 3.5 API so that it could generate the answer. We got really really good results.
We were able to get answers to the following questions
1. Provide a summary of the contract with <Largest Customer Name>.
2. What is the progress on <Key Initiative>?
3. Did the Company hire new employees?
4. Did the Company discuss any trade secrets?
5. What is the team's opinion on Mongodb Atlas vs Google Firestore?
6. What new products is the Company planning to develop?
7. Which Cloud provider is the Company using?
8. What is the progress on a key initiative?
9. Are employees happy working in the company?
10. Is the team fighting fires?
ChatGPT's responses to the above questions was amazingly and eerily accurate. For Question 4, it did indicate that it did not want to answer the question. And when it do not have adequate information (e.g. Question 9), it did indicate that in its response.
At Voicegain, we had always been a big proponents of why Voice AI needs to remain on the Edge. We had written about it in the past.
Meeting transcripts in any business is a veritable gold mine of information. Now with the power of LLMs, they can now be queried very easily to provide amazing insights. But if these transcripts are stored in another Vendor's cloud, it has the potential to expose very proprietary and confidential information of any business to 3rd parties.
Hence for businesses it is extremely critical that such transcripts are stored only in private infrastructure (behind the firewall). It is really important for Enterprise IT to make sure this happens in order to safeguard proprietary and confidential information.
If you are looking for such a solution, we can help. At Voicegain, we offer Voicegain Transcribe, an enterprise-ready solution for Meeting AI. With Voicegain Transcribe, the entire solution can deployed either in a datacenter (on bare-metal) or in a private cloud. You can read more about it here.
Interested in customizing the ASR or deploying Voicegain on your infrastructure?