The team behind VoiceGain has more than 12 years experience of using Automated Speech Recognition in real wold - developing and hosting complete IVR systems for large enterprises.
We started of as Resolvity, Inc., back in 2005. We built our own IVR Dialog platform, utilizing AI to guide the dialog and to improve the recognition results from commercial ASR engines.
Resolvity Dialog Platform
The Resolvity Dialog platform, had some advanced AI modules. For example:
- It had ontology that could be used to model Dialog Domain . This ontology then could be used to automatically drive the dialog. It would automatically generate follow up questions based on the information that was already acquired. We used this often in IVR applications that required recognition of product names.
- It had an Incremental Case-Based Reasoning (CBR) troubleshooting engine which together with Ontology could be used to diagnose technical problems based on presented symptoms.
- It had a module to correct systematic errors of the ASR engine to improve the accuracy (we received a US Patent for this)
- It had an NLCR module that could automatically handle "How may I help you?" type of interactions. It used a combination on Ontology, Bayesian and Neural Network classifiers.
Hosted IVR
Starting from 2007 we were building complete IVR applications for Customer Support and hosting them on our servers in data centers. We build a Customer Solutions team that interacted with our customers ensuring that the IVR applications were always up to date and an Operations team that ensured that we ran the IVRs 24/7 with very high SLAs.
Resolvity Dialog Platform had a set of tools available that allowed us to analyze speech recognition accuracy in high detail and also allowed us to tune various ASR parameters (thresholds, grammars).
Moreover, because that platform was ASR-engine agnostic, we were able to see how a number of ASR engines from various brands performed in real life.
VoiceGain 1.0 Cloud PBX
In 2012-2013 Resolvity built a complete low-cost Cloud PBX platform on top of Open Source projects. We launched it for the India market under the brand name VoiceGain. The platform was providing complete end-to-end PBX+IVR functionality.
The version that we used in prod supported only DTMF, but we also had a functional ASR version. However, at that time it was built using conventional ASR technologies (GMM+HMM) and we found that training it for new languages presented quite a bit of challenges.
VoiceGain was growing quite fast. We had presence in data centers in Bangalore and Mumbai. We were able to provision both landline and mobile numbers for our PBX+IVR customers. Eventually, although our technology was performing quite well, we found it expensive to run a very hands-on business in India from the USA and sold our India operations.
Augmented Recognition
When the combination of hardware and AI developments made Deep Neural Networks possible, we decided to start working on our own DNN Speech Recognizer, initially with the goal to augment the results from the ASR engines that we used in our IVRs. Very quickly we noticed that with our new customized ASR used for IVR tasks we could achieve results better than with the commercial ASRs. We were able to confirm this by running comparison tests across data sets containing thousands of examples. The key to higher accuracy was ability to customize the ASR Acoustic Models to the specific IVR domain and user population.
Own ASR Platform
Great results with augmented recognition lead us to launch a full scale effort to build a complete ASR platform, again under Voicegain (.ai) brand name, that would allow for easy model customization and be easy to use in IVR applications.
From our IVR experience we knew that large enterprise IVR users are (a) very price sensitive plus (b) require tight security compliance, that is why from day 1 we also worked on making the Voicegain platform deployable on the Edge.