Voice Assistant for Veterinarians Startup Talkatoo Closes Oversubscribed Funding Round

Talkatoo

Dictation for veterinarians startup Talkatoo has closed an oversubscribed funding round for an undisclosed sum led by Klick Ventures. The company provides dictation software for animal health professionals of all stripes, offering a new facet of the larger trend for using voice AI in healthcare.

Read More

Introduction to Voice User Interfaces (Part - 2)

Voice user interface

One of the most popular application areas for voice system today is conversational AI. Graph based interaction mainly focuses on asking pointed questions in a prescribed order and only accepting specific terms as responses. We’ve seen this plenty of times before when we can’t move forward in the system until we provide our user ID or we can’t specify our destination until we provided the location we’ll be starting from.

Read More

Introduction to Voice User Interfaces (Part - 1)

Voice user interface

VUI system’s overview and introduction to some current VUI applications.Prateek SawhneyJust now·7 min readHello and welcome to this medium article around voice user interfaces. A VUI is a speech platform that enables humans to communicate with machines by voice. VUIs used to be the stuff of science fiction. Movies and TV shows featuring spaceship crews that communicated verbally with their computers seemed fantastic. But that fantastic future is here now. Voice enabled agents are becoming common place on our phones, computers, cars to the point that many people may no longer think of these systems as artificial intelligence at all. Under the hood, though, there is a lot going on. Audio sound waves from voice must be converted into language texts using machine learning algorithms and probabilistic models.Photo by Magnus Jonasson on UnsplashThe resulting text must be reasoned-over using AI logic to determine the meaning and formulate a response. Finally, the response text must be converted back into understandable speech again with machine learning tools.These three parts constitute a general pipeline for building an end to end voice enabled application. Each part employs some aspect of a AI. And that’s why we’re here.In this article we’ll go through a VUI system’s overview and talk about some current VUI applications. We’ll focus on conversational AI applications where we’ll learn some VUI best practices and why we need to think differently about user design for voice as compared to other interface mediums. Finally, we will put these ideas into practice by building our own conversational AI application.VUI OverviewLet’s take a closer look at the basic VUI pipeline we described earlier. To recap, three general pieces were identified.Voice to text,Text input reasoned to text output,And finally, text to speech.Speech RecognitionIt starts with voice to text. This is speech recognition. Speech recognition is historically hard for machines but easy for people and is an important goal of AI. As a person speaks into a microphone, sound vibrations are converted to an audio signal. This signal can be sampled at some rate and those samples converted into vectors of component frequencies. These vectors represent features of sound in a data set, so this step can be thought of as feature extraction.Photo by Jonas Leupe on UnsplashThe next step in speech recognition is to decode or recognize the series of vectors as a word or sentence. In order to do that, we need probabilistic models that work well with time series data for the sound patterns. This is the acoustic model.Decoding the vectors with an acoustic model will give us a best guess as to what the words are. This might not be enough though, some sequences of words are much more likely than others. For example, depending on how the phrase “hello world” was said, the acoustic model might not be sure if the words are “hello world” or “how a word” or something else.Now you and I know that it was most likely the first choice, “hello world”. But why do we know? We know because we have a language model in our heads, trained from years of experience and that is something we need to add to our decoder. An accent model may be needed for the same reason. If these models are well trained on lots of representative examples, we have a higher probability of producing the correct text. That’s a lot of models to train. Acoustic, language and accent models are all needed for a robust system and we haven’t even gone through the whole VUI pipeline yet.Reasoning LogicBack to the pipeline, once we have our speech in the form of text, it’s time to do the thinking part of our voice application, the reasoning logic.If I ask you, a human, a question like how’s the weather?You may respond in many ways like “I don’t know?” “It’s cold outside”, “The thermometer says 90 degrees, etc”. In order to come up with a response, you first had to understand what I was asking for and then process the requests and formulate a response. This was easy because, you’re human. It’s hard for a computer to understand what we want and what we mean when we speak. The field of natural language of processing (NLP) is devoted to this quest. To fully implement NLP, large datasets of language must be processed and there are a great deal of challenges to overcome. But let’s look at a smaller problem, like getting just a weather report from VUI device.Photo by Thomas Kolnowski on UnsplashLet’s imagine an application that has weather information available in response to some text request. Rather than parsing all the words, we could take a shortcut and just map the most probable request phrases for the weather to get weather process. In that case, the application would in fact understand requests most of the time. This won’t work if the request hasn’t been premapped as a possible choice, but it can be quite effective for limited applications and can be improved over time.TTS (Text To Speech)Once we have a text response, the remaining task in our VUI pipeline is to convert that text to speech. This is the speech synthesis or text to speech (TTS). Here again examples of how words are spoken can be used to train a model, to provide the most probable pronunciation components of spoken words. The complexity of the task can vary greatly when we move from say, a monotonic robotic voice to a rich human sounding voice that includes inflection and warmth. Some of the most realistic sounding machine voices to ate have been produced using deep learning techniques.VUI ApplicationsVUI applications are becoming more and more common place. There are a few reasons driving this. First of all voice is natural for humans. It’s effortless for us to converse by voice compared to reading and typing. And secondly, it turns out it’s also fast. Speaking into a text transcriber is three times faster than typing. In addition there are times when it is just too distracting to look at a visual interface like when you’re walking or driving. With the advent of better and more accessible speech recognition and speech synthesis technologies a number of applications have flourished. For example voice interfaces can be found in cars, drivers can initiate and answer phone calls given receive navigation commands and even receive texts and e-mail without ever taking their eyes off the road. Other applications in web and mobile have been around for a few years now but are getting better and better. Dictation applications, Leverage speech recognition technologies to make putting thoughts into words a snap. Translation applications, Leverage speech recognition and speech synthesis as well as some reasoning logic in between to convert speech in one language to speech in another. If you’ve tried any of these you know it’s not quite a universal translator but it’s pretty amazing to be able to communicate through one of these apps with someone you couldn’t even speak to before.One of the most exciting innovations in VUI today is conversational AI technology. We can now carry on a conversation with a cloud based system that incorporates well-tuned speech recognition, some functionality and speech synthesis into one system or device. Examples include Apple’s Siri, Microsoft’s Cortana, Google home and Amazon’s Alexa on Eco. Conversational AI really captures our imaginations because it seems to be an early step toward the more general AI we’ve seen in science fiction movies.The Home Assistant devices in this category are quite flexible. In addition to running a search or giving you the weather these devices can interface with other devices on the internet linked with your accounts if you want, fetching save data, the list goes on. Even better, development with these technologies is accessible to all of us. We really only need our computer to get started creating our own application in conversational AI. The heavy lifting of speech recognition and speech synthesis have been done for us and turned into a cloud based APIs. The field is new and just waiting for smart developers to imagine and implement the next big thing. There’s a lot of opportunity out there to come up with any voice and able application we can think of.ReferencesIntroduction to Stemming and Lemmatization (NLP)Introduction to Stemming vs Lemmatization (NLP)A complete study on Stemming vs Lemmatization and which technique is used under different Natural Language Processing…medium.com2. Introduction to Word Embeddings (NLP)Introduction to Word Embeddings (NLP)A complete study about capturing the contextual meanings of neighbouring words using techniques like Word2Vec & GloVe.medium.comThat’s it for Voice User Interfaces. Thanks for reading and following along. I hope you had a good time reading and learning. Bundle of thanks for reading it!My Portfolio and Linkedin :)Android Apps by Prateek Sawhney on Google PlayArtificial Intelligence Engineer @ Digital Product School by UnternehmerTUM & Technical University of Munich, Germanyplay.google.comprateeksawhney97 — OverviewShare Split app enables quick and easy file transfer without internet usage. Share Split app created by Prateek Sawhney…github.com

Read More

AI-driven voice assistant PolyAI raises $14M round led by Khosla Ventures

PolyAI

“Conversational AI” startup PolyAI, based out of London, has raised $14 million in a funding round led by Silicon Valley’s Khosla Ventures, with participation from existing investors (Point72 Ventures, Amadeus Capital, Sands Capital Ventures, Passion Capital and Entrepreneur First). This follows their $12m Series A, and will provide resources for further US expansion beyond its existing US team. The startup has now raised $28m to date.

Read More

[Podcast] Professor Jan Sedivy on winning the Alexa Prize SocialBot Challenge and 40 years in Voice Tech

Jan Sedivy is a Researcher at the Institute of Cybernetics and Robotics at the Czech Technical University (CTU) and a member of the faculty of electrical engineering. This is also his alma mater where he earned a PhD in 1983. Jan served as the faculty leader for Team Alquist from CTU which was the 2021 winner of the Amazon Alexa Prize SocialBot Grand Challenge.

Read More

Financial Conversational AI startup Kasisto raises $15.5M

Kasisto

Conversational AI developer Kasisto has closed a $15.5 million Series C funding round led by Naples Technology Vendors and NCR Corporation. Kasisto is best known for creating the KAI platform for integrating digital assistants with banking and financial service providers, will use the new funding to keep up with demand in an increasingly valuable and competitive facet of enterprise AI.

Read More

Voiceflow Closes $20M Funding Round to Extend Conversational AI Platform

Voiceflow

Voice app design and prototyping startup Voiceflow has closed a $20 million Series A funding round led by Felicis Ventures. The company’s platform supports thousands of voice apps across many platforms and is working to expand its enterprise clientele with additional features and tools.

Read More

Kitchen voice AI platform Tinychef acquires meal-planning app Zelish

Tiny Chef

Voice-first culinary AI startup Tinychef (formerly Klovechef) has begun acquiring meal-planning app Zelish for an undisclosed sum. The India-based Tinychef will augment its current features with its Canadian acquisition’s own innovations while scooping up Zelish’s customer base and opening the door to accelerating its growth in North America beyond what was previously feasible.

Read More

Samsung debuts Bixby-Enabled Robot Vacuum Cleaner/ Pet Sitter

Robot vacuum cleaner

Samsung has launched a new voice-controlled robotic vacuum cleaner that can also serve as a pet sitter using a combination of advanced sensors and AI. The new Bespoke Jet Bot AI+ connects to the Samsung Bixby voice assistant to interact with owners, accept commands, and serve as a kind of mobile smart speaker that will also share news and weather updates upon request.

Read More

Hotel voice assistant developer Angie Hospitality acquired by long-time partner Nomadix

Angie hospitality

Hotel voice tech developer Angie Hospitality has been acquired by enterprise internet network provider Nomadix for an undisclosed amount. The purchase shifts Angie more fully into the Nomadix ecosystem after being closely intertwined since the voice assistant and smart display creator was founded in 2015.

Read More

Yellow Messenger becomes Yellow.ai and launches Voice AI Platform in the US

Yellow AI

Indian enterprise conversational AI startup Yellow Messenger has rebranded as Yellow.ai and debuted new voice assistants to go with its text-based services as part of an expansion to the United States. Yellow’s automated chat services already run on several major messaging services and company websites but can now include vocal conversations as well.

Read More

[Podcast] Joseph Turow author of Voice Catchers on Voice Tech, Marketing and Privacy

Voicebot podcast

Joseph Turow is a professor at the University of Pennsylvania’s Annenberg School for Communication — the same school where he earned his PhD. Turow is the author of over 150 articles and 10 books including the recently published “The Voice Catchers: How Marketers Listen In to Exploit Your Feelings, Your Privacy, and Your Wallet.”

Read More

European Union report describes unfair competition in Voice Assistant and Smart Home markets

EU - European Union

The European Union has published the first version of a report on competition and potential antitrust issues in the voice assistant, Internet of Things (IoT), and smart home device market. The inquiry began last July and gathered opinions from more than 200 companies on the topic. After a few months of revisions, the European Commission will use the final report to draft any potential policies and rules surrounding the industry.

Read More

Conversational AI startup Hyro raises $10.5M

Hyro

Industry-focused conversational AI developer Hyro has closed a $10.5 million Series A funding round led by Spero Ventures. Hyro develops voice assistants and chatbots for a few verticals, with its conversational AI for healthcare seeing particular growth over the last year, growth it will use the new funding to secure and build upon.

Read More

US Air Force successfully tests Commanding AI-Piloted Jet

Air force

The U.S. Air Force is recently hit a new milestone in developing aircraft operated by artificial intelligence and commanded by a human on the ground. The Skyborg system flew a successful 130-minute test flight out of Florida for the first time a couple of weeks ago, setting the stage for more AI-run military equipment taking orders from humans.

Read More

Peloton acquires Voice Tech startup Aiqudo

Aiqudo

Peloton Interactive has acquired industrial voice tech startup Aiqudo, according to a report in Bloomberg and confirmed by a source with knowledge of the transaction who could not speak on the record. Aiqudo, the developer of a white-label voice assistant for consumer and industrial use, was acquired in February for its technology and its team, many of whom are already listed as working for Peloton on LinkedIn.

Read More
1 2