How To Build A Google Home Voice Assistant?
At Opentrends we do not just write code or design for users. We are also prepared to talk to them. In fact, voice user interfaces (VUI) have revolutionized the interactions of the audience with the devices. But how do you build a voice assistant?
At Opentrends we have created a concept for Google Home that reserves meeting rooms in a simple way. Next, we show you a real workflow to conceptualize and build a user interface with voice through technology.
At Opentrends we have created a concept with the Google Assistant application to book a meeting room
Before creating a Google assistant application it is essential to meet some technical requirements:
- A Google account to access all the services and tools.
- A Google home / phone with google assistant / Emulator to test the app (You'll have a much better time with a Google Home when testing).
- Server with NodeJS where we will have part of the business logic.
With these elements, we started to build the voice assistant with Google Home. The steps that we took in Opentrends were the following:
- Bot voice and tone definition
- Conversational tree design
- Setup the environment
- Build with Dialogflow
- Build the server to handle the business logic (NodeJs)
- Co-creation to choose the goal of the voice assistant: with the premise of making the office more intelligent, 5 Opentrends stakeholders participated in an exercise to find the best solution around this concept. Finally, we detected the need to improve the process for the booking of meeting rooms.
- Definition of the voice tone of the bot: first, we define the voice tone of the bot. Through a quick analysis of the market, we create three possible personalities to whom we assign specific features of their speech: keywords and words crutch, intonation and rhythm. In this way, we could humanize the bot and at the same time, give consistency for evolutionary futures.
- Conversational tree design: what questions are essential? What conversation flow is most suitable for the usability of the service? Where could it get stuck when it comes to giving adequate answers? The conversational tree foresees all the points of contact between the user and the bot, as well as the answers to ill-formulated questions or even insults. In this way, we minimize the possible errors during the use of the voice assistant.
If you want to know how we define the personality and voice tone of a bot or how we build conversational trees, in this article you will find all the details.
In the process of analyzing the conversation flow between the user and Google Home to book a meeting room, we decided to create 2 actions: users could book a room directly or ask which meeting room will be available. The flow begins when the user wakes up the application: "Ok Google, talk to booking rooms". Dialogflow identifies it as the "welcome intent" and asks the server for the corresponding response. For its part, the Google assistant is the part that detects the voice and transcribes the voice message to text and vice versa.
When we design chatbots or VUIs, we talk about "intents" and "entities". The "intent" is the user's intention. Identifying the "intent" means finding out what the user wants when interacting with a bot. An "entity" acts as a variable that modifies an "intent".
We use DialogFlow to create the application that will receive the message and find out the response to the user. DialogFlow communicates with a NodeJs server that makes the application more intelligent: the server returns the correct message depending on the time, the previous messages and the availability of the meeting rooms.
The DialogFlow process is:
- Dialog Flow receives the text and figures out to which agent it will send it.
- Dialogflow’s agent identifies the intent of the user and passes the text to the right intent.
- Dialogflow’s intent uses entities to store parameter values.
- Dialogflow’s intent passes the request along with entities to fulfillment.
- Fulfilment uses webhook to call the server.
We created a server with NodeJS where we have the business logic. The server receives the user's message, some keywords and the action (reserve or request information). With that information and the context of the conversation, it connects to the data store and extracts the relevant data.
The final part of the project was to test and train the AI within DialogFlow. For this, we asked for the collaboration of different Opentrends colleagues and joined them to the test program. Our colleagues spent some time talking with Google Home (device, phone or test environment). For our part, we worked in DialogFlow, which has a training section where we could see the history of the conversations. It was very positive to know how people talk with the interface, because they expressed things that we could not imagine when we defined the flow! This allowed us to enrich and add these new ways of asking or booking room in the application.
If you liked this article, and you are interested in learning more about our omnichannel service offering, check this link.