Using Amazon Lex, Lambda, & MongoDB Atlas to Build a Voice-Activated Movie Search App - Part 2

Raphael Londner
November 13, 2017 | Updated: June 26, 2020

It's that time of year again! This post is part of our Road to AWS re:Invent 2017 blog series. In the weeks leading up to AWS re:Invent in Las Vegas this November, we'll be posting about a number of topics related to running MongoDB in the public cloud. See all posts here.

Introduction

This is Part 2 of our Road to re:Invent 2017 blog post series. If you haven’t read it yet, take a look at Part 1 for a brief overview of Amazon Lex and instructions to set up our movie database with MongoDB Atlas, our fully managed database service.

As a reminder, this tutorial is divided into 4 parts:

Part 1: Lex overview, demo scenario and data layer setup
Part 2: Set up and test an Amazon Lex bot (this post)
Part 3: Deploy a Lambda function as our Lex bot fulfillment

In this blog post, we will set up our Lex bot in the AWS Console and verify that its basic flow works as expected. We’ll implement the business logic (which leverages MongoDB) in Part 3 of this post series.

Amazon Lex bot setup instructions

In this section, we will go through the whole process of creating our SearchMovies bot while explaining the architectural decisions I made.

After signing in into the AWS Console, select the Lex service (in the Artificial Intelligence section) and press the Create button.

Select the Custom bot option and fill out the form parameters as follows:

Bot name: SearchMoviesBot
Output voice: None
Session timeout: 5
COPPA: No

Press the Create button at the bottom of the form.

A new page appears, where you can create an intent. Press the Create Intent button and in the Add intent pop-up page, click the Create new intent link and enter SearchMovies in the intent name field.

In the Slot types section, add a new slot type with the following properties:

Slot type name: MovieGenre
Description: Genre of the movie (Action, Comedy, Drama…)
Slot Resolution: Restrict to Slot values and Synonyms
Values: All, Action, Adventure, Biography, Comedy, Crime, Drama, Romance, Thriller

image alt text

You can add synonyms to all these terms (which strictly match the possible values for movie genres in our sample database), but the most important one for which you will want to configure synonyms is the Any value. We will use it as a keyword to avoid filtering on movie genre in scenarios when the user cannot qualify the genre of the movie he’s looking for or wants to retrieve all the movies for a specific cast member. Of course, you can explore the movie database on your own to identify and add other movie genres I haven’t listed above. Once you’re done, press the Save slot type button.

Next, in the Slots section, add the following 3 slots:

genre
1. Type: MovieGenre
2. Prompt: I can help with that. What's the movie genre?
3. Required: Yes
castMember
1. Type: AMAZON.Actor
2. Prompt: Do you know the name of an actor or actress in that movie?
3. Required: Yes
year
1. Type: AMAZON.FOUR_DIGIT_NUMBER
2. Prompt: Do you know the year {castMember}'s movie was released? If not, just type 0
3. Required: Yes

Press the Save Intent button and verify you have the same setup as shown in the screenshot below:

image alt text

The order of the slots is important here: once the user’s first utterance has been detected to match a Lex intent, the Lex bot will (by default) try to collect the slot values from the user in the priority order specified above by using the Prompt texts for each slot. Note that you can use previously collected slot values in subsequent slot prompts, which I demonstrate in the ‘year’ slot. For instance, if the user answered Angelina Jolie to the castMember slot prompt, the year slot prompt will be: ‘Do you know the year Angelina Jolie’s movie was released? If not, just type 0

Note that it’s important that all the slots are marked Required. Otherwise, the only opportunity for the user to specify them is to mention them in the original utterance. As you will see below, we will provide such ability for Lex to identify slots right from the start, but what if the user chooses to kick off the process without mentioning any of them? If the slots aren’t required, they are by default overlooked by the Lex bot so we need to mark them Required to offer the user the option to define them.

But what if the user doesn’t know the answer to those prompts? We’ve handled this case as well by defining "default" values: All for the genre slot and 0 for the year slot. The only mandatory parameter the bot’s user must provide is the cast member’s name; the user can restrict the search further by providing the movie genre and release year.

Last, let’s add the following sample utterances that match what we expect the user will type (or say) to launch the bot:

I am looking for a movie
I am looking for a {genre} movie
I am looking for a movie released in {year}
I am looking for a {genre} movie released in {year}
In which movie did {castMember} play
In which movie did {castMember} play in {year}
In which {genre} movie did {castMember} play
In which {genre} movie did {castMember} play in {year}
I would like to find a movie
I would like to find a movie with {castMember}

Once the utterances are configured as per the screenshot below, press Save Intent at the bottom of the page and then Build at the top of the page. The process takes a few seconds, as AWS builds the deep learning model Lex will use to power our SearchMovies bot.

image alt text

It’s now time to test the bot we just built!

Testing the bot

Once the build process completes, the test window automatically shows up:

image alt text

Test the bot by typing (or saying) sentences that are close to the sample utterances we previously configured. For instance, you can type ‘Can you help me find a movie with Angelina Jolie?’ and see the bot recognize the sentence as a valid kick-off utterance, along with the {castMember} slot value (in this case, ‘Angelina Jolie’). This can be verified by looking at the Inspect Response panel:

image alt text

At this point, the movie genre hasn’t been specified yet, so Lex prompts for it (since it’s the first required slot). Once you answer that prompt, notice that Lex skips the second slot ({castMember}) since it already has that information.

Conversely, you can test that the ‘Can you help me find a comedy movie with angelina jolie?’ utterance will immediately prompt the user to fill out the {year} slot since both the {castMember} and {genre} values were provided in the original utterance:

image alt text

An important point to note here is that enumeration slot types (such as our MovieGenre type) are not case-sensitive. This means that both "comedy" and “coMeDy” will resolve to “Comedy”. This means we will be able to use a regular index on the Genres property of our movies collection (as long as our enumeration values in Lex match the Genres case in our database).

However, the AMAZON.Actor type is case sensitive - for instance, "angelina jolie" and “Angelina Jolie” are 2 distinct values for Lex. This means that we must define a case-insensitive index on the Cast property (don’t worry, there is already such an index, called ‘Cast_1’ in our sample movie database). Note that in order for queries to use that case-insensitive index, we’ll have to make sure our find() query specifies the same collation as the one used to create the index (locale=’en’ and strength=1). But don’t worry for now: I’ll make sure to point it out again in Part 3 when we review the code of our chat’s business logic (in the Lambda function we’ll deploy).

Summary

In this blog post, we created the SearchMovies Lex bot and tested its flow. More specifically, we:

Created a custom Lex slot type (MovieGenre)
Configured intent slots
Defined sample utterances (some of which use our predefined slots)
Tested our utterances and the specific prompt flows each of them starts

We also identified the case sensitivity of a built-in Lex slot that adds a new index requirement on our database.

In Part 3, we’ll get to the meat of this Lex blog post series and deploy the Lambda function that will allow us to complete our bots’ intended action (called ‘fulfillment’ in the Lex terminology).

Meanwhile, I suggest the following readings to further your knowledge of Lex and MongoDB:

About the Author - Raphael Londner

Raphael Londner is a Principal Developer Advocate at MongoDB, focused on cloud technologies such as Amazon Web Services, Microsoft Azure and Google Cloud Engine. Previously he was a developer advocate at Okta as well as a startup entrepreneur in the identity management space. You can follow him on Twitter at @rlondner.

← Previous

The User Guide to AWS re:Invent

This post is a mini-guide that walks through some of the things to do while you are at AWS re:Invent this year.

November 13, 2017

Next →

Data and the European Landscape: 3 Trends for 2022

The past two years have brought massive changes for IT leaders: large and complex cloud migrations; unprecedented numbers of people suddenly working, shopping and learning from home; and a burst in demand for digital-first experiences. Like everyone else, we are hoping that 2022 isn’t so disruptive (fingers crossed!), our customer conversations in Europe do lead us to believe the new year will bring new business priorities. We’re already noticing changes in conversations around vendor lock-in, thanks to the Digital Markets Act, a new enthusiasm for combining operational and analytical data to drive new insights faster, and a more strategic embrace of sustainability. Here’s how we see these trends playing out in 2022. Digital markets act draws new attention to cloud vendor lock-in in Europe We’ve heard plenty about the European Commission’s Digital Markets Act , which, in the name of ensuring fair and open digital markets, would place new restrictions on companies that are deemed to be digital “gatekeepers” in the region. That discussion will be nothing compared to the vigorous debate we expect once the EU begins the very tricky political business of determining exactly which companies will fall under the act. If the EU sets the bar for revenues, users, and market size high enough, it’s possible that the regulation will end up affecting only Facebook, Amazon, Google, Apple, and Microsoft. But a European group representing 2,500 CIOs and almost 700 organizations is now pushing to have the regulation encompass more software companies. Their main concern centers around “distorted competition” in cloud infrastructure services and a worry that companies are being locked into one cloud vendor. A trend that will likely increase in 2022 that pushes back on cloud vendor lock-in is embracing multi-cloud strategies. We should expect to see more organisations in the region pursuing multi-cloud environments as a means to improve business continuity and agility whilst being able to access best of breed services from each cloud provider. As we have always said …”it’s fine to date your cloud provider….but don’t ever marry them.” The convergence of operational and analytical data The processing of operational and analytical data is almost always contained in different data systems, each tuned to that use case and managed by separate teams. But because that data lives in separate places, it’s almost impossible for organisations to generate insights and automate actions in real time, against live data. We believe 2022 is the year we’ll see a critical mass of companies in the region make significant progress toward a convergence of their operational and analytical data. We’re already starting to see some of the principles of microservices in operational applications, such as domain ownership, be applied to analytics as well. We’re hearing about this from so many of our customers locally, who are looking at MongoDB as an application data platform that allows them to perform queries across both real-time and historical data, using a unified platform and a single query API. This results in the applications they are building becoming more intelligent and contextual to their users, while avoiding dependencies on centralized analytics teams that otherwise slow down how quickly new, data-driven experiences can be released. Sustainability drives local strategic IT choice Technology always has some environmental cost. Sometimes that’s obvious — such as the energy needs and emissions associated with Bitcoin mining. More often, though, the environmental costs are well hidden. The European Green Deal commits the European Union to reducing emissions by 55% by 2030, with a focus on sustainable industry. With the U.N. Climate Change Conference (COP26) recently completed in Glasgow, and coming off the hottest European summer on record, climate issues have become top of mind. That means our customers are increasingly looking to make their technical operations more sustainable — including in their choice of cloud provider and data centers. According to research from IDC , more than 20% of CxOs say that sustainability is now important in selecting a strategic cloud service provider, and some 29% of CxOs are including sustainability into their RFPs for cloud services. Most interesting, 26% say they are willing to switch to providers with better sustainability credentials. Historically, it’s been difficult to make a switch like that. That’s part of the reason we built MongoDB Atlas — to give our customers the flexibility to run in any region , with any of the three largest cloud providers, and to make it easy to switch between them, and even to run a single database cluster across them. Publicly available information about the footprint of individual regions and even single data centers will make it simpler for companies to make informed decisions. Already, at least one cloud platform has added indicators to regions with the lowest carbon footprint. Source: IDC, European Customers Engage Services Providers at All Stages of Their Cloud Journey, IDC Survey Spotlight, Doc #EUR248484021, Dec 2021

December 21, 2021