Skip to content

General Information About Alexa Development

jvautrot edited this page Jan 6, 2021 · 10 revisions

Alexa skill execution overview

This is basically a restatement of the guide at https://moduscreate.com/blog/build-an-alexa-skill-with-python-and-aws-lambda/ with some additional clarification. It traces the path of execution as a user interacts with our skill.

  1. User issues voice command to Echo by saying, "Alexa" followed by a skill name and an intent. The intent may have parameters.

    In this case, something like: "Alexa, ask Boston Info when is trash day?"

    skill name : Boston Info
    intent     : find trash days
    parameters : 1 Main Street apartment 2
    

    We will give this intent a name, TrashDayIntent. You can see this in the intent schema.

  2. Echo sends the request to the Alexa Service Platform.

    This handles the speech recognition and translates the above voice command to a JSON document containing the intent and any parameters.

    This JSON is sent to the skill (Boston Info in this example).

    intent    : trashday
    parameter : "1 Main Street apartment 2"
    
  3. The skill receives the JSON.

    We're implementing the skill as an AWS Lambda, so the JSON will be sent to the Lambda function at the ARN associated with the skill name.

  4. The Lambda contains custom code that parses the JSON to identify the intent and corresponding arguments (in this example, the address).

    The code then gathers data for the response. In this case that means a call to data.boston.gov to get the string of trash days associated with the provided address. Alternately this might mean accessing a database or session information.

    This response data is serialized in a JSON response, which is returned to the Alexa Service Platform. It contains the response both as text for Alexa to say and as text/images for the smartphone app to display.

  5. The Alexa Service Platform receives the response and conveys to the user using text-to-speech or the app display.

This communication paradigm is shown below.

Deployment Notes

Note: See Deploying Your Skill to learn how we currently push code to our development environments.

Because the python code in Boston Info's Lambda function uses external libraries, it must be uploaded as a .zip file.

To generate this .zip file, we must install all of the required Python packages in a directory that contains our code. Amazon provides instructions on how to do so: https://docs.aws.amazon.com/lambda/latest/dg/lambda-python-how-to-create-deployment-package.html

Once all the requisite libraries are installed, compress the contents of the directory. The instructions note:

Important: Zip the directory content, not the directory. The contents of the Zip file are available as the current working directory of the Lambda function.

Recall that in Part 2 of the installation instructions we set the Handler to lambda_function.lambda_handler. This is specifying the function that is executed when a voice command is issued to the Alexa device. If we compress the containing directory instead of its contents, this code is not available.

Miscellaneous Alexa Skill Information

Below we've listed some basic vocabulary that will be useful to know when working on an Alexa skill.

ARN (Amazon Resource Name)

From Amazon's documentation:

Amazon Resource Names (ARNs) uniquely identify AWS resources. We require an ARN when you need to specify a resource unambiguously across all of AWS, such as in IAM policies, Amazon Relational Database Service (Amazon RDS) tags, and API calls. https://docs.aws.amazon.com/general/latest/gr/aws-arns-and-namespaces.html#arn-syntax-lambda

Our skill will be stored in an AWS Lambda function, which we can identify by its ARN.

Slot

Slots are used for intents that require parameters. Each slot must have:

  1. a name (a string describing the slot)
  2. a type (this can be a type preconfigured by Amazon or a custom type)

The preconfigured slot type for a street address is described here.

Information on defining a custom slot type is available here.

Sample Utterances

Alexa needs a list of phrases that correspond to each of our skill's intents.

We provide this in the settings for our skill in the developer console (see Part 1 of installation).

The format for the list of sample utterances is

[intent] [phrase]

The phrase may contain a reference to a slot if there is one associated with the intent it invokes. The format for this is

{slot_name}

Example of a sample utterance:

SetAddressIntent my address is {Address}

Lambda

Our skill will invoke an Amazon Lambda function. This is where the code that produces a response to the Alexa voice command resides.

There are several language options for this code, including Javascript (Node.js), Java, C#, and Python(2.7 or 3.6).

Selecting Python we are provided the following template:

def lambda_handler(event, context):
    # TODO implement
    return 'Hello from Lambda'

The event argument is the JSON received from the Alexa platform. It contains the intent and slot information from the voice command.

Structure of the event object:

  session
      sessionId: [session id],
      application
          applicationId: [application id]
      attributes: {},
      user
          userId: [user id]
      new: true
  request:
      type: [request type, e.g., IntentRequest]
      requestId: [request id]
      timestamp: [timestamp]
      intent
          name: [name of the invoked intent]
          slots:
              [slot name]
                  name: [name of the slot]
                  value: [value of the slot]
      locale: "en-US"
  version: "1.0"

The elements of this event object are discussed in detail at: https://developer.amazon.com/docs/custom-skills/request-and-response-json-reference.html.

Back   |   Next