Using Rasa Core Story format as requirements

Rasa Core Stories as Functional Requirements for your Chatbot

Building a chatbot with Rasa is relatively easy once you get around the technology. Scaling it as a product in a multidisciplinary team is often a much bigger challenge. After a year and a half of consulting with organizations of all sizes and corporate cultures, our primary success metric has become how the team we collaborate with takes ownership of a project moves it forward without us.

The key is to help those teams adapt their development process and tools to a chatbot project, and it is essentially about writing good requirements. We implement the Rasa stack, and Rasa Core stories have become the most natural way to formulate functional requirements. It facilitates communications, speeds up development and testing on all the projects we are using it on.

I am going to explain how we use Rasa core stories as requirements and how these requirements are used through the development process, but before I would like to explain how, in our view,  chatbots are different from GUI apps, and the drawbacks of the common methods used to formalize requirements.

Why chatbots requirements are different from other apps

A GUI is usually self-explanatory

Whether it's a page, or a widget in a complex web app's page, or a screen in a mobile app, a UI constrains the set of possible actions. You can only click on clickable items, enter text in text fields, etc. Mockups offer an implicit description of a functionality. Often, few words are necessary to complement visuals and present requirements in an exhaustive way.

A conversational interface, however, is not self-explanatory by design. What you can or can't do is revealed as feedback, for example when a bot asks you to answer by yes or no after an unexpected entry, you understand that you must say "yes" or "no". As a result, if some requirements are implicit in applications with graphical interfaces, all requirements must be explicitly written for a chatbot.

Another aspect is that A GUI always behaves the same, whatever you were doing on the previous screen, or two screens before. It's generally insensitive to context. A conversation segment, however, can be very dependant on what happened a few turns before.

Natural Language Understanding vs. Dialogue Management

There are generally 2 main components in chatbots. One is Natural Language Understanding (NLU) and the other is the dialogue engine. The former is in charge of understanding what the user is saying, the latter must decide what to do or say. If I say "Hi", the NLU gets the intent [crayon-5b778ae10ae07342051344-i/], and given the intent [crayon-5b778ae10ae0f532982471-i/], the dialogue engine responds with "Hi there".

Example of a task-oriented chatbot components

You could say that it's similar to opposing design to code in the GUI apps realm. But there is a major difference. When there is a glitch in the display, it's always a bug you can assign to a front-end engineer. But as NLU does not require coding skills and can be handled by domain experts, it's important to keep things separated for at least two reasons:

  • First, it's not always obvious for engineers to infer an intent from a sentence. "Hi" is clearly [crayon-5b778ae10ae13702350311-i/], but is "Show me EBITDA last 5 yrs" as obvious if your financial bot has, say, 200 intents? That is why describing the conversation in plain text (in a prototyping tool for example) will make a requirement incomplete.
  • Second, suppose that NLU gets the wrong intent ([crayon-5b778ae10ae15780616251-i/] instead of [crayon-5b778ae10ae17537220849-i/] ), the dialogue engine will respond "Bye".  That's a bug. But should it be assigned to the domain expert in charge of the NLU or to the engineer in charge of the dialogue management? At this stage, it's clearly NLY related and nothing indicates that there's an error in the dialogue engine. However, because the requirement is loose, a product owner or QA analyst will generally assign the bug to an engineer, who will investigate 10-15 minutes, find out it is NLU related and re-assign to the right person. NLU can be tricky on projects with a large scope, routing 100 NLU bugs to engineers can be very frustrating and result in a considerable waste of time.

Good requirements should keep NLU and dialogue management separated.

Common ways of defining chatbots requirements - and drawbacks

Prototyping tools

It's not uncommon to see requirements provided as prototypes made with tools such as Botsociety (probably the most popular), Botmock, or Botpreview (my favorite for its simplicity and its open source foundations). Prototypes are great to communicate a vision, an impression, a feeling of a product but are a poor way to communicate requirements. Look at the simple prototype (made with Botsociety) below. Even if it looks great for demo purposes, there is a major issue (setting aside the discussion above about separating NLU from dialogue): You can't have an exhaustive view of a conversation without playing it. For longer conversations, it's an awful experience for engineers who will be naturally reluctant to replay that video for the 100th time just to check if the 19th turn is implemented correctly. Using prototypes as requirements will just increase bugs, QA and development time and costs, and have a negative impact on product quality.

Chatbot prototype example - rasa prototype
Chatbot prototype example

Flowcharts

Flow charts are great for brainstorming and high-level design. They help to get the big picture. But to be useful, they must be easy to read, and to be easy to read, they must stay general and avoid too many details. Big picture, not too many details.

conversation flow described with a flowchart - rasa core requirements
Example of a conversation flow described with a flowchart

Requirements on the other hand, especially in agile development, must be specific and detailed. Each agile story should describe in full the conversation segment to be implemented, including flows, wording, or keys to be used to retrieve the bot responses if they are stored in a CMS. To avoid confusion, they should not contain pieces of other dialogues unless they're helping state the context.

That is why flowcharts don't work well as functional requirements. They are either too detailed and impossible to read or too general and lacking of essential information. In both cases, it creates frustration, unnecessary back and forth communication, delays, and product instability.

The Rasa Core story format

One of Core's key features is its markup language to write conversations segments. Let's look at a simple example:
[crayon-5b778ae10ae1c267397086/]
This is called a Story. A line starting with a [crayon-5b778ae10ae23862989506-i/]  means the user is speaking, a line starting with a [crayon-5b778ae10ae27940965364-i/]  means the bot is answering.

[crayon-5b778ae10ae29792157129-i/] is an intent. An intent is a meaning shared by different sentences a user can say in that situation. [crayon-5b778ae10ae2b577717222-i/]  would be trained with examples such as "Hi", "Hello", "Good morning", "Good day",... (You can read this if you want to understand intent training and classification in depth)

[crayon-5b778ae10ae2d950101502-i/] is a bot response. The content of this response can be defined later and/or somewhere else, so this is just a key that can be used later to retrieve an actual response written in plain English (or any other language)

That's how you can write dialogues with Rasa Core. Yes, it gets more complicated when you need to integrate API's that influences the bot's decisions (such as validating a transfer depending on the current balance of an account), but you can let engineers deal with that. As a product person, that's all you need to know. Now, how is a story different from a conversation?

An instance of the Rasa Core story could be:

  • (User) Hi
    • (Bot) How are you
  • (User) I am good, thanks!
    • (Bot) Great, I am happy for you

But it could also be:

  • (User) Good morning robot
    • (Bot) Hello Friend, what's up?
  • (User) All good
    • (Bot) 👍

And so on. A core story describes up to an infinity of conversations:  "Hi" and "Good morning robot" are two instances of [crayon-5b778ae10ae30639551296-i/]. And "How are you" and "Hello Friend, what's up?" are two possible values of [crayon-5b778ae10ae32230988126-i/].  A first reason why this format makes a good requirement is that it's complete. One story describes all the possible conversations given by the combinations of values for intents and utterances. But wait, there's more:

Rasa Core stories abstract domain knowledge

As I said above, it's not hard for an engineer to infer that "Hi" is an instance of [crayon-5b778ae10ae34822078383-i/]. Now, how likely is (s)he to infer the correct intent of "ev to ebit ratio in the last 5 years" among the 150+ intents of an advanced financial information chatbot? Providing exemple based dialogues (e.g. using prototyping tools) to the engineering team will introduce more and more friction in the process as the project grows in complexity and developers will need more and more clarifications from product owners. The Rasa story format avoids that by providing intent and entities based dialogue examples.

Rasa Core stories facilitate collaboration with copywriters

Another point of friction is when the person in charge of writing texts needs to understand what text it should write, and how those texts should be delivered. Stories are self-explanatory, and keys (such as [crayon-5b778ae10ae37360748832-i/] ) can be used as a reference in a json file, an excel sheet (ugh) or an external CMS.

Rasa Core stories can be used as test cases for test automation  and TDD

The cherry on the cake. Well written requirements can be used as is for automation and TDD.  Setting up test automation takes only a few minutes with the Rasa Add-ons . Test automation improves the general stability of your product, but also incredibly speeds up development since you can test hundreds of conversations in seconds.

An example of a conversational requirement using Rasa Core stories

Now that I have convinced you that a Rasa Core story is a great format to write functional requirements, how does that work in practice? Let's say you are preparing a sprint and want to write an Agile story for a greeting feature. "As Robert, I want to greet the chatbot and expect some empathy"

Case 1 - when I feel good
[crayon-5b778ae10ae3b543905958/]
Case 2 - when I feel bad

[crayon-5b778ae10ae3d712766186/]
Case 3 - when I digress, the bot should get me back on track with an error message
[crayon-5b778ae10ae40625357511/]
(Note: the Add-ons are super useful for these user validations)

Conclusion

The Rasa Core story format is a great way to express requirements:

  • It is easy to read and to write for all team members, including product owners, business analysts, and copywriters.
  • It abstracts domain knowledge to engineers. It separates better NLU and dialogue management tasks and helps categorize bugs faster.
  • It makes the work of copywriters easier.
  • It speeds up dev and testing time as requirements can be used as test cases.