Having a borderline unhealthy obsession with making stuff run offline (my “day job” is partly to blame here), an idea
started keeping me up at night - why not a chatbot? The idea
turned into a talk (not English tho) I gave at a local meetup but it
still feels like it deserves a more in-depth treatise on the why and how of building an offline chatbot.
Why?
I mean, why not? 😎 With PWAs becoming more mainstream, we can
expect to see more “edgy” stuff in the browser, working offline and acting like a desktop app. So if we can have
various games, tomato timers,
file sharing and even google drive/docs running
offline, why not a measly chatbot? For whenever you feel like talking to a not-particularly-smart glorified if / else
statement? No? Doesn’t matter, read on 😆.
How?
Take one part create-react-app, one part
compromise.js with a sprinkle of react for UI and “brains” on top, et voila - a chatbot
basis. The create-react-app bit is not particularly interesting here, so we’ll skip that and get into compromise.js
a bit before we get into the actually interesting parts.
1. Compromise.js
The self-described “modest natural-language processing in javascript” library is in fact a very cool piece of tech. It
may not be the cleverest nor the fastest NLP
solution out there, even when it comes to JavaScript land, but it runs in the browser and is perfectly capable of
running offline, as it doesn’t depend on any third-party services. Best of all, it weighs around 200kb (which is around
the same size as jQuery, only cooler 😉). With that size, it still manages to be
84%-86% accurate. Effin’ amazing.
How it achieves that takes a bit of reading and theory to understand fully, but the gist of it is:
80% of the used English language consists of the
top 1000 words.
Statistically, the most common word type is a noun, so it makes sense to assume that any word unknown to the library
is a noun.
With some word stemming / lemmatization, we can reduce the size of the
dictionary needed for the library to run (word suffixes, for example).
And some sentence-level postprocessing on top leads us to the numbers mentioned above.
A thousand-word dictionary is perfectly acceptable for in-browser use, considering modern JS library sizes. A more
in-depth explanation of how compromise works here and
here. Oh, did I mention it also does plugins and
custom lexicons?
2. The “brains” part
Compromise.js sounds great and all, but how does that help us build a chatbot? Well, it doesn’t directly, but we’ll get
to that. Fundamentally, a chatbot can be described as a program that responds to natural language via request/response
cycles. So, a very dumb and basic chatbot would simply be a “reactive1” program that can respond to
given strings. Ergo, we need a function that responds to given user input:
const getReply = (input) => { return 'This is your chatbot responding';};export default { getReply,};
Now, this gives us a one-trick-pony of a chatbot that only knows to return the response above. To make it smarter we’ll
“pirate” a page off of Amazon Alexa - we’ll organize everything the bot knows into skills:
// The structure of a "bare" skillconst ID = 'skill_identifier_here';const lexicon = {};const matchRules = [];const reply = (input, context) => {};export default { ID, lexicon, matchRules, reply,};
The theory here is that we’ll have an index of all of the skills our bot knows, and we’ll have a way of looking up the
most appropriate one by using the magic of compromise.js. Now, before we get any further into making the above-mentioned
lookup work, we need a bit more info on what compromise can do to simplify that otherwise tedious task.
When we give input to compromise.js, it’s nice enough to tag all
Parts of Speech and give us a tool to
match against them, together with regex, plain words
and our own custom tags if we happen to need them. I mean, if you decide “glue” is a preposition, you can tell
compromise to treat it as such. But just so you know, “glue” is not a preposition. So, with that out of the way, we can
put the lookup part together:
import skills from './skills/'; // The index for all of our skillsconst getReply = (input) => { const skillMatch = skills.find((s) => { const ruleMatch = s.matchRules.find((r) => nlp(input).normalize().match(r).found); if (ruleMatch) { return true; } }); nlp(input).debug(); if (skillMatch) { const reply = await skillMatch.reply(input, context); return reply; }};export default { getReply,};
So, we improved our decision-maker to look for whether an input matches against a skill’s match config. The whole
process can be described as:
We import all of our skills into the “brain” (line 1)
We look through each of a skill’s match rules to find what works for the given input (lines 4-12);
If we happen to find a skill, we use its reply function to return a reply (lines 16-19)
Since compromise comes with a nice debugger, we use it to give us more info (line 14)
To make the puzzle complete, let’s take a look at a full skill:
What the above does, following the bare structure we defined previously is:
Define match rules (lines 7-9), giving our “brain” something to match against. So, whenever the brain gets a “hi”,
“hello”, “ahoy” or “greetings” as input, that is going to trigger this skill, because compromise’s .match() matches
it here. As a last-ditch effort to make it work, whenever compromise recognizes something as an “#Expression” we
trigger on that too (not ideal, but works surprisingly well).
In order not to get too boring with repetition, we randomize stuff a little bit with the “replies” array and pick a
random one on each trigger (lines 11-18).
With that done, we have a basic bot that’s not as dumb as its first iteration. This one can reply to greetings with a
greeting, making it at least somewhat context-sensitive. It’s still too dumb for anything more sophisticated, but the
basics are there.
3. Improvements
There are some glaringly obvious shortcomings with our bot right now - it doesn’t do too much right now, it’s not aware
of historical data or context and doesn’t do fallback answers when it doesn’t match any of the given skills. And I plan
on making it better in part 2 of this thing, very soon 😎. For the impatient, have a look at my example bot from the
talk I mentioned above here or see it live
here. That version has a few more tricks up its sleeve, but by the time we’re done with
this version of the bot, it will be a lot smarter than the deployed “beerbot” there.
Thanks for reading, and join me in the next installment when we improve the bot’s “smarts” significantly.
Not reactive as used in programming, but reactive as in
“readily responsive to a stimulus” - although the two feel
very similar here.
Liked this article? Subscribe to the newsletter and get notified of
new articles.