A Custom N-gram Filter for Material-UI Autocomplete
We are big fans of the Material-UI react library. Sure it might not be original--it has 62k stars on GitHub but it has a fantastic API, good support for NFRs, and is familiar to users. Not to mention it looks pretty slick!
A component I really like is <Autocomplete />
. It's still in the "lab" section of the
project, but I think it's just fine for production use. In this post we're going to show
you how you can enhance the <Autocomplete />
component's typeahead ability by
using n-grams.
Colors ๐๐๐
We need some data/options to autocomplete. In this post we'll be autocompleting colors ๐. While it might not seem worthy of
autocompletion (if you're thinking ROYGBIV) there are actually hundreds of named colors in HTML5. A fun, nerd
thing you can do is use them in your CSS by their name instead of their RGB/hex values. Two of my
favorites are coral
and cornflowerblue
:
coral
cornflowerblue
For a larger list of colors, take a look here.
Autocomplete in Material UI
As mentioned previously, we're using the <Autocomplete />
component from Material-UI. Here's
our initial implementation. It's pretty much out of the box--we're using a JSON file to populate
our list of colors. Here's what the component looks like and below you'll find the code.
โ This code is live. Give it a try! โ
Adding Some Flair
We're going to slightly enhance the <Autocomplete />
component by adding a color
swatch preview beside the name of each color. You can do this by using the startAdornment
prop for the selected color and the renderOption prop
for the options.
You can see in the code below that I'm using some additional comoponents from the
Material-UI package such as <Typography />
.
โ This code is live. Give it a try! โ

A little flair goes a long way.
Adding in n-grams
Next we're going to improve the typeahead functionality of <Autocomplete />
component. Out of the box,
it uses a basic text matching filter (i.e. does search string exist in target string). This
is straightforward but it doesn't account for things like typos. We're going to switch
it to use trigrams. Trigrams are 3 character strings
that make up a piece of text. In our case we'll actually be computing trigrams, bigrams, and unigrams.
By using all three, we get a really nice blend of precision and recall in our filtering
algorithm. It strikes the right balance between matching exactly when the user knows what they're
looking for but tolerating spelling mistakes or other uncertainty.

The out of the box implementation returns no results for *crnflower*.
First we'll write a couple of helper functions to make this easier.
ngrams
will convert a string into an array of grams. So "corn".ngrams(3)
becomes ["c", "o", "r", "n", "co", "or", "rn", "cor", "orn"]
.
intersect
will return the shared values of two arrays. So
intersect(['a', 'b'], ['a', 'c'])
will return ['a']
. Yes I agree, truly innovative.
Now for the main course. We're going to define the filterOptions
property of <Autocomplete />
. This
is where we'll be implementing our custom n-grams matching.
filterOptions
takes 2 arguments options
and state
:
options
- list of options, in our case a list of colorsstate
- the input state of<Autocomplete />
My code for filterOptions
is going to convert both the options and the state
value into an array of ngrams then compare them. We'll do that in the following steps:
- compute trigrams for the input state value
- for each option, compute trigrams
- count the number of identical trigrams for a given option and the input state
- rank the options by the number of matches and (in the event of a tie), string length
- keep the top 10 and return them
โ This code is live. Give it a try! โ

Much better.
Upping the ante
Interesting in its own right is the colornames.org project which crowdsources the naming of every RGB color. It sports well over a million named colors which you can download here.
Since working with all 1,965,852 colors would be a blog post in itself, I'm going to limit this example to names that have been voted on at least 6 times--just over 20k colors.
And lo and behold--our <Autocomplete />
is still working quite well with 20k options.
โ This code is live. Give it a try! โ
There are some very creative names like "Light Salmon Pink On A Grill" and "Cheeto Hands".
There's also a lot of variants of "cornflower". All of them pretty similar to the cornflowerblue
we used in the first example.

Turns out there a lot of variants of cornflower.
What is a cornflower anyway?
I'd be remiss if I didn't at least touch on this. Having used cornflower
the color
in many a project, I don't actually know what a cornflower is. For you plant enthusiasts
out there, Centaurea cyanus is an annual
flowering plant native to Europe. It gets it's name because it often pops up as a weed in--you guessed it--corn fields!

I must say it's a most striking color.
Before you go
For the code in its entirety, you can see it here github.com/userbugreport/mui-autocomplete-with-ngrams.
When we're not blogging about European meadow flowers, we're building User Bug Report. User Bug Report automatically generates bug report with detailed, reproducible data. It even comes with a screenshot! Give it a try, your dev and QA teams will love it.