A Custom N-gram Filter for Material-UI Autocomplete

We are big fans of the Material-UI react library. Sure it might not be original--it has 62k stars on GitHub but it has a fantastic API, good support for NFRs, and is familiar to users. Not to mention it looks pretty slick!

A component I really like is <Autocomplete />. It's still in the "lab" section of the project, but I think it's just fine for production use. In this post we're going to show you how you can enhance the <Autocomplete /> component's typeahead ability by using n-grams.

Colors ๐ŸŒˆ๐ŸŒˆ๐ŸŒˆ

We need some data/options to autocomplete. In this post we'll be autocompleting colors ๐ŸŒˆ. While it might not seem worthy of autocompletion (if you're thinking ROYGBIV) there are actually hundreds of named colors in HTML5. A fun, nerd thing you can do is use them in your CSS by their name instead of their RGB/hex values. Two of my favorites are coral and cornflowerblue:

coral

cornflowerblue

.bg-coral {
background-color: coral;
}
.bg-cornflower {
background-color: cornflowerblue;
}

For a larger list of colors, take a look here.

Autocomplete in Material UI

As mentioned previously, we're using the <Autocomplete /> component from Material-UI. Here's our initial implementation. It's pretty much out of the box--we're using a JSON file to populate our list of colors. Here's what the component looks like and below you'll find the code.

โ†‘ This code is live. Give it a try! โ†‘


import colors from "./colors.json";
<Autocomplete
options={colors}
getOptionLabel={(option) => option.displayName}
renderInput={(params) => (
<TextField {...params} label="Colors" variant="outlined" />
)}
style={{ width: 300 }}
/>;

Adding Some Flair

We're going to slightly enhance the <Autocomplete /> component by adding a color swatch preview beside the name of each color. You can do this by using the startAdornment prop for the selected color and the renderOption prop for the options.

You can see in the code below that I'm using some additional comoponents from the Material-UI package such as <Typography />.

// this is a 20x20 colored square
const SampleColor = ({ hex }) => (
<div
style={{
display: "inline-block",
marginBottom: 3,
height: 20,
width: 20,
backgroundColor: hex,
border: "1px solid grey",
}}
/>
);
<Autocomplete
options={colors}
getOptionLabel={(option) => option.displayName}
renderInput={(params) => {
// convert the value in the textbox (params.inputProps.value) to a
// color object
const selectedColor = colors.find(
({ displayName }) => displayName == params.inputProps.value
);
return (
<TextField
{...params}
label="Colors"
variant="outlined"
InputProps={{
...params.InputProps,
// it's important we set startAdornment after we pass the existing
// params.InputProps. this way if there happens to be an existing
// InputProp.startAdornment property, we'll override it
startAdornment: selectedColor && (
<InputAdornment position="start">
<SampleColor hex={selectedColor.hex} />
</InputAdornment>
),
}}
/>
);
}}
renderOption={({ displayName, hex }) => (
<div>
<SampleColor hex={hex} />{" "}
<Typography noWrap style={{ display: "inline-block" }}>
{displayName}
</Typography>
</div>
)}
style={{ width: 300 }}
/>;

โ†‘ This code is live. Give it a try! โ†‘


flair
A little flair goes a long way.

Adding in n-grams

Next we're going to improve the typeahead functionality of <Autocomplete /> component. Out of the box, it uses a basic text matching filter (i.e. does search string exist in target string). This is straightforward but it doesn't account for things like typos. We're going to switch it to use trigrams. Trigrams are 3 character strings that make up a piece of text. In our case we'll actually be computing trigrams, bigrams, and unigrams. By using all three, we get a really nice blend of precision and recall in our filtering algorithm. It strikes the right balance between matching exactly when the user knows what they're looking for but tolerating spelling mistakes or other uncertainty.

cornflower example
The out of the box implementation returns no results for *crnflower*.

First we'll write a couple of helper functions to make this easier.

ngrams will convert a string into an array of grams. So "corn".ngrams(3) becomes ["c", "o", "r", "n", "co", "or", "rn", "cor", "orn"].

intersect will return the shared values of two arrays. So intersect(['a', 'b'], ['a', 'c']) will return ['a']. Yes I agree, truly innovative.

// turn a string into n 'grams'. we also lowercase the grams.
String.prototype.ngrams = function (n) {
var r = [];
for (var j = 1; j <= n; j++) {
for (var i = 0; i <= this.toLowerCase().length - j; i++) {
r.push(this.toLowerCase().substring(i, i + j));
}
}
return r;
};
// shamelessly copied this from stackoverflow
function intersect(array1, array2) {
return array1.filter((value) => array2.includes(value));
}

Now for the main course. We're going to define the filterOptions property of <Autocomplete />. This is where we'll be implementing our custom n-grams matching.

filterOptions takes 2 arguments options and state:

  • options - list of options, in our case a list of colors
  • state - the input state of <Autocomplete />

My code for filterOptions is going to convert both the options and the state value into an array of ngrams then compare them. We'll do that in the following steps:

  1. compute trigrams for the input state value
  2. for each option, compute trigrams
  3. count the number of identical trigrams for a given option and the input state
  4. rank the options by the number of matches and (in the event of a tie), string length
  5. keep the top 10 and return them
filterOptions={(options, state) => {
const { inputValue } = state;
if (!inputValue) {
return options.slice(0, 1000);
}
const inputTrigrams = inputValue.ngrams(3);
return (
options
// iterate over each option and compute intersect(ngram(search), all_color_ngrams)
.map((option) => {
const nMatches = intersect(
inputTrigrams, // ngrams of search input (i.e. "crnflower")
option.displayName.ngrams(3) // ngrams of the option (i.e. "cornflowerblue")
).length;
return {
...option,
nMatches,
};
})
// toss out anything that had no matches
.filter(({ nMatches }) => nMatches > 0)
// for sanity's sake we'll only display the top 10 results. we're going to
// order by `nMatches`. in the event of a tie the shorter word wins.
//
// i.e. if we're searching for "blue" then "Blue" is #1 and "Green Blue" #2
.sort((a, b) => {
const diff = b.nMatches - a.nMatches;
if (diff) {
return diff;
}
// if they have the same number off matching trigrams, shorter one wins
return a.displayName.length - b.displayName.length;
})
// return the top 10
.slice(0, 10)
);
}}

โ†‘ This code is live. Give it a try! โ†‘


cornflower example better
Much better.

Upping the ante

Interesting in its own right is the colornames.org project which crowdsources the naming of every RGB color. It sports well over a million named colors which you can download here.

Since working with all 1,965,852 colors would be a blog post in itself, I'm going to limit this example to names that have been voted on at least 6 times--just over 20k colors.

And lo and behold--our <Autocomplete /> is still working quite well with 20k options.

โ†‘ This code is live. Give it a try! โ†‘


There are some very creative names like "Light Salmon Pink On A Grill" and "Cheeto Hands".

There's also a lot of variants of "cornflower". All of them pretty similar to the cornflowerblue we used in the first example.

cornflower menu
Turns out there a lot of variants of cornflower.

What is a cornflower anyway?

I'd be remiss if I didn't at least touch on this. Having used cornflower the color in many a project, I don't actually know what a cornflower is. For you plant enthusiasts out there, Centaurea cyanus is an annual flowering plant native to Europe. It gets it's name because it often pops up as a weed in--you guessed it--corn fields!

cornflower flower
I must say it's a most striking color.

Before you go

For the code in its entirety, you can see it here github.com/userbugreport/mui-autocomplete-with-ngrams.

When we're not blogging about European meadow flowers, we're building User Bug Report. User Bug Report automatically generates bug report with detailed, reproducible data. It even comes with a screenshot! Give it a try, your dev and QA teams will love it.


Want to read more?
Sign up for our newsletter.