A smarter way to search using query suggestions

We’re all familiar with the concept of query suggestions: we enter terms in a search box, and as if by magic are offered a set of plausible completions to our search string. This saves time, helps to minimise error-prone keystrokes, and often gives us the inspiration we need to form better queries than we’d originally anticipated. Since their introduction over a decade ago, query suggestions have become an essential part of the web search experience:

google1.png

However, for other types of search application, their benefit is somewhat less apparent. Web search is predicated on the notion that most users’ queries are composed of relatively short sequences of keywords, perhaps with some elementary linguistic structure. In this context, appending related (or even serendipitous) terms to the tail of an unstructured query can offer immediate utility. But for many professional search applications, particularly those that employ structured searching methods, this assumption breaks down. For example, recruiters commonly use Boolean strings to source candidates, crafting expressions such as the following:

(“business analyst” or “systems analyst” or “system analyst” or “data analyst” or “requirements analyst” or “functional analyst”) and crystal and report* and analy* and data near analy* and not inventory and not retail and not (ecommerce or “e-commerce” or b2b or b2c)

In this context, it’s no longer sufficient simply to consider which terms are offered. Instead, we must also consider how and where they are applied, since simply appending them to the end of the search string is no longer a productive strategy:

google2.png

When solving complex search challenges such as this, the user needs control over not just the choice of search suggestions, but also the location and manner in which they are applied. This requires a different approach.

So what is the alternative? Well, let’s enter this string into 2Dsearch and see what happens:

nested.png

This is how the above example is rendered using what we call the Nested Layout, in which logical expressions are displayed as a set of nested containers. In this form, it becomes apparent that:

  • The overall expression consists of three sub-expressions and a handful of keywords

  • They are all connected by a logical AND

  • Three of the elements have been negated (as indicated here by the black background)

Transforming logical structure into a visual layout provides a more direct mapping between the underlying semantics and the physical appearance. Moreover, we can now manipulate this expression in ways that were previously not possible: for example, if we find that the term "analy*" is not matching anything useful, we can simply delete it:

delete.png

Alternatively, we might take the view that it’s not useful right now, but it might be later, in which case we can instead temporarily disable it:

disable.png

In each case, the effects of these operations are displayed in real time in the adjacent search results window, as a set of matching social profiles:

SERP.PNG

But more importantly, this is our opportunity to apply automated query suggestions is a more precise and controlled manner. Consider the group of terms shown in the top left. It seems to be a fairly comprehensive list of synonyms for ‘business analyst’ roles. But have we missed anything? We could brainstorm for a while, or even use Google for suggestions. A smarter approach might be to select one of the terms (e.g. ‘business analyst’) and invoke the ‘Suggest terms’ option:

suggest1.png

Now, we can see immediately that ‘BA’ would seem a further useful synonym (among others). Indeed, when we select that term and add it to our group, the count of matching search results increases from 2,820 to 3,070:

afterSuggest.png

But that search result list is still too long to exhaustively review, and scanning it briefly, we also notice a number of false positives. Can anything be done about that? One solution is to increase the number of terms in the negated (‘NOT’) group that is shown top right. So following the same approach, we can explore suggested terms for ‘b2b’:

suggest2.png

Selecting ‘business-to-business’ seems a good additional term, and indeed, adding this to our search reduces the result list to 1,940. We could continue in this manner and optimise our results further to a handful of highly precise profiles, but hopefully the value of the approach is now apparent.

Of course, we could also have achieved the same end by modifying the Boolean string itself. After all, it’s not an unduly complex Boolean expression. But faced with the interactive display on the left or the text block on the right, where would you rather start?

leftorright.png

Summary

In this post we’ve reviewed some of the shortcomings of traditional approaches to automated query suggestions, and explored the additional requirements that emerge from complex search challenges.  In particular, we have shown how structured searching methodologies require that consideration be given not just to the choice of suggested terms, but also the location and manner in which they are applied. We have also explored an alternative, visual approach which transforms logical structure into an interactive graphical representation, and shown how this facilitates a more precise and effective approach to the application of query suggestions.

In closing, we should acknowledge that we are not the first to offer query suggestions, and we probably won’t be the last. But we are the first, we believe, to offer query suggestions in this manner. And we are among a minority in publishing and sharing our work with the scientific community, and to that extent welcome feedback and scrutiny on the approach.

Stay tuned for our next post, when we’ll explain a little bit about the science of how all this is done. In the meantime, head on over to 2Dsearch and try it out for yourself.