Star 0

Abstract

Full-text search systems, such as Elasticsearch and Apache Solr, enable document retrieval based on keyword queries. In many deployments these systems are multi-tenant, meaning distinct users’ documents reside in, and their queries are answered by, one or more shared search indexes. Large deployments may use hundreds of indexes across which user documents are randomly assigned. The results of a search query are filtered to remove documents to which a client should not have access.

We show the existence of exploitable side channels in modem multi-tenant search. The starting point for our attacks is a decade-old observation that the TF-IDF scores used to rank search results can potentially leak information about other users’ documents. To the best of our knowledge, no attacks have been shown that exploit this side channel in practice, and constructing a working side channel requires overcoming numerous challenges in real deployments. We nevertheless develop a new attack, called STRESS (Search Text RElevance Score Side channel), and in so doing show how an attacker can map out the number of indexes used by a service, obtain placement of a document within each index, and then exploit co-tenancy with all other users to (1) discover the terms in other tenants’ documents or (2) determine the number of documents (belonging to other tenants) that contain a term of interest. In controlled experiments, we demonstrate the attacks on popular services such as GitHub and Xen.do. We conclude with a discussion of countermeasures.

Slides