18093

Elasticsearch Terms Query exclude large amount of users

Question:

I'm working on a tinder like app. In order to exclude profiles that user has swiped before, I use a "must_not" query like this:

<blockquote>

must_not : [{"terms": { "swipedusers": ["userid1", "userid1", "userid1"…]}}]

</blockquote>

I wonder what are the limits using this approach? is this a scalable approach that would also work when the swipedusers array contains 2000 user ids? If there is a better scalable approach to this I would be happy to know...

Answer1:

there is a better approach! and it called "terms lookup", is something like the traditional join that you could do on relational databases...

I could try to explain you here, but, all the information that you need is well documented on the official Elastic Search page:

<a href="https://www.elastic.co/guide/en/elasticsearch/reference/5.0/query-dsl-terms-query.html#query-dsl-terms-lookup" rel="nofollow">https://www.elastic.co/guide/en/elasticsearch/reference/5.0/query-dsl-terms-query.html#query-dsl-terms-lookup</a>

The final solution is having 2 indices, one for the registered users and another one to track swipes for each user. Then, for each swipe, you should update the document containing current user swipes... Here you will need to add elements to an array, and this is another problem in ElasticSearch (big problem if you are using AWS managed ElasticSearch) that only can be solved using scripting... More info at <a href="https://www.elastic.co/guide/en/elasticsearch/guide/current/partial-updates.html#_using_scripts_to_make_partial_updates" rel="nofollow">https://www.elastic.co/guide/en/elasticsearch/guide/current/partial-updates.html#_using_scripts_to_make_partial_updates</a>

For your case, the query will result in something like:

GET /possible_matches/_search { "query" : { "terms" : { "user" : { "index" : "swiped", "type" : "users", "id" : "current-user-id", "path" : "swipedUserId" } } } }

Another thing that you should take in account is the replication configuration for the swipes index, since each node will perform "joins" with that index, is highly recommended to have a full copy of that index in each node. You could achieve this creating the index with the "auto_expand_replicas" with "0-all" value.

PUT /swipes { "settings": { "auto_expand_replicas": "0-all" } }

Recommend

  • how to read a variable from a file to a Unix script in Unix
  • Blob fields in SAS gets truncated
  • How to create filebeat index pattern in kibana?
  • Elasticsearch 2.4 nodes does not form cluster with ConnectTransportException
  • Send Kubernetes cluster logs to AWS Elasticsearch
  • WP7 - issues with Horizontal scrolling Listbox
  • When and how is the Java DB updated and synced with Apache Derby?
  • How to get not only single record but all records that belong to specific search query
  • Is there a difference SMO ServerConnection transaction methods versus using the SqlConnectionObject
  • Upload a Java and node.js project to Google AppEngine at once
  • Add custom information to HockeyApp crash report
  • updating and compacting sqlite database in android
  • Is it good to have multiple database running in a same project?
  • Chef recipe for RoR Heroku
  • Parse an XML fragment stored in a string into nodes in XSLT with SAXON for Java
  • What is the official release update URL for Aptana Studio 3.0?
  • parameterized queries in oursql
  • Group variable in cobol
  • MySQL: Difference between `… ADD INDEX(a); … ADD INDEX(b);` and `… ADD INDEX(a,b);`?
  • What does certain JVM do after loading ByteCode into memory?
  • How can i dump blob fields from mysql tables
  • Check all values in string[] for length?
  • Google Custom Search with transparent background
  • Django: Count of Group Elements
  • Insert into database using onclick function
  • What is Eclipse's Declaration View used for?
  • RectangularRangeIndicator format like triangular using dojo
  • Cross-Platform Protobuf Serialization
  • How to check if every primary key value is being referenced as foreign key in another table
  • How to handle AllServersUnavailable Exception
  • Can I make an Android app that runs a web view in Chrome 39?
  • htaccess rewriting URLs with multiple forward slashes
  • Display Images one by one with next and previous functionality
  • Do create extension work in single-user mode in postgres?
  • Web-crawler for facebook in python
  • How to get next/previous record number?
  • A cron job substitute?
  • Revoking OAuth Access Token Results in 404 Not Found
  • XCode 8, some methods disappeared ? ex: layoutAttributesClass() -> AnyClass
  • Reading document lines to the user (python)