63012

Slow self-join delete query

Question:

Does it get any simpler than this query?

delete a.* from matches a inner join matches b ON (a.uid = b.matcheduid)

Yes, apparently it does... because the performance on the above query is really bad when the matches table is very large.

matches is about 220 million records. I am hoping that this DELETE query takes the size down to about 15,000 records. How can I improve the performance of the query? I have indexes on both columns. UID and MatchedUID are the only two columns in this InnoDB table, both are of type INT(10) unsigned. The query has been running for over 14 hours on my laptop (i7 processor).

Answer1:

Deleting so many records can take a while, I think this is as fast as it can get if you're doing it this way. If you don't want to invest into faster hardware, I suggest another approach:

If you really want to delete 220 million records, so that the table then has only 15.000 records left, thats about 99,999% of all entries. Why not

<ol><li>Create a new table, </li> <li>just insert all the records you want to survive,</li> <li>and replace your old one with the new one?</li> </ol>

Something like this might work a little bit faster:

/* creating the new table */ CREATE TABLE matches_new SELECT a.* FROM matches a LEFT JOIN matches b ON (a.uid = b.matcheduid) WHERE ISNULL (b.matcheduid) /* renaming tables */ RENAME TABLE matches TO matches_old; RENAME TABLE matches_new TO matches;

After this you just have to check and create your desired indexes, which should be rather fast if only dealing with 15.000 records.

Answer2:

running explain select a.* from matches a inner join matches b ON (a.uid = b. matcheduid) would explain how your indexes are present and being used

Answer3:

I might be setting myself up to be roasted here, but in performing a delete operation like this in the midst of a self-join, isn;t the query having to recompute the join index after each deletion?

While it is clunky and brute force, you might consider either:

A. Create a temp table to store the uid's resulting from the inner join, then join to THAT, THEN perfoorm the delete.

OR

B. Add a boolean (bit) typed column, use the join to flag each match (this operation should be FAST), and THEN use:

DELETE * FROM matches WHERE YourBitFlagColumn = True

Then delete the boolean column.

Recommend

  • Windows Phone 8 - deploy app time out error
  • PHP hash_pbkdf2 takes orders of magnitude longer on AWS instances
  • return google contacts api v3 photo?
  • How to change the type of a function reference?
  • Grails: cant create tables on mysql
  • finding values case insensitively with emojis
  • Setting Access-Control-Allow-Origin header in Angular2 development mode
  • SQL: what kind of relation (1:1, 1:m, m:m,…) there is between this two tables?
  • SQL not inserting into table with relation in Yii
  • Get data file from microphone in windows phone 7
  • Reusing the CQ5 Form into the mywebsite components is not showing up the End of the Form section
  • Fatal error: Call to a member function fetch() on a non-object?
  • Open an application in a space using applescripts
  • Messed up characters in webpages (especially social media)
  • PHP multiple file uploads
  • Are there any side effects from calling SQLAlchemy flush() within code?
  • Synchronize windows folders
  • MeeGo Handset Emulator not starting on Windows 7
  • Android cannot disable cut copy paste
  • User messaging system
  • Check all values in string[] for length?
  • Calculate time difference in hh:mm:ss with simple javascript/jquery
  • ASP.NET MVC Application won't update some controllers
  • How to run “Deployd” on port 80 instead of port 5000 in webserver.
  • Is there a way to do normal logging with EureakLog?
  • Asynchronous UI Testing in Xcode With Swift
  • Illegal mix of collations for operation for date/time comparison
  • ActionScript 2 vs ActionScript 3 performance
  • Large data - storage and query
  • How can I estimate amount of memory left with calling System.gc()?
  • Apache 2.4 - remove | delete | uninstall
  • How can I get HTML syntax highlighting in my editor for CakePHP?
  • How get height of the a view with gone visibility and height defined as wrap_content in xml?
  • FormattedException instead of throw new Exception(string.Format(…)) in .NET
  • How do I configure my settings file to work with unit tests?
  • IndexOutOfRangeException on multidimensional array despite using GetLength check
  • Programmatically clearing map cache
  • Sorting a 2D array using the second column C++
  • Binding checkboxes to object values in AngularJs
  • java string with new operator and a literal