82686

Spearman rank correlation with missing values?

<h3>Question</h3>

I have two list of words which are ordered by the number of occurrences

The ordering was generated by counting each word in two files sampled at different point in times.

I would like to calculate spearman to see how well the order of the first file was found in the second file.

for instance:

File a: 1) is 2) went 3) work

File b: 1) is 2) work 3) went

Because the ordering is different I would not achieve a score of 1.0 but yet one that would suggest that these two samples are rather similar

My problem are now missing values. A word of file A might not exist in the file B. Can I use spearman rank in this case? Or would be another correlation measure better suited?


<h3>Answer1:</h3>

When it comes to rank, in your application, you don't need to have missing values. When a word has an occurrence in one file but not in the other, you can give it last ranking in the other file (or equal last ranking for multiple missing values).

However, I am not sure of the effect on the Spearman value of lots of missing values (lots of tied last ranks). You might instead consider using a standard correlation/regression on the raw <em>relative</em> frequencies, instead of the Spearman coefficient.

Example...

Say file x has m=113 words and file y has n=234. We can create a table of relative word frequencies like so:

word x y
is 5/113 23/234 the 4/113 45/234 a 4/113 17/234 farnarkling 1/113 0/234 elbow 0/113 2/234 ... =============================== TOTAL 113/113 234/234

You would then calculate:

word x y u=x*y v=x*x
is 5/113 23/234 115/26442 25/12769 the 4/113 45/234 180/26442 16/12769 a 4/113 17/234 68/26442 16/12769 farnarkling 1/113 0/234 0/26442 1/12769 elbow 0/113 2/234 0/26442 0/12769 ... ======================================================== TOTAL 113/113 234/234 s=(sum of u) t=(sum of v)

Your answer is given by s/t. A value close to m/n implies a good correspondence.

Some possibly useful links are:

https://statistics.laerd.com/statistical-guides/spearmans-rank-order-correlation-statistical-guide.php

http://en.wikipedia.org/wiki/Simple_linear_regression

来源:https://stackoverflow.com/questions/26323239/spearman-rank-correlation-with-missing-values

Recommend

  • Java 8, why not a ZonedTime class?
  • Not logged in after manual login in CakePHP if redirected, logged in if no redirect
  • Yii2 login give access to backend when user login is from frontend
  • Can't install profiler symfony flex
  • Codename One - Zoom, center and crop a video to force it to occupy all the screen
  • Getting memory usage Live/Dirty Bytes in iOS app programmatically (not Resident/Real Bytes)
  • r - insert row for missing monthly data and interpolate
  • How to extract details from the xml files using java?
  • How to process future stream to create an instance of class with list property
  • Python - Pandas subtotals on groupby
  • IBM Worklight 6.1- [ERROR ] FWLSE0020E and [ERROR ] FWLSE0117E
  • WooCommerce Free Shipping - Remove raw or change the text name on checkout and email
  • UITextField get focus and then lose focus immediately due to return YES in textFieldShouldReturn
  • WPF binding to property of all items in a collection
  • how can i open a webpage in a webview from a fragment with buttons?
  • Is this usage of the const keyword in line with its intention?
  • Wicket countdown timer will not self-update
  • In Moment.js, how do you get the date of the next occurrence of a specific month (ex: 'next Jan
  • what is “Other” category in CosmosDB monitoring graph?
  • Map Annotation Disclosure Indicator - Xamarin.Form
  • Defer unused CSS
  • WiX ManagedBootstrapper SetDownloadSource confusion
  • First dynamically-added TinyMCE editor displays, others do not
  • Connect to a local database from phpmyadmin with R
  • Facebook Error (#200) The user hasn't authorized the application to perform this action (PHP)
  • Create an Office365 mailbox from within C# Web API method
  • How to load dynamic images in custom ListView
  • Ruby on Rails: Get mediaplayer information (iTunes, TRAKTOR, Cog; current song + playlist)
  • What is the difference between dynamically creating a script tag and statically embed a script tag?
  • time column in sqlite using gorm
  • Cloud Code: Creating a Parse.File from URL
  • VSTS work items list through REST API
  • matrix multiplication apache pig
  • Spring Boot fails to start
  • Unity3d lost directional light shadows after generate assetBundle (.unity3d file)
  • Grails - How to implement a foreign key relationship not using an id column?
  • multiple button click in asp.net MVC 3
  • Sql - ON DUPLICATE KEY UPDATE
  • How to use FirstOrDefault inside Include
  • How to mutate multiple variables without repeating codes?