13463

Python error: len() of unsized object while using statsmodels with one row of data

I'm able to use the statsmodel's WLS (weighted least squares regression) fine when I have lots of datapoints. However, I seem to be having a problem with the numpy arrays when I try to use WLS for a single sample from the dataset.

What I mean is, if I have a dataset X which is a 2D array, with lots of rows, WLS works fine. But not if I try to work it on a single row. You'll get what I mean in the code below:

import sys from sklearn.externals.six.moves import xrange from sklearn.metrics import accuracy_score import pylab as pl from sklearn.externals.six.moves import zip import numpy as np import statsmodels.api as sm from statsmodels.sandbox.regression.predstd import wls_prediction_std # this is my dataset X, with 10 rows X = np.array([[1,2,3],[1,2,3],[4,5,6],[1,2,3],[4,5,6],[1,2,3],[1,2,3],[4,5,6],[4,5,6],[1,2,3]]) # this is my response vector, y, also with 10 rows y = np.array([1, 1, 0, 1, 0, 1, 1, 0, 0, 1]) # weights, 10 rows weights = np.array([ 0.1 , 0.1, 0.1 , 0.1, 0.1 , 0.1, 0.1 , 0.1, 0.1 , 0.1 ]) # the line below, using all 10 rows of X, gives no errors but is commented out # mod_wls = sm.WLS(y, X, weights) # and this is the line I need, which is giving errors: mod_wls = sm.WLS(np.array(y[0]), np.array([X[0]]),np.array([weights[0]]))

The last line above was initially just mod_wls = sm.WLS(y[0], X[0], weights[0])

But that gave me errors like object of type 'numpy.float64' has no len(), hence I turned them into arrays. But now I keep getting this error:

Traceback (most recent call last): File "C:\Users\app\Documents\Python Scripts\test.py", line 53, in <module> mod_wls = sm.WLS(np.array(y[0]), np.array([X[0]]),np.array([weights[0]])) File "C:\Users\app\Anaconda\lib\site-packages\statsmodels\regression\linear_model.py", line 383, in __init__ weights=weights, hasconst=hasconst) File "C:\Users\app\Anaconda\lib\site-packages\statsmodels\regression\linear_model.py", line 79, in __init__ super(RegressionModel, self).__init__(endog, exog, **kwargs) File "C:\Users\app\Anaconda\lib\site-packages\statsmodels\base\model.py", line 136, in __init__ super(LikelihoodModel, self).__init__(endog, exog, **kwargs) File "C:\Users\app\Anaconda\lib\site-packages\statsmodels\base\model.py", line 52, in __init__ self.data = handle_data(endog, exog, missing, hasconst, **kwargs) File "C:\Users\app\Anaconda\lib\site-packages\statsmodels\base\data.py", line 401, in handle_data return klass(endog, exog=exog, missing=missing, hasconst=hasconst, **kwargs) File "C:\Users\app\Anaconda\lib\site-packages\statsmodels\base\data.py", line 78, in __init__ self._check_integrity() File "C:\Users\app\Anaconda\lib\site-packages\statsmodels\base\data.py", line 249, in _check_integrity print len(self.endog) TypeError: len() of unsized object

So in order to see what was wrong with the lengths, I did this:

print "y size: " print len(np.array([y[0]])) print "X size" print len (np.array([X[0]])) print "weights size" print len(np.array([weights[0]]))

And got this output:

y size: 1 X size 1 weights size 1

I then tried this:

print "x shape" print X[0].shape print "y shape" print y[0].shape

And the output was:

x shape (3L,) y shape ()

Line 249 in data.py, which the error referred to, has this function, where I added a bunch of "print sizes" in order to see what was happening:

def _check_integrity(self): if self.exog is not None: print "exog size: " print len(self.exog) print "endog size" print len(self.endog) # <-- this, and the line below are causing the error if len(self.exog) != len(self.endog): raise ValueError("endog and exog matrices are different sizes")

It appears there's something wrong with len(self.endog). Although when I tried printing out len(np.array([y[0]])), it simply gave the output 1. But somehow when y goes into the check_integrity function and becomes endog, it doesn't behave the same.... or is something else going on?

What should I do? I'm using an algorithm where I really do need to run WLS for each row of X separately.

Answer1:

There's no such thing as WLS for one observation. The single weight would simply become 1 when they're normalized to sum to 1. If you want to do this, though I supsect you don't, just use OLS. The solution will be a consequence of the SVD not any actual relationship in the data though.

OLS solution using pinv/svd

np.dot(np.linalg.pinv(X[[0]]), y[0])

Though you could just make up any answer that works and get the same result. I'm not sure offhand what exactly the properties of the SVD solution are vs. the other non-unique solutions.

[~/] [26]: beta = [-.5, .25, 1/3.] [~/] [27]: np.dot(beta, X[0]) [27]: 1.0

Recommend

  • Why does my GradientDescentOptimizer produce NaN?
  • How to display the mean value and error bars in a percent bar graph
  • “stack level too deep” When Processing Carrierwave Image Versions in Nested Form
  • DataTables+RequireJS: Cannot read property 'defaults' of undefined
  • Angularjs - Pagination appear after search filter
  • Ionic 2 - Runtime Error No provider for NavController
  • Type definition: expected UnionAll, got TypeVar
  • Change paused animation's play state to running on click of a link
  • jQuery: click function bind in for-loop with closure fix
  • Modify a Google App Engine entity id?
  • ggplot2 polygon world map centred with limits gives funny edges
  • Where in the relevant specification is it documented that some comments in a SQL script are, in fact
  • JSON data through JS/AJAX into PHP
  • Cythonized function unexpectedly slow
  • Java Garbage collection, setting reference to null
  • maven jboss-as:start A required class was missing … org/sonaty…/ArtifactResolutionException
  • Generic/Unknown HTTP Error with response code 0 using UnityWebRequest
  • calculating number of bytes of each row in an image
  • Vigenere cipher not working
  • R sqldf renaming a field in a select statement
  • WordPress > setting permalink option via script buggy?
  • Android: How to correctly use NotifyDataSetChanged with SimpleExpandableListAdapter?
  • Available space left on drive - WinAPI - Windows CE
  • How do I get the list of bad records that didn't load in Bigquery?
  • TFS 2015 - Waiting for an agent to be requested
  • SAXReader not re-ecape characters
  • ASP.NET MVC Application won't update some controllers
  • Web.config system.webserver errors
  • Spring Cloud Microservice Architecture Confusion
  • How to revert to previous XCode version?
  • How can I set a binding to a Combox in a UserControl?
  • Make VS2015 use angular-cli ng at build time in a .NET project
  • Where to put my custom functions in Wordpress?
  • Numpy divide by zero. Why?
  • AT Commands to Send SMS not working in Windows 8.1
  • Rails 2: use form_for to build a form covering multiple objects of the same class
  • How do I configure my settings file to work with unit tests?
  • IndexOutOfRangeException on multidimensional array despite using GetLength check
  • Is it possible to post an object from jquery to bottle.py?
  • Binding checkboxes to object values in AngularJs