85107

Receiving random cost output on tensorflow regression- python

Question:

I am relatively new to tensorflow and I have attempted to adapt some code from a tutorial to process my own data.

The data can be found here: <a href="https://github.com/z12332/tensorflow-test-1/blob/master/export.csv" rel="nofollow">https://github.com/z12332/tensorflow-test-1/blob/master/export.csv</a>

Keep in mind, that the dataset being fed consists of only columns 9 though 27 (with nan's converted to 0) and column 30 as the labels

here is the link to the tutorial code: <a href="https://github.com/llSourcell/How_to_use_Tensorflow_for_classification-LIVE/blob/master/demo.ipynb" rel="nofollow">https://github.com/llSourcell/How_to_use_Tensorflow_for_classification-LIVE/blob/master/demo.ipynb</a>

I am able to get the program to run without an error message, but for some reason, it outputs 200 training steps of relatively random cost values. As and example, here are the first few steps:

Training step: 0000 cost= 0.039999638 Training step: 0000 cost= 0.159996599 Training step: 0000 cost= 0.000000002 Training step: 0000 cost= 0.000000004 Training step: 0000 cost= 0.000000001 Training step: 0000 cost= 0.039994366 Training step: 0000 cost= 0.000000005 Training step: 0000 cost= 0.039997347 Training step: 0000 cost= 0.359970629 Training step: 0000 cost= 1.959837437 Training step: 0000 cost= 3.239814520 Training step: 0000 cost= 0.000000195 Training step: 0000 cost= 0.000000228 Training step: 0000 cost= 0.000000003 Training step: 0000 cost= 0.000000388 Training step: 0000 cost= 0.039958697 Training step: 0000 cost= 0.159986690 Training step: 0000 cost= 0.159973413 Training cost= 2.70406e-05 W= [ 2.38201610e-05 1.96683395e-05 3.69497479e-06 2.77944509e-05 2.02058782e-05 3.82550934e-05 3.37507554e-05 2.18498894e-06 2.92303273e-04 7.17514267e-05 2.34498725e-06 3.40497172e-06 6.25661269e-05 5.59996465e-07 8.81450160e-06 3.44998034e-06] b= [ 2.62004360e-05]

Here is my code in full, for anyone knows why this is happening or how to debug:

import tensorflow as tf import pandas as pd import numpy as np import matplotlib.pyplot as plt df = pd.read_csv('/Users/benny/desktop/export.csv') data_ = df.iloc[1:,9:27] data_['CRISPR'] = df.iloc[:,30] data_ = data_.drop(['Diseases'],axis=1) dim = 16 learning_rate = 0.0000001 display_step = 50 X = tf.placeholder(tf.float32, [None, dim]) Y = tf.placeholder(tf.float32) train_X = data_.iloc[:200, :-2].as_matrix() ''' dimensions = [200,16] array([[ 25., 2., 3., ..., nan, nan, 2.], [ 5., 13., 3., ..., nan, 19., 2.], [ 25., 13., 3., ..., nan, nan, 2.], ..., [ 25., 13., 3., ..., nan, nan, 2.], [ 25., 13., 3., ..., nan, nan, 2.], [ nan, 13., 3., ..., nan, 19., 3.]]) ''' train_X = train_X.fillna(value=0) train_Y = data_.iloc[:200, -1].as_matrix() ''' dimensions = [200] array([ 1, 2, 0, 0, 0, 1, 0, 1, 3, 7, 9, 0, 0, 0, 0, 1, 2, 2, 0, 0, 0, 7, 2, 2, 2, 0, 4, 0, 0, 0, 0, 9, 5, 2, 1, 2, 1, 0, 0, 1, 0, 2, 1, 2, 0, 1, 1, 1, 0, 0, 0, 1, 3, 1, 2, 4, 1, 0, 1, 6, 2, 1, 0, 0, 1, 0, 1, 1, 1, 7, 7, 4, 1, 1, 6, 4, 0, 0, 1, 1, 0, 1, 1, 1, 2, 0, 0, 2, 0, 0, 0, 3, 2, 3, 1, 1, 9, 7, 4, 1, 1, 1, 0, 1, 5, 4, 2, 1, 1, 1, 1, 1, 0, 4, 1, 0, 1, 0, 0, 1, 2, 1, 4, 0, 10, 2, 0, 1, 2, 3, 0, 0, 0, 0, 0, 0, 3, 1, 1, 2, 0, 7, 0, 2, 0, 2, 0, 0, 2, 3, 1, 0, 7, 3, 2, 9, 1, 0, 0, 2, 1, 0, 2, 2, 1, 1, 2, 4, 0, 0, 0, 0, 0, 0, 1, 0, 1, 4, 1, 0, 0, 1, 15, 1, 0, 1, 2, 0, 0, 1, 0, 2, 0, 0, 0, 2, 1, 0, 1, 11]) ''' test_X = data_.iloc[200:320, :-2].as_matrix() test_Y = data_.iloc[200:320, -1].as_matrix() n_samples = train_Y.size W= tf.Variable(tf.zeros([dim]), name="weight") b = tf.Variable(tf.zeros([1]), name="bias") activation = tf.add(tf.mul(X, W), b) cost = tf.reduce_sum(tf.pow(activation-Y, 2))/(2*n_samples) optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost) hm_epochs = 10 init = tf.initialize_all_variables() xy = zip(train_X, train_Y) sess = tf.Session() sess.run(init) for (x, y) in xy: for i in range(hm_epochs): sess.run(optimizer, feed_dict={X: x[np.newaxis, ...], Y: y[np.newaxis, ...]}) if (i) % display_step == 0: cc = sess.run(cost, feed_dict={X: x[np.newaxis, ...], Y: y[np.newaxis, ...]}) print "Training step:", '%04d' % (i), "cost=", "{:.9f}".format(cc) print "Optimization Finished!" training_cost = sess.run(cost, feed_dict={X:x[np.newaxis, ...], Y:y[np.newaxis, ...]}) print "Training cost=", training_cost, "W=", sess.run(W), "b=", sess.run(b), '\n'

Could the random values be from the missing values (changed to 0 in the dataset)? Also, how would I now apply the test_X,Y dataframes to predict?

Either way Im here to learn so thanks for your help!

Answer1:

Here is a working version of the code, but before I'll offer some notes.

1) One do some reading at <a href="https://www.tensorflow.org/get_started/mnist/beginners" rel="nofollow">tensorflow mnist tutorial</a>. Im particular see why your placeholder sizes are not correct and why we are going to use a one-hot-encoded version of the labels for this task.

2) Consider using cross-entropy cost. It is a more well suited cost for this multiclass task.

3) try not to be too underwhelmed by the performance of this basic model (it does not perform well). Consider exploring the data looking for import features and also look around for what the state of the art performance on this dataset might be.

import tensorflow as tf import pandas as pd import numpy as np import matplotlib.pyplot as plt df = pd.read_csv('/Users/benny/desktop/export.csv') data_ = df.iloc[1:,9:27] data_['CRISPR'] = df.iloc[:,30] data_ = data_.drop(['Diseases'],axis=1) # we will need the number of classes to be predict # the nunique methods gets us the number of unique labels nclasses = data["CRISPR"].nunique() # lets collect of labels here. We'll one-hot-encode them # using pandas.get_dummies() inputY = pd.get_dummies(data_.iloc[:, -1]) dim = 16 learning_rate = 0.0000001 display_step = 50 X = tf.placeholder(tf.float32, [None, dim]) # Y should define the shape of your labels. # As discussed we're going to need one hot encoded labels for # this prediction task. this line does not define the shape of your input. # we'll define later # Y = tf.placeholder(tf.float32) train_X = data_.iloc[:200, :-2].as_matrix() train_X = train_X.fillna(value=0) train_Y = inputY[:200].as_matrix() test_X = data_.iloc[200:320, :-2].as_matrix() test_Y = inputY[200:320].as_matrix() n_samples = train_Y.size # Its important we get the shape of the weight and bias matrices # correct, the version in code is: # W = tf.Variable(tf.zeros([dim]), name="weight") # that wont work since we want to be able to multiply [X, W] # to produce a evidence vector for each each example. # the shape of X is [200 x dim] - there should be a weight for each # feature and there 10 classes so W is [dim, nclasses], W = tf.Variable(tf.zeros([dim, nclasses])) # for the bias, there should be one for each class. # b = tf.Variable(tf.zeros([1]), name="bias") b = tf.Variable(tf.zeros([nclasses])) # the correct operation here is tf.matmal, I suspect you introduced # this to make your early matrix definiton work in the graph # activation = tf.add(tf.mul(X, W), b) activation = tf.add(tf.matmul(X, W), b) # you forgot the actual model! Assuming you want to do # softmax classification let do: y = tf.nn.softmax(activation) # Now let's define or input labels ( we could have called them Y ) # as you had them. Notice what we are saying here is expect a matrix # of floats with any number of examples and nclasses number of columns # which is exactly the size of train_Y. y_ = tf.placeholder(tf.float32, [None, nclasses]) # we define the cost reflect the fact that our model output is called # y not activations (anymore) # cost = tf.reduce_sum(tf.pow(activation-Y, 2))/(2*n_samples) cost = tf.reduce_sum(tf.pow(y_ - y, 2))/(2*n_samples) optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost) hm_epochs = 1000 init = tf.initialize_all_variables() sess = tf.Session() sess.run(init) # I imagine what you want to do here is stochastic gradient descent. # I am not sure this is the way to do it. End to check the code I # will train over the entire training data for 1000 repetitions, # similar to the tutorial code. # ..... for i in range(hm_epochs): sess.run(optimizer, feed_dict={X: train_X, y_: train_Y}) if (i) % display_step == 0: cc = sess.run(cost, feed_dict={X: train_X,y_: train_Y}) print "Training step:", '%04d' % (i), "cost=", "{:.9f}".format(cc) # To check the accuracy ( this is one way of measuring the performance # of an algorithm on a classification task) we will do: (This is # adapted from the tensflow mnist example [code][1]) correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) print(sess.run(accuracy, feed_dict={X: test_X, y_: test_Y}))

Recommend

  • Tensorflow's asymmetric padding assumptions
  • output label Y train shape keras tensorflow 1.4
  • Is it necessary to close session after tensorflow InteractiveSession()
  • Create colour picture from greyscale picture
  • Saving a dictionary into an .XLSX
  • ModuleNotFoundError: No module named 'matplotlib'
  • Interaction with python's matplotlib figure: assign value to selected features
  • matplotlib's colormap
  • how to plot a heat map for three column data
  • How to specify logical types when writing Parquet files from PyArrow?
  • How to use OpenCV in python 3.4 on windows 7 x64? [duplicate]
  • Tensorflow: “GraphDef cannot be larger than 2GB.” error when saving model after assigning variables
  • How to show matplotlib plot from a figure object
  • matplotlib issues when nan first in list
  • Issue clearing the Tkinter Canvas widget using also matplotlib
  • How to render two pd.DataFrames in jupyter notebook side by side?
  • Dendrogram or Other Plot from Distance Matrix
  • Pandas: How to subset (and sum) top N observations within subcategories?
  • Executing a function that adds columns and populates them dependig on other columns in Pandas
  • Windows batch curl to variable
  • ionic plugin add phonegap-plugin-push results in a 404 Not Found Error
  • Httpclient multipart/form-data post image and json same time
  • Add vcard to Contacts with Mail rules and Applescript
  • Parsing Data From Long to Wide Format in Python
  • Pandas groupby to to_csv
  • Get the last date of each month in a list of dates in Python
  • Python function to read variable length blocks of data from file while open
  • Converting query results into DataFrame in python
  • vectorized indexing/slicing in numpy/scipy?
  • Sensibility of combined Maven/Ant+Ivy build management for dual platform Desktop/Android deployment?
  • Wrong labels when plotting a time series pandas dataframe with matplotlib
  • MS Access - How to change the linked table path by amend the table
  • formatting the colorbar ticklabels with SymLogNorm normalization in matplotlib
  • Window Size for Mac application
  • Matplotlib draw Spline from multiple points
  • Why winpcap requires both .lib and .dll to run?
  • WOWZA + RTMP + HTML5 Playback?
  • Return words with double consecutive letters
  • Python: how to group similar lists together in a list of lists?
  • Busy indicator not showing up in wpf window [duplicate]