python - scikit-learn ValueError: dimension mismatch -


This is my first posting here. For the past few days I have been trying to teach myself - learn to learn But recently I have encountered an error which has persisted for some time.

My goal is to train only one NB Classifier cli so that I feed it to an arbitrary list of stars called new_doc and predict it That which square string is likely to be related.

This is how my program looks:

  Importing as import # import np import pylab import pumps sklearn.naive_bayes as PP multilateral NB sklearn.pipeline DF sklearn import matrix #Opening csv file import pipeline = pd.read_csv ( 'data.csv' September = ',') sklearn.feature_extraction.text import #Randomising rows file DF = df TfidfVectorizer, HashingVectorizer, CountVectorizer Reindex (np.random.permutation (df.index)) Expand features from text, define target y and data x vect = CountVectorizer () x = vect.fit_transform (df ['features']) y = df [' target 'split_size] X_test = x [split_size:] y_train = y [: split_size] y_test = y [split_size] data in testing and training #Partitioning SPLIT_PERC = 0.75 split_size = int (len (y) * SPLIT_PERC) X_train = x [set: ] # Model Cliff = Multinomial NB () clf.fit (X_train, y_train) #Training Results # Printing "Accuracy on Training Set: Print" clf.score (X_train, y_train) print "adult set on Accu test "Print clf.score (X_test, y_test) y_pred = clf.predict (X_test) print" classified report: "Print metrics.classification_report (y_test, y_pred) #Predicting new data new_doc = [" MacDonalds "" Walmart "," Target "," Starbucks "] trans_doc = vect.transform (new_doc) #extracting features y_pred = clf.predict (trans_doc) #predicting  

But when I run the program I the following error on the previous line Get:

  y_pred = clf.predict (trans_doc) traceback (most recent call final): File "& lt; stdin & gt; ", line 1, & lt; Module & gt; file" /Library/Python/2.7/site-packages/sklearn/naive_bayes.py ", line 62, JLL predicted = self._joint_log_likelihood (X) File "/Library/Python/2.7/site-packages/sklearn/naive_bayes py ", line 441, _joint_log_likelihood return (safe_sparse_dot (X, self.feature_log_prob_.T) file" /Library/Python/2.7/site-packages/sklearn/utils/extmath.py ", line 175, safe_sparse_dot retired = a * b file "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/scipy/sparse/base.py", line 334, ValueError in __mul__ ( 'amplitude mismatch') raise ValueError: dimensions Actually this word-document is going to do something with the dimension of the matrix.  

When I check the dimensions of Trans_doco, X_trray and X_Stest:

  & gt; Trans_doc.swap (4, 4) & gt ;> Gt; gt; y_pred = clf to work (X. To change .predict (trans_doc)  with the dimensions  (4, 28750)  in a word-document matrix, to me (which I understand it);  new_doc  But I do not know about any method within  CountVectorizer  which lets me do this 


Comments

Popular posts from this blog

python - Overriding the save method in Django ModelForm -

html - CSS autoheight, but fit content to height of div -

qt - How to prevent QAudioInput from automatically boosting the master volume to 100%? -