1 2 3 4 5 | vectorizer = CountVectorizer(gmin_df=1) corpus=[] for bp in behavioral_profiles: for pd in bp.product_descriptions: corpus.append(pd.description) |
1 2 | [u'better', u'between', u'beyond', u'biafra', u'big', u'bigger', u'bill', u'billboard', u'bites', u'biting'] |
1 2 3 4 5 6 | data_target_tuples=[ ] for bp in behavioral_profiles: for pd in bp.product_descriptions: data_target_tuples.append((bp.type, pd.description)) shuffle(data_target_tuples) |
1 2 3 4 5 6 7 8 9 | X_data=[ ] y_target=[ ] for t in data_target_tuples: v = vectorizer.transform([t[1]]).toarray()[0] X_data.append(v) y_target.append(t[0]) X_data=np.asarray(X_data) y_target=np.asarray(y_target) |
欢迎光临 电子技术论坛_中国专业的电子工程师学习交流社区-中电网技术论坛 (http://bbs.eccn.com/) | Powered by Discuz! 7.0.0 |