当前位置：首页 > news >正文

上虞做网站SEO优化网站建设价格

news 2025/10/14 19:58:33

上虞做网站,SEO优化网站建设价格,找人做网站源码被盗用,合肥网站建设培训学校一、背景 1、说明 2、数据集 row_id#xff1a;签到行为的编码 x y#xff1a;坐标系#xff0c;人所在的位置 accuracy#xff1a;定位的准确率 time#xff1a;时间戳 place_id#xff1a;预测用户将要签到的位置 3、数据集下载 https://www.kaggle.com/navoshta/gr…一、背景 1、说明 2、数据集 row_id签到行为的编码 x y坐标系人所在的位置 accuracy定位的准确率 time时间戳 place_id预测用户将要签到的位置 3、数据集下载 https://www.kaggle.com/navoshta/grid-knn/data 国内下不了无法收验证码还是在csdn用积分下一个别人上传的二、流程分析 1、获取数据 2、数据处理目的特征值目标值 a.缩小数据范围根据坐标缩小范围 2 x 2.5 1 y 1.5 b.时间戳 time - 年月日时分秒早上签到可能是公园、通勤的路上周六签到可能在商场、在家睡觉 c.过滤签到次数少的地点 d.数据集划分 3、特征工程标准化 4、KNN算法预估器流程 5、模型选择与调优 6、模型评估三、代码 1、day02_facebook_demo import pandas as pd# 1、获取数据 data pd.read_csv(./FBlocation/train.csv)data.head()# 2、基本的数据处理 # 1缩小数据范围 data data.query(x 2.5 x 2 y 1.5 y 1)data# 2处理时间特征 time_value pd.to_datetime(data[time], units)time_value.valuesdate pd.DatetimeIndex(time_value)data[day] date.daydata[weekday] date.weekdaydata[hour] date.hourdata# 3、过滤掉签到次数少的地点 place_count data.groupby(place_id).count()[row_id]place_count[place_count 3].head()data_final data[data[place_id].isin(place_count[place_count 3].index.values)]data_final.head()# 筛选特征值和目标值 # 特征值 x data_final[[x, y, accuracy, day, weekday, hour]] # 目标值 y data_final[place_id]x.head()y.head()# 数据集划分 from sklearn.model_selection import train_test_splitx_train, x_test, y_train, y_test train_test_split(x, y)from sklearn.preprocessing import StandardScaler from sklearn.neighbors import KNeighborsClassifier from sklearn.model_selection import GridSearchCV# 3、特征工程标准化 transfer StandardScaler() x_train transfer.fit_transform(x_train) # 用训练集的平均值和标准差对测试集的数据来标准化 # 这里测试集和训练集要有一样的平均值和标准差而fit的工作就是计算平均值和标准差所以train的那一步用fit计算过了到了test这就不需要再算一遍自己的了直接用train的就可以 x_test transfer.transform(x_test) # 4、KNN算法预估器 estimator KNeighborsClassifier() # 加入网格搜索和交叉验证 # 参数准备 param_dict {n_neighbors: [1, 3, 5, 7, 9, 11]} estimator GridSearchCV(estimator, param_gridparam_dict, cv10) estimator.fit(x_train, y_train) # 5、模型评估 # 方法1直接比对真实值和预测值 y_predict estimator.predict(x_test) print(y_predict\n, y_predict) print(直接比对真实值和预测值\n, y_test y_predict) # 方法2计算准确率 score estimator.score(x_test, y_test) print(准确率为\n, score) #最佳参数best_params_ print(最佳参数\n, estimator.best_params_) #最佳结果best_score_ print(最佳结果\n, estimator.best_score_) #最佳估计器best_estimator_ print(最佳估计器\n, estimator.best_estimator_) #交叉验证结果cv_results_ print(交叉验证结果\n, estimator.cv_results_) 2、运行结果

查看全文

http://www.yingshimen.cn/news/47807/