尝试在代码框执行以下代码,学习质心初始化方式的init参数:
(1) 导入需要的模块、库
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs
from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score
plt.style.use('ggplot')
(2)自建数据集
X, y = make_blobs(n_samples=500,
n_features=2,centers=4,random_state=1)
(3)使用init确定质心
init='k-means++
cluster_01 = KMeans(n_clusters = 8,init='k-means++').fit(X)
cluster_01.n_iter_
silhouette_score(X,cluster_01.labels_)
init="random"
cluster_02 = KMeans(n_clusters = 8,init="random").fit(X)
cluster_02.n_iter_
silhouette_score(X,cluster_02.labels_)
注意: 因为没有设置随机种子random_state,每次代码运行结果存在差异