sklearn.datasets iris数据集绘图
导入数据
from sklearn.datasets import load_iris
import matplotlib.pyplot as plt
import pandas as pd
import warnings
warnings.filterwarnings('ignore')
data = load_iris()
df = pd.DataFrame(data['data'], columns=data['feature_names'])
df['species'] = data['target']
df.head()
sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) species 0 5.1 3.5 1.4 0.2 0 1 4.9 3.0 1.4 0.2 0 2 4.7 3.2 1.3 0.2 0 3 4.6 3.1 1.5 0.2 0 4 5.0 3.6 1.4 0.2 0
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 sepal length (cm) 150 non-null float64
1 sepal width (cm) 150 non-null float64
2 petal length (cm) 150 non-null float64
3 petal width (cm) 150 non-null float64
4 species 150 non-null int32
dtypes: float64(4), int32(1)
memory usage: 5.4 KB
df.describe()
sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) species count 150.000000 150.000000 150.000000 150.000000 150.000000 mean 5.843333 3.057333 3.758000 1.199333 1.000000 std 0.828066 0.435866 1.765298 0.762238 0.819232 min 4.300000 2.000000 1.000000 0.100000 0.000000 25% 5.100000 2.800000 1.600000 0.300000 0.000000 50% 5.800000 3.000000 4.350000 1.300000 1.000000 75% 6.400000 3.300000 5.100000 1.800000 2.000000 max 7.900000 4.400000 6.900000 2.500000 2.000000
数据可视化
df.plot(kind='box')#箱线图
df.plot(kind='line')#折线图
df.plot(kind='hist',bins=50)#直方图
df.plot(kind='bar')#条形图
df.plot(kind='kde')#Kernel 概率密度线
df.plot(kind='scatter',x='sepal length (cm)',y='sepal width (cm)')
<matplotlib.axes._subplots.AxesSubplot at 0x20a352cbcc8>
配合matplotlib 使用添加子图
fig,axs=plt.subplots(5,1)
df.plot(kind='line',subplots=True,ax=axs)
fig.subplots_adjust(wspace=1, )
fig,axs=plt.subplots(1,5)
df.plot(kind='kde',subplots=True,ax=axs,legend=False,sharey=1)
array([<matplotlib.axes._subplots.AxesSubplot object at 0x0000020A3978FE48>,
<matplotlib.axes._subplots.AxesSubplot object at 0x0000020A3993AA88>,
<matplotlib.axes._subplots.AxesSubplot object at 0x0000020A3994F6C8>,
<matplotlib.axes._subplots.AxesSubplot object at 0x0000020A3998DE88>,
<matplotlib.axes._subplots.AxesSubplot object at 0x0000020A39935288>],
dtype=object)
matplotlib.pyplot.GridSpec绘制子图
x = df['sepal length (cm)'].values
y = df['sepal width (cm)'].values
plt.figure(figsize=(6,6))
grid = plt.GridSpec(4, 4, wspace=0.5, hspace=0.5)
main_ax = plt.subplot(grid[0:3,1:4])
df.plot(kind='scatter',x='sepal length (cm)',y='sepal width (cm)',ax=main_ax)
y_hist = plt.subplot(grid[0:3,0],xticklabels=[],sharey=main_ax)
plt.hist(y,60,color='green',orientation='horizontal',alpha=0.5)
y_hist.invert_xaxis()
x_hist = plt.subplot(grid[3,1:4],yticklabels=[])
df['sepal length (cm)'].plot(kind='kde',color='red',alpha=0.5,ax=x_hist)
x_hist.invert_yaxis()
plt.subplots_adjust(hspace=1.2, wspace=1)
plt.show()
后续:利用 seaborn 库相关库进行绘制复杂图形
1 Comment
sklearn.datasets iris数据集绘图 seaborn and matplotlib – xinzipanghuang.home · 2020年7月24日 at 16:27
[…] sns.jointplot plt版 […]