在对这个问题做了一些研究之后,我找不到这个问题的答案。我想制作一个二次x轴,但是分类变量会在间隔内重复(但不会在绘图中反复重复相同的值)。在这张图片(用excel制作)中可以看到我想要的类似示例:
数据:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
data1 = {'Month': list(range(11,35))+list(range(34,42)),
'Checkpoint': ['A','A','A','A','A','A','B','B','B','B','B','B','C','C','C','C','C','C','C','C','C','C','C','D','C','D','D','D','D','D','D','D'],
'Litres':[216545.67,18034.45,25807.83,46136.23,68099.21,55436.35,56412.33,9347.52,3177.29,103.89,333.29,2355.41,
49063.72,113622.80,243639.97,303992.32,255471.55,267022.75,274952.92,619665.39,798969.54,1127476.60,
1563344.98,1051827.75,603167.32,1880605.49,1931002.19,
2970500.68,2362336.66,5311058.83,5071784.10,5325575.47]}
df = pd.DataFrame(data1)
通过运行上面的代码,我们获得了以下数据帧
Month Checkpoint Litres
0 11 A 216545.67
1 12 A 18034.45
2 13 A 25807.83
3 14 A 46136.23
4 15 A 68099.21
5 16 A 55436.35
6 17 B 56412.33
7 18 B 9347.52
8 19 B 3177.29
9 20 B 103.89
10 21 B 333.29
11 22 B 2355.41
12 23 C 49063.72
13 24 C 113622.80
14 25 C 243639.97
15 26 C 303992.32
16 27 C 255471.55
17 28 C 267022.75
18 29 C 274952.92
19 30 C 619665.39
20 31 C 798969.54
21 32 C 1127476.60
22 33 C 1563344.98
23 34 D 1051827.75
24 34 C 603167.32
25 35 D 1880605.49
26 36 D 1931002.19
27 37 D 2970500.68
28 38 D 2362336.66
29 39 D 5311058.83
30 40 D 5071784.10
31 41 D 5325575.47
我想对数据进行散点图(可以使用matplotlib或seaborn),但要使用第二个x轴(df[‘Checkpoint’])。
plt.figure(figsize = (14,7))
plt.scatter(df['Month'], df['Litres'], s=30)
一种可能是使用主刻度来放置月份标签,使用次刻度来进行分隔。每当检查点标签更改时,需要绘制更长的勾号。检查点标签位置正好在两个长刻度之间。
似乎有一个月有两个不同的标签(月34)。不清楚届时会发生什么。在下面的代码中,一个长的主要刻度线被画在那里。
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
data1 = {'Month': list(range(11, 35)) + list(range(34, 42)),
'Checkpoint': ['A', 'A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B', 'B', 'C', 'C', 'C', 'C', 'C', 'C', 'C',
'C', 'C', 'C', 'C', 'D', 'C', 'D', 'D', 'D', 'D', 'D', 'D', 'D'],
'Litres': [216545.67, 18034.45, 25807.83, 46136.23, 68099.21, 55436.35, 56412.33, 9347.52, 3177.29, 103.89,
333.29, 2355.41, 49063.72, 113622.80, 243639.97, 303992.32, 255471.55, 267022.75, 274952.92,
619665.39, 798969.54, 1127476.60, 1563344.98, 1051827.75, 603167.32, 1880605.49, 1931002.19,
2970500.68, 2362336.66, 5311058.83, 5071784.10, 5325575.47]}
df = pd.DataFrame(data1)
fig, ax = plt.subplots(figsize=(10, 5))
ax.scatter(df['Month'], df['Litres'], s=30, color='crimson')
ax.xaxis.set_major_locator(ticker.FixedLocator(0.5))
ax.xaxis.set_major_locator(ticker.MultipleLocator(1))
ax.xaxis.set_minor_locator(ticker.MultipleLocator(0.5))
ax.set_xlim(df['Month'].iloc[0] - 0.5, df['Month'].iloc[-1] + 0.5)
checkpoints = list(df['Checkpoint'])
long_minor_ticks = [df['Month'].iloc[0] - 1] # these minor ticks need to be longer
long_major_ticks = [] # these major ticks need to be longer
for m1, m, c1, c in zip(df['Month'][1:], df['Month'], df['Checkpoint'][1:], df['Checkpoint']):
if m == m1:
long_major_ticks.append(m)
elif c != c1:
long_minor_ticks.append(m)
long_minor_ticks.append(df['Month'].iloc[-1])
ax.tick_params(which='minor', axis='x', pad=20) # put the minor tick labels at some distance
checkpoint_labels = []
for tick, month in zip(ax.xaxis.get_minor_ticks(), range(df['Month'].iloc[0]-1, 100)):
l = 35 if month in long_minor_ticks and not month in long_major_ticks and not month+1 in long_major_ticks else 18
tick.tick1line.set_markersize(l)
checkpoint_labels.append('')
for tick, month in zip(ax.xaxis.get_major_ticks(), range(df['Month'].iloc[0]-1, 100)):
l = 35 if month in long_major_ticks else 0
tick.tick1line.set_markersize(l)
# set the checkpoint letters at the positions between the long minor ticks
for t0, t1 in zip(long_minor_ticks[:-1],long_minor_ticks[1:]):
if t1 != t0 + 1:
ind = (t1+t0) // 2 - long_minor_ticks[0]
checkpoint_labels[ind] = df['Checkpoint'].iloc[ind]
ax.set_xticklabels(checkpoint_labels, minor=True)
fig.subplots_adjust(bottom=0.15) # we need space to show the large ticks
plt.show()