Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
351 views
in Technique[技术] by (71.8m points)

python - Legend only shows one label when plotting with pandas

I have two Pandas DataFrames that I'm hoping to plot in single figure. I'm using IPython notebook.

I would like the legend to show the label for both of the DataFrames, but so far I've been able to get only the latter one to show. Also any suggestions as to how to go about writing the code in a more sensible way would be appreciated. I'm new to all this and don't really understand object oriented plotting.

%pylab inline
import pandas as pd

#creating data

prng = pd.period_range('1/1/2011', '1/1/2012', freq='M')
var=pd.DataFrame(randn(len(prng)),index=prng,columns=['total'])
shares=pd.DataFrame(randn(len(prng)),index=index,columns=['average'])

#plotting

ax=var.total.plot(label='Variance')
ax=shares.average.plot(secondary_y=True,label='Average Age')
ax.left_ax.set_ylabel('Variance of log wages')
ax.right_ax.set_ylabel('Average age')
plt.legend(loc='upper center')
plt.title('Wage Variance and Mean Age')
plt.show()

Legend is missing one of the labels

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This is indeed a bit confusing. I think it boils down to how Matplotlib handles the secondary axes. Pandas probably calls ax.twinx() somewhere which superimposes a secondary axes on the first one, but this is actually a separate axes. Therefore also with separate lines & labels and a separate legend. Calling plt.legend() only applies to one of the axes (the active one) which in your example is the second axes.

Pandas fortunately does store both axes, so you can grab all line objects from both of them and pass them to the .legend() command yourself. Given your example data:

You can plot exactly as you did:

ax = var.total.plot(label='Variance')
ax = shares.average.plot(secondary_y=True, label='Average Age')

ax.set_ylabel('Variance of log wages')
ax.right_ax.set_ylabel('Average age')

Both axes objects are available with ax (left axe) and ax.right_ax, so you can grab the line objects from them. Matplotlib's .get_lines() return a list so you can merge them by simple addition.

lines = ax.get_lines() + ax.right_ax.get_lines()

The line objects have a label property which can be used to read and pass the label to the .legend() command.

ax.legend(lines, [l.get_label() for l in lines], loc='upper center')

And the rest of the plotting:

ax.set_title('Wage Variance and Mean Age')
plt.show()

enter image description here

edit:

It might be less confusing if you separate the Pandas (data) and the Matplotlib (plotting) parts more strictly, so avoid using the Pandas build-in plotting (which only wraps Matplotlib anyway):

fig, ax = plt.subplots()

ax.plot(var.index.to_datetime(), var.total, 'b', label='Variance')
ax.set_ylabel('Variance of log wages')

ax2 = ax.twinx()
ax2.plot(shares.index.to_datetime(), shares.average, 'g' , label='Average Age')
ax2.set_ylabel('Average age')

lines = ax.get_lines() + ax2.get_lines()
ax.legend(lines, [line.get_label() for line in lines], loc='upper center')

ax.set_title('Wage Variance and Mean Age')
plt.show()

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...