使用 pandas 和 Matplotlib 绘制 CDF 图
核心方法
pandas.Series.hist
和 matplotlib.pyplot.hist
可以帮助我们画出想要的 CDF 图。
参数说明
bins
int
orSequence
, default10
.- Number of histogram bins to be used. If an integer is given,
bins + 1
bin edges are calculated and returned. Ifbins
is a sequence, gives bin edges, including left edge of first bin and right edge of last bin. In this case,bins
is returned unmodified.
density
bool
, defaultNone
.- If
True
, the first element of the return tuple will be the counts normalized to form a probability density, i.e., the area (or integral) under the histogram will sum to1
. This is achieved by dividing the count by the number of observations times the bin width and not dividing by the total number of observations. Ifstacked
is alsoTrue
, the sum of the histograms is normalized to1
.
cumulative
bool
, defaultFalse
.- If
True
, then a histogram is computed where each bin gives the counts in that bin plus all bins for smaller values. The last bin gives the total number of data points. Ifnormed
ordensity
is alsoTrue
then the histogram is normalized such that the last bin equals1
. If cumulative evaluates to less than0
(e.g.,-1
), the direction of accumulation is reversed. In this case, ifnormed
and/ordensity
is alsoTrue
, then the histogram is normalized such that the first bin equals1
.
histtype
'bar', 'barstacked', 'step', 'stepfilled'
, default'bar'
.- The type of histogram to draw:
'bar'
is a traditional bar-type histogram. If multiple data are given, the bars are arranged side by side.'barstacked'
is a bar-type histogram where multiple data are stacked on top of each other.'step'
generates a lineplot that is by default unfilled.- '
stepfilled'
generates a lineplot that is by default filled.
完整代码
1 | import pandas as pd |
附:测试数据
1 | Series A,Series B |