numpy - Two-sample Kolmogorov-Smirnov Test in Python Scipy

Question

Welcome To Ask or Share your Answers For Others

numpy - Two-sample Kolmogorov-Smirnov Test in Python Scipy

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

numpy - Two-sample Kolmogorov-Smirnov Test in Python Scipy

I can't figure out how to do a Two-sample KS test in Scipy.

After reading the documentation scipy kstest

I can see how to test where a distribution is identical to standard normal distribution

from scipy.stats import kstest
import numpy as np

x = np.random.normal(0,1,1000)
test_stat = kstest(x, 'norm')
#>>> test_stat
#(0.021080234718821145, 0.76584491300591395)

Which means that at p-value of 0.76 we can not reject the null hypothesis that the two distributions are identical.

However, I want to compare two distributions and see if I can reject the null hypothesis that they are identical, something like:

from scipy.stats import kstest
import numpy as np

x = np.random.normal(0,1,1000)
z = np.random.normal(1.1,0.9, 1000)

and test whether x and z are identical

I tried the naive:

test_stat = kstest(x, z)

and got the following error:

TypeError: 'numpy.ndarray' object is not callable

Is there a way to do a two-sample KS test in Python? If so, how should I do it?

Thank You in Advance

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-17T03:10:19+0000

You are using the one-sample KS test. You probably want the two-sample test ks_2samp:

>>> from scipy.stats import ks_2samp
>>> import numpy as np
>>> 
>>> np.random.seed(12345678)
>>> x = np.random.normal(0, 1, 1000)
>>> y = np.random.normal(0, 1, 1000)
>>> z = np.random.normal(1.1, 0.9, 1000)
>>> 
>>> ks_2samp(x, y)
Ks_2sampResult(statistic=0.022999999999999909, pvalue=0.95189016804849647)
>>> ks_2samp(x, z)
Ks_2sampResult(statistic=0.41800000000000004, pvalue=3.7081494119242173e-77)

Results can be interpreted as following:

You can either compare the statistic value given by python to the KS-test critical value table according to your sample size. When statistic value is higher than the critical value, the two distributions are different.
Or you can compare the p-value to a level of significance a, usually a=0.05 or 0.01 (you decide, the lower a is, the more significant). If p-value is lower than a, then it is very probable that the two distributions are different.

Categories

numpy - Two-sample Kolmogorov-Smirnov Test in Python Scipy

numpy - Two-sample Kolmogorov-Smirnov Test in Python Scipy

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags