|
本帖最后由 Menuett 于 2013-12-22 15:59 编辑 8 R$ D+ L0 H+ T. S8 K# r0 Y4 F
煮酒正熟 发表于 2013-12-20 12:05
" {( y5 W- V8 v, a. u6 Q基本可以说是显著的。总的来说,在商界做统计学分析,95%信心水平是用得最多的,当95%上不显著时,都会去 ... 4 D. r [2 x( ]# S1 j5 M3 k4 l ^
4 l0 m3 O" ?, Y. J$ O7 g7 w" k+ r这个其实是一种binomial response,应该用Contigency Table或者Logisitic Regression(In case there are cofactors)来做。只记比率丢弃了Number of trial的信息(6841和1217个客户)。
+ L# D3 c, u1 K# ^
- S ]& n: q7 G3 o2 @% r0 w结果p=0.5731。 远远不显著。要在alpha level 0.05的水平上检验出76.42%和75.62%的区别,即使实验组和对照组各自样本大小相同,各自尚需44735个样本(At power level 80%)。see: Statistical Methods for Rates and Proportions by Joseph L. Fleiss (1981)
, T" o# I4 B4 b; k0 u5 p0 P* `7 {% p" R
R example:
* r9 K" M! z7 D; ?' f4 x. R
" k" u" s( L0 b0 b> M<-as.table(rbind(c(1668,5173),c(287,930)))1 w& b/ ? D: g- h5 @
> chisq.test(M)
0 y; Z( W4 R( I; S# H# q5 N$ |; o4 U8 k( o
Pearson's Chi-squared test with Yates' continuity correction4 P6 q8 I6 J' L, Y N# u3 e
1 \' L$ j8 I, [7 G# B' t' Ddata: M5 u8 J1 F, _* J! A" i, s
X-squared = 0.3175, df = 1, p-value = 0.5731
& J; n' e" U9 t4 I' {
& Q1 r) e$ o/ c! G% g: z' PPython example:) O$ T% x$ K9 i0 M
2 m% Z; g! Q+ @; t# Q4 a. R" s>>> from scipy import stats
; ^7 B7 ?0 T" d; c6 i% c8 K>>> stats.chi2_contingency([[6841-5173,5173],[1217-930,930]])+ [* p, D9 y; B" @
(0.31748297614660292, 0.57312422493552839, 1, array([[ 1659.73628692, 5181.26371308],
0 w9 L- r, H$ g7 W& V [ 295.26371308, 921.73628692]])) |
|