如果
<bblatex>Y=\log\left(X\right)\sim\mathcal{LN}\left(\mu,\,\sigma^{2}\right)</bblatex>
那么有均值
<bblatex>\mathbb{E}\left(Y\right)&=\exp\left(\mu+\frac{1}{2}\sigma^{2}\right)</bblatex>
和方差
<bblatex>\text{var}\left(Y\right)&=\left[\text{exp}\left(\sigma^{2}\right)-1\right]\exp\left(2\mu+\sigma^{2}\right)</bblatex>
对于简单的对数线性回归
<bblatex>\lnY_{i}=\alpha+\beta X_{i}+\varepsilon_{i}</bblatex>
满足
<bblatex>\varepsilon_{i}\overset{\text{i.i.d}}{\sim}\mathcal{N}\left(0,\,\sigma^{2}\right)</bblatex>
和
<bblatex>\text{cov}\left(\varepsilon_{i}\,\varepsilon_{j}\right)=0</bblatex>
此时对于任意一个$X$有
<bblatex>\hat{\ln Y}\sim\mathcal{N}\left(\alpha+\beta X,\,\sigma^{2}\left(\frac{1}{n}+\frac{\left(X-\bar{X}\right)}{\sum_{i=1}^{n}\left(X_{i}-\bar{X}\right)^{2}}\right)\right)</bblatex>
那么结果似乎是
<bblatex>\mathbb{E}\left(\hat{Y}\right)=\exp\left[\alpha+\beta X+\frac{1}{2}\sigma^{2}\left(\frac{1}{n}+\frac{\left(X-\bar{X}\right)}{\sum_{i=1}^{n}\left(X_{i}-\bar{X}\right)^{2}}\right)\right]</bblatex>
但一般认为结果似乎是
<bblatex>\mathbb{E}\left(\hat{Y}\right)=\exp\left[\alpha+\beta X+\frac{1}{2}\sigma^{2}\right]</bblatex>
表示不能理解的是在《Reducing Transformation Bias in Curve Fitting》一文中给的是
<bblatex>\mathbb{E}\left(\hat{Y}\right)=\exp\left[\left(\alpha+\beta X+\frac{1}{2}\sigma^{2}\right)\left(\frac{1}{n}+\frac{\left(X-\bar{X}\right)}{\sum_{i=1}^{n}\left(X_{i}-\bar{X}\right)^{2}}\right)\right]</bblatex>
另外,真实的$\alpha$、$\beta$和$\sigma^{2}$又是不知道的,采用其最小二乘估计值$\hat{\alpha}$、$\hat{\beta}$和$\hat{\sigma}^{2}$进行计算的时候感觉以上表达式都有问题。
求大牛指教!
对数正态回归分析逆变换问题