When you are interested in testing the scale parameter rather than the variance, you will need to use different tests. The appropriate test will depend on the underlying distribution of your data. Here are a few possibilities:
Kolmogorov-Smirnov Goodness-of-Fit Test: This test is a general method for comparing a sample with a reference probability distribution. If you have an idea of what your scaled distribution should look like, you can compare your sample with this distribution.
Mann-Whitney U Test (or Wilcoxon rank-sum test): This non-parametric test is used to compare two independent samples to determine if they come from the same distribution. It can be used to compare the scale parameters of two samples. This test is typically used for comparing medians, but it also makes use of the full distribution of the data and hence provides some information about scale.
Ansari-Bradley Test: This is a non-parametric test for the equality of the variances (or scales) of two populations.
Mood’s Median Test: This is a non-parametric test that determines whether the medians of two groups are different, which indirectly relates to scale.
Bootstrapping: Bootstrapping is a resampling method where one repeatedly samples from the observed data with replacement, and then checks the empirical distribution of the scale parameter to test the hypothesis.
Likelihood Ratio Test: If you are willing to make some parametric assumptions (like assuming a particular family of distributions), then you might use a likelihood ratio test to test whether the scale parameter is equal to a particular value. For instance, if you assume that your data is exponentially distributed, the scale parameter would be the inverse of the rate parameter.
Again, the right choice of test will depend on the specifics of your problem and the nature of your data. The key is to remember that inferences about scale often come with more assumptions and are more sensitive to the exact shape of the underlying distribution compared to inferences about location (like the mean or median).
总体分布未知,检验单个样本的方差,用什么检验方法
- 已编辑
nan.xiao 看来 GPT4 比 3.5 要靠谱不少啊。3.5 会建议在不符合正态的时候用列文检验,但不会提醒列文检验不能用于单样本的场景;甚至会把 Bartlett's test 当非参来推荐.。
If the population distribution is unknown or not assumed to be normal, and you want to test whether a sample variance equals an expected value, you can use a non-parametric test called the Bartlett's test.
如果不自己手动复核的话,就被带到沟里了。部分时间心态如下:
nan.xiao 谢谢。 看来,GPT-4 是一本正经地不正面回答问题,GPT-4 就说没有找到相关的检验方法就完了,偏偏说了一大堆无助于问题的其他内容,不过,能唬新人。老实说,我之前是认真翻找过一些材料,按照 MECE 的原则想把众多的检验方法排排座,唯一没有座位可排的就是上面的检验问题(单样本总体分布未知,尺度/方差参数的检验问题),然后,我就想可能是我所知不多,没有找到,来问问 GPT-4,结果它一本正经地胡说。
其实,我有个问题,在忽略 GPT-4 版本更新带来的滞后性后,GPT-4 是不是可以保证已经拥有最前沿的知识?如果用它搜不到答案,意味着,这个问题还没有人做过?感觉这个问题对科研工作者会挺重要的。
Cloud2016 同意,即使是 GPT-4,推断能力以及 conciseness 都还需要加强,以及大家最常吐槽的 hallucination。这些弱点在你问一个写 grant proposal 这样的需要强事实的创意写作问题时就集中暴露了,上来就编出一堆不存在的论文 + 强行迎合设定进行文本套娃。
即使如此,目前 GPT-4 的水平比其他所有开源模型以及 GPT-3.5 还是强不少。个人猜测,可能领先 Google 的进度两到三年。如果相关数据在训练文本中不是太稀有,给定一个靠谱的框架和上下文,目前定位写作助手还是可以的。当然,我是降临派,表示期待 GPT-5 的表现。
对于你的问题,我的感觉是 GPT 基本可以保证知识的覆盖度(只要你关注的问题不是太偏门),所以我有时也会利用它验证想法,因为很多研究类问题需要输入较多上下文,而这类问题使用 Google 并没有办法很好地搜索到答案。
有另一种解释,找不到任何相关材料,没有人研究过。这个检验问题可能本身没什么应用价值,不值得研究。我只是在机械地用 MECE 原则去对方法做划分。
偶然看到一则消息说了类似的问题。所有的大语言模型都对强事实的推理问题无解。有人出了一道很简单的推理计算问题,如下:
Sally (a girl) has 3 brothers. Each brother has 2 sisters. How many sisters does Sally have?
涉及阅读理解了,LLM 模型没有一个答对的 https://benchmarks.llmonitor.com/sally