95.PKD之Testing
这一贴,其实主要介绍作者的testthat包
测试的好处我简短归纳下:
1) 测试越多未来出错可能越少
2) 越好的代码越容易测试,测试的时候很有可能让我们把代码分散成功能块
3)更容易知道工作要做什么,比如修正BUG,没有BUG的时候添加新功能
4) 修改代码更有信心
1. testthat的结构
expectation 用来检测最基础结果,test用来组合expectation来检测一个test_that()函数,context用来组合相关的一系列test_that()
2.expectations
引用原文,夹杂个人翻译[s:11]
[quote]•equals() 用 all.equal()来检测是否相等,但容许一定的数值精度差.
# Passes expect_that(10, equals(10)) # Also passes expect_that(10, equals(10 + 1e-7))
# Fails expect_that(10, equals(10 + 1e-6))
# Definitely fails! expect_that(10, equals(11))
•is_identical_to() 用identical()来检测精确相等.
# Passes expect_that(10, is_identical_to(10)) # Fails expect_that(10, is_identical_to(10 + 1e-10))
•is_equivalent_to() 宽松的equals()版本,因为忽略属性:
# Fails expect_that(c(“one” = 1, “two” = 2), equals(1:2)) # Passes expect_that(c(“one” = 1, “two” = 2), is_equivalent_to(1:2))
•is_a() 检测inherit()s继承自某个特定的类.
model <- lm(mpg ~ wt, data = mtcars) # Passes expect_that(model, is_a(“lm”))
# Fails expect_that(model, is_a(“glm”))
•matches() matches a character vector against a regular expression. The optional all argument controls whether all elements or just one element needs to match. This code is powered by str_detect() from the stringr package.
string <- “Testing is fun!” # Passes expect_that(string, matches(“Testing”)) # Fails, match is case-sensitive expect_that(string, matches(“testing”)) # Passes, match can be a regular expression expect_that(string, matches(“T.+ting”))
•prints_text() 对输出结果来进行正则匹配
a <- list(1:10, letters) # Passes expect_that(str(a), prints_text(“List of 2”)) # Passes expect_that(str(a), prints_text(fixed(“int [1:10]”))
•shows_message() 检测一个表达式生成特定信息:
# Passes expect_that(library(mgcv), shows_message(“This is mgcv”))
•gives_warning() 检测生成警告信息.
# Passes expect_that(log(-1), gives_warning()) expect_that(log(-1), gives_warning(“NaNs produced”)) # Fails expect_that(log(0), gives_warning())
•throws_error() verifies that the expression throws an error. You can also supply a regular expression which is applied to the text of the error.
# Fails expect_that(1 / 2, throws_error()) # Passes expect_that(1 / “a”, throws_error()) # But better to be explicit expect_that(1 / “a”, throws_error(“non-numeric argument”))
•is_true() is a useful catchall if none of the other expectations do what you want - it checks that an expression is true. is_false() is the complement of is_true().
[/quote]
这11种之多的检测可以组合出很多test去完成一个特定错误检测,另外,作者还提供了快键方式:
[quote]
expect_that(x, is_true())
expect_true(x)
expect_that(x, is_false())
expect_false(x)
expect_that(x, is_a(y))
expect_is(x, y)
expect_that(x, equals(y))
expect_equal(x, y)
expect_that(x, is_equivalent_to(y))
expect_equivalent(x, y)
expect_that(x, is_identical_to(y))
expect_identical(x, y)
expect_that(x, matches(y))
expect_match(x, y)
expect_that(x, prints_text(y))
expect_output(x, y)
expect_that(x, shows_message(y))
expect_message(x, y)
expect_that(x, gives_warning(y))
expect_warning(x, y)
expect_that(x, throws_error(y))
expect_error(x, y)
[/quote]
3.tests
一个test是test_that(name,code block)组成的, name是一种一旦出错可以让你知道哪里或者哪个功能出错的有用的信息标签,而code block就是完成对那种功能的检测的组合代码块,一旦出错就可以轻松找到对应的代码块,例如作者举的 test floor_date() function from library(lubridate).
test_that("floor_date works for different units", {<br />
base <- as.POSIXct("2009-08-03 12:01:59.23", tz = "UTC")</p>
<p> is_time <- function(x) equals(as.POSIXct(x, tz = "UTC"))<br />
floor_base <- function(unit) floor_date(base, unit)</p>
<p> expect_that(floor_base("second"), is_time("2009-08-03 12:01:59"))<br />
expect_that(floor_base("minute"), is_time("2009-08-03 12:01:00"))<br />
expect_that(floor_base("hour"), is_time("2009-08-03 12:00:00"))<br />
expect_that(floor_base("day"), is_time("2009-08-03 00:00:00"))<br />
expect_that(floor_base("week"), is_time("2009-08-02 00:00:00"))<br />
expect_that(floor_base("month"), is_time("2009-08-01 00:00:00"))<br />
expect_that(floor_base("year"), is_time("2009-01-01 00:00:00"))<br />
})
那么,假设一旦出错,我们看到 Test failed: "floor_date works for different units", Not expected: ..... 我们很快就可以找到上述test的对应代码
每一个test拥有自己的环境,但也有会影响到自己环境之外的情况:
1)文件系统,创建删除文件
2) search path,loading,detaching包
3) global options, 例如options(),par()
作者说如果有上述动作,you’ll need to clean up after yourself,有些测试包会帮你set-up and teardown methods that are run automatically before and after each test,对于testthat,我们可以在tests外创建对象然后依靠copy-on-modify机制不改变外部对象(这段没怎么深入理解)引用一下原文吧[s:11]
[quote]When you use these actions in tests, you’ll need to clean up after yourself. Many other testing packages have set-up and teardown methods that are run automatically before and after each test. These are not so important with testthat because you can create objects outside of the tests and rely on R’s copy-on-modify semantics to keep them unchanged between test runs. To clean up other actions you can use regular R functions[/quote]</p>
4.contexts
context就是把一些相关功能的tests组织到代码块,一般一个context一个文件,下面就举个测试stringr
包中的str_length
context("String length")</p>
<p>test_that("str_length is number of characters", {<br />
expect_that(str_length("a"), equals(1))<br />
expect_that(str_length("ab"), equals(2))<br />
expect_that(str_length("abc"), equals(3))<br />
})</p>
<p>test_that("str_length of missing is missing", {<br />
expect_that(str_length(NA), equals(NA_integer_))<br />
expect_that(str_length(c(NA, 1)), equals(c(NA, 1)))<br />
expect_that(str_length("NA"), equals(2))<br />
}<br />
test_that("str_length of factor is length of level", {<br />
expect_that(str_length(factor("a")), equals(1))<br />
expect_that(str_length(factor("ab")), equals(2))<br />
expect_that(str_length(factor("abc")), equals(3))<br />
})
我们同时用nchar来替代str_length测试
5.运行测试
我们有两种运行测试的情况:一是开发的时候,交互式的,二是自动测试
1) 我们用test_file(),test_dir()运行所有测试
2) auto_test来自动测试
3) 让R CMD check来运行测试
6.测试文件和目录
从下面的代码以及结果可以看到source(path)和test_file(path)的区别,可以看到test_file会给出所有的结果,而source只会给出第一个错误的test
> source("test-str_length.r")<br />
> test_file("test-str_length.r")<br />
.........</p>
<p>> source("test-nchar.r")<br />
Error: Test failure in 'nchar of missing is missing'<br />
* nchar(NA) not equal to NA_integer_<br />
'is.NA' value mismatch: 0 in current 1 in target<br />
* nchar(c(NA, 1)) not equal to c(NA, 1)<br />
'is.NA' value mismatch: 0 in current 1 in target</p>
<p>> test_file("test-nchar.r")<br />
...12..34</p>
<p>1. Failure: nchar of missing is missing ---------------------------------<br />
nchar(NA) not equal to NA_integer_<br />
'is.NA' value mismatch: 0 in current 1 in target</p>
<p>2. Failure: nchar of missing is missing ---------------------------------<br />
nchar(c(NA, 1)) not equal to c(NA, 1)<br />
'is.NA' value mismatch: 0 in current 1 in target</p>
<p>3. Failure: nchar of factor is length of level --------------------------<br />
nchar(factor("ab")) not equal to 2<br />
Mean relative difference: 0.5</p>
<p>4. Failure: nchar of factor is length of level --------------------------<br />
nchar(factor("abc")) not equal to 3<br />
Mean relative difference: 0.6666667
</p>
test_dir()会自动执行一个目录下以test开头的文件,下面是stringr的结果,12个contexts,每个2到25个基本检测
> test_dir("inst/tests/")<br />
String and pattern checks : ......<br />
Detecting patterns : .........<br />
Duplicating strings : ......<br />
Extract patterns : ..<br />
Joining strings : ......<br />
String length : .........<br />
Locations : ............<br />
Matching groups : ..............<br />
Test padding : ....<br />
Splitting strings : .........................<br />
Extracting substrings : ...................<br />
Trimming strings : ........
然后就是报告的形式,上面看到的test_dir的结果就是test_dir和test_file默认的报告形式summary reporter,而我们可以选择minimal reporter(E代表错误,F代表失败):
> test_dir("inst/tests/", reporter="minimal")<br />
...............................................
而最后一种reporter形式stop是当有一个failure出现就stops()的形式.原文如下
[quote]The stop reporter is the default and stop()s whenever a failure is encountered.[/quote]
我的理解是stop reporter就是
4. Failure: nchar of factor is length of level --------------------------<br />
nchar(factor("abc")) not equal to 3<br />
Mean relative difference: 0.6666667
</p>
7.Autotest
auto_test()有两个参数,code_path和test_path.前者是放代码的目录,后者是放测试的目录,一旦运行,auto_test()会自动检测,如果某个测试文件修改了,那么会重新用那个测试文件测试,如果某个源文件修改了,它会重新装载那个源文件,然后执行所有测试文件测试。
我的理解就是auto_test()一运行,剩下的就去改吧,改完点保存自动测试。
这改变了传统的modify->save->source->check的模式
8.R CMD check
首先我们得把testthat放入DESCRIPTION文件,以免R CMD check警告unspecified dependencies.
然后就是要把测试代码放入R CMD check可以找到的地方.
最开始,最好的实践是把所有测试文件放入inst/tests然后添加下列代码到tests/test-all.R
library(testthat)<br />
library(yourpackage)<br />
test_package("yourpackage")
现在推荐的方式是:
把测试放入tests/testthat,然后在tests/test-all.R放入
library(testthat)<br />
test_check("yourpackage")
作者说这样的好处是可以在R CMD install的时候选择 -install-tests或者设置install.packages()的参数
INSTALL_opts = c(“–install-tests”)
我又要猜测为什么要这么做了[s:11]:
由于R包的代码和测试代码的位置相对可以找到,所以test_check里面很可能用了test_dir和test_file,从而利用了这两个函数的优势(例如比起source的优势)</p>
9.开发的两种方式
一种叫探索性编程,一种叫验证式编程
验证式就是遇到一个test失败,run一下auto_test()然后不断修改直到通过
探索式就是不断source+modify
然后最后都弄好了就document(),update NEWS.