[未知用户]
比如这样的:
library(RCurl)
library(XML)
doc = getURL("http://scholar.google.com/scholar?q=regression")
parser = htmlParse(doc, asText = TRUE)
res = getNodeSet(parser, "//div[@class='gs_a']")
sapply(res, xmlValue)
[data] [1] "NR Draper, H Smith - 1981 - popline.org"
[2] "DM Bates, DG Watts - 1988 - Wiley Online Library"
[3] "J Neter, W Wasserman, MH Kutner - 1989 - Irwin Homewood, IL"
[4] "F Mosteller, JW Tukey - Addison-Wesley Series in Behavioral Science: a<80>|, 1977 - popline.org"
[5] "EJ Pedhazur - 1997 - citeulike.org"
[6] "RH Myers - Duxbury Press, Pacific Grove, 2000 - textbooks-to-succeed.info"
[7] "P Allison - 1999 - dl.acm.org"
[8] "DW Hosmer, S Lemeshow, RX Sturdivant - 2000 - Wiley Online Library"
[9] "PCB Phillips, P Perron - Biometrika, 1988 - Biometrika Trust"
[10] "J Neter, MH Kutner, CJ Nachtsheim, W Wasserman - 1996 - weibnc.com"[/data]