正则表达式 – 与R中的区域设置混淆

前端之家收集整理的这篇文章主要介绍了正则表达式 – 与R中的区域设置混淆前端之家小编觉得挺不错的,现在分享给大家,也给大家做个参考。
刚才我回答了这个 Removing characters after a EURO symbol in R问题.但是对于我来说,r代码适用于Ubuntu上的其他人.

这是我的代码.

  1. x <- "services as defined in this SOW at a price of € 15,896.80 (if executed fro"
  2. euro <- "\u20AC"
  3. gsub(paste(euro,"(\\S+)|."),"\\1",x)
  4. # ""

我认为这都是关于更改区域设置,我不知道该怎么做.

我在Windows 8上运行rstudio.

  1. > sessionInfo()
  2. R version 3.2.0 (2015-04-16)
  3. Platform: x86_64-w64-mingw32/x64 (64-bit)
  4. Running under: Windows 8 x64 (build 9200)
  5.  
  6. locale:
  7. [1] LC_COLLATE=English_United States.1252
  8. [2] LC_CTYPE=English_United States.1252
  9. [3] LC_MONETARY=English_United States.1252
  10. [4] LC_NUMERIC=C
  11. [5] LC_TIME=English_United States.1252
  12.  
  13. attached base packages:
  14. [1] stats graphics grDevices utils datasets methods
  15. [7] base
  16.  
  17. loaded via a namespace (and not attached):
  18. [1] tools_3.2.0

@ Anada的答案很好但我们需要在regex中使用unicodes时每次添加该编码参数.有没有办法在Windows上修改默认编码为utf-8?

似乎是编码问题.

考虑:

  1. x <- "services as defined in this SOW at a price of € 15,896.80 (if executed fro"
  2. gsub(paste(euro,x)
  3. # [1] ""
  4. gsub(paste(euro,`Encoding<-`(x,"UTF8"))
  5. # [1] "15,896.80"

猜你在找的正则表达式相关文章