刚才我回答了这个
Removing characters after a EURO symbol in R问题.但是对于我来说,r代码适用于Ubuntu上的其他人.
这是我的代码.
- x <- "services as defined in this SOW at a price of € 15,896.80 (if executed fro"
- euro <- "\u20AC"
- gsub(paste(euro,"(\\S+)|."),"\\1",x)
- # ""
我认为这都是关于更改区域设置,我不知道该怎么做.
我在Windows 8上运行rstudio.
- > sessionInfo()
- R version 3.2.0 (2015-04-16)
- Platform: x86_64-w64-mingw32/x64 (64-bit)
- Running under: Windows 8 x64 (build 9200)
- locale:
- [1] LC_COLLATE=English_United States.1252
- [2] LC_CTYPE=English_United States.1252
- [3] LC_MONETARY=English_United States.1252
- [4] LC_NUMERIC=C
- [5] LC_TIME=English_United States.1252
- attached base packages:
- [1] stats graphics grDevices utils datasets methods
- [7] base
- loaded via a namespace (and not attached):
- [1] tools_3.2.0
@ Anada的答案很好但我们需要在regex中使用unicodes时每次添加该编码参数.有没有办法在Windows上修改默认编码为utf-8?
似乎是编码问题.
考虑:
- x <- "services as defined in this SOW at a price of € 15,896.80 (if executed fro"
- gsub(paste(euro,x)
- # [1] ""
- gsub(paste(euro,`Encoding<-`(x,"UTF8"))
- # [1] "15,896.80"