Possible Duplicate:
07000
07001
我知道这个问题已经问了很多次,但有不同的答案;我很困惑.
我的行是:
1,3.2,BCD,"qwer 47"" ""dfg""",1
可选的引用和双引号MS Excel标准. (数据:qwer 47“”dfg“表示如下”qwer 47“”“”dfg“”“.)
我需要一个正则表达式.
好的,你从评论中看到正则表达式不是正确的工具.但如果你坚持,这里有:
原文链接:/regex/357244.html这个正则表达式将在Java(或.NET和其他支持占有量词和冗长正则表达式的实现)中工作:
^ # Start of string (?: # Match the following: (?: # Either match [^",\n]*+ # 0 or more characters except comma,quote or newline | # or " # an opening quote (?: # followed by either [^"]*+ # 0 or more non-quote characters | # or "" # an escaped quote ("") )* # any number of times " # followed by a closing quote ) # End of alternation,# Match a comma (separating the CSV columns) )* # Do this zero or more times. (?: # Then match (?: # using the same rules as above [^",\n]*+ # an unquoted CSV field | # or a quoted CSV field "(?:[^"]*+|"")*" ) # End of alternation ) # End of non-capturing group $ # End of string
Java代码:
boolean foundMatch = subjectString.matches( "(?x)^ # Start of string\n" + "(?: # Match the following:\n" + " (?: # Either match\n" + " [^\",\\n]*+ # 0 or more characters except comma,quote or newline\n" + " | # or\n" + " \" # an opening quote\n" + " (?: # followed by either\n" + " [^\"]*+ # 0 or more non-quote characters\n" + " | # or\n" + " \"\" # an escaped quote (\"\")\n" + " )* # any number of times\n" + " \" # followed by a closing quote\n" + " ) # End of alternation\n" + ",# Match a comma (separating the CSV columns)\n" + ")* # Do this zero or more times.\n" + "(?: # Then match\n" + " (?: # using the same rules as above\n" + " [^\",\\n]*+ # an unquoted CSV field\n" + " | # or a quoted CSV field\n" + " \"(?:[^\"]*+|\"\")*\"\n" + " ) # End of alternation\n" + ") # End of non-capturing group\n" + "$ # End of string");
请注意,您不能假设CSV文件中的每一行都是完整的行.您可以在CSV行中包含换行符(只要包含换行符的列用引号括起来).这个正则表达式知道这一点,但如果你只给它一个部分行,它就会失败.这是您真正需要CSV解析器来验证CSV文件的另一个原因.这就是解析器的作用.如果您控制输入并且知道在CSV字段中永远不会有换行符,那么您可能会放弃它,但只有这样.