I'm looking to create a regex that will grab all of the lines of text which comply with the following structure:
- Group1: 2-3 chars of upper case letters
- Group2: A datetime of day month and year with flexible separators
- Group3: Any block of text up to an end a line
Any blank spaces between the groups would be ignored. Currently the regex I have built is (?m)([A-Z]{2,3})[\s]+([0-9]+[-\/\.][0-9][0-9][-\/\.][0-9]+)[\s:,-]+([^\n]+)$
and it works on a few regex testing sites that I have gone to with this sample text:
#Rubbish
AA 22/05/2017: First block of text. \n
BB 15/05/2017: Second block of text. \n
AA 01/05/2017: Third block of text \n\n
Rubbish block
To be precise I've tried on https://regex101.com/ and there if I enable the global flag, all of the rows get detected, without it only the first, but I at least get a match. But when I take it into Apex I end up with this code
string message = '#Rubbish \n' +
'AA 22/05/2017: First block of text. \n' +
'BB 15/05/2017: Second block of text. \n' +
'AA 01/05/2017: Third block of text \n\n' +
'Rubbish block';
System.debug(message);
// Preparing regex
Pattern regex = Pattern.compile('(?m)([A-Z]{2,3})[\s]+([0-9]+[-\/\.][0-9][0-9][-\/\.][0-9]+)[\s:,-]+([^\n]+)$');
Matcher regexMatcher = regex.matcher(message);
if(regexMatcher.matches() == true) {
System.debug(regexMatcher);
}
else {
System.debug('no');
}
And initially I get compilation errors. I play a bit with the regex string, escaping the \ by adding an additional one, but I'm still unable to get any actual matches even when I don't get compilation errors anymore.
Could anybody have a look and tell me what is wrong? I'm convinced it's a dumb oversight, but I'm still not managing to see the issue.
Best Answer
There's two issues here: one is escaping the regex correctly, and one is the semantics of checking for a match. The following code works:
Note that
Matcher.matches()
returnstrue
for a whole-region (whole-string, in this case) match, which we don't have.find()
returnstrue
when we're able to match, which we are here.Additionally, you must escape all backslashes in an Apex string, and there's no need to escape a forward slash or period in a regex character class.