FME 2011 Sneak Peek: New Tester Predicates
Since the Tester is consistently the most-used of all transformers (various figures suggest between 8-12% of all transformers placed are Testers) it’s right that we give it some useful care and attention for each release.
For FME2010 the Tester got a new ability to create composite tests.
In FME2011 the big change is a new set of predicates (test operations) that can be performed.
The New Predicates
There are seven new operations:
- In + Between + Like + Matches Regex + Contains + Begins With + Ends With
You might have noticed that some of these operators bear close relation to SQL Where clause predicates. This is intentional, and the behaviour and syntax of these operators also closely matches SQL.
That’s not to say these are actual SQL statements in FME. There are no Selects, Inserts, Deletes or Updates in the Tester (more another time on where you can find those capabilities).
Each incoming feature in a Tester is evaluated against a set of test clauses, and passed or failed as appropriate. It’s the same as it always has been in that respect. But by having these predicates mimic SQL, we think users are more likely to understand immediately how they work
Of course there are some pitfalls. As with SQL statements, and rather untypically for FME, the tests are NOT case sensitive. Also you’ll find that wildcards (outside of Matches Regex) should be percentage characters (%) and not asterisks (*).
Additionally, regular expressions are evaluated a little more strictly than in other transformers; see the Matches Regex section for more information about that.
The In operation specifies that the Left Value of the test, must be contained in a list of potential values set in the Tester Right Value. It’s pretty much equivalent to the IN keyword in SQL.
An example might look like this:
Notice the syntax here is an attribute for the left hand value, and the right hand side is a comma separated list. I think that’s going to be the most common scenario.
However, there’s nothing to say you couldn’t use an attribute for the right hand value too (provided it contains a comma separated list of values) and a constant value for the left hand value (i.e. Where x In myAttribute), though that’s a less likely scenario.
Note that you can also have a range of numbers included in the comma separated list, for example:
Negative numbers are also permitted when you use a range (for example
-5--10), but that means you can’t have open-ended ranges; i.e. you can’t put -5 (meaning any number <=5) because that would be interpreted as a negative. You would have to do that as something like -999999-5. Similarly you can’t do 5- (meaning any number >=5) you would have to put 5-999999 or similar.
But on the bright side, -5–10 (minus 5 to minus 10) is permitted.
Whichever way you use it, this is great for testing one value within a list without having to use multiple clauses (or multiple Testers)
The Between operation is similar to In, but specifies a range rather than a list of values. In SQL terms it’s almost exactly equivalent to the BETWEEN predicate.
An example might look like this:
Again the right hand value syntax is comma separated, but here it’s just two comma separated values.
Just in case you are wondering:
- It is an inclusive test (i.e. 1 is between 1,4)
- It does correctly work with decimal values (e.g. 2.4 is between 1 and 4)
- It does work with non-numeric characters (e.g. you can test if E is between B and M). In that scenario it’s best to set the Tester comparison mode parameter specifically to String.
Once more the benefit here is to encapsulate the required tests within a single clause, rather than having to use two or more clauses to achieve the same thing.
The Like operation is a string match (like the existing = predicate would do) but with the addition of wildcard support. Again, the SQL keyword is almost identical in functionality to its eponymous FME counterpart.
As an example, this query in FME means a feature passes where the myAttribute field contains the letter ‘A’ followed by any single character followed by the letter ‘E’:
You can see how the SQL-style syntax (with % for a wildcard) differs from FME’s usual Tcl-inspired look and feel.
For the moment I’m going to deviate from the order shown in the transformer, skip Matches Regex, and go on to Contains.
The Contains operation is a substring match. It basically finds the existance of one string inside another. I suspect the SQL equivalent is less like the SQL Server implementation of ‘Contains’, and more representative of a LIKE predicate with multiple wildcards (…WHERE ename LIKE ‘%S%’;)
Either way the FME “Contains” is a user-friendly option for finding text strings. It looks like this:
…the test there being, does myAttribute contain the substring “bcd”? In the 2011 training materials I’m going to use this to test if a park name contains the string “golf” in which case we deduce it is a golf course and handle it separately.
Like other predicates, there’s no reason why you can’t use any combination of constants and attributes for the left and right values (for example, where myAttribute1 contains myAttribute2).
Of course this is similar to the StringSearcher transformer, but has the added advantage of being available in the Tester, and therefore can be used in combination of ANDs or ORs, and in composite tests.
The Begins With operation is a substring match specific to the start of a string. I think you can consider this as more regex-y than SQL
In the Tester it’s like this:
Much nicer that using the StringSearcher with the appropriate regex escape character (See? I can’t even remember what character you should use!)
Do I even need to say what Ends With does?
Just for consistency, here’s a screenshot of what it looks like:
It’s not too difficult to understand what the Matches Regex operation does – it tests for a match against a regular expression – the greater difficulty is (in my opinion) understanding regular expressions. But if that’s your sort of thing then go for it!
Here’s the obligatory screenshot:
The one thing that caught me out when I tried this operator, was that this operator searches for an exact match whereas the StringSearcher transformer just checks for inclusion within a string. For example,
Where myAttribute = abcd
StringSearcher: Regular Expression = bc* (Matched)
Tester: Matches Regex bc* (Fail)
But this does make sense. We’re not just searching for the text anywhere; it needs to be an exact match. As one of our developers said to me:
“There was some indecision on which way to go with this, but the word MATCH connotes equality rather than containment.”
He finished with:
“Remember: Regular expressions are for advanced users only!”
Things to Remember
- As mentioned these tests are NOT case sensitive. So in the Contains example above, it would match “bcd”, “BCD”, “Bcd”, or any other combination of cases for those three characters.
- Because the tests are not case sensitive, I’m not sure how well they will work with non-English strings. The developer is aware of the issue and is looking into how we can resolve it.
- Notice the % for a wildcard, and not a * (again, a SQL thing)
- Don’t forget the same predicates are also usable in the TestFilter transformer.
- Incidentally, did you know that you cannot directly test for a constant value of “TEST” in a Tester. I assume it’s something to do with the FME parser. However, you can perform the test by setting an attribute to the value “TEST” and choosing that attribute instead of setting a constant.
- Hey! What does he mean “advanced users only”? What is he saying about me?
Hope you find this new functionality useful,