The Safe Software Blog
Author:
Mark Ireland

Google
Get the Blog Newsletter

Delivered by FeedBurner

About FME    |   January 14, 2013   |   By Mark Ireland

FME 2013 Special: Key Functionality Updates

FME2013 Key Functionality Updates: By Mark Ireland

Dear users,
With the release of any software product, there are inevitably a few updates that grab the headlines and steal attention away from everything else.

With FME2013 – and I’m writing this about 1 month pre-release – I think the Smart Delete (Auto-Heal) function in Workbench will get a lot of attention, as will the GeometryValidator transformer, the Data Inspector table viewer, and new formats such as Socrata, Salesforce, SpatiaLite, and Ingres.

But, for users who use FME on a daily basis; those who know most of the transformers by heart, and have used nearly all of them in one combination or another; for these users I have another list of updates. The updates on this list are what I call “Key Functionality” updates. At first glance they won’t seem important, but they all represent an explosion of opportunities to do new things with that particular tool.

Think of these updates like the winglets on an aircraft wing. They’re only a tiny addition, but they radically improve handling and increase performance. You won’t buy a plane just because of these, but they make a great difference to pilots who fly on a daily basis.

So thank you for choosing FME for your journey to Interoperability. At Safe Software we have an entire team of support staff responsible for your well-being. If you have any questions please don’t hesitate to let one of us know. But in the meantime we’d appreciate your attention while we review the enhanced features of this FME 2013 software……

ShortestPathFinder: Multiple path support
If you explored the ShortestPathFinder transformer before, you were probably disappointed because of a fatal flaw. The original design could calculate routes on multiple networks at once, which was a great idea, but only one path each. That meant to calculate multiple routes on the same network you had to use a Cloner to make duplicate copies of that network, one for each route. That wasn’t good for either usability or performance.

So, in 2013 we changed all that. The transformer can now take a single network and calculate multiple routes on it. It’s way easier to use, and way better for performance.

Now, one other obvious update is that the route definition has changed from a start and end point, to a line whose end vertices are the start/end point.

On the one hand, we admit, this is slightly harder to handle. You’ll probably need to put a PointConnector in there like above. However: it gives the HUGE advantage that you can include intermediate points en-route. So any vertices on the line are counted as stops that have to be passed through. In this example the darker, 3-point line represents the start, end, and an intermediate point. The red line represents the shortest route between the three points, like so:

We also added a NOPATH output port. If a route taking in the start/end/stops provided can’t be calculated using the network provided, then the FROM-TO line is output from this port, rather than just disappearing as it did previously.

So, while it might not get a lot of attention, this transformer has a lot more potential than it did prior to 2013. If you passed it over before, you may want to look into it again – perhaps starting with this video demonstration.

Readers/Writers: Zip File and URL Support
When I “announced” this new functionality on my Twitter feed, it got more re-tweets (18) than any other function I’ve tweeted about. That’s when I knew it must be really important to our users, and on reflection I can see why. It really might be a game-changing update.

In short, all file/folder readers and writers (i.e. not databases) can now read their dataset from within a zip file, and write their data out to a zip file. In addition, any reader can now read their dataset from a URL.

I’m guessing file management and ease of use is what makes this such a big deal. In particular the combination of zip file and URL reading means you can read datasets that are stored online, in a zipped form. It even reads from an FTP site.

For example, this is a Shape dataset of community centres (from the City of Vancouver Data Catalogue) and this is me reading it directly in the FME Universal Viewer:

With this reader in my workspace, whenever I run a translation I can be sure I am using the latest version of the data from their catalogue. You could do the same on data stored anywhere on HTTP/FTP (un-authenticated only, at the time of writing).

I can also read data directly from a zip file stored in Google Docs. The trick there is to share the document and then use a URL of the form:

https://docs.google.com/uc?export=download&id=XXXX

…where XXXX is the ID number of your document. For example, here I am reading this zipped MapInfo TAB dataset

You can even read a CSV dataset like that (preview doesn’t seem to work, yet) and, what’s more, you can use it as the source in a SchemaMapper table. So you could put your schema mappings in a Google doc and you’ll always get the current version.

Additionally – and this I really love – you can put in a data streaming URL from FME Server, and it will read that too, like this:

Incidentally, you can stream just about any data you want. If you stream Shape (for example) then it gets zipped up by the streaming service, and – of course – the readers can now handle that. Check out the new FME Server tutorial movies to see how I use the capability to open data downloads directly within FME Viewer.

Overall, the more I use this capability, the more amazed I am at what is possible. It’s definitely a case of the functionality exceeding what we’d designed or ever considered what it could do.

WorkspaceRunner: Maximum Processes Parameter
The WorkspaceRunner is a way for one workspace to run another. Why do that? Usually to start a batch process of some sort. Often I incorporate it with the “Directory and File Pathnames” reader. For example, I use that reader on a folder full of datasets I want to translate. The result is a feature for each dataset with an attribute storing the file name. I pass that file name onto a workspace with the WorkspaceRunner. Result? I can process a whole set of datasets without having to know their names or quantity in advance.

Up until now, the WorkspaceRunner has let you run each job consecutively (i.e. one after the other) or run them all at once (concurrently). That sounds good, but it can be limiting. If (in my example) I’m trying to process 50 datasets, then I won’t want to run 50 jobs simultaneously, but neither do I really want to run only one at a time.

In FME Server, there is a queue and a set number of engines to handle this. But what do I do in FME Desktop?

Well, in 2013 we’ve solved that problem by adding a new parameter. This parameter simply adds the ability to set the maximum number of jobs to run at any one time.

Now, for example, if I have a quad-core computer I could run my 50 jobs, but only ever have a maximum of 8 (or fewer) at any one time.

I think this will really make a big difference to many users now, and how they batch process jobs using Desktop. You’ll be able to maximize your system resources without overloading them, and without having to purchase multiple engines of FME Server!

Incidentally…

…try not to get confused between this and the use of parallel processing techniques. WorkspaceRunner processes are “master” processes, of which you can run up to 8 regardless of license/edition [NB: The GUI above says 1-32, but it is wrong and has since been fixed].

Parallel processing processes are “worker” processes. They are license dependent, to a maximum of 16 with a Smallworld edition, so any master process can run up to 16 workers.

Of course, be careful using parallel processing with multiple workspaces, because even if you don’t hit FME’s limit, you might still be overloading your computer.

Chopper: Chop by Length and Interior Vertices
Oh, this is a nice update. The Chopper used to let you chop up lines and – did you know – polygons, according to a set number of vertices. But in 2013 it has a new mode that lets you chop by length.

OK, I don’t need to explain what this is or why it’s so useful; but I do have to explain how because until you think about it, the output can look a little unintuitive.

When you chop a line by distance we output a set of 2-point sections of equal length. But, we always keep existing vertices, and because these vertices might not be at an exact chop point, we intelligently scale the output to get as close as we can to the desired distance without ending up with short line lengths.

Here’s an example:

If this is a single feature where each side is 1.0 units, and the chop distance was 0.9, then – assuming we interpreted it literally the output would look like this:

See how there are a mix of long and short lines because the existing vertices don’t fall exactly on the 0.9 distance. The output would be a mix of lines 0.9 and 0.1! Not what you want at all. So, instead of that, we approximate a distance that will give a consistent output, like so:

So, the actual length of the output is each 1.0 units here. Like I said, it’s not fully intuitive, but logical when you think about it. The Densifier transformer gets the same treatment. You get to choose between this method (Uniform Interval) and the original method (Exact Interval).

Also notice that the output from chopping polygons can now use interior vertices:

That makes the output way nicer looking than it used to, with more regular shapes.

So, give it a try and let us know what you think. Does the approximated “uniform” method work for you?

Parameters: Attribute Name Parameter
This is an update to help get input from the user at run time. In the past there was a workaround with the AttributeSetter. But the AttributeSetter no longer exists, so here we are. I can only think of one use case for this, though I’m sure there are more.

Anyway, the use case is where you want a user to select an attribute, and ONLY an attribute.

Let’s take an example. Here I have a 3DForcer and I want the user to select the attribute to use to get Z values from. I can easily create a user (published) parameter like so:

But the problem is this: many of our parameters are not attribute only. Most of them are <something> OR an attribute; for example we have what we call “STRING_OR_ATTR”, which can either be a text string or an attribute.

In the case of the 3DForcer, the parameter is “FLOAT_OR_ATTR”, and this means that on running the workspace the user is prompted for an attribute, but could instead can enter a floating point value like this:

I don’t want them to be able to type a value, only select an attribute, so this is where the new Attribute Name user parameter comes in. I create a new user parameter from scratch (using Right Click > Add Parameter, in the Navigator window) and set it to be an Attribute Name parameter:

Now I can use that parameter in the 3DForcer. However, instead of just selecting it I have to open the editor dialog to make a reference to it:

Do you see why? It’s because that parameter is just the name of an attribute. If I just referenced that parameter, all I’d get would be the attribute name, whereas I want the attribute value. But with the setup above, I wrap an @Value() function around the attribute name to get the value of the selected attribute.

Now when I run the workspace I see:

…there’s a drop down list of attributes, and that’s it! The user isn’t able to start entering fixed values.

Incidentally, in case you notice, when you have multiple streams of data, the list of attributes you are shown at run-time is limited to those streams where you actually use that parameter. Furthermore, it’s an intersection of the streams – in other words you’ll only see an attribute where it exists in all streams (not a union where you would see all attributes), as you see below:

So here, only Attribute3 and Attribute4 are available, because they are the only attributes common to all the places my “attribute name” parameter is used. Does that make sense? I think it will when you try it out.

So, again this might not sound like a huge update, but I suspect that there are some users who are going to find this very, very, useful.

Tiler: Origin Point
The 2013 Tiler transformer shows, I think, how a simple control parameter can be vital to a transformer’s use. The big update is a set of two parameters to define the X/Y coordinate from which tiles are generated.

In previous FMEs you could tile data by defining the size of tile to create. But the position of these tiles varied according to the extents of the data. That was a problem because, for example, if you had two different datasets you would have no reliable way to generate the same tile boundaries for each. Also, the lower-left corner would be some hideously imprecise coordinate, like this:

But the new parameters in FME2013 means you now have control of the “seed” point for tiling. With this you can match existing tiling and also have a more meaningful lower-left coordinate, like so:

That, I think, really transforms this transformer (pun intended) into something much more usable, as you now have complete control over the tiling operation. Again, it’s not something that will make headlines, but it’s something I can really make use of.

Others
There are a few other updates I wanted to mention, although these are more for ease of use than for functionality.

Since its inception, the RasterExpressionEvaluator transformer has had a very basic interface, one that has made it difficult to understand and use. This is what it used to look like:

There you would have to know the syntax used, and what bands are available. But in 2013 it’s had a complete renovation and now is a lot more usable:

Not only are the bands listed (so you can select them without typing in the syntax) the expressions are divided up per band, making each only 1/3 the size of what it was and therefore easier to handle.

It’s not something you’ll have in every workspace, but when you do need it you’ll be glad of the updates!

Similarly, the Joiner transformer’s controls have been thoroughly revised from a wizard to a single settings dialog. Again it becomes way simpler to use, and more flexible.

Finally, a quick mention to the AreaBuilder transformer. Firstly it’s been merged with the PolygonBuilder to form a single transformer. The PolygonBuilder has been deprecated and exists now only as a shortcut. But beyond this, we wanted to add a tolerance setting, and achieved this by incorporating snapping parameters. So now you can snap shut any gaps in your data while area building is taking place.

Shortcut Keys
Here’s one final item I wanted to mention. Once you’ve learned to be proficient in any software product, the next step is to become a power user, and you do this by learning to take shortcuts to carry out the same tasks more efficiently. In my opinion, the best applications are those that provide such power tools for their expert users, even if it’s just a set of shortcut keys….

Speaking of which, FME Workbench got a whole new set of shortcuts for 2013. I hope they’ll help take your FME use to the next level!

Ctrl+D: Duplicate selected object
Ctrl+G: Open Generate Workspace dialog
Ctrl+N, but that now changes to the Create Workspace dialog
Ctrl+K: Attach annotation (Ctrl+Shift+K attached Summary Annotation)
Ctrl+T: Create Custom Transformer
Ctrl+W: Close current tab (also Ctrl+F4)
Ctrl+Tab: Shortcut to switch between tabs in a forwards direction
Ctrl+Shift+Tab: Shortcut to switch between tabs in a backwards direction
Ctrl+Shift+I: Connect Inspector transformers to an object
Ctrl+Shift+L: Connect Logger transformers to an object

Well, I think that’s enough from me. This post is going to be published on FME 2013 launch day – so by the time you read this the product will be launched and there’s no excuse for not trying it out!

Happy (FME) 2013,

Regards