Wednesday 23 June 2010

Start your WPS Services

Now that I have made contact with the different groups a couple things are clear:
  • Everyone is very enthusiastic about having additional clients to talk to (this is the traditional chicken and egg problem with standing up either a client or a service - you need a friend in order to have a conversation - or a chicken I guess)
  • Demo services are available for anything not under active development (this is great news for me)
    • While deegree 3 WPS is still technically under development they are producing a downloadable war making their service easy to test
    • GeoServer documentation needs some work (sad news for me but I can fix it)
    • PyWPS was recommended
    Today's task is to connect to each service and make sure I can parse the capabilities document; and if things go well the describe process documents.

    52 North
    52 North have been very supportive with both a stable service to test against, and in standing up a service from their development branch. Their development branch makes use of GeoTools 2.6 and I am keen to hear how their transition went. Recently we have made some usability improvements for GeoTools 2.7 which will make those updating older applications even easier.
    Thanks to Bastian for setting up the a server using the development branch.

    The ZooWPS mailing list got back to me today and quickly pointed me to both examples and a sample WPS service I can test against.  The examples confused me a bit (as the skip straight to the execute requests and rush over the whole capabilities and describe process steps).
    Thanks to Nicolas for the sample server to test against.

    While the examples are confusing (and show a danger of just using links for data) they have a simple great picture explaining how their WPS functions.

    Deegree makes a number of demonstration services available for testing; but they make use of an older 0.4 version of the WPS specification (and my dedication to standards compatibility has a limit). The new deegree 3 is implementing WPS 1.0 and has a war available that fits my needs.
    (can I capitalise deegree if it is used to start a sentence ... or is it like "iPod" and the shape of the word matters more then silly english sentence conventions?)

    GeoServer WPS Community Module
    While I can quickly produce a war of the community module; my preferred method of testing is to use a very lightweight application server called "Jetty". Indeed use of Jetty is rolled into the maven build system:
    cd web/app
    mvn jetty:run -Pwps
    The only difficulty is that the build tool maven has grown a bit responsible since I last used it and will no longer install plugins such as jetty without me modifying a couple of configuration files first. I am going to sort out what is needed and update the GeoServer docs later today.

    A recommendation from yesterday (thanks Tom) which appears to be a contemporary of deegree in terms of years of experience. One thing that really attracts my eye when looking at a new project is:
    • recent news (showing that the project is alive)
    • documentation (even better if it is called course material)
    And guess what is on the PyWPS home page?

    2010-05-05 New course material added

    New course material added to PyWPS source. See documentation for details.
    I will sign up to the email list and try and introduce myself shortly.

    uDig (ie client)
    The other thing I am working on is user interface ideas to present the idea of external processing. My thinking thus far is to cheat - and represent external results (and if processing is still going use a progress bar as a placeholder).

    If I consider it as a list of results it becomes a more interesting and productive user interface concept:
    • results can be "tagged" to define ad-hoc grouping according to server, process, processing status
    • results that were produced externally (such as to an ftp site) can be listed, and downloaded if needed
    • using a wps could be considered "adding" a result to the list and handled using a wizard (although a wizard is not the best for interacting with the map - such as selecting a calculation area)
    • I should be able to record the steps that were used to produce each result and "rerun" if needed

    Tuesday 22 June 2010

    Web Process Service Round Up

    I have a fun bit of work lined up - updating the web processing service client code in uDig.

    It is no secret that I am a huge fan on the idea of Web Processing Service - I am excited about the possibilities in using a WPS as a front to a grid of computers (a strategy 52North seems to be pursuing), the ability to bundle up processes written in a number of languages (something ZooWPS is really going after).

    The part I am really keen on does not seem to be tackled yet: I am very interested in chaining processes using standard diagrams such as BPEL - this represents a really nice olive branch between GIS and the business analysts that would love to know what the department is doing). There is some confusion in this area as the diagrams end up looking similar to those provided by BI tools (since GIS is used for decision making) or similar to ETL tools (since chains of processing are required).

    Today am making contact with the different web processing service implementations and warning them what I am up to and generally finding out where they live and what is a good contact point for communications.

    Thus far:
    • 52North - 30 mins to respond to email, seems to be very active and able to link to an example WPS service right out of the gate. This is the established open source WPS solution and I am looking forward to seeing how it handles feature collections and raster processing.
    • ZooWPS - no response to email yet, but the IRC channel was well populated (turns out half the members were my LISAsoft co-workers from different offices around Australia). This is the new kid on the block in the WPS space
    • GeoServer WPS Community Module - no email since I had already been following that email list. The GeoServer WPS community module has been very quiet in its development but has made recent progress in the two areas I am interested in testing.
    • deegree 3 is working on their second generation WPS implementation and is under active development - I may end up building from source in order to have something to test. It is great to see the continued support of WPS here (deegree 2 worked against an earlier version of the specification).
    The two areas I am targeting each have their own special risks.

    Features should be the bread and butter of GIS processing and we are held back in this area by the generally hap hazard support for GML. I can see nailing everything to the wall using GML and XML Schema - this is really what should be done - (since it is a data interchange format) when shuttling data between services. GML allows us to communicate the range and limits of the data and be able to negotiate differences between data models. I could see using this approach in an ETL context or when doing scientific work.

    The expectations of the current crop of implementations are in a slightly different direction: focus on geometry (hey it is spatial!) and have the attributes carried along for the ride. The ZooWPS implementation also supports GeoJason which is very good for this style of ad-hoc collaboration. Even for this ad-hoc style we will need to indicate "which" geometry in a feature needs to be acted on ... so it should be fun seeing what the different implementations have provided.

    Raster data is also interesting/scary. There is an answer in place for the obvious question of data size (the WPS specification accounts for this by allowing long running processes making use of FTP sites for staging results). The other question is the same one encountered by web coverage service; what does the data mean? Which bands mean what and how is your DEM height measured etc. I am really not sure if WPS is up to capturing this information; will the file format headers capture this in enough detail; or will each process need to be supplied hints to sort out how to interact with the information.