Wednesday 21 December 2011

Eye on the Earth Summit

Posted: Mike Sanderson

I was privileged to be able to attend the Eye on Earth summit convened under the Patronage of His Highness Sheikh Khalifa Bin Zayed Al Nahyan, President of the United Arab Emirates. AGEDI and UNEP gathered a panoply of world leaders in their fields for the plenary sessions. Jane Goodall talked enthusiastically about the chimps in equatorial Africa, polar explorer Rob Swan was passionate in his defence of Antarctica, Sylvia Earle (formerly Chief Scientist at NOAA) introduced us to blue carbon, Rebecca Moore from Google outreach talked about the Google Earth Engine. This was a call to arms event for sustainability.

The highlight was the presentation from Hernando de Soto of the Institute for Liberty and Democracy, Peru. He talked about the Tunisian street trader, Mohamed Bouazizi who set himself on fire just over a year ago. Hernando described the feeling of hopelessness that this individual felt as he realised that to trade legitimately he had to seek permits from 73 government agencies.  Development is key to human progress and cannot be avoided, so the role of government is to allow development whilst maintaining a balance with the environment and other individuals. The initiatives announced at the conference (see: http://www.iisd.ca/ymb/uncsd/eoes/) were purposeful but for us to return in 2013 and measure their success they need to pass the Bouazizi test (they need to produce a simpler form of interaction between the individual and government).

And for 1Spatial, the summit declaration contained the following words:  ‘we need to develop effective mechanisms for collecting, managing and disseminating necessary environmental information, with the responsibility for quality assurance resting with those who collect or originate the data.’

Friday 2 December 2011

Looking for a new challenge?

It’s a busy time here at 1Spatial and as a consequence we’re eager to fill several roles including Software Developers and Test Analysts in our Cambridge HQ and .NET Developers in our Cork Office.

The company not only offers competitive packages, including great benefits, but also a good working environment that encourages innovation and initiative as well as fun along the way, with plenty of support when you need it.

You can visit the 1Spatial website careers page for more information about the roles above and you can contact hr@spatial.com to send in a CV.

Tuesday 13 September 2011

It's ArcGIS Server, but not as we know it...

Posted - John Hartshorn:
For many years we have been automating and solving data quality problems for a range of international customers helping them get more value from their quality-assured data.  So why stop there?  Surely it makes sense to look at how else customers can be helped to get more value out of the data and their investments in the GIS technology that consumes, distributes and presents that data?  Enter....Geocortex

OK, I’ll expand a little....

I started my career 20 years ago this autumn at Esri UK.  Well, ‘Doric’ as it was then.  ArcInfo 5, if I recall.  (Yes, I was very young).  Since then, Esri has continued to dominate many aspects of the GI software sector.  So, it comes as no surprise that many of those we help solve data problems for are then publishing that data into web-GIS applications built on Esri technology, ArcGIS Server to be specific, and very 2011 about it.  So, to mark my 20th anniversary of joining the Esri world, 1Spatial has very kindly decided that it will expand its support for the ArcGIS Server community, by offering the finest tried-and-tested tool there is for getting the best out of ArcGIS Server – Geocortex Essentials.

Yes, we’ve become the UK and Ireland distributor for what Esri itself obviously sees as a clearly useful, complimentary toolkit for ArcGIS Server - it has elevated the software team behind Geocortex to become an Esri Platinum Partner.

Geocortex Essentials hugely reduces the amount of grunt work (development) you have to do and, instead, gives you a far more efficient approach to application development by providing a huge range of pre-built (and tried and tested!) capabilities out of the box.  Oh, and a focus on configuration instead of development.  Not to exclude custom coding, of course, but let’s face it – who wouldn’t want to be able to publish applications on top of ArcGIS Server faster, better and with lower risk?  I for one would want to spend less money and time by simply configuring web-GIS sites using an authoring tool and a load of pre-built functions, buttons, widgets, workflows and all sorts of other exciting “geo-thingies”.

We will be attending AGI, so if you are there and want to find out more, please get in touch via geocortex@1spatial.com . We will also be running a series of webinars in the next few months, so watch out for more information!

As you might guess, this is very exciting stuff.  Go on – ask me about Silverlight!

Thursday 25 August 2011

1Spatial announces the Radius Studio Accreditation Programme


After a few months of deliberation with my other consultant colleagues here at 1Spatial, I am pleased to announce the arrival of the Radius Studio Accreditation Programme.  Click here to view the webpage, where you can find full details of the programme and how to get involved.  During its inception I have spent many hours trying to work out the best way to implement some form of official recognition in the Radius Studio arena.  I have been asking myself questions such as:-

  • What levels of accreditation should there be?
  • How should it be decided / applied?
  • Why should somebody want to have it?
  • What is in it for a participant?
  • What is in it for 1Spatial?
  • How does it help a participant?
  • How does it help the participant’s organisation?

I hope to explore and answer these questions during this blog.

We have decided on two levels of accreditation; Practitioner and Professional, both are free to join and can be anyone from a student to a CEO.  Participants will need to gain a Practitioner accreditation before they can achieve Professional accreditation.  There was much deliberation on how a participant might gain accreditation, should it be by exam, or by practical examples?  We decided on practical examples. 

We took this route because we believe that with practical examples a participant will learn more about the data that their organisation holds and how that data fits into the organisation’s processes.  This, in turn, allows the participant to gain valuable information on the organisation’s processes, adding value for the participant and organisation alike. 

The achievement criteria for each level can be found here. By meeting the requirements for accreditation, a participant will learn about data validation and its role within business processes.  This will also help an organisation to understand the levels of quality in their data and improve that quality where applicable.  By applying the practical approach to accreditation we believe that all parties involved in the process will be happy.  It is not just taking an exam for examinations sake; you really need to know the organisation’s data and data quality level in order to achieve the necessary criteria.

I recently certified our latest Practitioner Lee Wells from Staffordshire County Council.  Lee has done some excellent work around address matching against the National Land and Property Gazetteer, click here to see my previous blog. We were able to build a case study around Lee’s work which can be found here. It was this case study that enabled Lee to gain his Accreditation.



Moving forward we hope that Practitioners and Professionals alike will join forces to improve the quality of the datasets worldwide by utilising their combined experience to produce ever more refined and exhaustive rule sets that can be applied across a multitude of data sources.

Watch this space for more developments!

Friday 29 July 2011

Data Sharing: The Quality Dilemma

Posted - Matt Beare: Earlier this year the ESDIN project concluded; a collaborative project "Underpinning the European Spatial Data Infrastructure with a best practice Network" that has occupied much of my time since 2008.

Throughout this period, the research, the meeting of people, the development of new ideas and the application of best practice have afforded me the opportunity to learn about the spatial data sharing aims of INSPIRE and the needs of the community it seeks to bring together. Importantly, it has taught me to look upon INSPIRE not as an end goal, but as a facilitator, a means to a greater good. A greater good that will be different for everyone, which means everyone needs to work out what it means to them or their business or their nation or our environment.

So at this year's INSPIRE conference I was encouraged to see many seeking to do just this, contemplating "going beyond INSPIRE" (an often used phrase during the conference). In doing so, the conference stimulated debate; around the views of whether too much prescription will stifle innovation or whether specification and compliance are necessary to ensure data is combinable, accessible and usable.

Recent blogs, such as those from Rob Dunfey and Don Murray, have continued the dialogue and offered further observations on this matter and the need to be practical.

I can empathise with both sides of the debate, and in my own presentation on "Driving Government Efficiency with Improved Location-based Data", I condensed the data sharing challenges that INSPIRE seeks to encourage us to address, to five points:
  1. Facilitate collaboration (bringing people together)
  2. Increase availability (publicising what data exists and how to get it)
  3. Ease accessibility (use the web for what its good at, accessing information quickly)
  4. Ensure compatibility (ensuring common exchange and knowledge of what the data means)
  5. Improve quality (understanding fitness for purpose)
Ultimately I feel all are important if the full potential of data sharing is to be realised, but I also understand that there are benefits to be had in approaching the challenges in a chronological order.

I think most would agree that INSPIRE is succeeding in the first of these, mobilising governments, organisations and individuals to engage with each other and reach out to the opportunity to, quite simply, achieve what they have long needed to achieve; to share data more intelligently in order to better accomplish existing and future business needs.

We now have the prospect to succeed in the second and third of these, but as one side of the debate suggested, only if we don't get too bogged down with the specifics of the fourth. That's not to say that data doesn't need to be compatible and combinable, but in the first instance just get the data out there. This in itself gave rise to an interesting discussion around being swamped in unintelligible data versus having too little intelligible information. In reality we need both, the innovators amongst us will do wonders with the former, whilst decision makers and policy makers need the latter (and more of it).

So on to the fifth challenge – quality – and is this another obstacle to data availability or is it crucial to making good decisions and achieving business objectives? Again the answer is both. The conference plenaries gave insight to the viewpoints this poses, with quotes like "undocumented data quality leads to unmeasured uncertainty” (Leen Hordijk, Joint Research Centre) and “accessibility is more important than data quality" (Ed Parsons, Google).

The concern for many is that the fear of having data categorised as "bad quality" will mean that data providers may withhold data, which is counter-intuitive to the aspirations of INSPIRE and other Directives, like PSI, seeking the effective re-use of public sector information.

But what is quality? ISO 9000 defines it as the "Degree to which a set of inherent characteristics fulfils requirements".

So for the user scenario that no one knows about yet, there are no requirements, therefore quality has no immediate importance. But as soon as the use case becomes known then the question of "is the data fit for that purpose?" becomes prevalent.

Any application, analysis or decision using data, without knowledge of its suitability to the business purpose at hand, will carry an unknown element of risk. This risk will need to be balanced against the value of the application or the decision being made, which relates to the presentation of Roger Longhorn (Compass Informatics) on assessing the value of geo data, where he asks whether the value of data is determined by the value of the services it supports or the decisions being made. Here value and quality become inexplicably linked, and one will value and should demand quality, if the services provided or the decisions being made are of value.

So "quality is important", but it's not until you know what business purpose it fulfils that it really comes of value. It is then that data providers need to be able to react to these new user needs, as fresh customer-supplier relationships are formed, and provide information and assurances on the quality of data for those known purposes.

That's why here at 1Spatial we make quality our business, providing automated data validation services and improvement processes, enabling data providers and custodians of data to rapidly respond to new and changing user requirements.

So, if the services you provide and the business decisions you make are of value, then value the data that underpins the services and informs the decisions, and demand knowledge of its quality (specific to your requirements), enabling you to trust the data, manage the risk and share with confidence.

I'd like to know what you think. How important is data quality to you? Is it seen as just a technical exercise or is your data a corporate asset that is relevant and valued by your business?

Thursday 23 June 2011

Driving Government Efficiency with Improved Location-based Data

Posted - Matt Beare: For those who may have recently read John Hartshorn’s Blog, will know that I will be in Edinburgh at the end of the month, looking forward to participating in and learning from this year’s INSPIRE Conference. For my part, I will be presenting (Thu 30 Jun @ 16:00) on my experiences of how and where INSPIRE can help drive government efficiency through the improved use of location-based data. These efficiencies will enable public sector organisations to readily share good information, make better decisions, plan more effectively and ultimately deliver better services, to us, the citizen.

I will have this in poster form too, but with a bit of a twist to your typical poster, so why not come along to the welcome drinks and poster session to find out just what that entails (Tue 28 Jun @ 18:00).
I’ll be drawing on experiences from two recent projects to illustrate my point. The first has provided me with the perfect opportunity over the past two and a half years to really understand what INSPIRE is all about and what is required to make it happen. This is the ESDIN project (European Spatial Data Infrastructure with a best practice Network), where 1Spatial worked as part of a large consortium of National Mapping and Cadastral Agencies, researchers, consultants and technology providers, smartly led by EuroGeographics.  My specific focus was to assist in activities around data quality, edge matching, schema transformation and generalization. Therefore, I was delighted to read the comments from the Commission’s final review report on ESDIN, which included:

“The project offers valuable … applications for generalization and schema transformation and, most importantly, effective data quality measures together with tools for testing the conformity to the INSPIRE technical requirements and to the ESDIN specifications. This will gradually allow NMCAs to be able to provide stable geodata and differential updates with a clear identification of the modified features.”

 “ESDIN is one of the first projects that also provide appropriate testing tools that can check the conformity to the ESDIN data specifications as well as conformity to INSPIRE. These results should be forwarded to the INSPIRE community.”





The second project and the main focus of my presentation is one undertaken by 1Spatial last year with Staffordshire County Council (SCC). Seeking to create a safer environment for vulnerable adults, SCC and the local Fire Authority wanted to perform free safety inspections to assess the risk and take remedial action where necessary. But without the appropriate spatial data to support the activity the objective looked unachievable.

They turned to us to help validate and cleanse their client information database and geo-code the data to enable its use in GIS applications to assist in locating the vulnerable and planning visits. Utilising available authoritative address information the project has proven to be a great success, minimising costs within SCC, enabling the Fire Authority to perform the inspections and consequently the exercise has saved lives
! What better incentive to effectively share good quality location-based data do you need than that – invaluable. 

Thursday 26 May 2011

Welcome to fabulous Edinburgh…

Posted - John Hartshorn: It's a rare event when I'm allowed to blog (this is the first time), so I'd best make good use of the liberty I've been granted. I'm Edinburgh-bound at the end of June, specifically for the INSPIRE conference, so it seemed like a good time to squeak some thoughts and open the floodgates to the online justice dispensed by those who know more than me. Nothing like the sound of the shallow jumping in at the deep, eh?

I'll be honest from the start – I've slunk around in the shadows of INSPIRE so far, desperately trying to avoid admitting that I can't quite get my head around what the problem seems to be, despite everyone talking about it. It all sounds very complicated.

The problem, as I see it, is that some of you out there are obliged (or at least to contribute to a data supply chain that is obliged) to take a whole load of data, check it, validate it, transform it across to a new schema and, finally, publish it in a new format. Of course, those who live and breathe INSPIRE will scream that I've simplified the problem a little. Don't be offended – isn't that really what the problem is, in layman's terms?

I thought we'd cracked it, to be honest. If you've ever had much to do with my colleagues here at 1Spatial, you'll know that they spend all of their time solving data problems - and they make it seem easy. They make sure that data can be trusted before it's used, propagated, or shared - to them, it's bread and butter stuff. It's the focus of our business and why we exist.

And that's probably why we were in a consortium project that demonstrated automated schema transformation network services to the JRC. You can read all about it here, but if you want an easier-going overview, check out the video we made.

So what treats does 1Spatial have for you at the Edinburgh conference?

My colleague Matt Beare is going to be presenting on Driving Government Efficiency with Improved Location-based Data - and that's what it's all about, right? Trusting the data which we are using to make critical decisions, and making sure that what is published and shared can be trusted, and demonstrably so. In our day-to-day jobs we are all accountable for our decisions, aren't we? If we trust our data by understanding it better, our decisions are surely better-informed and supported by evidence? This is even more the case if we can then improve data by consistently testing its conformance to strict business-rules. Better-informed decisions lead to more efficient use of resources and, ultimately, either lower our operational costs or improve the services we are providing to the public, our customers, etc.

Anyway, Matt is going to talk about a project he's just completed as part of ESDIN in which he and his team developed procedures and guidelines for data quality evaluation, edge matching across boundaries and data sets, and model generalisation. As part of this, he and the team developed a pilot service to show that automation of data quality assessment and improvement can bring real efficiencies, in this case, in the production of INSPIRE compliant data products.

So with us having successfully ticked off the two projects for JRC and ESDIN, we already have proven solutions to tackle some of the issues around INSPIRE. We can read the schema definitions, we can import data from multiple sources and formats, we can check its readiness and quality and we can then automate the whole process right up to the transformation of the data to the right schemas and the subsequent publishing of it in the right format. It's just another data problem like all the others - the commonality being in the automation and repeatable tasks of gathering up data, validating its readiness and quality, transforming it and publishing it. We already had the tools to do it - it really is just another data problem to solve. Which is what we do.

That's why I believe we've cracked it.

Anyway, I'll be in Edinburgh for at least some of the week of the conference and I do hope you get a chance to come along, hear about Matt's work and maybe even say hello to me. In the meantime to read more about our involvement with INSPIRE click here.

The critical question remains, though - will they let me blog again?

Wednesday 13 April 2011

Here come the DIP's

Posted - Bob Chell: Three GeoDATA events down (London, Birmingham and Leeds) - review and retrospective time.

To begin with, I am presenting a subject that I am passionate about, but one that, to a certain degree, people have always been happy to ignore or pretend is not there - Data Quality. However, the reaction during my presentation, and more importantly afterwards through conversations with delegates, shows that in the spatial world this is right at the top of their to do lists.

In the end, two of the most important assets to any organisation are its staff and its data. And as well as the focus on efficiency, people are being encouraged to make their data publically available. This is strategic goal. But with it, comes a set of responsibilities, because people will be accountable for that data. To become more efficient, organisations are also looking to consolidate systems and streamline how they work.

The nature of the spatial data that I deal with joins up many different types of systems, managed by and for all sorts of personas. This means that most of my clients work in data-intensive businesses. More and more, spatial data accuracy, spatial data cleansing and spatial data quality health checks are now being mentioned in publications that are not focussed on users of geospatial technology. Take a look in any 2011 issue of Government Computing and you'll see this. Businesses are making direct relations from poor quality spatial data to missing revenue, or lost efficiency - missing or inaccurate address records resulting in lost income tax or missing out on new home bonuses.

As soon as people have made this connection between the data and its business value, they start to see some tangible quick wins. Typically, an organisation might have 100,000 address records in a single system. When you go through the numbers, even if that organisation can make data quality improvements of a small 2 or 3 per cent, it still adds up to 1000s of improved records, which you can put a financial value on. It then becomes possible to decide if the benefit of improving the data outweighs the cost of the improvement process itself.

However, the data-intensive business of the spatial world means that without some form of automation, it could literally take years for businesses to get an understanding of the quality of their data. Without being lean and efficient by automating these tasks, the business case for improving the data is not always there.
I tried to make the GeoDATA events educational, and took the opportunity to talk people through getting organised and setting up a Data Quality Process. It's a data-driven process, rather than product-driven. It shares key concepts with many other Data Quality Processes, but that is because there are key things you must do to ensure everything works efficiently. We encourage people to take a positive approach to this.

The GeoDATA Events continue in Dublin and Edinburgh, so now I know that people like what we are saying at 1Spatial, I need to get our marketing team to print out some more A5 flyers that have nothing more than the Data Quality Improvement Cycle on it. People can really relate to it, and it offers some real genuine guidance on how people can start their own programmes of work.

Monday 4 April 2011

What’s in an Address?

Posted - Chris Wright: Recently I have been working with one of our customers, Staffordshire County Council (SCC), around address matching.  The area of addressing is one not traditionally ‘addressed’ by 1Spatial, as in some cases there is no geocode and hence the data is deemed to be non-spatial. However, our Radius Studio solution is just as proficient with non-spatial data as it is with spatial data.


SCC needed to provide accurate address locations to the emergency services, enabling them to carry out safety checks on homes within their areas in a co-ordinated manner. This was all part of their Data Quality Mission Statement. However, the problem was that no specific details about levels of data quality were available, so we helped SCC conduct an initial baseline assessment of their data.  This initial assessment confirmed that it was of varying quality - much of the information did not appear to meet any address standard format such as BS7666. The BS7666 standard enabled us to build a useful catalogue of rules for addressing this area.

After the data quality baseline assessment was completed, we helped SCC move to the next stage and supported them in putting a Data Quality Management process in place. This stage of the exercise meant we could:
  • Retain the original address data entries
  • Automate the validation and correction of as many errors in the data as possible
  • Find the exact/best match to trusted national address datasets
  • Add value to the data by adding a Spatial Reference
  • Provide an indication of ‘confidence’ level of match
The whole process validated around 77,000 addresses against 3,500,000 national address records.
Working through the exercise allowed us to realise a number of benefits for the customer.
  • Data Conformance - Using the rule based methodology we were able to identify errors within the data including; syntax/typing errors, invalid characters, invalid postcodes, redundant records, etc.
  • Perform Data Reconciliation - Using the action abilities within Studio we were able to fix common problems in the data, such as replacing or removing invalid syntax. 
I have just started to look at the new National Address Gazetteer (NAG) which will be replacing the National Land and Property Gazetteer (NLPG) and Address Layer 2 in the fullness of time. I’ll post more on this in the near future.

Wednesday 30 March 2011

GeoDATA London

Posted - Hayley Merrill: We attended our first GeoDATA on Wednesday this week in London at the Emirates Stadium. There was a strong emphasis on helping people meet the current demands around efficiency, accountability and transparency.

Bob delivered a presentation in the afternoon, focusing on efficiency and the benefits of our Data Improvement Process, he also gave a live demo on our current Validation Service - this certainly raised interest on our exhibition stand. Judging by the many conversations that we had, transparency and accountability are making people aware more than ever about the importance of using high quality data in their services to ensure they offer real value for money.

It was great to see a number of existing customers, and to meet plenty of new faces. If you couldn’t make it to the London event, then you can still come along and see us at the Birmingham or Leeds events taking place on 5th and 7th April. To find out more click here.




Wednesday 9 March 2011

OGC Data Quality at Bonn

Posted - Matt Beare: Last week Bonn hosted the largest attended OGC technical committee meeting, with 230 delegates enjoying the hospitality of United Nations Environment Programme and the UN-SPIDER Bonn Office.

A packed agenda spanning the entire week afforded the community a platform to discuss and drive forward matters on the development of geoprocessing interoperability computing standards.
On a lighter side we experienced the customs of Women's Carnival - held on the last Thursday before Fastnacht, the locals amongst us (and others) were in festive mood and attire, with several men falling prey to women with scissors, and losing the ends of their ties. For my part, on Wednesday I chaired the Data Quality Domain Working Group (DQ DWG).

The mission of the DQ DWG is to provide a forum for describing an interoperable framework or model for OGC Quality Assurance measures and Web Services to enable access and sharing of high quality geospatial information, improve data analysis and ultimately influence policy decisions.
An imperative consideration for many disciplines (as typified by last week's blog) we gained three perspectives on the role and need for data quality knowledge in the earth observation community. These were followed by an academic perspective on how standards and data quality underpins a GeoInformatics curriculum. We concluded the meeting with an update on the status of the ISO 19157 draft standard for data quality, with a focus on how OGC comments had been received and acted upon by the ISO editorial committee.

May I take this opportunity to thank our five presenters (Joan Masó, Hervé Caumont, Dan Cornford, Ivana Ivánová and Marie-Lise Vautier) for excellent content in their presentations which together contributed to positive dialogue and proposals for future tasks for the DQ group:

  1. Engage with the GeoViQua project to assist in the development of the proposed GeoLabel concept to support the creation, search and visualization of quality information on EO data.
  2. Based on discussions around accuracy and uncertainty, document best practice examples in the application of the guidance offered in the forthcoming ISO 19157 standard.
  3. Develop understanding and guidance on the role of data quality control in data processing tasks such as schema transformation and generalisation, seeking to ensure the preservation of data integrity as it moves through the data supply chain.

One discussion that I found particularly relevant is whether data has to be of high quality or known quality? As part of any data improvement programme, striving to achieve the highest possible quality data for a particular purpose with the resources available will always be a key objective for data providers. However, fit for one purpose may not be of use for another purpose, and vice-verse. Indeed, in the age of the SDI and concepts like Linked Data the aim is to publish data to be used for unknown purposes (as highlighted by this video that one OGC participant pointed me at a YouTube video. So for the user community, having knowledge of the quality of the data and the types of purposes it can be used for is important. So there appears to be a growing consensus that known quality is more important than high quality. This is seen in initiatives like GEOSS and INSPIRE, where data quality is presented as being important, but equally critical is that no data should be deemed poor quality, for fear that this may act as an obstacle to an organisation’s willingness to publish and share data. Instead users desire access to as much data as possible, with documented guidance on its limitations of use and whether an element of caution should be applied to applications or decisions that are based on the data, and in turn this will enable informed decision making and business planning.

Friday 25 February 2011

Gas Explosion Pinned On Poor Data

Posted - Hayley Merrill:  You may recall the news headlines from September last year in which a Pacific Gas and Electric Co (PG&E Corp.) pipeline exploded in the San Francisco suburb of San Bruno, California, killing four people and destroying 38 homes.  The hearings have just recently begun and are examining the circumstances that led to the fatal explosion. It has become immediately apparent that the company is missing critical information that keeps track of its pipelines including location-intelligence, potential failure points and maintenance records.  In fact the company has stated that is has had a real challenge converting paper records to computer files for a number of years now. 

A San Francisco Chronicle article suggests that software upgrades and errors in managing the utility’s information system were contributing factors to lack of maintenance of the pipeline.  It appears this is not only a significant challenge for PGE&E Corporation but many other utility companies worldwide who are finding the management of geospatial information really complex and that the value behind its intelligence is grossly underestimated.  Mary Muse, a PG&E Corp. gas engineer, involved in the computerisation of old pipeline paper maps to digitised maps, told a convention of mapping experts, “Validation of individual field values was not performed.  This led to incorrect or inconsistent values being populated in the fields." 

Whether an organisation manages the capture and validation process itself or outsources it is critical that good (and known) data quality underpins business planning and decision making.  Surely it has never been more critical to recognise the valuable asset that location-data is – especially if it helps mitigate disasters like this.  

Thursday 24 February 2011

Poor Data Quality a Crime

Posted - Bob Chell:  As we decided to launch a blog for 1Spatial Consultants, there is no better current example that brings focus on data to the people’s minds than the Police Crime Maps.  My honest first thoughts, when I heard the site had been down due to demand, echo’s that from Adrian Short’s blog - ‘too much traffic to our website is a problem we’d all like to have’.

As I got the train from London to Cambridge, I then saw that what we could also get hold of was the raw data itself – fantastic!  Then, just as we were arriving into Cambridge, I actually started to read numerous blogs about issues people actually had with it all. All these have already been written about and I found informative summaries from Steven Feldman and the previous mentioned post from Craig Short to cover everything.

My attention is always drawn to comments about ‘understanding about the data’. One of the first articles I came across was on the Guardian website, which contained an eye-opening ranked list of a streets and crimes.  Interestingly, Cambridge had a hot spot, Peas Hill, in at number eight. Working and living in Cambridge means I already have context about this place - I know this area. And I also know that Jamie Oliver recently opened Jamie’s Italian which is right in the same place, right on the corner of this street - and it’s not a particularly long street! I’m pretty sure that Jamie Oliver was not advised to open a high profile restaurant in a crime hot spot. So I became a little more curious about this information. What rules were there behind its creation?

Having got a basic glimpse of the data, I needed to investigate further since the raw data themselves without context have limited value. So Step 1 - discover and learn more about this data’s provenance and governance. Without this type of understanding, you won’t really be able to make an informed decision on whether the data is suitable for you or your business. Is this suitable for you the mission that you and your company are on?

I share many people’s views that this is huge improvement on what we had before. I’m also slightly biased, having worked with the likes of British Transport Police, and empathise with the effort that they are putting into making continued improvements in managing the vast quantities of information that keeps passing by their desks.

The data are now open. However we are simply at the beginning. First we need a baseline to work out a where we are with the quality of the data, then we can start to determine how to improve it to make it fit for purpose, give it context. There appears to be valid thought-processes or rules that have gone on to justify why the information is as it is. So all this needs to be put into context, so that the issues and understanding around inaccuracies are not what these transparency initiatives are remembered for.
The data are available, so the government is hoping that others can easily look to take a different approach to using it and analysing and presenting it in novel ways. I like the ITO world updates to Open Street Map as Berners-Lee showed at TED2010.

There are other examples already appearing, I’m sure you can find them through the search engines. One thing is for sure though, the now monthly snapshots of crime figures for England and Wales should plug straight into all sorts of Business Intelligence tools and I’m looking forward to how this grows up. I might look at Oracle Business Intelligence Enterprise Edition (OBIEE) and others in the Consulting group will no doubt look at BI tools of their own preference. Keep an eye out for future posts.

Thursday 17 February 2011

Data Improvement Blog

Posted - Bob Chell:  I have been consulting at 1Spatial for over 10 years, and have worked with some exceptional colleagues and great clients. We have spent almost all our time solving the difficult, data-intensive problems in the spatial arena, problems that many people do not know how to solve. And along the way, collectively we have picked up a vast range of valuable skills and experiences.

Like the company we work for, despite changes in focus over the years from the GI industry (desktop, web, centralised databases, web-services) we have always maintained a constant focus on  your data - ensuring its quality and looking to automate as much of the data quality management and improvement process as we can.

We are way overdue in sharing some of our experiences, so this is our first foray into rectifying that, so I hope you find the posts interesting.

I plan to use this blog to allow the whole team to keep people up-to-date with the projects and discoveries that we all make along the way. And as this is the first post, I thought you might like to know about the types of things we are currently working on.

Matt Beare co-chairs the OGC Data Quality and is in the closing stages of ESDIN Project, where we have been contributing to a number of successful work packages. Although based in our Cambridge office, this has kept him busy right across Europe.

Tom Spencer supports PSMA on site down in Canberra, Australia. He also has a trip out to Land Information New Zealand (LINZ) in the near future to get them up and running with Radius Studio, as well as supporting our Managing Director for Asia Pacific Guy Perkins.

Chris Wright is also in our Consulting and brings over 20 years of experience to the team. He recently helped establish our Australian office, (now operated by Tom), originally helping MidCoast Water using Radius Studio to produce Geo-Schematic data from MidCoast's TopoBase database and engineering data quality business rules during the migration to TopoBase from Munsys GIS.

From Belgium, Luc Van Linden has his time divided between looking at our ventures with INSPIRE and helping Tom with some project work for Queensland Transport and Main Roads (QTMR). He has also recently found some time to look at Deegree OSGeo project, and he keeps his number 3 status on the Oracle Technology Forum too.

I'm currently working with English Heritage to assist them in future strategic guidance around interoperability. I’m also working on data migration and quality improvement programmes at the Environment Agency. Outside this I have also started to implement solutions using GeoServer and explore OBIEE 11g.

As you can see, it is a talented team that covers a range of tools and technologies. So if you want to find out anymore, get in touch.  You can email us at: consulting@1spatial.com