The Tandem Consortium Summary of Work Packages 2-3: A Theoretical Assessment and A Practical Assessment Marja Tammilehto-Luode, Statistics Finland Philippe Guiblin, Office of National Statistics, United Kingdom GISCO Working Party 25 October 2001 The Tandem Consortium Plan for the presentation
Point of departure The two parallel approaches Methods Tests with empirical data Results GISCO Working Party 25 October 2001 The Tandem Consortium Point of departure: A need for more comparable territorial divisions for statistics Why? to visualise data more effectively to combine and compare data on different spatial units to combine or compare data on different spatial scales to make better statistical/spatial analysis of data (to test spatial patterns and trends
m GISCO Working Party 25 October 2001 Finland 20 NUTS3-areas Belgium 45 NUTS3-areas GISCO Working Party 25 October 2001 The Tandem Consortium Objectives to find alternative comparable building blocks/territorial division for European system of small area statistics to summarise studies of theoretical assessments of systems of statistics by regular and irregular tessellation to present a practical pilot study
-to product grids and blobs -to test their capabilities -to test both systems with the same user case GISCO Working Party 25 October 2001 The Tandem Consortium The study of two parallel approaches Regular tessellation approach Statistics Finlands responsibility
study of methods to construct gridbased statistics tests of candidate methods with empirical data construction of a prototype tests on a prototype with a user case GISCO Working Party 25 October 2001 Irregular tessellation approach
Office of National Statistics in UKs responsibility study of methods of zoning design tests with ZDES (Zone Design System) by Leeds University with empirical data tests on a prototype with a user case From diversified input areas to harmonised building blocks and comparative output areas Regular tessellation Irregular tessellation Input areas Building blocks Output areas
GISCO Working Party 25 October 2001 The Tandem Consortium Towards a system of grid-based statistics Methods -Points to grid cells - need for best practices -Polygons to grid cells - different methods give different results - need for standardisation Size of a grid cell dependent on -quality of data -confidential grid cells -scale of the study -compatibility of different kind of source data
-disk space, processing speed Map projection -rectangular -common geo-reference system - with a same origin GISCO Working Party 25 October 2001 The Tandem Consortium Towards a system of polygon-based statistics Optimum grouping of the source areas -equal values of selected variable (population) -similar degree of heterogeneity (homogeneity) -weighted accessibility on certain input variable (shape) The automated zone design program (Openshaw 1977) -optimising an objective function -optimising under constraints
Design functions -equality population zoning -shape design function -homogeneity design function -correlation -, distance -, spatial autocorrelation functions GISCO Working Party 25 October 2001 The Tandem Consortium Test areas and data sets The Finnish data sets cover the Helsinki region -56 municipalities, 168 postal code areas and 168 grid-squares (10 km x 10 km) and 13 003 grid-squares (1 km x 1 km) -1998 census population counts of each area level The British data set covers the region of Wales and sub set of the county of the South Glamorgan -6376 EDs for the Wales and 817 EDs for the South Glamorgan -1991 census population counts by EDs
GISCO Working Party 25 October 2001 The Tandem Consortium Tests: Data from two countries, different types of geo-references, different types of variables grids tests of candidate algorithms to convert polygon-based data to
grids construction of prototype visualisation of results delineation of urban areas GISCO Working Party 25 October 2001 blobs (polygons) tests of zone function with constraints of equal population and a single weighted shape construction of prototypes
visualisation of results delineation of urban areas The Tandem Consortium Building up a system of grids - Results From points to grids -Description of best practices From polygons to grids -Regiongrid-algorithm maintains the original data structure - similar statistics to real grids -Larger grids keep the structure better than smaller -Polygrid and Pointgrids algorithms overestimate data to small grids and underestimate data in larger grids Delineation of Urban areas -The urban area by 1km x 1km grids is 45% smaller than that defined using the data on NUTS5 areas in Finland - The urban area of the UK test area, South Glamorgan is a little bit larger by 1km x 1km building blocks than by NUTS5 areas
GISCO Working Party 25 October 2001 GISCO Working Party 25 October 2001 Table 2: The Finnish test data Comparison of real grid squares with estimated grid squares Real grid squares = Data aggregated from point-based data, accurate geo references Estimated grid squares = Data converted from polygon-based data by different methods Population density 1998 10 km x 10 km count mean max
min variance Real grid squares 168 918 29 045 0 9 357 428 Polygrid 168
85 2 287 2 65 562 Pointgrid 128 31 208 0 1 787
Regiongrid (intensive) 168 232 12 987 3 1 227 872 12 988 109 19 478 0
329 220 Polygrid 12 968 150 31 363 0 735 312 Pointgrid 429 3521
24 334 0 17 288 687 Regiongrid (intensive) 12 969 151 31 319 0 709 963
Estimated grid squares Population density 1 km x 1 km Real grid squares Estimated grid squares GISCO Working Party 25 October 2001 GISCO Working Party 25 October 2001 GISCO Working Party 25 October 2001 GISCO Working Party 25 October 2001 The Tandem Consortium
Building up a system of blobs Software -Visual Basic program, AZM, developed by David Martin (Southampton University) Constraints -only equal population zoning and a shape constraints - perimeter squared/area were used -no other source of heterogeneity was considered such as geography and/or social class Delineation of Urban areas - Production of an optimised boundary system GISCO Working Party 25 October 2001 AZP Case Study: Cardiff Area
Input InputAreas Areas::818 818EDs EDs min. min.pop: pop:64 64 max. pop: 1030 max. pop: 1030 Automated AutomatedOutput Output Area AreaDesign Design (AZP)
(AZP) Building BuildingBlocks Blocks min.pop: min.pop:500 500 max. pop: 1909 max. pop: 1909 Delineation Delineationof of GISCO Working Party Urban Urban/ /Rural RuralAreas Areas 25 October 2001
Design DesignConstraints Constraints Thresholds: Thresholds: Target Targetpop.: pop.:500 500 Min. Min.pop. pop.::1500, 1500, Homogeneity Homogeneity::off off Shape: Shape:on on
AZP on South Glamorgan Enumeration District Input Areas GISCO Working Party 25 October 2001 AZP on South Glamorgan Enumeration District Building Blocks GISCO Working Party 25 October 2001 AZP on South Glamorgan Enumeration Districts Urban / Rural Delimitation GISCO Working Party
25 October 2001 AZP on Helsinki post-codes Urban / Rural Delimitation GISCO Working Party 25 October 2001 GISCO Working Party 25 October 2001 The Tandem Consortium Results and deliverables Reports of theoretical and technical assessment of two alternative building blocks for a system of geo-statistics of Europe Prototypes of grid (regular)-based and polygon(irregular)-based statistical system
Results of delineation of urban areas with the building blocks Recommendations GISCO Working Party 25 October 2001 The Tandem Consortium Conclusions - grid-based statistics A lot of advantages - a relevant alternative The critical points - characteristics of input data - what is a optimum/minimum size of the grid - what is a projection system Harmonised methods for converting grids needed The resultant grid size should be the same or coarser than the one for the input data The common georeferenced system needed - UTM applicable projection system
Confidentiality problems - special disclosure control methods needed GISCO Working Party 25 October 2001 The Tandem Consortium Conclusions - polygon-based statistics Advantages of the zoning design approach Software available and technical feasible for the whole of Europe Theoretically can be used for any level of geography During the tests some problems with the great numbers of input areas the more areas the greater the number of possible combinations and thus the more time needed for processing Face the definition of optimality (optimisation of an objective function) Using multiple objective functions makes it more difficult to find a good optimum GISCO Working Party 25 October 2001
The Tandem Consortium Conclusions - both approaches Both methods propose their own way to harmonise data production To perform well both approaches: - Need the provision of a good initial set of areal units - Need to incorporate confidentiality restrictions (for the dissemination of statistics) - Face the definition of optimality (e.g. optimal grid-size, optimisation of an objective function) Need to deal with limitations due to processing speed and disk space Need a good GIS software environment GISCO Working Party 25 October 2001 The Tandem Consortium
Conclusions - future studies Tests with more /different data sets -different variables -different kind of georeferences -different scale of analysis Development of methods -polygons to grids -AZM Case studies -to compare results with those by administrative areas -to make analysis which are not possible by administrative areas GISCO Working Party 25 October 2001 [email protected] [email protected]
GISCO Working Party 25 October 2001