As I mentioned in my previous post, chugging through big data can be rather difficult.  So, finding ways to replicate some sort of parallelization is advantageous.  There were some ways to fake PostGIS out, as I previously showed.  This summer I will be working with students to use Hadoop to perform real parallel processing with massive numbers of threads.  However, I have to be honest with you, I am dreading the effort it will require to stand up a Hadoop cluster and then try to manipulate things to work with GIS.  

This week I created a large dataset of 1,352,401 voronoi areas, and 2,704761 TIN areas.  To do this, I generated 1,352,401 random points and then created Voronoi and TIN areas for each point.  After that, I ran equivalent intersection functions in ArcGIS 10.1, ArcGIS Pro 1.0, Manifold 8, Manifold 9, and PostGRES.  In each case, I made sure to create a spatial index on the data.

An example of a zoomed in version of TIN and Voronoi polygons looks like this:

Hardware

All testing was performed using a 64-bit Dell Precision T1700 workstation with an i7-4770 CPU running at 3.4 GHZ and 16GB of RAM.  The i7 has 4 cores, and replicates 8 cores via hyperthreading.

Software Results

I ran a spatial intersection routine in PostGRES with PostGIS, ArcGIS 10.1, ArcGIS Pro, and Manifold 9.  The PostGRES and Manifold implementations used SQL, while the ArcGIS implementations used both the GUI and Arcpy.  

PostGRES

The intersection function in PostGRES and PostGIS was written as a straight-up SQL query:

SELECT st_intersection(vor.geom,tin.geom) AS g

INTO vortin

FROM tin

RIGHT JOIN vor ON

st_touches(vor.geom,tin.geom)

As my previous post indicated, PostGRES is not a multi-threaded application, so it was somewhat bogged down.  Inserting all the records into the new table took 3502 seconds (about 58 minutes).

In an attempt to improve the solution, I created four separate queries that inserted the data into a single table, for specific “chunks” of the data - like my previous post, I broke the data into 4 separate pieces as: 

INSERT INTO vortin (g)

SELECT st_intersection(vor.geom,tin.geom)

FROM tin

RIGHT JOIN vor ON

st_touches(vor.geom,tin.geom)

AND

 vor.gid BETWEEN 1400000 AND 2100000

(note the BETWEEN clause in bold - this was to obtain chunks of 700,000 records at a time, and run them in their own separate thread).

Inserting all the records into the new table took 1201 seconds (about 20 minutes).  So, we improved our speed 3x by running parts of the query in separate threads, and inserting the data into a table.  

ArcGIS 10.1 

For this task I imported all the data into a geodatabase and ran the analysis using the Intersection tool - the intersection process took 1,624 seconds (about 27 minutes).  

Just for fun, I ran the same intersection using Arcpy:

arcpy.Intersect_analysis(["tin", "vor"],"tinvor","ALL",0.001,"INPUT") - 1529s

The results took 1,529 seconds (about 25 minutes).    

So, as expected, the GUI and Arcpy versions are nearly identical.

ArcGIS Pro 

ArcGIS Pro 1.0.2 utilizes true 64-bit computing for better performance.  I loaded the data into a new geodatabase and ran the intersection command.  The intersection process took 1607 seconds (about 27 minutes).  This was virtually identical to ArcGIS 10.1.  I also ran  the same Arcpy scripts, and the process took 1722 seconds.

A check of the Windows Task Manager showed that only one thread was actually firing with any significant CPU usage.  

As the PostGRES test showed, there are certainly performance benefits when spreading the process over multiple threads.  I expect that ESRI will introduce true parallelization into ArcGIS Pro to achieve better performance. I think their move to 64-bit, and the really modern interface for ArcGIS Pro is really encouraging - I can’t wait to see what comes next.

Manifold 8 

I decided to test the same approach using the 64-bit version of Manifold GIS 8.0.  The SQL code to perform this task was straightforward, and similar to other SQL code I’ve written:

SELECT ClipIntersect([tin].[Geom (I)],[vor].[Geom (I)])

INTO bobo

FROM tin

RIGHT JOIN vor ON

touches(tin.[Geom (I)],vor.[Geom (I)])

a less elegant way, but actually just as fast (because Manifold Software is doing some optimizations in the background) is:

SELECT ClipIntersect([tin].[Geom (I)],[vor].[Geom (I)])

INTO bobo

FROM tin , vor

WHERE touches(tin.[Geom (I)],vor.[Geom (I)])

The process ran in 1620 seconds (about 27 minutes).  I also used the GUI to run the intersection, and that ran in 1072 seconds (about 18 minutes).

Manifold 9 (1 THREAD)

Recently I began working with some experimental software from Manifold Software, Ltd.  While many are looking forward to a Manifold 9 offering, I am uncertain as to what the next product will be called, or when it will be released.  The software technology that I am experimenting with is part of Manifold Systems Radian engine.  Implemented in SQL, Radian utilized parallel processing both on the video card and over multiple CPUs.  

The first test I performed was to intersect the layers through the following SQL code:

SELECT GeomClip([tin Table].[Geom (I)], p.[Geom (I)], true, 0.001)

INTO bobo

FROM [tin Table], [vor Table] AS P

WHERE GeomTouches([tin Table].[Geom (I)], p.[Geom (I)], 0.001)

The spatial intersection completed in 570 seconds (about 9 minutes and 30 seconds).  This was three times faster than Manifold 8 - but, this process does not take advantage of multiple CPUs on the computer.  The Radian engine allows users to explicitly assign multiple CPUs to a process.  So, I ran  the following query, assigning the process to two threads - which only required a single directive:

Manifold 9 (2 THREADS)

SELECT GeomClip([tin Table].[Geom (I)], p.[Geom (I)], true, 0.001)

INTO bobo

FROM [tin Table], [vor Table] AS P

WHERE GeomTouches([tin Table].[Geom (I)], p.[Geom (I)], 0.001)

THREADS 2

BATCH 64

A check of the Windows Task Manager showed that the two CPUs are saturated (see the first and last CPU):

The intersection ran in  316 seconds (about 5 minutes).  This is about 5 times faster than the Manifold 8 process.  

Manifold 9 (4 THREADS)

 

I raised the THREAD count to 4, so that the process would run over 4 separate threads using the following query:

SELECT GeomClip([tin Table].[Geom (I)], p.[Geom (I)], true, 0.001)

INTO bobo

FROM [tin Table], [vor Table] AS P

WHERE GeomTouches([tin Table].[Geom (I)], p.[Geom (I)], 0.001)

THREADS 4

BATCH 64

One can see from the Windows Task Manager that four threads are almost completely saturated:

The process completed in 235 seconds (just under 4 minutes) which was almost 7 times faster than the Manifold 8 process, and 3 times faster than my modified PostGRES process.  However, unlike the PostGRES process I created, I did not have to run multiple instances of SQL code - I only needed to add an additional line to indicate I wanted to use 4 threads.

Finally, I decided to run the process with 5,  6 and 8 threads.  However, I found that the results did not change very much, and in fact, 8 threads took slightly longer (238 seconds).   Nonetheless, you can see that all 8 CPU cores were mostly saturated.  

I created a graph of the processing times using the Radian software, and you can see a definite improvement as more threads are added.  

However, it begs the question: why did we not see any improvement with 8 threads.  Well, I am part of a private beta group that is testing the Radian engine.  One of the astute members of the group (Tim Baigent - tjhb for you georeference.org forum participants) offered this suggestion which I edited for grammatical clarity:

Regarding the different speeds when using 4 versus 6 or 8 threads,. I assume you're running this on a CPU with 8 logical cores with hyperthreading.  That means, you have 4 physical cores and 4 virtual cores.

As I understand it, hyperthreading is designed to make up for latencies in a parallel workflow, mainly for multitasking but also for a single task having multiple threads.  

Where a parallel workflow is highly optimized, to the extent that it can supply data and instructions as fast as the CPU can schedule them, then hyperthreading does not offer an extra advantage, or very little, and the slight latency due to switching between "hyper" threads becomes a net cost.

I think that's what's happening:

In other words: the Radian engine is so well written that it can fully saturate all physical cores. Therefore, adding, and then ultimately spreading the load over other virtual cores only adds overhead.

Just for fun, I decided to write another, more complex piece of code.  In this case, I wanted to buffer the voronoi polygons by 50 feet, and then intersect them with the tin triangles.  This is standard, SQL stuff that you could easily pick up from my previous posts about spatial SQL.  The code is:

SELECT GeomClip([tin Table].[Geom (I)], p, true, 0.1)

INTO bobo

FROM [tin Table],

     (SELECT geomBuffer([vor Table].[Geom (i)],50,0.1) AS p

      FROM [vor Table])

WHERE GeomTouches([tin Table].[Geom (I)], p, 0.1)

THREADS 4

BATCH 64

The process took 3106 seconds (about 51 minutes).  The point here is that you can simply introduce a second spatial process into the query with little effort, and Radian will do all the parallel processing in the background.

 

Conclusion

This summer I will be working with students on parallel processing with Hadoop - and more specifically spatialHadoop.  This has some really cool potentials for really insane massive data processing - something no GIS could likely approach. However, I am not looking forward to the effort it will take to setup Hadoop or even start writing some code - it just looks like a big scary monster to me.  Also, there are still less than a half-dozen true spatial analytical functions in spatialHadoop. So, it is quite limited.  The modifications I made to PostGRES gives me hope that there are simpler methods for large data processing by using the robust functionality of the PostGIS SQL engine.  

The initial results with spatial SQL within the Radian engine is very encouraging news for those who have been waiting to see what the people at Manifold Software, Ltd. will come up with next.  And while going from 27 minutes to 4 minutes is impressive, I want to see even better results.  So, I am going to try some other approaches over the next few days.  But, at the same time, I am going to try to push Manifold Systems to see if they can increase things a few more seconds here or there!

Another important thing to realize in the Radian engine is that we have true parallel processing.  In the case of PostGRES, I was breaking the problem up into smaller chunks and running it as a multi-threaded group of commands.  That, while clever, is not true parallelization.  In the case of the experimental Radian software, there is real parallelization going on - that is, the actual intersection algorithm is being spread over multiple cores.  And, the best part is that I don’t have to even think about it - if I can write SQL code, the Radian engine takes care of all the other intricacies.

The point is, one can have it both ways: the performance of hand coded parallelism with the convenience of the SQL one already knows.  In this case the engine is doing the parallelism and optimization for you.  Therefore, one can run both very fast and very convenient with SQL - what I have called A Language for Geographers.

I have a little time off until the summer session starts here at the University - when I start back up next week I will begin to test out other functionality in Radian and let you know what I discover.  Also, expect to see some spatialHadoop results in July.

In the meantime, please consider joining me at the University Consortium GIS Symposium in Alexandria Virgina on Wednesday, May 27, where will will discuss more of this during the seminar that I am presenting on Spatial SQL.