I'm going to be a little cautious about calling this a bug after my previous spatial embarrassment, but I think something rather odd is happening here. It's a bit complicated but I'll try to explain in stages. I am trying to run queries that identify which of various person records fall inside certain UK administrative areas. The link is based on the persons' postcodes. UK postcodes identify groups of usually 20-50 properties and the Ordnance Survey [OS] provide a OSGB grid reference for the centroid of each postcode area. The OS also provide polygons with the outlines of the administrative areas - in this case counties. I have the person records in a table called Person, there are some 23k records in the table. The table includes various information including postcode; about half the records have a postcode. I have another table, PostcodeLocation, with the centroid of each postcode as an ST_Point. The table contains records for all 1.4m postcode in the UK. Most of the postcodes in the Person table have a matching value in PostcodeLocation but some will be out-of-date or incorrect. A third table, Boundary-line-historic-counties_region, contains the (multi)polygons that represent the county boundaries. When I run a query like: select p.personid, town from person p join postcodelocation l on p.postcode = l.postcode where l.IQXPos.ST_Within((select Geometry from "Boundary-line-historic-counties_region" where "name"='Angus')) =1 I get the expected result set: personid,town 'TI04201631032011000A','Arbroath' 'TI0SJJAN231020060005', 'TI161223240520110F97','Arbroath' 'TI215211160720026362','DUNDEE' 'TI322616240820110001','Arbroath' 'TI531716310320110001','Arbroath' 'TI32536','DUNDEE' 'TI35398','BY FORFAR' ie 8 Person records have postcodes that are in Angus. However, as soon as I add any additional restraints to the WHERE clause eg: select p.personid, town from person p join postcodelocation l on p.postcode = l.postcode where l.IQXPos.ST_Within((select Geometry from "Boundary-line-historic-counties_region" where "name"='Angus')) =1 and p.town like 'D%'; rather than getting a sub-set of my previous results, I get a completely different result: personid,town 'TI1238192502201117E7','Deeside' 'TI1238192502201117E7','Deeside' 'TI1238192502201117E7','Deeside' 'TI1238192502201117E7','Deeside' 'TI1238192502201117E7','Deeside' 'TI1238192502201117E7','Deeside' 'TI1238192502201117E7','Deeside' ....... This continues for hundreds of thousands of rows. Things to note:
I've tried reproducing the problem with a much smaller Person table but I haven't yet been able to recreate the issue, so it seems that something about the Person table is involved in creating the problem. This is hard to isolate as it has many foreign keys and is referenced by many other tables. Am I missing something very obvious here? A full validation on the database showed no issues. An unload & reload did not make any difference. Version 16.0.0.2213 Update: I thought I would take a look at the execution plans. Interestingly the working query plan was returned in a few seconds but it took over 20 minutes for the problem query to return a plan to the plan viewer. The main difference is that the problem one uses Spatial Index Probes which the other doesn't. I'd upload the plans if I could work out how to :) - I've seen them attached to other questions! Further Update: Graphical Plans for both good and bad queries attached using the Volker-Dmitri bodge. http://sqlanywhere-forum.sap.com/upfiles/WithoutExtraCondition(CorrectResultSet).saplan.jpg To retrieve plans, right click on the hyper links above and choose "Save link as.." or equivalent depending on browser, then remove jpg extension from resulting downloaded file. And another update: I've been able to work out what determines whether I get the right result set or the wrong one. It depends on whether or not the engine uses the index on person.postcode (if it does it gets the right answer). I created a new table: CREATE TABLE "pears"."Person3" ( "personid" CHAR(20) NOT NULL, "town" CHAR(30) NULL, "postcode" CHAR(30) NULL, PRIMARY KEY ( "personid" ASC ) ) IN "system";and populated it with a select statement from the original Person table. When I ran the query select p.personid from person3 p inner join postcodelocation l on p.postcode = l.postcode where l.IQXPos.ST_Within((select Geometry from "Boundary-line-historic-counties_region" where "name"='Angus')) =1;I got the wrong result set. Adding an index on person.postcode and running the query again gave the correct results. To check, I dropped the index, re-ran the query and we went back to the wrong results. Reported to SAP as incident ID:002075129400004746052016 |
Fixed by Engineering Case 805460 SAP SQL Anywhere Version 16.0.0 build 2428 SAP SQL Anywhere Version 17.0 build 2816 The problem only happens when a multi-column index that includes a geometry column is used in the plan. |
This is a WAG... That whole business of using a "correlated parenthesized ( subquery ) as a function parameter in a WHERE predicate" may be OK because it appears in a few examples in the Help, but... ...those examples use the "comma join" operator in the outer SELECT, not an INNER JOIN with ON clause. In other words, it's a kind of undocumented LATERAL join using the WHERE clause. In practice, parenthesized ( subqueries ) are very dangerous things. In theory they are allowed as function parameters but not as procedure CALL parameters (go figure!). Personally, I limit parenthesized ( subquery ) usage to a few well-proven uses; e.g., SELECT lists (rare) and IF EXISTS statements (frequent). Perhaps you can move the ST_Within predicate down into a derived table based on postcodelocation... or into a completely separate query to load a temporary table. It's interesting what you say about the dangers of parenthesised (subqueries), I've probably been rather cavalier in my approach to them. However in this case it doesn't seem to be the cause. I tried eliminating the look-up entirely: CREATE VARIABLE @Blah ST_Geometry; select Geometry into @Blah from "Boundary-line-historic-counties_region" where "name"='Angus'; select p.personid, p.town from person p join postcodelocation l on p.postcode = l.postcode and l.IQXPos.ST_Within(@Blah) =1 and p.town like 'D%'; but it doesn't change the outcome. What I'm actually trying to achieve here is a mechanism that can be incorporated into a reporting tool where we won't have complete control over the query construction. I'd hoped we'd just be able to keep ANDing on additional conditions, so I'm trying to get to the bottom of what works and what doesn't. UPDATE The behaviour is a bit different however with the polygon in a variable. While select count(postcode) from postcodelocation l where l.IQXPos.ST_Within(@Blah) =1;works fine, BUT select p.personid, p.town , p.postcode from person p join postcodelocation l on p.postcode = l.postcode and l.IQXPos.ST_Within(@Blah) =1ie without the town condition, returns rows that don't meet the join condition
(30 Sep '16, 08:28)
Justin Willey
Replies hidden
The latter part being quite successful, methinks:( In theory, I'd assume that your approach to simply add ANDed conditions should work - at least I would use that approach, too. However, IMVHO, in practise the spatial function call seems to prevent the correct execution. If that were a STP, I'd suggest to try with a kind of CROSS APPLY / LATERAL call, i.e. to buiild a filter on the persons/towns and to "apply" the result to the spatial call... Hm, another very wild and even vague guess...
(30 Sep '16, 09:25)
Volker Barth
2
Late to the thread (again) but a couple of thoughts . . . I am a bit confused by all the focus on the sub-query part of this. Does the bad plan show that as being anything but a separate entry under the "Subquery:" drop down in the plan viewer? If not, I would not suspect the sub-Q to be a significant part of this. That subquery is just a single-row query and should not (otherwise) significantly contribute to the final plans. Unless I am wrong, that should be able to be replaced with a[n] ST_GEOMETRY (or other spatial-type) variable set to that query and the behaviour differences should remain, pretty much, the same. Of course having a way to investigate the details behind this might shed a brighter light on all of this issue. And since we are leaning towards a bug, it would be good to get someone working on that part. "Of course": - the optimizer could do a number of different things here - "rewritten spatial operations" is entirely conceivable but ... I would have to work from the plans (and possibly a representative db) to be abl to bring much else to this conversation. YAK: Yet Another Kibitzer P.S. FWIW it might also be interesting to see if you get similar problems with ST_CoveredBy and ST_WithinFilter
(30 Sep '16, 10:11)
Nick Elson S...
Replies hidden
Hi Nick, thanks for looking at this. The sub-query looks fine - just a sequential scan of a very small table and is the same in both cases. As I don't seem to be able upload plans to the forum, is it OK to email them to you? I could get you the database too if that's wanted - its not too big.
(30 Sep '16, 10:49)
Justin Willey
I think y'all are missing the point of my WAG. The subquery is not the problem, its use inside the l.IQXPos.ST_Within() function call in the WHERE clause is the problem... it is a very very very very very sophisticated construction since it refers to a table in the FROM clause. This is definitely NOT a case of a simple singleton-select subquery. I don't understand how this stuff works, BUT it is clear that the examples in the help DO NOT USE INNER JOIN or ON clauses at all when the ST_things() appear in the WHERE clause. The examples use old-fashioned "FROM tablename, tablename" clauses. Try changing this select p.personid, town from person p join postcodelocation l on p.postcode = l.postcode where l.IQXPos.ST_Within ( (select Geometry from "Boundary-line-historic-counties_region" where "name" = 'Angus' ) ) = 1 and p.town like 'D%'; to this select p.personid, town from person p, postcodelocation l where l.IQXPos.ST_Within ( (select Geometry from "Boundary-line-historic-counties_region" where "name" = 'Angus' ) ) = 1 and p.town like 'D%' and p.postcode = l.postcode; to make it look more like the examples in the Help that use the comma join instead of inner join: SELECT c.PostalCode, c.City, z.CenterPoint FROM Massdata z, GROUPO.SpatialContacts c WHERE c.PostalCode = z.ZIP AND z.CenterPoint.ST_WithinDistance( ( SELECT CenterPoint FROM Massdata WHERE ZIP = '01775' ), 100, 'Statute mile' ) = 1 ORDER BY c.PostalCode;
(30 Sep '16, 11:13)
Breck Carter
I've tried select p.personid, town from person p, postcodelocation l where l.IQXPos.ST_Within ( (select Geometry from "Boundary-line-historic-counties_region" where "name" = 'Angus' ) ) = 1 and p.town like 'D%' and p.postcode = l.postcode; and still get the wrong result set - exactly the same wrong results as in the question above. When I comment out the additional criteron eg select p.personid, town from person p, postcodelocation l where l.IQXPos.ST_Within ( (select Geometry from "Boundary-line-historic-counties_region" where "name" = 'Angus' ) ) = 1 -- and p.town like 'D%' and p.postcode = l.postcode;then we go back to the correct results. If we checkout the first PersonId returned in the "wrong" result set 'TI1238192502201117E7', we get this select postcode from person where personid = 'TI1238192502201117E7'; -- returns 'CH5 4XB' select IQXPos from postcodelocation where postcode = 'CH5 4XB'; -- returns 'Point (328810 369953)' select NEW ST_Point(328810,369953,0,999999).ST_Within((select Geometry from "Boundary-line-historic-counties_region" where "name" = 'Angus' ) ); -- returns 0so the value is there even though it doesn't satisfy all the conditions in the WHERE clause. Whats really odd is the if I use a different "extra" criteria, say and p.name like 'D%' we get a different "wrong" result set: personid,town 'NZ0P0HKL16062011000D','Auckland' 'NZ0P0HKL16062011000D','Auckland' 'NZ0P0HKL16062011000D','Auckland' 'NZ0P0HKL16062011000D','Auckland' checking out this value 'NZ0P0HKL16062011000D' we get: select postcode from person where personid = 'NZ0P0HKL16062011000D'; -- returns '0604' select IQXPos from postcodelocation where postcode = '0604'; -- returns no record (unsurprisingly)which breaks a different condition.
(30 Sep '16, 12:16)
Justin Willey
ST_CoveredBy and ST_WithinFilter behave in exactly the same way.
(30 Sep '16, 13:21)
Justin Willey
1
OK, let's wave a dead chicken over the query: select p.personid, town from ( SELECT * FROM person p WHERE p.town like 'D%' ) AS p, postcodelocation l where l.IQXPos.ST_Within ( (select Geometry from "Boundary-line-historic-counties_region" where "name" = 'Angus' ) ) = 1 and p.postcode = l.postcode;
(30 Sep '16, 14:27)
Breck Carter
OK - chicken has been dispatched and the waving commences: select p.personid, town from ( SELECT * FROM person p WHERE p.town like 'D%' ) AS p, postcodelocation l where l.IQXPos.ST_Within ( (select Geometry from "Boundary-line-historic-counties_region" where "name" = 'Angus' ) ) = 1 and p.postcode = l.postcode; gives the wrong answer still, while removing the restriction on Person: select p.personid, town from ( SELECT * FROM person p) AS p, postcodelocation l where l.IQXPos.ST_Within ( (select Geometry from "Boundary-line-historic-counties_region" where "name" = 'Angus' ) ) = 1 and p.postcode = l.postcode;makes it work just fine. Again it doesn't seem to matter what condition is specified, it goes strange. I suppose I could try human sacrifice next - I have a little list ....
(02 Oct '16, 12:21)
Justin Willey
1
Coming back to your question: Is there a chance to post the plans as described in the workaround (i.e. as fake image files) in your according other question? (Yes, it's a nasty workaround but I'd still prefer that to yur suggested next step...)
(02 Oct '16, 16:37)
Volker Barth
|
Does the query work correctly when you use the original query as a derived query and then apply the filter on town lateron, i.e. something like:
Interestingly that produces the same wrong answer. I wonder if it is simply being re-written by the optimiser?
works fine.
Hm, what about putting the town filter in the Join condition, such as
or by using a derived table to filter person before you do the join with postcodelocation?
Yep, all wild guessing, but somehow I remember to have had (non-spatial) queries with similar problems...
That's the same again. I also tried:
but no better.
It's strange that making the original query an OUTER join works. As does making the join between DT and PostcodeLocation in the query above a left outer join.
So that's one of the few cases where the classical "your WHERE clause turns the LEFT JOIN into an INNER JOIN" error does not hold:)
Indeed - rather the opposite!
Hm, does your original schema with table person have a FOREIGN KEY to postcodelocation - and therefore an automatic index? (I'm asking as the table person3 apparently doesn't.) (Sorry, I have not yet checked the attached plans, I had to get to know the meaning of "bodge" first:)...)
So would an index hint on the join condition (aka the FK index) do the trick, too? - I.e. something like
If so, that would probably be still a bodge but better than a wrong result set.
There is no foreign key (the postcodes in Person may be blank or filled and may or may not exist in PostcodeLocation).
I tried:
but the same (wrong) behaviour happens.
Then I guess it would even be similarily wrong if you added a separate NULLable column like "known_postcode" with those codes existing in the other table and declared a FK on that, right?
So the plan did not use that index?
(Otherwise, your observation "It depends on whether or not the engine uses the index on person.postcode (if it does it gets the right answer)." would not hold...)
Besides that, I hope the SQL Anywhere engineers will take over...