SAP SQL Anywhere

Question

The work table is this:

SELECT
  client_ref AS account_ref
INTO #latest_account
FROM accounts
WHERE client_ref<>'';

The ultimate query (simplified into performing COUNTs instead of listing the fields) is this:

SELECT
  count(ci2.COLLECTIONS_ITEM_ID) as ci_count,
  count(vcig.ALLOCATED_GROUP_ID) as vcig_count
FROM #latest_account AS l
JOIN collections_item AS ci2 ON ci2.collections_account_id=l.account_ref
JOIN v_collections_item_group AS vcig on vcig.collections_item_id = ci2.collections_item_id

The view is defined as such:

SELECT
  CI.COLLECTIONS_ITEM_ID,
  SLAG.AUTOKEY as ALLOCATED_GROUP_ID,
  SLAG.COMPONENT_BALANCE as DEBT_BALANCE,
  SLAG.INTEREST_BALANCE as INTEREST_BALANCE,
  SLAG.GROUP_BALANCE as TOTAL_BALANCE
FROM COLLECTIONS_ITEM as CI
JOIN DBA.EVENT_TRANS as ET on ET.PROCEDURE_CODE = 'DR' and ET.JOB_NO = CI.COLLECTIONS_ACCOUNT_ID and ET.SOURCE_ITEM_ID = CI.COLLECTIONS_ITEM_ID
JOIN SL_ALLOCATED_GROUP as SLAG on SLAG.ORIGINAL_TRANS_ID = ET.AUTOKEY
WHERE ET.TRANS_TYPE = 'C'

If I change the first query to TOP xxx I get these timings:

 20,000 = 1.81secs, ci_count=88,494
 30,000 = 2.67secs, ci_count=124,607
 31,000 = 2.61secs, ci_count=127,804
 32,000 = aborted after 5 minutes
 31,800 = 2.75secs, ci_count=130,126
 31,825 = aborted after 5 minutes

I had to run the complete job yesterday for the end-users report (278,000 records in #latest_account) - here is my log: 18/08/2014 09:02:20 !Executing: SELECT client_ref AS account_ref...
18/08/2014 09:02:42 !277,896 rows affected.
18/08/2014 09:02:42 !Executing: CREATE INDEX ix ON #latest_account (CLIENT_REF, ACCOUNT_REF)
18/08/2014 09:02:43 !277,896 rows affected.
18/08/2014 09:02:43 !Executing: DROP TABLE IF EXISTS #report
18/08/2014 09:02:43 !-1 rows affected.
18/08/2014 09:02:43 !Executing: SELECT a.client_ref AS ... INTO #report
19/08/2014 03:45:26 !524,073 rows affected.

Does anyone have any ideas why this is taking so long?

Answer 1

0

(see the attached plans fast.saplan and slow.saplan)

It looks like the optimizer has picked two different plans based on the estimated number of rows in #latest_account. When the estimated number of rows is 32,000 it puts #latest_account (l) higher in the tree, and does a full sequential scan on ci2... that's bad, because ci2 has 1141845 rows.

When the estimated number of rows in #latest_account is 31,000 it pushed that table down to the bottom, does a sequential scan on that table (which is OK because it is small) and does an index scan on ci2... much better than a sequential scan.

The optimizer picks plans dynamically, when the statement is executed, so the numbers of rows in different tables can have a big effect. Sometimes, it does the wrong thing.

One BRUTE FORCE technique that sometimes works is to force the optimizer to always use an index that it uses in the fast plan.

If you click on the ci2 box in the fast plan, you will see it says this...

Index Scan

Scan ci2 using index COLLECTIONS_ACCOUNT

Try changing this:

JOIN collections_item AS ci2

to this, to force that index to always be used:

JOIN collections_item AS ci2 WITH ( FORCE INDEX ( COLLECTIONS_ACCOUNT ) )

That technique is dangerous, because you might be telling it to do a bad thing in other cases (like maybe when #latest_account has millions of rows).

Other problems...

Your plans show only the estimates, not the actual results. Try clicking on the Statistics level dropdown to pick Detailed and node statistics in the future.
~~Your query seems to be using a view. It's hard to understand the plan without seeing the view.~~ ...you did show us the view earlier :)
It is also hard to understand the plan without seeing the tables. There might be a completely different way to write the query, to make it always run fast, maybe even faster than the fast query.
A later EBF of 11.0.1 might help, or better yet, a newer version of SQL Anywhere... if you upgrade, be sure try taking out the FORCE INDEX clause, it may be making things worse.

fast
----
DROP TABLE IF EXISTS #latest_account;
SELECT TOP 31000
  file_no AS account_ref
INTO #latest_account
FROM collections_account
WHERE client_ref<>'';

SELECT
  count(ci2.COLLECTIONS_ITEM_ID) as ci_count,
  count(vcig.ALLOCATED_GROUP_ID) as vcig_count
FROM #latest_account AS l
JOIN collections_item AS ci2 ON ci2.collections_account_id=l.account_ref
JOIN v_collections_item_group AS vcig on vcig.collections_item_id = ci2.collections_item_id;


slow
----
DROP TABLE IF EXISTS #latest_account;
SELECT TOP 32000
  file_no AS account_ref
INTO #latest_account
FROM collections_account
WHERE client_ref<>'';

SELECT
  count(ci2.COLLECTIONS_ITEM_ID) as ci_count,
  count(vcig.ALLOCATED_GROUP_ID) as vcig_count
FROM #latest_account AS l
JOIN collections_item AS ci2 ON ci2.collections_account_id=l.account_ref
JOIN v_collections_item_group AS vcig on vcig.collections_item_id = ci2.collections_item_id

permanent link

answered 23 Aug '14, 21:15

Breck Carter
32.5k●541●726●1050
accept rate: 20%

edited 23 Aug '14, 21:20

FORCE doesn't work - it gives this error... Could not execute statement.
Syntax error near 'FORCE' on line 5
SQLCODE=-131, ODBC 3 State="42000"
Line 1, column 1

As I've never forced hints before, I have tried the WITH before and after the ON and even without an ON...

Here is a bit more information about the tables involved:

DESCRIBE collections_item:
Column,Type,Nullable,Primary Key
'COLLECTIONS_ITEM_ID','integer',0,1
'COLLECTIONS_ACCOUNT_ID','integer',0,0
'INVOICE_REF','varchar(30)',1,0
'SOURCE_REF','varchar(30)',1,0
'SOURCE_DATE','date',1,0
'SOURCE_TIME','time',1,0
'INVOICE_DATE','date',0,0
'ORDER_REF','varchar(30)',1,0
'NARRATIVE','varchar(50)',1,0
'NET_VALUE','numeric(19,4)',1,0
'VAT_RATE','numeric(18,2)',1,0
'VAT_VALUE','numeric(19,4)',1,0
'DELAY_DAYS','integer',1,0
'EFFECTIVE_DATE','date',1,0
'VAT_CODE','varchar(5)',1,0
'GROSS_VALUE','numeric(19,4)',0,0
'TIMESTAMP','timestamp',1,0
'DISPUTED','char(1)',1,0
'DISPUTE_CODE','varchar(10)',1,0
'STATUS','varchar(12)',1,0
'REMOVE_BALANCE','char(1)',1,0
'LITIGATION_SWITCH_ADDED','char(1)',1,0
'REMOVE_DEBT','numeric(19,4)',1,0
'REMOVE_INTEREST','numeric(19,4)',1,0
'DUE_DATE','date',1,0
'DISPUTE_DATE','date',1,0
'DISPUTE_NOTE','varchar(150)',1,0
'EXPECTED_CLIENT_BALANCE','numeric(19,4)',1,0
'EXPECTED_CLIENT_BALANCE_DATE','date',1,0

1,141,845 rows with index (FOREIGN) COLLECTIONS_ACCOUNT ON collections_account_id

DESCRIBE event_trans;
Column,Type,Nullable,Primary Key
'AUTOKEY','integer',0,1
'EVENT_HISTORY','integer',0,0
'PROCEDURE_CODE','varchar(3)',0,0
'JOB_NO','integer',0,0
'REFERENCE','varchar(25)',1,0
'NARRATIVE','varchar(45)',0,0
'DEBIT','numeric(18,2)',1,0
'CREDIT','numeric(18,2)',1,0
'TRANS_TYPE','varchar(3)',0,0
'NEW_BALANCE','numeric(19,4)',1,0
'ALLOC_METHOD_USED','varchar(3)',1,0
'PAYMENT_TYPE','varchar(4)',1,0
'SOURCE_CURRENCY','varchar(3)',1,0
'SOURCE_DEBIT','numeric(19,4)',1,0
'SOURCE_CREDIT','numeric(19,4)',1,0
'TIMESTAMP','timestamp',1,0
'PRINCIPAL_BALANCE','numeric(19,4)',1,0
'FEES_BALANCE','numeric(19,4)',1,0
'COSTS_BALANCE','numeric(19,4)',1,0
'INTEREST_BALANCE','numeric(19,4)',1,0
'EXCHANGE_RATE','numeric(18,6)',1,0
'SORT_CODE','varchar(8)',1,0
'BANK_ACCOUNT','varchar(8)',1,0
'ALLOCATION_DONE','char(1)',1,0
'IMPORT_EXPORT_STATUS','varchar(1)',1,0
'PAYER_NAME','varchar(30)',1,0
'PAYMENT_IN_FULL','char(1)',1,0
'EFFECTIVE_DATE','date',1,0
'SOURCE_ITEM_ID','integer',1,0
'APPLY_TO','integer',1,0
'INVOICE_DATE','date',1,0
'CREDIT_TO_TRANS_TYPE','varchar(3)',1,0
'DISPUTED','char(1)',1,0
'CREATE_NEW_GROUP','varchar(1)',1,0
'REMIT_DONE','char(1)',1,0
'REMIT_COMMISSIONABLE','numeric(19,4)',1,0
'REMIT_COMMISSION_RATE','numeric(18,2)',1,0
'REMIT_DATE','date',1,0
'REMIT_COMMISSION','numeric(19,4)',1,0
'REMIT_AMOUNT','numeric(19,4)',1,0
'REMIT_RUN_ID','integer',1,0
'REMIT_VAT_RATE','numeric(18,2)',1,0
'REMIT_VAT_ON_COMMISSION','numeric(19,4)',1,0
'REMIT_AMOUNT_GROSS','numeric(19,4)',1,0
'REMIT_AMOUNT_NET','numeric(19,4)',1,0
'REMIT_RESIDUAL_INVOICE','numeric(19,4)',1,0
'MANUALLY_ALLOCATED','char(1)',1,0
'SKIP_BILL_CHARGES','char(1)',1,0
'INPUT_TRANS_ID','integer',1,0
'INCLUDES_VAT_RATE','numeric(18,2)',1,0
'INCLUDES_VAT_AMOUNT','numeric(19,4)',1,0
'INTEREST_RATE_CODE','varchar(5)',1,0
'COLLECTIONS_ONLY_FLAG','char(1)',1,0
'LAST_COLLECTOR','varchar(3)',1,0
'LAST_COLLECTOR_ACTION_DATE','date',1,0

9,097,271 rows with index (FOREIGN) job ON procedure_code,job_no and INDEX i_source_item_id ON source_item_id

latest_account will currently have 650k rows but will continue to grow I am using 2960 EBF which AFAIK is the last one for 11.01.

I am trying to generate better plans (Details and node statistics) and will post then when it eventually comes back...

(26 Aug '14, 04:15) da69plc

Replies hidden

When you tell about a syntax error, it may be of help if you show the exact statement that raises the error (and possibly those variants that you have tried).

Given that, I guess Breck's sample has somewhat mixed up the FORCE INDEX vs. WITH hint clauses and should read:

JOIN collections_item AS ci2 FORCE INDEX (COLLECTIONS_ACCOUNT)

or

JOIN collections_item AS ci2 WITH (INDEX (COLLECTIONS_ACCOUNT))

(26 Aug '14, 04:39) Volker Barth

I am using 2960 EBF which AFAIK is the last one for 11.01.

Nope, there are several newer ones, up to 11.0.1.3158 (for Windows) - cf. Breck's EBF overview here.

(26 Aug '14, 04:42) Volker Barth

Re Syntax error - sorry I ran out of characters! Didn't think to post a second comment - doh...but you were right - using FORCE INDEX executed the statement - but I'm still waiting for the result set (15 mins so far) - looks like I going to need to re-write it!

Re EBFS - I have tried to get access to the latest patches for 11.0.1 to see if there are any that may help but the website is refusing me access even though I registered with service.sap.com ages ago...

PS. I am also still waiting for "Detailed and node statistics to finish"... (4 hours so far)

(26 Aug '14, 08:53) da69plc

I have found this helps: CREATE INDEX ix ON #latest_account (ACCOUNT_REF)

without the need to FORCE. Maybe the optimizer floated the temp table back to the top since it had an index or it could join to it . Nevertheless, it has reduced the time taken down from 18 hours to ~14 mins and then on the second run, 3.5 mins!).

I am interested in getting the seeing the patch change logs / updating our server to the latest one though. What do I need to do to do this?

(26 Aug '14, 10:12) da69plc

Replies hidden

CREATE INDEX ix ON #latest_account (ACCOUNT_REF)

Hm, that shows we have not yet really discussed the contents of that one-column table: As you are simply selecting without grouping etc. given your original description (copied below), will the temporary table contain duplicate values?

SELECT
  client_ref AS account_ref
INTO #latest_account
FROM accounts
WHERE client_ref<>'';

If so, it might be of help to just copy unique values, and possibly to pre-order them with the help of an ORDER BY in the SELECT, and you could then use a "real" PK definition instead of an index...

(26 Aug '14, 10:40) Volker Barth

I am interested in getting...

If you have problems to access the SAP Marketplace or the EBF page, I'd recommend to have a look at FAQs here on the tag "ebf" or to ask a separate question...

(26 Aug '14, 10:43) Volker Barth

The one-column #latest_account is actually a 2 column table...I cut/simplified the query down to the bare minimum to illustrate the slowness problem - here's the proper query in production:

-- FOR A CLIENT (WHO HAS MULTIPLE CLIENT CODES), PULL OUT THE 'LATEST' ACCOUNT BY THEIR REF
SELECT
  o.CLIENT_REF,
  -- PREFER OPEN ACCOUNTS OVER CLOSED OR LAST CLOSED ONE
  (SELECT TOP 1 OUR_REF FROM collections_account WHERE CLIENT_REF=o.CLIENT_REF AND COLLECTIONS_CLIENT_CODE IN ('2412','2455','2456','2426','2427','2428','2441','2442','2443','2451','2448') ORDER BY IF DATE_CLOSED IS NULL THEN 0 ELSE 1 ENDIF, FILE_NO DESC) AS ACCOUNT_REF
INTO #latest_account
FROM collections_account AS o
WHERE ISNULL(o.CLIENT_REF,'')<>'' AND o.COLLECTIONS_CLIENT_CODE IN ('2412','2455','2456','2426','2427','2428','2441','2442','2443','2451','2448') AND o.STATUS<>'CL6'
GROUP BY CLIENT_REF

...so as you can see, I do use GROUP BY... I couldn't add a PRIMARY index to a TEMPORARY TABLE and I don't bother with the ORDER BY as AFAIK that is only used to display the end-result set and wouldn't be needed as later, I use it as the main FROM table with an ORDER BY (again I simplified the second part into a simple COUNT query to illustrate the problem).

(27 Aug '14, 03:57) da69plc

Replies hidden

Ah, I see. Well, the question whether account_ref is unique is still open - if so, you might add an UNIQUE index instead of a normal one which will give the optimizer more clues.

FWIW, you are right that ORDER BY is primarily used for display purposes but if you use SELECT ... INTO (or INSERT ... SELECT) it helps to insert new values in sorted order which may (or may not) lead to a clustered storage of these new rows - without having to use a further index to do so.

I couldn't add a PRIMARY index to a TEMPORARY TABLE

Of course you are correct, that is not possible for local temporary tables (including those created automatically by SELECT INTO #temp).

(27 Aug '14, 04:27) Volker Barth

Comment Text Removed

> Breck's sample has somewhat mixed up the FORCE INDEX vs. WITH hint clauses

I do that EVERY time... why isn't it in the list of characteristic errors? :)

Plus, posting code without testing it is an open invitation to Murphy's Law... sigh.

(27 Aug '14, 08:33) Breck Carter

why isn't it in the list of characteristic errors?

Possibly as we all have learnt (particularly from Glenn, I bet) that query hints should better be avoided, if possible:

Two different syntactical variants of index hints help to assure you have to look after the exact syntax every time and will ask yourself if you reaaaaaally want/need to use them:)

(27 Aug '14, 08:49) Volker Barth

Volker, thanks for your comments...

with CREATE INDEX it takes 3m 25s to SELECT INTO the final work file with CREATE UNIQUE INDEX it takes 3m 23s but as (only I would know) the account_ref is unique I'll use a UNIQUE index...

re: ...may or may not lead to a clustered storage...without having to use a further index... As it's not a definite yes, I'll stick with my CREATE INDEX [that takes 1 second] ;)

(29 Aug '14, 08:59) da69plc

More comments hidden

showing 5 of 12 show all flat view

SAP SQL Anywhere

Query takes 2.75 seconds to count 31,800 records but takes hours to count 31,825 records...

Follow this question

Related questions