I have a table like described below with about 1E6 records.

create table SXE_QUOTE 
( 
isin     varchar(12)     not null, 
mic  varchar(4)  not null, 
period_start    timestamp    not null, 
period_end   timestamp   not null, 
ticks_count  smallint    not null, 
-- some other fields ....
constraint SXE_QUOTE_PK primary key (isin, mic, period_start) 
, constraint SXE_QUOTE_U1 unique (isin, mic, period_end) 
, constraint SXE_QUOTE_C01 check (period_start <= period_end) 
); 

And I select last period_end for each group <isin, mic=""> with a select:

select q.isin    as @q_isin, 
q.mic    as @q_mic, 
max(q.period_end)    as @q_period_end 
from sxe_quote q 
group by q.isin, q.mic

I took a look at execution plan and was appalled, because I have expected, that the index SXE_QUOTE_U1 will be used, but instead I saw only a parallel table scan.

So the question for me: why index is not used? And it's possible at all by ASA 11?

asked 31 Dec '11, 07:44

mikron's gravatar image

mikron
913412
accept rate: 0%

edited 09 May '12, 19:29

Calvin%20Allen's gravatar image

Calvin Allen
1.5k232638

The things goes worse and worse: right now table contains about 3 mio. records, and mentioned request delivers only 16(!) rows. But optimizer decide to make parallel full table scan!!! It must be a bug in optimizer. (I really hope so, because otherwise it were a real product weakness).

(09 May '12, 18:21) mikron
Replies hidden

Can you please post the query plan for this query for review? To do so, 1. Connect to the db in DBISQL 2. Enter the query into the SQL Statements pane 3. Select Tools | Plan Viewer 4. Select "Detailed and Node Statistics" Statistics level 5. Click Get Plan 6. Click Save As

(09 May '12, 18:51) Chris Keating

Are you saying there are only 16 different combinations of (isin, mic) in all these 3 million rows? Or is the above query just a sample, and you are really filtering with more conditions?

(10 May '12, 03:36) Volker Barth

I have realy posted request as it is, without modification. There are no other filter criteria.

(10 May '12, 05:41) mikron

SXE_QUOTE_U1 is a unique constraint on (isin, mic, period_end). That seems like a reasonable candidate for an index-based access plan to me since index entries would be clustered by (isin, mic) and the engine just needs to find the highest period_end for each unique (isin, mic). One of the query folks would know better but I'd bet the choice of plan was just based on heuristics in the cost-based optimizer and might also be related to the size of the index hash for varchar(12)+varchar(4) (which would also depend on the database collation). A different plan might be chosen if more of the table pages is in cache or if calibration has been performed. I'm just a guessing, really. As I said, a query team member would know better.

permanent link

answered 31 Dec '11, 10:30

John%20Smirnios's gravatar image

John Smirnios
8.9k377110
accept rate: 39%

One of the query folks would know better...

And we're still waiting for one of them to let us know, too:)

(10 May '12, 11:29) Volker Barth

Here's another wild guess; this query appears to work OK in V11.0.1, but whether or not it works BETTER on millions of rows is anyone's guess:

create table SXE_QUOTE 
( 
isin     varchar(12)     not null, 
mic  varchar(4)  not null, 
provider smallint not null,
period_start    timestamp    not null, 
period_end   timestamp   not null, 
ticks_count  smallint    not null, 
-- some other fields ....
constraint SXE_QUOTE_PK primary key (isin, mic, provider, period_start) 
, constraint SXE_QUOTE_U1 unique (isin, mic, provider, period_end) 
, constraint SXE_QUOTE_C01 check (period_start <= period_end) 
);

INSERT sxe_quote VALUES ( 'A', 'A', 1, '2001-01-01', '2001-01-01', 0 ); 
INSERT sxe_quote VALUES ( 'A', 'A', 1, '2001-01-02', '2001-01-02', 0 );

INSERT sxe_quote VALUES ( 'A', 'A', 2, '2001-01-03', '2001-01-03', 0 ); 
INSERT sxe_quote VALUES ( 'A', 'A', 2, '2001-01-04', '2001-01-04', 0 );

INSERT sxe_quote VALUES ( 'A', 'A', 3, '2001-01-05', '2001-01-05', 0 ); 
INSERT sxe_quote VALUES ( 'A', 'A', 3, '2001-01-06', '2001-01-06', 0 );

COMMIT;

SELECT isin, mic, provider, period_end
  FROM ( SELECT isin, mic, provider, period_end,
                rank ( ) OVER partition_window AS entry_rank
           FROM sxe_quote
         WINDOW partition_window AS (
                   PARTITION BY isin, mic, provider
                   ORDER BY period_end DESC )
        ) AS ranked_sxe_quote
 WHERE entry_rank = 1
 ORDER BY isin, mic, provider;

isin,mic,provider,period_end
'A','A',1,'2001-01-02 00:00:00.000'
'A','A',2,'2001-01-04 00:00:00.000'
'A','A',3,'2001-01-06 00:00:00.000'
permanent link

answered 14 May '12, 18:33

Breck%20Carter's gravatar image

Breck Carter
25.7k428592852
accept rate: 20%

Many thanks. That shows great promise and looks much better as my current workaround. Just to fullfil the whole bild:

-- do it only first time and keep it 
select distinct isin, mic, provider into tt from sxe_quote;

-- 
select tt.isin, tt.mic, tt.provider, (select max(s.period_end) from sxe_quote s where s.isin = tt.isin and s.mic = tt.mic and s.provider = tt.provider) as period_end
from tt
order by tt.isin, tt.mic, tt.provider;

P.S. SQL might be incorrect, because I wrote it blindly, and only to show the idea behind it. Anyway, such optimization strategy might be a very useful improvement for ASA 13/14.

(15 May '12, 03:20) mikron
Replies hidden
1

FWIW the following Help topic may hold a clue about way MAX() and GROUP BY are handled (or not handled) when it comes to optimization: Optimization for MIN and MAX functions http://dcx.sybase.com/index.html#1100/en/dbusage_en11/queryopt-sect2-5356394.html

The clue is this statement: "must not contain a GROUP BY clause".

(15 May '12, 08:49) Breck Carter

That may be a very helpful link - still I feel (and you will agree, Breck) that my comment on John's answer is more valid than before...

(15 May '12, 10:28) Volker Barth

Thanks. Apparently this restriction isn't dictated by a nature of applied sql pattern but rather by a product. And after some test I found out, that SA miss some usefull optimization strategies applicable on an index. AFAIK some competitors can use indexes much ofter to extract partial key values from an index.

http://publib.boulder.ibm.com/infocenter/iseries/v7r1m0/index.jsp?topic=%2Frzajq%2Frzajqgroupidx.htm

So I have just to hope that this gap isgoing to be closed in one of next releases.

(16 May '12, 08:16) mikron

By a way: yesterday I tested this SQL on the same data with a ASA 12 and after about 30 min I gave up. So I'm very curious to see test results from ASA 11.

(16 May '12, 08:30) mikron

I tested suggested optimization with ASA 11. Sadly I had no luck. I couldn't even get detailed execution plan. Hier is execution plan based on estimation: http://www.megafileupload.com/en/file/351427/not-better-saplan.html

(16 May '12, 16:36) mikron

FWIW, Breck has asked for an according product enhancement in his blog:

Product Suggestion: Index Skip Key Processing

(22 May '12, 06:28) Volker Barth
showing 3 of 7 show all flat view

First you should make sure, that you have the latest EBF applied and that you have refreshed your statistics (see Create Statistics).

Multiple ideas come to my mind:

1) Based on the data I would not expect to much from the optimizer, the optimizer doesn't have the knowledge, that only 16 of all isin+mic tuples are distinct. So maybe what you can do is split your table into two, one having just the isin+mic as a unique entry and then a foreign key relationship to the second table with the period_end entries. Then the optimizer wouldn't have to find the distinct tuples together with the max(period_end) but instead can concentrate on that desired task.

2) As your data rate seems rather high an alternative could be to use insert triggers and update an already aggregated table with just the distinct isin+mic + the max(period_end), so that don't have to visit the SXE_QUOTE table for your query at all.

3) Last but not least have you tried to force the usage of the index?

permanent link

answered 10 May '12, 03:15

Martin's gravatar image

Martin
8.6k116151237
accept rate: 14%

Just some wild guesses:

a) To your first suggestion: Would an index solely on (isin, mic) be appropriate (though John's reasoning would imply that the existing index on SXE_QUOTE_U1 would do here as well).

b) To your second suggestion: An immediate materialized view would come to mind as a "blackbox" implementation of your approach.

Now we are waiting on mikron to check these alternatives:)

(10 May '12, 03:34) Volker Barth

Last EBF (revision 2803) was applied before. 1) I thought that an index has a distribution statistic. And I would expect, that at least after first full table scan statistics get updated.

2) table data is very volatile and get updated 5 times per second. And select mentioned before will be executed only 6 times per hour. So I gess, the total cost shoulb be more with a mat. view.

3) No, i didn't right now but I give them a try.

(10 May '12, 06:20) mikron
Replies hidden

1) not to my knowlegde, as far as I know only columns maintain a statistic but not the index itself.

(10 May '12, 07:11) Martin

It seems you are right.

(10 May '12, 14:36) mikron

3) I have tested query with an index hint, but result is getting more worse because now index was used for full index scan, and not as expected for a range scan.

(10 May '12, 14:54) mikron

The query you posted originally:

select q.isin as @q_isin, q.mic as @q_mic, max(q.period_end) as @q_period_end from sxe_quote q group by q.isin, q.mic

matches the index : constraint SXE_QUOTE_U1 unique (isin, mic, period_end)

but, query in your bad_plan.saplan is different, has an extra column ("provider")in SELECT stmt and GROUP BY:

select isin, mic, provider, max(period_end) from sxe_quote group by isin, mic, provider

have you tried to create an index that will actually match this e.g. : constraint SXE_QUOTE_U2 unique (isin, mic, provider, period_end)

and then see what plan/performance you get.

permanent link

answered 11 May '12, 16:02

Lucjan's gravatar image

Lucjan
764
accept rate: 0%

I posted my first question in December 2011, and since that the data model has undergone some changes, that I haven't mentioned, and indeed, a primary key was extended by a column 'provider', but also unique index U1 has the same extension. I guess, you can just oversee that - it's really negligible.

(14 May '12, 17:03) mikron
Replies hidden

"it's really negligible" ...are you trying to make it easy, or hard, for folks to help you? If your answer is "hard" then just keep going like you have been... there is no risk or obligation to you, there is a Money Back Guarantee.

If, on the other hand, you really want help, try posting your query as a new question, with the matching plan, and the matching CREATE TABLE.

(14 May '12, 17:57) Breck Carter

Does SXE_QUOTE_U1 look like this now?

SXE_QUOTE_U1 unique (isin, mic, provider, period_end)

(14 May '12, 18:00) Breck Carter

Sorry for imprecision. You got it all right.

(15 May '12, 02:58) mikron

I have a close look on documentation about optimization methods in execution plan and haven't saw any proper optimization method. May be I'm spoiled by other RDBMS vendors.

permanent link

answered 10 May '12, 15:56

mikron's gravatar image

mikron
913412
accept rate: 0%

Please post a *saplan file containing a graphical plan with statistics.

(10 May '12, 16:04) Breck Carter

I posted instructions on how to get the plan in an earlier thread. We need the plan to understand the problem. Otherwise, we would simply be taking guesses.

(10 May '12, 20:57) Chris Keating
Replies hidden
(11 May '12, 03:24) mikron

Thanks! ...didn't see it before, oops :)

Here's a wild guess: put a stop to the intra-query parallelism to see if the optimizer picks a better plan, by doing this ahead of the SELECT: SET TEMPORARY OPTION max_query_tasks = '1';

(14 May '12, 17:33) Breck Carter
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×242
×137
×27

question asked: 31 Dec '11, 07:44

question was seen: 1,924 times

last updated: 22 May '12, 06:28