The forum will experience an outage sometime between February 10 at 7:00pm EST and February 12 at 11:59 EST for installation of security updates. The actual time and duration of the outage are unknown but attempts will be made to minimize the downtime. We apologize for any inconvenience.

Hi,

I have a table with over 500k rows and when I execute a query that supposed to return me a count for a subset of the data (28k) the invocation of ResultSet.next() takes over 5 seconds to perform. I don't see any delay when calling executeQuery(), only when calling ResultSet.next() to get the actual count value. What's more interesting is that when I actually need to fetch the values and use

select top x start at y ...
the performance of ResultSet.next is much better though I'm fetching multiple columns.

I'm running UltraliteJ 12.0.1.3726 on Android OS using Galaxy Nexus device. The schema for the table in question is as follows:

CREATE TABLE "dba"."LIST_DATA_FLAT" ( 
"LIST_DATA_ID" integer NOT NULL 
, "LANGUAGE_ID" integer NOT NULL 
, "VALUE" varchar(512) NOT NULL 
, "DESCRIPTION" varchar(1024) NULL 
, "ATTR_0" varchar(512) NULL 
, "ATTR_1" varchar(512) NULL 
, "ATTR_2" varchar(512) NULL 
, "ATTR_3" varchar(512) NULL 
, "ATTR_4" varchar(512) NULL 
, "ATTR_5" varchar(512) NULL 
, "ATTR_6" varchar(512) NULL 
, "ATTR_7" varchar(512) NULL 
, "ATTR_8" varchar(512) NULL 
, "ATTR_9" varchar(512) NULL 
, "ATTR_OBJECT" long binary NULL 
, "TIMESTAMP_CREATED" timestamp DEFAULT CURRENT UTC TIMESTAMP NOT NULL 
, "TIMESTAMP_UPDATED" timestamp NOT NULL 
, "LISTS_ID" integer NOT NULL 
, "BACKEND_ID" varchar(128) NULL 
, "PARENT_ID" integer NOT NULL 
, "SORT_ORDER" integer NOT NULL 
, "LEVEL" integer NOT NULL 
, PRIMARY KEY ( "LIST_DATA_ID" asc
, "LANGUAGE_ID" asc ) 
)

FYI LISTS_ID is a foreign key.

The query that I'm running is this:

select count(distinct(ld.list_data_id))
from list_data_flat ld
where ld.lists_id = 384 AND ld.level = 0

Am I reaching the boundaries of Ultralite/UltraliteJ performance with that much data or is there something wrong with my schema?

Any hints/advice is much appreciated,

Ross

asked 31 Jul '14, 10:18

rshnaper's gravatar image

rshnaper
21114
accept rate: 0%

edited 05 Aug '14, 13:05


Please confirm you posted the correct plan.

I'm guessing it is because of the distinct it has to build a temp table which, given the volume of rows, just beyond what the device can do.

If you wish to investigate this further, I suggest opening a support case.

permanent link

answered 12 Aug '14, 14:27

PhilippeBertrand%20_SAP_'s gravatar image

PhilippeBert...
1.4k41733
accept rate: 22%

edited 12 Aug '14, 14:47

An index on lists_id would help.

Execute query just builds the query. The first next() is when it goes through the entire table and does the count. The first next on your top x query just has to get to the first row, which may or may not be real easy to find depending on your data.

You can use dbisql to play with your queries and view the resulting plans. Index scans are good - full table scans not good.

permanent link

answered 31 Jul '14, 10:48

PhilippeBertrand%20_SAP_'s gravatar image

PhilippeBert...
1.4k41733
accept rate: 22%

edited 31 Jul '14, 10:48

Hi Phillippe,

Thanks for the clarification. I have an index on lists_id already since it is a foreign key. I looked at the execution plan in dbisql and it does indeed do an index scan. This is what I get:

group-single[filter[index-scan(LIST_DATA_FLAT,LISTS)]]

(31 Jul '14, 11:24) rshnaper
Replies hidden

I'm not familiar with Galaxy Nexus device so I can't comment on its performance. How much memory is in use?

The plan does look very good. Perhaps an index on lists_id, level if there are many different levels. Really depends on the distribution of your data. Each extra index slows down insert/update/delete operations and requires more space. Are you using the default max hash size for indexes when you created your database?

(31 Jul '14, 11:51) PhilippeBert...

The device has 1GB of memory with 300MB free, the app itself uses about 40MB. The hash size is default of 4 - haven't touched that as I'm not sure how it would affect the rest of the db performance. I've tried running the same scenario using Ultralite db on PC (same exact schema) using custom JNI and had the same result so I doubt it has something to do with the device. The table is mostly used for querying with very little updates done to it so I'll try to add the new index for (lists_id,level) and see if it does anything.

(31 Jul '14, 13:11) rshnaper

Also, the db is started with cache_size set to 4MB - not sure if that matters or not.

(31 Jul '14, 13:12) rshnaper

Compare the count of rows when you remove the level=0 predicate and you will see how many extra rows you are processing without the index.

(31 Jul '14, 13:31) PhilippeBert...

I've tried adding the index (lists_id,level) but it did not affect the query performance one bit. The data for lists_id = 384 has level = 0 so there are no extra entries that have any other level. Does it make sense to open a case with SAP/Sybase for this or is it just that my number of rows is a bit much for UlraliteJ?

(05 Aug '14, 10:02) rshnaper
Replies hidden

Given the distribution of the data, it makes sense the extra index didn't help. What happens if you try a count(*) instead of the count you have? (I don't expect a difference but its worth the experiment just to make sure)

What is the page size in your database?

(05 Aug '14, 10:29) PhilippeBert...

The page size of the database is 4k. I've run some tests using different types of count queries and got pretty different values depending on the query:

select count(*) from list_data_flat ld where ld.lists_id = 384 and ld.level = 0 >>> 1632 ms
select count(ld.list_data_id) from list_data_flat ld where ld.lists_id = 384 and ld.level = 0 >>> 707 ms
select count(distinct(ld.list_data_id)) from list_data_flat ld where ld.lists_id = 384 AND ld.level = 0 >>> 7202 ms

select count(*) from list_data_flat ld where ld.lists_id = 384 and ld.level = 0 >>> 1592 ms select count(ld.list_data_id) from list_data_flat ld where ld.lists_id = 384 and ld.level = 0 >>> 761 ms select count(distinct(ld.list_data_id)) from list_data_flat ld where ld.lists_id = 384 AND ld.level = 0 >>> 10155 ms

The third query is what being executed right now by the app. Looks like second query is the fastest, I'll have to figure out how to account for non-distinct rows if I were to use it.

(05 Aug '14, 11:48) rshnaper

But list_data_id is the primary key so it should always be distinct! (your original post didn't include the distinct part)

(05 Aug '14, 12:07) PhilippeBert...

Wow I feel like an idiot. My apologies, I've managed to post the wrong schema AND SQL which of course from your context changes things dramatically. I've updated the schema with the proper one, the pk is LIST_DATA_ID and LANGUAGE_ID, that's why I'm using distinct(list_data_id) in my count query.

(05 Aug '14, 13:08) rshnaper
More comments hidden
showing 5 of 10 show all flat view
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×238
×75
×59
×15

question asked: 31 Jul '14, 10:18

question was seen: 1,403 times

last updated: 12 Aug '14, 14:47