sql server - Adding Conditional around query increases time by over 2400% -

- March 15, 2010

update: query plan can.

we had poor performing query took 4 minutes particular organization. after usual recompiling stored proc , updating statistics didn't help, re-wrote if exists(...) select count(*)... , stored procedure when 4 minutes 70 milliseconds. problem conditional makes 70 ms query take 4 minutes? see examples

these take 4+ minutes:

if (   select count(*)            observationorganism  omo     join observation          om  on  om.observationid  = omo.observationmicid     join organism             o   on  o.organismid      = omo.organismid     join observationmicdrug   omd on  omd.observationorganismid = omo.observationorganismid     join sirn                 srn on  srn.sirnid        = omd.sirnid     join organismdrug         od  on  od.organismdrugid = omd.organismdrugid       om.statuscode in ('f', 'c')     , o.organismgroupid <> -1     , od.organismdruggroupid <> -1     , (om.labtype <> 'screen' or om.labtype null)) > 0  print 'records';

if (exists(   select *            observationorganism  omo     join observation          om  on  om.observationid  = omo.observationmicid     join organism             o   on  o.organismid      = omo.organismid     join observationmicdrug   omd on  omd.observationorganismid = omo.observationorganismid     join sirn                 srn on  srn.sirnid        = omd.sirnid     join organismdrug         od  on  od.organismdrugid = omd.organismdrugid       om.statuscode in ('f', 'c')     , o.organismgroupid <> -1     , od.organismdruggroupid <> -1     , (om.labtype <> 'screen' or om.labtype null))  print 'records'

this take 70 milliseconds:

declare @recordcount int; select @recordcount = count(*)            observationorganism  omo     join observation          om  on  om.observationid  = omo.observationmicid     join organism             o   on  o.organismid      = omo.organismid     join observationmicdrug   omd on  omd.observationorganismid = omo.observationorganismid     join sirn                 srn on  srn.sirnid        = omd.sirnid     join organismdrug         od  on  od.organismdrugid = omd.organismdrugid       om.statuscode in ('f', 'c')     , o.organismgroupid <> -1     , od.organismdruggroupid <> -1     , (om.labtype <> 'screen' or om.labtype null);  if(@recordcount > 0)   print 'records';

it doesn't make sense me why moving exact same count(*) query if statement causes such degradation or why 'exists' slower count. tried exists() in select case when exists() , still 4+ minutes.

given previous answer mentioned, i'll try explain again because these things pretty tricky. yes, think you're seeing same problem the other question. namely row goal issue.

so try , explain what's causing i'll start 3 types of joins @ disposal of engine (and pretty categorically): loop joins, merge joins, hash joins. loop joins sound like, nested loop on both sets of data. merge joins take 2 sorted lists , move through them in lock-step. , hash joins throw in smaller set filing cabinet , items in larger set once filing cabinet has been filled.

so performance wise, loop joins require pretty no set , if you're looking small amount of data they're optimal. merge best of best far join performance data size, require data sorted (which rare). hash joins require fair amount of setup allow large data sets joined quickly.

now query , difference between count(*) , exists/top 1. behavior you're seeing optimizer thinks rows of query (you can confirm planning query without grouping , seeing how many records thinks in last step). in particular thinks table in query, every record in table appear in output.

"eureka!" says, "if every row in table ends in output, find if 1 exists can cheap start-up loop join throughout because though it's slow large data sets, need 1 row." doesn't find row. , doesn't find again. , it's iterating through vast set of data using least efficient means @ disposal weeding through large sets of data.

by comparison, if ask full count of data, has find every record definition. sees vast set of data , picks choices best iterating through entire set of data instead of tiny sliver of it.

if, on other hand, correct , records correlated have found record smallest possible amount of server resources , maximized overall throughput.

Search This Blog

Stadnd

sql server - Adding Conditional around query increases time by over 2400% -

Comments

Post a Comment

Popular posts from this blog

Capture and play voice with Asterisk ARI -

visual studio - Installing Packages through Nuget - "Central Directory corrupt" -

python - Statsmodels.api Logit model error ValueError: endog must be in the unit interval -