Faster queries using narrow indexes and CROSS APPLY

It’s common to try to build the perfect index for a query. This index only uses the exact right columns for the key columns and other, presentational, data is pushed to the INCLUDE section of the index. While you can create a narrow key this way, the entire index row can become wide.

A Sample Report

For this blog post, we’re going to work with the StackOverflow dataset. For more information on getting started with the StackOverflow database, check out How to Query the StackExchange database.

We want to find interesting, unanswered questions that have been asked in the last 90 days.

We’ll start by creating some indexes:

CREATE INDEX IX_UnansweredQuestions
    ON dbo.Posts (CreationDate) 
    INCLUDE (Id, Score, OwnerUserId, Title) 
    WHERE (AcceptedAnswerId = 0) ;

CREATE INDEX IX_ScoredCommentsByParent
    ON dbo.Posts (ParentId, Id)
    WHERE (Score > 0);

CREATE INDEX IX_VotesByPostId
    ON dbo.Votes (PostId) ;

CREATE NONCLUSTERED INDEX IX_Votes_VoteTypeId_CreationDate
    ON dbo.Votes (VoteTypeId, CreationDate)
    INCLUDE (PostId, UserId) ;

And here’s our sweet query:

SELECT  TOP 2000
        p.Id ,
        p.Score ,
        p.Title, 
        p.CreationDate,
        p.OwnerUserId ,
        p.Score
        + (SUM(CASE WHEN v.VoteTypeId = 2 THEN 1
                 WHEN v.VoteTypeId = 3 THEN -1
                 ELSE 0 END )
            / COUNT(*))
        + SUM(COALESCE(uv.Reputation, 0)) AS SortWeight
FROM    dbo.Posts AS p
        JOIN dbo.Votes AS v ON p.Id = v.PostId
                               AND v.VoteTypeId IN (2,3)
        LEFT JOIN dbo.Users AS uv ON v.UserId = uv.Id
WHERE   p.CreationDate > '2013-06-08'
        AND v.CreationDate > '2013-06-08'
        AND p.AcceptedAnswerId = 0
        AND p.PostTypeId = 1
        AND EXISTS (SELECT  *
                    FROM    dbo.Posts a
                    WHERE   a.ParentId = p.Id
                            AND a.Score > 0)
GROUP BY p.Id, p.Score, p.Title, 
        p.CreationDate,
        p.OwnerUserId
ORDER BY SortWeight DESC
;
GO

This query uses a lot of CPU and uses nearly 5 seconds of CPU time on my machine between parsing, compilation, and execution. Overall it does around 1 million logical reads. That’s a few too many reads for my liking. I think we can make it better.

Simple, Small, Indexes and APPLY

We need a way to make this query faster and lighter. First, we get rid of the wide index we created before.

DROP INDEX dbo.Posts.IX_UnansweredQuestions ;

We add two more indexes:

CREATE INDEX IX_UnansweredQuestions
    ON dbo.Posts (PostTypeId, AcceptedAnswerId, CreationDate) 
    INCLUDE (Id, Score) ;

CREATE NONCLUSTERED INDEX IX_Posts_AcceptedAnswerId_PostTypeId_CreationDate
    ON dbo.Posts (PostTypeId, CreationDate)
    INCLUDE (Id, Score);

And then we re-write the query with magic:

WITH recent_votes AS (
    SELECT  v.PostId,
            v.UserId,
            (SUM(CASE WHEN v.VoteTypeId = 2 THEN 1 
                     WHEN v.VoteTypeId = 3 THEN -1
                     ELSE 0 END ) 
                / COUNT(*)) AS VoteWeight
    FROM    dbo.Votes AS v
    WHERE   v.CreationDate > '2013-06-08'
            AND v.VoteTypeId IN (2,3)
    GROUP BY v.PostId, v.UserId
),
posts AS (
    SELECT  p.Id ,
            p.Score 
    FROM    dbo.Posts AS p 
            LEFT JOIN dbo.Posts AS answers ON answers.ParentId = p.Id
                                              AND answers.Score > 0 
    WHERE   p.CreationDate > '2013-06-08'
            AND p.AcceptedAnswerId = 0
            AND p.PostTypeId = 1
            AND answers.Id IS NOT NULL
)
SELECT  TOP 2000
        p.Id ,
        p.Score ,
        _.Title, _.CreationDate, _.OwnerUserId ,
        p.Score 
        + SUM(v.VoteWeight)
        + SUM(COALESCE(uv.Reputation, 0)) AS SortWeight
FROM    posts AS p
        JOIN recent_votes AS v ON p.Id = v.PostId
        LEFT JOIN dbo.Users AS uv ON v.UserId = uv.Id
        CROSS APPLY (   SELECT  Title, CreationDate, OwnerUserId
                        FROM    dbo.Posts p2
                        WHERE   p2.Id = p.Id ) _
GROUP BY p.Id, 
        p.Score,
        _.Title, _.CreationDate, _.OwnerUserId
ORDER BY SortWeight DESC
;
GO

The new query performs 800,000 reads but it only takes 1.2 seconds to execute and it goes parallel (1.9 total CPU seconds are used). This is a big improvement over the first query, and I’ll explain how we got there.

What makes the CROSS APPLY faster?

This query has a lot of differences from the first query. Let’s break down what’s going on in here.

We’ve split out the core of the query into two CTEs. In this case, we’re using the CTEs as optimization fences. Rather than let SQL Server do its own thing with optimization, we’re tricking it into providing the optimization path that we want – recent_votes and posts will be optimized separately. Because of this optimization fence, we’ve been able to construct very narrow indexes that bring back the data that we need for each.

After tricking SQL Server into optimizing our CTEs separately, we use CROSS APPLY to pull back the details we want based on a primary key lookup. This helps our query run faster – we need fewer pages allocated for our indexes which means that SQL Server estimates that less memory will be used. The results speak for themselves, too – the second query performs 1/4 the I/O of the first query.

What’s the Result?

The end result is that we have query that’s fast. Our indexes are free of bloat caused by wide INCLUDE columns.

If you like this sort of thing, you might be interested in our Advanced Querying & Indexing class this August in Portland, OR.

Faster queries using narrow indexes and CROSS APPLY

A Sample Report

Simple, Small, Indexes and APPLY

What makes the CROSS APPLY faster?

What’s the Result?

Trending Articles

Scuffham Amps - S-GEAR 2.6.0 VST, AAX, STANDALONE x86 x64 (R2R NO iLok2, +NO...

Practice Sheet of Right form of verbs for HSC Students

VHSE First (1st) Allotment 2025 - vhscap.kerala.gov.in

UNIVERSE LEAGUE – UNIVERSE LEAGUE – WAR (We Are Ready) – EP [iTunes Plus M4A]

City Hunter Teledrama – Episode 18 – 07th May 2016

Comment on Proposed Criteria for Identifying Predatory Conferences by Luke...

Bureau of Internal Revenue: Regional Offices (Directory)

Kendrick Lamar – Not Like Us (2024) [24Bit-88.2kHz] [PMEDIA] ⭐️

Inception 2010 Hindi Dual Audio 650MB BRRip 720p ESubs HEVC

East Hull MD admits sexual assaults after another victim comes forward

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

R. v. Sargeant, 2023 ONSC 6406 (CanLII)

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Who’s been sentenced at Northampton Magistrates’ Court

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Family cries out as traditional ruler allegedly abducts brother, extorts N2.5m

Long-Running Conflict In Springfield (MA) Gangland Sphere Has Manzi Family &...

Wondershare Filmora X v10.1.20.16 x64

Man arrested after fracas in flat

Man charged in ongoing Sexual Assault Investigation Derek Nyilas, 46, Faces...