Quantcast
Channel: Brent Ozar Unlimited®
Viewing all articles
Browse latest Browse all 3153

Computed Columns: Reversing Data For Easier Searching

$
0
0

During Training

We were talking about computed columns, and one of our students mentioned that he uses computed columns that run the REVERSE() function on a column for easier back-searching.

What’s back-searching? It’s a word I just made up.

The easiest example to think about and demo is Social Security Numbers.

One security requirement is often to give the last four.

Obviously running this for a search WHERE ssn LIKE '%0000' would perform badly over large data sets.

Now, if you only ever needed the last four, you could easily just use SUBSTRING or RIGHT to pull out the last four.

If you wanted to give people the ability to expand their search further, REVERSE becomes more valuable.

Oh, a demo

You must have said please.

Let’s mock up some dummy data.

USE tempdb;

DROP TABLE IF EXISTS dbo.AllYourPersonalInformation;

CREATE TABLE dbo.AllYourPersonalInformation
(
    id INT IDENTITY(1, 1) PRIMARY KEY CLUSTERED,
    fname VARCHAR(10),
    lname VARCHAR(20),
    ssn VARCHAR(11)
);
 
INSERT INTO dbo.AllYourPersonalInformation WITH ( TABLOCK )
			   ( fname, lname, ssn )
SELECT     TOP ( 1000000 )
           'Does',
           'Notmatter',
             RIGHT('000'  + CONVERT(VARCHAR(11), ABS(CHECKSUM( NEWID() )) ), 3) + '-'
           + RIGHT('00'   + CONVERT(VARCHAR(11), ABS(CHECKSUM( NEWID() )) ), 2) + '-'
           + RIGHT('0000' + CONVERT(VARCHAR(11), ABS(CHECKSUM( NEWID() )) ), 4)
FROM       (SELECT 1 AS n FROM sys.messages AS m CROSS JOIN sys.messages AS m2) AS x;

CREATE INDEX ix_ssn ON dbo.AllYourPersonalInformation (ssn);

We should have 1 million rows of randomly generated (and unfortunately not terribly unique) data.

If we run this query, we get a query plan that scans the index on ssn and reads every row. This is the ~bad~ kind of index scan.

SELECT COUNT(*) AS records
FROM dbo.AllYourPersonalInformation AS aypi
WHERE aypi.ssn LIKE '%9156'

Ouch

Let’s add that reverse column and index it. We don’t have to persist it.

ALTER TABLE dbo.AllYourPersonalInformation
ADD reversi AS REVERSE(ssn);

CREATE NONCLUSTERED INDEX ix_reversi ON dbo.AllYourPersonalInformation (reversi);

Now we can run queries like this — without even directly referencing our reversed column — and get the desired index seeks.

SELECT COUNT(*) AS records
FROM dbo.AllYourPersonalInformation AS aypi
WHERE REVERSE(aypi.ssn) LIKE REVERSE('%9156')

SELECT COUNT(*) AS records
FROM dbo.AllYourPersonalInformation AS aypi
WHERE REVERSE(aypi.ssn) LIKE REVERSE('%90-9156')

SELECT COUNT(*) AS records
FROM dbo.AllYourPersonalInformation AS aypi
WHERE REVERSE(aypi.ssn) LIKE REVERSE('%082-90-9156')

(Hot 97 air horn)

What I thought was cool

Was that the REVERSE on the LIKE predicate put the wildcard on the correct side. That’s usually the kind of thing that I hope works, but doesn’t.

Thanks, whoever wrote that!

And thank YOU for reading!

All our live classes offer Instant Replay. Start watching your class immediately to get ready.


Viewing all articles
Browse latest Browse all 3153

Trending Articles