How does case-insensitive collation work?How to do a case-insensitive LIKE in a case-sensitive database?SQL...

Where does the image of a data connector as a sharp metal spike originate from?

Does the 'java' command compile Java programs?

Generating numbers with cubes

IEEE 754 square root with Newton-Raphson

What action is recommended if your accommodation refuses to let you leave without paying additional fees?

Looking for circuit board material that can be dissolved

Can Fabled Passage generate two mana with Amulet of Vigor?

Does Hogwarts have its own anthem?

Is it possible for both sides of an encounter to be surprised?

Does the US Armed Forces refuse to recruit anyone with an IQ less than 83?

Disable all sound permanently

Why not add cuspidal curves in the moduli space of stable curves?

Is the Basilisk Jaw a Slayer only drop?

Does using a crossbow with the Sharpshooter feat change its range in underwater combat?

Did the Soviet army intentionally send troops (e.g. penal battalions) running over minefields?

Is elastic wiring feasable?

Single tx included in two different blocks

How is погода (weather) a count noun?

How does case-insensitive collation work?

How can I find places to store/land a private airplane?

Is "Ram married his daughter" ambiguous?

Sci-fi story about aliens with cells based on arsenic or nitrogen, poisoned by oxygen

Is "weekend warrior" derogatory?

Isn't the detector always measuring, and thus always collapsing the state?

How does case-insensitive collation work?

How to do a case-insensitive LIKE in a case-sensitive database?SQL Server collation mismatchBitmask Flags with Lookup Tables ClarificationHow to create Postgres DB with case insensitive collationWhy is my PostgreSQL ORDER BY case-insensitive?How can I pass column to function in sql?Does any DBMS have a collation that is both case-sensitive and accent-insensitive?Can database objects be made case insensitive while keeping strings case sensitive?How to do a case-insensitive LIKE in a case-sensitive database?SQL Server 2008R2 database migration to cloud: case-insensitive collation changed to case-sensitiveSSMS - How to do case-insensitive searches in Object Explorer

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{
margin-bottom:0;
}

Default collation type in SQL Server allows for indexing against case insensitive strings yet the case of the data is persisted. How does this actually work? I'm looking for the actual nuts and bolts, bits and bytes, or a good resource that explains it in detail.

create table casetest (fruitnames nvarchar(50) not null);

create unique index IX_fruitnames on casetest(fruitnames);



insert into casetest values ('apples');

insert into casetest values ('Pears');

-- this insert fails

insert into casetest values ('pears');



-- this yields 'Pears' as a result

select * from casetest (forceseek) where fruitnames = 'PEARS'



update casetest set fruitnames = 'pears' where fruitnames = 'pEArs'



-- this yields 'pears' as a result

select * from casetest (forceseek) where fruitnames = 'PEARS'

Questions About SQL Server Collations You Were Too Shy to Ask by Robert Sheldon covers how to use collation. It does not cover how collation works. I'm interested in how an index can be efficiently created/queried not caring about case, while simultaneously storing case data.

edited 6 hours ago

asked 9 hours ago

cocogorilla

2271 silver badge10 bronze badges

You can efficiently query (e.g. utilizing an index seek) case-insensitive strings against a case-sensitive field, but it's a little annoying.

– John Eisbrener
7 hours ago

add a comment
|

create table casetest (fruitnames nvarchar(50) not null);

create unique index IX_fruitnames on casetest(fruitnames);



insert into casetest values ('apples');

insert into casetest values ('Pears');

-- this insert fails

insert into casetest values ('pears');



-- this yields 'Pears' as a result

select * from casetest (forceseek) where fruitnames = 'PEARS'



update casetest set fruitnames = 'pears' where fruitnames = 'pEArs'



-- this yields 'pears' as a result

select * from casetest (forceseek) where fruitnames = 'PEARS'

edited 6 hours ago

asked 9 hours ago

cocogorilla

2271 silver badge10 bronze badges

You can efficiently query (e.g. utilizing an index seek) case-insensitive strings against a case-sensitive field, but it's a little annoying.

– John Eisbrener
7 hours ago

add a comment
|

create table casetest (fruitnames nvarchar(50) not null);

create unique index IX_fruitnames on casetest(fruitnames);



insert into casetest values ('apples');

insert into casetest values ('Pears');

-- this insert fails

insert into casetest values ('pears');



-- this yields 'Pears' as a result

select * from casetest (forceseek) where fruitnames = 'PEARS'



update casetest set fruitnames = 'pears' where fruitnames = 'pEArs'



-- this yields 'pears' as a result

select * from casetest (forceseek) where fruitnames = 'PEARS'

edited 6 hours ago

asked 9 hours ago

cocogorilla

2271 silver badge10 bronze badges

create table casetest (fruitnames nvarchar(50) not null);

create unique index IX_fruitnames on casetest(fruitnames);



insert into casetest values ('apples');

insert into casetest values ('Pears');

-- this insert fails

insert into casetest values ('pears');



-- this yields 'Pears' as a result

select * from casetest (forceseek) where fruitnames = 'PEARS'



update casetest set fruitnames = 'pears' where fruitnames = 'pEArs'



-- this yields 'pears' as a result

select * from casetest (forceseek) where fruitnames = 'PEARS'

sql-server collation

edited 6 hours ago

asked 9 hours ago

cocogorilla

2271 silver badge10 bronze badges

edited 6 hours ago

asked 9 hours ago

cocogorilla

2271 silver badge10 bronze badges

edited 6 hours ago

asked 9 hours ago

cocogorilla

2271 silver badge10 bronze badges

asked 9 hours ago

cocogorilla

2271 silver badge10 bronze badges

asked 9 hours ago

cocogorilla

2271 silver badge10 bronze badges

You can efficiently query (e.g. utilizing an index seek) case-insensitive strings against a case-sensitive field, but it's a little annoying.

– John Eisbrener
7 hours ago

add a comment
|

You can efficiently query (e.g. utilizing an index seek) case-insensitive strings against a case-sensitive field, but it's a little annoying.

– John Eisbrener
7 hours ago

You can efficiently query (e.g. utilizing an index seek) case-insensitive strings against a case-sensitive field, but it's a little annoying.

– John Eisbrener
7 hours ago

add a comment
|

2 Answers
2

active

oldest

votes

indexing against case insensitive strings yet the case of the data is persisted. How does this actually work?

This is actually not a SQL Server specific behavior, it's just how these things work in general.

So, the data is the data. If you are speaking about an index specifically, the data needs to be stored as it is else it would require a look-up in the main table each time to get the actual value, and there would be no possibility of a covering index (at least not for string types).

The data, either in the table/clustered index or non-clustered index, does not contain any collation / sorting info. It is simply data. The collation rules (locale/culture and sensitivities) is just meta data stored attached to the column and used when a sort operation is called, which would include the creation/rebuild of an index. The rules defined by the particular collation are used to generate sort-keys, which are binary representations of the string. These binary representations incorporate the linguistic rules (or none if a binary collation is used). The sort-keys are used to place the records in their proper order, but are not themselves stored in the index or table (and they aren't truly needed since they would merely be in the same order as the rows in the table or index anyway; or, at least I have not seen these values in the index and was told that they aren't stored, though I would think that storing them might make comparisons faster, but at the same time it would make the index larger and maybe not worth it in the end). But, the physical order of the index is just sorting, not comparison.

There are two types of collations: SQL Server and Windows.

SQL Server collations (those with names starting with SQL_) are the older, pre-SQL Server 2000 way of sorting/comparing (even though SQL_Latin1_General_CP1_CI_AS is still the installation default on US English OSes, quite sadly). In this older, simplistic, non-Unicode model, each combination of locale, code page, and the various sensitivities are given a static mapping of each of the characters in that code page. Each character is assigned a value to denote how it equates with the others. The comparison operation in this model goes character by character to determine equality based on these underlying values per each character. This is what mustaccio is describing in his answer. The only sensitivities that can be adjusted in these collations are: "case" and "accent" ("width", "kana type" and "variation selector" are not available). Also, these none of these collations support Supplementary Characters (which makes sense as those are Unicode-specific and these collations only apply to non-Unicode data). This approach applies only to VARCHAR data.

Windows collations (those with names not starting with SQL_) are the newer (starting in SQL Server 2000) way of sorting/comparing. In this newer, complex, Unicode model, each combination of locale, code page, and the various sensitivities are not given a static mapping. For one thing, there are no code pages in this model. This model assigns a default sort values to each character, and then each locale/culture can re-assign sort values to any number of characters. This allows multiple cultures to use the same characters in different ways. This does have the affect of allowing for multiple languages to be sorted naturally using the same collation if they do not use the same characters (and if one of them does not need to re-assign any values and can simply use the defaults).

{more to come here...still typing up the Windows collation portion--not simple or short :-( }

The comparison operation in this model goes character by character per each sensitivity. All sensitivities can be adjusted in these collations: "case", "accent", "width", "kana type", and "variation selector". Also, some of these collations (when used with Unicode data) support Supplementary Characters. This approach applies to both NVARCHAR data and non-Unicode VARCHAR data. It applies to non-Unicode VARCHAR data by first converting the value to Unicode internally, and then applying the sort/comparison rules.

edited 2 hours ago

Joe Obbish

24.6k4 gold badges40 silver badges107 bronze badges

answered 7 hours ago

Solomon Rutzky

53.1k5 gold badges96 silver badges211 bronze badges

add a comment
|

Typically this is implemented using collation tables that assign a certain score to each character. The sorting routine has a comparator that uses an appropriate table, whether default or specified explicitly, to compare strings, character by character, using their collation scores. If, for example, a particular collation table assigns a score of 1 to "a" and 201 to "A", and a lower score in this particular implementation means higher precedence, then "a" will be sorter before "A". Another table might assign reverse scores: 201 to "a" and 1 to "A", and the sort order will be subsequently reverse. Yet another table might assign equal scores to "a", "A", "Á", and "Å", which would lead to a case- and accent-insensitive comparison and sorting.

Similarly, such a collation table-based comparator used when comparing an index key with the value supplied in the predicate.

edited 8 hours ago

answered 8 hours ago

mustaccio

12.4k9 gold badges30 silver badges46 bronze badges

Just FYI: this info is only correct in terms of using SQL Server collations (i.e. those with names starting with SQL_) when used on VARCHAR data. This is not exactly true for NVARCHAR data or VARCHAR data when using a Windows collation (names not starting with SQL_).

– Solomon Rutzky
7 hours ago

add a comment
|

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f249715%2fhow-does-case-insensitive-collation-work%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

indexing against case insensitive strings yet the case of the data is persisted. How does this actually work?

This is actually not a SQL Server specific behavior, it's just how these things work in general.

There are two types of collations: SQL Server and Windows.

{more to come here...still typing up the Windows collation portion--not simple or short :-( }

edited 2 hours ago

Joe Obbish

24.6k4 gold badges40 silver badges107 bronze badges

answered 7 hours ago

Solomon Rutzky

53.1k5 gold badges96 silver badges211 bronze badges

add a comment
|

indexing against case insensitive strings yet the case of the data is persisted. How does this actually work?

This is actually not a SQL Server specific behavior, it's just how these things work in general.

There are two types of collations: SQL Server and Windows.

{more to come here...still typing up the Windows collation portion--not simple or short :-( }

edited 2 hours ago

Joe Obbish

24.6k4 gold badges40 silver badges107 bronze badges

answered 7 hours ago

Solomon Rutzky

53.1k5 gold badges96 silver badges211 bronze badges

add a comment
|

indexing against case insensitive strings yet the case of the data is persisted. How does this actually work?

This is actually not a SQL Server specific behavior, it's just how these things work in general.

There are two types of collations: SQL Server and Windows.

{more to come here...still typing up the Windows collation portion--not simple or short :-( }

edited 2 hours ago

Joe Obbish

24.6k4 gold badges40 silver badges107 bronze badges

answered 7 hours ago

Solomon Rutzky

53.1k5 gold badges96 silver badges211 bronze badges

indexing against case insensitive strings yet the case of the data is persisted. How does this actually work?

This is actually not a SQL Server specific behavior, it's just how these things work in general.

There are two types of collations: SQL Server and Windows.

{more to come here...still typing up the Windows collation portion--not simple or short :-( }

edited 2 hours ago

Joe Obbish

24.6k4 gold badges40 silver badges107 bronze badges

answered 7 hours ago

Solomon Rutzky

53.1k5 gold badges96 silver badges211 bronze badges

edited 2 hours ago

Joe Obbish

24.6k4 gold badges40 silver badges107 bronze badges

edited 2 hours ago

Joe Obbish

24.6k4 gold badges40 silver badges107 bronze badges

edited 2 hours ago

Joe Obbish

24.6k4 gold badges40 silver badges107 bronze badges

answered 7 hours ago

Solomon Rutzky

53.1k5 gold badges96 silver badges211 bronze badges

answered 7 hours ago

Solomon Rutzky

53.1k5 gold badges96 silver badges211 bronze badges

answered 7 hours ago

Solomon Rutzky

53.1k5 gold badges96 silver badges211 bronze badges

add a comment
|

Similarly, such a collation table-based comparator used when comparing an index key with the value supplied in the predicate.

edited 8 hours ago

answered 8 hours ago

mustaccio

12.4k9 gold badges30 silver badges46 bronze badges

Just FYI: this info is only correct in terms of using SQL Server collations (i.e. those with names starting with SQL_) when used on VARCHAR data. This is not exactly true for NVARCHAR data or VARCHAR data when using a Windows collation (names not starting with SQL_).

– Solomon Rutzky
7 hours ago

add a comment
|

Similarly, such a collation table-based comparator used when comparing an index key with the value supplied in the predicate.

edited 8 hours ago

answered 8 hours ago

mustaccio

12.4k9 gold badges30 silver badges46 bronze badges

Just FYI: this info is only correct in terms of using SQL Server collations (i.e. those with names starting with SQL_) when used on VARCHAR data. This is not exactly true for NVARCHAR data or VARCHAR data when using a Windows collation (names not starting with SQL_).

– Solomon Rutzky
7 hours ago

add a comment
|

Similarly, such a collation table-based comparator used when comparing an index key with the value supplied in the predicate.

edited 8 hours ago

answered 8 hours ago

mustaccio

12.4k9 gold badges30 silver badges46 bronze badges

Similarly, such a collation table-based comparator used when comparing an index key with the value supplied in the predicate.

edited 8 hours ago

answered 8 hours ago

mustaccio

12.4k9 gold badges30 silver badges46 bronze badges

edited 8 hours ago

answered 8 hours ago

mustaccio

12.4k9 gold badges30 silver badges46 bronze badges

answered 8 hours ago

mustaccio

12.4k9 gold badges30 silver badges46 bronze badges

answered 8 hours ago

mustaccio

12.4k9 gold badges30 silver badges46 bronze badges

Just FYI: this info is only correct in terms of using SQL Server collations (i.e. those with names starting with SQL_) when used on VARCHAR data. This is not exactly true for NVARCHAR data or VARCHAR data when using a Windows collation (names not starting with SQL_).

– Solomon Rutzky
7 hours ago

add a comment
|

Just FYI: this info is only correct in terms of using SQL Server collations (i.e. those with names starting with SQL_) when used on VARCHAR data. This is not exactly true for NVARCHAR data or VARCHAR data when using a Windows collation (names not starting with SQL_).

– Solomon Rutzky
7 hours ago

Just FYI: this info is only correct in terms of using SQL Server collations (i.e. those with names starting with SQL_) when used on VARCHAR data. This is not exactly true for NVARCHAR data or VARCHAR data when using a Windows collation (names not starting with SQL_).

– Solomon Rutzky
7 hours ago

add a comment
|

draft saved

draft discarded

Thanks for contributing an answer to Database Administrators Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Mdthbs