How does case-insensitive collation work?How to do a case-insensitive LIKE in a case-sensitive database?SQL...
Where does the image of a data connector as a sharp metal spike originate from?
Does the 'java' command compile Java programs?
Generating numbers with cubes
IEEE 754 square root with Newton-Raphson
What action is recommended if your accommodation refuses to let you leave without paying additional fees?
Looking for circuit board material that can be dissolved
Can Fabled Passage generate two mana with Amulet of Vigor?
Does Hogwarts have its own anthem?
Is it possible for both sides of an encounter to be surprised?
Does the US Armed Forces refuse to recruit anyone with an IQ less than 83?
Disable all sound permanently
Why not add cuspidal curves in the moduli space of stable curves?
Is the Basilisk Jaw a Slayer only drop?
Does using a crossbow with the Sharpshooter feat change its range in underwater combat?
Did the Soviet army intentionally send troops (e.g. penal battalions) running over minefields?
Is elastic wiring feasable?
Single tx included in two different blocks
How is погода (weather) a count noun?
How does case-insensitive collation work?
How can I find places to store/land a private airplane?
Is "Ram married his daughter" ambiguous?
Sci-fi story about aliens with cells based on arsenic or nitrogen, poisoned by oxygen
Is "weekend warrior" derogatory?
Isn't the detector always measuring, and thus always collapsing the state?
How does case-insensitive collation work?
How to do a case-insensitive LIKE in a case-sensitive database?SQL Server collation mismatchBitmask Flags with Lookup Tables ClarificationHow to create Postgres DB with case insensitive collationWhy is my PostgreSQL ORDER BY case-insensitive?How can I pass column to function in sql?Does any DBMS have a collation that is both case-sensitive and accent-insensitive?Can database objects be made case insensitive while keeping strings case sensitive?How to do a case-insensitive LIKE in a case-sensitive database?SQL Server 2008R2 database migration to cloud: case-insensitive collation changed to case-sensitiveSSMS - How to do case-insensitive searches in Object Explorer
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{
margin-bottom:0;
}
Default collation type in SQL Server allows for indexing against case insensitive strings yet the case of the data is persisted. How does this actually work? I'm looking for the actual nuts and bolts, bits and bytes, or a good resource that explains it in detail.
create table casetest (fruitnames nvarchar(50) not null);
create unique index IX_fruitnames on casetest(fruitnames);
insert into casetest values ('apples');
insert into casetest values ('Pears');
-- this insert fails
insert into casetest values ('pears');
-- this yields 'Pears' as a result
select * from casetest (forceseek) where fruitnames = 'PEARS'
update casetest set fruitnames = 'pears' where fruitnames = 'pEArs'
-- this yields 'pears' as a result
select * from casetest (forceseek) where fruitnames = 'PEARS'
Questions About SQL Server Collations You Were Too Shy to Ask by Robert Sheldon covers how to use collation. It does not cover how collation works. I'm interested in how an index can be efficiently created/queried not caring about case, while simultaneously storing case data.
sql-server collation
add a comment
|
Default collation type in SQL Server allows for indexing against case insensitive strings yet the case of the data is persisted. How does this actually work? I'm looking for the actual nuts and bolts, bits and bytes, or a good resource that explains it in detail.
create table casetest (fruitnames nvarchar(50) not null);
create unique index IX_fruitnames on casetest(fruitnames);
insert into casetest values ('apples');
insert into casetest values ('Pears');
-- this insert fails
insert into casetest values ('pears');
-- this yields 'Pears' as a result
select * from casetest (forceseek) where fruitnames = 'PEARS'
update casetest set fruitnames = 'pears' where fruitnames = 'pEArs'
-- this yields 'pears' as a result
select * from casetest (forceseek) where fruitnames = 'PEARS'
Questions About SQL Server Collations You Were Too Shy to Ask by Robert Sheldon covers how to use collation. It does not cover how collation works. I'm interested in how an index can be efficiently created/queried not caring about case, while simultaneously storing case data.
sql-server collation
You can efficiently query (e.g. utilizing an index seek) case-insensitive strings against a case-sensitive field, but it's a little annoying.
– John Eisbrener
7 hours ago
add a comment
|
Default collation type in SQL Server allows for indexing against case insensitive strings yet the case of the data is persisted. How does this actually work? I'm looking for the actual nuts and bolts, bits and bytes, or a good resource that explains it in detail.
create table casetest (fruitnames nvarchar(50) not null);
create unique index IX_fruitnames on casetest(fruitnames);
insert into casetest values ('apples');
insert into casetest values ('Pears');
-- this insert fails
insert into casetest values ('pears');
-- this yields 'Pears' as a result
select * from casetest (forceseek) where fruitnames = 'PEARS'
update casetest set fruitnames = 'pears' where fruitnames = 'pEArs'
-- this yields 'pears' as a result
select * from casetest (forceseek) where fruitnames = 'PEARS'
Questions About SQL Server Collations You Were Too Shy to Ask by Robert Sheldon covers how to use collation. It does not cover how collation works. I'm interested in how an index can be efficiently created/queried not caring about case, while simultaneously storing case data.
sql-server collation
Default collation type in SQL Server allows for indexing against case insensitive strings yet the case of the data is persisted. How does this actually work? I'm looking for the actual nuts and bolts, bits and bytes, or a good resource that explains it in detail.
create table casetest (fruitnames nvarchar(50) not null);
create unique index IX_fruitnames on casetest(fruitnames);
insert into casetest values ('apples');
insert into casetest values ('Pears');
-- this insert fails
insert into casetest values ('pears');
-- this yields 'Pears' as a result
select * from casetest (forceseek) where fruitnames = 'PEARS'
update casetest set fruitnames = 'pears' where fruitnames = 'pEArs'
-- this yields 'pears' as a result
select * from casetest (forceseek) where fruitnames = 'PEARS'
Questions About SQL Server Collations You Were Too Shy to Ask by Robert Sheldon covers how to use collation. It does not cover how collation works. I'm interested in how an index can be efficiently created/queried not caring about case, while simultaneously storing case data.
sql-server collation
sql-server collation
edited 6 hours ago
cocogorilla
asked 9 hours ago
cocogorillacocogorilla
2271 silver badge10 bronze badges
2271 silver badge10 bronze badges
You can efficiently query (e.g. utilizing an index seek) case-insensitive strings against a case-sensitive field, but it's a little annoying.
– John Eisbrener
7 hours ago
add a comment
|
You can efficiently query (e.g. utilizing an index seek) case-insensitive strings against a case-sensitive field, but it's a little annoying.
– John Eisbrener
7 hours ago
You can efficiently query (e.g. utilizing an index seek) case-insensitive strings against a case-sensitive field, but it's a little annoying.
– John Eisbrener
7 hours ago
You can efficiently query (e.g. utilizing an index seek) case-insensitive strings against a case-sensitive field, but it's a little annoying.
– John Eisbrener
7 hours ago
add a comment
|
2 Answers
2
active
oldest
votes
indexing against case insensitive strings yet the case of the data is persisted. How does this actually work?
This is actually not a SQL Server specific behavior, it's just how these things work in general.
So, the data is the data. If you are speaking about an index specifically, the data needs to be stored as it is else it would require a look-up in the main table each time to get the actual value, and there would be no possibility of a covering index (at least not for string types).
The data, either in the table/clustered index or non-clustered index, does not contain any collation / sorting info. It is simply data. The collation rules (locale/culture and sensitivities) is just meta data stored attached to the column and used when a sort operation is called, which would include the creation/rebuild of an index. The rules defined by the particular collation are used to generate sort-keys, which are binary representations of the string. These binary representations incorporate the linguistic rules (or none if a binary collation is used). The sort-keys are used to place the records in their proper order, but are not themselves stored in the index or table (and they aren't truly needed since they would merely be in the same order as the rows in the table or index anyway; or, at least I have not seen these values in the index and was told that they aren't stored, though I would think that storing them might make comparisons faster, but at the same time it would make the index larger and maybe not worth it in the end). But, the physical order of the index is just sorting, not comparison.
There are two types of collations: SQL Server and Windows.
SQL Server collations (those with names starting with SQL_
) are the older, pre-SQL Server 2000 way of sorting/comparing (even though SQL_Latin1_General_CP1_CI_AS
is still the installation default on US English OSes, quite sadly). In this older, simplistic, non-Unicode model, each combination of locale, code page, and the various sensitivities are given a static mapping of each of the characters in that code page. Each character is assigned a value to denote how it equates with the others. The comparison operation in this model goes character by character to determine equality based on these underlying values per each character. This is what mustaccio is describing in his answer. The only sensitivities that can be adjusted in these collations are: "case" and "accent" ("width", "kana type" and "variation selector" are not available). Also, these none of these collations support Supplementary Characters (which makes sense as those are Unicode-specific and these collations only apply to non-Unicode data). This approach applies only to VARCHAR
data.
Windows collations (those with names not starting with SQL_
) are the newer (starting in SQL Server 2000) way of sorting/comparing. In this newer, complex, Unicode model, each combination of locale, code page, and the various sensitivities are not given a static mapping. For one thing, there are no code pages in this model. This model assigns a default sort values to each character, and then each locale/culture can re-assign sort values to any number of characters. This allows multiple cultures to use the same characters in different ways. This does have the affect of allowing for multiple languages to be sorted naturally using the same collation if they do not use the same characters (and if one of them does not need to re-assign any values and can simply use the defaults).
{more to come here...still typing up the Windows collation portion--not simple or short :-( }
The comparison operation in this model goes character by character per each sensitivity. All sensitivities can be adjusted in these collations: "case", "accent", "width", "kana type", and "variation selector". Also, some of these collations (when used with Unicode data) support Supplementary Characters. This approach applies to both NVARCHAR
data and non-Unicode VARCHAR
data. It applies to non-Unicode VARCHAR
data by first converting the value to Unicode internally, and then applying the sort/comparison rules.
add a comment
|
Typically this is implemented using collation tables that assign a certain score to each character. The sorting routine has a comparator that uses an appropriate table, whether default or specified explicitly, to compare strings, character by character, using their collation scores. If, for example, a particular collation table assigns a score of 1 to "a" and 201 to "A", and a lower score in this particular implementation means higher precedence, then "a" will be sorter before "A". Another table might assign reverse scores: 201 to "a" and 1 to "A", and the sort order will be subsequently reverse. Yet another table might assign equal scores to "a", "A", "Á", and "Å", which would lead to a case- and accent-insensitive comparison and sorting.
Similarly, such a collation table-based comparator used when comparing an index key with the value supplied in the predicate.
Just FYI: this info is only correct in terms of using SQL Server collations (i.e. those with names starting withSQL_
) when used onVARCHAR
data. This is not exactly true forNVARCHAR
data orVARCHAR
data when using a Windows collation (names not starting withSQL_
).
– Solomon Rutzky
7 hours ago
add a comment
|
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f249715%2fhow-does-case-insensitive-collation-work%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
indexing against case insensitive strings yet the case of the data is persisted. How does this actually work?
This is actually not a SQL Server specific behavior, it's just how these things work in general.
So, the data is the data. If you are speaking about an index specifically, the data needs to be stored as it is else it would require a look-up in the main table each time to get the actual value, and there would be no possibility of a covering index (at least not for string types).
The data, either in the table/clustered index or non-clustered index, does not contain any collation / sorting info. It is simply data. The collation rules (locale/culture and sensitivities) is just meta data stored attached to the column and used when a sort operation is called, which would include the creation/rebuild of an index. The rules defined by the particular collation are used to generate sort-keys, which are binary representations of the string. These binary representations incorporate the linguistic rules (or none if a binary collation is used). The sort-keys are used to place the records in their proper order, but are not themselves stored in the index or table (and they aren't truly needed since they would merely be in the same order as the rows in the table or index anyway; or, at least I have not seen these values in the index and was told that they aren't stored, though I would think that storing them might make comparisons faster, but at the same time it would make the index larger and maybe not worth it in the end). But, the physical order of the index is just sorting, not comparison.
There are two types of collations: SQL Server and Windows.
SQL Server collations (those with names starting with SQL_
) are the older, pre-SQL Server 2000 way of sorting/comparing (even though SQL_Latin1_General_CP1_CI_AS
is still the installation default on US English OSes, quite sadly). In this older, simplistic, non-Unicode model, each combination of locale, code page, and the various sensitivities are given a static mapping of each of the characters in that code page. Each character is assigned a value to denote how it equates with the others. The comparison operation in this model goes character by character to determine equality based on these underlying values per each character. This is what mustaccio is describing in his answer. The only sensitivities that can be adjusted in these collations are: "case" and "accent" ("width", "kana type" and "variation selector" are not available). Also, these none of these collations support Supplementary Characters (which makes sense as those are Unicode-specific and these collations only apply to non-Unicode data). This approach applies only to VARCHAR
data.
Windows collations (those with names not starting with SQL_
) are the newer (starting in SQL Server 2000) way of sorting/comparing. In this newer, complex, Unicode model, each combination of locale, code page, and the various sensitivities are not given a static mapping. For one thing, there are no code pages in this model. This model assigns a default sort values to each character, and then each locale/culture can re-assign sort values to any number of characters. This allows multiple cultures to use the same characters in different ways. This does have the affect of allowing for multiple languages to be sorted naturally using the same collation if they do not use the same characters (and if one of them does not need to re-assign any values and can simply use the defaults).
{more to come here...still typing up the Windows collation portion--not simple or short :-( }
The comparison operation in this model goes character by character per each sensitivity. All sensitivities can be adjusted in these collations: "case", "accent", "width", "kana type", and "variation selector". Also, some of these collations (when used with Unicode data) support Supplementary Characters. This approach applies to both NVARCHAR
data and non-Unicode VARCHAR
data. It applies to non-Unicode VARCHAR
data by first converting the value to Unicode internally, and then applying the sort/comparison rules.
add a comment
|
indexing against case insensitive strings yet the case of the data is persisted. How does this actually work?
This is actually not a SQL Server specific behavior, it's just how these things work in general.
So, the data is the data. If you are speaking about an index specifically, the data needs to be stored as it is else it would require a look-up in the main table each time to get the actual value, and there would be no possibility of a covering index (at least not for string types).
The data, either in the table/clustered index or non-clustered index, does not contain any collation / sorting info. It is simply data. The collation rules (locale/culture and sensitivities) is just meta data stored attached to the column and used when a sort operation is called, which would include the creation/rebuild of an index. The rules defined by the particular collation are used to generate sort-keys, which are binary representations of the string. These binary representations incorporate the linguistic rules (or none if a binary collation is used). The sort-keys are used to place the records in their proper order, but are not themselves stored in the index or table (and they aren't truly needed since they would merely be in the same order as the rows in the table or index anyway; or, at least I have not seen these values in the index and was told that they aren't stored, though I would think that storing them might make comparisons faster, but at the same time it would make the index larger and maybe not worth it in the end). But, the physical order of the index is just sorting, not comparison.
There are two types of collations: SQL Server and Windows.
SQL Server collations (those with names starting with SQL_
) are the older, pre-SQL Server 2000 way of sorting/comparing (even though SQL_Latin1_General_CP1_CI_AS
is still the installation default on US English OSes, quite sadly). In this older, simplistic, non-Unicode model, each combination of locale, code page, and the various sensitivities are given a static mapping of each of the characters in that code page. Each character is assigned a value to denote how it equates with the others. The comparison operation in this model goes character by character to determine equality based on these underlying values per each character. This is what mustaccio is describing in his answer. The only sensitivities that can be adjusted in these collations are: "case" and "accent" ("width", "kana type" and "variation selector" are not available). Also, these none of these collations support Supplementary Characters (which makes sense as those are Unicode-specific and these collations only apply to non-Unicode data). This approach applies only to VARCHAR
data.
Windows collations (those with names not starting with SQL_
) are the newer (starting in SQL Server 2000) way of sorting/comparing. In this newer, complex, Unicode model, each combination of locale, code page, and the various sensitivities are not given a static mapping. For one thing, there are no code pages in this model. This model assigns a default sort values to each character, and then each locale/culture can re-assign sort values to any number of characters. This allows multiple cultures to use the same characters in different ways. This does have the affect of allowing for multiple languages to be sorted naturally using the same collation if they do not use the same characters (and if one of them does not need to re-assign any values and can simply use the defaults).
{more to come here...still typing up the Windows collation portion--not simple or short :-( }
The comparison operation in this model goes character by character per each sensitivity. All sensitivities can be adjusted in these collations: "case", "accent", "width", "kana type", and "variation selector". Also, some of these collations (when used with Unicode data) support Supplementary Characters. This approach applies to both NVARCHAR
data and non-Unicode VARCHAR
data. It applies to non-Unicode VARCHAR
data by first converting the value to Unicode internally, and then applying the sort/comparison rules.
add a comment
|
indexing against case insensitive strings yet the case of the data is persisted. How does this actually work?
This is actually not a SQL Server specific behavior, it's just how these things work in general.
So, the data is the data. If you are speaking about an index specifically, the data needs to be stored as it is else it would require a look-up in the main table each time to get the actual value, and there would be no possibility of a covering index (at least not for string types).
The data, either in the table/clustered index or non-clustered index, does not contain any collation / sorting info. It is simply data. The collation rules (locale/culture and sensitivities) is just meta data stored attached to the column and used when a sort operation is called, which would include the creation/rebuild of an index. The rules defined by the particular collation are used to generate sort-keys, which are binary representations of the string. These binary representations incorporate the linguistic rules (or none if a binary collation is used). The sort-keys are used to place the records in their proper order, but are not themselves stored in the index or table (and they aren't truly needed since they would merely be in the same order as the rows in the table or index anyway; or, at least I have not seen these values in the index and was told that they aren't stored, though I would think that storing them might make comparisons faster, but at the same time it would make the index larger and maybe not worth it in the end). But, the physical order of the index is just sorting, not comparison.
There are two types of collations: SQL Server and Windows.
SQL Server collations (those with names starting with SQL_
) are the older, pre-SQL Server 2000 way of sorting/comparing (even though SQL_Latin1_General_CP1_CI_AS
is still the installation default on US English OSes, quite sadly). In this older, simplistic, non-Unicode model, each combination of locale, code page, and the various sensitivities are given a static mapping of each of the characters in that code page. Each character is assigned a value to denote how it equates with the others. The comparison operation in this model goes character by character to determine equality based on these underlying values per each character. This is what mustaccio is describing in his answer. The only sensitivities that can be adjusted in these collations are: "case" and "accent" ("width", "kana type" and "variation selector" are not available). Also, these none of these collations support Supplementary Characters (which makes sense as those are Unicode-specific and these collations only apply to non-Unicode data). This approach applies only to VARCHAR
data.
Windows collations (those with names not starting with SQL_
) are the newer (starting in SQL Server 2000) way of sorting/comparing. In this newer, complex, Unicode model, each combination of locale, code page, and the various sensitivities are not given a static mapping. For one thing, there are no code pages in this model. This model assigns a default sort values to each character, and then each locale/culture can re-assign sort values to any number of characters. This allows multiple cultures to use the same characters in different ways. This does have the affect of allowing for multiple languages to be sorted naturally using the same collation if they do not use the same characters (and if one of them does not need to re-assign any values and can simply use the defaults).
{more to come here...still typing up the Windows collation portion--not simple or short :-( }
The comparison operation in this model goes character by character per each sensitivity. All sensitivities can be adjusted in these collations: "case", "accent", "width", "kana type", and "variation selector". Also, some of these collations (when used with Unicode data) support Supplementary Characters. This approach applies to both NVARCHAR
data and non-Unicode VARCHAR
data. It applies to non-Unicode VARCHAR
data by first converting the value to Unicode internally, and then applying the sort/comparison rules.
indexing against case insensitive strings yet the case of the data is persisted. How does this actually work?
This is actually not a SQL Server specific behavior, it's just how these things work in general.
So, the data is the data. If you are speaking about an index specifically, the data needs to be stored as it is else it would require a look-up in the main table each time to get the actual value, and there would be no possibility of a covering index (at least not for string types).
The data, either in the table/clustered index or non-clustered index, does not contain any collation / sorting info. It is simply data. The collation rules (locale/culture and sensitivities) is just meta data stored attached to the column and used when a sort operation is called, which would include the creation/rebuild of an index. The rules defined by the particular collation are used to generate sort-keys, which are binary representations of the string. These binary representations incorporate the linguistic rules (or none if a binary collation is used). The sort-keys are used to place the records in their proper order, but are not themselves stored in the index or table (and they aren't truly needed since they would merely be in the same order as the rows in the table or index anyway; or, at least I have not seen these values in the index and was told that they aren't stored, though I would think that storing them might make comparisons faster, but at the same time it would make the index larger and maybe not worth it in the end). But, the physical order of the index is just sorting, not comparison.
There are two types of collations: SQL Server and Windows.
SQL Server collations (those with names starting with SQL_
) are the older, pre-SQL Server 2000 way of sorting/comparing (even though SQL_Latin1_General_CP1_CI_AS
is still the installation default on US English OSes, quite sadly). In this older, simplistic, non-Unicode model, each combination of locale, code page, and the various sensitivities are given a static mapping of each of the characters in that code page. Each character is assigned a value to denote how it equates with the others. The comparison operation in this model goes character by character to determine equality based on these underlying values per each character. This is what mustaccio is describing in his answer. The only sensitivities that can be adjusted in these collations are: "case" and "accent" ("width", "kana type" and "variation selector" are not available). Also, these none of these collations support Supplementary Characters (which makes sense as those are Unicode-specific and these collations only apply to non-Unicode data). This approach applies only to VARCHAR
data.
Windows collations (those with names not starting with SQL_
) are the newer (starting in SQL Server 2000) way of sorting/comparing. In this newer, complex, Unicode model, each combination of locale, code page, and the various sensitivities are not given a static mapping. For one thing, there are no code pages in this model. This model assigns a default sort values to each character, and then each locale/culture can re-assign sort values to any number of characters. This allows multiple cultures to use the same characters in different ways. This does have the affect of allowing for multiple languages to be sorted naturally using the same collation if they do not use the same characters (and if one of them does not need to re-assign any values and can simply use the defaults).
{more to come here...still typing up the Windows collation portion--not simple or short :-( }
The comparison operation in this model goes character by character per each sensitivity. All sensitivities can be adjusted in these collations: "case", "accent", "width", "kana type", and "variation selector". Also, some of these collations (when used with Unicode data) support Supplementary Characters. This approach applies to both NVARCHAR
data and non-Unicode VARCHAR
data. It applies to non-Unicode VARCHAR
data by first converting the value to Unicode internally, and then applying the sort/comparison rules.
edited 2 hours ago
Joe Obbish
24.6k4 gold badges40 silver badges107 bronze badges
24.6k4 gold badges40 silver badges107 bronze badges
answered 7 hours ago
Solomon RutzkySolomon Rutzky
53.1k5 gold badges96 silver badges211 bronze badges
53.1k5 gold badges96 silver badges211 bronze badges
add a comment
|
add a comment
|
Typically this is implemented using collation tables that assign a certain score to each character. The sorting routine has a comparator that uses an appropriate table, whether default or specified explicitly, to compare strings, character by character, using their collation scores. If, for example, a particular collation table assigns a score of 1 to "a" and 201 to "A", and a lower score in this particular implementation means higher precedence, then "a" will be sorter before "A". Another table might assign reverse scores: 201 to "a" and 1 to "A", and the sort order will be subsequently reverse. Yet another table might assign equal scores to "a", "A", "Á", and "Å", which would lead to a case- and accent-insensitive comparison and sorting.
Similarly, such a collation table-based comparator used when comparing an index key with the value supplied in the predicate.
Just FYI: this info is only correct in terms of using SQL Server collations (i.e. those with names starting withSQL_
) when used onVARCHAR
data. This is not exactly true forNVARCHAR
data orVARCHAR
data when using a Windows collation (names not starting withSQL_
).
– Solomon Rutzky
7 hours ago
add a comment
|
Typically this is implemented using collation tables that assign a certain score to each character. The sorting routine has a comparator that uses an appropriate table, whether default or specified explicitly, to compare strings, character by character, using their collation scores. If, for example, a particular collation table assigns a score of 1 to "a" and 201 to "A", and a lower score in this particular implementation means higher precedence, then "a" will be sorter before "A". Another table might assign reverse scores: 201 to "a" and 1 to "A", and the sort order will be subsequently reverse. Yet another table might assign equal scores to "a", "A", "Á", and "Å", which would lead to a case- and accent-insensitive comparison and sorting.
Similarly, such a collation table-based comparator used when comparing an index key with the value supplied in the predicate.
Just FYI: this info is only correct in terms of using SQL Server collations (i.e. those with names starting withSQL_
) when used onVARCHAR
data. This is not exactly true forNVARCHAR
data orVARCHAR
data when using a Windows collation (names not starting withSQL_
).
– Solomon Rutzky
7 hours ago
add a comment
|
Typically this is implemented using collation tables that assign a certain score to each character. The sorting routine has a comparator that uses an appropriate table, whether default or specified explicitly, to compare strings, character by character, using their collation scores. If, for example, a particular collation table assigns a score of 1 to "a" and 201 to "A", and a lower score in this particular implementation means higher precedence, then "a" will be sorter before "A". Another table might assign reverse scores: 201 to "a" and 1 to "A", and the sort order will be subsequently reverse. Yet another table might assign equal scores to "a", "A", "Á", and "Å", which would lead to a case- and accent-insensitive comparison and sorting.
Similarly, such a collation table-based comparator used when comparing an index key with the value supplied in the predicate.
Typically this is implemented using collation tables that assign a certain score to each character. The sorting routine has a comparator that uses an appropriate table, whether default or specified explicitly, to compare strings, character by character, using their collation scores. If, for example, a particular collation table assigns a score of 1 to "a" and 201 to "A", and a lower score in this particular implementation means higher precedence, then "a" will be sorter before "A". Another table might assign reverse scores: 201 to "a" and 1 to "A", and the sort order will be subsequently reverse. Yet another table might assign equal scores to "a", "A", "Á", and "Å", which would lead to a case- and accent-insensitive comparison and sorting.
Similarly, such a collation table-based comparator used when comparing an index key with the value supplied in the predicate.
edited 8 hours ago
answered 8 hours ago
mustacciomustaccio
12.4k9 gold badges30 silver badges46 bronze badges
12.4k9 gold badges30 silver badges46 bronze badges
Just FYI: this info is only correct in terms of using SQL Server collations (i.e. those with names starting withSQL_
) when used onVARCHAR
data. This is not exactly true forNVARCHAR
data orVARCHAR
data when using a Windows collation (names not starting withSQL_
).
– Solomon Rutzky
7 hours ago
add a comment
|
Just FYI: this info is only correct in terms of using SQL Server collations (i.e. those with names starting withSQL_
) when used onVARCHAR
data. This is not exactly true forNVARCHAR
data orVARCHAR
data when using a Windows collation (names not starting withSQL_
).
– Solomon Rutzky
7 hours ago
Just FYI: this info is only correct in terms of using SQL Server collations (i.e. those with names starting with
SQL_
) when used on VARCHAR
data. This is not exactly true for NVARCHAR
data or VARCHAR
data when using a Windows collation (names not starting with SQL_
).– Solomon Rutzky
7 hours ago
Just FYI: this info is only correct in terms of using SQL Server collations (i.e. those with names starting with
SQL_
) when used on VARCHAR
data. This is not exactly true for NVARCHAR
data or VARCHAR
data when using a Windows collation (names not starting with SQL_
).– Solomon Rutzky
7 hours ago
add a comment
|
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f249715%2fhow-does-case-insensitive-collation-work%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
You can efficiently query (e.g. utilizing an index seek) case-insensitive strings against a case-sensitive field, but it's a little annoying.
– John Eisbrener
7 hours ago