Which collation should I use for biblical Hebrew?SQL Server collation mismatchSQL Server Collation for...
What kind of screwdriver can unscrew this?
Sanitise a high score table to remove offensive terms / usernames
Non-Legendary Planeswalkers
I didn't do any exit passport control when leaving Japan. What should I do?
How to not lose focus after each disruption in flow
My name was added to manuscript as co-author without my consent; how to get it removed?
How to add just the second elements in lists of pairs?
Delete line if next line is the same
5v home network
Would Anti-Magic Zone Affect Dragon Breath?
Drawing Super Mario Bros.....in LaTeX
Why did the range based for loop specification change in C++17
Song in C major has F# note
"Dear Stack Exchange, I am very disappointed in you" - How to construct a strong opening line in a letter?
Can you decide not to sneak into a room after seeing your roll?
Why is it so hard to land on the Moon?
'Cheddar goes "good" with burgers?' Can "go" be seen as a verb of the senses?
What is the word for things that work even when they aren't working - e.g escalators?
Can the bass be used instead of drums?
How does the Trump administration justify tariffs on luxury goods?
Applying rules on rules
Based on true story rules
What powers an aircraft prior to the APU being switched on?
What is the meaning of "shop-wise" in "… and talk turned shop-wise"?
Which collation should I use for biblical Hebrew?
SQL Server collation mismatchSQL Server Collation for Arabic, Hebrew, English and FrenchDesign SQL Server 2008 new install … which collation?Server default collation change or not changeDB collation is used for comparison instead of column collationWhat collation to use for Ukraine?What is Collation Compatibility_60_406_30001 in SQL ServerSearch for Arabic text ignoring diacritics, alef hamza differences, and kashida in SQL Server and OracleSQL Server default collation vs database with different collation - potential problems?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{
margin-bottom:0;
}
Which SQL Server collation should I use for biblical Hebrew? The database under consideration needs to accommodate diacritics (i.e., vowels, accents, trope, etc.).
sql-server database-design configuration sql-server-2017 collation
New contributor
add a comment
|
Which SQL Server collation should I use for biblical Hebrew? The database under consideration needs to accommodate diacritics (i.e., vowels, accents, trope, etc.).
sql-server database-design configuration sql-server-2017 collation
New contributor
add a comment
|
Which SQL Server collation should I use for biblical Hebrew? The database under consideration needs to accommodate diacritics (i.e., vowels, accents, trope, etc.).
sql-server database-design configuration sql-server-2017 collation
New contributor
Which SQL Server collation should I use for biblical Hebrew? The database under consideration needs to accommodate diacritics (i.e., vowels, accents, trope, etc.).
sql-server database-design configuration sql-server-2017 collation
sql-server database-design configuration sql-server-2017 collation
New contributor
New contributor
edited 8 hours ago
MDCCL
7,2993 gold badges20 silver badges48 bronze badges
7,2993 gold badges20 silver badges48 bronze badges
New contributor
asked 8 hours ago
brian12345brian12345
311 bronze badge
311 bronze badge
New contributor
New contributor
add a comment
|
add a comment
|
2 Answers
2
active
oldest
votes
First, regardless of anything else, you want to use the newest set of collations, which are the _100_
series as they have the newer / more complete sort weights and linguistic rules than the older series with no version number in the name (technically the are version 80
). (NOTE: testing is showing that the version 100
collations are not allowing cantillation marks to be ignored while the non-versioned collations are, which is the opposite of how it should be...will update with what I find out)
Next, it depends on how you will be interacting with the string values. Hebrew has no case variations, nor do the concepts of "width" or "Kana" apply. Accent-sensitive / insensitive might apply to diacritics used for vowels and cantillation marks (i.e. trope). Maybe answer these questions:
Do you need to distinguish between
א
andאֽ
, or betweenש
,שׁ
, andשׂ
? If so, you want an_AS
collation. (actually, testing is showing that vowels and cantillation marks are always sensitive in version 100 collations, but not in unversioned / version 80 collations, which is the opposite of how this should be). The following test shows that a vowel difference still compares as different in a newer, accent-insensitive collation, which I think should compare as the same:
SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_100_CI_AI_SC;
-- no rows (was expecting 1 row!!)
SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AI;
-- 1 row (expected)
-- א אֽ
SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AS;
-- no rows (expected)
Even just a cantillation mark difference is an issue with the newer accent-insensitive collations, and this seems odd given that the mark is Hebrew Accent Geresh (U+059C) (which has "accent" in the name!):
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_100_CS_AI_SC;
-- no rows (was expecting 1 row!!)
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AI;
-- 1 row (expected)
-- אֽ֜ אֽ
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AS;
-- no rows (expected)
Do you need to distinguish between upper-case and lower-case in other languages, such as English? If so, then you want a
_CS
collation, else you can use a_CI
collation (i.e. case-insensitive)
You do not want a binary collation (_BIN
or _BIN2
) as those cannot distinguish between Hebrew letters with both vowels and cantillation marks that are the same but have the combining characters in different orders, nor can they ignore vowels and other marks to equate things like א
and אֽ
.
For example (vowel and cantillation mark combining characters in opposite order):
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C) COLLATE Hebrew_100_CS_AS_SC;
-- אֽ֜ אֽ֜
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C) COLLATE Hebrew_100_BIN2;
-- no rows
Also, the collations ending in _SC
support supplementary characters (i.e. full UTF-16) so usually best to pick one of those, if available.
So, perhaps Hebrew_100_CI_AS_SC
or Hebrew_100_CS_AS_SC
for the columns, and you can override per expression / predicate via the COLLATE
statement if you need to use a variation, such as COLLATE Hebrew_CS_AI
.
Also, you will need to store the data in NVARCHAR
columns / variables.
add a comment
|
It depends on a lot of things. Collation is sorting, comparing, and non-unicode code page.
This repo has a good list of options around Hebrew.
+---------------------------+---------------------------------------------------------------------------------------------------------------------+
| Hebrew_BIN | Hebrew, binary sort |
| Hebrew_BIN2 | Hebrew, binary code point comparison sort |
| Hebrew_CI_AI | Hebrew, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CI_AI_WS | Hebrew, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CI_AI_KS | Hebrew, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CI_AI_KS_WS | Hebrew, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CI_AS | Hebrew, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CI_AS_WS | Hebrew, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CI_AS_KS | Hebrew, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CI_AS_KS_WS | Hebrew, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CS_AI | Hebrew, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CS_AI_WS | Hebrew, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CS_AI_KS | Hebrew, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CS_AI_KS_WS | Hebrew, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CS_AS | Hebrew, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CS_AS_WS | Hebrew, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CS_AS_KS | Hebrew, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CS_AS_KS_WS | Hebrew, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_BIN | Hebrew-100, binary sort |
| Hebrew_100_BIN2 | Hebrew-100, binary code point comparison sort |
| Hebrew_100_CI_AI | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CI_AI_WS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CI_AI_KS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CI_AI_KS_WS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CI_AS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CI_AS_WS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CI_AS_KS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CI_AS_KS_WS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CS_AI | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CS_AI_WS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CS_AI_KS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CS_AI_KS_WS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CS_AS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CS_AS_WS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CS_AS_KS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CS_AS_KS_WS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CI_AI_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AI_WS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AI_KS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AI_KS_WS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AS_WS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AS_KS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AS_KS_WS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AI_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AI_WS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AI_KS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AI_KS_WS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AS_WS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AS_KS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AS_KS_WS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive, supplementary characters |
+---------------------------+---------------------------------------------------------------------------------------------------------------------+
add a comment
|
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
brian12345 is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f250215%2fwhich-collation-should-i-use-for-biblical-hebrew%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
First, regardless of anything else, you want to use the newest set of collations, which are the _100_
series as they have the newer / more complete sort weights and linguistic rules than the older series with no version number in the name (technically the are version 80
). (NOTE: testing is showing that the version 100
collations are not allowing cantillation marks to be ignored while the non-versioned collations are, which is the opposite of how it should be...will update with what I find out)
Next, it depends on how you will be interacting with the string values. Hebrew has no case variations, nor do the concepts of "width" or "Kana" apply. Accent-sensitive / insensitive might apply to diacritics used for vowels and cantillation marks (i.e. trope). Maybe answer these questions:
Do you need to distinguish between
א
andאֽ
, or betweenש
,שׁ
, andשׂ
? If so, you want an_AS
collation. (actually, testing is showing that vowels and cantillation marks are always sensitive in version 100 collations, but not in unversioned / version 80 collations, which is the opposite of how this should be). The following test shows that a vowel difference still compares as different in a newer, accent-insensitive collation, which I think should compare as the same:
SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_100_CI_AI_SC;
-- no rows (was expecting 1 row!!)
SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AI;
-- 1 row (expected)
-- א אֽ
SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AS;
-- no rows (expected)
Even just a cantillation mark difference is an issue with the newer accent-insensitive collations, and this seems odd given that the mark is Hebrew Accent Geresh (U+059C) (which has "accent" in the name!):
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_100_CS_AI_SC;
-- no rows (was expecting 1 row!!)
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AI;
-- 1 row (expected)
-- אֽ֜ אֽ
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AS;
-- no rows (expected)
Do you need to distinguish between upper-case and lower-case in other languages, such as English? If so, then you want a
_CS
collation, else you can use a_CI
collation (i.e. case-insensitive)
You do not want a binary collation (_BIN
or _BIN2
) as those cannot distinguish between Hebrew letters with both vowels and cantillation marks that are the same but have the combining characters in different orders, nor can they ignore vowels and other marks to equate things like א
and אֽ
.
For example (vowel and cantillation mark combining characters in opposite order):
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C) COLLATE Hebrew_100_CS_AS_SC;
-- אֽ֜ אֽ֜
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C) COLLATE Hebrew_100_BIN2;
-- no rows
Also, the collations ending in _SC
support supplementary characters (i.e. full UTF-16) so usually best to pick one of those, if available.
So, perhaps Hebrew_100_CI_AS_SC
or Hebrew_100_CS_AS_SC
for the columns, and you can override per expression / predicate via the COLLATE
statement if you need to use a variation, such as COLLATE Hebrew_CS_AI
.
Also, you will need to store the data in NVARCHAR
columns / variables.
add a comment
|
First, regardless of anything else, you want to use the newest set of collations, which are the _100_
series as they have the newer / more complete sort weights and linguistic rules than the older series with no version number in the name (technically the are version 80
). (NOTE: testing is showing that the version 100
collations are not allowing cantillation marks to be ignored while the non-versioned collations are, which is the opposite of how it should be...will update with what I find out)
Next, it depends on how you will be interacting with the string values. Hebrew has no case variations, nor do the concepts of "width" or "Kana" apply. Accent-sensitive / insensitive might apply to diacritics used for vowels and cantillation marks (i.e. trope). Maybe answer these questions:
Do you need to distinguish between
א
andאֽ
, or betweenש
,שׁ
, andשׂ
? If so, you want an_AS
collation. (actually, testing is showing that vowels and cantillation marks are always sensitive in version 100 collations, but not in unversioned / version 80 collations, which is the opposite of how this should be). The following test shows that a vowel difference still compares as different in a newer, accent-insensitive collation, which I think should compare as the same:
SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_100_CI_AI_SC;
-- no rows (was expecting 1 row!!)
SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AI;
-- 1 row (expected)
-- א אֽ
SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AS;
-- no rows (expected)
Even just a cantillation mark difference is an issue with the newer accent-insensitive collations, and this seems odd given that the mark is Hebrew Accent Geresh (U+059C) (which has "accent" in the name!):
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_100_CS_AI_SC;
-- no rows (was expecting 1 row!!)
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AI;
-- 1 row (expected)
-- אֽ֜ אֽ
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AS;
-- no rows (expected)
Do you need to distinguish between upper-case and lower-case in other languages, such as English? If so, then you want a
_CS
collation, else you can use a_CI
collation (i.e. case-insensitive)
You do not want a binary collation (_BIN
or _BIN2
) as those cannot distinguish between Hebrew letters with both vowels and cantillation marks that are the same but have the combining characters in different orders, nor can they ignore vowels and other marks to equate things like א
and אֽ
.
For example (vowel and cantillation mark combining characters in opposite order):
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C) COLLATE Hebrew_100_CS_AS_SC;
-- אֽ֜ אֽ֜
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C) COLLATE Hebrew_100_BIN2;
-- no rows
Also, the collations ending in _SC
support supplementary characters (i.e. full UTF-16) so usually best to pick one of those, if available.
So, perhaps Hebrew_100_CI_AS_SC
or Hebrew_100_CS_AS_SC
for the columns, and you can override per expression / predicate via the COLLATE
statement if you need to use a variation, such as COLLATE Hebrew_CS_AI
.
Also, you will need to store the data in NVARCHAR
columns / variables.
add a comment
|
First, regardless of anything else, you want to use the newest set of collations, which are the _100_
series as they have the newer / more complete sort weights and linguistic rules than the older series with no version number in the name (technically the are version 80
). (NOTE: testing is showing that the version 100
collations are not allowing cantillation marks to be ignored while the non-versioned collations are, which is the opposite of how it should be...will update with what I find out)
Next, it depends on how you will be interacting with the string values. Hebrew has no case variations, nor do the concepts of "width" or "Kana" apply. Accent-sensitive / insensitive might apply to diacritics used for vowels and cantillation marks (i.e. trope). Maybe answer these questions:
Do you need to distinguish between
א
andאֽ
, or betweenש
,שׁ
, andשׂ
? If so, you want an_AS
collation. (actually, testing is showing that vowels and cantillation marks are always sensitive in version 100 collations, but not in unversioned / version 80 collations, which is the opposite of how this should be). The following test shows that a vowel difference still compares as different in a newer, accent-insensitive collation, which I think should compare as the same:
SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_100_CI_AI_SC;
-- no rows (was expecting 1 row!!)
SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AI;
-- 1 row (expected)
-- א אֽ
SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AS;
-- no rows (expected)
Even just a cantillation mark difference is an issue with the newer accent-insensitive collations, and this seems odd given that the mark is Hebrew Accent Geresh (U+059C) (which has "accent" in the name!):
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_100_CS_AI_SC;
-- no rows (was expecting 1 row!!)
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AI;
-- 1 row (expected)
-- אֽ֜ אֽ
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AS;
-- no rows (expected)
Do you need to distinguish between upper-case and lower-case in other languages, such as English? If so, then you want a
_CS
collation, else you can use a_CI
collation (i.e. case-insensitive)
You do not want a binary collation (_BIN
or _BIN2
) as those cannot distinguish between Hebrew letters with both vowels and cantillation marks that are the same but have the combining characters in different orders, nor can they ignore vowels and other marks to equate things like א
and אֽ
.
For example (vowel and cantillation mark combining characters in opposite order):
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C) COLLATE Hebrew_100_CS_AS_SC;
-- אֽ֜ אֽ֜
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C) COLLATE Hebrew_100_BIN2;
-- no rows
Also, the collations ending in _SC
support supplementary characters (i.e. full UTF-16) so usually best to pick one of those, if available.
So, perhaps Hebrew_100_CI_AS_SC
or Hebrew_100_CS_AS_SC
for the columns, and you can override per expression / predicate via the COLLATE
statement if you need to use a variation, such as COLLATE Hebrew_CS_AI
.
Also, you will need to store the data in NVARCHAR
columns / variables.
First, regardless of anything else, you want to use the newest set of collations, which are the _100_
series as they have the newer / more complete sort weights and linguistic rules than the older series with no version number in the name (technically the are version 80
). (NOTE: testing is showing that the version 100
collations are not allowing cantillation marks to be ignored while the non-versioned collations are, which is the opposite of how it should be...will update with what I find out)
Next, it depends on how you will be interacting with the string values. Hebrew has no case variations, nor do the concepts of "width" or "Kana" apply. Accent-sensitive / insensitive might apply to diacritics used for vowels and cantillation marks (i.e. trope). Maybe answer these questions:
Do you need to distinguish between
א
andאֽ
, or betweenש
,שׁ
, andשׂ
? If so, you want an_AS
collation. (actually, testing is showing that vowels and cantillation marks are always sensitive in version 100 collations, but not in unversioned / version 80 collations, which is the opposite of how this should be). The following test shows that a vowel difference still compares as different in a newer, accent-insensitive collation, which I think should compare as the same:
SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_100_CI_AI_SC;
-- no rows (was expecting 1 row!!)
SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AI;
-- 1 row (expected)
-- א אֽ
SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AS;
-- no rows (expected)
Even just a cantillation mark difference is an issue with the newer accent-insensitive collations, and this seems odd given that the mark is Hebrew Accent Geresh (U+059C) (which has "accent" in the name!):
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_100_CS_AI_SC;
-- no rows (was expecting 1 row!!)
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AI;
-- 1 row (expected)
-- אֽ֜ אֽ
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AS;
-- no rows (expected)
Do you need to distinguish between upper-case and lower-case in other languages, such as English? If so, then you want a
_CS
collation, else you can use a_CI
collation (i.e. case-insensitive)
You do not want a binary collation (_BIN
or _BIN2
) as those cannot distinguish between Hebrew letters with both vowels and cantillation marks that are the same but have the combining characters in different orders, nor can they ignore vowels and other marks to equate things like א
and אֽ
.
For example (vowel and cantillation mark combining characters in opposite order):
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C) COLLATE Hebrew_100_CS_AS_SC;
-- אֽ֜ אֽ֜
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C) COLLATE Hebrew_100_BIN2;
-- no rows
Also, the collations ending in _SC
support supplementary characters (i.e. full UTF-16) so usually best to pick one of those, if available.
So, perhaps Hebrew_100_CI_AS_SC
or Hebrew_100_CS_AS_SC
for the columns, and you can override per expression / predicate via the COLLATE
statement if you need to use a variation, such as COLLATE Hebrew_CS_AI
.
Also, you will need to store the data in NVARCHAR
columns / variables.
edited 6 hours ago
answered 7 hours ago
Solomon RutzkySolomon Rutzky
53.5k5 gold badges97 silver badges211 bronze badges
53.5k5 gold badges97 silver badges211 bronze badges
add a comment
|
add a comment
|
It depends on a lot of things. Collation is sorting, comparing, and non-unicode code page.
This repo has a good list of options around Hebrew.
+---------------------------+---------------------------------------------------------------------------------------------------------------------+
| Hebrew_BIN | Hebrew, binary sort |
| Hebrew_BIN2 | Hebrew, binary code point comparison sort |
| Hebrew_CI_AI | Hebrew, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CI_AI_WS | Hebrew, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CI_AI_KS | Hebrew, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CI_AI_KS_WS | Hebrew, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CI_AS | Hebrew, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CI_AS_WS | Hebrew, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CI_AS_KS | Hebrew, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CI_AS_KS_WS | Hebrew, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CS_AI | Hebrew, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CS_AI_WS | Hebrew, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CS_AI_KS | Hebrew, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CS_AI_KS_WS | Hebrew, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CS_AS | Hebrew, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CS_AS_WS | Hebrew, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CS_AS_KS | Hebrew, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CS_AS_KS_WS | Hebrew, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_BIN | Hebrew-100, binary sort |
| Hebrew_100_BIN2 | Hebrew-100, binary code point comparison sort |
| Hebrew_100_CI_AI | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CI_AI_WS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CI_AI_KS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CI_AI_KS_WS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CI_AS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CI_AS_WS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CI_AS_KS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CI_AS_KS_WS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CS_AI | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CS_AI_WS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CS_AI_KS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CS_AI_KS_WS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CS_AS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CS_AS_WS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CS_AS_KS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CS_AS_KS_WS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CI_AI_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AI_WS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AI_KS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AI_KS_WS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AS_WS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AS_KS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AS_KS_WS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AI_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AI_WS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AI_KS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AI_KS_WS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AS_WS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AS_KS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AS_KS_WS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive, supplementary characters |
+---------------------------+---------------------------------------------------------------------------------------------------------------------+
add a comment
|
It depends on a lot of things. Collation is sorting, comparing, and non-unicode code page.
This repo has a good list of options around Hebrew.
+---------------------------+---------------------------------------------------------------------------------------------------------------------+
| Hebrew_BIN | Hebrew, binary sort |
| Hebrew_BIN2 | Hebrew, binary code point comparison sort |
| Hebrew_CI_AI | Hebrew, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CI_AI_WS | Hebrew, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CI_AI_KS | Hebrew, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CI_AI_KS_WS | Hebrew, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CI_AS | Hebrew, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CI_AS_WS | Hebrew, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CI_AS_KS | Hebrew, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CI_AS_KS_WS | Hebrew, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CS_AI | Hebrew, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CS_AI_WS | Hebrew, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CS_AI_KS | Hebrew, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CS_AI_KS_WS | Hebrew, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CS_AS | Hebrew, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CS_AS_WS | Hebrew, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CS_AS_KS | Hebrew, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CS_AS_KS_WS | Hebrew, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_BIN | Hebrew-100, binary sort |
| Hebrew_100_BIN2 | Hebrew-100, binary code point comparison sort |
| Hebrew_100_CI_AI | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CI_AI_WS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CI_AI_KS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CI_AI_KS_WS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CI_AS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CI_AS_WS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CI_AS_KS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CI_AS_KS_WS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CS_AI | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CS_AI_WS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CS_AI_KS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CS_AI_KS_WS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CS_AS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CS_AS_WS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CS_AS_KS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CS_AS_KS_WS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CI_AI_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AI_WS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AI_KS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AI_KS_WS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AS_WS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AS_KS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AS_KS_WS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AI_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AI_WS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AI_KS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AI_KS_WS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AS_WS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AS_KS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AS_KS_WS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive, supplementary characters |
+---------------------------+---------------------------------------------------------------------------------------------------------------------+
add a comment
|
It depends on a lot of things. Collation is sorting, comparing, and non-unicode code page.
This repo has a good list of options around Hebrew.
+---------------------------+---------------------------------------------------------------------------------------------------------------------+
| Hebrew_BIN | Hebrew, binary sort |
| Hebrew_BIN2 | Hebrew, binary code point comparison sort |
| Hebrew_CI_AI | Hebrew, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CI_AI_WS | Hebrew, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CI_AI_KS | Hebrew, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CI_AI_KS_WS | Hebrew, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CI_AS | Hebrew, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CI_AS_WS | Hebrew, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CI_AS_KS | Hebrew, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CI_AS_KS_WS | Hebrew, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CS_AI | Hebrew, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CS_AI_WS | Hebrew, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CS_AI_KS | Hebrew, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CS_AI_KS_WS | Hebrew, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CS_AS | Hebrew, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CS_AS_WS | Hebrew, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CS_AS_KS | Hebrew, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CS_AS_KS_WS | Hebrew, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_BIN | Hebrew-100, binary sort |
| Hebrew_100_BIN2 | Hebrew-100, binary code point comparison sort |
| Hebrew_100_CI_AI | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CI_AI_WS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CI_AI_KS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CI_AI_KS_WS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CI_AS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CI_AS_WS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CI_AS_KS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CI_AS_KS_WS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CS_AI | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CS_AI_WS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CS_AI_KS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CS_AI_KS_WS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CS_AS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CS_AS_WS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CS_AS_KS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CS_AS_KS_WS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CI_AI_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AI_WS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AI_KS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AI_KS_WS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AS_WS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AS_KS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AS_KS_WS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AI_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AI_WS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AI_KS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AI_KS_WS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AS_WS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AS_KS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AS_KS_WS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive, supplementary characters |
+---------------------------+---------------------------------------------------------------------------------------------------------------------+
It depends on a lot of things. Collation is sorting, comparing, and non-unicode code page.
This repo has a good list of options around Hebrew.
+---------------------------+---------------------------------------------------------------------------------------------------------------------+
| Hebrew_BIN | Hebrew, binary sort |
| Hebrew_BIN2 | Hebrew, binary code point comparison sort |
| Hebrew_CI_AI | Hebrew, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CI_AI_WS | Hebrew, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CI_AI_KS | Hebrew, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CI_AI_KS_WS | Hebrew, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CI_AS | Hebrew, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CI_AS_WS | Hebrew, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CI_AS_KS | Hebrew, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CI_AS_KS_WS | Hebrew, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CS_AI | Hebrew, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CS_AI_WS | Hebrew, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CS_AI_KS | Hebrew, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CS_AI_KS_WS | Hebrew, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CS_AS | Hebrew, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CS_AS_WS | Hebrew, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CS_AS_KS | Hebrew, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CS_AS_KS_WS | Hebrew, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_BIN | Hebrew-100, binary sort |
| Hebrew_100_BIN2 | Hebrew-100, binary code point comparison sort |
| Hebrew_100_CI_AI | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CI_AI_WS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CI_AI_KS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CI_AI_KS_WS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CI_AS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CI_AS_WS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CI_AS_KS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CI_AS_KS_WS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CS_AI | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CS_AI_WS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CS_AI_KS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CS_AI_KS_WS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CS_AS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CS_AS_WS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CS_AS_KS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CS_AS_KS_WS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CI_AI_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AI_WS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AI_KS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AI_KS_WS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AS_WS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AS_KS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AS_KS_WS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AI_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AI_WS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AI_KS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AI_KS_WS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AS_WS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AS_KS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AS_KS_WS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive, supplementary characters |
+---------------------------+---------------------------------------------------------------------------------------------------------------------+
answered 8 hours ago
scsimonscsimon
2,6307 silver badges19 bronze badges
2,6307 silver badges19 bronze badges
add a comment
|
add a comment
|
brian12345 is a new contributor. Be nice, and check out our Code of Conduct.
brian12345 is a new contributor. Be nice, and check out our Code of Conduct.
brian12345 is a new contributor. Be nice, and check out our Code of Conduct.
brian12345 is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f250215%2fwhich-collation-should-i-use-for-biblical-hebrew%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown