Universal Acceptance (UA) Micro-Learning Module: Module 3- Unicode in Data Structures and Algorithms in Java.
Instructor Guide
1st Edition.
���© 2024 Creative Commons License - Attribution 4.0 International (CC BY 4.0).�
Universal Acceptance
Unicode In Data Structures and Algorithms in Java- Micro-Learning Module Objectives.
| 2
Note About the Utilization of Unicode String Literals :
| 3
Implementing Data Structures with Local Language Strings(1/6):
import java.util.ArrayList;
import java.util.List;
// Create a list of Unicode strings
List<String> unicodeList = new ArrayList<>();
unicodeList.add("ሰላም"); // Ethiopic (Amharic)
unicodeList.add("مرحبا"); // ‘Marhaba’ in Arabic, hello or welcome in English
unicodeList.add("සුබ උදෑසනක්"); // Suba Udaesank’ in Sinhala
unicodeList.add("你好"); // ‘nǐ hǎo’ in Chinese
| 4
Implementing Data Structures with Local Language Strings(2/6):
,
import java.util.HashMap;
import java.util.Map;
// Create a dictionary (map) with Unicode strings
Map<String, String> unicodeDictionary = new HashMap<>();
unicodeDictionary.put("ሰላም", "Hello"); // Ethiopic (Amharic)
unicodeDictionary.put("مرحبا", "Bonjour"); // Arabic
unicodeDictionary.put("සුබ උදෑසනක්", "Good morning"); // Sinhala
unicodeDictionary.put("你好", "Ni hao"); // Chinese
| 5
Implementing Data Structures with Local Language Strings(3/6):
,
import java.util.HashSet;
import java.util.Set;
// Create a set of Unicode strings
Set<String> unicodeSet = new HashSet<>();
unicodeSet.add("ሰላም"); // Ethiopic (Amharic)
unicodeSet.add("مرحبا"); // Arabic
unicodeSet.add("සුබ උදෑසනක්"); // Sinhala
unicodeSet.add("你好"); // Chinese
| 6
Implementing Data Structures with Local Language Strings(4/6):
,
public static class TreeNode {
String data;
TreeNode left;
TreeNode right;
public TreeNode(String data) {
this.data = data;
this.left = null;
this.right = null;
}
}
// Create a tree with Unicode strings
TreeNode root = new TreeNode("ሰላም"); // Ethiopic (Amharic)
root.left = new TreeNode("مرحبا"); // Arabic
root.right = new TreeNode("සුබ උදෑසනක්"); // Sinhala
root.left.left = new TreeNode("你好"); // Chinese
| 7
Implementing Data Structures with Local Language Strings(5/6):
,
| 8
Implementing Data Structures with Local Language Strings(6/6):
,
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
// Create a graph with Unicode strings
Map<String, List<String>> graph = new HashMap<>();
// Add vertices
addVertex(graph, "ሰላም"); // Ethiopic (Amharic)
addVertex(graph, "مرحبا"); // Arabic
addVertex(graph, "සුබ උදෑසනක්"); // Sinhala
addVertex(graph, "你好"); // Chinese
// Add edges
addEdge(graph, "ሰላም", "مرحبا");
addEdge(graph, "مرحبا", "සුබ උදෑසනක්");
addEdge(graph, "مرحبا", "你好");
| 9
Unicode Collation Algorithm for Sorting of Unicode Strings:
| 10
How to Sort Strings Using the Unicode Collation Algorithm (UCA)(1/2):
Example: Sorting of Strings Using UCA in Java:
,
import java.text.Collator;
import java.util.Arrays;
import java.util.Locale;
public class UnicodeCollationExample {
public static void main(String[] args) {
// Create a Collator instance for the desired locale
Collator collator = Collator.getInstance(Locale.getDefault());
// Array of Unicode strings to be sorted
String[] strings = {
"अनुच्छेद", // Hindi: Paragraph
"Éclair", // French: Pastry
"Österreich", // German: Austria
"中国", // Chinese: China
"Россия", // Russian: Russia
"日本", // Japanese: Japan
"ሉሲ" //Ethiopian: Lucy
};
// Sort the strings using the collator
Arrays.sort(strings, collator);
// Print the sorted strings
for (String str : strings) {
System.out.println(str);
}
}
}
| 11
How to Sort Strings Using the Unicode Collation Algorithm (UCA)(2/2):
Example: Sorting of Strings Using UCA in Java -Output:
,
Output:
Éclair
Österreich
Россия
ሉሲ
अनुच्छेद
中国
日本
| 12
Some of Commonly Used Sorting Algorithms in Data Structures and Algorithms(1/4):
| 13
Some of Commonly Used Sorting Algorithms in Data Structures and Algorithms(2/4):
| 14
Some of Commonly Used Sorting Algorithms in Data Structures and Algorithms(3/4):
| 15
Some of Commonly Used Sorting Algorithms in Data Structures and Algorithms(4/4):
| 16
Example: Unicode-aware Bubble Sort in Java(1/3):
,
import java.text.Collator;
import java.util.Arrays;
import java.util.Locale;
public class UnicodeAwareBubbleSort {
public static void main(String[] args) {
String[] strings = {
"সর্বজনীন-স্বীকৃ তির-পরীক্ষা.ভারি", // Bangla (Bengali)
"универсальное-принятие-тест.москва", // Cyrillic
"सार्वभौमिक-स्र्ीकृति-परीक्षण.संगठन", // Devanagari
"უნივერსალური-თავსობადობის-ტესტი.გე", // Georgian
"καθολική-αποδοχή-δοκιμή.ευ", // Greek
"تجربة-القبول-الشامل.موريتانيا", // Arabic
"普遍适用测试.我爱你", // Simplified Chinese
"ඉ-තැපැල්-පිරික්සුම@විශ්ව-සම්මුති-පිරික්සුම.ලංකා" // Sinhala
};
unicodeAwareBubbleSort(strings);
// Print the sorted strings
for (String string : strings) {
System.out.println(string);
}
}
| 17
Example: Unicode-aware Bubble Sort in Java(2/3):
,
public static void unicodeAwareBubbleSort(String[] strings) {
int n = strings.length;
for (int i = 0; i < n; i++) {
// Flag to check if any swaps occurred in this pass
boolean swapped = false;
for (int j = 0; j < n - i - 1; j++) {
// Compare and swap if necessary using Unicode-aware comparison
if (compareStrings(strings[j], strings[j + 1]) > 0) {
String temp = strings[j];
strings[j] = strings[j + 1];
strings[j + 1] = temp;
swapped = true;
}
}
// If no swaps occurred in this pass, the array is already sorted
if (!swapped) {
break;
}
}
}
| 18
Example: Unicode-aware Bubble Sort in Java(3/3):
,
public static int compareStrings(String str1, String str2) {
// Compare the strings using the default locale-specific collator
Collator collator = Collator.getInstance(Locale.getDefault());
return collator.compare(str1, str2);
}
}
Output:
καθολική-αποδοχή-δοκιμή.ευ
универсальное-принятие-тест.москва
تجربة-القبول-الشامل.موريتانيا
सार्वभौमिक-स्र्ीकृति-परीक्षण.संगठन
সর্বজনীন-স্বীকৃ তির-পরীক্ষা.ভারি
ඉ-තැපැල්-පිරික්සුම@විශ්ව-සම්මුති-පිරික්සුම.ලංකා
უნივერსალური-თავსობადობის-ტესტი.გე
普遍适用测试.我爱你
| 19
Example: Unicode-aware Selection Sort in Java(1/3):
,
import java.text.Normalizer;
import java.util.Arrays;
public class UnicodeAwareSelectionSort {
public static void main(String[] args) {
String[] strings = {
"সর্বজনীন-স্বীকৃ তির-পরীক্ষা.ভারি", // Bangla (Bengali)
"универсальное-принятие-тест.москва", // Cyrillic
"सार्वभौमिक-स्र्ीकृति-परीक्षण.संगठन", // Devanagari
"უნივერსალური-თავსობადობის-ტესტი.გე", // Georgian
"καθολική-αποδοχή-δοκιμή.ευ", // Greek
"تجربة-القبول-الشامل.موريتانيا", // Arabic
"普遍适用测试.我爱你", // Simplified Chinese
"ඉ-තැපැල්-පිරික්සුම@විශ්ව-සම්මුති-පිරික්සුම.ලංකා" // Sinhala
};
selectionSort(strings);
// Print the sorted strings
for (String string : strings) {
System.out.println(string);
}
}
| 20
Example: Unicode-aware Selection Sort in Java(2/3):
,
public static void selectionSort(String[] arr) {
int n = arr.length;
for (int i = 0; i < n; i++) {
int minIndex = i;
for (int j = i + 1; j < n; j++) {
// Compare using Unicode-aware comparison
if (compareStrings(arr[j], arr[minIndex]) < 0) {
minIndex = j;
}
}
if (minIndex != i) {
// Swap elements
String temp = arr[i];
arr[i] = arr[minIndex];
arr[minIndex] = temp;
}
}
}
| 21
Example: Unicode-aware Selection Sort in Java(3/3):
,
public static int compareStrings(String str1, String str2) {
// Normalize strings for consistent comparison
String str1Norm = Normalizer.normalize(str1, Normalizer.Form.NFD);
String str2Norm = Normalizer.normalize(str2, Normalizer.Form.NFD);
// Use case-insensitive comparison using Unicode code points
return str1Norm.compareToIgnoreCase(str2Norm);
}
}
Output:
καθολική-αποδοχή-δοκιμή.ευ
универсальное-принятие-тест.москва
تجربة-القبول-الشامل.موريتانيا
सार्वभौमिक-स्र्ीकृति-परीक्षण.संगठन
সর্বজনীন-স্বীকৃ তির-পরীক্ষা.ভারি
ඉ-තැපැල්-පිරික්සුම@විශ්ව-සම්මුති-පිරික්සුම.ලංකා
უნივერსალური-თავსობადობის-ტესტი.გე
普遍适用测试.我爱你
| 22
Example: Unicode-aware Insertion Sort in Java(1/3):
,
import java.text.Normalizer;
import java.util.Arrays;
import java.util.Comparator;
public class StringComparator {
public static String[] insertionSort(String[] arr) {
int n = arr.length;
for (int i = 1; i < n; i++) {
String key = arr[i];
int j = i - 1;
while (j >= 0 && compareStrings(arr[j], key) > 0) {
arr[j + 1] = arr[j];
j--;
}
arr[j + 1] = key;
}
return arr;
}
| 23
Example: Unicode-aware Insertion Sort in Java(2/3):
,
public static int compareStrings(String str1, String str2) {
// Normalize strings for consistent comparison
String str1Norm = Normalizer.normalize(str1, Normalizer.Form.NFD);
String str2Norm = Normalizer.normalize(str2, Normalizer.Form.NFD);
// Use case-insensitive comparison using Unicode code points
return Character.codePointAt(str1Norm, 0) - Character.codePointAt(str2Norm, 0);
}
| 24
Example: Unicode-aware Insertion Sort in Java(3/3):
,
public static void main(String[] args) {
String[] strings = {
"সর্বজনীন-স্বীকৃ তির-পরীক্ষা.ভারি", // Bangla (Bengali)
"универсальное-принятие-тест.москва", // Cyrillic
"सार्वभौमिक-स्र्ीकृति-परीक्षण.संगठन", // Devanagari
"უნივერსალური-თავსობადობის-ტესტი.გე", // Georgian
"καθολική-αποδοχή-δοκιμή.ευ", // Greek
"تجربة-القبول-الشامل.موريتانيا", // Arabic
"普遍适用测试.我爱你", // Simplified Chinese
"ඉ-තැපැල්-පිරික්සුම@විශ්ව-සම්මුති-පිරික්සුම.ලංකා" // Sinhala
};
Arrays.sort(strings, Comparator.comparing(String::new, (str1, str2) -> compareStrings(str1, str2)));
for (String item : strings) {
System.out.println(item);
}
}
}
| 25
Example: Unicode-aware Merge Sort in Java(1/3):
,
import java.text.Normalizer;
import java.util.Arrays;
public class UStringMergeSorter {
public static int compareStrings(String str1, String str2) {
// Normalize strings for consistent comparison
String str1Norm = Normalizer.normalize(str1, Normalizer.Form.NFD);
String str2Norm = Normalizer.normalize(str2, Normalizer.Form.NFD);
// Use case-insensitive comparison using Unicode code points
return Character.compare(str1Norm.charAt(0), str2Norm.charAt(0));
}
public static String[] mergeSort(String[] arr) {
if (arr.length <= 1) {
return arr;
}
int mid = arr.length / 2;
String[] leftHalf = Arrays.copyOfRange(arr, 0, mid);
String[] rightHalf = Arrays.copyOfRange(arr, mid, arr.length);
leftHalf = mergeSort(leftHalf);
rightHalf = mergeSort(rightHalf);
return merge(leftHalf, rightHalf);
}
| 26
Example: Unicode-aware Merge Sort in Java(2/3):
,
public static String[] merge(String[] left, String[] right) {
String[] merged = new String[left.length + right.length];
int i = 0, j = 0, k = 0;
while (i < left.length && j < right.length) {
if (compareStrings(left[i], right[j]) <= 0) {
merged[k++] = left[i++];
} else {
merged[k++] = right[j++];
}
}
while (i < left.length) {
merged[k++] = left[i++];
}
while (j < right.length) {
merged[k++] = right[j++];
}
return merged;
}
| 27
Example: Unicode-aware Merge Sort in Java(3/3):
,
public static void main(String[] args) {
String[] strings = {
"সর্বজনীন-স্বীকৃ তির-পরীক্ষা.ভারি", // Bangla (Bengali)
"универсальное-принятие-тест.москва", // Cyrillic
"सार्वभौमिक-स्र्ीकृति-परीक्षण.संगठन", // Devanagari
"უნივერსალური-თავსობადობის-ტესტი.გე", // Georgian
"καθολική-αποδοχή-δοκιμή.ευ", // Greek
"تجربة-القبول-الشامل.موريتانيا", // Arabic
"普遍适用测试.我爱你", // Simplified Chinese
"ඉ-තැපැල්-පිරික්සුම@විශ්ව-සම්මුති-පිරික්සුම.ලංකා" // Sinhala
};
String[] sortedStrings = mergeSort(strings);
for (String item : sortedStrings) {
System.out.println(item);
}
}
}
| 28
Example: Unicode-aware Quick Sort in Java(1/3):
,
import java.text.Normalizer;
import java.util.ArrayList;
import java.util.List;
public class UStringQuickSorter {
public static int compareStrings(String str1, String str2) {
// Normalize strings for consistent comparison
String str1Norm = Normalizer.normalize(str1, Normalizer.Form.NFD);
String str2Norm = Normalizer.normalize(str2, Normalizer.Form.NFD);
// Use case-insensitive comparison using Unicode code points
return Character.compare(str1Norm.charAt(0), str2Norm.charAt(0));
}
public static List<String> quickSort(List<String> arr) {
if (arr.size() <= 1) {
return arr;
}
String pivot = arr.get(arr.size() / 2);
| 29
Example: Unicode-aware Quick Sort in Java(2/3):
,
List<String> left = new ArrayList<>();
List<String> equal = new ArrayList<>();
List<String> right = new ArrayList<>();
for (String item : arr) {
int comparison = compareStrings(item, pivot);
if (comparison < 0) {
left.add(item);
} else if (comparison == 0) {
equal.add(item);
} else {
right.add(item);
}
}
left = quickSort(left);
right = quickSort(right);
List<String> sortedList = new ArrayList<>();
sortedList.addAll(left);
sortedList.addAll(equal);
sortedList.addAll(right);
return sortedList;
}
| 30
Example: Unicode-aware Quick Sort in Java(3/3):
,
public static void main(String[] args) {
List<String> strings = new ArrayList<>();
strings.add("সর্বজনীন-স্বীকৃ তির-পরীক্ষা.ভারি"); // Bangla (Bengali)
strings.add("универсальное-принятие-тест.москва"); // Cyrillic
strings.add("सार्वभौमिक-स्र्ीकृति-परीक्षण.संगठन"); // Devanagari
strings.add("უნივერსალური-თავსობადობის-ტესტი.გე"); // Georgian
strings.add("καθολική-αποδοχή-δοκιμή.ευ"); // Greek
strings.add("تجربة-القبول-الشامل.موريتانيا"); // Arabic
strings.add("普遍适用测试.我爱你"); // Simplified Chinese
strings.add("ඉ-තැපැල්-පිරික්සුම@විශ්ව-සම්මුති-පිරික්සුම.ලංකා"); // Sinhala
List<String> sortedStrings = quickSort(strings);
for (String item : sortedStrings) {
System.out.println(item);
}
}
}
| 31
Example: Unicode-aware Heap Sort in Java(1/3):
,
import java.text.Normalizer;
import java.util.Arrays;
public class UStringHeapSorter {
public static int compareStrings(String str1, String str2) {
// Normalize strings for consistent comparison
String str1Norm = Normalizer.normalize(str1, Normalizer.Form.NFD);
String str2Norm = Normalizer.normalize(str2, Normalizer.Form.NFD);
// Use case-insensitive comparison using Unicode code points
return Character.compare(str1Norm.charAt(0), str2Norm.charAt(0));
}
public static void heapify(String[] arr, int n, int i) {
int largest = i;
int left = 2 * i + 1;
int right = 2 * i + 2;
if (left < n && compareStrings(arr[left], arr[largest]) > 0) {
largest = left;
}
if (right < n && compareStrings(arr[right], arr[largest]) > 0) {
largest = right;
}
| 32
Example: Unicode-aware Heap Sort in Java(2/3):
,
if (largest != i) {
String temp = arr[i];
arr[i] = arr[largest];
arr[largest] = temp;
heapify(arr, n, largest);
}
}
public static String[] heapSort(String[] arr) {
int n = arr.length;
// Build max heap
for (int i = n / 2 - 1; i >= 0; i--) {
heapify(arr, n, i);
}
// Heap sort
for (int i = n - 1; i > 0; i--) {
String temp = arr[0];
arr[0] = arr[i];
arr[i] = temp;
heapify(arr, i, 0);
}
return arr;
}
| 33
Example: Unicode-aware Heap Sort in Java(3/3):
,
public static void main(String[] args) {
String[] strings = {
"সর্বজনীন-স্বীকৃ তির-পরীক্ষা.ভারি", // Bangla (Bengali)
"универсальное-принятие-тест.москва", // Cyrillic
"सार्वभौमिक-स्र्ीकृति-परीक्षण.संगठन", // Devanagari
"უნივერსალური-თავსობადობის-ტესტი.გე", // Georgian
"καθολική-αποδοχή-δοκιμή.ευ", // Greek
"تجربة-القبول-الشامل.موريتانيا", // Arabic
"普遍适用测试.我爱你", // Simplified Chinese
"ඉ-තැපැල්-පිරික්සුම@විශ්ව-සම්මුති-පිරික්සුම.ලංකා" // Sinhala
};
strings = heapSort(strings);
for (String item : strings) {
System.out.println(item);
}
}
}
| 34
Example: Unicode-aware Tim Sort in Java(1/3):
,
import java.text.Normalizer;
import java.util.Arrays;
public class UStringTimSorter {
public static int compareStrings(String str1, String str2) {
// Normalize strings for consistent comparison
String str1Norm = Normalizer.normalize(str1, Normalizer.Form.NFD);
String str2Norm = Normalizer.normalize(str2, Normalizer.Form.NFD);
// Use case-insensitive comparison using Unicode code points
return Character.compare(str1Norm.charAt(0), str2Norm.charAt(0));
}
public static void insertionSort(String[] arr, int left, int right) {
for (int i = left + 1; i <= right; i++) {
String key = arr[i];
int j = i - 1;
while (j >= left && compareStrings(arr[j], key) > 0) {
arr[j + 1] = arr[j];
j--;
}
arr[j + 1] = key;
}
}
| 35
Example: Unicode-aware Tim Sort in Java(2/3):
,
public static void merge(String[] arr, int left, int mid, int right) {
int len1 = mid - left + 1;
int len2 = right - mid;
String[] leftArr = Arrays.copyOfRange(arr, left, mid + 1);
String[] rightArr = Arrays.copyOfRange(arr, mid + 1, right + 1);
int i = 0, j = 0, k = left;
while (i < len1 && j < len2) {
if (compareStrings(leftArr[i], rightArr[j]) <= 0) {
arr[k] = leftArr[i];
i++;
} else {
arr[k] = rightArr[j];
j++;
}
k++;
}
while (i < len1) {
arr[k] = leftArr[i];
i++;
k++;
}
| 36
Example: Unicode-aware Tim Sort in Java(2/3):
,
while (j < len2) {
arr[k] = rightArr[j];
j++;
k++;
}
}
public static String[] timSort(String[] arr) {
int n = arr.length;
int minRun = 32;
for (int i = 0; i < n; i += minRun) {
insertionSort(arr, i, Math.min(i + minRun - 1, n - 1));
}
int size = minRun;
while (size < n) {
for (int start = 0; start < n; start += size * 2) {
int mid = Math.min(start + size - 1, n - 1);
int end = Math.min(start + size * 2 - 1, n - 1);
merge(arr, start, mid, end);
}
| 37
Example: Unicode-aware Tim Sort in Java(3/3):
,
size *= 2;
}
return arr;
}
public static void main(String[] args) {
String[] strings = {
"সর্বজনীন-স্বীকৃ তির-পরীক্ষা.ভারি", // Bangla (Bengali)
"универсальное-принятие-тест.москва", // Cyrillic
"सार्वभौमिक-स्र्ीकृति-परीक्षण.संगठन", // Devanagari
"უნივერსალური-თავსობადობის-ტესტი.გე", // Georgian
"καθολική-αποδοχή-δοκιμή.ευ", // Greek
"تجربة-القبول-الشامل.موريتانيا", // Arabic
"普遍适用测试.我爱你", // Simplified Chinese
"ඉ-තැපැල්-පිරික්සුම@විශ්ව-සම්මුති-පිරික්සුම.ලංකා" // Sinhala
};
strings = timSort(strings);
for (String item : strings) {
System.out.println(item);
}
}
}
| 38
Example: Binary Searches in Unicode Strings in Java:(1/3):
,
import java.util.Arrays;
public class LexicographicBinarySearch {
public static int binarySearchUnicode(String[] arr, String target) {
int low = 0;
int high = arr.length - 1;
while (low <= high) {
int mid = (low + high) / 2;
String midVal = arr[mid];
if (target.equals(midVal)) {
return mid;
} else if (target.compareTo(midVal) < 0) {
high = mid - 1;
} else {
low = mid + 1;
}
}
return -1; // Target not found
}
| 39
Example: Binary Searches in Unicode Strings in Java:(2/3):
,
public static void main(String[] args) {
String[] strings = {
"تجربة-بريد-الكتروني@تجربة-القبول-الشامل.موريتانيا", // Arabic Unicode@IDN.IDN email.
"Էլփոստ-թեստ@համընդհանուր-ընկալում-թեստ.հայ", // Armenian Unicode@IDN.IDN email.
"ই-মেইল-পরীক্ষা@সর্বজনীন-স্বীকৃ তির-পরীক্ষা.ভারি", // Bangala Unicode@IDN.IDN email.
"ფოსტის-ტესტი@უნივერსალური-თავსობადობისტესტი.გე", // Georgian Unicode@IDN.IDN email.
"מבחן-דואר-אלקטרוני@מבחן-קבלה-אוניברסלי.קום", // Hebrew Unicode@IDN.IDN email.
"மின்னஞ்சல்-சோதனை@பொது-ஏற்பு-சோதனை.சிங்கப்பூர்", // Tamil Unicode@IDN.IDN email.
"อีเมลทดสอบ@ยูเอทดสอบ.ไทย", // Thai Unicode@IDN.IDN email.
"ኢሜይል-ሙከራ@ሁለንአቀፍ-ተቀባይነት-ሙከራ.com", // Ethiopic Unicode@IDN.ASCII email.
"အီးမေးလ်စမ်းသပ်ချက်@အလုံးစုံလက်ခံမှုစမ်းသပ်ချက်.com" // Myanmar Unicode@IDN.ASCII email.
};
| 40
Example: Binary Searches in Unicode Strings in Java:(3/3):
,
String targetString = "ই-মেইল-পরীক্ষা@সর্বজনীন-স্বীকৃ তির-পরীক্ষা.ভারি"; // Target string to search
Arrays.sort(strings); // Sorting in lexicographic order
int result = binarySearchUnicode(strings, targetString);
if (result != -1) {
System.out.println(targetString + " found at index " + result);
} else {
System.out.println(targetString + " not found");
}
}
}
Output:
ই-মেইল-পরীক্ষা@সর্বজনীন-স্বীকৃ তির-পরীক্ষা.ভারি found at index 3
| 41
Searching Through Unicode Text (Normalization + Collation)
| 42
Example: Searching Unicode Text in Java:(1/2):
,
import java.text.Collator;
import java.text.Normalizer;
import java.util.Locale;
public class StringSearchExample {
public static void main(String[] args) {
String[] strings = {
"সর্বজনীন-স্বীকৃ তির-পরীক্ষা.ভারি", // Bangla (Bengali)
"универсальное-принятие-тест.москва", // Cyrillic
"सार्वभौमिक-स्र्ीकृति-परीक्षण.संगठन", // Devanagari
"უნივერსალური-თავსობადობის-ტესტი.გე", // Georgian
"καθολική-αποδοχή-δοκιμή.ευ", // Greek
"تجربة-القبول-الشامل.موريتانيا", // Arabic
"普遍适用测试.我爱你", // Simplified Chinese
"ඉ-තැපැල්-පිරික්සුම@විශ්ව-සම්මුති-පිරික්සුම.ලංකා" // Sinhala
};
String searchQuery = "καθολικη-αποδοχη-δοκιμη.ευ";
// Normalize the search query using Unicode normalization form NFD
String normalizedSearchQuery = Normalizer.normalize(searchQuery,
Normalizer.Form.NFD);
| 43
Example: Searching Unicode Text in Java:(2/2):
,
// Create a Collator instance for the default locale with strength PRIMARY
Collator collator = Collator.getInstance(Locale.getDefault());
collator.setStrength(Collator.PRIMARY);
boolean found = false;
for (String str : strings) {
String normalizedStr = Normalizer.normalize(str, Normalizer.Form.NFD);
if (collator.compare(normalizedSearchQuery, normalizedStr) == 0) {
found = true;
break;
}
}
if (found) {
System.out.println("Search text” + searchQuery + “ is found in the array.");
} else {
System.out.println("Search text” + searchQuery + “ is not found in the array.");
}
}
}
Output: Search text καθολικη-αποδοχη-δοκιμη.ευ is found in the array.
| 44
Reference:
| 45
Author:
| 46