tags:

views:

238

answers:

4

Hi guys,

in .NET is there a function that returns the root letter (the letter without special attributes like cedilla), kinda:

Select Case c
  Case "á", "à", "ã", "â", "ä", "ª" : x = "a"
  Case "é", "è", "ê", "ë" : x = "e"
  Case "í", "ì", "î", "ï" : x = "i"
  Case "ó", "ò", "õ", "ô", "ö", "º" : x = "o"
  Case "ú", "ù", "û", "ü" : x = "u"

  Case "Á", "À", "Ã", "Â", "Ä" : x = "A"
  Case "É", "È", "Ê", "Ë" : x = "E"
  Case "Í", "Ì", "Î", "Ï" : x = "I"
  Case "Ó", "Ò", "Õ", "Ô", "Ö" : x = "O"
  Case "Ú", "Ù", "Û", "Ü" : x = "U"

  Case "ç" : x = "c"
  Case "Ç" : x = "C"

  Case Else
       x = c
End Select

This code miss some letters, but it's only for the example sake :)

+2  A: 

By the way (completely unrelated to the question), your code operates on strings. This isn't only less efficient, it actually doesn't really make sense since you're interested in individual characters rather than strings, and these are distinct data types in .NET.

To get a single-character literal rather than a string literal, append c to your literal:

Select Case c
  Case "á"c, "à"c, "ã"c, "â"c, "ä"c, "ª"c : x = "a"c
  ' … and so on. '
End Select
Konrad Rudolph
+1  A: 

taken from Chetan Sastry response, here I give you the VB.NET code and the C# one copied from his GREAT answer :)

VB:

Imports System.Text
Imports System.Globalization

''' <summary>
''' Removes the special attributes of the letters passed in the word
''' </summary>
''' <param name="word">Word to be normalized</param>
Function RemoveDiacritics(ByRef word As String) As String
    Dim normalizedString As String = word.Normalize(NormalizationForm.FormD)
    Dim r As StringBuilder = New StringBuilder()
    Dim i As Integer
    Dim c As Char

    For i = 0 To i < normalizedString.Length
        c = normalizedString(i)
        If (CharUnicodeInfo.GetUnicodeCategory(c) <> UnicodeCategory.NonSpacingMark) Then
            r.Append(c)
        End If
    Next

    RemoveDiacritics = r.ToString
End Function

C#

using System.Text;
using System.Globalization;

/// <summary>
/// Removes the special attributes of the letters passed in the word
/// </summary>
/// <param name="word">Word to be normalized</param>
public String RemoveDiacritics(String word)
{
  String normalizedString = word.Normalize(NormalizationForm.FormD);
  StringBuilder stringBuilder = new StringBuilder();
  int i;
  Char c;

  for (i = 0; i < normalizedString.Length; i++)
  {
    c = normalizedString[i];
    if (CharUnicodeInfo.GetUnicodeCategory(c) != UnicodeCategory.NonSpacingMark)
  stringBuilder.Append(c);
  }

  return stringBuilder.ToString();
}

I hope it helps someone like me :)

balexandre
A: 

hi guys, there is as simple method compare string in .NET

public static string NormalizeString(string value) { string nameFormatted = value.Normalize(System.Text.NormalizationForm.FormKD); Regex reg = new Regex("[^a-zA-Z0-9 ]"); return reg.Replace(nameFormatted, ""); }

Javier Mateos