Removing Accents In a Ruby String

Use this to convert "àéêîôûù" to "aeeiouu".

It's fairly easy to convert a string to a URL-safe version. Both PHP and Ruby have a way to do this, but what happens in this case is that extended characters are escaped rather than being replaced and that's often not what I want.

If you're like me, you've probably had many occasions where you wanted to get a 7bits ASCII version of a string where the accents are converted to the base letter. For example, you want to give each user his page at : [domain.com]/[name]. When the name is "Steve Smith", that's not a problem, but what if the name is "François Léveillé" ?

What you want in this case is "francois_leveille" or "francoisleveille" and this is where this script will help you.

First, DOWNLOAD THE SCRIPT. Make sure to require it in your project. If you are using Ruby on Rails, place it in the lib directory and add this line to the bottom of your config/environment.rb file:

    1 require 'extend_string'

Once you're done, you now have 2 additional methods in the String class. This means that for any string anywhere in your project, you can do:

    1 # Set a sample string to test things out
    2 mystring = "Ceci Est UN test : éàòù"
    3 
    4 # The removeaccents method simple removes the accents 
    5 # and returns the string
    6 mystring.removeaccents
    7 
    8 # The urlize method not only calls removeaccents, 
    9 # but also a bunch of
   10 # others to make it truly URL-ready.
   11 mystring.urlize
   12 
   13 # You can customize urlize with options:
   14 # :downcase => true
   15 #     will convert entire string to lowercase
   16 # :convert_spaces => true
   17 #     will convert space to underscore
   18 # :regexp => //
   19 #     matching characters will be removed
   20 mystring.urlize({:downcase => true})

See the full documentation for more details on what's possible with the new methods.

Creative Commons LicenseThis work is licensed under a Creative Commons Attribution-Share Alike 2.5 Canada License.