Coder Social home page Coder Social logo

unicode_utils's People

Contributors

agis avatar lang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

unicode_utils's Issues

Upcase does not work with Cyrillics

irb(main):001:0> require 'unicode_utils/upcase'
=> true
irb(main):002:0> puts UnicodeUtils.upcase('абвгдеёжзийклмнопрстуфхцчшщъыьэюя')
абвгде╤жзийклмноп└┴┬├─┼╞╟╚╔╩╦╠═╬╧
=> nil

Expected result is:
АБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ

Titlecase has a different behavior than Rails

I substituted the Rails titlecase method for UnicodeUtils::titlecase, because of UTF-8 characters. I noticed the behavior is not exactly the same, even when no special chars are involved. For example:

Capitalized letter within word:

> "CompanyName".titlecase
 => "Company Name"
> UnicodeUtils.titlecase "CompanyName"
 => "Companyname"

Words than contain periods:

> "Company X.Y.Z.".titlecase
 => "Company X.Y.Z."
> UnicodeUtils.titlecase "Company X.Y.Z."
 => "Company X.y.z."

Is this intentional for some reason? It would be interesting if unicode_utils preserved the original behavior and dealt only with the special chars.

What library should be used now?

Hello, this library doesn't seem to be maitained anymore (no more commit) - which solution would you now recommand for unicode normalization under Ruby 1.9?

Licence missing in the rubygems version and in the gemspec

The "unicode_utils" gem seems not to have a license at all. Unless a license that specifies otherwise is included, nobody else can use, copy, distribute, or modify that library without being at risk of take-downs, shake-downs, or litigation.

I know, that this gem has a license on github, however it's missing one at rubygems and in a gemspec.

Zrzut ekranu z 2019-06-21 10-41-56

interested in UTR#30 support?

There was a UTR#30 for 'ascii folding'. While it's been withdrawn as part of the Unicode standard, many people find it useful anyway -- for instance Solr/Lucene still supports it with their ICUFoldingFilterFactory

Here are what I think are the relevant unicode ".txt" source files with mappings to implement UTR#30, from the lucene source: https://github.com/apache/lucene-solr/tree/trunk/lucene/analysis/icu/src/data/utr30

I note that unicode_utils uses these same unicode .txt mapping source files as definitions to implement the parts of unicode it does implement.

So that would probably make it pretty feasible to do UTR#30 too. Even though it's not part of unicode, some people are still finding it useful and have need of it (including myself).

Are you interested in unicode_utils supporting UTR#30? I could try to create a pull request, although it would take me a while to figure out what to actually do with the .txt mapping definition files to fit them into unicode_util properly, it's possible you could do it in only a few minutes if you were interested.

Allowing only Ruby >= 1.9.1 versions

Hi,

I wanted to install this gem on a ree (Ruby Entreprise Edition) version of Ruby. But as it requires ruby >= 1.9.1, it can't be installed.

Could you do a quick fix to allow the installation on a ree version of Ruby ?

Thx !

Kulgar.

require "unicode_utils" takes a very long time

I've installed unicode_utils on R2.0 on windows 7 x64. When I execute this:

require "time"
beginning_time = Time.now
require "unicode_utils"
end_time = Time.now
puts "Time elapsed #{(end_time - beginning_time)} seconds"

it outputs "Time elapsed 94.805422 seconds" (or something to that effect -- times vary between 93 and 101 seconds"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.