elixir-cldr / cldr_collation Goto Github PK
View Code? Open in Web Editor NEWCLDR Collation
License: Other
CLDR Collation
License: Other
Hello!
First of all, thank you for your work here. It's really helpful.
I'm opening this issue because I think I have found a bug while trying to init this library. This bug happens when you have your app compiled and you rename the folder where it is located. Example: I have my ~/projects/myapp
and I rename it to ~/projects/mypersonalapp
.
At this point the library is not working and I think that the problem is here and I think that could be solved modifying the init/0
function as follows:
def init do
so_path = :code.priv_dir(:ex_cldr_collation) ++ '/ucol'
num_scheds = :erlang.system_info(:schedulers)
:ok = :erlang.load_nif(so_path, num_scheds)
end
instead of using the module attribute @so_path
. Do you agree?
I've seen this approach in other libraries like AppSignal.
I found the problem while trying to deploy my application where we build the application in a /tmp
folder and after that we move it to the /app
folder.
Hello !
I'm dropping a line about my experience after having upgraded Ubuntu to 22.04 :
Under Ubuntu 22.04, the package candidate version of ICU is 70.
Without doing anything, Cld.Collation fails expecting libicu67.
I got everything up and running just by dpkg'ing https://packages.ubuntu.com/impish/amd64/libicu67/download
Many thanks for your work !
My main motivation for using ex_cldr_collation
is to "properly" sort binaries with Polish letters:
iex> Enum.sort(["a", "b", "ą"])
["a", "b", "ą"]
iex> Enum.sort(["a", "b", "ą"], Cldr.Collation.Sensitive)
["a", "ą", "b"]
This is exactly what I need, but the ordering of capital letters is completely surprising:
iex> Enum.sort(["a", "b", "A", "B"])
["A", "B", "a", "b"]
iex> Enum.sort(["a", "b", "A", "B"], Cldr.Collation.Sensitive)
["a", "A", "b", "B"]
So not only "a" comes before "A", but also "A" comes before "b"! I guess the second part ("A" < "b") makes sense and I'm too used to ASCII-table based sorting, but I was wondering if there is an easy way to sort so that "A" < "a" and "a" < "b"?
Either way, thanks for making this library! 👏
I’m trying to add ex_cldr_collation as a dependency to a standard phoenix application. The problem arises when building and running the application in a docker container. These are the steps to reproduce:
mix phx.new my_app
(version 1.7.0-rc.3
)ex_cldr_collation
in mix.exs
(version 0.7.1
)mix phx.gen.release --docker
to generate the default Dockerfilelibicu-dev
to the builder image, as explained in the docs (pkgconf
is not really reaquired I thinkdocker build . -t my_app
docker run -it my_app
When the image starts the phoenix application, I get this error:
crasher:
initial call: application_master:init/4
pid: <0.2044.0>
registered_name: []
exception exit: {{shutdown,
{failed_to_start_child,kernel_safe_sup,
{on_load_function_failed,'Elixir.Cldr.Collation',
{{badmatch,
{error,
{load_failed,
"Failed to load NIF library: '/app/lib/ex_cldr_collation-0.7.1/priv/ucol.so: undefined symbol: ucol_strcollIter_67'"}}},
[{'Elixir.Cldr.Collation',init,0,
[{file,"lib/cldr_collation.ex"},{line,20}]},
{init,'-run_on_load_handlers/2-fun-0-',1,[]}]}}}},
{kernel,start,[normal,[]]}}
in function application_master:init/4 (application_master.erl, line 142)
...
Adding the debian package libicu67 to the runner image doesn't solve the problem. This is unexpected though: the symbol ucol_strcollIter_67
is found in the shared object at /usr/lib/x86_64-linux-gnu/libicui18n.so
(which comes with said debian package).
I checked by running nm -D --demangle libicui18n.so
in the docker container (after installing binutils
to get the nm
utility), where the sybol shows up with a T
marker, which should indicate it's present in the so file (that's an assumption, because I don't really know what I'm doing here 🤷 ).
I don't know how this should work, but is there something missing to tell elixir to also look into this other so file when loading NIF's? I'd be surprised, because it seems to work on other distro's without any problems.
Hey!
using OTP 23 I got the linking error while compiling my project (mix compile):
cc c_src/ucol.o -arch x86_64 -flat_namespace -undefined suppress -shared -L/usr/local/Cellar/erlang/23.2.2/lib/erlang/usr/lib -lerl_interface -lei -lpthread -lm -licucore -lstdc++ -o ./priv/ucol.so
ld: library not found for -lerl_interface
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [priv/ucol.so] Error 1
It looks that the lib erl_interface
was removed (maybe renamed) starting with OTP 23. The lib folder of OTP 23 contains
liberts.a
libei_st.a
libei.a
liberts_r.a
while OTP 22 the lib folder contains
liberl_interface.a
libei_st.a
libei.a
liberl_interface_st.a
liberts_r.a
liberts.a
icu_collator is fully TR #10 compliant and being in Rust can enable a fast, safe and developer friendly experience. I see the following advantages:
@foxbenjaminfox has kindly offered to work on the Rust bindings while I work on the overall library, Elixir api and documentation.
The basic Elixir API (not the NIF API) I envisage as:
Cldr.Collation.sort(list_of_binaries, options)
where options is a keyword list. This should only required one NIF round-trip but would require instantiating a new collator on each call. Options can include :locale
(the default is Cldr.default_locale/0
). Other options would be Elixir expressions of the icu_collator
type CollatorOptions
.Cldr.Collation.compare(string, string, options)
where is options is a keyword list.Cldr.Collation.collator(language_tag)
which returns a collator (Rust-based resource) that can be reused and may (possibly) we stored in a Cldr.LanguageTag.t/0
struct. Options can include :locale
(the default is Cldr.default_locale/0
). Other options would be Elixir expressions of the icu_collator
type CollatorOptions
.This API needs to be as simple as possible and be driven by the needs of icu_collator
and Rustler. Hopefully the interface can directly use the binaries in the Rust code without copying but the memory models are different and this may not be possible. Sorting by adjusting pointers would be more efficient than memory copies.
At minimum the NIF API should expect to accept:
Cldr.LanguageTag
's :canonical_locale_name
field)It should be able to return:
CLDR collations are configured per-locale (typically per-language in reality) in a set of configuration files. These files need to be available to icu-collator
through its data provider interface.
Including the data files in ex_cldr_collation
seems reasonable. They are not large files since they represent only tailorings of the standard DUCET collation.
icu-collator
depend on other CLDR data than these collation files?icu-collator
support loading these files. And if so, how is that configured?I'll see what I can learn from reading more of the rust docs but I'm in deep water when it comes to that so any suggestions you have would be warmly welcomed!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.