Coder Social home page Coder Social logo

diacritics.net's Introduction

💬 About

C# software developer for customers in various industries. Recent projects include mobile apps, backend services and web APIs.

Working Skills

C# .NET .NET MAUI Xamarin.Forms WPF .NET ASPNET Core SQL EntityFramework Core Software Architecture

Tools

Visual Studio Visual Studio Code MSSQL Android Studio XCode Notepad++

Social

LinkedIn Stackoverflow

GitHub stats

diacritics.net's People

Contributors

darthramone avatar jerry2007 avatar julien-vandenbussche avatar ltduy avatar monoman avatar much4cho avatar ryanoneill1970 avatar steevequadra avatar thomasgalliker avatar zolrath avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

diacritics.net's Issues

Add .NET Standard support

When installing the nuget in a .NET Core project I see warnings at compile time: "Package 'Diacritics 1.0.6' was restored using '.NETFramework,Version=v4.6.1' instead of the project target framework '.NETStandard,Version=v2.0'. This package may not be fully compatible with your project."

It would be nice if it could be compiled under .NET Standard.

German umlaut mapping not correct

I came across this project while searching for an umlaut replacement library. Find the approach to solve the general problem. Unfortunately I found that there is something wrong with the German language.

Correct is that the character ß is replaced by ss. But this is not true for the other umlauts. Actually, Ä is replaced by Ae, Ö is Oe and Ü is Ue. The same applies to lower case letters, of course. In this library, however, the translation is only done by one letter instead of two. Therefore this is actually not correct.

Umlaut replacement character
Ä Ae
Ü Ue
Ö Oe
ä ae
ü ue
ö oe

https://github.com/thomasgalliker/Diacritics.NET/blob/develop/Diacritics/AccentMappings/GermanAccentsMapping.cs#L10.L12

FrenchAccentsMapping œ should transform on oe not o

Hi,

In French, we have some special word like "œuf" (Egg). For them, after RemoveDiacritics, we should have "oeuf" and not "ouf".
I tracked down the mapping the the file FrenchAccentsMapping. Line 22, the { 'œ', "o" }, should be replaced by { 'œ', "oe" },

Is that possible to do it ?
To make it simple, I submitted a pull request.
Regards
Steeve

ß-> ss

Please can you change the code that an ß will be translateted to ss

Add a LICENSE.MD file

NuGet license info says Apache 2.0, while README.MD doesn't specify a concrete license (except a non commercial clause - even though Apache 2.0 allows commercial usage).

A LICENSE.MD or LICENSE.TXT file with a proper license would make this project useful for more people.

ơ => o

Hi.

Could you please add this mapping for Vietnamese "ơ" letter?

Thanks.

Release Note Availability

Thank you for your work on this project! My team is working a project that utilizes your library and is looking at updating from 2.0.19240.3 to 3.3.18. Do you have release notes available anywhere that we could review for possible breaking changes? I didn't see any release information available in GitHub that I could read through.

What about œ (e.g. in "cœur")

Hi Thomas, while looking for a solution to normalize diacritics and other digrams, I came across your implementation. I like the way you separated every language into its own set of rules.

However, I don't feel comfortable using it, since it produces unexpected conversions. Say you feed it with "cœur" in French. You'd expect to get "coeur" as an output, but since you map "œ" to "o" you finally get "cour" instead.

Same thing for German words, where "Grüße" might be more appropriately mapped to "Gruesse" (i.e. map ü → ue and ß → ss).

Can you explain why you chose your approach of a one-to-one mapping?

License

What means commercial use?
"For commercial use please contact the author."
Would an information system development fall into commercial use?

Add overloads to RemoveDiacritics and IsDiacritics extension methods to pass a IDiacriticsMapper

If you want to use the extension methods of the library you have to register a global default diacritics mapper.
This is not very pure and it does not allow to have different mappings for different strings without switching the global mapper.

StaticDiacritics.SetDefaultMapper(() =>
    new DiacriticsMapper(
        new MyGermanAccentMapping(),
        new GermanAccentsMapping(),
        new ItalianAccentsMapping(),
        new ArabicAccentsMapping()
    )
);

"Thöni".RemoveDiacritics() // "Thoeni"

The current "pure" approach is to instantiate a DiacriticsMapper with accent mappings and use the methods form this instance.

var myMapper = new DiacriticsMapper(
    new MyGermanAccentMapping(),
    new GermanAccentsMapping(),
    new ItalianAccentsMapping(),
    new ArabicAccentsMapping());

myMapper.RemoveDiacritics() // "Thoeni"

This is fine.
But it would be convenient to have an overload for the extensions methods where the mapper (or single accent mappings) could be passed:

"Thöni".RemoveDiacritics(myMapper) // "Thoeni"

or simply as a params array:

"Thöni".RemoveDiacritics(new MyGermanAccentMapping()) // "Thoeni"

Lower-case Latin variants hide upper diacritics

When characters have a lower-case Latin equivalent the diacritic is not correctly removed.
Take for example the Turkish word "İngiltere" (England), when invoking RemoveDiacritics the input is converted to lowercase before IndexOfAny is called. At this point the input is transformed to "ingiltere" meaning the İ diacritic is not replaced & the original string is returned.

ñ => n

spanish letter n with tilde

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.