Coder Social home page Coder Social logo

intl's Introduction

intl

Build Status

A PHP 8.0+ internationalization library, powered by CLDR data.

Features:

  • NumberFormatter and CurrencyFormatter, inspired by intl.
  • Currencies
  • Languages

Looking for a list of countries and subdivisions? Check out commerceguys/addressing.

Why not use the intl extension?

The intl extension isn't present by default on PHP installs, requiring it can hurt software adoption. Behind the scenes the extension relies on libicu which includes the CLDR dataset, but depending on the OS/distribution used, could be several major CLDR releases behind.

Since the CLDR dataset is freely available in JSON form, it is possible to reimplement the intl functionality in pure PHP code while ensuring that the dataset is always fresh.

Having access to the CLDR dataset also makes it possible to offer additional APIs, such as listing all currencies.

More backstory can be found in this blog post.

Formatting numbers

Allows formatting numbers (decimals, percents, currency amounts) using locale-specific rules.

Two formatters are provided for this purpose: NumberFormatter and CurrencyFormatter.

use CommerceGuys\Intl\Currency\CurrencyRepository;
use CommerceGuys\Intl\NumberFormat\NumberFormatRepository;
use CommerceGuys\Intl\Formatter\NumberFormatter;
use CommerceGuys\Intl\Formatter\CurrencyFormatter;

$numberFormatRepository = new NumberFormatRepository;
// Options can be provided to the constructor or the
// individual methods, the locale defaults to 'en' when missing.
$numberFormatter = new NumberFormatter($numberFormatRepository);
echo $numberFormatter->format('1234.99'); // 1,234.99
echo $numberFormatter->format('0.75', ['style' => 'percent']); // 75%

$currencyRepository = new CurrencyRepository;
$currencyFormatter = new CurrencyFormatter($numberFormatRepository, $currencyRepository);
echo $currencyFormatter->format('2.99', 'USD'); // $2.99
// The accounting style shows negative numbers differently and is used
// primarily for amounts shown on invoices.
echo $currencyFormatter->format('-2.99', 'USD', ['style' => 'accounting']); // (2.99$)

// Arabic, Arabic extended, Bengali, Devanagari digits are supported as expected.
$currencyFormatter = new CurrencyFormatter($numberFormatRepository, $currencyRepository, ['locale' => 'ar']);
echo $currencyFormatter->format('1230.99', 'USD'); // US$ ١٬٢٣٠٫٩٩

// Parse formatted values into numeric values.
echo $currencyFormatter->parse('US$ ١٬٢٣٠٫٩٩', 'USD'); // 1230.99

Currencies

use CommerceGuys\Intl\Currency\CurrencyRepository;

// Reads the currency definitions from resources/currency.
$currencyRepository = new CurrencyRepository;

// Get the USD currency using the default locale (en).
$currency = $currencyRepository->get('USD');
echo $currency->getCurrencyCode(); // USD
echo $currency->getNumericCode(); // 840
echo $currency->getFractionDigits(); // 2
echo $currency->getName(); // US Dollar
echo $currency->getSymbol(); // $
echo $currency->getLocale(); // en

// Get the USD currency using the fr-FR locale.
$currency = $currencyRepository->get('USD', 'fr-FR');
echo $currency->getName(); // dollar des États-Unis
echo $currency->getSymbol(); // $US
echo $currency->getLocale(); // fr-FR

// Get all currencies, keyed by currency code.
$allCurrencies = $currencyRepository->getAll();

Languages

use CommerceGuys\Intl\Language\LanguageRepository;

// Reads the language definitions from resources/language.
$languageRepository = new LanguageRepository;

// Get the german language using the default locale (en).
$language = $languageRepository->get('de');
echo $language->getLanguageCode(); // de
echo $language->getName(); // German

// Get the german language using the fr-FR locale.
$language = $languageRepository->get('de', 'fr-FR');
echo $language->getName(); // allemand

// Get all languages, keyed by language code.
$allLanguages = $languageRepository->getAll();

Related projects

Laravel integration

intl's People

Contributors

arubacao avatar bojanz avatar borisson avatar boywijnmaalen avatar fbonzon avatar greatislander avatar jbekker avatar joshbrw avatar jsacksick avatar limenet avatar marcortola avatar mglaman avatar propaganistas avatar qzmenko avatar strykaizer avatar vishalghyv avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

intl's Issues

Implement rounding in the formatters

Every single number formatter out there (intl, number_format, Java's DecimalFormatter, the Ruby on Rails helper) also rounds. I've always found it bizarre (showing a value should not modify it), but silently truncating (like we do now) isn't ideal either.

The generally expected rounding modes are: ceiling, floor, up, down, half up, half down, half even (default in intl/icu), half odd. We can support the last four with round(). The first two will be tricky because ceil() and floor() have floating point bugs (which round() doesn't). Rounding down is truncating. Rounding up would need special code.

Maybe we can get away with supporting only the round() ones?

Revamp fallback locale handling

  1. $fallbackLocale should default to 'en', not null, to ensure that by default usable data is always returned.

  2. $fallbackLocale is not useful on the individual repository methods, remove it. It can still be set via $repository->setFallbackLocale().

  3. If a fallback locale is complex ("en-US", for example) then it too should fallback ("en-US", "en").

[Currency usage] Move Country::getCurrencyCode to a different class

It was convenient to add the currency code to the country object (and definition files), but this data is not actually needed as often as the other country data, and isn't completely tied to what a country is.

So, we can get away with moving this mapping to its own file, and provide a different class (named what?) for consuming it. This would reduce the size of the country definitions and the cost around consuming country data.

Initial idea: CurrencyUsageRepository with getCurrent($countryCode) and getAll($countryCode). Returns a CurrencyUsage class with a currency code and start/end dates.
However, we don't ship with deprecated currencies, so getAll() would return currency codes not in the library. So we either start shipping deprecated currencies, or don't report historical usage

Move the country list to commerceguys/addressing

The country list now lives in commerceguys/addressing, right next to subdivisions.

That means that it should be removed from commerceguys/intl, and the README modified to point to addressing.

The currency symbol/code is not always properly spaced

We currently rely on the pattern to give us the complete layout of the final number.
But CLDR has additional rules that say when a space should be inserted around a currency symbol/code, which look like this:

"currencySpacing": {
            "beforeCurrency": {
              "currencyMatch": "[:^S:]",
              "surroundingMatch": "[:digit:]",
              "insertBetween": " "
            },
            "afterCurrency": {
              "currencyMatch": "[:^S:]",
              "surroundingMatch": "[:digit:]",
              "insertBetween": " "
            }
          },

Yes, that's quite confusing, which is why I missed it previously.

Looks like this is a good opportunity to check how our formatting logic compares with the ICU4J one.

Relevant links:
angular/angular#20708
andyearnshaw/Intl.js#221

Add a getList() method to currency/country/language repositories

Showing lists (id => name pairs) of this data is a major/majority use case, especially for countries.
Currently the way to do it is to call getAll(), then construct the list from the returned objects.
But the objects are not actually needed, we construct them for no reason.

By introducing a getList() method we can make this use case less verbose, and save a few milliseconds by not instantiating hundreds of objects.

extension may be installed using the bundled version as of PHP 5.3.0

The intl extension isn't present by default on PHP installs, requiring it can hurt software adoption.

I believe the sentence is in-correct. If I understand from the php website http://php.net/manual/en/intl.installation.php

This extension may be installed using the bundled version as of PHP 5.3.0, or as a PECL extension as of PHP 5.2.0.

My assumption is they specifically mentioning 5.2 may not have intl by default. In my php versions it is installed by default.

Improve generation scripts

  1. The scripts should be runnable from outside scripts/, by having paths use DIR
  2. The scripts should be outside the destination directories, for easier data copying
  3. The scripts should write to files, instead of stdout, for easier copying.

Rework the data model, start using value objects.

Goal:
Replace the Currency/Country/Language/NumberFormat objects with value objects (no setters).
Remove each *EntityInterface.

Reason:
Applications don't need us to dictate how their entities are going to look. And in many cases they might not have entities at all. Those that do can import the data from the library (using our repositories), provide their own entities, then provide their own repositories that load the data from the db and pass it to value objects. Basically, value objects are the envelope in which we put our data. This reduces friction by recuding the number of assumptions a library puts on the parent system.

The addressing library did this a year ago, with great results.

vendor/autoload.php missing

vendor/autoload.php missing with no details on how to obtain it.

intl-master/scripts/language/generate.php

$ php generate.php

Warning: require(../../vendor/autoload.php): failed to open stream: No such file or directory

The number format for "bg" is broken

It has no "#", so the end result is always "0.00".
I checked the pattern that intl's NumberFormatter uses for "bg", and oddly enough, it does have a '#'.

Stop using file_exists() to resolve available locales

The LocaleResolverTrait currently uses file_exists() to see which locales have translations available.

This is bad because checking for non-existent files bypasses the stat cache, the cost is incurred again every time. Furthermore, distributed filesystems (such as GlusterFS) make this call really slow (up to 15ms).

Solution:
Stop using file_exists(), make each repository responsible for knowing which locales are available.
Get this list from a separate json file.

Make the formatters stateless

Right now the formatters have state, you can modify their options, which affects future callers due to the singleton nature of dependency containers. This requires a factory to always be used, which we don't even provide.

Let's move to passing options to format(), making the formatters stateless.

We did the same change in commerceguys/addressing.

Walloon should not be filtered out

  • Walloon (official iso 'wa' code)
  • Taiwanese (or Taiyu)
  • Montenegrin (official iso 'cnr')

We make some translations in and to these languages. Could you please add them so we can add them to our website?
Or how can we add them ourselves?

Thank you

Percent formatting/parsing is broken

The README has an example of 0.75 becoming 75%, but there's no multiplication logic in the formatter, and the test just confirms that 50 becomes 50%. There's also no reverse logic for parsing numbers, nor a test case.

Maximum and minimum fraction digits are always set in defaultOptions

This line of code

if (!isset($options['minimum_fraction_digits'])) {
and also
if (!isset($options['maximum_fraction_digits'])) {
makes no sense because in defaultOptions those array items are already set, so if your options array doesn't contain the minimum_fraction_digits or/and maximum_fraction_digits they will never be taken from the Currency config with the method getFractionDigits().

This causes the problems like this:

  1. Currency is EUR
  2. Currency fractionDigits is 2 (set in yml file, talking here about Drupal 8 implementation).
  3. In price formatter no minimum or maximum fraction digits is set, so the format method in CurrencyFormatter uses the defaultOptions instead, they have as values: null, but still isset() returns true in lines 105 and 108.
  4. Formatted price with number 5.123123 is shown like 5.123123 instead of 5.12 .

Proposed solution:

  • replace !isset($options['minimum_fraction_digits'])) to empty($options['minimum_fraction_digits']))
  • replace !isset($options['maximum_fraction_digits'])) to empty($options['maximum_fraction_digits']))

This way null value will force the method to get the settings from the Currency itself.

Pull request link: #77

Revamp repository locale handling

#50 made some progress, but we have more problems:

  1. The RepositoryLocaleTrait methods are not on any of the repository interfaces
  2. Having setters for default locale / fallback locale introduces unnecessary state.
  3. Having a trait in the first place requires us to copy it over to addressing once we move countries there.
  4. We want to copy the repository locale approach to the NumberFormatter once we refactor it, but having extra methods there adds too much noise.
  5. NumberFormatRepository has a default locale that isn't even used.

Solution:

  • Inject $defaultLocale and $fallbackLocale via constructors (only $fallbackLocale in the case of NumberFormatRepository). No getters/setters. This emphasizes that these are application settings that come from outside.

Date formatting plans?

First of all, thanks for all the hard work put into this, really useful for myself and my team.
This is not really an issue report but more of a question.

Do you guys have any estimation/plans regarding introducing date formats support?
Thanks

[de-AT, fr-CH] Missing currencyGroup / currencyDecimal symbols in NumberFormat

Somewhere along the way CLDR added currencyGroup and currencyDecimal symbols, which are to be used instead of the group and decimal separators when formatting currency amounts.

Right now only two locales use it.

de-AT:

          "group": " ",
          "currencyGroup": ".",

and fr-CH:

          "decimal": ",",
          "currencyDecimal": ".",

We need to add these two properties to NumberFormat, modify the scripts to parse them from CLDR, and modify the formatter to use them when needed.

Mark BCMath extension as a dependency

Fatal error: Call to undefined function CommerceGuys\Intl\Formatter\bccomp()

Able to trigger this with following Dockerfile snippet used to run my fpm container

# Install modules
RUN apt-get update && apt-get install -y \
        libfreetype6-dev \
        libjpeg62-turbo-dev \
        libmcrypt-dev \
        libpng12-dev \
        libxml2-dev \
    && docker-php-ext-install mcrypt pdo_mysql mysql mysqli mbstring opcache soap \
    && docker-php-ext-configure gd --with-freetype-dir=/usr/include/ --with-jpeg-dir=/usr/include/ \
    && docker-php-ext-install gd

Move number format definitions into the repository

Same thing we did with address formats. There's not enough data to justify the performed I/O. With a few tricks, we end up with about 1200 lines, which is even less than the amount in AddressFormatRepository (1500).

Investigate quirks when rendering prices in Farsi

Our number format for Farsi ("fa") puts the currency symbol before the amount.

However, PHP's native NumberFormatter varies the symbol location based on the currency:

$formatter = new \NumberFormatter('fa', \NumberFormatter::CURRENCY);
echo $formatter->formatCurrency('645000', 'USD'); // Symbol before the amount
echo $formatter->formatCurrency('645000', 'IRR'); // Symbol after the amount

This could be related to #65, where our logic is too simplistic.

Simplify&shrink the country / language definitions

These definitions always have only a country code and a name, but we use an associative array instead of code => name pairs (which is how CLDR stores it). This was done just to be in sync with currencies, but it's not needed, and if we fix that, we can reduce the dataset size by 30%.

Split NumberFormatter into NumberFormatter and CurrencyFormatter

The current formatter does two distinct jobs: formats both numbers and currency amounts.
This allows it to be configured in weird ways, for example, you can instantiate it using the decimal/percent style, then use it to format currencies, and vice-versa.
Recently I've been cleaning it up to behave more like a regular service, injecting the number format repository, etc. I want to do the same for currency formatting, to inject a currency repository and accept a currency code, instead of requiring callers to care about currency objects. But it's odd to have to inject the currency repository even when you just want to format regular numbers.

All of this makes it obvious that the code would be simpler when split, so let's go for it.

Example in readme makes no sense.

One of the examples in the readme makes no sense.

How can 1234.99 suddenly equal 123,456.99?

echo $numberFormatter->format('1234.99'); // 123,456.99

Reduce the number of available locales

I've done an analysis of the available locales, and concluded that easily half of them are non-essential and unlikely to be used. CLDR includes 75 African locales which represent non-official languages. It also includes minor European locales such as Walser and Western Frisian.

Each locale has a cost, in maintainer time when updating the dataset and verifying the results, and in memory costs (since NumberFormatRepository inlines all data, and each repository has its list of available locales). Plus the general weight of the library and number of files it contains. Furthermore, many of the smaller locales have partially translated data, with some missing translations completely.

So, I suggest imposing this ignore list (first 4 lines are already present):

$ignoredLocales = [
    // Esperanto, Interlingua, Volapuk are made up languages.
    'eo', 'ia', 'vo',
    // Church Slavic, Manx, Prussian are historical languages.
    'cu', 'gv', 'prg',
    // Africa secondary languages.
    'agq', 'ak', 'am', 'asa', 'bas', 'bem', 'bez', 'bm', 'cgg', 'dav',
    'dje', 'dua', 'dyo', 'ebu', 'ee', 'ewo', 'ff', 'ff-Latn', 'guz',
    'ha', 'ig', 'jgo', 'jmc', 'kab', 'kam', 'kea', 'kde', 'ki', 'kkj',
    'kln', 'khq', 'ksb', 'ksf', 'lag', 'luo', 'luy', 'lu', 'lg', 'ln',
    'mas', 'mer', 'mua', 'mgo', 'mgh', 'mfe', 'naq', 'nd', 'nmg', 'nnh',
    'nus', 'nyn', 'om', 'rof', 'rwk', 'saq', 'seh', 'ses', 'sbp', 'sg',
    'shi', 'sn', 'teo', 'ti', 'tzm', 'twq', 'vai', 'vai-Latn', 'vun',
    'wo', 'xog', 'xh', 'zgh', 'yav', 'yo', 'zu',
    // Europe secondary languages.
    'br', 'dsb', 'fo', 'fur', 'fy', 'hsb', 'ksh', 'kw', 'nds', 'or', 'rm',
    'se', 'smn', 'wae',
    // Other infrequently used locales.
    'ceb', 'ccp', 'chr', 'ckb', 'haw', 'ii', 'jv', 'kl', 'kn', 'lkt',
    'lrc', 'mi', 'mzn', 'os', 'qu', 'row', 'sah', 'tt', 'ug', 'yi',
];

This creates a danger of accidentally dropping a locale that people rely on, but it would be easy enough to revert such a change.

How set empty grouping_separator

Hi guys!
How set empty grouping_separator ?

If override number format and set empty grouping_separator in Drupal 8

/**
   * Correct price format
   *
   * @param \Drupal\commerce_price\Event\NumberFormatDefinitionEvent $event
   */
  public function priceFormat(NumberFormatDefinitionEvent $event) {
    $def = $event->getDefinition();
    $def['grouping_separator'] = '';
    $def['currency_pattern'] = '#,##0.00 ¤';
    $event->setDefinition($def);
  }

After change grouping_separator,Function strtr returns FALSE in parseNumber trait

How set empty grouping separator?

Thanks

Add NumberFormatter::parse()

The NumberFormatter has format() and formatCurrency(), but it only has parseCurrency, the parse() needed to complete the pair is missing.

NumberFormatter no longer accepts floats

I've noticed that NumberFormatter underwent some changes from 0.* tot 1.*. More specifically the percent style got some breaking changes. I'm just curious if the following ones are intended.

// version 0.*
$formatter->format(75, ['style' => 'percent']); // 75%

// version 1.*
$formatter->format(75, ['style' => 'percent']); // 7500%
// version 0.*
$formatter->format(0.75, ['style' => 'percent']); // 0.75%

// version 1.*
$formatter->format(0.75, ['style' => 'percent']); // Exception (floats cannot be passed)
$formatter->format('0.75', ['style' => 'percent']); // 75%

Move base currency/country data into the repositories

CurrencyRepository and CountryRepository currently have to load two files: base.json and $locale.json. On systems that care about performance, both need to be cached.

However, base definitions are small enough that they could be inlined into the repositories (just like we did with AddressFormatRepository in addressing), saving us a cache get.

So, let's do that.

Currency codes for CN, VG and ZA are incorrect

Thanks for your great library!

The currency codes in base.json for CN, VG and ZA are incorrect due to the array ordering of the CLDR data. (See currencyData.json#L979, #L3221, and #L3348.) CN has CNX instead of CNY, VG has GBP instead of USD and ZA has ZAL instead of ZAR. These are all historic or non-tender currencies.

In most countries in the list, the current currency is the last in the array, and in generate.php#L78 you are using key(end($currencyData['region'][$countryCode])) to retrieve it. These three countries are differently sorted for some reason, and the current currency is not the last in the array.

Rather than rely on the sorting, which could be arbitrary (although it only has these 3 exceptions at the moment), could you use the _from, _to and _tender values to filter the currency array before returning the last key?

Here is a messy example that I made quickly to identify what was wrong with these three countries:

// Determine the current currency for this country.
if (isset($currencyData['region'][$countryCode])) {

    $currencyCandidates = $currencyData['region'][$countryCode];
    $currencyCandidates = array_filter($currencyData['region'][$countryCode], function($currency) {

        $currency = end($currency);

        if (!isset($currency['_tender']) || $currency['_tender'] !== 'false') {
            $active = isset($currency['_to']) ? new DateTime <= new DateTime($currency['_to']) : true;
            $active = $active && (isset($currency['_from']) ? new DateTime >= new DateTime($currency['_from']) : true);

            return $active;
        }

    });

    if (!empty($currencyCandidates)) {
        $currentCurrency = key(end($currencyCandidates));
    }
    else {
        $currentCurrency = null;
    }

    $baseData[$countryCode]['currency_code'] = $currentCurrency;
}

Create a Locale helper class

LocaleResolverTrait has independent helpers that would be cleaner in their own class, which is also how PHP's intl structures it. This would allow us easier testing, and leave room for future logic (determining the correct parent locale, iterating through a known list of locales to fix #17, etc).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.