Coder Social home page Coder Social logo

vitorluizc / normalize-text Goto Github PK

View Code? Open in Web Editor NEW
61.0 5.0 6.0 736 KB

📝 Provides a simple functions to normalize texts, whitespaces, paragraphs & diacritics.

License: MIT License

JavaScript 12.99% TypeScript 87.01%
normalize-text normalize string-manipulation string javascript functional-programming compose

normalize-text's Introduction

Normalize Text

Build Status License Package tree-shaking Package minified & gzipped size Package dependency count

Provides a simple API to normalize texts, white-spaces, names, paragraphs & diacritics (accents).

  • 📦 Distributions in ESM, CommonJS, UMD and UMD minified formats.

    • Supports NodeJS ESM and CommonJS;
  • ⚡ Lightweight:

    • It's bundled with Rollup and Bublé.
    • Smaller than 1KB (min + gzip).
    • Supports tree shaking.
  • 🔋 Bateries included:

    • Its not based on newer browser's APIs or es2015+ features.
    • Only needs String.prototype.normalize polyfill for older browsers, and don't crashes without it.
  • 🏷 Safe:

    • Type declarations for IDEs and editor's autocomplete/intellisense.
    • Made with TypeScript as strict as possible.
    • Unit tests with Jest.
    • Travis CI that keeps tests running.

Install

normalize-text is published under NPM registry, so you can install using any Node.js package manager.

npm install normalize-text --save

# If you're using Yarn.
yarn add normalize-text

Install from CDN

The bundles of this module are also available on JSDelivr and UNPKG CDNs.

In both you can import just the bundle you want or use default one, UMD.

<!-- Using default bundle from JSDelivr -->
<script src="https://cdn.jsdelivr.net/npm/normalize-text"></script>

<!-- Using default bundle from UNPKG -->
<script src="https://unpkg.com/normalize-text"></script>

<script>
  /**
   * UMD bundle expose brazilian-values through `normalizeText` object.
   */
  normalizeText.capitalizeFirstLetter('vitor');
  //=> "Vitor"
</script>

Usage

All the functions are named exported from module.

import { normalizeText } from 'normalize-text';

normalizeText([
  'Olá\r\n',
  '  como  está a   senhorita?'
]);
//=> "ola como esta a senhorita?"

API

capitalizeFirstLetter

Capitalize first character of received text.

capitalizeFirstLetter('vitorLuizC');
//=> "VitorLuizC"

normalizeDiacritics

If String.prototype.normalize is supported it normalizes diacritics by replacing them with "clean" character from received text.

It doesn't normalize special characters.

normalizeDiacritics('Olá, você aí');
//=> 'Ola, voce ai'

normalizeDiacritics('àáãâäéèêëíìîïóòõôöúùûüñçÀÁÃÂÄÉÈÊËÍÌÎÏÓÒÕÔÖÚÙÛÜÑÇ');
//=> "aaaaaeeeeiiiiooooouuuuncAAAAAEEEEIIIIOOOOOUUUUNC"

normalizeDiacritics('@_$><=-#!,.`\'"');
//=> "@_$><=-#!,.`'\"";

normalizeName

Normalize received name by normalizing it's white-spaces and capitalizing first letter of every word but exceptions (received in lower-case).

normalizeName(' fernanDA  MONTENEGRO');
//=> "Fernanda Montenegro"

normalizeName(' wilson da costa', ['da']);
//=> "Wilson da Costa"

normalizeParagraph

Normalize a paragraph by normalizing its white-spaces, capitalizing first letter and adding a period at end.

normalizeParagraph(' once upon a time');
//=> "Once upon a time."

normalizeParagraph('hello world, my friend\r\n');
// => 'Hello world, my friend.'

normalizeText

Resolve received texts (when receives an Array) by normalizing its white-spaces and its diacritics and transforming to lower-case.

normalizeText(' so there\'s  a  Way to NORMALIZE ');
//=> "so there\'s a way to normalize"

normalizeText(['Olá\r\n', 'como está a   senhorita?']);
//=> "ola como esta a senhorita?"

normalizeWhiteSpaces

Normalize all white-space characters and remove trailing ones received text.

normalizeWhiteSpaces(' What exactly is it?   ');
//=> "What exactly is it?"

normalizeWhiteSpaces('Hi,   how is \r\n everything  \t?');
//=> 'Hi, how is everything ?'

normalizeWhiteSpaces`It is ${temperature}\n  degree\r outside.  `
//=> 'It is 25 degree outside.'

License

Released under MIT license. You can see it here.

normalize-text's People

Contributors

hexetia avatar mechamobau avatar phuvinhbmt avatar rayzr522 avatar vitorluizc avatar zscaiosi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

normalize-text's Issues

Emoji Normalization Feature

I was thinking about this lib and there's a growing need to handle emojis effectively in text normalization. This feature would convert emojis into their corresponding textual descriptions, making the text more comprehensible and analyzable, especially when processing social media content or informal communications.

Use Case:
Often, emojis are used in texts to convey emotions or actions that are not captured by plain text. Normalizing these into words can aid in sentiment analysis, text-to-speech applications, and in contexts where emojis are not supported or are less meaningful.

Implementation Idea:
We could create a mapping of commonly used emojis to their respective descriptive phrases. The normalization function should then detect these emojis in the text and replace them with the mapped phrases.

It's possible to use Gitmoji project as reference, because their project has the list with all emoji and codes that is possible to use in commit messages, and this feature can adapt with it's own context (e.g they have :bug: as emoji for commits that solves bugs, maybe :insect: or something like that can be used in the place), and Github has it's own text-to-emoji cheatsheet too

Potential Challenges:

  • Ensuring comprehensive coverage of frequently used emojis.
  • Deciding on standardized descriptive text for each emoji, considering cultural and contextual variances.

Benefits:

  • Enhances the utility of text normalization in modern communication contexts.
  • Facilitates better understanding and processing of texts rich in emojis.

I believe this feature would be a valuable addition to the 'normalize-text' project, helping people that want to support apps that receives emoji codes and handles the emoji as needed.

Cannot read property 'normalize' of undefined

Hi,

I am currently working on a project developed with the NestJS framework and TypeScript, and when I try to use the normalizeText() function I get the error TypeError: Cannot read property 'normalize' of undefined.

The full error message is as follows:

TypeError: Cannot read property 'normalize' of undefined
    at normalizeDiacritics (C:\Users\jmcandia\Repositorios\proyecto-morpheus\morpheuscrm-api\node_modules\normalize-text\src\normalizeDiacritics.js:12:16)
    at C:\Users\jmcandia\Repositorios\proyecto-morpheus\morpheuscrm-api\node_modules\@bitty\pipe\src\pipe.js:23:53
    at Array.reduce (<anonymous>)
    at Object.<anonymous> (C:\Users\jmcandia\Repositorios\proyecto-morpheus\morpheuscrm-api\node_modules\@bitty\pipe\src\pipe.js:23:20)
    at new User (C:\Users\jmcandia\Repositorios\proyecto-morpheus\morpheuscrm-api\src\users\users.entity.ts:82:44)
    at EntityMetadata.Object.<anonymous>.EntityMetadata.create (C:\Users\jmcandia\Repositorios\proyecto-morpheus\morpheuscrm-api\src\metadata\EntityMetadata.ts:527:23)
    at EntityMetadataValidator.Object.<anonymous>.EntityMetadataValidator.validate (C:\Users\jmcandia\Repositorios\proyecto-morpheus\morpheuscrm-api\src\metadata-builder\EntityMetadataValidator.ts:118:47)
    at C:\Users\jmcandia\Repositorios\proyecto-morpheus\morpheuscrm-api\src\metadata-builder\EntityMetadataValidator.ts:46:56
    at Array.forEach (<anonymous>)
    at EntityMetadataValidator.Object.<anonymous>.EntityMetadataValidator.validateMany (C:\Users\jmcandia\Repositorios\proyecto-morpheus\morpheuscrm-api\src\metadata-builder\EntityMetadataValidator.ts:46:25)

The version of normalize-text I am using is 2.3.2

Roadmap to version 1.0

  • Remove uncouple dependency.
  • Move to "not soo functional" way.
    • Remove compose function.
  • Use TypeScript, even on tests.

Add normalize name or a capitalize words

Add a function to capitalize words like a name.

import { normalizeName } from 'normalize-text';

normalizeName('fernanda montenegro') // 'Fernanda Montenegro'
normalizeName('leornado matos    nascimento   ') // 'Leonardo Matos Nascimento'
normalizeName('ALOÍSIO NUNES') // 'Aloísio Nunes'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.