Coder Social home page Coder Social logo

djy's Introduction

What's this?

This is a library of character utility functions for Clojure, inspired by useful built-in string and character libraries from other languages, most significantly Haskell's Data.Char library.

It is currently somewhat cumbersome to work with characters in Clojure. Complicating matters is the inherent complexity of dealing with supplementary characters in the JVM; Java characters are 16-bit, allowing characters in the Unicode range 0000-FFFF to be expressed as single characters. This range is called the Basic Multilingual Plane (BMP), however the range of existent characters has since expanded, bringing about the need for 32-bit characters. Java's way of representing these supplementary characters is via pairs of 16-bit characters, for a combined total of 32 bits.

This library aims to provide convenient wrappers for standard Java Character library functions, as well as some new utility functions to facilitate working with characters.

Many of these functions are polymorphic in nature, by way of a single multimethod, code-point-of, which can take as an argument a character, an integer representing a Unicode code point, or a string beginning with a supplementary character (i.e. two 16-bit Java characters). The focus in doing this is ease of use by the end-user.

Among the new utility functions is char' (on analogy with clojure.core's +' and other "enhanced" arithmetic operators that support arbitrary precision), an extension of clojure.core/char that will return a string containing a supplementary character if provided with a codepoint above U+FFFF, e.g. (char' 135641) => ๐ก‡™

Another convenient function is char-range, which returns the range (inclusive) between two characters, e.g. (char-range \a \z) => (\a \b \c ... \x \y \z). This provides a concise, readable syntax for representing ranges of characters, as compared to, e.g., (map char (range (int \a) (inc (int \z)))). As a bonus, this function also supports supplementary characters, as it uses char' internally.

My hope is that this library will end up in clojure.contrib or (my pipe dream) as a part of Clojure proper as "clojure.char."

Any feedback and suggestions would be very welcome -- feel free to join the discussion going on the Clojure dev Google group.

Enjoy!

- Dave Yarwood, 10/8/14

To do:

  • Write comprehensive, automated tests.
  • Remedy potential performance issues caused by dynamic type introspection, as noted by Mikera.

djy's People

Contributors

daveyarwood avatar

Watchers

Tracy avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.