Coder Social home page Coder Social logo

alerque / decasify Goto Github PK

View Code? Open in Web Editor NEW
5.0 2.0 0.0 490 KB

A CLI utility, Rust crate, Lua Rock, Python module, and JavaScript module to cast strings to title-case according to locale specific style guides including Turkish support.

License: GNU General Public License v3.0

Rust 31.93% Makefile 4.38% Shell 10.91% M4 15.48% Nix 2.61% Lua 31.75% Python 2.53% JavaScript 0.40%
case-converter cli luarocks npm-package pypi-package rust-library titlecase localization

decasify's Introduction

decasify

Rust Test Status Rust Lint Status Flake Run Status Lua Lint Status Lua Test Status
GitHub tag (latest) Crates.io (latest) LuaRocks (latest) PyPi (latest) NPM Version

A CLI utility, Rust crate, Lua Rock, Python module, and JavaScript module to cast strings to title-case according to locale specific style guides including Turkish support.

This project was born out of frustration with ALL CAPS TITLES in Markdown that no tooling seemed to properly support casting to title-cased strings, particularly coming from Turkish. Many tools can handle casing single words, and some others can handle English strings, but nothing seemed to be out there for full Turkish strings.

The CLI defaults to titlecase and English, but lower, upper, and sentence case options are also available. The Rust, Lua, Python, and JavaScript library APIs have functions specific to each operation. Where possible the APIs currently default to English rules and (for English) the Gruber style rules, but others are available. The Turkish rules follow Turkish Language Institute's guidelines.

For English, three style guides are known: Associated Press (AP), Chicago Manual of Style (CMOS), and John Grubber's Daring Fireball (Gruber). The Gruber style is by far the most complete, being implemented by the titlecase crate. The CMOS style handles a number of parts of speech but has punctuation related issues. The AP style is largely unimplemented. Contributions are welcome for better style guide support or further languages.

$ decasify -l tr ILIK SU VE İTEN RÜZGARLAR
Ilık Su ve İten Rüzgarlar
$ echo ILIK SU VE İTEN RÜZGARLAR | decasify -l tr
Ilık Su ve İten Rüzgarlar
$ echo foo BAR AND baz: an alter ego | decasify -l en -s gruber
Foo BAR and Baz: An Alter Ego

Use as a CLI tool

Use of the CLI is pretty simple. Input may be either shell arguments or STDIN.

$ decasify --help
A CLI tool to convert all-caps strings to title-case or other less aggressive tones that supports
Turkish input

Usage: decasify [OPTIONS] [INPUT]...

Arguments:
  [INPUT]...  Input string

Options:
  -l, --locale <LOCALE>  Locale [default: EN] [possible values: EN, TR]
  -c, --case <CASE>      Target case [default: Title] [possible values: Lower, Sentence, Title,
                         Upper]
  -s, --style <STYLE>    Style Guide [possible values: ap, cmos, gruber]
  -h, --help             Print help
  -V, --version          Print version

First, check your distro for packages, e.g. for Arch Linux get it from the AUR.

Otherwise for many platforms you can run it directly or install it to a shell using Nix Flakes:

$ nix run github:alerque/decasify

To do a full install from source, grab the tarball attached to the latest release or use Git to clone the repository. Don't use the "source code" zip/tar.gz files linked from releases, go for the tar.zst source file. If you use a Git close, first run ./bootstrap.sh after checkout. This isn't needed in the source release tarballs. Next, configure and install with:

$ ./configure
$ make
$ sudo make install

Note that installing from source has the advantage of include a man page and shell completions. All the usual autotools options apply, see --help for details. The most commonly used option especially for distro packagers is probably --prefix /usr to change the install location from the default of /usr/local.

Of course the bare binary can also be installed directly with Cargo:

$ cargo install --features cli decasify

Use as Rust crate

In your Cargo.toml file.

[dependencies]
decasify = "0.5"

Then use the crate functions and types in your project something like this:

use decasify::to_titlecase;
use decasify::{InputLocale, StyleGuide};

fn demo() {
    let input = "ILIK SU VE İTEN RÜZGARLAR";
    let output = to_titlecase(input, InputLocale::TR, None);
    eprintln! {"{output}"};
    let input = "title with a twist: a colon";
    let output = to_titlecase(input, InputLocale::EN, Some(StyleGuide::DaringFireball));
    eprintln! {"{output}"};
}

Use as Lua Rock

Depend on the LuaRock in your project or install with luarocks install decasify:

dependencies = {
   "decasify"
}

Then import and use the provided functions:

local decasify = require("decasify")
local input = "ILIK SU VE İTEN RÜZGARLAR"
local output = decasify.titlecase(input, "tr")
print(output)
input = "title with a twist: a colon"
output  = decasify.titlecase(input, "en", "gruber")
print(output)

Use as Python Module

Depend on the Python module in your project or install with pip install decasify:

[project]
dependencies = [
  "decasify"
]

Then import and use the provided functions and type classes:

from decasify import *

input = "ILIK SU VE İTEN RÜZGARLAR"
output = titlecase(input, InputLocale.TR)
print(output)
input = "title with a twist: a colon"
output  = titlecase(input, InputLocale.EN, StyleGuide.DaringFireball)
print(output)

Use as JavaScript (WASM) Module

Depend on the WASM based JavaScript module in your project with npm add decasify:

Then import and use the provided functions and classes:

import { titlecase, uppercase, lowercase, InputLocale, StyleGuide } from 'decasify';

var input = "ILIK SU VE İTEN RÜZGARLAR"
var output = titlecase(input, InputLocale.TR)
console.log(output)

var input = "title with a twist: a colon"
var output = titlecase(input, InputLocale.EN, StyleGuide.DaringFireball)
console.log(output)

decasify's People

Contributors

alerque avatar patryk27 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

decasify's Issues

Consider re-licensing GPL → LGPL

I originally envisioned this just as a CLI tool, and the GPL license exactly suites my tastes. However I quickly realized that for some use cases it would be easier to use as a library, and first the Rust, then the Lua, and now the Python library interfaces were born. The GPL does make use as a library a bit tough.

I would like to consider relaxing the GPL to be LGPL, at least for the library interfaces. The CLI could potentially stay GPL, but that needs unnecessary since it would just be some clap based parsing wrapper around the Rust library anyway.

The main difference would be allowing use of decasify as a shared library in other projects without either having to include a copy of the source (unless modified, in which case it should be published) or issues with the license being viral and needing to release the including project under a GPL license. This might have a significant impact on adoption since one can't use a pure GPL licensed library at all in a more permissively licensed project without creating an issue for the wider project.

As the only other outside contributor to date, @Patryk27 I would need a sign-off from you to make this happen. Would you be okay with this?

Incorrect TR titecase output

Not sure why this didn't get capitalized, but it ain't right:

$ decasify -l tr -c title <<< 'dualarımızda minnettarlık ve övgü'
Dualarımızda minnettarlık ve Övgü

$ decasify -V
decasify v0.5.7

Publish as Node/JavaScript package

With Lua and Python in the bag (#2), I guess I should look into publishing a JavaScript interface. At this point in history I assume that means something similar to how Python and Lua are handled with some way to compile to WASM and include a tiny bit of wrapper code and some tooling to make it accessible in the JS/Node ecosystem.

Provide Python module

It should be pretty easy to wrap the Rust library up in a Python library module. The existing Python titlecase package works okay for English, but it doesn't cover Turkish or style guides other than the NYT/MoS.

Compresses spaces in long strings

Yah not so good.... if you feed this a large block of text, say with indented lines, the extra spacing gets compressed. It works great for sentence level fragments with only single-space breaks, but not more complex text.

Implement French title casing rules

It looks like French is one of the harder cases around for Title Case rules, and (perhaps consequently) also in demand. There also doesn't seem to be a Rust implementation yet. This on in JavaScript is evidently the best I've seen so far but even that highlights the complexity and disagreement over style guides.

If this library is going to be usable in places like SILE or Typst, it aught to have more than just English and Turkish out of the gate. Especially since SILE has French contributors it seems like a likely candidate to try to implement here.

Publish as Typst package

With WASM builds working (see #6) it should be possible to setup a build that acts as a Typst package and exposes locale aware casing functions to Typst, something it lacks at the moment.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.