Coder Social home page Coder Social logo

unique-country-prefixes's Introduction

What is the shortest prefix that uniquely identifies the name of each country? (I'm using "prefix" in the computer science sense, so, for example, S, SP, SPA, and SPAI are all prefixes of SPAIN, as well as SPAIN itself, and the empty string, ε.)

An alternative formulation: if you had an autocomplete field for choosing a country, what is the shortest sequence of letters you would have to type before your options are narrowed down to one specific country?

Example: SWE is the shortest unique prefix for Sweden. Sweden is the only country that begins with this letter sequence, and there is no shorter prefix that has this property (because, for example, SW is shared with Switzerland).

Some possibly surprising facts:

  • There are 3 countries that are uniquely specified by their first letter
  • The 2 countries with the longest shortest unique prefixes require 13 characters (including spaces) to distinguish: REPUBLIC OF _
  • There are 2 countries whose shortest unique prefix is not "proper" - i.e. it is the whole name of the country. (3 if you count Iran - see information on data sources below.)
  • There are 3 countries that have no unique prefix!

Data

For the purposes of this experiment, I used the 192 United Nations member states as of July 2022. I used the English name listed on the UN website here, which may differ from the country's endonym or official full English name (e.g. 'Germany', rather than 'Deutschland' or 'Federal Republic of Germany'). Most of the forms used here are the recognizable ones used in everyday conversation, though there are a few exceptions (e.g. the country commonly known as Turkey has requested to be referred to as Türkiye as of May 2022).

I used the repository cristiroma/countries for two purposes:

  • For the flag icons used in the generated infographics. To load these images locally, you'll need to clone the countries repo under the root of this repo.
  • To generate countries.csv. I started from the file at countries/data/csv/countries.csv, then manually winnowed it down to just the UN member states, and manually updated the first 'name' column for a couple states to match the form currently used by the UN.

Infographic generation

The IPython notebook prefixes.ipynb generates an html file pres.html.

I then convert this into a pdf using Chrome's 'print to pdf' feature.

Then I convert that to an image using an Imagemagick invocation along these lines:

convert -density 150 -trim pres.pdf -quality 100 pres.png

(Grossly circuitous, I know.)

The notebook also generates another html file, spoilers.html, which is the 'answer' key version of the infographic with the full name of each country. It differs from the other html file in a few ways:

  • It includes the css file sstyles.css, which has some specific rules, e.g. making the "suffix" elements that show the remainder of a country's name after the MUP visible.
  • It excludes the explanatory "preamble" text under the title

There is some ad-hoc fine-tuning that goes into the image conversion process. When printing to pdf, I'll generally set margins to 'none', and may fiddle with a custom 'scale' setting. When running imagemagick, I may or may not need to do some cropping of margins (either automatically using the -trim flag, or manually with -crop).

unique-country-prefixes's People

Contributors

colinmorris avatar

Stargazers

Collin Smith avatar Nimish Bhide avatar

Watchers

 avatar James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.