Coder Social home page Coder Social logo

polylalia's Introduction

polylalia

Polylalia is an experiment exploring the ability of stylometric analysis of a body of text. Polylalia attempts anonymity by running a body of text through the Yandex translation engine multiple times, ending up with a similar yet somewhat stilted result.

Think of Polylalia as a script that makes Yandex play telephone game with itself and gives you the result. Polylalia only works on plaintext and is modular, so you can import polylalia and use the Polylalia class in your own programs.

Do not use to protect life or property

This is a basic tool for some research I am doing on stylometric fingerprinting and authorship with the tools provided by Drexel University's PSAL. It is meant to prove or disprove a hypothesis that its stylometric techniques are possibly circumventable by readily accessible scripts.

It has not been audited to provide privacy or anonymity of any kind from a stylometric analysis. Also, understand that your raw text is being sent to Yandex, and that you trust Yandex. A user with access to the Yandex logs could fingerprint this script's behavior.

When I have results of it against the stylometry of PSAL et al I will update this document.

Setup/installation

You will need a dependency on yandex.translate, also a free Yandex Translate API key which you can get from Yandex developer site.

# apt-get install -y python-pip
# pip install -r requirements.txt

Usage

For command-line use, see ./polylalia.py -h.

License

GNU GPL V3. See LICENSE.

Donate

If you like Polylalia and are interested in stylometry, donate your time by teaching others about the importance of free speech and privacy in your community.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.