Coder Social home page Coder Social logo

sfpc-py101's Introduction

Hello! ๐Ÿ‘ป

Today we're going to talk a bit about text scraping, manipulation, and analysis in Python.

Workshop by Phil, Riley, and Yeli.

Tools

If you don't have a favorite text editor already, download Sublime Text. You can use Xcode for these exercises if you're used to it, but we recommend Sublime since it's simpler and less clunky.

Open up a terminal and run this command to download the installer:

curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py

then to install,

sudo python get-pip.py

Once we've got pip set up, we can install Beautiful Soup with

$ sudo pip install beautifulsoup4

Beautiful Soup helps us scrape text from the internet. Muahahaha! ๐Ÿ‘น

Once we've got pip set up, we can install NLTK with

$ sudo pip install nltk

NLTK is a suite of text processing libraries for Python that lets us analyze text in some really interesting and powerful ways. For the intro exercises, we'll work through part of the NLTK Book. It's a great resource, check it out!!

NLTK comes loaded with a bunch of corpora and trained models. We're going to use some of them, so in your Python REPL type:

import nltk
nltk.download()

If it looks like nothing happened, check if a new window popped open in the background. We want to download book under the "Collections" tab.

Cool links

sfpc-py101's People

Contributors

uniphil avatar rileyjshaw avatar oa495 avatar

Stargazers

gonza moiguer avatar Nitcha Tothong avatar Hans Steinbrecher avatar

Watchers

 avatar James Cloos avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.