Coder Social home page Coder Social logo

doubaokun / seoserver Goto Github PK

View Code? Open in Web Editor NEW

This project forked from moviepilot/seoserver

0.0 1.0 0.0 383 KB

A PhantomjS-based SEO server that serves pages of JavaScript apps to search engine robots like the Googlebot.

License: Other

JavaScript 29.18% CoffeeScript 70.82%

seoserver's Introduction

Welcome!

Seo Server is a command line tool that runs a server that allows GoogleBot (and any other crawlers) to crawl your heavily Javascript built websites. The tool works with very little changes to your server or client side code.

Getting started

  • Install CoffeeScript (if not already)
    npm install -g coffee-script
  • Edit configuration file src/config.coffee.sample and save it as src/config.coffee
  • Compile the config into project directory
    coffee --output lib/ -c src/config.coffee
  • Install npm dependencies
    npm install
  • Install PhantomJS
    npm install -g phantomjs
  • Start the main process on port 10300 and with default memcached conf:
    bin/seoserver start -p 10300

Internals

The crawler has three parts:

lib/phantom-server.js A small PhantomJS script for fetching the page and returning the response along with the response headers in serialized form. It can be executed via:

phantomjs lib/phantom-server.js http://moviepilot.com/stories

lib/seoserver.js A node express app responsible for accepting the requests from Googlebot, checking if there is a cached version on memcached, otherwise fetching the page via phantom-server.js.

You can start it locally with:

node lib/seoserver.js start

And test its output with:

curl -v http://localhost:10300

bin/seoserver Forever-monitor script, for launching and monitoring the node main process.

bin/seoserver start -p 10300

Nginx and Varnish configuration examples

Your webserver has to detect incoming search engine requests in order to route them to the seoserver. A way of doing so is looking for the string "bot" in the User-Agent-Header, or by checking for Google's escaped fragment. In Nginx you can check the variable $http_user_agent and set the backend similar to this:

location / {
  proxy_pass  http://defaultbackend;
  if ($http_user_agent ~* bot)  {
    proxy_pass  http://seoserver;
}
location ~* escaped_fragment {
  proxy_pass  http://seoserver;
}

If you deliver a cached version of your website with a reverse proxy in front, you can do a similar check. A vcl example for Varnish:

if (req.http.User-Agent ~ "bot" || req.url ~ "escaped_fragment") {
  set req.http.UA-Type = "crawler";
} else {
  set req.http.UA-Type = "regular";
}

Credits

This code is based on a tutorial by Thomas Davis and on https://github.com/apiengine/seoserver

seoserver's People

Contributors

nuc avatar rendez avatar leitmedium avatar lextoumbourou avatar opyh avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.