Coder Social home page Coder Social logo

alexrabarts / big_sitemap Goto Github PK

View Code? Open in Web Editor NEW
118.0 118.0 34.0 166 KB

A Sitemap library suitable for large sites. Compatible with most frameworks, including Rails and Merb.

Home Page: http://github.com/alexrabarts/big_sitemap

License: MIT License

Ruby 100.00%

big_sitemap's People

Contributors

alexrabarts avatar christianhellsten avatar christoph-buente avatar dalibor avatar eddm avatar lis2 avatar mislav avatar rachaelroland avatar sbecker avatar solutus avatar sylvain avatar yannski avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

big_sitemap's Issues

find_for_sitemap method in model does not work ...

Hi,

I try to use my named_scopes to define the sitemap testing this in the model (product.rb) does not work:

def self.find_for_sitemap
find(:first)
end

def find_for_sitemap
Product.find(:first)
end

Or is it me?

Thanks! Val

:to_param option

Hey, just wandering on how to force a :to_param parameter to be passed in
Thx if any infos about it

Error with Ruby 1.9

When I follow the first two steps from your readme, after adding a model I first get the following:

zlib(finalizer): Zlib::GzipWriter object must be closed explicitly.
zlib(finalizer): the stream was freed prematurely.

When I run 'sitemap.generate' I get the following exception: https://gist.github.com/886702

Btw. I am using Rails 3.0.3 and PostgreSQL.

doesn't /sitemaps directory ruin the sitemap?

I get confused every time I try to look up the sitemap specs, but it sounds like if the sitemap files are in /sitemaps, then they can only report urls under /sitemaps.

Does symlinking the sitemap index file to the root of the site solve this? I don't think it would?

does partial_updates work?

It seems to re-create all the files from scratch.

The queries look like this though:
SQL (105.5ms) SELECT COUNT(users.id) FROM users WHERE (deleted=0 AND (4762191 >= '4762191'))
User Load (18.9ms) SELECT users.* FROM users WHERE (deleted=0 AND (4762191 >= '4762191')) LIMIT 5000 OFFSET 0
User Load (19.9ms) SELECT users.* FROM users WHERE (deleted=0 AND (4762191 >= '4762191') AND (id > '12655')) LIMIT 5000
User Load (20.4ms) SELECT users.* FROM users WHERE (deleted=0 AND (4762191 >= '4762191') AND (id > '12655') AND (id > '28679')) LIMIT 5000

So it's loading the old max_id, but then just including it in "4762191 >= '4762191'", which does nothing..

Same model, different urls

My site is localized, so that I have urls like http://example.com/en/users/1 and http://example.com/ar/users/1. The former url shows an english profile for the user, and the later shows an arabic profile for the same user.

How do I go about adding both urls to the sitemap? I tried sitemap.add User, :path => 'en/users' and another with :path => 'ar/users', but only the last one is being generated.

Ideas?

suggestions

phew, that was a lot of work

seems like find_in_batches needs a sub iterator, as the first arg returned is an array of the first 1k results:

Group.where("deleted=0 AND id > #{get_last_id('item')}").find_in_batches do |set|
set.each do |i|
add i.full_url(:app), :change_frequency => 'never', :last_modified => i.updated_at
end
end

rewriting get_last_id was kind of a pain:
def get_last_id(filename)
path = "#{File.expand_path('../..',FILE)}/public/sitemaps/sitemap_#{filename}*.{xml,xml.gz}"
#puts "path is #{path}"
v=Dir[path].map do |file|
file.to_s.scan(/#{filename}_(.+).xml/).flatten.last.to_i
end.sort.last
puts "starting from #{filename} id #{v}"
v
end

My document_root directory is created, but the files aren't written inside it? I'll look into it more later I guess

Multiples I18n

I have to index a large website using two languages, but if I loop through locales, each localized sitemap just overwrite the precedent generated site map.

Is it actually possible to generate a separated site map for each locale ?

base_url + suggestion

It would be nice if each sitemap link was on it's own line. Since it's gzipped anyways, doesn't really add anything to the file size.

also, there's a bug with using :base_url instead of :url_options, I think the code is looking for @path_url or something instead.

thanks for a great project!

'add' method isn't working properly

It seems that the add method isn't working properly on my application.

After generate the sitemap xml i get some incompatibilities from the generated file compared to my application routes, ex:

It should return:
<url> <loc>http://www.urieljuliatti.com.br/produto/1-ecailles-de-la-lune</loc> <lastmod>2011-12-05</lastmod> <changefreq>daily</changefreq> <priority>0.5</priority> </url>
Instead of:

<url> <loc>http://www.urieljuliatti.com.br/1-ecailles-de-la-lune</loc> <lastmod>2011-12-05</lastmod> <changefreq>daily</changefreq> <priority>0.5</priority> </url>

My routing system returns /produto/1-ecailles-de-la-lune, but the generated sitemap file doesn't include the product controller param. What's going on?

That's my rake task implementation:
namespace :sitemap do task :generate, [ :needs ] => [ :environment ] do require 'big_sitemap' include Rails.application.routes.url_helpers include I18n I18n.locale = :ptBR BigSitemap.generate(:base_url => Store.default.url, :document_root => "#{Rails.root}/public", :document_full => "#{Rails.root}/public") do Product.all.each do |product| add show_product_path(product), :change_frequency => 'daily', :priority => 0.5, :last_modified => product.updated_at end Brand.all.each do |brand| add brand_path(brand), :change_frequency => 'daily', :priority => 0.5, :last_modified => brand.updated_at end end end end

installing gem

I don't know if anyone else has ran into this but i ran sudo gem install big_sitemap and it did
Installing ri documentation for big_sitemap-1.0.1...
Installing RDoc documentation for big_sitemap-1.0.1...

but i when did the command rake routes it
rake aborted!
no such file to load -- big_sitemap
which was hitting my require 'big_sitemap' which wasn't created.

Got it to work though by putting gem "big_sitemap" in gemfile explicitly.

why doesn't running the gem install in the terminal add this to the actual gem file and gem set list without manually doing it?

Conditions hash issues with more than > 1000 rows?

Hey! Love the gem! One issue I'm having is using a conditions hash instead of a string. For example, if I use the conditions hash:

{ :property_id => 1 }

It'll result in two batch finds, the first that works, the second that fails:

{:conditions=>{:property_id=>1}, :limit=>1001, :select=>nil, :include=>nil, :offset=>0, :group=>nil, :order=>nil, :joins=>nil}

{:conditions=>"property_id1 AND (id > 1001)", :limit=>1001, :select=>nil, :include=>nil, :offset=>nil, :group=>nil, :order=>nil, :joins=>nil}

It seems like the issue is the way the id clause for finds is combined with the conditions hash on line 161 of big_sitemap.rb. I'm not exactly sure how to solve this, but I wanted to put it out there.

Thanks!

Ambiguous "id" column when using joins

When you bring joins into the picture when defining :conditions for models to be added to the sitemap, the primary_column is still assumed to be id. Therefore, when splitting result sets, you're left with a query that looks for models with an (ambiguous) id > 123. Invalid SQL, as you'd expect.

One would assume this could be fixed by passing a custom primary_column of, for example, posts.id, but this causes another issue when the gem attempts to call the primary_column method on the objects in question (for example, this would call posts.id upon a Post).

ActiveSupport::Inflector ignored

It seems that you are using Extlib for inflections. The problem is that inflections defined in my rails app are ignored, and this affects to table names.

Is there any way to make Extlib not override ActiveSupport inflections?

wrong number of arguments (1 for 0)

Hi, when I am trying to generate a sitemap with several model it gives me the above error. What I did in my Raketask was simply:
sitemap.add Static
sitemap.add Category
sitemap.add Product
sitemap.add Company

Am I wrong on the syntax? Missing something?

dev mode sucks!

Just a note to anyone else having my problem... I thought this seemed really slow, I was only getting about 5k rows per minute.

The auto-reloading of dev mode sucks - run it in production mode (duh) and now I'm getting 5k rows/2 seconds.

Receiving error when running rake task

I followed the steps as outlined in the README, and it looks like the sitemap is building correctly, but I get the following error:

Don't know how to build task 'sitemap'

uninitialized constant Class

I am getting this error
uninitialized constant Class::Tweet

This is my rake task
require 'big_sitemap'

include Rails.application.routes.url_helpers # Allows access to Rails routes

BigSitemap.generate(:url_options => {:host => 'mysite'}, :document_root => "#{Rails.root}/public") do

Add some URLs with additional options

Tweet.find(:all).each do |tweet|
add tweet_path(tweet), :change_frequency => 'daily', :priority => 0.5
end
end

undefined local variable or method `get_last_id' for BigSitemap:Class

In the following code:

BigSitemap.generate(:url_options => {:host => DEFAULT_HOST}, :document_root => "#{Rails.root}/public", :partial_update => true, :gzip => false) do
      Job.find_in_batches(:conditions => "id > #{get_last_id}").each do |job|
        add jobs_path(job), :id => job.id
      end
   end

protocol is not having the desired effect

url_options: { host: "example.com", protocol: 'https' }

I wanted the urls generated to have:

<loc>https://example.com/...</loc>

However, it still shows http.

Bug?

find_options should also exclude :chage_frequency and :priority

in big_sitemap.rb, line 102:
find_options = options.except(:path)
should also exclude other options:
find_options = options.except(:path, :change_frequency, :priority)

If not, ActiveRecord's find method will throw an exception in 'assert_valid_keys':

    /usr/lib/ruby/gems/1.8/gems/activesupport-2.3.2/lib/active_support/core_ext/hash/keys.rb:47:in `assert_valid_keys'
    /usr/lib/ruby/gems/1.8/gems/activerecord-2.3.2/lib/active_record/base.rb:2407:in `validate_find_options'
    /usr/lib/ruby/gems/1.8/gems/activerecord-2.3.2/lib/active_record/base.rb:609:in `find'
    /usr/lib/ruby/gems/1.8/gems/activerecord-2.3.2/lib/active_record/base.rb:635:in `all'
    /usr/lib/ruby/gems/1.8/gems/alexrabarts-big_sitemap-0.3.3/lib/big_sitemap.rb:114:in `send' ...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.