Coder Social home page Coder Social logo

sitemap.js's Issues

'changefreq' and 'priority' should be optional

According to http://www.sitemaps.org/protocol.html#xmlTagDefinitions:

These tags are optional:

  • changefreq
  • priority

Actually, it seems to be impossible to get rid of them in the generated xml ...

Even if can understand the '0.5' default value for the priority, I think there is no point for adding it: it will increase the page weight for nothing

I could not find that default value for changefreq was 'weekly' in the specification, but the same goes for this tag

Sitemap Styling

An option to add a style sheet would be greatly appreciated!

Async-style API ignores error thrown by `.toString()`

In lib/sitemap.xml you have this:

Sitemap.prototype.toXML = function (callback) {
  if (typeof callback === 'undefined') {
    return this.toString();
  }
  var self = this;
  process.nextTick( function () {
    if (callback.length === 1) {
      callback( self.toString() );
    } else {
      callback( null, self.toString() );
    }
  });
}

However, Sitemap.prototype.toString() may throw error, since it will call new SitemapItem() and that may throw error too (when there is no URL, no protocol, etc). The async-style API never pass the error to the callback so the error becomes an uncaught exception, which will kill Node process. How about changing .toXML() to:

Sitemap.prototype.toXML = function (callback) {
  if (typeof callback === 'undefined') {
    return this.toString();
  }
  var self = this;
  process.nextTick( function () {
    try {
      return callback(null, self.toString());
    } catch (err) {
      return callback(err);
    }
  });
}

I know this will break backward compatibility but I think this is the proper way to do it. If agreed, I can work on this.

chunk sitemap failed with large number of urls

From 1 300 000 urls I get a segmentation fault ...

It failed on utils.chunckArray() function.

I rewrite this function with underscore module with success but it add a new dependency...

2 hostnames

Is there any chance to specify http and https toghether?

Thanks, G.

Caching blocks eventloop from exiting gracefully

First of all, nice module! I stumbled upon one little problem though,

Enabling cache on this module starts an interval timer. This makes my program unable to exit. This is not a problem in production since it shouldn't exit there but it blocks my unit tests from finishing gracefully.

https://github.com/ekalinin/sitemap.js/blob/master/lib/sitemap.js#L150-L152

The current mechanism is also wrong in the way that if a sitemap is generated right before the interval fires, the cache is cleared immediately and the sitemap is not cached for the desired time.

I think it would be better to introduce a caching mechanism where on generating a sitemap the current time is stored with the cache and this time is checked on retrieval with the desired caching time. Thus eliminating the need for an interval timer.

I resolved the issue for me by disabling cache and rolling my own. (I will stick with this even if you fix this issue since it's a better separation of concerns anyway)

How to test the package?

When i install it, i see no "tests" folder, you forgot to add it into npm? or it's my fault that i'm missing some extra command?

Thanks!

Sitemap can't be used with browserify

If sitemap.js is pulled into a browserify build, the javascript has frontend errors, because the use of fs.readFileSync to get the version number is done in a way that brfs can't understand:

/**
 * Framework version.
 */
var fs = require('fs')
  , path = require('path')
  , pack_file = path.join(__dirname, 'package.json');

if ( !module.exports.version ) {
  module.exports.version = JSON.parse(
    fs.readFileSync(pack_file, 'utf8')).version;
}

If this was changed to be something like:

var fs = require('fs');

if ( !module.exports.version ) {
  module.exports.version = JSON.parse(
    fs.readFileSync(__dirname + "/package.json", 'utf8')).version;
}

It should work with brfs

Generate sitemap from app or router

Nice module!
I was wondering if there was a way to pass in an express object, or express router object to sitemap.js and have it spit out the sitemap.xml file.

Doesn't normalize urls

So:

hostname: 'http://website.com' with url: 'page.html' => http://website.compage.html
hostname: 'http://website.com/' with url: '/page.html' => http://website.com//page.html

I'd expect this module to remove and add slashes where necessary.

What is the point of cache?

Sure, when adding pages while the app is running. Makes totally sense.

But how do I replace urls, since they are passed as array at creation time.

Missing underscore in package.json dependencies

module.js:338
throw err;
^
Error: Cannot find module 'underscore'
at Function.Module._resolveFilename (module.js:336:15)
at Function.Module._load (module.js:278:25)
at Module.require (module.js:365:17)
at require (module.js:384:17)
at Object. (/Users/joaoribeiro/Documents/Projects/cloudtasks.io/node_modules/sitemap/lib/utils.js:7:9)
at Module._compile (module.js:460:26)
at Object.Module._extensions..js (module.js:478:10)
at Module.load (module.js:355:32)
at Function.Module._load (module.js:310:12)
at Module.require (module.js:365:17)
at require (module.js:384:17)

Dynamic sitemap index

I want to generate sitemap index that gets updated every month with new sitemaps. I tried the following:

            var opts = {
                cacheTime: 600000,
                hostname: 'https://xxx.com/sitemaps',
                sitemapName: 'sitemap',
                sitemapSize: 1,
                targetFolder: path.join(__dirname, '../public/sitemaps')
            };
            var arr = [];

                for (var x in res) {
                    console.log(res[x].url);
                    arr.push(res[x].url);
                }

                opts['urls'] = arr;
                var sitemapIndex = sm.createSitemapIndex(opts);

But this generates the following files:

sitemap-index.xml
sitemap-0.xml
sitemap-1.xml
sitemap-2.xml

where the sitemap-index file contains:

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:mobile="http://www.google.com/schemas/sitemap-mobile/1.0" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">
<sitemap>
    <loc>https://xxx/sitemaps/sitemap-0.xml</loc>
</sitemap>
<sitemap>
    <loc>https://xxx/sitemaps/sitemap-1.xml</loc>
</sitemap>
<sitemap>
    <loc>https://xxx/sitemaps/sitemap-2.xml</loc>
</sitemap>
</sitemapindex>

And each contains:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:news="http://www.google.com/schemas/sitemap-news/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:mobile="http://www.google.com/schemas/sitemap-mobile/1.0" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">
<url> <loc>https://actual_site_url</loc> </url>
</urlset>

I actually want the sitemap-index to be:

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:mobile="http://www.google.com/schemas/sitemap-mobile/1.0" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">
<sitemap>
    <loc>https://path_to_xml1_specified_by_me.xml</loc>
</sitemap>
<sitemap>
    <loc>https://path_to_xml2_specified_by_me.xml</loc>
</sitemap>
<sitemap>
    <loc>https://path_to_xml3_specified_by_me.xml</loc>
</sitemap>
</sitemapindex>

How can I give an array of paths to sitemap xmls while executing .createSitemapIndex instead of having it create sitemap-1, sitemap-2 on its own.
I am using S3 to store xml files, so I want to give the path to xmls on my Amazon S3 server inside sitemapindex.

wrong urls added to sitemap

var sm = require('sitemap')

var staticUrls = ['/', '/terms', '/login']


var sitemap = sm.createSitemap({
    hostname: process.env.HOST_URL || 'http://babeleo.com',
    urls: staticUrls
})

sitemap.add({
    url: '/details/' + 'url1'
})

console.log('sitemap1 urls', sitemap.urls)

var sitemap2 = sm.createSitemap({
    hostname: process.env.HOST_URL || 'http://babeleo.com',
    urls: staticUrls
})

console.log('sitemap2 urls', sitemap2.urls)

And the output is:

sitemap1 urls [ '/', '/terms', '/login', { url: '/details/url1' } ]
sitemap2 urls [ '/', '/terms', '/login', { url: '/details/url1' } ]

Expected output:

sitemap1 urls [ '/', '/terms', '/login', { url: '/details/url1' } ]
sitemap2 urls [ '/', '/terms', '/login' ]

Can't test the package

According to readme...

make env 
make: *** No rule to make target 'env'.  Stop.

Please don't close the issue this time. Thanks!

support for news site map

Hi,
I need to create news sitemap for our company blog.
it needs to the support the following
<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:news="http://www.google.com/schemas/sitemap-news/0.9"> <url> <loc>https://www.website.com/news/article/[ARTICLE_SLUG]</loc> <news:news> <news:publication> <news:name>[Company Name]</news:name> <news:language>en</news:language> </news:publication> <news:genres>Blog</news:genres> <news:publication_date>[PUBLISH_DATE]</news:publication_date> <news:title>[ARTICLE_TITLE]</news:title> <news:keywords>[META_KEYWORDS]</news:keywords> </news:news> </url> </urlset>

is this package support this?

Error while deploying to GCLOUD

I get the Error

ERROR: (gcloud.preview.app.deploy) Error Response: [400] Invalid character in filename: node_modules/sitemap/env/lib/python2.7/site-packages/setuptools/script (dev).tmpl

lastmod field returns Nan-Nan-Nan

{ url: '', lastmod: moment().toISOString(), changefreq: 'daily', priority: 0.3 },
{ url: '/culture', lastmod: moment().toISOString(), changefreq: 'monthly', priority: 0.9 },
{ url: '/culture/putting-dent-into-the-universe', lastmod: moment().toISOString(), changefreq: 'monthly', priority: 0.89 },
{ url: '/culture/workmanship-is-the-key', lastmod: moment().toISOString(), changefreq: 'monthly', priority: 0.88 }

We always get Nan-Nan-Nan for the lastmod field in generated sitemap.xml file.

Any idea ??

Image loc and caption problems

Hi, whenever I add an image with its caption I always end up being a wrong url such as:

<url> <loc> https://URL </loc> <image:image> <image:loc>https://URL/[object Object]</image:loc> </image:image> </url>

The image:loc is wrong and the caption doesn't seem to appear...
The problem seems to be at ligne 383 in your sitemap.js file

an error in README

section: https://github.com/ekalinin/sitemap.js#example-of-sitemap-index-as-string

current:

var sm = require('sitemap') , smi = new sm.buildSitemapIndex({ urls: ['https://example.com/sitemap1.xml', 'https://example.com/sitemap2.xml'] xslUrl: 'https://example.com/style.xsl' // optional });

should be:

var sm = require('sitemap') , smi = sm.buildSitemapIndex({ urls: ['https://example.com/sitemap1.xml', 'https://example.com/sitemap2.xml'] xslUrl: 'https://example.com/style.xsl' // optional });

because sm.buildSitemapIndex is a static method.

SitemapIndex can only be used if you want to write it as a file

Surely createSitemapIndex() should work like createSitemap()? Instead I find that I'm having to write a file and then read it! Thus I'm not using this module to generate my sitemap index any more.

If I get a chance, would you mind if I did a PR to implement functionality so that I can use createSitemapIndex(conf).toString() ?

Deprecated/wrong usage example

The following code from example will not work
sitemap.toXML( function(xml){ console.log(xml) });
You have to change example to handle error
sitemap.toXML( function(err, xml){ if (!err){console.log(xml)} });

Please add AMP support

As indicated in this link: https://developers.google.com/search/docs/guides/create-URLs
We have to add entries for both canonical and AMP URLs.
It's a little similar to androidLink support.

Just make those changes to sitemap.js:

...
+ this.ampLink = conf['ampLink'] || null;
...
-  var xml = '<url> {loc} {img} {lastmod} {changefreq} {priority} {links} {androidLink} {mobile} {news}</url>'
-    , props = ['loc', 'img', 'lastmod', 'changefreq', 'priority', 'links', 'androidLink', 'mobile', 'news']
+  var xml = '<url> {loc} {img} {lastmod} {changefreq} {priority} {links} {androidLink} {ampLink}  {mobile} {news}</url>'
+    , props = ['loc', 'img', 'lastmod', 'changefreq', 'priority', 'links', 'androidLink', 'ampLink', 'mobile', 'news']
...
+    } else if (this[p] && p == 'ampLink') {
+      xml = xml.replace('{' + p + '}', '<xhtml:link rel="amphtml" href="' + this[p] + '" />');
...

Thank you.

Unit Test fails due to timezone offset

The tests currently fail on my machine probably due to timezone offset (Iโ€™m currently in New Zealand, UTC +12) and the test fails because

(calculated by sitemap.js with input lastmod 2011-06-27 and timezone offset) 2011-06-28 โ‰  2011-06-27 (set in comparison string)

Seems like we should have a look at the date calculation.

cache is sliding and will never invalidate in certain scenarios

Given a cacheDuration of say 5000, if requests come in every 4 seconds then the cache will never expire and therefore never allow new dynamic items to be added when using isCacheValid.

This line effectively resets the cacheSetTimestamp anytime it's called. Therefore creating a sliding expiration.

Maybe I'm using the cache in an unintended way but it seems to me a typical use case to build a sitemap, and rebuild it after a set amount of time to allow new dynamic items to appear.

Since the cache behavior is not documented I'm not sure if this could be considered a bug or not.

XML not valid

I tried to generate a sitemap and to validate it with:
https://validator.w3.org/

And it does not validate.

This is not necessearly a problem, but it would probably be better to fix it.

(Thank you for this library).

Large Sitemaps?

Does this module support multiple sitemaps for particularly large sites?

E.g. chunking based on the sitemap max size?

No appropriate support for urls with hashbang #!

I have an AngularJs powered website which uses hashbang (#!) in the URLs.
I tried setting my hostname and map url to every possible way but couldn't make the hashbang url work.

For example my url is: "www.example.com/#!/home". No matter what I configure I always get "www.example.com#!/home". Note the missing slash after dot com.

It might be a bug in the "url-join" component. Please investigate.

Lastmod for sitemap index entries?

Is it possible to add a lastmod property for entries in a sitemap index?

I currently have code like this to create the sitemap index XML string:

// create the xml to be used for the sitemap-index by giving it an array of urls
const sitemapIndexXML = pd.xml(sitemap.buildSitemapIndex({
  urls: fileNameArray.map((fileName) => `${baseUrl}/${fileName}`)
}));

Is there any way get the generated sitemap index to have a <lastmod> element, similar to how the module allows you to pass a lastmodISO property when creating a sitemap?

Ideally it would produce XML like this:

<sitemap>
  <loc>http://example.com/sitemap-us.xml</loc>
  <lastmod>2016-11-22T18:06:38.207Z</lastmod>
</sitemap>
<sitemap>
  <loc>http://example.com/sitemap-ca.xml</loc>
  <lastmod>2016-11-22T18:06:38.207Z</lastmod>
</sitemap>

Priority 0 gets removed from the sitemap

Priority 0 gets removed from sitemap. Should it not show: <priority>0.0</priority> instead?

Also setting priority to 1 shows as 1 in sitemap XML. Would that be wise to always print 1 decimal point in the XML e.g. 1 becomes <priority>1.0</priority> always!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.