thu0ng91 / sitemapgen4j Goto Github PK
View Code? Open in Web Editor NEWAutomatically exported from code.google.com/p/sitemapgen4j
Automatically exported from code.google.com/p/sitemapgen4j
Sample on how to generate code that can be and should be subject to
namespace validation:
//important imports
import org.jdom.Attribute;
import org.jdom.output.Format;
import org.jdom.output.XMLOutputter;
import org.jdom.Document;
import org.jdom.Element;
//code sample
Document document = new Document();
Namespace xmlns =
Namespace.getNamespace("http://www.sitemaps.org/schemas/sitemap/0.9");
Namespace xsi = Namespace.getNamespace("xsi",
"http://www.w3.org/2001/XMLSchema-instance");
Element rootElement = new Element("urlset", xmlns);
rootElement.addNamespaceDeclaration(xsi);
Attribute schemaLocation = new Attribute("schemaLocation",
"http://www.sitemaps.org/schemas/sitemap/0.9" +
" http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd", xsi);
rootElement.setAttribute(schemaLocation);
document.setRootElement(rootElement);
//begin url harvesting
Element url = new Element("url", xmlns);
Element loc = new Element("loc", xmlns);
Element lastmod = new Element("lastmod", xmlns);
Element changefreq = new Element("changefreq", xmlns);
Element priority = new Element("priority", xmlns);
loc.addContent("");
lastmod.addContent("");
changefreq.addContent("");
priority.addContent("");
url.addContent(loc);
url.addContent(lastmod);
url.addContent(changefreq);
url.addContent(priority);
rootElement.addContent(url);
//end loop
createSitemapFile(document);
Original issue reported on code.google.com by [email protected]
on 16 Mar 2010 at 2:17
What steps will reproduce the problem?
1. User builder
2. Set GZIP=true
3. Set autoValidate=true
What is the expected output? What do you see instead?
Expected the output to validate despite being gzipped.
Actually, it fails trying to read GZIP as XML.
What version of the product are you using? On what operating system?
1.0.1 under Ubuntu.
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 1 Mar 2012 at 1:43
What steps will reproduce the problem?
wsg.addUrl("http://www.facebook.com/[some FB user]");
What is the expected output? What do you see instead?
Stack trace:
java.lang.RuntimeException: Url http://www.facebook.com/[some FB user] doesn't start with base URL http://www.[some host].com
at com.redfin.sitemapgenerator.UrlUtils.checkUrl(UrlUtils.java:10)
at com.redfin.sitemapgenerator.SitemapGenerator.addUrl(SitemapGenerator.java:59)
at com.redfin.sitemapgenerator.SitemapGenerator.addUrl(SitemapGenerator.java:119)
What version of the product are you using? On what operating system?
sitemapgen4j-1.0.1.jar
OS: Windows 7
Please provide any additional information below.
I've never heard of a restriction where you are not allowed to include external
links into your sitemap. In fact having external links is one of the criteria
for Google's rating.
Original issue reported on code.google.com by [email protected]
on 1 Aug 2012 at 8:16
http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=80472
describes an optional field <video:expiration_date> for a date after which
the video will no longer be available.
GoogleVideoSitemapUrl.Options should have a method
expirationDate(java.util.Date publicationDate) to set this field and
GoogleVideoSitemapGenerator should write this field to the Video-Sitemap.
What version of the product are you using?
sitemapgen4j-1.0.1
Original issue reported on code.google.com by [email protected]
on 1 Sep 2009 at 10:56
SitemapGeneratorBuilder.gzip is package-private; should be public.
Original issue reported on code.google.com by [email protected]
on 3 Feb 2009 at 9:48
Hi,
Do you plan on supporting mixed Google Image and Video sitemaps as specified in
https://support.google.com/webmasters/answer/183668?hl=en
I was hoping I could extend GoogleVideoSitemapGenerator and
GoogleVideoSitemapUrl myself to do this, but unfortunately
AbstractSitemapGeneratorOptions isn't visible.
Cheers,
Mark
Original issue reported on code.google.com by [email protected]
on 1 Aug 2014 at 10:39
Please change in:
com.redfin.sitemapgenerator.SitemapGenerator
in method:
private void writeSiteMap(OutputStreamWriter out) throws IOException
out.write("<urlset xmlns=\"http://www.sitemaps.org/schemas/sitemap/0.9\"
");
TO:
out.write("<urlset xmlns=\"http://www.sitemaps.org/schemas/sitemap/0.9\" "
+
" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-
instance\"" +
"
xsi:schemaLocation=\"http://www.sitemaps.org/schemas/sitemap/0.9" +
"
http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd\"");
Original issue reported on code.google.com by [email protected]
on 15 Mar 2010 at 4:27
i find the program limit SitemapIndexGenerator.maxUrls no more then
1000(MAX_SITEMAPS_PER_INDEX), but just as I konw, google does not to limit this.
Original issue reported on code.google.com by [email protected]
on 20 Sep 2012 at 4:58
What steps will reproduce the problem?
there is no image:image tag
What is the expected output? What do you see instead?
<url>
<loc>...</loc>
<image:image>
<image:loc>...</image:loc>
<image:caption>...</image:caption>
<image:title>...</image:title>
</image:image>
<image:image>
...
</image:image>
...
<lastmod>...</lastmod>
<priority>...</priority>
</url>
What version of the product are you using? On what operating system?
1.0.1, Linux
Please provide any additional information below.
I like to add image urls to an url like:
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=178636
Original issue reported on code.google.com by [email protected]
on 2 Jul 2012 at 9:35
1. Generate sitemap with urls different from main domain
As we use sub domain name and alias, our urls are not always similar to main
domain.
Example :
www.mydomain.com
Url added
www.aliasspecific.com/.....
May it be possible to get a flag in order get generation skipping checkUrl
process ?
Thanks,
Jul
Original issue reported on code.google.com by [email protected]
on 21 May 2012 at 9:38
Is it possible for the WebSitemapGenerator builder to have a method where you
specify a Commons VFS manager to store the sitemap on different filesystems
(local, remote, RAM, FTP, Samba server, Amazon S3, etc.)
Original issue reported on code.google.com by [email protected]
on 29 Oct 2014 at 12:12
When I use custom url with Arabic or chineese characters , I get '?'
I'm using version 1.0.1 on windows 8?
here is the fix :
Ligne : 245 [SitemapGenerator.class]
if (gzip) {
FileOutputStream fileStream = new FileOutputStream(outFile);
GZIPOutputStream gzipStream = new GZIPOutputStream(fileStream);
out = new OutputStreamWriter(gzipStream, Charset.forName("UTF-8").newEncoder());
} else {
out = new OutputStreamWriter(
new FileOutputStream(outFile),
Charset.forName("UTF-8").newEncoder());
}
Original issue reported on code.google.com by [email protected]
on 17 Jul 2013 at 3:11
This may not be a bug/issue/problem.
I sought to generate a site map-with-index per the example in the documentation
Example code from documentation:
WebSitemapGenerator wsg = new WebSitemapGenerator("http://www.example.com",
myDir);
for (int i = 0; i < 60000; i++)
wsg.addUrl("http://www.example.com/doc"+i+".html");
wsg.write();
wsg.writeSitemapsWithIndex(); // generate the sitemap_index.xml
What steps will reproduce the problem?
1. Instantiate WebSiteMapGenerator
2. add urls
3. call wsg.write()
4. call wsg.writeSiteMapWithIndex()
Expected Result
===================
Code generates sitemap.xml and site map index xml file.
Actual Results
=====================
Exception while exeecuting task 'Generate Site Map' (999)
java.lang.RuntimeException: No URLs added, sitemap index would be empty; you
must add some URLs with addUrls
Issue:
-I don't see a place where the api requires/allows you to 'add a url' between
the write() and the wsg.writeSitemapsWithIndex();
-I have a very small data set: one url. This does not seem relevant to the
problem.
Version
===========
Version 1.0.1
Appendix A. Code
=====================
WebSitemapGenerator wsg = new WebSitemapGenerator(BASE_URL, myDir);
for (Post post : posts)
{
ChangeFreq changeFreq = ChangeFreq.HOURLY;
WebSitemapUrl url = new WebSitemapUrl.Options(BASE_URL + ELEMENT_URL + post.getID())
.lastMod(post.getThreadActivityDate())
.priority(1.0)
.changeFreq(changeFreq).build();
wsg.addUrl(url);
}
wsg.write();
wsg.writeSitemapsWithIndex();
Original issue reported on code.google.com by [email protected]
on 13 Jul 2010 at 12:46
Its currently not possible to create a complete
google-news-sitemap with this tool.
Tags like
"title" or "publication" are important and missing.
Original issue reported on code.google.com by [email protected]
on 27 Jan 2010 at 2:51
Hi,
very nice clean project. However I have a feature request:
- The ability to change the output to be limited to only file.
I'd like to be able to write to various implementations of outputstreams /
writers as opposed to be limited to file only. Specifically in mind, I could
write to ServletOutputStreams, StringWriters, etc.
Thanks
Original issue reported on code.google.com by [email protected]
on 27 Mar 2010 at 3:11
What steps will reproduce the problem?
1. add a URL containing a & in path, e.g. http://www.domain.com/user/me&you/
2. generate the sitemap
3. ampersand is not correctly encoded for XML
What is the expected output? What do you see instead?
ampersand should be encoded for XML:
http://www.domain.tld/user/me&you/
What version of the product are you using? On what operating system?
1.0.1 on win xp
Please provide any additional information below.
Both & and < are valid characters of a URL, but not in XML
see URL RFC (3.3 / 3.4): http://www.ietf.org/rfc/rfc2396.txt
and XML Spec (2.4): http://www.w3.org/TR/REC-xml/
Original issue reported on code.google.com by [email protected]
on 9 May 2009 at 4:47
What steps will reproduce the problem?
1. Refer WebSitemapUrl.class from a class that is not in package
com.redfin.sitemapgenerator. Our use case is in a test with Mockito using code
similar with that below:
WebSitemapGenerator mock = Mockito.mock(WebSitemapGenerator.class);
...
Mockito.verify(mock, Mockito.times(1)).addUrl(Mockito.any(WebSitemapUrl.class));
2. Compile with Java 1.7.0_25
3. Get error: ISitemapUrl is not public in com.redfin.sitemapgenerator; cannot
be accessed from outside package
What is the expected output? What do you see instead?
It should compile, but instead it fails because ISitemapUrl is not declared
public
What version of the product are you using? On what operating system?
1.0.1
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 16 Jul 2013 at 5:24
This is missing from GoogleVideoSiteUrl and Options.
See https://support.google.com/webmasters/answer/80472#4
Original issue reported on code.google.com by [email protected]
on 1 Aug 2014 at 1:31
What steps will reproduce the problem?
1. Try using this on App Engine.
What is the expected output? What do you see instead?
It works, it doesn't due to the restrictions on java.io and file system access.
What version of the product are you using? On what operating system?
1.0, Google App Engine
Please provide any additional information below.
I think it should be easy enough to add a method that instead of trying to
write to a file gives you the output as a blob or some kind of input stream so
you can then write the output to the datastore or even memory to be sent to the
crawler upon request. Thanks!
Original issue reported on code.google.com by [email protected]
on 14 Dec 2010 at 7:55
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.