drewm / hkit Goto Github PK
View Code? Open in Web Editor NEWAutomatically exported from code.google.com/p/hkit
Automatically exported from code.google.com/p/hkit
Hi,
I am Systems engineer at W3C and we noticed multiple requests to our online
Tidy service [1] from sites running code based on the hKit library (as an
example see my report for the Extended Profile WordPress plugin [2]).
Looking at hkit.class.php I see it is set by default to use the online W3C tidy
service:
class hKit
{
public $tidy_mode = 'proxy'; // 'proxy', 'exec', 'php' or 'none'
public $tidy_proxy = 'http://cgi.w3.org/cgi-bin/tidy?forceXML=on&docAddr='; // required only for tidy_mode=proxy
public $tmp_dir = '/path/to/writable/dir/'; // required only for tidy_mode=exec
Could you please modify your library so that it does not rely on W3C's service
by default.
My feeling is that by default hKit should use a local tidy instance which would
give better performance and reliability.
Also FYI note that our online tidy service has moved from
http://cgi.w3.org/cgi-bin/tidy to http://services.w3.org/tidy/tidy
Thanks,
Vivien Lacourba
W3C Systems Team
[1] http://services.w3.org/tidy/tidy
[2]
https://wordpress.org/support/topic/plugin-should-not-use-the-online-w3c-tidy-se
rvice?replies=1
Original issue reported on code.google.com by [email protected]
on 10 Oct 2014 at 3:46
What steps will reproduce the problem?
1. WordPress Mu 2.9.1.1 + Extended Profile
http://mu.wordpress.org
http://wordpress.org/extend/plugins/extended-profile/
2. Log into Dashboard -> Plugins -> Activate Extended Profile
3. In Dashboard -> User -> Add New - Create New User, result:
Parse error: syntax error, unexpected T_STRING, expecting T_OLD_FUNCTION or
T_FUNCTION or T_VAR or '}' in /home/phnxps/public_html/wp-
content/plugins/extended-profile/hkit.class.php on line 63
What is the expected output? What do you see instead?
It creates the new user, unfortunately it gives you the error.
Take you back to /wp-admin/user-new.php
What version of the product are you using? On what operating system?
Linux host.wildwaterweb.com 2.6.9-67.0.15.EL #1 Thu May 8 10:39:19 EDT 2008
i686
PHP Version 5.2.6
Please provide any additional information below.
if line 63 is commented out, solves the issue (but unknown circumstances)
// public $tidy_mode = 'proxy'; // 'proxy', 'exec', 'php' or 'none'
Original issue reported on code.google.com by [email protected]
on 21 Jan 2010 at 11:20
What steps will reproduce the problem?
1. parse http://www.flickr.com/people/12755805@N00/ with the hcard profile
What is the expected output? What do you see instead?
'note' should say 'Lorem ipsum dolor sit amet, consectetur adipisicing
elit, ...'
but it says 'Lorem ipsum dolor sit amet, consectetur adipisicingelit, ...'
-> look at the last two words.
What version of the product are you using? On what operating system?
0.5 (HEAD)
proxy setting
I will look into the source of the errors.
Original issue reported on code.google.com by [email protected]
on 21 Apr 2008 at 11:25
What steps will reproduce the problem?
1. Create an hcard with some property set to the value "0"
2. Process it with hKit
3. Inspect the results
What is the expected output?
The relevant array element set to a value of "0"
What do you see instead?
The relevant array element is missing
What version of the product are you using? On what operating system?
0.5 on Ubuntu 10.4
Please provide any additional information below.
This is perhaps unlikely with hcard, but I discovered it working on a profile
for an alternative microformat that includes the property 'distance'. This can
legitimately be 0.
The problem is in the function removeTextVals(). The final call to
array_filter, with no arguments, removes all elements with a value that can be
recast as the boolean false. Which includes the string "0" and the numeric
value 0.
The attached patch uses a callback to remove only zero length strings from the
array.
Original issue reported on code.google.com by [email protected]
on 6 May 2012 at 4:52
Attachments:
What steps will reproduce the problem?
1. extend private function
2. use own extended function
3. fail
What is the expected output? What do you see instead?
Own function to be used
What version of the product are you using? On what operating system?
r21
Please provide any additional information below.
Private functions like loadURL() or tidyThis() should be changed to
protected, so one can extend hkit class.
Original issue reported on code.google.com by [email protected]
on 29 Jan 2010 at 12:15
Attachments:
It's not always useful to follow location redirects (301 or 302) because
some pages are using redirects instead of sending a 404 code if a page
doesn't exist, so the:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
could cause problems.
For example:
http://www.dozentenscout.de/Mitglieder/details/Andrea-Kutschan
http://www.dozentenscout.de/Mitglieder/details/Andrea-Kutsch
The second URL results a various user from the home-page, so it would be
helpful to control this mechanism for example by an attribute. Like the
following for example:
public $follow_location = true; // should curl follow the location redirects
Original issue reported on code.google.com by pfefferle
on 22 Mar 2010 at 5:05
Attachments:
What steps will reproduce the problem?
1. load http://www.last.fm/user/gw with hcard profile
I get php warnings. Also, I don't think fn should be required.
The patch I provide checks if the first hierarchy of $s is an array with
non-numeric keys (that means, processNodes only found one hcard) and wraps
it into an array, to make the formatting consistent.
on a long term processNodes should return consistent/predictable results,
i.e. the top level is always an indexed array containing all found results.
an example of the applied patch in use:
http://web84.login-65.hoststar.ch/php_profile_import/profile_import_example.php
Original issue reported on code.google.com by [email protected]
on 21 Apr 2008 at 10:11
Attachments:
http://tools.microformatic.com/query/plain/hkit/http://bergie.iki.fi/
hKit parses the note as:
[note] => Bergie, in old data that got corrupted sometime in 2003/2004
starts biting suddenly Bergie, in old data that got corrupted sometime in
2003/2004 starts biting suddenly
Whereas the HTML on the page is:
<li class="location"><span class="note">Bergie, in <span class="geo
adr"><abbr class="latitude" title="60,1539">60° 9,234 N</abbr> <abbr
class="longitude" title="24,8797">24° 52,782 E</abbr> <span
class="locality">Helsinki</span>, <span
class="country-name">FI</span></span> old data that got corrupted sometime
in 2003/2004 starts biting suddenly</span></span></li>
Original issue reported on code.google.com by henri.bergius
on 16 Jan 2008 at 1:37
hKit doesn't work with non-utf8 encodings.
Temporary patch:
case 'php':
if (!preg_match('//u', $source) || function_exists
('iconv')) {
if (preg_match('!<meta\s.*?content="text/html;
\s*charset=([^\s"\']+)!s', $source, $match)) {
$source = iconv($match[1], 'UTF-8//IGNORE//
TRANSLIT', $source);
}
}
$config = array(
'output-xml' => true,
'quote-nbsp' => false,
'output-encoding' => 'utf8',
);
return tidy_repair_string($source,
$config, 'UTF8');
Original issue reported on code.google.com by [email protected]
on 7 Aug 2009 at 10:10
Having a list of urls with tags set in rel="" would be great. To be flexible,
you would not even need
to filter the tags, but just report everything that is in there and separated
with a space.
Original issue reported on code.google.com by [email protected]
on 28 Jan 2008 at 12:34
<a class="email" href="mailto:[email protected]">[email protected]</a>
<div class="tel">321-321-321</div>
<div class="tel"><span class="type">fax</span><span class="value">123-123-
123</span></div>
<div class="tel"><span class="type">fax</span><span class="value">+7 (921)
123-123-123</span></div>
Below lines parsed as only last phones:
["tel"]=>
array(2) {
[1]=>
array(2) {
["type"]=>
string(3) "fax"
["value"]=>
string(11) "123-123-123"
}
[2]=>
array(2) {
["type"]=>
string(3) "fax"
["value"]=>
string(20) "+7 (921) 123-123-123"
}
}
Expected email, and three phones.
Original issue reported on code.google.com by [email protected]
on 4 Sep 2008 at 11:42
$tidy = tidy_parse_string($source);
return tidy_clean_repair($tidy);
instead of
$tidy = tidy_parse_string($source);
tidy_clean_repair($tidy);
return $tidy;
Original issue reported on code.google.com by [email protected]
on 7 Aug 2009 at 9:53
When using hkit to parse nested hCards, it appears that hkit incorrectly
treats all of the nested hCard data as belonging to the top-level hCard
("incorrectly" as defined by
http://microformats.org/wiki/hcard-parsing#nested_hCards ).
Note that I am filing this bug report based on the information that
getSatisfaction.com is using hKit for their hCard profile import.
What steps will reproduce the problem?
1. Start signing up for getSatisfaction: http://getsatisfaction.com/people/new
2. Select "other" service for hCard import.
3. Enter a Mahalo URL for a user that has friends, like
http://www.mahalo.com/member/Cfinke
What is the expected output?
Name: cfinke
Profile URL: http://www.mahalo.com/member/Cfinke
Image URL: http://mho_users.s3.amazonaws.com/cfinke/weemee.jpg
What do you see instead?
Name: Sara
Profile URL:
http://www.mahalo.com/member/cfinke,http://www.mahalo.com/member/Sara,http://www
.mahalo.com/member/Sean
percival,http://www.mahalo.com/member/Spinchange,http://www.mahalo.com/member/Sh
alunov,http://www.mahalo.com/member/Tummblr,http://www.mahalo.com/member/Leahcul
ver,http://www.mahalo.com/member/Rcade,http://www.mahalo.com/member/Jschuur,http
://www.mahalo.com/member/Jordan,,,,
Image URL:
http://mho_users.s3.amazonaws.com/cfinke/weemee.jpg,http://mho_users.s3.amazonaw
s.com/sara/sara_lg.png,http://mho_users.s3.amazonaws.com/sean_percival/sean_912_
lg.jpg,http://mho_users.s3.amazonaws.com/tummblr/boo.jpg,http://mho_users.s3.ama
zonaws.com/leahculver/weemee.jpg,http://mho_users.s3.amazonaws.com/jschuur/south
parkjoost.gif,http://mho_users.s3.amazonaws.com/travis/Picture_1.png,http://mho_
users.s3.amazonaws.com/laurend/laurend_252_lg.jpg,http://mho_users.s3.amazonaws.
com/connectedgeek/me_smaller.jpg,http://mho_users.s3.amazonaws.com/melindam/meli
ndam_842_lg.png,http://mho_users.s3.amazonaws.com/danielle/danielle_lg.png,http:
//mho_users.s3.amazonaws.com/steepdecline/steepdecline_lg.jpg,http://mho_users.s
3.amazonaws.com/julia/julia_lg.jpg,http://mho_users.s3.amazonaws.com/ssravp/weem
ee.jpg,http://mho_users.s3.amazonaws.com/tantek/icon-2007-256px.png,http://mho_u
sers.s3.amazonaws.com/tomer/Tomer_Cohen.JPG,http://mho_users.s3.amazonaws.com/sc
ottorama/weemee-crop.jpg,http://mho_users.s3.amazonaws.com/sebastian/sebastian__
2_quadrat.jpg
What version of the product are you using? On what operating system?
Using hkit via getSatisfaction.
Please provide any additional information below.
This incorrect implementation is widespread (it appears in the Operator
Toolbar for Firefox and the Safari microformat plugin as well), but we have
verified with Tantek Celik of microformats.org that our HTML is correct,
and that the parsing is the issue.
Original issue reported on code.google.com by cfinke
on 11 Feb 2008 at 4:09
This is my (very hacky) hListing profile for hKit. It covers the base
properties, but there are number
of key areas to improve that would require changes to the hKit core. Namely:
• Need support for nested microformats (for hCalendar items, hCard listers,
rel-tag)
• Need support for rel values (for permalinks, rel-tag)
• Support for the class=value pattern (this has to be hard coded for price
at the moment)
Anyway, this does function, so may be useful for some early, experimental
parsing.
Ben
Original issue reported on code.google.com by ben%[email protected]
on 15 Apr 2008 at 12:47
Attachments:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.