Coder Social home page Coder Social logo

hkit's People

Watchers

 avatar  avatar

hkit's Issues

hKit Library should not rely on the online W3C Tidy service by default

Hi,

I am Systems engineer at W3C and we noticed multiple requests to our online 
Tidy service [1] from sites running code based on the hKit library (as an 
example see my report for the Extended Profile WordPress plugin [2]).

Looking at hkit.class.php I see it is set by default to use the online W3C tidy 
service:


class hKit
{
    public $tidy_mode   = 'proxy'; // 'proxy', 'exec', 'php' or 'none'
    public $tidy_proxy  = 'http://cgi.w3.org/cgi-bin/tidy?forceXML=on&docAddr='; // required only for tidy_mode=proxy
    public $tmp_dir     = '/path/to/writable/dir/'; // required only for tidy_mode=exec


Could you please modify your library so that it does not rely on W3C's service 
by default.

My feeling is that by default hKit should use a local tidy instance which would 
give better performance and reliability.

Also FYI note that our online tidy service has moved from 
http://cgi.w3.org/cgi-bin/tidy to http://services.w3.org/tidy/tidy

Thanks,
Vivien Lacourba
W3C Systems Team

[1] http://services.w3.org/tidy/tidy
[2] 
https://wordpress.org/support/topic/plugin-should-not-use-the-online-w3c-tidy-se
rvice?replies=1

Original issue reported on code.google.com by [email protected] on 10 Oct 2014 at 3:46

Parse error: syntax error, unexpected T_STRING, expecting T_OLD_FUNCTION or T_FUNCTION or T_VAR or '}' in hkit.class.php on line 63

What steps will reproduce the problem?

1. WordPress Mu 2.9.1.1 + Extended Profile 
http://mu.wordpress.org
http://wordpress.org/extend/plugins/extended-profile/
2. Log into Dashboard -> Plugins -> Activate Extended Profile
3. In Dashboard -> User -> Add New - Create New User, result:

Parse error: syntax error, unexpected T_STRING, expecting T_OLD_FUNCTION or 
T_FUNCTION or T_VAR or '}' in /home/phnxps/public_html/wp-
content/plugins/extended-profile/hkit.class.php on line 63

What is the expected output? What do you see instead?
It creates the new user, unfortunately it gives you the error.
Take you back to /wp-admin/user-new.php

What version of the product are you using? On what operating system?
Linux host.wildwaterweb.com 2.6.9-67.0.15.EL #1 Thu May 8 10:39:19 EDT 2008 
i686
PHP Version 5.2.6

Please provide any additional information below.

if line 63 is commented out, solves the issue (but unknown circumstances)

// public $tidy_mode    = 'proxy'; // 'proxy', 'exec', 'php' or 'none'

Original issue reported on code.google.com by [email protected] on 21 Jan 2010 at 11:20

dropped blanks

What steps will reproduce the problem?
1. parse http://www.flickr.com/people/12755805@N00/ with the hcard profile

What is the expected output? What do you see instead?
'note' should say 'Lorem ipsum dolor sit amet, consectetur adipisicing
elit, ...'
but it says 'Lorem ipsum dolor sit amet, consectetur adipisicingelit, ...'
-> look at the last two words.

What version of the product are you using? On what operating system?
0.5 (HEAD)
proxy setting

I will look into the source of the errors. 

Original issue reported on code.google.com by [email protected] on 21 Apr 2008 at 11:25

Properties with a value of "0" are not returned

What steps will reproduce the problem?
1. Create an hcard with some property set to the value "0"
2. Process it with hKit
3. Inspect the results

What is the expected output? 
The relevant array element set to a value of "0"

What do you see instead?
The relevant array element is missing

What version of the product are you using? On what operating system?
0.5 on Ubuntu 10.4

Please provide any additional information below.

This is perhaps unlikely with hcard, but I discovered it working on a profile 
for an alternative microformat that includes the property 'distance'. This can 
legitimately be 0.

The problem is in the function  removeTextVals(). The final call to 
array_filter, with no arguments, removes all elements with a value that can be 
recast as the boolean false. Which includes the string "0" and the numeric 
value 0. 

The attached patch uses a callback to remove only zero length strings from the 
array.

Original issue reported on code.google.com by [email protected] on 6 May 2012 at 4:52

Attachments:

Change private functions to be protected for one to be able to extend hkit class.

What steps will reproduce the problem?
1. extend private function
2. use own extended function
3. fail


What is the expected output? What do you see instead?
Own function to be used 

What version of the product are you using? On what operating system?
r21

Please provide any additional information below.
Private functions like loadURL() or tidyThis() should be changed to
protected, so one can extend hkit class.

Original issue reported on code.google.com by [email protected] on 29 Jan 2010 at 12:15

Attachments:

Possibility to disable FOLLOWLOCATION (curl)

It's not always useful to follow location redirects (301 or 302) because
some pages are using redirects instead of sending a 404 code if a page
doesn't exist, so the:

curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);

could cause problems.

For example:

http://www.dozentenscout.de/Mitglieder/details/Andrea-Kutschan
http://www.dozentenscout.de/Mitglieder/details/Andrea-Kutsch

The second URL results a various user from the home-page, so it would be
helpful to control this mechanism for example by an attribute. Like the
following for example:

public $follow_location = true; // should curl follow the location redirects

Original issue reported on code.google.com by pfefferle on 22 Mar 2010 at 5:05

Attachments:

warnings for certain urls

What steps will reproduce the problem?
1. load http://www.last.fm/user/gw with hcard profile

I get php warnings. Also, I don't think fn should be required. 
The patch I provide checks if the first hierarchy of $s is an array with
non-numeric keys (that means, processNodes only found one hcard) and wraps
it into an array, to make the formatting consistent.

on a long term processNodes should return consistent/predictable results,
i.e. the top level is always an indexed array containing all found results.

an example of the applied patch in use: 
http://web84.login-65.hoststar.ch/php_profile_import/profile_import_example.php

Original issue reported on code.google.com by [email protected] on 21 Apr 2008 at 10:11

Attachments:

Duplication of hCard note element contents

http://tools.microformatic.com/query/plain/hkit/http://bergie.iki.fi/

hKit parses the note as:
[note] => Bergie, in old data that got corrupted sometime in 2003/2004
starts biting suddenly Bergie, in old data that got corrupted sometime in
2003/2004 starts biting suddenly

Whereas the HTML on the page is:
<li class="location"><span class="note">Bergie, in <span class="geo
adr"><abbr class="latitude" title="60,1539">60° 9,234 N</abbr> <abbr
class="longitude" title="24,8797">24° 52,782 E</abbr> <span
class="locality">Helsinki</span>, <span
class="country-name">FI</span></span> old data that got corrupted sometime
in 2003/2004 starts biting suddenly</span></span></li>

Original issue reported on code.google.com by henri.bergius on 16 Jan 2008 at 1:37

non-utf8 encoding

hKit doesn't work with non-utf8 encodings.

Temporary patch:

                case 'php':
                    if (!preg_match('//u', $source) || function_exists
('iconv')) {
                        if (preg_match('!<meta\s.*?content="text/html;
\s*charset=([^\s"\']+)!s', $source, $match)) {
                            $source = iconv($match[1], 'UTF-8//IGNORE//
TRANSLIT', $source);
                        }
                    }

                    $config = array(
                        'output-xml' => true,
                        'quote-nbsp' => false,
                        'output-encoding' => 'utf8',
                    );


                    return tidy_repair_string($source, 
$config, 'UTF8');

Original issue reported on code.google.com by [email protected] on 7 Aug 2009 at 10:10

Add XFN recognition

Having a list of urls with tags set in rel="" would be great. To be flexible, 
you would not even need 
to filter the tags, but just report everything that is in there and separated 
with a space.

Original issue reported on code.google.com by [email protected] on 28 Jan 2008 at 12:34

Wrong parse of phone and email with value/type

 <a class="email" href="mailto:[email protected]">[email protected]</a> 
 <div class="tel">321-321-321</div> 
 <div class="tel"><span class="type">fax</span><span class="value">123-123-
123</span></div> 
 <div class="tel"><span class="type">fax</span><span class="value">+7 (921) 
123-123-123</span></div> 

Below lines parsed as only last phones:
  ["tel"]=>
  array(2) {
    [1]=>
    array(2) {
      ["type"]=>
      string(3) "fax"
      ["value"]=>
      string(11) "123-123-123"
    }
    [2]=>
    array(2) {
      ["type"]=>
      string(3) "fax"
      ["value"]=>
      string(20) "+7 (921) 123-123-123"
    }
  }

Expected email, and three phones.


Original issue reported on code.google.com by [email protected] on 4 Sep 2008 at 11:42

Error in tidyThis

$tidy = tidy_parse_string($source);
return tidy_clean_repair($tidy);

instead of

$tidy = tidy_parse_string($source);
tidy_clean_repair($tidy);
return $tidy;

Original issue reported on code.google.com by [email protected] on 7 Aug 2009 at 9:53

Poor support for nested hCards

When using hkit to parse nested hCards, it appears that hkit incorrectly
treats all of the nested hCard data as belonging to the top-level hCard
("incorrectly" as defined by
http://microformats.org/wiki/hcard-parsing#nested_hCards ).

Note that I am filing this bug report based on the information that
getSatisfaction.com is using hKit for their hCard profile import.

What steps will reproduce the problem?

1. Start signing up for getSatisfaction: http://getsatisfaction.com/people/new

2. Select "other" service for hCard import.

3. Enter a Mahalo URL for a user that has friends, like
http://www.mahalo.com/member/Cfinke

What is the expected output? 

Name: cfinke
Profile URL: http://www.mahalo.com/member/Cfinke
Image URL: http://mho_users.s3.amazonaws.com/cfinke/weemee.jpg

What do you see instead?

Name: Sara
Profile URL:
http://www.mahalo.com/member/cfinke,http://www.mahalo.com/member/Sara,http://www
.mahalo.com/member/Sean
percival,http://www.mahalo.com/member/Spinchange,http://www.mahalo.com/member/Sh
alunov,http://www.mahalo.com/member/Tummblr,http://www.mahalo.com/member/Leahcul
ver,http://www.mahalo.com/member/Rcade,http://www.mahalo.com/member/Jschuur,http
://www.mahalo.com/member/Jordan,,,,
Image URL:
http://mho_users.s3.amazonaws.com/cfinke/weemee.jpg,http://mho_users.s3.amazonaw
s.com/sara/sara_lg.png,http://mho_users.s3.amazonaws.com/sean_percival/sean_912_
lg.jpg,http://mho_users.s3.amazonaws.com/tummblr/boo.jpg,http://mho_users.s3.ama
zonaws.com/leahculver/weemee.jpg,http://mho_users.s3.amazonaws.com/jschuur/south
parkjoost.gif,http://mho_users.s3.amazonaws.com/travis/Picture_1.png,http://mho_
users.s3.amazonaws.com/laurend/laurend_252_lg.jpg,http://mho_users.s3.amazonaws.
com/connectedgeek/me_smaller.jpg,http://mho_users.s3.amazonaws.com/melindam/meli
ndam_842_lg.png,http://mho_users.s3.amazonaws.com/danielle/danielle_lg.png,http:
//mho_users.s3.amazonaws.com/steepdecline/steepdecline_lg.jpg,http://mho_users.s
3.amazonaws.com/julia/julia_lg.jpg,http://mho_users.s3.amazonaws.com/ssravp/weem
ee.jpg,http://mho_users.s3.amazonaws.com/tantek/icon-2007-256px.png,http://mho_u
sers.s3.amazonaws.com/tomer/Tomer_Cohen.JPG,http://mho_users.s3.amazonaws.com/sc
ottorama/weemee-crop.jpg,http://mho_users.s3.amazonaws.com/sebastian/sebastian__
2_quadrat.jpg

What version of the product are you using? On what operating system?

Using hkit via getSatisfaction.

Please provide any additional information below.

This incorrect implementation is widespread (it appears in the Operator
Toolbar for Firefox and the Safari microformat plugin as well), but we have
verified with Tantek Celik of microformats.org that our HTML is correct,
and that the parsing is the issue.

Original issue reported on code.google.com by cfinke on 11 Feb 2008 at 4:09

hListing Support

This is my (very hacky) hListing profile for hKit. It covers the base 
properties, but there are number 
of key areas to improve that would require changes to the hKit core. Namely:

• Need support for nested microformats (for hCalendar items, hCard listers, 
rel-tag)
• Need support for rel values (for permalinks, rel-tag)
• Support for the class=value pattern (this has to be hard coded for price 
at the moment)

Anyway, this does function, so may be useful for some early, experimental 
parsing.

Ben

Original issue reported on code.google.com by ben%[email protected] on 15 Apr 2008 at 12:47

Attachments:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.