brahma-dev / metafetch Goto Github PK
View Code? Open in Web Editor NEWNodeJS package that fetches a given URL's title, description, images, links etc.
Home Page: https://www.npmjs.org/package/metafetch
License: MIT License
NodeJS package that fetches a given URL's title, description, images, links etc.
Home Page: https://www.npmjs.org/package/metafetch
License: MIT License
Describe the bug
For some URLs, the timeout param passed into the fetch call is not respected, the call never returns and never throws an error.
Request goes into the void, never timing out.
Request should time out after the specified amount of time.
...
(might need to try a different url, or set the timeout lower, but this one reproduces the issue most of the time for me)
metafetch.fetch(
'https://www.seattletimes.com/nation-world/witnesses-myanmar-air-attack-kills-13-including-7-children/',
{
flags: { links: false, images: false },
http: { timeout: 2500, headers: { 'Accept-Encoding': '*' }
}
}).then(metadata => {
console.log(metadata);
}).catch(e => {
console.log(e);
});
Looking through the lib, it looks like the timeout param is not actually passed into the axios request
adding timeout: http_options.timeout,
to the axios request options objects seems to correct the issue
When I request some pages with this lib, some page characters are broken. (JP, CN, KO ...)
I founded this in restler dependency, it has old version iconv-lite.
Furthermore, I think the library is no more maintain.
So, how about update api client library?
I would like to insert my own custom user-agent while making the request just like Facebook or Linkedin.
This is possible?
I received following error:
On inspection I found that fetch() was written in promise without catch().
Can you please check this.
UnhandledPromiseRejectionWarning: TypeError [ERR_UNESCAPED_CHARACTERS]: Request path contains unescaped characters
at new ClientRequest (_http_client.js:115:13)
at Object.request (http.js:42:10)
at Request.request (/var/app/current/node_modules/superagent/lib/node/index.js:622:31)
at Request.end (/var/app/current/node_modules/superagent/lib/node/index.js:764:8)
at /var/app/current/node_modules/metafetch/index.js:203:74
at new Promise ()
at Object.Client.fetch (/var/app/current/node_modules/metafetch/index.js:138:9)
I am trying to retrieve data from https://artplusmarketing.com/two-questions-at-the-heart-of-every-great-brand-strategy-c237180c3aa4
But I am getting 409 error. Same url when I try using another package (url-metadata.js) it is giving proper response.
Please inform how to correct this issue
Hi,
I know we are not supposed to fetch any image(https://www.google.co.in/images/branding/googlelogo/2x/googlelogo_color_272x92dp.png) with this library, but I want to catch the exception and don't want the library to break with this case.
It breaks in index.js :
var $ = cheerio.load(body);
inside cheerios parse.js
var oldParent = node.parent || node.root,
TypeError: Cannot read property 'parent' of undefined
Thank you.
./node_modules/metafetch/index.js:165
return r();
^
ReferenceError: r is not defined
https://github.com/afzaalace/metafetch/blob/master/index.js#L142
https://github.com/afzaalace/metafetch/blob/master/index.js#L165
Only happens when link redirects.
NOTE: This is the link that caused this error -> http://www.dianitica.com/
It redirects to -> http://dianitica.com/
When I get the meta for this page https://www.hunziker-thalwil.ch/fr/hunziker-idep.htm metafetch returns the title as:
Plate-forme d'�ducation digitale individuelle de Suisse | Hunziker AG Thalwil
The description is read as:
La plate-forme digitale individuelle Edu est la premi�re solution suisse globale pour l'enseignement num�rique compl�mentaire conforme aux dispositions de la DSG et de l'ODASG.: hunziker-idep
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.