kanasimi / wikiapi Goto Github PK
View Code? Open in Web Editor NEWJavaScript MediaWiki API for node.js
Home Page: https://kanasimi.github.io/wikiapi/
License: BSD 3-Clause "New" or "Revised" License
JavaScript MediaWiki API for node.js
Home Page: https://kanasimi.github.io/wikiapi/
License: BSD 3-Clause "New" or "Revised" License
const Wikiapi= require('wikiapi');
const fetch = require('node-fetch');
const fs = require('fs');
const logins = require('./logins-ShufaBot.js');
const files = require('./data/zi-reds.js'); // ls -1 *.svg > zi-reds.js
// PURPOSE: Script uploads file 丁-red.png and similar to Commons.
// Edit login credentials
var USER = logins.commons.user,
PASS = logins.commons.pass,
API = logins.commons.api;
wikicode = '';
console.log('Username:', USER);
(async () => {
// Connect
console.log('Connects');
const targetWiki = new Wikiapi;
await targetWiki.login(USER, PASS, API);
console.log('Connected!');
// upload file / media
for(i=0;i<files.length;i++){
let zi = files[i].zi;
wikicode=`{{SOlicense|${zi}||red.png||license=PD}}`;
let result = await targetWiki.upload({
file_path: `./reds/png/${zi}.png`,
comment: `Upload red stroke order for Chinese character ${zi}.`,
text: wikicode
});
}
})();
Then:
$/node ./create-Commons-upload.js
Username: ShufaBot
Connects
get_URL_node: 異常 HTTP 狀態碼 404:https://commons.wikimedia.org/api.php?assert=user&maxlag=5&format=json&utf8=1
get_URL_node: Retry 1/4: BAD STATUS
get_URL_node: 異常 HTTP 狀態碼 404:https://commons.wikimedia.org/api.php?assert=user&maxlag=5&format=json&utf8=1
get_URL_node: Retry 2/4: BAD STATUS
get_URL_node: 異常 HTTP 狀態碼 404:https://commons.wikimedia.org/api.php?assert=user&maxlag=5&format=json&utf8=1
get_URL_node: Retry 3/4: BAD STATUS
get_URL_node: 異常 HTTP 狀態碼 404:https://commons.wikimedia.org/api.php?assert=user&maxlag=5&format=json&utf8=1
get_URL_node: Retry 4/4: BAD STATUS
get_URL_node: 異常 HTTP 狀態碼 404:https://commons.wikimedia.org/api.php?assert=user&maxlag=5&format=json&utf8=1
get_URL_node: Got error when retrieving [https://commons.wikimedia.org/api.php?assert=user&maxlag=5&format=json&utf8=1]: BAD STATUS
wiki_API_query: BAD STATUS: https://commons.wikimedia.org/api.php?assert=user&maxlag=5&format=json&utf8=1
get_URL_node: 異常 HTTP 狀態碼 404:https://commons.wikimedia.org/api.php?action=query&meta=tokens&type=login&maxlag=5&format=json&utf8=1
get_URL_node: Retry 1/4: BAD STATUS
get_URL_node: 異常 HTTP 狀態碼 404:https://commons.wikimedia.org/api.php?action=query&meta=tokens&type=login&maxlag=5&format=json&utf8=1
get_URL_node: Retry 2/4: BAD STATUS
get_URL_node: 異常 HTTP 狀態碼 404:https://commons.wikimedia.org/api.php?action=query&meta=tokens&type=login&maxlag=5&format=json&utf8=1
get_URL_node: Retry 3/4: BAD STATUS
get_URL_node: 異常 HTTP 狀態碼 404:https://commons.wikimedia.org/api.php?action=query&meta=tokens&type=login&maxlag=5&format=json&utf8=1
get_URL_node: Retry 4/4: BAD STATUS
get_URL_node: 異常 HTTP 狀態碼 404:https://commons.wikimedia.org/api.php?action=query&meta=tokens&type=login&maxlag=5&format=json&utf8=1
get_URL_node: Got error when retrieving [https://commons.wikimedia.org/api.php?action=query&meta=tokens&type=login&maxlag=5&format=json&utf8=1]: BAD STATUS
wiki_API_query: BAD STATUS: https://commons.wikimedia.org/api.php?action=query&meta=tokens&type=login&maxlag=5&format=json&utf8=1
wiki_API.login: 無法 login! Abort! Response:
BAD STATUS
Hello,
I see on the readme the following :
// upload file / media
(async () => {
const wiki = new Wikiapi;
await wiki.login('user', 'password', 'test');
let result = await wiki.upload({ file_path: '/local/file/path', comment: '', text: '' });
})();
Nice. It does the job. But what is the recommended approach to complete the Commons.wikimedia.org Upload Wizard's form with author, source, license, categories and co ? Should I program a second section of code which edit the page after upload. Or should I use text:"value"
with "value" being my whole page's wikicode ?
Note: you can answer me here, I will PR the Readme with a clarification. 😉
Please also note you don't have test for uploads.
Hi, following the existing README.md, I may suggest new feature as perceived to be practical cases.
Hey:
I'm trying to build a little template-to-csv tool with wikiapi. the problem in the title occurs when the script runs new Wikiapi("https://warframe.huijiwiki.com/api.php")
the environment I use as below:
the full code I ran as below:
const Wikiapi = require('wikiapi')
(async() => {
const wiki = new Wikiapi('https://warframe.huijiwiki.com/api.php')
let pageData = await wiki.page('吞天沙暴', {})
console.log(pageData)
})()
the error in the console:
D:\git repo\warframe-items-for-huiji\node_modules\cejs\application\net\wiki.js:201
API_URL.includes('://')) {
^
at new wiki_API (D:\git repo\warframe-items-for-huiji\node_modules\cejs\application\net\wiki.js:201:21)
at Wikiapi (D:\git repo\warframe-items-for-huiji\node_modules\wikiapi\Wikiapi.js:92:23)
at Object.<anonymous> (D:\git repo\warframe-items-for-huiji\scripts\parseTemplateData.js:6:1)
at Module._compile (node:internal/modules/cjs/loader:1103:14)
at Object.Module._extensions..js (node:internal/modules/cjs/loader:1155:10)
at Module.load (node:internal/modules/cjs/loader:981:32)
at Function.Module._load (node:internal/modules/cjs/loader:822:12)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:77:12)
at node:internal/main/run_main_module:17:47
I tried a little debug on the module and located the problem:
/cejs/application/net/wiki.js
line 194, right before API_URL.includes('://')
yet I'm too noob to fix it, I need advice, thank you~
I can't seem to find a decent place to change the default User-Agent used in the library. Wikipedia has a User-Agent policy which calls for a specific User-Agent string and it would be great if there were an option to provide a custom User-Agent which was specific to the calling application vs. the wikiapi package.
Regards,
Steve
Any idea what is that ?
== [20210408T2243]: 6 pages done ==
Add temporary category to ease next step.
: 6 pages done, 28 s elapsed.
* First, use 476 ms to get 6 pages.
# '''[[:File:Letter-a-colorful.svg]]''' 2 s elapsed, [[Special:Diff/551289225|finished]] at 20210408T2243
# '''[[:File:Letter-b-colorful.svg]]''' 7 s elapsed, [[Special:Diff/551289231|finished]] at 20210408T2243
# '''[[:File:Letter-c-colorful.svg]]''' 3 s elapsed, [[Special:Diff/551289255|finished]] at 20210408T2243
# '''[[:File:Letter-d-colorful.svg]]''' 5 s elapsed, [[Special:Diff/551289270|finished]] at 20210408T2243
# '''[[:File:Letter-e-colorful.svg]]''' 5 s elapsed, [[Special:Diff/551289289|finished]] at 20210408T2243
# '''[[:File:Letter-f-colorful.svg]]''' 5 s elapsed, [[Special:Diff/551289301|finished]] at 20210408T2243
It shown up at the end of this script being run :
// PURPOSE: Script to edit targets using hand-picked targets.
// Run: $node wiki-upload-many.js
const Wikiapi= require('wikiapi');
//const logins = require('./logins.js');
const logins = require('../DragonsBot/logins-ShufaBot.js');
const letters= require('./data/letters.js');
// Login credentials from .login*.js
var USER = logins.commons.user,
PASS = logins.commons.pass,
API = logins.commons.api;
(async () => {
const wiki = new Wikiapi;
await wiki.login(USER, PASS, API);
console.log(`Username ${USER} is connected !`);
/* *************************************************************** */
/* CORE ACTION(S) HERE : HACK ME ! ******************************* */
var listPages = letters.map(item => `File:Letter-${item.letter}-colorful.svg`);
// Add template {stub}, replace-remove vandalism if any, add category.
await wiki.for_each_page(
listPages,
d => { return d.wikitext //.replace(/^/g,'Thanos says: ')
+`\n[[Category:${USER} test: edit]]`;
}, // new content
{bot: 1, nocreate: 0, minor: 1, summary: 'Add temporary category to ease next step.'} // edit options
);
/* END CORE ****************************************************** */
/* *************************************************************** */
})();
// For details, see documentation : https://kanasimi.github.io/wikiapi/
[Please close this issue. Already solved. Just here to create some record / documentation.]
I got an error message with another of my code...
commons-rename.js:
// PURPOSE: Script renames targets following hand-coded patters.
const Wikiapi= require('wikiapi');
const logins = require('./logins-ShufaBot.js');
const files = require('./data/zi-reds.js');
// Edit login credentials
var USER = logins.commons.user,
PASS = logins.commons.pass,
API = logins.commons.api;
(async () => {
// Connect
console.log('Connecting...');
const targetWiki = new Wikiapi;
await targetWiki.login(USER, PASS, API);
console.log(`Username ${USER} is connected !`);
// Renaming by patter
for(i=0;i<files.length;i++){
zi = files[i].zi;
console.log(zi)
// File page exist ?
let pageData = await targetWiki.page(`File:${zi}.png`, {});
console.log('pageExists: ',pageData.wikitext!=='')
if(pageData.wikitext!=='') {
var initialTitle=`File:${zi}.png`,
newTitle=`File:${zi}-newname.png`,
reason='ShufaBot test: renaming file.',
revertReason='ShufaBot test: renaming file, revert.';
console.log(initialTitle,newTitle);
// Rename
result = await targetWiki.move_page(initialTitle, newTitle, { reason: reason, noredirect: true, movetalk: true });
// Revert rename
await targetWiki.page(newTitle);
result = await targetWiki.move_to(initialTitle, { reason: revertReason, noredirect: true, movetalk: true });
}
}
})();
$ node ./edit-Commons-filename.js
Connecting...
get_API_parameters: Set enwiki: path=query+siteinfo
get_API_parameters: Set commonswiki: path=query+siteinfo
Username ShufaBot@ShufaBot is connected !
112379664-b9a79100-8ce8-11eb-981d-727884b32993
pageExists: true
File:112379664-b9a79100-8ce8-11eb-981d-727884b32993.png File:112379664-b9a79100-8ce8-11eb-981d-727884b32993-newname.png
(node:71405) UnhandledPromiseRejectionWarning: #<Object>
(Use `node --trace-warnings ...` to show where the warning was created)
(node:71405) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag `--unhandled-rejections=strict` (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 1)
(node:71405) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
Following #30 API documentation revamp (JSDoc migration) , the README.md must be rewritten. (I propose to lead this while @kanasimi works on wikiapi.js JSDoc3 comments).
Readme.md code gallery :
Use cases and examples should use Commons:Sandbox/wikiapijs
and subpages (./1, ./2, ...) as default testing ground when possible.
Or :en:Wikipedia:Sandbox/wikiapijs
.
(Or an official test wiki ? but it's less fun...)
Hello,
Is there a parameter for that, or should I first read the target page to check if it exists ?
I have a code working via
let pageData = await targetWiki.page(`File:${zi}-red.png`, {});
let isEmpty = pageData.wikitext==='';
It's more a documentation issue.
Hello Kanasimi,
I tried to create a minimal bot to contribute to Wikimedia's LinguaLibre.org. Login fails. I suspect it's because it has delegated login to Commons via Oauth. I am confuse on how to connect for log in, how to connect for edit.
My objective is to create 1000 pages based on @unicode-org/unilex's 1000's files.
// edit page: method 2
(async () => {
await wiki.login('Dragons Bot', 'pass', 'https://commons.wikimedia.org/api.php');
const wiki = new Wikiapi('https://www.lingualibre.org/api.php');
await wiki.edit_page('User:Dragons_Bot', function(page_data) {
return page_data.wikitext
+ `\nHi, I'm Dragons Bot ! I plan to upload lists and others maintenances.`;
}, {bot: 1, nocreate: 1, minor: 1});
console.log('Done.');
})();
Simple readability proposal. When a page exists and I edit it with the same content, I see :
wiki_API_edit: [[List:Ibo/Most used words, UNILEX 1: words 00001 to 00200]]: no change
Edit page: Done.
It could be more natural to have :
wiki_API_edit: [[List:Ibo/Most used words, UNILEX 1: words 00001 to 00200]]: no difference
Edit page: Done (none).
Feel free to close this issue. Depending on what you do under the hood your current text may be more accurate.
Does this module support browsers?
I'm specifically using it together with Snowpack (& Skypack CDN) to make it work on the browser but I get the following errors:
[23:33:25] [snowpack] + [email protected]
[23:33:26] [esinstall:wikiapi] Home\DEV\ppsl-app-v1.5\node_modules\wikiapi\Wikiapi.js
Import "./_CeL.loader.nodejs.js" could not be resolved from file.
[23:33:26] [esinstall:wikiapi] Home\DEV\ppsl-app-v1.5\node_modules\wikiapi\Wikiapi.js
Import "./_CeL.loader.nodejs.js" could not be resolved from file.
[23:33:26] [esinstall:wikiapi] Home\DEV\ppsl-app-v1.5\node_modules\wikiapi\_CeL.loader.nodejs.js?commonjs-external
Module "Home\DEV\ppsl-app-v1.5\node_modules\wikiapi\_CeL.loader.nodejs.js" could not be resolved by Snowpack (Is it installed?).
[23:33:26] [snowpack] Install failed.
error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.
This is how I import it:
import wikiapi from 'wikiapi'
Wikiapi is too broad and non-distinctive and confusing.
WikiapiJS should be used to qualify your project, as often as possible.
See also #48
After yarn upgrade from 1.14.1 to 1.15.1 resulted in the following error:
[Node] /Users/strefethen/github/traffic-graphql/node_modules/wikiapi/Wikiapi.js:1371
[Node] no_message: options?.no_edit,
node --version
v12.18.3
I got filemover
userrights on Commons for my bot :
Any familiarity with this error message ?
Error: abusefilter-autopromote-blocked: This action has been automatically identified as harmful, and it has been disallowed. In addition, as a security measure, some privileges routinely granted to established accounts have been temporarily revoked from your account. A brief description of the abuse rule which your action matched is: Pagemove throttle for new users
at /home/yug/Documents/DragonsBot/node_modules/cejs/application/net/wiki/admin.js:166:15
at check_session_badtoken (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/wiki/query.js:181:4)
at /home/yug/Documents/DragonsBot/node_modules/cejs/application/net/wiki/query.js:603:4
at IncomingMessage.<anonymous> (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:2221:6)
at IncomingMessage.emit (events.js:327:22)
at endReadableNT (_stream_readable.js:1224:12)
at processTicksAndRejections (internal/process/task_queues.js:84:21)
If I remember well, we are limited to 1 edit every 5 secs. But I can't get my hand on the doc at the moment, and renaming may have special rules.
If you don't know no need to look for the answer : I asked on Commons for guidance.
/* ************************************************************* */
/* ********* ************** */
/* ********* REWRITING ONGOING. ************** */
/* ********* DO NOT READ YET. ************** */
/* ********* ************** */
/* ************************************************************* */
I don't know how cejs and WikiapiJS' .download()
currently handles its queries, but I suspect it list the target files in an array via .categorymembers()
, .search()
or others which returns something such:
Categorymembers files:
// var files = await targetwiki.categorymembers('Category:Lingua Libre pronunciation-cmn', { namespace: 'File' });
// returns :
[
{"pageid":98560779,"ns":6,"title":"File:LL-Q9192 (cmn)-Assassas77-不.wav"},
{"pageid":98560774,"ns":6,"title":"File:LL-Q9192 (cmn)-Assassas77-了.wav"},
....
{"pageid":98560798,"ns":6,"title":"File:LL-Q9192 (cmn)-Assassas77-什么.wav"},
]
then use pageid to run a new API query for each file, get the url, and download the file.
Source | Comment |
---|---|
https://commons.wikimedia.org/w/api.php? | API queries have ratelimit |
https://upload.wikimedia.org | Direct downloads don't (I'm not 100% sure for that 😅 ). |
There are Special:ApiSandbox queries which using a single API call can fetch by category name few hundred category member files, with exact file url and timestamp.
{
"batchcomplete": "",
"continue": {
"gcmcontinue": "file|313235300a4c4c2d513135302028465241292d504f534c4f56495443482d313235302e574156|88497069",
"continue": "gcmcontinue||"
},
"query": {
"pages": {
"82101585": { ... },
"104331639": {
"pageid": 104331639,
"ns": 6,
"title": "File:LL-Q150 (fra)-Kitel WP-%.wav",
"imagerepository": "local",
"imageinfo": [{
"timestamp": "2021-04-25T15:49:00Z",
"url": "https://upload.wikimedia.org/wikipedia/commons/a/a3/LL-Q150_%28fra%29-Kitel_WP-%25.wav",
"descriptionurl": "https://commons.wikimedia.org/wiki/File:LL-Q150_(fra)-Kitel_WP-%25.wav",
"descriptionshorturl": "https://commons.wikimedia.org/w/index.php?curid=104331639"
}]
},
"104381091": { ... }
}
}
}
title
, timestamp
and url
are the most relevant properties I believe.
See also: API:Categorymembers, API:Allimages, API:Imageinfo.
For files, the url property gives a direct download url allowing download from upload.wikimedia.org
without additional API query on https://commons.wikimedia.org/w/api.php?
.
With one API query we can have 500 direct url to download from at higher speed.
The Wikimedia Discord api-bot channel made several input to this projects:
Ambiguity comes from MediaWiki API already commonly known as the wiki api
.
Wikiapi.js
(capitalized + javascript indicator) - for human-oriented documentation text, readme.md, wiki, community announcements, emails, etc. Minor arternative WikiapiJS
.wikiapijs
(lowercase) - for computer-friendly url, hashtags, tags.wikiapi
(lowercase) - for incode usage within .js files, github repository name, npm (which is already a js ecosystem). Note: Rename npm package is bad for dependencies, cannot do there.Same practice as for D3.js (in human conversation), d3js (in hashtags), and d3
(in .js files and npm).
Important to note : D3 has no ambiguity with another homonyme "D3" project. We do.
Do you mean that renaming on github changes the package's name on npm ?
Hello,
You already got quite some useful, practical codes on the README.md, from wikipedias to wikidata, from GET to Replace.
Since the developments is active, I would suggest you the following new axe :
// Step 4: POST request to upload a file directly
function upload(csrf_token) {
var params_3 = {
action: "upload",
filename: "Sandboxfile1.jpg", // `data[i].filename` comes here
ignorewarnings: "1",
comment: "comment here", // `data[i].comment` comes here
token: csrf_token,
format: "json"
};
var file = {
file: fs.createReadStream('My.jpg') // `data[i].filepath` comes here
};
var formData = Object.assign( {}, params_3, file );
request.post({
url: url, // API's root url such as "https://test.wikipedia.org/w/api.php";
formData: formData
}, function (error, res, body) {
body = JSON.parse(body);
if (error) { return; }
else if (body.upload.result === "Success"){
console.log("File Uploaded :)");
}
});
}
filepath
: path to file to upload.
string
"birds.png"
filename
: target filename as wished on Wikimedia Commons.
string
"wikiapiTestUploadOnCommons.png"
comment
: Upload comment. Also used as the initial page text for new files if text is not specified.
string
"Upload successful !"+ new Date().toISOString().slice(0,-5)
Under consideration:
directory
: path to directory with image(s). (Dev node: maybe just push it into data[i].filename
?)
string
. Ex: ~/Documents/ImagesLouvres/
./
tags
: Change tags to apply to the upload log entry and file page revision.
ignorewarnings
: Ignore any warnings
The difference between upload of local file or online file is minor and can be balanced via a conditional.
Example 1: Upload a local file directly | Example 2: Upload file from URL |
---|---|
// Step 4: POST request to upload a file directly
function upload(csrf_token) {
var params_3 = {
action: "upload",
filename: "Sandboxfile1.jpg",
ignorewarnings: "1",
token: csrf_token,
format: "json"
};
var file = {
file: fs.createReadStream('My.jpg')
};
var formData = Object.assign({}, params_3, file);
request.post({
url: url, // API https://en.wikipedia.org/w/api.php
formData: formData
}, function(error, res, body) {
body = JSON.parse(body);
if (error) { return; }
else if (body.upload.result === "Success") {
console.log("File Uploaded :)");
}
});
} |
// Step 4: POST request to upload a file from a URL
function editRequest(csrf_token) {
var params_3 = {
action: "upload",
filename: "Test-ABCD.jpg",
url: "https://farm9.staticflickr.com/8213/8300206113_374c017fc5.jpg",
ignorewarnings: "1",
token: csrf_token,
format: "json"
};
request.post({
url: url, // API https://en.wikipedia.org/w/api.php
form: params_3
}, function(error, res, body) {
body = JSON.parse(body);
if (error) { return; }
else if (body.upload.result === "Success") {
console.log("File Uploaded :)");
}
});
} |
So a conditional and some minor fixes should be enough
url
: URL to fetch the file from.
if( filepath.indexOf("http") == 0 ) {
params_3.url = filepath
} else {
params_3.file = fs.createReadStream(filepath)
}
Note: form
and formData
are lightly different. (I'am not sure it affects our case).
See table in :mw:Manual:Creating_a_bot
Replace:
// load page
(async () => {
const wiki = new Wikiapi('en');
let page_data = await wiki.page('Universe');
console.log(page_data.wikitext);
})();
// load page of other wiki
(async () => {
const wiki = new Wikiapi('https://awoiaf.westeros.org/api.php');
let page_data = await wiki.page('Game of Thrones');
console.log(page_data.wikitext);
})();
by
// load page any wiki
(async () => {
const wiki = new Wikiapi('https://en.wikipedia.org/api.php'); // on Wikipedia...
// const wiki = new Wikiapi('https://awoiaf.westeros.org/api.php'); // ...or any private wiki
let page_data = await wiki.page('Game of Thrones');
console.log(page_data.wikitext);
})();
I started a demo bot repository which will contain convenient basic codes, better organized and ready to fire.
See : hugolpz/WikiapiJS-Eggs
Given the code :
const Wikiapi= require('wikiapi');
const logins = require('./logins.js');
// Login credentials from .login*.js
var USER = logins.commons.user,
PASS = logins.commons.pass,
API = logins.commons.api;
(async () => {
// Connect
var targetwiki = new Wikiapi;
await targetwiki.login(USER, PASS, API);
console.log(`Username ${USER.split('@')[0]} is connected !`);
/* *************************************************************** */
/* CORE ACTION(S) HERE : HACK ME ! ******************************* */
// List of categories
var categories = (await targetwiki.category_tree('Lingua_Libre_pronunciation', { depth: 1, cmtype: 'subcat', get_flated_subcategories: true })).flated_subcategories;
keys=Object.keys(categories)
console.log(keys.length+' keys :\n '+JSON.stringify(keys))
/* END CORE ****************************************************** */
/* *************************************************************** */
})();
Update cejs done:
~/Documents/WikiapiJS-Eggs$ node GitHub.updater.node.js
Read the latest version from cache file CeJS-master.version.json
Get the infomation of latest version of CeJS...
Already have the latest version: 2022-01-05T08:20:26Z
Due to .category_tree()
, the script prints the following warning message, while the script continue to work fine :
/Documents/WikiapiJS-Eggs$ node wiki-category_tree-many.js
get_API_parameters: Cache commonswiki: path=query+siteinfo
Username ShufaBot is connected !
get_list: Unknown response: [{"error":{"code":"toomanyvalues","info":"Too many values supplied for parameter \"pageids\". The limit is 50.","limit":50,"lowlimit":50,"highlimit":500,"*":"See https://commons.wikimedia.org/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at <https://lists.wikimedia.org/postorius/lists/mediawiki-api-announce.lists.wikimedia.org/> for notice of API deprecations and breaking changes."},"servedby":"mw1363"}]
Trace: {
error: {
code: 'toomanyvalues',
info: 'Too many values supplied for parameter "pageids". The limit is 50.',
limit: 50,
lowlimit: 50,
highlimit: 500,
'*': 'See https://commons.wikimedia.org/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at <https://lists.wikimedia.org/postorius/lists/mediawiki-api-announce.lists.wikimedia.org/> for notice of API deprecations and breaking changes.'
},
servedby: 'mw1363'
}
at wiki_API_list_callback (/home/yug/Documents/WikiapiJS-Eggs/node_modules/wikiapi/node_modules/cejs/application/net/wiki/list.js:1075:13)
at wiki_API_next_list_callback (/home/yug/Documents/WikiapiJS-Eggs/node_modules/wikiapi/node_modules/cejs/application/net/wiki/task.js:571:16)
at /home/yug/Documents/WikiapiJS-Eggs/node_modules/wikiapi/node_modules/cejs/application/net/wiki/list.js:587:5
at check_session_badtoken (/home/yug/Documents/WikiapiJS-Eggs/node_modules/wikiapi/node_modules/cejs/application/net/wiki/query.js:189:4)
at XMLHttp_handler (/home/yug/Documents/WikiapiJS-Eggs/node_modules/wikiapi/node_modules/cejs/application/net/wiki/query.js:687:4)
at IncomingMessage.<anonymous> (/home/yug/Documents/WikiapiJS-Eggs/node_modules/wikiapi/node_modules/cejs/application/net/Ajax.js:2265:6)
at IncomingMessage.emit (node:events:402:35)
at endReadableNT (node:internal/streams/readable:1340:12)
at processTicksAndRejections (node:internal/process/task_queues:83:21)
for_category_info_list: [object Object]
^Cpth 2/1: 17/131 Lingua Libre pronunciation-bdu: 0 item(s). evels left)
Hello Kanasimi, Hope you are going well.
Given a Commons category name, I want to download all its (12) files.
I updated wikiapi :
npm update
npm view wikiapi
[email protected] | BSD-3-Clause | deps: 1 | versions: 32
...
Following the quite elegant doc, I "coded" the following:
// PURPOSE: Script to upload targets using an external data file.
// Run: $node wiki-upload-many.js
const Wikiapi= require('wikiapi');
const logins = require('./logins.js');
// Login credentials from .login*.js
var USER = logins.commons.user,
PASS = logins.commons.pass,
API = logins.commons.api;
(async () => {
// Connect
var targetwiki = new Wikiapi;
await targetwiki.login(USER, PASS, API);
console.log(`Username ${USER.split('@')[0]} is connected !`);
/* *************************************************************** */
/* CORE ACTION(S) HERE : HACK ME ! ******************************* * /
// List of targets
const list = await targetwiki.categorymembers(`Category:Lingua Libre pronunciation by Bile rene`, { namespace: 'File' });
// Loop on targets & save
for(i=0;i<list.length;i++){// Set pages titles (current and new), reason and revertReason :
page_data = list[i];
try {
await targetwiki.download(page_data.title, { directory: './downloads' });
} catch (error) { console.log(`Download error on ${page_data.title} : ${error}`) }
}
*/
// Download all files from a (Commons) category
for (const page_data of await targetwiki.categorymembers(`Category:Lingua Libre pronunciation by Bile rene`, { namespace: 'File' })) {
try {
//if (targetwiki.is_namespace(page_data, 'File'))
const file_data = await targetwiki.download(page_data.title, { directory: './downloads' });
} catch (error) { console.log(`Download error on ${page_data.title} : ${error}`) }
}
/* END CORE ****************************************************** */
/* *************************************************************** */
})();
I get the following error message:
/WikiapiJS-Eggs$ node wiki-download-many.js
get_API_parameters: Set commonswiki: path=query+siteinfo
Username ShufaBot is connected !
Download error on File:LL-Q33093 (bas)-Bile rene-Bonjour.wav : TypeError: wiki.download is not a function
Download error on File:LL-Q56668 (mcn)-Bile rene-Bonjour.wav : TypeError: wiki.download is not a function
Download error on File:LL-Q33093 (bas)-Bile rene-Bonne nuit.wav : TypeError: wiki.download is not a function
Download error on File:LL-Q56668 (mcn)-Bile rene-Bonne nuit.wav : TypeError: wiki.download is not a function
Download error on File:LL-Q33093 (bas)-Bile rene-C'est comment?.wav : TypeError: wiki.download is not a function
Download error on File:LL-Q56668 (mcn)-Bile rene-C'est comment?.wav : TypeError: wiki.download is not a function
Download error on File:LL-Q33093 (bas)-Bile rene-Comment vas-tu?.wav : TypeError: wiki.download is not a function
Download error on File:LL-Q56668 (mcn)-Bile rene-Comment vas-tu?.wav : TypeError: wiki.download is not a function
Download error on File:LL-Q56668 (mcn)-Bile rene-guten tag.wav : TypeError: wiki.download is not a function
Download error on File:LL-Q56668 (mcn)-Bile rene-Kal nga'a?(Comment allez-vous?).wav : TypeError: wiki.download is not a function
Download error on File:LL-Q33093 (bas)-Bile rene-Que se passe t-il?.wav : TypeError: wiki.download is not a function
Download error on File:LL-Q56668 (mcn)-Bile rene-Que se passe t-il?.wav : TypeError: wiki.download is not a function
Given JSdoc3 generates the documentation from the code, and I do find download
at line 1781 in both your github and my local file... I'am quite confused. Any idea ?
I guess ideally the docs should be up to date with the npm version. In any case, I happen to be trying to use this library and need to delete.
Thanks!
Timestamp property could be used to compare with existing local file's timestamp.
If API timestamp property is smaller (older) than local file timestamp, then skip download.
The imageinfo's "timestamp": "2021-04-25T15:49:00Z"
indeed matches file description page indicating :
Date/Time | Thumbnail | Dimensions | User | Comment | |
---|---|---|---|---|---|
current | 15:49, 25 April 2021 | 1.1 s (99 KB) | Kitel WP (talk | contribs) |
After verification, files with several uploads provide by default the timestamp of the last upload (default : 1 revision, the latest).
When a file already exists locally, it could be skipped faster.
Given a
the time per download, x
the number of files to download, b
the initial categorymember query time with estimated b=60sec
. We could get the second attempt (update) duration to be such as 2.7*14+60 = 97.8secs
instead of 540 sec.
@Poslovitch pointed out that filenames are not enough, some versioning check may be ongoing so recently updated files on commons are indeed re-downloaded. (Discord server invitation)
Ciencia-Al-Poder pointed out "First of all, you should avoid redownload files that you downloaded on a previous run. The api will return you the file modification/creation time. Use it to check if the file has been updated." (Discord link)
Unsure if I should file this here or kanasimi/CeJS; the issue seems to be this regexp pattern:
https://github.com/kanasimi/CeJS/blob/521b966ccf2810455c9d89ec893f478f06d4575a/application/net/wiki/namespace.js#L399
Simply try getting a page from a domain that does not have subdomain or domain extension, or is only an IP, or have a port number attached to it:
const Wikiapi = require("wikiapi");
(async () => {
const url = 'https://gbf.wiki/api.php';
// These will also fail
// const url = 'http://localhost/api.php';
// const url = 'http://127.0.0.1/api.php';
// const url = 'https://www.example.com:8080/api.php';
const wiki = new Wikiapi(url);
const data = await wiki.page('Main_Page');
console.log(data.wikitext);
})();
It just works
Get the following error:
api_URL: Unknown project: [https://gbf.wiki/api.php]! Using default API URL.
And it simply queries the default (en wikipedia) instead.
I can still get it working by running a local reverse proxy with nginx
and modifying /etc/hosts
, but this is very non-ideal.
Previous issues helped develop an efficient recursive download over categories of files via :
await targetwiki.download(
"Category:Lingua_Libre_pronunciation-cmn", {
directory: './',
max_threads: 4,
page_filter(page_data) {
console.log('@Yug: ',JSON.stringify(page_data))
return true;
}
});
Q1: what is page_filter(page_data){ ...}
Q2 script speed limit and internet connection ?:
Is this speed I observe:
If 2., then :
Q3 max_threads: if I set max_threads: 8
, I should be 2 times faster than max_threads: 4
, right ?
I tested shortly with 12
, seems true.
Q4 depth: is it possible to limit category depth ? how ?
Q5 depth loop: is there a risk of depth infinite loop ? (Category:A contains Category:B which contains Category:A)
Q6 header: is the xhr hearder properly set ? See meta:User-Agent_policy
Q7 resilience: Is the script resilient ?
Q8 ogg: how to download alternative file format such as .mp3
from .wav
?
Example : file in category File:LL-Q7737_(rus)-1Apollinariya1-кофе.wav -> file to download LL-Q7737_(rus)-1Apollinariya1-кофе.mp3
I noticed a lot of commented console.log
s and console.trace
s in the code.
It would be nice if those could be toggled on/off with a flag. It would help when running in CI so that bugs from production edits can be caught easier.
Nice library btw, very versatile :D
Following up for yesterday upload issue. Down the road I now meet this :
$ node ./create-Commons-upload.js
Connecting...
get_API_parameters: Set enwiki: path=query+siteinfo # <------- Side issue: why talk about enwiki ? My target is Commons.
get_API_parameters: Set commonswiki: path=query+siteinfo
Username ShufaBot@ShufaBot is connected !
wiki_API_page: Not exists: [[:File:A-red.png]] # <------- this is good, I want to upload when page is empty
get_API_parameters: Set commonswiki: path=upload
/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:553
: !item.includes(boundary);
^
TypeError: item.includes is not a function # <------- This is the rocks on the road !
at not_includes_in (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:553:12)
at Array.every (<anonymous>)
at not_includes_in (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:551:38)
at Array.every (<anonymous>)
at not_includes_in (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:551:38)
at give_boundary (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:570:9)
at process_next (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:712:5)
at process_next (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:790:4)
at push_and_callback (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:630:5)
at get_file_object (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:661:5)
In console I see :
wiki_API_page: Not exists: [[User talk:Dragons Bot@Dragons Bot/Stop]]
We should expect :
wiki_API_page: Not exists: [[User talk:Dragons Bot/Stop]]
no ?
commons-upload.js:
// PURPOSE: Script uploads file 丁-red.png and similar to Commons.
const Wikiapi= require('wikiapi');
const fetch = require('node-fetch');
const fs = require('fs');
const logins = require('./logins-ShufaBot.js');
const files = require('./data/zi-reds.js'); // PROVIDES THE CHARACTERS
// Edit login credentials
var USER = logins.commons.user,
PASS = logins.commons.pass,
API = logins.commons.api;
wikicode = '',
zi='',
path='';
console.log('Username:', USER);
(async () => {
// Connects
const targetWiki = new Wikiapi;
await targetWiki.login(USER, PASS, API);
console.log('Connected!');
// upload file / media
for(i=0;i<files.length;i++){
zi = files[i].zi,
wikicode=`{{SOlicense|${zi}||red.png||license=PD}}`,
path = `/home/yug/Documents/DragonsBot/reds/png/${zi}-red.png`;
let result = await targetWiki.upload({
file_path: path,
comment: `Upload red stroke order for Chinese character ${zi}.`,
text: wikicode
});
}
})();
Then:
$ node ./create-Commons-upload.js --unhandled-rejections=strict
Username: ShufaBot@ShufaBot
get_API_parameters: Set enwiki: path=query+siteinfo
get_API_parameters: Set commonswiki: path=query+siteinfo
Connected!
to_form_data: Failed to get file: [/home/yug/Documents/DragonsBot/reds/png/a-red.png]
(node:50492) UnhandledPromiseRejectionWarning: #<Object>
(Use `node --trace-warnings ...` to show where the warning was created)
(node:50492) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag `--unhandled-rejections=strict` (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 1)
(node:50492) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
I copy-pasted that url into my Chrome and boom ! The images shows up. So the path is correct.
Example file, to save as 丁-red.png
:
丁-red.png
ASCII file, to save as a-red.png
:
a-red.png
Both 丁-red.png
and a-red.png
returns the same error message, so it's not encoding.
This issue is only a review and outline of associated issues. Please discuss on related issues.
I like to write documentations and plan to do a push on wikiapi readme.md.[1] I still have to understand the project better. Today I understood that you base your parameters naming convention upon MediaWiki API. 🥇 I still have to understand more stuff before attacking your README.md. Your project is elegant, the code is easy to write and understand, and the development is active which is great. Your project has two major weak points :
Key | Point | Actionable |
---|---|---|
Doc | The manual via examples is insufficient, noisy, too many things are opaques (parameters) | Needs to revamp the Readme.md |
Devs | The lack of a multi-developers community. All stands on yourself. It's not distributed enough. | Needs to onboard 1~2 other JS devs of your level or understanding your ES6 code. |
A better readme will help to call over new devs. While @kanasimi works great alone, for sustainability it is better to have teammates and reviewers, who can fix things when you, Kanasimi, can't for various reasons.
[1]: this is open source "plan". Plan are only true when they are done. So anyone who can do something it first can go ahead and propose some improvement.
@hugolpz @maltejur @pcjtulsa @trustedtomato @vangberg
In the README, the section on editing a page does something I don't understand.
Here's the snippet in question
// edit page
(async () => {
const enwiki = new Wikiapi;
await enwiki.login('bot name', 'password', 'en');
let page_data = await enwiki.page('Wikipedia:Sandbox');
await enwiki.edit(function(page_data) {
return page_data.wikitext
+ '\nTest edit using {{GitHub|kanasimi/wikiapi}}.';
}, {bot: 1});
// alternative method
await enwiki.edit_page('Wikipedia:Sandbox', function(page_data) {
return page_data.wikitext
+ '\nTest edit using {{GitHub|kanasimi/wikiapi}}.';
}, {bot: 1, nocreate: 1, minor: 1});
console.log('Done.');
})();
It first retrieves the page data with let page_data = await enwiki.page('Wikipedia:Sandbox');
but then never uses page_data
. Inside all of the edit
and edit_page
methods, page_data
is passed in by the caller and shadows the outer variable.
Does this mean edit
will do a .page()
call internally for you? Is the await enwiki.page()
call not needed then? Or should it be that edit
actually doesn't get passed page_data
?
Given core script.js :
const Wikiapi= require('wikiapi');
(async () => {
var targetwiki = new Wikiapi('commons');
try {
// await targetwiki.download(targetwiki.to_namespace('Lingua_Libre_pronunciation-cmn', 'Category'), { directory: './downloads',max_threads: 4 });
await targetwiki.download(
'Lingua_Libre_pronunciation-cmn',
{
directory: './',
max_threads: 4,
page_filter(page_data) {
console.log('@Yug: ',JSON.stringify(page_data))
return true;
}
});
} catch (error) { console.log(`Download error : ${JSON.stringify(error)}`) }
})();
It failed to download, and simply returned :
yug@yug-k401ub:~/Documents/WikiapiJS-Eggs$ node ./wiki-download-many-category_tree_and_files.js
get_API_parameters: Cache commonswiki: path=query+siteinfo
get_API_parameters: Cache commonswiki: path=query+imageinfo
(Also, I don't understand the role of page_filter()
so i neutralised it)
! ===========================================================================================
! DRAFT MESSAGE - to publish when ready to receive error reports and PRs
! ===========================================================================================
TL;DR: Wikiapi documentation has been created ; WikiapiJS Eggs, a kick starter project with classic usecases, as well.
Hello @kanasimi,
Hello @acagastya @maltejur @pcjtulsa @trustedtomato @vangberg,
March 2021 have been a very active month for WikiapiJS and we are happy to announce two major milestones for the project.
Wikiapi welcomes a proper JSDoc3-based documentation. This makes learning and coding much easier.
Welcome also to WikiapiJS Eggs ! 🥚 🐉 WikiapiJS Eggs is the demo kickstarter for WikiapiJS. Install node, git clone WikiapiJS Eggs, add your username+password+target_site, and you are ready to mass fire on your wiki 😎
Both the Wikiapi documentation and WikiapiJS Eggs are on alpha release. While they add great value, we expect errors around. You are welcome to create issues or PRs when you see something to improve.
-- Kanasimi & Hugolpz
I started script to upload 10 files. One target is missing :
Node script (relevant section):
try{
// upload
let result = await targetWiki.upload({
file_path: `../DragonsBot/SOP/${zi}-${locale}${major}.${extension}`,
comment: `Upload Chinese radical ${zi} in red style, raster format.`, // <------------------------ TO EDIT
text: wikicode,
ignorewarnings: 1, // overwrite or create,
author: '[[User:Yug]]',
categories:[``]
});
}catch(e){ console.error(e); }
The node shell/script crashes on it, returning :
get_API_parameters: Set commonswiki: path=upload
internal/fs/utils.js:259
throw err;
^
Error: ENOENT: no such file or directory, open '../DragonsBot/SOP/亠-red.png'
at Object.openSync (fs.js:461:3)
at Object.readFileSync (fs.js:364:35)
at get_file_object (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:638:24)
at process_next (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:759:5)
at process_next (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:791:4)
at process_next (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:791:4)
at process_next (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:791:4)
at process_next (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:791:4)
at to_form_data (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:793:3)
at get_URL_node (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:1395:4) {
errno: -2,
syscall: 'open',
code: 'ENOENT',
path: '../DragonsBot/SOP/亠-red.png'
}
I assumed the script would ping me "file missing" then pass over it and move to the next target.
The script works properly when files exist.
Note/idea: if there are both successful and missing files, it may be interesting to log the list of all successfull / failed upload, to then print it when the script ends.
You may want to check page_data.wikitext
in your readme.md, which is not aligned with your data structure.
console.log(pagedata)
pagedata {
pageid: 438578,
ns: 3,
title: 'User talk:Dragons Bot',
revisions: [
{
revid: 464938,
parentid: 464937,
timestamp: '2021-03-01T22:36:55Z',
contentformat: 'text/x-wiki',
contentmodel: 'wikitext',
'*': 'undefined\n' +
'2\n' +
"Hi, I'm Dragons Bot ! I plan to upload lists and others maintenances."
}
],
convert_from: 'User_talk:Dragons_Bot',
original_title: 'User_talk:Dragons_Bot',
is_Flow: false
}
const Wikiapi = require('wikiapi');
const fetch = require("node-fetch");
var newContent = `\nHi, I'm Dragons Bot ! I plan to upload lists and others maintenances.`;
/* (async () => {
const url1 = 'https://raw.githubusercontent.com/lingua-libre/unilex/master/data/frequency/ig.txt'
const response = await fetch(url1);
const data = await response.text();
console.log('Fetch: Done.');
})(); */
// edit page: method 2
(async () => {
const targetWiki = new Wikiapi;
await targetWiki.login('user', 'pass', 'en');
await targetWiki.edit_page('Wikipedia:Sandbox', function(page_data) {
console.log('pagedata',page_data)
console.log('wikitext',page_data.revisions[0]['*'])
return page_data.revisions[0]['*'] // <------- `return page_data.wikitext` from your readme.md but actually undefined
+ '\n2'+newContent;
}, {bot: 1, nocreate: 1, minor: 1});
console.log('Edit: Done.');
})();
Hi kanasimi,
I just started using this to parse content from wikipedia and I'm fetching infobox content and I would like to wikitext parse some sub-elements of the content. For example, the infobox has a string like this:
{{plainlist\n|*{{Jct|state=CA|SR|24}} in Oakland\n*{{Jct|state=CA|SR|123}} in [[Berkeley, California|Berkeley]]<!--might as well include it, since there are two lines-->\n}}
I'd like to use the parser to parse this string to allow easier processing and it's not clear to me if this is possible with the current implementation. Ultimately, it looks like surfacing the parsing in CeJS to wikiapi but am I missing some alternative?
Thanks,
-Steve
For later. There are clearly some avenues to make more minimalist and savy doc. 👍🏼 That's pretty cool !
This is possibly not a bug.
I just uploaded upon, and it does no pagecontent's changes.
// upload
let result = await targetWiki.upload({
comment: `Upload Chinese radical ${zi} in red style, raster format.`,
text: `{{SOlicense|${zi}|${locale}|${major}.${extension}||${rad[0].replace(/^0+/,'')}|date=${today}|author=[[User:Yug|]]|license=PD-font|fontname=Arphic PL UKai}}`
+`\n{{MakeMeAHanzi}}\n{{AnimeCJK}}`
+`\n{{Rcat|{{Radnum|${rad[0]}}}|${rad[0]}|0||x}}`,
filepath : `../DragonsBot/SOP/${zi}-${locale}${major}.${extension}`,
ignorewarnings: 1, // overwrite or create,
author: '[[User:Yug]]',
categories:[`ShufaBot test: upload`]
});
Parameter comment
is not effective.
Parameter text
is not effective.
Parameter author
is not effective.
Parameter categories
is not effective.
So I plan an edit round thereafter.
I'm not having much luck using this package on Debian 4.17.171-2 with Node version v12.18.3 and yarn v1.22.5. The package fails to load properly. I'm developing on macOS using Node v12.18.3 using yarn v1.22.5 on my dev machine and the package works just fine but when I install on Debian I'm seeing these errors:
I'm curious if you might have seen/heard of this?
TypeError: Cannot assign to read only property 'name' of function 'function env(name, value) {
if (!name)
// return;
return undefined;
// typeof value !== 'undefi...<omitted>... }'
at Function.assign (<anonymous>)
at Function._.reset_env (eval at <anonymous> (/node_modules/cejs/_for include/node.loader.js:79:2),
at eval (eval at <anonymous> (/node_modules/cejs/_for include/node.loader.js:79:2), <anonymous>:4747:5)
at eval (eval at <anonymous> (/node_modules/cejs/_for include/node.loader.js:79:2), <anonymous>:4751:3)
And CeL fails to load so the first dependent line to execute then fails.
TypeError: CeL.run is not a function
at Object.<anonymous> (/node_modules/wikiapi/wikiapi.js:19:5)
at Module._compile (internal/modules/cjs/loader.js:1137:30)
Doesn't work 😭 .
Target category is Category:Lingua_Libre_pronunciation, exists, and contains sub-categories.
(Also tested with Category:Lingua_Libre_pronunciation-cmn, exists, and only contains files. Same results.)
The directory in which I want to store all those downloads is ./downloads
and this directory exists.
// PURPOSE: Given a root category, script list subcategories and download all files from them into dedicated repositories.
// Run: $node script.js
const Wikiapi= require('wikiapi');
const logins = require('./logins.js');
// Login credentials from .login*.js
var USER = logins.commons.user,
PASS = logins.commons.pass,
API = logins.commons.api;
(async () => {
// Connect
var targetwiki = new Wikiapi;
await targetwiki.login(USER, PASS, API);
console.log(`Username ${USER.split('@')[0]} is connected !`);
// CORE CODE
await targetwiki.download(targetwiki.to_namespace('Lingua_Libre_pronunciation', 'Category'), { directory: './downloads' });
})();
{ directory: './downloads' }
await targetwiki.download(targetwiki.to_namespace('Lingua_Libre_pronunciation', 'Category'), { directory: './downloads' });
yug@yug-k401ub:~/Documents/WikiapiJS-Eggs$ node wiki-category_tree-many.js
get_API_parameters: Cache commonswiki: path=query+siteinfo
Username ShufaBot is connected !
Cannot download Category:Lingua Libre pronunciation
node:internal/process/promises:246
triggerUncaughtException(err, true /* fromPromise */);
^
[UnhandledPromiseRejection: This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). The promise rejected with the reason "[object Array]".] {
code: 'ERR_UNHANDLED_REJECTION'
}
Node.js v17.0.1
try { ... } catch (error) { ... }
Errors are similar for :
{ directory: './downloads' }
try {
await targetwiki.download(targetwiki.to_namespace('Lingua_Libre_pronunciation', 'Category'), { directory: './downloads' });
} catch (error) { console.log(`Download error : ${JSON.stringify(error)}`) }
yug@yug-k401ub:~/Documents/WikiapiJS-Eggs$ node wiki-category_tree-many.js
get_API_parameters: Cache commonswiki: path=query+siteinfo
Username ShufaBot is connected !
Cannot download Category:Lingua Libre pronunciation
Download error : [{"pageid":69811642,"ns":14,"title":"Category:Lingua Libre pronunciation"}]
OSX Big Sur, node.js: 7.22.0
const Wikiapi = require('wikiapi');
(async()=>{
const wiki = new Wikiapi('zn');
let page_data = await wiki.page('Universe', {});
console.log('page_data: ', page_data);
console.log('page_data.text: ', page_data.wikitext)
})()
/Users/name/Documents/fun dev/wikiScraper/node_modules/cejs/application/net/wiki.js:201
&& API_URL.includes('://')) {
^
TypeError: API_URL.includes is not a function
at new wiki_API (/Users/name/Documents/fun dev/wikiScraper/node_modules/cejs/application/net/wiki.js:201:14)
at Wikiapi (/Users/name/Documents/fun dev/wikiScraper/node_modules/Wikiapi/Wikiapi.js:92:23)
at Object.<anonymous> (/Users/name/Documents/fun dev/wikiScraper/index.js:3:1)
Only can run the example that pulls the hanzi character from the page every other has this error, or if I try to use English wikipedia or any other input.
(I don't know if this is user error or not)
const Wikiapi= require('wikiapi');
const fetch = require('node-fetch');
const fs = require('fs');
const logins = require('./logins-ShufaBot.js');
const files = require('./data/zi-reds.js'); // to get the list: ls -1 *.svg > zi-reds.js
// PURPOSE: Script uploads file 丁-red.png and similar to Commons.
// Edit login credentials
var USER = logins.commons.user,
PASS = logins.commons.pass,
API = logins.commons.api;
wikicode = '';
console.log('Username:', USER);
(async () => {
// Connect
console.log('Connects');
const targetWiki = new Wikiapi;
await targetWiki.login(USER, PASS, API);
console.log('Connected!');
// upload file / media
for(i=0;i<files.length;i++){
let zi = files[i].zi;
wikicode=`{{SOlicense|${zi}||red.png||license=PD}}`;
let result = await targetWiki.upload_file({
file_path: `./reds/png/${zi}.png`,
comment: `Upload red stroke order for Chinese character ${zi}.`,
text: wikicode
});
}
})();
Then:
$ node ./create-Commons-upload.js --trace-warnings
Username: ShufaBot@ShufaBot # personal console.log
Connects
get_API_parameters: Set enwiki: path=query+siteinfo
get_API_parameters: Set commonswiki: path=query+siteinfo
Connected! # personal console.log
(node:48717) UnhandledPromiseRejectionWarning: TypeError: targetWiki.upload_file is not a function
at /home/yug/Documents/DragonsBot/create-Commons-upload.js:25:36
at processTicksAndRejections (internal/process/task_queues.js:97:5)
(Use `node --trace-warnings ...` to show where the warning was created)
(node:48717) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag `--unhandled-rejections=strict` (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 1)
(node:48717) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
Hi there,
I'm using WikiapiJS to code a wikiapi-egg (script) which will download all Commons files from target categories. My 3 largest target categories currently have about 50k audios files each, files being of 1.5KB each. Do you know:
cmlimit=500
for regular users, cmlimit=5000
if apihighlimits
userright.It's to provides the public direct and convenient dumps of LinguaLibre's audio assets on a per language basis. We want to create periodic (weekly?) dumps on our Lili server.
We want to keep a local dump synchronized based on Wikimedia Commons. We are talking about 700,000 files so far. According to tests duration above, the initial synchronization would take 21 days, that is ok.
But the later "updates" a week later would require about 15 days while only 1~2% of new files (7,000-15,000) will require a download.
Do you have possible optimization at sight ?
WikiapiJS download worked on tiny categories (files =12). See #48 code.
I'm currently reluctant to test further by fear of being banned.
.download()
bentchmark (1)Ok, I decided to test anyway on a category with n=369.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.