Coder Social home page Coder Social logo

kanasimi / wikiapi Goto Github PK

View Code? Open in Web Editor NEW
46.0 2.0 5.0 1.95 MB

JavaScript MediaWiki API for node.js

Home Page: https://kanasimi.github.io/wikiapi/

License: BSD 3-Clause "New" or "Revised" License

JavaScript 100.00%
wikipedia-bot mediawiki wikiapi wikidata wikitext cejs wikipedia wikipedia-api wiki wikitext-parser

wikiapi's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

wikiapi's Issues

Fails to login: `異常 HTTP 狀態碼 404`

const Wikiapi= require('wikiapi');
const fetch  = require('node-fetch');
const fs     = require('fs');
const logins = require('./logins-ShufaBot.js');
const files  = require('./data/zi-reds.js'); // ls -1 *.svg > zi-reds.js
  
// PURPOSE: Script uploads file 丁-red.png and similar to Commons.

// Edit login credentials
var USER = logins.commons.user,
	PASS = logins.commons.pass,
	API  = logins.commons.api;
    wikicode = '';
    console.log('Username:', USER);

(async () => {
    // Connect
    console.log('Connects');
    const targetWiki = new Wikiapi;
    await targetWiki.login(USER, PASS, API);
    console.log('Connected!');
    // upload file / media
    for(i=0;i<files.length;i++){
        let zi = files[i].zi;
        wikicode=`{{SOlicense|${zi}||red.png||license=PD}}`;
	    let result = await targetWiki.upload({ 
            file_path: `./reds/png/${zi}.png`,
            comment: `Upload red stroke order for Chinese character ${zi}.`, 
            text: wikicode 
        });
    }
})();

Then:

$/node ./create-Commons-upload.js 
Username: ShufaBot 
Connects
get_URL_node: 異常 HTTP 狀態碼 404:https://commons.wikimedia.org/api.php?assert=user&maxlag=5&format=json&utf8=1
get_URL_node: Retry 1/4: BAD STATUS
get_URL_node: 異常 HTTP 狀態碼 404:https://commons.wikimedia.org/api.php?assert=user&maxlag=5&format=json&utf8=1
get_URL_node: Retry 2/4: BAD STATUS
get_URL_node: 異常 HTTP 狀態碼 404:https://commons.wikimedia.org/api.php?assert=user&maxlag=5&format=json&utf8=1
get_URL_node: Retry 3/4: BAD STATUS
get_URL_node: 異常 HTTP 狀態碼 404:https://commons.wikimedia.org/api.php?assert=user&maxlag=5&format=json&utf8=1
get_URL_node: Retry 4/4: BAD STATUS
get_URL_node: 異常 HTTP 狀態碼 404:https://commons.wikimedia.org/api.php?assert=user&maxlag=5&format=json&utf8=1
get_URL_node: Got error when retrieving [https://commons.wikimedia.org/api.php?assert=user&maxlag=5&format=json&utf8=1]: BAD STATUS
wiki_API_query: BAD STATUS: https://commons.wikimedia.org/api.php?assert=user&maxlag=5&format=json&utf8=1
get_URL_node: 異常 HTTP 狀態碼 404:https://commons.wikimedia.org/api.php?action=query&meta=tokens&type=login&maxlag=5&format=json&utf8=1
get_URL_node: Retry 1/4: BAD STATUS
get_URL_node: 異常 HTTP 狀態碼 404:https://commons.wikimedia.org/api.php?action=query&meta=tokens&type=login&maxlag=5&format=json&utf8=1
get_URL_node: Retry 2/4: BAD STATUS
get_URL_node: 異常 HTTP 狀態碼 404:https://commons.wikimedia.org/api.php?action=query&meta=tokens&type=login&maxlag=5&format=json&utf8=1
get_URL_node: Retry 3/4: BAD STATUS
get_URL_node: 異常 HTTP 狀態碼 404:https://commons.wikimedia.org/api.php?action=query&meta=tokens&type=login&maxlag=5&format=json&utf8=1
get_URL_node: Retry 4/4: BAD STATUS
get_URL_node: 異常 HTTP 狀態碼 404:https://commons.wikimedia.org/api.php?action=query&meta=tokens&type=login&maxlag=5&format=json&utf8=1
get_URL_node: Got error when retrieving [https://commons.wikimedia.org/api.php?action=query&meta=tokens&type=login&maxlag=5&format=json&utf8=1]: BAD STATUS
wiki_API_query: BAD STATUS: https://commons.wikimedia.org/api.php?action=query&meta=tokens&type=login&maxlag=5&format=json&utf8=1
wiki_API.login: 無法 login! Abort! Response:
BAD STATUS

Upload to commons

Hello,
I see on the readme the following :

// upload file / media
(async () => {
	const wiki = new Wikiapi;
	await wiki.login('user', 'password', 'test');
	let result = await wiki.upload({ file_path: '/local/file/path', comment: '', text: '' });
})();

Nice. It does the job. But what is the recommended approach to complete the Commons.wikimedia.org Upload Wizard's form with author, source, license, categories and co ? Should I program a second section of code which edit the page after upload. Or should I use text:"value" with "value" being my whole page's wikicode ?

Note: you can answer me here, I will PR the Readme with a clarification. 😉

Please also note you don't have test for uploads.

Feature request: on existing wikidata item, add a property's value

Hi, following the existing README.md, I may suggest new feature as perceived to be practical cases.

Initial state

  • On wikidata (or any wikibase), and item exists. Known by name or Qid.
  • On this wikibase, a property exist. Known to us by Pid.
  • On our local script we have this property's value (but it is not on wikidata)

Action

  • We want to POST this value to the wikidata item, creating or updating the property.

throws "API_URL.includes" is not a function when creating wikiapi instance

Hey:
I'm trying to build a little template-to-csv tool with wikiapi. the problem in the title occurs when the script runs new Wikiapi("https://warframe.huijiwiki.com/api.php")

the environment I use as below:

  • node.js v16.14.0
  • wikiapi 1.19.2
  • cejs 4.5.1 (installed when installing wikiapi, your package.json wants me to install latest)

the full code I ran as below:

const Wikiapi = require('wikiapi')
(async() => {
    const wiki = new Wikiapi('https://warframe.huijiwiki.com/api.php')
    let pageData = await wiki.page('吞天沙暴', {})
    console.log(pageData)
})()

the error in the console:

D:\git repo\warframe-items-for-huiji\node_modules\cejs\application\net\wiki.js:201
            API_URL.includes('://')) {
                    ^

    at new wiki_API (D:\git repo\warframe-items-for-huiji\node_modules\cejs\application\net\wiki.js:201:21)
    at Wikiapi (D:\git repo\warframe-items-for-huiji\node_modules\wikiapi\Wikiapi.js:92:23)
    at Object.<anonymous> (D:\git repo\warframe-items-for-huiji\scripts\parseTemplateData.js:6:1)
    at Module._compile (node:internal/modules/cjs/loader:1103:14)
    at Object.Module._extensions..js (node:internal/modules/cjs/loader:1155:10)
    at Module.load (node:internal/modules/cjs/loader:981:32)
    at Function.Module._load (node:internal/modules/cjs/loader:822:12)
    at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:77:12)
    at node:internal/main/run_main_module:17:47

I tried a little debug on the module and located the problem:

  • inserted debugger in /cejs/application/net/wiki.js line 194, right before API_URL.includes('://')
    and found API_URL is an async function instead of string.

yet I'm too noob to fix it, I need advice, thank you~

Provide an option to set a custom User-Agent string

I can't seem to find a decent place to change the default User-Agent used in the library. Wikipedia has a User-Agent policy which calls for a specific User-Agent string and it would be great if there were an option to provide a custom User-Agent which was specific to the calling application vs. the wikiapi package.

Regards,
Steve

Unexpected edit on bot's userpage

Any idea what is that ?

== [20210408T2243]: 6 pages done ==

Add temporary category to ease next step.
: 6 pages done, 28 s elapsed.
* First, use 476 ms to get 6 pages.
# '''[[:File:Letter-a-colorful.svg]]''' 2 s elapsed,  [[Special:Diff/551289225|finished]] at 20210408T2243
# '''[[:File:Letter-b-colorful.svg]]''' 7 s elapsed,  [[Special:Diff/551289231|finished]] at 20210408T2243
# '''[[:File:Letter-c-colorful.svg]]''' 3 s elapsed,  [[Special:Diff/551289255|finished]] at 20210408T2243
# '''[[:File:Letter-d-colorful.svg]]''' 5 s elapsed,  [[Special:Diff/551289270|finished]] at 20210408T2243
# '''[[:File:Letter-e-colorful.svg]]''' 5 s elapsed,  [[Special:Diff/551289289|finished]] at 20210408T2243
# '''[[:File:Letter-f-colorful.svg]]''' 5 s elapsed,  [[Special:Diff/551289301|finished]] at 20210408T2243

It shown up at the end of this script being run :

// PURPOSE: Script to edit targets using hand-picked targets.
// Run: $node wiki-upload-many.js
const Wikiapi= require('wikiapi');
//const logins = require('./logins.js');
const logins = require('../DragonsBot/logins-ShufaBot.js');
const letters= require('./data/letters.js'); 

// Login credentials from .login*.js
var USER = logins.commons.user,
	PASS = logins.commons.pass,
	API  = logins.commons.api;

(async () => {
    const wiki = new Wikiapi;
    await wiki.login(USER, PASS, API);
    console.log(`Username ${USER} is connected !`);

/* *************************************************************** */
/* CORE ACTION(S) HERE : HACK ME ! ******************************* */
    var listPages = letters.map(item => `File:Letter-${item.letter}-colorful.svg`);
    // Add template {stub}, replace-remove vandalism if any, add category.
    await wiki.for_each_page(
        listPages, 
        d => { return d.wikitext //.replace(/^/g,'Thanos says: ')
        	+`\n[[Category:${USER} test: edit]]`; 
        	}, // new content
           {bot: 1, nocreate: 0, minor: 1, summary: 'Add temporary category to ease next step.'}  // edit options
   );
/* END CORE ****************************************************** */
/* *************************************************************** */

})();

// For details, see documentation : https://kanasimi.github.io/wikiapi/

Renaming file fails

[Please close this issue. Already solved. Just here to create some record / documentation.]

I got an error message with another of my code...

commons-rename.js:

// PURPOSE: Script renames targets following hand-coded patters.
const Wikiapi= require('wikiapi');
const logins = require('./logins-ShufaBot.js');
const files  = require('./data/zi-reds.js'); 

// Edit login credentials
var USER = logins.commons.user,
	PASS = logins.commons.pass,
	API  = logins.commons.api;

(async () => {
    // Connect
    console.log('Connecting...');
    const targetWiki = new Wikiapi;
    await targetWiki.login(USER, PASS, API);
    console.log(`Username ${USER} is connected !`);

    // Renaming by patter
    for(i=0;i<files.length;i++){
        zi = files[i].zi;
        console.log(zi)
        // File page exist ?
        let pageData = await targetWiki.page(`File:${zi}.png`, {});
        console.log('pageExists: ',pageData.wikitext!=='')
        if(pageData.wikitext!=='') {
            var initialTitle=`File:${zi}.png`,
                newTitle=`File:${zi}-newname.png`,
                reason='ShufaBot test: renaming file.',
                revertReason='ShufaBot test: renaming file, revert.';
            console.log(initialTitle,newTitle);
            // Rename
            result = await targetWiki.move_page(initialTitle, newTitle, { reason: reason, noredirect: true, movetalk: true });
            // Revert rename
            await targetWiki.page(newTitle);
            result = await targetWiki.move_to(initialTitle, { reason: revertReason, noredirect: true, movetalk: true });
        }

    }
})();
$ node ./edit-Commons-filename.js 
Connecting...
get_API_parameters: Set enwiki: path=query+siteinfo
get_API_parameters: Set commonswiki: path=query+siteinfo
Username ShufaBot@ShufaBot is connected !
112379664-b9a79100-8ce8-11eb-981d-727884b32993
pageExists:  true
File:112379664-b9a79100-8ce8-11eb-981d-727884b32993.png File:112379664-b9a79100-8ce8-11eb-981d-727884b32993-newname.png
(node:71405) UnhandledPromiseRejectionWarning: #<Object>
(Use `node --trace-warnings ...` to show where the warning was created)
(node:71405) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag `--unhandled-rejections=strict` (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 1)
(node:71405) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

Upload file only if file does not exist ?

Hello,
Is there a parameter for that, or should I first read the target page to check if it exists ?

I have a code working via

let pageData = await targetWiki.page(`File:${zi}-red.png`, {});
let isEmpty = pageData.wikitext==='';

It's more a documentation issue.

Connection to LinguaLibre.org via Commons ?

Hello Kanasimi,

I tried to create a minimal bot to contribute to Wikimedia's LinguaLibre.org. Login fails. I suspect it's because it has delegated login to Commons via Oauth. I am confuse on how to connect for log in, how to connect for edit.

My objective is to create 1000 pages based on @unicode-org/unilex's 1000's files.

Current code


// edit page: method 2
(async () => {
	await wiki.login('Dragons Bot', 'pass', 'https://commons.wikimedia.org/api.php');
	const wiki = new Wikiapi('https://www.lingualibre.org/api.php');
	await wiki.edit_page('User:Dragons_Bot', function(page_data) {
		return page_data.wikitext
			+ `\nHi, I'm Dragons Bot ! I plan to upload lists and others maintenances.`;
	}, {bot: 1, nocreate: 1, minor: 1});
	console.log('Done.');
})();

wiki_API_edit: `no change`

Simple readability proposal. When a page exists and I edit it with the same content, I see :

wiki_API_edit: [[List:Ibo/Most used words, UNILEX 1: words 00001 to 00200]]: no change
Edit page: Done.

It could be more natural to have :

wiki_API_edit: [[List:Ibo/Most used words, UNILEX 1: words 00001 to 00200]]: no difference
Edit page: Done (none).

Feel free to close this issue. Depending on what you do under the hood your current text may be more accurate.

Snowpack support

Does this module support browsers?

I'm specifically using it together with Snowpack (& Skypack CDN) to make it work on the browser but I get the following errors:

[23:33:25] [snowpack] + [email protected]
[23:33:26] [esinstall:wikiapi] Home\DEV\ppsl-app-v1.5\node_modules\wikiapi\Wikiapi.js
   Import "./_CeL.loader.nodejs.js" could not be resolved from file.
[23:33:26] [esinstall:wikiapi] Home\DEV\ppsl-app-v1.5\node_modules\wikiapi\Wikiapi.js
   Import "./_CeL.loader.nodejs.js" could not be resolved from file.
[23:33:26] [esinstall:wikiapi]  Home\DEV\ppsl-app-v1.5\node_modules\wikiapi\_CeL.loader.nodejs.js?commonjs-external
   Module "Home\DEV\ppsl-app-v1.5\node_modules\wikiapi\_CeL.loader.nodejs.js" could not be resolved by Snowpack (Is it installed?).
[23:33:26] [snowpack] Install failed.
error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.

This is how I import it:

import wikiapi from 'wikiapi'

Edit repository details to add `wikiapijs` tag.

  • WikiapiJS-eggs : tag added.
  • Wikiapi

Wikiapi is too broad and non-distinctive and confusing.
WikiapiJS should be used to qualify your project, as often as possible.

Screenshot_2021-04-06_10-37-43

Also...

  • #40 : May I suggest you to rename the project's title as well ? You as lightly use as of now. But with a clean documentation, elegant API code, and a very popular language (JS)... The project is going to become mediumly popular. Better to rename now.

SyntaxError: Unexpected token '.'

After yarn upgrade from 1.14.1 to 1.15.1 resulted in the following error:

[Node] /Users/strefethen/github/traffic-graphql/node_modules/wikiapi/Wikiapi.js:1371
[Node] 			no_message: options?.no_edit,

node --version
v12.18.3

`Pagemove throttle for new users` : file move get rejected due to speed.

I got filemover userrights on Commons for my bot :

  • 1st rename worked 🥇
  • 2nd revert-renamed was blocked and the bot visibly temporally lost its moving rights.

Any familiarity with this error message ?

Error: abusefilter-autopromote-blocked: This action has been automatically identified as harmful, and it has been disallowed. In addition, as a security measure, some privileges routinely granted to established accounts have been temporarily revoked from your account. A brief description of the abuse rule which your action matched is: Pagemove throttle for new users
    at /home/yug/Documents/DragonsBot/node_modules/cejs/application/net/wiki/admin.js:166:15
    at check_session_badtoken (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/wiki/query.js:181:4)
    at /home/yug/Documents/DragonsBot/node_modules/cejs/application/net/wiki/query.js:603:4
    at IncomingMessage.<anonymous> (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:2221:6)
    at IncomingMessage.emit (events.js:327:22)
    at endReadableNT (_stream_readable.js:1224:12)
    at processTicksAndRejections (internal/process/task_queues.js:84:21)

If I remember well, we are limited to 1 edit every 5 secs. But I can't get my hand on the doc at the moment, and renaming may have special rules.

If you don't know no need to look for the answer : I asked on Commons for guidance.

`.download()` : reduce calls to api.php, directly hit on https://upload.wikimedia.org

/* ************************************************************* */
/* *********                                      ************** */
/* *********       REWRITING ONGOING.             ************** */
/* *********         DO NOT READ YET.             ************** */
/* *********                                      ************** */
/* ************************************************************* */

I don't know how cejs and WikiapiJS' .download() currently handles its queries, but I suspect it list the target files in an array via .categorymembers(), .search() or others which returns something such:

Categorymembers files:

// var files = await targetwiki.categorymembers('Category:Lingua Libre pronunciation-cmn', { namespace: 'File' });
// returns : 
[
  {"pageid":98560779,"ns":6,"title":"File:LL-Q9192 (cmn)-Assassas77-不.wav"},
  {"pageid":98560774,"ns":6,"title":"File:LL-Q9192 (cmn)-Assassas77-了.wav"},
  ....
  {"pageid":98560798,"ns":6,"title":"File:LL-Q9192 (cmn)-Assassas77-什么.wav"},
]

then use pageid to run a new API query for each file, get the url, and download the file.

Limits

Source Comment
https://commons.wikimedia.org/w/api.php? API queries have ratelimit
https://upload.wikimedia.org Direct downloads don't (I'm not 100% sure for that 😅 ).

Wikimedia API and group queries

There are Special:ApiSandbox queries which using a single API call can fetch by category name few hundred category member files, with exact file url and timestamp.

{
  "batchcomplete": "",
  "continue": {
    "gcmcontinue": "file|313235300a4c4c2d513135302028465241292d504f534c4f56495443482d313235302e574156|88497069",
    "continue": "gcmcontinue||"
  },
  "query": {
    "pages": {
      "82101585": { ... },
      "104331639": {
        "pageid": 104331639,
        "ns": 6,
        "title": "File:LL-Q150 (fra)-Kitel WP-%.wav",
        "imagerepository": "local",
        "imageinfo": [{
          "timestamp": "2021-04-25T15:49:00Z",
          "url": "https://upload.wikimedia.org/wikipedia/commons/a/a3/LL-Q150_%28fra%29-Kitel_WP-%25.wav",
          "descriptionurl": "https://commons.wikimedia.org/wiki/File:LL-Q150_(fra)-Kitel_WP-%25.wav",
          "descriptionshorturl": "https://commons.wikimedia.org/w/index.php?curid=104331639"
        }]
      },
      "104381091": { ... }
    }
  }
}

title, timestamp and url are the most relevant properties I believe.

See also: API:Categorymembers, API:Allimages, API:Imageinfo.

Url

For files, the url property gives a direct download url allowing download from upload.wikimedia.org without additional API query on https://commons.wikimedia.org/w/api.php?.
With one API query we can have 500 direct url to download from at higher speed.

Sustained "burst" management

The Wikimedia Discord api-bot channel made several input to this projects:

  • meta:User-Agent_policy
  • Discord discussion
    • concensus: limit to 5~8 concurrent downloads.
    • @Anticomposite: make requests from only one IP
    • @Ciencia-Al-Poder: file requests are not stored on normal filesystem. Due to the large set of files uploaded to commons, they're stored on swift (IIRC), and it may incure in some overhead specially when requesting files that nobody requested in a long time
    • @Marreromarco using Wikiget reported an average of 10 download per seconde (730,000 files in 20hours).
      • Done by running Wikiget in 20 terminals in parallel.
    • An average of 100 downloads per second is not unheard of but likeky requires careful request header and whitelisting.
    • Resilience when internet connexion is unstable may help.

Clarify naming convention (project branding)

Ambiguity comes from MediaWiki API already commonly known as the wiki api.

Recommendation 💯 🚀 👨🏼‍🎤

  • Wikiapi.js (capitalized + javascript indicator) - for human-oriented documentation text, readme.md, wiki, community announcements, emails, etc. Minor arternative WikiapiJS.
  • wikiapijs (lowercase) - for computer-friendly url, hashtags, tags.
  • wikiapi (lowercase) - for incode usage within .js files, github repository name, npm (which is already a js ecosystem). Note: Rename npm package is bad for dependencies, cannot do there.

D3js case

Same practice as for D3.js (in human conversation), d3js (in hashtags), and d3 (in .js files and npm).
Important to note : D3 has no ambiguity with another homonyme "D3" project. We do.

Screenshot_2021-04-06_11-50-22

Github rename

Do you mean that renaming on github changes the package's name on npm ?

Feature request: upload.

Hello,
You already got quite some useful, practical codes on the README.md, from wikipedias to wikidata, from GET to Replace.

Since the developments is active, I would suggest you the following new axe :

NodeJS syntaxe from :mw:API:Upload

// Step 4: POST request to upload a file directly
function upload(csrf_token) {
    var params_3 = {
        action: "upload",
        filename: "Sandboxfile1.jpg",                  // `data[i].filename` comes here
        ignorewarnings: "1",
        comment: "comment here",                    // `data[i].comment` comes here
        token: csrf_token,
        format: "json"
    };

    var file = {
        file: fs.createReadStream('My.jpg')         // `data[i].filepath` comes here
    };

    var formData = Object.assign( {}, params_3, file );

    request.post({ 
        url: url,               // API's root url such as "https://test.wikipedia.org/w/api.php";
        formData: formData 
    }, function (error, res, body) {
        body = JSON.parse(body);
        if (error) { return; }
        else if (body.upload.result === "Success"){
            console.log("File Uploaded :)");
        }
    });
}

Input parameters

  • filepath: path to file to upload.
    • type: string
    • default: "birds.png"
  • filename: target filename as wished on Wikimedia Commons.
    • type: string
    • default: "wikiapiTestUploadOnCommons.png"
  • comment: Upload comment. Also used as the initial page text for new files if text is not specified.
    • type: string
    • default: "Upload successful !"+ new Date().toISOString().slice(0,-5)

Under consideration:

  • directory: path to directory with image(s). (Dev node: maybe just push it into data[i].filename ?)
    • type: string. Ex: ~/Documents/ImagesLouvres/
    • default: ./

Some default parameters

  • tags: Change tags to apply to the upload log entry and file page revision.
    • type: Values (separate with | or alternative): possible vandalism, repeating characters
  • ignorewarnings: Ignore any warnings
    • type: boolean

Url or filepath

The difference between upload of local file or online file is minor and can be balanced via a conditional.

Example 1: Upload a local file directly Example 2: Upload file from URL
// Step 4: POST request to upload a file directly
function upload(csrf_token) {
  var params_3 = {
    action: "upload",
    filename: "Sandboxfile1.jpg",
    ignorewarnings: "1",
    token: csrf_token,
    format: "json"
  };

  var file = {
    file: fs.createReadStream('My.jpg')
  };

  var formData = Object.assign({}, params_3, file);

  request.post({
    url: url,  // API https://en.wikipedia.org/w/api.php
    formData: formData
  }, function(error, res, body) {
    body = JSON.parse(body);
    if (error) { return; } 
    else if (body.upload.result === "Success") {
      console.log("File Uploaded :)");
    }
  });
}
// Step 4: POST request to upload a file from a URL
function editRequest(csrf_token) {
  var params_3 = {
    action: "upload",
    filename: "Test-ABCD.jpg",
    url: "https://farm9.staticflickr.com/8213/8300206113_374c017fc5.jpg",
    ignorewarnings: "1",
    token: csrf_token,
    format: "json"
  };






  request.post({ 
    url: url,  // API https://en.wikipedia.org/w/api.php
    form: params_3
  }, function(error, res, body) {
    body = JSON.parse(body);
    if (error) { return; } 
    else if (body.upload.result === "Success") {
      console.log("File Uploaded :)");
    }
  });
}

So a conditional and some minor fixes should be enough

  • url: URL to fetch the file from.
    • conditional:
if( filepath.indexOf("http") == 0 ) {
   params_3.url = filepath
} else {
   params_3.file = fs.createReadStream(filepath)
}

Note: form and formData are lightly different. (I'am not sure it affects our case).

Attached file

  • birds.png : open in browser, save me as birds.png within your git repository, git add me. Adapt code so by default, it load and upload me to Wikimedia Commons.
    birds

Simplify readme.md : load page

Replace:

// load page
(async () => {
	const wiki = new Wikiapi('en');
	let page_data = await wiki.page('Universe');
	console.log(page_data.wikitext);
})();

// load page of other wiki
(async () => {
	const wiki = new Wikiapi('https://awoiaf.westeros.org/api.php');
	let page_data = await wiki.page('Game of Thrones');
	console.log(page_data.wikitext);
})();

by

// load page any wiki
(async () => {
	const wiki    = new Wikiapi('https://en.wikipedia.org/api.php');      // on Wikipedia...
	// const wiki = new Wikiapi('https://awoiaf.westeros.org/api.php'); // ...or any private wiki
	let page_data = await wiki.page('Game of Thrones');
	console.log(page_data.wikitext);
})();

Gain

  • Reduce readme.md's length
  • Uniform syntax
  • Transparent API address / reminder : helpful for non wikimedians.

PR

  • Do you prefer pull requests ?

Others

  • Should I collaborate with you here or via a Wikimedia project page ?

"Too many values supplied for parameter \"pageids\". The limit is 50."

Given the code :

const Wikiapi= require('wikiapi');
const logins = require('./logins.js');

// Login credentials from .login*.js
var USER = logins.commons.user,
	PASS = logins.commons.pass,
	API  = logins.commons.api;

(async () => {
    // Connect
    var targetwiki = new Wikiapi;
    await targetwiki.login(USER, PASS, API);
    console.log(`Username ${USER.split('@')[0]} is connected !`);

/* *************************************************************** */
/* CORE ACTION(S) HERE : HACK ME ! ******************************* */
    // List of categories
    var categories = (await targetwiki.category_tree('Lingua_Libre_pronunciation', { depth: 1, cmtype: 'subcat', get_flated_subcategories: true })).flated_subcategories;
    keys=Object.keys(categories)
    console.log(keys.length+' keys :\n '+JSON.stringify(keys))
    
/* END CORE ****************************************************** */
/* *************************************************************** */
})();

Update cejs done:

~/Documents/WikiapiJS-Eggs$ node GitHub.updater.node.js
Read the latest version from cache file CeJS-master.version.json
Get the infomation of latest version of CeJS...
Already have the latest version: 2022-01-05T08:20:26Z

Due to .category_tree(), the script prints the following warning message, while the script continue to work fine :

/Documents/WikiapiJS-Eggs$ node wiki-category_tree-many.js 
get_API_parameters: Cache commonswiki: path=query+siteinfo
Username ShufaBot is connected !
get_list: Unknown response: [{"error":{"code":"toomanyvalues","info":"Too many values supplied for parameter \"pageids\". The limit is 50.","limit":50,"lowlimit":50,"highlimit":500,"*":"See https://commons.wikimedia.org/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at &lt;https://lists.wikimedia.org/postorius/lists/mediawiki-api-announce.lists.wikimedia.org/&gt; for notice of API deprecations and breaking changes."},"servedby":"mw1363"}]
Trace: {
  error: {
    code: 'toomanyvalues',
    info: 'Too many values supplied for parameter "pageids". The limit is 50.',
    limit: 50,
    lowlimit: 50,
    highlimit: 500,
    '*': 'See https://commons.wikimedia.org/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at &lt;https://lists.wikimedia.org/postorius/lists/mediawiki-api-announce.lists.wikimedia.org/&gt; for notice of API deprecations and breaking changes.'
  },
  servedby: 'mw1363'
}
    at wiki_API_list_callback (/home/yug/Documents/WikiapiJS-Eggs/node_modules/wikiapi/node_modules/cejs/application/net/wiki/list.js:1075:13)
    at wiki_API_next_list_callback (/home/yug/Documents/WikiapiJS-Eggs/node_modules/wikiapi/node_modules/cejs/application/net/wiki/task.js:571:16)
    at /home/yug/Documents/WikiapiJS-Eggs/node_modules/wikiapi/node_modules/cejs/application/net/wiki/list.js:587:5
    at check_session_badtoken (/home/yug/Documents/WikiapiJS-Eggs/node_modules/wikiapi/node_modules/cejs/application/net/wiki/query.js:189:4)
    at XMLHttp_handler (/home/yug/Documents/WikiapiJS-Eggs/node_modules/wikiapi/node_modules/cejs/application/net/wiki/query.js:687:4)
    at IncomingMessage.<anonymous> (/home/yug/Documents/WikiapiJS-Eggs/node_modules/wikiapi/node_modules/cejs/application/net/Ajax.js:2265:6)
    at IncomingMessage.emit (node:events:402:35)
    at endReadableNT (node:internal/streams/readable:1340:12)
    at processTicksAndRejections (node:internal/process/task_queues:83:21)
for_category_info_list: [object Object]
^Cpth 2/1: 17/131 Lingua Libre pronunciation-bdu: 0 item(s).    evels left)  

download is not a function

Hello Kanasimi, Hope you are going well.
Given a Commons category name, I want to download all its (12) files.

I updated wikiapi :

npm update
npm view wikiapi
[email protected] | BSD-3-Clause | deps: 1 | versions: 32
...

Following the quite elegant doc, I "coded" the following:

// PURPOSE: Script to upload targets using an external data file.
// Run: $node wiki-upload-many.js
const Wikiapi= require('wikiapi');
const logins = require('./logins.js');

// Login credentials from .login*.js
var USER = logins.commons.user,
	PASS = logins.commons.pass,
	API  = logins.commons.api;

(async () => {
    // Connect
    var targetwiki = new Wikiapi;
    await targetwiki.login(USER, PASS, API);
    console.log(`Username ${USER.split('@')[0]} is connected !`);

/* *************************************************************** */
/* CORE ACTION(S) HERE : HACK ME ! ******************************* * /
    // List of targets
    const list = await targetwiki.categorymembers(`Category:Lingua Libre pronunciation by Bile rene`, { namespace: 'File' });
    // Loop on targets & save
    for(i=0;i<list.length;i++){// Set pages titles (current and new), reason and revertReason :
        page_data = list[i];
        try {
            await targetwiki.download(page_data.title, { directory: './downloads' });
        } catch (error) { console.log(`Download error on ${page_data.title} : ${error}`) }
    }
    
*/
    // Download all files from a (Commons) category
for (const page_data of await targetwiki.categorymembers(`Category:Lingua Libre pronunciation by Bile rene`, { namespace: 'File' })) {
	try {
		//if (targetwiki.is_namespace(page_data, 'File'))
		const file_data = await targetwiki.download(page_data.title, { directory: './downloads' });
	} catch (error) { console.log(`Download error on ${page_data.title} : ${error}`) }
}
/* END CORE ****************************************************** */
/* *************************************************************** */

})();

I get the following error message:

/WikiapiJS-Eggs$ node wiki-download-many.js 
get_API_parameters: Set commonswiki: path=query+siteinfo
Username ShufaBot is connected !
Download error on File:LL-Q33093 (bas)-Bile rene-Bonjour.wav : TypeError: wiki.download is not a function
Download error on File:LL-Q56668 (mcn)-Bile rene-Bonjour.wav : TypeError: wiki.download is not a function
Download error on File:LL-Q33093 (bas)-Bile rene-Bonne nuit.wav : TypeError: wiki.download is not a function
Download error on File:LL-Q56668 (mcn)-Bile rene-Bonne nuit.wav : TypeError: wiki.download is not a function
Download error on File:LL-Q33093 (bas)-Bile rene-C'est comment?.wav : TypeError: wiki.download is not a function
Download error on File:LL-Q56668 (mcn)-Bile rene-C'est comment?.wav : TypeError: wiki.download is not a function
Download error on File:LL-Q33093 (bas)-Bile rene-Comment vas-tu?.wav : TypeError: wiki.download is not a function
Download error on File:LL-Q56668 (mcn)-Bile rene-Comment vas-tu?.wav : TypeError: wiki.download is not a function
Download error on File:LL-Q56668 (mcn)-Bile rene-guten tag.wav : TypeError: wiki.download is not a function
Download error on File:LL-Q56668 (mcn)-Bile rene-Kal nga'a?(Comment allez-vous?).wav : TypeError: wiki.download is not a function
Download error on File:LL-Q33093 (bas)-Bile rene-Que se passe t-il?.wav : TypeError: wiki.download is not a function
Download error on File:LL-Q56668 (mcn)-Bile rene-Que se passe t-il?.wav : TypeError: wiki.download is not a function

Given JSdoc3 generates the documentation from the code, and I do find download at line 1781 in both your github and my local file... I'am quite confused. Any idea ?

Please publish new version with delete

I guess ideally the docs should be up to date with the npm version. In any case, I happen to be trying to use this library and need to delete.

Thanks!

`.download()` : compare local and remote files by timestamps before downloading

Timestamp

Timestamp property could be used to compare with existing local file's timestamp.
If API timestamp property is smaller (older) than local file timestamp, then skip download.
The imageinfo's "timestamp": "2021-04-25T15:49:00Z" indeed matches file description page indicating :

Date/Time Thumbnail Dimensions User Comment
current 15:49, 25 April 2021 1.1 s (99 KB) Kitel WP (talk | contribs)

After verification, files with several uploads provide by default the timestamp of the last upload (default : 1 revision, the latest).

Q4-related (✅ #51)

When a file already exists locally, it could be skipped faster.
Given a the time per download, x the number of files to download, b the initial categorymember query time with estimated b=60sec. We could get the second attempt (update) duration to be such as 2.7*14+60 = 97.8secs instead of 540 sec.

Q5-related (✅ #51)

@Poslovitch pointed out that filenames are not enough, some versioning check may be ongoing so recently updated files on commons are indeed re-downloaded. (Discord server invitation)
Ciencia-Al-Poder pointed out "First of all, you should avoid redownload files that you downloaded on a previous run. The api will return you the file modification/creation time. Use it to check if the file has been updated." (Discord link)

API URL matching pattern is too rigid

Unsure if I should file this here or kanasimi/CeJS; the issue seems to be this regexp pattern:
https://github.com/kanasimi/CeJS/blob/521b966ccf2810455c9d89ec893f478f06d4575a/application/net/wiki/namespace.js#L399

Steps to reproduce

Simply try getting a page from a domain that does not have subdomain or domain extension, or is only an IP, or have a port number attached to it:

const Wikiapi = require("wikiapi");

(async () => {
  const url = 'https://gbf.wiki/api.php';
  // These will also fail
  // const url = 'http://localhost/api.php';
  // const url = 'http://127.0.0.1/api.php';
  // const url = 'https://www.example.com:8080/api.php';

  const wiki = new Wikiapi(url);
  const data = await wiki.page('Main_Page');
  console.log(data.wikitext);
})();

Expected

It just works

Actual

Get the following error:

api_URL: Unknown project: [https://gbf.wiki/api.php]! Using default API URL.

And it simply queries the default (en wikipedia) instead.

Workaround

I can still get it working by running a local reverse proxy with nginx and modifying /etc/hosts, but this is very non-ideal.

`.download()` on category, recursive : questions.

Previous issues helped develop an efficient recursive download over categories of files via :

 await targetwiki.download(
        "Category:Lingua_Libre_pronunciation-cmn", {
        directory: './',
        max_threads: 4,
        page_filter(page_data) {
            console.log('@Yug: ',JSON.stringify(page_data))
            return true;
        }
    });

Questions

Q1: what is page_filter(page_data){ ...}

Q2 script speed limit and internet connection ?:
Is this speed I observe:

  1. limited by some setting within you code ?
  2. purely relying on my internet connection ?

If 2., then :

  • my benchmark may have been affected by variations in my local internet connection
  • someone running my exact script from San Francisco could hit much higher speed-downloads per second. If this issue seems relevant, please open a new ticket for a speed limiting option.

Q3 max_threads: if I set max_threads: 8, I should be 2 times faster than max_threads: 4, right ?
I tested shortly with 12, seems true.

Q4 depth: is it possible to limit category depth ? how ?

Q5 depth loop: is there a risk of depth infinite loop ? (Category:A contains Category:B which contains Category:A)

Q6 header: is the xhr hearder properly set ? See meta:User-Agent_policy

Q7 resilience: Is the script resilient ?

  • 7a) can survive poor and temporarily interrupted internet connection ?
  • 7b) can resume where it ended ?

Q8 ogg: how to download alternative file format such as .mp3 from .wav ?
Example : file in category File:LL-Q7737_(rus)-1Apollinariya1-кофе.wav -> file to download LL-Q7737_(rus)-1Apollinariya1-кофе.mp3

Feature Request: Verbose mode

I noticed a lot of commented console.logs and console.traces in the code.

It would be nice if those could be toggled on/off with a flag. It would help when running in CI so that bugs from production edits can be caught easier.

Nice library btw, very versatile :D

`TypeError: item.includes is not a function` when uploading.

Following up for yesterday upload issue. Down the road I now meet this :

$ node ./create-Commons-upload.js
Connecting...
get_API_parameters: Set enwiki: path=query+siteinfo                    # <------- Side issue: why talk about enwiki ? My target is Commons.
get_API_parameters: Set commonswiki: path=query+siteinfo
Username ShufaBot@ShufaBot is connected !
wiki_API_page: Not exists: [[:File:A-red.png]]                         # <------- this is good, I want to upload when page is empty
get_API_parameters: Set commonswiki: path=upload
/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:553
			: !item.includes(boundary);
			        ^

TypeError: item.includes is not a function                              # <------- This is the rocks on the road ! 
    at not_includes_in (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:553:12)
    at Array.every (<anonymous>)
    at not_includes_in (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:551:38)
    at Array.every (<anonymous>)
    at not_includes_in (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:551:38)
    at give_boundary (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:570:9)
    at process_next (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:712:5)
    at process_next (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:790:4)
    at push_and_callback (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:630:5)
    at get_file_object (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:661:5)

[[*/Stop]]

In console I see :

wiki_API_page: Not exists: [[User talk:Dragons Bot@Dragons Bot/Stop]]

We should expect :

wiki_API_page: Not exists: [[User talk:Dragons Bot/Stop]]

no ?

Failed to get local file while the path is correct.

commons-upload.js:

// PURPOSE: Script uploads file 丁-red.png and similar to Commons.
const Wikiapi= require('wikiapi');
const fetch  = require('node-fetch');
const fs     = require('fs');
const logins = require('./logins-ShufaBot.js');
const files  = require('./data/zi-reds.js');     // PROVIDES THE CHARACTERS

// Edit login credentials
var USER = logins.commons.user,
	PASS = logins.commons.pass,
	API  = logins.commons.api;
    wikicode = '',
    zi='',
    path='';
    console.log('Username:', USER);

(async () => {
    // Connects
    const targetWiki = new Wikiapi;
    await targetWiki.login(USER, PASS, API);
    console.log('Connected!');
    // upload file / media
    for(i=0;i<files.length;i++){
        zi = files[i].zi,
        wikicode=`{{SOlicense|${zi}||red.png||license=PD}}`,
        path = `/home/yug/Documents/DragonsBot/reds/png/${zi}-red.png`;
	    let result = await targetWiki.upload({ 
            file_path: path,
            comment: `Upload red stroke order for Chinese character ${zi}.`, 
            text: wikicode 
        });
    }
})();

Then:

$ node ./create-Commons-upload.js --unhandled-rejections=strict
Username: ShufaBot@ShufaBot
get_API_parameters: Set enwiki: path=query+siteinfo
get_API_parameters: Set commonswiki: path=query+siteinfo
Connected!
to_form_data: Failed to get file: [/home/yug/Documents/DragonsBot/reds/png/a-red.png]
(node:50492) UnhandledPromiseRejectionWarning: #<Object>
(Use `node --trace-warnings ...` to show where the warning was created)
(node:50492) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag `--unhandled-rejections=strict` (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 1)
(node:50492) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

I copy-pasted that url into my Chrome and boom ! The images shows up. So the path is correct.

Example file, to save as 丁-red.png :
丁-red.png

ASCII file, to save as a-red.png :
a-red.png

Both 丁-red.png and a-red.png returns the same error message, so it's not encoding.

March 2021 review

This issue is only a review and outline of associated issues. Please discuss on related issues.

General review

I like to write documentations and plan to do a push on wikiapi readme.md.[1] I still have to understand the project better. Today I understood that you base your parameters naming convention upon MediaWiki API. 🥇 I still have to understand more stuff before attacking your README.md. Your project is elegant, the code is easy to write and understand, and the development is active which is great. Your project has two major weak points :

Key Point Actionable
Doc The manual via examples is insufficient, noisy, too many things are opaques (parameters) Needs to revamp the Readme.md
Devs The lack of a multi-developers community. All stands on yourself. It's not distributed enough. Needs to onboard 1~2 other JS devs of your level or understanding your ES6 code.

A better readme will help to call over new devs. While @kanasimi works great alone, for sustainability it is better to have teammates and reviewers, who can fix things when you, Kanasimi, can't for various reasons.

[1]: this is open source "plan". Plan are only true when they are done. So anyone who can do something it first can go ahead and propose some improvement.

Related issues

  • #30 API documentation revamp
  • #32 README.md revamp
  • #35 Boilerplate wikiapi bot based on Dragons Bot

Late spring 2021

  • Call for more users.
  • List of previous forkers : @hugolpz @maltejur @pcjtulsa @trustedtomato @vangberg

Summer 2021 / optional

  • Call for more devs
  • #31 JSdoc3 css customization

README: Question on edit page

In the README, the section on editing a page does something I don't understand.

Here's the snippet in question

// edit page
(async () => {
	const enwiki = new Wikiapi;
	await enwiki.login('bot name', 'password', 'en');
	let page_data = await enwiki.page('Wikipedia:Sandbox');
	await enwiki.edit(function(page_data) {
		return page_data.wikitext
			+ '\nTest edit using {{GitHub|kanasimi/wikiapi}}.';
	}, {bot: 1});

	// alternative method
	await enwiki.edit_page('Wikipedia:Sandbox', function(page_data) {
		return page_data.wikitext
			+ '\nTest edit using {{GitHub|kanasimi/wikiapi}}.';
	}, {bot: 1, nocreate: 1, minor: 1});

	console.log('Done.');
})();

It first retrieves the page data with let page_data = await enwiki.page('Wikipedia:Sandbox'); but then never uses page_data. Inside all of the edit and edit_page methods, page_data is passed in by the caller and shadows the outer variable.

Does this mean edit will do a .page() call internally for you? Is the await enwiki.page() call not needed then? Or should it be that edit actually doesn't get passed page_data?

`.download()`: `session.to_namespace('Category Name', 'Category')` needed

Given core script.js :

const Wikiapi= require('wikiapi');

(async () => {
    var targetwiki = new Wikiapi('commons');
    try {
    //    await targetwiki.download(targetwiki.to_namespace('Lingua_Libre_pronunciation-cmn', 'Category'), { directory: './downloads',max_threads: 4 });
    await targetwiki.download(
        'Lingua_Libre_pronunciation-cmn',
        {
            directory: './',
            max_threads: 4,
            page_filter(page_data) {
                console.log('@Yug: ',JSON.stringify(page_data))
                return true;
            }
        });
    } catch (error) { console.log(`Download error : ${JSON.stringify(error)}`) }

})();

It failed to download, and simply returned :

yug@yug-k401ub:~/Documents/WikiapiJS-Eggs$ node ./wiki-download-many-category_tree_and_files.js
get_API_parameters: Cache commonswiki: path=query+siteinfo
get_API_parameters: Cache commonswiki: path=query+imageinfo

(Also, I don't understand the role of page_filter() so i neutralised it)

April 2021 Public announcement !

! ===========================================================================================
! DRAFT MESSAGE - to publish when ready to receive error reports and PRs                     
! ===========================================================================================

TL;DR: Wikiapi documentation has been created ; WikiapiJS Eggs, a kick starter project with classic usecases, as well.
Hello @kanasimi,
Hello @acagastya @maltejur @pcjtulsa @trustedtomato @vangberg,
March 2021 have been a very active month for WikiapiJS and we are happy to announce two major milestones for the project.

Wikiapi documentation (new, alpha)

Wikiapi welcomes a proper JSDoc3-based documentation. This makes learning and coding much easier.
Screenshot_2021-03-31_11-36-42

WikiapiJS Eggs (new, alpha)

Welcome also to WikiapiJS Eggs ! 🥚 🐉 WikiapiJS Eggs is the demo kickstarter for WikiapiJS. Install node, git clone WikiapiJS Eggs, add your username+password+target_site, and you are ready to mass fire on your wiki 😎

Mini-community message

Both the Wikiapi documentation and WikiapiJS Eggs are on alpha release. While they add great value, we expect errors around. You are welcome to create issues or PRs when you see something to improve.

-- Kanasimi & Hugolpz

Upload of missing file crashes script.

I started script to upload 10 files. One target is missing :

  • path: ../DragonsBot/SOP/亠-red.png

Node script (relevant section):

            try{ 
                // upload
                let result = await targetWiki.upload({
                    file_path: `../DragonsBot/SOP/${zi}-${locale}${major}.${extension}`,
                    comment: `Upload Chinese radical ${zi} in red style, raster format.`, // <------------------------ TO EDIT
                    text: wikicode,
                    ignorewarnings: 1, // overwrite or create,
                    author: '[[User:Yug]]',
                    categories:[``]
                });
            }catch(e){ console.error(e); }

The node shell/script crashes on it, returning :

get_API_parameters: Set commonswiki: path=upload
internal/fs/utils.js:259
    throw err;
    ^

Error: ENOENT: no such file or directory, open '../DragonsBot/SOP/亠-red.png'
    at Object.openSync (fs.js:461:3)
    at Object.readFileSync (fs.js:364:35)
    at get_file_object (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:638:24)
    at process_next (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:759:5)
    at process_next (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:791:4)
    at process_next (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:791:4)
    at process_next (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:791:4)
    at process_next (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:791:4)
    at to_form_data (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:793:3)
    at get_URL_node (/home/yug/Documents/DragonsBot/node_modules/cejs/application/net/Ajax.js:1395:4) {
  errno: -2,
  syscall: 'open',
  code: 'ENOENT',
  path: '../DragonsBot/SOP/亠-red.png'
}

I assumed the script would ping me "file missing" then pass over it and move to the next target.

The script works properly when files exist.

Note/idea: if there are both successful and missing files, it may be interesting to log the list of all successfull / failed upload, to then print it when the script ends.

README.md `page_data.wikitext` working ?

You may want to check page_data.wikitext in your readme.md, which is not aligned with your data structure.

console.log(pagedata)
pagedata {
  pageid: 438578,
  ns: 3,
  title: 'User talk:Dragons Bot',
  revisions: [
    {
      revid: 464938,
      parentid: 464937,
      timestamp: '2021-03-01T22:36:55Z',
      contentformat: 'text/x-wiki',
      contentmodel: 'wikitext',
      '*': 'undefined\n' +
        '2\n' +
        "Hi, I'm Dragons Bot ! I plan to upload lists and others maintenances."
    }
  ],
  convert_from: 'User_talk:Dragons_Bot',
  original_title: 'User_talk:Dragons_Bot',
  is_Flow: false
}

My code

const Wikiapi = require('wikiapi');
const fetch = require("node-fetch");

 var newContent = `\nHi, I'm Dragons Bot ! I plan to upload lists and others maintenances.`;

/* (async () => {
	const url1 = 'https://raw.githubusercontent.com/lingua-libre/unilex/master/data/frequency/ig.txt'
	const response = await fetch(url1);
	const data = await response.text();
	console.log('Fetch: Done.');
})(); */

// edit page: method 2
(async () => {
	const targetWiki = new Wikiapi;
	await targetWiki.login('user', 'pass', 'en');

	await targetWiki.edit_page('Wikipedia:Sandbox', function(page_data) {
		console.log('pagedata',page_data)
		console.log('wikitext',page_data.revisions[0]['*'])
		return page_data.revisions[0]['*']                                 // <------- `return page_data.wikitext` from your readme.md but actually undefined
			+ '\n2'+newContent;
	}, {bot: 1, nocreate: 1, minor: 1});
	console.log('Edit: Done.');

})();

API documentation revamp (JSDoc migration)

API actionables

  • Identify and list the core steps or js methods (here)
  • Demo initial documentation via D3js documentation style
  • Explore documentation generators
  • JSdoc3 implementation.
  • Docdash theme installation
  • Improve in-code comments accordingly to JSdoc3 syntax
  • Set and improve github actions

Accessing wikitext parser from wikiapi

Hi kanasimi,
I just started using this to parse content from wikipedia and I'm fetching infobox content and I would like to wikitext parse some sub-elements of the content. For example, the infobox has a string like this:

{{plainlist\n|*{{Jct|state=CA|SR|24}} in Oakland\n*{{Jct|state=CA|SR|123}} in [[Berkeley, California|Berkeley]]<!--might as well include it, since there are two lines-->\n}}

I'd like to use the parser to parse this string to allow easier processing and it's not clear to me if this is possible with the current implementation. Ultimately, it looks like surfacing the parsing in CeJS to wikiapi but am I missing some alternative?

Thanks,
-Steve

Upload upon : no edit of the page ?

This is possibly not a bug.

I just uploaded upon, and it does no pagecontent's changes.

// upload
let result = await targetWiki.upload({ 
  comment: `Upload Chinese radical ${zi} in red style, raster format.`,
  text: `{{SOlicense|${zi}|${locale}|${major}.${extension}||${rad[0].replace(/^0+/,'')}|date=${today}|author=[[User:Yug|]]|license=PD-font|fontname=Arphic PL UKai}}`
  +`\n{{MakeMeAHanzi}}\n{{AnimeCJK}}`
  +`\n{{Rcat|{{Radnum|${rad[0]}}}|${rad[0]}|0||x}}`,
  filepath : `../DragonsBot/SOP/${zi}-${locale}${major}.${extension}`,
  ignorewarnings: 1, // overwrite or create,
  author: '[[User:Yug]]',
  categories:[`ShufaBot test: upload`]
});

Parameter comment is not effective.
Parameter text is not effective.
Parameter author is not effective.
Parameter categories is not effective.

So I plan an edit round thereafter.

TypeError: CeL.run is not a function on Debian using Node v12.18.3

I'm not having much luck using this package on Debian 4.17.171-2 with Node version v12.18.3 and yarn v1.22.5. The package fails to load properly. I'm developing on macOS using Node v12.18.3 using yarn v1.22.5 on my dev machine and the package works just fine but when I install on Debian I'm seeing these errors:

I'm curious if you might have seen/heard of this?

TypeError: Cannot assign to read only property 'name' of function 'function env(name, value) {
if (!name)
     // return;
         return undefined;
     // typeof value !== 'undefi...<omitted>...    }'
 at Function.assign (<anonymous>)
 at Function._.reset_env (eval at <anonymous> (/node_modules/cejs/_for include/node.loader.js:79:2),
 at eval (eval at <anonymous> (/node_modules/cejs/_for include/node.loader.js:79:2), <anonymous>:4747:5)
 at eval (eval at <anonymous> (/node_modules/cejs/_for include/node.loader.js:79:2), <anonymous>:4751:3)

And CeL fails to load so the first dependent line to execute then fails.

TypeError: CeL.run is not a function
   at Object.<anonymous> (/node_modules/wikiapi/wikiapi.js:19:5)
   at Module._compile (internal/modules/cjs/loader.js:1137:30)

Cannot download : triggerUncaughtException

Doesn't work 😭 .

Input

Target category is Category:Lingua_Libre_pronunciation, exists, and contains sub-categories.
(Also tested with Category:Lingua_Libre_pronunciation-cmn, exists, and only contains files. Same results.)

The directory in which I want to store all those downloads is ./downloads and this directory exists.

Full code (base)

// PURPOSE: Given a root category, script list subcategories and download all files from them into dedicated repositories.
// Run: $node script.js
const Wikiapi= require('wikiapi');
const logins = require('./logins.js');

// Login credentials from .login*.js
var USER = logins.commons.user,
	PASS = logins.commons.pass,
	API  = logins.commons.api;

(async () => {
    // Connect
    var targetwiki = new Wikiapi;
    await targetwiki.login(USER, PASS, API);
    console.log(`Username ${USER.split('@')[0]} is connected !`);

    // CORE CODE
    await targetwiki.download(targetwiki.to_namespace('Lingua_Libre_pronunciation', 'Category'), { directory: './downloads' });

})();

Variations

Cannot download

{ directory: './downloads' }

    await targetwiki.download(targetwiki.to_namespace('Lingua_Libre_pronunciation', 'Category'), { directory: './downloads' });
yug@yug-k401ub:~/Documents/WikiapiJS-Eggs$ node wiki-category_tree-many.js 
get_API_parameters: Cache commonswiki: path=query+siteinfo
Username ShufaBot is connected !
Cannot download Category:Lingua Libre pronunciation
node:internal/process/promises:246
          triggerUncaughtException(err, true /* fromPromise */);
          ^

[UnhandledPromiseRejection: This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). The promise rejected with the reason "[object Array]".] {
  code: 'ERR_UNHANDLED_REJECTION'
}

Node.js v17.0.1

Variations within try { ... } catch (error) { ... }

Errors are similar for :

Cannot download

{ directory: './downloads' }

try {
   await targetwiki.download(targetwiki.to_namespace('Lingua_Libre_pronunciation', 'Category'), { directory: './downloads' });
} catch (error) { console.log(`Download error : ${JSON.stringify(error)}`) }
yug@yug-k401ub:~/Documents/WikiapiJS-Eggs$ node wiki-category_tree-many.js 
get_API_parameters: Cache commonswiki: path=query+siteinfo
Username ShufaBot is connected !
Cannot download Category:Lingua Libre pronunciation
Download error : [{"pageid":69811642,"ns":14,"title":"Category:Lingua Libre pronunciation"}]

Example broken on OSX

Machine:

OSX Big Sur, node.js: 7.22.0

Code:

const Wikiapi = require('wikiapi');

(async()=>{
    const wiki = new Wikiapi('zn');		
    let page_data = await wiki.page('Universe', {});
	console.log('page_data: ', page_data);
	console.log('page_data.text: ', page_data.wikitext)
})()

Output:

/Users/name/Documents/fun dev/wikiScraper/node_modules/cejs/application/net/wiki.js:201
                && API_URL.includes('://')) {
                           ^

TypeError: API_URL.includes is not a function
    at new wiki_API (/Users/name/Documents/fun dev/wikiScraper/node_modules/cejs/application/net/wiki.js:201:14)
    at Wikiapi (/Users/name/Documents/fun dev/wikiScraper/node_modules/Wikiapi/Wikiapi.js:92:23)
    at Object.<anonymous> (/Users/name/Documents/fun dev/wikiScraper/index.js:3:1)

etc:

Only can run the example that pulls the hanzi character from the page every other has this error, or if I try to use English wikipedia or any other input.

(I don't know if this is user error or not)

.upload_file is not a function

const Wikiapi= require('wikiapi');
const fetch  = require('node-fetch');
const fs     = require('fs');
const logins = require('./logins-ShufaBot.js');
const files  = require('./data/zi-reds.js');     // to get the list: ls -1 *.svg > zi-reds.js
  
// PURPOSE: Script uploads file 丁-red.png and similar to Commons.

// Edit login credentials
var USER = logins.commons.user,
	PASS = logins.commons.pass,
	API  = logins.commons.api;
    wikicode = '';
    console.log('Username:', USER);

(async () => {
    // Connect
    console.log('Connects');
    const targetWiki = new Wikiapi;
    await targetWiki.login(USER, PASS, API);
    console.log('Connected!');
    // upload file / media
    for(i=0;i<files.length;i++){
        let zi = files[i].zi;
        wikicode=`{{SOlicense|${zi}||red.png||license=PD}}`;
	    let result = await targetWiki.upload_file({ 
                file_path: `./reds/png/${zi}.png`,
                comment: `Upload red stroke order for Chinese character ${zi}.`, 
                text: wikicode 
        });
    }
})();

Then:

$ node ./create-Commons-upload.js --trace-warnings
Username: ShufaBot@ShufaBot             # personal console.log
Connects
get_API_parameters: Set enwiki: path=query+siteinfo
get_API_parameters: Set commonswiki: path=query+siteinfo
Connected!                                              # personal console.log
(node:48717) UnhandledPromiseRejectionWarning: TypeError: targetWiki.upload_file is not a function
    at /home/yug/Documents/DragonsBot/create-Commons-upload.js:25:36
    at processTicksAndRejections (internal/process/task_queues.js:97:5)
(Use `node --trace-warnings ...` to show where the warning was created)
(node:48717) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag `--unhandled-rejections=strict` (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 1)
(node:48717) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

`.download()` scalability ?

Hi there,
I'm using WikiapiJS to code a wikiapi-egg (script) which will download all Commons files from target categories. My 3 largest target categories currently have about 50k audios files each, files being of 1.5KB each. Do you know:

  • Q1: Does WikiapiJS has such scale up in mind ?
  • Q2: What are the API limitations for such mass download ?
    • Listing items: Mediawiki API has a Categorymembers limit. cmlimit=500 for regular users, cmlimit=5000 if apihighlimits userright.
    • Downloads: I don't see limits on download themselves.
  • Q3: Do you handle categories with more than 500 files successfully ? (API limit)
  • Q4: Do you skip already downloaded files efficiently ? (quickly)
  • Q5: Do you compare local and remote files creation dates so to re-download from Commons when a new version is available ?
  • Q6: What should i avoid to not be blocked ?

Scale up

It's to provides the public direct and convenient dumps of LinguaLibre's audio assets on a per language basis. We want to create periodic (weekly?) dumps on our Lili server.

We want to keep a local dump synchronized based on Wikimedia Commons. We are talking about 700,000 files so far. According to tests duration above, the initial synchronization would take 21 days, that is ok.
But the later "updates" a week later would require about 15 days while only 1~2% of new files (7,000-15,000) will require a download.

Do you have possible optimization at sight ?

WikiapiJS download worked on tiny categories (files =12). See #48 code.
I'm currently reluctant to test further by fear of being banned.


.download() bentchmark (1)

Ok, I decided to test anyway on a category with n=369.

  • Initial attempt :
    • categorymembers=369
    • downloads=369
    • runing time: 16min or 960sec --> 2.7s./file
  • Removed 14 files from local directory
  • Update attempt:
    • categorymembers=369
    • downloads=14
    • runing time: 9min or 540sec --> 38.6s./file

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.