Comments (6)
Hi! This is actually available already. You can refer to home/wiki/Help:Import/Command-line within XOWA. There's a section for command-line import at: home/wiki/Help:Contents
Let me know if you run into any issues, or if the instructions aren't clear. Thanks!
from xowa.
Good to hear. I can't find the instructions you are referring to. Are they somewhere at https://gnosygnu.github.io/xowa/?
from xowa.
Nope. XOWA currently has most of it documentation within the app. In this case, you would do the following:
- Start XOWA:
- Copy-paste "home/wiki/Help:Import/Command-line" to the url bar
I list the wikitext below, but you're better off reading it within XOWA.
I am planning to upload these to https://gnosygnu.github.io/xowa/. However, there are a lot of pages and I'd like to automate generation and synchronization of them. If I can't get around to coding a system in the next few months, I'll just upload them all by hand.
XOWA can import a wiki using a plain text file and a command-line.
{{Help/Css}}
== Import simple.wikipedia.org through the command-line ==
* Open up a command-line. For example, on Windows, run <span class='bold'>cmd</span>
* Run the following: <span class='console'>java -jar {{#invoke:Xowa_url|plat_jar}} --cmd_file {{#invoke:Xowa_url|plat_url|xowa_build.gfs}} --app_mode cmd</span>
* Wait about 10 minutes for the script to complete
* Launch XOWA and enter <span class='url'>simple.wikipedia.org</span> in the URL bar
== Import a different wiki by editing the build script ==
* Open the following file in a [[Help:Text_editor|text editor]]: <span class='path'>{{#invoke:Xowa_url|plat_url|xowa_build.gfs}}</span>. See Script below for the full text.
* Replace all instances of <span class='bold'>simple.wikipedia.org</span> with the domain name. For example, for English Wikipedia, use <span class='bold'>en.wikipedia.org</span>
* Run the command-line import again.
* Launch XOWA and enter in the domain name in the the URL bar.
== Import a wiki with a manual download ==
=== Download the wiki dump ===
* Navigate to https://dumps.wikimedia.org/enwiki
* Click on the '''latest''' directory
* Download the file just under "'''Articles, templates, media/file descriptions, and primary meta-pages.'''". It should read '''enwiki-latest-pages-articles.xml.bz2'''
: The download is 11+ GB and may take anywhere between 2 and 5 hours to complete.
: If you also want talk pages, you should download the "'''Recombine all pages, current versions only.'''" version. It should read '''enwiki-latest-pages-meta-current.xml.bz2'''. Note that this dump is twice the size of the regular dump.
=== Specify location of the wiki dump ===
* In the build script, replace the following line:
: <span class='code'>add ('simple.wikipedia.org', 'text.init') {src_bz2_fil = '/your_directory/simplewiki-20130103-pages-articles.xml.bz2';}</span>
== Script ==
<pre class='code'>
// do not show a "Press enter to continue" at the end of the script
app.bldr.pause_at_end = 'n';
// run xowa.gfs
app.scripts.run_file_by_type('xowa_cfg_app');
// import wiki; for more info see [[Help:Import/Command-line]]
app.bldr.cmds {
// delete all files in directory; note that subdirectories and file databases ("-file.xowa") will not be deleted
add ('simple.wikipedia.org' , 'util.cleanup') {delete_all = 'y';}
// download main dump file; contains all articles
add ('simple.wikipedia.org' , 'util.download') {dump_type = 'pages-articles';}
// download categorylinks file; contains links from category to pages
add ('simple.wikipedia.org' , 'util.download') {dump_type = 'categorylinks';}
// download page_props file; contains information on hidden categories
add ('simple.wikipedia.org' , 'util.download') {dump_type = 'page_props';}
// start wiki import
add ('simple.wikipedia.org' , 'text.init');
// import articles
add ('simple.wikipedia.org' , 'text.page');
// generate search data
add ('simple.wikipedia.org' , 'text.search');
// generate main category data
add ('simple.wikipedia.org' , 'text.cat.core');
// import category links
add ('simple.wikipedia.org' , 'text.cat.link');
// apply hidden categories
add ('simple.wikipedia.org' , 'text.cat.hidden');
// end import
add ('simple.wikipedia.org' , 'text.term');
// import css into wiki
add ('simple.wikipedia.org' , 'text.css');
// cleanup temp files; delete xml and bz2
add ('simple.wikipedia.org' , 'util.cleanup') {delete_tmp = 'y'; delete_by_match('*.xml|*.sql|*.bz2|*.gz');}
}
// run cmds
app.bldr.run;
</pre>
from xowa.
Thanks. One of the steps is "Launch XOWA" that involves starting a UI. So actually there's no way to use it in completely headless mode, right?
from xowa.
So actually there's no way to use it in completely headless mode, right?
Well, there are two other ways, but I'm not sure how they'll work for your environment:
Run XOWA as an HTTP-server
- Open up a shell
- Run the following: java -jar xowa_linux_64.jar --app_mode http_server
- Open up a web-browser
- Navigate to any of the following links:
Run XOWA in command-line mode
- Open up a shell
- Run either of the following:
- (wikitext) java -jar xowa_linux_64.jar --app_mode cmd --show_license n --show_args n --cmd_text "app.shell.fetch_page('Help:Import/Command-line' 'wiki');"
- (html) java -jar xowa_linux_64.jar --app_mode cmd --show_license n --show_args n --cmd_text "app.shell.fetch_page('Help:Import/Command-line' 'html');"
- Read the text in the shell, or pipe to a text file and read in a text-editor
Let me know if neither of the above works. Thanks.
from xowa.
I'm going to mark this ticket closed. The command-line route should handle setup without a UI. Keep in mind this is what I use to generate all the wikis for archive.org. If you have questions, please feel free to reopen the issue and I'll respond. Thanks.
from xowa.
Related Issues (20)
- letters in left sidebar or menu too large HOT 1
- gzip reader to use whole buffer
- Location of wikipedia and getting Xowa to recognize it HOT 17
- Hey! @gnosygnu HOT 1
- Support for exporting pages of custom wiki HOT 1
- Coordinates in DMS using scientific notation
- XOWA development break HOT 10
- Error when starting Xowa on Slackware Linux HOT 2
- search box java.lang.NullPointerException HOT 1
- Broken templates in non-Wikimedia wikis HOT 9
- Search, disc space HOT 15
- Outdated Wikis HOT 1
- error "Javascript -files:///" HOT 1
- Huge error string on fresh install HOT 1
- Some general questions about XOWA HOT 5
- Title are not correctly displaied for Burmese wiki.
- download failed: bad size: bad=11364813 good=1563852846 HOT 1
- Missing images; sync doesn't work
- import button not working on the webinterface, can't download or import any wikis HOT 4
- Edits Don't Show in Read
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from xowa.