Comments (5)
We are considering to add the feature, dump the in-memory gstore to binary files on DFS and reload them, but not ready now. Previously, I have some tips which might help you.
- After loading dataset, you can write new queries and run them if you don’t need to change the dataset.
- Most configuration items can be changed at runtime by “
config -s XXX
” command without restart Wukong. - To speed up dataset loading, you can
- always split your input RDF data into multiple files (
#files >= #machines * #engine-threads
), since Wukong can parallel load them. - replace “
str_normal
” with “str_normal_minimal
”, which only contain the ID-mapping used by your queries (if you don't need to read the query results in string format). Currently, wukong loads the ID-mapping file (“str_normal
” or “str_normal_minimal
”) sequentially. - disable the planner in “
config
” file (i.e.,global_enable_planner
), and then Wukong will not collect data statistics used by planner when loading dataset. Currently, Wukong does it sequentially without optimizations. It should be noted that you must manually decide the order of triple patterns with the access direction in your query (e.g.,script/query/lubm_q*
) after disabling planner.
- always split your input RDF data into multiple files (
from wukong.
Thank you for the comment. I should try your advice. I have one question related to 3-iii, and another question about batch-mode.
Q1. related to 3-iii. : I want to try queries other than script/query/lubm_q*. Is there any suggestions(or rules) of deciding the matching order of triple patterns?
Q2. batch mode: help command shows "-b a set of queries configured by (batch-mode)". Does batch-mode mean that wukong processes a batch of queries concurrently? How can I configure a set of queries in an input file(I don't understand the meaning of numbers in scripts/batch/mix_config) ?
from wukong.
For Q1, we only have some simple heuristic rules. For selective queries, the good plan commonly starts from a normal vertex (constant entity, like <http://www.Department0.University0.edu>
in lubm_q4
). For non-selective queries, the good plan commonly starts from an index vertex (type or predicate, like ub:FullProfessor
in lubm_q7
and ub:undergraduateDegreeFrom
in lubm_q3
), and prefers to a relatively large pruning. Moreover, you can evaluate different plans on a small dataset and reuse it on a large dataset if two datasets follow the same style.
For Q2, the batch mode is used to evaluate the throughput of wukong (sorry for the confusing name), which generates a large number of queries from templates and continuously submits them to engines. A configuration file (i.e., script/batch/mix_config
) specifies the combination and submission rate of queries, which are defined in several query template files (i.e., script/batch/q?
) .
The only difference between query file and query template file: the start constant vertex (e.g., <http://www.Department0.University0.edu>
in script/query/lubm_q4
) is replaced by a type (e.g., %ub:Department
in script/batch/q4
). The wukong’s proxy will first send a query to retrieve a pool of all vertices belonged to the type, and generate queries by randomly replacing the type in the query template with a vertex from the pool.
BTW, the real batch mode you wanted is easy to implement. You can first add a new command, which reads a configuration file with a set of query files, and then the proxy can send and receive the queries one-by-one or in parallel.
from wukong.
Thank you.
from wukong.
The new version (v0.2.0) has supported run a batch of SPARQL queries by using a single command.
from wukong.
Related Issues (17)
- deps里缺乏文件 HOT 6
- 查询失败 HOT 1
- Do you have a docker image HOT 3
- wukong是使用什么存储rdf数据的? HOT 1
- 增量更新 HOT 2
- 编译不通过 HOT 4
- 您好,代码无法运行 HOT 3
- got bad completion with status: 0xc, vendor syndrome: 0x81 HOT 2
- 在运行run.sh时报错 HOT 1
- How can I fix this? Assertion `(vid == v) && (dir == d) && (pid == p)' failed HOT 3
- Failed to modify RC to RTR state, No such device or address HOT 6
- 关于qp及通信的相关问题 HOT 10
- 关于query分发 HOT 2
- 关于wukong系统查询相关问题 HOT 3
- Is Wukong+G available now? HOT 3
- 编译失败,cannot find -libverbs HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from wukong.