salesforce / wikisql Goto Github PK

View Code? Open in Web Editor NEW

1.6K 63.0 319.0 50.72 MB

A large annotated semantic parsing corpus for developing natural language interfaces.

License: BSD 3-Clause "New" or "Revised" License

Python 49.09% Dockerfile 0.55% HTML 50.36%

natural-language dataset database machine-learning natural-language-processing natural-language-interface

wikisql's Issues

[BUG] Unable to clone WikiSQL in a system

🐛🐛 Bug Report

The installation of WikiSQL given in README is not working.

It is showing a warning.

The folder WikiSQL is also empty

⚙️ Environment

Python version(s): [3.8.5]
OS: [Windows 10]

Phase 1 vs. phase 2

Hi,
I am confused by phase 1 and phase 2 annotations in the dataset files.
The paper says phase 1 is a paraphrasing phase while phase 2 is a verification phase. As far as I understand, phase 2 is just about discarding wrong paraphrases. So what do you mean by a given example was collected in phase 1 vs. phase 2?
Thanks,

--Ahmed

Python 2.7 compatibility

The current code is not python 2.7 compatible.

Requirements.txt versions

For those having trouble running the evaluation script in 2022 due to version updates in all the dependencies: use the following requirements.txt to get the specific package versions in the past (circa 2017-2018).

tqdm
sqlalchemy==1.2
records==0.5.3
babel==2.5.1
tabulate==0.8.1

Question Generation using a Template

Hi Team,

  Could you please let me know regarding the question generation template that was used to generate questions before the paraphrasing phase? Is the generation template/code a part of this codebase?

How to get the original SQL query from the "sql"/"query" field which is in json format

How can I convert the following code in "sql" field into the original SQL query

{"phase": 1, "table_id": "1-1000181-1", "question": "Tell me what the notes are for South Australia ", "sql": {"sel": 5, "conds": [[3, 0, "SOUTH AUSTRALIA"]], "agg": 0}}

I tried using lib.query.Query.from_dict method but get SELECT col5 FROM table WHERE col3 = SOUTH AUSTRALIA
and tried using lib.dbengine.DBEngine.execute_query method but get SELECT col5 AS result FROM table_1_1000181_1 WHERE col3 = :col3.
None of the above two methods get the correct SQL query, so how can I get it? Anybody help?

Incorrect Expected Queries

It seems many of the expected queries are incorrect, based on the question posed. Here is a preliminary list of just some questions I noticed have incorrect expected queries/answers:

Question: How many games was Damien Wilkins (27) the high scorer?
Expected query: SELECT MIN(Game) FROM 1-11964154-2 WHERE High points = 'damien wilkins (27)'
Expected result: ['6.0']

Question: What is the name of the integrated where allied-related is shared?
Expected query: SELECT (Component) FROM 1-11944282-1 WHERE Allied-Related = 'shared'
Expected result: ['customers']

Question: what is the integrated in which the holding allied-unrelated is many?
Expected query: SELECT (Holding) FROM 1-11944282-1 WHERE Allied-Unrelated = 'many'
Expected result: ['many']

Question: How many integrated allied-related are there?
Expected query: SELECT (Integrated) FROM 1-11944282-1 WHERE Allied-Related = 'many'
Expected result: ['one']

Question: Which authority has a rocket launch called rehbar-5?
Expected query: SELECT COUNT(Derivatives) FROM 1-11869952-1 WHERE Rocket launch = 'rehbar-5'
Expected result: ['1']

Question: Who had an evening gown score of 9.78?
Expected query: SELECT (Interview) FROM 1-11690135-1 WHERE Evening Gown = '9.78'
Expected result: ['8.91']

What does 'OP' stand for in cond_ops

cond_ops = ['=', '>', '<', 'OP']
And I can't find any 'OP' in dataset as index 3 using regex [[0-9], 3,.*?]

ground truth query

Any info about the Anonymous (2019)?

How to evaluate a Seq2Seq model ?

Hi,

I have built a se2seq model that takes the question and generates the SQL query directly.

My question is how to evaluate my model since the predictions are in sequence format ("select .... from ... where...")?

It's very urgent, please.

Thanks

What is your plan as ICLR reject the paper?

Thank you.
@vzhong @bmccann

In Google Colab - stanza.server.client.PermanentlyFailedException: Timed out waiting for service to come alive.

Hello Team,

While running the setup in Google Colab i am facing the below error:
Please advise further.

Traceback (most recent call last):
File "annotate.py", line 113, in
a = annotate_example(d, tables[d['table_id']])
File "annotate.py", line 38, in annotate_example
ann['question'] = annotate(example['question'])
File "annotate.py", line 22, in annotate
for s in client.annotate(sentence):
File "/usr/local/lib/python3.6/dist-packages/stanza/server/client.py", line 470, in annotate
r = self._request(text.encode('utf-8'), request_properties, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/stanza/server/client.py", line 379, in _request
self.ensure_alive()
File "/usr/local/lib/python3.6/dist-packages/stanza/server/client.py", line 203, in ensure_alive
raise PermanentlyFailedException("Timed out waiting for service to come alive.")
stanza.server.client.PermanentlyFailedException: Timed out waiting for service to come alive.
0% 0/56355 [02:00<?, ?it/s]

Cannot operate on a closed database

When running the evaluation.py as in Dockerfile, there will be an error:

Traceback (most recent call last):
  File "bug1.py", line 5, in <module>
    print(db.query('SELECT sql from sqlite_master WHERE tbl_name = :name', name='table_1_10015132_11').first())
  File "/home/zgzhen/projects/cse517-project/eval/seq2sql/WikiSQL/.venv/lib/python3.6/site-packages/records.py", line 214, in first
    record = self[0]
  File "/home/zgzhen/projects/cse517-project/eval/seq2sql/WikiSQL/.venv/lib/python3.6/site-packages/records.py", line 152, in __getitem__
    next(self)
  File "/home/zgzhen/projects/cse517-project/eval/seq2sql/WikiSQL/.venv/lib/python3.6/site-packages/records.py", line 136, in __next__
    nextrow = next(self._rows)
  File "/home/zgzhen/projects/cse517-project/eval/seq2sql/WikiSQL/.venv/lib/python3.6/site-packages/records.py", line 365, in <genexpr>
    row_gen = (Record(cursor.keys(), row) for row in cursor)
  File "/home/zgzhen/projects/cse517-project/eval/seq2sql/WikiSQL/.venv/lib/python3.6/site-packages/sqlalchemy/engine/result.py", line 946, in __iter__
    row = self.fetchone()
  File "/home/zgzhen/projects/cse517-project/eval/seq2sql/WikiSQL/.venv/lib/python3.6/site-packages/sqlalchemy/engine/result.py", line 1276, in fetchone
    e, None, None, self.cursor, self.context
  File "/home/zgzhen/projects/cse517-project/eval/seq2sql/WikiSQL/.venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1458, in _handle_dbapi_exception
    util.raise_from_cause(sqlalchemy_exception, exc_info)
  File "/home/zgzhen/projects/cse517-project/eval/seq2sql/WikiSQL/.venv/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 296, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/home/zgzhen/projects/cse517-project/eval/seq2sql/WikiSQL/.venv/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 276, in reraise
    raise value.with_traceback(tb)
  File "/home/zgzhen/projects/cse517-project/eval/seq2sql/WikiSQL/.venv/lib/python3.6/site-packages/sqlalchemy/engine/result.py", line 1268, in fetchone
    row = self._fetchone_impl()
  File "/home/zgzhen/projects/cse517-project/eval/seq2sql/WikiSQL/.venv/lib/python3.6/site-packages/sqlalchemy/engine/result.py", line 1148, in _fetchone_impl
    return self.cursor.fetchone()
sqlalchemy.exc.ProgrammingError: (sqlite3.ProgrammingError) Cannot operate on a closed database. (Background on this error at: http://sqlalche.me/e/f405)

Look like this issue is related: kennethreitz/records#128

How to get exact sql query for the natural language question ?

HI All,

I want to generate a custom json file which has the sql query for its natural language question.
I am unable to install docker to execute the annotate and query.py files to get the sql queries as i have windows 10 Home. and docker installation needs Windows 10 PRO.
Can you please suggest , how can i get it without docker way?
Or can you share the file for sql queries for wikisql if you have generated it already?

Thanks
Anshu

For real industrial application, what strategy to locate the exact table?

The datasets like WikiSQL is that the table corresponding to question is given.

But in real industrial application, we have 100+ tables for 1 new question.

Thank you!

.travis.yml missing

While submitting a pull request, the travis ci bot failed saying that ".travis.yml file could not be found". It seems to be some configuration file for the travis Continuous Integration module. I tried to locate it in the repo, but couldn't find it. Because of which, the pull request doesn't pass the built in checks.

Leaderboard for weakly supervised WikiSQL task

Since the results on the supervised learning benchmark are quite close to being saturated, I think having a leaderboard for models trained using only weak-supervision would be more relevant benchmark (for example, Memory Augmented Program Synthesis from Liang et. al beats some of the old entries in the strongly-supervised leaderboard using only weak supervision.)

Using this method on my own SQL database

Hi @vzhong,

I would like to get inferences for my own SQL table. For a similar question asked in Jan'18 you replied-

"You would have to train a model on this data, then perform inference on your data. Xiaojun and Chang from Berkeley has kindly made their model available here: https://github.com/xiaojunxu/SQLNet".

I am a little confused. Won't I need to train on my own SQL table? my column names could be very different. Won't need to create .jsonl files like you have in "data" directory? Could you please help me understand you comment above?

Thank you,
Shruti

dev.sql and test.sql files

HI, All, do anyone of you have dev.sql and test.sql file for wikisql dataset. Somehow i have train.sql , but i don't recall from where i got it. I need schema.sql file for dev and test.
Thank you so much!
Anshu

Higher LF accuracy on sample output.

Hi, I'm new to this data set. I followed README and run evaluation on example.pred.dev.jsonl. I got the following result.
{
"ex_accuracy": 0.5380596128725804,
"lf_accuracy": 0.45208407552547203
}

I'm not sure what is wrong. Can you give me some hint?

Where does ChatGPT stand in the leaderboard?

I asked a few text-to-SQL questions from chatGPT and its answers are pretty good. I'm wondering where does it stand in the leaderboard mentioned in the readme file.

AttributeError: 'NoneType' object has no attribute 'number_symbols'

When i run evaluate.py on example 'pred.dev.jsonl' i get this error, what sould i do?
C:\Users\Admin\anaconda3\python.exe C:/Users/Admin/Desktop/WikiSQL-master/evaluate.py
30%|██▉ | 2517/8421 [00:04<00:10, 562.58it/s]
Traceback (most recent call last):
File "C:/Users/Admin/Desktop/WikiSQL-master/evaluate.py", line 29, in
gold = engine.execute_query(eg['table_id'], qg, lower=True)
File "C:\Users\Admin\Desktop\WikiSQL-master\lib\dbengine.py", line 18, in execute_query
return self.execute(table_id, query.sel_index, query.agg_index, query.conditions, *args, **kwargs)
File "C:\Users\Admin\Desktop\WikiSQL-master\lib\dbengine.py", line 40, in execute
val = float(parse_decimal(val))
File "C:\Users\Admin\anaconda3\lib\site-packages\babel\numbers.py", line 707, in parse_decimal
group_symbol = get_group_symbol(locale)
File "C:\Users\Admin\anaconda3\lib\site-packages\babel\numbers.py", line 332, in get_group_symbol
return Locale.parse(locale).number_symbols.get('group', u',')
AttributeError: 'NoneType' object has no attribute 'number_symbols'

What is the input to annotate.py

Hi,

Could you please share an input format/example file for annotate.py? I would like to create SQL queries on a new separate SQL database

Thank you,
Sandy

``query word "symend" is not in input vocabulary''

Thanks for sharing the dataset and preprocessing scripts.

I tried to run the following command to generate examples:

python annotate.py

But the error message indicates that is_valid_example() returns False because of ``symend'':

annotating data/train.jsonl
loading tables
100%|██████████████████████████████████████████████████████████| 18585/18585 [00:01<00:00, 11703.27it/s]
loading examples
  0%|                                                                         | 0/61297 [00:00<?, ?it/s]query word "symend" is not in input vocabulary.
[u'symsyms', u'symselect', u'symwhere', u'symand', u'symcol', u'symtable', u'symcaption', u'sympage', u'symsection', u'symop', u'symcond', u'symquestion', u'symagg', u'symaggops', u'symcondops', u'symaggops', u'max', u'min', u'count', u'sum', u'avg', u'symcondops', u'=', u'>', u'<', u'op', u'symtable', u'symcol', u'state/territory', u'symcol', u'text/background', u'colour', u'symcol', u'format', u'symcol', u'current', u'slogan', u'symcol', u'current', u'series', u'symcol', u'notes', u'symquestion', u'tell', u'me', u'what', u'the', u'notes', u'are', u'for', u'south', u'australia']

Traceback (most recent call last):
  File "annotate.py", line 114, in <module>
    raise Exception(str(a))
Exception: {'seq_output': {'gloss': [u'SYMSELECT', u'SYMAGG', u'SYMCOL', u'Notes', u'SYMWHERE', u'SYMCOL', u'Current', u'slogan', u'SYMOP', u'=', u'SYMCOND', u'SOUTH', u'AUSTRALIA', u'SYMEND'], 'words': [u'symselect', u'symagg', u'symcol', u'notes', u'symwhere', u'symcol', u'current', u'slogan', u'symop', u'=', u'symcond', u'south', u'australia', u'symend'], 'after': [u' ', u'  ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u'']}, 'where_output': {'gloss': [u'SYMWHERE', u'SYMCOL', u'Current', u'slogan', u'SYMOP', u'=', u'SYMCOND', u'SOUTH', u'AUSTRALIA', u'SYMEND'], 'words': [u'symwhere', u'symcol', u'current', u'slogan', u'symop', u'=', u'symcond', u'south', u'australia', u'symend'], 'after': [u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u'']}, 'question': {'gloss': [u'Tell', u'me', u'what', u'the', u'notes', u'are', u'for', u'South', u'Australia'], 'words': [u'tell', u'me', u'what', u'the', u'notes', u'are', u'for', u'south', u'australia'], 'after': [u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u'']}, 'table_id': u'1-1000181-1', 'table': {'header': [{'gloss': [u'State/territory'], 'words': [u'state/territory'], 'after': [u'']}, {'gloss': [u'Text/background', u'colour'], 'words': [u'text/background', u'colour'], 'after': [u' ', u'']}, {'gloss': [u'Format'], 'words': [u'format'], 'after': [u'']}, {'gloss': [u'Current', u'slogan'], 'words': [u'current', u'slogan'], 'after': [u' ', u'']}, {'gloss': [u'Current', u'series'], 'words': [u'current', u'series'], 'after': [u' ', u'']}, {'gloss': [u'Notes'], 'words': [u'notes'], 'after': [u'']}]}, 'query': {u'agg': 0, u'sel': 5, u'conds': [[3, 0, {'gloss': [u'SOUTH', u'AUSTRALIA'], 'words': [u'south', u'australia'], 'after': [u' ', u'']}]]}, 'seq_input': {'gloss': [u'SYMSYMS', u'SYMSELECT', u'SYMWHERE', u'SYMAND', u'SYMCOL', u'SYMTABLE', u'SYMCAPTION', u'SYMPAGE', u'SYMSECTION', u'SYMOP', u'SYMCOND', u'SYMQUESTION', u'SYMAGG', u'SYMAGGOPS', u'SYMCONDOPS', u'SYMAGGOPS', u'MAX', u'MIN', u'COUNT', u'SUM', u'AVG', u'SYMCONDOPS', u'=', u'>', u'<', u'OP', u'SYMTABLE', u'SYMCOL', u'State/territory', u'SYMCOL', u'Text/background', u'colour', u'SYMCOL', u'Format', u'SYMCOL', u'Current', u'slogan', u'SYMCOL', u'Current', u'series', u'SYMCOL', u'Notes', u'SYMQUESTION', u'Tell', u'me', u'what', u'the', u'notes', u'are', u'for', u'South', u'Australia'], 'words': [u'symsyms', u'symselect', u'symwhere', u'symand', u'symcol', u'symtable', u'symcaption', u'sympage', u'symsection', u'symop', u'symcond', u'symquestion', u'symagg', u'symaggops', u'symcondops', u'symaggops', u'max', u'min', u'count', u'sum', u'avg', u'symcondops', u'=', u'>', u'<', u'op', u'symtable', u'symcol', u'state/territory', u'symcol', u'text/background', u'colour', u'symcol', u'format', u'symcol', u'current', u'slogan', u'symcol', u'current', u'series', u'symcol', u'notes', u'symquestion', u'tell', u'me', u'what', u'the', u'notes', u'are', u'for', u'south', u'australia'], 'after': [u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u'  ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u'']}}

unable to load library CoreNLPClient

Hello Team,

While executing the git code for annotate.py i am getting below error for CoreNLPClient

annotating data/train.jsonl
loading tables
100% 18585/18585 [00:00<00:00, 19025.85it/s]
loading examples
0% 0/56355 [00:00<?, ?it/s]
Traceback (most recent call last):
File "annotate.py", line 113, in
a = annotate_example(d, tables[d['table_id']])
File "annotate.py", line 38, in annotate_example
ann['question'] = annotate(example['question'])
File "annotate.py", line 20, in annotate
client = CoreNLPClient(default_annotators='ssplit,tokenize'.split(','))
NameError: name 'CoreNLPClient' is not defined

Please advise

where we could download prebuilt model?

hi, where we could download prebuilt model for prediction? thanks

How does IE-SQL know the sequence NER labels of question? How to get the NER ground truth?

How to execute a given sql query on wikisql test set database?

I have generated sql query from test set question and schema. (sql: select nationality where player = terrence ross)

Now i want to execute the sql query over the wikisql test set and return the result. How to do it please?

Thanks.

Licensing information?

Under which license is released your dataset?

Thank you.

Obtain answers to queries

How do we execute the groudtruth sql query against the db to obtain the groundtruth answer?

Error while running evaluate.py

I tried running evaluate.py with dev data.
The command I have given in cmd:
python evaluate.py data\dev.jsonl data\dev.db test\example.pred.dev.jsonl

I'm getting this error-

Can someone help where I'm going wrong?

Regards,

Data collection template

Hi @vzhong ,
I've been testing some models on your data and now I would like to create my own data following your format. In your paper you said that you made available examples of the interface used during the paraphrase. Wonder where I could find the template you used for your data collection?

Thank

Records library need to be patched for Sqlachemy 2.x

evaluate.py fails when records library depends on sqlachemy 2.x

TypeError: execute() got an unexpected keyword argument 'name'

records.py need changes refer this pull request

How do you think NL2SPARQL?

NL2SPARQL is a method towards knowledge graph, which is similar to NL2SQL.
How do you think NL2SPARQL methods and other methods like deep-path for Knowledge Graph or Knowledge Base?
Thank you.
@vzhong @bmccann

Can you provide the original data of WikiSQL？

At present, we are conducting multimodal task evaluation. We hope to include WikiSQL in the evaluation, but there is no image and layout information. We hope you can provide the original data.

Thanks

Error in the first train record

In the train.jsonl the first record under the seq->conds attribute it says 3 but it should say 0 instead.

For reference following is the object
{'phase': 1, 'table_id': '1-1000181-1', 'question': 'Tell me what the notes are for South Australia ', 'sql': {'sel': 5, 'conds': [[3, 0, 'SOUTH AUSTRALIA']], 'agg': 0}}

and following is the header for the particular table
["State/territory", "Text/background colour", "Format", "Current slogan", "Current series", "Notes"]

Intuitively it should choose State/territory column in the where clause

Following is the translated query
SELECT Notes AS result FROM table_1_1000181_1 WHERE Current slogan = 'south australia';
tell me what the notes are for south australia

The "sql" part corresponding to the real SQL language

I understand now

Slight improvement:

In the installation notes, for cloning the repository, the instruction says use the git clone url command. The url however lacks the .git extension because of which the git-lfs checkout hook doesn't get executed. It took me some time to realise why I was not being able to extract the data. I was getting the error that "(stdin) is not a bzip2 file".
Let me know what the details of this problem are.

Read Me points to staging?

Currently the readme.md
Link for "Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning."

Points to https://stg.einstein.ai/static/images/layouts/research/seq2sql/seq2sql.pdf

Should point to:
https://einstein.ai/static/images/layouts/research/seq2sql/seq2sql.pdf

;)

James

annotate.py throws exception: query word '.' is not in input vocabulary.

query word "." is not in input vocabulary.
['symsyms', 'symselect', 'symwhere', 'symand', 'symcol', 'symtable', 'symcaption', 'sympage', 'symsection', 'symop', 'symcond', 'symquestion', 'symagg', 'symaggops', 'symcondops', 'symaggops', 'max', 'min', 'count', 'sum', 'avg', 'symcondops', '=', '>', '<', 'op', 'symtable', 'symcol', 'species', 'symcol', 'indole', 'symcol', 'methyl', 'red', 'symcol', 'voges-proskauer', 'symcol', 'citrate', 'symquestion', 'what', 'is', 'the', 'result', 'for', 'salmonella', 'spp.', 'if', 'you', 'use', 'citrate', '?', 'symend']
Traceback (most recent call last):
File "annotate.py", line 119, in
raise Exception(str(a))
Exception: {'table_id': '1-16083989-1', 'question': {'gloss': ['What', 'is', 'the', 'result', 'for', 'salmonella', 'spp.', 'if', 'you', 'use', 'citrate', '?'], 'words': ['what', 'is', 'the', 'result', 'for', 'salmonella', 'spp.', 'if', 'you', 'use', 'citrate', '?'], 'after': [' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', '', '']}, 'table': {'header': [{'gloss': ['Species'], 'words': ['species'], 'after': ['']}, {'gloss': ['Indole'], 'words': ['indole'], 'after': ['']}, {'gloss': ['Methyl', 'Red'], 'words': ['methyl', 'red'], 'after': [' ', '']}, {'gloss': ['Voges-Proskauer'], 'words': ['voges-proskauer'], 'after': ['']}, {'gloss': ['Citrate'], 'words': ['citrate'], 'after': ['']}]}, 'query': {'sel': 4, 'conds': [[0, 0, {'gloss': ['Salmonella', 'spp', '.'], 'words': ['salmonella', 'spp.', '.'], 'after': [' ', '', '']}]], 'agg': 3}, 'seq_input': {'gloss': ['SYMSYMS', 'SYMSELECT', 'SYMWHERE', 'SYMAND', 'SYMCOL', 'SYMTABLE', 'SYMCAPTION', 'SYMPAGE', 'SYMSECTION', 'SYMOP', 'SYMCOND', 'SYMQUESTION', 'SYMAGG', 'SYMAGGOPS', 'SYMCONDOPS', 'SYMAGGOPS', 'MAX', 'MIN', 'COUNT', 'SUM', 'AVG', 'SYMCONDOPS', '=', '>', '<', 'OP', 'SYMTABLE', 'SYMCOL', 'Species', 'SYMCOL', 'Indole', 'SYMCOL', 'Methyl', 'Red', 'SYMCOL', 'Voges-Proskauer', 'SYMCOL', 'Citrate', 'SYMQUESTION', 'What', 'is', 'the', 'result', 'for', 'salmonella', 'spp.', 'if', 'you', 'use', 'citrate', '?', 'SYMEND'], 'words': ['symsyms', 'symselect', 'symwhere', 'symand', 'symcol', 'symtable', 'symcaption', 'sympage', 'symsection', 'symop', 'symcond', 'symquestion', 'symagg', 'symaggops', 'symcondops', 'symaggops', 'max', 'min', 'count', 'sum', 'avg', 'symcondops', '=', '>', '<', 'op', 'symtable', 'symcol', 'species', 'symcol', 'indole', 'symcol', 'methyl', 'red', 'symcol', 'voges-proskauer', 'symcol', 'citrate', 'symquestion', 'what', 'is', 'the', 'result', 'for', 'salmonella', 'spp.', 'if', 'you', 'use', 'citrate', '?', 'symend'], 'after': [' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', '', ' ', '']}, 'seq_output': {'gloss': ['SYMSELECT', 'SYMAGG', 'COUNT', 'SYMCOL', 'Citrate', 'SYMWHERE', 'SYMCOL', 'Species', 'SYMOP', '=', 'SYMCOND', 'Salmonella', 'spp', '.', 'SYMEND'], 'words': ['symselect', 'symagg', 'count', 'symcol', 'citrate', 'symwhere', 'symcol', 'species', 'symop', '=', 'symcond', 'salmonella', 'spp.', '.', 'symend'], 'after': [' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', '', ' ', '']}, 'where_output': {'gloss': ['SYMWHERE', 'SYMCOL', 'Species', 'SYMOP', '=', 'SYMCOND', 'Salmonella', 'spp', '.', 'SYMEND'], 'words': ['symwhere', 'symcol', 'species', 'symop', '=', 'symcond', 'salmonella', 'spp.', '.', 'symend'], 'after': [' ', ' ', ' ', ' ', ' ', ' ', ' ', '', ' ', '']}}

How do I begin to use this on my own database?

Hi,

I'm no NLP expert but I would like to understand how I can apply this library and it's techniques on my own data. Is there a hello world example of how to modify the data files so I can use NL queries over my own data?

Invalid File Names while cloning the GitHub repo

On cloning this repository using the command git clone https://github.com/salesforce/WikiSQL
Git is able to download the repository but is not able to extract all the files. An error is encountered as follows:

Cloning into 'WikiSQL'...
remote: Enumerating objects: 386, done.
remote: Counting objects: 100% (192/192), done.
remote: Compressing objects: 100% (38/38), done.
remote: Total 386 (delta 185), reused 154 (delta 154), pack-reused 194
Receiving objects: 100% (386/386), 50.72 MiB | 19.88 MiB/s, done.
Resolving deltas: 100% (212/212), done.
error: unable to create file collection/paraphrase/Icon?: Invalid argument
error: unable to create file collection/paraphrase/paraphrase_files/Icon?: Invalid argument
error: unable to create file collection/verify/Icon?: Invalid argument
error: unable to create file collection/verify/verify_files/Icon?: Invalid argument
fatal: unable to checkout working tree
warning: Clone succeeded, but checkout failed.
You can inspect what was checked out with 'git status'
and retry with 'git restore --source=HEAD :/'

This is the output of running git status

On branch master
Your branch is up to date with 'origin/master'.

Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
	deleted:    .dockerignore
	deleted:    .gitattributes
	deleted:    .gitignore
	deleted:    .travis.yml
	deleted:    CODEOWNERS
	deleted:    LICENSE
	deleted:    README.md
	deleted:    annotate.py
	deleted:    collection/README.md
	deleted:    "collection/paraphrase/Icon\r"
	deleted:    collection/paraphrase/index.html
	deleted:    "collection/paraphrase/paraphrase_files/Icon\r"
	deleted:    collection/paraphrase/paraphrase_files/bootstrap.min.css
	deleted:    collection/paraphrase/paraphrase_files/bootstrap.min.js
	deleted:    collection/paraphrase/paraphrase_files/jquery-3.2.1.min.js
	deleted:    collection/paraphrase/paraphrase_files/toastr.min.css
	deleted:    collection/paraphrase/paraphrase_files/toastr.min.js
	deleted:    "collection/verify/Icon\r"
	deleted:    collection/verify/verify.html
	deleted:    "collection/verify/verify_files/Icon\r"
	deleted:    collection/verify/verify_files/bootstrap.min.css
	deleted:    collection/verify/verify_files/bootstrap.min.js
	deleted:    collection/verify/verify_files/jquery-3.2.1.min.js
	deleted:    collection/verify/verify_files/toastr.min.css
	deleted:    collection/verify/verify_files/toastr.min.js
	deleted:    data.tar.bz2
	deleted:    evaluate.py
	deleted:    lib/__init__.py
	deleted:    lib/common.py
	deleted:    lib/dbengine.py
	deleted:    lib/query.py
	deleted:    lib/table.py
	deleted:    requirements.txt
	deleted:    test/Dockerfile
	deleted:    test/check.py
	deleted:    test/example.pred.dev.jsonl.bz2
	deleted:    version.txt

Untracked files:
  (use "git add <file>..." to include in what will be committed)
	.dockerignore
	.gitattributes
	.gitignore
	.travis.yml
	CODEOWNERS
	LICENSE
	README.md
	annotate.py
	collection/
	data.tar.bz2
	evaluate.py
	lib/
	requirements.txt
	test/
	version.txt

And the output of running git restore --source=HEAD :/ as suggested:

error: unable to create file collection/paraphrase/Icon?: Invalid argument
error: unable to create file collection/paraphrase/paraphrase_files/Icon?: Invalid argument
error: unable to create file collection/verify/Icon?: Invalid argument
error: unable to create file collection/verify/verify_files/Icon?: Invalid argument

It seems like the issue is with the filename of the files which contains question marks, a character which is not allowed in file names in Linux file systems.

I attempted to see if the issue can be resolved by downloading the missing files directly from GitHub into the directory where it is supposed to be, for example this file. But it is not possible to download this file as it is and using wget fails as well.

An alternate method tried by me was to download the the Master branch code as a ZIP file and extract is using the unzip WikiSQL-master.zip command. This method works fine and in fact, even the offending files (such as collection/paraphrase/Icon) were successfully extracted with no illegal characters in their file names. It seems like this is an issue with how Git is extracting the files in this repository.

[BUG ] make dockerFile processing.

OS is Ubuntu x64

yummyyyy@yummyyyy-virtual-machine:~/公共的/WikiSQL$

sudo docker build -t wikisqltest -f test/Dockerfile .

Sending build context to Docker daemon 80.38MB
Step 1/8 : FROM python:3.6.2-alpine
---> 294201c0731f
Step 2/8 : RUN mkdir -p /eval
---> Using cache
---> 998bcbd64c08
Step 3/8 : WORKDIR /eval
---> Using cache
---> 94acbfabbb4f
Step 4/8 : ADD . /eval/
---> Using cache
---> 942024fe38cb
Step 5/8 : RUN pip install -r requirements.txt
---> Running in fbdf67763a18
Collecting tqdm (from -r requirements.txt (line 1))
Downloading https://files.pythonhosted.org/packages/9c/05/cf212f57daa0eb6106fa668a04d74d932e9881fd4a22f322ea1dadb5aba0/tqdm-4.62.2-py2.py3-none-any.whl (76kB)
Collecting records (from -r requirements.txt (line 2))
Downloading https://files.pythonhosted.org/packages/ef/93/2467c761ea3729713ab97842a46cc125ad09d14a0a174cb637bee4983911/records-0.5.3-py2.py3-none-any.whl
Collecting babel (from -r requirements.txt (line 3))
Downloading https://files.pythonhosted.org/packages/aa/96/4ba93c5f40459dc850d25f9ba93f869a623e77aaecc7a9344e19c01942cf/Babel-2.9.1-py2.py3-none-any.whl (8.8MB)
Collecting tabulate (from -r requirements.txt (line 4))
Downloading https://files.pythonhosted.org/packages/ca/80/7c0cad11bd99985cfe7c09427ee0b4f9bd6b048bd13d4ffb32c6db237dfb/tabulate-0.8.9-py3-none-any.whl
Collecting openpyxl<2.5.0 (from records->-r requirements.txt (line 2))
Downloading https://files.pythonhosted.org/packages/77/26/0bd1a39776f53b4f28e5bb1d26b3fcd99068584a7e1ddca4e09c0d5fd592/openpyxl-2.4.11.tar.gz (158kB)
Collecting SQLAlchemy; python_version >= "3.0" (from records->-r requirements.txt (line 2))
Downloading https://files.pythonhosted.org/packages/ad/c7/61ff52be84f5ac86c72672ceac941981f1685b4ef90391d405a1f89677d0/SQLAlchemy-1.4.23.tar.gz (7.7MB)
Collecting docopt (from records->-r requirements.txt (line 2))
Downloading https://files.pythonhosted.org/packages/a2/55/8f8cab2afd404cf578136ef2cc5dfb50baa1761b68c9da1fb1e4eed343c9/docopt-0.6.2.tar.gz
Collecting tablib>=0.11.4 (from records->-r requirements.txt (line 2))
Downloading https://files.pythonhosted.org/packages/16/85/078fc037b15aa1120d6a0287ec9d092d93d632ab01a0e7a3e69b4733da5e/tablib-3.0.0-py3-none-any.whl (47kB)
Collecting pytz>=2015.7 (from babel->-r requirements.txt (line 3))
Downloading https://files.pythonhosted.org/packages/70/94/784178ca5dd892a98f113cdd923372024dc04b8d40abe77ca76b5fb90ca6/pytz-2021.1-py2.py3-none-any.whl (510kB)
Collecting jdcal (from openpyxl<2.5.0->records->-r requirements.txt (line 2))
Downloading https://files.pythonhosted.org/packages/f0/da/572cbc0bc582390480bbd7c4e93d14dc46079778ed915b505dc494b37c57/jdcal-1.4.1-py2.py3-none-any.whl
Collecting et_xmlfile (from openpyxl<2.5.0->records->-r requirements.txt (line 2))
Downloading https://files.pythonhosted.org/packages/96/c2/3dd434b0108730014f1b96fd286040dc3bcb70066346f7e01ec2ac95865f/et_xmlfile-1.1.0-py3-none-any.whl
Collecting importlib-metadata (from SQLAlchemy; python_version >= "3.0"->records->-r requirements.txt (line 2))
Downloading https://files.pythonhosted.org/packages/71/c2/cb1855f0b2a0ae9ccc9b69f150a7aebd4a8d815bd951e74621c4154c52a8/importlib_metadata-4.8.1-py3-none-any.whl
Collecting greenlet!=0.4.17 (from SQLAlchemy; python_version >= "3.0"->records->-r requirements.txt (line 2))
Downloading https://files.pythonhosted.org/packages/72/7e/d8586068d47adba73afc085249712bd266cd7ffbf27d8bc260c33e9d6133/greenlet-1.1.1.tar.gz (85kB)
Collecting zipp>=0.5 (from importlib-metadata->SQLAlchemy; python_version >= "3.0"->records->-r requirements.txt (line 2))
Downloading https://files.pythonhosted.org/packages/92/d9/89f433969fb8dc5b9cbdd4b4deb587720ec1aeb59a020cf15002b9593eef/zipp-3.5.0-py3-none-any.whl
Collecting typing-extensions>=3.6.4; python_version < "3.8" (from importlib-metadata->SQLAlchemy; python_version >= "3.0"->records->-r requirements.txt (line 2))
Downloading https://files.pythonhosted.org/packages/74/60/18783336cc7fcdd95dae91d73477830aa53f5d3181ae4fe20491d7fc3199/typing_extensions-3.10.0.2-py3-none-any.whl
Building wheels for collected packages: openpyxl, SQLAlchemy, docopt, greenlet
Running setup.py bdist_wheel for openpyxl: started
Running setup.py bdist_wheel for openpyxl: finished with status 'done'
Stored in directory: /root/.cache/pip/wheels/59/44/27/63b211425501ad51d197ff8ed00e9e469e38b9e516cb69b1c2
Running setup.py bdist_wheel for SQLAlchemy: started
Running setup.py bdist_wheel for SQLAlchemy: finished with status 'done'
Stored in directory: /root/.cache/pip/wheels/7d/52/1c/117179bb38418ab4e06deb5c8288acd8ee1e0b418f5e59608f
Running setup.py bdist_wheel for docopt: started
Running setup.py bdist_wheel for docopt: finished with status 'done'
Stored in directory: /root/.cache/pip/wheels/9b/04/dd/7daf4150b6d9b12949298737de9431a324d4b797ffd63f526e
Running setup.py bdist_wheel for greenlet: started
Running setup.py bdist_wheel for greenlet: finished with status 'error'
Complete output from command /usr/local/bin/python -u -c "import setuptools, tokenize;file='/tmp/pip-build-e0uqasn_/greenlet/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" bdist_wheel -d /tmp/tmpug655ghzpip-wheel- --python-tag cp36:
/usr/local/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'project_urls'
warnings.warn(msg)
running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.6
creating build/lib.linux-x86_64-3.6/greenlet
copying src/greenlet/init.py -> build/lib.linux-x86_64-3.6/greenlet
creating build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_generator_nested.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_stack_saved.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_leaks.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_version.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_gc.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_cpp.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_tracing.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_weakref.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_extension_interface.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_greenlet.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_contextvars.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_throw.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_generator.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/init.py -> build/lib.linux-x86_64-3.6/greenlet/tests
running egg_info
writing src/greenlet.egg-info/PKG-INFO
writing dependency_links to src/greenlet.egg-info/dependency_links.txt
writing requirements to src/greenlet.egg-info/requires.txt
writing top-level names to src/greenlet.egg-info/top_level.txt
reading manifest file 'src/greenlet.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
no previously-included directories found matching 'docs/_build'
warning: no files found matching '.py' under directory 'appveyor'
warning: no previously-included files matching '.pyc' found anywhere in distribution
warning: no previously-included files matching '.pyd' found anywhere in distribution
warning: no previously-included files matching '.so' found anywhere in distribution
warning: no previously-included files matching '.coverage' found anywhere in distribution
writing manifest file 'src/greenlet.egg-info/SOURCES.txt'
copying src/greenlet/greenlet.c -> build/lib.linux-x86_64-3.6/greenlet
copying src/greenlet/greenlet.h -> build/lib.linux-x86_64-3.6/greenlet
copying src/greenlet/slp_platformselect.h -> build/lib.linux-x86_64-3.6/greenlet
creating build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/setup_switch_x64_masm.cmd -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_aarch64_gcc.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_alpha_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_amd64_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_arm32_gcc.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_arm32_ios.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_csky_gcc.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_m68k_gcc.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_mips_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_ppc64_aix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_ppc64_linux.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_ppc_aix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_ppc_linux.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_ppc_macosx.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_ppc_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_riscv_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_s390_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_sparc_sun_gcc.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_x32_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_x64_masm.asm -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_x64_masm.obj -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_x64_msvc.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_x86_msvc.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_x86_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/tests/_test_extension.c -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/_test_extension_cpp.cpp -> build/lib.linux-x86_64-3.6/greenlet/tests
running build_ext
building 'greenlet._greenlet' extension
creating build/temp.linux-x86_64-3.6
creating build/temp.linux-x86_64-3.6/src
creating build/temp.linux-x86_64-3.6/src/greenlet
gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/usr/local/include/python3.6m -c src/greenlet/greenlet.c -o build/temp.linux-x86_64-3.6/src/greenlet/greenlet.o
unable to execute 'gcc': No such file or directory
error: command 'gcc' failed with exit status 1

Failed building wheel for greenlet
Running setup.py clean for greenlet
Successfully built openpyxl SQLAlchemy docopt
Failed to build greenlet
Installing collected packages: tqdm, jdcal, et-xmlfile, openpyxl, zipp, typing-extensions, importlib-metadata, greenlet, SQLAlchemy, docopt, tablib, records, pytz, babel, tabulate
Running setup.py install for greenlet: started
Running setup.py install for greenlet: finished with status 'error'
Complete output from command /usr/local/bin/python -u -c "import setuptools, tokenize;file='/tmp/pip-build-e0uqasn_/greenlet/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-tmdjgba9-record/install-record.txt --single-version-externally-managed --compile:
/usr/local/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'project_urls'
warnings.warn(msg)
running install
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.6
creating build/lib.linux-x86_64-3.6/greenlet
copying src/greenlet/init.py -> build/lib.linux-x86_64-3.6/greenlet
creating build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_generator_nested.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_stack_saved.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_leaks.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_version.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_gc.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_cpp.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_tracing.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_weakref.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_extension_interface.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_greenlet.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_contextvars.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_throw.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_generator.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/init.py -> build/lib.linux-x86_64-3.6/greenlet/tests
running egg_info
writing src/greenlet.egg-info/PKG-INFO
writing dependency_links to src/greenlet.egg-info/dependency_links.txt
writing requirements to src/greenlet.egg-info/requires.txt
writing top-level names to src/greenlet.egg-info/top_level.txt
reading manifest file 'src/greenlet.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
no previously-included directories found matching 'docs/_build'
warning: no files found matching '.py' under directory 'appveyor'
warning: no previously-included files matching '.pyc' found anywhere in distribution
warning: no previously-included files matching '.pyd' found anywhere in distribution
warning: no previously-included files matching '.so' found anywhere in distribution
warning: no previously-included files matching '.coverage' found anywhere in distribution
writing manifest file 'src/greenlet.egg-info/SOURCES.txt'
copying src/greenlet/greenlet.c -> build/lib.linux-x86_64-3.6/greenlet
copying src/greenlet/greenlet.h -> build/lib.linux-x86_64-3.6/greenlet
copying src/greenlet/slp_platformselect.h -> build/lib.linux-x86_64-3.6/greenlet
creating build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/setup_switch_x64_masm.cmd -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_aarch64_gcc.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_alpha_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_amd64_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_arm32_gcc.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_arm32_ios.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_csky_gcc.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_m68k_gcc.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_mips_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_ppc64_aix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_ppc64_linux.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_ppc_aix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_ppc_linux.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_ppc_macosx.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_ppc_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_riscv_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_s390_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_sparc_sun_gcc.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_x32_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_x64_masm.asm -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_x64_masm.obj -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_x64_msvc.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_x86_msvc.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_x86_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/tests/_test_extension.c -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/_test_extension_cpp.cpp -> build/lib.linux-x86_64-3.6/greenlet/tests
running build_ext
building 'greenlet._greenlet' extension
creating build/temp.linux-x86_64-3.6
creating build/temp.linux-x86_64-3.6/src
creating build/temp.linux-x86_64-3.6/src/greenlet
gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/usr/local/include/python3.6m -c src/greenlet/greenlet.c -o build/temp.linux-x86_64-3.6/src/greenlet/greenlet.o
unable to execute 'gcc': No such file or directory
error: command 'gcc' failed with exit status 1

----------------------------------------

Command "/usr/local/bin/python -u -c "import setuptools, tokenize;file='/tmp/pip-build-e0uqasn_/greenlet/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-tmdjgba9-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-e0uqasn_/greenlet/
You are using pip version 9.0.1, however version 21.2.4 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
The command '/bin/sh -c pip install -r requirements.txt' returned a non-zero code: 1

How to differentiate 'AND' vs 'OR

I see json when query : OR, AND in "conds" just a array of condition and separate by sysbol ",",
So what a signature maked we know that is "AND" or "OR" ?

\mathrm in question

Hi @vzhong

I think I found some unusual questions in train.jsonl (see the image below) which contain lots of \mathrm.

In my humble opinion, as they are few (6 questions), they can be removed or can be converted to normal text form (rather than using LaTeX form)?

Thanks!

Not sure about NSM

The paper of neural-symbolic-machines says:

So I am not sure whether NSM use the logic form for training.

[None] gold questions

I've instrumented evaluate.py to count the number of gold questions that have a gold answer of [None]. In the dev set, evaluated via the docker instructions on the front page, there are 657 / 8421 = 8% questions with a [None] result. I've manually looked at a few such questions, and they seem to have various errors, most notably lacking space before commas and > instead of >=.

Is this a known issue?

{'phase': 2, 'table_id': '2-12955969-1', 'question': 'What is the year of the tournament played at Melbourne, Australia?', 'sql': {'sel': 0, 'conds': [[2, 0, 'melbourne, australia']], 'agg': 5}}
SELECT AVG(col0) AS result FROM table_2_12955969_1 WHERE col2 = :col2
{'col2': 'melbourne, australia'}
Manual fix: SELECT AVG(col0) AS result FROM table_2_12955969_1 WHERE col2 = 'melbourne , australia';

{'phase': 2, 'table_id': '2-12312050-1', 'question': "What's the sum of points for the 1963 season when there are more than 30 games?", 'sql': {'sel': 4, 'conds': [[0, 0, '1963'], [2, 1, 30]], 'agg': 4}}
SELECT SUM(col4) AS result FROM table_2_12312050_1 WHERE col0 = :col0 AND col2 > :col2
{'col0': '1963', 'col2': 30}
Manual fix: SELECT SUM(col4) AS result FROM table_2_12312050_1 WHERE col0 = '1963' AND col2 >= 30;

TypeError: 'Document' object is not iterable annotate.py

File "C:/Users/Admin/Desktop/WikiSQL-master/annotate.py", line 21, in annotate
for s in client.annotate(sentence):
TypeError: 'Document' object is not iterable
I'm Using the stanza 1.0.1 and corenlp 2020

salesforce / wikisql Goto Github PK

wikisql's Issues

Recommend Projects

Recommend Topics

Recommend Org