Coder Social home page Coder Social logo

salesforce / wikisql Goto Github PK

View Code? Open in Web Editor NEW
1.6K 63.0 319.0 50.72 MB

A large annotated semantic parsing corpus for developing natural language interfaces.

License: BSD 3-Clause "New" or "Revised" License

Python 49.09% Dockerfile 0.55% HTML 50.36%
natural-language dataset database machine-learning natural-language-processing natural-language-interface

wikisql's Issues

[BUG] Unable to clone WikiSQL in a system

πŸ›πŸ› Bug Report

The installation of WikiSQL given in README is not working.

#639(2)

It is showing a warning.

#639(1)

The folder WikiSQL is also empty

#639(3)

βš™οΈ Environment

  • Python version(s): [3.8.5]
  • OS: [Windows 10]

Phase 1 vs. phase 2

Hi,
I am confused by phase 1 and phase 2 annotations in the dataset files.
The paper says phase 1 is a paraphrasing phase while phase 2 is a verification phase. As far as I understand, phase 2 is just about discarding wrong paraphrases. So what do you mean by a given example was collected in phase 1 vs. phase 2?
Thanks,

--Ahmed

Requirements.txt versions

For those having trouble running the evaluation script in 2022 due to version updates in all the dependencies: use the following requirements.txt to get the specific package versions in the past (circa 2017-2018).

tqdm
sqlalchemy==1.2
records==0.5.3
babel==2.5.1
tabulate==0.8.1

Question Generation using a Template

Hi Team,

  Could you please let me know regarding the question generation template that was used to generate questions before the paraphrasing phase? Is the generation template/code a part of this codebase? 

How to get the original SQL query from the "sql"/"query" field which is in json format

How can I convert the following code in "sql" field into the original SQL query

{"phase": 1, "table_id": "1-1000181-1", "question": "Tell me what the notes are for South Australia ", "sql": {"sel": 5, "conds": [[3, 0, "SOUTH AUSTRALIA"]], "agg": 0}}

I tried using lib.query.Query.from_dict method but get SELECT col5 FROM table WHERE col3 = SOUTH AUSTRALIA
and tried using lib.dbengine.DBEngine.execute_query method but get SELECT col5 AS result FROM table_1_1000181_1 WHERE col3 = :col3.
None of the above two methods get the correct SQL query, so how can I get it? Anybody help?

Incorrect Expected Queries

It seems many of the expected queries are incorrect, based on the question posed. Here is a preliminary list of just some questions I noticed have incorrect expected queries/answers:

Question: How many games was Damien Wilkins (27) the high scorer?
Expected query: SELECT MIN(Game) FROM 1-11964154-2 WHERE High points = 'damien wilkins (27)'
Expected result: ['6.0']

Question: What is the name of the integrated where allied-related is shared?
Expected query: SELECT (Component) FROM 1-11944282-1 WHERE Allied-Related = 'shared'
Expected result: ['customers']

Question: what is the integrated in which the holding allied-unrelated is many?
Expected query: SELECT (Holding) FROM 1-11944282-1 WHERE Allied-Unrelated = 'many'
Expected result: ['many']

Question: How many integrated allied-related are there?
Expected query: SELECT (Integrated) FROM 1-11944282-1 WHERE Allied-Related = 'many'
Expected result: ['one']

Question: Which authority has a rocket launch called rehbar-5?
Expected query: SELECT COUNT(Derivatives) FROM 1-11869952-1 WHERE Rocket launch = 'rehbar-5'
Expected result: ['1']

Question: Who had an evening gown score of 9.78?
Expected query: SELECT (Interview) FROM 1-11690135-1 WHERE Evening Gown = '9.78'
Expected result: ['8.91']

How to evaluate a Seq2Seq model ?

Hi,

I have built a se2seq model that takes the question and generates the SQL query directly.

My question is how to evaluate my model since the predictions are in sequence format ("select .... from ... where...")?

It's very urgent, please.

Thanks

In Google Colab - stanza.server.client.PermanentlyFailedException: Timed out waiting for service to come alive.

Hello Team,

While running the setup in Google Colab i am facing the below error:
Please advise further.

Traceback (most recent call last):
File "annotate.py", line 113, in
a = annotate_example(d, tables[d['table_id']])
File "annotate.py", line 38, in annotate_example
ann['question'] = annotate(example['question'])
File "annotate.py", line 22, in annotate
for s in client.annotate(sentence):
File "/usr/local/lib/python3.6/dist-packages/stanza/server/client.py", line 470, in annotate
r = self._request(text.encode('utf-8'), request_properties, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/stanza/server/client.py", line 379, in _request
self.ensure_alive()
File "/usr/local/lib/python3.6/dist-packages/stanza/server/client.py", line 203, in ensure_alive
raise PermanentlyFailedException("Timed out waiting for service to come alive.")
stanza.server.client.PermanentlyFailedException: Timed out waiting for service to come alive.
0% 0/56355 [02:00<?, ?it/s]

Cannot operate on a closed database

When running the evaluation.py as in Dockerfile, there will be an error:

Traceback (most recent call last):
  File "bug1.py", line 5, in <module>
    print(db.query('SELECT sql from sqlite_master WHERE tbl_name = :name', name='table_1_10015132_11').first())
  File "/home/zgzhen/projects/cse517-project/eval/seq2sql/WikiSQL/.venv/lib/python3.6/site-packages/records.py", line 214, in first
    record = self[0]
  File "/home/zgzhen/projects/cse517-project/eval/seq2sql/WikiSQL/.venv/lib/python3.6/site-packages/records.py", line 152, in __getitem__
    next(self)
  File "/home/zgzhen/projects/cse517-project/eval/seq2sql/WikiSQL/.venv/lib/python3.6/site-packages/records.py", line 136, in __next__
    nextrow = next(self._rows)
  File "/home/zgzhen/projects/cse517-project/eval/seq2sql/WikiSQL/.venv/lib/python3.6/site-packages/records.py", line 365, in <genexpr>
    row_gen = (Record(cursor.keys(), row) for row in cursor)
  File "/home/zgzhen/projects/cse517-project/eval/seq2sql/WikiSQL/.venv/lib/python3.6/site-packages/sqlalchemy/engine/result.py", line 946, in __iter__
    row = self.fetchone()
  File "/home/zgzhen/projects/cse517-project/eval/seq2sql/WikiSQL/.venv/lib/python3.6/site-packages/sqlalchemy/engine/result.py", line 1276, in fetchone
    e, None, None, self.cursor, self.context
  File "/home/zgzhen/projects/cse517-project/eval/seq2sql/WikiSQL/.venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1458, in _handle_dbapi_exception
    util.raise_from_cause(sqlalchemy_exception, exc_info)
  File "/home/zgzhen/projects/cse517-project/eval/seq2sql/WikiSQL/.venv/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 296, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/home/zgzhen/projects/cse517-project/eval/seq2sql/WikiSQL/.venv/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 276, in reraise
    raise value.with_traceback(tb)
  File "/home/zgzhen/projects/cse517-project/eval/seq2sql/WikiSQL/.venv/lib/python3.6/site-packages/sqlalchemy/engine/result.py", line 1268, in fetchone
    row = self._fetchone_impl()
  File "/home/zgzhen/projects/cse517-project/eval/seq2sql/WikiSQL/.venv/lib/python3.6/site-packages/sqlalchemy/engine/result.py", line 1148, in _fetchone_impl
    return self.cursor.fetchone()
sqlalchemy.exc.ProgrammingError: (sqlite3.ProgrammingError) Cannot operate on a closed database. (Background on this error at: http://sqlalche.me/e/f405)

Look like this issue is related: kennethreitz/records#128

How to get exact sql query for the natural language question ?

HI All,

I want to generate a custom json file which has the sql query for its natural language question.
I am unable to install docker to execute the annotate and query.py files to get the sql queries as i have windows 10 Home. and docker installation needs Windows 10 PRO.
Can you please suggest , how can i get it without docker way?
Or can you share the file for sql queries for wikisql if you have generated it already?

Thanks
Anshu

.travis.yml missing

While submitting a pull request, the travis ci bot failed saying that ".travis.yml file could not be found". It seems to be some configuration file for the travis Continuous Integration module. I tried to locate it in the repo, but couldn't find it. Because of which, the pull request doesn't pass the built in checks.

Leaderboard for weakly supervised WikiSQL task

Since the results on the supervised learning benchmark are quite close to being saturated, I think having a leaderboard for models trained using only weak-supervision would be more relevant benchmark (for example, Memory Augmented Program Synthesis from Liang et. al beats some of the old entries in the strongly-supervised leaderboard using only weak supervision.)

Using this method on my own SQL database

Hi @vzhong,

I would like to get inferences for my own SQL table. For a similar question asked in Jan'18 you replied-

"You would have to train a model on this data, then perform inference on your data. Xiaojun and Chang from Berkeley has kindly made their model available here: https://github.com/xiaojunxu/SQLNet".

I am a little confused. Won't I need to train on my own SQL table? my column names could be very different. Won't need to create .jsonl files like you have in "data" directory? Could you please help me understand you comment above?

Thank you,
Shruti

dev.sql and test.sql files

HI, All, do anyone of you have dev.sql and test.sql file for wikisql dataset. Somehow i have train.sql , but i don't recall from where i got it. I need schema.sql file for dev and test.
Thank you so much!
Anshu

Higher LF accuracy on sample output.

Hi, I'm new to this data set. I followed README and run evaluation on example.pred.dev.jsonl. I got the following result.
{
"ex_accuracy": 0.5380596128725804,
"lf_accuracy": 0.45208407552547203
}

I'm not sure what is wrong. Can you give me some hint?

AttributeError: 'NoneType' object has no attribute 'number_symbols'

When i run evaluate.py on example 'pred.dev.jsonl' i get this error, what sould i do?
C:\Users\Admin\anaconda3\python.exe C:/Users/Admin/Desktop/WikiSQL-master/evaluate.py
30%|β–ˆβ–ˆβ–‰ | 2517/8421 [00:04<00:10, 562.58it/s]
Traceback (most recent call last):
File "C:/Users/Admin/Desktop/WikiSQL-master/evaluate.py", line 29, in
gold = engine.execute_query(eg['table_id'], qg, lower=True)
File "C:\Users\Admin\Desktop\WikiSQL-master\lib\dbengine.py", line 18, in execute_query
return self.execute(table_id, query.sel_index, query.agg_index, query.conditions, *args, **kwargs)
File "C:\Users\Admin\Desktop\WikiSQL-master\lib\dbengine.py", line 40, in execute
val = float(parse_decimal(val))
File "C:\Users\Admin\anaconda3\lib\site-packages\babel\numbers.py", line 707, in parse_decimal
group_symbol = get_group_symbol(locale)
File "C:\Users\Admin\anaconda3\lib\site-packages\babel\numbers.py", line 332, in get_group_symbol
return Locale.parse(locale).number_symbols.get('group', u',')
AttributeError: 'NoneType' object has no attribute 'number_symbols'

What is the input to annotate.py

Hi,

Could you please share an input format/example file for annotate.py? I would like to create SQL queries on a new separate SQL database

Thank you,
Sandy

``query word "symend" is not in input vocabulary''

Thanks for sharing the dataset and preprocessing scripts.

I tried to run the following command to generate examples:

python annotate.py 

But the error message indicates that is_valid_example() returns False because of ``symend'':

annotating data/train.jsonl
loading tables
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 18585/18585 [00:01<00:00, 11703.27it/s]
loading examples
  0%|                                                                         | 0/61297 [00:00<?, ?it/s]query word "symend" is not in input vocabulary.
[u'symsyms', u'symselect', u'symwhere', u'symand', u'symcol', u'symtable', u'symcaption', u'sympage', u'symsection', u'symop', u'symcond', u'symquestion', u'symagg', u'symaggops', u'symcondops', u'symaggops', u'max', u'min', u'count', u'sum', u'avg', u'symcondops', u'=', u'>', u'<', u'op', u'symtable', u'symcol', u'state/territory', u'symcol', u'text/background', u'colour', u'symcol', u'format', u'symcol', u'current', u'slogan', u'symcol', u'current', u'series', u'symcol', u'notes', u'symquestion', u'tell', u'me', u'what', u'the', u'notes', u'are', u'for', u'south', u'australia']

Traceback (most recent call last):
  File "annotate.py", line 114, in <module>
    raise Exception(str(a))
Exception: {'seq_output': {'gloss': [u'SYMSELECT', u'SYMAGG', u'SYMCOL', u'Notes', u'SYMWHERE', u'SYMCOL', u'Current', u'slogan', u'SYMOP', u'=', u'SYMCOND', u'SOUTH', u'AUSTRALIA', u'SYMEND'], 'words': [u'symselect', u'symagg', u'symcol', u'notes', u'symwhere', u'symcol', u'current', u'slogan', u'symop', u'=', u'symcond', u'south', u'australia', u'symend'], 'after': [u' ', u'  ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u'']}, 'where_output': {'gloss': [u'SYMWHERE', u'SYMCOL', u'Current', u'slogan', u'SYMOP', u'=', u'SYMCOND', u'SOUTH', u'AUSTRALIA', u'SYMEND'], 'words': [u'symwhere', u'symcol', u'current', u'slogan', u'symop', u'=', u'symcond', u'south', u'australia', u'symend'], 'after': [u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u'']}, 'question': {'gloss': [u'Tell', u'me', u'what', u'the', u'notes', u'are', u'for', u'South', u'Australia'], 'words': [u'tell', u'me', u'what', u'the', u'notes', u'are', u'for', u'south', u'australia'], 'after': [u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u'']}, 'table_id': u'1-1000181-1', 'table': {'header': [{'gloss': [u'State/territory'], 'words': [u'state/territory'], 'after': [u'']}, {'gloss': [u'Text/background', u'colour'], 'words': [u'text/background', u'colour'], 'after': [u' ', u'']}, {'gloss': [u'Format'], 'words': [u'format'], 'after': [u'']}, {'gloss': [u'Current', u'slogan'], 'words': [u'current', u'slogan'], 'after': [u' ', u'']}, {'gloss': [u'Current', u'series'], 'words': [u'current', u'series'], 'after': [u' ', u'']}, {'gloss': [u'Notes'], 'words': [u'notes'], 'after': [u'']}]}, 'query': {u'agg': 0, u'sel': 5, u'conds': [[3, 0, {'gloss': [u'SOUTH', u'AUSTRALIA'], 'words': [u'south', u'australia'], 'after': [u' ', u'']}]]}, 'seq_input': {'gloss': [u'SYMSYMS', u'SYMSELECT', u'SYMWHERE', u'SYMAND', u'SYMCOL', u'SYMTABLE', u'SYMCAPTION', u'SYMPAGE', u'SYMSECTION', u'SYMOP', u'SYMCOND', u'SYMQUESTION', u'SYMAGG', u'SYMAGGOPS', u'SYMCONDOPS', u'SYMAGGOPS', u'MAX', u'MIN', u'COUNT', u'SUM', u'AVG', u'SYMCONDOPS', u'=', u'>', u'<', u'OP', u'SYMTABLE', u'SYMCOL', u'State/territory', u'SYMCOL', u'Text/background', u'colour', u'SYMCOL', u'Format', u'SYMCOL', u'Current', u'slogan', u'SYMCOL', u'Current', u'series', u'SYMCOL', u'Notes', u'SYMQUESTION', u'Tell', u'me', u'what', u'the', u'notes', u'are', u'for', u'South', u'Australia'], 'words': [u'symsyms', u'symselect', u'symwhere', u'symand', u'symcol', u'symtable', u'symcaption', u'sympage', u'symsection', u'symop', u'symcond', u'symquestion', u'symagg', u'symaggops', u'symcondops', u'symaggops', u'max', u'min', u'count', u'sum', u'avg', u'symcondops', u'=', u'>', u'<', u'op', u'symtable', u'symcol', u'state/territory', u'symcol', u'text/background', u'colour', u'symcol', u'format', u'symcol', u'current', u'slogan', u'symcol', u'current', u'series', u'symcol', u'notes', u'symquestion', u'tell', u'me', u'what', u'the', u'notes', u'are', u'for', u'south', u'australia'], 'after': [u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u'  ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u' ', u'']}}

unable to load library CoreNLPClient

Hello Team,

While executing the git code for annotate.py i am getting below error for CoreNLPClient

annotating data/train.jsonl
loading tables
100% 18585/18585 [00:00<00:00, 19025.85it/s]
loading examples
0% 0/56355 [00:00<?, ?it/s]
Traceback (most recent call last):
File "annotate.py", line 113, in
a = annotate_example(d, tables[d['table_id']])
File "annotate.py", line 38, in annotate_example
ann['question'] = annotate(example['question'])
File "annotate.py", line 20, in annotate
client = CoreNLPClient(default_annotators='ssplit,tokenize'.split(','))
NameError: name 'CoreNLPClient' is not defined

Please advise

Error while running evaluate.py

I tried running evaluate.py with dev data.
The command I have given in cmd:
python evaluate.py data\dev.jsonl data\dev.db test\example.pred.dev.jsonl

I'm getting this error-

image

Can someone help where I'm going wrong?

Regards,

Data collection template

Hi @vzhong ,
I've been testing some models on your data and now I would like to create my own data following your format. In your paper you said that you made available examples of the interface used during the paraphrase. Wonder where I could find the template you used for your data collection?

Thank

Can you provide the original data of WikiSQL?

At present, we are conducting multimodal task evaluation. We hope to include WikiSQL in the evaluation, but there is no image and layout information. We hope you can provide the original data.

Thanks

Error in the first train record

In the train.jsonl the first record under the seq->conds attribute it says 3 but it should say 0 instead.

For reference following is the object
{'phase': 1, 'table_id': '1-1000181-1', 'question': 'Tell me what the notes are for South Australia ', 'sql': {'sel': 5, 'conds': [[3, 0, 'SOUTH AUSTRALIA']], 'agg': 0}}

and following is the header for the particular table
["State/territory", "Text/background colour", "Format", "Current slogan", "Current series", "Notes"]

Intuitively it should choose State/territory column in the where clause

Following is the translated query
SELECT Notes AS result FROM table_1_1000181_1 WHERE Current slogan = 'south australia';
tell me what the notes are for south australia

Slight improvement:

In the installation notes, for cloning the repository, the instruction says use the git clone url command. The url however lacks the .git extension because of which the git-lfs checkout hook doesn't get executed. It took me some time to realise why I was not being able to extract the data. I was getting the error that "(stdin) is not a bzip2 file".
Let me know what the details of this problem are.

annotate.py throws exception: query word '.' is not in input vocabulary.

query word "." is not in input vocabulary.
['symsyms', 'symselect', 'symwhere', 'symand', 'symcol', 'symtable', 'symcaption', 'sympage', 'symsection', 'symop', 'symcond', 'symquestion', 'symagg', 'symaggops', 'symcondops', 'symaggops', 'max', 'min', 'count', 'sum', 'avg', 'symcondops', '=', '>', '<', 'op', 'symtable', 'symcol', 'species', 'symcol', 'indole', 'symcol', 'methyl', 'red', 'symcol', 'voges-proskauer', 'symcol', 'citrate', 'symquestion', 'what', 'is', 'the', 'result', 'for', 'salmonella', 'spp.', 'if', 'you', 'use', 'citrate', '?', 'symend']
Traceback (most recent call last):
File "annotate.py", line 119, in
raise Exception(str(a))
Exception: {'table_id': '1-16083989-1', 'question': {'gloss': ['What', 'is', 'the', 'result', 'for', 'salmonella', 'spp.', 'if', 'you', 'use', 'citrate', '?'], 'words': ['what', 'is', 'the', 'result', 'for', 'salmonella', 'spp.', 'if', 'you', 'use', 'citrate', '?'], 'after': [' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', '', '']}, 'table': {'header': [{'gloss': ['Species'], 'words': ['species'], 'after': ['']}, {'gloss': ['Indole'], 'words': ['indole'], 'after': ['']}, {'gloss': ['Methyl', 'Red'], 'words': ['methyl', 'red'], 'after': [' ', '']}, {'gloss': ['Voges-Proskauer'], 'words': ['voges-proskauer'], 'after': ['']}, {'gloss': ['Citrate'], 'words': ['citrate'], 'after': ['']}]}, 'query': {'sel': 4, 'conds': [[0, 0, {'gloss': ['Salmonella', 'spp', '.'], 'words': ['salmonella', 'spp.', '.'], 'after': [' ', '', '']}]], 'agg': 3}, 'seq_input': {'gloss': ['SYMSYMS', 'SYMSELECT', 'SYMWHERE', 'SYMAND', 'SYMCOL', 'SYMTABLE', 'SYMCAPTION', 'SYMPAGE', 'SYMSECTION', 'SYMOP', 'SYMCOND', 'SYMQUESTION', 'SYMAGG', 'SYMAGGOPS', 'SYMCONDOPS', 'SYMAGGOPS', 'MAX', 'MIN', 'COUNT', 'SUM', 'AVG', 'SYMCONDOPS', '=', '>', '<', 'OP', 'SYMTABLE', 'SYMCOL', 'Species', 'SYMCOL', 'Indole', 'SYMCOL', 'Methyl', 'Red', 'SYMCOL', 'Voges-Proskauer', 'SYMCOL', 'Citrate', 'SYMQUESTION', 'What', 'is', 'the', 'result', 'for', 'salmonella', 'spp.', 'if', 'you', 'use', 'citrate', '?', 'SYMEND'], 'words': ['symsyms', 'symselect', 'symwhere', 'symand', 'symcol', 'symtable', 'symcaption', 'sympage', 'symsection', 'symop', 'symcond', 'symquestion', 'symagg', 'symaggops', 'symcondops', 'symaggops', 'max', 'min', 'count', 'sum', 'avg', 'symcondops', '=', '>', '<', 'op', 'symtable', 'symcol', 'species', 'symcol', 'indole', 'symcol', 'methyl', 'red', 'symcol', 'voges-proskauer', 'symcol', 'citrate', 'symquestion', 'what', 'is', 'the', 'result', 'for', 'salmonella', 'spp.', 'if', 'you', 'use', 'citrate', '?', 'symend'], 'after': [' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', '', ' ', '']}, 'seq_output': {'gloss': ['SYMSELECT', 'SYMAGG', 'COUNT', 'SYMCOL', 'Citrate', 'SYMWHERE', 'SYMCOL', 'Species', 'SYMOP', '=', 'SYMCOND', 'Salmonella', 'spp', '.', 'SYMEND'], 'words': ['symselect', 'symagg', 'count', 'symcol', 'citrate', 'symwhere', 'symcol', 'species', 'symop', '=', 'symcond', 'salmonella', 'spp.', '.', 'symend'], 'after': [' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', '', ' ', '']}, 'where_output': {'gloss': ['SYMWHERE', 'SYMCOL', 'Species', 'SYMOP', '=', 'SYMCOND', 'Salmonella', 'spp', '.', 'SYMEND'], 'words': ['symwhere', 'symcol', 'species', 'symop', '=', 'symcond', 'salmonella', 'spp.', '.', 'symend'], 'after': [' ', ' ', ' ', ' ', ' ', ' ', ' ', '', ' ', '']}}

How do I begin to use this on my own database?

Hi,

I'm no NLP expert but I would like to understand how I can apply this library and it's techniques on my own data. Is there a hello world example of how to modify the data files so I can use NL queries over my own data?

Invalid File Names while cloning the GitHub repo

On cloning this repository using the command git clone https://github.com/salesforce/WikiSQL
Git is able to download the repository but is not able to extract all the files. An error is encountered as follows:

Cloning into 'WikiSQL'...
remote: Enumerating objects: 386, done.
remote: Counting objects: 100% (192/192), done.
remote: Compressing objects: 100% (38/38), done.
remote: Total 386 (delta 185), reused 154 (delta 154), pack-reused 194
Receiving objects: 100% (386/386), 50.72 MiB | 19.88 MiB/s, done.
Resolving deltas: 100% (212/212), done.
error: unable to create file collection/paraphrase/Icon?: Invalid argument
error: unable to create file collection/paraphrase/paraphrase_files/Icon?: Invalid argument
error: unable to create file collection/verify/Icon?: Invalid argument
error: unable to create file collection/verify/verify_files/Icon?: Invalid argument
fatal: unable to checkout working tree
warning: Clone succeeded, but checkout failed.
You can inspect what was checked out with 'git status'
and retry with 'git restore --source=HEAD :/'

This is the output of running git status

On branch master
Your branch is up to date with 'origin/master'.

Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
	deleted:    .dockerignore
	deleted:    .gitattributes
	deleted:    .gitignore
	deleted:    .travis.yml
	deleted:    CODEOWNERS
	deleted:    LICENSE
	deleted:    README.md
	deleted:    annotate.py
	deleted:    collection/README.md
	deleted:    "collection/paraphrase/Icon\r"
	deleted:    collection/paraphrase/index.html
	deleted:    "collection/paraphrase/paraphrase_files/Icon\r"
	deleted:    collection/paraphrase/paraphrase_files/bootstrap.min.css
	deleted:    collection/paraphrase/paraphrase_files/bootstrap.min.js
	deleted:    collection/paraphrase/paraphrase_files/jquery-3.2.1.min.js
	deleted:    collection/paraphrase/paraphrase_files/toastr.min.css
	deleted:    collection/paraphrase/paraphrase_files/toastr.min.js
	deleted:    "collection/verify/Icon\r"
	deleted:    collection/verify/verify.html
	deleted:    "collection/verify/verify_files/Icon\r"
	deleted:    collection/verify/verify_files/bootstrap.min.css
	deleted:    collection/verify/verify_files/bootstrap.min.js
	deleted:    collection/verify/verify_files/jquery-3.2.1.min.js
	deleted:    collection/verify/verify_files/toastr.min.css
	deleted:    collection/verify/verify_files/toastr.min.js
	deleted:    data.tar.bz2
	deleted:    evaluate.py
	deleted:    lib/__init__.py
	deleted:    lib/common.py
	deleted:    lib/dbengine.py
	deleted:    lib/query.py
	deleted:    lib/table.py
	deleted:    requirements.txt
	deleted:    test/Dockerfile
	deleted:    test/check.py
	deleted:    test/example.pred.dev.jsonl.bz2
	deleted:    version.txt

Untracked files:
  (use "git add <file>..." to include in what will be committed)
	.dockerignore
	.gitattributes
	.gitignore
	.travis.yml
	CODEOWNERS
	LICENSE
	README.md
	annotate.py
	collection/
	data.tar.bz2
	evaluate.py
	lib/
	requirements.txt
	test/
	version.txt

And the output of running git restore --source=HEAD :/ as suggested:

error: unable to create file collection/paraphrase/Icon?: Invalid argument
error: unable to create file collection/paraphrase/paraphrase_files/Icon?: Invalid argument
error: unable to create file collection/verify/Icon?: Invalid argument
error: unable to create file collection/verify/verify_files/Icon?: Invalid argument

It seems like the issue is with the filename of the files which contains question marks, a character which is not allowed in file names in Linux file systems.

I attempted to see if the issue can be resolved by downloading the missing files directly from GitHub into the directory where it is supposed to be, for example this file. But it is not possible to download this file as it is and using wget fails as well.

An alternate method tried by me was to download the the Master branch code as a ZIP file and extract is using the unzip WikiSQL-master.zip command. This method works fine and in fact, even the offending files (such as collection/paraphrase/Icon) were successfully extracted with no illegal characters in their file names. It seems like this is an issue with how Git is extracting the files in this repository.

[BUG ] make dockerFile processing.

OS is Ubuntu x64

yummyyyy@yummyyyy-virtual-machine:~/ε…¬ε…±ηš„/WikiSQL$

sudo docker build -t wikisqltest -f test/Dockerfile .

Sending build context to Docker daemon 80.38MB
Step 1/8 : FROM python:3.6.2-alpine
---> 294201c0731f
Step 2/8 : RUN mkdir -p /eval
---> Using cache
---> 998bcbd64c08
Step 3/8 : WORKDIR /eval
---> Using cache
---> 94acbfabbb4f
Step 4/8 : ADD . /eval/
---> Using cache
---> 942024fe38cb
Step 5/8 : RUN pip install -r requirements.txt
---> Running in fbdf67763a18
Collecting tqdm (from -r requirements.txt (line 1))
Downloading https://files.pythonhosted.org/packages/9c/05/cf212f57daa0eb6106fa668a04d74d932e9881fd4a22f322ea1dadb5aba0/tqdm-4.62.2-py2.py3-none-any.whl (76kB)
Collecting records (from -r requirements.txt (line 2))
Downloading https://files.pythonhosted.org/packages/ef/93/2467c761ea3729713ab97842a46cc125ad09d14a0a174cb637bee4983911/records-0.5.3-py2.py3-none-any.whl
Collecting babel (from -r requirements.txt (line 3))
Downloading https://files.pythonhosted.org/packages/aa/96/4ba93c5f40459dc850d25f9ba93f869a623e77aaecc7a9344e19c01942cf/Babel-2.9.1-py2.py3-none-any.whl (8.8MB)
Collecting tabulate (from -r requirements.txt (line 4))
Downloading https://files.pythonhosted.org/packages/ca/80/7c0cad11bd99985cfe7c09427ee0b4f9bd6b048bd13d4ffb32c6db237dfb/tabulate-0.8.9-py3-none-any.whl
Collecting openpyxl<2.5.0 (from records->-r requirements.txt (line 2))
Downloading https://files.pythonhosted.org/packages/77/26/0bd1a39776f53b4f28e5bb1d26b3fcd99068584a7e1ddca4e09c0d5fd592/openpyxl-2.4.11.tar.gz (158kB)
Collecting SQLAlchemy; python_version >= "3.0" (from records->-r requirements.txt (line 2))
Downloading https://files.pythonhosted.org/packages/ad/c7/61ff52be84f5ac86c72672ceac941981f1685b4ef90391d405a1f89677d0/SQLAlchemy-1.4.23.tar.gz (7.7MB)
Collecting docopt (from records->-r requirements.txt (line 2))
Downloading https://files.pythonhosted.org/packages/a2/55/8f8cab2afd404cf578136ef2cc5dfb50baa1761b68c9da1fb1e4eed343c9/docopt-0.6.2.tar.gz
Collecting tablib>=0.11.4 (from records->-r requirements.txt (line 2))
Downloading https://files.pythonhosted.org/packages/16/85/078fc037b15aa1120d6a0287ec9d092d93d632ab01a0e7a3e69b4733da5e/tablib-3.0.0-py3-none-any.whl (47kB)
Collecting pytz>=2015.7 (from babel->-r requirements.txt (line 3))
Downloading https://files.pythonhosted.org/packages/70/94/784178ca5dd892a98f113cdd923372024dc04b8d40abe77ca76b5fb90ca6/pytz-2021.1-py2.py3-none-any.whl (510kB)
Collecting jdcal (from openpyxl<2.5.0->records->-r requirements.txt (line 2))
Downloading https://files.pythonhosted.org/packages/f0/da/572cbc0bc582390480bbd7c4e93d14dc46079778ed915b505dc494b37c57/jdcal-1.4.1-py2.py3-none-any.whl
Collecting et_xmlfile (from openpyxl<2.5.0->records->-r requirements.txt (line 2))
Downloading https://files.pythonhosted.org/packages/96/c2/3dd434b0108730014f1b96fd286040dc3bcb70066346f7e01ec2ac95865f/et_xmlfile-1.1.0-py3-none-any.whl
Collecting importlib-metadata (from SQLAlchemy; python_version >= "3.0"->records->-r requirements.txt (line 2))
Downloading https://files.pythonhosted.org/packages/71/c2/cb1855f0b2a0ae9ccc9b69f150a7aebd4a8d815bd951e74621c4154c52a8/importlib_metadata-4.8.1-py3-none-any.whl
Collecting greenlet!=0.4.17 (from SQLAlchemy; python_version >= "3.0"->records->-r requirements.txt (line 2))
Downloading https://files.pythonhosted.org/packages/72/7e/d8586068d47adba73afc085249712bd266cd7ffbf27d8bc260c33e9d6133/greenlet-1.1.1.tar.gz (85kB)
Collecting zipp>=0.5 (from importlib-metadata->SQLAlchemy; python_version >= "3.0"->records->-r requirements.txt (line 2))
Downloading https://files.pythonhosted.org/packages/92/d9/89f433969fb8dc5b9cbdd4b4deb587720ec1aeb59a020cf15002b9593eef/zipp-3.5.0-py3-none-any.whl
Collecting typing-extensions>=3.6.4; python_version < "3.8" (from importlib-metadata->SQLAlchemy; python_version >= "3.0"->records->-r requirements.txt (line 2))
Downloading https://files.pythonhosted.org/packages/74/60/18783336cc7fcdd95dae91d73477830aa53f5d3181ae4fe20491d7fc3199/typing_extensions-3.10.0.2-py3-none-any.whl
Building wheels for collected packages: openpyxl, SQLAlchemy, docopt, greenlet
Running setup.py bdist_wheel for openpyxl: started
Running setup.py bdist_wheel for openpyxl: finished with status 'done'
Stored in directory: /root/.cache/pip/wheels/59/44/27/63b211425501ad51d197ff8ed00e9e469e38b9e516cb69b1c2
Running setup.py bdist_wheel for SQLAlchemy: started
Running setup.py bdist_wheel for SQLAlchemy: finished with status 'done'
Stored in directory: /root/.cache/pip/wheels/7d/52/1c/117179bb38418ab4e06deb5c8288acd8ee1e0b418f5e59608f
Running setup.py bdist_wheel for docopt: started
Running setup.py bdist_wheel for docopt: finished with status 'done'
Stored in directory: /root/.cache/pip/wheels/9b/04/dd/7daf4150b6d9b12949298737de9431a324d4b797ffd63f526e
Running setup.py bdist_wheel for greenlet: started
Running setup.py bdist_wheel for greenlet: finished with status 'error'
Complete output from command /usr/local/bin/python -u -c "import setuptools, tokenize;file='/tmp/pip-build-e0uqasn_/greenlet/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" bdist_wheel -d /tmp/tmpug655ghzpip-wheel- --python-tag cp36:
/usr/local/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'project_urls'
warnings.warn(msg)
running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.6
creating build/lib.linux-x86_64-3.6/greenlet
copying src/greenlet/init.py -> build/lib.linux-x86_64-3.6/greenlet
creating build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_generator_nested.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_stack_saved.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_leaks.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_version.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_gc.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_cpp.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_tracing.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_weakref.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_extension_interface.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_greenlet.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_contextvars.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_throw.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_generator.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/init.py -> build/lib.linux-x86_64-3.6/greenlet/tests
running egg_info
writing src/greenlet.egg-info/PKG-INFO
writing dependency_links to src/greenlet.egg-info/dependency_links.txt
writing requirements to src/greenlet.egg-info/requires.txt
writing top-level names to src/greenlet.egg-info/top_level.txt
reading manifest file 'src/greenlet.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
no previously-included directories found matching 'docs/_build'
warning: no files found matching '.py' under directory 'appveyor'
warning: no previously-included files matching '
.pyc' found anywhere in distribution
warning: no previously-included files matching '.pyd' found anywhere in distribution
warning: no previously-included files matching '
.so' found anywhere in distribution
warning: no previously-included files matching '.coverage' found anywhere in distribution
writing manifest file 'src/greenlet.egg-info/SOURCES.txt'
copying src/greenlet/greenlet.c -> build/lib.linux-x86_64-3.6/greenlet
copying src/greenlet/greenlet.h -> build/lib.linux-x86_64-3.6/greenlet
copying src/greenlet/slp_platformselect.h -> build/lib.linux-x86_64-3.6/greenlet
creating build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/setup_switch_x64_masm.cmd -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_aarch64_gcc.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_alpha_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_amd64_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_arm32_gcc.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_arm32_ios.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_csky_gcc.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_m68k_gcc.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_mips_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_ppc64_aix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_ppc64_linux.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_ppc_aix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_ppc_linux.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_ppc_macosx.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_ppc_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_riscv_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_s390_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_sparc_sun_gcc.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_x32_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_x64_masm.asm -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_x64_masm.obj -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_x64_msvc.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_x86_msvc.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_x86_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/tests/_test_extension.c -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/_test_extension_cpp.cpp -> build/lib.linux-x86_64-3.6/greenlet/tests
running build_ext
building 'greenlet._greenlet' extension
creating build/temp.linux-x86_64-3.6
creating build/temp.linux-x86_64-3.6/src
creating build/temp.linux-x86_64-3.6/src/greenlet
gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/usr/local/include/python3.6m -c src/greenlet/greenlet.c -o build/temp.linux-x86_64-3.6/src/greenlet/greenlet.o
unable to execute 'gcc': No such file or directory
error: command 'gcc' failed with exit status 1


Failed building wheel for greenlet
Running setup.py clean for greenlet
Successfully built openpyxl SQLAlchemy docopt
Failed to build greenlet
Installing collected packages: tqdm, jdcal, et-xmlfile, openpyxl, zipp, typing-extensions, importlib-metadata, greenlet, SQLAlchemy, docopt, tablib, records, pytz, babel, tabulate
Running setup.py install for greenlet: started
Running setup.py install for greenlet: finished with status 'error'
Complete output from command /usr/local/bin/python -u -c "import setuptools, tokenize;file='/tmp/pip-build-e0uqasn_/greenlet/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-tmdjgba9-record/install-record.txt --single-version-externally-managed --compile:
/usr/local/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'project_urls'
warnings.warn(msg)
running install
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.6
creating build/lib.linux-x86_64-3.6/greenlet
copying src/greenlet/init.py -> build/lib.linux-x86_64-3.6/greenlet
creating build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_generator_nested.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_stack_saved.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_leaks.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_version.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_gc.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_cpp.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_tracing.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_weakref.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_extension_interface.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_greenlet.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_contextvars.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_throw.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/test_generator.py -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/init.py -> build/lib.linux-x86_64-3.6/greenlet/tests
running egg_info
writing src/greenlet.egg-info/PKG-INFO
writing dependency_links to src/greenlet.egg-info/dependency_links.txt
writing requirements to src/greenlet.egg-info/requires.txt
writing top-level names to src/greenlet.egg-info/top_level.txt
reading manifest file 'src/greenlet.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
no previously-included directories found matching 'docs/_build'
warning: no files found matching '.py' under directory 'appveyor'
warning: no previously-included files matching '
.pyc' found anywhere in distribution
warning: no previously-included files matching '.pyd' found anywhere in distribution
warning: no previously-included files matching '
.so' found anywhere in distribution
warning: no previously-included files matching '.coverage' found anywhere in distribution
writing manifest file 'src/greenlet.egg-info/SOURCES.txt'
copying src/greenlet/greenlet.c -> build/lib.linux-x86_64-3.6/greenlet
copying src/greenlet/greenlet.h -> build/lib.linux-x86_64-3.6/greenlet
copying src/greenlet/slp_platformselect.h -> build/lib.linux-x86_64-3.6/greenlet
creating build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/setup_switch_x64_masm.cmd -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_aarch64_gcc.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_alpha_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_amd64_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_arm32_gcc.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_arm32_ios.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_csky_gcc.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_m68k_gcc.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_mips_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_ppc64_aix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_ppc64_linux.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_ppc_aix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_ppc_linux.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_ppc_macosx.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_ppc_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_riscv_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_s390_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_sparc_sun_gcc.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_x32_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_x64_masm.asm -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_x64_masm.obj -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_x64_msvc.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_x86_msvc.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/platform/switch_x86_unix.h -> build/lib.linux-x86_64-3.6/greenlet/platform
copying src/greenlet/tests/_test_extension.c -> build/lib.linux-x86_64-3.6/greenlet/tests
copying src/greenlet/tests/_test_extension_cpp.cpp -> build/lib.linux-x86_64-3.6/greenlet/tests
running build_ext
building 'greenlet._greenlet' extension
creating build/temp.linux-x86_64-3.6
creating build/temp.linux-x86_64-3.6/src
creating build/temp.linux-x86_64-3.6/src/greenlet
gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/usr/local/include/python3.6m -c src/greenlet/greenlet.c -o build/temp.linux-x86_64-3.6/src/greenlet/greenlet.o
unable to execute 'gcc': No such file or directory
error: command 'gcc' failed with exit status 1

----------------------------------------

Command "/usr/local/bin/python -u -c "import setuptools, tokenize;file='/tmp/pip-build-e0uqasn_/greenlet/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-tmdjgba9-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-e0uqasn_/greenlet/
You are using pip version 9.0.1, however version 21.2.4 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
The command '/bin/sh -c pip install -r requirements.txt' returned a non-zero code: 1

How to differentiate 'AND' vs 'OR

  • I see json when query : OR, AND in "conds" just a array of condition and separate by sysbol ",",
    So what a signature maked we know that is "AND" or "OR" ?

\mathrm in question

Hi @vzhong

I think I found some unusual questions in train.jsonl (see the image below) which contain lots of \mathrm.

image

In my humble opinion, as they are few (6 questions), they can be removed or can be converted to normal text form (rather than using LaTeX form)?

Thanks!

Not sure about NSM

The paper of neural-symbolic-machines says:

image

So I am not sure whether NSM use the logic form for training.

image

[None] gold questions

I've instrumented evaluate.py to count the number of gold questions that have a gold answer of [None]. In the dev set, evaluated via the docker instructions on the front page, there are 657 / 8421 = 8% questions with a [None] result. I've manually looked at a few such questions, and they seem to have various errors, most notably lacking space before commas and > instead of >=.

Is this a known issue?

{'phase': 2, 'table_id': '2-12955969-1', 'question': 'What is the year of the tournament played at Melbourne, Australia?', 'sql': {'sel': 0, 'conds': [[2, 0, 'melbourne, australia']], 'agg': 5}}
SELECT AVG(col0) AS result FROM table_2_12955969_1 WHERE col2 = :col2
{'col2': 'melbourne, australia'}
Manual fix: SELECT AVG(col0) AS result FROM table_2_12955969_1 WHERE col2 = 'melbourne , australia';

{'phase': 2, 'table_id': '2-12312050-1', 'question': "What's the sum of points for the 1963 season when there are more than 30 games?", 'sql': {'sel': 4, 'conds': [[0, 0, '1963'], [2, 1, 30]], 'agg': 4}}
SELECT SUM(col4) AS result FROM table_2_12312050_1 WHERE col0 = :col0 AND col2 > :col2
{'col0': '1963', 'col2': 30}
Manual fix: SELECT SUM(col4) AS result FROM table_2_12312050_1 WHERE col0 = '1963' AND col2 >= 30;

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.