Comments (8)
Using vega_lite_spec = nl4dv_instance.render_vis(query)
will rightly raise an error because in this case, vega_lite_spec contains additional information than the vega-lite specification (vlSpec
).
Using output["visList"][0]["vlSpec"]
is syntactically correct. To help debug this, do you mind doing the following:
- Share (an anonymized) version of the output JSON of output["visList"][0]["vlSpec"]. I mainly want to observe the "url" property within the "data" property -- is the path correct and accessible by Streamlit. If you believe the URL is correct, I would replace "url" with "values" and supply some inline test data (example) and paste the updated vega-lite spec into the online Vega-Lite editor, just to pinpoint if data was the issue.
From the Streamlit [documentation](https://docs.streamlit.io/library/api-reference/charts/st.vega_lite_chart#:~:text=st.vega_lite_chart(data,for%20more%20info.), Streamlit seems to accept both data and spec with data. Syntactically, what you did is fine, it is quite possible there is some issue with the data key.
- Check if there are unwanted transformations (filters) in the vlSpec JSON; these can sometimes get applied due to some of your query keywords matching some filter criteria resulting in some or no data values, hence an empty chart. While this should not be a case but is still a small possibility.
It will be great if you can share the vlSpec
JSON output(s).
from nl4dv.
Following is the vlSpec output
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"data": {
"format": {
"type": "csv"
},
"url": ".\\Detailed Exp. of each MP 16th LS.csv"
},
"encoding": {
"x": {
"field": "State",
"type": "nominal"
},
"y": {
"aggregate": "mean",
"axis": {
"format": "s"
},
"field": "Expenditure",
"type": "quantitative"
}
},
"mark": {
"tooltip": true,
"type": "bar"
},
"transform": []
}
Data indeed appears to be the issue since by replacing the URL with inline data in the online Vega editor I am seeing the output
from nl4dv.
from nl4dv.
Please find attached the CSV file. It is publicly available data.
Detailed Exp. of each MP 16th LS.csv
from nl4dv.
Thanks for sharing. I suggest two potential solutions:
-
Follow the Vega-Lite docs on how data URL works. I am worried it doesn't like the ".\" prefix. Can you try with different variations, e.g., just "Detailed Exp. of each MP 16th LS.csv", or "./Detailed Exp. of each MP 16th LS.csv". Looks like you are using Windows? I am not used to back slashes.
-
If data is public and has a HTTP URL, you could replace the ".\<data." with that URL either when you pass it as input to NL4DV or in the vlSpec directly. You could also delete the "url" property in the "vlSpec" and instead add a new property "values" and supply the array of objects (the inline data) solution.
Let me know what works best for you.
from nl4dv.
I tried the following:
df = pd.read_csv ('Detailed Exp. of each MP 16th LS.csv')
print(df)
nl4dv_instance = NL4DV(data_value = df)
and also
nl4dv_instance.set_data(data_value=df)
and the dataframe content is printed correctly
MPName Constituency State Entitlement FundReceivedGOI ... Cummulative amt for MPLAD works Amt Sanc Expenditure % UtilizationOverRelease UnspentBalance
0 Tabassum Hasan KAIRANA UTTAR PRADESH 5.0 2.5 ... 5.1121 5.0438 4.9787 197.148000 0.3569
1 Sarfaraz Alam ARARIA BIHAR 5.0 5.0 ... 9.9550 9.4817 9.0717 181.434000 0.4100
2 Rajiv Pratap Rudy SARAN BIHAR 32.5 15.0 ... 42.0000 31.0179 23.0726 153.317333 0.7835
3 Maganti Venkateswara Rao Magantti ELURU ANDHRA PRADESH 27.5 17.5 ... 33.9361 29.9008 23.6092 132.909714 2.5743
4 Rajendra Dhedya Gavit PALGHAR(ST) MAHARASTRA 5.0 5.0 ... 12.0365 6.8301 6.7016 132.032000 0.6363
.. ... ... ... ... ... ... ... ... ... ... ...
564 Raghu Sharma AJMER RAJASTHAN 5.0 5.0 ... 8.9551 2.7540 1.3038 25.076000 3.7276
565 Nagendra Pratap Singh Patel PHULPUR UTTAR PRADESH 5.0 5.0 ... 5.4363 2.4363 0.7666 14.332000 4.2741
566 Chand Nath ALWAR RAJASTHAN 2.5 2.5 ... 0.3400 0.3400 0.3852 13.408000 2.6350
567 Karan Singh Yadav ALWAR RAJASTHAN 20.0 17.5 ... 0.3400 0.3400 0.3852 2.201143 17.6350
568 Nanabhau Falgunrao Patole BHANDARA-GONDIYA MAHARASTRA 17.5 17.5 ... 0.0000 0.0000 0.0000 0.000000 17.5000
[569 rows x 11 columns]
But in both cases the generated Vega lite spec has no data:
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"data": {
"format": {
"type": null
},
"url": null
},
"encoding": {
"x": {
"field": "State",
"type": "nominal"
},
"y": {
"aggregate": "mean",
"axis": {
"format": "s"
},
"field": "Expenditure",
"type": "quantitative"
}
},
"mark": {
"tooltip": true,
"type": "bar"
},
"transform": []
}
from nl4dv.
As an interim solution, I have got it working with the following code
df = pd.read_csv ('Detailed Exp. of each MP 16th LS.csv')
nl4dv_instance = NL4DV(data_value = df)
dependency_parser_config = {"name": "spacy", "model": "en_core_web_sm", "parser": None}
nl4dv_instance.set_dependency_parser(config=dependency_parser_config)
query = "create a barchart showing average Expenditure across State"
output = nl4dv_instance.analyze_query(query)
vega_lite_spec = output["visList"][0]["vlSpec"]
del vega_lite_spec['data']
st.vega_lite_chart(df, vega_lite_spec)
from nl4dv.
Oh I see.
So, if you pass the "url" of the data, then nl4dv does not drop the "url"'s value in the vlSpec. However, because you tried passing the "value" (and this could be pandas dataframe or list of dictionaries), it drops the "data" property from the output vlSpec object, expecting the developer to supply it back at the end.
This design choice is because nl4dv recommends a list of visualizations (not just one) in the visList. If all of the vlSpecs have the raw data (values or dataframe), then the output becomes very big, a somewhat concern if someone has to transfer it back from backend to frontend. Hence, we drop the "data" from the output spec. The developer can always supply it back, right? We let the "url"s stay because it's just a string. What do you think?
In any case, we will check if this note is in the current documentation (probably not), if not I'll add it. Thanks.
from nl4dv.
Related Issues (12)
- Issue with Query Involving Temporal Attribute HOT 1
- Use database as data source? HOT 3
- The helper functions that help determine attribute datatypes err when input with `None`
- path (delimiter) to the dataset for debugger_batch application is incorrect
- chaining jquery $.ajax for async:false requests with .done() is not working
- ImportError Issue HOT 2
- Is there a way to use a data frame instead of a file path? HOT 4
- {"aggregate": None/null} key in the output vega-lite spec cause Validation error when using the altair vega-lite renderer in Colab HOT 1
- Generated vega-lite spec does not render when NL4DV is initialized using a .tsv dataset file HOT 1
- Default to a "sum" aggregation for a stacked bar chart vis
- Attributes detected using multiple instances of the same keywords are getting dropped off in the final attributeMap.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nl4dv.