Comments (10)
MultiIndex
can be used for a DataFrame's _ columns_ as well. In that case, what behavior would you want to see? A single header row, or one header row for each level in the MultiIndex (which is what I would expect to see)?- The API seems to have changed with version 0.24.0, where "labels" got renamed to "codes". I really strive to have
gspread-dataframe
work with a wide range of Pandas versions and I strive not to call specific API methods when possible, precisely because of these kinds of API changes. I'm hoping there's a way to avoid referencing labels/codes directly in this package's code. - Please tell me more about how you'd like to see MultiIndex support: any extra parameters or options you'd like to see?
from gspread-dataframe.
MultiIndex
can be used for a DataFrame's _ columns_ as well. In that case, what behavior would you want to see? A single header row, or one header row for each level in the MultiIndex (which is what I would expect to see)?
the latter - one header row for each level
- The API seems to have changed with version 0.24.0, where "labels" got renamed to "codes". I really strive to have
gspread-dataframe
work with a wide range of Pandas versions and I strive not to call specific API methods when possible, precisely because of these kinds of API changes. I'm hoping there's a way to avoid referencing labels/codes directly in this package's code.
I understand the predicament you are in - backwards compatibility is important, but I see no reason to keep support for pre-1.0 pandas.
- Please tell me more about how you'd like to see MultiIndex support: any extra parameters or options you'd like to see?
Generally speaking, I would like the appearance of the uploaded sheet to be similar to what I see when a DataFrame
is printed.
In this case, I would like to have a separate column (with a header!) for each level.
I am torn between
foo one
foo two
and
foo one
two
I think there should be an option to select one of them, the default being the 1st variant.
The same goes for column indexes.
Thank you for your attention!
from gspread-dataframe.
@sam-s I've pushed a commit to master that offers support for MultiIndex indexes in set_with_dataframe
. (MultiIndexes have always been supported via index_col
parameter to get_as_dataframe
.)
If you test the package code directly from master (I added integration test coverage of the feature, and it passes), and you're happy with it, I can cut a new release today.
There is not yet support for:
- the second variant that displays levels without redundant values in higher levels;
- MultiIndex support for a dataframe's columns. That will come in a later commit; it's more complicated to implement.
from gspread-dataframe.
I also just noticed that pandas.read_csv interprets multiple header rows in an unexpected way. If you have a spreadsheet like
the columns headers are read as a MultiIndex with names ['SQL', 'Query']
and levels ['SQL', 'SQL', 'Misc', 'Misc', 'Misc'], ['Parameters', 'Plan', 'Date', 'Author', 'Misc']
. @sam-s Is that what you would expect?
from gspread-dataframe.
Basically,
| foo | foo |
| bar | baz |
| 1 | 2 |
is treated the same as
| foo |
| bar | baz |
| 1 | 2 |
which does seem wrong
from gspread-dataframe.
It gets even worse if there's an index column with a name, or if there's an index column that's a multi-index. From my local testing, I actually don't see a way for read_csv
in pandas to succeed in reading the levels of the non-index columns and the levels of the index column(s) properly.
I do think there's a way to support MultiIndex either for index column when include_index=True, or MultiIndex for dataframe.columns when include_column_header=True. I'll probably have MultiIndex for dataframe.columns implemented by tomorrow.
But not both in the same dataframe, at least not where a round-trip back from Google Sheets produces the same MultiIndexes. I'll implement so that at least a write will succeed, and let pandas.TextParser
take the blame for not reading it back properly. :)
from gspread-dataframe.
Implementation complete, even when include_index=True and index is MultiIndex and columns are MultiIndex. Release 3.0.5 is now on PyPI with this feature. Please confirm that 3.0.5 is working for you!
(The option to have the second display variant needs to wait. I'm going to open another issue just for that and mention you.)
from gspread-dataframe.
It works! Thank you!
from gspread-dataframe.
@sam-s If you're interested, try gspread_formatting
which offers a good default sheet formatter for dataframes, and with release 0.2.1 today supports MultiIndex index or columns.
from gspread-dataframe.
I do use gspread_formatting
, but I don't like its default behavior.
Here is what I do:
from gspread_dataframe import set_with_dataframe
from gspread_formatting import set_frozen, CellFormat, Color, format_cell_ranges, \
NumberFormat, TextFormat
def upload_dataframe(workbook, worksheet, df, percent=lambda s: s.endswith("%")):
"Upload DataFrame to Google Sheet."
ws = get_worksheet(get_workbook(workbook),worksheet)
set_with_dataframe(ws, df, include_index=True, resize=True)
set_frozen(ws, rows=df.columns.nlevels, cols=df.index.nlevels)
bold = CellFormat(textFormat=TextFormat(bold=True),
backgroundColor=Color(0.9,0.9,0.9))
pct = CellFormat(numberFormat=NumberFormat(type="percent",pattern="##.##%"))
ws_cols = list(string.ascii_uppercase)
if len(ws_cols) < df.shape[1] + df.index.nlevels:
ws_cols.extend(["%s%s"%p for p in itertools.product(ws_cols,repeat=2)])
format_cell_ranges(ws, [("A:%s" % (ws_cols[df.index.nlevels-1]), bold),
("1:%d" % (df.columns.nlevels), bold)] +
[("%s:%s" % (c1,c1),pct) for c0,c1 in zip(
[""]*df.index.nlevels + list(df.columns), ws_cols)
if percent(c0)])
from gspread-dataframe.
Related Issues (20)
- buggy behavior when dealing with decimals? HOT 5
- Bug: Object of type int64 is not JSON serializable - introduced by Version 3.1.1 HOT 8
- Bug: escaping when allow_formulas = False is not returning any value HOT 1
- set_with_dataframe: TypeError: Object of type 'int64' is not JSON serializable HOT 2
- Sheets API erroneously rejects new rowCount if (new rowCount * old columnCount) > 5000000 (was: Bug in the number of cells of the dataframe HOT 4
- Cannot use gspread's add_row() method before using set_with_dataframe() HOT 8
- Can't import, "No module named gspread.models" HOT 1
- Consider allowing option to drop duplicates in set_with_dataframe HOT 4
- Data gets interpreted even when using dtype=str HOT 1
- ModuleNotFoundError: No module named 'gspread.models' HOT 7
- Write header rows for DataFrames with MultiIndex columns so that Pandas reads them properly HOT 1
- Include formatting options in set_with_dataframe
- Consider offering a reader/writer object based on a given dataframe
- return API response from set_with_dataframe HOT 5
- No module named 'gspread' HOT 2
- Feature Request: Append feature to an existing sheet. HOT 1
- Max Cell Count
- Requests to sheets API cause 400 error is worksheet name is valid cell reference (was: Error using get_as_dataframe) HOT 1
- Newest version gspread not compatible HOT 7
- get_as_dataframe() is reading the whole sheet data HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gspread-dataframe.