Comments (19)
We offer this in the Pro compendium. Since companies have paid for the features already, it would be unfair to them if we turned around and made it available as open source. We have a longer comment in a gist.
from sheetjs.
Can I get a demo on how these styles to be applied in the latest version?
from sheetjs.
Wow! thanks @elad for this. My question now would be... how much effort should it be to add "XML Styles" compatibility?
Example:
Given I have
<Styles>
<Style ss:ID="Default" ss:Name="Normal">
<Alignment ss:Vertical="Bottom"/>
<Borders/>
<Font ss:FontName="Arial Unicode MS" ss:Size="11" ss:Color="#000000"/>
<Interior/>
<NumberFormat/>
<Protection/>
</Style>
<Style ss:ID="s58">
<Alignment ss:Horizontal="Center" ss:Vertical="Bottom"/>
</Style>
<Style ss:ID="s59">
<Alignment ss:Horizontal="Center" ss:Vertical="Bottom"/>
<Borders/>
<Font ss:Bold="1"/>
<Interior ss:Color="#99CC00" ss:Pattern="Solid"/>
<NumberFormat/>
<Protection/>
</Style>
<Style ss:ID="s60">
<Alignment ss:Horizontal="Center" ss:Vertical="Bottom"/>
<Borders/>
<Font ss:FontName="Calibri" ss:Size="14" ss:Bold="1"/>
<Interior ss:Color="#FF6600" ss:Pattern="Solid"/>
</Style>
<Style ss:ID="s61">
<Alignment ss:Horizontal="Left" ss:Vertical="Bottom"/>
<Font ss:FontName="Calibri" ss:Size="16" ss:Color="#FFFFFF" ss:Bold="1"/>
<Interior ss:Color="#333333" ss:Pattern="Solid"/>
</Style>
<Style ss:ID="s62">
<Alignment ss:Horizontal="Center" ss:Vertical="Bottom"/>
<Font ss:Color="#FFFFFF" />
<Interior ss:Color="#333333" ss:Pattern="Solid"/>
</Style>
<Style ss:ID="s63">
<Alignment ss:Horizontal="Center" ss:Vertical="Bottom"/>
<NumberFormat ss:Format="0.0%"/>
</Style>
<Style ss:ID="s64" ss:Parent="s62">
<NumberFormat ss:Format="0.0%"/>
</Style>
</Styles>
I would like to keep those styles (or convert them) when reading an XML and writing an XLSX.
from sheetjs.
Very important for what I need to do. :) What can I do to help make this happen?
from sheetjs.
@elad love your enthusiasm :)
As is usually the case, the hardest part is settling on a JS representation. For example, XLSB uses a bit field for representing certain properties whereas XLSX uses its own kinda-sorta-like-HTML-but-not-quite rich text format.
My initial thought was to save an HTML representation for each cell (and in fact, XLSX does generate HTML by parsing the rich text runs), but that makes the reverse process somewhat tricky (how do you handle CSS styles? Do you parse the CSS to figure out if the text is bold?).
NOTE: since the XLSX writer recomputes styles anyway, we don't need to stick to a style array or some other representation that is tightly coupled to the actual file representation
Extracting the style information is not too difficult, once we settle on a representation:
- XLSX Global styles are stored in the styles.xml file (Section 18.8 of ECMA-376) -- see bits/57_styxml.js for relevant code
- XLSB Global styles are stored in the styles.bin file (Overview in Section 2.1.7.50 of MS-XLSB) -- see bits/58_stybin.js for relevant code
- Strings with special in-text formatting are represented as rich text (Section 18.4.8 of ECMA-376, 2.1.7.121 of MS-XLSB). Example files: https://github.com/SheetJS/test_files/blob/master/rich_text_stress.xlsx?raw=true https://github.com/SheetJS/test_files/blob/master/rich_text_stress.xlsb?raw=true
- themes.xml also contains some information (though the styles and cells store most of the relevant information)
@elad what do you think is the best way to store this information? Note that we don't necessarily have to store a pretty format: we can (and should) write functions that "parse" the intermediate representation and give output.
@jokerslab @gcoonrod @artemryzhov since you raised issues on the matter, hopefully you can chime in as well :)
from sheetjs.
(keep in mind I'm not very familiar with this stuff yet :)
I'm not sure HTML representation is the best way to do this. Some libraries seem to just expose the "raw" values for each field, a few examples:
http://msdn.microsoft.com/en-us/library/documentformat.openxml.spreadsheet.backgroundcolor.aspx
http://stackoverflow.com/questions/10756206/getting-cell-backgroundcolor-in-excel-with-open-xml-2-0
http://stackoverflow.com/questions/12043973/how-to-read-the-xlsx-color-infomation-by-using-openpyxl
By the way, I say "raw" because I printed the data
in parse_sty_xml
to see what's in it and I see that the relevant tags/attributes in the XML also appear in the different APIs. (What I'm still trying to figure out is what maps each cell's formatting to that style data...)
So it seems like the best way would be to at leas have each cell maybe have a style
object that will contain the raw values... Makes sense?
from sheetjs.
What I'm still trying to figure out is what maps each cell's formatting to that style data..
The overall cell style is linked to the cell's "s" attribute (page 1589 of ECMA-376 pdf i linked to). The relevant logic here is in the worksheet processing: https://github.com/SheetJS/js-xlsx/blob/master/bits/72_wsxml.js#L75-L80
...
if(cell.s && styles.CellXf) {
var cf = styles.CellXf[cell.s];
...
So it seems like the best way would be to at leas have each cell maybe have a style object that will contain the raw values... Makes sense?
If you want to see the raw value, it's already exposed by default in the (.r) field:
> require('xlsx').readFile('rich_text_stress.xlsx').Sheets.Sheet1.B13
{ v: 'this text is double accounting underlined sure enough',
t: 's',
r: '<r><t xml:space="preserve">this text is </t></r><r><rPr><u val="doubleAccounting"/><sz val="12"/><color theme="1"/><rFont val="Calibri"/><scheme val="minor"/></rPr><t>double accounting underlined</t></r><r><rPr><sz val="12"/><color theme="1"/><rFont val="Calibri"/><family val="2"/><scheme val="minor"/></rPr><t xml:space="preserve"> sure enough</t></r>',
h: 'this text is <span style="">double accounting underlined</span><span style=""> sure enough</span>',
w: 'this text is double accounting underlined sure enough' }
Unfortunately, XLS and XLSB and XLML use different representations :/ There are three ways around this:
-
Convert everything to/from HTML -
Convert everything to/from the XLSX representation (in XLSX, it's already exposed as
.r
) -
Devise a new representation.
I'll think about it a bit more.
@elad do you know specifically what you need from the styles? In particular, do you need something like an HTML representation or just certain vitals (like background color, font, etc)? In the latter case, we probably could craft a short style object with the basic details
from sheetjs.
First, thanks for being so quick on the replies! :)
I had a feeling the mapping happened through s
although one of my rows had one cell with a different index (specifically, A1 through J1 had s="2" except for B1 which had s="3") so I wasn't 100% sure.
I think the example you provided works only for cell-specific styling. For example, my worksheet applies styling to the entire row, and this is what I get:
{ v: '2.4.2014',
t: 's',
r: '<t>2.4.2014</t>',
h: '2.4.2014',
w: '2.4.2014' }
The color used is Aqua, Accent 5, Lighter 40%, which I believe corresponds to this:
<fill>
<patternFill patternType="solid">
<fgColor theme="8" tint="0.39997558519241921"/>
<bgColor indexed="64"/>
</patternFill>
</fill>
What am I missing in order to access this styling through .r
?
What I need from the styles is the background color. I suspect a lot of people use row colors to signify meaning that isn't otherwise conveyed through an actual column. In my case, the row color represents "type" and in order to import data from Excel to a database I need to figure out what the row color is.
I think having something simple like you suggest would cover the needs of most folks who are interested in this feature, and if not would serve as a great foundation to further expand.
from sheetjs.
@elad row-level information is currently not made available via .r
. :/
Let's settle on putting each cell's background in cell.s.bgcolor
(and other one-offs, like foreground color, general bolding, will also be in the .s
field).
Upon reflection, it requires a bit of work (because the themes are not currently processed). I will take a stab at it later today:
-
The themes.xml file should be parsed to find the actual colors. Unfortunately, that is not currently done, but it would follow the same pattern as parsing styles (actually, somewhat easier since XLSB also uses themes.xml)
-
You'll see the comment
/* fills CT_Fills ? */
inparse_sty_xml
. Here, the fills should be parsed (in the ECMA spec you'll see CT_Fills defined somewhere below -- for now, you can just focus on the patternFill, fgColor, bgColor). You can mirror the approach in cellXfs and numFmts -
in parse_cellXfs, when you see an
<xf
, check if it has a fillId. If it is nonzero, then add the fill object (just like how the number format is added) -
in parse_ws_xml there's a "formatting" comment. In the following block, it tries to find the cell format. At that point, add the fill information to the cell
from sheetjs.
Okay, I did as you said - except for the themes.xml part, because I'm not yet sure how to do that - and this is what I get:
{ v: '2.4.2014',
t: 's',
r: '<t>2.4.2014</t>',
h: '2.4.2014',
w: '2.4.2014',
s:
{ patternType: 'solid',
fgColor: { theme: 8, tint: 0.3999755851924192 },
bgColor: { indexed: 64 } } }
So now the cell's s
field has the relevant fill data as it appears in the raw XML I printed earlier. Is this what you meant?
Hopefully it is, in which case - what do we do about themes.xml? I assume it requires changes in parse_zip
, but I see there are type-specific parsing routines there (parse_sst
, parse_sty
, parse_wb
, etc.), does it require a similar parse_themes
function to be written?
from sheetjs.
Following up... It seems the answer to my question is "yes."
I looked and saw that dir.themes
is an array:
themes: [ '/xl/theme/theme1.xml' ]
So I printed the contents of this XML file and found the clrScheme
collection the spec mentions (page 1757), and indeed, at index 8 (counting from 0) was the RGB value of the color for my row, sans tint! Given the spec also shows how to calculate the final color from the RGB value + tint (pages 1757-1758), I think we're good to go.
I'm now writing parse_themes
(a single function for parsing XML - no binary/XML differentiation because I understand that's not necessary). I'll soon fork this tree and push my changes so you could take a look.
from sheetjs.
@elad it sounds like you have the right idea :) Looking forward to the PR
from sheetjs.
I think I got it. I added basic support for parsing the theme and tested it. I also added some utility functions to provide the RGB color with the tint applied, to make it easier to get the actual color. It seems that the tinting algorithm is either incorrect or I'm missing something though because it doesn't work if I use the version from the specification verbatim. Also, there should probably be a lot of testing here because I'm sure my use case isn't the only one. :)
Output:
$ node parser
A1:
{ v: '2.4.2014',
t: 's',
r: '<t>2.4.2014</t>',
h: '2.4.2014',
w: '2.4.2014',
s:
{ patternType: 'solid',
fgColor: { theme: 8, tint: 0.3999755851924192, rgb: '9ED2E0' },
bgColor: { indexed: 64 } } }
color:
{ name: 'accent5', rgb: '4BACC6' }
$
This shows the data for the A1
cell, including the RGB value with the tint applied (9ED2E0
), as well as the theme color scheme definition for index 8, which in this case is (correctly) Accent 5, 4BACC6
.
Will submit a pull request shortly. Please note that at the very least it should be marked as experimental. :)
from sheetjs.
@elad add an option cellStyles
that defaults to false
(see bits/84_defaults.js). If it is true, parse the themes file and populate the s
field (so there should be a check in the parse_zip
function as well as in the parse_ws_xml
function).
from sheetjs.
The code implementing the functionality requested by this issue has been merged, I think it's safe to close it.
from sheetjs.
@elad we'll close once XLSB and ODS also use the same format
from sheetjs.
Gotcha. If you manage to find a moment, please provide some status on date/style issues raised elsewhere - I'd like to help with the code but since I know you're working on a new version I'm afraid changes would be conflicting. :)
from sheetjs.
Is there no link on how the styles can be applied to cells?
from sheetjs.
Hello All, it would be really awesome if there is an example of how to provide style metadata
let data = ['a','b'];
let style = ["bgColor: blue","bgColor: green"];
is there a way to achieve this? a code snippet would be of great help.
Thank you in advance
from sheetjs.
Related Issues (20)
- Gettting an Array, but of scrambled data 😰 HOT 2
- sheet_to_json from remote file not working HOT 2
- Unable to set name for worksheet separately from workbook HOT 3
- xlsx CP932 Incorrect output HOT 3
- Error appears in bower js-xlsx#~0.11.5 HOT 4
- sheet_to_json: inconsistent blank cell parsing HOT 2
- format entire column HOT 2
- reading text in shapes HOT 1
- Rearranging the XLSX.write order HOT 1
- Add VBA script to file I am creating from scratch HOT 2
- Wrong filename when download file HOT 1
- How to Define name for a range ? HOT 1
- Need to Prevent formatting of dates while reading a csv
- No option to change delimiter when writing CSV HOT 1
- Export to excel- Hyperlink doesn't work on documentation page HOT 1
- reading and writing the excel with Symbols,photos, in different tabs.
- QUOTE not defined HOT 9
- [Security] Prototype Pollution in sheetJS HOT 27
- Archive this GitHub Project HOT 3
- Thanks!
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sheetjs.