shuchkin / simplexls Goto Github PK
View Code? Open in Web Editor NEWParse and retrieve data from old format Excel XLS files. MS Excel 97 workbooks PHP reader.
License: MIT License
Parse and retrieve data from old format Excel XLS files. MS Excel 97 workbooks PHP reader.
License: MIT License
The class won't able to read file.xlsb or file.xla
public function encode($txt){
$arr=array('EUC-CN' => 'GB2312','CP936' => 'GBK','SJIS-mac'=>'MacJapanese','SJIS-Mobile#DOCOMO'=>'SJIS-DOCOMO','SJIS-Mobile#KDDI'=>'SJIS-KDDI','SJIS-Mobile#SOFTBANK'=>'SJIS-SOFTBANK','UTF-8-Mobile#DOCOMO'=>'UTF-8-DOCOMO','UTF-8-Mobile#KDDI-B'=>'UTF-8-KDDI','UTF-8-Mobile#SOFTBANK'=>'UTF-8-SOFTBANK','ISO-2022-JP-MOBILE#KDDI'=>'ISO-2022-JP-KDDI');
$encode = mb_detect_encoding($txt, ['ASCII', 'UTF-8', 'GB2312', 'GBK', 'BIG5']);
if(array_key_exists($encode, $arr)) $encode = $arr[$encode];
return mb_convert_encoding($txt, 'UTF-8', $encode);
}
//sheetnames and cells
//sheetnames: 'name' => $rec_name,
//cells: $s['cells'][$i][$j ];
Okay.
Found in an excel 97 file that a numeric cell, formatted as number 0,00
if negative and lower than -2
gets interpreted badly, seems an overlflow in the IEEE754 code.
I placed a value of -3,15
that is not working:
_GetInt4d
returns 4294966039
out of byte values 23, 64256, 16711680, 4278190080
_IEEE754
returns then 1073741509
10737415,09
The cell is interpreted as of type MULRK
, and changing the formatting to even text does not solve the problem.
If the cell is expressed as a formula =-3,15
the parsing is fine.
Positive values of the same cell work fine.
When I upload the xls file, the text in the rows and columns is always cut off the last character and the text is incomplete.
shuchkin/simplexls/master/src/SimpleXLS.php
If you create an xls in openoffice, general numbers parsed as date (yyyy-mm-dd hh:mm:ss). If you set cell type: number -1234 it works as expected.
Hi there,
I have a problem when I try to read my "XLS" file and more specifically on the column that contains this value "202906789123456".
The problem is that this value is transformed into an exponential number and the value I get when retrieving the data is as follows: "2.0290678912346E+14".
How can I block the cast?
Thanks.
Please you can help me, i try using
header('Content-Type: text/html; charset=utf-8'); $lines = \SimpleXLS::parse($filepath);
But this characters don't convert "Ibirité" -> "Ibirit�".
You can help me. and sorry for bad english i'am learning now.
Could you help me? I have a xls file that was generated in a restaurant system with the daily transactions. In Excel 365 this file opens in Compatible Mode. But, In the SimpleXLS it gives "File is not XLS" message.
It seems the problem was in ole identification, but my knowledge is not so deep to solve it... :)
File could be reached at link:
Xls file
Tks for your help.
Hi, and thank you for this light reader.
I have a problem (spent one day no clue) :
I need to find active sheet
In simpleXLSX I found it with a tiny modification but in simpleXLS (because source is binary) I don't succeed yet.
Is there any guid to catch the tabId?
Any help would be appreciated.
In SimpleXLSX, I added the possibility to have the state of a sheet (hidden or not) and others attributes:
foreach ( $workbook->sheets->sheet as $s ) { $this->sheetNames[ $index ] = (string) $s['name']; $this->sheetAttributes[ $index ] = [ 'name' => (string) $s['name'], 'sheetId' => (string) $s['sheetId'], 'id' => (string) $s['id'], 'state' => (string) $s['state'] ]; //added line $index_rId[ $index ] = (string) $s['id']; $index ++; }
And
public function sheetAttributes() { return $this->sheetAttributes; }
I want to ignore hidden tabs during my reading file because if the user decided to hidden it, it's voluntary.
With SimpleXLS, it's possible too but I am not a BIFF expert. With the Hexinator application, I found that the information is in before the tab name.
Example:
00 00 00 00 05 00 54 61 62 20 31 85 00 0D 00 9E 43 00 00 01 00 05 00 54 61 62 20 32 85 00 0D 00 54 45 00 00 01 00 05 00 54 61 62 20 33 85 00 0D 00 0A 47 00 00 00 00 05 00 54 61 62 20 34
The tabs are named "Tab 1" (visible), "Tab 2" (hidden), "Tab 3" (hidden) and "Tab 4" (visible) -> 54 61 62 20 ## (according to the number)
In bold, you have the hex who indicates if hidden (01) or not (00).
In your code, we can easely push the information here (Line ~800):
$this->boundsheets[] = array( 'name' => $rec_name, 'offset' => $rec_offset,
adding 'hidden' => 0/1
I found the right position for BIFF8 / charset === 0 who is "ord($this->data[ $pos + 8])", I suppose it's the same for any chartype but I don't know how to test. And I don't know for BIFF7 (I see it's for document before year 2000, impossible to generate a new one no?).
Thanks for your help
Line 527 - PHP print warnings, when row is empty
I suggest a change:
before:
$name .= $d[ $i ];
after:
if (isset($d[ $i ])) { $name .= $d[ $i ]; }
Hello,
When I tried to parse a .XLS file with a date like that in it : '2019-07-05 00:00:00'
That line got parsed as a default timestamp value : 43651
I have no problem with other date formats.
I was working an XLS file that has cell formulas which return as zeros when I use $xls->rows(). It would be helpful to have a slimmed down version of SimpleXLSX's rowEX() function to detect when formulas are in play. Getting any other information about the cell would of course be a nice bonus.
In the _parsesheet function, a count of the rows is gathered and set as the 'numRows' value in the returned object. If the number of rows is 65535, a row count of 0 is returned despite there being 65535 rows in the 'data' array. I found removing a single row from the data was enough to get it to work.
This code here returns 0 as the 'numRows' value:
This causes the '->rows' method to never enter this for loop because the 'numRows' count is 0:
I implemented a workaround by using count() on the ['cells'] property, if the count appears to be 0:
Let me know if you want a copy of the offending file.
Hey! I'm using your library to import data from XLS document and it seems that the setDateTimeFormat() method have no effect on the output of the row() method.
The problem is that the content of the $datetimeFormat attribute is used during the parsing (before the object is fully instantiated) and we can't parse the content of the file afterward.
Maybe the parse of the file should occur at the rows() or rowsEx() call and not during the instantiation of the object?
Got an XLS spreadsheet where a simple formula (=F19/$F$22
) shows 100.00% correctly in Excel, but results in 1.00% after parsing with this library.
I've reviewed the code line-by-line, but can't see where or how formulas are evaluated. (Major respect for those of you that understand what's happening inside of XLS files!)
Does this library do actual formula evaluation, or are formula results stored in the XLS data and simply read by this library?
The date time field in my Excel file is not parsed correctly. Could you please take a look at my file?
I only parsed the file contents ($file_data is the actual contents of the file, not the path) and printed the rows.
$xls = SimpleXLS::parse($file_data, TRUE));
print_r($xls->rows());
Cell B1 of first sheet :
test_pv_95.xls
I get :
&�?���??????
Instead of :
&é?¥®™?????? (Excel 2003 to 95 lost arabic and hebrew characters, I shoud have had &é₤¥®™≠بةائӨ)
I think we are near of the issue #10
I put an utf8_encode() and it works but I don't know if it can generates side effects.
I want to just take the specific cell like
$rowcount = count($excel -> rows(1));
for( $i=1; $i <= $rowcount; $i+=3){
$cell = 'C'.$i ;
echo $cell;
}
Whenever there is a Turkish character in any cell it converts it to a question mark. Specifically, in the reading of "Ü,ü,Ö,ö” characters problem arises.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.