Coder Social home page Coder Social logo

libxls's Introduction

Build Status Fuzzing Status

libxls - Read XLS files from C

This is libxls, a C library for reading Excel files in the nasty old binary OLE format, plus a command-line tool for converting XLS to CSV (named, appropriately enough, xls2csv).

After several years of neglect, libxls is under new management as of the 1.5.x series. Head over to releases to get the latest stable version of libxls 1.5, which fixes many security vulnerabilities found in libxls 1.4 and earlier.

Libxls 1.5 also includes new APIs for parsing files stored in memory buffers, and returns errors instead of exiting upon encountering malformed input. If you find a bug, please file it on the GitHub issue tracker.

Changes to libxls since 1.4:

  • Hosted on GitHub (hooray!)
  • New in-memory parsing API (see xls_open_buffer)
  • Internals rewritten to return errors instead of exiting
  • Heavily fuzz-tested with clang's libFuzzer, fixing many memory leaks and CVEs
  • Improved compatibility with C++
  • Continuous integration tests on Mac, Linux, and Windows
  • Lots of other small fixes, see the commit history

The C API is pretty simple, this will get you started:

xls_error_t error = LIBXLS_OK;
xlsWorkBook *wb = xls_open_file("/path/to/finances.xls", "UTF-8", &error);
if (wb == NULL) {
    printf("Error reading file: %s\n", xls_getError(error));
    exit(1);
}
for (int i=0; i<wb->sheets.count; i++) { // sheets
    xlsWorkSheet *work_sheet = xls_getWorkSheet(work_book, i);
    error = xls_parseWorkSheet(work_sheet);
    for (int j=0; j<=work_sheet->rows.lastrow; j++) { // rows
        xlsRow *row = xls_row(work_sheet, j);
        for (int k=0; k<=work_sheet->rows.lastcol; k++) { // columns
            xlsCell *cell = &row->cells.cell[k];
            // do something with cell
            if (cell->id == XLS_RECORD_BLANK) {
                // do something with a blank cell
            } else if (cell->id == XLS_RECORD_NUMBER) {
               // use cell->d, a double-precision number
            } else if (cell->id == XLS_RECORD_FORMULA) {
                if (strcmp(cell->str, "bool") == 0) {
                    // its boolean, and test cell->d > 0.0 for true
                } else if (strcmp(cell->str, "error") == 0) {
                    // formula is in error
                } else {
                    // cell->str is valid as the result of a string formula.
                }
            } else if (cell->str != NULL) {
                // cell->str contains a string value
            }
        }
    }
    xls_close_WS(work_sheet);
}
xls_close_WB(wb);

The library also includes a CLI tool for converting Excel files to CSV:

./xls2csv /path/to/file.xls

The man page for xls2csv has more details.

Libxls should run fine on both little-endian and big-endian systems, but if not please open an issue.

If you want to hack on the source, you should first familiarize yourself with the Microsoft Excel File Format as well as Compound Document file format (documentation provided by the nice folks at OpenOffice.org).

Installation

If you want a stable version, check out the Releases section, which has copies of everything you'll find in Sourceforge, and download version 1.5.0 or later.

For full instructions see INSTALL, or here's the tl;dr:

To install a stable release:

./configure
make
make install

If you've cloned the git repository, you'll need to run this first:

./bootstrap

That will generate all the supporting files. It assumes autotools is already installed on the system (and also expects Autoconf Archive to be present).

Language bindings

If C is not your cup of tea, you can make use of libxls in several other languages, including:

libxls's People

Contributors

dhoerl avatar evanmiller avatar gaborcsardi avatar godzie44 avatar jennybc avatar leo-neat avatar lockedbyte avatar mdwagner avatar neilbryson avatar qulogic avatar sjmulder avatar stephematician avatar tbeu avatar zodf0055980 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

libxls's Issues

problem with wcstombs under windows

Hi,

the line mentioned gives problems under Windows because it always returns zero even for valid entries like "Root Entry". If the max length field (3rd parameterI is changed to len it works.

Regards
Otto

count = wcstombs(NULL, w, 0);

File not found with xls2csv, segfault with readxl

tidyverse/readxl#417

xls from here

https://www.seco.admin.ch/dam/seco/de/dokumente/Wirtschaft/Wirtschaftslage/VIP%20Quartalsschätzungen/qna_p_na.xls.download.xls/qna_p_na.xls

With xls2csv:

jenny@2015-mbp libxls-evanmiller-github $ ./xls2csv ~/Downloads/qna_p_na.xls
FILE: /Users/jenny/Downloads/qna_p_na.xls
File not found

With readxl:

> library(readxl)
> read_excel("~/Downloads/qna_p_na.xls")

 *** caught segfault ***
address 0x0, cause 'memory not mapped'

Traceback:
 1: .Call(`_readxl_read_xls_`, path, sheet_i, limits, shim, col_names,     col_types, na, trim_ws, guess_max)
 2: read_fun(path = path, sheet_i = sheet, limits = limits, shim = shim,     col_names = col_names, col_types = col_types, na = na, trim_ws = trim_ws,     guess_max = guess_max)
 3: tibble::as_tibble(read_fun(path = path, sheet_i = sheet, limits = limits,     shim = shim, col_names = col_names, col_types = col_types,     na = na, trim_ws = trim_ws, guess_max = guess_max), validate = FALSE)
 4: tibble::repair_names(tibble::as_tibble(read_fun(path = path,     sheet_i = sheet, limits = limits, shim = shim, col_names = col_names,     col_types = col_types, na = na, trim_ws = trim_ws, guess_max = guess_max),     validate = FALSE), prefix = "X", sep = "__")
 5: read_excel_(path = path, sheet = sheet, range = range, col_names = col_names,     col_types = col_types, na = na, trim_ws = trim_ws, skip = skip,     n_max = n_max, guess_max = guess_max, format = format)
 6: read_excel("~/Downloads/qna_p_na.xls")

Unreadable files with, I think, Latin-1 encoded contents

Passing along from readxl tidyverse/readxl#564.

User has 6 xls files, one of which can be read by readxl and xls2csv. The remaining five do not throw an error, but they produce no output.

I note I can read these files with a completely separate tool (the R package gdata, which wraps a Perl script 😬), but only if I specify the encoding: e.g., as read.xls(files[2], fileEncoding="latin1").

I've attempted to pass encoding to xls2csv but it does not change my results. That could be user error because I've never really specified encoding to libxls before.

I can also confirm what the user reports: merely opening and closing the problematic files in Excel makes them readable by readxl/libxls. I suppose this is changing the encoding?

The attached ZIP file is the one provided by my user in the readxl issue. I contains 6 directories, each of which holds one .txt and one .xls file.

fs::dir_tree("investigations/Data")
#> investigations/Data
#> ├── 20190326.seq
#> │   ├── 2019-3-26-.txt
#> │   └── 33.0000.XLS
#> ├── 20190327.seq
#> │   ├── 01.0000.XLS
#> │   └── 2019-3-27-.txt
#> ├── 20190328.seq
#> │   ├── 04.0000.XLS
#> │   └── 2019-3-28-.txt
#> ├── 20190329.seq
#> │   ├── 15.0000.XLS
#> │   └── 2019-3-29-.txt
#> ├── 20190330.seq
#> │   ├── 09.0000.XLS
#> │   └── 2019-3-30-.txt
#> ├── 20190331.seq
#> │   ├── 03.0000.XLS
#> │   └── 2019-3-31-.txt
#> └── Data.Rproj

Created on 2019-04-02 by the reprex package (v0.2.1.9000)

Data.zip

xls file throws no error, but no cell data retrieved ("Arrivals sheet")

We've discussed this one before and you applied a fix that went from "segfault" to "no segfault but no data returned". At that time, you expressed (somewhere? email?) that you were focused on preventing segfaults.

Previous issue here: #20

Tracking in readxl here: tidyverse/readxl#471

I bring it up again in case it's a better time to run this one down for real.

This is an xls that libxls and readxl used to be able to read (readxl v1.0.0), so it's a regression.

The xls is here:

http://homepage.stat.uiowa.edu/~luke/data/Arrivals-2017-01-06.xls

With the standalone xls2csv tool as of libxls b6192a3, I see:

jenny@2015-mbp libxls-evanmiller-github $ ./xls2csv ~/rrr/readxl/investigations/Arrivals-2017-01-06.xls
FILE: /Users/jenny/rrr/readxl/investigations/Arrivals-2017-01-06.xls

and a whole whack of newlines. The verbose mode with -v reveals the file is being parsed.

"Unable to allocate memory"

Via readxl, I have a file that generates this error when I attempt to read with xls2csv:

jenny@2015-mbp libxls-evanmiller-github $ ./xls2csv ~/rrr/readxl/investigations/pop-sexe-age-quinquennal6814.xlsFILE: /Users/jenny/rrr/readxl/investigations/pop-sexe-age-quinquennal6814.xls
Error reading XLS file: Unable to allocate memory

See tidyverse/readxl#478

Is this something you care to pursue? That is, is the error message just a sentinel and we're bumping against a semi-arbitrary limit you might raise? The OP reports being able to read this xls file with an earlier version of readxl (haven't verified, but have no reason to doubt) and same file, saved as xlsx, is also readable.

How to get the formula of a cell ?

I can parse an xml file with formula like $C$5*D5, but can only get the result of the actual computation of the formula, not the formula itself: can the $C$5*D5 string be retrieved?

Unnecessary configure checks?

configure appears to check for a C++ compiler, but no C++ code is compiled (though some does exist). It also checks for pkg-config, but no dependencies are queried using it.

xls file throws no error, but no cell data retrieved ("Los Alamos sheet")

Original readxl issue: tidyverse/readxl#483

File is "ctl_summary.xls" available from the top xls link here:
https://www.hiv.lanl.gov/content/immunology/tables/ctl_summary.html

With xls2csv built from b20521f, I see this:

jenny@2015-mbp libxls-evanmiller-github $ ./xls2csv ~/rrr/readxl/investigations/ctl_summary.xls
FILE: /Users/jenny/rrr/readxl/investigations/ctl_summary.xls

No error, but also no data.

Using xls2csv with the -v verbose flag reveals the file is being parsed but somehow no data comes out.

Difficulty parsing shared string table

I can build readxl with the current version of libxls and pass all by three tests (out of 365). Here are 3 problematic files, which jointly account for two failing tests.

We could read these test sheets with prior versions of libxls.

https://github.com/tidyverse/readxl/blob/master/tests/testthat/sheets/dates-1900.xls
https://github.com/tidyverse/readxl/blob/master/tests/testthat/sheets/dates-1904.xls
https://github.com/tidyverse/readxl/blob/master/tests/testthat/sheets/missing-first-column.xls

They all appear to trigger a parse error here:

https://github.com/evanmiller/libxls/blob/3a2acf14173463e40d6f18a8cc63b901db16e233/src/xls.c#L865

Maybe relates to #10

Any plans to read xlw?

Do you know anything about xlw and/or have any vague plans to support it here?

If not, I'm going to close this on readxl (tidyverse/readxl#350), because we certainly won't tackle that directly.

fread crash in ole.c:637 sector_read function

Test Version

dev version, git clone https://github.com/evanmiller/libxls

Test Environment

root@leon-virtual-machine:/proc# uname -a
Linux leon-virtual-machine 4.10.0-28-generic #32~16.04.2-Ubuntu SMP Thu Jul 20 10:19:48 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Test Program

xls2csv [infile]

Gdb and Backtrace

Reading symbols from xls2csv...done.
(gdb) run xls2csv_ole_ole2_fread_327.crash
Starting program: /opt/normal/bin/xls2csv xls2csv_ole_ole2_fread_327.crash
FILE: xls2csv_ole_ole2_fread_327.crash

Program received signal SIGSEGV, Segmentation fault.
__memcpy_sse2 () at ../sysdeps/x86_64/multiarch/../memcpy.S:220
220     ../sysdeps/x86_64/multiarch/../memcpy.S: No such file or directory.
(gdb) bt
#0  __memcpy_sse2 () at ../sysdeps/x86_64/multiarch/../memcpy.S:220
#1  0x00007ffff7a85fd3 in __GI__IO_file_xsgetn (fp=0x60e4b0, data=<optimized out>, n=512) at fileops.c:1383
#2  0x00007ffff7a7b236 in __GI__IO_fread (buf=buf@entry=0x7ffff7fe1e10, size=512, count=count@entry=1, fp=0x60e4b0) at iofread.c:38
#3  0x0000000000406579 in fread (__stream=<optimized out>, __n=1, __size=<optimized out>, __ptr=0x7ffff7fe1e10)
    at /usr/include/x86_64-linux-gnu/bits/stdio2.h:295
#4  ole2_fread (ole2=ole2@entry=0x60e420, buffer=buffer@entry=0x7ffff7fe1e10, size=<optimized out>, nitems=nitems@entry=1) at src/ole.c:327
#5  0x000000000040666a in sector_read (ole2=0x60e420, buffer=0x7ffff7fe1e10, sid=0) at src/ole.c:604
#6  0x000000000040714c in read_MSAT_body (sectorCount=8389121, sectorOffset=<optimized out>, ole2=0x60e420) at src/ole.c:663
#7  read_MSAT (oleh=0x60e6e0, ole2=0x60e420) at src/ole.c:757
#8  ole2_read_header (ole=ole@entry=0x60e420) at src/ole.c:399
#9  0x0000000000407442 in ole2_open_file (file=file@entry=0x7fffffffe752 "xls2csv_ole_ole2_fread_327.crash") at src/ole.c:552
#10 0x0000000000404a62 in xls_open_file (file=0x7fffffffe752 "xls2csv_ole_ole2_fread_327.crash", charset=0x4075ff "UTF-8",
    outError=outError@entry=0x7fffffffe394) at src/xls.c:1471
#11 0x0000000000400f5a in main (argc=2, argv=0x7fffffffe4b8) at src/xls2csv.c:116
(gdb) f 5
#5  0x000000000040666a in sector_read (ole2=0x60e420, buffer=0x7ffff7fe1e10, sid=0) at src/ole.c:604
604         if ((num = ole2_fread(ole2, buffer, ole2->lsector, 1)) != 1) {
(gdb) l
599                     if (xls_debug) fprintf(stderr, "Error: wanted to seek to sector %u (0x%x) loc=%u\n", sid, sid,
600                     (unsigned int)sector_pos(ole2, sid));
601             return -1;
602         }
603
604         if ((num = ole2_fread(ole2, buffer, ole2->lsector, 1)) != 1) {
605             if (xls_debug) fprintf(stderr, "Error: fread wanted 1 got %lu loc=%u\n", (unsigned long)num,
606                     (unsigned int)sector_pos(ole2, sid));
607             return -1;
608         }

POC file

xls2csv_ole_ole2_fread_327.zip

CREDIT

Zhao Liang, Huawei Weiran Labs

Testing collection

The recent reports of "libxls / readxl used to read this xls but now it doesn't" make me wonder about the best way to capture these examples. Most of the files in question are too big to include in readxl's test suite. But it seems like a lost opportunity to not hold on to them in some fashion.

It seems worthwhile to capture them and periodically do a semi-manual test run that verifies: (1) no segfault and (2) reads correct number of rows and columns.

Do you have any thoughts about this?

Status of libxls and this repo specifically wrt fixing security vulnerabilities

I'm excited that this repo is slated to become the official fork or home of libxls.

I'm the maintainer of readxl, an R package to read Excel files. It uses libxls for .xls.

I've been aware of these for a while:

https://www.talosintelligence.com/vulnerability_reports/TALOS-2017-0426
https://www.talosintelligence.com/vulnerability_reports/TALOS-2017-0404
https://www.talosintelligence.com/vulnerability_reports/TALOS-2017-0403

I think there might be few more too?

This Debian bug report has brought all this back to the top of my mind:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=895564

Is there a version of libxls that addresses these vulnerabilities? And if so, is it this version here or sourceforge?

Thanks for your advice!

xlstools.c -> "unicode_decode_wcstombs" method fails for unicode (utf-16) string

I created an xls file called hello.xls with only 1 sheet and 1 cell with value "こんにちは".
I passed the file to xls2csv and nothing was printed to console.
When I started to debug I found wcstombs() at line#294 of xlstools.c failed.

OS: Windows
LIBICONV: NO

How to run libxls in windows
This is the hack I followed.
I installed Cygwin64 with gcc
I ran ./configure in Cygwin64
Then I copied the source only "include" and "src" directories and config.h to my visual studio project to build it as a static library.
I modified config.h to ignore iconv
I removed xls2csv from the project
I had to change ssize_t to size_t in some of the source files (I don't know what might get effected by this. Your input is appreciated for it.)
Visual Studio 2015 didn't recognize ssize_t (signed size_t).
One solution is to define ssize_t in the header file as

/* ssize_t is not defined on Windows */
#ifndef ssize_t
# if defined(_WIN64)
typedef signed __int64 ssize_t;
# else
typedef signed long ssize_t;
# endif
#endif  /* !ssize_t */
/* On MSVC, ssize_t is SSIZE_T */
#ifdef _MSC_VER
#include <BaseTsd.h>
#define ssize_t SSIZE_T
#endif

Reference
https://gitlab.freedesktop.org/libnice/libnice/commit/3735b73d54f05facd36e49f3b5ee7c6fa82de9cf

I created a separate visual studio c++ console program for xls2csv by removing all linux specific code and added libxls project as its dependency.
I had to change a lot of code like sprintf -> sprintf_s, wcstombs -> wcstombs_s, etc to resolve most of the build errors.

When I was debugging the code I encountered the bug mentioned above.

This is my modified version of unicode_decode_wcstombs() method in xlstools.c source file to resolve the issue.

static char *unicode_decode_wcstombs(const char *s, size_t len, size_t *newlen) {
	// Do wcstombs conversion
	char *converted = NULL;
	errno_t err;
	size_t count, count2;
	size_t i;
	wchar_t *w;

#if defined(_WIN32) || defined(WIN32) || defined(_WIN64) || defined(WIN64) || defined(WINDOWS)
	_locale_t loc = _create_locale(LC_CTYPE, ".65001");
#else
	_locale_t loc = _create_locale(LC_CTYPE, "");
#endif

	if (loc == NULL)
	{
		printf("_create_locale failed: %d\n", errno);
		return NULL;
	}

	w = (wchar_t*)malloc((len / 2 + 1) * sizeof(wchar_t));

	for (i = 0; i < len / 2; i++)
	{
		w[i] = (BYTE)s[2 * i] + ((BYTE)s[2 * i + 1] << 8);
	}
	w[len / 2] = L'\0';

	err = _wcstombs_s_l(&count, NULL, 0, w, INT_MAX, loc);
	if (err != 0)
	{
		if (newlen) *newlen = 0;
		free(w);
		return NULL;
	}

	converted = calloc(count + 1, sizeof(char));
	err = _wcstombs_s_l(&count2, converted, count, w, count, loc);
	free(w);
	if (err != 0)
	{
		printf("_wcstombs_s_l failed (%lu)\n", (unsigned long)len / 2);
		if (newlen) *newlen = 0;
		return converted;
	}
	if (newlen) *newlen = count2;
	return converted;
}

it successfully converted utf-16 to utf-8 string.
".65001" argument in _create_locale() for WINDOWS refers to "utf-8"

Is this the correct way?

I have attached my visual studio 2015 libxls solution.
Note: I have changed ssize_t to size_t in my code. You might want to change it back to original. (Copy the original source files, resolve any build errors like I mentioned above or use _CRT_SECURE_NO_WARNINGS pre-processor)
libxls.zip

Lload of misaligned address ... for type 'DWORD', which requires 4 byte alignment

I'm seeing this when I check readxl on platform "Debian Linux, R-devel, GCC ASAN/UBSAN".

> test_check("readxl")
xlstool.c:587:22: runtime error: load of misaligned address 0x55698c6fc416 for type 'DWORD', which requires 4 byte alignment
0x55698c6fc416: note: pointer points here
 00 00 0f 00 1c 00  00 00 00 00 00 01 0f 00  33 33 33 33 33 33 d3 3f  33 33 33 33 33 33 d3 3f  00 24
             ^ 

This area has been the focus of other changes since I started working on readxl and prior to libxls moving to GitHub.

Double Free vulneribility in read_MSAT function

Test Version

dev version, git clone https://github.com/evanmiller/libxls

Test Program

xls2csv [infile]

Gdb and Backtrace

Reading symbols from xls2csv...done.
(gdb) run xls2csv_ole_read_MSAT_772.xls 
Starting program: /var/normal/bin/xls2csv xls2csv_ole_read_MSAT_772.xls
FILE: xls2csv_ole_read_MSAT_772.xls
*** Error in `/var/normal/bin/xls2csv': double free or corruption (out): 0x0000000000604900 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7ffff78737e5]
/lib/x86_64-linux-gnu/libc.so.6(+0x8037a)[0x7ffff787c37a]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7ffff788053c]
/var/normal/lib/libxlsreader.so.1(+0x6e6a)[0x7ffff7bcce6a]
/var/normal/lib/libxlsreader.so.1(ole2_open_file+0x47)[0x7ffff7bcd5b7]
/var/normal/lib/libxlsreader.so.1(xls_open_file+0x12)[0x7ffff7bd0a62]
/var/normal/bin/xls2csv[0x400d6a]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7ffff781c830]
/var/normal/bin/xls2csv[0x401139]
======= Memory map: ========
00400000-00402000 r-xp 00000000 08:01 526493                             /var/normal/bin/xls2csv
00601000-00602000 r--p 00001000 08:01 526493                             /var/normal/bin/xls2csv
00602000-00603000 rw-p 00002000 08:01 526493                             /var/normal/bin/xls2csv
00603000-00624000 rw-p 00000000 00:00 0                                  [heap]
7ffff0000000-7ffff0021000 rw-p 00000000 00:00 0 
7ffff0021000-7ffff4000000 ---p 00000000 00:00 0 
7ffff75e6000-7ffff75fc000 r-xp 00000000 08:01 1053568                    /lib/x86_64-linux-gnu/libgcc_s.so.1
7ffff75fc000-7ffff77fb000 ---p 00016000 08:01 1053568                    /lib/x86_64-linux-gnu/libgcc_s.so.1
7ffff77fb000-7ffff77fc000 rw-p 00015000 08:01 1053568                    /lib/x86_64-linux-gnu/libgcc_s.so.1
7ffff77fc000-7ffff79bc000 r-xp 00000000 08:01 1053530                    /lib/x86_64-linux-gnu/libc-2.23.so
7ffff79bc000-7ffff7bbc000 ---p 001c0000 08:01 1053530                    /lib/x86_64-linux-gnu/libc-2.23.so
7ffff7bbc000-7ffff7bc0000 r--p 001c0000 08:01 1053530                    /lib/x86_64-linux-gnu/libc-2.23.so
7ffff7bc0000-7ffff7bc2000 rw-p 001c4000 08:01 1053530                    /lib/x86_64-linux-gnu/libc-2.23.so
7ffff7bc2000-7ffff7bc6000 rw-p 00000000 00:00 0 
7ffff7bc6000-7ffff7bd5000 r-xp 00000000 08:01 526488                     /var/normal/lib/libxlsreader.so.1.4.0
7ffff7bd5000-7ffff7dd4000 ---p 0000f000 08:01 526488                     /var/normal/lib/libxlsreader.so.1.4.0
7ffff7dd4000-7ffff7dd5000 r--p 0000e000 08:01 526488                     /var/normal/lib/libxlsreader.so.1.4.0
7ffff7dd5000-7ffff7dd7000 rw-p 0000f000 08:01 526488                     /var/normal/lib/libxlsreader.so.1.4.0
7ffff7dd7000-7ffff7dfd000 r-xp 00000000 08:01 1053502                    /lib/x86_64-linux-gnu/ld-2.23.so
7ffff7fdd000-7ffff7fe0000 rw-p 00000000 00:00 0 
7ffff7ff5000-7ffff7ff7000 rw-p 00000000 00:00 0 
7ffff7ff7000-7ffff7ffa000 r--p 00000000 00:00 0                          [vvar]
7ffff7ffa000-7ffff7ffc000 r-xp 00000000 00:00 0                          [vdso]
7ffff7ffc000-7ffff7ffd000 r--p 00025000 08:01 1053502                    /lib/x86_64-linux-gnu/ld-2.23.so
7ffff7ffd000-7ffff7ffe000 rw-p 00026000 08:01 1053502                    /lib/x86_64-linux-gnu/ld-2.23.so
7ffff7ffe000-7ffff7fff000 rw-p 00000000 00:00 0 
7ffffffde000-7ffffffff000 rw-p 00000000 00:00 0                          [stack]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

Program received signal SIGABRT, Aborted.
0x00007ffff7831428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
54      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.

(gdb) bt
#0  0x00007ffff7831428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1  0x00007ffff783302a in __GI_abort () at abort.c:89
#2  0x00007ffff78737ea in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x7ffff798ced8 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/posix/libc_fatal.c:175
#3  0x00007ffff787c37a in malloc_printerr (ar_ptr=<optimized out>, ptr=<optimized out>, str=0x7ffff798cfe8 "double free or corruption (out)", action=3) at malloc.c:5006
#4  _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:3867
#5  0x00007ffff788053c in __GI___libc_free (mem=<optimized out>) at malloc.c:2968
#6  0x00007ffff7bcce6a in read_MSAT (oleh=0x6036e0, ole2=0x603420) at src/ole.c:772
#7  ole2_read_header (ole=ole@entry=0x603420) at src/ole.c:399
#8  0x00007ffff7bcd5b7 in ole2_open_file (file=file@entry=0x7fffffffe721 "xls2csv_ole_read_MSAT_772.xls") at src/ole.c:552
#9  0x00007ffff7bd0a62 in xls_open_file (file=0x7fffffffe721 "xls2csv_ole_read_MSAT_772.xls", charset=0x4016e5 "UTF-8", outError=outError@entry=0x7fffffffe364) at src/xls.c:1471
#10 0x0000000000400d6a in main (argc=2, argv=0x7fffffffe488) at src/xls2csv.c:116

(gdb) f 6
#6  0x00007ffff7bcce6a in read_MSAT (oleh=0x6036e0, ole2=0x603420) at src/ole.c:772
772                 free(ole2->SecID);

POC file

xls2csv_ole_read_MSAT_772.zip

CREDIT

Zhao Liang, Huawei Weiran Labs

BIFF5 xls written by 3rd party tool that stores text strings in LABEL records

I can build readxl with the current version of libxls and pass all but three tests (out of 365). This is one of the 4 problematic files.

We could read this test sheet with prior versions of libxls.

https://github.com/tidyverse/readxl/blob/master/tests/testthat/sheets/biff5-label-records.xls

This appears to trigger a parse error here:

https://github.com/evanmiller/libxls/blob/3a2acf14173463e40d6f18a8cc63b901db16e233/src/xls.c#L801

Backstory on this sheets and test: https://github.com/tidyverse/readxl/blob/eeeebf8171540a7cd14b373d20b08efbac7e3cd2/tests/testthat/test-compatibility.R#L33-L45

Relates to #10

"Unable to seek to sector" regression

The release of readxl v1.1.0 has brought a regression to light. Related issue in readxl tidyverse/readxl#463.

I can reproduce the OP's report that this xls file could be read with readxl v1.0.0 but not with readxl v1.1.0.

I can also reproduce the failure with xls2csv built from current HEAD here (79d31e0), i.e. without readxl in the equation. I think it's a coincidence that I managed to break things for the same file on the xlsx side as well 🤔. Rather, I imagine this tool (Economist Intelligent Unit) writes valid-but-nonstandard files for both xls and xlsx.

jenny@2015-mbp libxls-evanmiller-github $ git log --pretty=oneline -1
79d31e0e7c7dfcc93aec6bb0accb2128856c2c79 (HEAD -> master, upstream/master) Merge pull request #16 from jennybc/simplify-uint64
jenny@2015-mbp libxls-evanmiller-github $ ./xls2csv ~/Downloads/EIU/EIU_Data.xls
FILE: /Users/jenny/Downloads/EIU/EIU_Data.xls
Error reading XLS file: Unable to open file

I know you are mostly focused on other things here, at the moment. But libxls used to read this file OK. I'm attaching the OP's xls in a .zip file.
EIU.zip

Alloc-Dealloc-Mismatch in read_MSAT_body function

Test Version

dev version, git clone https://github.com/evanmiller/libxls

Test Program

xls2csv [infile]

Gdb and Backtrace

Reading symbols from xls2csv...done.
(gdb) run xls2csv_ole_read_MSAT_body_687.xls 
Starting program: /var/normal/bin/xls2csv xls2csv_ole_read_MSAT_body_687.xls
FILE: xls2csv_ole_read_MSAT_body_687.xls
*** Error in `/var/normal/bin/xls2csv': free(): invalid next size (normal): 0x0000000000604b10 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7ffff78737e5]
/lib/x86_64-linux-gnu/libc.so.6(+0x8037a)[0x7ffff787c37a]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7ffff788053c]
/var/normal/lib/libxlsreader.so.1(+0x6fdd)[0x7ffff7bccfdd]
/var/normal/lib/libxlsreader.so.1(ole2_open_file+0x47)[0x7ffff7bcd5b7]
/var/normal/lib/libxlsreader.so.1(xls_open_file+0x12)[0x7ffff7bd0a62]
/var/normal/bin/xls2csv[0x400d6a]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7ffff781c830]
/var/normal/bin/xls2csv[0x401139]
======= Memory map: ========
00400000-00402000 r-xp 00000000 08:01 526493                             /var/normal/bin/xls2csv
00601000-00602000 r--p 00001000 08:01 526493                             /var/normal/bin/xls2csv
00602000-00603000 rw-p 00002000 08:01 526493                             /var/normal/bin/xls2csv
00603000-00624000 rw-p 00000000 00:00 0                                  [heap]
7ffff0000000-7ffff0021000 rw-p 00000000 00:00 0 
7ffff0021000-7ffff4000000 ---p 00000000 00:00 0 
7ffff75e6000-7ffff75fc000 r-xp 00000000 08:01 1053568                    /lib/x86_64-linux-gnu/libgcc_s.so.1
7ffff75fc000-7ffff77fb000 ---p 00016000 08:01 1053568                    /lib/x86_64-linux-gnu/libgcc_s.so.1
7ffff77fb000-7ffff77fc000 rw-p 00015000 08:01 1053568                    /lib/x86_64-linux-gnu/libgcc_s.so.1
7ffff77fc000-7ffff79bc000 r-xp 00000000 08:01 1053530                    /lib/x86_64-linux-gnu/libc-2.23.so
7ffff79bc000-7ffff7bbc000 ---p 001c0000 08:01 1053530                    /lib/x86_64-linux-gnu/libc-2.23.so
7ffff7bbc000-7ffff7bc0000 r--p 001c0000 08:01 1053530                    /lib/x86_64-linux-gnu/libc-2.23.so
7ffff7bc0000-7ffff7bc2000 rw-p 001c4000 08:01 1053530                    /lib/x86_64-linux-gnu/libc-2.23.so
7ffff7bc2000-7ffff7bc6000 rw-p 00000000 00:00 0 
7ffff7bc6000-7ffff7bd5000 r-xp 00000000 08:01 526488                     /var/normal/lib/libxlsreader.so.1.4.0
7ffff7bd5000-7ffff7dd4000 ---p 0000f000 08:01 526488                     /var/normal/lib/libxlsreader.so.1.4.0
7ffff7dd4000-7ffff7dd5000 r--p 0000e000 08:01 526488                     /var/normal/lib/libxlsreader.so.1.4.0
7ffff7dd5000-7ffff7dd7000 rw-p 0000f000 08:01 526488                     /var/normal/lib/libxlsreader.so.1.4.0
7ffff7dd7000-7ffff7dfd000 r-xp 00000000 08:01 1053502                    /lib/x86_64-linux-gnu/ld-2.23.so
7ffff7fdd000-7ffff7fe0000 rw-p 00000000 00:00 0 
7ffff7ff5000-7ffff7ff7000 rw-p 00000000 00:00 0 
7ffff7ff7000-7ffff7ffa000 r--p 00000000 00:00 0                          [vvar]
7ffff7ffa000-7ffff7ffc000 r-xp 00000000 00:00 0                          [vdso]
7ffff7ffc000-7ffff7ffd000 r--p 00025000 08:01 1053502                    /lib/x86_64-linux-gnu/ld-2.23.so
7ffff7ffd000-7ffff7ffe000 rw-p 00026000 08:01 1053502                    /lib/x86_64-linux-gnu/ld-2.23.so
7ffff7ffe000-7ffff7fff000 rw-p 00000000 00:00 0 
7ffffffde000-7ffffffff000 rw-p 00000000 00:00 0                          [stack]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

Program received signal SIGABRT, Aborted.
0x00007ffff7831428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
54      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.

(gdb) bt
#0  0x00007ffff7831428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1  0x00007ffff783302a in __GI_abort () at abort.c:89
#2  0x00007ffff78737ea in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x7ffff798ced8 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/posix/libc_fatal.c:175
#3  0x00007ffff787c37a in malloc_printerr (ar_ptr=<optimized out>, ptr=<optimized out>, str=0x7ffff798d030 "free(): invalid next size (normal)", action=3) at malloc.c:5006
#4  _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:3867
#5  0x00007ffff788053c in __GI___libc_free (mem=<optimized out>) at malloc.c:2968
#6  0x00007ffff7bccfdd in read_MSAT_body (sectorCount=8388609, sectorOffset=<optimized out>, ole2=0x603420) at src/ole.c:687
#7  read_MSAT (oleh=0x6036e0, ole2=0x603420) at src/ole.c:757
#8  ole2_read_header (ole=ole@entry=0x603420) at src/ole.c:399
#9  0x00007ffff7bcd5b7 in ole2_open_file (file=file@entry=0x7fffffffe71c "xls2csv_ole_read_MSAT_body_687.xls") at src/ole.c:552
#10 0x00007ffff7bd0a62 in xls_open_file (file=0x7fffffffe71c "xls2csv_ole_read_MSAT_body_687.xls", charset=0x4016e5 "UTF-8", outError=outError@entry=0x7fffffffe364)
    at src/xls.c:1471
#11 0x0000000000400d6a in main (argc=2, argv=0x7fffffffe488) at src/xls2csv.c:116

(gdb) f 6
#6  0x00007ffff7bccfdd in read_MSAT_body (sectorCount=8388609, sectorOffset=<optimized out>, ole2=0x603420) at src/ole.c:687
687         free(sector);

Asan Debug Information

root@leon-virtual-machine:/var/asan/bin# ./xls2csv xls2csv_ole_read_MSAT_body_687.xls 
FILE: xls2csv_ole_read_MSAT_body_687.xls
=================================================================
==42611==ERROR: AddressSanitizer: alloc-dealloc-mismatch (operator new vs free) on 0x61500000fd00
    #0 0x7fd8432c02ca in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x982ca)
    #1 0x7fd8430012f7 in ole2_read_header src/ole.c:406
    #2 0x7fd843001fd5 in ole2_open_file src/ole.c:552
    #3 0x7fd84301035d in xls_open_file src/xls.c:1471
    #4 0x4016e8 in main src/xls2csv.c:116
    #5 0x7fd842c3b82f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
    #6 0x401008 in _start (/var/asan/bin/xls2csv+0x401008)

AddressSanitizer can not describe address in more detail (wild memory access suspected).
SUMMARY: AddressSanitizer: alloc-dealloc-mismatch ??:0 __interceptor_free
==42611==HINT: if you don't care about these warnings you may set ASAN_OPTIONS=alloc_dealloc_mismatch=0
==42611==ABORTING

POC file

xls2csv_ole_read_MSAT_body_687.zip

CREDIT

Zhao Liang, Huawei Weiran Labs

Release planning

Hi @evanmiller,

Exciting to see activity here!

For external reasons, I have to do a minor readxl release very soon. I always update the embedded libxls when I do this.

Do you predict an official libxls release in the next week and I should try to plan around that?

Or should I proceed on my own schedule? If I do not / cannot wait for an official libxls release, is there a specific SHA or tag you regard as more stable and release-ish?

compile xls2csv on windows

Hi,

Would it be possible to kindly let me know the command to execute in order to only build an .exe file from xls2csv.c using gcc on windows?

Thank you in advance.
Best regards.

"Unable to allocate memory" with a fairly innocuous-looking xls

Original report from tidyverse/readxl#578

I have reproduced the report. Namely that this file

https://www.gov.scot/binaries/content/documents/govscot/publications/statistics/2014/11/recorded-crime-scotland-2013-14/documents/recorded-crime-scotland-2013-14-excel-tables/recorded-crime-scotland-2013-14-excel-tables/govscot%3Adocument/00467392.xls

can't be read with readxl, but really libxls. I see:

> read_excel("~/Downloads/00467392.xls")
Error: 
  filepath: /Users/jenny/Downloads/00467392.xls
  libxls error: Unable to allocate memory

It's not really that large. It has lots of worksheets, but it's not obvious to me why it's a dealkiller.

Contrary to what OP reports, if I save this as xlsx in Excel, then I can read it with readxl, which travels an entirely different code path.

AX_CXX_COMPILE_STDCXX_11 not found

Not sure what the proper solution is but with the recent C++ changes I get

./configure: line 17117: AX_CXX_COMPILE_STDCXX_11: command not found

Guidance on packaging?

Hi, I'm creating a pkgsrc package for libxls. There are a few options wrt. the CVE situation:

  1. Release 1.4.0 as-is and mark it as vulnerable (users will have to override a flag to install)
  2. Release 1.4.0 with cherry picked patches for the CVE and memory issues
  3. Wait for 1.5.0

I'm a bit wary of no. 2 so I suppose it's mostly about the timeframe of a 1.5.0 release. What would you recommend?

xls file throws no error, but no cell data retrieved ("Chile sheet")

Original report: tidyverse/readxl#516

OP notes that, with another tool, they can read it if they specify the encoding is latin1.

With xls2csv built from b20521f, I see this:

jenny@2015-mbp libxls-evanmiller-github $ ./xls2csv ~/rrr/readxl/investigations/8029.xls -e latin1
FILE: /Users/jenny/rrr/readxl/investigations/8029.xls

No error, but also no data.

I attach the offending file, 8029.xls, as a zip. The original website is in Spanish and has lots of dropdown menus 😬

8029.zip

workflow hiccup for people who git pull

Something that's changed recently means that I have to do autoreconf -f -i every time I pull from upstream and want to rebuild the xls2csv tool.

Is there anything that is easy to change re: the setup that would make this problem go away? Or do I just need to remember to do this now? If that's the case, it should probably be added to the README.

jenny@2015-mbp libxls-evanmiller-github $ autoreconf -f -i
glibtoolize: putting auxiliary files in '.'.
glibtoolize: copying file './ltmain.sh'
glibtoolize: Consider adding 'AC_CONFIG_MACRO_DIRS([m4])' to configure.ac,
glibtoolize: and rerunning glibtoolize and aclocal.
glibtoolize: Consider adding '-I m4' to ACLOCAL_AMFLAGS in Makefile.am.
configure.ac:25: installing './compile'
configure.ac:22: installing './missing'
Makefile.am: installing './depcomp'

Release planning

Thanks for such prompt attention to the recent issues from readxl.

We are pushing to a release, in order to get a readxl version on CRAN that includes the fixes for the CVEs.

readxl is in a functional state as of c7e5e49, i.e. all tests pass.

However, we have to do a fair amount of manual patching to make embedded libxls acceptable to CRAN 😔, i.e. to generate no NOTEs with their compiler settings on all tested platforms.

I now see many commits have been made since then and I want to take advantage of the most current libxls when I submit to CRAN. But I also want to minimize the manual application of my "CRAN fixups".

Can you give me a sense of your near-term plans? Do you foresee sporadic development for a while or are you pushing towards a distinct release too? It helps me plan my work.

Thanks in advance.

MinGW Compilation

Hi,

I find this library very good. However, I'm lost as how can this be compiled in Windows using MinGW. Also, can you provide how-to on how to generate a static version of the library.

Kudos and kind regards!

Can't run ./bootstrap

I'd like to rerun ./configure on my machine, but first I gather I need to do ./bootstrap. But that fails:

jenny@2015-mbp libxls-evanmiller-github $ ./bootstrap
configure.ac:67: warning: macro 'AM_ICONV' not found in library
glibtoolize: putting auxiliary files in '.'.
glibtoolize: copying file './ltmain.sh'
glibtoolize: Consider adding 'AC_CONFIG_MACRO_DIRS([m4])' to configure.ac,
glibtoolize: and rerunning glibtoolize and aclocal.
glibtoolize: Consider adding '-I m4' to ACLOCAL_AMFLAGS in Makefile.am.
configure.ac:67: warning: macro 'AM_ICONV' not found in library
configure.ac:67: error: possibly undefined macro: AM_ICONV
      If this token and others are legitimate, please use m4_pattern_allow.
      See the Autoconf documentation.
autoreconf: /usr/local/Cellar/autoconf/2.69/bin/autoconf failed with exit status: 1

Malloc failed in ole.c:637 read_MSAT_body

It seems different with issues 35

Test Version

dev version, git clone https://github.com/evanmiller/libxls

Test Program

xls2csv [infile]

Gdb and Backtrace

Reading symbols from xls2csv...done.
(gdb) run xls2csv_ole_read_MSAT_body_637.crash
Starting program: /opt/normal/bin/xls2csv xls2csv_ole_read_MSAT_body_637.crash
FILE: xls2csv_ole_read_MSAT_body_637.crash
xls2csv: malloc.c:2394: sysmalloc: Assertion `(old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0)' failed.

Program received signal SIGABRT, Aborted.
0x00007ffff7a42428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
54      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  0x00007ffff7a42428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1  0x00007ffff7a4402a in __GI_abort () at abort.c:89
#2  0x00007ffff7a8a2e8 in __malloc_assert (
    assertion=assertion@entry=0x7ffff7b9e190 "(old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0)", file=file@entry=0x7ffff7b9abc5 "malloc.c", line=line@entry=2394,
    function=function@entry=0x7ffff7b9e9d8 <__func__.11509> "sysmalloc") at malloc.c:301
#3  0x00007ffff7a8e426 in sysmalloc (nb=nb@entry=528, av=av@entry=0x7ffff7dd1b20 <main_arena>) at malloc.c:2391
#4  0x00007ffff7a8f743 in _int_malloc (av=av@entry=0x7ffff7dd1b20 <main_arena>, bytes=bytes@entry=512) at malloc.c:3827
#5  0x00007ffff7a91184 in __GI___libc_malloc (bytes=512) at malloc.c:2913
#6  0x000000000040726a in ole_malloc (len=<optimized out>) at src/ole.c:66
#7  read_MSAT_body (sectorCount=8388610, sectorOffset=109, ole2=0x60e420) at src/ole.c:637
#8  read_MSAT (oleh=0x60e6e0, ole2=0x60e420) at src/ole.c:757
#9  ole2_read_header (ole=ole@entry=0x60e420) at src/ole.c:399
#10 0x0000000000407442 in ole2_open_file (file=file@entry=0x7fffffffe74e "xls2csv_ole_read_MSAT_body_637.crash") at src/ole.c:552
#11 0x0000000000404a62 in xls_open_file (file=0x7fffffffe74e "xls2csv_ole_read_MSAT_body_637.crash", charset=0x4075ff "UTF-8",
    outError=outError@entry=0x7fffffffe394) at src/xls.c:1471
#12 0x0000000000400f5a in main (argc=2, argv=0x7fffffffe4b8) at src/xls2csv.c:116
(gdb) f 7
#7  read_MSAT_body (sectorCount=8388610, sectorOffset=109, ole2=0x60e420) at src/ole.c:637
637         DWORD *sector = ole_malloc(ole2->lsector);

POC file

xls2csv_ole_read_MSAT_body_637.zip

CREDIT

Zhao Liang, Huawei Weiran Labs

Add support for password-protected (encrypted) files

We've discussed this one before:

#14

I have no evidence that libxls or readxl has ever been able to read this file. So it's conceivable that we continue to let this one go. But it's worth a quick look.

xls is here

https://www.seco.admin.ch/dam/seco/de/dokumente/Wirtschaft/Wirtschaftslage/VIP%20Quartalsschätzungen/qna_p_na.xls.download.xls/qna_p_na.xls

With the xls2csv tool built from ?b6192a3eb4f5ba99435db259ca1fb442b1e8cba7 I see this:

jenny@2015-mbp libxls-evanmiller-github $ ./xls2csv ~/rrr/readxl/investigations/qna_p_na.xls
FILE: /Users/jenny/rrr/readxl/investigations/qna_p_na.xls
Error reading XLS file: Unable to allocate memory

Tracking in readxl here: tidyverse/readxl#417

LICENSE

readxl currently communicates the license of libxls via the file LICENSE.note as per CRAN guidelines. I include the entirety of the file below.

Q1: Does this look as it should?

Q2: OK maybe this is more of a comment. I can't find clear licensing info in this repo. But perhaps that is due to the fact that the status of this repo is not yet fully defined.

libxls

The included libxls code is licensed under the BSD 2 clause license:

Redistribution and use in source and binary forms, with or without modification, are
permitted provided that the following conditions are met:

  1. Redistributions of source code must retain the above copyright notice, this list of
    conditions and the following disclaimer.

  2. Redistributions in binary form must reproduce the above copyright notice, this list
    of conditions and the following disclaimer in the documentation and/or other materials
    provided with the distribution.

THIS SOFTWARE IS PROVIDED BY David Hoerl ''AS IS'' AND ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL David Hoerl OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
ADVISED OF THE POSSIBILITY OF SUCH DAMAGE

How to build static library?

I have built the shared library, it works fine and I am happy.
But, I prefer a static library, how can I build it?

Build Issue

configure fails as at he end of this report

[root@localhost sf_Libraries]# wget https://github.com/libxls/libxls/archive/v1. 5.2.tar.gz
--2019-11-16 15:55:16-- https://github.com/libxls/libxls/archive/v1.5.2.tar.gz
Resolving github.com (github.com)... 13.234.210.38
Connecting to github.com (github.com)|13.234.210.38|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://codeload.github.com/libxls/libxls/tar.gz/v1.5.2 [following]
--2019-11-16 15:55:16-- https://codeload.github.com/libxls/libxls/tar.gz/v1.5.2
Resolving codeload.github.com (codeload.github.com)... 13.233.43.20
Connecting to codeload.github.com (codeload.github.com)|13.233.43.20|:443... con nected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/x-gzip]
Saving to: ‘v1.5.2.tar.gz’

[ <=>                                   ] 322,770     --.-K/s   in 0.1s

2019-11-16 15:55:17 (2.71 MB/s) - ‘v1.5.2.tar.gz’ saved [322770]

[root@localhost sf_Libraries]# mv v1.5.2.tar.gz libxls.v1.5.2.tar.gz
[root@localhost sf_Libraries]# tar -xf libxls.v1.5.2.tar.gz
[root@localhost sf_Libraries]# cd libxls-1.5.2/
[root@localhost libxls-1.5.2]# ./bootstrap
libtoolize: putting auxiliary files in .'. libtoolize: copying file ./ltmain.sh'
libtoolize: Consider adding AC_CONFIG_MACRO_DIR([m4])' to configure.ac and libtoolize: rerunning libtoolize, to keep the correct libtool macros in-tree. libtoolize: Consider adding -I m4' to ACLOCAL_AMFLAGS in Makefile.am.
configure.ac:24: installing './config.guess'
configure.ac:24: installing './config.sub'
configure.ac:21: installing './install-sh'
configure.ac:21: installing './missing'
Makefile.am: installing './depcomp'
parallel-tests: installing './test-driver'
[root@localhost libxls-1.5.2]# ./configure --prefix=/media/sf_Libraries/Installed/libxls
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /usr/bin/mkdir -p
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking whether make supports nested variables... (cached) yes
checking build system type... x86_64-unknown-linux-gnu
checking host system type... x86_64-unknown-linux-gnu
checking how to print strings... printf
checking for style of include used by make... GNU
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking dependency style of gcc... gcc3
checking for a sed that does not truncate output... /usr/bin/sed
checking for grep that handles long lines and -e... /usr/bin/grep
checking for egrep... /usr/bin/grep -E
checking for fgrep... /usr/bin/grep -F
checking for ld used by gcc... /usr/bin/ld
checking if the linker (/usr/bin/ld) is GNU ld... yes
checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B
checking the name lister (/usr/bin/nm -B) interface... BSD nm
checking whether ln -s works... no, using cp -pR
checking the maximum length of command line arguments... 1572864
checking whether the shell understands some XSI constructs... yes
checking whether the shell understands "+="... yes
checking how to convert x86_64-unknown-linux-gnu file names to x86_64-unknown-linux-gnu format... func_convert_file_noop
checking how to convert x86_64-unknown-linux-gnu file names to toolchain format... func_convert_file_noop
checking for /usr/bin/ld option to reload object files... -r
checking for objdump... objdump
checking how to recognize dependent libraries... pass_all
checking for dlltool... no
checking how to associate runtime and link libraries... printf %s\n
checking for ar... ar
checking for archiver @file support... @
checking for strip... strip
checking for ranlib... ranlib
checking command to parse /usr/bin/nm -B output from gcc object... ok
checking for sysroot... no
checking for mt... no
checking if : is a manifest tool... no
checking how to run the C preprocessor... gcc -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking for dlfcn.h... yes
checking for objdir... .libs
checking if gcc supports -fno-rtti -fno-exceptions... no
checking for gcc option to produce PIC... -fPIC -DPIC
checking if gcc PIC flag -fPIC -DPIC works... yes
checking if gcc static flag -static works... no
checking if gcc supports -c -o file.o... yes
checking if gcc supports -c -o file.o... (cached) yes
checking whether the gcc linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes
checking whether -lc should be explicitly linked in... no
checking dynamic linker characteristics... GNU/Linux ld.so
checking how to hardcode library paths into programs... immediate
checking whether stripping libraries is possible... yes
checking if libtool supports shared libraries... yes
checking whether to build shared libraries... yes
checking whether to build static libraries... no
checking for gcc... (cached) gcc
checking whether we are using the GNU C compiler... (cached) yes
checking whether gcc accepts -g... (cached) yes
checking for gcc option to accept ISO C89... (cached) none needed
checking dependency style of gcc... (cached) gcc3
checking for gcc option to accept ISO C99... -std=gnu99
checking for g++... g++
checking whether we are using the GNU C++ compiler... yes
checking whether g++ accepts -g... yes
checking dependency style of g++... gcc3
checking how to run the C++ preprocessor... g++ -E
checking for ld used by g++... /usr/bin/ld -m elf_x86_64
checking if the linker (/usr/bin/ld -m elf_x86_64) is GNU ld... yes
checking whether the g++ linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes
checking for g++ option to produce PIC... -fPIC -DPIC
checking if g++ PIC flag -fPIC -DPIC works... yes
checking if g++ static flag -static works... no
checking if g++ supports -c -o file.o... yes
checking if g++ supports -c -o file.o... (cached) yes
checking whether the g++ linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes
checking dynamic linker characteristics... (cached) GNU/Linux ld.so
checking how to hardcode library paths into programs... immediate
./configure: line 16356: syntax error near unexpected token ,' ./configure: line 16356: AX_CXX_COMPILE_STDCXX_11(, optional)'
[root@localhost libxls-1.5.2]# cd ..
[root@localhost sf_Libraries]#

Zero size arrays and flexible array members

Re: manual patches applied in readxl when we pull from upstream libxls and whether the need can be designed away.

readxl gets built with a C++ compiler and with flags dictated by CRAN.

We get several warnings about zero size arrays and flexible array members. One example:

./libxls/xlsstruct.h:126:10: warning: flexible array members are a C99 feature [-Wc99-extensions]
    char        name[];
                ^

And another:

./libxls/xlsstruct.h:243:18: warning: zero size arrays are an extension [-Wzero-length-array]
    BYTE        strings[0];
                        ^

This has been true for a long time, certainly before I even took over as maintainer. Our traditional "solution" has been to replace every instance of thing[] or thing[0] with thing[1]. After the most recent wave of changes, this hack is no longer an option. Although readxl compiles with these changes, we get segfaults or nonsense in some of our tests and examples.

@jimhester is helping me work around this for the current readxl release, but I wanted to open a conversation about working towards a better mutual solution in the future.

"make distcheck" fails with latest C++ inclusions

@QuLogic Would you mind taking a look?

$ make distcheck

... stuff compiles ...

  CXX      cplusplus/main.o
../../cplusplus/main.cpp:43:10: fatal error: 'XlsReader.h' file not found
#include "XlsReader.h"
         ^~~~~~~~~~~~~
1 error generated.
make[2]: *** [cplusplus/main.o] Error 1
make[1]: *** [all] Error 2
make: *** [distcheck] Error 1

I believe the Makefile.am needs to be updated - is XlsReader.h supposed to be installed or is it just for testing? If so, should it go in the regular include path or somewhere else? (I am really not familiar with C++ installation conventions.)

Line flagged in CRAN's clang-UBSAN checks

readxl currently has a ... memo, I'll call it, from their clang-UBSAN checks. readxl still passes R CMD check and our own tests cleanly but this line gets flagged:

snprintf(ret, retlen, "%d", (int)cell->d);

Here's the detailed report:

https://www.stats.ox.ac.uk/pub/bdr/memtests/clang-UBSAN/readxl/tests/testthat.Rout

I suspect that this can be seen as a false positive, in light of external knowledge of the xls spec and memory layout. I'm trying to figure that out for myself, but would be happy for any wisdom or wording you have to share.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.