bfabiszewski / libmobi Goto Github PK
View Code? Open in Web Editor NEWC library for handling Kindle (MOBI) formats of ebook documents
License: GNU Lesser General Public License v3.0
C library for handling Kindle (MOBI) formats of ebook documents
License: GNU Lesser General Public License v3.0
Detail bug report is at here. Developer can access it by logging in.
There is an error parsing the records offsets in mobi_load_rec. If the next record offset is lower than the previous that results in a negative size that overflows the unsigned integer, so the malloc in mobi_load_recdata can be enormous.
if (curr->next != NULL) {
next = curr->next;
size = next->offset - curr->offset; // <- integer overflow here
} else {
....stripped
}
curr->size = size;
ret = mobi_load_recdata(curr, file); // -> malloc(curr->size); -> enormous malloc
Here is sample that shows this behaviour:
sample.zip
Please check I can't convert that pdb to epub
"Error while loading document (Unsupported document format)"
PDB.zip
_buffer_get_varlen I am puzzled by this function, why should I read 7 bit, Stops when byte has bit 7 set, I am also confused about this condition. Should not be a step-by-step read 8 bit
Hello @bfabiszewski
I am using your another lib QLMobi combine with libmobi to parse html and images from mobi book.
Most book works great, but some books can not get media image.
I have try to fix but can not get the point. Hope you can help,this is the last problem for me i think~
Both QLMobi and libmobi are great nearly perfect lib.
Thank you very much for your great job~
World of Warcraft - Dawn of the Aspects Part I.mobi.zip
Also i am the developer of Alook Browser - 2x Speed (https://itunes.apple.com/us/app/alook-web-browser-2x-speed/id1261944766?mt=8) if you are using iOS ,and here is a promotional code JWYTH3FE4JJK
Forgive my poor english~
Best Regards.
Hey there!
I'd like to report a security issue but cannot find contact instructions on your repository.
If not a hassle, might you kindly add a SECURITY.md
file with an email, or another contact method? GitHub recommends this best practice to ensure security issues are responsibly disclosed, and it would serve as a simple instruction for security researchers in the future.
Thank you for your consideration, and I look forward to hearing from you!
(cc @huntr-helper)
We found with our fuzzer several heap-buffer-overflow errors when compiling libmobi with address sanitizer and run with the command mobitool -i7m $file
. Someone else also found a few others here.
We will list them separately in the following issue threads and this is the 1st one.
POC (proof-of-crash) files:
https://github.com/ntu-sec/pocs/blob/master/libmobi/hbo_buffer.c%3A212_1.mobi
https://github.com/ntu-sec/pocs/blob/master/libmobi/hbo_buffer.c%3A212_2.mobi
gdb output:
https://github.com/ntu-sec/pocs/blob/master/libmobi/hbo_buffer.c:212_1.mobi.gdb.txt
https://github.com/ntu-sec/pocs/blob/master/libmobi/hbo_buffer.c:212_2.mobi.gdb.txt
POC files:
https://github.com/ntu-sec/pocs/blob/master/libmobi/hbo_util.c%3A2759_1.mobi
https://github.com/ntu-sec/pocs/blob/master/libmobi/hbo_util.c%3A2759_2.mobi
gdb output:
https://github.com/ntu-sec/pocs/blob/master/libmobi/hbo_util.c%3A2759_1.mobi.gdb.txt
https://github.com/ntu-sec/pocs/blob/master/libmobi/hbo_util.c%3A2759_2.mobi.gdb.txt
Github-provided tarballs are pretty bad, because the size is about 45 megabytes
convert mobi file to epub format successfully, but the epub file format is error, it can't be opened by iBooks and many android epub readers. I check the epub file with calibre-edit, and get the error below:
ERROR: Parsing failed: xmlParseEntityRef: no name, line 1, column 807 [OEBPS/part00000.html]
INFO: File too large [OEBPS/part00000.html]
When i parse big file. Method MOBI_RET mobi_parse_rawml(MOBIRawml *rawml, const MOBIData *m)
is too slow.
How to get the HTML content with the specified sequence number as fast as possible?
Mobi parse failed but can be open in Kindle app.
File is in attachment.
Thank you for your great work~
Best Regards.
World of Warcraft - Dawn of the Aspects Part I.mobi.zip
Developer can access the bug detail at here.
libmobi.rc is missing from repo.
I have an AZW3 file that I cannot post publicly, but could send you by email for testing. When converted to EPUB, it generates non-functional Table of Contents (TOC) - the chapter names are correct, but links do not work. The TOC entries are like:
<navPoint id="toc-2" playOrder="2">
<navLabel>
<text>CAP&Iacute;TULO II: Otra mudanza ca&oacute;tica</text>
</navLabel>
<content src="part00000.html#"/>
</navPoint>
Note that '' is missing a tag after # character. The same happens with internal links in ebook text to the chapter titles. The same file converts fine to EPUB e.g. with Calibre.
BTW, tried to email you privately about this first, but the email does not go through and sits in the retry queue. Your own mail server at your .net domain says that your email address is graylisted...
Greg
POC files:
https://github.com/ntu-sec/pocs/blob/master/libmobi/hbo_buffer.c%3A230_1.mobi
https://github.com/ntu-sec/pocs/blob/master/libmobi/hbo_buffer.c%3A230_2.mobi
gdb output:
https://github.com/ntu-sec/pocs/blob/master/libmobi/hbo_buffer.c%3A230_1.mobi.gdb.txt
https://github.com/ntu-sec/pocs/blob/master/libmobi/hbo_buffer.c%3A230_2.mobi.gdb.txt
Currently mobitool -t can output both .mobi and .azw3.
If the user has thousands of mobi files and he only want .azw3 files, generating all the .mobi and .azw3 and then delete all .mobi is waste of time and disk space.
Please add an option like -k8 or -k7 to the mobitool and the user can extract only one type of them.
The README lists a lot of features, but they're all apparently centered around reading or modifying an existing file.
Can libmobi also create new ebooks from scratch? (For use in an EPUB->MOBI conversion software) If yes, maybe another bullet point in the README clarifying that would be useful 🙂
Thanks for creating this cool library!
thanks to your project!
Please add support new Kindle format KFX
(sample attached)
sample.zip
Mobi parse failed but can be open in Kindle app.
File is in attachment.
请停止无效努力.mobi.zip
Thank you for your great work~
Best Regards.
Homebrew is an awesome package manager for macOS. If you add a brew formula, i.e. libmobi.rb, it will get very convenient to install libmobi on macOS.
Please increase MOBI_ATTRNAME_MAXSIZE and MOBI_ATTRVALUE_MAXSIZE to 150
#define MOBI_ATTRNAME_MAXSIZE 150 /< Maximum length of tag attribute name, like "href" */
#define MOBI_ATTRVALUE_MAXSIZE 150 /< Maximum length of tag attribute value */
thanks
printf("Could not initialize zip archive\n");
Here is the link to the file I tested.
https://1drv.ms/u/s!AkaVccfysLmAhI5Odqj2pZ1QCMci6g?e=U9lkC3
HI:
I use create_epub(const MOBIRawml *rawml, const char *fullpath) ,create epub file, but epub file Wrong format, can't open; thanks
Cześć Bartek!
One of the users of my app (@voice Aloud Reader in Google Play) sent me the first azw4 ebook. Do you think you could include this format into your library? Would you need any help with this? I was able to convert the file to epub using the latest calibre, but the format is weird - short lines about 80 characters long formatted as <p>...</p>. Could be a problem with this original file, or calibre's conversion process, don't know at this time.
OK, just managed to update my old Kindle HDX 3rd generation, and it opened the azw4 file fine, no problem with formatting there. Apparently Calibre's conversion is not perfect yet. Please let me know if you have any plans regarding AZW4. Thanks!
Grzesiek
When you create a MOBIBuffer
object:
typedef struct {
size_t offset; /**< Current offset in respect to buffer start */
size_t maxlen; /**< Length of the buffer data */
unsigned char *data; /**< Pointer to buffer data */
MOBI_RET error; /**< MOBI_SUCCESS = 0 if operation on buffer is successful, non-zero value on failure */
} MOBIBuffer;
the initial value of buf->offset
is 0:
MOBIBuffer * mobi_buffer_init_null(unsigned char *data, const size_t len) {
MOBIBuffer *buf = malloc(sizeof(MOBIBuffer));
if (buf == NULL) {
debug_print("%s", "Buffer allocation failed\n");
return NULL;
}
buf->data = data;
buf->offset = 0;
buf->maxlen = len;
buf->error = MOBI_SUCCESS;
return buf;
}
I think there is a problem calling mobi_buffer_get_varlen_internal
when direction
is -1
(read buffer backwards) with a value of buf->offset
that is 3.
If buf->offset
is 3, it should Reads maximum 4 bytes from the buffer. Stops when byte has bit 7 set.
so it should read byte number 3, byte number 2, byte number 1, and then byte number 0.
but when it comes to read byte number 0, we can see the following check at line 267:
if (buf->offset < 1) {
it checks if zero is less than 1 and it is, so an error is printed and only the last 3 bytes that have been read return and not the 4.
(even though according to pull request it should return 0)
if it needs to read byte number 0 - it should read it and then return without decrementing buf->offset
of 0 because if it does it, it will lead to an integer underflow and we will get the max value for size_t in buf->offset
, so I suggest checking if it is 0 after reading the byte to the value byte
and after updating the value of val
, and if buf->offset
is 0,
we should check byte_count
and according to that decide whether to execute
debug_print("%s", "End of buffer\n");
buf->error = MOBI_BUFFER_END;
return 0;
or to set byte
to stop_flag
so it will stop reading and return val
, while keeping buf->offset
at 0,
main.c:10:11: fatal error: 'mobi.h' file not found
how can i fix this error.
How can I build this as a dll so that I can consume in a C# project?
Thank you and sorry to trouble you aging.
Best Regards.
I am getting a CMake error if I enable MOBI_DEBUG on Windows (VS 2022):
cl : command line error D8021: invalid numeric argument '/Wextra'
The issue is with "World of Warcraft - Dawn of the Aspects Part I.mobi" ebook file, submitted with "Mobi file can't parse #10" by @LiuDeng:
The toc.ncx that libmobi generates from this file has wrong links. For example for "Part I" we have in toc.ncx:
<navPoint id="toc-3" playOrder="3">
<navLabel>
<text>Part I</text>
</navLabel>
<content src="part00000.html#0000006908"/>
However, there is no element with id "0000006908" in part00000.html at all. Instead, "Part I" header is preceded with:
<a id="0000006902">
Could you maybe tell me where and how the toc.ncx is constructed, maybe then I could find a fix on my own...
Greg
It would be good if you can add covert chm to epub.
It should be easy for you, to extract chm to the list of html files and create epub with them.
Maybe you can use
https://github.com/jedwing/CHMLib
$ mobitool.exe -d data/googled.mobi -o .
Title: Googled
Author: Ken Auletta
. . . <output snipped> . . .
Dumping rawml...
Saving rawml to data/googled.rawml
diff --git a/src/util.c b/src/util.c
index be08b26..8887afd 100644
--- a/src/util.c
+++ b/src/util.c
@@ -1601,7 +1601,7 @@ static MOBI_RET mobi_decompress_content(const MOBIData *m, char *text, FILE *fil
if (dump) {
fwrite(decompressed, 1, decompressed_size, file);
} else {
- if (text_length > *len) {
+ if (text_length + decompressed_size > *len) {
debug_print("%s", "Text buffer too small\n");
/* free huff/cdic tables */
mobi_free_huffcdic(huffcdic);
--
2.7.4
Developers of this repo can access bug details at here.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.