Coder Social home page Coder Social logo

epubreader's Introduction

EpubReader

.NET library for reading EPUB files.

Build NuGet

Supports .NET Standard >= 1.3 (which includes .NET Core >= 1.0 and .NET >= 5) and .NET Framework >= 4.6.

Supports EPUB 2 (2.0, 2.0.1) and EPUB 3 (3.0, 3.0.1, 3.1, 3.2).

Download | WPF & .NET 6 console demo apps

Example

// Opens a book and reads all of its content into memory
EpubBook epubBook = EpubReader.ReadBook("alice_in_wonderland.epub");

            
// COMMON PROPERTIES

// Book's title
string title = epubBook.Title;

// Book's authors (comma separated list)
string author = epubBook.Author;

// Book's authors (list of authors names)
List<string> authors = epubBook.AuthorList;

// Book's cover image (null if there is no cover)
byte[] coverImageContent = epubBook.CoverImage;
if (coverImageContent != null)
{
    using (MemoryStream coverImageStream = new MemoryStream(coverImageContent))
    {
        Image coverImage = Image.FromStream(coverImageStream);
    }
}
            
// TABLE OF CONTENTS

// Enumerating chapters
foreach (EpubNavigationItem chapter in epubBook.Navigation)
{
    // Title of chapter
    string chapterTitle = chapter.Title;
                
    // Nested chapters
    List<EpubNavigationItem> subChapters = chapter.NestedItems;
}

// READING ORDER

// Enumerating the whole text content of the book in the order of reading
foreach (EpubTextContentFile textContentFile in book.ReadingOrder)
{
    // HTML of current text content file
    string htmlContent = textContentFile.Content;
}

            
// CONTENT

// Book's content (HTML files, stlylesheets, images, fonts, etc.)
EpubContent bookContent = epubBook.Content;

            
// IMAGES

// All images in the book (file name is the key)
Dictionary<string, EpubByteContentFile> images = bookContent.Images;

EpubByteContentFile firstImage = images.Values.First();

// Content type (e.g. EpubContentType.IMAGE_JPEG, EpubContentType.IMAGE_PNG)
EpubContentType contentType = firstImage.ContentType;

// MIME type (e.g. "image/jpeg", "image/png")
string mimeType = firstImage.ContentMimeType;

// Creating Image class instance from the content
using (MemoryStream imageStream = new MemoryStream(firstImage.Content))
{
    Image image = Image.FromStream(imageStream);
}

// Cover metadata
if (bookContent.Cover != null)
{
    string coverFileName = bookContent.Cover.FileName;
    EpubContentType coverContentType = bookContent.Cover.ContentType;
    string coverMimeType = bookContent.Cover.ContentMimeType;
}

// HTML & CSS

// All XHTML files in the book (file name is the key)
Dictionary<string, EpubTextContentFile> htmlFiles = bookContent.Html;

// All CSS files in the book (file name is the key)
Dictionary<string, EpubTextContentFile> cssFiles = bookContent.Css;

// Entire HTML content of the book
foreach (EpubTextContentFile htmlFile in htmlFiles.Values)
{
    string htmlContent = htmlFile.Content;
}

// All CSS content in the book
foreach (EpubTextContentFile cssFile in cssFiles.Values)
{
    string cssContent = cssFile.Content;
}


// OTHER CONTENT

// All fonts in the book (file name is the key)
Dictionary<string, EpubByteContentFile> fonts = bookContent.Fonts;

// All files in the book (including HTML, CSS, images, fonts, and other types of files)
Dictionary<string, EpubContentFile> allFiles = bookContent.AllFiles;


// ACCESSING RAW SCHEMA INFORMATION

// EPUB OPF data
EpubPackage package = epubBook.Schema.Package;

// Enumerating book's contributors
foreach (EpubMetadataContributor contributor in package.Metadata.Contributors)
{
    string contributorName = contributor.Contributor;
    string contributorRole = contributor.Role;
}

// EPUB 2 NCX data
Epub2Ncx epub2Ncx = epubBook.Schema.Epub2Ncx;

// Enumerating EPUB 2 NCX metadata
foreach (Epub2NcxHeadMeta meta in epub2Ncx.Head)
{
    string metadataItemName = meta.Name;
    string metadataItemContent = meta.Content;
}

// EPUB 3 navigation
Epub3NavDocument epub3NavDocument = epubBook.Schema.Epub3NavDocument;

// Accessing structural semantics data of the head item
StructuralSemanticsProperty? ssp = epub3NavDocument.Navs.First().Type;

More examples

  1. How to extract the plain text of the whole book.
  2. How to extract the table of contents.
  3. How to iterate over all EPUB files in a directory and collect some statistics.

Download latest stable release

Download from nuget.org

DLL file from GitHub: for .NET Framework (40.4 KB) / for .NET Standard (40.5 KB)

Demo apps

Download WPF demo app (WpfDemo.zip, 488 KB)

This .NET Framework application demonstrates how to open EPUB books and extract their content using the library.

HTML renderer used in this demo app may have difficulties while rendering content for some of the books if the HTML structure is too complicated.

Download .NET 6 console demo app (ConsoleDemo.zip, 185 KB)

This .NET 6 console application demonstrates how to open EPUB books and retrieve their text content.

epubreader's People

Contributors

vers-one avatar lellid avatar janek91 avatar fl3pp avatar krauser123 avatar frankdrebin893 avatar rudacs avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.