Coder Social home page Coder Social logo

daskycodes / infer Goto Github PK

View Code? Open in Web Editor NEW
13.0 1.0 3.0 78.36 MB

Infer file and MIME type by checking the magic number signature

Home Page: https://hexdocs.pm/infer/readme.html

License: MIT License

Elixir 99.83% HTML 0.16% Shell 0.02%
elixir filetype mime magic-number

infer's Introduction

Infer

A dependency free library to infer file and MIME type by checking the magic number signature.

An elixir adaption of the infer rust library.

Installation

The package can be installed by adding infer to your list of dependencies in mix.exs:

def deps do
  [
    {:infer, "~> 0.2.6"}
  ]
end

The docs can be found at https://hexdocs.pm/infer.

Examples

Infer.get/1

Takes the binary file contents as argument and returns the Infer.Type.t() if the file matches one of the supported types. Returns nil otherwise.

iex> binary = File.read!("test/images/sample.png")
iex> Infer.get(binary)
%Infer.Type{extension: "png", matcher: &Infer.Image.png?/1, matcher_type: :image, mime_type: "image/png"}

Infer.get_from_path/1

Similar to Infer.get/1, but takes the file path as argument.

iex> Infer.get_from_path("test/images/sample.png")
%Infer.Type{extension: "png", matcher: &Infer.Image.png?/1, matcher_type: :image, mime_type: "image/png"}

Infer.is?/2

Takes the binary content and the file extension as arguments. Returns whether the file content is of the given extension.

iex> binary = File.read!("test/images/sample.png")
iex> Infer.is?(binary, "png")
true

Infer.mime?/2

Takes the binary content and the file extension as arguments. Returns whether the file content is of the given mime type.

iex> binary = File.read!("test/images/sample.png")
iex> Infer.mime?(binary, "image/png")
true

Infer.image?/1

Takes the binary file contents as argument and returns whether the file is an image or not.

iex> binary = File.read!("test/images/sample.png")
iex> Infer.image?(binary)
true

Infer.document?/1

Takes the binary file contents as argument and returns whether the file is a document (microsoft office, open office)

iex> binary = File.read!("test/docs/sample.xlsx")
iex> Infer.document?(binary)
true

Infer.Doc.docx?/1

Takes the binary file contents as arguments. Returns true if it's Microsoft Word Open XML Format Document (DOCX) data.

iex> binary = File.read!("test/docs/sample.docx")
iex> Infer.Doc.docx?(binary)
true

Supported Types

Image

MIME Extension
image/jpeg jpg
image/jp2 jp2
image/png png
image/gif image/web
image/x-canon-c32 cr2
image/tiff tif
image/bmp bmp
image/vnd.ms-photo jxr
image/vnd.adobe.photoshop psd
image/vnd.microsoft.icon ico
image/heif heif
image/avif avif

Video

MIME Extension
video/mp4 mp4
video/x-m4v m4v
video/x-matroska mkv
video/webm webm
video/quicktime mov
video/x-msvideo avi
video/x-ms-wmv wmv
video/mpeg mpg
video/x-flv flv

Audio

MIME Extension
audio/midi midi
audio/mpeg mp3
audio/m4a m4a
audio/ogg ogg
audio/x-flac flac
audio/x-wav wav
audio/amr amr
audio/aac aac
audio/x-aiff aiff

Document

MIME Extension
application/msword doc
application/vnd.openxmlformats-officedocument.wordprocessingml.document docx
application/vnd.ms-excel xls
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet xlsx
application/vnd.ms-powerpoint ppt
application/vnd.openxmlformats-officedocument.presentationml.presentation pptx
application/vnd.oasis.opendocument.text odt
application/vnd.oasis.opendocument.spreadsheet ods
application/vnd.oasis.opendocument.presentation odp

Archive

MIME Extension
application/epub+zip epub
application/zip zip
application/x-tar tar
application/vnd.rar rar
application/gzip gz
application/x-bzip2 bz2
application/x-7z-compressed 7z
application/x-xz xz
application/is_pdf pdf
application/x-shockwave-flash swf
application/rtf rtf
application/octet-stream eot
application/postscript ps
application/vnd.sqlite3 sqlite
application/x-nintendo-nes-rom nex
application/x-unix-archive ar
application/x-compressed Z
application/x-lzip lz
application/x-rpm rpm
application/dicom dcm
application/zstd zst
application/x-ole-storage msi
application/x-google-chrome-extension crx
application/vnd.ms-cab-compressed cab
application/vnd.debian.binary-package deb

Font

MIME Extension
application/font-woff woff
application/font-woff2 woff2
application/font-sfnt ttf
application/font-sfnt otf

Book

MIME Extension
application/epub+zip epub
application/x-mobipocket-ebook mobi

Application

MIME Extension
application/wasm wasm
application/x-executable elf
application/vnd.microsoft.portable-executable exe
application/vnd.microsoft.portable-executable dll
application/java class
application/x-llvm bc
application/x-mach-binary mach
application/vnd.android.dex dex
application/vnd.android.det dey
application/x-x509-ca-cert der
application/x-executable obj

License

This project and is licensed under the MIT License - see the LICENSE file for details.

Copyright (c) 2021 Daniel Khaapamyaki

infer's People

Contributors

adriankumpf avatar daskycodes avatar ding-an-sich avatar hannesbraun avatar shreyanjain9 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

infer's Issues

Possible leaking file descriptor in `Infer.get_from_path/1,2`

Consider the implementation of Infer.get_from_path/1,2:

infer/lib/infer.ex

Lines 46 to 53 in 65beb86

@spec get_from_path(binary()) :: Infer.Type.t() | nil
def get_from_path(path, byte_size \\ 2048) do
with {:ok, io_device} <- :file.open(path, [:read, :binary]),
{:ok, binary} <- :file.read(io_device, byte_size) do
:file.close(io_device)
Enum.find(@matchers, & &1.matcher.(binary))
end
end

If :file.read/2 returns anything other than {:ok, data}, the file won't be closed. One possible solution is to use File.open/3 with a function as an argument because it ensures the file will be closed even in case of error:

def get_from_path(path, byte_size \\ 2048) do
  result = File.open(path, [:binary, :read], fn io_device ->
    case IO.binread(io_device, byte_size) do
      binary when is_binary(binary) -> Enum.find(@matchers, & &1.matcher.(binary)) 
      _other -> nil
    end
  end)

  case result do
    {:ok, %Infer.Type{} = type} -> type
    _other -> nil
  end
end

(this implementation would also fix #9)

Infer functions fail to typecheck (functions don't exist)

Consider the following minimal example (or see the repo):

defmodule InferTypecheckIssue do
  def get(binary), do: Infer.get(binary)
  def get_from_path(path), do: Infer.get_from_path(path)
end

When running dialyzer (via dialyxir v1.4.2), it fails with errors:

$ mix dialyzer
[...snip...]
Total errors: 2, Skipped: 0, Unnecessary Skips: 0
done in 0m0.69s
lib/infer_typecheck_issue.ex:2:unknown_function
Function Infer.get/1 does not exist.
________________________________________________________________________________
lib/infer_typecheck_issue.ex:3:unknown_function
Function Infer.get_from_path/1 does not exist.
________________________________________________________________________________
done (warnings were emitted)
Halting VM with exit status 2

I must admit I have no idea why it happens. I thought that it has something to do with the infer's project config, so I looked at mix.exs but didn't find anything suspicious.

Invalid type spec for `Infer.get_from_path/1,2`

The type specification states that the function returns either Infer.Type.t() or nil, but it returns {:error, :file.posix()} if given a non-existing path:

iex(1)> Infer.get_from_path("/path/to/non/existing/file.jpeg")
{:error, :enoent}

Replace sample files with smaller filesize

Currently the sample files infer uses for testing are generic sample files.

To improve the speed of the test suite most of the files can be replaced by smaller files.

  • Replace sample files with smaller files

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.