Cortex

Documentation - API Reference - Changelog - Bug reports - Discord

⚠️ Cortex is currently in Development: Expect breaking changes and bugs!

About

Cortex is an OpenAI-compatible AI engine that developers can use to build LLM apps. It is packaged with a Docker-inspired command-line interface and client libraries. It can be used as a standalone server or imported as a library.

Cortex Engines

Cortex supports the following engines:

cortex.llamacpp: cortex.llamacpp library is a C++ inference tool that can be dynamically loaded by any server at runtime. We use this engine to support GGUF inference with GGUF models. The llama.cpp is optimized for performance on both CPU and GPU.
cortex.onnx Repository: cortex.onnx is a C++ inference library for Windows that leverages onnxruntime-genai and uses DirectML to provide GPU acceleration across a wide range of hardware and drivers, including AMD, Intel, NVIDIA, and Qualcomm GPUs.
cortex.tensorrt-llm: cortex.tensorrt-llm is a C++ inference library designed for NVIDIA GPUs. It incorporates NVIDIA’s TensorRT-LLM for GPU-accelerated inference.

Quicklinks

Quickstart

Prerequisites

OS:
- MacOSX 13.6 or higher.
- Windows 10 or higher.
- Ubuntu 22.04 and later.
Dependencies:
- Node.js: Version 18 and above is required to run the installation.
- NPM: Needed to manage packages.
- CPU Instruction Sets: Available for download from the Cortex GitHub Releases page.
- OpenMPI: Required for Linux. Install by using the following command:
```
sudo apt install openmpi-bin libopenmpi-dev
```

Visit Quickstart to get started.

NPM

# Install using NPM
npm i -g cortexso
# Run model
cortex run mistral
# To uninstall globally using NPM
npm uninstall -g cortexso

Homebrew

# Install using Brew
brew install cortexso
# Run model
cortex run mistral
# To uninstall using Brew
brew uninstall cortexso

You can also install Cortex using the Cortex Installer available on GitHub Releases.

Cortex Server

cortex serve

# Output
# Started server at http://localhost:1337
# Swagger UI available at http://localhost:1337/api

You can now access the Cortex API server at http://localhost:1337, and the Swagger UI at http://localhost:1337/api.

Build from Source

To install Cortex from the source, follow the steps below:

Clone the Cortex repository here.
Navigate to the cortex-js folder.
Open the terminal and run the following command to build the Cortex project:

npx nest build

Make the command.js executable:

chmod +x '[path-to]/cortex/cortex-js/dist/src/command.js'

Link the package globally:

npm link

Cortex CLI Commands

The following CLI commands are currently available. See CLI Reference Docs for more information.

  serve               Providing API endpoint for Cortex backend.
  chat                Send a chat request to a model.
  init|setup          Init settings and download cortex's dependencies.
  ps                  Show running models and their status.
  kill                Kill running cortex processes.
  pull|download       Download a model. Working with HuggingFace model id.
  run [options]       EXPERIMENTAL: Shortcut to start a model and chat.
  models              Subcommands for managing models.
  models list         List all available models.
  models pull         Download a specified model.
  models remove       Delete a specified model.
  models get          Retrieve the configuration of a specified model.
  models start        Start a specified model.
  models stop         Stop a specified model.
  models update       Update the configuration of a specified model.
  benchmark           Benchmark and analyze the performance of a specific AI model using your system.
  presets             Show all the available model presets within Cortex.
  telemetry           Retrieve telemetry logs for monitoring and analysis.
  embeddings          Creates an embedding vector representing the input text.
  engines             Subcommands for managing engines.
  engines get         Get an engine details.
  engines list        Get all the available Cortex engines.
  engines init        Setup and download the required dependencies to run cortex engines.
  configs             Subcommands for managing configurations.
  configs get         Get a configuration details.
  configs list        Get all the available configurations.
  configs set         Set a configuration.

Contact Support

For support, please file a GitHub ticket.
For questions, join our Discord here.
For long-form inquiries, please email [email protected].

Load model fail should exit with code 1 instead of continue serving http server

[1] stderr: gguf_init_from_file: invalid magic number 0a8a0280
[1] 
[1] stderr: error loading model: llama_model_loader: failed to load model from /Users/louis/Library/Application Support/jan-electron/pytorch_model.bin
[1] 
[1] llama_load_model_from_file: failed to load model
[1] llama_init_from_gpt_params: error: failed to load model '/Users/louis/Library/Application Support/jan-electron/pytorch_model.bin'
[1] 
[1] stdout: 20231005 01:38:04.960344 UTC 4991698 INFO   - main.cc:27
[1] 20231005 01:38:04.971173 UTC 4991698 INFO  {"timestamp":1696469884,"level":"WARNING","function":"llamaCPP","line":1198,"message":"build info","build":1273,"commit":"99115f3"} - llamaCPP.h:108
[1] 20231005 01:38:04.971215 UTC 4991698 INFO  {"timestamp":1696469884,"level":"WARNING","function":"llamaCPP","line":1204,"message":"system info","n_threads":6,"total_threads":10,"system_info":"AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | "} - llamaCPP.h:108
[1] 20231005 01:38:04.971447 UTC 4991698 INFO  {"timestamp":1696469884,"level":"ERROR","function":"loadModel","line":245,"message":"unable to load model","model":"/Users/louis/Library/Application Support/jan-electron/pytorch_model.bin"} - llamaCPP.h:108
[1] 20231005 01:38:04.971451 UTC 4991698 INFO  "Error loading the model" - llamaCPP.h:108
[1]       ___                                   ___           ___     
[1]      /__/        ___           ___        /  /\         /  /\    
[1]      \  \:\      /  /\         /  /\      /  /::\       /  /::\   
[1]       \  \:\    /  /:/        /  /:/     /  /:/\:\     /  /:/\:\  
[1]   _____\__\:\  /__/::\       /  /:/     /  /:/  \:\   /  /:/  \:\ 
[1]  /__/::::::::\ \__\/\:\__   /  /::\    /__/:/ /:/___ /__/:/ \__\:\
[1]  \  \:\~~\~~\/    \  \:\/\ /__/:/\:\   \  \:\/:::::/ \  \:\ /  /:/
[1]   \  \:\  ~~~      \__\::/ \__\/  \:\   \  \::/~~~~   \  \:\  /:/ 
[1]    \  \:\          /__/:/       \  \:\   \  \:\        \  \:\/:/  
[1]     \  \:\         \__\/         \__\/    \  \:\        \  \::/   
[1]      \__\/                                 \__\/         \__\/    
[1]

janhq / nitro Goto Github PK

nitro's Introduction

Cortex

About

Cortex Engines

Quicklinks

Quickstart

Prerequisites

NPM

Homebrew

Cortex Server

Build from Source

Cortex CLI Commands

Contact Support

nitro's People

Contributors

Stargazers

Watchers

Forkers

nitro's Issues

Deliverable

Owners

Big Picture

Exclusions

Previous Issue

Success Criteria

Rough spec

Success Criteria

Overview

Recommend Projects

Recommend Topics

Recommend Org