Coder Social home page Coder Social logo

rk3588_npu_llm_server's Introduction

Server/Web API for RK3588 NPU LLM

Update: there is now a web UI available: https://github.com/av1d/NPU-Chat/

The goal is to make LLMs running on the NPU practical and usable as I'm not a fan of the CLI interactions due to their limited usability. The server outputs a JSON response and therefore you can use cURL, AJAX, Python, or whatever you want.

Currently only works with Qwen.

First, install ezrknpu from Pelochus if you haven't yet:
https://github.com/Pelochus/ezrknpu
Side note: Parts of this server are borrowed from the original ezrknn-llm/rkllm-runtime/example/src/main.cpp file from that repo.

Next, install Boost:
sudo apt install libboost-all-dev libcpprest-dev

Test Boost. Put this in a file named test.cpp:

#include <iostream>
#include <boost/version.hpp>

int main() {
    std::cout << "Using Boost " << BOOST_VERSION / 100000 << "."  // major version
              << BOOST_VERSION / 100 % 1000 << "."  // minor version
              << BOOST_VERSION % 100  // patch level
              << std::endl;
    return 0;
}

Compile: g++ test.cpp
Run test: ./a.out
Result should give you something like "Using Boost 1.74.0"
If not, RTFM: https://www.boost.org/doc/libs/1_74_0/

Compile server.cpp. Change the path to rkllmrt if needed.
If you have locate installed, try locate rkllmrt.
The path in the following command is probably correct, though:
g++ server.cpp -o server -std=c++11 -lcpprest -lcrypto -L/usr/lib -lrkllmrt

Finally, you have a file named server in the current working directory.
Syntax is: IP, port, path to model. Start it:
./server 192.168.0.196 31337 ../qwen-1_8B-rk3588/qwen-chat-1_8B.rkllm

Test it. Change the value of "hello, how are you " if you like then send it.
curl -H "Content-Type: application/json" -d '{"PROMPT_TEXT_PREFIX":"<|im_start|>system You are a helpful assistant. <|im_end|> <|im_start|>user ","input_str":"hello, how are you ","PROMPT_TEXT_POSTFIX":"<|im_end|><|im_start|>assistant "}' http://192.168.0.196:31337/

implement it in PHP:

<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://192.168.0.196:31337/');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'POST');
curl_setopt($ch, CURLOPT_HTTPHEADER, [
    'Content-Type: application/json',
]);
curl_setopt($ch, CURLOPT_POSTFIELDS, '{"PROMPT_TEXT_PREFIX":"<|im_start|>system You are a helpful assistant. <|im_end|> <|im_start|>user ","input_str":"hello, how are you ","PROMPT_TEXT_POSTFIX":"<|im_end|><|im_start|>assistant "}');

$response = curl_exec($ch);

curl_close($ch);

var_dump($response);

jQuery:

$.ajax({
  url: 'http://192.168.0.196:31337/',
  crossDomain: true,
  method: 'post',
  contentType: 'application/json',
  data: JSON.stringify({
    'PROMPT_TEXT_PREFIX': '<|im_start|>system You are a helpful assistant. <|im_end|> <|im_start|>user ',
    'input_str': 'hello, how are you ',
    'PROMPT_TEXT_POSTFIX': '<|im_end|><|im_start|>assistant '
  })
}).done(function(response) {
  console.log(response);
});

Parts of this software taken from ezrknn-llm/rkllm-runtime/example/src/main.cpp are covered by the Apache License:

Copyright (c) 2024 by Rockchip Electronics Co., Ltd. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Everything else falls under the MIT license.

Do not run this server in a production environment. It is lacking sanitization and security features.
Shouts to r/RockchipNPU, check it out!

rk3588_npu_llm_server's People

Contributors

av1d avatar ebw44 avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

ebw44 cklam12345

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.