Coder Social home page Coder Social logo

pemja's Introduction

PemJa

What is it?

PemJa is an open source cross language call framework based on FFI. It aims to provide a high-performance

framework of calling between different languages.

Where to get it

Python binary installers for the latest released version are available at the Python package index

pip install pemja

Java Maven Dependency

<dependency>
    <groupId>com.alibaba</groupId>
    <artifactId>pemja</artifactId>
    <version>{version}</version>
</dependency>

Dependencies

Installation from sources

Prerequisites for building PemJa:

  • Unix-like environment (we use Linux, Mac OS X)
  • Git
  • Maven (we recommend version 3.2.5 and require at least 3.1.1)
  • Java 8 or 11 (Java 9 or 10 may work)
  • Python >= 3.8 (we recommend version 3.8, 3.9, 3.10, 3.11)
git clone https://github.com/alibaba/pemja.git
cd pemja
mvn clean install -DskipTests
pip install -r dev/dev-requirements.txt
python setup.py sdist
pip install dist/*.tar.gz

Usage

String path = ...;
PythonInterpreterConfig config = PythonInterpreterConfig
    .newBuilder()
    .setPythonExec("python3") // specify python exec
    .addPythonPaths(path) // add path to search path
    .build();

PythonInterpreter interpreter = new PythonInterpreter(config);

// set & get
interpreter.set("a", 12345);
interpreter.get("a"); // Object
interpreter.get("a", Integer.class); // Integer

// exec & eval
interpreter.exec("print(a)");

// invoke functions
interpreter.exec("import str_upper");
String result = interpreter.invoke("str_upper.upper", "abcd");
// Object invoke(String name, Object... args);
// Object invoke(String name, Object[] args, Map<String, Object> kwargs);

// invoke object methods
/*
// invoke.py
class A:
       def __init__(self):
           self._a = 0
   
       def get_value(self):
           return self._a
   
       def add(self, n):
           self._a += n
   
       def add_all(self, *args):
           for item in args:
               self._a += item
               return self._a
   
       def minus(self, n):
           self._a -= n
           return self._a
*/

interpreter.exec("import invoke");
interpreter.exec("a = invoke.A()");
interpreter.invokeMethod("a", "add", 3);
interpreter.invokeMethod("a", "minus", 2);
interpreter.invokeMethod("a", "add_all", 1, 2, 3);


// python callback java methods
/*
// invoke_callback.py
from pemja import findClass

StringBuilder = findClass('java.lang.StringBuilder')
Integer = findClass('java.lang.Integer')

def callback_java():
    sb = StringBuilder()
    sb.append('pemja')
    sb.append('java')
    sb.append('python')
    sb.append(Integer.toHexString(Integer.MAX_VALUE))
    return sb.toString()
*/
interpreter.exec("import call_back")
print(interpreter.invoke("call_back.callback_java"))

Documentation

pemja's People

Contributors

a49a avatar alibaba-oss avatar huangxingbo avatar robbie-palmer avatar syntomic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pemja's Issues

Generic invokeMethod

It would be great if invokeMethod could take the return type, similar to public <T> T get(String name, Class<T> clazz)

The below code which passes an integer to Python and expects an integer back fails with the error class java.lang.Long cannot be cast to class java.lang.Integer

var pyFnName = "add_one";
var pythonCode = pyFnName + " = lambda x: x + 1";
int input = 5;
try (var interpreter = new PythonInterpreter(config);) {
    interpreter.exec(pythonCode);
    var pyFn = (PyObject) interpreter.get(pyFnName);
    var out = (int) pyFn.invokeMethod("__call__", input);
    System.out.println(out);
}

An integer can be retrieved by using the generic get alongside exec but is fiddlier and less pretty

var pyFnName = "add_one";
var pythonCode = pyFnName + " = lambda x: x + 1";
int input = 5;
try (var interpreter = new PythonInterpreter(config);) {
    interpreter.exec(pythonCode);
    var resultVarName = "result";
    interpreter.exec(resultVarName + "=" + pyFnName + "(" + input + ")");
    var out = (int) interpreter.get(resultVarName, Integer.class);
    System.out.println(out);
}

Or can accept the value back as a Long and convert it to an int value

var out = (Long) pyFn.invokeMethod("__call__", input);
var intOut = out.intValue();

So functionally everything is available, with a public <T> T invokeMethod(String name, Class<T> clazz, Object... args) method just providing syntactic sugar

mac m1 throw UnsatisfiedLinkError

mac m1 run demo throw
java.lang.UnsatisfiedLinkError: pemja.core.PythonInterpreter$MainInterpreter.initialize()
debug find PythonInterpreter.initialize() function "System.load(pemjaLibPath)" throw error, but the pemjaLibPath pemja_core.cpython-39-darwin.so exists. wish help.
image

Can pemja support JVM scheduling through PVM?

In many scenarios, Python is used as the entry point to call Java code. but pemja does not support jvm scheduling by pvm currently. We hope that the pemja team can support this feature. Thank you!

Comparison vs DL4J / java-cpp

FLIP-206 compares PemJa to Jython, GraalVM, JPype and Jep

deeplearning4j's Python4J framework seems more comparable to PemJa than any of these
It depends on javacpp-presets which bundles CPython into a Jar and depends on java-cpp to abstract from the JNI

API example:

try(PythonGIL gil = PythonGIL.lock()){
        try(PythonGC gc = PythonGC.watch()){
            List<PythonVariable> inputs = new ArrayList<>();
            inputs.add(new PythonVariable<>("x", PythonTypes.STR, "Hello "));
            inputs.add(new PythonVariable<>("y", PythonTypes.STR, "World"));
            PythonVariable out = new PythonVariable<>("z", PythonTypes.STR);
            String code = "z = x + y";
            PythonExecutioner.exec(code, inputs, Collections.singletonList(out));
            System.out.println(out.getValue());
        }
    }catch (Throwable e){
        e.printStackTrace();
    }

A comparison between this approach and that of PemJa would be very useful

TypeError: expected str, bytes or os.PathLike object, not NoneType

Unable to install pemja on raspberry pi, tried with python 3.7 and 3.9

Steps to reproduce:

$ virtualenv -p /usr/bin/python3.9 venv3.9
$ source venv3.9/bin/activate      
$ python --version
Python 3.9.0
$ pip --version
pip 22.1.1 from /home/pi/venv3.9/lib/python3.9/site-packages/pip (python 3.9)

$ pip install pemja

Looking in indexes: https://pypi.org/simple, https://www.piwheels.org/simple
Collecting pemja
  Downloading pemja-0.1.5.tar.gz (32 kB)
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [10 lines of output]
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-iqprsatx/pemja_afa58ffaf082492bb729b5377e1491da/setup.py", line 185, in <module>
          include_dirs=get_java_include() + ['src/main/c/pemja/core/include'] + get_numpy_include(),
        File "/tmp/pip-install-iqprsatx/pemja_afa58ffaf082492bb729b5377e1491da/setup.py", line 112, in get_java_include
          inc = os.path.join(get_java_home(), inc_name)
        File "/home/pi/venv3.9/lib/python3.9/posixpath.py", line 76, in join
          a = os.fspath(a)
      TypeError: expected str, bytes or os.PathLike object, not NoneType
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

Issue with Thread mode in Flink 1.16 and M1

I am running Flink 1.16 on Mac M1. Everything works as expected except few tweaks I had to make to get the pyflink 1.16 to work in my M1. However, when I decided to test the job in Thread mode, I got the following error:

2022-11-14 17:01:51
pemja.core.PythonException: <class 'TypeError'>: 'NoneType' object is not iterable
	at /usr/local/lib/python3.8/site-packages/pyflink/fn_execution/embedded/operations.process_element2(operations.py:140)
	at /usr/local/lib/python3.8/site-packages/pyflink/fn_execution/embedded/operations._output_elements(operations.py:57)
	at /usr/local/lib/python3.8/site-packages/pyflink/fn_execution/embedded/operations._process_elements_on_operation(operations.py:48)
	at /usr/local/lib/python3.8/site-packages/pyflink/fn_execution/datastream/embedded/operations.process_element_func2(operations.py:208)
	at /usr/local/lib/python3.8/site-packages/pyflink/fn_execution/datastream/embedded/operations.process_func(operations.py:111)
	at pemja.core.object.PyIterator.next(Native Method)
	at pemja.core.object.PyIterator.hasNext(PyIterator.java:40)
	at org.apache.flink.streaming.api.operators.python.embedded.AbstractTwoInputEmbeddedPythonFunctionOperator.processElement2(AbstractTwoInputEmbeddedPythonFunctionOperator.java:208)
	at org.apache.flink.streaming.runtime.io.StreamTwoInputProcessorFactory.processRecord2(StreamTwoInputProcessorFactory.java:225)
	at org.apache.flink.streaming.runtime.io.StreamTwoInputProcessorFactory.lambda$create$1(StreamTwoInputProcessorFactory.java:194)
	at org.apache.flink.streaming.runtime.io.StreamTwoInputProcessorFactory$StreamTaskNetworkOutput.emitRecord(StreamTwoInputProcessorFactory.java:266)
	at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:134)
	at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:105)
	at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
	at org.apache.flink.streaming.runtime.io.StreamMultipleInputProcessor.processInput(StreamMultipleInputProcessor.java:85)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:542)
	at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:831)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:780)
	at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:935)
	at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:914)
	at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:728)
	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:550)
	at java.base/java.lang.Thread.run(Thread.java:829)

The following is the brief settings i have in my job

    env = StreamExecutionEnvironment.get_execution_environment()

    env.set_stream_time_characteristic(TimeCharacteristic.EventTime)

    # Additional python settings
    env_config = Configuration(
        j_configuration=get_j_env_configuration(env._j_stream_execution_environment)
    )
    env_config.set_string("python.execution-mode", "thread")

I am running the job with 2 parallelism.

Integrate pemja in java web to call python deep learning model

Hi, in java web, is it possible to use pemja to call a deep learning model written in python to implement an AI service? There is such a scenario, there are two teams, one team is good at writing java web, the other team is good at using python to research machine translation models, we hope to use pemja to connect these two teams and deliver a robust and efficient model service , so that there is no need for a C++ development team or a python development team to develop web services. When the design of pemja, is it possible to support java calls to persistent python instances in RAM memory or GPU memory?

Can't install with python 3.10

I can't seem to get the pip install to work with python 3.10.6 and pip 22.3.1

pip install requests pemja

Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: requests in /usr/lib/python3/dist-packages (2.25.1)
Collecting pemja
Using cached pemja-0.2.6.tar.gz (48 kB)
Installing build dependencies ... done
Getting requirements to build wheel ... error
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> [24 lines of output]
setup.py:34: RuntimeWarning: Pemja may not yet support Python 3.10.
warnings.warn(
Traceback (most recent call last):
File "/home/richard/.local/lib/python3.10/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 351, in
main()
File "/home/richard/.local/lib/python3.10/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 333, in main
json_out['return_val'] = hook(**hook_input['kwargs'])
File "/home/richard/.local/lib/python3.10/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 118, in get_requires_for_build_wheel
return hook(config_settings)
File "/tmp/pip-build-env-gt1_4mbr/overlay/local/lib/python3.10/dist-packages/setuptools/build_meta.py", line 162, in get_requires_for_build_wheel
return self._get_build_requires(
File "/tmp/pip-build-env-gt1_4mbr/overlay/local/lib/python3.10/dist-packages/setuptools/build_meta.py", line 143, in _get_build_requires
self.run_setup()
File "/tmp/pip-build-env-gt1_4mbr/overlay/local/lib/python3.10/dist-packages/setuptools/build_meta.py", line 267, in run_setup
super(_BuildMetaLegacyBackend,
File "/tmp/pip-build-env-gt1_4mbr/overlay/local/lib/python3.10/dist-packages/setuptools/build_meta.py", line 158, in run_setup
exec(compile(code, file, 'exec'), locals())
File "setup.py", line 184, in
include_dirs=get_java_include() + ['src/main/c/pemja/core/include'] + get_numpy_include(),
File "setup.py", line 111, in get_java_include
inc = os.path.join(get_java_home(), inc_name)
File "/usr/lib/python3.10/posixpath.py", line 76, in join
a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not NoneType

When building from scratch I get:

python setup.py sdist
/home/richard/src/pemja/setup.py:23: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives
from distutils.command.build_ext import build_ext as old_build_ext
/home/richard/src/pemja/setup.py:34: RuntimeWarning: Pemja may not yet support Python 3.10.
warnings.warn(
Traceback (most recent call last):
File "/home/richard/src/pemja/setup.py", line 184, in
include_dirs=get_java_include() + ['src/main/c/pemja/core/include'] + get_numpy_include(),
File "/home/richard/src/pemja/setup.py", line 111, in get_java_include
inc = os.path.join(get_java_home(), inc_name)
File "/usr/lib/python3.10/posixpath.py", line 76, in join
a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not NoneType

Unknown Number class int.

Executing this example in the README fails

interpreter.set("a", 12345);
interpreter.get("a"); // Object
interpreter.get("a", int.class);

With error: Exception in thread "main" pemja.core.PythonException: Unknown Number class int.

interpreter.get("a", Integer.class); runs but returns a boxed integer instead of the primitive int

Can not import 'findClass'

When I following the document to callback java in python,

from pemja import findClass

StringBuilder = findClass('java.lang.StringBuilder')

I catch such error:

ImportError: cannot import name 'findClass' from 'pemja' (/root/miniconda3/envs/pemja/lib/python3.8/site-packages/pemja/__init__.py)

I create pemja env by conda, and here is all my packages listed by 'pip list'

Package        Version
-------------- ---------
certifi        2022.9.24
find-libpython 0.3.0
numpy          1.21.4
pemja          0.2.6
pip            22.2.2
setuptools     65.5.0
wheel          0.37.1

python = 3.8
jdk = openjdk version "1.8.0_292"
pemja = 0.2.6

Failed to use pemja in Flink DataStream API

import org.apache.flink.api.common.functions.RichFlatMapFunction;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.table.api.Table;
import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
import org.apache.flink.types.Row;
import org.apache.flink.util.Collector;
import pemja.core.PythonInterpreter;
import pemja.core.PythonInterpreterConfig;

public class PythonExecTest {

public static void main(String[] args) throws Exception {
    StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironment();
    StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env);

    tableEnv.executeSql("create table datagen (f1 string, f2 int) with ('connector' = 'datagen')");

    Table table = tableEnv.from("datagen");

    DataStream<Row> dataStream = tableEnv.toDataStream(table);

    System.out.println("======================================");

    dataStream.flatMap(new MyRichFlatMapFunction());

    System.out.println("================end======================");

}

static class MyRichFlatMapFunction extends RichFlatMapFunction<Row, Row> {

    private static PythonInterpreterConfig pythonInterpreterConfig;

    static {
        System.out.println("=================open==================");
        pythonInterpreterConfig = PythonInterpreterConfig
                .newBuilder()
                .setExcType(PythonInterpreterConfig.ExecType.SUB_INTERPRETER)

// .setExcType(PythonInterpreterConfig.ExecType.MULTI_THREAD)
// 设置python的环境路径
.setPythonExec("/root/miniconda3/bin/python3")
// 设置python的执行路径
.addPythonPaths("/root/pyAutoflow/python")
// 设置依赖路径
.addPythonPaths("/root/miniconda3/lib/python3.8/site-packages")
.build();
}

    @Override
    public void flatMap(Row row, Collector<Row> collector) throws Exception {
        System.out.println("=================flatMap==================");
        // 构建执行器
        PythonInterpreter interpreter = new PythonInterpreter(pythonInterpreterConfig);
        interpreter.set("a", 12345);
        Integer a = interpreter.get("a", Integer.class);
        // 执行脚本内容
        interpreter.exec("print(a)");

        // 要执行的文件
        interpreter.exec("import funcs");
        for (int i = 0; i < 10; i++) {
            // 调用
            Object result = interpreter.invoke("funcs.add", i, 2);
            System.out.println("result-------------->" + result);
        }
    }
}

}

======================================
=================open==================
================end======================
[failed]

Incompatible with Official Python Docker Images: Failed to find libpython

I'm trying to build a docker image to host my app with both Java and Python components

FROM python:3.9
RUN apt-get update
RUN apt install default-jre -y
COPY myapp.jar ./
CMD java -classpath myapp.jar foo.Main

But I get the error

java.lang.RuntimeException: Failed to find libpython
        at pemja.utils.CommonUtils.getPythonLibrary(CommonUtils.java:175)
        at pemja.core.PythonInterpreter$MainInterpreter.initialize(PythonInterpreter.java:358)
        at pemja.core.PythonInterpreter.initialize(PythonInterpreter.java:145)
        at pemja.core.PythonInterpreter.<init>(PythonInterpreter.java:46)

This appears to be because the path pattern doesn't match what is in this Python image
PemJa looks for ^libpython.*so$

String libPythonPathPattern;
if (isLinuxOs()) {
    libPythonPathPattern = "^libpython.*so$";
} else if (isMacOs()) {
    libPythonPathPattern = "^libpython.*dylib$";
} else {
    throw new RuntimeException("Unsupported os ");
}
if (libFile.isDirectory()) {
    for (File f : Objects.requireNonNull(libFile.listFiles())) {
        if (f.isFile() && Pattern.matches(libPythonPathPattern, f.getName())) {
            return f.getAbsolutePath();
        }
    }
}
throw new RuntimeException("Failed to find libpython");

When the actual contents of /usr/local/lib is:
libpython3.9.so libpython3.9.so.1.0 libpython3.so pkgconfig python3.9

Recursive callbacks cause a deadlock

Hi,

I am trying to use PEMJA to build an interface between python and scala code. This is working fine for simple cases, however I run into a deadlock in the following situation:

  1. Call python code from jvm
  2. Python code calls a jvm method
  3. The jvm method tries to call python again (deadlock)

Things I have tried to work around this issue:

  • Try to create a new python interpreter in the jvm for step 3. above (with MULTI_THREAD this still deadlocks, with SUB_INTERPRETER this fails with a numpy import error.
  • Try to create a new Thread in python (this causes a segfault in C [pemja_core.cpython-310-darwin.so+0x7fc8] JcpPyJObject_New+0xf8 when trying to call non-trivial jvm methods from the new python thread).

Do you have any guidance on how to work around this issue?

findClass module was missing in pemja

python -c "from pemja import findClass; Integer = findClass('java.lang.Integer'); print(Integer.toHexString(Integer.MAX_VALUE))" Traceback (most recent call last): File "<string>", line 1, in <module> ImportError: cannot import name 'findClass' from 'pemja' (/usr/local/lib/python3.8/site-packages/pemja/__init__.py)

Failed to find the function after exec

I define a function by exec, but invoke this function failed, eg:

interpreter.exec("def f(a):return a");
interpreter.invoke("f", 1);

then throw:

Exception in thread "main" pemja.core.PythonException: <class 'RuntimeError'>: Failed to find the function `f` 
        at pemja.core.PythonInterpreter.invokeOneArgInt(Native Method)
        at pemja.core.PythonInterpreter.invokeOneArg(PythonInterpreter.java:184)
        at pemja.core.PythonInterpreter.invoke(PythonInterpreter.java:93)

python setup.py egg_info failed

python setup.py egg_info

The following is error message

numpy not found
python -m pip install -r flink-python/dev/dev-requirements.txt

The following is error message

Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting pemja==0.1.5
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/36/32/18615e64be80b70c4c95dc2d7d3b20a9d706dc5fcefc92ebafe0348ca3dc/pemja-0.1.5.tar.gz (32 kB)
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py egg_info did not run successfully.
  │ exit code: 255
  ╰─> [1 lines of output]
      numpy not found
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.