Coder Social home page Coder Social logo

renjin-client-data-utils's Introduction

renjin-client-data-utils

Maven Central javadoc

Utilities to facilitate working with R data from Renjin in Java. In R, there are various basic data structures such as matrix and data.frame and are faithfully backed by Java classes such as ListVector for data.frame that while very easy to work with in R, can be a bit cumbersome to work with directly in Java. This library aims to simplify working with data derived from Renjin R from Java code.

To use it add the following dependency to your pom.xml:

<dependency>
    <groupId>se.alipsa</groupId>
    <artifactId>renjin-client-data-utils</artifactId>
    <version>1.5.0</version>
</dependency>

Note: 1.5.0 and later requires java 11, previous version required java 8.

se.alipsa.renjin.client.datautils.Table

The Table class makes it easy to interact with R data.frame and matrix data. Data frames in R are "column based" (variable based) which is very convenient for analysis but Java is Object / Observation based, so a Table which essentially is just a List of rows (observations), makes is much easier to work with the data in Java. The data in a Table is immutable once created. If you need mutable data, consider using RDataTRansformer instead (see below).

Example:

Given the following R code:

employee <- c('John Doe','Peter Smith','Jane Doe')
salary <- c(21000, 23400, 26800)
startdate <- as.Date(c('2013-11-1','2018-3-25','2017-3-14')) 
endDate <- as.POSIXct(c('2020-01-10 00:00:00', '2020-04-12 12:10:13', '2020-10-06 10:00:05'), tz='UTC' ) 
data.frame(employee, salary, startdate, endDate)

I.e a two dimensional table looking like this:

employee salary startdate enddate
John Doe 21000 2013-11-1 2020-01-10 00:00:00
Peter Smith 23400 2018-3-25 2020-04-12 12:10:13
Jane Doe 26800 2017-3-14 2020-10-06 10:00:05

... you create it in the RenjinScriptEngine like this:

import org.renjin.script.RenjinScriptEngine;
import org.renjin.sexp.SEXP;
import se.alipsa.renjin.client.datautils.Table;

public class Example {

  RenjinScriptEngine engine;

  public Example() {
    RenjinScriptEngineFactory factory = new RenjinScriptEngineFactory();
    Session session = new SessionBuilder().withDefaultPackages().build();
    engine = factory.getScriptEngine(session);
  }

  public Table getData(String code) {
    SEXP sexp = (SEXP) engine.eval(code);
    return Table.createTable(sexp);
  }
}

... the following assertions would be valid for the Table returned by the getData() method:

assertThat(table.getValue(0, 0), equalTo("John Doe"));
assertThat(table.getValue(1, 1), equalTo(23400.0));
assertThat(table.getValueAsLocalDate(2, 2), equalTo(LocalDate.of(2017, 3, 14)));
assertThat(table.getValueAsLocalDateTime(2, 3), equalTo(LocalDateTime.of(2020, 10, 6, 10, 0, 5)));

As you see from the above we access the Table data in a row, column way which makes it easy to display the data in Swing or JavaFx.

See TableTest for more examples.

se.alipsa.renjin.client.datautils.RDataTransformer

This is a utility class that transforms various R data types into Java. In many ways it is similar to the Table class but allows for more atomic operations useful when you only want the result in a particular way once, or you need mutable data.

See DataTransformationTest for examples.

Version history

1.5.1

Add automodule info

1.5.0

Upgrade to java 11, no module info though (not possible due to BLAS dependencies)

1.4.4

  • Add constructor for column wise constructing of a table
  • Add a DateArrayVector to handle column wise constructing of data types

1.4.3

  • make the convenience methods more robust and performing a conversion (if possible) into the type requested
  • Add more Javadocs
  • Add conversion from Vectors (c() and list() constructs in R) to java.util.List
  • Bump dependency versions for spotbugs and junit

1.4.2

  • If no info on column types are given when creating a table then treat them as Strings
  • Add customizable decimal format awareness for getAsFloat and getAsDouble convenience methods.
  • Bump dependency versions for spotbugs and require maven 3.6.3 or higher for cleaner output

1.4.1

Handle transformations of empty tables (used to get an ArrayIndexOutOfBoundsException) - now returning empty List instead

1.4

Fix handling of factorized columns with NA values Change Row naming to use an IntSequence instead of the no longer existing RowNamesVector(numberOfRows) Bump dependency versions

1.3

add Table column convenience methods (getColumnForName, getColumnIndex, getColumn) Make Table data immutable, this allows for some performance optimizations and makes usage much clearer (Table to process data from Renjin, RDataTransformer to handle mutable data to and from Renjin)

1.2

  • Add docs
  • cleanup and expand tests
  • add @SafeVarargs to Table constructor to silence compiler warning which does not apply
  • add getRowForName providing the ability to find a row based on a cell value

1.1

  • Add some utility methods, so we can remove the Ride version of the same class.
  • Add docs

1.0

  • Initial version

renjin-client-data-utils's People

Contributors

pernyfelt avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.