Coder Social home page Coder Social logo

activeviam / autopivot Goto Github PK

View Code? Open in Web Editor NEW
28.0 27.0 9.0 12.45 MB

Atoti AutoPivot automatically creates in-memory OLAP cubes from CSV files, that you can explore from Excel, Tableau or using the embedded Atoti UI web frontend

Home Page: https://www.activeviam.com

License: Apache License 2.0

Java 99.36% JavaScript 0.64%
cube activepivot olap atoti csv multidimensional

autopivot's Introduction

Atoti AutoPivot

Atoti AutoPivot is a standalone application for online analysis (OLAP) of CSV files.

Atoti AutoPivot discovers the structure of CSV files, field separator, column names, column types, and loads data in memory with a high throughput parallel CSV source. Atoti AutoPivot exposes the data as a cube with hierarchies and metrics that can be manipulated in the Atoti UI frontend or directly from the Microsoft Excel Pivot Table, using the XMLA protocol and MDX query language.

This project is packaged using Spring Boot

Launching Atoti AutoPivot

Build the project with Maven using the standard mvn clean install command. This will generate a jar file, which can be run using standard java commands. Atoti UI, ActiveViam's user interface for exploring the cube, will be available from http://localhost:9090/ui.

Performance

The multithreaded CSV source usually parses CSV data at several hundreds of MB/s. Of course this kind of throughput can only be reached with fast storage, a local SSD drive for instance or network storage accessed through a 10Gbps network at least.

Atoti AutoPivot is powered by the Atoti Server technology, the in-memory analytical platform developed by ActiveViam. Atoti Server runs on all sizes of hardware, from laptops to large servers with hundreds of cores and tens of terabytes of memory. When used in fire and forget mode, AutoPivot targets files up to a few hundreds of gigabytes.

CSV Format

Atoti AutoPivot expects a standard CSV file, with headers (column names) on the first row.

Options

The most common options can be set in the /src/main/resources/application.properties file. All the supported options can also be passed as JVM parameters such as -DfileName=data.csv.

Tweaking The Project

Atoti AutoPivot tries to guess what's in the data and do everything automatically. More generally it illustrates Atoti cubes can be configured programmatically and started on the fly, a very powerful concept for Atoti developers that can be reused beyond the simple usage of AutoPivot.

Here are some entry points to jump into the code, starting from src/main/java:

  • com.av.csv.discover.CSVDiscovery logic to discover the CSV separator character and the data types of the columns
  • com.av.autopivot.AutoPivotGenerator logic to create an Atoti cube (hierarchies, aggregates...) based on the file format
  • com.av.autopivot.spring this package contains the Spring configuration of the AutoPivot application
  • src/main/resources/application.properties options of the AutoPivot application

Licensing

The code of the Atoti AutoPivot application is open source, licensed under the Apache License 2.0. The AutoPivot application depends on the Atoti Server (commercial) software, the Atoti Server jar files distributed by ActiveViam must be available in the maven repository for the application to build. Running the Atoti AutoPivot application requires a license for the Atoti software. To use the Atoti UI frontend, the Atoti license must have the Atoti UI option enabled.

autopivot's People

Contributors

chamb avatar champialex avatar dependabot[bot] avatar fabiencelier avatar gbactiveviam avatar joiivier avatar tjj225 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

autopivot's Issues

BOM markers in csv files

It looks like the discoverFile doesn't handle BOM markers that come from windows, so you get things like the first column name will be "Book" instead of "Book".

you could do something like this at the start:
lines.get(0).replace("","")

reading quoted text bug

the line

FOO,BAR,"FOO,BAR,",TEST

should split as FOO <> BAR <> FOOBAR <> TEST.

Instead, it splits as
[FOO,BAR,"FOO,BAR,",TEST, null, null]

I think there is an error in the logic around double quotes that appear after a comma. when you do the following, it will think you are still within quotes if the character before the ending quote was a comma:

			if(c == 0 || sep == text.charAt(c-1)) {
				// Beginning of a new field, delimited by quotes
				withinQuotes = true;
			}
			// or else is it the end of a field?
			else if((c == text.length()-1) || (sep == text.charAt(c+1))) {
				if(!withinQuotes) {
					LOGGER.warning("Unexpected double quote character at the end of a field: " + text);
				}
				withinQuotes = false;
			}

This is incorrect however, as it is actually ending the withinQuotes

cannot build mvn clean install

I use mvn clean install

but it bug

[ERROR] Failed to execute goal on project autopivot: Could not resolve dependencies for project com.activeviam.sandbox:autopivot:jar:2.0.0-SNAPSHOT: The following artifacts could not be resolved: com.activeviam.activepivot:activepivot-server-spring:jar:5.8.3-jdk8, com.activeviam.activepivot:activepivot-test:jar:5.8.3-jdk8, com.activeviam.activeui:activeui:jar:4.2.11: Could not find artifact com.activeviam.activepivot:activepivot-server-spring:jar:5.8.3-jdk8 in central (https://repo.maven.apache.org/maven2) -> [Help 1]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.