biddata / bidmat Goto Github PK
View Code? Open in Web Editor NEWA CPU and GPU-accelerated matrix library for data mining
License: BSD 3-Clause "New" or "Revised" License
A CPU and GPU-accelerated matrix library for data mining
License: BSD 3-Clause "New" or "Revised" License
Hi, Do you guys have google groups for BidMach and BidMat projects ?
@jcanny I wanna leverage the GPU Resources in Spark.
For example use GPU to do some Matrix Computation.
I am thinking about how to configure to ake BIDMat as the GPU Backend for Spark?
Likely, I use Maven, How to add sth in POM.XML (Attached)
<prerequisites>
<maven>3.0.4</maven>
</prerequisites>
<mailingLists>
<mailingList>
<name>Dev Mailing List</name>
<post>[email protected]</post>
<subscribe>[email protected]</subscribe>
<unsubscribe>[email protected]</unsubscribe>
</mailingList>
<mailingList>
<name>User Mailing List</name>
<post>[email protected]</post>
<subscribe>[email protected]</subscribe>
<unsubscribe>[email protected]</unsubscribe>
</mailingList>
<mailingList>
<name>Commits Mailing List</name>
<post>[email protected]</post>
<subscribe>[email protected]</subscribe>
<unsubscribe>[email protected]</unsubscribe>
</mailingList>
</mailingLists>
<modules>
<module>core</module>
<module>bagel</module>
<module>graphx</module>
<module>mllib</module>
<module>tools</module>
<module>streaming</module>
<module>sql/catalyst</module>
<module>sql/core</module>
<module>sql/hive</module>
<module>repl</module>
<module>assembly</module>
<module>external/twitter</module>
<module>external/kafka</module>
<module>external/flume</module>
<module>external/flume-sink</module>
<module>external/zeromq</module>
<module>external/mqtt</module>
<module>examples</module>
</modules>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
<java.version>1.6</java.version>
<sbt.project.name>spark</sbt.project.name>
<scala.version>2.10.4</scala.version>
<scala.binary.version>2.10</scala.binary.version>
<scala.macros.version>2.0.1</scala.macros.version>
<mesos.version>0.18.1</mesos.version>
<mesos.classifier>shaded-protobuf</mesos.classifier>
<akka.group>org.spark-project.akka</akka.group>
<akka.version>2.2.3-shaded-protobuf</akka.version>
<slf4j.version>1.7.5</slf4j.version>
<log4j.version>1.2.17</log4j.version>
<hadoop.version>1.0.4</hadoop.version>
<protobuf.version>2.4.1</protobuf.version>
<yarn.version>${hadoop.version}</yarn.version>
<hbase.version>0.94.6</hbase.version>
<flume.version>1.4.0</flume.version>
<zookeeper.version>3.4.5</zookeeper.version>
<hive.version>0.12.0</hive.version>
<parquet.version>1.4.3</parquet.version>
<jblas.version>1.2.3</jblas.version>
<jetty.version>8.1.14.v20131031</jetty.version>
<chill.version>0.3.6</chill.version>
<codahale.metrics.version>3.0.0</codahale.metrics.version>
<avro.version>1.7.6</avro.version>
<jets3t.version>0.7.1</jets3t.version>
<aws.java.sdk.version>1.8.3</aws.java.sdk.version>
<aws.kinesis.client.version>1.1.0</aws.kinesis.client.version>
<PermGen>64m</PermGen>
<MaxPermGen>512m</MaxPermGen>
</properties>
<repositories>
<repository>
<id>central</id>
<!-- This should be at top, it makes maven try the central repo first and then others and hence faster dep resolution -->
<name>Maven Repository</name>
<url>https://repo1.maven.org/maven2</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
<repository>
<id>apache-repo</id>
<name>Apache Repository</name>
<url>https://repository.apache.org/content/repositories/releases</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
<repository>
<id>jboss-repo</id>
<name>JBoss Repository</name>
<url>https://repository.jboss.org/nexus/content/repositories/releases</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
<repository>
<id>mqtt-repo</id>
<name>MQTT Repository</name>
<url>https://repo.eclipse.org/content/repositories/paho-releases</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
<repository>
<id>cloudera-repo</id>
<name>Cloudera Repository</name>
<url>https://repository.cloudera.com/artifactory/cloudera-repos</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
<repository>
<id>mapr-repo</id>
<name>MapR Repository</name>
<url>http://repository.mapr.com/maven</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
<repository>
<id>spring-releases</id>
<name>Spring Release Repository</name>
<url>https://repo.spring.io/libs-release</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
</repositories>
<pluginRepositories>
<pluginRepository>
<id>central</id>
<url>https://repo1.maven.org/maven2</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</pluginRepository>
</pluginRepositories>
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-util</artifactId>
<version>${jetty.version}</version>
</dependency>
<dependency>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-security</artifactId>
<version>${jetty.version}</version>
</dependency>
<dependency>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-plus</artifactId>
<version>${jetty.version}</version>
</dependency>
<dependency>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-server</artifactId>
<version>${jetty.version}</version>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>14.0.1</version>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
<version>3.3.2</version>
</dependency>
<dependency>
<groupId>commons-codec</groupId>
<artifactId>commons-codec</artifactId>
<version>1.5</version>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-math3</artifactId>
<version>3.3</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.google.code.findbugs</groupId>
<artifactId>jsr305</artifactId>
<version>1.3.9</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>${slf4j.version}</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
<version>${slf4j.version}</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>jul-to-slf4j</artifactId>
<version>${slf4j.version}</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>jcl-over-slf4j</artifactId>
<version>${slf4j.version}</version>
<!-- <scope>runtime</scope> --> <!-- more correct, but scalac 2.10.3 doesn't like it -->
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>${log4j.version}</version>
</dependency>
<dependency>
<groupId>com.ning</groupId>
<artifactId>compress-lzf</artifactId>
<version>1.0.0</version>
</dependency>
<dependency>
<groupId>org.xerial.snappy</groupId>
<artifactId>snappy-java</artifactId>
<version>1.0.5.3</version>
</dependency>
<dependency>
<groupId>net.jpountz.lz4</groupId>
<artifactId>lz4</artifactId>
<version>1.2.0</version>
</dependency>
<dependency>
<groupId>com.clearspring.analytics</groupId>
<artifactId>stream</artifactId>
<version>2.7.0</version>
<exclusions>
<!-- Only HyperLogLogPlus is used, which doesn't depend on fastutil -->
<exclusion>
<groupId>it.unimi.dsi</groupId>
<artifactId>fastutil</artifactId>
</exclusion>
</exclusions>
</dependency>
<!-- In theory we need not directly depend on protobuf since Spark does not directly
use it. However, when building with Hadoop/YARN 2.2 Maven doesn't correctly bump
the protobuf version up from the one Mesos gives. For now we include this variable
to explicitly bump the version when building with YARN. It would be nice to figure
out why Maven can't resolve this correctly (like SBT does). -->
<dependency>
<groupId>com.google.protobuf</groupId>
<artifactId>protobuf-java</artifactId>
<version>${protobuf.version}</version>
</dependency>
<dependency>
<groupId>com.twitter</groupId>
<artifactId>chill_${scala.binary.version}</artifactId>
<version>${chill.version}</version>
<exclusions>
<exclusion>
<groupId>org.ow2.asm</groupId>
<artifactId>asm</artifactId>
</exclusion>
<exclusion>
<groupId>org.ow2.asm</groupId>
<artifactId>asm-commons</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>com.twitter</groupId>
<artifactId>chill-java</artifactId>
<version>${chill.version}</version>
<exclusions>
<exclusion>
<groupId>org.ow2.asm</groupId>
<artifactId>asm</artifactId>
</exclusion>
<exclusion>
<groupId>org.ow2.asm</groupId>
<artifactId>asm-commons</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>${akka.group}</groupId>
<artifactId>akka-actor_${scala.binary.version}</artifactId>
<version>${akka.version}</version>
</dependency>
<dependency>
<groupId>${akka.group}</groupId>
<artifactId>akka-remote_${scala.binary.version}</artifactId>
<version>${akka.version}</version>
</dependency>
<dependency>
<groupId>${akka.group}</groupId>
<artifactId>akka-slf4j_${scala.binary.version}</artifactId>
<version>${akka.version}</version>
</dependency>
<dependency>
<groupId>${akka.group}</groupId>
<artifactId>akka-testkit_${scala.binary.version}</artifactId>
<version>${akka.version}</version>
</dependency>
<dependency>
<groupId>colt</groupId>
<artifactId>colt</artifactId>
<version>1.2.0</version>
</dependency>
<dependency>
<groupId>org.apache.mesos</groupId>
<artifactId>mesos</artifactId>
<version>${mesos.version}</version>
<classifier>${mesos.classifier}</classifier>
<exclusions>
<exclusion>
<groupId>com.google.protobuf</groupId>
<artifactId>protobuf-java</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>commons-net</groupId>
<artifactId>commons-net</artifactId>
<version>2.2</version>
</dependency>
<dependency>
<groupId>io.netty</groupId>
<artifactId>netty-all</artifactId>
<version>4.0.23.Final</version>
</dependency>
<dependency>
<groupId>org.apache.derby</groupId>
<artifactId>derby</artifactId>
<version>10.4.2.0</version>
</dependency>
<dependency>
<groupId>com.codahale.metrics</groupId>
<artifactId>metrics-core</artifactId>
<version>${codahale.metrics.version}</version>
</dependency>
<dependency>
<groupId>com.codahale.metrics</groupId>
<artifactId>metrics-jvm</artifactId>
<version>${codahale.metrics.version}</version>
</dependency>
<dependency>
<groupId>com.codahale.metrics</groupId>
<artifactId>metrics-json</artifactId>
<version>${codahale.metrics.version}</version>
</dependency>
<dependency>
<groupId>com.codahale.metrics</groupId>
<artifactId>metrics-ganglia</artifactId>
<version>${codahale.metrics.version}</version>
</dependency>
<dependency>
<groupId>com.codahale.metrics</groupId>
<artifactId>metrics-graphite</artifactId>
<version>${codahale.metrics.version}</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-compiler</artifactId>
<version>${scala.version}</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-reflect</artifactId>
<version>${scala.version}</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>jline</artifactId>
<version>${scala.version}</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-actors</artifactId>
<version>${scala.version}</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scalap</artifactId>
<version>${scala.version}</version>
</dependency>
<dependency>
<groupId>org.scalatest</groupId>
<artifactId>scalatest_${scala.binary.version}</artifactId>
<version>2.1.5</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.easymock</groupId>
<artifactId>easymockclassextension</artifactId>
<version>3.1</version>
<scope>test</scope>
</dependency>
<!-- Needed by cglib which is needed by easymock. -->
<dependency>
<groupId>asm</groupId>
<artifactId>asm</artifactId>
<version>3.3.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.mockito</groupId>
<artifactId>mockito-all</artifactId>
<version>1.9.0</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.scalacheck</groupId>
<artifactId>scalacheck_${scala.binary.version}</artifactId>
<version>1.11.3</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.10</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.novocode</groupId>
<artifactId>junit-interface</artifactId>
<version>0.10</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.curator</groupId>
<artifactId>curator-recipes</artifactId>
<version>2.4.0</version>
<exclusions>
<exclusion>
<groupId>org.jboss.netty</groupId>
<artifactId>netty</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>${hadoop.version}</version>
<exclusions>
<exclusion>
<groupId>asm</groupId>
<artifactId>asm</artifactId>
</exclusion>
<exclusion>
<groupId>org.ow2.asm</groupId>
<artifactId>asm</artifactId>
</exclusion>
<exclusion>
<groupId>org.jboss.netty</groupId>
<artifactId>netty</artifactId>
</exclusion>
<exclusion>
<groupId>commons-logging</groupId>
<artifactId>commons-logging</artifactId>
</exclusion>
<exclusion>
<groupId>org.mortbay.jetty</groupId>
<artifactId>servlet-api-2.5</artifactId>
</exclusion>
<exclusion>
<groupId>javax.servlet</groupId>
<artifactId>servlet-api</artifactId>
</exclusion>
<exclusion>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
<version>${avro.version}</version>
</dependency>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro-ipc</artifactId>
<version>${avro.version}</version>
<exclusions>
<exclusion>
<groupId>io.netty</groupId>
<artifactId>netty</artifactId>
</exclusion>
<exclusion>
<groupId>org.mortbay.jetty</groupId>
<artifactId>jetty</artifactId>
</exclusion>
<exclusion>
<groupId>org.mortbay.jetty</groupId>
<artifactId>jetty-util</artifactId>
</exclusion>
<exclusion>
<groupId>org.mortbay.jetty</groupId>
<artifactId>servlet-api</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.velocity</groupId>
<artifactId>velocity</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro-mapred</artifactId>
<version>${avro.version}</version>
<exclusions>
<exclusion>
<groupId>io.netty</groupId>
<artifactId>netty</artifactId>
</exclusion>
<exclusion>
<groupId>org.mortbay.jetty</groupId>
<artifactId>jetty</artifactId>
</exclusion>
<exclusion>
<groupId>org.mortbay.jetty</groupId>
<artifactId>jetty-util</artifactId>
</exclusion>
<exclusion>
<groupId>org.mortbay.jetty</groupId>
<artifactId>servlet-api</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.velocity</groupId>
<artifactId>velocity</artifactId>
</exclusion>
</exclusions>
</dependency>
<!-- See SPARK-1556 for info on this dependency: -->
<dependency>
<groupId>net.java.dev.jets3t</groupId>
<artifactId>jets3t</artifactId>
<version>${jets3t.version}</version>
<exclusions>
<exclusion>
<groupId>commons-logging</groupId>
<artifactId>commons-logging</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-api</artifactId>
<version>${yarn.version}</version>
<exclusions>
<exclusion>
<groupId>javax.servlet</groupId>
<artifactId>servlet-api</artifactId>
</exclusion>
<exclusion>
<groupId>asm</groupId>
<artifactId>asm</artifactId>
</exclusion>
<exclusion>
<groupId>org.ow2.asm</groupId>
<artifactId>asm</artifactId>
</exclusion>
<exclusion>
<groupId>org.jboss.netty</groupId>
<artifactId>netty</artifactId>
</exclusion>
<exclusion>
<groupId>commons-logging</groupId>
<artifactId>commons-logging</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-common</artifactId>
<version>${yarn.version}</version>
<exclusions>
<exclusion>
<groupId>asm</groupId>
<artifactId>asm</artifactId>
</exclusion>
<exclusion>
<groupId>org.ow2.asm</groupId>
<artifactId>asm</artifactId>
</exclusion>
<exclusion>
<groupId>org.jboss.netty</groupId>
<artifactId>netty</artifactId>
</exclusion>
<exclusion>
<groupId>javax.servlet</groupId>
<artifactId>servlet-api</artifactId>
</exclusion>
<exclusion>
<groupId>commons-logging</groupId>
<artifactId>commons-logging</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-server-web-proxy</artifactId>
<version>${yarn.version}</version>
<exclusions>
<exclusion>
<groupId>asm</groupId>
<artifactId>asm</artifactId>
</exclusion>
<exclusion>
<groupId>org.ow2.asm</groupId>
<artifactId>asm</artifactId>
</exclusion>
<exclusion>
<groupId>org.jboss.netty</groupId>
<artifactId>netty</artifactId>
</exclusion>
<exclusion>
<groupId>javax.servlet</groupId>
<artifactId>servlet-api</artifactId>
</exclusion>
<exclusion>
<groupId>commons-logging</groupId>
<artifactId>commons-logging</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-client</artifactId>
<version>${yarn.version}</version>
<exclusions>
<exclusion>
<groupId>asm</groupId>
<artifactId>asm</artifactId>
</exclusion>
<exclusion>
<groupId>org.ow2.asm</groupId>
<artifactId>asm</artifactId>
</exclusion>
<exclusion>
<groupId>org.jboss.netty</groupId>
<artifactId>netty</artifactId>
</exclusion>
<exclusion>
<groupId>javax.servlet</groupId>
<artifactId>servlet-api</artifactId>
</exclusion>
<exclusion>
<groupId>commons-logging</groupId>
<artifactId>commons-logging</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<!-- Matches the version of jackson-core-asl pulled in by avro -->
<groupId>org.codehaus.jackson</groupId>
<artifactId>jackson-mapper-asl</artifactId>
<version>1.8.8</version>
</dependency>
</dependencies>
</dependencyManagement>
<build>
<pluginManagement>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-enforcer-plugin</artifactId>
<version>1.3.1</version>
<executions>
<execution>
<id>enforce-versions</id>
<goals>
<goal>enforce</goal>
</goals>
<configuration>
<rules>
<requireMavenVersion>
<version>3.0.4</version>
</requireMavenVersion>
<requireJavaVersion>
<version>${java.version}</version>
</requireJavaVersion>
</rules>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>build-helper-maven-plugin</artifactId>
<version>1.8</version>
</plugin>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.2.0</version>
<executions>
<execution>
<id>scala-compile-first</id>
<phase>process-resources</phase>
<goals>
<goal>compile</goal>
</goals>
</execution>
<execution>
<id>scala-test-compile-first</id>
<phase>process-test-resources</phase>
<goals>
<goal>testCompile</goal>
</goals>
</execution>
<execution>
<id>attach-scaladocs</id>
<phase>verify</phase>
<goals>
<goal>doc-jar</goal>
</goals>
</execution>
</executions>
<configuration>
<scalaVersion>${scala.version}</scalaVersion>
<recompileMode>incremental</recompileMode>
<useZincServer>true</useZincServer>
<args>
<arg>-unchecked</arg>
<arg>-deprecation</arg>
<arg>-feature</arg>
<arg>-language:postfixOps</arg>
</args>
<jvmArgs>
<jvmArg>-Xms1024m</jvmArg>
<jvmArg>-Xmx1024m</jvmArg>
<jvmArg>-XX:PermSize=${PermGen}</jvmArg>
<jvmArg>-XX:MaxPermSize=${MaxPermGen}</jvmArg>
</jvmArgs>
<javacArgs>
<javacArg>-source</javacArg>
<javacArg>${java.version}</javacArg>
<javacArg>-target</javacArg>
<javacArg>${java.version}</javacArg>
</javacArgs>
<!-- The following plugin is required to use quasiquotes in Scala 2.10 and is used
by Spark SQL for code generation. -->
<compilerPlugins>
<compilerPlugin>
<groupId>org.scalamacros</groupId>
<artifactId>paradise_${scala.version}</artifactId>
<version>${scala.macros.version}</version>
</compilerPlugin>
</compilerPlugins>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.1</version>
<configuration>
<source>${java.version}</source>
<target>${java.version}</target>
<encoding>UTF-8</encoding>
<maxmem>1024m</maxmem>
<fork>true</fork>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.17</version>
<configuration>
<!-- Uses scalatest instead -->
<skipTests>true</skipTests>
</configuration>
</plugin>
<plugin>
<groupId>org.scalatest</groupId>
<artifactId>scalatest-maven-plugin</artifactId>
<version>1.0-RC2</version>
<configuration>
<reportsDirectory>${project.build.directory}/surefire-reports</reportsDirectory>
<junitxml>.</junitxml>
<filereports>${project.build.directory}/SparkTestSuite.txt</filereports>
<argLine>-Xmx3g -XX:MaxPermSize=${MaxPermGen} -XX:ReservedCodeCacheSize=512m</argLine>
<stderr />
<systemProperties>
<java.awt.headless>true</java.awt.headless>
<spark.test.home>${session.executionRootDirectory}</spark.test.home>
<spark.testing>1</spark.testing>
</systemProperties>
</configuration>
<executions>
<execution>
<id>test</id>
<goals>
<goal>test</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>2.4</version>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-antrun-plugin</artifactId>
<version>1.7</version>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.2</version>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-source-plugin</artifactId>
<version>2.2.1</version>
<configuration>
<attach>true</attach>
</configuration>
<executions>
<execution>
<id>create-source-jar</id>
<goals>
<goal>jar-no-fork</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-clean-plugin</artifactId>
<version>2.5</version>
<configuration>
<filesets>
<fileset>
<directory>work</directory>
</fileset>
<fileset>
<directory>checkpoint</directory>
</fileset>
</filesets>
</configuration>
</plugin>
</plugins>
</pluginManagement>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-enforcer-plugin</artifactId>
</plugin>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>build-helper-maven-plugin</artifactId>
<executions>
<execution>
<id>add-scala-sources</id>
<phase>generate-sources</phase>
<goals>
<goal>add-source</goal>
</goals>
<configuration>
<sources>
<source>src/main/scala</source>
</sources>
</configuration>
</execution>
<execution>
<id>add-scala-test-sources</id>
<phase>generate-test-sources</phase>
<goals>
<goal>add-test-source</goal>
</goals>
<configuration>
<sources>
<source>src/test/scala</source>
</sources>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-source-plugin</artifactId>
</plugin>
<plugin>
<groupId>org.scalastyle</groupId>
<artifactId>scalastyle-maven-plugin</artifactId>
<version>0.4.0</version>
<configuration>
<verbose>false</verbose>
<failOnViolation>true</failOnViolation>
<includeTestSourceDirectory>false</includeTestSourceDirectory>
<failOnWarning>false</failOnWarning>
<sourceDirectory>${basedir}/src/main/scala</sourceDirectory>
<testSourceDirectory>${basedir}/src/test/scala</testSourceDirectory>
<configLocation>scalastyle-config.xml</configLocation>
<outputFile>scalastyle-output.xml</outputFile>
<outputEncoding>UTF-8</outputEncoding>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>check</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
<profiles>
<!-- Ganglia integration is not included by default due to LGPL-licensed code -->
<profile>
<id>spark-ganglia-lgpl</id>
<modules>
<module>extras/spark-ganglia-lgpl</module>
</modules>
</profile>
<!-- Kinesis integration is not included by default due to ASL-licensed code -->
<profile>
<id>kinesis-asl</id>
<modules>
<module>extras/kinesis-asl</module>
</modules>
</profile>
<profile>
<id>java8-tests</id>
<build>
<plugins>
<!-- Needed for publishing test jars as it is needed by java8-tests -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<executions>
<execution>
<goals>
<goal>test-jar</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
<modules>
<module>extras/java8-tests</module>
</modules>
</profile>
<!-- A series of build profiles where customizations for particular Hadoop releases can be made -->
<profile>
<id>hadoop-0.23</id>
<!-- SPARK-1121: Adds an explicit dependency on Avro to work around a Hadoop 0.23.X issue -->
<dependencies>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
</dependency>
</dependencies>
<properties>
<hadoop.version>0.23.10</hadoop.version>
</properties>
</profile>
<profile>
<id>hadoop-2.2</id>
<properties>
<hadoop.version>2.2.0</hadoop.version>
<protobuf.version>2.5.0</protobuf.version>
</properties>
</profile>
<profile>
<id>hadoop-2.3</id>
<properties>
<hadoop.version>2.3.0</hadoop.version>
<protobuf.version>2.5.0</protobuf.version>
<jets3t.version>0.9.0</jets3t.version>
</properties>
</profile>
<profile>
<id>hadoop-2.4</id>
<properties>
<hadoop.version>2.4.0</hadoop.version>
<protobuf.version>2.5.0</protobuf.version>
<jets3t.version>0.9.0</jets3t.version>
</properties>
</profile>
<profile>
<id>yarn-alpha</id>
<modules>
<module>yarn</module>
</modules>
</profile>
<profile>
<id>yarn</id>
<modules>
<module>yarn</module>
</modules>
</profile>
<profile>
<id>mapr3</id>
<activation>
<activeByDefault>false</activeByDefault>
</activation>
<properties>
<hadoop.version>1.0.3-mapr-3.0.3</hadoop.version>
<yarn.version>2.3.0-mapr-4.0.0-FCS</yarn.version>
<hbase.version>0.94.17-mapr-1405</hbase.version>
<zookeeper.version>3.4.5-mapr-1406</zookeeper.version>
</properties>
</profile>
<profile>
<id>mapr4</id>
<activation>
<activeByDefault>false</activeByDefault>
</activation>
<properties>
<hadoop.version>2.3.0-mapr-4.0.0-FCS</hadoop.version>
<yarn.version>2.3.0-mapr-4.0.0-FCS</yarn.version>
<hbase.version>0.94.17-mapr-1405-4.0.0-FCS</hbase.version>
<zookeeper.version>3.4.5-mapr-1406</zookeeper.version>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.curator</groupId>
<artifactId>curator-recipes</artifactId>
<version>2.4.0</version>
<exclusions>
<exclusion>
<groupId>org.apache.zookeeper</groupId>
<artifactId>zookeeper</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.zookeeper</groupId>
<artifactId>zookeeper</artifactId>
<version>3.4.5-mapr-1406</version>
</dependency>
</dependencies>
</profile>
<!-- Build without Hadoop dependencies that are included in some runtime environments. -->
<profile>
<id>hadoop-provided</id>
<activation>
<activeByDefault>false</activeByDefault>
</activation>
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-api</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-common</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-server-web-proxy</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-client</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro-ipc</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.zookeeper</groupId>
<artifactId>zookeeper</artifactId>
<version>${zookeeper.version}</version>
<scope>provided</scope>
</dependency>
</dependencies>
</profile>
<profile>
<id>hive</id>
<activation>
<activeByDefault>false</activeByDefault>
</activation>
<modules>
<module>sql/hive-thriftserver</module>
</modules>
</profile>
</profiles>
Now the TParse code output an IMat and a SBMat for a column of strings with delimeters in the raw data file. Each row of the IMat is a (row, keyword_id)
pair, where the row number corresponds to the row number in the input data file, and keyword_id is the id of a keyword in the SBMat dictionary of unique keywords.
However, in the document for TParse behavior on string data (https://github.com/BIDData/BIDMach/wiki/Data-Wrangling#String_Fields), it says TParse will output an SMat and a SBMat for a column of strings with delimeters in the raw data file, where each column of the SMat is a list of keyword_ids.
It seems that there should be an extra step from the current TParse result to the intended behavior in the document, in which the IMat is converted to the SMat according to the specification of the encoding. Can anyone either fix the document or fix the code?
CUDA 7.0 was recently released. I upgraded to that in order to get caffe finally working on my laptop, and I got rid of my earlier version (6.5) that I had used for BIDMach/BIDMat.
I downloaded the binaries:
http://www.jcuda.org/downloads/downloads.html
And pasted these files in the BIDMat/lib folder:
jcublas-0.7.0.jar
jcuda-0.7.0.jar
...(etc)...
libJCublas-apple-x86_64.dylib
libJCublas2-apple-x86_64.dylib
...(etc)...
I can compile successfully, but then I get (see the "something went wrong..."):
dhcp-46-165:BIDMat danielseita$ ./bidmat
Loading /Users/danielseita/BIDMat/lib/bidmat_init.scala...
import BIDMat.{CMat, CSMat, DMat, Dict, FMat, FND, GMat, GDMat, GIMat, GLMat, GSMat, GSDMat, HMat, IDict, Image, IMat, LMat, Mat, SMat, SBMat, SDMat}
import BIDMat.MatFunctions._
import BIDMat.SciFunctions._
import BIDMat.Solvers._
import BIDMat.Plotting._
1 CUDA device found, CUDA version 7.0
Something went wrong while loading BIDMat CUDA library
Welcome to Scala version 2.11.2 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_05).
Type in expressions to have them evaluated.
Type :help for more information.
scala> sys.exit
Specifics about my system: Mac 10.9, using latest version (e6e764f).
In the bidmat
file there's a place where I can specify the CUDA number, but even fixing that doesn't change things. Has 7.0 been tested, and if not, when are the plans to deal with it?
In the meantime, I'll stick with CUDA 6.5 then but I thought I'd raise this issue now.
hi All,
is there any plan to implement GPU version of binominal and negativ binominal random numbers generators?
Thanks!
Marek
Noticed this issue while browsing through Mat.scala, notably, on in the getOS function:
final val OS_WINDOWS = 0;
final val OS_LINUX = 1;
final val OS_OSX = 2;
final val OS_ANDROID = 3;
def getOS:Int = {
val osname = System.getProperty("os.name");
if (osname.startsWith("Windows")) OS_WINDOWS
else if (osname.startsWith("Linux")) OS_LINUX
else if (osname.startsWith("Mac")) OS_OSX
else OS_ANDROID
}
val ostype = getOS
According to http://developer.android.com/reference/java/lang/System.html#getProperty(java.lang.String), System.getProperty("os.name")
always returns "Linux"
on Android.
Seems like another way might be to check System.getProperty("java.vendor")
for "The Android Project"
, then System.getProperty("os.name")
for the remaining platforms?
I am trying to identify the top 10 similar, say,
documents from a list of 2.7 Million documents for a new document that I get. Each document is represented as a vector of 500 doubles. The number of documents is 2,633,142. So, we have a matrix of size gc = (2633142 x 500) for all the processed documents. Whenever a new document arrives, we compute it's vector e = (500 x 1). Assuming that the vectors are normalized, we need to compute the dot product of e with each row of gc and pick the maximum value from the resulting matrix(2,633,142 x 1).
When I compute the dot product of a new document features with that of 2.7 million images' features using BIDMach matrix matrix multiplication, it takes 300 milliseconds using FMat and 1200 milliseonds using GMat. I am attaching the screenshot. I was expecting GMat to work faster than FMat. Another thing I observe is that the amount of GPU memory used while creating GMat is less than 100 MBs while I was expecting it to use 3 GB. Looks like it is not caching the Matrix in GPU memory which is leading to slower execution. Can you tell me if I am doing something wrong here or whether any kind of improvement can be done to cache the data properly in memory and reduce the time consumption of GMat based matrix multiplication?
The computation is a matrix multiplication of dimensions:
(2633142 x 500) * (500 x 1) = (2633142 x 1)
I don't understand why CPU is faster than GPU and how to obtain faster speed with GPU, say, in a few 100 milliseconds.
Notebook.pdf
John,
In some code I'm writing, I have a matrix and a set of indices of that matrix (column-major order as usual), called innz
. I would like to set all of those elements at the spots specified by innz
to be zero. With CPU matrices, it is straightforward:
scala> val a = IMat(rand(3,12)*5)+1
a: BIDMat.IMat =
4 3 3 3 3 4 1 2 2 5 3 3
5 2 5 4 1 4 1 5 1 3 3 4
4 4 3 1 5 4 5 3 3 1 5 5
scala> val innz = 2 \ 4 \ 9 \ 12 \ 18
innz: BIDMat.IMat = 2,4,9,12,18
scala> a(innz) = 0
res6: BIDMat.IMat =
4 3 3 0 0 4 0 2 2 5 3 3
5 0 5 4 1 4 1 5 1 3 3 4
0 4 3 1 5 4 5 3 3 1 5 5
Alternatively, one can do this:
scala> a(innz) = izeros(1,5)
res7: BIDMat.IMat = 0,0,0,0,0
scala> a
res8: BIDMat.IMat =
4 3 3 0 0 4 0 2 2 5 3 3
5 0 5 4 1 4 1 5 1 3 3 4
0 4 3 1 5 4 5 3 3 1 5 5
With GPU matrices, it is a little more complicated because it is missing a few update methods. (I am not sure whether these are on purpose or not; for instance, the Wiki states that the ^* operator is missing but that is deliberate right now.) I am setting another random matrix, and using the same set of indices to target for zeros, but doing the single 0 won't work because of no linear updates. I tried several ways, and by checking the source code, the only way that works is to set the right hand side to be a GIMat, as shown below:
scala> val ga = GIMat(grand(3,12)*5)+1
ga: BIDMat.GIMat =
3 2 5 4 3 3 2 5 2 5 2 2
3 5 1 3 3 2 5 1 5 5 4 3
1 3 5 5 4 3 1 4 3 1 1 3
scala> val innz = GIMat(2 \ 4 \ 9 \ 12 \ 18)
innz: BIDMat.GIMat = 2,4,9,12,18
scala> ga(innz) = 0
java.lang.RuntimeException: operator linear update not implemented for GIMat
at BIDMat.Mat.notImplemented0(Mat.scala:21)
at BIDMat.Mat.update(Mat.scala:135)
... 33 elided
scala> ga(innz) = gizeros(1,5)
res4: BIDMat.GIMat =
3 2 5 4 3 3 2 5 2 5 2 2
3 5 1 3 3 2 5 1 5 5 4 3
1 3 5 5 4 3 1 4 3 1 1 3
However, this does not modify the components of ga
!
The source code traces back to the def updatex(I:GIMat, v:GIMat):GIMat
method in GIMat.scala
, which then calls some GPU code: val err = CUMAT.copyToInds(data, v.data, I.data, I.llength);
. I read through the BIDMat/jni/src/BIDMat_CUMAT.cpp
file, which has that function declaration, but the definition isn't there so it must be somewhere else (or maybe it's from CUDA itself, so you didn't write it?). EDIT It's actually in MatKernel.cu, sorry. That seems to be where you wrote the matrix kernels in CUDA. But the definition seems to make sense, based on my rudimentary understanding of CUDA syntax...
Regardless, I wanted to check to make sure this was the correct behavior for block updates; it doesn't seem that way to me. If this is not the right way to go, any suggestions on how to do updates to specified values of indices?
Thanks for your time,
Daniel
Version: d2b1d59
Seems like there's a link error.
scala> val a8 = BIDMat.GDMat(BIDMat.DMat(1 on 4))
a8: BIDMat.GDMat =
1
4scala> a8.t
java.lang.UnsatisfiedLinkError: edu.berkeley.bid.CUMATD.transpose(Ljcuda/Pointer;ILjcuda/Pointer;III)I
at edu.berkeley.bid.CUMATD.transpose(Native Method)
at BIDMat.GDMat.t(GDMat.scala:301)
... 33 elided
The only difference is that the GDMats and GLMats use this:
CUMATD.transpose(this.data, nrows, out.data, ncols, nrows, ncols)
whereas GIMats and GMats use
CUMAT.transpose(this.data, nrows, out.data, ncols, nrows, ncols)
To this:
${BIDMAT_ROOT}/scala/bin/scala -nobootcp -cp "${ALL_LIBS}" -Yrepl-sync -i
to allow local Scala instance to be found.
Hi All please find attached the error I receive:
scala> val a = binornd(10,0.5,2,2)
MKL ERROR: Parameter 5 was incorrect on entry to viRngBinomial.
a: BIDMat.IMat =
0 0
0 0
Version: 38035fd
I was looking through the DenseMat.scala code to document it. (I was going to document this one, then copy and paste some of the comments elsewhere.) I noticed that there are many methods that return an appropriate matrix internally, but will not print it out on the command line. Do you think this may confuse other/new users? Is it worthwhile to just set those methods to be private or package-protected?
Here are two examples:
(1) DenseMat has mkdiag and getdiag methods that sort of act as a "second" way of getting diagonals. For instance, the usual way to make a diagonal matrix is as the following:
scala> val a = mkdiag(1 on 2 on 3)
a: BIDMat.IMat =
1 0 0
0 2 0
0 0 3
Though I can also do it this way:
scala> val a = (1 on 2 on 3).mkdiag
a: BIDMat.DenseMat[Int] =
scala>
While it's subtle, the second way actually produces a correct matrix, but it just doesn't print out normally and we have to wrap an IMat() around it to "see" it normally. The same holds true for getdiag.
Second example:
scala> val b = DMat(1\2 on 3\4)
b: BIDMat.DMat =
1 2
3 4
scala> b.ghorzcat(b)
res10: BIDMat.DenseMat[Double] =
scala> DMat(b.ghorzcat(b))
res11: BIDMat.DMat =
1 2 1 2
3 4 3 4
Looking through the code, I couldn't find a way to specify the number of GPUs to use on the machine. Is this possible?
John,
I've been having some problems debugging memory allocation with BIDMach (BayesNet.scala specifically). I'm using the Java Runtime class which should be a reliable measure of memory allocation. But in the following test script, I'm noticing some weird results:
1 import java.text.NumberFormat
2
3 def computeMemory = {
4 val runtime = Runtime.getRuntime()
5 val format = NumberFormat.getInstance()
6 val sb = new StringBuilder()
7 val maxMemory = runtime.maxMemory()
8 val allocatedMemory = runtime.totalMemory()
9 val freeMemory = runtime.freeMemory()
10 sb.append("free memory: " + format.format(freeMemory / (1024*1024)) + "M ");
11 sb.append("allocated/total memory: " + format.format(allocatedMemory / (1024*1024)) + "M\n");
12 print(sb.toString())
13 }
14
15 for (i <- 0 until 100) {
16 val a = rand(67,4367)
17 //Thread sleep 3000
18 println("memory at iteration i = " + i)
19 computeMemory
20 }
The computeMemory function computes the free memory and the total memory. The free memory we set with -Xms. Then the loop creates a bunch of random matrices and prints out memory as we go.
The output is as follows:
dhcp-46-165:BIDMach danielseita$ ./bidmach test.ssc
Loading /Users/danielseita/BIDMach/lib/bidmach_init.scala...
import BIDMat.{CMat, CSMat, DMat, Dict, FMat, FND, GMat, GDMat, GIMat, GLMat, GSMat, GSDMat, HMat, IDict, Image, IMat, LMat, Mat, SMat, SBMat, SDMat}
import BIDMat.MatFunctions._
import BIDMat.SciFunctions._
import BIDMat.Solvers._
import BIDMat.Plotting._
import BIDMach.Learner
import BIDMach.models.{FM, GLM, KMeans, KMeansw, LDA, LDAgibbs, Model, NMF, SFA, RandomForest}
import BIDMach.networks.DNN
import BIDMach.datasources.{DataSource, MatDS, FilesDS, SFilesDS}
import BIDMach.mixins.{CosineSim, Perplexity, Top, L1Regularizer, L2Regularizer}
import BIDMach.updaters.{ADAGrad, Batch, BatchNorm, IncMult, IncNorm, Telescoping}
import BIDMach.causal.IPTW
1 CUDA device found, CUDA version 7.0
Loading test.ssc...
import java.text.NumberFormat
computeMemory: Unit
memory at iteration i = 0
free memory: 13,534M allocated/total memory: 13,739M
memory at iteration i = 1
free memory: 13,534M allocated/total memory: 13,739M
memory at iteration i = 2
free memory: 13,462M allocated/total memory: 13,739M
memory at iteration i = 3
free memory: 13,462M allocated/total memory: 13,739M
// More of the same...
free memory: 13,462M allocated/total memory: 13,739M
memory at iteration i = 66
free memory: 13,462M allocated/total memory: 13,739M
memory at iteration i = 67
free memory: 13,389M allocated/total memory: 13,739M
memory at iteration i = 68
free memory: 13,389M allocated/total memory: 13,739M
memory at iteration i = 69
// More of the same ...
What happens is that it looks like the amount of free memory goes down at seemingly random times (about 72-73 MB), but in fact, when you add up the amount of memory allocated across all 100 loops for these 67 x 4367, non-cached matrices (which is approximately (67 x 4367 x 8) / (1024 x 1024) MB if we assume one element takes up about 8 bytes) the final free memory makes sense. What happens is that the free memory variable does not seem to get updated frequently enough.
So my question: do you know if there is some esoteric detail about memory allocation with BIDMach/BIDMat data structures that would cause memory allocation to behave weirdly and not update with the Java Runtime? Normally this wouldn't be too big of a problem, but I'm trying to debug matrix caching problems with Gibbs sampling. For that, it would be nice to confidently point out places in the code where memory gets allocated, other than having "random" drops of memory occur. Adding a thread sleep option to delay measurement time does not seem to affect this.
It's likely that this is more of a Java Runtime problem or some timing issue between Runtime and Scala, so I'll probably try printing out GUIDs of matrices to help me debug instead, but I just wanted to check. This happens both on my laptop and in stout.
I just pulled and ran into some errors building:
~/BIDMat> ./sbt package
[info] Set current project to BIDMat (in build file:/home/kjameslubin/BIDMat/)
[warn] Credentials file /home/kjameslubin/.ivy2/.credentials does not exist
[info] Compiling 9 Scala sources to /home/kjameslubin/BIDMat/target/scala-2.11/classes...
[error] /home/kjameslubin/BIDMat/src/main/scala/BIDMat/JSON.scala:4: object cedarsoftware is not a member of package com
[error] import com.cedarsoftware.util.io.JsonWriter;
[error] ^
etc...
It took me a minute to figure out that I needed to rerun ./getdevlibs.sh
- and then it built without issue. Perhaps we should update INSTALLING.txt?
Something like:
Install from source:
apt-get < reqs >
git clone <repo name>
./getlibs.sh
./sbt package
cd scripts && bidmach workout.ssc
I have only installed from scratch on Ubuntu and OS X and could produce instructions for them - though my OS X laptop is too old to "really" run the recent CUDA versions.
Assigning an element in an IMat works
scala> var a = 0\0\0 on 0\0\0;
a: BIDMat.IMat =
0 0 0
0 0 0
scala> a(0,2) = 3;
res2: Int = 3
but the same fails with an FMat:
scala> var b = 0.1\0\0 on 0\0\0;
b: BIDMat.FMat =
0.10000 0 0
0 0 0
scala> b(0,2) = 3.4;
<console>:22: error: ambiguous reference to overloaded definition,
both method update in class FMat of type (i: Int, jv: BIDMat.Mat, b: BIDMat.Mat)BIDMat.FMat
and method update in class FMat of type (iv: BIDMat.IMat, j: Int, b: BIDMat.FMat)BIDMat.FMat
match argument types (Int,Int,Double)
b(0,2) = 3.4;
BIDMat 0.9.7.
In the wiki/Dictionaries page, there is a reference to a union2
method of the Dict
class:
>val (d, dm1, dm2) = Dict.union2(d1,d2)
I think this should actually be Dict.union3(d1,d2)
? The union2
command seems to apply to IDicts.
John,
There seems to be some new issue with loading/saving matrices. Currently on the master branch of BIDMach/BIDMat (for both my Mac 10.9 laptop and on stout), I get some NoClassDefFoundErrors when loading/saving matrices by following the BIDMat Loading/Saving Wiki page:
[seita@stout BIDMach]$ ./bidmach
Loading /home/seita/BIDMach/lib/bidmach_init.scala...
import BIDMat.{CMat, CSMat, DMat, Dict, FMat, FND, GMat, GDMat, GIMat, GLMat, GSMat, GSDMat, HMat, IDict, Image, IMat, LMat, Mat, SMat, SBMat, SDMat}
import BIDMat.MatFunctions._
import BIDMat.SciFunctions._
import BIDMat.Solvers._
import BIDMat.Plotting._
import BIDMach.Learner
import BIDMach.models.{FM, GLM, KMeans, KMeansw, LDA, LDAgibbs, Model, NMF, SFA, RandomForest}
import BIDMach.networks.DNN
import BIDMach.datasources.{DataSource, MatDS, FilesDS, SFilesDS}
import BIDMach.mixins.{CosineSim, Perplexity, Top, L1Regularizer, L2Regularizer}
import BIDMach.updaters.{ADAGrad, Batch, BatchNorm, IncMult, IncNorm, Telescoping}
import BIDMach.causal.IPTW
4 CUDA devices found, CUDA version 6.5
Welcome to Scala version 2.11.2 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_51).
Type in expressions to have them evaluated.
Type :help for more information.
scala> saveFMat("hi", rand(3,3))
java.lang.NoClassDefFoundError: net/jpountz/lz4/LZ4BlockInputStream
at BIDMat.MatFunctions$.saveFMat(MatFunctions.scala:1852)
... 33 elided
Caused by: java.lang.ClassNotFoundException: net.jpountz.lz4.LZ4BlockInputStream
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 34 more
This is on my laptop:
scala> val a = loadSMat("moods_data/moodsByUser.smat.lz4")
java.lang.NoClassDefFoundError: net/jpountz/lz4/LZ4BlockInputStream
at BIDMat.MatFunctions$.loadSMat(MatFunctions.scala:1846)
... 33 elided
Caused by: java.lang.ClassNotFoundException: net.jpountz.lz4.LZ4BlockInputStream
at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 34 more
In the past, I tried saving matrices by using the HMat.saveFMatTxt("name", a)
workaround but that suddenly gives me the same error.
Do you know if a recent update changed this loading/saving process, or is there something I need to download? Thanks.
-Daniel
We try to write and automatically launch specs for our piece of software using BIDMat.
some of our code is GPU-powered, some not.
is there a way to do that without having each machine GPU-enabled and jcuda installed?
I imagine that if we have some interface like 'Mat' we could do do that - write some test on 'Mat' interface providing an FMat when test, and GMat when on GPU.
I tried that, but even using SciFunctions requires jcuda package.
How would you suggest overcomming this issue?
Running sbt in the main directory of the repository, I get this error:
[info] Resolving org.scala-sbt#precompiled-2_10_0;0.12.2 ...
[warn] ::::::::::::::::::::::::::::::::::::::::::::::
[warn] :: UNRESOLVED DEPENDENCIES ::
[warn] ::::::::::::::::::::::::::::::::::::::::::::::
[warn] :: com.github.siasia#xsbt-proguard-plugin_2.9.2;0.12.2-0.1.1: not found
[warn] ::::::::::::::::::::::::::::::::::::::::::::::
sbt.ResolveException: unresolved dependency: com.github.siasia#xsbt-proguard-plugin_2.9.2;0.12.2-0.1.1: not found
at sbt.IvyActions$.sbt$IvyActions$$resolve(IvyActions.scala:214)
at sbt.IvyActions$$anonfun$update$1.apply(IvyActions.scala:122)
at sbt.IvyActions$$anonfun$update$1.apply(IvyActions.scala:121)
at sbt.IvySbt$Module$$anonfun$withModule$1.apply(Ivy.scala:114)
at sbt.IvySbt$Module$$anonfun$withModule$1.apply(Ivy.scala:114)
at sbt.IvySbt$$anonfun$withIvy$1.apply(Ivy.scala:102)
at sbt.IvySbt.liftedTree1$1(Ivy.scala:49)
at sbt.IvySbt.action$1(Ivy.scala:49)
at sbt.IvySbt$$anon$3.call(Ivy.scala:58)
at xsbt.boot.Locks$GlobalLock.withChannel$1(Locks.scala:75)
at xsbt.boot.Locks$GlobalLock.withChannelRetries$1(Locks.scala:58)
at xsbt.boot.Locks$GlobalLock$$anonfun$withFileLock$1.apply(Locks.scala:79)
at xsbt.boot.Using$.withResource(Using.scala:11)
at xsbt.boot.Using$.apply(Using.scala:10)
at xsbt.boot.Locks$GlobalLock.liftedTree1$1(Locks.scala:51)
at xsbt.boot.Locks$GlobalLock.withLock(Locks.scala:51)
at xsbt.boot.Locks$.apply0(Locks.scala:30)
at xsbt.boot.Locks$.apply(Locks.scala:27)
at sbt.IvySbt.withDefaultLogger(Ivy.scala:58)
at sbt.IvySbt.withIvy(Ivy.scala:99)
at sbt.IvySbt.withIvy(Ivy.scala:95)
at sbt.IvySbt$Module.withModule(Ivy.scala:114)
at sbt.IvyActions$.update(IvyActions.scala:121)
at sbt.Classpaths$$anonfun$work$1$1.apply(Defaults.scala:951)
at sbt.Classpaths$$anonfun$work$1$1.apply(Defaults.scala:949)
at sbt.Classpaths$$anonfun$doWork$1$1$$anonfun$54.apply(Defaults.scala:972)
at sbt.Classpaths$$anonfun$doWork$1$1$$anonfun$54.apply(Defaults.scala:970)
at sbt.Tracked$$anonfun$lastOutput$1.apply(Tracked.scala:35)
at sbt.Classpaths$$anonfun$doWork$1$1.apply(Defaults.scala:974)
at sbt.Classpaths$$anonfun$doWork$1$1.apply(Defaults.scala:969)
at sbt.Tracked$$anonfun$inputChanged$1.apply(Tracked.scala:45)
at sbt.Classpaths$.cachedUpdate(Defaults.scala:977)
at sbt.Classpaths$$anonfun$45.apply(Defaults.scala:856)
at sbt.Classpaths$$anonfun$45.apply(Defaults.scala:853)
at sbt.Scoped$$anonfun$hf10$1.apply(Structure.scala:586)
at sbt.Scoped$$anonfun$hf10$1.apply(Structure.scala:586)
at scala.Function1$$anonfun$compose$1.apply(Function1.scala:49)
at sbt.Scoped$Reduced$$anonfun$combine$1$$anonfun$apply$12.apply(Structure.scala:311)
at sbt.Scoped$Reduced$$anonfun$combine$1$$anonfun$apply$12.apply(Structure.scala:311)
at sbt.$tilde$greater$$anonfun$$u2219$1.apply(TypeFunctions.scala:41)
at sbt.std.Transform$$anon$5.work(System.scala:71)
at sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:232)
at sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:232)
at sbt.ErrorHandling$.wideConvert(ErrorHandling.scala:18)
at sbt.Execute.work(Execute.scala:238)
at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:232)
at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:232)
at sbt.ConcurrentRestrictions$$anon$4$$anonfun$1.apply(ConcurrentRestrictions.scala:160)
at sbt.CompletionService$$anon$2.call(CompletionService.scala:30)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
error sbt.ResolveException: unresolved dependency: com.github.siasia#xsbt-proguard-plugin_2.9.2;0.12.2-0.1.1: not found
Project loading failed: (r)etry, (q)uit, (l)ast, or (i)gnore?
JCuda now supports 7.5. Is 7.5 on the roadmap somewhere for BIDMat? I took a run at adding it myself, but it got a little complicated since 7.5 requires x64 and I know next to nothing about JNI.
using saveAs, I've gotten an HDF5LibraryException even when load works fine.
I made two changes and it fixed the problem:
There's nothing blocking, but wanted to post an issue because there may be a more helpful way to do error reporting here. Since the HDF5LibraryException seems to be related to a wide variety of possible problems (including not having LD_LIBRARY_PATH set properly), it would be nice for usability to catch it and report a more specific error when possible.
John, I added in out.clear
lines for the DenseMat accum, i.e., what gets called when we do accum
for CPU matrices.
The issue was that originally, my accum
calls would store the results from the previous accum call in out
which resulted in cumulative sums being added to the previous output, not a clear matrix of zeros. The GPU versions of accum have out.clear
in them (check GMat.scala
) so I'm assuming that the lack of out.clear
in the CPU versions is just an oversight. I didn't catch this in testing but it came up in the BayesNet.scala tests with matrix caching on.
John, assuming this doesn't break anything, feel free to close this.
We want to load matrix from file/stream to GPU. currently we load the whole to FMat and then copy to GPU but maybe it would be faster to do it directly to GPU.
could you point us how to do it directly on GPU?
Currently i do not see a way to either update a given cell in GMat or maybe update whole row/col in GMat.
Notice that the first two multiplications match, but the last is off by a uniform factor of three. Could easily be my mistake or some peculiarity of my local configuration.
Trace:
scala> val c = GMat(bernrnd(0.3,6,6) : FMat)
c: BIDMat.GMat =
0 0 0 1 1 0
0 0 1 0 0 0
0 1 0 1 1 1
1 0 0 0 0 0
0 1 1 1 0 0
.. .. .. .. .. ..
scala> val s = GSMat(sparse(bernrnd(0.3,6,6)) : SMat)
s: BIDMat.GSMat =
( 2, 0) 1
( 3, 0) 1
( 1, 1) 1
( 3, 1) 1
( 4, 1) 1
( 5, 1) 1
( 2, 2) 1
( 5, 2) 1
... ... ...
scala> c*s
res18: BIDMat.GMat =
1 2 0 2 1 0
1 0 1 1 1 0
1 4 1 3 2 1
0 0 0 0 1 0
2 2 1 2 2 0
.. .. .. .. .. ..
scala> full(s)
res19: BIDMat.GMat =
0 0 0 0 1 0
0 1 0 0 0 0
1 0 1 1 1 0
1 1 0 1 1 0
0 1 0 1 0 0
.. .. .. .. .. ..
scala> c*full(s)
res20: BIDMat.GMat =
1 2 0 2 1 0
1 0 1 1 1 0
1 4 1 3 2 1
0 0 0 0 1 0
2 2 1 2 2 0
.. .. .. .. .. ..
scala> c.tileMult(c.nrows, s.ncols, s.ncols, 0,0,s,0,0,GMat(FMat.zeros(6,6)),0,0)
res21: BIDMat.GMat =
3 6 0 6 3 0
3 0 3 3 3 0
3 12 3 9 6 3
0 0 0 0 3 0
6 6 3 6 6 0
.. .. .. .. .. ..
Hello,
I've been using the bleeding-edge/developer version of BIDMat (tried this with a clone of the version Feburary 2, and now a fresh pull today) and I have found that the dense-sparse matrix multiply no longer works. At first I thought it had to do with my own library's (which is built on top of BIDMat) determinism (since it appears the results are different every re-run of the program), but then I ran this simple vignette after tracing the computation to its source problem:
var datX = Array(1f, 0f, 1f, 0f, 0f, 0f,
0f, 1f, 1f, 0f, 0f, 0f,
1f, 1f, 1f, 0f, 0f, 0f,
0f, 0f, 0f, 1f, 0f, 1f,
0f, 0f, 0f, 0f, 1f, 1f,
0f, 0f, 0f, 1f, 1f, 1f)
var x1 = new FMat(6,6,datX)
var x2 = sparse(x1)
println("prod:\n"+(x1 * x2))
Reproduced below are two subsequent runs of the program (in a Scala object with main method), labelled Trail 1 and Trial 2 respectively:
Trial 1:
prod:
1 0 4 0 0 0
0 1 3 0 0 0
1 1 5 0 0 0
0 0 0 2 1 2
0 0 0 1 2 2
.. .. .. .. .. ..
Trial 2:
prod:
1 1 2 0 0 1
0 2 2 0 0 2
1 2 3 0 0 2
0 0 0 2 1 6
0 0 0 1 2 5
.. .. .. .. .. ..
I'm not sure if as BIDMat has been updated something broke the matrix-matrix operations involving sparse matrices, at minimum the multiply. Could you please look into this and fix this bug? Your dense-sparse/sparse ops are quite critical and useful for my library's back-bone. Note that this issue goes away when I do NOT use the most up-to-date version of BIDMat (but instead revert to one of the bundles available on the website).
Thanks!
Implementation should be probably something like:
applySFun(beta, FMat(beta.nrows, beta.ncols), null, math.signum _, 1L)
sbt is the default way for dependency management in scala, even though BIDMat uses sbt itself it seems not to be possible to have BIDMat as an sbt reference.
Expected behaviour libraryDependencies += "edu.berkeley.bid" %% "BIDMat" % "1.0.2"
is not working.
Calling setseed
and then randperm
, always returns 0 as the first element in the permutation. But only the first time randperm
is called after setseed
scala> setseed(45664)
scala> randperm(10)
res49: BIDMat.IMat = 0,8,4,7,1,6,3,2,5,9
scala> randperm(10)
res50: BIDMat.IMat = 8,4,2,6,0,9,3,7,5,1
scala> setseed(464)
scala> randperm(10)
res52: BIDMat.IMat = 0,6,9,7,5,2,3,1,8,4
scala> randperm(10)
res53: BIDMat.IMat = 4,1,9,2,5,8,3,6,7,0
scala> randperm(10)
res54: BIDMat.IMat = 9,3,0,8,2,5,7,4,1,6
scala> setseed(8559)
scala> randperm(10)
res56: BIDMat.IMat = 0,3,7,2,5,4,8,1,9,6
scala> randperm(10)
res57: BIDMat.IMat = 5,9,4,3,8,7,2,0,6,1
This is the error I got when I do A/5.0 where A is a DMat.
java.lang.RuntimeException: operator / not implemented for DMat
at BIDMat.Mat.notImplemented0(Mat.scala:10)
CUDA 5.5 is old. Please support CUDA 7.0.
Hi, I'm unable to find the source to compile libbidmatcuda. Where would I find it?
I'm attempting to build BIDMach from the git repo in order to test out the Word2Vec model. I'm unable to compile BIDMach without recompiling BIDMat.
Daniel
I download the bigger bundle of BIDMat from http://bid2.berkeley.edu/bid-data-project/download/ and extract it. But when I run ./bitmat from there, I got:
"2 CUDA devices found, CUDA version 7.0
Something went wrong while loading BIDMat CUDA library"
There are a k20 and a k40 on the machine and I can see libbidmatcuda-linux-x86_64.so in the lib directory.
I realize I should buy MKL, but I'm only planning to run on CUDA, so MKL should not be required, correct?
Make of course fails if the MKL library is not available.
Is there a configure file for that particular instance?
Thanks for the software though. The packages are excellent!
Accessing an item in the IMat in the interpreter through index does not have any problems, but using the same code for compilation results in errors. The same is with accessing .data. The fix is to use .data0, but this seems to be an unnecessary roundabout way to do so.
This seems to be the case for IMat's created in a class method. Note that the IMat will definitely have a size > 0.
val hourMat: SMat = load(filePathStr, "X")
val (rowIndices, colIndices) = find2(hourMat)
val bb1 = rowIndices(0) // Causes error.
val bb2 = rowIndices.apply(0) // Causes error.
val bb3 = rowIndices.data // Causes error.
val rowArray = rowIndices.data0 // No error.
val matEx: IMat = new IMat(3,1,Array(1,2,3))
matEx(0) // Causes error.
matEx.apply(0) // Causes error.
matEx.data(0) // Causes error.
matEx.data0(0) // No error.
On the other hand, doing so in main() does not cause any trouble. Hence, this does not seem to be a problem with find2 or the creation of the IMat.
Finally, this is the error message from compiler:
exception when typing $anonfun.this.$outer .row().update$mcI$sp($anonfun.this.$outer .numberOfTokenInOneBatch(), $anonfun.this.$outer .monthlyDict().apply(token))
overloaded method value update$mcI$sp with alternatives:
(a: BIDMat.IMat,b: Int)Int
(im: BIDMat.IMat,b: BIDMat.DenseMat)BIDMat.DenseMat
(i0: Int,v: Int)Int
cannot be applied to (Int, java.lang.Object) in file tt.scala
scala.tools.nsc.symtab.Types$TypeError: overloaded method value update$mcI$sp with alternatives:
(a: BIDMat.IMat,b: Int)Int
(im: BIDMat.IMat,b: BIDMat.DenseMat)BIDMat.DenseMat
(i0: Int,v: Int)Int
cannot be applied to (Int, java.lang.Object)
The CUDA code was recently refactored in this commit: 77fdb70
I tested the compilation process on my Mac 10.9 laptop and stout (Linux), and got similar errors. These were errors I got after doing git pull origin master
, then going into the jni/src
directory, then doing make clean; ./configure; make
. It does not appear to be a major issue since I think the issue can be resolved with some simple name changing/updates. For instance, I think the apply_binop
definitions in MatKernel.hpp
can be transferred to MatKernelD.hpp
, with a corresponding change in the .cu
files. Notice that the BIDMat_CUMAT.cpp
file is handled just fine.
Mac:
dhcp-46-165:src danielseita$ ./configure
Creating config for apple x86_64
dhcp-46-165:src danielseita$ make
g++ -fPIC -c -O2 -g -DNDEBUG -I/Library/Java/JavaVirtualMachines/jdk1.8.0_05.jdk/Contents/Home/include -I/Library/Java/JavaVirtualMachines/jdk1.8.0_05.jdk/Contents/Home/include/darwin -I/Users/danielseita/BIDMat/jni/include -I/usr/local/cuda/include BIDMat_CUMAT.cpp
g++ -fPIC -c -O2 -g -DNDEBUG -I/Library/Java/JavaVirtualMachines/jdk1.8.0_05.jdk/Contents/Home/include -I/Library/Java/JavaVirtualMachines/jdk1.8.0_05.jdk/Contents/Home/include/darwin -I/Users/danielseita/BIDMat/jni/include -I/usr/local/cuda/include BIDMat_CUMATD.cpp
BIDMat_CUMATD.cpp:63:12: error: use of undeclared identifier 'apply_binop'
return apply_binop(nativeA, Anrows, Ancols, nativeB, Bnrows, Bncols, nativeC, opn);
^
BIDMat_CUMATD.cpp:75:12: error: use of undeclared identifier 'sdopcol'
return sdopcol(nrows, ncols, nnz, A, Air, B, len, opn);
^
BIDMat_CUMATD.cpp:86:12: error: use of undeclared identifier 'sdoprow'
return sdoprow(nrows, ncols, nnz, A, Aic, B, len, opn);
^
BIDMat_CUMATD.cpp:153:12: error: use of undeclared identifier 'apply_gfun'
return apply_gfun(nativeA, nativeB, N, opn);
^
BIDMat_CUMATD.cpp:163:12: error: use of undeclared identifier 'apply_gfun2'
return apply_gfun2(nativeA, nativeB, nativeC, N, opn);
^
5 errors generated.
Linux:
[seita@stout src]$ ./configure
Creating config for linux x86_64
[seita@stout src]$ make
g++ -fPIC -c -O2 -DNDEBUG -I/usr/java/default/include -I/usr/java/default/include/linux -I/home/seita/BIDMat/jni/include -I/usr/local/cuda/include BIDMat_CUMAT.cpp
g++ -fPIC -c -O2 -DNDEBUG -I/usr/java/default/include -I/usr/java/default/include/linux -I/home/seita/BIDMat/jni/include -I/usr/local/cuda/include BIDMat_CUMATD.cpp
BIDMat_CUMATD.cpp: In function ‘jint Java_edu_berkeley_bid_CUMATD_applyop(JNIEnv*, _jobject*, _jobject*, jint, jint, _jobject*, jint, jint, _jobject*, jint)’:
BIDMat_CUMATD.cpp:63: error: ‘apply_binop’ was not declared in this scope
BIDMat_CUMATD.cpp: In function ‘jint Java_edu_berkeley_bid_CUMATD_sdopcol(JNIEnv*, _jobject*, jint, jint, jint, _jobject*, _jobject*, _jobject*, jint, jint)’:
BIDMat_CUMATD.cpp:75: error: ‘sdopcol’ was not declared in this scope
BIDMat_CUMATD.cpp: In function ‘jint Java_edu_berkeley_bid_CUMATD_sdoprow(JNIEnv*, _jobject*, jint, jint, jint, _jobject*, _jobject*, _jobject*, jint, jint)’:
BIDMat_CUMATD.cpp:86: error: ‘sdoprow’ was not declared in this scope
BIDMat_CUMATD.cpp: In function ‘jint Java_edu_berkeley_bid_CUMATD_applygfun(JNIEnv*, _jobject*, _jobject*, _jobject*, jint, jint)’:
BIDMat_CUMATD.cpp:153: error: ‘apply_gfun’ was not declared in this scope
BIDMat_CUMATD.cpp: In function ‘jint Java_edu_berkeley_bid_CUMATD_applygfun2(JNIEnv*, _jobject*, _jobject*, _jobject*, _jobject*, jint, jint)’:
BIDMat_CUMATD.cpp:163: error: ‘apply_gfun2’ was not declared in this scope
make: *** [BIDMat_CUMATD.o] Error 1
Line 167: out.jc(i) = nnzx
Should be: out.jc(i) = nnzx + ioff
Reason: Although all data is transferred over to Array[Byte] data, Array[Int] jc has its last number set incorrectly in respect to its other values. Printing and converting to CSMat again would lead to loss of the last byte/character.
Example:
scala> var a = new CSMat(1,1,Array("abcd")) a: BIDMat.CSMat = abcd scala> var r = BMat(a) r: BIDMat.BMat = abc ... scala> r.data res2: Array[Byte] = Array(97, 98, 99, 100) scala> r.jc res3: Array[Int] = Array(1, 4)
./bidmat
Error: Could not find or load main class scala.tools.nsc.MainGenericRunner
I have tried with scala 2.11.2 and 2.10.4
i have java version 1.7.0_80
I'm running into a error in GMat's allocation during a matrix multiply operation that's run in the bidmach/scala repl:
[user1 ~]$ bidmach
Loading /homes/user1/git/BIDMach/lib/bidmach_init.scala...
import BIDMat.{CMat, CSMat, DMat, Dict, FMat, FND, GMat, GDMat, GIMat, GLMat, GSMat, GSDMat, HMat, IDict, Image, IMat, LMat, Mat, SMat, SBMat, SDMat}
import BIDMat.MatFunctions._
import BIDMat.SciFunctions._
import BIDMat.Solvers._
import BIDMat.Plotting._
import BIDMach.Learner
import BIDMach.models.{FM, GLM, KMeans, KMeansw, LDA, LDAgibbs, Model, NMF, SFA, RandomForest}
import BIDMach.networks.DNN
import BIDMach.datasources.{DataSource, MatDS, FilesDS, SFilesDS}
import BIDMach.mixins.{CosineSim, Perplexity, Top, L1Regularizer, L2Regularizer}
import BIDMach.updaters.{ADAGrad, Batch, BatchNorm, IncMult, IncNorm, Telescoping}
import BIDMach.causal.IPTW
1 CUDA device found, CUDA version 7.0
Welcome to Scala version 2.11.2 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_80).
Type in expressions to have them evaluated.
Type :help for more information.
scala> val f = grand(256,40000)
f: BIDMat.GMat =
0.88383 0.36261 0.67829 0.41245 0.98139 0.49479 0.73682 0.072556 0.28022 0.86777 0.63998 0.60432 0.34435 0.63388 0.34711 0.37487 0.027557 0.029789 0.74355 0.78470 0.63183 0.019272 0.48245...
0.81384 0.74068 0.59507 0.056584 0.065755 0.73149 0.94198 0.16632 0.38067 0.63685 0.62824 0.063965 0.88211 0.41027 0.60684 0.16312 0.76326 0.054253 0.91629 0.22353 0.57114 0.19324 0.82873...
.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
scala> val g = grand(256,80000)
g: BIDMat.GMat =
0.38165 0.0094942 0.33977 0.52591 0.36995 0.93983 0.53091 0.21534 0.49424 0.17834 0.31689 0.11909 0.30771 0.26260 0.69630 0.36673 0.47742 0.33469 0.032663 0.82122...
0.28144 0.34397 0.38882 0.94709 0.77737 0.72754 0.060895 0.39984 0.80299 0.10944 0.33119 0.64009 0.10358 0.35609 0.54367 0.44958 0.17401 0.52444 0.010265 0.95499...
.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
scala> val h = f ^* g
java.lang.IllegalArgumentException: Negative capacity: -84901888
at java.nio.Buffer.<init>(Buffer.java:191)
at java.nio.ByteBuffer.<init>(ByteBuffer.java:276)
at java.nio.ByteBuffer.<init>(ByteBuffer.java:284)
at java.nio.MappedByteBuffer.<init>(MappedByteBuffer.java:89)
at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:162)
at jcuda.runtime.JCuda.cudaMallocHostNative(Native Method)
at jcuda.runtime.JCuda.cudaMallocHost(JCuda.java:3902)
at BIDMat.GMat$.apply(GMat.scala:1661)
at BIDMat.GMat$.newOrCheckGMat(GMat.scala:2411)
at BIDMat.GMat$.newOrCheckGMat(GMat.scala:2445)
at BIDMat.GMat.GTMult(GMat.scala:658)
at BIDMat.GMat.$up$times(GMat.scala:1141)
... 33 elided
But if the column dimensions are different, it works:
scala> val f = grand(256,38970)
f: BIDMat.GMat =
0.88383 0.36261 0.67829 0.41245 0.98139 0.49479 0.73682 0.072556 0.28022 0.86777 0.63998 0.60432 0.34435 0.63388 0.34711 0.37487 0.027557 0.029789 0.74355 0.78470 0.63183 0.019272 0.48245...
0.81384 0.74068 0.59507 0.056584 0.065755 0.73149 0.94198 0.16632 0.38067 0.63685 0.62824 0.063965 0.88211 0.41027 0.60684 0.16312 0.76326 0.054253 0.91629 0.22353 0.57114 0.19324 0.82873...
.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
scala> val g = grand(256,176965)
g: BIDMat.GMat =
0.34660 0.30266 0.70115 0.53805 0.47667 0.99830 0.72554 0.33751 0.22702 0.13273 0.40718 0.33229 0.30885 0.42985 0.74604 0.45333 0.60975 0.50820 0.68527 0.97660 0.59849 0.29411 0.39118...
0.85486 0.14368 0.16848 0.82664 0.89762 0.60588 0.30885 0.46973 0.11753 0.47522 0.95762 0.047029 0.28322 0.90902 0.11110 0.60526 0.70672 0.17768 0.51984 0.25233 0.64633 0.097652 0.22967...
.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
scala> val h = f ^* g
h: BIDMat.GMat =
64.222 64.623 67.955 61.407 62.221 61.621 62.616 65.751 68.504 64.667 65.463 62.693 70.220 69.879 65.115 68.670 70.400 67.562 68.949 62.765 62.149 68.763 66.013 67.540 65.336 66.513 64.797 66.269...
66.547 63.118 70.510 65.141 64.089 61.146 66.066 67.357 68.428 67.383 64.990 62.570 69.941 71.695 66.835 66.209 70.526 66.446 70.568 63.454 62.166 65.496 65.838 67.402 65.898 67.939 63.881 65.722...
.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
The relevant code is in GMat.scala, line 1661:
1654 def apply(nr:Int, nc:Int):GMat = {
1655 val retv = new GMat(nr, nc, new Pointer(), 1L*nr*nc)
1656 if (Mat.debugMem) {
1657 println("GMat %d %d, %d %f" format (nr, nc, SciFunctions.getGPU, SciFunctions.GPUmem._1))
1658 if (nr*nc > Mat.debugMemThreshold) throw new RuntimeException("GMat alloc too large");
1659 }
1660 var err = if (1L*nr*nc*Sizeof.FLOAT > Mat.hostAllocSize) {
1661 cudaMallocHost(retv.data, 1L*nr*nc*Sizeof.FLOAT);
1662 } else {
1663 cudaMalloc(retv.data, 1L*nr*nc*Sizeof.FLOAT);
1664 }
1665 cudaDeviceSynchronize;
1666 if (err == 0) err = cudaGetLastError();
1667 if (err != 0) throw new RuntimeException("CUDA alloc failed " + cudaGetErrorString(err));
1668 retv
1669 }
Notice that:
scala> 1L*40000*80000*4
res2: Long = 12800000000
scala> 1L*(40000*80000*4)
res3: Long = -84901888
The latter is the value that the cudaMallocHost() was complaining about.
The version of the code I'm using is:
[user1 BIDMat]$ git log
commit 5c4f7ed945d7d7aac34f8fa2544258e40a7c1568
Author: John Canny <[email protected]>
Date: Tue Nov 17 16:26:12 2015 -0800
added HDFSIO.scala
Grepping for hash instances in the project:
huitseeker@jollyjumper:~/tmp/BIDMat(master)$ ag Murmur -G'.*\.scala'
src/main/scala/BIDMat/DMat.scala
6:import scala.util.hashing.MurmurHash3
51: out.setGUID(MurmurHash3.mix(MurmurHash3.mix(nr, nc), (GUID*3145341).toInt));
src/main/scala/BIDMat/FMat.scala
7:import scala.util.hashing.MurmurHash3
49: out.setGUID(MurmurHash3.mix(MurmurHash3.mix(nr, nc), (GUID*3145341).toInt));
src/main/scala/BIDMat/GDMat.scala
16:import scala.util.hashing.MurmurHash3
46: out.setGUID(MurmurHash3.mix(MurmurHash3.mix(nr, nc), (GUID*3145341).toInt));
src/main/scala/BIDMat/GIMat.scala
13:import scala.util.hashing.MurmurHash3
45: out.setGUID(MurmurHash3.mix(MurmurHash3.mix(nr, nc), (GUID*3145341).toInt));
src/main/scala/BIDMat/GLMat.scala
10:import scala.util.hashing.MurmurHash3
49: out.setGUID(MurmurHash3.mix(MurmurHash3.mix(nr, nc), (GUID*3145341).toInt));
src/main/scala/BIDMat/GMat.scala
13:import scala.util.hashing.MurmurHash3
43: out.setGUID(MurmurHash3.mix(MurmurHash3.mix(nr, nc), (GUID*3145341).toInt));
src/main/scala/BIDMat/IMat.scala
5:import scala.util.hashing.MurmurHash3
42: out.setGUID(MurmurHash3.mix(MurmurHash3.mix(nr, nc), (GUID*3145341).toInt));
src/main/scala/BIDMat/LMat.scala
5:import scala.util.hashing.MurmurHash3
42: out.setGUID(MurmurHash3.mix(MurmurHash3.mix(nr, nc), (GUID*3145341).toInt));
src/main/scala/BIDMat/ND.scala
11:import edu.berkeley.bid.MurmurHash3
199: MurmurHash3.MurmurHash3_x64_64(inds.map(_.GUID), 0x3142341)
203: MurmurHash3.MurmurHash3_x64_64(inds.map(_.toLong), 0x3142341)
huitseeker@jollyjumper:~/tmp/BIDMat(master)$
I've found a lot of hashes on 32-bit Ints. Have you considered replacing Murmur3 with the faster xxHash ?
https://github.com/Cyan4973/xxHash
(as for the one usage in its 64-bit version, note the existence of XXH64)
Version: master branch, after commit 97cda3d on Feb 6 2015.
The code for MatFunctions.scala has:
def getdiag(a:DMat) = DMat(a.getdiag)
def getdiag(a:FMat) = FMat(a.getdiag)
def getdiag(a:IMat) = IMat(a.getdiag)
def getdiag(a:CMat) = CMat(a.getdiag)
def getdiag(a:GMat) = a.mkdiag
I am not sure why there's a mkdiag for the GPU case. I can get around this by calling it from the GMat code, i.e., a.getdiag() if a is a GMat.
Some examples:
scala> val a = 1\2\3 on 4\5\6 on 7\8\9
a: BIDMat.IMat =
1 2 3
4 5 6
7 8 9
scala> getdiag(a)
res8: BIDMat.IMat =
1
5
9
scala> getdiag(BIDMat.GMat(a))
java.lang.RuntimeException: mkdiag requires a vector argument, but dims= 3 3
at BIDMat.GMat.mkdiag(GMat.scala:636)
at BIDMat.MatFunctions$.getdiag(MatFunctions.scala:1437)
... 33 elided
Version: 9b1557e
I'm not sure if this is a serious bug, but instead a code design or philosophical choice, but when we call a multiplication with a generic matrix on the left hand side, BIDMat will call the multiplication operator in the Mat.scala class, but that will throw an error because the binary methods there will create an "operator xxx not implemented for ...".
The key is that the generic matrix must have a compile time type of Mat. Even if the runtime type is changed to FMat, SMat, etc., the multiplication will still search for the Mat class.
In some code I'm writing, for instance, I have either GPU or CPU mode to consider, so my matrices a and b have type Mat to be generic. Then I set them equal to something:
val a:Mat = null
val b:Mat = null
// Initialize a to be either a GSMat or an SMat, depending on GPU/CPU mode
// Initialize b to be either a GMat or FMat, depending on GPU/CPU mode
a * b
And the a*b
line will fail if I run with CPU mode with the errors given in the title with XMat = SMat and YMat = FMat.
There are ways I can get this code to work by wrapping SMats and FMats, as in SMat(a)*FMat(b)
, but I wanted to check in with you since this information has the potential to be a little confusing, because we are encouraged to use Mats to be generic, but we must also keep repeatedly checking for cases and then casting to SMat(a), FMat(b), etc. What are your thoughts?
Some command line examples:
scala> val a:Mat=sprand(1000,1000,0.1)
a: BIDMat.Mat =
( 5, 0) 0.52206
( 6, 0) 0.71377
( 13, 0) 0.62309
( 28, 0) 0.20338
( 54, 0) 0.24161
( 61, 0) 0.38814
( 63, 0) 0.67045
( 74, 0) 0.33437
... ... ...
scala> val b:Mat=sprand(1000,1000,0.1)
b: BIDMat.Mat =
( 2, 0) 0.32644
( 16, 0) 0.78167
( 18, 0) 0.35468
( 41, 0) 0.16965
( 53, 0) 0.60286
( 56, 0) 0.27231
( 64, 0) 0.70402
( 65, 0) 0.20928
... ... ...
scala> a*b
res2: BIDMat.Mat =
( 0, 0) 2.2325
( 1, 0) 2.6090
( 2, 0) 1.5920
( 3, 0) 2.0934
( 4, 0) 2.0458
( 5, 0) 3.6955
( 6, 0) 1.9645
( 7, 0) 1.3880
... ... ...
scala> a*rand(1000,1000)
java.lang.RuntimeException: operator * not implemented for SMat and FMat
at BIDMat.Mop$class.notImplemented(Operators.scala:327)
at BIDMat.Mop_Times$.notImplemented(Operators.scala:382)
at BIDMat.Mop$class.sop(Operators.scala:28)
at BIDMat.Mop_Times$.sop(Operators.scala:382)
at BIDMat.Mop$class.op(Operators.scala:143)
at BIDMat.Mop_Times$.op(Operators.scala:382)
at BIDMat.SMat.$times(SMat.scala:531)
... 33 elided
scala> rand(1000,1000)*a
res4: BIDMat.Mat =
31.505 24.170 20.940 18.018 27.939 22.453 25.225 24.584 22.139 24.005 25.443 28.746 20.686 23.867 19.050 29.900 20.043 24.953 27.049 25.659 26.288 28.829 26.924 26.171...
29.118 26.772 21.919 21.094 26.713 19.681 24.325 23.950 24.116 22.451 22.003 26.480 22.981 22.656 18.127 29.811 19.218 23.364 22.904 23.538 23.502 24.597 27.557 25.794...
28.578 25.883 20.647 20.381 24.435 17.622 22.413 23.946 23.355 21.007 24.580 28.317 20.682 24.526 18.536 26.255 17.030 22.624 26.466 24.642 25.480 27.498 24.700 26.631...
.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
scala> val c:Mat = rand(1000,1000)
c: BIDMat.Mat =
0.91933 0.054223 0.0029891 0.97654 0.045690 0.60623 0.45060 0.91492 0.00043926 0.77734 0.033706 0.27043 0.54978 0.25489 0.47594 0.35749...
0.54238 0.62099 0.89218 0.71610 0.86469 0.69615 0.28406 0.93660 0.15506 0.091589 0.48099 0.82811 0.79517 0.67379 0.15774 0.88002...
0.82720 0.88533 0.27555 0.88446 0.71065 0.61293 0.14963 0.053130 0.36748 0.25745 0.21042 0.96924 0.70845 0.77234 0.042210 0.76511...
.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
scala> a*c
java.lang.RuntimeException: operator * not implemented for SMat and FMat
at BIDMat.Mop$class.notImplemented(Operators.scala:327)
at BIDMat.Mop_Times$.notImplemented(Operators.scala:382)
at BIDMat.Mop$class.sop(Operators.scala:28)
at BIDMat.Mop_Times$.sop(Operators.scala:382)
at BIDMat.Mop$class.op(Operators.scala:143)
at BIDMat.Mop_Times$.op(Operators.scala:382)
at BIDMat.SMat.$times(SMat.scala:531)
... 33 elided
The GSMat class contains no transpose method. Something like this might work:
override def t = {
val out = GSMat.newOrCheckGSMat(ncols, nrows, nnz0, null, GUID, "t".##)
CUMATD.transpose(this.data, nrows, out.data, ncols, nrows, ncols)
cudaDeviceSynchronize()
out
}
(Note the extra nnz0 term to include when checking the cache for GSMats).
However, I did a few tests and got some weird behavior when multiplying GSMats with other matrices (e.g., GSMat(a) * GMat(mkdiag(ones(3,1)))
), so I just wanted to check in and see if GSMats were really supposed to have transposes. It seems like they should because SMats have transposes.
mean, sum and transpose operator do not support GMat so far.
I am implementing a sum method for TMat's as it appears frequently and seems useful. The plan is to iterate over tiles and to apply the sum individually on them, and then aggregate the result in a single vector, likely an FMat or GMat. I am tempted to allocate sum vectors of size matching the result vector for each of the constituent tiles, which is a waste of memory. To be memory efficient, this method could sum the (sum of the) tiles with the running result vector, in place. The problem is tracking the indices within the result vector - it is much easier to just sum vectors of a uniform size - and I'm not sure if the existing vector sum methods support smart sub-indexing. Thus I propose a sum with offset function - it would be quite similar to tileMult, in spirit.
Given that the omat parameter is always generic anyway, it's likely we could write a catch-all newOrCheck method in Mat and override it individually. The reason I mention it is that implementing caching for generic matrix methods could end up requiring quite a few pattern matches. It's manageable at the moment (just FMat's, SMat's, GMat's and GSMat's) but could grow in complexity.
In practice the methods pattern match on the omat type inside the function body, so it should be 'safe' to make the method generic.
https://github.com/BIDData/BIDMat/blob/master/src/main/scala/BIDMat/DMat.scala#L1459-L1465
Something might be lost vis-a-vis (e.g.) reflection, though. I still don't understand Scala's type erasure semantics very well.
scala> val a = rand(3,4)
a: BIDMat.FMat =
0.40857 0.65735 0.18252 0.21719
0.82198 0.38203 0.57885 0.25747
0.13365 0.036090 0.57698 0.28741
scala> val b = rand(3,4)
b: BIDMat.FMat =
0.44102 0.63252 0.38195 0.052384
0.99267 0.053118 0.0094332 0.070619
0.34677 0.82116 0.74243 0.42345
scala> a *^ b
java.lang.ArrayIndexOutOfBoundsException: 9
at BIDMat.FMat.fDMultTHelper(FMat.scala:752)
at BIDMat.FMat.multT(FMat.scala:778)
at BIDMat.FMat.$times$up(FMat.scala:1253)
... 33 elided
I can't tell if cummaxByKey is implemented for GPU matrices. If it's not, then I think that it's just missing a few methods in one of the .cu files that are in BIDMat's GPU code directory? (It was recently refactored according to this commit f9f5720)
Example on my Mac 10.9 laptop:
I do git pull, then do cd BIDMat/jni/src; make clean; ./configure; make; make install
to get the cuda libraries. Then I do ./sbt package
. The following code shows an example of the GPU version not working (but the CPU version working):
scala> val a = rand(3,10)
a: BIDMat.FMat =
0.17454 0.72285 0.33957 0.23828 0.47790 0.41824 0.37690 0.083308 0.39092 0.91638
0.91813 0.83321 0.55037 0.63189 0.80421 0.76176 0.66493 0.75904 0.0053793 0.65965
0.16096 0.34379 0.048064 0.23330 0.32797 0.86558 0.81054 0.53446 0.91294 0.60398
scala> val ga = grand(3,10)
ga: BIDMat.GMat =
0.78258 0.081876 0.44771 0.53182 0.63474 0.78777 0.73650 0.69435 0.39765 0.44639
0.53618 0.083131 0.098099 0.34346 0.027273 0.90388 0.29731 0.14853 0.83988 0.091234
0.14477 0.36829 0.64452 0.77051 0.31326 0.89641 0.67069 0.90067 0.67521 0.99995
scala> val keys = zeros(3,10)
keys: BIDMat.FMat =
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
scala> val gkeys = gzeros(3,10)
gkeys: BIDMat.GMat =
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
scala> cummaxByKey(a,keys)
res9: BIDMat.FMat =
0.17454 0.72285 0.33957 0.23828 0.47790 0.41824 0.37690 0.083308 0.39092 0.91638
0.91813 0.83321 0.55037 0.63189 0.80421 0.76176 0.66493 0.75904 0.39092 0.91638
0.91813 0.83321 0.55037 0.63189 0.80421 0.86558 0.81054 0.75904 0.91294 0.91638
scala> cummaxByKey(ga,gkeys)
java.lang.UnsatisfiedLinkError: edu.berkeley.bid.CUMAT.cummaxByKeyFL(Ljcuda/Pointer;Ljcuda/Pointer;Ljcuda/Pointer;J)I
at edu.berkeley.bid.CUMAT.cummaxByKeyFL(Native Method)
at BIDMat.GMat.cummaxByKey(GMat.scala:974)
at BIDMat.GMat.cummaxByKey(GMat.scala:1004)
at BIDMat.SciFunctions$.cummaxByKey(SciFunctions.scala:1070)
... 33 elided
Thanks.
Steps:
Partial solution: Add these lines to build.sbt:
libraryDependencies += "org.apache.commons" % "commons-math3" % "3.0"
libraryDependencies += "net.jpountz.lz4" % "lz4" % "1.2.0"
libraryDependencies += "org.scala-saddle" % "jhdf5" % "2.9"
However after adding these, the build still fails, so either the repo contains unbuildable code or I selected the wrong version(s) of these libraries.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.