biddata / bidmat Goto Github PK

View Code? Open in Web Editor NEW

264.0 264.0 73.0 6.9 MB

A CPU and GPU-accelerated matrix library for data mining

License: BSD 3-Clause "New" or "Revised" License

Shell 1.09% Java 4.62% C++ 9.24% C 15.89% HTML 0.90% Scala 57.60% Cuda 10.22% Batchfile 0.30% Makefile 0.15%

bidmat's People

Contributors

Stargazers

Watchers

Forkers

chrishzhao nbjsw dlwh eternia478 vijakshay julianzhang derrickcheng derrickchengresearch coderxiang lendle jingtaow byeah yinxusen mindis sfurmb invinciblejha xpontus daishichao birdgun dcep nkhuyu aeigis88 timesofbadri narayana1208 apsaltis rtvt123 annazhou chagge tillrohrmann bordaw codezixo alemagnani ryan-ki gitter-badger shirisht codeaudit cfregly dfrsg jamesjia94 nomad-ca-us huitseeker phlip9 danielchalef mjparrott sandy4321 yyangeecs pkalipatnapu bingoko itrs-group caomw junshi15 jihongma adomore appcoreopc mattmacy witold-jedrzejewski nagyistge lgnglob frankfqchen jinyu0310 alainlompo bapleliu liuguoyou anskarl felixfuxi bayesquant afcarl danhlephuoc wiwa zeta1999 simha-devops

bidmat's Issues

Google groups for BidMat and BidMach

Hi, Do you guys have google groups for BidMach and BidMat projects ?

How to make BIDMat as the GPU Backend for Spark?

@jcanny I wanna leverage the GPU Resources in Spark.
For example use GPU to do some Matrix Computation.
I am thinking about how to configure to ake BIDMat as the GPU Backend for Spark?
Likely, I use Maven, How to add sth in POM.XML (Attached)

4.0.0 org.apache apache 14 org.apache.spark spark-parent 1.1.0 pom Spark Project Parent POM http://spark.apache.org/ Apache 2.0 License http://www.apache.org/licenses/LICENSE-2.0.html repo scm:git:[email protected]:apache/spark.git scm:git:https://git-wip-us.apache.org/repos/asf/spark.git scm:git:[email protected]:apache/spark.git v1.1.0-rc4 matei Matei Zaharia [email protected] http://www.cs.berkeley.edu/~matei Apache Software Foundation http://spark.apache.org JIRA https://issues.apache.org/jira/browse/SPARK

<prerequisites>
    <maven>3.0.4</maven>
</prerequisites>

<mailingLists>
    <mailingList>
        <name>Dev Mailing List</name>
        <post>[email protected]</post>
        <subscribe>[email protected]</subscribe>
        <unsubscribe>[email protected]</unsubscribe>
    </mailingList>

    <mailingList>
        <name>User Mailing List</name>
        <post>[email protected]</post>
        <subscribe>[email protected]</subscribe>
        <unsubscribe>[email protected]</unsubscribe>
    </mailingList>

    <mailingList>
        <name>Commits Mailing List</name>
        <post>[email protected]</post>
        <subscribe>[email protected]</subscribe>
        <unsubscribe>[email protected]</unsubscribe>
    </mailingList>
</mailingLists>

<modules>
    <module>core</module>
    <module>bagel</module>
    <module>graphx</module>
    <module>mllib</module>
    <module>tools</module>
    <module>streaming</module>
    <module>sql/catalyst</module>
    <module>sql/core</module>
    <module>sql/hive</module>
    <module>repl</module>
    <module>assembly</module>
    <module>external/twitter</module>
    <module>external/kafka</module>
    <module>external/flume</module>
    <module>external/flume-sink</module>
    <module>external/zeromq</module>
    <module>external/mqtt</module>
    <module>examples</module>
</modules>

<properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>

    <java.version>1.6</java.version>
    <sbt.project.name>spark</sbt.project.name>
    <scala.version>2.10.4</scala.version>
    <scala.binary.version>2.10</scala.binary.version>
    <scala.macros.version>2.0.1</scala.macros.version>
    <mesos.version>0.18.1</mesos.version>
    <mesos.classifier>shaded-protobuf</mesos.classifier>
    <akka.group>org.spark-project.akka</akka.group>
    <akka.version>2.2.3-shaded-protobuf</akka.version>
    <slf4j.version>1.7.5</slf4j.version>
    <log4j.version>1.2.17</log4j.version>
    <hadoop.version>1.0.4</hadoop.version>
    <protobuf.version>2.4.1</protobuf.version>
    <yarn.version>${hadoop.version}</yarn.version>
    <hbase.version>0.94.6</hbase.version>
    <flume.version>1.4.0</flume.version>
    <zookeeper.version>3.4.5</zookeeper.version>
    <hive.version>0.12.0</hive.version>
    <parquet.version>1.4.3</parquet.version>
    <jblas.version>1.2.3</jblas.version>
    <jetty.version>8.1.14.v20131031</jetty.version>
    <chill.version>0.3.6</chill.version>
    <codahale.metrics.version>3.0.0</codahale.metrics.version>
    <avro.version>1.7.6</avro.version>
    <jets3t.version>0.7.1</jets3t.version>
    <aws.java.sdk.version>1.8.3</aws.java.sdk.version>
    <aws.kinesis.client.version>1.1.0</aws.kinesis.client.version>

    <PermGen>64m</PermGen>
    <MaxPermGen>512m</MaxPermGen>
</properties>

<repositories>
    <repository>
        <id>central</id>
        <!-- This should be at top, it makes maven try the central repo first and then others and hence faster dep resolution -->
        <name>Maven Repository</name>
        <url>https://repo1.maven.org/maven2</url>
        <releases>
            <enabled>true</enabled>
        </releases>
        <snapshots>
            <enabled>false</enabled>
        </snapshots>
    </repository>
    <repository>
        <id>apache-repo</id>
        <name>Apache Repository</name>
        <url>https://repository.apache.org/content/repositories/releases</url>
        <releases>
            <enabled>true</enabled>
        </releases>
        <snapshots>
            <enabled>false</enabled>
        </snapshots>
    </repository>
    <repository>
        <id>jboss-repo</id>
        <name>JBoss Repository</name>
        <url>https://repository.jboss.org/nexus/content/repositories/releases</url>
        <releases>
            <enabled>true</enabled>
        </releases>
        <snapshots>
            <enabled>false</enabled>
        </snapshots>
    </repository>
    <repository>
        <id>mqtt-repo</id>
        <name>MQTT Repository</name>
        <url>https://repo.eclipse.org/content/repositories/paho-releases</url>
        <releases>
            <enabled>true</enabled>
        </releases>
        <snapshots>
            <enabled>false</enabled>
        </snapshots>
    </repository>
    <repository>
        <id>cloudera-repo</id>
        <name>Cloudera Repository</name>
        <url>https://repository.cloudera.com/artifactory/cloudera-repos</url>
        <releases>
            <enabled>true</enabled>
        </releases>
        <snapshots>
            <enabled>false</enabled>
        </snapshots>
    </repository>
    <repository>
        <id>mapr-repo</id>
        <name>MapR Repository</name>
        <url>http://repository.mapr.com/maven</url>
        <releases>
            <enabled>true</enabled>
        </releases>
        <snapshots>
            <enabled>false</enabled>
        </snapshots>
    </repository>
    <repository>
        <id>spring-releases</id>
        <name>Spring Release Repository</name>
        <url>https://repo.spring.io/libs-release</url>
        <releases>
            <enabled>true</enabled>
        </releases>
        <snapshots>
            <enabled>false</enabled>
        </snapshots>
    </repository>
</repositories>
<pluginRepositories>
    <pluginRepository>
        <id>central</id>
        <url>https://repo1.maven.org/maven2</url>
        <releases>
            <enabled>true</enabled>
        </releases>
        <snapshots>
            <enabled>false</enabled>
        </snapshots>
    </pluginRepository>
</pluginRepositories>

<dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>org.eclipse.jetty</groupId>
            <artifactId>jetty-util</artifactId>
            <version>${jetty.version}</version>
        </dependency>
        <dependency>
            <groupId>org.eclipse.jetty</groupId>
            <artifactId>jetty-security</artifactId>
            <version>${jetty.version}</version>
        </dependency>
        <dependency>
            <groupId>org.eclipse.jetty</groupId>
            <artifactId>jetty-plus</artifactId>
            <version>${jetty.version}</version>
        </dependency>
        <dependency>
            <groupId>org.eclipse.jetty</groupId>
            <artifactId>jetty-server</artifactId>
            <version>${jetty.version}</version>
        </dependency>
        <dependency>
            <groupId>com.google.guava</groupId>
            <artifactId>guava</artifactId>
            <version>14.0.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.commons</groupId>
            <artifactId>commons-lang3</artifactId>
            <version>3.3.2</version>
        </dependency>
        <dependency>
            <groupId>commons-codec</groupId>
            <artifactId>commons-codec</artifactId>
            <version>1.5</version>
        </dependency>
        <dependency>
            <groupId>org.apache.commons</groupId>
            <artifactId>commons-math3</artifactId>
            <version>3.3</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>com.google.code.findbugs</groupId>
            <artifactId>jsr305</artifactId>
            <version>1.3.9</version>
        </dependency>
        <dependency>
            <groupId>org.slf4j</groupId>
            <artifactId>slf4j-api</artifactId>
            <version>${slf4j.version}</version>
        </dependency>
        <dependency>
            <groupId>org.slf4j</groupId>
            <artifactId>slf4j-log4j12</artifactId>
            <version>${slf4j.version}</version>
        </dependency>
        <dependency>
            <groupId>org.slf4j</groupId>
            <artifactId>jul-to-slf4j</artifactId>
            <version>${slf4j.version}</version>
        </dependency>
        <dependency>
            <groupId>org.slf4j</groupId>
            <artifactId>jcl-over-slf4j</artifactId>
            <version>${slf4j.version}</version>
            <!-- <scope>runtime</scope> --> <!-- more correct, but scalac 2.10.3 doesn't like it -->
        </dependency>
        <dependency>
            <groupId>log4j</groupId>
            <artifactId>log4j</artifactId>
            <version>${log4j.version}</version>
        </dependency>
        <dependency>
            <groupId>com.ning</groupId>
            <artifactId>compress-lzf</artifactId>
            <version>1.0.0</version>
        </dependency>
        <dependency>
            <groupId>org.xerial.snappy</groupId>
            <artifactId>snappy-java</artifactId>
            <version>1.0.5.3</version>
        </dependency>
        <dependency>
            <groupId>net.jpountz.lz4</groupId>
            <artifactId>lz4</artifactId>
            <version>1.2.0</version>
        </dependency>
        <dependency>
            <groupId>com.clearspring.analytics</groupId>
            <artifactId>stream</artifactId>
            <version>2.7.0</version>
            <exclusions>
                <!-- Only HyperLogLogPlus is used, which doesn't depend on fastutil -->
                <exclusion>
                    <groupId>it.unimi.dsi</groupId>
                    <artifactId>fastutil</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <!-- In theory we need not directly depend on protobuf since Spark does not directly
             use it. However, when building with Hadoop/YARN 2.2 Maven doesn't correctly bump
             the protobuf version up from the one Mesos gives. For now we include this variable
             to explicitly bump the version when building with YARN. It would be nice to figure
             out why Maven can't resolve this correctly (like SBT does). -->
        <dependency>
            <groupId>com.google.protobuf</groupId>
            <artifactId>protobuf-java</artifactId>
            <version>${protobuf.version}</version>
        </dependency>
        <dependency>
            <groupId>com.twitter</groupId>
            <artifactId>chill_${scala.binary.version}</artifactId>
            <version>${chill.version}</version>
            <exclusions>
                <exclusion>
                    <groupId>org.ow2.asm</groupId>
                    <artifactId>asm</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.ow2.asm</groupId>
                    <artifactId>asm-commons</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>com.twitter</groupId>
            <artifactId>chill-java</artifactId>
            <version>${chill.version}</version>
            <exclusions>
                <exclusion>
                    <groupId>org.ow2.asm</groupId>
                    <artifactId>asm</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.ow2.asm</groupId>
                    <artifactId>asm-commons</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>${akka.group}</groupId>
            <artifactId>akka-actor_${scala.binary.version}</artifactId>
            <version>${akka.version}</version>
        </dependency>
        <dependency>
            <groupId>${akka.group}</groupId>
            <artifactId>akka-remote_${scala.binary.version}</artifactId>
            <version>${akka.version}</version>
        </dependency>
        <dependency>
            <groupId>${akka.group}</groupId>
            <artifactId>akka-slf4j_${scala.binary.version}</artifactId>
            <version>${akka.version}</version>
        </dependency>
        <dependency>
            <groupId>${akka.group}</groupId>
            <artifactId>akka-testkit_${scala.binary.version}</artifactId>
            <version>${akka.version}</version>
        </dependency>
        <dependency>
            <groupId>colt</groupId>
            <artifactId>colt</artifactId>
            <version>1.2.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.mesos</groupId>
            <artifactId>mesos</artifactId>
            <version>${mesos.version}</version>
            <classifier>${mesos.classifier}</classifier>
            <exclusions>
                <exclusion>
                    <groupId>com.google.protobuf</groupId>
                    <artifactId>protobuf-java</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>commons-net</groupId>
            <artifactId>commons-net</artifactId>
            <version>2.2</version>
        </dependency>
        <dependency>
            <groupId>io.netty</groupId>
            <artifactId>netty-all</artifactId>
            <version>4.0.23.Final</version>
        </dependency>
        <dependency>
            <groupId>org.apache.derby</groupId>
            <artifactId>derby</artifactId>
            <version>10.4.2.0</version>
        </dependency>
        <dependency>
            <groupId>com.codahale.metrics</groupId>
            <artifactId>metrics-core</artifactId>
            <version>${codahale.metrics.version}</version>
        </dependency>
        <dependency>
            <groupId>com.codahale.metrics</groupId>
            <artifactId>metrics-jvm</artifactId>
            <version>${codahale.metrics.version}</version>
        </dependency>
        <dependency>
            <groupId>com.codahale.metrics</groupId>
            <artifactId>metrics-json</artifactId>
            <version>${codahale.metrics.version}</version>
        </dependency>
        <dependency>
            <groupId>com.codahale.metrics</groupId>
            <artifactId>metrics-ganglia</artifactId>
            <version>${codahale.metrics.version}</version>
        </dependency>
        <dependency>
            <groupId>com.codahale.metrics</groupId>
            <artifactId>metrics-graphite</artifactId>
            <version>${codahale.metrics.version}</version>
        </dependency>
        <dependency>
            <groupId>org.scala-lang</groupId>
            <artifactId>scala-compiler</artifactId>
            <version>${scala.version}</version>
        </dependency>
        <dependency>
            <groupId>org.scala-lang</groupId>
            <artifactId>scala-reflect</artifactId>
            <version>${scala.version}</version>
        </dependency>
        <dependency>
            <groupId>org.scala-lang</groupId>
            <artifactId>jline</artifactId>
            <version>${scala.version}</version>
        </dependency>
        <dependency>
            <groupId>org.scala-lang</groupId>
            <artifactId>scala-library</artifactId>
            <version>${scala.version}</version>
        </dependency>
        <dependency>
            <groupId>org.scala-lang</groupId>
            <artifactId>scala-actors</artifactId>
            <version>${scala.version}</version>
        </dependency>
        <dependency>
            <groupId>org.scala-lang</groupId>
            <artifactId>scalap</artifactId>
            <version>${scala.version}</version>
        </dependency>
        <dependency>
            <groupId>org.scalatest</groupId>
            <artifactId>scalatest_${scala.binary.version}</artifactId>
            <version>2.1.5</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.easymock</groupId>
            <artifactId>easymockclassextension</artifactId>
            <version>3.1</version>
            <scope>test</scope>
        </dependency>
        <!-- Needed by cglib which is needed by easymock. -->
        <dependency>
            <groupId>asm</groupId>
            <artifactId>asm</artifactId>
            <version>3.3.1</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.mockito</groupId>
            <artifactId>mockito-all</artifactId>
            <version>1.9.0</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.scalacheck</groupId>
            <artifactId>scalacheck_${scala.binary.version}</artifactId>
            <version>1.11.3</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>4.10</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>com.novocode</groupId>
            <artifactId>junit-interface</artifactId>
            <version>0.10</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.apache.curator</groupId>
            <artifactId>curator-recipes</artifactId>
            <version>2.4.0</version>
            <exclusions>
                <exclusion>
                    <groupId>org.jboss.netty</groupId>
                    <artifactId>netty</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>${hadoop.version}</version>
            <exclusions>
                <exclusion>
                    <groupId>asm</groupId>
                    <artifactId>asm</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.ow2.asm</groupId>
                    <artifactId>asm</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.jboss.netty</groupId>
                    <artifactId>netty</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>commons-logging</groupId>
                    <artifactId>commons-logging</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.mortbay.jetty</groupId>
                    <artifactId>servlet-api-2.5</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>javax.servlet</groupId>
                    <artifactId>servlet-api</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>junit</groupId>
                    <artifactId>junit</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>org.apache.avro</groupId>
            <artifactId>avro</artifactId>
            <version>${avro.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.avro</groupId>
            <artifactId>avro-ipc</artifactId>
            <version>${avro.version}</version>
            <exclusions>
                <exclusion>
                    <groupId>io.netty</groupId>
                    <artifactId>netty</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.mortbay.jetty</groupId>
                    <artifactId>jetty</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.mortbay.jetty</groupId>
                    <artifactId>jetty-util</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.mortbay.jetty</groupId>
                    <artifactId>servlet-api</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.apache.velocity</groupId>
                    <artifactId>velocity</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>org.apache.avro</groupId>
            <artifactId>avro-mapred</artifactId>
            <version>${avro.version}</version>
            <exclusions>
                <exclusion>
                    <groupId>io.netty</groupId>
                    <artifactId>netty</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.mortbay.jetty</groupId>
                    <artifactId>jetty</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.mortbay.jetty</groupId>
                    <artifactId>jetty-util</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.mortbay.jetty</groupId>
                    <artifactId>servlet-api</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.apache.velocity</groupId>
                    <artifactId>velocity</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <!-- See SPARK-1556 for info on this dependency: -->
        <dependency>
            <groupId>net.java.dev.jets3t</groupId>
            <artifactId>jets3t</artifactId>
            <version>${jets3t.version}</version>
            <exclusions>
                <exclusion>
                    <groupId>commons-logging</groupId>
                    <artifactId>commons-logging</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-yarn-api</artifactId>
            <version>${yarn.version}</version>
            <exclusions>
                <exclusion>
                    <groupId>javax.servlet</groupId>
                    <artifactId>servlet-api</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>asm</groupId>
                    <artifactId>asm</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.ow2.asm</groupId>
                    <artifactId>asm</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.jboss.netty</groupId>
                    <artifactId>netty</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>commons-logging</groupId>
                    <artifactId>commons-logging</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-yarn-common</artifactId>
            <version>${yarn.version}</version>
            <exclusions>
                <exclusion>
                    <groupId>asm</groupId>
                    <artifactId>asm</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.ow2.asm</groupId>
                    <artifactId>asm</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.jboss.netty</groupId>
                    <artifactId>netty</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>javax.servlet</groupId>
                    <artifactId>servlet-api</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>commons-logging</groupId>
                    <artifactId>commons-logging</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-yarn-server-web-proxy</artifactId>
            <version>${yarn.version}</version>
            <exclusions>
                <exclusion>
                    <groupId>asm</groupId>
                    <artifactId>asm</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.ow2.asm</groupId>
                    <artifactId>asm</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.jboss.netty</groupId>
                    <artifactId>netty</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>javax.servlet</groupId>
                    <artifactId>servlet-api</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>commons-logging</groupId>
                    <artifactId>commons-logging</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-yarn-client</artifactId>
            <version>${yarn.version}</version>
            <exclusions>
                <exclusion>
                    <groupId>asm</groupId>
                    <artifactId>asm</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.ow2.asm</groupId>
                    <artifactId>asm</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.jboss.netty</groupId>
                    <artifactId>netty</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>javax.servlet</groupId>
                    <artifactId>servlet-api</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>commons-logging</groupId>
                    <artifactId>commons-logging</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <!-- Matches the version of jackson-core-asl pulled in by avro -->
            <groupId>org.codehaus.jackson</groupId>
            <artifactId>jackson-mapper-asl</artifactId>
            <version>1.8.8</version>
        </dependency>
    </dependencies>
</dependencyManagement>

<build>
    <pluginManagement>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-enforcer-plugin</artifactId>
                <version>1.3.1</version>
                <executions>
                    <execution>
                        <id>enforce-versions</id>
                        <goals>
                            <goal>enforce</goal>
                        </goals>
                        <configuration>
                            <rules>
                                <requireMavenVersion>
                                    <version>3.0.4</version>
                                </requireMavenVersion>
                                <requireJavaVersion>
                                    <version>${java.version}</version>
                                </requireJavaVersion>
                            </rules>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
            <plugin>
                <groupId>org.codehaus.mojo</groupId>
                <artifactId>build-helper-maven-plugin</artifactId>
                <version>1.8</version>
            </plugin>
            <plugin>
                <groupId>net.alchim31.maven</groupId>
                <artifactId>scala-maven-plugin</artifactId>
                <version>3.2.0</version>
                <executions>
                    <execution>
                        <id>scala-compile-first</id>
                        <phase>process-resources</phase>
                        <goals>
                            <goal>compile</goal>
                        </goals>
                    </execution>
                    <execution>
                        <id>scala-test-compile-first</id>
                        <phase>process-test-resources</phase>
                        <goals>
                            <goal>testCompile</goal>
                        </goals>
                    </execution>
                    <execution>
                        <id>attach-scaladocs</id>
                        <phase>verify</phase>
                        <goals>
                            <goal>doc-jar</goal>
                        </goals>
                    </execution>
                </executions>
                <configuration>
                    <scalaVersion>${scala.version}</scalaVersion>
                    <recompileMode>incremental</recompileMode>
                    <useZincServer>true</useZincServer>
                    <args>
                        <arg>-unchecked</arg>
                        <arg>-deprecation</arg>
                        <arg>-feature</arg>
                        <arg>-language:postfixOps</arg>
                    </args>
                    <jvmArgs>
                        <jvmArg>-Xms1024m</jvmArg>
                        <jvmArg>-Xmx1024m</jvmArg>
                        <jvmArg>-XX:PermSize=${PermGen}</jvmArg>
                        <jvmArg>-XX:MaxPermSize=${MaxPermGen}</jvmArg>
                    </jvmArgs>
                    <javacArgs>
                        <javacArg>-source</javacArg>
                        <javacArg>${java.version}</javacArg>
                        <javacArg>-target</javacArg>
                        <javacArg>${java.version}</javacArg>
                    </javacArgs>
                    <!-- The following plugin is required to use quasiquotes in Scala 2.10 and is used
                         by Spark SQL for code generation. -->
                    <compilerPlugins>
                        <compilerPlugin>
                            <groupId>org.scalamacros</groupId>
                            <artifactId>paradise_${scala.version}</artifactId>
                            <version>${scala.macros.version}</version>
                        </compilerPlugin>
                    </compilerPlugins>
                </configuration>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.1</version>
                <configuration>
                    <source>${java.version}</source>
                    <target>${java.version}</target>
                    <encoding>UTF-8</encoding>
                    <maxmem>1024m</maxmem>
                    <fork>true</fork>
                </configuration>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-surefire-plugin</artifactId>
                <version>2.17</version>
                <configuration>
                    <!-- Uses scalatest instead -->
                    <skipTests>true</skipTests>
                </configuration>
            </plugin>
            <plugin>
                <groupId>org.scalatest</groupId>
                <artifactId>scalatest-maven-plugin</artifactId>
                <version>1.0-RC2</version>
                <configuration>
                    <reportsDirectory>${project.build.directory}/surefire-reports</reportsDirectory>
                    <junitxml>.</junitxml>
                    <filereports>${project.build.directory}/SparkTestSuite.txt</filereports>
                    <argLine>-Xmx3g -XX:MaxPermSize=${MaxPermGen} -XX:ReservedCodeCacheSize=512m</argLine>
                    <stderr />
                    <systemProperties>
                        <java.awt.headless>true</java.awt.headless>
                        <spark.test.home>${session.executionRootDirectory}</spark.test.home>
                        <spark.testing>1</spark.testing>
                    </systemProperties>
                </configuration>
                <executions>
                    <execution>
                        <id>test</id>
                        <goals>
                            <goal>test</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-jar-plugin</artifactId>
                <version>2.4</version>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-antrun-plugin</artifactId>
                <version>1.7</version>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-shade-plugin</artifactId>
                <version>2.2</version>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-source-plugin</artifactId>
                <version>2.2.1</version>
                <configuration>
                    <attach>true</attach>
                </configuration>
                <executions>
                    <execution>
                        <id>create-source-jar</id>
                        <goals>
                            <goal>jar-no-fork</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-clean-plugin</artifactId>
                <version>2.5</version>
                <configuration>
                    <filesets>
                        <fileset>
                            <directory>work</directory>
                        </fileset>
                        <fileset>
                            <directory>checkpoint</directory>
                        </fileset>
                    </filesets>
                </configuration>
            </plugin>
        </plugins>
    </pluginManagement>

    <plugins>
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-enforcer-plugin</artifactId>
        </plugin>
        <plugin>
            <groupId>org.codehaus.mojo</groupId>
            <artifactId>build-helper-maven-plugin</artifactId>
            <executions>
                <execution>
                    <id>add-scala-sources</id>
                    <phase>generate-sources</phase>
                    <goals>
                        <goal>add-source</goal>
                    </goals>
                    <configuration>
                        <sources>
                            <source>src/main/scala</source>
                        </sources>
                    </configuration>
                </execution>
                <execution>
                    <id>add-scala-test-sources</id>
                    <phase>generate-test-sources</phase>
                    <goals>
                        <goal>add-test-source</goal>
                    </goals>
                    <configuration>
                        <sources>
                            <source>src/test/scala</source>
                        </sources>
                    </configuration>
                </execution>
            </executions>
        </plugin>
        <plugin>
            <groupId>net.alchim31.maven</groupId>
            <artifactId>scala-maven-plugin</artifactId>
        </plugin>
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-source-plugin</artifactId>
        </plugin>
        <plugin>
            <groupId>org.scalastyle</groupId>
            <artifactId>scalastyle-maven-plugin</artifactId>
            <version>0.4.0</version>
            <configuration>
                <verbose>false</verbose>
                <failOnViolation>true</failOnViolation>
                <includeTestSourceDirectory>false</includeTestSourceDirectory>
                <failOnWarning>false</failOnWarning>
                <sourceDirectory>${basedir}/src/main/scala</sourceDirectory>
                <testSourceDirectory>${basedir}/src/test/scala</testSourceDirectory>
                <configLocation>scalastyle-config.xml</configLocation>
                <outputFile>scalastyle-output.xml</outputFile>
                <outputEncoding>UTF-8</outputEncoding>
            </configuration>
            <executions>
                <execution>
                    <phase>package</phase>
                    <goals>
                        <goal>check</goal>
                    </goals>
                </execution>
            </executions>
        </plugin>
    </plugins>
</build>

<profiles>

    <!-- Ganglia integration is not included by default due to LGPL-licensed code -->
    <profile>
        <id>spark-ganglia-lgpl</id>
        <modules>
            <module>extras/spark-ganglia-lgpl</module>
        </modules>
    </profile>

    <!-- Kinesis integration is not included by default due to ASL-licensed code -->
    <profile>
        <id>kinesis-asl</id>
        <modules>
            <module>extras/kinesis-asl</module>
        </modules>
    </profile>

    <profile>
        <id>java8-tests</id>
        <build>
            <plugins>
                <!-- Needed for publishing test jars as it is needed by java8-tests -->
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-jar-plugin</artifactId>
                    <executions>
                        <execution>
                            <goals>
                                <goal>test-jar</goal>
                            </goals>
                        </execution>
                    </executions>
                </plugin>
            </plugins>
        </build>

        <modules>
            <module>extras/java8-tests</module>
        </modules>

    </profile>

    <!-- A series of build profiles where customizations for particular Hadoop releases can be made -->

    <profile>
        <id>hadoop-0.23</id>
        <!-- SPARK-1121: Adds an explicit dependency on Avro to work around a Hadoop 0.23.X issue -->
        <dependencies>
            <dependency>
                <groupId>org.apache.avro</groupId>
                <artifactId>avro</artifactId>
            </dependency>
        </dependencies>
        <properties>
            <hadoop.version>0.23.10</hadoop.version>
        </properties>
    </profile>

    <profile>
        <id>hadoop-2.2</id>
        <properties>
            <hadoop.version>2.2.0</hadoop.version>
            <protobuf.version>2.5.0</protobuf.version>
        </properties>
    </profile>

    <profile>
        <id>hadoop-2.3</id>
        <properties>
            <hadoop.version>2.3.0</hadoop.version>
            <protobuf.version>2.5.0</protobuf.version>
            <jets3t.version>0.9.0</jets3t.version>
        </properties>
    </profile>

    <profile>
        <id>hadoop-2.4</id>
        <properties>
            <hadoop.version>2.4.0</hadoop.version>
            <protobuf.version>2.5.0</protobuf.version>
            <jets3t.version>0.9.0</jets3t.version>
        </properties>
    </profile>

    <profile>
        <id>yarn-alpha</id>
        <modules>
            <module>yarn</module>
        </modules>
    </profile>

    <profile>
        <id>yarn</id>
        <modules>
            <module>yarn</module>
        </modules>
    </profile>

    <profile>
        <id>mapr3</id>
        <activation>
            <activeByDefault>false</activeByDefault>
        </activation>
        <properties>
            <hadoop.version>1.0.3-mapr-3.0.3</hadoop.version>
            <yarn.version>2.3.0-mapr-4.0.0-FCS</yarn.version>
            <hbase.version>0.94.17-mapr-1405</hbase.version>
            <zookeeper.version>3.4.5-mapr-1406</zookeeper.version>
        </properties>
    </profile>

    <profile>
        <id>mapr4</id>
        <activation>
            <activeByDefault>false</activeByDefault>
        </activation>
        <properties>
            <hadoop.version>2.3.0-mapr-4.0.0-FCS</hadoop.version>
            <yarn.version>2.3.0-mapr-4.0.0-FCS</yarn.version>
            <hbase.version>0.94.17-mapr-1405-4.0.0-FCS</hbase.version>
            <zookeeper.version>3.4.5-mapr-1406</zookeeper.version>
        </properties>
        <dependencies>
            <dependency>
                <groupId>org.apache.curator</groupId>
                <artifactId>curator-recipes</artifactId>
                <version>2.4.0</version>
                <exclusions>
                    <exclusion>
                        <groupId>org.apache.zookeeper</groupId>
                        <artifactId>zookeeper</artifactId>
                    </exclusion>
                </exclusions>
            </dependency>
            <dependency>
                <groupId>org.apache.zookeeper</groupId>
                <artifactId>zookeeper</artifactId>
                <version>3.4.5-mapr-1406</version>
            </dependency>
        </dependencies>
    </profile>

    <!-- Build without Hadoop dependencies that are included in some runtime environments. -->
    <profile>
        <id>hadoop-provided</id>
        <activation>
            <activeByDefault>false</activeByDefault>
        </activation>
        <dependencies>
            <dependency>
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-client</artifactId>
                <scope>provided</scope>
            </dependency>
            <dependency>
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-yarn-api</artifactId>
                <scope>provided</scope>
            </dependency>
            <dependency>
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-yarn-common</artifactId>
                <scope>provided</scope>
            </dependency>
            <dependency>
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-yarn-server-web-proxy</artifactId>
                <scope>provided</scope>
            </dependency>
            <dependency>
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-yarn-client</artifactId>
                <scope>provided</scope>
            </dependency>
            <dependency>
                <groupId>org.apache.avro</groupId>
                <artifactId>avro</artifactId>
                <scope>provided</scope>
            </dependency>
            <dependency>
                <groupId>org.apache.avro</groupId>
                <artifactId>avro-ipc</artifactId>
                <scope>provided</scope>
            </dependency>
            <dependency>
                <groupId>org.apache.zookeeper</groupId>
                <artifactId>zookeeper</artifactId>
                <version>${zookeeper.version}</version>
                <scope>provided</scope>
            </dependency>
        </dependencies>
    </profile>

    <profile>
        <id>hive</id>
        <activation>
            <activeByDefault>false</activeByDefault>
        </activation>
        <modules>
            <module>sql/hive-thriftserver</module>
        </modules>
    </profile>

</profiles>

TParse behaviors with string type different from the document

Now the TParse code output an IMat and a SBMat for a column of strings with delimeters in the raw data file. Each row of the IMat is a (row, keyword_id) pair, where the row number corresponds to the row number in the input data file, and keyword_id is the id of a keyword in the SBMat dictionary of unique keywords.

However, in the document for TParse behavior on string data (https://github.com/BIDData/BIDMach/wiki/Data-Wrangling#String_Fields), it says TParse will output an SMat and a SBMat for a column of strings with delimeters in the raw data file, where each column of the SMat is a list of keyword_ids.

It seems that there should be an extra step from the current TParse result to the intended behavior in the document, in which the IMat is converted to the SMat according to the specification of the encoding. Can anyone either fix the document or fix the code?

Plan for supporting CUDA 7.0?

CUDA 7.0 was recently released. I upgraded to that in order to get caffe finally working on my laptop, and I got rid of my earlier version (6.5) that I had used for BIDMach/BIDMat.

I downloaded the binaries:
http://www.jcuda.org/downloads/downloads.html

And pasted these files in the BIDMat/lib folder:

jcublas-0.7.0.jar
jcuda-0.7.0.jar
...(etc)...
libJCublas-apple-x86_64.dylib
libJCublas2-apple-x86_64.dylib
...(etc)...

I can compile successfully, but then I get (see the "something went wrong..."):

dhcp-46-165:BIDMat danielseita$ ./bidmat
Loading /Users/danielseita/BIDMat/lib/bidmat_init.scala...
import BIDMat.{CMat, CSMat, DMat, Dict, FMat, FND, GMat, GDMat, GIMat, GLMat, GSMat, GSDMat, HMat, IDict, Image, IMat, LMat, Mat, SMat, SBMat, SDMat}
import BIDMat.MatFunctions._
import BIDMat.SciFunctions._
import BIDMat.Solvers._
import BIDMat.Plotting._
1 CUDA device found, CUDA version 7.0
Something went wrong while loading BIDMat CUDA library

Welcome to Scala version 2.11.2 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_05).
Type in expressions to have them evaluated.
Type :help for more information.

scala> sys.exit

Specifics about my system: Mac 10.9, using latest version (e6e764f).

In the bidmat file there's a place where I can specify the CUDA number, but even fixing that doesn't change things. Has 7.0 been tested, and if not, when are the plans to deal with it?

In the meantime, I'll stick with CUDA 6.5 then but I thought I'd raise this issue now.

GPU version of binominal random number generator

hi All,
is there any plan to implement GPU version of binominal and negativ binominal random numbers generators?
Thanks!
Marek

Mat.getOS will never return OS_ANDROID

Noticed this issue while browsing through Mat.scala, notably, on in the getOS function:

  final val OS_WINDOWS = 0;
  final val OS_LINUX = 1;
  final val OS_OSX = 2;
  final val OS_ANDROID = 3;

  def getOS:Int = {
    val osname = System.getProperty("os.name");
    if (osname.startsWith("Windows")) OS_WINDOWS
    else if (osname.startsWith("Linux")) OS_LINUX
    else if (osname.startsWith("Mac")) OS_OSX
    else OS_ANDROID    
  }

  val ostype = getOS

According to http://developer.android.com/reference/java/lang/System.html#getProperty(java.lang.String), System.getProperty("os.name") always returns "Linux" on Android.

Seems like another way might be to check System.getProperty("java.vendor") for "The Android Project", then System.getProperty("os.name") for the remaining platforms?

Tall Matrix multiplication with a column matrix with CPU(FMat) vs GPU(GMat) : GPU not caching matrix

I am trying to identify the top 10 similar, say,

documents from a list of 2.7 Million documents for a new document that I get. Each document is represented as a vector of 500 doubles. The number of documents is 2,633,142. So, we have a matrix of size gc = (2633142 x 500) for all the processed documents. Whenever a new document arrives, we compute it's vector e = (500 x 1). Assuming that the vectors are normalized, we need to compute the dot product of e with each row of gc and pick the maximum value from the resulting matrix(2,633,142 x 1).
When I compute the dot product of a new document features with that of 2.7 million images' features using BIDMach matrix matrix multiplication, it takes 300 milliseconds using FMat and 1200 milliseonds using GMat. I am attaching the screenshot. I was expecting GMat to work faster than FMat. Another thing I observe is that the amount of GPU memory used while creating GMat is less than 100 MBs while I was expecting it to use 3 GB. Looks like it is not caching the Matrix in GPU memory which is leading to slower execution. Can you tell me if I am doing something wrong here or whether any kind of improvement can be done to cache the data properly in memory and reduce the time consumption of GMat based matrix multiplication?

The computation is a matrix multiplication of dimensions:
(2633142 x 500) * (500 x 1) = (2633142 x 1)

I don't understand why CPU is faster than GPU and how to obtain faster speed with GPU, say, in a few 100 milliseconds.
Notebook.pdf

Block assignment, one-dimensional indices to GPU matrices seems to have trouble

John,

In some code I'm writing, I have a matrix and a set of indices of that matrix (column-major order as usual), called innz. I would like to set all of those elements at the spots specified by innz to be zero. With CPU matrices, it is straightforward:

scala> val a = IMat(rand(3,12)*5)+1
a: BIDMat.IMat =
   4   3   3   3   3   4   1   2   2   5   3   3
   5   2   5   4   1   4   1   5   1   3   3   4
   4   4   3   1   5   4   5   3   3   1   5   5

scala> val innz = 2 \ 4 \ 9 \ 12 \ 18
innz: BIDMat.IMat = 2,4,9,12,18

scala> a(innz) = 0
res6: BIDMat.IMat =
   4   3   3   0   0   4   0   2   2   5   3   3
   5   0   5   4   1   4   1   5   1   3   3   4
   0   4   3   1   5   4   5   3   3   1   5   5

Alternatively, one can do this:

scala> a(innz) = izeros(1,5)
res7: BIDMat.IMat = 0,0,0,0,0

scala> a
res8: BIDMat.IMat =
   4   3   3   0   0   4   0   2   2   5   3   3
   5   0   5   4   1   4   1   5   1   3   3   4
   0   4   3   1   5   4   5   3   3   1   5   5

With GPU matrices, it is a little more complicated because it is missing a few update methods. (I am not sure whether these are on purpose or not; for instance, the Wiki states that the ^* operator is missing but that is deliberate right now.) I am setting another random matrix, and using the same set of indices to target for zeros, but doing the single 0 won't work because of no linear updates. I tried several ways, and by checking the source code, the only way that works is to set the right hand side to be a GIMat, as shown below:

scala> val ga = GIMat(grand(3,12)*5)+1
ga: BIDMat.GIMat =
   3   2   5   4   3   3   2   5   2   5   2   2
   3   5   1   3   3   2   5   1   5   5   4   3
   1   3   5   5   4   3   1   4   3   1   1   3

scala> val innz = GIMat(2 \ 4 \ 9 \ 12 \ 18)
innz: BIDMat.GIMat = 2,4,9,12,18

scala> ga(innz) = 0
java.lang.RuntimeException: operator linear update not implemented for GIMat
  at BIDMat.Mat.notImplemented0(Mat.scala:21)
  at BIDMat.Mat.update(Mat.scala:135)
  ... 33 elided

scala> ga(innz) = gizeros(1,5)
res4: BIDMat.GIMat =
   3   2   5   4   3   3   2   5   2   5   2   2
   3   5   1   3   3   2   5   1   5   5   4   3
   1   3   5   5   4   3   1   4   3   1   1   3

However, this does not modify the components of ga!

The source code traces back to the def updatex(I:GIMat, v:GIMat):GIMat method in GIMat.scala, which then calls some GPU code: val err = CUMAT.copyToInds(data, v.data, I.data, I.llength);. I read through the BIDMat/jni/src/BIDMat_CUMAT.cpp file, which has that function declaration, but the definition isn't there so it must be somewhere else (or maybe it's from CUDA itself, so you didn't write it?). EDIT It's actually in MatKernel.cu, sorry. That seems to be where you wrote the matrix kernels in CUDA. But the definition seems to make sense, based on my rudimentary understanding of CUDA syntax...

Regardless, I wanted to check to make sure this was the correct behavior for block updates; it doesn't seem that way to me. If this is not the right way to go, any suggestions on how to do updates to specified values of indices?

Thanks for your time,
Daniel

Transposes for GDMats and GLMats

Version: d2b1d59

Seems like there's a link error.

scala> val a8 = BIDMat.GDMat(BIDMat.DMat(1 on 4))
a8: BIDMat.GDMat =
1
4

scala> a8.t
java.lang.UnsatisfiedLinkError: edu.berkeley.bid.CUMATD.transpose(Ljcuda/Pointer;ILjcuda/Pointer;III)I
at edu.berkeley.bid.CUMATD.transpose(Native Method)
at BIDMat.GDMat.t(GDMat.scala:301)
... 33 elided

The only difference is that the GDMats and GLMats use this:
CUMATD.transpose(this.data, nrows, out.data, ncols, nrows, ncols)

whereas GIMats and GMats use
CUMAT.transpose(this.data, nrows, out.data, ncols, nrows, ncols)

Change line 36 in pre-build BIDMat bidmat file

To this:

${BIDMAT_ROOT}/scala/bin/scala -nobootcp -cp "${ALL_LIBS}" -Yrepl-sync -i ${LIBDIR}/bidmat_init.scala "$@"

to allow local Scala instance to be found.

binornd does not work

Hi All please find attached the error I receive:

scala> val a = binornd(10,0.5,2,2)

MKL ERROR: Parameter 5 was incorrect on entry to viRngBinomial.
a: BIDMat.IMat =
0 0
0 0

Exposing code that returns "empty" DenseMats?

Version: 38035fd

I was looking through the DenseMat.scala code to document it. (I was going to document this one, then copy and paste some of the comments elsewhere.) I noticed that there are many methods that return an appropriate matrix internally, but will not print it out on the command line. Do you think this may confuse other/new users? Is it worthwhile to just set those methods to be private or package-protected?

Here are two examples:

(1) DenseMat has mkdiag and getdiag methods that sort of act as a "second" way of getting diagonals. For instance, the usual way to make a diagonal matrix is as the following:

scala> val a = mkdiag(1 on 2 on 3)
a: BIDMat.IMat =
1 0 0
0 2 0
0 0 3

Though I can also do it this way:

scala> val a = (1 on 2 on 3).mkdiag
a: BIDMat.DenseMat[Int] =

scala>

While it's subtle, the second way actually produces a correct matrix, but it just doesn't print out normally and we have to wrap an IMat() around it to "see" it normally. The same holds true for getdiag.

Second example:

scala> val b = DMat(1\2 on 3\4)
b: BIDMat.DMat =
1 2
3 4

scala> b.ghorzcat(b)
res10: BIDMat.DenseMat[Double] =

scala> DMat(b.ghorzcat(b))
res11: BIDMat.DMat =
1 2 1 2
3 4 3 4

specify number of GPUs to use?

Looking through the code, I couldn't find a way to specify the number of GPUs to use on the machine. Is this possible?

Memory issues with Java Runtime object ("memory gets allocated randomly")

John,

I've been having some problems debugging memory allocation with BIDMach (BayesNet.scala specifically). I'm using the Java Runtime class which should be a reliable measure of memory allocation. But in the following test script, I'm noticing some weird results:

  1 import java.text.NumberFormat
  2 
  3 def computeMemory = {
  4     val runtime = Runtime.getRuntime()
  5     val format = NumberFormat.getInstance()
  6     val sb = new StringBuilder()
  7     val maxMemory = runtime.maxMemory()
  8     val allocatedMemory = runtime.totalMemory()
  9     val freeMemory = runtime.freeMemory()
 10     sb.append("free memory: " + format.format(freeMemory / (1024*1024)) + "M   ");
 11     sb.append("allocated/total memory: " + format.format(allocatedMemory / (1024*1024)) + "M\n");
 12     print(sb.toString())
 13 }
 14 
 15 for (i <- 0 until 100) {
 16     val a = rand(67,4367)
 17     //Thread sleep 3000
 18     println("memory at iteration i = " + i)
 19     computeMemory
 20 }

The computeMemory function computes the free memory and the total memory. The free memory we set with -Xms. Then the loop creates a bunch of random matrices and prints out memory as we go.

The output is as follows:

dhcp-46-165:BIDMach danielseita$ ./bidmach test.ssc 
Loading /Users/danielseita/BIDMach/lib/bidmach_init.scala...
import BIDMat.{CMat, CSMat, DMat, Dict, FMat, FND, GMat, GDMat, GIMat, GLMat, GSMat, GSDMat, HMat, IDict, Image, IMat, LMat, Mat, SMat, SBMat, SDMat}
import BIDMat.MatFunctions._
import BIDMat.SciFunctions._
import BIDMat.Solvers._
import BIDMat.Plotting._
import BIDMach.Learner
import BIDMach.models.{FM, GLM, KMeans, KMeansw, LDA, LDAgibbs, Model, NMF, SFA, RandomForest}
import BIDMach.networks.DNN
import BIDMach.datasources.{DataSource, MatDS, FilesDS, SFilesDS}
import BIDMach.mixins.{CosineSim, Perplexity, Top, L1Regularizer, L2Regularizer}
import BIDMach.updaters.{ADAGrad, Batch, BatchNorm, IncMult, IncNorm, Telescoping}
import BIDMach.causal.IPTW
1 CUDA device found, CUDA version 7.0

Loading test.ssc...
import java.text.NumberFormat
computeMemory: Unit
memory at iteration i = 0
free memory: 13,534M   allocated/total memory: 13,739M
memory at iteration i = 1
free memory: 13,534M   allocated/total memory: 13,739M
memory at iteration i = 2
free memory: 13,462M   allocated/total memory: 13,739M
memory at iteration i = 3
free memory: 13,462M   allocated/total memory: 13,739M

// More of the same...

free memory: 13,462M   allocated/total memory: 13,739M
memory at iteration i = 66
free memory: 13,462M   allocated/total memory: 13,739M
memory at iteration i = 67
free memory: 13,389M   allocated/total memory: 13,739M
memory at iteration i = 68
free memory: 13,389M   allocated/total memory: 13,739M
memory at iteration i = 69

// More of the same ...

What happens is that it looks like the amount of free memory goes down at seemingly random times (about 72-73 MB), but in fact, when you add up the amount of memory allocated across all 100 loops for these 67 x 4367, non-cached matrices (which is approximately (67 x 4367 x 8) / (1024 x 1024) MB if we assume one element takes up about 8 bytes) the final free memory makes sense. What happens is that the free memory variable does not seem to get updated frequently enough.

So my question: do you know if there is some esoteric detail about memory allocation with BIDMach/BIDMat data structures that would cause memory allocation to behave weirdly and not update with the Java Runtime? Normally this wouldn't be too big of a problem, but I'm trying to debug matrix caching problems with Gibbs sampling. For that, it would be nice to confidently point out places in the code where memory gets allocated, other than having "random" drops of memory occur. Adding a thread sleep option to delay measurement time does not seem to affect this.

It's likely that this is more of a Java Runtime problem or some timing issue between Runtime and Scala, so I'll probably try printing out GUIDs of matrices to help me debug instead, but I just wanted to check. This happens both on my laptop and in stout.

Installation Instructions [minor]

I just pulled and ran into some errors building:

~/BIDMat> ./sbt package
[info] Set current project to BIDMat (in build file:/home/kjameslubin/BIDMat/)
[warn] Credentials file /home/kjameslubin/.ivy2/.credentials does not exist
[info] Compiling 9 Scala sources to /home/kjameslubin/BIDMat/target/scala-2.11/classes...
[error] /home/kjameslubin/BIDMat/src/main/scala/BIDMat/JSON.scala:4: object cedarsoftware is not a member of package com
[error] import com.cedarsoftware.util.io.JsonWriter;
[error]            ^

etc...

It took me a minute to figure out that I needed to rerun ./getdevlibs.sh - and then it built without issue. Perhaps we should update INSTALLING.txt?

Something like:

Install from source:

Get requirements:
apt-get < reqs >
Clone repos:
git clone <repo name>
Get native libraries:
./getlibs.sh
Build:
./sbt package
Run:
cd scripts && bidmach workout.ssc

I have only installed from scratch on Ubuntu and OS X and could produce instructions for them - though my OS X laptop is too old to "really" run the recent CUDA versions.

FMat element assignment

Assigning an element in an IMat works

scala> var a = 0\0\0 on 0\0\0;
a: BIDMat.IMat =
   0   0   0
   0   0   0

scala> a(0,2) = 3;
res2: Int = 3

but the same fails with an FMat:

scala> var b = 0.1\0\0 on 0\0\0;
b: BIDMat.FMat =
  0.10000        0        0
        0        0        0

scala> b(0,2) = 3.4;
<console>:22: error: ambiguous reference to overloaded definition,
both method update in class FMat of type (i: Int, jv: BIDMat.Mat, b: BIDMat.Mat)BIDMat.FMat
and  method update in class FMat of type (iv: BIDMat.IMat, j: Int, b: BIDMat.FMat)BIDMat.FMat
match argument types (Int,Int,Double)
              b(0,2) = 3.4;

BIDMat 0.9.7.

Error in wiki

In the wiki/Dictionaries page, there is a reference to a union2 method of the Dict class:

>val (d, dm1, dm2) = Dict.union2(d1,d2)

I think this should actually be Dict.union3(d1,d2)? The union2 command seems to apply to IDicts.

Loading and saving matrices does not seem to be working due to NoClassDefFoundError: net/jpountz/lz4/LZ4BlockInputStream

John,

There seems to be some new issue with loading/saving matrices. Currently on the master branch of BIDMach/BIDMat (for both my Mac 10.9 laptop and on stout), I get some NoClassDefFoundErrors when loading/saving matrices by following the BIDMat Loading/Saving Wiki page:

[seita@stout BIDMach]$ ./bidmach
Loading /home/seita/BIDMach/lib/bidmach_init.scala...
import BIDMat.{CMat, CSMat, DMat, Dict, FMat, FND, GMat, GDMat, GIMat, GLMat, GSMat, GSDMat, HMat, IDict, Image, IMat, LMat, Mat, SMat, SBMat, SDMat}
import BIDMat.MatFunctions._
import BIDMat.SciFunctions._
import BIDMat.Solvers._
import BIDMat.Plotting._
import BIDMach.Learner
import BIDMach.models.{FM, GLM, KMeans, KMeansw, LDA, LDAgibbs, Model, NMF, SFA, RandomForest}
import BIDMach.networks.DNN
import BIDMach.datasources.{DataSource, MatDS, FilesDS, SFilesDS}
import BIDMach.mixins.{CosineSim, Perplexity, Top, L1Regularizer, L2Regularizer}
import BIDMach.updaters.{ADAGrad, Batch, BatchNorm, IncMult, IncNorm, Telescoping}
import BIDMach.causal.IPTW
4 CUDA devices found, CUDA version 6.5

Welcome to Scala version 2.11.2 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_51).
Type in expressions to have them evaluated.
Type :help for more information.

scala> saveFMat("hi", rand(3,3))
java.lang.NoClassDefFoundError: net/jpountz/lz4/LZ4BlockInputStream
  at BIDMat.MatFunctions$.saveFMat(MatFunctions.scala:1852)
  ... 33 elided
Caused by: java.lang.ClassNotFoundException: net.jpountz.lz4.LZ4BlockInputStream
  at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
  at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
  at java.security.AccessController.doPrivileged(Native Method)
  at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
  ... 34 more

This is on my laptop:

scala> val a = loadSMat("moods_data/moodsByUser.smat.lz4")
java.lang.NoClassDefFoundError: net/jpountz/lz4/LZ4BlockInputStream
  at BIDMat.MatFunctions$.loadSMat(MatFunctions.scala:1846)
  ... 33 elided
Caused by: java.lang.ClassNotFoundException: net.jpountz.lz4.LZ4BlockInputStream
  at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
  at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
  at java.security.AccessController.doPrivileged(Native Method)
  at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
  ... 34 more

In the past, I tried saving matrices by using the HMat.saveFMatTxt("name", a) workaround but that suddenly gives me the same error.

Do you know if a recent update changed this loading/saving process, or is there something I need to download? Thanks.

-Daniel

advice on testing software using BIDMat

We try to write and automatically launch specs for our piece of software using BIDMat.

some of our code is GPU-powered, some not.

is there a way to do that without having each machine GPU-enabled and jcuda installed?

I imagine that if we have some interface like 'Mat' we could do do that - write some test on 'Mat' interface providing an FMat when test, and GMat when on GPU.

I tried that, but even using SciFunctions requires jcuda package.
How would you suggest overcomming this issue?

sbt is unable to resolve a dependency

Running sbt in the main directory of the repository, I get this error:

[info] Resolving org.scala-sbt#precompiled-2_10_0;0.12.2 ...
[warn] ::::::::::::::::::::::::::::::::::::::::::::::
[warn] :: UNRESOLVED DEPENDENCIES ::
[warn] ::::::::::::::::::::::::::::::::::::::::::::::
[warn] :: com.github.siasia#xsbt-proguard-plugin_2.9.2;0.12.2-0.1.1: not found
[warn] ::::::::::::::::::::::::::::::::::::::::::::::
sbt.ResolveException: unresolved dependency: com.github.siasia#xsbt-proguard-plugin_2.9.2;0.12.2-0.1.1: not found
at sbt.IvyActions$.sbt$IvyActions$$resolve(IvyActions.scala:214)
at sbt.IvyActions$$anonfun$update$1.apply(IvyActions.scala:122)
at sbt.IvyActions$$anonfun$update$1.apply(IvyActions.scala:121)
at sbt.IvySbt$Module$$anonfun$withModule$1.apply(Ivy.scala:114)
at sbt.IvySbt$Module$$anonfun$withModule$1.apply(Ivy.scala:114)
at sbt.IvySbt$$anonfun$withIvy$1.apply(Ivy.scala:102)
at sbt.IvySbt.liftedTree1$1(Ivy.scala:49)
at sbt.IvySbt.action$1(Ivy.scala:49)
at sbt.IvySbt$$anon$3.call(Ivy.scala:58)
at xsbt.boot.Locks$GlobalLock.withChannel$1(Locks.scala:75)
at xsbt.boot.Locks$GlobalLock.withChannelRetries$1(Locks.scala:58)
at xsbt.boot.Locks$GlobalLock$$anonfun$withFileLock$1.apply(Locks.scala:79)
at xsbt.boot.Using$.withResource(Using.scala:11)
at xsbt.boot.Using$.apply(Using.scala:10)
at xsbt.boot.Locks$GlobalLock.liftedTree1$1(Locks.scala:51)
at xsbt.boot.Locks$GlobalLock.withLock(Locks.scala:51)
at xsbt.boot.Locks$.apply0(Locks.scala:30)
at xsbt.boot.Locks$.apply(Locks.scala:27)
at sbt.IvySbt.withDefaultLogger(Ivy.scala:58)
at sbt.IvySbt.withIvy(Ivy.scala:99)
at sbt.IvySbt.withIvy(Ivy.scala:95)
at sbt.IvySbt$Module.withModule(Ivy.scala:114)
at sbt.IvyActions$.update(IvyActions.scala:121)
at sbt.Classpaths$$anonfun$work$1$1.apply(Defaults.scala:951)
at sbt.Classpaths$$anonfun$work$1$1.apply(Defaults.scala:949)
at sbt.Classpaths$$anonfun$doWork$1$1$$anonfun$54.apply(Defaults.scala:972)
at sbt.Classpaths$$anonfun$doWork$1$1$$anonfun$54.apply(Defaults.scala:970)
at sbt.Tracked$$anonfun$lastOutput$1.apply(Tracked.scala:35)
at sbt.Classpaths$$anonfun$doWork$1$1.apply(Defaults.scala:974)
at sbt.Classpaths$$anonfun$doWork$1$1.apply(Defaults.scala:969)
at sbt.Tracked$$anonfun$inputChanged$1.apply(Tracked.scala:45)
at sbt.Classpaths$.cachedUpdate(Defaults.scala:977)
at sbt.Classpaths$$anonfun$45.apply(Defaults.scala:856)
at sbt.Classpaths$$anonfun$45.apply(Defaults.scala:853)
at sbt.Scoped$$anonfun$hf10$1.apply(Structure.scala:586)
at sbt.Scoped$$anonfun$hf10$1.apply(Structure.scala:586)
at scala.Function1$$anonfun$compose$1.apply(Function1.scala:49)
at sbt.Scoped$Reduced$$anonfun$combine$1$$anonfun$apply$12.apply(Structure.scala:311)
at sbt.Scoped$Reduced$$anonfun$combine$1$$anonfun$apply$12.apply(Structure.scala:311)
at sbt.$tilde$greater$$anonfun$$u2219$1.apply(TypeFunctions.scala:41)
at sbt.std.Transform$$anon$5.work(System.scala:71)
at sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:232)
at sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:232)
at sbt.ErrorHandling$.wideConvert(ErrorHandling.scala:18)
at sbt.Execute.work(Execute.scala:238)
at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:232)
at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:232)
at sbt.ConcurrentRestrictions$$anon$4$$anonfun$1.apply(ConcurrentRestrictions.scala:160)
at sbt.CompletionService$$anon$2.call(CompletionService.scala:30)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
error sbt.ResolveException: unresolved dependency: com.github.siasia#xsbt-proguard-plugin_2.9.2;0.12.2-0.1.1: not found
Project loading failed: (r)etry, (q)uit, (l)ast, or (i)gnore?

CUDA 7.5 Support

JCuda now supports 7.5. Is 7.5 on the roadmap somewhere for BIDMat? I took a run at adding it myself, but it got a little complicated since 7.5 requires x64 and I know next to nothing about JNI.

saveAs gives misleading exceptions

using saveAs, I've gotten an HDF5LibraryException even when load works fine.

I made two changes and it fixed the problem:

adding a .mat to the file name
creating the directory that I was telling it to save into so that it gets a valid path

There's nothing blocking, but wanted to post an issue because there may be a more helpful way to do error reporting here. Since the HDF5LibraryException seems to be related to a wide variety of possible problems (including not having LD_LIBRARY_PATH set properly), it would be nice for usability to catch it and report a more specific error when possible.

Added out.clear for BIDMat DenseMat accum

John, I added in out.clear lines for the DenseMat accum, i.e., what gets called when we do accum for CPU matrices.

99e9923

The issue was that originally, my accum calls would store the results from the previous accum call in out which resulted in cumulative sums being added to the previous output, not a clear matrix of zeros. The GPU versions of accum have out.clear in them (check GMat.scala) so I'm assuming that the lack of out.clear in the CPU versions is just an oversight. I didn't catch this in testing but it came up in the BayesNet.scala tests with matrix caching on.

John, assuming this doesn't break anything, feel free to close this.

Loading matrix directly to GPU

We want to load matrix from file/stream to GPU. currently we load the whole to FMat and then copy to GPU but maybe it would be faster to do it directly to GPU.

could you point us how to do it directly on GPU?
Currently i do not see a way to either update a given cell in GMat or maybe update whole row/col in GMat.

Possible bug in tileMult

Notice that the first two multiplications match, but the last is off by a uniform factor of three. Could easily be my mistake or some peculiarity of my local configuration.

Trace:

scala> val c = GMat(bernrnd(0.3,6,6) : FMat)
c: BIDMat.GMat =
   0   0   0   1   1   0
   0   0   1   0   0   0
   0   1   0   1   1   1
   1   0   0   0   0   0
   0   1   1   1   0   0
  ..  ..  ..  ..  ..  ..

scala> val s = GSMat(sparse(bernrnd(0.3,6,6)) : SMat)
s: BIDMat.GSMat =
(   2,   0)   1
(   3,   0)   1
(   1,   1)   1
(   3,   1)   1
(   4,   1)   1
(   5,   1)   1
(   2,   2)   1
(   5,   2)   1
  ...  ...  ...

scala> c*s
res18: BIDMat.GMat =
   1   2   0   2   1   0
   1   0   1   1   1   0
   1   4   1   3   2   1
   0   0   0   0   1   0
   2   2   1   2   2   0
  ..  ..  ..  ..  ..  ..

scala> full(s)
res19: BIDMat.GMat =
   0   0   0   0   1   0
   0   1   0   0   0   0
   1   0   1   1   1   0
   1   1   0   1   1   0
   0   1   0   1   0   0
  ..  ..  ..  ..  ..  ..

scala> c*full(s)
res20: BIDMat.GMat =
   1   2   0   2   1   0
   1   0   1   1   1   0
   1   4   1   3   2   1
   0   0   0   0   1   0
   2   2   1   2   2   0
  ..  ..  ..  ..  ..  ..

scala> c.tileMult(c.nrows, s.ncols, s.ncols, 0,0,s,0,0,GMat(FMat.zeros(6,6)),0,0)
res21: BIDMat.GMat =
   3   6   0   6   3   0
   3   0   3   3   3   0
   3  12   3   9   6   3
   0   0   0   0   3   0
   6   6   3   6   6   0
  ..  ..  ..  ..  ..  ..

Critical Flaw: Broken BIDMat Dense-Sparse Matrix Multiply

Hello,
I've been using the bleeding-edge/developer version of BIDMat (tried this with a clone of the version Feburary 2, and now a fresh pull today) and I have found that the dense-sparse matrix multiply no longer works. At first I thought it had to do with my own library's (which is built on top of BIDMat) determinism (since it appears the results are different every re-run of the program), but then I ran this simple vignette after tracing the computation to its source problem:

    var datX = Array(1f, 0f, 1f, 0f, 0f, 0f,
                     0f, 1f, 1f, 0f, 0f, 0f,
                     1f, 1f, 1f, 0f, 0f, 0f,
                     0f, 0f, 0f, 1f, 0f, 1f,
                     0f, 0f, 0f, 0f, 1f, 1f,
                     0f, 0f, 0f, 1f, 1f, 1f)
    var x1 = new FMat(6,6,datX)
    var x2 = sparse(x1)
    println("prod:\n"+(x1 * x2))

Reproduced below are two subsequent runs of the program (in a Scala object with main method), labelled Trail 1 and Trial 2 respectively:

Trial 1:
prod:
1 0 4 0 0 0
0 1 3 0 0 0
1 1 5 0 0 0
0 0 0 2 1 2
0 0 0 1 2 2
.. .. .. .. .. ..

Trial 2:
prod:
1 1 2 0 0 1
0 2 2 0 0 2
1 2 3 0 0 2
0 0 0 2 1 6
0 0 0 1 2 5
.. .. .. .. .. ..

I'm not sure if as BIDMat has been updated something broke the matrix-matrix operations involving sparse matrices, at minimum the multiply. Could you please look into this and fix this bug? Your dense-sparse/sparse ops are quite critical and useful for my library's back-bone. Note that this issue goes away when I do NOT use the most up-to-date version of BIDMat (but instead revert to one of the bundles available on the website).

Thanks!

sign for FMat is missing?

Implementation should be probably something like:

applySFun(beta, FMat(beta.nrows, beta.ncols), null, math.signum _, 1L)

BIDMat as sbt libraryDependencies

sbt is the default way for dependency management in scala, even though BIDMat uses sbt itself it seems not to be possible to have BIDMat as an sbt reference.
Expected behaviour libraryDependencies += "edu.berkeley.bid" %% "BIDMat" % "1.0.2" is not working.

setseed and randperm

Calling setseed and then randperm, always returns 0 as the first element in the permutation. But only the first time randperm is called after setseed

scala> setseed(45664)

scala> randperm(10)
res49: BIDMat.IMat = 0,8,4,7,1,6,3,2,5,9

scala> randperm(10)
res50: BIDMat.IMat = 8,4,2,6,0,9,3,7,5,1

scala> setseed(464)

scala> randperm(10)
res52: BIDMat.IMat = 0,6,9,7,5,2,3,1,8,4

scala> randperm(10)
res53: BIDMat.IMat = 4,1,9,2,5,8,3,6,7,0

scala> randperm(10)
res54: BIDMat.IMat = 9,3,0,8,2,5,7,4,1,6

scala> setseed(8559)

scala> randperm(10)
res56: BIDMat.IMat = 0,3,7,2,5,4,8,1,9,6

scala> randperm(10)
res57: BIDMat.IMat = 5,9,4,3,8,7,2,0,6,1

division operator for DMat is broken

This is the error I got when I do A/5.0 where A is a DMat.

java.lang.RuntimeException: operator / not implemented for DMat
at BIDMat.Mat.notImplemented0(Mat.scala:10)

Support for CUDA 7.0

CUDA 5.5 is old. Please support CUDA 7.0.

Unable to find source for libbidmatcuda

Hi, I'm unable to find the source to compile libbidmatcuda. Where would I find it?

I'm attempting to build BIDMach from the git repo in order to test out the Word2Vec model. I'm unable to compile BIDMach without recompiling BIDMat.

Daniel

"Something went wrong while loading BIDMat CUDA library" error

I download the bigger bundle of BIDMat from http://bid2.berkeley.edu/bid-data-project/download/ and extract it. But when I run ./bitmat from there, I got:

"2 CUDA devices found, CUDA version 7.0
Something went wrong while loading BIDMat CUDA library"

There are a k20 and a k40 on the machine and I can see libbidmatcuda-linux-x86_64.so in the lib directory.

Make fails if MKL not available (jni)

I realize I should buy MKL, but I'm only planning to run on CUDA, so MKL should not be required, correct?
Make of course fails if the MKL library is not available.
Is there a configure file for that particular instance?
Thanks for the software though. The packages are excellent!

IMat Accessing Compilation Issue

Accessing an item in the IMat in the interpreter through index does not have any problems, but using the same code for compilation results in errors. The same is with accessing .data. The fix is to use .data0, but this seems to be an unnecessary roundabout way to do so.

This seems to be the case for IMat's created in a class method. Note that the IMat will definitely have a size > 0.

val hourMat: SMat = load(filePathStr, "X")
val (rowIndices, colIndices) = find2(hourMat)
val bb1 = rowIndices(0) // Causes error.
val bb2 = rowIndices.apply(0) // Causes error.
val bb3 = rowIndices.data // Causes error.
val rowArray = rowIndices.data0 // No error.

val matEx: IMat = new IMat(3,1,Array(1,2,3))
matEx(0) // Causes error.
matEx.apply(0) // Causes error.
matEx.data(0) // Causes error.
matEx.data0(0) // No error.

On the other hand, doing so in main() does not cause any trouble. Hence, this does not seem to be a problem with find2 or the creation of the IMat.

Finally, this is the error message from compiler:
exception when typing $anonfun.this.$outer .row().update$mcI$sp($anonfun.this.$outer .numberOfTokenInOneBatch(), $anonfun.this.$outer .monthlyDict().apply(token))
overloaded method value update$mcI$sp with alternatives:
(a: BIDMat.IMat,b: Int)Int
(im: BIDMat.IMat,b: BIDMat.DenseMat)BIDMat.DenseMat
(i0: Int,v: Int)Int
cannot be applied to (Int, java.lang.Object) in file tt.scala
scala.tools.nsc.symtab.Types$TypeError: overloaded method value update$mcI$sp with alternatives:
(a: BIDMat.IMat,b: Int)Int
(im: BIDMat.IMat,b: BIDMat.DenseMat)BIDMat.DenseMat
(i0: Int,v: Int)Int
cannot be applied to (Int, java.lang.Object)

Compilation error with new refactored CUDA code

The CUDA code was recently refactored in this commit: 77fdb70

I tested the compilation process on my Mac 10.9 laptop and stout (Linux), and got similar errors. These were errors I got after doing git pull origin master, then going into the jni/src directory, then doing make clean; ./configure; make. It does not appear to be a major issue since I think the issue can be resolved with some simple name changing/updates. For instance, I think the apply_binop definitions in MatKernel.hpp can be transferred to MatKernelD.hpp, with a corresponding change in the .cu files. Notice that the BIDMat_CUMAT.cpp file is handled just fine.

Mac:

dhcp-46-165:src danielseita$ ./configure
Creating config for apple x86_64
dhcp-46-165:src danielseita$ make
g++ -fPIC -c -O2 -g -DNDEBUG -I/Library/Java/JavaVirtualMachines/jdk1.8.0_05.jdk/Contents/Home/include -I/Library/Java/JavaVirtualMachines/jdk1.8.0_05.jdk/Contents/Home/include/darwin -I/Users/danielseita/BIDMat/jni/include -I/usr/local/cuda/include  BIDMat_CUMAT.cpp
g++ -fPIC -c -O2 -g -DNDEBUG -I/Library/Java/JavaVirtualMachines/jdk1.8.0_05.jdk/Contents/Home/include -I/Library/Java/JavaVirtualMachines/jdk1.8.0_05.jdk/Contents/Home/include/darwin -I/Users/danielseita/BIDMat/jni/include -I/usr/local/cuda/include  BIDMat_CUMATD.cpp
BIDMat_CUMATD.cpp:63:12: error: use of undeclared identifier 'apply_binop'
    return apply_binop(nativeA, Anrows, Ancols, nativeB, Bnrows, Bncols, nativeC, opn);
           ^
BIDMat_CUMATD.cpp:75:12: error: use of undeclared identifier 'sdopcol'
    return sdopcol(nrows, ncols, nnz, A, Air, B, len, opn);
           ^
BIDMat_CUMATD.cpp:86:12: error: use of undeclared identifier 'sdoprow'
    return sdoprow(nrows, ncols, nnz, A, Aic, B, len, opn);
           ^
BIDMat_CUMATD.cpp:153:12: error: use of undeclared identifier 'apply_gfun'
    return apply_gfun(nativeA, nativeB, N, opn);
           ^
BIDMat_CUMATD.cpp:163:12: error: use of undeclared identifier 'apply_gfun2'
    return apply_gfun2(nativeA, nativeB, nativeC, N, opn);
           ^
5 errors generated.

Linux:

[seita@stout src]$ ./configure
Creating config for linux x86_64
[seita@stout src]$ make
g++ -fPIC -c -O2 -DNDEBUG -I/usr/java/default/include -I/usr/java/default/include/linux -I/home/seita/BIDMat/jni/include -I/usr/local/cuda/include  BIDMat_CUMAT.cpp
g++ -fPIC -c -O2 -DNDEBUG -I/usr/java/default/include -I/usr/java/default/include/linux -I/home/seita/BIDMat/jni/include -I/usr/local/cuda/include  BIDMat_CUMATD.cpp
BIDMat_CUMATD.cpp: In function ‘jint Java_edu_berkeley_bid_CUMATD_applyop(JNIEnv*, _jobject*, _jobject*, jint, jint, _jobject*, jint, jint, _jobject*, jint)’:
BIDMat_CUMATD.cpp:63: error: ‘apply_binop’ was not declared in this scope
BIDMat_CUMATD.cpp: In function ‘jint Java_edu_berkeley_bid_CUMATD_sdopcol(JNIEnv*, _jobject*, jint, jint, jint, _jobject*, _jobject*, _jobject*, jint, jint)’:
BIDMat_CUMATD.cpp:75: error: ‘sdopcol’ was not declared in this scope
BIDMat_CUMATD.cpp: In function ‘jint Java_edu_berkeley_bid_CUMATD_sdoprow(JNIEnv*, _jobject*, jint, jint, jint, _jobject*, _jobject*, _jobject*, jint, jint)’:
BIDMat_CUMATD.cpp:86: error: ‘sdoprow’ was not declared in this scope
BIDMat_CUMATD.cpp: In function ‘jint Java_edu_berkeley_bid_CUMATD_applygfun(JNIEnv*, _jobject*, _jobject*, _jobject*, jint, jint)’:
BIDMat_CUMATD.cpp:153: error: ‘apply_gfun’ was not declared in this scope
BIDMat_CUMATD.cpp: In function ‘jint Java_edu_berkeley_bid_CUMATD_applygfun2(JNIEnv*, _jobject*, _jobject*, _jobject*, _jobject*, jint, jint)’:
BIDMat_CUMATD.cpp:163: error: ‘apply_gfun2’ was not declared in this scope
make: *** [BIDMat_CUMATD.o] Error 1

BMat off-by-one translation from CSMat

Line 167: out.jc(i) = nnzx
Should be: out.jc(i) = nnzx + ioff
Reason: Although all data is transferred over to Array[Byte] data, Array[Int] jc has its last number set incorrectly in respect to its other values. Printing and converting to CSMat again would lead to loss of the last byte/character.

Example:

scala> var a = new CSMat(1,1,Array("abcd"))
a: BIDMat.CSMat =
  abcd

scala> var r = BMat(a)
r: BIDMat.BMat = abc ...

scala> r.data
res2: Array[Byte] = Array(97, 98, 99, 100)

scala> r.jc
res3: Array[Int] = Array(1, 4)

install trouble ubuntu

./bidmat
Error: Could not find or load main class scala.tools.nsc.MainGenericRunner

I have tried with scala 2.11.2 and 2.10.4
i have java version 1.7.0_80

GMat memory allocation error

I'm running into a error in GMat's allocation during a matrix multiply operation that's run in the bidmach/scala repl:

[user1 ~]$ bidmach
Loading /homes/user1/git/BIDMach/lib/bidmach_init.scala...
import BIDMat.{CMat, CSMat, DMat, Dict, FMat, FND, GMat, GDMat, GIMat, GLMat, GSMat, GSDMat, HMat, IDict, Image, IMat, LMat, Mat, SMat, SBMat, SDMat}
import BIDMat.MatFunctions._
import BIDMat.SciFunctions._
import BIDMat.Solvers._
import BIDMat.Plotting._
import BIDMach.Learner
import BIDMach.models.{FM, GLM, KMeans, KMeansw, LDA, LDAgibbs, Model, NMF, SFA, RandomForest}
import BIDMach.networks.DNN
import BIDMach.datasources.{DataSource, MatDS, FilesDS, SFilesDS}
import BIDMach.mixins.{CosineSim, Perplexity, Top, L1Regularizer, L2Regularizer}
import BIDMach.updaters.{ADAGrad, Batch, BatchNorm, IncMult, IncNorm, Telescoping}
import BIDMach.causal.IPTW
1 CUDA device found, CUDA version 7.0

Welcome to Scala version 2.11.2 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_80).
Type in expressions to have them evaluated.
Type :help for more information.

scala> val f = grand(256,40000)
f: BIDMat.GMat =
   0.88383   0.36261   0.67829   0.41245   0.98139   0.49479   0.73682  0.072556   0.28022   0.86777   0.63998   0.60432   0.34435   0.63388   0.34711   0.37487  0.027557  0.029789   0.74355   0.78470   0.63183  0.019272   0.48245...
   0.81384   0.74068   0.59507  0.056584  0.065755   0.73149   0.94198   0.16632   0.38067   0.63685   0.62824  0.063965   0.88211   0.41027   0.60684   0.16312   0.76326  0.054253   0.91629   0.22353   0.57114   0.19324   0.82873...
        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..

scala> val g = grand(256,80000)
g: BIDMat.GMat =
    0.38165  0.0094942    0.33977    0.52591    0.36995    0.93983    0.53091    0.21534    0.49424    0.17834    0.31689    0.11909    0.30771    0.26260    0.69630    0.36673    0.47742    0.33469   0.032663    0.82122...
    0.28144    0.34397    0.38882    0.94709    0.77737    0.72754   0.060895    0.39984    0.80299    0.10944    0.33119    0.64009    0.10358    0.35609    0.54367    0.44958    0.17401    0.52444   0.010265    0.95499...
         ..         ..         ..         ..         ..         ..         ..         ..         ..         ..         ..         ..         ..         ..         ..         ..         ..         ..         ..         ..

scala> val h = f ^* g
java.lang.IllegalArgumentException: Negative capacity: -84901888
  at java.nio.Buffer.<init>(Buffer.java:191)
  at java.nio.ByteBuffer.<init>(ByteBuffer.java:276)
  at java.nio.ByteBuffer.<init>(ByteBuffer.java:284)
  at java.nio.MappedByteBuffer.<init>(MappedByteBuffer.java:89)
  at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:162)
  at jcuda.runtime.JCuda.cudaMallocHostNative(Native Method)
  at jcuda.runtime.JCuda.cudaMallocHost(JCuda.java:3902)
  at BIDMat.GMat$.apply(GMat.scala:1661)
  at BIDMat.GMat$.newOrCheckGMat(GMat.scala:2411)
  at BIDMat.GMat$.newOrCheckGMat(GMat.scala:2445)
  at BIDMat.GMat.GTMult(GMat.scala:658)
  at BIDMat.GMat.$up$times(GMat.scala:1141)
  ... 33 elided

But if the column dimensions are different, it works:

scala> val f = grand(256,38970)
f: BIDMat.GMat =
   0.88383   0.36261   0.67829   0.41245   0.98139   0.49479   0.73682  0.072556   0.28022   0.86777   0.63998   0.60432   0.34435   0.63388   0.34711   0.37487  0.027557  0.029789   0.74355   0.78470   0.63183  0.019272   0.48245...
   0.81384   0.74068   0.59507  0.056584  0.065755   0.73149   0.94198   0.16632   0.38067   0.63685   0.62824  0.063965   0.88211   0.41027   0.60684   0.16312   0.76326  0.054253   0.91629   0.22353   0.57114   0.19324   0.82873...
        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..

scala> val g = grand(256,176965)
g: BIDMat.GMat =
   0.34660   0.30266   0.70115   0.53805   0.47667   0.99830   0.72554   0.33751   0.22702   0.13273   0.40718   0.33229   0.30885   0.42985   0.74604   0.45333   0.60975   0.50820   0.68527   0.97660   0.59849   0.29411   0.39118...
   0.85486   0.14368   0.16848   0.82664   0.89762   0.60588   0.30885   0.46973   0.11753   0.47522   0.95762  0.047029   0.28322   0.90902   0.11110   0.60526   0.70672   0.17768   0.51984   0.25233   0.64633  0.097652   0.22967...
        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..        ..

scala> val h = f ^* g
h: BIDMat.GMat =
  64.222  64.623  67.955  61.407  62.221  61.621  62.616  65.751  68.504  64.667  65.463  62.693  70.220  69.879  65.115  68.670  70.400  67.562  68.949  62.765  62.149  68.763  66.013  67.540  65.336  66.513  64.797  66.269...
  66.547  63.118  70.510  65.141  64.089  61.146  66.066  67.357  68.428  67.383  64.990  62.570  69.941  71.695  66.835  66.209  70.526  66.446  70.568  63.454  62.166  65.496  65.838  67.402  65.898  67.939  63.881  65.722...
      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..

The relevant code is in GMat.scala, line 1661:

1654  def apply(nr:Int, nc:Int):GMat = {
1655    val retv = new GMat(nr, nc, new Pointer(), 1L*nr*nc) 
1656    if (Mat.debugMem) {
1657      println("GMat %d %d, %d %f" format (nr, nc, SciFunctions.getGPU, SciFunctions.GPUmem._1))
1658      if (nr*nc > Mat.debugMemThreshold) throw new RuntimeException("GMat alloc too large");
1659    }
1660    var err = if (1L*nr*nc*Sizeof.FLOAT > Mat.hostAllocSize) {
1661      cudaMallocHost(retv.data, 1L*nr*nc*Sizeof.FLOAT);
1662    } else {
1663      cudaMalloc(retv.data, 1L*nr*nc*Sizeof.FLOAT);
1664    }
1665    cudaDeviceSynchronize;
1666    if (err == 0) err = cudaGetLastError();
1667    if (err != 0) throw new RuntimeException("CUDA alloc failed " + cudaGetErrorString(err));
1668    retv
1669  }

Notice that:

scala> 1L*40000*80000*4
res2: Long = 12800000000

scala> 1L*(40000*80000*4)
res3: Long = -84901888

The latter is the value that the cudaMallocHost() was complaining about.
The version of the code I'm using is:

[user1 BIDMat]$ git log
commit 5c4f7ed945d7d7aac34f8fa2544258e40a7c1568
Author: John Canny <[email protected]>
Date:   Tue Nov 17 16:26:12 2015 -0800

    added HDFSIO.scala

xT and Tx operators are not implemented for SMat

hashing in BIDMat

Grepping for hash instances in the project:

huitseeker@jollyjumper:~/tmp/BIDMat(master)$ ag Murmur -G'.*\.scala'
src/main/scala/BIDMat/DMat.scala
6:import scala.util.hashing.MurmurHash3
51:     out.setGUID(MurmurHash3.mix(MurmurHash3.mix(nr, nc), (GUID*3145341).toInt));

src/main/scala/BIDMat/FMat.scala
7:import scala.util.hashing.MurmurHash3
49:     out.setGUID(MurmurHash3.mix(MurmurHash3.mix(nr, nc), (GUID*3145341).toInt));

src/main/scala/BIDMat/GDMat.scala
16:import scala.util.hashing.MurmurHash3
46:     out.setGUID(MurmurHash3.mix(MurmurHash3.mix(nr, nc), (GUID*3145341).toInt));

src/main/scala/BIDMat/GIMat.scala
13:import scala.util.hashing.MurmurHash3
45:     out.setGUID(MurmurHash3.mix(MurmurHash3.mix(nr, nc), (GUID*3145341).toInt));

src/main/scala/BIDMat/GLMat.scala
10:import scala.util.hashing.MurmurHash3
49:     out.setGUID(MurmurHash3.mix(MurmurHash3.mix(nr, nc), (GUID*3145341).toInt));

src/main/scala/BIDMat/GMat.scala
13:import scala.util.hashing.MurmurHash3
43:     out.setGUID(MurmurHash3.mix(MurmurHash3.mix(nr, nc), (GUID*3145341).toInt));

src/main/scala/BIDMat/IMat.scala
5:import scala.util.hashing.MurmurHash3
42:     out.setGUID(MurmurHash3.mix(MurmurHash3.mix(nr, nc), (GUID*3145341).toInt));

src/main/scala/BIDMat/LMat.scala
5:import scala.util.hashing.MurmurHash3
42:     out.setGUID(MurmurHash3.mix(MurmurHash3.mix(nr, nc), (GUID*3145341).toInt));

src/main/scala/BIDMat/ND.scala
11:import edu.berkeley.bid.MurmurHash3
199:    MurmurHash3.MurmurHash3_x64_64(inds.map(_.GUID), 0x3142341)
203:    MurmurHash3.MurmurHash3_x64_64(inds.map(_.toLong), 0x3142341)
huitseeker@jollyjumper:~/tmp/BIDMat(master)$

I've found a lot of hashes on 32-bit Ints. Have you considered replacing Murmur3 with the faster xxHash ?
https://github.com/Cyan4973/xxHash
(as for the one usage in its 64-bit version, note the existence of XXH64)

getdiag() for GMats refers to mkdiag()

Version: master branch, after commit 97cda3d on Feb 6 2015.

The code for MatFunctions.scala has:

def getdiag(a:DMat) = DMat(a.getdiag)
def getdiag(a:FMat) = FMat(a.getdiag)
def getdiag(a:IMat) = IMat(a.getdiag)
def getdiag(a:CMat) = CMat(a.getdiag)
def getdiag(a:GMat) = a.mkdiag

I am not sure why there's a mkdiag for the GPU case. I can get around this by calling it from the GMat code, i.e., a.getdiag() if a is a GMat.

Some examples:

scala> val a = 1\2\3 on 4\5\6 on 7\8\9
a: BIDMat.IMat =
1 2 3
4 5 6
7 8 9

scala> getdiag(a)
res8: BIDMat.IMat =
1
5
9

scala> getdiag(BIDMat.GMat(a))
java.lang.RuntimeException: mkdiag requires a vector argument, but dims= 3 3
at BIDMat.GMat.mkdiag(GMat.scala:636)
at BIDMat.MatFunctions$.getdiag(MatFunctions.scala:1437)
... 33 elided

operator * not implemented for various XMat and YMat classes for X =/= Y

Version: 9b1557e

I'm not sure if this is a serious bug, but instead a code design or philosophical choice, but when we call a multiplication with a generic matrix on the left hand side, BIDMat will call the multiplication operator in the Mat.scala class, but that will throw an error because the binary methods there will create an "operator xxx not implemented for ...".

The key is that the generic matrix must have a compile time type of Mat. Even if the runtime type is changed to FMat, SMat, etc., the multiplication will still search for the Mat class.

In some code I'm writing, for instance, I have either GPU or CPU mode to consider, so my matrices a and b have type Mat to be generic. Then I set them equal to something:

val a:Mat = null
val b:Mat = null
// Initialize a to be either a GSMat or an SMat, depending on GPU/CPU mode
// Initialize b to be either a GMat or FMat, depending on GPU/CPU mode
a * b

And the a*b line will fail if I run with CPU mode with the errors given in the title with XMat = SMat and YMat = FMat.

There are ways I can get this code to work by wrapping SMats and FMats, as in SMat(a)*FMat(b), but I wanted to check in with you since this information has the potential to be a little confusing, because we are encouraged to use Mats to be generic, but we must also keep repeatedly checking for cases and then casting to SMat(a), FMat(b), etc. What are your thoughts?

Some command line examples:

scala> val a:Mat=sprand(1000,1000,0.1)
a: BIDMat.Mat =
(        5,        0)  0.52206
(        6,        0)  0.71377
(       13,        0)  0.62309
(       28,        0)  0.20338
(       54,        0)  0.24161
(       61,        0)  0.38814
(       63,        0)  0.67045
(       74,        0)  0.33437
       ...       ...       ...

scala> val b:Mat=sprand(1000,1000,0.1)
b: BIDMat.Mat =
(        2,        0)  0.32644
(       16,        0)  0.78167
(       18,        0)  0.35468
(       41,        0)  0.16965
(       53,        0)  0.60286
(       56,        0)  0.27231
(       64,        0)  0.70402
(       65,        0)  0.20928
       ...       ...       ...

scala> a*b
res2: BIDMat.Mat =
(       0,       0)  2.2325
(       1,       0)  2.6090
(       2,       0)  1.5920
(       3,       0)  2.0934
(       4,       0)  2.0458
(       5,       0)  3.6955
(       6,       0)  1.9645
(       7,       0)  1.3880
      ...      ...      ...

scala> a*rand(1000,1000)
java.lang.RuntimeException: operator * not implemented for SMat and FMat
  at BIDMat.Mop$class.notImplemented(Operators.scala:327)
  at BIDMat.Mop_Times$.notImplemented(Operators.scala:382)
  at BIDMat.Mop$class.sop(Operators.scala:28)
  at BIDMat.Mop_Times$.sop(Operators.scala:382)
  at BIDMat.Mop$class.op(Operators.scala:143)
  at BIDMat.Mop_Times$.op(Operators.scala:382)
  at BIDMat.SMat.$times(SMat.scala:531)
  ... 33 elided

scala> rand(1000,1000)*a
res4: BIDMat.Mat =
  31.505  24.170  20.940  18.018  27.939  22.453  25.225  24.584  22.139  24.005  25.443  28.746  20.686  23.867  19.050  29.900  20.043  24.953  27.049  25.659  26.288  28.829  26.924  26.171...
  29.118  26.772  21.919  21.094  26.713  19.681  24.325  23.950  24.116  22.451  22.003  26.480  22.981  22.656  18.127  29.811  19.218  23.364  22.904  23.538  23.502  24.597  27.557  25.794...
  28.578  25.883  20.647  20.381  24.435  17.622  22.413  23.946  23.355  21.007  24.580  28.317  20.682  24.526  18.536  26.255  17.030  22.624  26.466  24.642  25.480  27.498  24.700  26.631...
      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..      ..
scala> val c:Mat = rand(1000,1000)
c: BIDMat.Mat =
     0.91933    0.054223   0.0029891     0.97654    0.045690     0.60623     0.45060     0.91492  0.00043926     0.77734    0.033706     0.27043     0.54978     0.25489     0.47594     0.35749...
     0.54238     0.62099     0.89218     0.71610     0.86469     0.69615     0.28406     0.93660     0.15506    0.091589     0.48099     0.82811     0.79517     0.67379     0.15774     0.88002...
     0.82720     0.88533     0.27555     0.88446     0.71065     0.61293     0.14963    0.053130     0.36748     0.25745     0.21042     0.96924     0.70845     0.77234    0.042210     0.76511...
          ..          ..          ..          ..          ..          ..          ..          ..          ..          ..          ..          ..          ..          ..          ..          ..

scala> a*c
java.lang.RuntimeException: operator * not implemented for SMat and FMat
  at BIDMat.Mop$class.notImplemented(Operators.scala:327)
  at BIDMat.Mop_Times$.notImplemented(Operators.scala:382)
  at BIDMat.Mop$class.sop(Operators.scala:28)
  at BIDMat.Mop_Times$.sop(Operators.scala:382)
  at BIDMat.Mop$class.op(Operators.scala:143)
  at BIDMat.Mop_Times$.op(Operators.scala:382)
  at BIDMat.SMat.$times(SMat.scala:531)
  ... 33 elided

GSMat contains no transpose method

The GSMat class contains no transpose method. Something like this might work:

override def t = {
  val out = GSMat.newOrCheckGSMat(ncols, nrows, nnz0, null, GUID, "t".##)
  CUMATD.transpose(this.data, nrows, out.data, ncols, nrows, ncols)
  cudaDeviceSynchronize()
  out
}

(Note the extra nnz0 term to include when checking the cache for GSMats).

However, I did a few tests and got some weird behavior when multiplying GSMats with other matrices (e.g., GSMat(a) * GMat(mkdiag(ones(3,1)))), so I just wanted to check in and see if GSMats were really supposed to have transposes. It seems like they should because SMats have transposes.

MatFunction not implemented for GMat

mean, sum and transpose operator do not support GMat so far.

Sum with Offset

I am implementing a sum method for TMat's as it appears frequently and seems useful. The plan is to iterate over tiles and to apply the sum individually on them, and then aggregate the result in a single vector, likely an FMat or GMat. I am tempted to allocate sum vectors of size matching the result vector for each of the constituent tiles, which is a waste of memory. To be memory efficient, this method could sum the (sum of the) tiles with the running result vector, in place. The problem is tracking the indices within the result vector - it is much easier to just sum vectors of a uniform size - and I'm not sure if the existing vector sum methods support smart sub-indexing. Thus I propose a sum with offset function - it would be quite similar to tileMult, in spirit.

newOrCheck methods could probably be generic

Given that the omat parameter is always generic anyway, it's likely we could write a catch-all newOrCheck method in Mat and override it individually. The reason I mention it is that implementing caching for generic matrix methods could end up requiring quite a few pattern matches. It's manageable at the moment (just FMat's, SMat's, GMat's and GSMat's) but could grow in complexity.

In practice the methods pattern match on the omat type inside the function body, so it should be 'safe' to make the method generic.
https://github.com/BIDData/BIDMat/blob/master/src/main/scala/BIDMat/DMat.scala#L1459-L1465

Something might be lost vis-a-vis (e.g.) reflection, though. I still don't understand Scala's type erasure semantics very well.

*^ index out of bounds

scala> val a = rand(3,4)
a: BIDMat.FMat =
0.40857 0.65735 0.18252 0.21719
0.82198 0.38203 0.57885 0.25747
0.13365 0.036090 0.57698 0.28741

scala> val b = rand(3,4)
b: BIDMat.FMat =
0.44102 0.63252 0.38195 0.052384
0.99267 0.053118 0.0094332 0.070619
0.34677 0.82116 0.74243 0.42345

scala> a *^ b
java.lang.ArrayIndexOutOfBoundsException: 9
at BIDMat.FMat.fDMultTHelper(FMat.scala:752)
at BIDMat.FMat.multT(FMat.scala:778)
at BIDMat.FMat.$times$up(FMat.scala:1253)
... 33 elided

cummaxByKey for GPU matrices?

I can't tell if cummaxByKey is implemented for GPU matrices. If it's not, then I think that it's just missing a few methods in one of the .cu files that are in BIDMat's GPU code directory? (It was recently refactored according to this commit f9f5720)

Example on my Mac 10.9 laptop:

I do git pull, then do cd BIDMat/jni/src; make clean; ./configure; make; make install to get the cuda libraries. Then I do ./sbt package. The following code shows an example of the GPU version not working (but the CPU version working):

scala> val a = rand(3,10)
a: BIDMat.FMat =
    0.17454    0.72285    0.33957    0.23828    0.47790    0.41824    0.37690   0.083308    0.39092    0.91638
    0.91813    0.83321    0.55037    0.63189    0.80421    0.76176    0.66493    0.75904  0.0053793    0.65965
    0.16096    0.34379   0.048064    0.23330    0.32797    0.86558    0.81054    0.53446    0.91294    0.60398

scala> val ga = grand(3,10)
ga: BIDMat.GMat =
   0.78258  0.081876   0.44771   0.53182   0.63474   0.78777   0.73650   0.69435   0.39765   0.44639
   0.53618  0.083131  0.098099   0.34346  0.027273   0.90388   0.29731   0.14853   0.83988  0.091234
   0.14477   0.36829   0.64452   0.77051   0.31326   0.89641   0.67069   0.90067   0.67521   0.99995

scala> val keys = zeros(3,10)
keys: BIDMat.FMat =
   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0

scala> val gkeys = gzeros(3,10)
gkeys: BIDMat.GMat =
   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0

scala> cummaxByKey(a,keys)
res9: BIDMat.FMat =
   0.17454   0.72285   0.33957   0.23828   0.47790   0.41824   0.37690  0.083308   0.39092   0.91638
   0.91813   0.83321   0.55037   0.63189   0.80421   0.76176   0.66493   0.75904   0.39092   0.91638
   0.91813   0.83321   0.55037   0.63189   0.80421   0.86558   0.81054   0.75904   0.91294   0.91638

scala> cummaxByKey(ga,gkeys)
java.lang.UnsatisfiedLinkError: edu.berkeley.bid.CUMAT.cummaxByKeyFL(Ljcuda/Pointer;Ljcuda/Pointer;Ljcuda/Pointer;J)I
  at edu.berkeley.bid.CUMAT.cummaxByKeyFL(Native Method)
  at BIDMat.GMat.cummaxByKey(GMat.scala:974)
  at BIDMat.GMat.cummaxByKey(GMat.scala:1004)
  at BIDMat.SciFunctions$.cummaxByKey(SciFunctions.scala:1070)
  ... 33 elided

Thanks.

sbt dependencies missing

Steps:

git clone https://github.com/BIDData/BIDMat.git
cb BIDMat
copy sbt-launch.jar into lib
./sbt package
Result: Several errors regarding unfound imports.

Partial solution: Add these lines to build.sbt:
libraryDependencies += "org.apache.commons" % "commons-math3" % "3.0"
libraryDependencies += "net.jpountz.lz4" % "lz4" % "1.2.0"
libraryDependencies += "org.scala-saddle" % "jhdf5" % "2.9"

However after adding these, the build still fails, so either the repo contains unbuildable code or I selected the wrong version(s) of these libraries.

biddata / bidmat Goto Github PK

bidmat's People

Contributors

Stargazers

Watchers

Forkers

bidmat's Issues

Recommend Projects

Recommend Topics

Recommend Org