biddata / bidmach_spark Goto Github PK
View Code? Open in Web Editor NEWCode to allow running BIDMach on Spark including HDFS integration and lightweight sparse model updates (Kylix).
Code to allow running BIDMach on Spark including HDFS integration and lightweight sparse model updates (Kylix).
For this to work, do I need Spark with Scala 2.11 support? The available spark binaries are built with scala 2.10 at this moment.
http://spark.apache.org/downloads.html
I would like to create a small PR with a small Test and/or example. Some suggested code to use for these would be appreciated.
Otherwise I will be attempting to reverse engineer from the API's and put together something.
The compilation errors below refer to code that is not checked in (and never has been) to BIDMat
package
[warn] Credentials file /Users/steve/.ivy2/.credentials does not exist
[info] Compiling 3 Scala sources to /git/BIDMach_Spark/target/scala-2.11/classes...
[error] /git/BIDMach_Spark/src/main/scala/BIDMach/RunOnSpark.scala:107: value combineModels is not a member of BIDMach.models.Model
[error] l.model.combineModels(ipass, r.model)
[error] ^
[error] /git/BIDMach_Spark/src/main/scala/BIDMach/RunOnSpark.scala:116: value updateM is not a member of BIDMach.Learner
[error] reduced_learner.updateM(0)
[error] ^
[error] /git/BIDMach_Spark/src/main/scala/BIDMach/RunOnSpark.scala:130: value updateM is not a member of BIDMach.Learner
[error] reduced_learner.updateM(i)
[error] ^
[error] three errors found
error Compilation failed
[error] Total time: 1 s, completed May 11, 2016 9:25:53 AM
Similar to mat = HMat.loadMat("/tmp/A.smat"), if "/tmp/A.smat" is located in HDFS, can we load it directly? i.e. something like mat = hdfsio.readMat("hdfs:///myhostname/tmp/A.smat").
From the code, readMat/Mats only read from sequencefiles, not .xmat directly. Am I right?
Where does this BIDMat.SerText class come from?
I can't find reference to it any where in the BIDMat repo.
error: object SerText is not a member of package BIDMat
import BIDMat.SerText
I have the suggested AMI up and running with BIDMach. Cloned this repo into it but do not know: (a) where is HADOOP_HOME and (b) getting lots of compilation errors
./sbt package
[ec2-user@ip-10-217-152-46 BIDMach_Spark]$ ./sbt package
[info] Set current project to BIDMatHDFS (in build file:/opt/BIDMach_Spark/)
[warn] Credentials file /home/ec2-user/.ivy2/.credentials does not exist
[info] Compiling 3 Scala sources to /opt/BIDMach_Spark/target/scala-2.11/classes...
[error] /opt/BIDMach_Spark/src/main/scala/BIDMach/RunOnSpark.scala:3: object IteratorSource is not a member of package BIDMach.datasources
[error] import BIDMach.datasources.IteratorSource
[error] ^
(52 errors)
Note that I did already copy the BIDMach.jar and BIDMat.jar to the lib dir.
$ ll lib
total 3824
-rwxrwxr-x 1 ec2-user ec2-user 134344 Feb 24 2015 lz4-1.1.2.jar
-rwxr-xr-x 1 ec2-user ec2-user 1756333 Feb 24 2015 BIDMat.jar
-rwxr-xr-x 1 ec2-user ec2-user 871121 Feb 24 2015 BIDMach.jar
-rw-r--r-- 1 ec2-user ec2-user 1141129 May 12 23:03 sbt-launch.jar
drwxr-xr-x 9 ec2-user ec2-user 4096 May 12 23:06 ..
drwxr-xr-x 2 ec2-user ec2-user 4096 May 12 23:11 .
[error] /home/ubuntu/Downloads/BIDMach_Spark/src/main/scala/BIDMat/MatIO.scala:31: type mismatch;
[error] found : java.io.DataOutput
[error] required: String
[error] case fM:FMat => {out.writeInt(MatTypeTag.FMat); HMat.saveFMat(out, fM);}
[error]
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.