feldspar / feldspar-language Goto Github PK
View Code? Open in Web Editor NEWThe goal of the Feldspar project is to define a high-level language that allows description of high-performance digital signal processing algorithms.
License: Other
The goal of the Feldspar project is to define a high-level language that allows description of high-performance digital signal processing algorithms.
License: Other
Implement a function for constructing dynamic tables:
table :: Syntax a => [a] -> Data [Internal a]
This is needed for making lookup tables, and it is also needed by the Repa library.
minmintest :: Vector1 IntN -> Vector1 IntN -> Vector1 IntN
minmintest lx lz = zipWith3 (\x y z -> 7) lforward lforward lx
where
lforward = replicate (i2n (lx ! 0)) (zipWith (+) lx lz)
printExpr minmintest
(\var0 -> (\var1 -> (letBind (i2n (var0 ! 0)) (\var3 -> (parallel (min var3 (min var3 (getLength var0))) (\var2 -> 7))))))
Changing Vector to use Prelude.foldl1 instead of Prelude.foldr1 gives:
printExpr minmintest
(\var0 -> (\var1 -> (parallel (min (i2n (var0 ! 0)) (getLength var0)) (\var2 -> 7))))
Range propagation should be aware of conditionals (and asserts?). For example:
condition (x<10) (max x 10) 10
can be reduced to the literal 10.
As seen in #122, I'm working on adding support for system calls to Feldspar.
There are some issues related to this that I would like to discuss.
I'm currently (ab)using the M
monad for system calls which is not so good given that M
is for mutable data structures and that it has a run function. A better solution seems to be to define a new Sys
monad as a newtype wrapper. The core language would use the same monad for both Sys
and M
, but the interfaces would be different:
Sys
runMutable
only works for M
The only problem I see with this is that one would have to redefine the mutable data structures for the Sys
monad, since we need references and arrays also at the system level.
Another option might be to just have a single monad, and have a static analysis that checks for sys calls under runMutable
.
foreignCall
requires a semantic function to be provided. But I think it will be hard to model all sys calls as Haskell functions. So I suggest that we simply leave the semantics undefined for sys calls (when there is no sensible semantics to give).
Here is my motivation: Sys calls will be used to implement various system abstractions in Feldspar. These abstractions should be implemented as shallow or deep EDSLs on top of Feldspar, but their implementation should not rely on sys calls. Instead, we will provide one or more compilation functions that translate the EDSL to Feldspar programs with sys calls. But when evaluating the EDSL, we should not use the compilation function but rather a specialized evaluator for the EDSL (that calls Feldspar's eval
).
As an example, the implementation of Stream
does not use sys calls, but one can imagine a compilation function that translates a stream into a loop in the Sys
monad that uses sys calls write out the output stream.
I used splitAt and got
(min var3 (shiftR var3 1))
in the output from printExpr. I was about to add an rewrite rule but I get a cyclic import problem:
Module imports form a cycle:
module `Feldspar.Core.Constructs.Logic' (work/feldspar-language/src/Feldspar/Core/Constructs/Logic.hs)
imports `Feldspar.Core.Constructs.Ord' (work/feldspar-language/src/Feldspar/Core/Constructs/Ord.hs)
which imports `Feldspar.Core.Constructs.Bits' (work/feldspar-language/src/Feldspar/Core/Constructs/Bits.hs)
which imports `Feldspar.Core.Constructs.Logic' (work/feldspar-language/src/Feldspar/Core/Constructs/Logic.hs)
Can the problem be avoided easily or is a major refactoring required?
Introduce a select combinator which is rendered as a multiple choice in the backend (e.g switch statement in C)
-- | Multiple choice. The Index in the first argument is used to select a result from the
-- key-value list in the second argument. The third argument provides a default result if
-- the key is nor present.
select :: Syntax a => Data Index -> [(Index, a)] -> a -> a
See this proposal.
See also the issue for feldspar-compiler.
Currently we do optimizations before running code motion, which means that optimizations are not aware of any sharing in the program. The syntax tree without sharing can be exponentially larger than the tree with sharing, which causes a blow-up in compilation time for certain programs.
Here is an example of a program that takes quite a long time to process despite being quite small after code motion:
f as = force (bs ++ cs)
where
(bs,cs) = splitAt (length as `div` 2) as
prog1 :: Vector1 Index -> Vector1 Index
prog1 as = Prelude.iterate f as Prelude.!! 3
test1 = drawAST prog1
Here is the same program with explicit sharing:
iterate :: Syntax a => (a -> a) -> a -> [a]
iterate f a = a : iterate f (share a f)
prog2 :: Vector1 Index -> Vector1 Index
prog2 as = iterate f as Prelude.!! 3
test2 = drawAST prog2
test2
terminates instantly.
One possibility would be to run optimizations after code motion, but that wouldn't work because our optimizations don't work across let bindings.
We need some kind of graph representation of the syntax (possibly using a special kind of let binding), and adapt the optimizations to work on this representation.
A related issue is that code motion itself is slow because it doesn't detect any sharing in the original program. This should be fixed in the Syntactic package.
Are we ready to deprecate Feldspar.Matrix in favour of Feldspar.Repa?
(The Mul
class etc. may still be useful.)
(@josefs)
It is not possible to build the modules outside cabal since they do not have the language pragmas
This could be used in the back end to identify the saved value (e.g. used as the variable name).
QuickCheck reports the following bug:
prop_rangeRem Int8: FAIL
*** Failed! Falsifiable (after 60 tests and 1 shrink):
Range {lowerBound = 124, upperBound = 127}
Range {lowerBound = -128, upperBound = -1}
127
-128
It is indeed a bug in rangeRem.
Make a collection of commented example files to serve as a Feldspar user's guide.
Each file should be relatively self-contained and focus on a single topic.
Topic suggestions:
> eval $ case Indexed 0 (\i -> err "sdf") Empty :: Matrix Index of Indexed l ixf Empty -> (ixf 0)
*** Exception: Prelude.undefined
Expected result
*** Exception: sdf
Due to constant folding?
In the system-core branch I've added a construct to do foreign import of system calls. For example:
alert :: Data WordN -> M ()
alert = foreignCall "alert" print
How should I represent these calls in the middle end? @pjonsson
Call
seems like a good candidate, but it seems to have to do with forking, so I'm not sure if that would be abuse.
It seems wrong to use ForeignImport
, because it's handled like an expression in the back end, and the only way to distinguish a system call from a pure function call would be to look at the return type.
Another option is of course to add a new construct in the middle end.
Minor note: foreignCall
currently allows any monadic return type, but it could easily be restricted to M ()
if that's more appropriate for a sys call.
For example:
getIx :: Syntax a => Data [Internal a] -> Data Index -> a
setIx :: Syntax a => Data [Internal a] -> Data Index -> a -> Data [Internal a]
Is this a good idea?
*Main> printExpr $ \a -> parallel 10 (*2) ! a
(\var0 -> ([0,2,4,6,8,10,12,14,16,18] ! var0))
Expected result:
*Main> printExpr $ \a -> parallel 10 (*2) ! a
(\var0 -> (var0 * 2))
div
is currently defined in terms of /
. When Feldspar/feldspar-compiler#16 has been fixed, it can be made a primitive again.
Multidimensional vectors leave index calculations in the code, and some of them are not necessary. Here's one example:
divmodtest :: Data Index -> Data Index -> Data Index
divmodtest x y = x `rem` y + y * (x `quot` y)
I have the rule for simplification implemented, but Num.hs can not import Integral.hs without creating a cyclic import. The offender is the second rule in Integral.hs, which optimizes x quot
(y^2k).
Short term: is that an important rule, or can we disable it until later?
Longer term: Can we solve the cyclic import issues for numbers by simply putting the number-related things, except for maybe complex numbers into the same module?
I have a program which tries to store an intermediate vector to memory. But the vector is still being fused.
f :: Manifest DIM1 (Data Double) -> Manifest DIM1 (Data Double)
f = store . s1 . store. s1
s1 vs = zipWith avg vs (tail vs)
avg a b = (a + b) / 2
The generated code for f
is the same (modulo variable names) as if the middle store
in between the two calls to s1
wasn't there.
The current implementation is quite incomplete and it only deals with merging literals, so it can't cancel out variables; e.g.
(x+x)-x ===> x
It would be better to optimize a whole arithmetic expression at once. Gather all variables in one list, all literals in one list and all non-arithmetic sub-terms in one list. Then make a new optimized expression by combining the three lists.
However, doing this compositionally will probably lead to a lot of re-traversals of the same sub-terms, so the optimization framework will probably have to be modified so that arithmetic optimization only happens at feasible places (i.e. arithmetic sub-terms whose parents are not arithmetic expressions).
We should also think about how to deal with examples like this one (reported by Gergely):
> icompile (\(n::Data Word8) -> (n-2)+1)
...
void test(uint8_t v0, uint8_t * out)
{
(* out) = (v0 + 255);
}
The result is correct, but not ideal from a readability point of view.
Right now, these are effectively 32 bit numbers. They should really be target-dependent.
I tried to use feldspar today, but it was not cabal install-able because of the following problem.
src/Feldspar/Core/Types.hs:257:10:
Duplicate instance declarations:
instance Show (IORef a)
-- Defined at src/Feldspar/Core/Types.hs:257:10
instance Show (IORef a)
-- Defined in `monad-par-0.3.4.1:Control.Monad.Par.Scheds.DirectInternal'
Failed to install feldspar-language-0.6.0.2
Best,
Takayuki
When a MultiDim vector is converted to its internal representation, type information is lost.
Internal (Pull DIM2) a
becomes
([Length],[Internal a])
The number of dimensions is converted into a runtime property and the information is lost at the type level.
Among other things, this makes it impossible to write a good Arbitrary
instance that will generate only arrays with the correct number of dimensions.
Can we make the internal representation an explicit (new)type instead of the pair and encode the dimensions so that the number of dimensions is still available in the type.
cc @josefs
See the TODO in the optimization of ForLoop
.
If so, this could be done by adding a field to Info
saying whether or not a node contains save
.
From the old trac:
We currently optimize division by a constant, but not remainder by a constant.
Feldspar has a few corners with undefined semantics. These are the ones I can think of:
undef
and err
getIx
and setIx
setLength
can be used to make an array longer, which semantically amounts to appending undefined elements.Some undefined programs result in errors in eval
, but invalid size annotations will instead result in a bogus value. The C code usually produces a bogus value for undefined programs, but it can probably also crash with a segfault.
It would be nice to have a safe compilation mode where all undefined behavior would be caught by assert/abort in the C code.
Question: Is it an error to use setLength
to make an array longer? If yes, this error could easily be caught by an assert. If no, I don't think it's possible to catch errors from indexing uninitialized elements. It would require keeping track of exactly which elements of an array are initialized and which are not. This ultimately requires storing an extra Boolean for each element.
A related problem is when eval
has well-defined semantics, but the C code has not. This is the case for e.g. [http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html signed integer overflow]. Should this be caught by an assert as well?
From the old trac:
Currently we only optimize division by a constant when it is a power of two. This is the simple case; division can be replaced by a right shift. But it is also possible to optimize the general case.
Interpretation.hs currently has hard coded compilation targets. This is non-modular. A better solution would be to parameterize optimization on the set of rewrite rules and let the back end decide which set to use.
Add support for rewriting compiler known functions before optimizations occur, see Feldspar/feldspar-compiler#101.
...to make nesting easier, see this page.
Any objections?
I've been trying to optimize doubly-nested let-bindings and my rewrite rules did not fire. Here's one test case:
stestL2 :: Data Index -> Data Length -> Data [[Index]]
stestL2 m x = parallel x (\x1 -> let z = let y = x `mod` m in (y, y) in parallel 2 (\x -> fst z))
Gives the result:
*Main> printExpr stestL2
(\var0 -> (\var1 -> (letBind (letBind (rem var1 var0) (\var4 -> (parallel 2 (\var3 -> var4)))) (\var5 -> (parallel var1 (\var2 -> var5))))))
Turns out that the let-bindings are introduced by codeMotion and never passed through the rewrite rules which explains why my rules never matched.
I would prefer to get the nesting of the let-bindings flattened (and the let-bindings floated after that), but there's currently no hook after codeMotion is performed in feldspar-language. Could we re-run the optimize function after codeMotion, or alternatively add another pass that I can hook into?
See this issue in the compiler.
The code motion strives to push shared expressions as high up as possible in the syntax tree. When combined with the FUTURE
feature some expressions will "travel in time" and execute now instead of in the future.
import qualified Prelude
import Feldspar
import Feldspar.Vector
import Feldspar.Algorithm.FFT
pmap f = map await . force . map (future . f)
p0 = pmap (fft . ifft)
In p0
, we expect the syntax tree to consist of two major parts; the fused application of fft . ifft
wrapped in a future
, and a loop await
-ing the results.
However, we end up with a tree where the ifft
is performed before spawning any future
expressions.
The code motion tries to move (hoist) shared expressions as high up as possible in the syntax tree to maximize sharing. An expression is eligible for hoisting if the outermost symbol allows sharing and the expression is used at least twice.
In our example the resulting vector of the ifft
is used several times inside the fft
(both for its length and its values) and thus it is selected for sharing.
Now, the code motion does not know about the future
construct and thus the expression is lifted above the future
in the syntax tree. This lifting makes it difficult for a programmer to predict which code will run in parallel and which will be computed serially before the parallel parts of the program.
A fix is provided in Feldspar/feldspar-language@ 7bac736.
The fix depends on unreleased changes to the syntactic library.
With the fix it is possible to disallow sharing through a symbol, i.e. make hoistOver
return False
.
Some files in the tutorial don't build anymore (due to the multidimensional arrays). Reenable the tutorial test suite once this has been fixed.
The Range module is usable outside Feldspar
Let
is intended to be strict (at least that's how it's compiled to C), but eval
gives non-strict semantics.
Break Vector
into SimpleVector
and Vector
. SimpleVector
would be single-segment vectors, and a Vector
could be defined as
type Vector a = [SimpleVector a]
Range propagation defaults to universal for signed operands. Universal is of course a correct upper bound of the range, but it is not tight.
Add a function like cap
that has "union" semantics instead of "intersection". This can be used e.g. to increase the size of the initial state of a for loop to make sure that the fixed-point is reached quickly.
Not meaningful until #15 is fixed.
The history of this ticket is that #38 exposed some expressions that should be optimized, for example min var3 (var3 >> 1)
. That optimization was checked in with the class Monotonic
. Commit a23d9a1 renamed Monotonic
to Crease
, but that is not a great name. The optimization rule before that commit reads:
constructFeatOpt _ (C' Min) (a :* b :* Nil)
| as <- viewMonotonicDec a
, any (alphaEq b) as = return a
I am convinced about the correctness of the optimization suggested in #38. Revisiting this generalization makes me a bit worried that the API is suggestive for making incorrect rewrite rules that will show upp in transitive uses, but I don't have a counter example at hand. I will try to construct a counter example this week.
The second issue is the name of the class. Aren't we really trying to Ord
functions and their output?
f x < x
I had a closer look at the problem mentioned in #122 and that needs a bit of deeper surgery. Tracking progress in this ticket.
It is possible to use runMutable
in an unsafe way. We decided no to use the "ST trick" to enforce safe use in order to keep the types simple.
It should be possible to make a static analysis that discovers any unsafe use of runMutable
. The downside compared to a type-based approach is that a static analysis won't be able to point to a line number when a program is rejected.
There is no test case for range propagation over shift operators. Also, the implementation should be updated to propagate over signed operands.
This blocks #13.
The first pattern match in enumFromTo in MultiDim is faulty:
Main> eval $ length $ (0...(7::Data Index))
6
*Main> eval $ length $ (1...(8::Data Index))
8
*Main> eval $ (0...(7::Data Index))
([6],[0,1,2,3,4,5])
*Main> eval $ (1...(8::Data Index))
([8],[1,2,3,4,5,6,7,8])
Both printExpr and printExprUnOpt hangs on the following for me:
*Feldspar> printExprUnOpt $ parallel 10 (*2)
Tried going back a couple of months in the repository to find a working point but they all seem to hang on printExpr. Anything obvious that pops out?
Optimization should throw an error when the size of an expression is over-constrained. It can only happen if there's a bug in the general size inference, or if the user has stated invalid size constraints. In both cases it may lead to incorrect optimizations, so throwing an error seems preferable.
optimizeM
should turn any expression with a singleton range into a literal. That way, optimizations that match on literals won't have to check for singleton ranges as well.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.