fragnix / fragnix Goto Github PK

Fragment-based code distribution!

License: BSD 3-Clause "New" or "Revised" License

Haskell 98.32% C 1.68% Shell 0.01%

fragnix's Introduction

Fragnix

Fragnix is an experimental code package manager for Haskell. The central idea is that we should share and reuse code in units of small code fragments instead of in units of packages. The current state of development is technology preview, not even alpha.

Installation

Follow the following steps to get a development version of fragnix. You need at least GHC 8.0.

git clone https://github.com/fragnix/fragnix
cd fragnix

Building with cabal < 3.0

cabal new-build
cabal new-run fragnix

You can install fragnix on your system with:

cabal new-install exe:fragnix

This should place a fragnix executable in ~/.cabal/bin.

Building with cabal >= 3.0

You can use the same cabal instructions as above, omitting the "new-" prefixes.

Building with stack

stack build
stack exec fragnix

You can install fragnix on your system with:

stack install fragnix

The fragnix executable is then in ~/.local/bin.

Example

If you have completed the installation you have a fragnix executable. The following assumes you have it somewhere in your PATH. You also have to have GHC 8.0 in your PATH.

In tests/quick/HelloFragnix/ we have two Haskell module files: Greet.hs and Main.hs.

module Greet where

putHello :: String -> IO ()
putHello x = putStrLn ("Hello " ++ x)

putHi :: String -> IO ()
putHi x = putStrLn ("Hi " ++ x)

module Main where

import Greet (putHello)

main :: IO ()
main = putHello "Fragnix!"

When we provide fragnix build with a list of Haskell module files and there is a Main module that contains a definition for main it will build a program.

> fragnix build ./tests/quick/HelloFragnix/Greet.hs ./tests/quick/HelloFragnix/Main.hs
Loading environment ...
Took   0.38s
Parsing modules ...
Took   0.05s
Extracting declarations ...
Took   0.35s
Slicing ...
Took   0.00s
Updating environment ...
Took   0.00s
Compiling 792930286580032004
Generating compilation units...
Took   0.06s
Invoking GHC
[1 of 2] Compiling F6662111434988992012 ( fragnix/temp/compilationunits/F6662111434988992012.hs, fragnix/temp/compilationunits/F6662111434988992012.o )
[2 of 2] Compiling F792930286580032004 ( fragnix/temp/compilationunits/F792930286580032004.hs, fragnix/temp/compilationunits/F792930286580032004.o )
Linking main ...
Took   3.63s

We can then invoke the produced executable main:

> ./main
Hello Fragnix!

Vision

Fragnix is an experiment to find out if the advantages of fragment-based code distribution can be realized. This is my vision of modern code distribution.

Lightweight dependencies

I've read the term "inflict a dependency". Reusing code should be something good. You should never think twice about using something existing or rolling your own. Or copy pasting part of a package into your codebase.

Easy contribution

Some packages are missing helpful functionality. Sometimes we implement a helper function that we are sure someone else must have implemented as well but don't bother to release it. Another common pattern is to release a packagename-extras package on hackage. Fragnix allows you to contribute useful functions without going through the maintainer and without releasing (and maintaining) a package. You write the useful function, click submit and it is online for others to enjoy.

Discoverability

Code forms a giant directed acyclic graph. We can use this graph to rank code search results and recommend related functions. We can find real-world example uses of a given function. "People who have used this function have also used...".

Immutable code

Code is immutable. What we today call a "change to a function" is actually a different function that happens to have the same name. The old function will never go away. You can use both in your project at the same time. This reduces the tension between stability and evolution and hopefully eliminates one use-case of the C preprocessor.

First class updates

Your environment is frozen by default. You have to explicitly apply updates. Updates can have metadata (non-breaking, performance, whitespace, ...). Different people can have different policies which updates to apply automatically. There could be a tool for example that (automatically) applies all non-breaking updates to your environment. The hope is that updates have a finer granularity than today. Then you could for example selectively apply a non-breaking bugfix without also getting the breaking changes.

Platform support

Fragnix reduces your project to a small set of Haskell modules. If you want to build your code on a different platform you can invoke GHC on that platform on the set of generated modules. This should make it easier to build your project on for example Raspberry Pi. If some part of a cabal package is not supported on the target platform the build fails under traditional package-based dependency management. Even if you don't actually use the unsupported part. With fragnix the build will succeed.

First class environments

Multiple environments for example for beginners, web development, data science and so on can coexist. We see a recent trend to develop custom Preludes. This is the same idea, only fully supported. As long as different environments rely on the same core data types they are compatible.

Foreign function interface

On some platforms (C, Javascript, JVM) we have to integrate with foreign code. It is an open question where fragnix dependency management will stop.

Compatibility with multiple Haskell compilers

A Haskell program is a set of modules (according to the Haskell Report 2010). Fragnix produces a set of plain Haskell modules. All the compiler has to do is take this set of modules and produce a program. No special support required. Fragnix should eventually work with Haskell compilers like GHCJS, Haste, GHC, Fay, Frege, Clash, Purescript, UHC, JHC, ETA, HaLVM, CodeWorld and those that are not written yet. Some code fragments will be shared across compilers and some won't.

Lower binary size and compilation time

Fragnix does dead code elimination by design. Because the dead code elimination is static it helps to avoid compilation of large parts of programs. This should speed up compilation.

Which code you use is obvious and explicit

Find out if the code you rely on uses certain unsafe features. Find out if you are affected by a security vulnerability. Find out if any function you use is deprecated.

Cache compilation results forever

Fragnix uniquely identifies code fragments by a hash. We can cache the compilation results based on this hash even across machines.

Browse through code

There will be an online platform to browse through code. If we have a code snippet that describes for example a picture or a piece of music we can show that picture or a player for the music right next to the code.

Integration with source control

We will use text-based formats to make it possible to use existing source control tools to manage your environment.

Separate metadata from code

The online platform will make it possible to annotate code fragments with comments, upvotes, tags, supported platforms, deprecation, benchmarks, tests, ... This enables crowd-sourced documentation.

Related work

Rich Hickey has similar thoughts.

Joe Armstrong wondered Why do we need modules at all?.

Thomas Schilling suggested to track dependencies at the level of individual functions, types, etc.

Paul Chiusano's unison

sourcegraph

Moon language

Some people use npm the way you would use fragnix. For example this guy. Remember left-pad?

This project dumps the call graph of installed rust packages. This project walks the call graph to find uses of unsafe features.

Interlisp had a tool called masterscope to analyse the static call graph of your program.

fragnix's People

Contributors

Stargazers

Watchers

Forkers

joehillen fkellner hsyl20 artisdom binderdavid denhie

fragnix's Issues

Flatten fragnix/builtin module names

The names of the symbol files in fragnix/builtin are hierarchical and not flat as opposed to the ones in fragnix/names. Write a one time script to flatten them.

Language Extensions

A lot of Haskell code is not actually written in Haskell but uses one or more of the extensions that GHC provides. We need to keep track of that on a per slices basis to be able to compile.

Symbols used in instance declarations are missing

Somehow we are still missing symbols used by declarations in instances.

Slice explorer

We want a web app to explore a folder full of slices. Viewing them in an editor is tedious.

Operator symbols not working properly

In F5685558312108961428.hs:

{-# LANGUAGE NoImplicitPrelude #-}
module F5685558312108961428 where
import GHC.Arr (!)
import F4889493513583856799 (vertices)
import F4140369130282544907 (Edge)
import F4001183171300763828 (Graph)
edges g = [(v, w) | v <- vertices g, w <- g ! v]

edges :: Graph -> [Edge]

The import of ! should be import GHC.Arr ((!))

Qualified usage of names

We cannot extract the qualification of names from modules.

Packages with dependencies

package-modules does not work on packages with dependencies. It uses cabal install with our custom haskell-modules compiler. But this compiler does not implement a proper package database. Two possible directions to fix this are:

get rid of cabal install (but we need it for dependency resolution)
implement a package database.
reuse haskell-suite's implementation of a package database

Put folders for temporary files in fragnix/temp/

We now have multiple folders under fragnix. Some of those are meant to be permanent and some are for temporary files. We should make this clear by creating a seperate temp folder.

Own names database

We currently use the global names database that comes with haskell-names. Going further we will want our own database in fragnix/names for easy distribution and better control.

Install names and declarations into separate folders

Names and declaration files are next to each other in the same folder, but we want to have them in their own directory trees.

Declarations other than FunBind and PatBind

The only declarations we extract from a module are function and pattern bindings

Command line executable.

The path of the file fragnix is invoked on is currently hard-coded. We want to take it on the command line. Eventually we want to take an entire folder (after #5 is resolved). We also want some output to indicate progress and nice error messages.

Mutually recursive declarations are not supported

Mutually recursive declarations make problems in different phases:

Extraction: We need to detect cycles
Hashing: We cannot compute the hash without knowing it first
Compilation: We need hs-boot files for mutually recursive modules

Probably mutually recursive declarations have to be grouped into one slice.

Group type signature and function definition.

Type signatures are just declarations, but we need to find a way to associate them to the function body.
A type signature can specify the type of many symbols and therefore many functions at once. In this case we should split it.

Find a way to put instances into their own slice

We put every instance into its own slice. The problem is that the instance depends on both the class it is of and the type it is for. When compiling we would have to import the instance in either one so that it is propagated. But this leads to cyclic imports. Maybe instances really need their own very special treatment.

Fake package database with base for cabal

The cabal solver wants to know that a version of base is installed. Find a way to say so without a package database file

First class instances

We want to reliably find out what class and type an instance is for. Right now we only find out the class as all mentioned class symbols which clearly is a bug. We could have a special instance symbol.
Needed for #36.

Do not use pretty printer but original source as fragments

Pretty printing is not pretty enough in my opinion. We should check if a fragment parses and then use the original source.

Ignore INLINE (and other) pragmas

INLINE pragmas are not properly resolved by haskell-names. Let's ignore them for now.

Something goes wrong with type operators

The following does not compile:

{-# LANGUAGE NoImplicitPrelude #-}
module F5030602782144409221 where
import GHC.Prim (seq)
import Data.Complex (Complex(:+))
import Data.Complex (Complex)
import F5363716782726002741 (NFData)
import F5363716782726002741 (rnf)

instance (RealFloat a, NFData a) => NFData (Complex a) where
        rnf (x :+ y) = rnf x `seq` rnf y `seq` ()

Recover difference between VarId and SymId

The difference between variables like x and symbols like ++ is lost during name resolution.

Tests for the entire pipeline.

Now that the pipeline from module files to compiled program works we need the infrastructure to test module files with more and more features.

Group fixity declarations with their binding declaration

fixities currently end up in their own compilation unit which is wrong

Mutliple Modules

We want to be able to convert a set of modules.

Instance symbol for derived instances

We do have an instance symbol for instance declarations but not for deriving declarations

Not online

There should be a repository online where many slices are hosted.

Layout signature fragment directly above binding declaration

We get slices like this, which is ugly

{-# LANGUAGE NoImplicitPrelude #-}
module F7007818387610957033 where
import F8137826209908809703 (deepseq)
import F2127085270274173167 (NFData)

infixr 0 $!!
f $!! x = x `deepseq` f x

($!!) :: (NFData a) => (a -> b) -> a -> b

List builtin symbols

We need a list of all symbols that are considered builtin for name resolution and for classifying references during slice generation. We currently have builtin modules in fragnix/builtin. Related to #20.

Dependencies on foreign files

We need to track dependencies on foreign source and header files. Probably there should be foreign slices.

Mentioned values in type signatures

We only extract mentioned symbols from QNames. The symbols in TypeSigs are never qualified and therefore Names. A nice fix would be to convince the people at haskell-src-exts to make these names QNames.

Cpp problems with infix \\

There is a strange bug with infix declarations for \ and CPP. The workaround is

infixl 9 \\{-This comment teaches CPP correct behaviour -}

Separate declaration extraction and slice extraction

We want three tools:
package-declarations: given a package extracts all declarations from it
module-declarations: given a set of modules extracts all declarations from them
declarations-slices: given a set of declarations finds all slices

Properly find the class and type of an instance

The best would be to fork haskell-names and change the symbol type

Dependency on instances

We need an extra case for UsedName where the name should be empty and we are using an instance.

Safe Haskell is in the way

Sometimes we have Safe and/or Trustworthy pragmas that somehow prevent compilation. As an easy fix we could ignore them when assembling the comilation units.

Extract module sources in separate step

package-declarations takes the name of a package and produces fragnix/names and fragnix/declarations. There should be two different steps instead: package-modules takes a package and yields perfect, preprocessed, parseable Haskell module files. Then module-declarations takes these modules, does name resolution and extracts the declarations.

Slice tester

We want a script to attempt to compile all slices and summarize the results

Type class instances

We need to find a good heuristic for when a declaration does or does not depend on a type class instance.

Depend on names in instance constraints

Part of #4 is to properly extract used names in instance constraints. Fails to compile:

{-# LANGUAGE NoImplicitPrelude #-}
module F1979410200377991118 where
import GHC.Arr (bounds)
import GHC.Arr (elems)
import GHC.Arr (Array)
import F5363716782726002741 (NFData)
import F5363716782726002741 (rnf)

instance (Ix a, NFData a, NFData b) => NFData (Array a b) where
        rnf x = rnf (bounds x, Data.Array.elems x)

Actually compute hash

The resolver currently assigns temporary IDs to slices. We need to compute a hash of each slice and replace all temporary IDs.

Group instances for builtin type classes with the data type

We group all type class instances with the class declaration but for builtin type classes there is not class declaration and so we need to group them with the data type the instance is for.

More tests

Write a few more tests for the entire pipeline and the individual steps

Environments (Map Symbol SliceId)

We want to store some Map Symbol SliceId called an environment to make nests mergeable.

Compile against integer-simple instead of integer-gmp

haskell-names comes with names for integer-simple but not for integer-gmp. We want to configure all packages (currently containers and deepseq) to use integer-simple

integer-gmp for primitive names instead of integer-simple

The modules in fragnix/primitive for Integer related functionality are from integer-simple. We need them for integer-gmp

Name clashes because things all of the sudden are in the same module

The NFData class slice also has all instances for NFData in it. Some of these mention the Bin constructor from Data.Set, some from Data.IntMap and some from Data.Map. But these were previously used unqualified with no problem. So they have no qualification.