Coder Social home page Coder Social logo

multiset's People

Contributors

chris-martin avatar fsestini avatar jmendyk avatar lambda-fairy avatar larskuhtz avatar ollef avatar rcook avatar sjakobi avatar twanvl avatar vlopezj avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

multiset's Issues

Convenience functions elemAt and deleteAt

Data.Set has:

elemAt :: Int -> Set a -> a  -- O(log n)
deleteAt :: Int -> Set a -> Set a  -- O(log n)

It would be nice to have similar functions for MultiSet:

elemAt :: Int -> MultiSet a -> a  -- O(n)
distinctElemAt :: Int -> MultiSet a -> a  -- O(log n)
deleteAt :: Int -> MultiSet a -> MultiSet a  -- O(n)
distinctDeleteAt :: Int -> MultiSet a -> MultiSet a  -- O(log n)
distinctDeleteManyAt :: Int -> Occur -> MultiSet a -> MultiSet a  -- O(log n)
distinctDeleteAllAt :: Int -> MultiSet a -> MultiSet a  -- O(log n)

I'm mostly interested in elemAt for my purposes (using MultiSet as a bag to randomly take from).

You could theoretically bring down all the complexities to O(log n) but it would require changing the internal structure.

What to do with 0 or negative values?

Even though on the docs it says it should work only with positive values, sometimes it is easy missed introducing a hard to find bug (at least It happened to me :/ ). As an example, this can be unintuitive fromListOccur [(x,1),(x,-1)] == fromListOccur [(x,0)] /= fromListOccur [] . I think it might be a good idea either:

  • Detect when a zero or negative has been used and throw an error "negative or zero occurrences not supported".
  • Extend the implementation such it handle zero and negative values. (Internally only storing elements with an occurrence different than 0 ).

I think the second option is more interesting cause it eliminates the problem and makes the library useful for new use-cases; but it would slightly change the current semantics of some functions.

Build breaks on 7.6

This commit breaks the build on GHC 7.6. Please either depend on a base version high enough to exclude 7.6 or tweak the CPP so the GHC option is not used in 7.6.

5aebca1

test suite compilation failure

Preprocessing test suite 'doctests' for multiset-0.3.2...
[1 of 1] Compiling Main             ( test/Main.hs, dist/build/doctests/doctests-tmp/Main.o )
Linking dist/build/doctests/doctests ...
> /tmp/stackage-build8/multiset-0.3.2$ dist/build/doctests/doctests

/tmp/stackage-build8/multiset-0.3.2/Data/IntMultiSet.hs:671:0: error:
     fatal error: Typeable.h: No such file or directory
     #include "Typeable.h"
     ^
compilation terminated.
doctests: `gcc' failed in phase `C pre-processor'. (Exit code: 1)

Remove references to hedge algorithms

Map stopped using "hedge" algorithms several years ago: the much simpler divide-and-conquer ones have better-understood performance characteristics and seem to perform better in practice. I don't think IntMap has ever used anything like that. At the same time, performance claims can be updated accordingly.

Wish: Make MultiSet instance of NFData

I would like to use MultiSet inside a data structure that has
deriving (Eq,Ord,Read,Show, Generic, NFData, Data).
This lead to an error * No instance for (NFData (MultiSet.MultiSet BExpr))
Could MultiSet just like Set be derived from NFData?

mapEither is inefficient

It first builds lists, then converts them lazily into multisets. I think it makes a lot more sense to accumulate those multisets eagerly.

doctest failure with filemanip and Glob in scope

This is a limitation of doctest and won't occur unless both these packages are in scope. I think it can be resolved by using PackageImports - assuming it matters to you. I'll disable this test-suite on stackage at least for now.

/tmp/stackage-build1/multiset-0.3.1/test/Main.hs:3:8:
    Ambiguous module name ‘System.FilePath.Glob’:
      it was found in multiple packages:
      filemanip-0.3.6.3@filem_AFSulF8Y70TAA5R8pjLEdB Glob-0.7.5@Glob_J8eYhnaKSeW8dehUANn0F1

Convenience function mapM

As a user, I would like the monad version of map to be provided by the multiset library instead of having to define it locally each time I need it.
Consequently, when a better implementation is possible, I don't have to change my code in many places.

MultiSet already offers the map function of signature Ord b => (a-> b) -> MultiSet.MultiSet a -> MultiSet.MultiSet b .
Could you also offer something like

{-# LANGUAGE ScopedTypeVariables #-}
mapM :: forall a b m . (Ord b, Monad m) => (a-> m b) -> MultiSet.MultiSet a -> m (MultiSet.MultiSet b)
mapM f a = MultiSet.fromOccurList <$> mapM f' (MultiSet.toOccurList a)
          where f' :: (a, MultiSet.Occur) -> m (b, MultiSet.Occur)
                     f' (x,c) = do
                                      v <- f x
                                      return (v,c)

I use the naming convention:
A postfix 'M' always stands for a function in the Kleisli category: The monad type constructor m is added to function results (modulo currying) and nowhere else. So, for example,

filter  ::              (a ->   Bool) -> [a] ->   [a]
filterM :: (Monad m) => (a -> m Bool) -> [a] -> m [a]

Building with GHC 7.10.2 generates warnings related to imports

Due to changes in Prelude and Data.Monoid, compilation with GHC 7.10.2 results in the following warnings:

$ stack ghc -- --version
The Glorious Glasgow Haskell Compilation System, version 7.10.2
$ stack build
multiset-0.3.1: unregistering (local file changes: Data/IntMultiSet.hs Data/MultiSet.hs multiset.cabal test/Main.hs)
multiset-0.3.1: configure
Configuring multiset-0.3.1...
multiset-0.3.1: build
Preprocessing library multiset-0.3.1...
[1 of 2] Compiling Data.MultiSet    ( Data/MultiSet.hs, .stack-work/dist/x86_64-linux/Cabal-1.22.4.0/build/Data/MultiSet.o )

Data/MultiSet.hs:139:1: Warning:
    Module ‘Prelude’ does not export ‘join’

Data/MultiSet.hs:144:1: Warning:
    The import of ‘Data.Monoid’ is redundant
      except perhaps to import instances from ‘Data.Monoid’
    To import instances alone, use: import Data.Monoid()

Data/MultiSet.hs:146:1: Warning:
    The qualified import of ‘List.foldr’
    from module ‘Data.Foldable’ is redundant
[2 of 2] Compiling Data.IntMultiSet ( Data/IntMultiSet.hs, .stack-work/dist/x86_64-linux/Cabal-1.22.4.0/build/Data/IntMultiSet.o )

Data/IntMultiSet.hs:133:1: Warning:
    Module ‘Prelude’ does not export ‘join’

Data/IntMultiSet.hs:138:1: Warning:
    The import of ‘Data.Monoid’ is redundant
      except perhaps to import instances from ‘Data.Monoid’
    To import instances alone, use: import Data.Monoid()
In-place registering multiset-0.3.1...
multiset-0.3.1: copy/register
Installing library in
/home/user/src/multiset/.stack-work/install/x86_64-linux/lts-3.20/7.10.2/lib/x86_64-linux-ghc-7.10.2/multiset-0.3.1-APke0rlS0Vt5EHV2CxbSF5
Registering multiset-0.3.1...

typo in documentation

The summary of MultiSet
An efficient implementation of multisets, also somtimes called bags.
Should be
An efficient implementation of multisets, also sometimes called bags.

Running tests using "stack test" issues warnings related to "Typeable.h"

Here's the output from running stack test:

$ stack test
multiset-0.3.1: unregistering (dependencies changed)
multiset-0.3.1: configure (lib + test)
Configuring multiset-0.3.1...
multiset-0.3.1: build (lib + test)
Preprocessing library multiset-0.3.1...
In-place registering multiset-0.3.1...
Preprocessing test suite 'doctests' for multiset-0.3.1...
[1 of 1] Compiling Main             ( test/Main.hs, .stack-work/dist/x86_64-linux/Cabal-1.22.4.0/build/doctests/doctests-tmp/Main.o )
Linking .stack-work/dist/x86_64-linux/Cabal-1.22.4.0/build/doctests/doctests ...
multiset-0.3.1: copy/register
Installing library in
/home/user/src/multiset/.stack-work/install/x86_64-linux/lts-3.20/7.10.2/lib/x86_64-linux-ghc-7.10.2/multiset-0.3.1-APke0rlS0Vt5EHV2CxbSF5
Registering multiset-0.3.1...
multiset-0.3.1: test (suite: doctests)

Progress: 1/2
In file included from /home/user/src/multiset/Data/MultiSet.hs:666:0:


/home/user/.stack/programs/x86_64-linux/ghc-7.10.2/lib/ghc-7.10.2/base_GDytRqRVSUX7zckgKqJjgw/include/Typeable.h:17:2:
     warning: #warning <Typeable.h> is obsolete and will be removed in GHC 7.10 [-Wcpp]
     #warning <Typeable.h> is obsolete and will be removed in GHC 7.10
      ^

In file included from /home/user/src/multiset/Data/IntMultiSet.hs:672:0:


/home/user/.stack/programs/x86_64-linux/ghc-7.10.2/lib/ghc-7.10.2/base_GDytRqRVSUX7zckgKqJjgw/include/Typeable.h:17:2:
     warning: #warning <Typeable.h> is obsolete and will be removed in GHC 7.10 [-Wcpp]
     #warning <Typeable.h> is obsolete and will be removed in GHC 7.10
      ^

In file included from /home/user/src/multiset/Data/MultiSet.hs:666:0:


/home/user/.stack/programs/x86_64-linux/ghc-7.10.2/lib/ghc-7.10.2/base_GDytRqRVSUX7zckgKqJjgw/include/Typeable.h:17:2:
     warning: #warning <Typeable.h> is obsolete and will be removed in GHC 7.10 [-Wcpp]
     #warning <Typeable.h> is obsolete and will be removed in GHC 7.10
      ^

In file included from /home/user/src/multiset/Data/IntMultiSet.hs:672:0:


/home/user/.stack/programs/x86_64-linux/ghc-7.10.2/lib/ghc-7.10.2/base_GDytRqRVSUX7zckgKqJjgw/include/Typeable.h:17:2:
     warning: #warning <Typeable.h> is obsolete and will be removed in GHC 7.10 [-Wcpp]
     #warning <Typeable.h> is obsolete and will be removed in GHC 7.10
      ^
Examples: 4  Tried: 4  Errors: 0  Failures: 0
Examples: 0  Tried: 0  Errors: 0  Failures: 0

Completed 2 action(s).

This is caused by doctest compiling Multiset.hs and IntMultiset.hs which includes Typeable.h from the base package via #include directives, when - instead - they should be including the version supplied under the include directory.

Build breaks on 8.4

Any chance we can get a version of this that works on 8.4? Specifically, it needs a Semigroup instance.

Test suite failure with GHC 8

> /tmp/stackage-build8/multiset-0.3.2$ runghc -clear-package-db -global-package-db -package-db=/home/stackage/work/builds/nightly/pkgdb Setup configure --enable-tests --package-db=clear --package-db=global --package-db=/home/stackage/work/builds/nightly/pkgdb --libdir=/home/stackage/work/builds/nightly/lib --bindir=/home/stackage/work/builds/nightly/bin --datadir=/home/stackage/work/builds/nightly/share --libexecdir=/home/stackage/work/builds/nightly/libexec --sysconfdir=/home/stackage/work/builds/nightly/etc --docdir=/home/stackage/work/builds/nightly/doc/multiset-0.3.2 --htmldir=/home/stackage/work/builds/nightly/doc/multiset-0.3.2 --haddockdir=/home/stackage/work/builds/nightly/doc/multiset-0.3.2 --flags=
Configuring multiset-0.3.2...
> /tmp/stackage-build8/multiset-0.3.2$ runghc -clear-package-db -global-package-db -package-db=/home/stackage/work/builds/nightly/pkgdb Setup build
Building multiset-0.3.2...
Preprocessing library multiset-0.3.2...
[1 of 2] Compiling Data.MultiSet    ( Data/MultiSet.hs, dist/build/Data/MultiSet.o )

Data/MultiSet.hs:139:1: warning: [-Wdodgy-imports]
    Module ‘Prelude’ does not export ‘join’

Data/MultiSet.hs:419:1: warning: [-Wredundant-constraints]
    • Redundant constraint: Ord a
    • In the type signature for:
           filter :: Ord a => (a -> Bool) -> MultiSet a -> MultiSet a

Data/MultiSet.hs:425:1: warning: [-Wredundant-constraints]
    • Redundant constraint: Ord a
    • In the type signature for:
           partition :: Ord a =>
                        (a -> Bool) -> MultiSet a -> (MultiSet a, MultiSet a)

Data/MultiSet.hs:580:1: warning: [-Wredundant-constraints]
    • Redundant constraint: Ord a
    • In the type signature for:
           fromMap :: Ord a => Map a Occur -> MultiSet a
[2 of 2] Compiling Data.IntMultiSet ( Data/IntMultiSet.hs, dist/build/Data/IntMultiSet.o )

Data/IntMultiSet.hs:133:1: warning: [-Wdodgy-imports]
    Module ‘Prelude’ does not export ‘join’
Preprocessing test suite 'doctests' for multiset-0.3.2...
[1 of 1] Compiling Main             ( test/Main.hs, dist/build/doctests/doctests-tmp/Main.o )
Linking dist/build/doctests/doctests ...
> /tmp/stackage-build8/multiset-0.3.2$ dist/build/doctests/doctests

/tmp/stackage-build8/multiset-0.3.2/Data/IntMultiSet.hs:671:0: error:
     fatal error: Typeable.h: No such file or directory
     #include "Typeable.h"
     ^
compilation terminated.
doctests: `gcc' failed in phase `C pre-processor'. (Exit code: 1)

Typo in maxView

In both MultiSet and IntMultiSet, the definitions of maxView and minView are identical. In maxView, you should replace deleteFindMin by deleteFindMax.

Settle on Ord instance

There are a bunch of comments on the Ord instances suggesting that it might be nice to use a different ordering. This shouldn't be a big deal, I don't think. Start by viewing the ends of each to see if they have the same key, in which case special logic is needed. Otherwise do the simple thing. Either way, use Data.Ord.Down. @twanvl, I'll be happy to do this, but only if you want it. Please let me know.

Add mapPair

mapEither isn't very general as a multiset operation. It generalizes nicely to

mapPair
  :: Ord a
  => (a -> Occur -> ((b, Occur), (c, Occur)))
  -> MultiSet a -> (MultiSet b, MultiSet c)

By digging into Data.Map internals, it's also possible to write a very efficient version for when the passed function is strictly increasing (relative to the natural partial ordering of pairs).

mapPairAsc
  :: (a -> Occur -> ((b, Occur), (c, Occur)))
  -> MultiSet a -> (MultiSet b, MultiSet c)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.