Comments (7)
Something like GroupBy
or ToLookup
here https://codeblog.jonskeet.uk/category/edulinq/, I think. The latter is eager, while the former is deferred until the outer iterator is evaluated, I believe. That's in contrast with the other iterator adapters which are mostly lazy.
from itertools.
Sure, it's a bit like the current unique, except you collect all values that map to the same key.
from itertools.
I'm interested in submitting PR for this. But I'm not quite understand about the explanation. Care to gives an example?
from itertools.
Sorry, not familiar with C#. A rust snippet on how the method is used and the intended result would be appreciated.
from itertools.
@matematikaadit And I don't really know Rust, so the following might make little sense. I offered that link as a guideline for the API, not for the implementation. Anyway, consider this code:
// numbers and their lengths in English:
let items = vec![(4, "zero"), (3, "one"), (3, "two"), (4, "four"), (4, "five"),
(3, "six"), (5, "seven"), (5, "eight"), (4, "nine"), (3, "ten")];
// group by length, associate keys with references to the original values
let grouped = items.iter().group_by(|&item| item.0).collect::<Vec<_>>();
// grouped: Vec<(usize, Vec<&({Integer}, &str)>)>
assert_eq!(grouped.len(), 3); // three distinct lengths
// the keys, in the original order, but distinct
assert_eq!(grouped[0].0, 4);
assert_eq!(grouped[1].0, 3);
assert_eq!(grouped[2].0, 5);
assert_eq!(*grouped[0].1[0].1, "zero");
assert_eq!(*grouped[0].1[4].1, "four");
assert_eq!(*grouped[2].1[3].1, "ten");
// that looks rather ugly, not that I've typed it
// another variant, also available in the .NET API:
// this one has key and value selectors
let grouped = items.iter().group_by(|&item| item.0, |&item| &item.1).collect::<Vec<_>>();
// grouped: Vec<(usize, Vec<&str>)>
// or, maybe more idiomatic and easier to implement without overloading, return keys and values
let grouped = items.iter().group_by(|&item| (item.0, item.1)).collect::<Vec<_>>();
// which is more or less the same as
let grouped = items.iter().group_by(|item| item).collect::<Vec<_>>();
// a "real" example, take the length of each number, group by length,
// sort by how many of them there are
items.iter()
.group_by(|item| item)
.map(|group| (group.0, group.1.len()))
.sorted_by(|&group1, &group2| group1.1.cmp(group2.1))
.collect::<Vec<_>>();
// should yield [(4, 4), (3, 4), (5, 2)]
With some hand-waving about the lifetimes and references -- I assumed above that group_by
can return references to the items, but that probably doesn't make sense as they don't necessarily live long enough.
As for the specifics, the .NET implementation returns keys in the original order and is somewhat lazy, in that the result is constructed only when the iterator is first dereferenced. The values in each group are also in the original order, if I recall correctly. It probably builds a hash table, with the caveat that it also needs to remember the key order. It's different from other LINQ/Iterator
methods in that it needs to allocate.
Hope I made sense. You can also look at group_by
in itertools
, which does a similar thing but assumes that the equal keys are in consecutive positions. This allows it to work without the hash table, but is less general.
from itertools.
I'm interested in this iterator too. The exact output still needs to be decided however. The way I see it, the returned struct could be one of:
- A
HashMap<K, Vec<V>>
, - A
Vec<(K, Vec<V>)>
(preserving first-key-encounter order), - A wrapper over one of the above, with an Iterator implementation where
Item=(K, V)
, similar toGroupBy
.
Whichever underlying structure it is, it would mean a wasted allocation if the caller wants to then convert it to the other structure.
@bluss, do you have any preference?
from itertools.
HashMap, then it will have reasonable performance for all scales of input. I think we should just go with HashMap here and let that be the practical solution. Any more general solution will not materialize for a while now.
from itertools.
Related Issues (20)
- `Itertools::dedup_by[_key]` vs `Vec`
- Merge `MultiPeek` and `PeekNth` HOT 1
- Feature request: `try_collect_vec()` HOT 3
- feature request: split_vec() HOT 1
- Itertools 0.13.0 breaks strum derive macros if both are in scope at the same time. HOT 8
- get (new range iterator) breaks strum HOT 2
- How does tree_reduce achieve O(ln(n)) operations? HOT 10
- Implement `DoubleEndedIterator` for `FilterOk` HOT 3
- Do we need a `rfind_map`? HOT 5
- `.try_collect()` for `Iterator<Option>` HOT 7
- `maybe_` for `cmp`, `partial_cmp`, `eq`, `ne`, `lt`, `le`, `gt`, `ge` HOT 2
- Add single-element tuple for `collect_tuple` HOT 3
- Feature request: Tracked adapter HOT 7
- product with repeat HOT 1
- Increase MSRV to 1.63 HOT 5
- Add `fn zip_longest` HOT 1
- Add comm iterator (inspired by Unix comm command)? HOT 8
- split iterator in two? HOT 6
- Proposal: `Iterator::try_flat_map()` method HOT 2
- izip! is just wrong HOT 7
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from itertools.