Coder Social home page Coder Social logo

Add joinBy method to Frame about framian HOT 6 OPEN

tixxit avatar tixxit commented on July 22, 2024
Add joinBy method to Frame

from framian.

Comments (6)

charleskreps avatar charleskreps commented on July 22, 2024

I have two variations on the same approach. Both produce a frame with the original keys...however:

  • The first has to drop any keys from the "that" frame since type A != Row.
  • The second includes keys from both frames, but changes the method to use Row and Col types for all frames to do it.

Not sure how to have my cake and eat it too here:

  // Original version that only has keys from "this" frame:
  def joinBy1[A: Order: ClassTag](by: Cols[Col, A])(that: Frame[A, Col]): Frame[Row, Col] = {
    val reIdx = reindex(by)
    val thatReIdx = that.reindex(by)
    val joined = reIdx.merge(thatReIdx)(Merge.Outer)
    val (k, v) = joined.rowIndex.flatMap{ case (r, i) =>
      reIdx.rowIndex.getAll(r).indices.map(k => (rowIndex.keyAt(k), i))
    }.unzip
    joined.withRowIndex(Index.apply(k, v))
  }
  // Version where types match and keys from both are in result:
  def joinBy2(by: Cols[Col, Row])(that: Frame[Row, Col]): Frame[Row, Col] = {
    val reIdx = reindex(by)
    val thatReIdx = that.reindex(by)
    val joined = reIdx.merge(thatReIdx)(Merge.Outer)
    val (k, v) = joined.rowIndex.flatMap{ case (r, i) =>
      reIdx.rowIndex.getAll(r).indices.map(k => (rowIndex.keyAt(k), i)) ++
        thatReIdx.rowIndex.getAll(r).indices.map(k => (that.rowIndex.keyAt(k), i))
    }.unzip
    joined.withRowIndex(Index.apply(k, v))
  }

thoughts?

from framian.

tixxit avatar tixxit commented on July 22, 2024

I prefer joinBy1. It would be nice if we had the option of Inner or Left (Right and Outer are Inner and Left because we drop the right keys). Perhaps, innerJoinBy(...)(...) and leftJoinBy(...)(...)? What do you think?

from framian.

charleskreps avatar charleskreps commented on July 22, 2024

I like it. But does it make sense to be consistent with the other methods and take the Join as a param? Even though Right = Inner and Outer = Left they are all valid arguments right?

from framian.

tixxit avatar tixxit commented on July 22, 2024

My only issue is that while they are technically valid, they would be confusing. I'd rather remove that point of confusion... However, I am not against making a sub-trait of Join that only works for Left/Inner and Right/Inner... so, something like this:

sealed trait Join {
  val leftOuter: Boolean
  val rightOuter: Boolean
}

sealed trait LeftBiasedJoin extends Join {
  val rightOuter: Boolean = false
}

sealed trait RightBiasedJoin extends Join {
  val leftOuter: Boolean = false
}

object Join {
  case object Inner extends LeftBiasedJoin with RightBiasedJoin
  case object Left extends LeftBiasedJoin { val leftOuter = true }
  case object Right extends RightBiasedJoin { val rightOuter = true }
  case object Outer extends Join { val leftOuter = true; val rightOuter = true }
}

And then we have:

def joinBy(by: Cols[Col, A])(that: Frame[_, Col])(join: LeftBiasedJoin): Frame[Row, Col]

Of course, the major issue here is we break binary compatibility, so it would have to be in the 0.4.0 release.

from framian.

charleskreps avatar charleskreps commented on July 22, 2024

Probably not a hard breaking change to handle when released. I have this now:

  def joinBy[A: Order: ClassTag](by: Cols[Col, A])(that: Frame[A, Col])(join: LeftBiasedJoin): Frame[Row, Col] = {
    val reIdx = reindex(by)
    val joined = reIdx.join(that.reindex(by))(join)
    joined.withRowIndex(
      joined.rowIndex.flatMap{ case (r, i) =>
        reIdx.rowIndex.getAll(r).indices.map(k => (rowIndex.keyAt(k), i))
      }.unzip match {
        case (k, v) => Index.ordered(k, v)
      }
    )
  }

I can clean it up and add the PR if you like.

from framian.

tixxit avatar tixxit commented on July 22, 2024

Please - just make the rows parameter on that a wildcard (Frame[_, Col]), since I don't think you meant to put the A from Cols[Col, A] there ;)

from framian.

Related Issues (15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.