Coder Social home page Coder Social logo

Comments (6)

FelipeAdachi avatar FelipeAdachi commented on June 3, 2024

Each row in the picture represents a Langkit metric, whose documentation can be found here:
https://github.com/whylabs/langkit/blob/main/langkit/docs/modules.md

Each column in the picture represents a whylogs metric. The following example discusses these metrics:
https://github.com/whylabs/whylogs/blob/mainline/python/examples/basic/Inspecting_Profiles.ipynb

Let me know if that answers your question!

from whylogs.

pradeepdev-1995 avatar pradeepdev-1995 commented on June 3, 2024

So how to analyze this report?
Screenshot from 2024-02-29 10-30-11

consider the second-row
row : prompt.aggregate_reading_level(aggregate reading level of the input text as calculated by the textstat library)
column : cardinality/est( the estimated unique values for each feature)
the value is 17.00000001
what does this mean?
@FelipeAdachi

from whylogs.

FelipeAdachi avatar FelipeAdachi commented on June 3, 2024

It means that, within the data you profiled (with the size of 50, as shown by counts/n), there are approximately 17 different unique values for prompt.aggragate_reading_level. This is not the exact value, it's an estimation, denoted by est after cardinality.

The cardinality value can be more or less useful depending on the feature you're inspecting. In this case, it might not be very interesting. For aggregate_reading_level, the distribution metrics might give more useful information, such as mean, median, or the quantile values.

from whylogs.

pradeepdev-1995 avatar pradeepdev-1995 commented on June 3, 2024

@FelipeAdachi Okay. So for getting the exact values for aggragate_reading_level,charector_count,automated_readability_index,sentence_count...etc, how to write the code?
I am looking for the actual values, rather than the distributions like est,mean,median,..etc

from whylogs.

FelipeAdachi avatar FelipeAdachi commented on June 3, 2024

Then you can use langkit directly, like this:

from langkit import llm_metrics, extract
import pandas as pd

df = pd.DataFrame({"prompt":["Hi! how are you?"],"response":["I'm ok, thanks for asking!"]})

enhanced_df = extract(df)

enhanced_df.head()

This will calculate metrics in the llm_metrics for each row of the dataframe.

I'm closing this issue as it is now langkit-related.

from whylogs.

pradeepdev-1995 avatar pradeepdev-1995 commented on June 3, 2024

@FelipeAdachi
Thank you so much

from whylogs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.