Comments (6)
Each row in the picture represents a Langkit metric, whose documentation can be found here:
https://github.com/whylabs/langkit/blob/main/langkit/docs/modules.md
Each column in the picture represents a whylogs metric. The following example discusses these metrics:
https://github.com/whylabs/whylogs/blob/mainline/python/examples/basic/Inspecting_Profiles.ipynb
Let me know if that answers your question!
from whylogs.
So how to analyze this report?
consider the second-row
row : prompt.aggregate_reading_level(aggregate reading level of the input text as calculated by the textstat library)
column : cardinality/est( the estimated unique values for each feature)
the value is 17.00000001
what does this mean?
@FelipeAdachi
from whylogs.
It means that, within the data you profiled (with the size of 50, as shown by counts/n
), there are approximately 17 different unique values for prompt.aggragate_reading_level
. This is not the exact value, it's an estimation, denoted by est
after cardinality
.
The cardinality value can be more or less useful depending on the feature you're inspecting. In this case, it might not be very interesting. For aggregate_reading_level, the distribution metrics might give more useful information, such as mean
, median
, or the quantile values.
from whylogs.
@FelipeAdachi Okay. So for getting the exact values for aggragate_reading_level,charector_count,automated_readability_index,sentence_count...etc, how to write the code?
I am looking for the actual values, rather than the distributions like est,mean,median,..etc
from whylogs.
Then you can use langkit directly, like this:
from langkit import llm_metrics, extract
import pandas as pd
df = pd.DataFrame({"prompt":["Hi! how are you?"],"response":["I'm ok, thanks for asking!"]})
enhanced_df = extract(df)
enhanced_df.head()
This will calculate metrics in the llm_metrics
for each row of the dataframe.
I'm closing this issue as it is now langkit-related.
from whylogs.
@FelipeAdachi
Thank you so much
from whylogs.
Related Issues (20)
- Warn users on duplicate profile uploads to WhyLabs (or avoid the duplicate upload if possible) HOT 1
- Better message when unsupported region string specified in whylabs upload. HOT 1
- Don't throw on errors getting observatory profile link after uploading profiles HOT 1
- Bumpversion error on release CI HOT 1
- Pillow vulnerabilities HOT 2
- Support for Python 3.12 HOT 1
- log_batch_ranking_metrics error HOT 1
- Is it possible to host the UI server on localhost? HOT 1
- whylogs requires numpy to run
- PIL warning when reading profiles
- Ranking Metrics - Support for continuous relevance
- Allow multiple column names passed to DeclarativeSchema::add_resolver_spec() to apply the same set of metrics to multiple columns
- Add type metric to uncompounded condition count metrics
- Remove batch_ranking_metrics' convert_non_numeric
- Ranking Metrics - Passing the relevances through prediction in Single Column Scenario
- Ranking Metrics - Support segments
- examples in whylogs README generate error messages?
- Canot cancel out of why.init() interactive modal dialog
- Retry on 5xx response codes
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from whylogs.