The mljflow.jl from juliaai

Fixing Julia LTS pipeline

Now, the pipeline is failing when we send the information through HTTP.jl. The entire log is here.

Remove an abstract type in a struct

Line 28 in d36386b

verbosity::Integer

Change verbosity::Integer to verbosity::Int.

Integer is abstract. Generally we avoid abstract types in structs. If we need a generic type we add a type parameter instead (not needed here).

Add tests for CPUThreads and CPUProcesses

As defined in JuliaAI/MLJTuning.jl#193 (comment)

Improve test of `save`.

Can we make this test a little stronger? For example, make a prediction using the original machine and check it agrees with the prediction of the reconstructed machine.

MLJFlow.jl/test/base.jl

Line 35 in d36386b

@test loaded_mach.model isa ProbabilisticPipeline

Can't find my saved machine artifact

This used to work for me but doesn't any longer. What's strange is that tests successfully pass locally for me, and I believe saving artifacts is in the tests.

Following the instructions in the README.md:

using MLJ, MLFlowClient

X, y = make_moons()
model = ConstantClassifier()
mach = machine(model, X, y) |> fit!

logger = MLJFlow.Logger("http://127.0.0.1:5000/api")
run = MLJ.save(logger, mach)

service = MLJFlow.service(logger)

artifacts = MLFlowClient.listartifacts(service, run)
@assert !isempty(artifacts)

# ERROR: AssertionError: !(isempty(artifacts))
# Stacktrace:
#  [1] top-level scope
#    @ REPL[27]:1

# julia> versioninfo()
# Julia Version 1.10.3
# Commit 0b4590a5507 (2024-04-30 10:59 UTC)
# Build Info:
#   Official https://julialang.org/ release
# Platform Info:
#   OS: macOS (x86_64-apple-darwin22.4.0)
#   CPU: 12 × Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
#   WORD_SIZE: 64
#   LIBM: libopenlibm
#   LLVM: libLLVM-15.0.7 (ORCJIT, skylake)
# Threads: 12 default, 0 interactive, 6 GC (on 12 virtual cores)
# Environment:
#   JULIA_LTS_PATH = /Applications/Julia-1.6.app/Contents/Resources/julia/bin/julia
#   JULIA_PATH = /Applications/Julia-1.10.app/Contents/Resources/julia/bin/julia
#   JULIA_EGLOT_PATH = /Applications/Julia-1.7.app/Contents/Resources/julia/bin/julia
#   JULIA_NUM_THREADS = 12
#   JULIA_NIGHTLY_PATH = /Applications/Julia-1.10.app/Contents/Resources/julia/bin/julia

# (jl_I6VBrN) pkg> st
# Status `/private/var/folders/4n/gvbmlhdc8xj973001s6vdyw00000gq/T/jl_I6VBrN/Project.toml`
#   [64a0f543] MLFlowClient v0.5.1
#   [add582a8] MLJ v0.20.5

Export names sparingly

https://github.com/pebeto/MLJFlow.jl/blob/70e961e9244645b4dfc52a99eab06aea0852f0b1/src/MLJFlow.jl#L20

Generally, I prefer to export names sparingly as removing them later on is always a breaking change. Typically export a name if the method is going to be used by an ordinary user, otherwise don't.

I think the only method your package needs to export is MLFlowLogger (for constructing loggers), and, if you are providing it, the client(::MLFlowLogger) method that the general user needs to access the client. I'm assuming runs is an internal method, yes?

Do not export save as this is such a common method (provided by serialization packages, for example) that this routinely leads to name conflicts. No MLJ package exports save.

Move `service` code to service.jl

I was looking for new service method and saw the new service.jl file, but it wasn't there 😄 .

MLJFlow.jl/src/base.jl

Line 39 in d36386b

"""

Minor doc point.

https://github.com/pebeto/MLJFlow.jl/blob/951fb8f8b03727ec2539b8e9ec832cf44503d345/src/types.jl#L11

To keep doc strings concise, I usually suppress statements like "If the client is not running an informative error will be thrown". The user will find this out if they forget. Otherwise, they don't need this information.

[Tracking] Towards registration

Todo:

Make artifact retrieval more convenient

I wonder while we think of a remedy for #42, we can think about a better way for MLJ users to retrieve artifacts, without needing familiarity with the MLFlowClient API, and so requiring so many steps. What about something along the lines of a function MLJFlow.artifact(logger, run) to enable this workflow:

run = MLJ.save(logger, mach)
restored_mach = MLJFlow.artifact(logger, run)

Also, what if the user failed to record run? Is there no way to retrieve the artifact? Can artifacts be automatically tagged in some way that provides a user-handle for later retrieval, without knowing actual run object? So, instead of run above, we can subsitute the string handle?

Add accessor method to extract the "client" from an `MLFlowLogger object`

As discussed on a call.

I know we have been calling this client but I wonder if this is really the correct word, since it is on a server most (all?) of the time. I see that MLFlowClient.jl documentation calls an instance of MLFlow an "MLFlow API service", so maybe "service" is a better work for this.

Improve local testability

Running Pkg.test("MLJFlow") locally requires that an MLflow service is already running on your machine and that the uri "http://localhost:5000" will work to connect to it. On my mac, that uri will not work and I must manually edit it to be "http://127.0.1:5000", which is a pain.

Here's one suggestion: To run tests one must set a local env variable "MLFLOW_URI" to the uri of an active MLflow service. If the env is empty a helpful warning explaining what to do is thrown.

@pebeto Do you have any other suggestions?

Rename `MLFlowLogger` ?

I think having MLJFlow, MLFlowLogger, etc, - sometimes there is a "j" and sometimes not - is bit of cognitive burden. What if we rename MLFlowLogger to Logger? So, in MLJ the user does

logger = Logger(...)

In the rare case there are multiple logging platforms at play, the user can resolve the ambiguity with MLJFlow.Logger. Or we could just not export the name and have the user do logger = MLJFlow.Logger(...). Ie., the user does

logger = MLJFlow.Logger()

Of course this is a breaking change but I think that's fine at this early stage of development.

@pebeto What do you think?

Issue to trigger releases

Need handling for models with zero hyperparameters

ConstantClassifier is a model with no hyperparameters. If I change ConstantClassifer below to DecisionTreeClassifier, for example, then no error is thrown.

using MLJ
using .Threads
nthreads()
# 5

logger = MLFlowLogger("http://127.0.0.1:5000", experiment_name="rooster")
X, y = make_moons()
model = ConstantClassifier()
#model = (@load RandomForestClassifier pkg=DecisionTree)()

evaluate(
    model,
    X,
    y;
    logger,
)

# ERROR: HTTP.Exceptions.StatusError(400, "POST", "/api/2.0/mlflow/runs/log-parameter", HTTP.Messages.Response:
# """
# HTTP/1.1 400 Bad Request
# Server: gunicorn
# Date: Mon, 11 Sep 2023 19:09:40 GMT
# Connection: close
# Content-Type: application/json
# Content-Length: 163

# {"error_code": "INVALID_PARAMETER_VALUE", "message": "Missing value for required parameter 'key'. See the API docs for more information about request parameters."}""")
# Stacktrace:
#   [1] mlfpost(mlf::MLFlowClient.MLFlow, endpoint::String; kwargs::Base.Pairs{Symbol, String, Tuple{Symbol, Symbol, Symbol}, NamedTuple{(:run_id, :key, :value), Tuple{String, String, String}}})
#     @ MLFlowClient ~/.julia/packages/MLFlowClient/Szkbv/src/utils.jl:74
#   [2] mlfpost
#     @ ~/.julia/packages/MLFlowClient/Szkbv/src/utils.jl:66 [inlined]
#   [3] logparam(mlf::MLFlowClient.MLFlow, run_id::String, key::Symbol, value::ConstantClassifier)

In the readme, add example setting `default_logger()`

Needs:

JuliaAI/MLJ.jl#1124

In MLJ or MLJBase, the global default logger is set with default_logger(logger) and inspected with default_logger().

Improve `show` for logger instances

I think it would be better if a logger instance displays as you would construct it. That way the user can easily guess how to add keywords without looking up the docstring.

So, instead of

julia> MLFlowLogger("http://127.0.0.1:5000")
MLFlowLogger(MLFlow(
    baseuri = "http://127.0.0.1:5000", 
    apiversion = 2.0
), 1, "MLJ experiment", nothing)

we arrange

julia> MLFlowLogger("http://127.0.0.1:5000")
MLFlowLogger("http://127.0.0.1:5000",
    experiment_name="MLJ experiment",
    artifact_location=nothing,
)

This skips display of the apiversion, but I don't think that's a big deal. But we could do:

julia> MLFlowLogger("http://127.0.0.1:5000")
MLFlowLogger("http://127.0.0.1:5000",
    experiment_name="MLJ experiment",
    artifact_location=nothing,
) using MLflow version 2.0

@pebeto What do you think?

Add a `verbosity` field to the `MLFlowLogger` wrapper

defaulting to 1.

I think we will want this as the user interface point for specifying how much to log in tuning and iteration. For example, if verbosity < 1 then only log the performance evaluation for the best model.

In tests: include reconstruction of machine saved as artefact

Currently we only check the artifact exists.

We can do something like the what is in the example in #10

import instead of using

If these are the only methods you need, you can just use import here, right?

https://github.com/pebeto/MLJFlow.jl/blob/70e961e9244645b4dfc52a99eab06aea0852f0b1/src/MLJFlow.jl#L6

Move utilities to MLJBase?

I think it's conceivable these methods would be more generally useful. What do you think of putting these back in MLJBase/src/utilities.jl ?

https://github.com/pebeto/MLJFlow.jl/blob/70e961e9244645b4dfc52a99eab06aea0852f0b1/src/utilities.jl#L1

Include type of `resampling` in the run tag

As discussed on a call.

Will require us to add resampling to the PerformanceEvaluation objects in MLJBase, as we did the model, to make them available to our overloading of load_evaluation.

Update the README.md to reflect change `uri` -> `apiroot`

The example given there will no longer work.

Duplicate run names intentional?

When I tried out the example and repeated the evaluation to generate new runs, they all had the same name. Is that intentional?

CI has started to fail

I have re-run a previously passing CI workflow and it is now failing. It seems that mlflow is no longer being installed properly:

https://github.com/JuliaAI/MLJFlow.jl/actions/runs/5993892262/job/17118142976#step:4:262

I am hitting this issue at #21 . Locally that PR passes tests if I manually launch the mlflow service.

`mlflow` health endpoint is doutbful

mlflow local instances provide /health and /ping endpoints to check if it is running. Platforms like DagsHub don't have it (well, as far as I have been able to test), making users being unable to use this package. Should we remove that check?

Recheck readme example after MLJ.jl update

In the pending PR #10, the example does using MLJBase, MLJModels, and so forth. After we rollout the new version of MLJ that will include using MLJFlow we can simplify the example to do using MLJ and so forth.

Method ambiguity (due to type piracy) in extension of `MLJBase.save`

using MLJ
model = ConstantClassifier()
tmodel = TunedModel(models=[model, model])
mach = machine(tmodel, (@load_iris)...) |> fit!
MLJ.save(IOBuffer(), mach)

# ERROR: MethodError: save(::MLJTuning.ProbabilisticTunedModel{Explicit, Probabilistic}, ::Machine{ConstantClassifier, true}) is ambiguous.

# Candidates:
#   save(tmodel::Union{MLJTuning.DeterministicTunedModel{T, M}, MLJTuning.ProbabilisticTunedModel{T, M}} where {T, M}, fitresult)
#     @ MLJTuning ~/.julia/packages/MLJTuning/CLXum/src/tuned_models.jl:832
#   save(logger, machine::Machine)
#     @ MLJFlow ~/.julia/packages/MLJFlow/AAdc1/src/base.jl:22

# Possible fix, define
#   save(::Union{MLJTuning.DeterministicTunedModel{T, M}, MLJTuning.ProbabilisticTunedModel{T, M}} where {T, M}, ::Machine)

Clean up doc string

https://github.com/pebeto/MLJFlow.jl/blob/70e961e9244645b4dfc52a99eab06aea0852f0b1/src/types.jl#L9

The first part of the docstring should address the casual user. So your "Fields" section should not mention client at all (this is part of the implementation, not the user interface) and somewhere you should explain the meaning of the baseuri argument.

(Implementation details generally go in code comments, unless you are providing a method that is part of a public interface, such as our log_evaluation method, in which case put that at the end, possibly under a separate "New implementations" section.)

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

juliaai / mljflow.jl Goto Github PK

mljflow.jl's People

Contributors

Stargazers

Watchers

mljflow.jl's Issues

Recommend Projects

Recommend Topics

Recommend Org