Coder Social home page Coder Social logo

Comments (14)

timholy avatar timholy commented on August 29, 2024 2

To be clear, here's what I'm proposing: for users who really know what they are doing, the right workflow is

using SnoopCompileCore
invslist = @snoopr using SomePkg
using SnoopCompile
trees = invalidation_trees(invslist)

or

using SnoopCompileCore
inf_timing = @snoopi tmin=0.01 do_something()
using SnoopCompile
pc = SnoopCompile.parcel(inf_timing)
SnoopCompile.write("/tmp/precompile", pc)

This doesn't load parcel until after measurements have been collected.

from snoopcompile.jl.

timholy avatar timholy commented on August 29, 2024 1

I do think splitting out the bot is technically possible (and is indeed a fairly natural solution), but from a purist's standpoint I wonder if making the measurements using the absolutely-minimal code makes most sense anyway. That is, split @snoopi, @snoopc, and @snoopr out into one or more "micro" packages and let SnoopCompile be everything and the kitchen sink.

I'm a bit interested in playing with the new multiple-packages-in-one-repo (JuliaLang/Pkg.jl#1251), this might be a good time to try it. If it really works well I think it may make maintaining Images.jl a bit easier (it's a pretty tall list of dependencies that need to be co-developed).

from snoopcompile.jl.

timholy avatar timholy commented on August 29, 2024 1

I think we're both saying mostly the same thing: SnoopCompile's ability to collect measurements depends on very little code, and the bulk of its code comes down to analysis.

If we do split out the core, I don't think one needs to spin up a separate process to make the measurements. It's so nice that @snoopi and @snoopr return the actual MethodInstances rather than dumping to a text file that you then have to parse, make sure all dependent packages are reloaded, evaluate, error-handle, and reconstruct. Having the MethodInstances is much of what makes these tools so simple and solid.

from snoopcompile.jl.

aminya avatar aminya commented on August 29, 2024 1

To be clear, here's what I'm proposing: for users who really know what they are doing, the right workflow is

using SnoopCompileCore
invslist = @snoopr using SomePkg
using SnoopCompile
trees = invalidation_trees(invslist)

or

using SnoopCompileCore
inf_timing = @snoopi tmin=0.01 do_something()
using SnoopCompile
pc = SnoopCompile.parcel(inf_timing)
SnoopCompile.write("/tmp/precompile", pc)

This doesn't load parcel until after measurements have been collected.

That is a good solution! We will not need external Julia processes by lazy using. I love it.

from snoopcompile.jl.

KristofferC avatar KristofferC commented on August 29, 2024

Move the Bot stuff to a separate package? Gets rid of YAML and FilePathsBase dependencies and a lot of code from the more core parts.

from snoopcompile.jl.

timholy avatar timholy commented on August 29, 2024

Yes, that is a good option. We could decree that developing it within SnoopCompile made sense to develop the functionality that each needed from the other, but that once developed it makes more sense to split them.

I still wonder, though, if even the parcel/write infrastructure could be more perturbative than we want. It's not a long list of things to worry about, but it's not zero either.

from snoopcompile.jl.

aminya avatar aminya commented on August 29, 2024

Move the Bot stuff to a separate package?

That is not a solution since the bot has the same goal as the SnoopCompile. They both should detect compilation properly.

  • One technique that I used was to use an external Julia process for benchmarking. We can move the core of snoopi and snoopr to a module (SnoopCompileCore) , and load it inside a clean external process. This way the things in SnoopCompile does not affect SnoopCompileCore. This core module does not need to be in another repository.
    julia_cmd = `julia --project=@. -e $snooping_code`

Gets rid of YAML and FilePathsBase dependencies and a lot of code from the more core parts.

To remove dependencies we can

you can see that the overwhelming source of these invalidations is FilePaths.jl,

  • From FilePathsBase only its walkpath is used. Again this was added for user's convenience to do a recursive search because it is a little hard to use relative path in CI environment. If people use absolute path, walkpath is not needed
    We can try to write a walkpath using walkdir. Which I think we absolutely should! Base needs this for many operations.
    for file in walkpath(Path(rootpath))

deprecations.jl (the @eval statements),

  • Let's remove them

  • If we want to absolutely minimize the amount of Julia code compilation in the convenience layer (I don't think we need considering my first point), we can use external tools. For example, doing recursive search using bash script, or using an external binary for YAML parsing (written in C/C++). The same thing applies to the parcel. We might want to do that. Doesn't PackageCompiler provide an easy way to make a binary that is independent of the running Julia process?

from snoopcompile.jl.

aminya avatar aminya commented on August 29, 2024

@timholy

I do think splitting out the bot is technically possible (and is indeed a fairly natural solution), but from a purist's standpoint I wonder if making the measurements using the absolutely-minimal code makes most sense anyway. That is, split @snoopi, @snoopc, and @snoopr out into one or more "micro" packages and let SnoopCompile be everything and the kitchen sink.

To me using micro package makes more sense. If we want to snoop Pkg, for example, the micro package should even be independent of that. We might even use baremodule.

I'm a bit interested in playing with the new multiple-packages-in-one-repo (JuliaLang/Pkg.jl#1251), this might be a good time to try it. If it really works well I think it may make maintaining Images.jl a bit easier (it's a pretty tall list of dependencies that need to be co-developed).

Multiple packages in one repo are only needed for their registration (Pkg stuff). 🤔 This does not affect the actual Julia code. Tokenize for example has an internal module called Tokens.
https://github.com/JuliaLang/Tokenize.jl/blob/2a2766a7f0b45d0c506da7568580a0ad33b47611/src/token.jl#L1

from snoopcompile.jl.

timholy avatar timholy commented on August 29, 2024

Let's remove them

Can't really do that in the 1.x series, but we can in the 2.x series.

we can use external tools

That has headaches of its own. I'd rather keep everything pure Julia as it is more flexible and has fewer deployment issues. There are other ways to solve this.

Multiple packages in one repo are only needed for their registration

Right, I've written a few packages with sub-modules (ImageFiltering being an example). My interest though is allowing the user to control how much code gets loaded, and that requires the Pkg stuff.

from snoopcompile.jl.

aminya avatar aminya commented on August 29, 2024

That has headaches of its own. I'd rather keep everything pure Julia as it is more flexible and has fewer deployment issues. There are other ways to solve this.

What if that these external processes are just another clean Julia process?

Right, I've written a few packages with sub-modules (ImageFiltering being an example). My interest though is allowing the user to control how much code gets loaded, and that requires the Pkg stuff.

If we use external Julia processes, there is no need for doing so. The actual snooping process would always be clean no matter what.

 julia_cmd = `julia --project=@. -e $snooping_code` 
run(julia_cmd)

That is exactly how I benchmark a package twice! Once with precompilation and once without.

run($julia_cmd)

run($julia_cmd)

from snoopcompile.jl.

timholy avatar timholy commented on August 29, 2024

That works for @snoopc, but both @snoopi and @snoopr require that you first load SnoopCompile. That's the problem this issue is focused on.

from snoopcompile.jl.

aminya avatar aminya commented on August 29, 2024

That works for @snoopc, but both @snoopi and @snoopr require that you first load SnoopCompile. That's the problem this issue is focused on.

When we moved the core stuff, this part for example just need to change to include("src/SnoopCompileCore.jl"):

using SnoopCompile

Even for @snoopi we don't need all of the code (the macro, sorting). We should pipe the snooping code to the command string. We just need these

const __inf_timing__ = Tuple{Float64,MethodInstance}[]
if isdefined(Core.Compiler, :Params)
function typeinf_ext_timed(linfo::Core.MethodInstance, params::Core.Compiler.Params)
tstart = time()
ret = Core.Compiler.typeinf_ext(linfo, params)
tstop = time()
push!(__inf_timing__, (tstop-tstart, linfo))
return ret
end
function typeinf_ext_timed(linfo::Core.MethodInstance, world::UInt)
tstart = time()
ret = Core.Compiler.typeinf_ext(linfo, world)
tstop = time()
push!(__inf_timing__, (tstop-tstart, linfo))
return ret
end
@noinline stop_timing() = ccall(:jl_set_typeinf_func, Cvoid, (Any,), Core.Compiler.typeinf_ext)
else
function typeinf_ext_timed(interp::Core.Compiler.AbstractInterpreter, linfo::Core.MethodInstance)
tstart = time()
ret = Core.Compiler.typeinf_ext_toplevel(interp, linfo)
tstop = time()
push!(__inf_timing__, (tstop-tstart, linfo))
return ret
end
function typeinf_ext_timed(linfo::Core.MethodInstance, world::UInt)
tstart = time()
ret = Core.Compiler.typeinf_ext_toplevel(linfo, world)
tstop = time()
push!(__inf_timing__, (tstop-tstart, linfo))
return ret
end
@noinline stop_timing() = ccall(:jl_set_typeinf_func, Cvoid, (Any,), Core.Compiler.typeinf_ext_toplevel)
end
@noinline start_timing() = ccall(:jl_set_typeinf_func, Cvoid, (Any,), typeinf_ext_timed)

function _snoopi(cmd::Expr, tmin = 0.0)
return quote
empty!(__inf_timing__)
start_timing()
try
$(esc(cmd))
finally
stop_timing()
end
$sort_timed_inf($tmin)
end
end

function __init__()
# typeinf_ext_timed must be compiled before it gets run
# We do this in __init__ to make sure it gets compiled to native code
# (the *.ji file stores only the inferred code)
if isdefined(Core.Compiler, :Params)
@assert precompile(typeinf_ext_timed, (Core.MethodInstance, Core.Compiler.Params))
@assert precompile(typeinf_ext_timed, (Core.MethodInstance, UInt))
else
@assert precompile(typeinf_ext_timed, (Core.Compiler.NativeInterpreter, Core.MethodInstance))
@assert precompile(typeinf_ext_timed, (Core.MethodInstance, UInt))
end
precompile(start_timing, ())
precompile(stop_timing, ())
nothing
end

And we will need a code to dump the snooping data to the hard disk, which we can just use simple text (Base.write)

from snoopcompile.jl.

aminya avatar aminya commented on August 29, 2024

I think we're both saying mostly the same thing: SnoopCompile's ability to collect measurements depends on very little code, and the bulk of its code comes down to analysis.

Yes. I agree.

If we do split out the core, I don't think one needs to spin up a separate process to make the measurements. It's so nice that @snoopi and @snoopr return the actual MethodInstances rather than dumping to a text file that you then have to parse, make sure all dependent packages are reloaded, evaluate, error-handle, and reconstruct. Having the MethodInstances is much of what makes these tools so simple and solid.

If serialization is not feasible, we should see which part of the analysis is using MethodInstance directly. That would be SnoopCompile.write and parcel_snoopi. I am saying even parcel_snoopi may have a bad effect. If we can separate the parcel too, that would be better.

from snoopcompile.jl.

aminya avatar aminya commented on August 29, 2024

Considering this solution #95 (comment), there should be three modules!

SnoopCompile:

  • SnoopCompileCore: doing snooping
  • SnoopCompileAnalysis: to generate useful data from the Core data
  • SnoopCompileBot: the rest which runs the above code in an external process + other bot codes
module SnoopCompileBot # used by user

julia_cmd = """
using SnoopCompileCore  
inf_timing = @snoopi tmin=0.01 do_something()
using SnoopCompileAnalysis
pc = SnoopCompile.parcel(inf_timing)
SnoopCompile.write("/tmp/precompile", pc)
"""
 julia_cmd = `julia --project=@. -e $snooping_code` 
run(julia_cmd)

end

This way we can automate the whole process, and the user will only need to use SnoopCompileBot directly, which abstracts the internals, and the data will be always reliable.

In the above code we might need to use include("src/SnoopCompileCore.jl") if Pkg does not allow multiple modules in one repo, or just make them three repositories.

from snoopcompile.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.