Coder Social home page Coder Social logo

Comments (12)

scohen avatar scohen commented on July 26, 2024

I feel a little responsible for this, but I don't think reducing the cyclomatic complexity of these functions will yield code that's meaningfully better. What you have here is all the possible cases for zig-zag encoding spelled out in one block. Reducing CC would mean splitting this over a bunch of functions, making the code much harder to follow.

from clickhousex.

sirikid avatar sirikid commented on July 26, 2024

I'm not sure yet how we're going to fix this, maybe we'll just turn off this warning.

from clickhousex.

scohen avatar scohen commented on July 26, 2024

I would; The cyclomatic complexity warning often plagues macro-heavy code. It's easy to disable the warning.
I would recommend fixing up conntect/1 though; I've never been a big fan of that code. The return value of {:ok, conn, {:selected, _, _}} -> is a head scratcher, as is throwing away the values in the tuple. This might be a better, minimal change:

  with {:ok, conn} <- Client.connect(scheme, hostname, port),
       {:ok, conn, {:selected, _, _}} <- Client.request(conn, @ping_query, @ping_params, timeout, username, password, database) do
    response =  %__MODULE__{
           conn: conn,
           conn_opts: [
             scheme: scheme,
             hostname: hostname,
             port: port,
             database: database,
             username: username,
             password: password,
             timeout: timeout
           ]
         }
     {:ok, response}
   end
 end

Also, I'd make the conn module have options instead of conn_opts, but those are heavier refactorings.

from clickhousex.

sirikid avatar sirikid commented on July 26, 2024

connect/1 was the easy part, but I also have a possible solution for build_extractor/5.

from clickhousex.

scohen avatar scohen commented on July 26, 2024

@sirikid see what I mean? What was once a bunch of literal functions is now a lot of meta-meta programming; It's much, much harder to grok now.

from clickhousex.

sirikid avatar sirikid commented on July 26, 2024

@scohen indeed, but I think these functions are easier to support than the literal functions that were before.

from clickhousex.

scohen avatar scohen commented on July 26, 2024

Two questions:

  1. If something is harder to understand, how will it be easier to support? The next time you look at this code, it'll take you a long time to build up the requisite context in order to begin to change it.
  2. The old functions are a complete implementation of zig-zag encoding, what's left to add?

from clickhousex.

sirikid avatar sirikid commented on July 26, 2024

The old functions are a complete implementation of zig-zag encoding

Not really, if I understand correct the zig-zag encoding defined for arbitrary sized integers, but our implementation only supports up to 10 zigzags.

what's left to add?

Tests, I feel like I should cover this with tests.

If something is harder to understand, how will it be easier to support? The next time you look at this code, it'll take you a long time to build up the requisite context in order to begin to change it.

I don't think the new implementation is harder to understand, it is literally compile-time unrolling of the imperative algorithm.

from clickhousex.

scohen avatar scohen commented on July 26, 2024

supports up to 10 zigzags

You're right that the implementation doesn't support arbitrarily-sized integers, but since this is for a clickhouse driver, it supports integers in a range that's representable inside of clickhouse.

Tests, I feel like I should cover this with tests.

I agree, and things like this are a great use case for property based tests, as each generated function has a well defined range that it decodes. I don't think the new code is any easier to test than the old one, and you're going to write the same tests in either regime.

I don't think the new implementation is harder to understand

Depends on the audience, and on how fresh the code is in your mind. I'm trying to imagine someone coming to look at the code in a year and figure out what it does; The previous code is plain and unadorned, the new code is code that writes code and will require someone to read it and understand it before they can make any changes or even understand what the generated code looks like. When I wrote it, I picked being repetitive over being clever, and I found myself able to dive right back in and understand what things did.

In the end, it's your project, and all I really care about is that the new code maintains the binary match context optimizations that were present in the old code. Do you know how to ensure that's the case?

from clickhousex.

sirikid avatar sirikid commented on July 26, 2024

all I really care about is that the new code maintains the binary match context optimizations that were present in the old code. Do you know how to ensure that's the case?

It generates almost the same code. I could add another reverse and it will be exactly the same.

from clickhousex.

scohen avatar scohen commented on July 26, 2024

So two things.

First, the binary optimizations are really tricky, which is why this macro-heavy code was introduced in the first place. It would have been much easier to decode varints in one go, but that won't work if you're trying to get the optimizations to kick in. I recommend actually having the compiler tell you if the new code is optimized rather than just guessing. To do that, add the following in the module that uses this module:

  @compile {:bin_opt_info, true}

Then mix compile will tell you if things are optimized. You should see a lot of:

warning: OPTIMIZED: match context reused
  lib/clickhousex/codec/row_binary.ex:183

Make note of it before and after applying your changes. Also note that the resume functions will obviously not work with this optimization.

Secondly, if it generates the same code, the cyclomatic complexity is the same, it's just being hidden from credo, in which case, it's effectively the same as ignoring the warning. That's the reason that I really dislike this error when it comes to code that writes code.

Another approach (that's slightly less efficient) that might work would be to do something like this (untested, but should preserve the match context -- again, check my work ):

# this goes outside the quote block
varint_extractor_name = :"do_extract_varint_#{System.unique_integer([:positive, :monotonic]))"
def unquote(extractor_name)(<<>>, unquote_splicing(extractor_args)) do
  {:resume, &unquote(extractor_name)(&1, unquote_splicing(extractor_args))}
end

def unquote(extractor_name)(data,  unquote_splicing(extractor_args)) do
  unquote(extractor_name)(rest, unquote_splicing(extractor_args))
end

def unquote(varint_extractor_name)(binary, unquote_splicing(extractor_args)) do 
  unquote(extractor_name)(binary, [], unquote_splicing(extractor_args)) 
end

def unquote(extractor_name)(<<0::size(1), part::size(7), rest::binary>>, acc, unquote_splicing(extractor_args)) do 
   acc = [part | acc]
   # Check my work below on calculating the value
   unquote(int_variable) = acc 
   |> Enum.reverse() 
   |> Enum.with_index()
   |> Enum.reduce(acc, 0, fn {part, index}, acc -> 
      shift = index * 7 
      acc |||  part <<< shift 
   end)
   unquote(landing_call)
end 

def unquote(extractor_name)(<<1::size(1), part::size(7),  rest::binary>>, acc, unquote_splicing(extractor_args) do 
  unquote(extractor_name)(rest, [{part, shift} | acc], unquote_splicing(extractor_args))
end

from clickhousex.

sirikid avatar sirikid commented on July 26, 2024

First, the binary optimizations are really tricky, which is why this macro-heavy code was introduced in the first place. It would have been much easier to decode varints in one go, but that won't work if you're trying to get the optimizations to kick in. I recommend actually having the compiler tell you if the new code is optimized rather than just guessing. To do that, add the following in the module that uses this module:

  @compile {:bin_opt_info, true}

Then mix compile will tell you if things are optimized. You should see a lot of:

warning: OPTIMIZED: match context reused
  lib/clickhousex/codec/row_binary.ex:183

Wow, didn't know about this. Optimizations haven't changed.

Secondly, if it generates the same code, the cyclomatic complexity is the same, it's just being hidden from credo, in which case, it's effectively the same as ignoring the warning. That's the reason that I really dislike this error when it comes to code that writes code.

Credo does not check the complexity of the generated code and the complexity of the macro is decreased, imo.

from clickhousex.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.