rescript-association / reanalyze Goto Github PK

View Code? Open in Web Editor NEW

269.0 14.0 20.0 4.36 MB

Experimental analyses for ReScript and OCaml: globally dead values/types, exception analysis, and termination analysis.

License: MIT License

JavaScript 0.44% Dockerfile 0.02% Shell 0.03% OCaml 99.51%

ocaml analysis dce termination-detection-algorithm deadcodeelimination rescript

reanalyze's People

Contributors

Stargazers

Watchers

reanalyze's Issues

Whitelist for interface files constituting a library interface

In the implementation of a library, it is usual for much of it to be dead, as there is no client that actually calls the entry points. In order to analyze libraries, it would be good to be able to specify an interface file (or maybe several) that constitutes the exposed library api. Everything in that file ought to then be interpreted as @live.

Model `Yojson.Basic.from_string`

Both Yojson.Basic.from_string and Basic.from_string after open Yojson.

Analyzing modules with information from local library

I have a project which has the "usual" bin/lib structure. The binary uses the library in lib, which exposes a function that raises an exception. Is there a way to make the exception analyzer be aware that the code calling this function raises an exception?

I have written the example here. More concretely, I think I would like hello_world.ml:8:10 Call(Mylib.Lib.trace, modulePath:) to use the information from the corresponding lib.ml that was already processed, and warn that it might raise Bad. Is there a way of doing this? Thanks!

Division by zero should special case constants.

Currently the exception analysis reports on x / 4.

Track unused record fields?

Would it be possible to somehow track unused fields on records?

The reason I ask is because both reason-relay and graphql_ppx (v1.0 of graphql_ppx which is just around the corner) output records to describe the data coming back from the GraphQL API through fragments/queries/subscriptions. If we can use Reanalyze to (opt in?) track whether you actually use what you select in your GraphQL operations (through access of the record fields), that'd be a really big win, because it'd mean we can help the user to not fetch more data than what's needed.

This has been attempted via eslint plugins before, but the result has never really been that solid to my knowledge. Maybe we have a better chance of doing this well in ReasonML with Reanalyze?

I understand there's nuances to this, but I figured we could get a discussion going.

cc @jfrolich who I'm sure would be interested in this too, and @sgrove who I know have expressed interest for similar features previously.

Add attribute to bypass redundant optional arg

Use case: ffi to react components.

Whitelist for function names

It would be nice if there was a way to define a whitelist of value names that should be considered live in any module signature. That way basic operations that every type could have (e.g. compare, hash, hash_fold_t, t_of_sexp, sexp_of_t) could be whitelisted and the source would not need to be cluttered up the source all over the place with e.g.:

  include sig
    [@@@ocaml.warning "-32"]

    val compare : t -> t -> int
    val hash : t -> int
    val hash_fold_t : t Hash.folder
    val t_of_sexp : Sexp.t -> t
    val sexp_of_t : t -> Sexp.t
  end

Add check for dead modules.

When a module defined by a ml-mli pair of files is dead, it would be nice to report that concisely rather than to report that each item in the module is dead.

Static Analysis

Support `exit`

Can't even catch this one. Will require the user to annotate all the callsites. Which is appropriate.

DCE reports only on inner value or an example with recursion.

let rec subList = (b, e, l) =>
  switch (l) {
  | [] => failwith("subList")
  | [h, ...t] =>
    let tail =
      if (e == 0) {
        [];
      } else {
        subList(b - 1, e - 1, t);
      };
    if (b > 0) {
      tail;
    } else {
      [h, ...tail];
    };
  };

This reports on tail but not on subList.

Add support for exception analysis.

We've found cases where we used a Pervasives, atd, or yojson function without a proper try/catch, and it didn't hit us until it was in production, at which point the lack any meaningful stack traces in Async Reason meant that we had a challenging time tracking down the source.

This is made worse by the fact that our code base has grown into using the Result type, and there are still a few places that throw even in our own code. Now that we've grown to a certain size and experience, it'd be nice if we had a way of generating a TODO list to clean up all of these exceptions, and to know that they could never throw and escape the type system.

Symbolic links to _esy disturb build.

Read annotation in doc comment for exception

Ocamldoc has special annotation @raise Invalid_argument that we can add in the doc comment.
And when building the doc we can see it. A example of the stdlib https://caml.inria.fr/pub/docs/manual-ocaml/libref/List.html#VALinit and in the code https://github.com/ocaml/ocaml/blob/trunk/stdlib/list.mli#L82. We can see that the Exception is highlighted in the generated html.

Atm reanalyze support only [@raise ] in the implementation file as attribute so it is not read by ocamldoc. It would be convenient if reanalyze could read doc comment and suggest to add in the comment. This way it help us to write good comment.

Report incorrect annotations

If I add a [@dead] annotation to the declaration of a function f in an interface file, when f is actually live, then there is a report that the implementation is dead (assuming all the uses are from other modules). It would be more useful if such incorrect annotations were reported as such, rather than trusting them and misreporting on the implementation. I'm thinking of the use case of using the dead annotations as a way to record the current state of the dead code analysis, and thinking of what happens when a new use of a previously dead value is added. In that case, the old dead annotation is incorrect, and it would be helpful to report that.

Records passed to externals can complain about field never being read

Use case: Defining things like configuration objects that are only intended to be passed into externals.

type config = {size: int}
type something

@module("something") @val
external makeSomething: config => something = "create"

@live
let something = makeSomething({size: 5})

reanalyze complains that size in config is never used to read a value. The intention however isn't to ever read from the record, only construct it and pass it into the external.

Adding @live to the field alleviates the problem, and with the new "suppress warning for a full type", maybe the best solution is to use that. However, if it could be inferred that the record is in fact passed into an external, that'd be pretty slick. I worry a bit that having to annotate things you do often becomes noisy. What do you think?

Running scoped/parts of DCE analysis only?

Hi!

I'm wondering if there's a way to run DCE on only a part of the code base's types. Let me explain my use case:

I'm building and maintaining a way of using Relay (FB GraphQL client fw) with ReScript. Relay has its own compiler that'll take a GraphQL definition written inside of ReScript, and generate an actual file with type definitions etc matching that GraphQL definition. The ReScript code can then reference those types for type safe interaction with the data Relay gives it. It looks roughly like this:

Define a GraphQL operation:
https://github.com/zth/rescript-relay/blob/master/example/src/SingleTodo.res#L1-L7

The compiler generates a file with records representing the ReScript types that GraphQL can return:
https://github.com/zth/rescript-relay/blob/master/example/src/__generated__/SingleTodo_todoItem_graphql.res#L6-L11

All of the generated files are put in a single directory, __generated__.

Now, sometimes you forget to remove selections from your GraphQL definitions when you stop needing/using the data. That in turn causes overfetching. What I'd like to be able to do is to run DCE analysis only on the records inside of the module Types in each res file in the __generated__ folder, since it's those records that represent the actual fields in GraphQL that the developer has selected. So, basically, via DCE figure out if there are record fields that aren't accessed anymore, and that can be safely removed.

I'm guessing this isn't possible as of now, but do you understand the use case/what I'm after...?

[exception analysis] Add support for inner modules.

Currently inner modules are not tracked correctly, in terms of path resolution.

foo defined inside inner module M.N should be stores as M.N.foo.
Lookup of foo from within inner module M.N should look up M.N.foo then M.foo then foo.

More info in circular references diagnostics

On some code I see many Results for ... could be inaccurate because of circular references warnings. More information on what the cycle of references is would make these possibly debuggable.

How to analyza projects spanning multiple directories, compiled with `-open`

The attached mini project open_test.tar.gz consists of:

$ tree
.
├── bin
│   ├── dune
│   └── test_bin.ml
├── dune-project
├── lib
│   ├── dune
│   └── test_lib.ml
├── test_bin.ml
└── test_lib.opam

How should reanalyze be invoked in this case. I think the crux is that there is a library under lib that is used in a binary under bin, and so they have different directories of object files. I have tried various choices of root directory. E.g. reanalyze -dce-cmt _build/default/lib/ reports values that are used in the bin as dead, as expected. On the other hand, reanalyze -dce-cmt _build/default/ does not report anything as dead even though Test_lib.N.y is dead.

Additionally, test_bin is compiled with -open Test_lib, which might complicate things.

[Question][ReScript] Possible to track promise rejections?

I wonder if it is possible to extend exception tracking to raise(Exception) and Js.Promise.reject called inside Js.Promise.make and Js.Promise.then and ensure they are caught with Js.Promise.catch.

If this works then it would be great for ReScript since that world doesn't use Result.t and converting between exception and Result can be painful.

Using Result type itself with promises can be painful when you often run into types like result<Promise.t<result<option<_>, _>>, _>. And for code paths that is most the green path, allocation Result types every time is less attractive than just raising exceptions, however we do lose type checking.

This would also be a huge sell when people compare ReScript to other languages like TypeScript since the latter has untyped errors and provide no help in tracking which errors can be caught.

+emptyArray is never used for any react jsx element

Running on my project, i'm getting " +emptyArray is never used" for a lot of reasonreact elements.

Warning 32 ought to be respected

Attributes that disable warning 32 in sections of source code are used to silence the compiler's dead code warnings. It would be best if reanalyze were to also respect these warnings. For example, deriving compare in signatures is careful to disable warning 32 to prevent reports on generated code:

let compare_int = compare

module M : sig
  type t = int [@@deriving_inline compare]

  include sig
    [@@@ocaml.warning "-32"]

    val compare : t -> t -> int
  end
  [@@ocaml.doc "@inline"]

  [@@@end]
end = struct
  type t = int [@@deriving_inline compare]

  let _ = fun (_ : t) -> ()
  let compare = (compare_int : t -> t -> int)
  let _ = compare

  [@@@end]
end

This is used in generated code, but also in non-generated code, such as here.

`-write` removes trailing newline from modified file

Running reanalyze 389dd68 on Goblint goblint/analyzer@a544002 revealed the following.

When -write modifies a file to add dead code attributes, it removes the trailing newline from the file. Since all other code style is nicely preserved, it shouldn't be removing that.

Exception analysis: bs-json

Run the analysis bs-json and confirm what exceptions are raised glennsl/bs-json#70
Add models for bs-json
Test on some projects, e.g. https://github.com/reason-association/reasonml.org

Give warning when `@doesNotRaise` is used but not needed.

Add a watch mode?

Thanks for this awesome tool!

While going through a repo and resolving things that are brough up, it would be nice to get a stream of remaining dead code left.

Add support for belt

@inline seems to negate @live/@dead

@inline @live
let someValue = "something"

reanalyze reports someValue as dead even though it's annotated with @live. Same when using @dead. Removing @inline makes it all work as intended again.

Unremovable constructor reported as dead

In code such as:

module M : sig
  type t = A of int | B of int

  val _A : int -> t
  val get : t -> int
end = struct
  type t = A of int | B of int

  let _A field = A field
  let get = function A field | B field -> field
end

let () =
  print_int (M.get (M._A 0)) ;
  print_int (M.get (M.B 0))

Executing dune build @check @all && reanalyze -dce-cmt _build/default reports:

  Warning Dead Type
  dce_test.ml 2:12-19
  M.t.+A is a variant case which is never constructed
  <-- line 2
    type t = A of int [@dead "M.t.+A"]  | B of int

On one hand, it is true that the constructor A is not used to construct a value outside of M. But on the other hand it is used to construct values within M, and it cannot be excluded from the type t in the signature. So this report seems to be correct in some sense, but is noisy since there is no possible action to take. Or am I overlooking something?

dce_test.tar.gz

Publish the project on opam-repository

Hello, I'm one of the maintainers of https://github.com/ocaml/opam-repository and I've been using your project here and there and it looks really really good I must say.
However it is currently sadly not available through opam. Would it be possible to release it there? I can help you do that, or even do the appropriate PR for you, if needed.

Support implementation/interface in optional arg analysis.

Support raise @@ Exn

Currently only raise(Exn) is supported.
Also support raise @@ Exn, and apply some hygiene rules: warn on raise being used in any other context.

E.g. raise is used indirectly: the resulting exception won't be recognized.

Dead code warnings on ppx generated code

The measures that ppx code generators take to avoid the compiler's various "unused" warnings do not work for reanalyze, and result in dead code warnings on generated code. For instance:

let compare_int = compare

type t = int [@@deriving_inline compare]

let _ = fun (_ : t) -> ()
let compare = (compare_int : t -> t -> int)
let _ = compare

[@@@end]

leads to

  Warning Dead Value
  dce_test.ml 5:1-25
  _ has no side effects and can be removed
  <-- line 5
  let _ = fun (_ : t) -> () [@@dead "_"]

  Warning Dead Value
  dce_test.ml 7:1-15
  _ has no side effects and can be removed
  <-- line 7
  let _ = compare [@@dead "_"]

These let _ = compare bindings are generated in order to silence warnings, but reanalyze warns on them directly.

Perhaps it would make sense to omit warnings on values (and types, and...) whose names start with _, as this is what the compiler does. Perhaps with a mode to warn on them anyhow.

dce_test.tar.gz

Fatal error: exception End_of_file

Hi!

node_modules/.bin/reanalyze -dce
Fatal error: exception End_of_file

This is running Reanalyze in a mid-sized project. Since the error is quite sparse, what can I do to help debug it? What type of information can I supply?

Incorrect Division_by_zero warning in ReScript

let f = x => x / 2

> reanalyze -exception

  Exception Analysis
  File "test.res", line 1, characters 4-5
  f might raise Division_by_zero (test.res:1:15) and is not annotated with @raises Division_by_zero
  
  Analysis reported 1 issues (Exception Analysis:1)

This isn't accurate, since ReScript compiles to JS which does not raise this exception.

function f(x) {
  return x / 2 | 0;
}

Edit: it seems to have the same problem with mod.

"ModuleName.+unsafe_expr is never used" incorrect report for graphql ppx module

not sure for other info is needed. simply incorrect report for a used module/expression.
The expression in question looks somewhat like:

module ModuleName = [%graphql
  {|
      mutation XXX($x: ID!, $y: String!) {
        xxx(pageId: $x, text: $y){
...
        }
      }
  |}
];

GraphQL PPX example

// generated code by the PPX
module MyQuery = {
  module Raw = {
    type t = {scalarsInput: string};
  };
  let query = "query ($arg: VariousScalarsInput!)  {\nscalarsInput(arg: $arg)  \n}\n";
  type t = {scalarsInput: string};
  type t_variables = {arg: t_variables_VariousScalarsInput}
  and t_variables_VariousScalarsInput = {
    nullableString: option(string),
    string,
    nullableInt: option(int),
    int,
    nullableFloat: option(float),
    float,
    nullableBoolean: option(bool),
    boolean: bool,
    nullableID: option(string),
    id: string,
  };
  let parse: Raw.t => t =
    (value) => (
      {

        scalarsInput: {
          let value = (value: Raw.t).scalarsInput;

          value;
        },
      }: t
    );
  let serialize: t => Raw.t =
    (value) => (
      {
        let scalarsInput = {
          let value = (value: t).scalarsInput;

          value;
        };
        {

          scalarsInput: scalarsInput,
        };
      }: Raw.t
    );
  let rec serializeVariables: t_variables => Js.Json.t =
    inp =>
      [|
        (
          "arg",
          (a => Some(serializeInputObjectVariousScalarsInput(a)))(inp.arg),
        ),
      |]
      |> Js.Array.filter(
           fun
           | (_, None) => false
           | (_, Some(_)) => true,
         )
      |> Js.Array.map(
           fun
           | (k, Some(v)) => (k, v)
           | (k, None) => (k, Js.Json.null),
         )
      |> Js.Dict.fromArray
      |> Js.Json.object_
  and serializeInputObjectVariousScalarsInput:
    t_variables_VariousScalarsInput => Js.Json.t =
    inp =>
      [|
        (
          "nullableString",
          (
            a =>
              switch (a) {
              | None => None
              | Some(b) => (a => Some(Js.Json.string(a)))(b)
              }
          )(
            inp.nullableString,
          ),
        ),
        ("string", (a => Some(Js.Json.string(a)))(inp.string)),
        (
          "nullableInt",
          (
            a =>
              switch (a) {
              | None => None
              | Some(b) => (a => Some(Js.Json.number(float_of_int(a))))(b)
              }
          )(
            inp.nullableInt,
          ),
        ),
        ("int", (a => Some(Js.Json.number(float_of_int(a))))(inp.int)),
        (
          "nullableFloat",
          (
            a =>
              switch (a) {
              | None => None
              | Some(b) => (a => Some(Js.Json.number(a)))(b)
              }
          )(
            inp.nullableFloat,
          ),
        ),
        ("float", (a => Some(Js.Json.number(a)))(inp.float)),
        (
          "nullableBoolean",
          (
            a =>
              switch (a) {
              | None => None
              | Some(b) => (a => Some(Js.Json.boolean(a)))(b)
              }
          )(
            inp.nullableBoolean,
          ),
        ),
        ("boolean", (a => Some(Js.Json.boolean(a)))(inp.boolean)),
        (
          "nullableID",
          (
            a =>
              switch (a) {
              | None => None
              | Some(b) => (a => Some(Js.Json.string(a)))(b)
              }
          )(
            inp.nullableID,
          ),
        ),
        ("id", (a => Some(Js.Json.string(a)))(inp.id)),
      |]
      |> Js.Array.filter(
           fun
           | (_, None) => false
           | (_, Some(_)) => true,
         )
      |> Js.Array.map(
           fun
           | (k, Some(v)) => (k, v)
           | (k, None) => (k, Js.Json.null),
         )
      |> Js.Dict.fromArray
      |> Js.Json.object_;
  let makeVar = (~f, ~arg, ()) =>
    f(
      serializeVariables(
        {

          arg: arg,
        }: t_variables,
      ),
    )
  and makeInputObjectVariousScalarsInput =
      (
        ~nullableString=?,
        ~string,
        ~nullableInt=?,
        ~int,
        ~nullableFloat=?,
        ~float,
        ~nullableBoolean=?,
        ~boolean,
        ~nullableID=?,
        ~id,
        (),
      )
      : t_variables_VariousScalarsInput => {

    nullableString,

    string,

    nullableInt,

    int,

    nullableFloat,

    float,

    nullableBoolean,

    boolean,

    nullableID,

    id,
  };
  let definition = (parse, query, makeVar, serialize);
  let makeVariables = makeVar(~f=f => f);
};
// end of generated code by the ppx

Not sure if it's the best example. parse, serialize, makeVariables. definition and the input object constructors may or may not be used. In most cases only definition and makeVariables are used (definition passed to the GraphQL client and makeVariables to construct variables). But I think that is a tricky case because definition is referencing everything.

Incorrectly reports dead value when using first class module as a react component argument.

module type ReplacebleComponent = {
  [@react.component]
  let make: unit => React.element;
};

module Component = {
  [@react.component]
  let make = () => <div> "A"->React.string </div>;
};

module Container = {
  [@react.component]
  let make = (~impl: (module ReplacebleComponent)) => {
    let (module X) = impl;

    <X />;
  };
};

[@live]
let out = <Container impl=(module Component) />;

Reports:

Warning Dead Value
  File "/test/ReanalyzeTest.re", line 14, characters 4-25
  Container.+X is never used
  <-- line 14
      [@dead "Container.+X"] let (module X) = impl;

More DCE checks.

Collection of possible additional DCE checks:

An optional arg is never passed.
An optional arg is always passed.
Apply the same checks for optional args to components and JSX.
A record field of type option(t) is always None, or always Some.

False alarm on record field

In the attached project executing:

dune build @check @all && Debug=1 reanalyze -dce-cmt _build/default

reports:

File "trace/trace.ml", line 3, characters 11-28
+pf.+pf is a record label never used to read a value
  <-- line 3
  type pf = {pf: 'a. 'a printf [@dead "+pf.+pf"] }

This is complaining about the type definition:

type pf = {pf: 'a. 'a printf}

but there is a use:

Trace.infok "" "" (fun {pf} -> pf "%s" "")

since Trace.infok is s.t.:

val infok : string -> string -> (pf -> 'a) -> 'a

Notably, removing the trace.mli file eliminates the report. Perhaps the declaration in the interface is not correctly identified with the one in the implementation.

Confusing message when nothing's actually raised

The code doesn't actually raise anything, so the error message shows this:

It's really saying "might raise ". Aka it won't raise.

The notion of dead record label is not clear.

Thanks @jchavarri for reporting it.

type userData = {
  a: bool,
  b: int,
  c: string,
};

type tabState = {
  a: bool,
  b: int,
  c: string,
  d: int,
  e: int,
  f: string,
};

let userData =
    ({a, b, c}): userData => {
  a,
  b,
  c,
};

Js.log(userData);

Exception false positive report

let toMinMax = (str: string) => {
  switch (Js.String.split("-", str)) {
  | [|min, max|] => {min: int_of_string(min), max: int_of_string(max)}
  | _ => {min: (-1), max: (-1)}
  };
};

Exception Analysis
toMinMax might raise Failure and is not annotated with @raises Failure

If any pattern is present it should be safe?

private types and "a variant case which is never constructed" warning

If you have a private variant type, then it will always trigger a "variant case which is never constructed" warning. Of course, because it's private, it can't be constructed outside of the implementation.

Allow configuration via bsconfig.json

It would be slick to be able to provide configuration via bsconfig.json. Real world use cases I had in mind after running reanalyze on a fairly large and diverse project:

Having a bindings folder with ad hoc bindings, where you really want to consider all of the code in there live (like it's an actual lib).
Using things like Storybook that expects you to have a bunch of files/components that are exported to Storybook, but that reanalyze will think are dead since they're never used in the app itself.

I believe this is possible to suppress via CLI args right now. But, it'd be slick to be able to configure it via bsconfig.json too, so that general tooling (like a dead code mode in the editor extension) can take advantage of the config too. Maybe letting the CLI args override the eventual values in bsconfig.json?

Local module alias is not followed for exception analysis

I have some code that shadows an exception-throwing module with a safer one, but reanalyze still flags the code as throwing an exception. This is a contrived example, but it does replicate the issue:

module Array = Belt.Array;

let rec search = (arr, idx, predicate) =>
  switch (arr->Array.get(idx)) {
  | Some((k, v)) => predicate(k, v) ? search(arr, idx + 1, predicate) : false
  | None => true
  };

The analysis reports search might raise Invalid_argument (File.re:4:17) and is not annotated with @raises Invalid_argument. It appears reanalyse is incorrectly resolving this reference as OCaml's default Array.get.

Explicitly referencing the Belt module raises no warnings:

let rec search = (arr, idx, predicate) =>
  switch (arr->Belt.Array.get(idx)) {
  | Some((k, v)) => predicate(k, v) ? search(arr, idx + 1, predicate) : false
  | None => true
  };

I am using version 2.10.0

Explore corner cases of exception analysis.

A corner case is a case where it's not clear whether or not one would like to report, even though it is true that the function raises.

Here's one:

String.make(length, ' ')

The call throws if length is negative. And the analysis can become noisy when mistakes are rare, but the analysis always reports.
Might be best to postpone corner cases until the entire analysis flow is more developed.

Investigate adding models for atdgen.

https://github.com/ahrefs/atd/search?q=raise&unscoped_q=raise

[Feature request] Exhaustive dependencies checks for ReasonReact hooks

Greetings everyone! I asked in ReasonML forum if anyone was aware of a way to check for exhaustive deps checks when using ReasonReact hooks. I really think that this issue is a great source of bugs in React, but in JS world there are tons of things like this, and obviously the ecosystem tools are lifesavers.

In ReasonML most of the source of bugs are handled at compile-time, but unfortunately this is not the case with exhaustive deps. I really believe that there should be a way to express React hooks using an abstraction layer that could detect these kind of issue at compile-time (maybe using ppx?), but I still don't know the language enough to write this sort of implementation (again, if it is possible).

I think that implementing an exhaustive deps check in reanalyze would help a lot -- it took me hours to find one missing dependency in one hook for a toy project. Do you think that it would be something doable?

rescript-association / reanalyze Goto Github PK

reanalyze's People

Contributors

Stargazers

Watchers

Forkers

reanalyze's Issues

Recommend Projects

Recommend Topics

Recommend Org