rescript-association / reanalyze Goto Github PK
View Code? Open in Web Editor NEWExperimental analyses for ReScript and OCaml: globally dead values/types, exception analysis, and termination analysis.
License: MIT License
Experimental analyses for ReScript and OCaml: globally dead values/types, exception analysis, and termination analysis.
License: MIT License
In the implementation of a library, it is usual for much of it to be dead, as there is no client that actually calls the entry points. In order to analyze libraries, it would be good to be able to specify an interface file (or maybe several) that constitutes the exposed library api. Everything in that file ought to then be interpreted as @live
.
Both Yojson.Basic.from_string
and Basic.from_string
after open Yojson
.
I have a project which has the "usual" bin
/lib
structure. The binary uses the library in lib
, which exposes a function that raises an exception. Is there a way to make the exception analyzer be aware that the code calling this function raises an exception?
I have written the example here. More concretely, I think I would like hello_world.ml:8:10 Call(Mylib.Lib.trace, modulePath:)
to use the information from the corresponding lib.ml
that was already processed, and warn that it might raise Bad
. Is there a way of doing this? Thanks!
Currently the exception analysis reports on x / 4
.
Would it be possible to somehow track unused fields on records?
The reason I ask is because both reason-relay
and graphql_ppx
(v1.0 of graphql_ppx
which is just around the corner) output records to describe the data coming back from the GraphQL API through fragments/queries/subscriptions. If we can use Reanalyze to (opt in?) track whether you actually use what you select in your GraphQL operations (through access of the record fields), that'd be a really big win, because it'd mean we can help the user to not fetch more data than what's needed.
This has been attempted via eslint
plugins before, but the result has never really been that solid to my knowledge. Maybe we have a better chance of doing this well in ReasonML with Reanalyze?
I understand there's nuances to this, but I figured we could get a discussion going.
cc @jfrolich who I'm sure would be interested in this too, and @sgrove who I know have expressed interest for similar features previously.
Use case: ffi to react components.
It would be nice if there was a way to define a whitelist of value names that should be considered live in any module signature. That way basic operations that every type could have (e.g. compare
, hash
, hash_fold_t
, t_of_sexp
, sexp_of_t
) could be whitelisted and the source would not need to be cluttered up the source all over the place with e.g.:
include sig
[@@@ocaml.warning "-32"]
val compare : t -> t -> int
val hash : t -> int
val hash_fold_t : t Hash.folder
val t_of_sexp : Sexp.t -> t
val sexp_of_t : t -> Sexp.t
end
When a module defined by a ml-mli pair of files is dead, it would be nice to report that concisely rather than to report that each item in the module is dead.
Can't even catch this one. Will require the user to annotate all the callsites. Which is appropriate.
let rec subList = (b, e, l) =>
switch (l) {
| [] => failwith("subList")
| [h, ...t] =>
let tail =
if (e == 0) {
[];
} else {
subList(b - 1, e - 1, t);
};
if (b > 0) {
tail;
} else {
[h, ...tail];
};
};
This reports on tail
but not on subList
.
We've found cases where we used a Pervasives, atd, or yojson function without a proper try/catch, and it didn't hit us until it was in production, at which point the lack any meaningful stack traces in Async Reason meant that we had a challenging time tracking down the source.
This is made worse by the fact that our code base has grown into using the Result
type, and there are still a few places that throw even in our own code. Now that we've grown to a certain size and experience, it'd be nice if we had a way of generating a TODO list to clean up all of these exceptions, and to know that they could never throw and escape the type system.
Ocamldoc has special annotation @raise Invalid_argument
that we can add in the doc comment.
And when building the doc we can see it. A example of the stdlib https://caml.inria.fr/pub/docs/manual-ocaml/libref/List.html#VALinit and in the code https://github.com/ocaml/ocaml/blob/trunk/stdlib/list.mli#L82. We can see that the Exception is highlighted in the generated html.
Atm reanalyze support only [@raise ]
in the implementation file as attribute so it is not read by ocamldoc. It would be convenient if reanalyze could read doc comment and suggest to add in the comment. This way it help us to write good comment.
If I add a [@dead]
annotation to the declaration of a function f
in an interface file, when f
is actually live, then there is a report that the implementation is dead (assuming all the uses are from other modules). It would be more useful if such incorrect annotations were reported as such, rather than trusting them and misreporting on the implementation. I'm thinking of the use case of using the dead annotations as a way to record the current state of the dead code analysis, and thinking of what happens when a new use of a previously dead value is added. In that case, the old dead annotation is incorrect, and it would be helpful to report that.
Use case: Defining things like configuration objects that are only intended to be passed into externals.
type config = {size: int}
type something
@module("something") @val
external makeSomething: config => something = "create"
@live
let something = makeSomething({size: 5})
reanalyze
complains that size
in config
is never used to read a value. The intention however isn't to ever read from the record, only construct it and pass it into the external.
Adding @live
to the field alleviates the problem, and with the new "suppress warning for a full type", maybe the best solution is to use that. However, if it could be inferred that the record is in fact passed into an external, that'd be pretty slick. I worry a bit that having to annotate things you do often becomes noisy. What do you think?
Hi!
I'm wondering if there's a way to run DCE on only a part of the code base's types. Let me explain my use case:
I'm building and maintaining a way of using Relay (FB GraphQL client fw) with ReScript. Relay has its own compiler that'll take a GraphQL definition written inside of ReScript, and generate an actual file with type definitions etc matching that GraphQL definition. The ReScript code can then reference those types for type safe interaction with the data Relay gives it. It looks roughly like this:
Define a GraphQL operation:
https://github.com/zth/rescript-relay/blob/master/example/src/SingleTodo.res#L1-L7
The compiler generates a file with records representing the ReScript types that GraphQL can return:
https://github.com/zth/rescript-relay/blob/master/example/src/__generated__/SingleTodo_todoItem_graphql.res#L6-L11
All of the generated files are put in a single directory, __generated__
.
Now, sometimes you forget to remove selections from your GraphQL definitions when you stop needing/using the data. That in turn causes overfetching. What I'd like to be able to do is to run DCE analysis only on the records inside of the module Types
in each res
file in the __generated__
folder, since it's those records that represent the actual fields in GraphQL that the developer has selected. So, basically, via DCE figure out if there are record fields that aren't accessed anymore, and that can be safely removed.
I'm guessing this isn't possible as of now, but do you understand the use case/what I'm after...?
Currently inner modules are not tracked correctly, in terms of path resolution.
foo
defined inside inner module M.N
should be stores as M.N.foo
.foo
from within inner module M.N
should look up M.N.foo
then M.foo
then foo
.On some code I see many Results for ... could be inaccurate because of circular references
warnings. More information on what the cycle of references is would make these possibly debuggable.
The attached mini project open_test.tar.gz consists of:
$ tree
.
├── bin
│ ├── dune
│ └── test_bin.ml
├── dune-project
├── lib
│ ├── dune
│ └── test_lib.ml
├── test_bin.ml
└── test_lib.opam
How should reanalyze be invoked in this case. I think the crux is that there is a library under lib
that is used in a binary under bin
, and so they have different directories of object files. I have tried various choices of root directory. E.g. reanalyze -dce-cmt _build/default/lib/
reports values that are used in the bin as dead, as expected. On the other hand, reanalyze -dce-cmt _build/default/
does not report anything as dead even though Test_lib.N.y
is dead.
Additionally, test_bin
is compiled with -open Test_lib
, which might complicate things.
I wonder if it is possible to extend exception tracking to raise(Exception)
and Js.Promise.reject
called inside Js.Promise.make
and Js.Promise.then
and ensure they are caught with Js.Promise.catch
.
If this works then it would be great for ReScript since that world doesn't use Result.t and converting between exception and Result can be painful.
Using Result type itself with promises can be painful when you often run into types like result<Promise.t<result<option<_>, _>>, _>
. And for code paths that is most the green path, allocation Result types every time is less attractive than just raising exceptions, however we do lose type checking.
This would also be a huge sell when people compare ReScript to other languages like TypeScript since the latter has untyped errors and provide no help in tracking which errors can be caught.
Running on my project, i'm getting " +emptyArray is never used" for a lot of reasonreact elements.
Attributes that disable warning 32 in sections of source code are used to silence the compiler's dead code warnings. It would be best if reanalyze were to also respect these warnings. For example, deriving compare
in signatures is careful to disable warning 32 to prevent reports on generated code:
let compare_int = compare
module M : sig
type t = int [@@deriving_inline compare]
include sig
[@@@ocaml.warning "-32"]
val compare : t -> t -> int
end
[@@ocaml.doc "@inline"]
[@@@end]
end = struct
type t = int [@@deriving_inline compare]
let _ = fun (_ : t) -> ()
let compare = (compare_int : t -> t -> int)
let _ = compare
[@@@end]
end
This is used in generated code, but also in non-generated code, such as here.
Running reanalyze 389dd68 on Goblint goblint/analyzer@a544002 revealed the following.
When -write
modifies a file to add dead code attributes, it removes the trailing newline from the file. Since all other code style is nicely preserved, it shouldn't be removing that.
Run the analysis bs-json and confirm what exceptions are raised glennsl/bs-json#70
Add models for bs-json
Test on some projects, e.g. https://github.com/reason-association/reasonml.org
Thanks for this awesome tool!
While going through a repo and resolving things that are brough up, it would be nice to get a stream of remaining dead code left.
@inline @live
let someValue = "something"
reanalyze
reports someValue
as dead even though it's annotated with @live
. Same when using @dead
. Removing @inline
makes it all work as intended again.
In code such as:
module M : sig
type t = A of int | B of int
val _A : int -> t
val get : t -> int
end = struct
type t = A of int | B of int
let _A field = A field
let get = function A field | B field -> field
end
let () =
print_int (M.get (M._A 0)) ;
print_int (M.get (M.B 0))
Executing dune build @check @all && reanalyze -dce-cmt _build/default
reports:
Warning Dead Type
dce_test.ml 2:12-19
M.t.+A is a variant case which is never constructed
<-- line 2
type t = A of int [@dead "M.t.+A"] | B of int
On one hand, it is true that the constructor A
is not used to construct a value outside of M
. But on the other hand it is used to construct values within M
, and it cannot be excluded from the type t
in the signature. So this report seems to be correct in some sense, but is noisy since there is no possible action to take. Or am I overlooking something?
Hello, I'm one of the maintainers of https://github.com/ocaml/opam-repository and I've been using your project here and there and it looks really really good I must say.
However it is currently sadly not available through opam. Would it be possible to release it there? I can help you do that, or even do the appropriate PR for you, if needed.
Currently only raise(Exn)
is supported.
Also support raise @@ Exn
, and apply some hygiene rules: warn on raise
being used in any other context.
E.g. raise
is used indirectly: the resulting exception won't be recognized.
The measures that ppx code generators take to avoid the compiler's various "unused" warnings do not work for reanalyze, and result in dead code warnings on generated code. For instance:
let compare_int = compare
type t = int [@@deriving_inline compare]
let _ = fun (_ : t) -> ()
let compare = (compare_int : t -> t -> int)
let _ = compare
[@@@end]
leads to
Warning Dead Value
dce_test.ml 5:1-25
_ has no side effects and can be removed
<-- line 5
let _ = fun (_ : t) -> () [@@dead "_"]
Warning Dead Value
dce_test.ml 7:1-15
_ has no side effects and can be removed
<-- line 7
let _ = compare [@@dead "_"]
These let _ = compare
bindings are generated in order to silence warnings, but reanalyze warns on them directly.
Perhaps it would make sense to omit warnings on values (and types, and...) whose names start with _
, as this is what the compiler does. Perhaps with a mode to warn on them anyhow.
Hi!
node_modules/.bin/reanalyze -dce
Fatal error: exception End_of_file
This is running Reanalyze in a mid-sized project. Since the error is quite sparse, what can I do to help debug it? What type of information can I supply?
let f = x => x / 2
> reanalyze -exception
Exception Analysis
File "test.res", line 1, characters 4-5
f might raise Division_by_zero (test.res:1:15) and is not annotated with @raises Division_by_zero
Analysis reported 1 issues (Exception Analysis:1)
This isn't accurate, since ReScript compiles to JS which does not raise this exception.
function f(x) {
return x / 2 | 0;
}
Edit: it seems to have the same problem with mod
.
not sure for other info is needed. simply incorrect report for a used module/expression.
The expression in question looks somewhat like:
module ModuleName = [%graphql
{|
mutation XXX($x: ID!, $y: String!) {
xxx(pageId: $x, text: $y){
...
}
}
|}
];
// generated code by the PPX
module MyQuery = {
module Raw = {
type t = {scalarsInput: string};
};
let query = "query ($arg: VariousScalarsInput!) {\nscalarsInput(arg: $arg) \n}\n";
type t = {scalarsInput: string};
type t_variables = {arg: t_variables_VariousScalarsInput}
and t_variables_VariousScalarsInput = {
nullableString: option(string),
string,
nullableInt: option(int),
int,
nullableFloat: option(float),
float,
nullableBoolean: option(bool),
boolean: bool,
nullableID: option(string),
id: string,
};
let parse: Raw.t => t =
(value) => (
{
scalarsInput: {
let value = (value: Raw.t).scalarsInput;
value;
},
}: t
);
let serialize: t => Raw.t =
(value) => (
{
let scalarsInput = {
let value = (value: t).scalarsInput;
value;
};
{
scalarsInput: scalarsInput,
};
}: Raw.t
);
let rec serializeVariables: t_variables => Js.Json.t =
inp =>
[|
(
"arg",
(a => Some(serializeInputObjectVariousScalarsInput(a)))(inp.arg),
),
|]
|> Js.Array.filter(
fun
| (_, None) => false
| (_, Some(_)) => true,
)
|> Js.Array.map(
fun
| (k, Some(v)) => (k, v)
| (k, None) => (k, Js.Json.null),
)
|> Js.Dict.fromArray
|> Js.Json.object_
and serializeInputObjectVariousScalarsInput:
t_variables_VariousScalarsInput => Js.Json.t =
inp =>
[|
(
"nullableString",
(
a =>
switch (a) {
| None => None
| Some(b) => (a => Some(Js.Json.string(a)))(b)
}
)(
inp.nullableString,
),
),
("string", (a => Some(Js.Json.string(a)))(inp.string)),
(
"nullableInt",
(
a =>
switch (a) {
| None => None
| Some(b) => (a => Some(Js.Json.number(float_of_int(a))))(b)
}
)(
inp.nullableInt,
),
),
("int", (a => Some(Js.Json.number(float_of_int(a))))(inp.int)),
(
"nullableFloat",
(
a =>
switch (a) {
| None => None
| Some(b) => (a => Some(Js.Json.number(a)))(b)
}
)(
inp.nullableFloat,
),
),
("float", (a => Some(Js.Json.number(a)))(inp.float)),
(
"nullableBoolean",
(
a =>
switch (a) {
| None => None
| Some(b) => (a => Some(Js.Json.boolean(a)))(b)
}
)(
inp.nullableBoolean,
),
),
("boolean", (a => Some(Js.Json.boolean(a)))(inp.boolean)),
(
"nullableID",
(
a =>
switch (a) {
| None => None
| Some(b) => (a => Some(Js.Json.string(a)))(b)
}
)(
inp.nullableID,
),
),
("id", (a => Some(Js.Json.string(a)))(inp.id)),
|]
|> Js.Array.filter(
fun
| (_, None) => false
| (_, Some(_)) => true,
)
|> Js.Array.map(
fun
| (k, Some(v)) => (k, v)
| (k, None) => (k, Js.Json.null),
)
|> Js.Dict.fromArray
|> Js.Json.object_;
let makeVar = (~f, ~arg, ()) =>
f(
serializeVariables(
{
arg: arg,
}: t_variables,
),
)
and makeInputObjectVariousScalarsInput =
(
~nullableString=?,
~string,
~nullableInt=?,
~int,
~nullableFloat=?,
~float,
~nullableBoolean=?,
~boolean,
~nullableID=?,
~id,
(),
)
: t_variables_VariousScalarsInput => {
nullableString,
string,
nullableInt,
int,
nullableFloat,
float,
nullableBoolean,
boolean,
nullableID,
id,
};
let definition = (parse, query, makeVar, serialize);
let makeVariables = makeVar(~f=f => f);
};
// end of generated code by the ppx
Not sure if it's the best example. parse
, serialize
, makeVariables
. definition
and the input object constructors may or may not be used. In most cases only definition
and makeVariables
are used (definition
passed to the GraphQL client and makeVariables
to construct variables). But I think that is a tricky case because definition
is referencing everything.
module type ReplacebleComponent = {
[@react.component]
let make: unit => React.element;
};
module Component = {
[@react.component]
let make = () => <div> "A"->React.string </div>;
};
module Container = {
[@react.component]
let make = (~impl: (module ReplacebleComponent)) => {
let (module X) = impl;
<X />;
};
};
[@live]
let out = <Container impl=(module Component) />;
Reports:
Warning Dead Value
File "/test/ReanalyzeTest.re", line 14, characters 4-25
Container.+X is never used
<-- line 14
[@dead "Container.+X"] let (module X) = impl;
Collection of possible additional DCE checks:
option(t)
is always None
, or always Some
.In the attached project executing:
dune build @check @all && Debug=1 reanalyze -dce-cmt _build/default
reports:
File "trace/trace.ml", line 3, characters 11-28
+pf.+pf is a record label never used to read a value
<-- line 3
type pf = {pf: 'a. 'a printf [@dead "+pf.+pf"] }
This is complaining about the type definition:
type pf = {pf: 'a. 'a printf}
but there is a use:
Trace.infok "" "" (fun {pf} -> pf "%s" "")
since Trace.infok
is s.t.:
val infok : string -> string -> (pf -> 'a) -> 'a
Notably, removing the trace.mli
file eliminates the report. Perhaps the declaration in the interface is not correctly identified with the one in the implementation.
Thanks @jchavarri for reporting it.
type userData = {
a: bool,
b: int,
c: string,
};
type tabState = {
a: bool,
b: int,
c: string,
d: int,
e: int,
f: string,
};
let userData =
({a, b, c}): userData => {
a,
b,
c,
};
Js.log(userData);
let toMinMax = (str: string) => {
switch (Js.String.split("-", str)) {
| [|min, max|] => {min: int_of_string(min), max: int_of_string(max)}
| _ => {min: (-1), max: (-1)}
};
};
Exception Analysis
toMinMax might raise Failure and is not annotated with @raises Failure
If any pattern is present it should be safe?
If you have a private variant type, then it will always trigger a "variant case which is never constructed" warning. Of course, because it's private, it can't be constructed outside of the implementation.
It would be slick to be able to provide configuration via bsconfig.json
. Real world use cases I had in mind after running reanalyze
on a fairly large and diverse project:
bindings
folder with ad hoc bindings, where you really want to consider all of the code in there live (like it's an actual lib).reanalyze
will think are dead since they're never used in the app itself.I believe this is possible to suppress via CLI args right now. But, it'd be slick to be able to configure it via bsconfig.json too, so that general tooling (like a dead code mode in the editor extension) can take advantage of the config too. Maybe letting the CLI args override the eventual values in bsconfig.json?
I have some code that shadows an exception-throwing module with a safer one, but reanalyze still flags the code as throwing an exception. This is a contrived example, but it does replicate the issue:
module Array = Belt.Array;
let rec search = (arr, idx, predicate) =>
switch (arr->Array.get(idx)) {
| Some((k, v)) => predicate(k, v) ? search(arr, idx + 1, predicate) : false
| None => true
};
The analysis reports search might raise Invalid_argument (File.re:4:17) and is not annotated with @raises Invalid_argument
. It appears reanalyse is incorrectly resolving this reference as OCaml's default Array.get
.
Explicitly referencing the Belt module raises no warnings:
let rec search = (arr, idx, predicate) =>
switch (arr->Belt.Array.get(idx)) {
| Some((k, v)) => predicate(k, v) ? search(arr, idx + 1, predicate) : false
| None => true
};
I am using version 2.10.0
A corner case is a case where it's not clear whether or not one would like to report, even though it is true that the function raises.
Here's one:
String.make(length, ' ')
The call throws if length
is negative. And the analysis can become noisy when mistakes are rare, but the analysis always reports.
Might be best to postpone corner cases until the entire analysis flow is more developed.
Greetings everyone! I asked in ReasonML forum if anyone was aware of a way to check for exhaustive deps checks when using ReasonReact hooks. I really think that this issue is a great source of bugs in React, but in JS world there are tons of things like this, and obviously the ecosystem tools are lifesavers.
In ReasonML most of the source of bugs are handled at compile-time, but unfortunately this is not the case with exhaustive deps. I really believe that there should be a way to express React hooks using an abstraction layer that could detect these kind of issue at compile-time (maybe using ppx?), but I still don't know the language enough to write this sort of implementation (again, if it is possible).
I think that implementing an exhaustive deps check in reanalyze would help a lot -- it took me hours to find one missing dependency in one hook for a toy project. Do you think that it would be something doable?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.