Coder Social home page Coder Social logo

ghc-proposals's Introduction

GHC Proposals

This repository contains specifications for proposed changes to the Glasgow Haskell Compiler. The purpose of the GHC proposal process and of the GHC Steering Committee is to broaden the discussion of the evolution of GHC.

What is a proposal?

A GHC Proposal is a document describing a proposed change to the compiler, the GHC/Haskell language, or the libraries in the GHC.* module namespace. These include,

  • A syntactic change to GHC/Haskell (e.g. the various ShortImports proposals, do expressions without $)
  • A major change to the user-visible behaviour of the compiler (e.g. the recent change in super-class solving, and -Wall behavior)
  • The addition of major features to the compiler (e.g. -XTypeInType, GHCi commands, type-indexed Typeable representations)

Changes to the GHC API or the plugin API are not automatically within the scope of the committee, and can be contributed following the usual GHC workflow. Should the GHC maintainers deem a change significant or controversial enough to warrant that, they may, at their discretion, involve the committee and ask the contributor to write a formal proposal.

Proposals are evaluated against our principles, which cover both language design and language stability

The life cycle of a proposal

This section outlines what stages a proposal may go through. The stage is identified by a GitHub label, which is identified in the following list.

  1. (No label.) The author drafts a proposal.

    What is a proposal?

  2. (No label.) The author submits the proposal to the wider Haskell community for discussion, as a pull request against this repository.

    How to submit a proposal

  3. (No label.) The wider community discusses the proposal in the commit section of the pull request, while the author refines the proposal. This phase lasts as long as necessary.

    Discussion goals How to comment on a proposal ≡ List of proposals under discussion

  4. Label: Pending shepherd recommendation. Eventually the proposal author brings the proposal before the committee for review.

    How to bring a proposal before the committee Who is the committee? ≡ List of proposals waiting for shepherd

  5. Label: Pending committee review. One committee member steps up as a shepherd, and generates consensus within the committee within four or five weeks.

    Committee process Review criteria ≡ List of proposals under review

  6. Eventually, the committee rejects a proposal (label: Rejected), or passes it back to the author for review (label: Needs revision), or accepts it (label: Accepted).

    Acceptance of the proposal implies that the implementation will be accepted into GHC provided it is well-engineered, well-documented, conforms to the specification and does not complicate the code-base too much. However, the GHC maintainers may reject an implementation if there turn out to be significant gaps in the specification, unforeseen interactions with existing features, or unexpected breaking changes not covered by the backwards compatibility assessment. In this case the proposal should be revised.

    ≡ List of accepted proposals ≡ List of proposals being revised ≡ List of rejected proposals

  7. Label: Dormant. If a proposal sees no activity for along time, it is marked as “dormant”, and eventually closed.

    What is a dormant proposal? ≡ List of dormant proposals

  8. Label: Implemented. Once a proposal is accepted, it still has to be implemented. The author may do that, or someone else. We mark the proposal as “implemented” once it hits GHC’s master branch (and we are happy to be nudged to do so by email, GitHub issue, or a comment on the relevant pull request).

    ≡ List of proposals pending implementation ≡ List of implemented proposals

Do not hesitate to contact us if you have questions.

How to start a new proposal

Proposals are written in either ReStructuredText or Markdown. While the proposal process itself has no preference, keep in mind that the GHC Users Guide uses ReStructuredText exclusively. Accepted proposals written in ReStructuredText thus have the slight benefit that they can be more easily included in the official GHC documentation.

Proposals should follow the structure given in the ReStructuredText template, or the Markdown template. (The two are identical except for format.)

See the section Review criteria below for more information about what makes a strong proposal, and how it will be reviewed.

To start a proposal, create a pull request that adds your proposal as proposals/0000-proposal-name.rst or proposals/0000-proposal-name.md. Use the corresponding proposals/0000-template file as a template.

The pull request summary should include a brief description of your proposal, along with a link to the rendered view of proposal document in your branch. For instance,

This is a proposal augmenting our existing `Typeable` mechanism with a
variant, `Type.Reflection`, which provides a more strongly typed variant as
originally described in [A Reflection on
Types](http://research.microsoft.com/en-us/um/people/simonpj/papers/haskell-dynamic/index.htm)
(Peyton Jones, _et al._ 2016).

[Rendered](https://github.com/bgamari/ghc-proposals/blob/typeable/proposals/0000-type-indexed-typeable.rst)

How to amend an accepted proposal

Some proposals amend an existing proposal. Such an amendment :

  • Makes a significant (i.e. not just editorial or typographical) change, and hence warrants approval by the committee
  • Is too small, or too closely tied to the existing proposal, to make sense as a new standalone proposal.

Often, this happens after a proposal is accepted, but before or while it is implemented. In these cases, a PR that _changes the accepted proposal can be opened. It goes through the same process as an original proposal.

Discussion goals

Members of the Haskell community are warmly invited to offer feedback on proposals. Feedback ensures that a variety of perspectives are heard, that alternative designs are considered, and that all of the pros and cons of a design are uncovered. We particularly encourage the following types of feedback,

  • Completeness: Is the proposal missing a case?
  • Soundness: Is the specification sound or does it include mistakes?
  • Alternatives: Are all reasonable alternatives listed and discussed. Are the pros and cons argued convincingly?
  • Costs: Are the costs for implementation believable? How much would this hinder learning the language?
  • Other questions: Ask critical questions that need to be resolved.
  • Motivation: Is the motivation reasonable?

How to comment on a proposal

To comment on a proposal you need to be viewing the proposal's diff in "source diff" view. To switch to this view use the buttons on the top-right corner of the Files Changed tab.

Use the view selector buttons on the top right corner of the "Files Changed" tab to change between "source diff" and "rich diff" views.

Use the view selector buttons on the top right corner of the "Files Changed" tab to change between "source diff" and "rich diff" views.

Feedback on a open pull requests can be offered using both GitHub's in-line and pull request commenting features. Inline comments can be added by hovering over a line of the diff.

Hover over a line in the source diff view of a pull request and click on the + to leave an inline comment

Hover over a line in the source diff view of a pull request and click on the + to leave an inline comment

For the maintenance of general sanity, try to avoid leaving "me too" comments. If you would like to register your approval or disapproval of a particular comment or proposal, feel free to use GitHub's "Reactions" feature.

How to bring a proposal before the committee

When the discussion has ebbed down and the author thinks the proposal is ready, they

  1. Review the discussion thread and ensure that the proposal text accounts for all salient points. Remember, the proposal must stand by itself, and be understandable without reading the discussion thread.
  2. Add a comment to the pull request, briefly summarizing the major points raised during the discussion period and stating your belief that the proposal is ready for review. In this comment, tag the committee secretary (currently @adamgundry).

The secretary will then label the pull request with Pending shepherd recommendation and start the committee process. (If this does not happen within a few days, please ping the secretary or the committee.)

What is a dormant proposal?

In order to keep better track of actively discussed proposals, proposals that see no activity for an extended period of time (a month or two) might be marked as “dormant”. At any time the proposer, or someone else can revive the proposal by picking up the discussion (and possibly asking the secretary to remove the Dormant tag).

You can see the list of dormant proposals.

Who is the committee?

You can reach the committee by email at [email protected]. This is a mailing list with public archives.

The current members, including their GitHub handle, when they joined first, when their term last renewed, when their term expires and their role, are:

simonmar Simon Marlow @simonmar 2017/02 2024/02 2027/02 co-chair
simonpj Simon Peyton-Jones @simonpj 2017/02 2024/02 2027/02 co-chair
gridaphobe Eric Seidel @gridaphobe 2018/09 2022/03 2025/03 member
cdornan Chris Dornan @cdornan 2022/03 - 2025/03 member
aspiwack Arnaud Spiwack @aspiwack 2019/07 2022/10 2025/10 member
adamgundry Adam Gundry @adamgundry 2022/10 - 2025/10 secretary
angerman Moritz Angermann @angerman 2023/02 - 2026/02 member
maralorn Malte Ott @maralorn 2024/03 - 2027/03 member
Tritlo Matthías Páll Gissurarson @Tritlo 2024/03 - 2027/03 member

The committee members have committed to adhere to the Haskell committee guidelines for respectful communication and are subject to the committee bylaws.

We would also like to thank our former members:

Ryan Newton @rrnewton 2017/02 - 2018/09
Roman Leshchinskiy @rleshchinskiy 2017/02 - 2018/11
Ben Gamari @bgamari 2017/02 - 2019/07
Manuel M T Chakravarty @mchakravarty 2017/02 - 2019/07
Sandy Maguire @isovector 2019/07 - 2019/12
Christopher Allen @bitemyapp 2017/02 - 2020/05
Iavor Diatchki @yav 2017/02 - 2021/05
Cale Gibbard @cgibbard 2020/01 - 2021/07
Alejandro Serrano @serras 2020/01 - 2022/01
Vitaly Bragilevsky @bravit 2018/09 - 2022/02
Baldur Blöndal @icelandjack 2022/03 - 2022/09
Tom Harding @i-am-tom 2020/01 - 2023/02
Joachim Breitner @nomeata 2017/02 - 2024/03
Richard Eisenberg @goldfirere 2017/02 - 2024/03
Vladislav Zavialov @int-index 2021/03 - 2024/03

Committee process for responding to a proposal

The committee process starts once the secretary has been notified that a proposal is ready for decision.

The steps below have timescales attached, so that everyone shares the same expectations. But they are only reasonable expectations. The committee consists of volunteers with day jobs, who are reviewing proposals in their spare time. If they do not meet the timescales indicated below (e.g. they might be on holiday), a reasonable response is a polite ping/enquiry.

  • The secretary nominates a member of the committee, the shepherd, to oversee the discussion. The secretary
    • labels the proposal as Pending shepherd recommendation,
    • assigns the proposal to the shepherd,
    • drops a short mail on the mailing list, informing the committee about the status change.
  • Based on the proposal text (but not the GitHub commentary), the shepherd decides whether the proposal ought to be accepted or rejected or returned for revision. The shepherd should do this within two weeks.
  • If the shepherd thinks the proposal ought to be rejected, they post their justifications on the GitHub thread, and invite the authors to respond with a rebuttal and/or refine the proposal. This continues until either
    • the shepherd changes their mind and supports the proposal now,
    • the authors withdraw their proposal,
    • the authors indicate that they will revise the proposal to address the shepherds point. The shepherd will label the pull request as Needs Revision.
    • the authors and the shepherd fully understand each other’s differing positions, even if they disagree on the conclusion.
  • Now the shepherd proposes to accept or reject the proposal. To do so, they
    • post their recommendation, with a rationale, on the GitHub discussion thread,
    • label the pull request as Pending committee review,
    • re-title the proposal pull request, appending (under review) at the end. (This enables easy email filtering.)
    • drop a short mail to the mailing list informing the committee that discussion has started.
  • Discussion among the committee ensues, in two places

    • Technical discussion takes place on the discussion thread, where others may continue to contribute.
    • Evaluative discussion, about whether to accept, reject, or return the proposal for revision, takes place on the committee's email list, which others can read but not post to.

    It is expected that every committee member express an opinion about every proposal under review. The most minimal way to do this is to "thumbs-up" the shepherd's recommendation on GitHub.

    Ideally, the committee reaches consensus, as determined by the secretary or the shepherd. If consensus is elusive, then we vote, with the Simons retaining veto power.

    This phase should conclude within a month.

  • For acceptance, a proposal must have at least some enthusiastic support from member(s) of the committee. The committee, fallible though its members may be, is the guardian of the language. If all of them are lukewarm about a change, there is a presumption that it should be rejected, or at least "parked". (See "evidence of utility" above, under "What a proposal should look like".)
  • A typical situation is that the committee, now that they have been asked to review the proposal in detail, unearths some substantive technical issues. This is absolutely fine -- it is what the review process is for!

    If the technical debate is not rapidly resolved, the shepherd should return the proposal for revision. Further technical discussion can then take place, the author can incorporate that conclusions in the proposal itself, and re-submit it. Returning a proposal for revision is not a negative judgement; on the contrary it might connote "we absolutely love this proposal but we want it to be clear on these points".

    In fact, this should happen if any substantive technical debate takes place. The goal of the committee review is to say yes/no to a proposal as it stands. If new issues come up, they should be resolved, incorporated in the proposal, and the revised proposal should then be re-submitted for timely yes/no decision. In this way, no proposal should languish in the committee review stage for long, and every proposal can be accepted as-is, rather than subject to a raft of ill-specified further modifications.

    The author of the proposal may invite committee collaboration on clarifying technical points; conversely members of the committee may offer such help.

    When a proposal is returned for revision, GitHub labels are updated accordingly and the (under review) suffix is removed from the title of the PR.

  • The decision is announced, by the shepherd or the secretary, on the GitHub thread and the mailing list.

    Notwithstanding the return/resubmit cycle described above, it may be that the shepherd accepts a proposal subject to some specified minor changes to the proposal text. In that case the author should carry them out.

    The secretary then tags the pull request accordingly, and either merges or closes it. In particular

    • If we say no: The pull request will be closed and labeled Rejected.

      If the proposer wants to revise and try again, the new proposal should explicitly address the rejection comments.

      In the case that the proposed change has already been implemented in GHC, it will be reverted.

    • If we say yes: The pull request will be merged and labeled Accepted. Its meta-data will be updated to include the acceptance date. A link to the accepted proposal is added to the top of the PR discussion, together with the sentence “The proposal has been accepted; the following discussion is mostly of historic interest.”.

      At this point, the proposal process is technically complete. It is outside the purview of the committee to implement, oversee implementation, attract implementors, etc.

      The proposal authors or other implementors are encouraged to update the proposal with the implementation status (i.e. ticket URL and the first version of GHC implementing it.)

      Committee members should see the acceptance page for a checklist to be applied to accepted proposals and the steps necessary in order to mark a proposal as accepted.

Review criteria

Here are some characteristics that a good proposal should have.

  • It should follow our design principles. These principles cover both the language design and its stability over time.
  • It should be self-standing. Some proposals accumulate a long and interesting discussion thread, but in ten years' time all that will be gone (except for the most assiduous readers). Before acceptance, therefore, the proposal should be edited to reflect the fruits of that discussion, so that it can stand alone.
  • It should be precise, especially the "Proposed change specification" section. Language design is complicated, with lots of interactions. It is not enough to offer a few suggestive examples and hope that the reader can infer the rest. Vague proposals waste everyone's time; precision is highly valued.

    We do not insist on a fully formal specification, with a machine-checked proof. There is no such baseline to work from, and it would set the bar far too high. On the other hand, for proposals involving syntactic changes, it is very reasonable to ask for a BNF for the changes. (Use the Haskell 2010 Report or GHC's alex- or happy-formatted files for the lexer or parser for a good starting point.)

    Ultimately, the necessary degree of precision is a judgement that the committee must make; but authors should try hard to offer precision.

  • It should offer evidence of utility. Even the strongest proposals carry costs:

    • For programmers: most proposals make the language just a bit more complicated;
    • For GHC maintainers: most proposals make the implementation a bit more complicated;
    • For future proposers: most proposals consume syntactic design space add/or add new back-compat burdens, both of which make new proposals harder to fit in.
    • It is much, much harder subsequently to remove an extension than it is to add it.

    All these costs constitute a permanent tax on every future programmer, language designer, and GHC maintainer. The tax may well be worth it (a language without polymorphism would be simpler but we don't want it), but the case should be made.

    The case is stronger if lots of people express support by giving a "thumbs-up" in GitHub. Even better is the community contributes new examples that illustrate how the proposal will be broadly useful. The committee is often faced with proposals that are reasonable, but where there is a suspicion that no one other than the author cares. Defusing this suspicion, by describing use-cases and inviting support from others, is helpful.

  • It should be copiously illustrated with examples, to aid understanding. However, these examples should not be the specification.

Below are some criteria that the committee and the supporting GHC community will generally use to evaluate a proposal. These criteria are guidelines and questions that the committee will consider. None of these criteria is an absolute bar: it is the committee's job to weigh them, and any other relevant considerations, appropriately.

  • Utility and user demand. What exactly is the problem that the feature solves? Is it an important problem, felt by many users, or is it very specialised? The whole point of a new feature is to be useful to people, so a good proposal will explain why this is so, and ideally offer evidence of some form. The "Endorsements" section of the proposal provides an opportunity for third parties to express their support for the proposal, and the reasons they would like to see it adopted.
  • Elegant and principled. Haskell is a beautiful and principled language. It is tempting to pile feature upon feature (and GHC Haskell has quite a bit of that), but we should constantly and consciously strive for simplicity and elegance.

    This is not always easy. Sometimes an important problem has lots of solutions, none of which have that "aha" feeling of "this is the Right Way to solve this"; in that case we might delay rather than forge ahead regardless.

  • Does not create a language fork. By a "fork" we mean

    • It fails the test "Is this extension something that most people would be happy to enable, even if they don't want to use it?";
    • And it also fails the test "Do we think there's a reasonable chance this extension will make it into a future language standard?"; that is, the proposal reflects the stylistic preferences of a subset of the Haskell community, rather than a consensus about the direction that (in the committee's judgement) we want to push the whole language.

    The idea is that unless we can see a path to a point where everyone has the extension turned on, we're left with different groups of people using incompatible dialects of the language. A similar problem arises with extensions that are mutually incompatible.

  • Fit with the language. If we just throw things into GHC willy-nilly, it will become a large ball of incoherent and inconsistent mud. We strive to add features that are consistent with the rest of the language.
  • Specification cost. Does the benefit of the feature justify the extra complexity in the language specification? Does the new feature interact awkwardly with existing features, or does it enhance them? How easy is it for users to understand the new feature?
  • Implementation cost. How hard is it to implement?
  • Maintainability. Writing code is cheap; maintaining it is expensive. GHC is a very large piece of software, with a lifetime stretching over decades. It is tempting to think that if you propose a feature and offer a patch that implements it, then the implementation cost to GHC is zero and the patch should be accepted.

    But in fact every new feature imposes a tax on future implementors, (a) to keep it working, and (b) to understand and manage its interactions with other new features. In the common case the original implementor of a feature moves on to other things after a few years, and this maintenance burden falls on others.

  • It should conform to existing principles. This repository contains a principles document that lays out various principles guiding future directions for GHC. Proposals should seek to uphold these principles in new features, as much as possible. Note that these principles are not absolutes, and regressions against the principles are possible, if a proposal is otherwise very strong.
  • Backward compatibility. Will the change break existing code, and if so, has an adequate impact assessment been carried out to determine whether the benefits outweigh the costs? Is there a clearly documented migration path? Will users receive warnings in advance of the breaking change, and reasonable error messages afterwards? See the Backward Compatibility section of the proposal template for specifics of how breakage is assessed.

How to build the proposals?

The proposals can be rendered by running:

nix-shell shell.nix --run "make html"

This will then create a directory _build which will contain an index.html file and the other rendered proposals. This is useful when developing a proposal to ensure that your file is syntax correct.

Questions?

Feel free to contact any of the members of the GHC Steering Committee with questions. Email and IRC (#ghc on irc.freenode.net) are both good ways of accomplishing this.

ghc-proposals's People

Contributors

adamgundry avatar andrewthad avatar antc2 avatar aspiwack avatar bgamari avatar ericson2314 avatar facundominguez avatar goldfirere avatar int-index avatar isovector avatar jakobbruenker avatar kcsongor avatar knothed avatar leventerkok avatar mpickering avatar ndmitchell avatar nomeata avatar noughtmare avatar phadej avatar reactormonk avatar ryanglscott avatar serras avatar sgraf812 avatar sheaf avatar simonpj avatar takano-akio avatar takenobu-hs avatar treeowl avatar ulysses4ever avatar yav avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ghc-proposals's Issues

Syntax for Records/update/puns, extensible records (future): how to have our cake and eat it?

[This is loaded to github as an 'Issue' for discussion, not as a full-blooded proposal.]

Thumbnail Make this valid syntax for records

aPoint = {Point | x = 1.0, y = 2.5 }                 -- record build syntax

bar r = {r | x = 3.4}                                -- record update syntax
bar r = {r :: Point | x = 3.4}                       -- record update with signature

baz {Point | x } = x                                 -- record pattern syntax, with field pun
  • The form with signature might be just the right johnny-on-the-spot for disambiguating DuplicateRecordFields.

Motivation

Of the many things to regret about H98 records, and the backwards-compatibility corner it has painted GHC/Haskell into, perhaps the most surprising/ugly/un-Haskelly is record build and update syntax:

data Point = Point{ x, y :: Float }         deriving (Eq, Show, Read)

aPoint = Point { x = 1.0, y = 2.5 }

foo r = r {x = 3.4}

>  foo aPoint { y = 7.8 }
===> Point {x = 3.4, y = 7.8}

r {x = 3.4} looks to the uninitiated that it's applying function r to something. Even worse, foo aPoint { y = 7.8 } looks like it's applying function foo to two arguments; but it isn't:

> foo $ aPoint {y = 7.8}
===> Point {x = 3.4, y = 7.8}
> foo aPoint $ {y = 7.8}
===> ERROR - Syntax error in expression ...
> foo . aPoint $ {y = 7.8}                                         -- aaargh

To my eye, even Point { x = 1.0, y = 2.5 } looks like it should be applying constructor Point to something.

I think we can do better; whilst retaining the current semantics; and opening up some syntax space for extensible records (design in the future).

  • retain that { ..., ... } means 'here comes a record'. That's also strongly flavoured with maths 'here comes a set': there's a uniqueness property; there's a sequence-immaterial property distinct from tuples inside ( ..., ...) or list construction inside [ ..., ... ].
  • retain that the core content is x = 1.2, y = 3.4, ... label-value pairs and/or with field puns x = 1.2, y, z ... label-(implicit)-variable pairs;
  • retain the properties: each label must appear exactly once; order is immaterial; but allow as follows ...

Proposed Change Specification

Maths notation already allows more than elements inside { ... }, as in set-builder notation. That uses | to demark 'something more than a list of elements'. By a happy coincidence, | is a reserved sym in Haskell, so it can't be a user-defined op; furthermore it's a symbol that can't appear in expressions nor in patterns. (Although it might be close-nearby patterns to mark guards: I think we're OK, the proposal is that | appears only inside { ... }.)

  • Retain existing H98 record syntax, including GHC extensions for field label puns and record wildcards. That includes H98 record build and record update syntax with { ... }.

  • Allow { ... | ... } syntax as first-class for expressions, providing it conforms to one of the following. (Note none of these forms clash with H98 syntax, because all have | inside { ... }.)

      aPoint = {Point | x = 1.0, y = 2.5 }             -- record build syntax
    
      bar r = {r | x = 3.4}                            -- record update syntax
      bar r = {r :: Point | x = 3.4}                   -- .. with signature
    
      baz {Point | x } = x                             -- record pattern syntax, with field pun
    

In record update syntax {r | x = ... }, r may be any expression yielding a record datatype, as currently allowed by H98. The syntax would also tolerate { r :: t | x = ... } without parens, which might be the happy place to supply a disambiguating signature under the DuplicateRecordFields extension.

Comparison with other record systems -- Hugs Trex

The Gaster & Jones 1996 paper describing Hugs' 'Extensible Records' uses { ... } to mark records. For the implementation (before the H98 standard!), that syntax was found to clash with H98 records, so Hugs uses parens ( x = 1.2, ... ) instead.

Hugs/Trex also uses | inside { ... } -- or actually ( ... ) -- to denote record extension, for example:

> let rename (x = x | r ) = (x2 = x | r) in rename (x = 1.2, y = 3.4, z = 7.8)
===> (x2 = 1.2, y = 3.4, z = 7.8)

in which the pattern ( ... | r ) means: bind the named fields, bind r to the remainder of the record; and in the record construction means extend r with the named fields.

Now although, Hugs uses |, that appears in a syntactic construct distinct from this proposal (after label-value pairs rather than before). Note Trex doesn't have label field puns [##]; neither label wildcards, but the 'remainder of the record' binding serves the same job, and more succinctly.
[##] Hugs does support label field puns for H98 records, same as in GHC; it's Trex style records that don't, because that would make the syntax ambiguous.

To be clear: this proposal is not advocating for Hugs/Trex nor any other style of extensible/anonymous records. I'm merely pointing out that the proposal leaves some syntactic space for an extensible/anonymous design to sit alongside existing H98 records and this proposal.

Prototype

I've implemented the proposed syntax in Hugs, including in Trex. Furthermore I've implemented it purely in the parser/it's all syntactic sugar that translates down to existing H98 and Trex semantics.

I've allowed Trex to use { ... } syntax alongside H98 and this proposal. I've needed to make a small compromise:

trexPoint = (x = 1.0, y = 2.3)                  -- valid in Hugs 2006, and in Trex doesn't clash with this proposal

notaTrexPoint = {x = 1.2, y = 2.3}              -- not allowed by this proposal, because too confusing

nowaTrexPoint = {x = 1.2, y = 2.3 |}            -- to be a Trex record, must have |

Note that in Trex, all records are notionally extensible: (x = 1.0, y = 2.3) is shorthand for (x = 1.0, y = 2.3 | emptyRec) -- that is, start with emptyRec, extend it with fields labelled x, y.

Effects and Interactions

Although each proposed syntactic form (including the possible-future extensible records) is distinct and doesn't clash with H98/current GHC syntax+extensions, it does make the syntax inside { ... } crowded. It would be easy to make a small slip and trigger either a syntax error or (worse) a valid but mistaken interpretation by the compiler. Note, however, that the mistaken interpretations will turn out to be type-incorrect, so it's fail-safe.

Easter Egg (apologies it's late, I wanted to get my prototype going before proposing anything)

Record puns were in Hugs for years before GHC. Hugs nearly took them out for fear they were too bleeding-edge. Oh, also because "current records implementation in haskell seems like somewhat
temporary solution" -- we wish!

A compact internal representation for record accessors

This is an issue rather than a PR. I don't know enough about the guts of GHC and the ease or difficulty of various approaches to suggest a full solution, and I'm hoping for ghc experts to weigh in. I've been pondering the problems described in well-typed's work on large records: https://well-typed.com/blog/2021/08/large-records/

If we only consider the simplest example -- quadratic blowup in code size for accessors, this comes from the fact that accessors are generated from all fields, and also need to enumerate all fields. Given how core currently looks, this seems pretty inevitable and natural. My question is the following: is there a way to change core to have some "magic" notion of a compact accessor that serves as sugar for this? It seems to me also that accessors should invariably be inlined. So could we just have a magic accessor token in core that is always expanded at use -- i.e. a compact representation that "desugars" to the full thing inlined thing when invoked, but is otherwise efficient to generate and keep around?

Personally, I think solving issues like applicative, etc. is less of a key case, since most fields in the autogenerated big records I am used to are not parameterized. I haven't thought through the other generated typeclasses enough to consider which ones may or may not present similar quadratic issues -- experience makes me wary of Read/Show/Eq/Ord, but logically I don't think a number of them would be subject to these problems.

Encourage review comments over pull request comments?

Top-level pull request comments can't be resolved, so derailed conversations tend to fill the page rapidly without being structured.
It would be easier to read conversations if they were hanging at the Unresolved Questions section. Thoughts?

Add 'GHC-X.Y' labels to 'Implemented' GHC proposals

I can view all implemented proposals under the following link, which is very convenient!

But I think that it would be even nicer to see from the GitHub in what exact GHC version each proposal was implemented. This can be interesting for historical reasons or for having a brief understanding of which features were introduced and when. Adding a label in a form like GHC-8.8 or GHC-8.6.1 to each of the Implemented proposals can help with this 🙂

Template suggest using BNF

BNF grammar and semantics of any new syntactic constructs

However there is no BNF grammar for all constructs in GHC. There are alex and happy input files, which are not "BNF".

Would it be better & more explicit to ask how GHC's alex and happy files need to be changed (omitting AST construction)?

Proposal: labeled arguments

There's a library that does this, but it not nearly as convenient as having it built-in like OCaml does, though without default arguments. My main goal here is to eliminate flip gymnastics and make partial application more general. The details of this would have to be worked out more, but I think the core idea is worth bringing up explicitly.

sectnum wreaks havoc on rendered proposals

All of the accepted proposals have strange section numbers. This is entirely a cosmetic issue, but it does make the proposals look pretty bad. At some point, sectnum was added to all proposals, but the effect it produces is terrible. I see things like

210 Motivation
...
211 Proposed Change Specification
...

These numbers should be 1 and 2, not 210 and 211. What was the purpose of introducing sectnum? Can it be removed?

Promoted methods before the rest of DH

Today's type families, while trying to do a generally-liked thing, are a bit weird. See #381 for #177 for examples of problems with them. Type families are also quite difficult to reform. @goldfirere concludes in part of a comment at the bottom of #177:

The more I work on Haskell, the more I wonder whether type families are something of a misfeature: instead, we should just be able to use functions in types. Class methods would, naturally, induce constraints, just like proposed here. But I worry that the investment in this idea would end up being redundant once ordinary functions (and methods) can be used in types. (To be fair, I'm very pleased that Haskell got type families when it did. We would have never gotten to this point without them. It's just that I wonder about their future.)

To provide a bit of context, "constrained type families" as discussed in #177 suggest borrowing the idea of constraints to improve type families. But class methods at the type level naturally would keep their constraints too. And the openness of class methods can fill the role of the openness of type families.

I agree with @goldfirere's quote. But, I also think "better" type families are useful even if we don't have Dependent Haskell. That isn't to say I don't want dependent Haskell --- I do --- but rather than making users wait for Dependent Haskell to see these issues with families fixed, perhaps we could provide a DH-anticipating alternative[1] to type families. I think showing how the DH approach can both reveal these subtle problems and solve them can also build more support for the rest of DH.

[1]: I say "alternative" type families, because they are not backwards-compatible with today's type families. [We should probably give them another name than type families, too.]

Better multi-line string literals + deprecate `QuasiQuotes`

It came up in #125 that QuasiQuotes is basically a work-around for not have good multi-line string literal syntax (in conjunction with TH splices), and furthermore one that has a number of downsides.

Over the long term I think it would be good to deprecate it for those reasons. Deprecating features in wide use seems scary but 1) we can leave it around for a while after deprecating, 2) we could have tools to automatically rewrite code to its replacement. Opening this issue in hopes that someone writes a proposal for this.

Pattern-based macros in the far future

Languages like Scheme, and Rust after it, have pattern-based macros. It would be nice to have these in Haskell, but what would they even be for us?

  1. What are the patterns? They are pattern matching on syntax, fine, but what is that syntax? In the older schemes it is just s-expressions. But is the equivalent of s-expressions in a language with far more syntax? In Rust the s-expression analogue is "token trees", see https://doc.rust-lang.org/1.7.0/book/macros.html#syntactic-requirements. This I think is a very clever thing: Yes the parser refines what the lexer produces, but the lexer makes something flat---bad for pattern matching. Instead, take what the lexer does but also match ensure (, and ), [, and ], { and }, layout's implied brackets are matched, making a tree.

    I don't see why we should also do that. In fact always do that---I think it would be a good users experience to not have unbalanced bracket errors mixed in with other "full syntax" parse errors.

    Another thing to note is that traditional Scheme and Rust both have a pretty rigid separation between pattern-bases macros and quote-antiquote-based macros. In Scheme's case, that was because they didn't know how to combine the two. In Rust's case, this is because their "procedural macros" are pre-compiled and rather out-of-band for ease of implementation but also interface stability concerns.

    We already have well-developed in-band quote-antiquote-based macros in the form of TH. So we should build on that and not have that strict separation. Our guide here is Racket, which for syntax has a sort of enriched s-expression (https://docs.racket-lang.org/reference/syntax-model.html#%28part._stxobj-model%29)---the extra information keeps things hygienic as opposed to the limited expressive power of pure-pattern based macros and a cacophony of renaming tricks.

    Having a datatype for syntax also creates the expectation for regular pattern matching---Rust and traditional Scheme have special pattern matching and construction forms (Scheme has no regular Pattern matching, Rust does but it is unrelated) as part of the special pattern-based macro language. But with an actual data type, and regular Haskell but phase separated in TH, we should be able to do regular pattern matching.

  2. What about interleaved arbitrary token trees with regular Haskell? In Rust, the leaves of the pattern can be regular language forms like "expression", "type", or arbitrary token trees. This is nice so macros don't need to either reparse the language by hand, or suffer users to just get syntax errors in splices---what the macro outputs---instead of what they inputted.

    As a first step, we could make some magic view patterns, but I think what would be even cooler is to mix up TH and happy, following haskell/happy#149 . If TH could use happy to parse grammars of it's own choosing, and those grammars could refer to a (possibly simplified interface to) the regular happy grammar, this could result in a really nice experience for the user of macros.

  3. What about types? Something like typed TH (especially after @mpickering is done with it:)) is a good and necessary first step. But what we really need beyond typed quoting and splicing is something for custom binding forms, which I suppose means being able to manipulate the context for type checking directly. https://github.com/justinpombrio/thesis/raw/master/resugaring-thesis.pdf is also a good resource.

I think a test to see how the above turns it is whether do notation could be removed from the language and redone as a macros without a degradation in user experience around error messages.

template-haskell plainTV type inconsistency

@yav pointed out this inconsistency here: https://gitlab.haskell.org/ghc/ghc/-/issues/17650

Here is a file using plainInvisTV

{-# LANGUAGE TemplateHaskell #-}
import Language.Haskell.TH.Lib.Internal
import Language.Haskell.TH (mkName)

f :: $(forallT [ plainInvisTV (mkName "n") inferredSpec ] (cxt []) [t| () |])
f = undefined

It does not work when the imports are changed:

{-# LANGUAGE TemplateHaskell #-}
import Language.Haskell.TH
-- or to Language.Haskell.TH.Lib

f :: $(forallT [ plainInvisTV (mkName "n") inferredSpec ] (cxt []) [t| () |])
f = undefined
t.hs:5:18: error:
    • Couldn't match type ‘Language.Haskell.TH.Syntax.TyVarBndr
                             Language.Haskell.TH.Syntax.Specificity’
                     with ‘Language.Haskell.TH.Syntax.Specificity’
      Expected: Language.Haskell.TH.Syntax.TyVarBndr
                  Language.Haskell.TH.Syntax.Specificity
        Actual: Language.Haskell.TH.Syntax.TyVarBndr
                  (Language.Haskell.TH.Syntax.TyVarBndr
                     Language.Haskell.TH.Syntax.Specificity)
    • In the expression: plainInvisTV (mkName "n") inferredSpec
      In the first argument of ‘forallT’, namely
        ‘[plainInvisTV (mkName "n") inferredSpec]’
      In the expression:
        forallT [plainInvisTV (mkName "n") inferredSpec] (cxt []) [t| () |]

The comment in TH/Lib.hs

but if a change occurs to Template
Haskell which requires breaking the API offered in this module, we opt to
copy the old definition here, and make the changes in
Language.Haskell.TH.Lib.Internal.

Should be changed to:

but if a change occurs to Template Haskell

1. When .Internal functions gain additional arguments, the
   function in .Lib will not.
2. Expressions that typecheck with .Internal also typecheck with .Lib,
   provided that any additional arguments are removed
3. In typical expressions such as `forallT [plainTV 'x] (cxt []) [t| Int |]`,
   intermediate types may change between template-haskell versions, but 
   the original expression will still typecheck if possible.

With template haskell 2.16 you could have forallT [plainTV _]. It doesn't work in 2.17 because forallT's type changed. If 2.17 had changed plainTV, perhaps to plainTV = plainInvisTV x specifiedSpec the original expression would still typecheck. In other words, "breaking the API" should read "typical expressions no longer typecheck" not "definitions in Lib.hs changed"

Likewise forallVisT :: Quote m => [m (TyVarBndr ())] -> m Type -> m Type should have been changed to

forallVisT :: Quote m => [m (TyVarBndr a)] -> m Type -> m Type, and then you could have only 4 definitions under type variable binders, namely plainTV, kindedTV, plainInferredTV, kindedInferredTV. If the .Internal continues to distinguish between TyVarBndr () and TyVarBndr Specificity, this part of the proposal prioritizes (3.) over (2.). Template haskell code that creates type variables could have been unaffected by "explicit specificity". Maybe it's not too late to go back for the next release.

Optimized overloaded lists

Currently, OverloadedLists leads to run-time conversion from (and occasionally to) lists. I think we could improve matters using some Template Haskell. Suppose we add a method to IsList:

class IsList t where
  type Item t
  fromList :: [Item t] -> t
  toList :: t -> [Item t]
  fromListTH :: Quote m => [Code m (Item t)] -> Code m t
  fromListTH cs = [|| fromList $$(flopper cs) ||]

flopper :: Quote m => [Code m a] -> Code m [a]
flopper [] = [|| [] ||]
flopper (x : xs) = [|| $$x : $$(flopper xs) ||]

Now a literal list [e1, e2, e3] :: T could be desugared

let
  v1, v2, v3 :: Item T
  v1 = e1
  v2 = e2
  v3 = e3
in $$(fromListTH [ [||v1||], [||v2||], [||v3||] ]

If fromListTH has a custom definition, this could build the result immediately, rather than converting from a list at run-time.

Of course, this approach only works when the specific instance is fixed at splice time. With OverloadedLists, you can write things like

foo :: (IsList t, Item t ~ Int) => t
foo = [1,2,3]

but that would be bogus for OptimizedOverloadedLists.


What about list patterns? Well, we could get a method like this:

untypedListPat
  :: Quote m
  => Int -- ^ The length of the list pattern
  -> [Pat] -- ^ The patterns within it
  -> m Pat

So a pattern

[1, a, b]

would be desugared

$(untypedListPat 3 =<< sequence [litP (integerL 1), varP =<< newName "a", varP =<< newName "b"])

It's not very satisfying that we don't (yet?) have a pattern language in typed Template Haskell, but I don't think we should wait for that.

Default polymorphic type parameters?

Does this sound familiar?

  1. You write a function that takes an Int
  2. You later on want to make it general to take in any Num
  3. Now you have to go back to all the literal numbers and explicitly annotate them as Int (to fix the -Wtype-defaults warnings)

What do people think about being able to define extended default rules at function definition instead of at call site? e.g.

logJSON ::
  ( ToJSON a
  , default a (Text, String)
  ) => a -> IO ()

So now, the following calls work with -Wall and -XOverloadedStrings, without needing to turn on the -fextended-default-rules option:

logJSON "this is Text"
logJSON $ "this would've given an obscure error message before about unifying a ~ [b] and ToJSON a: " ++ if x then "yes" else "no"
logJSON $ Aeson.String "non-polymorphic values still work"

URL links

Make :set +m the default in GHCi

This was asked in the #ghc IRC channel:

anchpop: geekosaur, do you know why multiline mode (+m) is not enabled by default?
geekosaur: no

So, are there reasons not to do that?

newtype <--> data parity

I've observed that a very valuable thing to do is to have a writer monad where (equationally) undefined :: Writer a b = Writer (undefined :: a) (undefined :: b), this can usefully convert infinite inputs to infinite outputs (or to outputs with a finite defined prefix and a _|_ tail) if it does so with maximum laziness.

I propose an extension NArityNewtypeConstructor to enable this feature by extending newtypes to accept 0 or more fields and some further enhancements that become immediately relevant which I mention a bit further below. I think this can only be done if we can have the same unlifted behaviour from a newtype with multiple fields as we do from a newtype with only one.

Furthermore, I suspect that much of the richness of the syntax and semantics of datatypes can be provided for newtypes, the difference being that newtypes have to retain the constraint that they have one and only one constructor, and this is the basis of this proposal. This puts me in mind of a possible practical implementation strategy in the next paragraph.

This might look like it's just like any Haskell strictness but strict expressions with datatypes require computing the head of the expression but this proposal requires no such computation just as you'd expect for a newtype, the head can be assumed thanks to the type. That means it's useful for values computed from infinite values. It also allows the easy parts of Haskell to be cheaply used for many common system programming tasks so very great Haskell expertise isn't needed before becoming very effective at those things.

I suspect this can be implemented in the parser/typechecker by defining a datatype for each newtype to act as the representation type instead of using the first field as the representation type; the system would treat pattern matches on newtypes as irrefutable pattern matches on that representation type. That which is the representation type today would become the type of the only field of the constructor of the representation type in the new concept for the special case of the single field that we currently support. I think seq has to apply to each field of the representation value. If any path through the graph of occupants is unboxed then the type is "somewhat unboxed", for example if any field is either unboxed or "somewhat unboxed", and the various restrictions against the use of unboxed types apply.

I also propose:

  • a LiftingAssertion extension to allow people to assert in their module that a specific type must be a newtype or it must be a datatype to ensure the semantics of being lifted or being unlifted are as expected as a program is changed over time.

  • an UnliftedTuples extension where a module can use tuple syntax for tuples that are defined as newtypes instead of as datatypes (and the core library should define a module for the newtype variant which is in the prelude for any module that uses that extension). This would be for modules where the most common need is maximally defined values - I expect this to become commonplace once it's available. Provide a module for the data and newtype definitions for the two kinds of tuple and the existing tuple module can re-export them along with it's tuple-syntax type aliases while a new tuple module can re-export them along with the unlifted tuple definitions. GHC will report an error if the two conflicting tuple-syntax type aliases are imported into one module.

  • that major functors should have newtype variants defined such as NEmpty, NIdentity, NWriterT, NWriter, etc, that data syntax be enhanced to make it trivial to define a counterpart newtype and that instance syntax be enhanced to make it trivial to define two instances with one definition - one for the lifted type and one for the unlifted type. People can use this where-ever they're confident that the newtype instance is correct and the datatype instance is merely less defined. Perhaps this would be enabled with an UnliftedCounterpartTypes extension adding an optional deriving newtype as .. of .. -> clause in addition to the current deriving syntax:

    • data DatatypeName a = SoloConstructor a deriving Class deriving newtype as NewtypeName of SoloConstructor -> NSoloConstructor
  • correct the error message when using fix id with unboxed types to report that it's incompatible with unboxed types instead of reporting that it's incompatible with unlifted types; fix id is already compatible with unlifted types in the form of any newtype where a pattern can match on the constructor of the value of fix id :: SomeNewtype.

I am trying to implement this, all useful advice on this matter is welcome and solicited.

Absurd patterns

The discussion on #302, starting here #302 (comment), has recently been focused on the possibility of introducing absurd patterns. In lieu of a full proposal and in an effort to have a proper place to hold that discussion, I'm creating this issue.

Using _|_ as tentative syntax to denote an absurd pattern, the idea is that it can appear anywhere where patterns can appear, and signifies that there is no possible non-bottom value that it could match. The compiler would complain if this is not the case. For example:

-- currently:
fromBoolVoidAndInt :: (Bool, Void) -> Int -> String
fromBoolVoidAndInt (b, v) x -> case v of {}

-- with absurd pattern:
fromBoolVoidAndInt :: (Bool, Void) -> Int -> String
fromBoolVoidAndInt (b, _|_) x

Since an absurd pattern can never be successfully matched, no right-hand side is needed for this equation.

There are a few open questions, though:

  • If an absurd pattern tries to match a bottom value, is the resulting bottom value the matched value or an error about a pattern match failure (a bang in front of the absurd pattern could change this)
  • How exactly should this interact with GADTs?

Feel free to make this issue obsolete by creating a proper proposal.

[meta] Update member list

The member list in the README needs an update to reflect recent changes in members (one departure, two arrivals)

Offer a way to construct primop-like functions in Haskell

As touched on in #130, one of the things we can't do in surface Haskell is prevent the garbage collector from running during a critical operation (such as initializing an array). As a result, all such operations must be implemented as primops, or as calls to CMM through the FFI. The trouble is that Haskell code may allocate memory at any time, and if the nursery fills up, we must run the garbage collector. It would be very nice if we could find a nice way to work around this.

Suppose we add a prim or similar declaration keyword, indicating that the function being defined should be "primop-like". Specify a restricted subset of Haskell that can be used in prim functions, and guarantee that the garbage collector will not run while the function is being evaluated. The simplest version might look like this:

prim appendArrays# :: Array# a -> Array# a -> Array# a
prim appendArrays# xs ys = case runRW# $ \s ->
  case newUninitializedArray# (sxs +# sys) s of { (# s', mry #) ->
  case copyArray# xs 0# mry 0# sxs s' of { s'' ->
  case copyArray# ys 0# mry sxs sys s'' of { s''' ->
  unsafeFreezeArray# mry s'''}}} of (# _, arr #) -> arr
  where
    sxs = sizeofArray# xs
    sys = sizeofArray# ys

Note that we don't perform any calculations affecting allocation after our first (in this case only) allocation. How can we specify the restricted language to ensure this is the case?

`EarlyDo` - return early from a do-block

Problem statement

The problem it solves is returning early from a do expression.

app :: IO (Either Error String)
app = do
  path <- grabEnv "PATH"?
  putStrLn "Look ma, no lifts!"
  magic <- grabEnv "MAGIC"?
  pure (Right (path ++ magic))

Implementation

The syntax stmt? is desugared in this way:

  • do stmt?; next becomes do earlyThen stmt next
  • do pat <- stmt?; next; next2 becomes do early stmt (\pat -> do next; next2; ...)

Where early/earlyThen simply return Left if stmt returns Left, or if Right a, they pass the a over to the continuation. Nothing clever or special. It’s equivalent to writing

do
  result <- m
  case result of
    Left e -> pure (Left e)
    Right a -> f a

It could be overloaded to any failure-like type, or not:

-- monomorphic version
early :: Monad m => m (Either e a) -> (a -> m (Either e b)) -> m (Either e b)
-- polymorphic version
early :: (Monad m, Early f) => m (f a) -> (a -> m (f b)) -> m (f b)

Why an extension

I’ve been using quite happily this compiler plugin which I wrote a year ago:

https://github.com/inflex-io/early (more details and motivation written there, including FAQs like “why not ExceptT)

I’d like a proper language extension for it. We have arrows and mdo and a bunch other syntactic extensions that I’ve never used for anything, but this one has practical value for me and I’m sure others. My plugin is ok, but has a heavy dependency (ghc-parser-lib), the parser is not 100% accurate and plugins add compile overhead (although not that much).

A factor I didn’t consider, but is true, is that Haskell newbies would be able to use this easily, without having to learn what a monad transformer is. But that’s not really a concern of mine.

I think Haskellers deserve to explore other ways to handle early returning; doing it via clever monad instances seems to have been fully explored.

Considerations

Technical considerations:

  • Should it work for any “Early” type (Either, Maybe, Failure, etc), like here
  • Or should it be monomorphic? The diff for the two APIs is here
  • Because ? cannot appear validly at the end of a statement (without parens) in Haskell, this should present no conflicts. But I could be wrong.

If feedback is positive, then I can submit a proposal. And then if accepted, do the GHC language implementation.

A good summary by Ollie Charles

I would like to see this. Error handling is an important aspect of programming, and I don't think Haskell has a very good story. Our predominant tools are:

  • Exceptions. These are simple to produce, and also fairly simple to deal with, but they are also essentially untyped (they don't appear in the type of things that throw exceptions). I think this means they work well in the small, but are much more problematic in the large.
  • Monad transformers (e.g., ExceptT). This makes it easy to propagate errors, but incur a cost (every bind now has to do pattern matching), and don't combine with all monads (e.g., we can't easily use ExceptT with MonadUnliftIO)
  • Returning a sum types. This is the approach advocated by this issue, but is tedious for users as they now have to deal with unpacking and error propagation.

I think there is elegance and simplicity to the third approach - the error appears clearly in the type, and in the return position. It's a shame that actually working in this style is so difficult though, and I like the idea of having a special early-out bind function to ease that pain.

I'm fairly indifferent to what the actual syntax is. As long as I can just add a few characters to x <- f a b c to get error propagation, I'm happy.

I am sympathetic to concerns about adding extra syntax - Haskell is definitely a heavy language. However, I still feel having canonical syntax for typed error handling in Haskell would be extremely beneficial in the long run.

The only thing I don't like about this approach is it's very easy to accidentally forget to check for errors for functions that don't return anything:

foo :: IO (Either Bang ())

bar = do
  ...
  ...
  foo -- Whoops, even if this fails we'll continue!
  foo? -- We should have written this
  ...

However, I think -Wunused-do-bind would alert about this.

Objections summarized

Objection: Special syntax for a single type of effect

@Ericson2314

@chrisdone OK. Then, this boils down to a difference of opinion.

I view Rust's ? as just a weaken form of ! because they don't have a general notion like Monad. .await is another example of this. Rust developers by no means like this state of affairs either, see https://without.boats/blog/the-problem-of-effects/ for example. And regular users who don't know what Monads are still wishing they didn't have to, e.g. write things twice for async and regular.

I would not be a fan of seeing what feels like an ad-hoc per-effect syntax in Haskell when it's no the only option.

Similarity to case bind

@tomjaguarpaw

Similarity to #327

This seems similar in spirit to #327, though the latter achieves greater generality at the cost of additional syntactic noise. I think what you would write with EarlyDo as

app = do
  path <- grabEnv "PATH"?
  ...

you would write in #327 as

app = do
  Right path <- case grabEnv "PATH" of
     e -> e
   ...

#327 has the benefit that you can distinguish multiple failure cases.

Explicitly declare the absence of an instance

Motivation

Instances are global and should be universally agreed upon. When a consensus is reached we can implement an instance. But when we know that an instance is controversial and we know that there is no consensus, then we can't make that explicit.

Changes

I would suggest adding a declaration to allow us to specify explicitly absent instances:

-- |  Use 'Sum' or 'Product' instead
noinstance Semigroup Int

-- With MultiParamTypeClasses
class Void
noinstance Void

With GADTs you can already implement something similar:

class Void
instance Int ~ Bool => Void

instance Void => Semigroup Int

But this gives terrible error messages.

Advantages

So, there two main advantages:

  1. It allows us to generate improved error messages in many situations: GHC can stop suggesting strange instances like Num String as in haskell/error-messages#5, users can get errors when they define instances like that, and this gives an obvious place for documenting why particular instances cannot exist.
  2. It makes it possible to write unsatisfiable constraints without a hacky GADTs or TypeFamilies, e.g. Void is the same as Int ~ Bool.

Disadvantages and problems

One disadvantage is that this prevents the creation local unexported orphan instances that implement strange instances like Num String. Such instances are sometimes fun to play around with, but I don't think they belong in serious code, so I don't think this is a very important problem.

Perhaps a big open question still is what to do with superclasses. Can you define noinstance (Monad f, Monad g) => Monad (Compose f g) and what would its semantics be? I think just disallowing superclasses could be a possible approach, but maybe noinstance Monad (Compose f g) is not satisfactory.

I expect that there are many problems with this idea that I cannot foresee because I don't know how difficult this will be to implement in GHC, so I am opening it as an issue instead of a proper pull request.

I have also not yet thought about interaction with other extensions like OverlappingInstances, hopefully it can mostly have the same semantics as instance {-# OVERLAPPING #-} Void => ... instances.

Alternatives

An alternative would be a pragma like DEPRECATED which just causes warnings instead of errors, but that could still guide some error messages. E.g.:

{-# NOINSTANCE Semigroup Int "Use 'Sum' or 'Product' instead" #-}

Break out fromInteger/fromRational from Num/Fractional?

There have been a few cases where I would like to write a numeric literal in place of a custom data type, but that data type does not have valid implementations for the other functions in the type class (or you don't want to implement the other functions for various reasons). If I were redesigning everything from scratch, I'd do:

-- integral literals
class FromInteger a where
  fromInteger :: Integer -> a

-- decimal literals
class FromRational a where
  fromRational :: Rational -> a

-- string literals, renamed from IsString
class FromString s where
  fromString :: String -> s

-- new IsString; I've always wanted this anyway,
-- to generalize functions that can take in any
-- string-like thing (e.g. logging or golden tests)
class FromString s => IsString s where
  toString :: s -> String

-- list literals, broken out from IsList
class FromList l where
  type Item l
  fromList :: [Item l] -> l

-- new IsList
class FromList l => IsList l where
  toList :: l -> [Item l]

For example, we have a BigDecimal newtype over Scientific because Scientific has unsafe instances. This way, we can provide explicit unsafeAdd functions and prevent people from accidentally using +. But currently, there's no way to use a literal 10 to be BigDecimal 10. With this proposal, one could implement FromInteger but not Num as a whole.

This is mentioned as part of #124, but I think it's a worthwhile unit separate from the "prevent fromInteger from being partial" discussion. Obviously, this is a big breaking change, but I just wanted to write this up and see if others have any thoughts on this before trying to come up with a proposal/migration plan.

Sectnum has been removed from proposal metadata

I was just building the proposals locally and now the contents page no longer is numbered with the number of each proposal. I added sectnum headings to all the proposals so they would get a number but it seems they have recently been removed.

New:
2019-12-17-161409_927x945_scrot

Old, desirable behaviour:
2019-12-17-161535_1142x900_scrot

Syntax for modifying records

The Record Dot Syntax proposal does not provides syntax for modifying records because no appealing syntax has been suggested yet.

I thought out pretty good syntax for this, so I'll introduce it here.

data Company = Company {name :: String, owner :: Person}
data Person = Person {name :: String, age :: Int}

getAged :: Company -> Company
getAged company = company { owner.age@age = age + 1 }

What do you think about it?

Make TemplateHaskell brackets polymorphic in the monad

We have [| blah |] :: Q Exp, but we don't have to, – it could be Monad m => m Exp, or even Applicative f => f Exp.

Brackets desugar to functions from Language.Haskell.TH.Lib, and I suspect we could achieve this by changing the type signatures there. Take parensE for instance, it's defined as

parensE :: ExpQ -> ExpQ
parensE x = do { x' <- x; return (ParensE x') }

but it might as well be

parensE :: Functor f => f Exp -> f Exp
parensE = fmap ParensE

Here, even Functor is sufficient! But we would need Applicative for combinators that use sequence.


Why do this? Because now it's possible to extract the quoted expression without running Q, by doing runIdentity [| blah |] :: Exp.

Representation hiding for class instances

If you have an abstract datatype, instances of certain classes may expose details of the representation, and thereby provide a way around the abstraction boundary. The most obvious case of this is Generic. Thus it would be nice to have a way to control whether downstream modules see an instance.

Something like this exists already for some built-in constraints:

  • Coercible looks through newtypes only if the newtype constructor is in scope
  • HasField is solved only if the corresponding record field selector is in scope

We could special-case Generic, but I think it would be nicer to have a language feature that can be used for arbitrary classes. I'm imagining being able to mark particular instances as "private", with the semantics that the instance is available only if the data constructor(s) of the type are in scope. (An interesting question is whether the instance is hidden completely, or if GHC should refuse to use it and report an error.)

For example:

module Internal where
  data AbstractType = MkAbstractType { foo :: Int }
    deriving private Generic

module External (AbstractType) where
  import Internal
  -- instance Generic AbstractType is available in this module, because the constructor was imported

module Client where
  import External
  -- instance Generic AbstractType is not available here, because the constructor was not imported

Does this sound worthwhile to anyone else?

See kcsongor/generic-lens#142 for the discussion that prompted this ticket.

Print warnings with errors?

@adinapoli and I are cleaning out some accumulated dust in the error-handling infrastructure, en route to implementing structured error messages (#306).

In my work on this this afternoon, I came across some interesting behavior, and I don't know whether it's wrong, and if it is, in which direction to fix it.

First off, consider

foo x = not ()

GHC reports that () does not match Bool. It does not report anything about unused variable x, even with warnings enabled. This is because reporting of warnings is suppressed when there are errors about.

However, let's examine

foo x = \x -> x

GHC reports three warnings: one because I've made a top-level definition without a type signature, one because I didn't use the first x, and one because my second x shadows the first. All good.

Now I recompile with -Werror=name-shadowing. Interestingly, I now see two warnings and an error -- something I had not thought possible in GHC.

So:

  1. Should GHC suppress warnings when there are errors?
  2. If answering "yes" to (1), then should warnings-promoted-into-errors with -Werror also be suppressed? ("Yes" means that warnings-promoted-into-errors are treated more like warnings here.)
  3. If answering "yes" to (1), then should warnings-promoted-into-errors with -Werror also suppress? ("No" means that warnings-promoted-into-errors are treated more like warnings here.)

Right now, GHC's implementation answers "yes", "yes", and "no". To me, this feels like an odd design choice. If the user writes -Werror, then I would think that warnings-promoted-into-errors get treated just like errors. But maybe this is too rigid, and the current design is more useful in practice.

My choice would be to answer "no" to (1): just print out all that we've got. If you don't want the errors, use -w.

What do you think?

(Note: this isn't really a GHC proposal, because it's just fiddling with diagnostics, which are generally considered just beyond the scope of proposals.)

Why template-haskell does not provide basic QuasiQuoters?

Is there a reason why template-haskell doesn't provide QuasiQuoters corresponding to TH quotes [e| ... |], [d| ... |], etc.? A lot of packages provide QuasiQuoters that could be written as "[e| ... |] + extra stuff"; for example, a text interpolation library could be implemented as

-- the QuasiQuoter corresponding to [e| ... |]
eQQ :: QuasiQuoter

-- Interpolate: (let x = 1 in [text|x is: #{x}|]) == "x is: 1"
text :: QuasiQuoter
text =
  eQQ
    { quoteExp =
        appE [| concat |]
        . listE
        . map
          ( \case
            Left raw -> litE $ stringL raw
            Right interpolated -> quoteExp eQQ interpolated
          )
        . parse
    }

-- Parses "x is: #{x}" into [Left "x is: ", Right "x"]
parse :: String -> [Either String String]

Whereas currently, it'd have to pull in haskell-src-meta to parse the string.

As another example, we have a use case for easily writing lists of hlists:

rows =
  eQQ
    { quoteExp = listE . map (toHList . quoteExp eQQ) . lines
    }

-- generates [1 :& "hello" :& True :& Nil, 2 :& "world" :& False :& Nil]
[rows|
  [1, "hello", True]
  [2, "world", False]
|]

Today, the best way to do this without haskell-src-meta is to do

rows = listE . map toHList

$( rows
     [ [| [1, "hello", True] |]
     , [| [2, "world", False] |]
     ]
 )

Related: #260

RecordWildCards should support the "empty record"

Forked out from haskell/error-messages#15

data Foo = Foo { a :: Int }

f (Foo {..}) = 1

being allowed, but

data Foo = Foo { }

f (Foo {..}) = 1

not being allowed, feels arbitrary and annoying. The empty field has a name for every field, so on what grounds do we say it is not the empty record?


Note that I had thought Rust did this, allowing all ways of pattern matching on an empty struct regardless of how it was defined, but I am in fact mistaken:

struct Foo();
struct Bar {}
struct Baz;

fn main() {

  //// illegal because shadowing (?)
  //match (Foo {}) {
  //  Foo => 1, // binds a name Foo, rather than pattern matches on constructor
  //};
  match (Foo {}) {
    Foo() => 1,
  };
  match Foo() {
    Foo {} => 1,
  };
  
  match (Bar {}) {
    Bar => 1, // binds a name Bar, rather than pattern matches on constructor
  };
  match (Bar {}) {
    //Bar() => 1,
    Bar {} => 1,
  };
  //match Bar() {
  match (Bar {}) {
    Bar {} => 1,
  };
  
  match Baz {
    Baz => 1,
  };
  match (Baz{}) {
    //Baz() => 1,
    Baz {} => 1,
  };
  //match Baz() {
  match Baz {
    Baz {} => 1,
  };
}

The commented combinations are those that do not work.

I can't figure out the pattern here; I guess it's not such clear precedent. But do note the record patterns are in fact the only sort that works across all 3 definitions.

Caching code at a more granular level than the module level

Currently ghc caches at the module level, so any change to a module will lead to fully recompiling / re-interpreting all modules that import it, even if those modules are partially or completely unchanged.

This leads to a significant amount of developer time being wasted waiting for compilation/interpretation, and it can also get in the way of certain developer choices. For example it is ergonomically quite nice to have various re-exporting/bundling modules to avoid having to spend as much time messing with imports, but doing so leads to significantly more cache misses.

I understand that at the very first step of compilation there is inevitably going to be a bit of module-level caching due to the dependence on file modification times, but it seems like once the changed files are parsed, you can immediately get down to a more granular level of caching.

One method of doing this would be to take a Nix/Merkle-tree approach, just at the function/type level instead of the package/derivation level. Nix correctly only recompiles packages whose dependencies have changed, even if the nix file they are configured in changes. Likewise Haskell functions should only be recompiled if the functions they call / the types they use are changed.

Using a standardized Merkle-tree format such as IPLD would be nice to avoid reinventing the wheel, and to get other desirable interop for free. For example being able to decentrally store and retrieve various stages of compilation, whilst keeping the trusted input->output mapping as lightweight as possible (just input hashes and output hashes). Additionally any tooling for parsing/editing/introspecting these objects can be reused, which is much nicer than trying to work with a custom binary format.

what type of performance regression testing does GHC go through?

Hi,

Does the GHC release or development process include regression testing for performance?

Is this the place to discuss ideas for implementing such a thing and to eventually craft a proposal?

I believe the performance impact of changes to GHC needs to be verified/validated before release. I also believe this would be feasible if we tracked metrics on building a wide variety of real-world packages. Using real-world packages is one of the best ways to see the actual impact users will experience. It's also a great way to broaden the scope of tests, particularly with the combination of language pragmas and enabled features within the compiler.

Unidirectional Coercible

For newtypes that guarantee some invariant, we might want to coerce in one direction (unwrap) but not the other.

newtype NonNegative a = NonNegative a

coerce :: NonNegative a -> a -- safe
coerce :: a -> NonNegative a -- unsafe

Right now the only option is to not export the constructor, which makes both directions of coerce unavailable.

Explore ways to weaken or abandon the Instance Consistency Condition for fundeps

In Functional dependencies in GHC we identify two two major functional-dependency-related conditions on instance declarations

  • The Coverage Condition
  • The Instance Consistency Condition

Proposal #374 explores ways of weakening the Coverage Condition.

This ticket explores way to weaken (or even abandon) the Instance Consistency Condition. It is not yet a proposal ... a ticket will do for now.


Abandoning the instance consistency condition

Here is one concrete possibility, triggered by Examples 2, 3, and 4 on the wiki page:

  • Abandon the LICC altogether. It is too weak (Examples 2,3) and too strong (Example 4).
  • Instead, when considering improvement of a Wanted constraint against the global instances, do the following:
    • Find all the instances that can possibly match the Wanted (where "can possibly match" is "unify with").
    • Among these instances see if there is one "best" instance, which is more specific than (is a substitution instance of) all the otheres
    • If there is just one such, add the fundeps from that instance.
    • "Add fundeps from instance" means (precisely as now): for each fundep, if the LHS tys match, then generate an equality with the instantiated RHS tys.

These rules are very like the rules for overlapping instances.

#9210 is a very relevant ticket, with interesting discussion.

Examples of how it works

Now consider Example 2:

  • Only one instance unifies with [W] C Bool [alpha] beta
  • So we take fudeps from that constraint alone, giving beta ~ [alpha]
  • Now we can solve [W] C Bool [alpha] [alpha]
  • Yielding f :: a -> [a] as desired

Now Example 3. [W] C alpha beta (gamma,delta) unifies with both instances, so we get no fundeps at all.
If alpha gets unified with, say, Int by some other constraint, then it'll unify with just one instance and we can use the fundeps.

Example 4. If we have [W] TypeEq Int Int r, that unifies with only one instance, so we'll get a fundep r ~ True as desired. Similarly if the first two arguments are apart. All is good.

In the OP from #9210 we have

class Foo s t a b | a b s -> t where
instance Foo (x, a) (y, a) x y where
instance Foo (a, x) (a, y) x y where

We want to solve [W] Foo (Int,Int) alpha Int String. This unifies with both instances, so we will not use either fundep. We need more information to disambiguate.

Less completeness

Because this new rule is a bit less aggressive on using fundeps, it may fail to solve some constraints that we can solve today.

class CX x a b | a -> b
instance CX Int Int Bool
instance CX Bool Int Bool

class C2 a b | a -> b
instance C2 Bool Bool

Now suppose we are solving [W] CX alpha Int beta, [W] C2 beta alpha.
With our new rule, both instances unify, so no fundeps are used. We are stuck.

But in GHC today we take fundeps from both instances, giving beta ~ Bool. Then we can get fundeps from [W] C2 Bool alpha, giving alpha ~ Bool. Now we can solve the CX constraint.

But this is an extremely delicate setup.

Can we do any instance consistency checks at all?

We want to weaken instance consistency, but can we do any consistency checking on instances? For example, these ones look pretty suspicious.

class C2 a b | a -> b
instance C2 Int Bool
instance C2 Int Char

But what about this:

instance C2 Int [Int]
instance C2 Int (Maybe Bool)

Here if we have [W] C Int [alpha] only one instance matches and perhaps we can improve alpha to Int.

Be careful: we want to allow Example 4.

Not functional dependencies in the original sense

Mark Jones introduced the term "functional dependencies" by lifting it from the database world. There if a -> b we really mean that a fully determines b. That property is essential for the translation to type families proposed by Karachalias & Schrijvers (paper link at top).

All proposals for liberal coverage conditions, and liberal (or even dropped) instance consistency, move decisively away from this story. a does not fully determine b; translation into type families is impossible. Fundeps guide type inference; they do not carry evidence.

Changing direction on function type.

I'd love to make a discussion here to know if it's compelling enough for a proposal.

If we take a function type as f :: B <- A, then below are advantages:

1. Function composition

f ::     B <- A
g ::     C <- B
g . f :: C <- A

It's natural to read and think in type.
It's easier to follow return types, because they're fixed at one level of identation.
Instead of current backward composition with f :: A -> B

f     :: A -> B
g     :: B -> C
g . f :: A -> C (why not `f . g` here ?)

2. Function application and type-level application:

If we have

f :: B <- A
x :: A
then
f x ~ (B <- A) A ~ B

So we reason about type-level application the same as term-level application.
Instead of current backward application on type:

f :: A -> B
x :: A
f x ~ (A -> B) A ~ B ( why not just destructing B , instead of A )

3. do notation

If we have

f :: Monad m => m a
g :: Monad m => m b <- a
then
i could use in do block as
do
  x <- f     ~ ( a <- m a )
  y <- g x   ~ b <- (m b <- a) a
  return y   ~ m b

4. =<< notation

If we have

f :: Monad m => m a
g :: Monad m => m b <- a

then

g =<< f ~ (m b <- a) (m a) ~ m b

So both (3) and (4) are consistent and natural to think in type, too.

5. Multiple arguments function application:

If we have

f :: A <- B <- C <- D
and 
x :: D, y :: C, z :: B
then
f x y z ~ (A <- B <- C <- D) D C B ~ A

It's just normal destructing a function type to get return type.

Amending a Proposal

In some of the trailing comments on the PtrRep proposal, there's a discussion going on about modifying the proposal. However, it was already accepted. What is the procedure for modifying an accepted proposal that has not yet been implemented? I suspect that opening a PR the modifies the original proposal would be most appropriate. If this is the procedure, it would be nice to mention this, even if only briefly, in the readme.

Fundeps: shorthands for common idioms, esp with Overlaps

Motivation

There are common idioms for writing FunDep code. Furthermore if combined with Overlaps, the 'textbook' examples don't work reliably, so needing rather verbose idioms to get dependable behaviour. Here I'm thinking out loud about shorthands. This isn't introducing any new functionality or semantics/nothing to see here/move along please.

(These thoughts are prompted mostly by reviewing GHC tickets of unreliable/bogus behaviour and the work-rounds; also somewhat by looking at the 'Habit' flavour of Haskell -- I can see what they're aiming for, but I dislike the specifics.)

Contents: (sorry, Issues markup doesn't support clickable TOC)

  1. Class decls

  2. Instance Heads

  3. Type Family calls in Instance Head

  4. Type class calls in Instance Head

  5. Fewer UndecidableInstances

  6. Type class calls in types

  7. Less distraction in the Class Head

  8. Less distraction in the Instance Head

  9. Single param class calls in Instance Head

  10. Summary/Proposed Change Specification

1. Class decls

By far the most common FunDeps have:

  • a single FunDep;
  • the target type parameter as the rightmost in the class decl;
  • the Fundep is 'full' -- that is, all the other type params make up the source.

And that structure is natural: the class acting as a type-level function.

class C a b c  | a b -> c  where ...

suggested shorthand

class C a b -> c  where ...

Discussion/possible complications/objections

  • If you want a type-level function, use a Type Family

    Type Classes serve three purposes; whereas Type Families serve only one of those: a) a predicate on types; b) an overloading mechanism for methods; c) a mechanism for type improvement. Type Families sometimes work well in combination with type classes, sometimes not. This shorthand is for where not.

    Oh, and Type Families seem to have some serious shortcomings in terms of semantic coherence.

  • What if I want other FunDeps as well?

    I'm not proposing taking away the longer-hand current syntax. Would be nice (but not essential) to combine them:

      class C a b -> c  | c b -> a  where ...
    
  • Bikeshed about lexical syntax

    Habit uses = where I've used ->. I had a strong reaction against that: = is what we use in equations, but this is a decl. So I think it's appropriate to use type-flavoured lexemes. (Type Families use = in the instances/equations, not the TF decl.) -> for the target type fits with thinking of FunDeps as type-level functions; also directly corresponds to ->in a fully written out FunDep. (The arrow comes from Relational Database theory of FunDeps.)

2. Instance Heads

The theory behind FunDeps says these two instances should give equivalent behaviour.

class C a b -> c  where ...

instance C Int Bool Char  where ...                    -- (A)

instance (c ~ Char) => C Int Bool c  where ...         -- (B)

Indeed that via CHR's paper defines the first to behave as the second.

But thanks to GHC's dodgy implementation, there's an observable difference: if the use site gives c ~ Float, GHC ignores that instance (A) and indeed might select some other instance. Whereas with instance (B), GHC selects that instance, then attempts the improvement under ~, then rejects the program [correct behaviour]. Trac #15632 and many others linked from it.

This bites particularly where combining FunDeps with overlaps, in many of the utilities from Oleg:

data TTrue  = TTrue   deriving (Eq, Show, Read)         -- using these so they're showable
data TFalse = TFalse  deriving (Eq, Show, Read)

class IsFunc a p  | a -> p  where
  isFunc :: a -> p

instance IsFunc (a1 -> a2) TTrue  where                 -- unreliable, see below
--  instance (p ~ TTrue) => IsFunc (a1 -> a2) p  where  -- the right way to do that
  isFunc _ = TTrue

instance {-# OVERLAPPABLE #-} (p ~ TFalse) => IsFunc a p  where
  isFunc _ = TFalse

λ> isFunc id == TTrue    ===> True                                       [correct]
λ> isFunc id == TFalse   ===> True, with the direct TTrue instance       [wrong]
                         ===> mismatch rejection, with the ~ instance    [correct]

To emphasise: Overlap is not essential to make your instances go wrong, but it certainly helps. For another datapoint: Hugs rejects that isFunc id == TFalse example * Couldn't match type ``TFalse' with ``TTrue', even if there's only the one instance defined.

Then well-disciplined code should use the ~ constraint for improvement on every instance, even though it's verbose and obscures the logic. Suggested shorthand (if we're already comfortable with decorating the target position of a class):

instance C ty1 ty2 ~ty3  where ...

shorthand for

instance (c ~ ty3) => C ty1 ty2 c  where ...

That is: create a fresh tyvar (ideally per the class head); put that in the instance head; add a ~ constraint. This is a purely syntactic sweetener/no validation that the ~-decorated argument is in fact a FunDep target. Then ...

3. Type Family calls in Instance Head

Type Families are not allowed in instance heads. That's a nuisance/again adds verbosity that obscures the logic. You must write

instance (c ~ F ty1 ty2) => C ty1 ty2 c  where ...

OK I just fixed that, by allowing anything to follow the ~ decoration:

instance C ty1 ty2 ~(F ty1 ty2)  where ...
  • Bikeshed about lexical syntax

I've used ~ because it's strongly flavoured by Haskell's irrefutable pattern match at the term level. It's a pretty coincidence ~ is also the pseudo-operator for equality constraints. ~ is not an operator, it's a reservedsym with hard-coded syntax. The syntax I'm suggesting in instance heads actually has ~ as a prefix, even though it looks infix. This would be allowed

instance D ty1 ~ty2 ~ty3  where ...

sugar for

instance (b ~ ty2, c ~ ty3) => D ty1 b c  where ...

4. Type class calls in Instance Head

If we're getting familiar with writing classy stuff as C ty1 ty2 ~ty3 ...

The idiom of a bare tyvar in the instance head that is a target for improvement from constraints is more pervasive, especially where using a recursive typeclass to walk some type structure. Consider

data Z = Z            deriving (Eq, Show, Read)
data S n = S n        deriving (Eq, Show, Read)

-- type-level natural for the number of args to a function
class FuncArgs a n  | a -> n  where
  funcArgs :: a -> n

instance (FuncArgs a2 n') => FuncArgs (a1 -> a2) (S n')  where -- recursive case
  funcArgs (_ :: a1 -> a2) = S (funcArgs (undefined :: a2))

instance {-# OVERLAPPABLE #-} (n ~ Z) => FuncArgs a n  where   -- base case
  funcArgs _ = Z

λ> funcArgs not
===> S Z

Suggestion 4a: allow type class calls in the instance head. Then combining with earlier suggestions that example looks like

class FuncArgs a -> n   where
  funcArgs :: a -> n

instance FuncArgs (a1 -> a2) ~(S (FuncArgs a2 ~n'))  where     -- recursive case
  funcArgs (_ :: a1 -> a2) = S (funcArgs (undefined :: a2))

instance {-# OVERLAPPABLE #-} FuncArgs a ~Z  where             -- base case
  funcArgs _ = Z

(Note the ~(S (FuncArgs a2 ~n')) uses two ~. The outermost is per Suggestion 2.)

The syntactic rule is:

  • ~ prefixed at the outermost level in the instance head requires generating a fresh tyvar and moving what's prefixed (which is usually a type) to an equality constraint.

  • ~ prefixed at nested level, which must be prefixed to a bare tyvar, moves the whole of what's nested to a constraint, leaving the bare tyvar in situ, and removing the ~.

There won't necessarily be any type constructors at outer level. Here's the classic HList example

data HNil = HNil             deriving (Eq, Show, Read)
data HCons e l = HCons e l   deriving (Eq, Show, Read)

-- delete all elements type `e`
class HDeleteMany e l -> l'  where ...

instance HDeleteMany e HNil ~HNil  where ...

instance HDeleteMany e (HCons e l'') (HDeleteMany e l'' ~l')  where ...   -- no outer ~

-- telescope the desugarring/no need for a fresh tyvar
instance (HDeleteMany e l'' l') => HDeleteMany e (HCons e l'') l'  where ...

The outer ~ can be dropped, because ~l' is introducing a fresh tyvar anyway. (Keeping it would not be wrong, but is redundant.)

Another readability gain here is that there's less clutter of constraints before you get to the 'meat' of the instance head. These days with some library code it's really hard to fight through the thicket of constraints to figure out what some instance is talking about. The visual clue used to be to skim forward to the =>. But these days with Implication Constraints, even that clue isn't reliable. (The Habit approach here is to move constraints to follow the head -- both for instances and superclass constraints -- I'll come to them.)

Suggestion 4b: variant form for Type Family calls in instance head

Perhaps you're not so keen on Suggestion 2. & 3.'s creating fresh tyvars, and would prefer to control the naming. Then remembering that 4a. is a purely syntactic desugarring ...

 instance C ty1 ty2 (F ty1 ty2 ~ c)  where ...                   -- must prefix ~ to a bare tyvar

desugars to

 instance (F ty1 ty2 ~ c) => C ty1 ty2 c  where ...    

Ah, but there's a problem: the ~ must be retained in the constraint, because that's a genuine equality constraint, not a typeclass constraint. They're syntactically indistinguishable (but that's a major reason for using ~ in a pseudo-infix fashion). Does GHC have the smarts to know which is which? Before you answer ...

Suggestion 4c: ~ as a visual clue in constraints

Even without allowing typeclass calls in the instance head, with a pile-up of calls in the constraints, it would be handy to provide some visual sign of the chaining of return types to further calls, like this:

instance (D ty1 ~a', E ty2 ~b', F a' b' ~ c) => C ty1 ty2 c  where ...

Here D and E are regular type classes. The ~ is there as visual clue, entirely unnecessary, and is to be merely dropped. But F is a Type Family, the ~ is needed to make a constraint, so can't be dropped.

If the worst comes to the worst, we could resort to the timeworn work-round: ~ prefixed with no space is to be dropped; ~ with whitespace to its right is to be retained.

That last example could be this. How does it look aesthetically?

instance C ty1 ty2 (F (D ty1 ~a') (E ty2 ~b') ~ c)  where ...

(Habit in a roughly-comparable usage would omit the result type altogether. Then if a typeclass call appears unsaturated, take that as a pseudo-Type function call, and insert a fresh tyvar, moving the typeclass to constraints. I.e. instance ty1 ty2 (F (D ty1) (E ty2)) where .... Note that Habit doesn't include Type Families nor ~ constraints, so doesn't have to deal with that particular ambiguity.)

5. Fewer UndecidableInstances

The idiom of a fresh/bare tyvar in the instance head has another annoyance: it needs the scary-sounding UndecidableInstances. There's a proposal for that, because most usages are perfectly decidable. Laying down the rules for 'decidable' constraints has become a bit of a mission.

The shorthands here perhaps offer an easier way: if the bare tyvar in the instance head arises from one of these shorthands, then it doesn't count as Undecidable. (That's a bit flakey: these are purely syntactic sugar, with no guarantee the ~ are prefixed to targets of a FunDep.)

6. Type class calls in types

insert2 :: (Collects ce ~e) -> e -> ce -> ce

As sugar for

insert2 :: Collects ce e => e -> e -> ce -> ce

TBH, the old chestnut of the Collects class is a good candidate for an Associated Type. Here there's not much saving over:

insert2 :: Elem ce -> Elem ce -> ce -> ce

7. Less distraction in the Class Head

Talking of "clutter of constraints before you get to the 'meat'", what I'd like to see straight after the class keyword is the class name. Instead I get constraints. Then a =>; but that isn't doing the same job as => in a type signature or instance decl. In fact, taking it as implication, that arrow is pointing backwards: class Eq a => Ord a ... means 'anywhere you see Ord, you can take it Eq is also satisfied'. Then move the distraction out of the way

class C a b -> c  | Eq a, Num b  where ...
  • Bikeshed about lexical syntax

Oh, but that | signals FunDeps. Is it ambiguous? FunDeps are a comma-separated list of stuff before the where. Superclass constraints are a comma-separated list of stuff. I'd argue FunDeps semantically belong to the class as constraints (on whole instances, rather than on parameters).

But items between the ,s are not ambiguous:

  • FunDeps have a -> at top level amongst tyvars (lower-case); constraints do not.
  • Constraints start with a class (upper-case), or perhaps these days have an infix type operator, but not ->. Just possibly they might have a -> as type constructor, but that must be nested.

Is there any human difficulty parsing this?

class C a b -> c  | Eq a, c b -> a, Num b, F (a -> b) ~ c  where ...

Semantically, I'm struggling to imagine a class like that, but I'm asking about the syntax.

(This suggestion is straight out of the force of Habit.)

The | (vaguely) suggests guard syntax; for any proffered instance, bind the variables from the class head, then make sure the constraints hold before accepting that instance.

8. Less distraction in the Instance Head

Similarly I'd prefer the class name to follow the instance keyword (but I'm not myself convinced this has great value):

instance C ty1 ty2 c  | (D ty1 ~a', E ty2 ~b', F a' b' ~ c) where ...

as sugar for

instance (D ty1 ~a', E ty2 ~b', F a' b' ~ c) => C ty1 ty2 c  where ...

Again we might observe that => is not straightforward implication, and not doing the same job as in type signatures.

  • Bikeshed about lexical syntax

OTOH | suggesting guard syntax here might add to a common confusion amongst newbies that the constraints restrict an instance from applying. Haskell semantics is to select the instance first, based purely on the instance head, then validate that the constraints hold, including applying any type improvement from FunDeps or ~ constraints.

(Habit's semantics here is the opposite way round: only select the instance if the constraints hold; otherwise look for another instance. And to show that semantics, Habit has keyword if. There's some evidence an older version of Hugs tried to apply that semantics, but gave it up as too hard.)

9. Single param class calls in Instance Head

As a consequence of 4 (allowing class calls in Instance Heads), and because this is a purely syntactic extension, this would be valid:

instance E (Num ~ty2) (Eq ~ty3)  where ...

as sugar for

instance (Num ty2, Eq ty3) => E ty2 ty3  where ... 

10. Summary/Proposed Change Specification

Suggested additional syntax Desugars to
class decl FunDeps, Constraints
class C a b -> c where ... class C a b c | a b -> c where ...
class C a b -> c | c b -> a where ... class C a b c | a b -> c, c b -> a where ...
class C a b -> c | Eq a, c b -> a, Num b,
                 F (a -> b) ~ c where ...
class (Eq a, Num b, F (a -> b) ~ c)
       => C a b c | a b -> c, c b -> a where ...
instance decl head, Constraints, Type Family calls, class calls
instance C ty1 ty2 ~ty3 where ... instance (c ~ ty3) => C ty1 ty2 c where ...
-- c is fresh tyvar
instance C ty1 ty2 ~(F ty1 ty2) where ... instance ( (F ty1 ty2) ~ c) => C ty1 ty2 c where ...
instance C ty1 ty2 ~(F (D ty3 ~a') (E ty4 ~b')) where ...
instance (D ty3 ~a', E ty4 ~b', F a' b' ~ c)
              => C ty1 ty2 c where ...
instance (D ty3 a', E ty4 b', F a' b' ~ c)
           => C ty1 ty2 c where ...
-- D, E classes, F Type Family
Type signatures embedded class constraints
insert2 :: (Collects ce ~e) -> e -> ce -> ce insert2 :: Collects ce e => e -> e -> ce -> ce

Binding Guards

Main Idea

While translating an imperative algorithm to Haskell I keep running into situations where guards on binding statements would be neat and it seems like a minor syntactic extension. Here's an artificial example:

do x | 0 <= i && i < size arr <- read arr i
     | otherwise              <- pure 0
   return x

Which would desugar to:

do x <- if 0 <= i && i < size arr then
          read arr i
        else if otherwise then
          pure 0
        else
          fail "binding guard failure"
   return x

With a prettier error message.

Extension 1

An extension of this idea would be to allow mixing pure and impure bindings:

do x | 0 <= i && i < size arr <- read arr i
     | otherwise              =  0
   return x

If we allow this then we should really also allow pure bindings without let in do blocks for consistency. We already do that in GHCi, so why not everywhere?

Edit: I guess a problem are recursive bindings. Let bindings in do blocks are still recursive, so this mixing would mean that even the pure bindings cannot be recursive, which is inconsistent with the other behaviour.

Extension 2

Initially I was actually thinking it could be really useful if it allowed for conditional bindings, like so:

do s <- initialValue
   s | someCondition <- newValue

This would be desugared to:

do s <- initialValue
   s <- if someCondition then newValue else pure s

I think that would be very useful in some cases, but I feel like it won't always be useful, that it is too unexpected that a missing branch indicates that the old value is kept, and it only works if the variable was already bound and now shadowed.

We could make this a bit more consistent if the semantics are to always add a | otherwise <- pure <name> guard to all guard lists if <name> is already bound.

Conclusion

All in all, I think the main idea is a minor extension which can improve readability a bit, but I don't know if it is worth it on its own. I think both extensions are interesting but also not obviously good enough to make the cut. I did want to write this down in case maybe somebody finds a way to improve it or finds a better use-case.

Pattern synonyms deriving Read

Pattern synonyms in expressions are great for providing a less error-prone/shorthand way to construct values that would be complex using the raw constructors from the datatype; or to provide smart constructors that keep the raw constructors abstract.

Then a natural thing to want to do is use those patterns in Strings; and read the String to get a value. (Probably the pattern is embedded inside some more complex literal.) Problem: there's no way to derive Read instances for patterns. Furthermore instances are for the type not the constructor(s), and in general there's no way for ghc to know all the patterns declared for some type. If the code declares the type ... deriving (Read, ...), we can't insert any read overloading later for the patterns.

Hopelessly ideal solution

pattern ...                       -- must be bidirectional, either implicit or explicit
  deriving (Read)

(I can't think of any other class than Read which it makes sense to derive.)

Workable approach(?)

  • On the base data type, do not put ... deriving (Read).

  • Then extend StandaloneDeriving:

      deriving instance Read (datatype)              -- standard deriving instance
        pattern (conid1, conid2, ..., conidn)
    

Downsides:

  • If you're importing the base data type decl, and it already has ... deriving (Read), you're sunk.
  • Or if you want to add some patterns to those already provided with Read instances.
  • What if you don't want a Read instance for the raw data constructors?
    Do we require mentioning them in the list of conid?

If you're avoiding the ... deriving (Read) on the data type, then yes you could hand-craft instance Read (datatype) where ... and overload read to parse all the pattern constructors. That's tedious and error-prone code; far better to have the compiler generate it.

Clumsy/ugly approach

Provide a callout option/mechanism from deriving (Read) for the data type: if the parse fails on the base constructors:

  • Parse a conid from the front of read's input string.

  • Call method patternRead of class PatternRead, with two type parameters:

  • The data type to be returned (as with Read).

  • A Symbol, being the conid promoted to a type-level String.

  • Then we can generate an instance with this pattern decl:

      pattern ...
        deriving (PatternRead)
    
  • Hmm: what about infix conids? They might be a long way from the front of read's input string, with tricky parsing needed.

Producing a type-level Symbol will need a heavy amount of dependent-typing. And could fail at runtime: no instance. But if we provide conid as merely a String, we can't freely add patterns to PatternRead: there could be only one instance per datatype and it would need to 'know' all the conids in advance.

Some other type-level mechanism for chaining together attempts to parse a pattern for the type?

Add naive Core/STG interpreter so TH and GHCi work even with ABI changes

external interpreters and --target help with the cross case, but that still leaves the ABI compatibility problem. The end result for GHC development itself is the same: TH and GHCi are not availible in certain bootstrapping scenarios, wich restricts the way GHC and core libs must be developed in myriad obscure ways that are highly hostile to new contributors.

Also, it seems like a dedicated core interpeter would be a good thing to have in general as a "living quasi-specification", with benefits such as giving something authoritative for people to read, being an oracle for tests, etc.

CC @angerman

Purpose of issue tracker

There's nothing in the readme about the issue tracker on this repository, but what exactly is the issue tracker for. The two kinds of issues that I see are:

  1. Questions about the procedure of GHC proposals
  2. Incomplete proposals that attract some discussion and then linger forever

This issue falls into the first category. I propose that the readme be amended to restrict the purpose of the issue tracker to just questions about procedure. Subsequently, all proposals on the issue tracker would be closed with a request that if the author was still interested, the proposal should be reopened as a PR.

Named functional dependencies

In #236 (comment), @cartazio brings up the topic of generating function inverses from functional dependencies. I think it should be possible by giving them a name.

Today we can write:

type family F a = r | r -> a where
  F True = 1
  F False = 0

I propose we write this instead:

type family F a = r | G r = a where
  F True = 1
  F False = 0

Then GHC would derive the following definition:

type family G r = a where
  G 1 = True
  G 0 = False

Github discussion feature

Recently, github introduced a discussion feature.

Since most of the issues opened here are discussions, maybe it would make sense to enable discussions for this repository? (It is also possible to convert existing issues into discussions.)

The main advantage they offer over issues is that they allow threaded comments.

One might even consider using discussions for the actual proposals (that's how they do it over in F#-land), though since the code review feature has turned out to be quite useful for proposals, I'm less convinced that that would make sense.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.