Coder Social home page Coder Social logo

akimd / bison Goto Github PK

View Code? Open in Web Editor NEW
298.0 16.0 86.0 18.36 MB

GNU Bison

License: GNU General Public License v3.0

Makefile 1.65% Shell 3.73% Perl 2.52% M4 11.68% C 63.09% C++ 11.29% XSLT 3.15% CSS 0.10% Python 0.18% Ruby 0.14% Java 2.45%
gnu parser-generator yacc bison

bison's Introduction

GNU Bison is a general-purpose parser generator that converts an annotated context-free grammar into a deterministic LR or generalized LR (GLR) parser employing LALR(1) parser tables. Bison can also generate IELR(1) or canonical LR(1) parser tables. Once you are proficient with Bison, you can use it to develop a wide range of language parsers, from those used in simple desk calculators to complex programming languages.

Bison is upward compatible with Yacc: all properly-written Yacc grammars work with Bison with no change. Anyone familiar with Yacc should be able to use Bison with little trouble. You need to be fluent in C, C++, D or Java programming in order to use Bison.

Bison and the parsers it generates are portable, they do not require any specific compilers.

GNU Bison's home page is https://gnu.org/software/bison/.

Installation

Build from git

The README-hacking.md file is about building, modifying and checking Bison. See its "Working from the Repository" section to build Bison from the git repo. Roughly, run:

$ git submodule update --init
$ ./bootstrap

then proceed with the usual configure && make steps.

Build from tarball

See the INSTALL file for generic compilation and installation instructions.

Bison requires GNU m4 1.4.6 or later. See https://ftp.gnu.org/gnu/m4/m4-1.4.6.tar.gz.

Running a non installed bison

Once you ran make, you might want to toy with this fresh bison before installing it. In that case, do not use src/bison: it would use the installed files (skeletons, etc.), not the local ones. Use tests/bison.

Colored diagnostics

As an experimental feature, diagnostics are now colored, controlled by the --color and --style options.

To use them, install the libtextstyle library, 0.20.5 or newer, before configuring Bison. It is available from https://alpha.gnu.org/gnu/gettext/, for instance https://alpha.gnu.org/gnu/gettext/libtextstyle-0.20.5.tar.gz, or as part of Gettext 0.21 or newer, for instance https://ftp.gnu.org/gnu/gettext/gettext-0.21.tar.gz.

The option --color supports the following arguments:

  • always, yes: Enable colors.
  • never, no: Disable colors.
  • auto, tty (default): Enable colors if the output device is a tty.

To customize the styles, create a CSS file, say bison-bw.css, similar to

/* bison-bw.css */
.warning   { }
.error     { font-weight: 800; text-decoration: underline; }
.note      { }

then invoke bison with --style=bison-bw.css, or set the BISON_STYLE environment variable to bison-bw.css.

In some diagnostics, bison uses libtextstyle to emit special escapes to generate clickable hyperlinks. The environment variable NO_TERM_HYPERLINKS can be used to suppress them. This may be useful for terminal emulators which produce garbage output when they receive the escape sequence for a hyperlink. Currently (as of 2020), this affects some versions of emacs, guake, konsole, lxterminal, rxvt, yakuake.

Relocatability

If you pass --enable-relocatable to configure, Bison is relocatable.

A relocatable program can be moved or copied to a different location on the file system. It can also be used through mount points for network sharing. It is possible to make symlinks to the installed and moved programs, and invoke them through the symlink.

See "Enabling Relocatability" in the documentation.

Internationalization

Bison supports two catalogs: one for Bison itself (i.e., for the maintainer-side parser generation), and one for the generated parsers (i.e., for the user-side parser execution). The requirements between both differ: bison needs ngettext, the generated parsers do not. To simplify the build system, neither are installed if ngettext is not supported, even if generated parsers could have been localized. See https://lists.gnu.org/r/bug-bison/2009-08/msg00006.html for more details.

Questions

See the section FAQ in the documentation (doc/bison.info) for frequently asked questions. The documentation is also available in PDF and HTML, provided you have a recent version of Texinfo installed: run make pdf or make html.

If you have questions about using Bison and the documentation does not answer them, please send mail to [email protected].

Bug reports

Please send bug reports to [email protected]. Be sure to include the version number from bison --version, and a complete, self-contained test case in each bug report.

Copyright statements

For any copyright year range specified as YYYY-ZZZZ in this package, note that the range specifies every single year in that closed interval.

bison's People

Contributors

adelavais avatar adl avatar akimd avatar bhaible avatar bonzini avatar dmacnet avatar ebblake avatar eggert avatar eric-s-raymond avatar jannick0 avatar jmgdjgpp avatar jpewdev avatar jrn avatar jsoref avatar jurik42 avatar kaladron avatar meyering avatar mranno avatar nickg avatar nitnelave avatar noahfriedman avatar pnhilfinger avatar scfc avatar shure avatar slattarini avatar vogelsgesang avatar wojciechpolak avatar xaec6 avatar yroeht avatar yui-knk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bison's Issues

publish bison version number in generated output

I was trying to migrate from %name-prefix to api.prefix while trying to support the as-of-today versions of bison in various distributions.

I ran into one issue, which I have created a workaround for, but since this was pain to fix, let me suggest some additions to the generated grammar file. The suggestion is to publish

YYBISON_VERSION_MAJOR 3
YYBISON_VERSION_MINOR 5

macros, containing the major/minor version numbers. These would be in addition to the already existing YYBISON_VERSION macro.

Rationale:

YYBISON_VERSION as of today is a string representation of the bison version number, thus it does not allow me to create conditional code blocks in my grammar, which I needed in order to resolve a compatibility issue.

The issue I had was:

  • I am using YYEMPTY identifier in my rules, which in bison 3.5.1 was named YYEMPTY even if I used api.prefix.
  • starting with 3.6.1, YYEMPTY got renamed to _EMPTY when api.prefix is used
  • to make my grammar work with all bison versions, I #defined _EMPTY to YYEMPTY in case I was using 3.5.1

In order to have the version used to generate the .c file, I needed some sed magic in my Makefile rules to insert this information.

.y.c:
        $(AM_V_YACC)$(am__skipyacc) $(SHELL) $(YLWRAP) $< y.tab.c $@ y.tab.h $*.h y.output $*.output -- $(YACCCOMPILE)
        $(AM_V_GEN) sed -i -e '1i #define SYSLOG_NG_BISON_MAJOR @bison_version_major@\n#define SYSLOG_NG_BISON_MINOR @bison_version_minor@'  $@

Here's the code in the upstream codebase I needed: https://github.com/syslog-ng/syslog-ng/blob/059c1154ab87833bfb2598600c20ede4f09a8aaf/lib/rewrite/rewrite-expr-grammar.ym#L48-L55

Just in case you are interested, the migration from the old style to the new one took us almost 2 years, as documented in this pull request: syslog-ng/syslog-ng#2526

Broken for D parsers: %code lexer {

The use of %code lexer { in the grammar file is to be broken for %language "D". For file bison/data/skeletons/lalr1.d at line 272, changing implements to : seems to be all that is needed to fix this issue (from limited testing).

It's great to see D support in bison, and I hope it will continue to be developed/extended.

Usage of HAVE_UNISTD_H should be consistent

HAVE_UNISTD_H is found in various files as a guard against including unistd.h.

functions.m4
headers.m4
specific.m4

all contain #ifdef HAVE_UNISTD_H.

However,

src/files.c
src/system.h

do not.

The usage of HAVE_UNISTD_H should be consistent across the codebase.

[TC] Problem: Bad Variant Access errors are not explicit

Hello,

I am a Tiger Maintainer 2024 at Epita,

This year we step into a recurrent issue with bison. Students
almost always faced a bad variant acces error while mismatching
types. The problem with this is that they have no way to discriminate
the source of the error.

It could be a good idea to have some security and some checkings
around variant access.

Makefile:4424: recipe for target 'src/bison-scan-code-c.o' failed

I couldn't build bison due to a fatal error:
Makefile:4424: recipe for target 'src/bison-scan-code-c.o' failed.
The error is: src/scan-code-c.c:3:10: fatal error: src/scan-code.c: No such file or directory
#include "src/scan-code.c"
There is a 'scan-code.l' and there is a 'scan-code-c.c' in the src directory, but no 'scan-code.c'.


Running make > ~/error.txt produces:
make[1]: Entering directory '/home/rob/Source/github/bison'
LEX examples/c/reccalc/scan.stamp
make[1]: Leaving directory '/home/rob/Source/github/bison'
make[1]: Entering directory '/home/rob/Source/github/bison'
LEX examples/c/reccalc/scan.stamp
make[1]: Leaving directory '/home/rob/Source/github/bison'
LEX src/scan-code.c
LEX src/scan-gram.c
LEX src/scan-skel.c
make all-recursive
make[1]: Entering directory '/home/rob/Source/github/bison'
Making all in po
make[2]: Entering directory '/home/rob/Source/github/bison/po'
make[2]: Nothing to be done for 'all'.
make[2]: Leaving directory '/home/rob/Source/github/bison/po'
Making all in runtime-po
make[2]: Entering directory '/home/rob/Source/github/bison/runtime-po'
make[2]: Nothing to be done for 'all'.
make[2]: Leaving directory '/home/rob/Source/github/bison/runtime-po'
Making all in gnulib-po
make[2]: Entering directory '/home/rob/Source/github/bison/gnulib-po'
make[2]: Nothing to be done for 'all'.
make[2]: Leaving directory '/home/rob/Source/github/bison/gnulib-po'
Making all in .
make[2]: Entering directory '/home/rob/Source/github/bison'
CC src/bison-scan-code-c.o
Makefile:4424: recipe for target 'src/bison-scan-code-c.o' failed
make[2]: Leaving directory '/home/rob/Source/github/bison'
Makefile:5371: recipe for target 'all-recursive' failed
make[1]: Leaving directory '/home/rob/Source/github/bison'
Makefile:2940: recipe for target 'all' failed

Associating %prec with a rule without any occurrences of tokens

I want to parse ML-style whitespace sensitive function application using operator precedence, so I'd start from a grammar like

%left APP
%%
e : "f" 
  | e e %prec APP

and I would hope that this would be accepted without conflict by parsing chains of whitespace-separated "f" tokens as left-associated lists.
Alas, it appears that %prec declarations only work if the rule starts with a terminal, or some other hidden condition:

$ bison -v test.y -Wcounterexamples
test.y: warning: 1 shift/reduce conflict [-Wconflicts-sr]
test.y: warning: shift/reduce conflict on token "f" [-Wcounterexamples]
  Example: e e • "f"
  Shift derivation
    e
    ↳ 2: e e
           ↳ 2: e e
                  ↳ 1: • "f"
  Reduce derivation
    e
    ↳ 2: e          e
         ↳ 2: e e • ↳ 1: "f"

If I use an arbitrary infix operator @ instead and specify %left @, all conflicts can be resolved.

Is there some technical reason why the operator-less use case can't be supported?

Test

Some
words
on
several
lines.

undefined reference to `rpl_fprintf'

Encounter below error when build bison, but actually I didn't find it calls rpl_fprintf in the function location_print, but it call fprintf instead, why does it complain "undefined reference to `rpl_fprintf'"? Thanks!

src/bison-location.o: In function `location_print':
location.c:(.text+0xae5): undefined reference to `rpl_fprintf'
location.c:(.text+0xb22): undefined reference to `rpl_fprintf'
location.c:(.text+0xb3d): undefined reference to `rpl_fprintf'
location.c:(.text+0xb65): undefined reference to `rpl_fprintf'
location.c:(.text+0xb80): undefined reference to `rpl_fprintf'
src/bison-location.o:location.c:(.text+0xbc4): more undefined references to `rpl_fprintf' follow
src/bison-getargs.o: In function `getargs':
getargs.c:(.text+0x1581): undefined reference to `rpl_printf'
getargs.c:(.text+0x18b1): undefined reference to `rpl_printf'
getargs.c:(.text+0x1928): undefined reference to `rpl_fprintf'
getargs.c:(.text+0x1d73): undefined reference to `rpl_printf'
getargs.c:(.text+0x1f63): undefined reference to `rpl_printf'
getargs.c:(.text+0x1f7f): undefined reference to `rpl_printf'
getargs.c:(.text+0x2037): undefined reference to `rpl_fprintf'

bison regression since 3.3.90 up to and including 3.7.4

I am pretty positive that af1c6f9 (using bitsets instead of a linear lookup table) introduced a subtle regression that is difficult to trigger, but I have ran into. I have confirmed that reverting this patch "fixes" the problem for me on top of 3.7.4.

Description of the problem

Due to the way we use bison generated grammars in syslog-ng, we have a lot of unused terminals and non-terminals in our grammar (and we suppress the generated warning).

I wanted to introduce a new %token and while the very same grammar (without the %token) worked (I was using 3.7.4), simply by adding the new token (but no rules) caused the parser to behave incorrectly.

This is the only change on the grammar that triggers the incorrect behavior:

diff --git a/modules/csvparser/csvparser-grammar.ym b/modules/csvparser/csvparser-grammar.ym
index 27177db05..93304e01e 100644
--- a/modules/csvparser/csvparser-grammar.ym
+++ b/modules/csvparser/csvparser-grammar.ym
@@ -60,6 +60,7 @@
 %token KW_NULL
 %token KW_CHARS
 %token KW_STRINGS
+%token KW_DROP_INVALID
 
 %type  <ptr> parser_expr_csv
 %type   <num> parser_csv_flags

The grammar is pretty long and complex (because of the way we generate it from its source files), so I am not attaching here, but I can do that too if needed, let me however provide the information how I found the buggy patch.

What happens is that with the only change above, bison reduces using the wrong rule. Here's the debug output of the parser:

Good

Next token is token ')' (5.47: )
Reducing stack by rule 29 (line 841):
-> $$ = nterm string_list_build (5.47: )
Entering state 57
Stack now 0 1 3 6 7 12 24 40 40 40 57
Reducing stack by rule 28 (line 840):
   $1 = nterm string (5.42-46: )
   $2 = nterm string_list_build (5.47: )
-> $$ = nterm string_list_build (5.42-46: )
Entering state 57
Stack now 0 1 3 6 7 12 24 40 40 57
Reducing stack by rule 28 (line 840):
   $1 = nterm string (5.36-40: )
   $2 = nterm string_list_build (5.42-46: )
-> $$ = nterm string_list_build (5.36-46: )
Entering state 57
Stack now 0 1 3 6 7 12 24 40 57
Reducing stack by rule 28 (line 840):
   $1 = nterm string (5.30-34: )
   $2 = nterm string_list_build (5.36-46: )
-> $$ = nterm string_list_build (5.30-46: )
Entering state 42
Stack now 0 1 3 6 7 12 24 42
Reducing stack by rule 27 (line 836):
   $1 = nterm string_list_build (5.30-46: )
-> $$ = nterm string_list (5.30-46: )
Entering state 41
Stack now 0 1 3 6 7 12 24 41
Next token is token ')' (5.47: )
Shifting token ')' (5.47: )
Entering state 58
Stack now 0 1 3 6 7 12 24 41 58
Reducing stack by rule 13 (line 437):
   $1 = token KW_COLUMNS (5.22-28: )
   $2 = token '(' (5.29: )
   $3 = nterm string_list (5.30-46: )
   $4 = token ')' (5.47: )
-> $$ = nterm parser_csv_opt (5.22-47: )
Entering state 18
Stack now 0 1 3 6 7 18
Reading a token
Next token is token ')' (5.48: )
Reducing stack by rule 5 (line 424):
-> $$ = nterm parser_csv_opts (5.48: )
Entering state 30
Stack now 0 1 3 6 7 18 30
Reducing stack by rule 4 (line 423):
   $1 = nterm parser_csv_opt (5.22-47: )
   $2 = nterm parser_csv_opts (5.48: )
-> $$ = nterm parser_csv_opts (5.22-47: )
Entering state 17
Stack now 0 1 3 6 7 17
Next token is token ')' (5.48: )
Shifting token ')' (5.48: )
Entering state 29
Stack now 0 1 3 6 7 17 29
Reducing stack by rule 3 (line 413):
   $1 = token KW_CSV_PARSER (5.11-20: )
   $2 = token '(' (5.21: )
   $3 = nterm $@1 (5.22: )
   $4 = nterm parser_csv_opts (5.22-47: )
   $5 = token ')' (5.48: )
-> $$ = nterm parser_expr_csv (5.11-48: )
Entering state 4
Stack now 0 1 4
Reducing stack by rule 1 (line 408):
   $1 = token LL_CONTEXT_PARSER (5.11-20: )
   $2 = nterm parser_expr_csv (5.11-48: )
Stack now 0

As you can see, the entire input is properly consumed, reductions happen via rules:

  • Reducing stack by rule 29 (line 841):
  • Reducing stack by rule 28 (line 840):
  • Reducing stack by rule 28 (line 840):
  • Reducing stack by rule 28 (line 840):
  • Reducing stack by rule 27 (line 836):
  • Reducing stack by rule 13 (line 437):
  • Reducing stack by rule 5 (line 424):
  • Reducing stack by rule 4 (line 423):
  • Reducing stack by rule 3 (line 413):
  • Reducing stack by rule 1 (line 408):

The very same in the bad case:

In this case the input file and the grammar is the same, bison does not have the revert, e.g. it is vanilla 3.7.4:

Next token is token ')' (5.47: )
Shifting token ')' (5.47: )
Entering state 61
Stack now 0 1 3 6 7 12 24 40 40 40 61
Reducing stack by rule 9 (line 433):
   $1 = nterm string (5.30-34: )
   $2 = nterm string (5.36-40: )
   $3 = nterm string (5.42-46: )
   $4 = token ')' (5.47: )
-> $$ = nterm parser_csv_opt (5.30-47: )
Entering state 18
Stack now 0 1 3 6 7 12 24 18
Reading a token
Next token is token ')' (5.48: )
Reducing stack by rule 5 (line 424):
-> $$ = nterm parser_csv_opts (5.48: )
Entering state 30
Stack now 0 1 3 6 7 12 24 18 30
Reducing stack by rule 4 (line 423):
   $1 = nterm parser_csv_opt (5.30-47: )
   $2 = nterm parser_csv_opts (5.48: )
-> $$ = nterm parser_csv_opts (5.30-47: )
Entering state 17
Stack now 0 1 3 6 7 12 24 17
Next token is token ')' (5.48: )
Shifting token ')' (5.48: )
Entering state 29
Stack now 0 1 3 6 7 12 24 17 29
Reducing stack by rule 3 (line 413):
   $1 = nterm $@1 (5.22: )
   $2 = token KW_COLUMNS (5.22-28: )
   $3 = token '(' (5.29: )
   $4 = nterm parser_csv_opts (5.30-47: )
   $5 = token ')' (5.48: )
-> $$ = nterm parser_expr_csv (5.22-48: )
Entering state 4
Stack now 0 1 3 6 4
Reducing stack by rule 1 (line 408):
   $1 = token '(' (5.21: )
   $2 = nterm parser_expr_csv (5.22-48: )
Stack now 0 1 3
Cleanup: popping token KW_CSV_PARSER (5.11-20: )
Cleanup: popping token LL_CONTEXT_PARSER (5.11-20: )

In this case, we have two leftover tokens and reductions happen via:

  • Reducing stack by rule 9 (line 433):
  • Reducing stack by rule 5 (line 424):
  • Reducing stack by rule 4 (line 423):
  • Reducing stack by rule 3 (line 413):
  • Reducing stack by rule 1 (line 408):

Comparing bison outputs

The report file is the same in both cases (-r all), there's a difference between the generated .c file though:

$ diff -u rossz.c jo.c
--- rossz.c	2021-01-23 10:29:45.522057550 +0000
+++ jo.c	2021-01-23 10:30:38.365153239 +0000
@@ -2235,7 +2235,7 @@
 };
 #endif
 
-#define YYPACT_NINF (-148)
+#define YYPACT_NINF (-149)
 
 #define yypact_value_is_default(Yyn) \
   ((Yyn) == YYPACT_NINF)
@@ -2249,13 +2249,13 @@
      STATE-NUM.  */
 static const yytype_int16 yypact[] =
 {
-      -4,  -142,    16,  -147,  -148,  -148,  -148,   -87,  -146,  -143,
-    -142,  -141,  -140,  -139,  -138,  -137,  -136,  -136,   -87,  -148,
-    -124,  -124,  -124,  -124,  -124,  -125,  -124,  -124,  -124,  -148,
-    -148,  -148,  -148,  -135,  -148,  -124,  -134,  -133,  -148,  -132,
-    -124,  -131,  -148,  -129,  -126,  -124,  -148,  -147,  -148,  -123,
-    -122,  -121,  -148,  -148,  -148,  -148,  -148,  -148,  -148,  -124,
-    -124,  -148,  -148,  -148,  -148,  -148,  -120,  -119,  -148,  -148
+      -4,  -142,    16,  -146,  -149,  -149,  -149,   -87,  -143,  -141,
+    -140,  -139,  -138,  -137,  -136,  -135,  -134,  -148,   -87,  -149,
+    -124,  -124,  -124,  -124,  -124,  -125,  -124,  -124,  -124,  -149,
+    -149,  -149,  -149,  -133,  -149,  -124,  -132,  -131,  -149,  -130,
+    -124,  -127,  -149,  -123,  -122,  -121,  -149,  -147,  -149,  -120,
+    -119,  -118,  -149,  -149,  -149,  -149,  -149,  -149,  -149,  -124,
+    -124,  -149,  -149,  -149,  -149,  -149,  -117,  -116,  -149,  -149
 };
 
   /* YYDEFACT[STATE-NUM] -- Default reduction number in state STATE-NUM.
@@ -2275,8 +2275,8 @@
   /* YYPGOTO[NTERM-NUM].  */
 static const yytype_int16 yypgoto[] =
 {
-    -148,  -148,  -148,  -148,    29,  -148,  -148,     1,  -148,    14,
-    -148,   -20,    28,    -9,    12,  -148
+    -149,  -149,  -149,  -149,     4,  -149,  -149,   -16,  -149,     8,
+    -149,   -20,    28,    -9,    12,  -149
 };
 
   /* YYDEFGOTO[NTERM-NUM].  */
@@ -2292,10 +2292,10 @@
 static const yytype_int8 yytable[] =
 {
        8,    36,     1,    39,    40,    48,    49,    50,    51,    31,
-      31,     3,    32,    32,    43,    44,     5,     6,    20,     9,
-      40,    21,    22,    23,    24,    25,    26,    27,    28,    29,
-      52,    54,    55,    56,    58,    59,    43,    44,    60,    66,
-      40,    61,    63,    64,    65,    68,    69,    30,    62,    53,
+      31,     3,    32,    32,    43,    44,     5,    29,     6,     9,
+      40,    20,    30,    21,    22,    23,    24,    25,    26,    27,
+      28,    62,    52,    54,    55,    56,    43,    44,    58,    66,
+      40,    59,    60,    53,    61,    63,    64,    65,    68,    69,
       38,    67,    57,     0,     0,     0,     0,     0,     0,     0,
        0,     0,     0,     0,     0,     0,     0,    10,    11,    12,
       13,    14,    15,    16
@@ -2304,10 +2304,10 @@
 static const yytype_int16 yycheck[] =
 {
       87,    21,     6,    23,    24,    25,    26,    27,    28,   134,
-     134,   153,   137,   137,   161,   162,     0,   164,   164,   106,
-      40,   164,   164,   164,   164,   164,   164,   164,   164,   165,
-     165,   165,   165,   165,   165,   164,   161,   162,   164,    59,
-      60,   165,   165,   165,   165,   165,   165,    18,    47,    35,
+     134,   153,   137,   137,   161,   162,     0,   165,   164,   106,
+      40,   164,    18,   164,   164,   164,   164,   164,   164,   164,
+     164,    47,   165,   165,   165,   165,   161,   162,   165,    59,
+      60,   164,   164,    35,   165,   165,   165,   165,   165,   165,
       22,    60,    40,    -1,    -1,    -1,    -1,    -1,    -1,    -1,
       -1,    -1,    -1,    -1,    -1,    -1,    -1,   154,   155,   156,
      157,   158,   159,   160

I suspect there's an off-by-one error somewhere. I am yet to diagnose the patch itself, fortunately that single change is what triggers the bug and it can easily be reverted on top of the latest release.

Let me know what else you would need.

Possible crash on invalid input

I'm porting GNU bison on windows and got an issue describing crash on some invalid input file.
lexxmark/winflexbison#64

I reproduced it and stack trace is following:
states_free ()
closure_free()
bitset_free (ruleset);
BITSET_FREE_ (bset); <<<< bset is NULL

Could you check if it also affect original bison code under linux?

Wrong fix for #72

Hi,

As it seems the fix for #72 was not entirely successful. The bitset is not properly shifted to the right. See this patch:

diff --git a/src/tables.c b/src/tables.c
index b04a496e..60e3ec93 100644
--- a/src/tables.c
+++ b/src/tables.c
@@ -186,8 +186,8 @@ pos_set_set (int pos)
       bitset_resize (pos_set, new_size);
       // Shift all the bits by DELTA.
       // FIXME: add bitset_assign, and bitset_shift?
-      for (int i = new_size - 1; delta <= i ; --i)
-        if (bitset_test (pos_set, i))
+      for (int i = old_size - 1; i >= -delta ; --i)
+        if (i >= 0 && bitset_test (pos_set, i))
           bitset_set (pos_set, i + delta);
         else
           bitset_reset (pos_set, i + delta);

With this patch in place, the generated grammars work for me.

Incorrect D code generated when enabling locations.

Hello,

When generating a parser for the following file with location support:

%language "D"
%define api.parser.class {Parser}
%locations

/* Types the parser handles for return and input */
%union {
    AST           tast;
    Declaration[] tdecls;
    Declaration   tdecl;
    Statement[]   tstmts;
    Statement     tstmt;
    Type          ttype;
    string        str;
    char          chr;
    long          lng;
    double        dbl;
}

/* Single-char token declarations */
%token SEMICOLON    ";"
       PERIOD       "."
       OPENPAREN    "("
       CLOSEPAREN   ")"
       OPENBRACE    "{"
       CLOSEBRACE   "}"
       OPENBRACKET  "["
       CLOSEBRACKET "]"
       PLUS         "+"
       MINUS        "-"
       ASTERISK     "*"
       SLASH        "/"

/* Keywords */
%token RETURN "return"
       VOID   "void"
       INT    "int"
       FLOAT  "float"
       CHAR   "char"
       STRING "string"

/* Values */
%token <str> IDENTIFIER STRINGVALUE
%token <chr> CHARACTER
%token <lng> INTEGER
%token <dbl> FLOATVALUE

%%
ast : /* Empty*/
    | declarations
    ;

declarations : declaration
             | declaration declarations
             ;

declaration : type IDENTIFIER "(" ")" "{" statements "}"
            ;

statements : statement
           | statement statements
           ;

statement : "return" ";"
          ;

type : "void"
     ;
%%

struct AST {}
class Declaration {}
class Statement   {}
final class Function : Declaration {}
final class Return : Statement {}
enum Type {}

Compiling the generated code yields the following error, which does not happen when not enabling locations:

cdc.p/parser.d(764): Error: function `parser.Parser.YYStack.push(int state, YYSemanticType value, ref YYLocation loc)` is not callable using argument types `(int, immutable(YYSemanticType), YYLocation)`
cdc.p/parser.d(764):        cannot pass argument `yy_semantic_null` of type `immutable(YYSemanticType)` to parameter `YYSemanticType value`
cdc.p/parser.d(765): Error: function `parser.Parser.YYStack.push(int state, YYSemanticType value, ref YYLocation loc)` is not callable using argument types `(int, immutable(YYSemanticType), YYLocation)`
cdc.p/parser.d(765):        cannot pass argument `yy_semantic_null` of type `immutable(YYSemanticType)` to parameter `YYSemanticType value`

using ldc2.

Removing the user-defined classes from the union seems to fix it, but this cannot be done in the original code as they serve a purpose there.

What is there at fault here? What could be a fix for this issue? Thanks in advance.

lalr1.cc and increased usage of noexcept

lalr1.cc contains the following code fragment:

class context
{
public:
context (const ]b4_parser_class[& yyparser, const symbol_type& yyla);
const symbol_type& lookahead () const { return yyla_; }
symbol_kind_type token () const { return yyla_.kind (); }]b4_locations_if([[
const location_type& location () const { return yyla_.location; }

lookahead(), token(), and location() should be tagged with YY_NOEXCEPT.

Note that kind() is already tagged with YY_NOEXCEPT as shown below.

/// The symbol kind (corresponding to \a state).
/// \a ]b4_symbol(-2, kind)[ when empty.
symbol_kind_type kind () const YY_NOEXCEPT;

Generate location.hh when api.location.type is provided

Hi, I'd like to extend default location class, but when %define api.location.type { MyLocation } is provided, location.hh is not generated.

Is it possible to add some option for such case?

#include <location.hh> 
struct MyLocation : yy_cl::location
{
    int argument;
}; 

How to remove entries from the token list at runtime / from being passed to yysyntax_error?

I have a rather complex grammar which handles different "dialects".
To support that there is an internal list of reserved words for each dialect and their matching tokens, which is used by the scanner to distinguish between "this literal means either TOKEN1, TOKEN2 or the token WORD" and the grammar uses all possible tokens.

This works quite good in general - until the parser is given input that does not match the "dialect".

Example: TOKEN2 is "disabled" in a given dialect, in this case the scanner returns the token WORD and the user gets an error message like

"unexpected WORD expecting TOKEN1 or TOKEN2"

This is of course confusing because "TOKEN2" is what is actually given as input (the scanner just pass it as different internal token).

I see two possible options to handle this:

  1. Improve the parser's error handling and performance by dropping any "disabled" tokens / "clear" them from the internal parser list to never let it be an expected token any more (I have no idea if this is possible or could be achieved, but that seems to be the ideal solution)
  2. get into the construction of the diagnostic and remove the "disabled" tokens from the list, before bison produces the message (= also not consider it in the amount of tokens available - too much will disable the verbosity), this is possibly the easiest option but I'm not sure how to actually do this - the definition of yyerror does not help as that is too late as the error message is already constructed

Where is the INSTALL file?

Hello,
I want to exec the cross-compilation from tarball, but i cannot find the "INSTALL" file? How i should do?

Escaping dollar sign in a rule action

Hi!
Is there a way to pass the dollar sign from the grammar.y file to the parser output file without modification?
I want to do something like this:

grammar.y

%%
rule { \$this->var = $1; }
...
%%

parser

...
switch (yyn) {
    case 2: { $this->var = yystack.valueAt(0)); }
break;
...

I spent several hours on this and I think it's impossible.
Can you help me?

make: *** [Makefile:7745: src/bison-scan-code-c.o] Error 1

I am trying to install bison and when I run make I get this:
GPERF lib/iconv_open-aix.h
GPERF lib/iconv_open-hpux.h
GPERF lib/iconv_open-irix.h
GPERF lib/iconv_open-osf.h
GPERF lib/iconv_open-solaris.h
GPERF lib/iconv_open-zos.h
GEN lib/inttypes.h
GEN lib/textstyle.h
GEN lib/limits.h
GEN lib/locale.h
GEN lib/math.h
GEN lib/sched.h
GEN lib/signal.h
GEN lib/spawn.h
GEN lib/stdint.h
GEN lib/stdio.h
GEN lib/stdlib.h
GEN lib/string.h
GEN lib/sys/ioctl.h
GEN lib/sys/resource.h
GEN lib/sys/time.h
GEN lib/sys/times.h
GEN lib/sys/types.h
GEN lib/sys/wait.h
GEN lib/termios.h
GEN lib/time.h
GEN lib/unistd.h
GEN lib/unistr.h
GEN lib/unitypes.h
GEN lib/uniwidth.h
GEN lib/wchar.h
GEN lib/wctype.h
LEX src/scan-code.c
LEX src/scan-gram.c
LEX src/scan-skel.c
echo 3.7.1.23-6e1d8 > .version-t && mv .version-t .version
CC src/bison-AnnotationList.o
CC src/bison-InadequacyList.o
CC src/bison-Sbitset.o
CC src/bison-assoc.o
CC src/bison-closure.o
CC src/bison-complain.o
CC src/bison-conflicts.o
CC src/bison-counterexample.o
CC src/bison-derivation.o
CC src/bison-derives.o
CC src/bison-files.o
CC src/bison-fixits.o
CC src/bison-getargs.o
CC src/bison-glyphs.o
CC src/bison-gram.o
CC src/bison-graphviz.o
CC src/bison-ielr.o
CC src/bison-lalr.o
CC src/bison-location.o
CC src/bison-lr0.o
CC src/bison-lssi.o
CC src/bison-main.o
CC src/bison-muscle-tab.o
CC src/bison-named-ref.o
CC src/bison-nullable.o
CC src/bison-output.o
CC src/bison-parse-gram.o
CC src/bison-parse-simulation.o
CC src/bison-print-graph.o
CC src/bison-print-xml.o
CC src/bison-print.o
CC src/bison-reader.o
CC src/bison-reduce.o
CC src/bison-relation.o
CC src/bison-scan-code-c.o
src/scan-code-c.c:3:10: fatal error: src/scan-code.c: No such file or directory
#include "src/scan-code.c"
^~~~~~~~~~~~~~~~~
compilation terminated.
make: *** [Makefile:7745: src/bison-scan-code-c.o] Error 1

[NonFetal Error]: use-of-uninitialized-value in bison(version 3.8.2.45, commit 25b3d0e1)

Crash Inputs

Here are the files that trigger the bug - muscle-tab.c_186_3-in-muscle_grow.zip

Bug Description

I apply MSan (Memory Sanitizer) to check for errors and report the detected errors as follows.

MemorySanitizer: use-of-uninitialized-value
    #0 0x54f335 in muscle_grow /data/code/bison/src/muscle-tab.c:186:3
    #1 0x54e4c4 in muscle_syncline_grow /data/code/bison/src/muscle-tab.c:214:3
    #2 0x54c815 in muscle_code_grow /data/code/bison/src/muscle-tab.c:227:3
    #3 0x5c4783 in gram_parse /data/code/bison/src/parse-gram.c:2082:7
    #4 0x6074a5 in reader /data/code/bison/src/reader.c:766:3
    #5 0x54a754 in main /data/code/bison/src/main.c:118:3
    #6 0x7f62fc25a082 in __libc_start_main /build/glibc-SzIz7B/glibc-2.31/csu/../csu/libc-start.c:308:16
    #7 0x41d70d in _start (/data/program/bison/orig-msan/bin/bison+0x41d70d)

SUMMARY: MemorySanitizer: use-of-uninitialized-value /data/code/bison/src/muscle-tab.c:186:3 in muscle_grow

How to Reproduce

The aforementioned bug can be stably reproduced in version 3.8.2.45 (commit id 25b3d0e).

  1. Download the bison source code with the official link.
  2. Using clang/clang++ (10.0.0-4ubuntu1), build bison with MSan.
    • -U_FORTIFY_SOURCE -fsanitize=memory -g
  3. Execute bison with the provided input files.
    • eg: /data/program/bison/orig-msan/bin/bison <input-file-path>

FIXME - java

Hello!
I am currently working on a small project, which utilises Bison for Kotlin (KMP/KMM).
This far I managed to create a Gradle task and checked the Java Source for any obstacle, which makes the Java2Kotlin Converter do funny things and started to eliminate them.
However while doing so I stumbled over this FIXME, which unfortunately does not provide any further information or pointers to any test case this breaks. Therefore it would be lovely if you point me to additional informations or give me a hint what is wrong there, so I am maybe able to fix it. Of course if I am able to fix it, I gladly provide any contribution to Bison.
However in the long run direct Kotlin support would be great, but I my knowledge around M4 is for now too limited be be of any help to make something similar like this project.

yynerrs unused-but-set-variable warning with Clang 15

When building bison-generated parsers using Clang 15, a -Wunused-but-set-variable warning is thrown for the usually unused yynerrs variable:

/home/nikic/php/php-src-fast/Zend/zend_language_parser.c:4487:9: error: variable 'zendnerrs' set but not used [-Werror,-Wunused-but-set-variable]
    int yynerrs;
        ^
/home/nikic/php/php-src-fast/Zend/zend_language_parser.c:93:25: note: expanded from macro 'yynerrs'
#define yynerrs         zendnerrs

Probably the declaration needs to be annotated with YY_ATTRIBUTE_UNUSED.

bison 3.6.1 generated unexpected nested-comment, but 3.5.4 is no problem

This code using bison 3.5.4 with no problem.
https://github.com/verilator/verilator/blob/master/src/verilog.y#L655-L673

But when I upgrade to bison 3.6.1, it have something break.

The original code :

%token<fl>		yVL_CLOCK		"/*verilator sc_clock*/"
%token<fl>		yVL_CLOCKER		"/*verilator clocker*/"
%token<fl>		yVL_NO_CLOCKER		"/*verilator no_clocker*/"
%token<fl>		yVL_CLOCK_ENABLE	"/*verilator clock_enable*/"
%token<fl>		yVL_COVERAGE_BLOCK_OFF	"/*verilator coverage_block_off*/"
%token<fl>		yVL_FULL_CASE		"/*verilator full_case*/"
%token<fl>		yVL_INLINE_MODULE	"/*verilator inline_module*/"
%token<fl>		yVL_ISOLATE_ASSIGNMENTS	"/*verilator isolate_assignments*/"
%token<fl>		yVL_NO_INLINE_MODULE	"/*verilator no_inline_module*/"
%token<fl>		yVL_NO_INLINE_TASK	"/*verilator no_inline_task*/"
%token<fl>		yVL_SC_BV		"/*verilator sc_bv*/"
%token<fl>		yVL_SFORMAT		"/*verilator sformat*/"
%token<fl>		yVL_PARALLEL_CASE	"/*verilator parallel_case*/"
%token<fl>		yVL_PUBLIC		"/*verilator public*/"
%token<fl>		yVL_PUBLIC_FLAT		"/*verilator public_flat*/"
%token<fl>		yVL_PUBLIC_FLAT_RD	"/*verilator public_flat_rd*/"
%token<fl>		yVL_PUBLIC_FLAT_RW	"/*verilator public_flat_rw*/"
%token<fl>		yVL_PUBLIC_MODULE	"/*verilator public_module*/"
%token<fl>		yVL_SPLIT_VAR		"/*verilator split_var*/"

The code generated and can't compile

    yVL_CLOCK = 610,               /* "/*verilator sc_clock*/"  */
    yVL_CLOCKER = 611,             /* "/*verilator clocker*/"  */
    yVL_NO_CLOCKER = 612,          /* "/*verilator no_clocker*/"  */
    yVL_CLOCK_ENABLE = 613,        /* "/*verilator clock_enable*/"  */
    yVL_COVERAGE_BLOCK_OFF = 614,  /* "/*verilator coverage_block_off*/"  */
    yVL_FULL_CASE = 615,           /* "/*verilator full_case*/"  */
    yVL_INLINE_MODULE = 616,       /* "/*verilator inline_module*/"  */
    yVL_ISOLATE_ASSIGNMENTS = 617, /* "/*verilator isolate_assignments*/"  */
    yVL_NO_INLINE_MODULE = 618,    /* "/*verilator no_inline_module*/"  */
    yVL_NO_INLINE_TASK = 619,      /* "/*verilator no_inline_task*/"  */
    yVL_SC_BV = 620,               /* "/*verilator sc_bv*/"  */
    yVL_SFORMAT = 621,             /* "/*verilator sformat*/"  */
    yVL_PARALLEL_CASE = 622,       /* "/*verilator parallel_case*/"  */
    yVL_PUBLIC = 623,              /* "/*verilator public*/"  */
    yVL_PUBLIC_FLAT = 624,         /* "/*verilator public_flat*/"  */
    yVL_PUBLIC_FLAT_RD = 625,      /* "/*verilator public_flat_rd*/"  */
    yVL_PUBLIC_FLAT_RW = 626,      /* "/*verilator public_flat_rw*/"  */
    yVL_PUBLIC_MODULE = 627,       /* "/*verilator public_module*/"  */
    yVL_SPLIT_VAR = 628,           /* "/*verilator split_var*/"  */

Is there any option to not generate comments (compatible with older versions)? It may a break change for many software.

verilator/verilator#2320

c++.m4 and increased usage of noexcept

c++.m4 contains the following code fragments

/// Destroy contents, and record that is empty. (around line 341)
void clear ()

clear() should be tagged as YY_NOEXCEPT.

Each of the destroy calls in the switch are already marked YY_NOEXCEPT.

/// Record that this symbol is empty. (around line 424)
void clear ();

clear() should be tagged as YY_NOEXCEPT.

]b4_inline([$1])[void (around line 545)
]b4_parser_class[::by_kind::clear ()
{
kind_ = ]b4_symbol(-2, kind)[;
}

clear() should be tagged as YY_NOEXCEPT to match declaration.

[BUG] abitset_set is reachable by crafted input, which cause the program abort

short summary

Hello, I was testing my fuzzer and found function abitset_set in lib/bitset/array.c:92 can be reached when bison parse a crafted input. As commented in the code indicate, it should not be reached. I'm not sure if it's a bug or just error handling, pls ignore if it's just an expected behavior.

Step to reproduce

CC="gcc -fsanitize=address -g " CXX="g++ -fsanitize=address -g" ./autogen.sh && ./configure --disable-shared && make -j$(nproc)
./src/bison $POC

Environment

Ubuntu 22.04 (docker image)
gcc 11.2.0
bison latest commit 6376364

Output / gdb log

poc1:40.30: warning: stray '$' [-Wother]
   40 | %printer { fprintf (y_o, "%g"$ E$); } <double>;
      |                              ^
...
      |                                       ^
poc1:98.24: warning: empty rule without %empty [-Wempty-rule]
   98 |   {            } YYEOF:
      |                        ^
      |                        %empty

Program received signal SIGABRT, Aborted.

[----------------------------------registers-----------------------------------]
RAX: 0x0
RBX: 0x7ffff72aac00 (0x00007ffff72aac00)
RCX: 0x7ffff7440828 (<__GI___pthread_kill+248>: mov    r13d,eax)
RDX: 0xffffffe6 --> 0x0
RSI: 0x6
RDI: 0x257de6
RBP: 0x6
RSP: 0x7fffffffda60 --> 0xffffffffb4e --> 0x0
RIP: 0x7ffff7440828 (<__GI___pthread_kill+248>: mov    r13d,eax)
R8 : 0x7fffffffdb30 --> 0x20 (' ')
R9 : 0x0
R10: 0x8
R11: 0x246
R12: 0x5555556cbc60 --> 0xb ('\x0b')
R13: 0x16
R14: 0x1e
R15: 0x5555556e8ca0 --> 0x29 (')')
EFLAGS: 0x246 (carry PARITY adjust ZERO sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x7ffff744081f <__GI___pthread_kill+239>:    mov    edi,eax
   0x7ffff7440821 <__GI___pthread_kill+241>:    mov    eax,0x3e
   0x7ffff7440826 <__GI___pthread_kill+246>:    syscall
=> 0x7ffff7440828 <__GI___pthread_kill+248>:    mov    r13d,eax
   0x7ffff744082b <__GI___pthread_kill+251>:    neg    r13d
   0x7ffff744082e <__GI___pthread_kill+254>:    cmp    eax,0xfffff000
   0x7ffff7440833 <__GI___pthread_kill+259>:    mov    eax,0x0
   0x7ffff7440838 <__GI___pthread_kill+264>:    cmovbe r13d,eax
[------------------------------------stack-------------------------------------]
0000| 0x7fffffffda60 --> 0xffffffffb4e --> 0x0
0008| 0x7fffffffda68 --> 0x5555555c799d (<location_caret_suggestion+557>:       jmp    0x5555555c793c <location_caret_suggestion+460>)
0016| 0x7fffffffda70 --> 0x41b58ab3
0024| 0x7fffffffda78 --> 0x5555556a8a00 ("4 32 24 7 now:175 96 24 7 now:126 160 144 8 self:105 368 144 8 chld:107")
0032| 0x7fffffffda80 --> 0x555555667330 (<timevar_push>:        endbr64)
0040| 0x7fffffffda88 --> 0x5555556a8a48 ("4 32 24 7 now:214 96 24 7 now:126 160 144 8 self:105 368 144 8 chld:107")
0048| 0x7fffffffda90 --> 0x555555667970 (<timevar_pop>: endbr64)
0056| 0x7fffffffda98 --> 0x606000004128 --> 0x602000001090 --> 0x606000004160 --> 0x602000000250 --> 0x31636f70 ('poc1')
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGABRT
__pthread_kill_implementation (no_tid=0x0, signo=0x6, threadid=0x7ffff72aac00) at pthread_kill.c:44
44      pthread_kill.c: No such file or directory.
gdb-peda$ bt
#0  __pthread_kill_implementation (no_tid=0x0, signo=0x6, threadid=0x7ffff72aac00) at pthread_kill.c:44
#1  __pthread_kill_internal (signo=0x6, threadid=0x7ffff72aac00) at pthread_kill.c:80
#2  __GI___pthread_kill (threadid=0x7ffff72aac00, signo=signo@entry=0x6) at pthread_kill.c:91
#3  0x00007ffff73ec476 in __GI_raise (sig=sig@entry=0x6) at ../sysdeps/posix/raise.c:26
#4  0x00007ffff73d27b7 in __GI_abort () at abort.c:79
#5  0x000055555558a638 in abitset_set (dst=<optimized out>, bitno=<optimized out>) at lib/bitset/array.c:92
#6  0x00005555556013ca in bitset_set (bitno=<optimized out>, bset=0x6040000009d0) at ./lib/bitset.h:146
#7  bitset_set (bitno=<optimized out>, bset=0x6040000009d0) at ./lib/bitset.h:138
#8  useless_nonterminals () at src/reduce.c:121
#9  reduce_grammar () at src/reduce.c:377
#10 0x000055555558b9ad in main (argc=argc@entry=0x2, argv=argv@entry=0x7fffffffde58) at src/main.c:126
#11 0x00007ffff73d3fd0 in __libc_start_call_main (main=main@entry=0x55555558b780 <main>, argc=argc@entry=0x2, argv=argv@entry=0x7fffffffde58)
    at ../sysdeps/nptl/libc_start_call_main.h:58
#12 0x00007ffff73d407d in __libc_start_main_impl (main=0x55555558b780 <main>, argc=0x2, argv=0x7fffffffde58, init=<optimized out>, fini=<optimized out>,
    rtld_fini=<optimized out>, stack_end=0x7fffffffde48) at ../csu/libc-start.c:409
#13 0x000055555558d0c5 in _start ()

Seems that function abitset_set should never be reached. I'm not sure if it's a bug or just error handling.

/* Set bit BITNO in bitset DST.  */
static void
abitset_set (MAYBE_UNUSED bitset dst, MAYBE_UNUSED bitset_bindex bitno)
{
  /* This should never occur for abitsets since we should always hit
     the cache.  It is likely someone is trying to access outside the
     bounds of the bitset.  */
  abort ();
}

POC

poc1.zip

Credit

Han Zheng (NCNIPC of China, Hexhive)

Bison crash on counterexamples report

This bug was originally reported on the bug-bison mailing list by Michal Bartkowiak on 5 Jan 2021 - https://lists.gnu.org/archive/html/bug-bison/2021-01/msg00000.html

I've investigated it and am opening an issue here for ease of tracking - many thanks Akim btw for creating this github mirror!

Here's what happens with the grammar file parser.yc.gz provided by Michal:

$ bison --report=counterexamples parser.yc
parser.yc: warning: 2 shift/reduce conflicts [-Wconflicts-sr]
parser.yc: warning: 1 reduce/reduce conflict [-Wconflicts-rr]
parser.yc: note: rerun with option '-Wcounterexamples' to generate conflict counterexamples
Segmentation fault (core dumped)

This is bison built from master

$ bison --version
bison (GNU Bison) 3.7.4.284-fb14

Here's the stack at the crash:

Thread 1 "bison" received signal SIGSEGV, Segmentation fault.
0x000000010041cfc0 in eligible_state_items (target=0x800061460) at lssi.c:141
141           BITSET_FOR_EACH (biter, rsi, sin, 0)
#0  eligible_state_items (target=0x800061460) at lssi.c:141
#1  0x000000010041d087 in shortest_path_from_start (target=28, next_sym=8) at lssi.c:156
#2  0x000000010040bf6d in counterexample_report (itm1=28, itm2=29, next_sym=8, shift_reduce=false, out=0x800050318,
    prefix=0x1004fe958 <__func__.0+384> "    ") at counterexample.c:1255
#3  0x000000010040c7d4 in counterexample_report_reduce_reduce (Reading in symbols for print.c...
itm1=28, itm2=29, conflict_syms=0x800066b80, out=0x800050318, prefix=0x1004fe958 <__func__.0+384> "    ")
    at counterexample.c:1350
#4  0x000000010040cb3c in counterexample_report_state (s=0x80005ea00, out=0x800050318,
    prefix=0x1004fe958 <__func__.0+384> "    ") at counterexample.c:1394
#5  0x000000010043ee02 in print_state (Reading in symbols for main.c...
out=0x800050318, s=0x80005ea00) at print.c:366
#6  0x000000010043f42d in print_results () at print.c:473
#7  0x000000010041e0fc in main (argc=3, argv=0xffffcc10) at main.c:179

This is the source in lssi.c

(gdb) l
136           bitset_set (result, si - state_items);
137           // search all reverse edges.
138           bitset rsi = si->revs;
139           bitset_iterator biter;
140           state_item_number sin;
141           BITSET_FOR_EACH (biter, rsi, sin, 0)
142             gl_list_add_last (queue, &state_items[sin]);

The reason is the bitset rsi passed to BITSET_FOR_EACH is from a disabled state_item si - meaning the bitset was previously freed and has garbage vtable pointers - in particular the list function pointer which is accessed by gl_list_add_last causing the segv

This is the disabled state_item

(gdb) p si.trans
$2 = -2
(gdb) p *si.item
$3 = 15
(gdb) p *si.state
$4 = {
  number = 13,
  accessing_symbol = 4,
  consistent = false,
  solved_conflicts = 0x0,
  solved_conflicts_xml = 0x0,
  nitems = 2,
  items = {22}
}

The state_item was disabled and the bitset was freed by prune_backward in state-items.c:

#0  disable_state_item (si=0x8000615e0) at state-item.c:381
#1  0x000000010045a726 in prune_backward (si=0x800061520) at state-item.c:453
#2  0x000000010045a7c8 in prune_disabled_paths () at state-item.c:471
#3  0x000000010045adf0 in state_items_init () at state-item.c:557
#4  0x000000010040bf00 in counterexample_init () at counterexample.c:1229
#5  0x000000010041e07c in main (argc=3, argv=0xffffcc10) at main.c:154

(gdb) p *si.state
$9 = {
  number = 13,
  accessing_symbol = 4,
  consistent = false,
  solved_conflicts = 0x0,
  solved_conflicts_xml = 0x0,
  nitems = 2,
  items = {22}
}

(gdb) p *si.item
$7 = 15

Clearly the loop in eligible_state_items should not be accessing a disabled state_item - but I don't know enough to say if the loop should avoid adding a disabled state_item to the queue in the first place

I have some comments on the grammar below

Parallel build issue

We recently increased the number of build jobs used by our CI runner to 8 and observed the following build errors with bison-3.7.1

On a musl based system (x86_64-pc-linux-musl):

x86_64-pc-linux-musl-cc -DEXEEXT=\"\"   -I. -I./lib -I. -I./lib -DDEFAULT_TEXT_DOMAIN=\"bison-gnulib\" -march=x86-64 -mtune=generic -pipe -O2 -I/usr/x86_64-pc-linux-musl/include  -march=x86-64 -mtune=generic -pipe -O2 -c -o lib/libbison_a-careadlinkat.o `test -f 'lib/careadlinkat.c' || echo './'`lib/careadlinkat.c
In file included from lib/careadlinkat.h:24,
                 from lib/careadlinkat.c:23:
./lib/unistd.h:632:11: fatal error: getopt-cdefs.h: No such file or directory
  632 | # include <getopt-cdefs.h>
      |           ^~~~~~~~~~~~~~~~
compilation terminated.
make: *** [Makefile:4943: lib/libbison_a-careadlinkat.o] Error 1

It happens far less likely on a glibc based system (x86_64-pc-linux-gnu), but I saw this error:

x86_64-pc-linux-gnu-cc -DEXEEXT=\"\"   -I. -I./lib -I. -I./lib -DDEFAULT_TEXT_DOMAIN=\"bison-gnulib\" -march=x86-64 -mtune=generic -pipe -O2 -I/usr/x86_64-pc-linux-gnu/include  -march=x86-64 -mtune=generic -pipe -O2 -c -o lib/libbison_a-pipe2.o `test -f 'lib/pipe2.c' || echo './'`lib/pipe2.c
In file included from lib/pipe2.c:25:
lib/binary-io.h: In function '__gl_setmode':
lib/binary-io.h:52:10: error: 'O_BINARY' undeclared (first use in this function)
   52 |   return O_BINARY;
      |          ^~~~~~~~
lib/binary-io.h:52:10: note: each undeclared identifier is reported only once for each function it appears in
lib/pipe2.c: In function 'rpl_pipe2':
lib/pipe2.c:70:43: error: 'O_BINARY' undeclared (first use in this function); did you mean 'SET_BINARY'?
   70 |   if ((flags & ~(O_CLOEXEC | O_NONBLOCK | O_BINARY | O_TEXT)) != 0)
      |                                           ^~~~~~~~
      |                                           SET_BINARY
lib/pipe2.c:70:54: error: 'O_TEXT' undeclared (first use in this function); did you mean 'F_TEST'?
   70 |   if ((flags & ~(O_CLOEXEC | O_NONBLOCK | O_BINARY | O_TEXT)) != 0)
      |                                                      ^~~~~~
      |                                                      F_TEST
make: *** [Makefile:5391: lib/libbison_a-pipe2.o] Error 1

I found a Gentoo bug describing the same problem: https://bugs.gentoo.org/713556

make is version 4.3

CVE found in releases prior to 3.5.4

Apologies if this is not the right platform to ask this.

As you might be aware a CVE in bison has been found in releases prior to 3.5.4.

GNU Bison before 3.5.4 allows attackers to cause a denial of service (application crash).

Does this vulnerability gets introduced to the application/binary which we build using Bison? (It is not very clear from the vulnerability description)

If yes, any idea where can I get a 3.5.4+ executable available for Windows?

Thanks!

Some counterexamples seem to be repeated for -Wcounterexamples

bison 3.7.5 seems to print some counterexamples twice with -Wcounterexamples

The output below is for a grammar provided by Christoph Grüninger on the help-bison mailing list asking for help with resolving the conflicts - https://lists.gnu.org/archive/html/help-bison/2021-02/msg00000.html - the grammar file cmDependsJavaParser.y
is at https://lists.gnu.org/archive/html/help-bison/2021-02/txtD1gzy_c0wb.txt

I've numbered each counterexample - you'll notice there are 6 counterexamples for 4 conflicts. And counterexamples 2 and 5 seem to be the same. Similarly 3 and 6 seem duplicates.

The true story is the duplicate counterexamples are for different parser states. But this is apparent only when the diagnostics report file is generated with --report=counterexamples --report-file=report.txt. The report shows each conflict and counterexample by parser state. Without the report the output can be confusing.

Could parser states be referenced in the output for -Wcounterexamples?

$ bison -Wcounterexamples cmDependsJavaParser.y

cmDependsJavaParser.y: warning: 4 shift/reduce conflicts [-Wconflicts-sr]

1. cmDependsJavaParser.y: warning: shift/reduce conflict on token jp_SEMICOL [-Wcounterexamples]
Example: ClassBodyDeclarations MethodHeader MethodBody • jp_SEMICOL
Shift derivation
ClassBodyDeclarations
↳ 79: ClassBodyDeclarations ClassBodyDeclaration
                            ↳ 80: ClassMemberDeclaration
                                  ↳ 85: MethodDeclaration
                                        ↳ 97: MethodHeader MethodBody • jp_SEMICOL
Example: ClassBodyDeclarations MethodHeader MethodBody • jp_SEMICOL
Reduce derivation
ClassBodyDeclarations
↳ 79: ClassBodyDeclarations                                                   ClassBodyDeclaration
      ↳ 79: ClassBodyDeclarations ClassBodyDeclaration                        ↳ 83: TypeDeclaration
                                  ↳ 80: ClassMemberDeclaration                      ↳ 52: jp_SEMICOL
                                        ↳ 85: MethodDeclaration
                                              ↳ 96: MethodHeader MethodBody •

2. cmDependsJavaParser.y: warning: shift/reduce conflict on token jp_DOT [-Wcounterexamples]
Example: jp_THIS • jp_DOT Identifier
Shift derivation
FieldAccess
↳ 268: jp_THIS • jp_DOT Identifier
Example: jp_THIS • jp_DOT Identifier
Reduce derivation
FieldAccess
↳ 266: Primary                  jp_DOT Identifier
       ↳ 239: PrimaryNoNewArray
              ↳ 242: jp_THIS •

3. cmDependsJavaParser.y: warning: shift/reduce conflict on token jp_DOT [-Wcounterexamples]
Example: jp_THIS • jp_DOT Identifier jp_PARESTART jp_PAREEND
Shift derivation
MethodInvocation
↳ 273: jp_THIS • jp_DOT Identifier jp_PARESTART ArgumentListopt jp_PAREEND
                                                ↳ 273: ε
Example: jp_THIS • jp_DOT Identifier jp_PARESTART jp_PAREEND
Reduce derivation
MethodInvocation
↳ 271: Primary                  jp_DOT Identifier jp_PARESTART ArgumentListopt jp_PAREEND
       ↳ 239: PrimaryNoNewArray                                ↳ 271: ε
              ↳ 242: jp_THIS •

4. cmDependsJavaParser.y: warning: shift/reduce conflict on token jp_SEMICOL [-Wcounterexamples]
Example: ClassBodyDeclarations Modifiersopt ConstructorDeclarator Throwsopt ConstructorBody • jp_SEMICOL
Shift derivation
ClassBodyDeclarations
↳ 79: ClassBodyDeclarations ClassBodyDeclaration
                            ↳ 82: ConstructorDeclaration
                                  ↳ 115: Modifiersopt ConstructorDeclarator Throwsopt ConstructorBody • jp_SEMICOL
Example: ClassBodyDeclarations Modifiersopt ConstructorDeclarator Throwsopt ConstructorBody • jp_SEMICOL
Reduce derivation
ClassBodyDeclarations
↳ 79: ClassBodyDeclarations                                                                                   ClassBodyDeclaration
      ↳ 79: ClassBodyDeclarations ClassBodyDeclaration                                                        ↳ 83: TypeDeclaration
                                  ↳ 82: ConstructorDeclaration                                                      ↳ 52: jp_SEMICOL
                                        ↳ 114: Modifiersopt ConstructorDeclarator Throwsopt ConstructorBody •

5. cmDependsJavaParser.y: warning: shift/reduce conflict on token jp_DOT [-Wcounterexamples]
Example: jp_THIS • jp_DOT Identifier
Shift derivation
FieldAccess
↳ 268: jp_THIS • jp_DOT Identifier
Example: jp_THIS • jp_DOT Identifier
Reduce derivation
FieldAccess
↳ 266: Primary                  jp_DOT Identifier
       ↳ 239: PrimaryNoNewArray
              ↳ 242: jp_THIS •

6. cmDependsJavaParser.y: warning: shift/reduce conflict on token jp_DOT [-Wcounterexamples]
Example: jp_THIS • jp_DOT Identifier jp_PARESTART jp_PAREEND
Shift derivation
MethodInvocation
↳ 273: jp_THIS • jp_DOT Identifier jp_PARESTART ArgumentListopt jp_PAREEND
                                                ↳ 273: ε
Example: jp_THIS • jp_DOT Identifier jp_PARESTART jp_PAREEND
Reduce derivation
MethodInvocation
↳ 271: Primary                  jp_DOT Identifier jp_PARESTART ArgumentListopt jp_PAREEND
       ↳ 239: PrimaryNoNewArray                                ↳ 271: ε
              ↳ 242: jp_THIS •

git shallow clone - error: Server does not allow request for unadvertised object - git.sv.gnu.org - git.savannah.gnu.org

git shallow clone fails at

error: Server does not allow request for unadvertised object 66fdaea3cfb4e758212c1891913e9a59441d49af

this is slightly annoying, because i have to fetch all the git history (deep clone)

see also Allow fetch of specific commit hash from git://git.sv.gnu.org

git clone --depth 1 --recurse-submodules --shallow-submodules https://github.com/akimd/bison

Cloning into 'bison'...
remote: Enumerating objects: 376, done.
remote: Counting objects: 100% (376/376), done.
remote: Compressing objects: 100% (342/342), done.
remote: Total 376 (delta 87), reused 91 (delta 15), pack-reused 0
Receiving objects: 100% (376/376), 1.34 MiB | 2.57 MiB/s, done.
Resolving deltas: 100% (87/87), done.
Submodule 'gnulib' (git://git.savannah.gnu.org/gnulib.git) registered for path 'gnulib'
Submodule 'submodules/autoconf' (git://git.sv.gnu.org/autoconf.git) registered for path 'submodules/autoconf'
Cloning into '/home/user/src/bison/bison/gnulib'...
remote: Counting objects: 10725, done.        
remote: Compressing objects: 100% (9641/9641), done.        
remote: Total 10725 (delta 5376), reused 2358 (delta 1066)        
Receiving objects: 100% (10725/10725), 9.69 MiB | 3.29 MiB/s, done.
Resolving deltas: 100% (5376/5376), done.
Cloning into '/home/user/src/bison/bison/submodules/autoconf'...
remote: Counting objects: 161, done.        
remote: Compressing objects: 100% (155/155), done.        
remote: Total 161 (delta 13), reused 64 (delta 4)        B/s
Receiving objects: 100% (161/161), 1.50 MiB | 1.57 MiB/s, done.
Resolving deltas: 100% (13/13), done.
remote: Total 0 (delta 0), reused 0 (delta 0)
remote: Counting objects: 1041, done.
remote: Compressing objects: 100% (986/986), done.
remote: Total 1041 (delta 1027), reused 60 (delta 55)
Receiving objects: 100% (1041/1041), 681.60 KiB | 1.05 MiB/s, done.
Resolving deltas: 100% (1027/1027), completed with 970 local objects.
From git://git.savannah.gnu.org/gnulib
 * branch            71b603702b8cf7977dedd5f6b71ea0ffc1669894 -> FETCH_HEAD
Submodule path 'gnulib': checked out '71b603702b8cf7977dedd5f6b71ea0ffc1669894'
remote: Total 0 (delta 0), reused 0 (delta 0)
error: Server does not allow request for unadvertised object 66fdaea3cfb4e758212c1891913e9a59441d49af
fatal: Fetched in submodule path 'submodules/autoconf', but it did not contain 66fdaea3cfb4e758212c1891913e9a59441d49af. Direct fetching of that commit failed.

git.sv.gnu.org

cd $(mktemp -d)
git init 
git remote add asdf git://git.sv.gnu.org/autoconf.git
git fetch asdf 66fdaea3cfb4e758212c1891913e9a59441d49af 
# error: Server does not allow request for unadvertised object 66fdaea3cfb4e758212c1891913e9a59441d49af

git.savannah.gnu.org

cd $(mktemp -d)
git init 
git remote add asdf git://git.savannah.gnu.org/autoconf.git
git fetch asdf 66fdaea3cfb4e758212c1891913e9a59441d49af 
# error: Server does not allow request for unadvertised object 66fdaea3cfb4e758212c1891913e9a59441d49af

workaround

use the github mirror https://github.com/autotools-mirror/autoconf

git submodule set-url submodules/autoconf https://github.com/autotools-mirror/autoconf
git submodule update --init --recursive --depth 1

make fails

OUTPUT OF make:
In file included from src/scan-gram-c.c:3:
src/scan-gram.l: In function 'gram_scanner_open':
src/scan-gram.l:1029:3: error: 'gram__flex_debug' undeclared (first use in this function); did you mean 'gram_flexdebug'?
1029 | gram_debug = trace_flag & trace_parse;
| ^~~~~~~~~~~~~~~~
THE ERROR IS REPRODUCIBLE

make error

src/scan-code-c.c:3:10: fatal error: src/scan-code.c: No such file or directory
#include "src/scan-code.c"src/scan-code-c.c:3:10: fatal error: src/scan-code.c: No such file or directory
#include "src/scan-code.c"

Request for latest release

I am facing this issue #89 using the latest available bision 3.8.2. It seems it is fixed in the current master branch, but is not yet available with the release version. The most recent release available is from September 2021. I would like to request if you could make a new release. Thank you for your time and consideration.

bison prints shift/reduce conflicts for unambiguous grammars

This grammar fails with shift/reduce conflicts:

%token END

%%

top
  : body ';' END
  ;
body
  : RepeatI
  | RepeatI ';' RepeatD
  ;
RepeatI
  : 'I'
  | 'I' ';' RepeatI
  ;
RepeatD
  : 'D'
  | 'D' ';' RepeatD
  ;

%%

Failure message:

test-parser.y: warning: 3 shift/reduce conflicts [-Wconflicts-sr]
test-parser.y:10.5-11: warning: rule useless in parser due to conflicts [-Wother]
   10 |   : RepeatI
      |     ^~~~~~~
test-parser.y:14.5-7: warning: rule useless in parser due to conflicts [-Wother]
   14 |   : 'I'
      |     ^~~
test-parser.y:18.5-7: warning: rule useless in parser due to conflicts [-Wother]
   18 |   : 'D'
      |     ^~~

But this grammar has a state machine that unambiguously implements its logic:

State0:
        when 'I' => goto State1

State1:
        when ';' => goto State2

State2:
        when 'I' => goto State1
        when 'D' => goto State3
        when END => goto <FINISH>

State3:
        when ';' => goto State4

State4:
        when 'D' => goto State3
        when END => goto <FINISH>

I am sure it is possible to reformulate the grammar such that the above problems would go away, but there should be no need to do this.

Bison should compile any grammar that unambiguously defines a state machine that can parse it, without issuing unnecessary shift/reduce conflicts.

Bison should support counterexample generation

The classic problem with parser generators—and especially with LALR parser generators—is that it is often hard to diagnose what went wrong when there are conflicts. We developed some efficient algorithms for generating useful, concise counterexamples in our PLDI 2015 paper, "Finding Counterexamples from Parsing Conflicts". Unfortunately, we implemented it only in the CUP parser generator. It would be an excellent feature for Bison, though.

[BUG] reachable assertation in string_decode, bison

short summary

Hello, I was testing my fuzzer and found a reachable assertation in string_decode, src/muscle-tab.c:317. An assertion can be reached when parsing a crafted file. As shown in the attachment.

Step to reproduce

CC="gcc -fsanitize=address -g " CXX="g++ -fsanitize=address -g" ./autogen.sh && ./configure --disable-shared && make -j$(nproc)
./src/bison $POC

Environment

  • Ubuntu 22.04 (docker image)
  • gcc 11.2.0
  • bison latest commit 5555f4d

Output

poc0:185.12-186.0: error: missing ‘"’ at end of line
  185 |   TYPENAME "tyuuuuup$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$...
      |            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
bison: src/muscle-tab.c:317: string_decode: Assertion `false' failed.
Aborted

Credit

Han Zheng (NCNIPC of China, Hexhive)

POC

poc0.zip

Compilation error: No rule to make target 'textstyle.h'

Hi!
How to reproduce error:

git clone [email protected]:akimd/bison.git && cd bison
git submodule update --init
docker build -t bison .

Dockerfile

FROM debian:buster-slim

RUN apt-get update && apt-get upgrade -y \
  gcc \
  git \
  wget \
  autopoint \
  gperf \
  make

RUN wget ftp://ftp.gnu.org/gnu/m4/m4-1.4.19.tar.gz
RUN tar -xvzf m4-1.4.19.tar.gz
WORKDIR /m4-1.4.19
RUN ./configure --prefix=/usr/local/m4
RUN make
RUN make install
RUN #cp /usr/local/m4/bin/* /usr/bin
ENV PATH="/usr/local/m4/bin:${PATH}"
WORKDIR /

RUN apt-get remove autoconf -y
RUN wget ftp://ftp.gnu.org/gnu/autoconf/autoconf-2.71.tar.gz
RUN tar -xvzf autoconf-2.71.tar.gz
WORKDIR /autoconf-2.71
RUN ./configure --prefix=/usr/local/autoconf
RUN make
RUN make install
ENV PATH="/usr/local/autoconf/bin:${PATH}"
WORKDIR /

RUN wget ftp://ftp.gnu.org/gnu/automake/automake-1.16.5.tar.gz
RUN tar -xvzf automake-1.16.5.tar.gz
WORKDIR /automake-1.16.5
RUN ./configure --prefix=/usr/local/automake
RUN make
RUN make install
ENV PATH="/usr/local/automake/bin:${PATH}"
WORKDIR /

RUN wget ftp://ftp.gnu.org/pub/gnu/libiconv/libiconv-1.11.tar.gz
RUN tar -xvzf libiconv-1.11.tar.gz
WORKDIR /libiconv-1.11
RUN ./configure --prefix=/usr/local/libiconv
RUN make
RUN make install
WORKDIR /

COPY . /bison

WORKDIR /bison
RUN ./bootstrap
RUN ./configure --prefix=/usr/local/bison --with-libiconv-prefix=/usr/local/libiconv
RUN make
RUN make install

Error:

make: *** No rule to make target 'textstyle.h', needed by 'all'.  Stop.

Can you help me with this?

D skeleton file breaks recent D compilers with example code from manual

The following code uses a definition of reportSyntaxError taken from the bison manual $10.2.6. The result compiles with gdc version 10.2.1 but not with more recent versions of gdc, ldc and dmd.exe. (Note the change to lalr1.d described in issue #84 is needed, too. The version of bison tested was 3.8.2.12-013d.)

%language "D"
%output "bison_bug.d"
%define parse.error custom
%locations
%define api.value.type union

%token <int> DUMMY SYNTAXERROR 

%code lexer {

import std.stdio : stderr, writeln;

public void reportSyntaxError(YYParser.Context ctx)
{
  stderr.write(ctx.getLocation(), ": syntax error");
  // Report the expected tokens.
  {
    immutable int TOKENMAX = 5;
    YYParser.SymbolKind[] arg = new YYParser.SymbolKind[TOKENMAX];
    int n = ctx.getExpectedTokens(arg, TOKENMAX);
    if (n < TOKENMAX)
      for (int i = 0; i < n; ++i)
        stderr.write((i == 0 ? ": expected " : " or "), arg[i]);
  }
  // Report the unexpected token which triggered the error.
  {
    YYParser.SymbolKind lookahead = ctx.getToken();
    stderr.writeln(" before ", lookahead);
  }
}

public void yyerror(const(YYLocation) loc, string s) {
	stderr.writeln("error: ", s);
}

public Symbol yylex() {
	return Symbol(TokenKind.SYNTAXERROR, YYLocation());
}

}

%%

Stmts   : %empty
        | Stmts Stmt
        ;

Stmt	: DUMMY
	;

%%

int main() {
	auto parser = new YYParser();
	parser.parse();
	
	return 0;
}

Below is the output when compiling the resulting bison_bug.d with dmd.exe v2.098.1:

C:\Program Files\Digital Mars\dmd2\windows\bin\..\..\src\phobos\std\format\internal\write.d(166): Error: no property `yycode_` for type `bison_bug.YYParser.SymbolKind`
C:\Program Files\Digital Mars\dmd2\windows\bin\..\..\src\phobos\std\format\internal\write.d(248): Error: no property `yycode_` for type `bison_bug.YYParser.SymbolKind`
C:\Program Files\Digital Mars\dmd2\windows\bin\..\..\src\phobos\std\format\internal\write.d(248): Error: incompatible types for `(obj) != (0)`: `SymbolKind` and `int`
C:\Program Files\Digital Mars\dmd2\windows\bin\..\..\src\phobos\std\format\internal\write.d(253): Error: no property `yycode_` for type `bison_bug.YYParser.SymbolKind`
C:\Program Files\Digital Mars\dmd2\windows\bin\..\..\src\phobos\std\format\internal\write.d(253): Error: incompatible types for `(obj) != (0)`: `SymbolKind` and `int`
C:\Program Files\Digital Mars\dmd2\windows\bin\..\..\src\phobos\std\format\write.d(1239): Error: template instance `std.format.internal.write.formatValueImpl!(LockingTextWriter, SymbolKind, char)` error instantiating
C:\Program Files\Digital Mars\dmd2\windows\bin\..\..\src\phobos\std\format\write.d(632):        instantiated from here: `formatValue!(LockingTextWriter, SymbolKind, char)`
C:\Program Files\Digital Mars\dmd2\windows\bin\..\..\src\phobos\std\stdio.d(1719):        instantiated from here: `formattedWrite!(LockingTextWriter, char, SymbolKind)`
bison_bug.y(23):        instantiated from here: `write!(string, SymbolKind)`
C:\Program Files\Digital Mars\dmd2\windows\bin\..\..\src\phobos\std\stdio.d(1765): Error: template instance `std.stdio.File.write!(string, SymbolKind, char)` error instantiating
bison_bug.y(28):        instantiated from here: `writeln!(string, SymbolKind)`

The proposed fix is to apply the following changes to source file data/skeletons/d.m4, or the installed /usr/share/bison/skeletons/d.m4:

--- [...]/data/skeletons/d.m4
+++ /usr/share/bison/skeletons/d.m4
@@ -282,23 +282,22 @@
     /* Return YYSTR after stripping away unnecessary quotes and
      backslashes, so that it's suitable for yyerror.  The heuristic is
      that double-quoting is unnecessary unless the string contains an
      apostrophe, a comma, or backslash (other than backslash-backslash).
      YYSTR is taken from yytname.  */
-    final void toString(W)(W sink) const
-    if (isOutputRange!(W, char))
+    final void toString(void delegate(const(char)[]) sink) const
     {
       immutable string[] yy_sname = @{
   ]b4_symbol_names[
       @};]b4_has_translations_if([[
       /* YYTRANSLATABLE[SYMBOL-NUM] -- Whether YY_SNAME[SYMBOL-NUM] is
         internationalizable.  */
       immutable ]b4_int_type_for([b4_translatable])[[] yytranslatable = @{
   ]b4_translatable[
       @};]])[
 
-      put(sink, yy_sname[yycode_]);
+      sink.formattedWrite!"%s"(yy_sname[yycode_]);
     }
   }
 ]])

Now the output is as expected:

1.1: syntax error: expected end of file or DUMMY before SYNTAXERROR

A delegate is D's term for a lambda function which captures its immediately enclosing scope, and would appear to be the more "modern" way of implementing this pattern (see [Çehreli,2017] $72.4, p485). It should be valid D code for any compiler which can otherwise handle the D output of bison (implements static foreach etc.)

[Çehreli,2017] Programming in D, First Edition, Ali Çehreli, 2009-17

Inconsistent use of static_cast in c++ skeleton (lalr1.cc)

Line 924 reads:
yypush_ ("Shifting", static_cast<state_type> (yyn), YY_MOVE (yyla));]b4_lac_if([[

Line 855 does not use static_cast:
YYCDEBUG << "Entering state " << int (yystack_[0].state) << '\n';
It should be:
YYCDEBUG << "Entering state " << static_cast (yystack_[0].state) << '\n';

Line 1437 does not use static_cast:
*yycdebug_ << ' ' << int (i->state);
It should be:
*yycdebug_ << ' ' << static_cast (i->state);

More noexcept

Hi,

The following lines should have YY_NOEXCEPT as shown below

c++.m4: line 311
basic_symbol () YY_NOEXCEPT
c++.m4: line 493
symbol_type () TT_NOEXCEPT {}

lalr1.cc: line 477
void yypop_ (int n = 1) YY_NOEXCEPT;
lalr1.cc : line 782
]b4_parser_class[::yypop_ (int n) YY_NOEXCEPT

lalr1.cc; line 324
static state_type yy_lr_goto_state_ (state_type yystate, int yysym) YY_NOEXCEPT;
lalr1.cc: line 815
]b4_parser_class[::yy_lr_goto_state_ (state_type yystate, int yysym) YY_NOEXCEPT

lalr1.cc: line 328
static bool yy_pact_value_is_default_ (int yyvalue) YY_NOEXCEPT;
lalr1.cc: line 825
]b4_parser_class[::yy_pact_value_is_default_ (int yyvalue) YY_NOEXCEPT

lalr1.cc: line 332
static bool yy_table_value_is_error_ (int yyvalue) YY_NOEXCEPT;
lalr1.cc: line 831
]b4_parser_class[::yy_table_value_is_error_ (int yyvalue) YY_NOEXCEPT

c++.m4: line 441
by_kind () YY_NOEXCEPT;
c++.m4: line 449
by_kind (const by_kind& that) YY_NOEXCEPT;
c++.m4: line 463
by_kind (kind_type t) YY_NOEXCEPT;
c++.m4: line 567
]b4_inline([$1])b4_parser_class[::by_kind::by_kind () YY_NOEXCEPT
c++.m4: line 579
]b4_inline([$1])b4_parser_class[::by_kind::by_kind (const by_kind& that) YY_NOEXCEPT
c++.m4: line 583
]b4_inline([$1])b4_parser_class[::by_kind::by_kind (token_kind_type t)

c++.m4: line 646
]b4_parser_class[::yytranslate_ (int t) YY_NOEXCEPT
lalr1.cc: line 340
static symbol_kind_type yytranslate_ (int t) YY_NOEXCEPT;

stack.hh: line 40
stack (size_type n = 200) YY_NOEXCEPT
stack.hh: line 119
slice (const stack& stack, index_type range) YY_NOEXCEPT

Generated code fails with -ansi

My swig build fails with the latest bison version 3.6. The generated code fails to compile with -ansi compiler flag.

The issue is with the c++ style comment here https://github.com/akimd/bison/blob/master/src/parse-gram.c#L2708 (also in some other locations: data/skeletons/yacc.c, ...)

A fix would be:

--- src/parse-gram.c  2020-05-09 11:07:53.483627254 +0000
+++ src/parse-gram.c  2020-05-09 11:08:05.619952863 +0000
@@ -2705,7 +2705,7 @@
 yyerrlab1:
   yyerrstatus = 3;      /* Each real token shifted decrements this.  */
 
-  // Pop stack until we find a state that shifts the error token.
+  /* Pop stack until we find a state that shifts the error token. */
   for (;;)
     {
       yyn = yypact[yystate];

--- data/skeletons/yacc.c  2020-05-09 11:20:47.180538119 +0000
+++ data/skeletons/yacc.c  2020-05-09 11:21:03.464980276 +0000
@@ -1979,7 +1979,7 @@
 yyerrlab1:
   yyerrstatus = 3;      /* Each real token shifted decrements this.  */
 
-  // Pop stack until we find a state that shifts the error token.
+  /* Pop stack until we find a state that shifts the error token. */
   for (;;)
     {
       yyn = yypact[yystate];

Thanks.

glr2.cc skeleton throwing errors

I am getting the following errors when running Bison using glr2.cc skeleton on attached dprec++2.bison :-

bison --defines=dprec++2.bison.h --output=dprec++2.bison.cpp dprec++2.bison
dprec++2.bison: warning: 1 reduce/reduce conflict [-Wconflicts-rr]
/usr/bin/m4:/usr/share/bison/skeletons/glr2.cc:569: undefined macro `b4_symbol(empty, kind)'
/usr/bin/m4:/usr/share/bison/skeletons/glr2.cc:1029: undefined macro `b4_symbol(empty, id)'
/usr/bin/m4:/usr/share/bison/skeletons/glr2.cc:1793: undefined macro `b4_symbol(empty, id)'
/usr/bin/m4:/usr/share/bison/skeletons/glr2.cc:1844: undefined macro `b4_symbol(empty, id)'
/usr/bin/m4:/usr/share/bison/skeletons/glr2.cc:1986: undefined macro `b4_symbol(empty, id)'
/usr/bin/m4:/usr/share/bison/skeletons/glr2.cc:1997: undefined macro `b4_symbol(empty, id)'
/usr/bin/m4:/usr/share/bison/skeletons/glr2.cc:2162: undefined macro `b4_symbol(empty, id)'
/usr/bin/m4:/usr/share/bison/skeletons/glr2.cc:2624: undefined macro `b4_symbol(empty, id)'
/usr/bin/m4:/usr/share/bison/skeletons/glr2.cc:2648: undefined macro `b4_symbol(0, kind)'
/usr/bin/m4:/usr/share/bison/skeletons/glr2.cc:2695: undefined macro `b4_symbol(1, kind)'
/usr/bin/m4:/usr/share/bison/skeletons/glr2.cc:2791: undefined macro `b4_symbol(empty, id)'
/usr/bin/m4:/usr/share/bison/skeletons/glr2.cc:2842: undefined macro `b4_symbol(empty, id)'
/usr/bin/m4:/usr/share/bison/skeletons/glr2.cc:2867: undefined macro `b4_symbol(empty, id)'
/usr/bin/m4:/usr/share/bison/skeletons/glr2.cc:2905: undefined macro `b4_symbol(empty, id)'
/usr/bin/m4:/usr/share/bison/skeletons/glr2.cc:2909: undefined macro `b4_symbol(empty, id)'
/usr/bin/m4:/usr/share/bison/skeletons/glr2.cc:3187: undefined macro `b4_symbol(empty, id)'

Bison works fine with the same file but with a `%skeleton "glr.cc"'.

I am using WSL2 Ubuntu BTW just for completeness. I ran the full 'make check' build and only got one unrelated error :-

User Actions.

311: Midrule actions                                 ok
312: Typed midrule actions                           ok
313: Implicitly empty rule                           ok
314: Invalid uses of %empty                          FAILED (actions.at:192)

~~~~~

## ------------- ##
## Test results. ##
## ------------- ##

ERROR: 662 tests were run,
1 failed unexpectedly.
52 tests were skipped.

dprec++2.bison originated from https://www.gnu.org/software/bison/manual/html_node/Merging-GLR-Parses.html

%{
  #include <stdio.h>
  #define YYSTYPE char const *
  int yylex (void);
  void yyerror (char const *);
%}

%token TYPENAME ID

%right '='
%left '+'

%require "3.2"
%language "c++"

%skeleton "glr2.cc"
%glr-parser

%%

prog:
    %empty
  | prog stmt                           { printf ("\n"); }
;

stmt:
    expr ';'  %dprec 1
  | decl      %dprec 2
;

expr:
    ID                                  { printf ("%s ", $$); }
  | TYPENAME '(' expr ')'               { printf ("%s <cast> ", $1); }
  | expr '+' expr                       { printf ("+ "); }
  | expr '=' expr                       { printf ("= "); }
;

decl:
    TYPENAME declarator ';'             { printf ("%s <declare> ", $1); }
  | TYPENAME declarator '=' expr ';'    { printf ("%s <init-declare> ", $1); }
;

declarator:
    ID                                  { printf ("\"%s\" ", $1); }
  | '(' declarator ')'
;

dprec++.bison

dprec++2.bison

No example code in the C++ examples

It seems like the bison/examples/c++/calc++ directory does not actually contain any example code. There are no cpp, nor any scanner or parser source files.

3.8.2 testsuite segfault on armv6

when updating bison package on alpine linux to 3.8.2 we run into a segfault on armv6.

Here is a backtrace:

(gdb) run
Starting program: /home/ncopa/aports/main/bison/src/bison-3.8.2/src/bison -o y.tab.c --defines -Werror -Wall,dangling-alias --report=all --no-lines /home/ncopa/aports/main/bison/src/bison-3.8.2/examples/c/calc/calc.y

Program received signal SIGSEGV, Segmentation fault.
0x00448544 in abitset_small_list (src=src@entry=0xf7f44ef0, list=list@entry=0xfffeea8c, num=num@entry=1024, next=next@entry=0xfffeea88) at lib/bitset/array.c:69
69            list[count++] = bitno + pos;
(gdb) bt
#0  0x00448544 in abitset_small_list (src=src@entry=0xf7f44ef0, list=list@entry=0xfffeea8c, num=num@entry=1024, next=next@entry=0xfffeea88) at lib/bitset/array.c:69
#1  0x00447b2c in bitset_count_ (src=0xf7f44ef0) at lib/bitset.c:356
#2  0xfffffffe in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.