Coder Social home page Coder Social logo

p4-spec's People

Contributors

abe149 avatar akeep avatar anirudhsk avatar antoninbas avatar apinski-cavium avatar blp avatar cc10512 avatar chkim4142 avatar chrisdodd avatar chrispsommers avatar cole-barefoot avatar gbrebner avatar grg avatar hackedy avatar hanw avatar hanw-bfn avatar hesingh avatar jafingerhut avatar jfingerh avatar jklr avatar jnfoster avatar jonathan-dilorenzo avatar liujed avatar mary-grace avatar mbudiu-bfn avatar mihaibudiu avatar mollydream avatar pataei avatar rst0git avatar santiagobautista avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

p4-spec's Issues

P4 2016 draft: integer division is actually well-defined in C

Section 10.6 says:

Division and modulo are illegal for negative values (the C language does not give a clear semantics to division of signed integers when values are negative).

but this is incorrect. The ISO C standard for C, 2011 edition, defines division clearly and unambigiously as the following in section 6.5.5 "Multiplicative operators" (this has only been clarified slightly since the 1999 edition):

When integers are divided, the result of the / operator is the algebraic quotient with any
fractional part discarded. 105) If the quotient a/b is representable, the expression
(a/b)*b + a%b shall equal a; otherwise, the behavior of both a/b and a%b is
undefined.
105) This is often called ‘‘truncation toward zero’’.

It's one thing if P4 doesn't want to define this but it shouldn't make incorrect claims about the C language.

Even if we don't want to define division for negative values, it's probably best if P4 defines 0/x as zero instead of forbidding it.

P4 2016 draft: naming of methods for header stacks

Section 10.14 describes two methods named "push_front" and "pop_front" for header stacks but suggests that they be renamed using "shift". I'd support that: "unshift" and "shift" match the naming for similar operations in Bourne shell and Perl.

[design] support Union type

Some header and metadata may need to reuse the same type with different format for different purpose. An example is that for memory and bandwidth efficiency, the metadata prefers to use an union type based on context to save the amount of data that need to be passed between different modules. The union type can also be used to describe packet headers, for example a big packet header containing L3 & L4 headers. In this case the TCP header and UDP header can be a union in the big header spec.

[prototyping] support for new typing syntax

For v1.1 we proposed a new typing syntax. Instead of using just the field length in bits, we introduced bit<>, varbit, int, and void.

It seems we'll need to do the followings.

  • Make the v1.1 compiler take the new type syntax (per v1.1-rc) only
  • Let people start using the new syntax upon the release of v1.1 spec. We should officially deprecate the old syntax.

P4 2016 draft: parser loops and Turing completeness.

I'm finding section 13.11, which discusses parser loops a bit confusing. Reactions from @mbudiu-vmw and @blp appreciated.

The section starts with the following observation, which is useful for compiler writers:

A parser processing header stacks may introduce loops in the parser graph. Since header stacks have finite maximum depth, the parser loops that extract header stacks can be statically unrolled into larger acyclic graphs. The unrolled program should be identical in behavior with the original one.

This is certainly true, but loops can also arise in examples that don't involve header stacks. For example, the TLV_parser.p4 example from the SIGCOMM '15 tutorial has a loop that terminates when a metadata field becomes 0. So to me, it's not entirely clear that this observation belongs in the specification.

The section then goes on to say:

A compiler may reject a parse graphs that contains a cycle which does not advance the packet_in.nextBitIndex cursor, since it could lead to an infinite loop at runtime.

This is potentially confusing because nextBitIndex is a component of the parser abstract machine introduced in the specification and not a language-level construct. But that's a minor issue. A more serious concern is that P4 parsers are actually Turing complete (!), which surprised me. So deciding if they can be unrolled into a DAG is actually undecidable. A simple encoding of two-counter machines is given below.

So at this point, I'm not sure what we actually want. I find it unattractive to start layering on various syntactic restrictions to ensure termination. Such restrictions are bound to be ad hoc and they complicate the language specification. For example, we already have a useful example where it's useful to have a parser state that does not advance any cursor (c.f. the TLV example). I can also imagine scenarios where such states get combined in a larger program in a way that would make the restriction cumbersome. More generally, since parsers can now use expressions and statements, it's completely natural to ensure termination by counting down with a variable (or several variables). Who knows what people may find useful when writing parsers for new protocols by hand? Also, when used as an intermediate language it will likely be convenient to be quite liberal in the notion of state machine the language accepts.

Instead, I would prefer to simply say that targets may impose a limit on how much time may be spent parsing and cut things off (with a specific error) if this limit is exceeded. Of course, good compilers would probably analyze parsers and issue a warning if they cannot determine that the parser will terminate on all inputs.

Turing Completeness

We will show that P4 parsers are Turing complete by encoding two-counter machines. The formulation of two-counter machines used here comes from Pierce93.

Two-Counter Machines.

A two-counter machine is a tuple <PC; A; B; I1,...,Ik> where A and B are integer registers and PC, and I1 through Ik are instructions given by the following grammar:

I ::= INCA => m
   | INCB => m
   | TSTA => m / n
   | TSTB => m / n
   | HALT

Intuitively, the program is stored as the instructions I1 though Ik and the current instruction being executed is PC. The semantics of instructions is as follows:

  • INCA => m increments the value of the A register and then swaps in Im into the PC slot (and INCB => m is similar).
  • TSTA => m / n swaps Im into the PC slot if A is zero and otherwise decrements A and swaps In into the PC slot (and TSTB => m / n is similar).
  • The HALT instruction halts execution of the machine.

More formally, we can define a transition function --> on two-counter machines as follows:

  • <INCA => m; A; B; I1,...,Ik> --> <Im; A + 1; B; I1,...,Ik>
  • <INCB => m; A; B; I1,...,Ik> --> <Im; A; B + 1; I1,...,Ik>
  • <TSTA => m / n; 0; B; I1,...,Ik> --> <Im; A; B; I1,...,Ik>
  • <TSTA => m / n; A; B; I1,...,Ik> --> <In; A - 1; B; I1,...,Ik> where A > 0
  • <TSTB => m / n; A; 0; I1,...,Ik> --> <Im; A; B; I1,...,Ik>
  • <TSTB => m / n; A; B; I1,...,Ik> --> <In; A; B - 1; I1,...,Ik> where B > 0
  • <HALT; A; B; I1,...,Ik> --> <HALT; A; B; I1,...,Ik>
    We can then define the multi-step transition function -->* as the reflexive and transitive closure of the single-step transition function.

** Definition [Halts] ** A two-counter machine <PC; A; B; I1,...Ik> halts iff <PC; A; B; I1,...Ik> -->* <HALT; A'; B'; I1,...Ik>.

Fact: The halting problem for two-counter machines is undecidable.

P4 Encoding

We can encode a two-counter machine as a P4 parser as follows.

First, we define a type to store the values of the registers.

struct registers { int a; int b }

Next, we construct two kinds of states for each instruction I1 to Ik. The first state handles the implementation of the instruction, and is defined using a case analysis on the instruction itself:

  • If Ii is INCA => m the state is,
state Ii { 
  regs.a = regs.a + 1;
  transition lm;
}

and similarly for INCA => m

  • If Ii is TSTA => m / n the state is,
state Ii {
  transition select(regs.a) {
    0 : Im
    default : deca_In
  }
}

and similarly for TSTB => m / n

  • If li is HALT, the state is,
state halt { 
  transition accept;
}

We also need to add the deca_Ii states to handle decrementing and transitions for each state,

state deca_Ii { 
  regs.a = regs.a - 1;
  transition Ii;
}

and similarly for the decb_Ii states.

Finally, we encode the entire two-counter machine as a top-level parser tcm, declared as follows

parser tcm(registers regs) {
  ...
}

The start state has the encoding of the initial PC instruction.

P4 2016 draft: pass-by-reference vs. pass-by-copy-in-copy-out

The draft states in a couple of places that arguments are passed with copy-in, copy-out semantics. This surprises me a little for a language aimed at targets with limited memory resources, since some of the arguments that are likely to be passed are relatively large headers or perhaps entire packets (e.g. Parsed_packet in the example in section 7.3). Of course, pass-by-reference is usually equivalent to copy-in, copy-out, that is, when there are no aliases. I think that this is the common case for P4, and there is wording in 8.6 to make the compiler report an error for obvious aliases, but... I still worry due to lack of implementation experience. Will a naive compiler need to copy around big structures in some cases? Is there a reason to avoid specifying pass-by-reference or to say that a compiler is allowed to use either one?

Also, the wording in 8.6 only says that a compiler should report aliases on out and inout arguments. I think a compiler will have to make a copy if the same data structure is passed as in and out arguments, since the function might interleave reads from the in argument with writes to the out argument.

[feature] modeling tables that could share the same match entries

[From Peter's original email]

Something we do fairly often in packet forwarding is lookup both the source address and the destination address in the same match table. We will take different actions in the case of a hit but we use a single match table. For example, in IPv4 we will lookup the destination address for forwarding and the source address for a reverse path forwarding security check.

In the P4 table definition there is a one-to-one correspondence between the lookup fields and the match table. I can’t express two different lookups using the same match table.

Has anyone thought about how to handle this? We need some sort of indication in the P4 source that two different tables can be implemented using the same match entries. Would we suggest using a pragma directive to the compiler to handle this or is it important enough to do something in the P4 source?

Request for examples

A handful of features of both the 1.0.2 and 1.1 rc1 spec do not have any example programs to show how they should be used.

  • Features unused from both versions of the spec:
    1. There are no parser_exception declarations
    2. There are no explicit returns to parse_error
    3. There are no uses of parser_value_set
    4. There are no examples of the hit or miss cases in any table apply statement
  • Features unused from the new spec (some features have started showing up in 1.0.2 programs)
    1. There are no uses of control return
    2. There are no examples of the cast operator

P4 2016 draft: value of -8w10

In C, this program prints 246:

#include <stdio.h>
#include <stdint.h>
int main(void)
{
   uint8_t x = -10;
   printf("%d\n", x);
   return 0;
}

The table in section 9.1.6.6 says that, in P4, -8w10 has the value 245, though. Is this a typo?

[design] Support for incremental parsing

There has been concerns that allowing any parsers or control flows to call any other parsers or control flows (irrespective of whether they belong to ingress or egress) might lead to a P4 coding style that is supportable only a certain type of machines. If we want P4 to be widely applicable, we should error on the conservative side.

To enable both pipeline-style targets and CPU/NPU-style targets, the group felt the following approach would be reasonable. Note #1 and #2 below are already possible with the current P4 spec. Only #3 is a new feature.

  1. Allow exiting a parser through multiple exit points, each of which can lead to a different control flow
  2. Allow a control flow to call other sub-control flows. Although currently the egress must start with one pre-defined fixed entry point (which is "egress"), one can quickly branch out and call multiple sub-control flows from there.
  3. Allow a control flow to call a parser via resubmit() or continue_parsing(). The current spec doesn't define the semantics of resubmit() -- it doesn't tell whether resubmit() resets the parse pointer of the resubmitted packet to zero or retains the last position where parsing exited. We should make it clear (the former way) in the spec and should consider introducing another primitive action called continue_parsing(), which retains the last position where parsing stopped.

[spec] v1.1 spec writing

We'll need to update the v1.1rc to produce the official v1.1 spec.

Keep only officially-approved features in the spec

  • Blackbox (consider renaming)
  • TLV header parsing --> change to Sec. 4
  • Sequential execution semantics --> change to Sec. 7.1.1
  • Stronger typing --> a new section?
  • Having modify_field() take expressions --> change to Sec 12.1
  • “direct” lookup semantics --> change to Sec. 8

What happens to Sec. 10 & 13 of v1.1-rc?

  • Consider moving them to Appendix and name them "provisional"

P4 2016 draft: how to include core library

Section 8.2.1 says that every P4 program must include the core library, but it doesn't say how. It appears from examples that one should do this with:

#include "core.p4"

which is a little surprising since usually in C standard headers are included with <>, e.g.

#include <core.p4>

Perhaps 8.2.1 should recommend a syntax.

P4 2016 draft: potential for octal constant confusion

It's going to be somewhat surprising to C programmers that 0377 has value 377, instead of 255. In section 8.3.2, I'd consider either forbidding leading zeros on decimal literals (that is, other than 0 itself), or providing an example at the end, e.g.:
0377 // a *decimal* integer with value 377 (not 255!)

P4 2016 draft: nonsensical cube expressions?

Section 10.11.3 doesn't say anything about nonsense cube expressions, that is, ones where bits are set in the value but not in the mask, such as 8w8 &&& 8w7. According to the definition provided, such an expression is equivalent to 8w0 &&& 8w7, but it might be worth making the meaning implementation-defined or undefined since it doesn't make much sense to have a 1-bit in the value but a 0-bit in the mask.

[design] Define the standard architecture model (aka Standard Model)

Portability will be critical for P4's long-term success. While we all admit complete portability is an impractical goal, it'd be crucial to try to maximize the commonalities across different target architectures and hence achieve ease of porting.

Unfortunately target vendors (understandably) can't afford to share the full details of their target architecture. Yet, we need to be able to help potential P4 writers author practically-meaningful and viable P4 programs that can run on a various types of targets. To achieve these goals, we need the SM.

P4 2016 draft: are string literals ASCII or UTF-8?

Section 8.3 says:

P4 compilers should handle correctly UTF-8 characters in comments and string literals.

Appendix 23 says:

The STRING_LITERAL token corresponds to a doubly-quoted string containing only printable ASCII characters and spaces. No escape sequences are supported.

Which is correct?

(Furthermore, by saying that comments and string literals are always UTF-8, does this mean that UTF-8 is the only valid encoding for P4 source files? In a C context, this could be a claim about the execution character set, not the source character set, but of course P4 doesn't have an execution character set.)

P4 2016 draft: definition of path and path prefix

Section 8.6 "Paths" talks about paths but never defines one. Perhaps it should.

There is no definition in the grammar for a path. The intended definition of a path is actually somewhat unclear to me. Perhaps a path is a path prefix followed by an identifier, but because of the definition of a path prefix that means that every path has at least one dot in it. It looks to me like every reference to pathPrefix in the grammar has an alternative that does not include a pathPrefix, and I don't know whether those alternatives without a pathPrefix are considered paths or just identifiers.

The text in section 8.6 says that a path prefix is separated by dots; it looks to me like every pathPrefix also ends in a dot, so maybe it should say that a path prefix is a series of namespaces separated by and ending with a dot.

[prototyping] Action parameter types and directionality annotations

Require types on action parameters
○ Parameters have directionality
■ in: May not be modified, can be a constant or other non-L-value
■ out: May be modified, implied to be uninitialized
■ inout: May be modified, already initialized with data
○ Use syntax similar to header field declaration

action foo (inout header ipv4_t ipv4, in bit<32> new_dst_addr)
{
modify_field(ipv4.dstAddr, new_dst_addr);
}

P4 2016 draft: verify

The current specification describes the verify function as coming from the architecture (9.2.5.3) and as a built-in function (13.7).

It seems to me that as currently used, it should really be part of the main language and cannot be defined by the architecture. First, the functionality required is extremely simple (evaluate a boolean expression and set an error if false). Second, when a call to verify fails, the parser moves to the reject state. A standard built-in functions would not have the latter behavior.

But perhaps I am missing something. It would be great to hear from @mbudiu-vmw and @blp (who I believe owns parser-related topics in the current P4-16 design effort).

P4 2016 draft: parserLocalElements

Section 13.2 "Parser declarations" gives the definition of parserLocalElement but not parserLocalElements. Though the reader can jump to the appendix, it would be kind to copy its definition into 13.2.

[spec] Sequential execution semantics

Compound action execution semantics should be sequential. Each target can support a certain maximum-length dependency chain in each compound action. If a compound action violates such a max dependency chain length, the compiler for the target must detect it and generate a compiler-time error.

If a target does have a parallel execution support, its compiler can infer parallelism where possible and take advantage of the parallelism.

Problem statement and proposal:
sequential execution semantics -- Anirudh.pdf

P4 2016 draft: inconsistent use of semicolons after declarations

At least in the sample P4 program in section 7.1, some top-level P4 syntax elements end with semicolons, and some do not. At least from this C programmer's perspective, it is going to be difficult to remember which ones take semicolons and which ones do not. I hope that, if semicolons are not required after particular declaration elements, they are at least allowed, so that a programmer who always adds a semicolon will not cause syntax errors. Or perhaps the semicolons could always be made part of the syntax.

P4 2016 draft: meaning of 1s10

Section 9.1.6.3 says that signed integers must have a width greater than 1.

Section 9.1.6.6 says that an s prefix indicates a signed integer.

The table in section 9.1.6.6 gives an example for 1s10 and says that it's an unsigned integer with value 0. I think that it should be a syntax error since the s-prefix means a signed integer and section 9.1.6.3 forbids 1-bit signed integers.

[design] Deparser modeling

How to model the deparsing logic?

[Problem statement]

[Potential solutions]

  • Inference approach: Let a user express all the parsing state in the parse graph, and let the compiler infer deparsing order from the parse graph. This is what the current spec (v1.0) suggests.
  • Monolithic imperative ordering w/o conditional branching. Only valid headers are serialized.
  • Linked-list approach.

P4 2016 draft: varbit field size

There is no way to specify the size of a varbit field equivalent with the P4-14 syntax, where the size of the field is computed from other header fields.

P4 2016 draft: Invoking sub-parsers

Section 13.10 mentions that sub-parsers can be invoked similar to subroutines. It's unclear what happens when the call graphs of such parser invocations involve cycles. For example, if they are like subroutines there a run-time stack to keep track of local variables etc.? This can't be what's intended...

See also the related issue #46.

@blp @mbudiu-vmw

Corrections & missing items in P4 1.1rc spec

There are a few missing items in the P4 1.1 rc spec that should either be addressed in the 1.1 final spec, or added as addendum to the 1.1 final. These suggestions are a mix of features implemented in the current implementation but not listed in the spec and features used in examples (including the textual example in the 1.1 RC document).

  • the mtag-edge.p4 example program includes expressions in the size indicator for several tables, however the grammar specifies these must be constants. The grammar could be extended to include a const_expr ::= const_value | const_expr bin_op const_expr
    and then size, min_size, max_size, etc. fields could allow for this type of expression which can be computed at compile-time of the P4 program
  • The switch P4 example program has an action_selector that specifies the additional select_mode attribute, and the HLIR parser seems to handle this in addition to a selection_type attribute.
  • The switch P4 example program also makes use of a @pragma, which is not described in the grammar, but could be added as:

p4_pragma ::= @pragma pragma_name pragma_text

where pragma text is really any string or strings up to the end of the line.

  • The 1.1rc1 spec appears to have dropped the text about P4's namespaces. I think it is worthwhile for the spec to indicate if there is now a single namespace, or if there are separate namespaces (as there were in 1.0.2).
  • A little more description of how inout action parameters should be interpreted should be added to the spec.

[spec] New wording for the preprocessing section

This is a proposal for the definition of a new source code preprocessing behavior that obviates the need to use an actual C preprocessor over P4 source files prior to compilation. This wording would replace the contents of the 8.2 Preprocessing section of the spec.

The most important change here is replacing the #include directive with a new #import directive that behaves somewhat differently. For example, it's no longer a blind copy-and-paste of the target file.

I think #include brings too many undesirable behaviors from C (e.g. include guards, multiply defined symbols, context-dependent parsing, etc) that I think a more modern language shouldn't have.

A prototype implementation of this proposal for the p4c compiler can be seen here: p4lang/p4c#68


#8.2 Preprocessing

P4 does not support separate compilation or linking: a P4 compiler must be provided a complete P4 program. To aid composition of programs from multiple source files the P4 language supports the following subset of the a small set of preprocessing directives that are inspired by similar C preprocessor functionality:

  • #import
  • #define for defining macros without arguments / #undef
  • #if / #elif / #else / #endif / #ifdef
  • #line

This functionality allows P4 programs to be built from multiple source files, potentially produced by different programmers at different times:

  • the P4 core library, produced by the P4 language designers
  • the target architecture interfaces, specified by the target manufacturer
  • target libraries, describing extern blocks provided by the target architecture
  • user-defined and other libraries of useful components (e.g, standard protocol header definitions)
  • the P4 programs that control programmable functional blocks of a target

8.2.1 P4 core library

Similar to the C standard library, the P4 language specification defines a core library, that declares useful P4 constructs. A description of the P4 core library is provided in the Appendix 21. All P4 programs must include the core library.

8.2.2 Preprocessor directives

Preprocessor directives begin with the '#' character and must appear at the beginning of a line (ignoring leading whitespace). They always occupy a separate line of source code and may end with a single-line comment.

Preprocessor directives aren't part of the final structure of the program. The grammar definitions used in this section are for explanatory purposes and don't form part of the actual P4 language grammar.

8.2.3 Preprocessor definitions

At every point in the program the compiler has a list of preprocessor variables that are currently defined. This includes predefined variables - which are always available - as well as user-defined variables passed to the compiler or defined previously in source code using the #define directive.

Preprocessor variables don't carry a value; they can only either be defined or undefined. They may only be used in preprocessor directives and are not otherwise available to the P4 program.

8.2.4 Predefined variables

The following table lists preprocessor variables that are implicitly defined by the compiler and available automatically at every point in the program. These variables can't be undefined by the user:

Name Description
__p4__ Defined when building P4 source code.

8.2.5 #import directive

importDirective
    : '#' IMPORT '<' path '>'
    | '#' IMPORT '"' path '"'
    ;

The #import directive is used to make definitions from another file available to the current file. The effect of an #import directive is as if all definitions from the named file, as well as all definitions from transitively imported files, were present in the current file at the point of inclusion.

Preprocessor variables defined in the current file via #define directives have no effect on the contents of the imported file, that is, the set of definitions imported by an #import directive depends exclusively on global definitions and #define directives found in the included file. Similarly, #define directives found in the included file have no effect on the current file's preprocessor state.

There are two forms of the #import directive: an angle bracket form and a quoted form. They both require a valid relative or absolute path between those separators. The two forms differ when presented with a relative path. The angle bracket form instructs the preprocessor to search for the included file in the system import directories. On the other hand, the quoted form makes the preprocessor first search for the file starting in the including file's directory, and then moves to the system import directories if it wasn't found there.

Multiple #import directives that resolve to the same file behave as if only the first #import directive were present in the source file. That is, importing a given file multiple times has no effect.

8.2.6 #define / #undef directives

defineDirective
    : '#' DEFINE name
    ;

undefDirective
    : '#' UNDEF name
    ;

The #define directive makes a new definition available from the point of definition to the end of the current source file. A #define directive that defines an already-defined variable has no effect.

The #undef directive removes a variable definition, making it unavailable from that point to the end of the current source file. An #undef directive that references an undefined variable has no effect.

The #define and #undef directives must appear at the beginning of a source file before any other program tokens. This doesn't include comments or other preprocessor directives, which can appear before a #define or #undef.

8.2.7 #if / #elif / #else / #endif directives

ifDirective
    : '#' IF ppExpression
    ;

elifDirective
    : '#' ELIF ppExpression
    ;

elseDirective
    : '#' ELSE
    ;

endifDirective
    : '#' ENDIF
    ;

ppExpression
    : name
    | '(' ppExpression ')'
    | '!' ppExpression
    | ppExpression AND ppExpression // &&
    | ppExpression OR ppExpression  // ||
    ;

The #if directive is used for conditional compilation and it defines a region of code that will be parsed only if the associated expression evaluates to true. The affected code is the region from the line following the #if directive to the line preceding the next matching #elif, #else or #endif directive.

The #elif directive is used to provide a new condition to test when the previous #if or #elif directive evaluated to false. The #else directive is used to include a region of code only if the previous condition evaluated to false.

Every #if directive must be followed by zero or more #elif directives, then by an optional #else directive, and finally one #endif directive. It's possible to nest conditional compilation blocks.

The expression in an #if or #elif directive involves one or more preprocessor variables and can only contain the logical operators !, && and ||. These operators have the same precedence rules as they have in regular P4 expressions. Preprocessor variables evaluate to the logical value true if they are defined and evaluate to false otherwise.

8.2.8 #line directive

lineDirective
    : '#' LINE INTEGER
    | '#' LINE INTEGER STRING_LITERAL
    ;

The #line directive is used to provide the compiler with mapping information that can be used to translate locations in the input source file to locations in the original source file. This is useful to provide better diagnostics when the input source file was generated from another file, potentially written in a different language.

The #line directive provides a line number and optionally the original source file name that apply to the block of source code following it. The compiler applies the new line numbering starting with the next source code line.


Below is my original proposal's wording:


#8.2 Preprocessing and compilation (ORIGINAL PROPOSAL)

P4 does not support separate compilation or linking: a P4 compiler must be provided a complete P4 program as a set of input source files. These source files must contain a definition for all the declarations used throughout the program, with the exception of declarations from the standard library which are always available.

This functionality allows P4 programs to be built from multiple source files, potentially produced by different programmers at different times:

  • the P4 core library, produced by the P4 language designers
  • the target architecture interfaces, specified by the target manufacturer
  • target libraries, describing extern blocks provided by the target architecture
  • user-defined and other libraries of useful components (e.g, standard protocol header definitions)
  • the P4 programs that control programmable functional blocks of a target

To aid composition of programs from multiple source files conditional compilation of source code the P4 language supports a small set of preprocessing directives that are inspired by similar C preprocessor functionality:

  • #define for defining macros without arguments / #undef
  • #if / #elif / #else / #endif / #ifdef
  • #line

8.2.1 P4 core library

Similar to the C standard library, the P4 language specification defines a core library~~,~~ that declares useful P4 constructs. A description of the P4 core library is provided in the Appendix 21. All P4 programs must include the core library implicitly.

8.2.2 Preprocessor directives

Preprocessor directives begin with the '#' character and must appear at the beginning of a line (ignoring leading whitespace). They always occupy a separate line of source code and may end with a single-line comment.

Preprocessor directives are consumed by the lexer and aren't part of the final structure of the program. The grammar definitions used in this section are for explanatory purposes and don't form part of the actual P4 language grammar.

8.2.3 Preprocessor definitions

At every point in the program the compiler has a list of preprocessor variables that are currently defined. This includes predefined variables - which are always available - as well as user-defined variables passed to the compiler or defined previously in source code using the #define directive.

Preprocessor variables don't carry a value; they can only either be defined or undefined. They may only be used in preprocessor directives and are not otherwise available to the P4 program.

8.2.4 Predefined variables

The following table lists preprocessor variables that are implicitly defined by the compiler and available automatically at every point in the program. These variables can't be undefined or redefined by the user:

Name Description
__p4__ Defined when building P4 source code.

8.2.5 #define / #undef directives

defineDirective
    : '#' DEFINE name
    ;

undefDirective
    : '#' UNDEF name
    ;

The #define directive makes a new definition available from the point of definition to the end of the current source file. A #define directive that defines an already-defined variable has no effect.

The #undef directive removes a variable definition, making it unavailable from that point to the end of the current source file. An #undef directive that references an undefined variable has no effect.

The #define and #undef directives must appear at the beginning of a source file before any other program tokens. This doesn't include comments or other preprocessor directives, which can appear before a #define or #undef.

8.2.6 #if / #elif / #else / #endif directives

ifDirective
    : '#' IF ppExpression
    ;

elifDirective
    : '#' ELIF ppExpression
    ;

elseDirective
    : '#' ELSE
    ;

endifDirective
    : '#' ENDIF
    ;

ppExpression
    : name
    | '(' ppExpression ')'
    | '!' ppExpression
    | ppExpression AND ppExpression // &&
    | ppExpression OR ppExpression  // ||
    ;

The #if directive is used for conditional compilation and it defines a region of code that will be parsed only if the associated expression evaluates to true. The affected code is the region from the line following the #if directive to the line preceding the next matching #elif, #else or #endif directive.

The #elif directive is used to provide a new condition to test when the previous #if or #elif directive evaluated to false. The #else directive is used to include a region of code only if the previous condition evaluated to false.

Every #if directive must be followed by zero or more #elif directives, then by an optional #else directive, and finally one #endif directive. It's possible to nest conditional compilation blocks.

The expression in an #if or #elif directive involves one or more preprocessor variables and can only contain the logical operators !, && and ||. These operators have the same precedence rules as they have in regular P4 expressions. Preprocessor variables evaluate to the logical value true if they are defined and evaluate to false otherwise.

8.2.7 #line directive

lineDirective
    : '#' LINE INTEGER
    | '#' LINE INTEGER STRING_LITERAL
    ;

The #line directive is used to provide the compiler with mapping information that can be used to translate locations in the input source file to locations in the original source file. This is useful to provide better diagnostics when the input source file was generated from another file, potentially written in a different language.

The #line directive provides a line number and optionally the original source file name that apply to the block of source code following it. The compiler applies the new line numbering starting with the next source code line.


[design] Renaming "blackbox"

It seems the best option we've come across so far (inspired by DanT's email a while ago) is "extern". To define a new blackbox type, you can use the keyword "extern_type". To create a blackbox instance, you start with "extern". In that sense, "extern" is still a type qualifier, not a type in itself.

**

An example for counter would look like the following.

/* counter type definition */
extern_type counter {

/* attribute and method declaration */

}

/* counter instantiation */

extern counter per_prefix_counter {

type : packets;

direct : ipv4_lpm;

}

[design] Concurrency model for P4

P4 lacks a formal concurrency model. I can see at least two scenarios that demand such a model.

  1. Interactions between the control and data plane: What happens when the controller changes a table entry while a packet is being processed in the data plane?
  2. Interactions between multiple packet processors in the data plane: These packet processors could be match-action tables, pipelines, or cores. Further, these packet processors can share state. As an example, consider flowlet switching from the SIGCOMM 2015 P4 tutorial. The state stored in register flowlet_id is shared by two tables:
    • It is read in the action lookup_flowlet_map in the flowlet table
    • It is later written in the action update_flowlet_id in the new_flowlet table.

P4 doesn't clarify behavior for these scenarios. For instance, in scenario 1, is the table entry guaranteed to be either the old or the new entry, but not some muddled combination? In scenario 2, is the value of flowlet_id read by one packet in lookup_flowlet_map guaranteed to be the value of flowlet_id written by the previous packet in update_flowlet_id?

I think we need a concurrency model for such scenarios. One conservative start is to forbid any state sharing and further guarantee that any state updated by a packet processor (table, pipeline, or core) is visible to the next packet, i.e., actions within a packet processor are atomic.

But to span networking devices that permit shared state, a more expansive model might be required: for instance, we could make an entire control flow block "atomic". Semantically, this atomic control flow block would process exactly one packet at a time. A compiler would then generate a pipelined implementation guaranteeing these semantics that processes multiple packets concurrently.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.