Minimal set of low level words to build forth,about forthhub/discussion

Comments (48)

monsonite commented on June 3, 2024 1

You could look at sectorForth, implemented in under 512 bytes, to fit in the boot sector, in x86 assembly language

https://github.com/cesarblum/sectorforth

It was inspired by Bernd Paysan's post from 1996, using 8 primitives plus KEY and EMIT.

Have a look at the "Hello World" example to see how the language is brought up from scratch.

https://github.com/cesarblum/sectorforth/blob/master/examples/01-helloworld.f

from discussion.

pebhidecs commented on June 3, 2024

On 12/27/2020 at 9:29 PM, "kt97679" ***@***.***> wrote: Hi folks, I was curious how many low level words do you need to build forth. So I took [sod32](https://lennartb.home.xs4all.nl/sod32.tar.gz) by Lennart Benschop which uses 32 low level words and started to redefine low level words via other low level words. The whole process can be seen [here](https://github.com/kt97679/forth- dev/commits/reduce_opcodes). At the end of the day I managed to reduce low level words to 7 primitives: nop, nand, !, @, um+, special, lit. Inner interpreter has additional logic for 'exit' and 'call'. I was running tester.fr to ensure all words behave as expected and also used run time to track performance. My final implementation runs 708 times slower than original sod32. This whole effort hardly can be used in practice, but it was a fun little project. I wonder if we really need 3 primitives to access memory: @, ! and lit? Please let me know if there is any trick I can use to reduce this.

While the three word Forth might be implemented with just teh ability to communicate with terminal device and the words XC@,m XC!, and @execute, the sensible minimum is something a little more wordy. Peter Knaggs and myself put together this proooposal for the 2015 conference. <http://www.euroforth.org/ef15/papers/knaggs.pdf> I understand C. H. Ting has a similarly sparse set of primitives. While many Forth words can be implemented in terms of others, I expect that you would only do so as part of a bootstrapping a system onto a totally new architecture where there are no helpful resources. In the end, gaining reasonable performance will demand re-coding for efficiency. Regards Paul E. Bennett IEng MIET Systems Engineer Lunar Mission One Ambassador -- ******************************************************************** Paul E. Bennett IEng MIET..... Forth based HIDECS Consultancy............. Mob: +44 (0)7811-639972 Going Forth Safely ..... EBA. www.electric-boat-association.org.uk.. ********************************************************************

from discussion.

kt97679 commented on June 3, 2024

Hi Paul,

do you know by chance how we can implement arithmetic and boolean operations just using XC@, XC!, and
@execute? With all honesty I have no idea how this can be done.

Thanks,
Kirill.

from discussion.

pebhidecs commented on June 3, 2024

On 12/27/2020 at 11:05 PM, "kt97679" ***@***.***> wrote: Hi Paul, do you know by chance how we can implement arithmetic and boolean operations just using XC@, XC!, and @execute? With all honesty I have no idea how this can be done. Thanks, Kirill.

As I understand it, those three words are used to insert and check the machine code instructions of the processor you are working with in order to construct the rest of the Forth VM primitives to a level where you can continue more easily. Think of them as the Peek and Poke and JUMP via Vector set. Regards Paul E. Bennett IEng MIET Systems Engineer Lunar Mission One Ambassador -- ******************************************************************** Paul E. Bennett IEng MIET..... Forth based HIDECS Consultancy............. Mob: +44 (0)7811-639972 Going Forth Safely ..... EBA. www.electric-boat-association.org.uk.. ********************************************************************

from discussion.

mitra42 commented on June 3, 2024

I think I'd want to see the rest of for example Ting's set implemented in these 7 words ?

I started with Ting's set when building webForth, though there were a few more words that it made a HUGE difference to code - especially find or some part of it.

from discussion.

mitra42 commented on June 3, 2024

@paul if XC! , XC@ etc are used to write machine code, then I don't see this as a minimum set because all its done is move "code" into forth.

@kiril - what is "special" in your list.

from discussion.

kt97679 commented on June 3, 2024

@mitra42 special is used to call os functions like read and write files. It can be implemented via syscalls.

from discussion.

mitra42 commented on June 3, 2024

A lot of the exercises in this tend to just implement a virtual machine (for example with a huge switch statement), and then define a few primitives on top of that, and then define forth in those primitives - for example I could define forth in one word opcode then define "rot" as 5 opcode. But if I do that, then I haven't really made the port any easier since I've still got to define about 30 words in my switch statement. It doesn't look like you've done this, but I still can't see for example how to build rot from your 7 words: nop, nand, !, @, um+, special, lit. can you define rot in those words so e can see your approach.

from discussion.

kt97679 commented on June 3, 2024

Here is the new rot definition: kt97679/forth-dev@2a3b456#diff-6e5f6da6018a5fcb663d9b6fdbe51567e1bfb78e4cb8827663465886e0ac9547R120 You can check other commits to see how I replaced low level opcodes with high level definitions.

…

On Sun, Dec 27, 2020 at 3:44 PM Mitra Ardron ***@***.***> wrote: A lot of the exercises in this tend to just implement a virtual machine (for example with a huge switch statement), and then define a few primitives on top of that, and then define forth in those primitives - for example I could define forth in one word opcode then define "rot" as 5 opcode. But if I do that, then I haven't really made the port any easier since I've still got to define about 30 words in my switch statement. It doesn't look like you've done this, but I still can't see for example how to build rot from your 7 words: nop, nand, !, @, um+, special, lit. can you define rot in those words so e can see your approach. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#92 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA2OI3UGSARUS3Y2NL2MN4LSW7BFDANCNFSM4VLGY4HA> .

from discussion.

mitra42 commented on June 3, 2024

So if I read that commit correctly, you are using some working variables (at addresses 8, 12, 16) to allow things like ROT and SWAP to be defined.

(By the way, its coincidence that my example opcode two comments above uses the same word as SOD32 does)

from discussion.

kt97679 commented on June 3, 2024

Yes, location 0 holds stack pointer, location 4 holds return stack pointer, locations after that are "registers" to manipulate stack values.

…

On Sun, Dec 27, 2020 at 4:05 PM Mitra Ardron ***@***.***> wrote: So if I read that commit correctly, you are using some working variables (at addresses 8, 12, 16) to allow things like ROT and SWAP to be defined. (By the way, its coincidence that my example opcode two comments above uses the same word as SOD32 does) — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#92 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA2OI3UDXGEO2CNVTPWUK2LSW7DT3ANCNFSM4VLGY4HA> .

from discussion.

Bushmills commented on June 3, 2024

"I still can't see for example how to build rot from your 7 words: nop, nand, !, @, um+, special, lit. can you define rot in those words."

One can build flip flops and address decoders from NANDs, SRAM and registers from flip flops and decoders, and stacks from SRAM and register (if not implemented as cascaded shift register). From there, it needs moving stack tops between two stacks and from/to a temporary holding register (to avoid, at this point, swap), and voila, there's your ROT :)

from discussion.

monsonite commented on June 3, 2024

IMHO 7 primitives is a bit too few - and this is confirmed in the 708 times slower performance.

Both C.H. Ting and C.H. Moore settled on "about" 32 primitives as the minimum sub-set to allow efficient synthesis and execution of Forth.

Needless to say, you can do a lot with fewer instructions.

The PDP-8 was a perfectly viable machine capable of running 4K FOCAL and BASIC with just TAD, AND, DCA, ISZ, JMP, JMS plus the modify accumulator instructions (Clear, Complement, Shift left, shift right, set and clear carry etc) that were made available through the OPR class of instruction. I think a minimum of 16 and a maximum of 32 primitives would be needed depending on the machine architecture.

from discussion.

kt97679 commented on June 3, 2024

@monsonite yes, I mentioned in the very beginning that there is not much practical use in the reducing of the number of primitives. This is more of the theoretical question what is most orthogonal set of words that can be used to build forth and can't be reduced.

from discussion.

jacereda commented on June 3, 2024

If it's a theoretical question you might be interested in https://en.wikipedia.org/wiki/One-instruction_set_computer

from discussion.

kt97679 commented on June 3, 2024

@jacereda yes, I'm aware of the OISC, but I'm specifically interested in forth primitives that can be used to define the rest of the system. It is absolutely possible to implement forth in OISC, but in this case OISC single command will be used as an assembler, not as a forth primitive. And we will need to implement stack and everything else.

from discussion.

Bushmills commented on June 3, 2024

given that the logic of a NAND gate is enough to derive all other gates from it, and all kinds of gate logic in turn enough to model all primitives Forth would ever want to use, it seems that AND and 0= should all be needed, in addition to those words needed to glue them together such that new words can be written and executed - a NAND primitive alone shouldn't be enough to bootstrap a full system from it, as the NAND doesn't know about how to "connect" them, that is, how to build new words from it. It may also be a bit hefty to, say, having to implement a stack from NANDs alone. But as the question was about "needed", I suppose that required effort to implement a minimal system can be disregarded. Or is practicality also an aspect?

from discussion.

alexdowad commented on June 3, 2024

given that the logic of a NAND gate is enough to derive all other gates from it, and all kinds of gate logic in turn enough to model all primitives Forth would ever want to use, it seems that AND and 0= should all be needed...

Digital logic doesn't just require gates, but also something like flip-flops; some kind of memory.

from discussion.

sturem commented on June 3, 2024

Ah, but with (several) NAND gates you can make a flipflop...

…

On Wed, 19 Jan 2022, 13:44 Alex Dowad, ***@***.***> wrote: given that the logic of a NAND gate is enough to derive all other gates from it, and all kinds of gate logic in turn enough to model all primitives Forth would ever want to use, it seems that AND and 0= should all be needed... Digital logic doesn't just require gates, but also something like flip-flops; some kind of memory. — Reply to this email directly, view it on GitHub <#92 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACKCGHAO2FOXZNBB6YT4IATUWZFSZANCNFSM4VLGY4HA> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

from discussion.

Bushmills commented on June 3, 2024

And the same with memory: have enough flip-flops (from NANDs) and decoders (from NANDs), and you can build any amount of RAM (from just NANDs)

from discussion.

GarthWilson commented on June 3, 2024

When I try responding by email, it doesn't work; so I'm trying again in the github page:

True; but then you also need to be able to tri-state I/O bits. The only way you can do that with a NAND is if you add a separate output-enable input, or if the output is the wire-OR kind of thing which is a power hog and slows things down.

On 1/18/22 11:02 PM, Stuart wrote:

Ah, but with (several) NAND gates you can make a flipflop...

from discussion.

Bushmills commented on June 3, 2024

You don't need to tri-state outputs. You can also connect them as long as not both states are driven, as with open collector outputs. Those connect to a common usually high potential bar, pulled high by a pull-up, and any switched gate will pull it low. Or any number of active gates, doesn't matter. That's called a "wired or" and an alternative to tri-state when wanting to connect multiple outputs together.

from discussion.

GarthWilson commented on June 3, 2024

Right. As I said. That makes it a slow power hog though.

from discussion.

Bushmills commented on June 3, 2024

Would a minimal set of primitives, build from NANDs, really care much about power? But there may arise another problem when trying to build them from NANDs only which I didn't consider in my first post: the problem of lack of concurrency when "connecting" them in software.

from discussion.

SirWumpus commented on June 3, 2024

On 2020-12-27 16:29, kt97679 wrote: Hi folks, I was curious how many low level words do you need to build forth. So I took sod32 <https://lennartb.home.xs4all.nl/sod32.tar.gz> by Lennart Benschop which uses 32 low level words and started to redefine low level words via other low

One of the IOCCC 1992 winners wrote a tokenised indirect threaded Forth (like) with 13 primitives. I haven't looked at it in years, but it might be a fun snow day activity to study: https://www.ioccc.org/years.html#1992_buzzard.2

…

-- Anthony C Howe ***@***.*** BarricadeMX & Milters http://snert.com/ http://nanozen.snert.com/ http://snertsoft.com/

from discussion.

scherrey commented on June 3, 2024

Ulrich Hoffman has also made considerable efforts to identify the useful minimum that will get you a practical Forth. See his presentation "Forth - The New Synthesis Growing Forth with preForth and seedForth" from FOSDEM 2020 here: https://archive.fosdem.org/2020/schedule/event/forth_new_synthesis/ . preForth has 13 primitives and seedForth has 31.

…

-- Ben Scherrey

On Sat, Jan 16, 2021 at 12:44 AM Ken Boak ***@***.***> wrote: IMHO 7 primitives is a bit too few - and this is confirmed in the 708 times slower performance. Both C.H. Ting and C.H. Moore settled on "about" 32 primitives as the minimum sub-set to allow efficient synthesis and execution of Forth. Needless to say, you can do a lot with fewer instructions. The PDP-8 was a perfectly viable machine capable of running 4K FOCAL and BASIC with just TAD, AND, DCA, ISZ, JMP, JMS plus the modify accumulator instructions (Clear, Complement, Shift left, shift right, set and clear carry etc) that were made available through the OPR class of instruction. I think a minimum of 16 and a maximum of 32 primitives would be needed depending on the machine architecture. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#92 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGWP76PJMC763CDL3MLZDDS2B5IVANCNFSM4VLGY4HA> .

from discussion.

niclash commented on June 3, 2024

On 2022-01-19 10:18, Anthony Howe wrote: On 2020-12-27 16:29, kt97679 wrote: > I was curious how many low level words do you need to build forth. So I took > sod32 <https://lennartb.home.xs4all.nl/sod32.tar.gz> by Lennart Benschop which > uses 32 low level words and started to redefine low level words via other low

One this subject, there are plenty of one and even zero instructions CPUs, and one of my favorite is the SUBLEQ computer; Gary Explains; https://www.youtube.com/watch?v=jRZDnetjGuo (other resources available too, but this is fairly easy to understand by comparison) So, with a SUBLEQ instruction, should be able to bootstrap the rest of a FORTH system. And it should be fairly easy to implement that in FPGA or with discrete ICs in hardware. Enjoy Niclas

from discussion.

kt97679 commented on June 3, 2024

Hi folks, I would like to clarify that my original research was about discovering an orthogonal minimal set of words sufficient for building the whole forth system. Alan Kay once described fundamental parts of lisp as "Maxwell equations of the software <https://www.gnu.org/software/mes/manual/html_node/LISP-as-Maxwell_0027s-Equations-of-Software.html>" and I was curious what a forth analog. This was not about building the smallest forth in the world or using cpu with the smallest set of commands. Thanks, Kirill.

…

On Wed, Jan 19, 2022 at 3:49 PM Ken Boak ***@***.***> wrote: You could look at sectorForth, implemented in under 512 bytes, to fit in the boot sector, in x86 assembly language https://github.com/cesarblum/sectorforth It was inspired by Bernd Paysan's post from 1996, using 8 primitives plus KEY and EMIT. Have a look at the "Hello World" example to see how the language is brought up from scratch. https://github.com/cesarblum/sectorforth/blob/master/examples/01-helloworld.f — Reply to this email directly, view it on GitHub <#92 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA2OI3RGT7DYNHVHCGV4Q6DUW3FQJANCNFSM4VLGY4HA> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. You are receiving this because you authored the thread.Message ID: ***@***.***>

from discussion.

jwoehr commented on June 3, 2024

1 (one) word. Probably Chuck would point out that all you need is an operator that can compile a byte into RWX memory.

…

On Wed, Jan 19, 2022 at 11:34 AM kt97679 ***@***.***> wrote: Hi folks, I would like to clarify that my original research was about discovering an orthogonal minimal set of words sufficient for building the whole forth system.

-- Jack Woehr, IBM Champion 2021 <https://www.youracclaim.com/badges/528d23d6-087f-4698-8d17-d59688106ac4/public_url> Absolute Performance, Inc. 12303 Airport Way, Suite 100 Broomfield, CO 80021 NON-DISCLOSURE NOTICE: This communication including any and all attachments is for the intended recipient(s) only and may contain confidential and privileged information. If you are not the intended recipient of this communication, any disclosure, copying further distribution or use of this communication is prohibited. If you received this communication in error, please contact the sender and delete/destroy all copies of this communication immediately.

from discussion.

kt97679 commented on June 3, 2024

I'm afraid 1 word will not work. Memory modification can't be used to implement arithmetic operations. On Wed, Jan 19, 2022 at 7:39 PM Jack J. Woehr ***@***.***> wrote:

…

1 (one) word. Probably Chuck would point out that all you need is an operator that can compile a byte into RWX memory. On Wed, Jan 19, 2022 at 11:34 AM kt97679 ***@***.***> wrote: > Hi folks, > > I would like to clarify that my original research was about discovering an > orthogonal minimal set of words sufficient for building the whole forth > system. > -- Jack Woehr, IBM Champion 2021 < https://www.youracclaim.com/badges/528d23d6-087f-4698-8d17-d59688106ac4/public_url > Absolute Performance, Inc. 12303 Airport Way, Suite 100 Broomfield, CO 80021 NON-DISCLOSURE NOTICE: This communication including any and all attachments is for the intended recipient(s) only and may contain confidential and privileged information. If you are not the intended recipient of this communication, any disclosure, copying further distribution or use of this communication is prohibited. If you received this communication in error, please contact the sender and delete/destroy all copies of this communication immediately. — Reply to this email directly, view it on GitHub <#92 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA2OI3SEMTEM63F6B7BQESDUW4APHANCNFSM4VLGY4HA> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. You are receiving this because you authored the thread.Message ID: ***@***.***>

from discussion.

jwoehr commented on June 3, 2024

On Wed, Jan 19, 2022 at 11:49 AM kt97679 ***@***.***> wrote: I'm afraid 1 word will not work. Memory modification can't be used to implement arithmetic operations.

Sure it can be. I specified RWX memory. You use the 1 operator to compile an arithmetic word, then do your operation!

…

from discussion.

kt97679 commented on June 3, 2024

May I ask you to clarify what do you mean by rwx memory? I feel I'm missing something here. On Wed, Jan 19, 2022 at 8:30 PM Jack J. Woehr ***@***.***> wrote:

…

On Wed, Jan 19, 2022 at 11:49 AM kt97679 ***@***.***> wrote: > I'm afraid 1 word will not work. Memory modification can't be used to > implement arithmetic operations. > Sure it can be. I specified RWX memory. You use the 1 operator to compile an arithmetic word, then do your operation! -- Jack Woehr, IBM Champion 2021 < https://www.youracclaim.com/badges/528d23d6-087f-4698-8d17-d59688106ac4/public_url > Absolute Performance, Inc. 12303 Airport Way, Suite 100 Broomfield, CO 80021 NON-DISCLOSURE NOTICE: This communication including any and all attachments is for the intended recipient(s) only and may contain confidential and privileged information. If you are not the intended recipient of this communication, any disclosure, copying further distribution or use of this communication is prohibited. If you received this communication in error, please contact the sender and delete/destroy all copies of this communication immediately. — Reply to this email directly, view it on GitHub <#92 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA2OI3XME2VFNKPWRKLDCYLUW4GNZANCNFSM4VLGY4HA> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. You are receiving this because you authored the thread.Message ID: ***@***.***>

from discussion.

jwoehr commented on June 3, 2024

On Wed, Jan 19, 2022 at 12:36 PM kt97679 ***@***.***> wrote: May I ask you to clarify what do you mean by rwx memory? I feel I'm missing something here.Message ID: <ForthHub/discussion/issues/92/1016802850@ github.com>

Read-Write-Execute memory. Most memory management these days doesn't allow a process to execute code from a memory region that same process can write to. Security, right?! What I said was perhaps tongue in cheek :) LISP can be coded with four opaque operators, because it's not a machine language, it's a virtual machine with four basic primitives (ideally, not efficiently). Forth, on the other hand, is The Greatest Macro Assembler In The Universe with a cool little threading trick and a couple of stacks. So "ideally" (ha-ha) you'd only need one opaque operator, one that allowed you to write a byte to RWX memory, and you could build the rest of Forth by hand from there. That's more or less how Chuck built ColorForth. He got tired of MASM and started using MS-DOS DEBUG to write bytes to memory. He had the Intel 80x86 opcodes memorized.

…

from discussion.

paraplegic commented on June 3, 2024

Embedded, bare metal, perhaps. Modern operating systems expend a fair bit of effort to ensure that the text (program) memory segments cannot be modified at runtime.

…

On Wed, 19 Jan 2022, Jack J. Woehr wrote: Date: Wed, 19 Jan 2022 11:30:44 -0800 From: Jack J. Woehr ***@***.***> Reply-To: ForthHub/discussion ***@***.***> To: ForthHub/discussion ***@***.***> Cc: Subscribed ***@***.***> Subject: Re: [ForthHub/discussion] Minimal set of low level words to build forth (#92) On Wed, Jan 19, 2022 at 11:49 AM kt97679 ***@***.***> wrote: > I'm afraid 1 word will not work. Memory modification can't be used to > implement arithmetic operations. > Sure it can be. I specified RWX memory. You use the 1 operator to compile an arithmetic word, then do your operation! -- Jack Woehr, IBM Champion 2021 <https://www.youracclaim.com/badges/528d23d6-087f-4698-8d17-d59688106ac4/public_url> Absolute Performance, Inc. 12303 Airport Way, Suite 100 Broomfield, CO 80021 NON-DISCLOSURE NOTICE: This communication including any and all attachments is for the intended recipient(s) only and may contain confidential and privileged information. If you are not the intended recipient of this communication, any disclosure, copying further distribution or use of this communication is prohibited. If you received this communication in error, please contact the sender and delete/destroy all copies of this communication immediately. ? Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you are subscribed to this thread.[AAS2KESN2P2EAILFUFAY2XDUW4GOJA5CNFSM4VLGY4HKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOHSNR3HY.gif] Message ID: ***@***.***>

-- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Robert S. Sciuk ***@***.*** Principal Consultant Phone: 289.312.1278 Cell: 905.706.1354 Control-Q Research 97 Village Rd. Wellesley, ON N0B 2T0

from discussion.

kt97679 commented on June 3, 2024

I'm terrible at explaining. I'm not looking for the word to generate machine language. I'm looking for the minimal set of the forth words that can be used to define all other words of the complete forth system.

…

On Wed, Jan 19, 2022 at 8:58 PM Rob Sciuk ***@***.***> wrote: Embedded, bare metal, perhaps. Modern operating systems expend a fair bit of effort to ensure that the text (program) memory segments cannot be modified at runtime. On Wed, 19 Jan 2022, Jack J. Woehr wrote: > Date: Wed, 19 Jan 2022 11:30:44 -0800 > From: Jack J. Woehr ***@***.***> > Reply-To: ForthHub/discussion > ***@***.***> > To: ForthHub/discussion ***@***.***> > Cc: Subscribed ***@***.***> > Subject: Re: [ForthHub/discussion] Minimal set of low level words to build > forth (#92) > > > On Wed, Jan 19, 2022 at 11:49 AM kt97679 ***@***.***> wrote: > > > I'm afraid 1 word will not work. Memory modification can't be used to > > implement arithmetic operations. > > > > Sure it can be. I specified RWX memory. You use the 1 operator to compile > an arithmetic word, then do your operation! > > -- > Jack Woehr, IBM Champion 2021 > < https://www.youracclaim.com/badges/528d23d6-087f-4698-8d17-d59688106ac4/public_url > > Absolute Performance, Inc. > 12303 Airport Way, Suite 100 > Broomfield, CO 80021 > > NON-DISCLOSURE NOTICE: This communication including any and all > attachments is for the intended recipient(s) only and may contain > confidential and privileged information. If you are not the intended > recipient of this communication, any disclosure, copying further > distribution or use of this communication is prohibited. If you received > this communication in error, please contact the sender and delete/destroy > all copies of this communication immediately. > > ? > Reply to this email directly, view it on GitHub, or unsubscribe. > Triage notifications on the go with GitHub Mobile for iOS or Android. > You are receiving this because you are subscribed to this > thread.[AAS2KESN2P2EAILFUFAY2XDUW4GOJA5CNFSM4VLGY4HKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOHSNR3HY.gif] > Message ID: ***@***.***> > > > -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Robert S. Sciuk ***@***.*** Principal Consultant Phone: 289.312.1278 Cell: 905.706.1354 Control-Q Research 97 Village Rd. Wellesley, ON N0B 2T0 — Reply to this email directly, view it on GitHub <#92 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA2OI3Q4AGZQS5UCXRQIL7DUW4JX5ANCNFSM4VLGY4HA> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. You are receiving this because you authored the thread.Message ID: ***@***.***>

from discussion.

iru- commented on June 3, 2024

Hi folks, I would like to clarify that my original research was about discovering an orthogonal minimal set of words sufficient for building the whole forth system. Alan Kay once described fundamental parts of lisp as "Maxwell equations of the software https://www.gnu.org/software/mes/manual/html_node/LISP-as-Maxwell_0027s-Equations-of-Software.html" and I was curious what a forth analog. This was not about building the smallest forth in the world or using cpu with the smallest set of commands. Thanks, Kirill.

Hi Kiril,

You are welcome to read my (code) take on this matter in https://github.com/iru-/nopforth, which is a Forth dialect I've been writing for my own use for the past 4 years. It is definitely not a try at reaching a minimum set of primitives, but may provide another data point.

By the way, porting it to arm64 has organically reduced some of the "primitives" compared to the x64-64 code.

Kind regards,
Iruatã

from discussion.

iru- commented on June 3, 2024

Embedded, bare metal, perhaps. Modern operating systems expend a fair bit of effort to ensure that the text (program) memory segments cannot be modified at runtime.

They sure do!
Check the dance I have to do to compile arm64 instructions on macOS on Apple Silicon at runtime: https://github.com/iru-/nopforth/blob/main/src/arm64/boot.s#L612

from discussion.

RGD2 commented on June 3, 2024

I'm not sure your question makes as much philosophical sense as it at first appears to, but I'll do my best. Possibly relevant (or not), but if you look at cross.fs, basewords.fs, nuc.fs in here: https://github.com/jamesbowman/swapforth/tree/master/j1a/ These are run with gforth like `gforth cross.fs basewords.fs nuc.fs` to produce a nuc.hex image which is then loaded into sram of a forth machine implemented as a non-virtual machine in an FPGA (or you can run it in a simulated version of the 'SoC' using verilator, if you don't have the hardware). The description of nuc.fs is 'just enough to let the j1 (hardware) finish bootstrapping the rest of swapforth', which itself is a mostly ans-forth compliant 16bit forth. (Well enough that I've been able to use it to control lab hardware). This is likely not anything like a true minimal set, but it's interesting in that James would have been motivated to keep nuc.fs relatively small. Also, I've been able to get something like 69MHz clock rate (each clock executes one word, most of the time) out of actual hardware with some tweaking (mostly just overclocking, there is a new timing-driven placer that might do better now). Not bad for 'low power' FPGA hardware. The standard j1a runs with a 12MHz clock, and goes fast enough to produce a 1 MHz square wave with just: ```forth : toggle begin 0 2 io! 1 2 io! again ; toggle ``` Squirted over a serial port at it... I expect the answer to your inquiry actually depends a great deal on exactly what forth 'machine' you are running, and how it works. This J1 core is basically just a single-cycle finite state machine which has two actual hardware stacks and separate IO and SRAM busses, and pulls data 16 bits at a time so it can run every non-fetch word in one clock cycle. Essentially a CPU that has no registers: even the IP is basically just 'top of return stack'. But it can also access the top few 'stack' items simultaneously - something you can't do if you're implementing your stacks using a hardware register just to hold a pointer into ram somewhere: you end up putting the top few stack places into your register file, and this requires some time spent on 'housekeeping' which the J1a doesn't need to bother with. Because of its design, it can implement either a jump or a call to anywhere in its (admittedly tiny) memory space, with just one instruction fetch. But it isn't a VM like pretty much all other non-hardware forth implementations: It's using the forth execution model directly as actual hardware. Forth kind of has to build up from the bottom, and you're asking, 'whats the smallest set I can start with from the bottom'? (essentially). This is just going to be so dependent on what you will be executing on, as well as what you consider necessary capabilities for your 'done' FORTH to have. James could have used the 'three word forth' method, and just had cross.fs talk directly over a serial link to the 3WF to transfer binaries compiled entirely from within gforth. But the way he does it means the implementation could be bootstrapped from another, simpler forth implementation having just enough for cross.fs to run. And just so happens to be some kind of implementation of your question. The other example I would suggest you examine would be a forth implementation on JavaScript, like jeforth.3ce. It does something recognisably ans-forth after implementing the forth machine in JS. A JS forth implementation might be a good choice for running cross.fs for swapforth also. The thing about lisp, is that it's really a 'top down' designed language, that happens to be so simple one barely needs to implement anything to get it to work. There's now a LISP implementation *with a garbage collector*, that fits in a DOS boot sector... https://github.com/jart/sectorlisp (There's an interesting x86-64 'visual vm' simulator being shown in animated gifs there called blinkenlights which would be handy if you want to go for the answer to your question in an x86-specific design). According to https://justine.lol/sectorlisp2/, sectorLISP just 436 bytes, even smaller than sectorFORTH which weighs in at 491. The trouble is that LISP is a 'machine' somewhat like a turing machine, that 'lives' in mental space: The virtual space of all mathematics. It's made of 'high level conceptualisation', and it is 'simple' such that so long as you have basically infinite ram (like any good turing machine) it's trivial to implement, and the details only get complex because of the finite limitations of real hardware. To have a FORTH, you've got to choose specifics: What does the architecture look like? And then decide how to implement a forth 'VM' within that. (Where is RAM? how is it accessed? Where are the stacks, and how are they accessed? How is program stored and where is it loaded from?). Because you're working bottom up, not top down. The forth spec is just defining 'a bottom' to start with. You can of course define a LISP in FORTH (ie, retroforth ). Maybe what you're really asking is how much LISP you need to be able to then bootstrap a FORTH emulator? Something has to start with some kind of ISA / choice of opcode which then feeds some kind of finite state machine, the details of which decide what words need to be implemented, and how, in order to arrive at system that behaves like a FORTH, which can then accept further code to eventually end up with (I assume) something like an ANS-forth compliant dictionary. Doesn't the J1a processor, itself, within its opcode format, define the very 'lowest' level of words, upon which the rest of the swapforth implementation (nuc.fs) builds? They're all there in basewords.fs. After that 'swapforth.fs', run on the J1a, does what you expect and defines enough words to arrive at a useful set which ends up with something very almost an ANS-forth compliant dictionary. So, whilst the 'one instruction computer' answers others are giving sounds disingenuous - it does seem relevant to me. Your file could just be a stream of binary that ends up implementing a forth system. Are you asking what the minimal size of that file could be? In LISP, code is data; whereas in FORTH, data is code: You're looking down, not up. Forth code 'lives' equally validly transmitted across a serial line as sitting in a file, or being held in a block on a forth system that does that. I'd argue that as-loaded from the FPGA's bitfile image read from the eeprom chip about a millisecond after power up, the J1a is a FORTH. This remains true even if that image came from just nuc.hex and not the 'full' compiled j1a.hex (which normally results in a bitfile image that boots straight into the nearly-ans-forth-compliant swapforth interpreter waiting on the next byte to come in via the serial line). Swapforth is probably one of the 'biggest' forth implementations, considered by the size of that file, because it's got just the configuration switches in the FPGA to implement itself in, but it does so with the effect that the black box having a rs232 serial connection and a few other pins usably implements a FORTH system. That files contains not only the initial ram set of the resulting machine, but also all the bits necessary to make the FPGA into the machine. The opposite extreme would be GreenArrays' chips - they take forth (their version) code directly. Also basically over a serial line. They're hardwired to be a forth. IIRC they found that letting their processors run straight from source was faster than loading pre-compiled binaries. Those execute at something like 400 M words / sec, although they also don't actually have a system clock nor clock rate. Their 'serial ports' include bidirectional handshaking, so data flows only when it can be accepted, and everything is asynchronous and just 'ready when it's done'. I bet the minimal set of data to make those chips behave recognisably like a FORTH is smaller than sectorLISP.

…

-- -- Remy

from discussion.

Bushmills commented on June 3, 2024

It might also be necessary to describe better what a "low level word" is, as its definition has a massive impact on the number of "low level words" needed. Must they be Forth primitives, as required by any standard of choice? Can they be created for the sole purpose of providing tools for building a Forth? Do they have to correspond with or can be built from executable instructions of a host CPU? Different answers to these questions will result in a different set, in both composition as well as in size, of the "minimal set".
I assume that those low level "words" will used to be combined with each other, in such a way that they can be, say, executed in sequence, to form new, more complex behaviour. If those words from the minimal set don't need to be existing Forth primitives, there's a simple way to reduce the number of needed words to one already: I implement a low level word, which I call "dispatch". It receives an argument, and depending on its value, branches to one or another mode of behaviour coded into it. Say, calling it with 1 results in code which duplicates top of stack, 2 cause it to behave such a way that top of stack is discarded, 3. makes it swap two top stack items, and so on. Everything in one single low level word. Therefore, a complete Forth can be built from just one single low level word.
"But that's cheating" you might say - in that case, all solutions requiring some external processing, only interfaced by a small set of low level words to reference them, Like, memory architectures which yield results by appropriate addressing, But then, isn't using CPU instructions not also a way of using capabilities, external to the Forth we want to code?
So let's reduce this further, and only require that we can combine out lowest level code expression units. Evidently we are allowed to use CPU instructions. Customising the CPU is of course one route, but not even necessary: any CPU allows us to process all aspects if its capabilities by merely putting zeroes and ones into a suitable sequence. So we combine the minimal set
of "1" and "0" to obtain more complex behaviour. Say, we arrange them such that the effect, when executing a thusly built instruction, is to discard the top item of a stack. Or another, causing it to get duplicate. Therefore, all we need are zero instructions - we can, after all, build any by simply putting zeroes and ones into the proper sequence. "But that's cheating again" one may say - no, it isn't, as long as the "rules" defining what those "low level words" are don't exclude such an approach, and currently, the - nonexistant - definition of those don't.

from discussion.

monsonite commented on June 3, 2024

Bushmills, I also use the "dispatch" technique for my bytecode interpreter MINT. However the arguments consist of printable ascii characters, either directly from the terminal input buffer, or from another area in memory. On receipt of a character, a lookup table provides the jump address of the code associated with the particular macro. Once executed, the macro returns execution to the interpreter which fetches the next character. Highly influenced by Forth, it is very much a lightweight solution, a form of shorthand. As all characters are printable ascii they can be assigned in such a way to give a high degree of human readability. It has all the usual stack operations, arithmetic and logical operations, comparisons, looping, arrays, decimal and hexadecimal input and output and the means to extend the language using a series of user macros, which are very similar to colon definitions. The main limitations are that you are restricted to single character macros - but this removes the complexity of multi-character parsing and the dictionary. The complete interpreter has been implemented on a Z80 in around 1700 bytes of assembly language. It's not Forth, but a means to provide an interactive, interpreted programming language for a variety of resource limited micros, using a limited command set, as an alternative to assembly language.

…

On Thu, 20 Jan 2022 at 12:02, Bushmills ***@***.***> wrote: It might also be necessary to describe better what a "low level word" is, as its definition has a massive impact on the number of "low level words" needed. Must they be Forth primitives, as required by any standard of choice? Can they be created for the sole purpose of providing tools for building a Forth? Do they have to correspond with or can be built from executable instructions of a host CPU? Different answers to these questions will result in a different set, in both composition as well as in size, of the "minimal set". I assume that those low level "words" will used to be combined with each other, in such a way that they can be, say, executed in sequence, to form new, more complex behaviour. If those words from the minimal set don't need to be existing Forth primitives, there's a simple way to reduce the number of needed words to one already: I implement a low level word, which I call "dispatch". It receives an argument, and depending on its value, branches to one or another mode of behaviour coded into it. Say, calling it with 1 results in code which duplicates top of stack, 2 cause it to behave such a way that top of stack is discarded, 3. makes it swap two top stack items, and so on. Everything in one single low level word. Therefore, a complete Forth can be built from just one single low level word. "But that's cheating" you might say - in that case, all solutions requiring some external processing, only interfaced by a small set of low level words to reference them, Like, memory architectures which yield results by appropriate addressing, But then, isn't using CPU instructions not also a way of using capabilities, external to the Forth we want to code? So let's reduce this further, and only require that we can combine out lowest level code expression units. Evidently we *are* allowed to use CPU instructions. Customising the CPU is of course one route, but not even necessary: any CPU allows us to process all aspects if its capabilities by merely putting zeroes and ones into a suitable sequence. So we combine the minimal set of "1" and "0" to obtain more complex behaviour. Say, we arrange them such that the effect, when executing a thusly built instruction, is to discard the top item of a stack. Or another, causing it to get duplicate. Therefore, all we need are zero instructions - we can, after all, build any by simply putting zeroes and ones into the proper sequence. "But that's cheating again" one may say - no, it isn't, as long as the "rules" defining what those "low level words" are don't exclude such an approach, and currently, the - nonexistant - definition of those don't. — Reply to this email directly, view it on GitHub <#92 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAFZIP7RH3FCNYIA7QIC27LUW72XHANCNFSM4VLGY4HA> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. You are receiving this because you were mentioned.Message ID: ***@***.***>

from discussion.

Bushmills commented on June 3, 2024

@monsonite, then you are the living proof that a single-primitive Forth (-alike) is not just a hypothetical or academical exercise :-)

from discussion.

kt97679 commented on June 3, 2024

Maybe my question should be transformed to "what is absolutely minimal amount of the assembly code we need to write to implement ANS forth on the 64 bit linux system"?

from discussion.

iru- commented on June 3, 2024

I will again suggest you to read part of the nopforth bootstrap code. For Linux on x86-64 that’s https://github.com/iru-/nopforth/blob/main/src/x86_64/Linux.s. nopforth compiles native code, which probably makes it a little more complex than threaded forths in the bootstrap. Hence, this is not the minimum set, but is a realistic/practical set for my purposes. Even then, the bulk of the code is in nop itself.

…

On 22 Jan 2022, at 05:01, kt97679 ***@***.***> wrote: Maybe my question should be transformed to "what is absolutely minimal amount of the assembly code we need to write to implement ANS forth on the 64 bit linux system"? — Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you commented.

from discussion.

iru- commented on June 3, 2024

I forgot to mention that nopforth is not ANS, but I have the impression it doesn’t matter that much at this level.

…

On 22 Jan 2022, at 11:04, Iruatã Souza ***@***.***> wrote: I will again suggest you to read part of the nopforth bootstrap code. For Linux on x86-64 that’s https://github.com/iru-/nopforth/blob/main/src/x86_64/Linux.s. nopforth compiles native code, which probably makes it a little more complex than threaded forths in the bootstrap. Hence, this is not the minimum set, but is a realistic/practical set for my purposes. Even then, the bulk of the code is in nop itself. >> On 22 Jan 2022, at 05:01, kt97679 ***@***.***> wrote: >> > > Maybe my question should be transformed to "what is absolutely minimal amount of the assembly code we need to write to implement ANS forth on the 64 bit linux system"? > > — > Reply to this email directly, view it on GitHub, or unsubscribe. > Triage notifications on the go with GitHub Mobile for iOS or Android. > You are receiving this because you commented.

from discussion.

Bushmills commented on June 3, 2024

CPUs capable of running 64 bit Linux may have loadable microcode (Intel, AMD) or exist as, say, Verilog source which, after compiling and synthesizing, can be loaded into an FPGA (such as RISC-V) - will any "cheats" employing customization of any of those be acceptable? Like, adding an equivalent of such a "dispatch" instruction?

from discussion.

RGD2 commented on June 3, 2024

Why specifically that? Although, “64-bit Linux assembler” isn’t actually very specific. I assume you mean x86/64 aka amd64 Linux, but there are quite a few different models of 64-bit arm CPU’s running Linux now: different architecture, different assembler, different solution. There are also a few other 64bit ISA’s that Linux can be had on. So the best way to be ‘generic’ is probably to use some kind of very portable C code… Justine’s lisp is in C, and is written in cosmopolitan , a C lib that produces binaries that will run unmodified on a x86-64 running any of (windows, Linux, freeBSD, macOS, netBSD, none). And which is small and fast: works by having a binary format that all of the above OS will accept as an executable. Ends up being a .com file to appease windows. I don’t think cosmopolitan ( https://github.com/jart/cosmopolitan BTW) supports any architecture other than x86-64 aka amd64, I wonder if cosmopolitan could be extended to support other ISA’s in a similar way to what Apple once did? They have long had executable file formats supporting multiple ISA’s on their macintosh platform. They went from m68k to PowerPC and introduced a “fat” application format that would just run on either. I think they more recently are doing something similar for their current x86-64 -> arm64 transition. So that all makes me think you may be back to sectorFORTH again. Then again, if you’re able to assume Linux, there’s probably a lot of functionality you can find in-kernel to exploit with system calls for less assembler. IIRC there exists some malware that does that on windows (because it’s a monoculture and everything is in predictable locations) so they could deliberately (mis)interpret strings of code in ram by jumping to it to execute “words” that they didn’t have to bring along with them. iIRC this exploited both the huge bloat of windows as well as the fact that with variable length instruction encoding, it’s sometimes possible to find a pre-existing string in executable ram you can jump to in order to do something you find useful due to it being interpreted differently because your jump is out of phase with the intended program entry. That reputedly made for an even more minimal threaded-interpretive language system that made infiltration easier because it could use a smaller payload, after crashing a stack or whatever. So, maybe using some version of windows so you can do similar might give an even shorter answer? So long as you pick one that doesn’t load all libraries as position independent code to randomised addresses, which I can only hope they’ve been doing for a while now, but I’m not sure. Or: possibly more useful would be to do it on webAssembly. Which might be very short, since I think it has stacks and already basically acts a lot like forth. Let’s see what already exists, google says : wasm-forth, WAForth, and “Ricardo Forth - WebAssembly edition”. — Remy

On Sat, 22 Jan 2022 at 3:01 pm, kt97679 ***@***.***> wrote: Maybe my question should be transformed to "what is absolutely minimal amount of the assembly code we need to write to implement ANS forth on the 64 bit linux system"? — Reply to this email directly, view it on GitHub <#92 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADXT4GDRU7I5DO6KFUP4FVTUXITZJANCNFSM4VLGY4HA> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. You are receiving this because you commented.Message ID: ***@***.***>

-- -- Remy

from discussion.

kt97679 commented on June 3, 2024

@iru- I looked at Linux.s but it is just aggregator of other files so I assume you mean arch.ns ? Yes, your system looks pretty minimal but probably it can be minimized even further? Since you are solving some practical tasks this may not be reasonable in your case since system will become much slower.

@Bushmills let's not use any cheats.

@RGD2 I just tried to narrow down requirements. We definitely can use any other OS and any other architecture (or like you noted we can use wasm). I looked into both sectorforth and sectorlisp and those are amazing projects but they are about minimizing the whole code of the system. I'm not trying to do that. I try to see how much of the system we need to define in the low level code so the rest of the system can be defined via that low level base. From my research I see that we can build complete forth system with only 9 words implemented in the low level assembly. I was curious if this is the limit or we can reduce number of those words further.

from discussion.

iru- commented on June 3, 2024

@iru- I looked at Linux.s but it is just aggregator of other files so I assume you mean arch.ns ? Yes, your system looks pretty minimal but probably it can be minimized even further?

Yes, Linux.s is an aggregator. The real work is done by:

x86_64/boot.s: the main routine and the bulk of what's needed in assembly to start interpreting/compiling nop code. This is mostly independent of OS, but does assume some POSIX interfaces.
x86_64/sysv.s: utility routines to aid in interfacing with the Linux/*BSD environment around nop, such as command-line arguments and loading dynamic libraries.
dicts.s: assembled-in dictionary

You mentioned x86_64/arch.ns. That's the only architecture-dependent source in nop, not in assembly, itself. Everything after it included in Linux.s is portable.

As you correctly guessed, the assembly code can be minimized further. For fun, I've played and was successful with not interpreting numbers in assembly at all, leaving that to routines in nop. The end result was a little less clear than the current code, so I didn't commit it to the main branch. Also, I'm currently porting nop to arm64 and it turned out that some assembly ended up being implemented in arm64's arch.ns just because it didn't need to be in assembly.

Since you are solving some practical tasks this may not be reasonable in your case since system will become much slower.

I write and use nop for practical tasks (and having fun!). However, I never cared specifically about it being much slower or faster than it is. I just try to write nop itself in a way that both it and the programs written in it are not too complex.

As an example, initially I tried to use syscalls directly for my interaction with the OS. It turns out that unix-like operating systems don't have a very strict definition of who should implement the expected behavior of a call, IOW this behavior is usually accomplished by a combination of the syscall itself and its counterpart in libc. To avoid having to deal with this problem altogether, I stopped dealing with syscalls directly and started using libc.

Regarding performance, it doesn't seem like nop is too bad. Whereas cat(1)-ing its README on Linux takes around 0.002s, using nop's examples/cat.ns on the same file takes around 0.005s, but the latter involves bootstraping nop -- which compiles the whole nucleus on the fly --, compiling cat.ns and running it.

from discussion.

Minimal set of low level words to build forth about discussion HOT 48 OPEN

Comments (48)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent