Coder Social home page Coder Social logo

Comments (64)

mfld-fr avatar mfld-fr commented on June 11, 2024

Not really a bug. After your change, the ETH / NE2K driver requests the IRQ11, but in irqtab.S, the interrupt for the board is still fixed to IRQ9 (IRQ11 is surrounded by #if 0). You are the first one to need to set the interrupt number else than to 9 😄.

Could you please change that piece of code and test again, just to validate the fix, before we submit a clean patch ?

! IRQ9 is used by the Ethernet device

#ifdef CONFIG_ETH
_irq9:
	push    ax
	mov     ax, #9
	br      _irqit
#endif

#if 0
_irq10:
	push	ax
	mov	ax,#10
	br	_irqit
_irq11:
	push	ax
	mov	ax,#11
	jmp	_irqit

from elks.

Mellvik avatar Mellvik commented on June 11, 2024

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

The basic test for the ETH / NE2K driver is the eth command, that you could build in /elks/elkscmd/test/eth/. It tests the connection to the local network with ARP requests / replies. Don't forget to update the IP addresses in the code with the ones of your real local network, because the defaults are for QEMU as configured in qemu.sh.

from elks.

lithoxs avatar lithoxs commented on June 11, 2024

Actually, two changes are needed in irqtab.S. The one pointed to by mfld-fr and another where the pointer to the ISR is placed in the interrupt vector table.
The code for irqtab.S has changed recently. The relevant parts for you case should now be:

#ifdef CONFIG_ETH_NE2K
#define METH 0x800
#else
...

#if 0
_irq9: ! Ethernet device
call _irqit
.byte 9
_irq10: ! USB
call _irqit
.byte 10
#endif
#ifdef CONFIG_ETH_NE2K
_irq11: ! Sound
call _irqit
.byte 11
#endif
#if 0
_irq12: ! Mouse
call _irqit

from elks.

lithoxs avatar lithoxs commented on June 11, 2024

Regarding the FPU interrupt, ELKS does not know anything about the FPU.

from elks.

Mellvik avatar Mellvik commented on June 11, 2024

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

Hello, any news on that issue ?

from elks.

Mellvik avatar Mellvik commented on June 11, 2024

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

Mmm... what is your CPU model ? Because I just remember I used some 80186 instructions in the Ne2k driver, and this could explain the failure.

from elks.

Mellvik avatar Mellvik commented on June 11, 2024

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

Forget my last idea if you have a 80286.
About "never stops sending", could you please detail ?
Did you catch an Ethernet dump before the power supply break ?

from elks.

Mellvik avatar Mellvik commented on June 11, 2024

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

Hello, any progress in your testing ? Were you able to test the HW with another OS ?

from elks.

Mellvik avatar Mellvik commented on June 11, 2024

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

OK, thanks for the update. So the problem is specific to ELKS.

If you could connect point-to-point your target with the NE2K card to a PC, run the eth test program, reboot the target, then run the telnet client program, and capture the packets with WireShark, it would help us to diagnose.

from elks.

Mellvik avatar Mellvik commented on June 11, 2024

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

Could you please take a photo of your ISA card, so that we could identify the main chip and check its datasheet against the current implementation ?

You could also disable all the network applets in /etc/rc.d/rc.sysinit (ktcp, telnetd, httpd, etc), only run the /bin/eth command and check the captured packets.

And last but not least, you could add some printk() in ne2k-main.c to try to find what is looping.

from elks.

Mellvik avatar Mellvik commented on June 11, 2024

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

Sorry, but I see no picture in your latest reply.

About /bin/eth, it looks like you forgot to set your build environment as explained in the main README : in the top directory (here .../elks-master), run . tools/env.sh before trying to invoke any make command.

Do you still have the legacy dev86 package installed on your system ? Please uninstall it if stills there : now ELKS relies on the auto-build & latest dev86 of @jbruchon, not the legacy one that is rather outdated and not able to run "in-tree".

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

By the way, I just saw that there is a mistake in /elkscmd/Makefile(114) that prevents the test programs to be automatically built : please replace ifdef $(CONFIG_APP_TEST) by ifdef CONFIG_APP_TEST. Just fixed it for the next PR.

from elks.

Mellvik avatar Mellvik commented on June 11, 2024

from elks.

jbruchon avatar jbruchon commented on June 11, 2024

You cannot post attachments of any kind (other than plain text) to Github issues; they are discarded and you're posting by email. Post a link to the image on an image host such as Imgur instead.

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

Hey... I also just realized you are not capturing the packets on a point-to-point connection, but you are using a router. So if you are sniffing the packets behind that router, you won't be able to see the packets that are actually sent by ELKS, if they are malformed. Please make the capture directly on the NE2K side, not behind a router.

from elks.

Mellvik avatar Mellvik commented on June 11, 2024

from elks.

Mellvik avatar Mellvik commented on June 11, 2024

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

About the switch / router : I am wondering why the traffic LED on your switch / router is blinking while you capture no packet except the ARP ones ? I cannot explain this else than thinking that your switch / router discards some packets. This is why I am suggesting to connect the NE2K socket directly to the host socket where you run the capture program.

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

Thanks for the photos. Unfortunately, I cannot find any datasheet for the W89C90, W89C901 or W89C902. I only found some references in the W89C94 one, but this latest is for PCI and has no detailed information on the NE2K implementation.

from elks.

Mellvik avatar Mellvik commented on June 11, 2024

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

About /bin/eth, I checked on my side. After fixing the mistake as described above, and selecting the test applets in the configuration menu, that program is automatically built. Did you check you have no legacy dev86 in your path, just the one in /cross ?

from elks.

Mellvik avatar Mellvik commented on June 11, 2024

from elks.

Mellvik avatar Mellvik commented on June 11, 2024

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

Just a question : /bin/eth should be run alone, with all the other internet applets down. Are you sure you ran it after removing ktcp, telnetd, etc, from the configuration ? Otherwise it conflicts with ktcp on the /dev/eth resource.

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

For the other tests, please proceed step by step : first remove all applets, and try to pring and connect from outside. Then start only ktcp, and try again (ping and connect). Then start only ktcp & telnetd, then retry.

from elks.

Mellvik avatar Mellvik commented on June 11, 2024

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

Ok so then your test on the ARP loop is meaningful. It shows the data path is good, but I agree, is not stable, as the test loop does not complete. Thanks to your traces, it also shows that there is only one interrupt for two operations (write + read), where two are expected (one for the end of the write, one for data available for read). Let me dig into the code to try to understand this wrong behavior...

About 'connect from outside' : I mean initiate the connection from the host side (= inbound), with the ping and telnet clients, without any reader, to see how the driver would behave. Expected is a 'read interrupt' for each packet ELKS would receive until buffer exhaustion. And yes, ICMP looks like to be implented in ktcp, according to its sources. Please do not forget to capture the packets in P2P in all that test cases.

from elks.

Mellvik avatar Mellvik commented on June 11, 2024

from elks.

Mellvik avatar Mellvik commented on June 11, 2024

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

Thank you very much for all that test data ! Still working on it...

from elks.

Mellvik avatar Mellvik commented on June 11, 2024

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

@Mellvik : I fixed the problem of the screwed characters in #210.

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

@Mellvik : I fixed the problem of the endless looping in #212.

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

@Mellvik : could you please rebase your IRQ 11 commits on the latest upstream master and test again on your physical machines ? This is to know if we still have a problem specific to your Ethernet adapter, in addition to the bad terminal management tracked by #211.

from elks.

Mellvik avatar Mellvik commented on June 11, 2024

from elks.

Mellvik avatar Mellvik commented on June 11, 2024

from elks.

Mellvik avatar Mellvik commented on June 11, 2024

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

Let us forget the terminal stuff for now (that is reproducible in QEMU and so could be tested later without your real HW), and let us focus on the "no TX" problem with your ETH board, and only with inbound telnet to avoid any side effect of a key press in the ELKS console (select only the ktcp and telnetd applets in the configuration).

From what you describe since the beginning, I am starting to think that your board does not reset the TXP bit 2 in the command register (CR offset 00h) after transmitting the first and only packet (i.e. the ARP reply).

In current implementation, the driver puts the packet to send in the board TX buffer, then sets this bit to trigger the TX, and will block any more write / send until this bit is reset (see ne2k_tx_stat() call in ne2k_write()).

So I would add some traces in the ne2k_write() to confirm that this function completes on first call (= the observed ARP reply), and is blocked on second call (= the expected reply to TCP SYN packet from telnet), because ne2k_tx_stat() never returns NE2K_STAT_TX and ne2k_write() enters interruptible_sleep_on(), causing ktcp to hang.

from elks.

Mellvik avatar Mellvik commented on June 11, 2024

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

OK, what is the meaning of R0 / R1, or W2 in that screenshot ?

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

Could you also activate the commented traces in the file ktcp/deveth.c, function deveth_send(), to see if at least ktcp tries to respond to the telnet SYNs ?

from elks.

Mellvik avatar Mellvik commented on June 11, 2024

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

Sorry, but I still not understand your R0 / R1 / W2. The res variable could be a return value from a lower level function, or the size of the received / sent packet. What actual value are you displaying in your trace ?

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

And last but not least, what is the speed of this R0W2It endless loop ? Very fast or an iteration each second ? If the period is around one second, I think I understood the current behavior (there is a retry loop in ktcp_run() every second).

from elks.

Mellvik avatar Mellvik commented on June 11, 2024

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

Thanks for these latest info. Forget my previous though about the TXP bit, it is working as expected. So ELKS actually responds to the TCP SYNs and the packets are really sent on the wire (I guess the router traffic LED is blinking at the same speed as that loop, right ?), as we get the TX interrupt after the write.

So why do you not capture any packet from ELKS ? I guess this is because these packets are too short, and are discarded by your host. Ethernet requires a minimal packet size of 64. QEMU is more tolerant and works with little packets.

Let me prepare a patch to confirm that analysis...

from elks.

Mellvik avatar Mellvik commented on June 11, 2024

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

Yes, in addition to the invalid packet size, there is another problem in the select() that does not honour the timeout of 1 second before ktcp tries again to resend the SYN ACK. But let us fix that quick looping in a second step, it should not be a blocking point in the traffic.

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

Here is the patch to have an ETH packet minimum size of 64 bytes.

In ne2k_main.c, line 128:

		// Client should write packet once
		// otherwise end of packet will be lost

		if (len > MAX_PACKET_ETH) len = MAX_PACKET_ETH;
		memcpy_fromfs (send_buf, data, len);

		if (len < 64) len = 64;  /* issue #133 */
		res = ne2k_pack_put (send_buf, len);

Could you please apply that patch and test again the inbound telnet ?

from elks.

Mellvik avatar Mellvik commented on June 11, 2024

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

Is this capture really made in P2P, not through the switch / router ?

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

Could you please clean and rebuild all ELKS ? I cannot understand how just adding a line to force the minimal size could cause such a big effect...

from elks.

Mellvik avatar Mellvik commented on June 11, 2024

from elks.

mfld-fr avatar mfld-fr commented on June 11, 2024

So nice 😃 !

Yes, you are right for the switch nature, I don't know why I started to consider it as a router earlier in the discussion ?..

There is still some extra and useless traffic because your HW is rather fast, coupled to the select() ineffective timeout, but it looks like we are done to make the NE2K driver & board to basically work.

Let us then close this long issue by submitting the above patch, and focus on other defective parts in dedicated issues (#213 #211 #125 #124 and so on).

from elks.

pawosm-arm avatar pawosm-arm commented on June 11, 2024

Here is the patch to have an ETH packet minimum size of 64 bytes.

memcpy_fromfs(send_buf, data, len);
if (len < 64U) len = 64U;  /* issue #133 */
...
res = len;

^^ I don't think this is right. The value returned through res isn't telling how many bytes were put into send buffer.

IMO it should be like this:

                if (len > MAX_PACKET_ETH) len = MAX_PACKET_ETH;
                memcpy_fromfs(send_buf, data, len);

                if (ne2k_pack_put(send_buf,
                     (len < 64) ? 64 : len /* issue #133 */
                   )) {
                        res = -EIO;
                        break;
                }

                res = len;

from elks.

ghaerr avatar ghaerr commented on June 11, 2024

Hello @pawosm-arm,

I can see you've been intricately inspecting driver source :)

You are correct on the point that the write return value isn't what the user specified when len < 64, as it is returned larger, which is usually not a good thing, since the user's buffer wasn't actually advanced. Note that the len can be returned smaller (also never handled nor needed to be in ktcp) when len > MAX_PACKET_ETH. The driver has made it clear that multiple writes for a packet are not supported.

However, the current code does return the amount of bytes sent on the wire, if you include the garbage bytes required for proper operation. It could be said it would be better to return the larger len to inform the application of what happened.

That said, the code could be changed to what you're talking about with less code using:

if (len > MAX_PACKET_ETH) len = MAX_PACKET_ETH;
res = len [add this statement]
...
res = len; [remove this statement]

I'll have to think which return approach would be better. In general, ktcp can't do anything different either way, and the only real purpose in the return value is to report anomalies, which is currently only checked for read returns. Write return values are ignored because of the higher probability of packet loss anyways.

Thank you!

from elks.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.