Comments (64)
Not really a bug. After your change, the ETH / NE2K driver requests the IRQ11, but in irqtab.S
, the interrupt for the board is still fixed to IRQ9 (IRQ11 is surrounded by #if 0). You are the first one to need to set the interrupt number else than to 9 😄.
Could you please change that piece of code and test again, just to validate the fix, before we submit a clean patch ?
! IRQ9 is used by the Ethernet device
#ifdef CONFIG_ETH
_irq9:
push ax
mov ax, #9
br _irqit
#endif
#if 0
_irq10:
push ax
mov ax,#10
br _irqit
_irq11:
push ax
mov ax,#11
jmp _irqit
from elks.
from elks.
The basic test for the ETH / NE2K driver is the eth
command, that you could build in /elks/elkscmd/test/eth/
. It tests the connection to the local network with ARP requests / replies. Don't forget to update the IP addresses in the code with the ones of your real local network, because the defaults are for QEMU as configured in qemu.sh
.
from elks.
Actually, two changes are needed in irqtab.S. The one pointed to by mfld-fr and another where the pointer to the ISR is placed in the interrupt vector table.
The code for irqtab.S has changed recently. The relevant parts for you case should now be:
#ifdef CONFIG_ETH_NE2K
#define METH 0x800
#else
...
#if 0
_irq9: ! Ethernet device
call _irqit
.byte 9
_irq10: ! USB
call _irqit
.byte 10
#endif
#ifdef CONFIG_ETH_NE2K
_irq11: ! Sound
call _irqit
.byte 11
#endif
#if 0
_irq12: ! Mouse
call _irqit
from elks.
Regarding the FPU interrupt, ELKS does not know anything about the FPU.
from elks.
from elks.
Hello, any news on that issue ?
from elks.
from elks.
Mmm... what is your CPU model ? Because I just remember I used some 80186 instructions in the Ne2k driver, and this could explain the failure.
from elks.
from elks.
Forget my last idea if you have a 80286.
About "never stops sending", could you please detail ?
Did you catch an Ethernet dump before the power supply break ?
from elks.
from elks.
Hello, any progress in your testing ? Were you able to test the HW with another OS ?
from elks.
from elks.
OK, thanks for the update. So the problem is specific to ELKS.
If you could connect point-to-point your target with the NE2K card to a PC, run the eth
test program, reboot the target, then run the telnet
client program, and capture the packets with WireShark, it would help us to diagnose.
from elks.
from elks.
Could you please take a photo of your ISA card, so that we could identify the main chip and check its datasheet against the current implementation ?
You could also disable all the network applets in /etc/rc.d/rc.sysinit
(ktcp, telnetd, httpd, etc), only run the /bin/eth
command and check the captured packets.
And last but not least, you could add some printk()
in ne2k-main.c
to try to find what is looping.
from elks.
from elks.
Sorry, but I see no picture in your latest reply.
About /bin/eth
, it looks like you forgot to set your build environment as explained in the main README
: in the top directory (here .../elks-master
), run . tools/env.sh
before trying to invoke any make
command.
Do you still have the legacy dev86
package installed on your system ? Please uninstall it if stills there : now ELKS relies on the auto-build & latest dev86
of @jbruchon, not the legacy one that is rather outdated and not able to run "in-tree".
from elks.
By the way, I just saw that there is a mistake in /elkscmd/Makefile(114)
that prevents the test programs to be automatically built : please replace ifdef $(CONFIG_APP_TEST)
by ifdef CONFIG_APP_TEST
. Just fixed it for the next PR.
from elks.
from elks.
You cannot post attachments of any kind (other than plain text) to Github issues; they are discarded and you're posting by email. Post a link to the image on an image host such as Imgur instead.
from elks.
Hey... I also just realized you are not capturing the packets on a point-to-point connection, but you are using a router. So if you are sniffing the packets behind that router, you won't be able to see the packets that are actually sent by ELKS, if they are malformed. Please make the capture directly on the NE2K side, not behind a router.
from elks.
from elks.
from elks.
About the switch / router : I am wondering why the traffic LED on your switch / router is blinking while you capture no packet except the ARP ones ? I cannot explain this else than thinking that your switch / router discards some packets. This is why I am suggesting to connect the NE2K socket directly to the host socket where you run the capture program.
from elks.
Thanks for the photos. Unfortunately, I cannot find any datasheet for the W89C90, W89C901 or W89C902. I only found some references in the W89C94 one, but this latest is for PCI and has no detailed information on the NE2K implementation.
from elks.
from elks.
About /bin/eth
, I checked on my side. After fixing the mistake as described above, and selecting the test
applets in the configuration menu, that program is automatically built. Did you check you have no legacy dev86
in your path, just the one in /cross
?
from elks.
from elks.
from elks.
Just a question : /bin/eth should be run alone, with all the other internet applets down. Are you sure you ran it after removing ktcp, telnetd, etc, from the configuration ? Otherwise it conflicts with ktcp on the /dev/eth resource.
from elks.
For the other tests, please proceed step by step : first remove all applets, and try to pring and connect from outside. Then start only ktcp, and try again (ping and connect). Then start only ktcp & telnetd, then retry.
from elks.
from elks.
Ok so then your test on the ARP loop is meaningful. It shows the data path is good, but I agree, is not stable, as the test loop does not complete. Thanks to your traces, it also shows that there is only one interrupt for two operations (write + read), where two are expected (one for the end of the write, one for data available for read). Let me dig into the code to try to understand this wrong behavior...
About 'connect from outside' : I mean initiate the connection from the host side (= inbound), with the ping and telnet clients, without any reader, to see how the driver would behave. Expected is a 'read interrupt' for each packet ELKS would receive until buffer exhaustion. And yes, ICMP looks like to be implented in ktcp, according to its sources. Please do not forget to capture the packets in P2P in all that test cases.
from elks.
from elks.
from elks.
Thank you very much for all that test data ! Still working on it...
from elks.
from elks.
@Mellvik : I fixed the problem of the screwed characters in #210.
from elks.
@Mellvik : I fixed the problem of the endless looping in #212.
from elks.
@Mellvik : could you please rebase your IRQ 11 commits on the latest upstream master and test again on your physical machines ? This is to know if we still have a problem specific to your Ethernet adapter, in addition to the bad terminal management tracked by #211.
from elks.
from elks.
from elks.
from elks.
Let us forget the terminal stuff for now (that is reproducible in QEMU and so could be tested later without your real HW), and let us focus on the "no TX" problem with your ETH board, and only with inbound telnet to avoid any side effect of a key press in the ELKS console (select only the ktcp
and telnetd
applets in the configuration).
From what you describe since the beginning, I am starting to think that your board does not reset the TXP bit 2 in the command register (CR offset 00h) after transmitting the first and only packet (i.e. the ARP reply).
In current implementation, the driver puts the packet to send in the board TX buffer, then sets this bit to trigger the TX, and will block any more write / send until this bit is reset (see ne2k_tx_stat()
call in ne2k_write()
).
So I would add some traces in the ne2k_write()
to confirm that this function completes on first call (= the observed ARP reply), and is blocked on second call (= the expected reply to TCP SYN packet from telnet), because ne2k_tx_stat()
never returns NE2K_STAT_TX
and ne2k_write()
enters interruptible_sleep_on()
, causing ktcp
to hang.
from elks.
from elks.
OK, what is the meaning of R0 / R1, or W2 in that screenshot ?
from elks.
Could you also activate the commented traces in the file ktcp/deveth.c
, function deveth_send()
, to see if at least ktcp
tries to respond to the telnet SYNs ?
from elks.
from elks.
Sorry, but I still not understand your R0 / R1 / W2. The res
variable could be a return value from a lower level function, or the size of the received / sent packet. What actual value are you displaying in your trace ?
from elks.
And last but not least, what is the speed of this R0W2It
endless loop ? Very fast or an iteration each second ? If the period is around one second, I think I understood the current behavior (there is a retry loop in ktcp_run()
every second).
from elks.
from elks.
Thanks for these latest info. Forget my previous though about the TXP bit, it is working as expected. So ELKS actually responds to the TCP SYNs and the packets are really sent on the wire (I guess the router traffic LED is blinking at the same speed as that loop, right ?), as we get the TX interrupt after the write.
So why do you not capture any packet from ELKS ? I guess this is because these packets are too short, and are discarded by your host. Ethernet requires a minimal packet size of 64. QEMU is more tolerant and works with little packets.
Let me prepare a patch to confirm that analysis...
from elks.
from elks.
Yes, in addition to the invalid packet size, there is another problem in the select() that does not honour the timeout of 1 second before ktcp tries again to resend the SYN ACK. But let us fix that quick looping in a second step, it should not be a blocking point in the traffic.
from elks.
Here is the patch to have an ETH packet minimum size of 64 bytes.
In ne2k_main.c
, line 128:
// Client should write packet once
// otherwise end of packet will be lost
if (len > MAX_PACKET_ETH) len = MAX_PACKET_ETH;
memcpy_fromfs (send_buf, data, len);
if (len < 64) len = 64; /* issue #133 */
res = ne2k_pack_put (send_buf, len);
Could you please apply that patch and test again the inbound telnet ?
from elks.
from elks.
Is this capture really made in P2P, not through the switch / router ?
from elks.
Could you please clean and rebuild all ELKS ? I cannot understand how just adding a line to force the minimal size could cause such a big effect...
from elks.
from elks.
So nice 😃 !
Yes, you are right for the switch nature, I don't know why I started to consider it as a router earlier in the discussion ?..
There is still some extra and useless traffic because your HW is rather fast, coupled to the select()
ineffective timeout, but it looks like we are done to make the NE2K driver & board to basically work.
Let us then close this long issue by submitting the above patch, and focus on other defective parts in dedicated issues (#213 #211 #125 #124 and so on).
from elks.
Here is the patch to have an ETH packet minimum size of 64 bytes.
memcpy_fromfs(send_buf, data, len);
if (len < 64U) len = 64U; /* issue #133 */
...
res = len;
^^ I don't think this is right. The value returned through res
isn't telling how many bytes were put into send buffer.
IMO it should be like this:
if (len > MAX_PACKET_ETH) len = MAX_PACKET_ETH;
memcpy_fromfs(send_buf, data, len);
if (ne2k_pack_put(send_buf,
(len < 64) ? 64 : len /* issue #133 */
)) {
res = -EIO;
break;
}
res = len;
from elks.
Hello @pawosm-arm,
I can see you've been intricately inspecting driver source :)
You are correct on the point that the write
return value isn't what the user specified when len < 64, as it is returned larger, which is usually not a good thing, since the user's buffer wasn't actually advanced. Note that the len can be returned smaller (also never handled nor needed to be in ktcp) when len > MAX_PACKET_ETH. The driver has made it clear that multiple writes for a packet are not supported.
However, the current code does return the amount of bytes sent on the wire, if you include the garbage bytes required for proper operation. It could be said it would be better to return the larger len to inform the application of what happened.
That said, the code could be changed to what you're talking about with less code using:
if (len > MAX_PACKET_ETH) len = MAX_PACKET_ETH;
res = len [add this statement]
...
res = len; [remove this statement]
I'll have to think which return approach would be better. In general, ktcp can't do anything different either way, and the only real purpose in the return value is to report anomalies, which is currently only checked for read returns. Write return values are ignored because of the higher probability of packet loss anyways.
Thank you!
from elks.
Related Issues (20)
- Soviet Computer Electronika MC1502 HOT 10
- panic at boot HOT 8
- tools/build.sh failed Undefined symbols for architecture arm64: "_host_hooks" HOT 4
- elf.c:34:18: error: use of undeclared identifier 'LIBELF_ARCH' HOT 5
- 8018X system: bell-8254.c:24:14: error: ‘SPEAKER_PORT’ undeclared HOT 3
- Cannot compile v0.7.0, undefined reference to "hd_drive_map" HOT 6
- Incompatibility with 3inONEder HOT 3
- V25 support HOT 5
- fd_type number mismatch for FD1232 HOT 7
- ELKs rebuilds with no configuration changes HOT 1
- Congratulations and wish list for elks functionality HOT 1
- Net start does not start httpd HOT 1
- ftp not working HOT 4
- Question on return code on create in socket.c HOT 2
- Building/porting ELKS in ROM HOT 19
- Not an issue, but i am willing to donate the intel etherexpress 8/16 network card HOT 8
- [qemu] cannot get network up HOT 6
- PC-98 Build errors HOT 8
- serial 2 ps/2 mouse adapter mouse acts funny HOT 11
- Selecting NANO sockets breaks build HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from elks.