Coder Social home page Coder Social logo

Comments (55)

earlephilhower avatar earlephilhower commented on May 20, 2024

Try latest release 0.9.2. I tested using that lib myself, but forgot to git add that header.

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

Thanks, EEPROM now works fine! The Adafruit ILI9341 comes a bit further but gives errors here:

C:\Users\marcel\Documents\Arduino\libraries\Adafruit_BusIO\Adafruit_SPIDevice.cpp: In constructor 'Adafruit_SPIDevice::Adafruit_SPIDevice(int8_t, int8_t, int8_t, int8_t, uint32_t, BitOrder, uint8_t)':
C:\Users\marcel\Documents\Arduino\libraries\Adafruit_BusIO\Adafruit_SPIDevice.cpp:51:48: error: 'digitalPinToPort' was not declared in this scope
51 | csPort = (BusIO_PortReg *)portOutputRegister(digitalPinToPort(cspin));
| ^~~~~~~~~~~~~~~~
C:\Users\marcel\Documents\Arduino\libraries\Adafruit_BusIO\Adafruit_SPIDevice.cpp:51:29: error: 'portOutputRegister' was not declared in this scope
51 | csPort = (BusIO_PortReg *)portOutputRegister(digitalPinToPort(cspin));
| ^~~~~~~~~~~~~~~~~~
C:\Users\marcel\Documents\Arduino\libraries\Adafruit_BusIO\Adafruit_SPIDevice.cpp:52:15: error: 'digitalPinToBitMask' was not declared in this scope
52 | csPinMask = digitalPinToBitMask(cspin);
| ^~~~~~~~~~~~~~~~~~~
C:\Users\marcel\Documents\Arduino\libraries\Adafruit_BusIO\Adafruit_SPIDevice.cpp:58:33: error: 'portInputRegister' was not declared in this scope
58 | misoPort = (BusIO_PortReg *)portInputRegister(digitalPinToPort(misopin));
| ^~~~~~~~~~~~~~~~~
exit status 1
Error compiling for board Raspberry Pi Pico.

Library versions:

Adafruit BusIO 1.72
Adafruit GFX 1.10.6
Adafruit ILI9341 1.5.6

if I #undef BUSIO_USE_FAST_PINIO in Adafruit_SPIDevice_h then it compiles. Have not connected a real display yet.

Thanks! Marcel

from arduino-pico.

earlephilhower avatar earlephilhower commented on May 20, 2024

if I #undef BUSIO_USE_FAST_PINIO in Adafruit_SPIDevice_h then it compiles. Have not connected a real display yet.

That’s expected. The direct pin register writes that the fastio used aren’t needed or supported on the Pico.

we might want to see if it’s possible to add a wrapper to support then at some point. PRs always welcome. :)

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

Hi,
Adafruit ILI9341 works with software SPI but very slow (slower than the 50 year computer it simulates). Is there any scope to run it with hardware SPI. The library implements its own SPI, so that may be a problem.
Thanks,
Marcel

from arduino-pico.

earlephilhower avatar earlephilhower commented on May 20, 2024

A quick look at the Adafruit_BusIO library shows that there is an AdafruitSPI constructor which takes a HW SPI. Why not pass in the existing HW SPI device (SPI object, been tested to work w/SdFat to read/write FAT SD cards)?

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

Thanks, that works, and is 5 times faster but still very slow. For comparison with hardware SPI on a teensy 4.0, it is 80 times faster (SPI clock set to 75MHz), while teensy4.0 CPU clock is 5 times higher. Interesting your default SPI is on pins 0-3, while the pico schematic shows the default on pins 16-19.

from arduino-pico.

earlephilhower avatar earlephilhower commented on May 20, 2024

Thanks, that works, and is 5 times faster but still very slow. For comparison with hardware SPI on a teensy 4.0, it is 80 times faster (SPI clock set to 75MHz), while teensy4.0 CPU clock is 5 times higher.

Did you specify the SPI clock as part of the SPIConfig? Are you equipped to check the SPI clock frequency? I might have an issue setting it somewhere.

Interesting your default SPI is on pins 0-3, while the pico schematic shows the default on pins 16-19.

Where did you find defaults? I just worked from the rp2040 datasheet and picked the first block of muxes it came out of, not for any real reason. There are also calls setXXX which can adjust the used pins (call before SPI::begin)

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

from arduino-pico.

earlephilhower avatar earlephilhower commented on May 20, 2024

Odd. There is very little code in the SPI.cpp wrapper: https://github.com/earlephilhower/arduino-pico/blob/master/libraries/SPI/SPI.cpp

Basically, the Pico SDK is doing all the work. It is possible the spi_set_format call before sending might be factored out if they are slowing things down, but unfortunately that call is needed to specify 8- or 16-bit transfers. We could cache if we've already set it properly, I guess. If you've got the time to give it a try, a PR would be very nice if it helps. :)

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

Thanks, just tried adding the cache, makes little difference. I'll keep on looking. There is no floating point used either.
byte n=0;
if (n!=8) spi_set_format(_spi, n=8, cpol(), cpha(), SPI_MSB_FIRST);
if (n!=16) spi_set_format(_spi, n=16, cpol(), cpha(), SPI_MSB_FIRST);

from arduino-pico.

earlephilhower avatar earlephilhower commented on May 20, 2024

Just checking, @marcelvanherk, but did you make n static? OTW it'll always end up calling set-format each pass...

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

Hi again,

Could the speed difference be due to different compiler optimisation? My teensy 3.5 at 120 MHz runs about 4 times faster than the 125 MHz PICO, both with hardware SPI, running exactly the same code.

Marcel

from arduino-pico.

earlephilhower avatar earlephilhower commented on May 20, 2024

Possibly. I've selected -Os (optimize size) to allow for larger sketches (and eventually keep enough room onboard for flash file systems even in the 2MB RPI version).

Edit platform.txt and swap -O2 in for -Os, and restart the IDE, and give it a whirl.

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

Did 02, speed is similar. I guess it is partly due to speed difference between adafruit and teensy library, and of course the teensy4 is quite a bit faster. Thanks for your help.

from arduino-pico.

earlephilhower avatar earlephilhower commented on May 20, 2024

Check out http://www.ganssle.com/tem/tem228.html and the "Battle of the CPUs" section. Teensy seems to have a Cortex-M4 which is massively faster clock-for-clock than the M0.

There's always overclocking the Pico to try. It's at 125MHz stock and people seem to be able to run them way faster...

Good luck!

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

PICO beats Teensy3.5 without TFT (made a dummy tft class), 1969 DG BASIC running on simulated 1971 DG Nova1200 calculates, in 10s, 737 square roots on PICO and 489 on Teensy3.5. Issue must be in display.

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

Tested now between 120MHz T35 and 125MHz PICO with exactly same adafruit library both with HW I2C, constructed with Adafruit_ILI9341(CS, DC) on T35 and with Adafruit_ILI9341(&SPI, DC) on PICO. T35 runs 3 times faster, while processor of PICO is faster, see above. Delay seems to be in PICO I2C. Or could it be in pin IO? There is lots of graphics, see movie.

20210327_161709.mp4

from arduino-pico.

earlephilhower avatar earlephilhower commented on May 20, 2024

I believe you, but I don't have a good feeling for where the issue lies.

I think a simpler SPI-only benchmark is needed. Say, something that sends or receives 10K of data over and over and checks the runtime. If that is significantly different, you have your smoking gun.

If that shows the Pico very slow, the next step would be to write the same test for the raw Pico SDK and not Arduino.

If SPI-Arduino == slow and SPI-SDK ==fast, then the problem is obviously in the Arduino wrapper somewhere. If both are slow, then there';s something up w/the SDK.

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

Hi again. On Pico software I2C has the same speed as HW I2C; if I replace digitalWrite by a direct call to gpio_put in Adafruit_SPITFT.cpp for e.g. SPI_MOSI_HIGH, SW I2C becomes 2x faster than HW I2C, it writes 16 bits in 2us. SW writes bits faster but then there is a gap making it slower.

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

Interestingly, if I code SW SPI in your SPI.CPP, it is as slow as HW SPI. As if the calls like hwspi._spi->transfer16 create the overhead. I keep searching

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

If I remove spi_init and spi_deinit and use SW SPI coded in SPI.CPP speed is getting close to SW SPI in Adafruit library (patched to use gpio calls). My application does a lot of small draws so lots of single word transfers that apparently have a lot of overhead on the PICO.

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

Is may be possible to NOT call spi_ini for every transaction, but only at ::begin, and caching the baudrate - I just tried that, software SPI implemented in your SPI.CPP still twice as fast as hardware. I think the RPI library is the main limiting factor.

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

Increasing the clock helps, at 75 MHz HW SPI and 10 MHz SW SPI in SPI.cpp are on par. Could it be the FIFO delays single word wites?

from arduino-pico.

earlephilhower avatar earlephilhower commented on May 20, 2024

Why not try moving spi_init to only in the ::begin, and in the ::beginTransaction call spi_set_baudrate to set the right clock, and move spi_deinit to only in the ::end and see how it goes?

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

from arduino-pico.

earlephilhower avatar earlephilhower commented on May 20, 2024

Another thing to try would be to only call spi_set_baudrate when its different than the last time. It could be that the HW/SW is giving the PLL time to stabilize at the new clock frequency (hard busy wait).

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

i2c_speed
This is the SPI_SCK signal.
Only calls are:
if (n!=16) spi_set_format(_spi, n=16, ccpol, ccpha, SPI_MSB_FIRST);
spi_write16_read16_blocking(_spi, &data, &ret, 1);
if (n!=8) spi_set_format(_spi, n=16, ccpol, ccpha, SPI_MSB_FIRST);
spi_write8_read8_blocking(_spi, &data, &ret, 1);

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

i2c_speed2
Pico versus Teensy3.5; same speed but less SPI overhead on Teensy3.5; both SPI clocks are at maximum (~60-75 MHZ). Overhead is not in CPU, the pico beats the Teensy at pure (integer) computation. Both pictures uses same code and library (adafruit_ili9431)

from arduino-pico.

earlephilhower avatar earlephilhower commented on May 20, 2024

Just to clarify, are you doing just a simple test loop app which does nothing but dummy SPI writes, or your whole application? If it's the whole app, then the difference could be in the app code and have nothing to do w/SPI.

If it's a test app, then maybe this is something that should be opened up on the pico-sdk repo? It could be a HW or driver issue.

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

It is the whole app: but the first pictures just show a short test loop {tft.setcursor(i, j); tft.write(char)}. The bottom one compares Teensy and RPI on a random sample of drawing, so more complex app. But the app code is the same on all pictures and between both processors.

A test app would look like this:

for (i=0; i<65536; i+) {
spi_set_format(_spi, 8, ccpol, ccpha, SPI_MSB_FIRST);
spi_write8_read8_blocking(_spi, &data, &ret, 1);
spi_set_format(_spi, n=16, ccpol, ccpha, SPI_MSB_FIRST);
spi_write16_read16_blocking(_spi, &data, &ret, 1);
spi_write16_read16_blocking(_spi, &data, &ret, 1);
}

I would want ~90% efficiency, but it would be closer to ~15% given the comparison between HW and SW SPI in the same code.

Regards, Marcel

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

`#include <hardware/spi.h>
#include <hardware/gpio.h>

spi_cpol_t cccpol;
spi_cpha_t cccpha;

void setup() {
gpio_set_function(16, GPIO_FUNC_SPI);
gpio_set_function(17, GPIO_FUNC_SPI);
gpio_set_function(18, GPIO_FUNC_SPI);
gpio_set_function(19, GPIO_FUNC_SPI);

spi_init(spi0, 100000000);
cccpol = SPI_CPOL_0;
cccpha = SPI_CPHA_0;
}

void loop() {
for (int i=0; i<65536; i++) {
uint16_t ret16;
uint16_t data16=0;
uint8_t ret8;
uint8_t data8=0;
spi_set_format(spi0, 8, cccpol, cccpha, SPI_MSB_FIRST);
spi_write_read_blocking(spi0, &data8, &ret8, 1);
spi_set_format(spi0, 16, cccpol, cccpha, SPI_MSB_FIRST);
spi_write16_read16_blocking(spi0, &data16, &ret16, 1);
spi_write16_read16_blocking(spi0, &data16, &ret16, 1);
}
}`

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

image
0.8 us overhead per call, 1us if switching format.

Sorry, test program is a bit untidy!

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

Raised it there!

raspberrypi/pico-sdk#292

from arduino-pico.

Bodmer avatar Bodmer commented on May 20, 2024

I noticed this too.

Examining the SDK code indicates that it is more suited to bulk transfers due to the way it handles the SPI buffers.

If you send a block of data it is much faster, at 62.5MHz the gap between 16 bit values is ~30ns:
image

SPI clock rate is 62.5MHz but 'scope is only 50MHz bandwidth, hense the "Sine" clock! However the delay is representative.

I have the Pico running with an ILI9341 using the Arduino IDE and Earle's excellent core, see here. Results are good when blocks are sent to the TFT:

Benchmark,                Time (microseconds)
Screen fill (5 times),    114981  (43.49 fps)
Text,                     31986
Lines,                    128851
Horiz/Vert Lines,         18031
Rectangles (outline),     10210
Rectangles (filled),      241497
Circles (filled),         70702
Circles (outline),        58318
Triangles (outline),      33224
Triangles (filled),       109793
Rounded rects (outline),  30585
Rounded rects (filled),   255167
Total = 1103345us
Total = 1.1033s

Not so good for individual pixels though. However the Pico has RAM enough for a full screen buffer... and DMA...

from arduino-pico.

Bodmer avatar Bodmer commented on May 20, 2024

Here is an example, without DMA but performance is good at ~70fps to update a 180x180 pixel area from a buffer. Video here.

Example SPI transmit code:

/***************************************************************************************
** Function name:           pushBlock - for RP2040
** Description:             Write a block of pixels of the same colour
***************************************************************************************/
void TFT_eSPI::pushBlock(uint16_t color, uint32_t len){

  bool loaded = false;
  uint16_t colorBuf[64];
  const uint16_t* colorPtr = colorBuf;
  if (len>63) {
    loaded = true;
    for (uint32_t i = 0; i < 64; i++) colorBuf[i] = color;
    while(len>63) {
      spi_write16_blocking(spi0, (const uint16_t*)colorPtr, 64);
      len -=64;
    }
  }

  if (len) {
    if (!loaded) for (uint32_t i = 0; i < len; i++) colorBuf[i] = color;
    spi_write16_blocking(spi0, (const uint16_t*)colorPtr, len);
  }
}

/***************************************************************************************
** Function name:           pushPixels - for RP2040
** Description:             Write a sequence of pixels
***************************************************************************************/
void TFT_eSPI::pushPixels(const void* data_in, uint32_t len){

  if (_swapBytes) {
    spi_write16_blocking(spi0, (const uint16_t*)data_in, len);
  }
  else {
    spi_set_format(spi0, 8, (spi_cpol_t)0, (spi_cpha_t)0, SPI_MSB_FIRST);
    spi_write_blocking(spi0, (const uint8_t*)data_in, len * 2);
    spi_set_format(spi0, 16, (spi_cpol_t)0, (spi_cpha_t)0, SPI_MSB_FIRST);
  }
}

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

from arduino-pico.

Bodmer avatar Bodmer commented on May 20, 2024

Understood. The Adafruit library does not include any optimisations for the RP2040 and thus runs about 10x slower than it could. The Teensy library on the otherhand has specific optimisations.

Adafruit may decide to optimise their library at some point for the RP2040 but until then it may be best to stick with the Teensy for your project.

The graphicstest results for Adafruit_GFX + Adafruit_ILI9341 libraries on Pico are slow with 62.5MHz hardware SPI:

Benchmark,                Time (microseconds)
Screen fill (5 times),    1378508  (3.63 fps)
Text,                     104438
Lines,                    763413
Horiz/Vert Lines,         120718
Rectangles (outline),     75152
Rectangles (filled),      2863128
Circles (filled),         356630
Circles (outline),        346576
Triangles (outline),      175697
Triangles (filled),       946461
Rounded rects (outline),  152487
Rounded rects (filled),   2852664
Total = 10135872us
Total = 10.1359s

Wereas with an optimised library (TFT_eSPI):

Benchmark,                Time (microseconds)
Screen fill (5 times),    114981  (43.49 fps)
Text,                     31986
Lines,                    128851
Horiz/Vert Lines,         18031
Rectangles (outline),     10210
Rectangles (filled),      241497
Circles (filled),         70702
Circles (outline),        58318
Triangles (outline),      33224
Triangles (filled),       109793
Rounded rects (outline),  30585
Rounded rects (filled),   255167
Total = 1103345us
Total = 1.1033s

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

Although this comparison: #11 (comment) was both with the adafruit library for fairness.

from arduino-pico.

Bodmer avatar Bodmer commented on May 20, 2024

I had a look at the Adafruit library and it does contain Teensy optimised code.

I added some Pico optimised code to my Adafruit library and it makes the graphics test 6x faster. If you have the latest version of Adafruit_GFX (1.10.7) and replace 2 of the files with these then you can try it to see if the performance boost is adequate.
Adafruit_SPITFT.zip

If you use images and the colours look wrong then let me know and I can add endianess support to image rendering.

from arduino-pico.

Bodmer avatar Bodmer commented on May 20, 2024

Here are the results now for the graphics test with the Pico and using Adafruit_GFX with 62.5MHz SPI clock:

Benchmark,                Time (microseconds)
Screen fill (5 times),    115395  (43.33 fps)
Text,                     50392
Lines,                    331519
Horiz/Vert Lines,         18628
Rectangles (outline),     10980
Rectangles (filled),      241718
Circles (filled),         88662
Circles (outline),        156140
Triangles (outline),      74920
Triangles (filled),       121274
Rounded rects (outline),  50472
Rounded rects (filled),   258238
Total = 1518338us
Total = 1.5183s

Some functions now run 10x faster, so it depends which ones you use.

from arduino-pico.

earlephilhower avatar earlephilhower commented on May 20, 2024

@Bodmer if you have a fork on GH, maybe we can add it to the included libraries, at least until Adafruit does their magic. Or, worst case, we can put it the README.

from arduino-pico.

Bodmer avatar Bodmer commented on May 20, 2024

Yes, I can do that after some more tests.

One problem is that there is no board specific #define that I can find that flags the board in use is an RP2040. If there is one then let me know as I have hardwired the change at the moment with a #define RP2040

from arduino-pico.

Bodmer avatar Bodmer commented on May 20, 2024

Actually I had to use #define PICO_RP2040 to avoid a clash with a class.

from arduino-pico.

earlephilhower avatar earlephilhower commented on May 20, 2024

ARDUINO_ARCH_RP2040 is the flag that's set now in the build system.

from arduino-pico.

Bodmer avatar Bodmer commented on May 20, 2024

Ah, excellent, thanks!

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

Hi, I use ARDUINO_RPI_PICO as my board flag.

I will test your mods soon, seem similar to mine for SW SPI.

Marcel

from arduino-pico.

Bodmer avatar Bodmer commented on May 20, 2024

@earlephilhower , @marcelvanherk

I have used the new flag ARDUINO_ARCH_RP2040 to invoke the code in a new Adafruit_GFX fork.

RP2040 optimisations have been added to this Adafruit_GFX fork.

This Adafruit_ILI9341 fork only has minor tweaks to the graphicstest and pictureEmbed examples so they run on my Pico setup and to show how to connect a RPi Pico to the SPI ILI9341 pins.

Only tested with graphicstest, pictureEmbed and with TJPG_Decoder at the moment and all looks good and much faster.

It should be possible to program the PIO to really quickly spew out the setAddrWindow commands which use 8bit + 16 bit transfers and toggle the DC line automatically, but that is a project for another day!

Post any issues on those repositories.

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

You first version compiles and works fine, but runs 'only 'about 40% faster, my code has a lot of single pixel (text) writes. Nova1200 simulation at 0.14 MIPS with HW SPI.

IN SW SPI, calls like SPI_SCK_LOW/HIGH and SPI_MOSI_LOW/HIGH can also use e.g. sio_hw->gpio_clr/set that makes it a lot faster.

However, SW SPI is broken right now, not sure why. It crashes the Pico.

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

Thanks and good night!

from arduino-pico.

Bodmer avatar Bodmer commented on May 20, 2024

No problem, yes single pixel writes are bad news as it takes 13 bytes to send one 16 bit pixel, this combined with the delays between bytes means bit bashing will be faster. I hope to get the PIO handling all this eventually.

from arduino-pico.

marcelvanherk avatar marcelvanherk commented on May 20, 2024

from arduino-pico.

Bodmer avatar Bodmer commented on May 20, 2024

Yes, I have been investigating.

Grabbing the SPI bus seens to take ages, so it is best to use:

tft.startWrite();

before a lot of single pixel operations. For the Adafruit library this makes a x3 speed improvement for a simple test.

To release the bus again:

tft.endWrite();

from arduino-pico.

Bodmer avatar Bodmer commented on May 20, 2024

PS. After tft.startWrite(); you have to use writePixel(...) rather than drawPixel(...) as the latter has the bus grab code.

Tests indicate a better than x6 speed improvement. Whether you see this improvement is dependant on your project implementation and how many pixels you draw.

from arduino-pico.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.