Coder Social home page Coder Social logo

OPT proposal about unzip-cpm-z80 HOT 7 CLOSED

agn453 avatar agn453 commented on July 21, 2024
OPT proposal

from unzip-cpm-z80.

Comments (7)

zx70 avatar zx70 commented on July 21, 2024

I'd also suggest to use HL' for bleft and bitbuf

from unzip-cpm-z80.

zx70 avatar zx70 commented on July 21, 2024

I created a pull request with those code parts I was able to test.
Using EXX seems more tricky than expected and probably should stay as an option (we never know whether the BIOS is leaving the alternate registers untouched).

from unzip-cpm-z80.

agn453 avatar agn453 commented on July 21, 2024

; push de

; ld de,-4
; add iy,de
; pop de

dec iy
dec iy
dec iy
dec iy

; push de

; ld de,4
; add iy,de
; pop de

inc iy
inc iy
inc iy
inc iy

...why it was preferred the addition ? For the flags?

If you wish to optimise for size, the current method of PUSH, LD DE, ADD IY, POP uses 7 bytes versus 8 bytes for the INC IY or DEC IY solution. Squeezing every byte from the code was the goal at the time. In other places such as within loops, the code is optimised for speed.

from unzip-cpm-z80.

zx70 avatar zx70 commented on July 21, 2024

Well, thinking at ways to save memory we could, in example put redundant code parts in a subroutine, like:

open_wr:
ld de,opbuf
ld c,setdma
call bdos
call setout
ld de,opfcb
ld c,fwrite
call bdos
ret

Another possible approach is to remove (or keep optional) those compression methods which are not used anymore.

from unzip-cpm-z80.

agn453 avatar agn453 commented on July 21, 2024

I've merged your #12 changes and bumped the CP/M UNZIP version to v1.5-7. Thanks.

from unzip-cpm-z80.

zx70 avatar zx70 commented on July 21, 2024

This one is a little bit extreme, but in my test case reduces the timing count (z88dk-ticks) from 28210377 to 26158350.
It can be extended to the whole "getbits" logic and probably helps in saving a little more if correctly implemented.
The only problem is that we must trust the BIOS (and BDOS) not be touching the alternate registers set, which should be the case if it is well written. Otherwise we should preserve HL' before using the BDOS calls.

    ;
    nextsymbol:
        ld	(treep),hl
    
        exx
        ld	hl,(bitbuf)	; keep bitbuf in L, bleft in H
        exx
    
    nsloop:
    ;	push	hl
        exx
        ;ld	hl,(bitbuf)	; keep bitbuf in L, bleft in H
        dec	h
        jp	p,$+9		; jump to "xor a", past jp op plus 6 bytes:
        call	getbyte		; (3 bytes)
        ld	l,a		; (1 byte)  new bitbuf
        ld	h,7		; (2 bytes) 8 bits left, pre-dec'd
        xor	a		; jp op above jumps here
        rr	l
    ;	ld	(bitbuf),hl	; update bitbuf/bleft
        exx
        ;ld	h,a		; A still zero
        rla			; return bit in HL and A
        ;ld	l,a

    ;	pop	hl
        or	a
        jr	z,nsleft
        inc	hl
        inc	hl
    nsleft:
        ld	e,(hl)
        inc	hl
        ld	d,(hl)
    
        ld	a,d
        cp	10h
        jr	nc,nsleaf
        or	e
        ;ret	z
        jr	z,nsexit
    
        ld	hl,(treep)
        add	hl,de
        add	hl,de
        add	hl,de
        add	hl,de
        jr	nsloop
    
    nsleaf:	and	0fh
        ld	d,a
    nsexit:
    
        exx
        ld	(bitbuf),hl	; keep bitbuf in L, bleft in H
        exx
        ret

from unzip-cpm-z80.

zx70 avatar zx70 commented on July 21, 2024

One step further...

    ;	rd1bit
    ;	push	af
    ;
    ;	ld	a,2
    ;	call	rdbybits
    ;	or	a
    
        ld	a,3		; better to gather 3 bits at once, it's faster and smaller
        call	rdbybits
        
        ld l,a
        and 1		; keep the first bit
        push af
    
        ld	a,l		; now onto the next 2 bits
        srl	a

from unzip-cpm-z80.

Related Issues (7)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.