Coder Social home page Coder Social logo

Comments (5)

hamvocke avatar hamvocke commented on September 26, 2024 1

Alex, happy to see that you're back on this! I only use this script occasionally so I don't always catch those changes right away, so thanks so much for investigating!

The current script is trying to skip the CSV headers already and is using a pretty naive heuristic to find the actual transaction lines:

This is what the script is currently doing:

def transaction_lines(file):
    lines = file.readlines()
    i = 1
    for line in lines:
        if "Betrag" in line:
            return lines[i:]
        i = i + 1

    raise ValueError("Can't convert CSV file without header line")

This approach is working okay but it's super naive and breaking with this change. It's iterating over each line in a .csv file and will skip lines until it finds a line that contains the word Betrag. Both, the Cash and the Visa CSV contain the word Betrag in their transaction lines so this was a shortcut to use this function for both files.

Now the problem is that the new CSV format uses the word Betrag in its headers, leading the script to falsely assume that it's arrived at the transaction lines.

The simplest solution that'd work for now would be to change the transaction_lines() function from stopping at a line containing the word Betrag to stopping at a line containing the word Wertstellung (as this seems to be another one that's present in both, Cash and Visa files, but not in the CSV header).

What do you think?

from dkb2homebank.

hamvocke avatar hamvocke commented on September 26, 2024 1

in the long run, it might make sense to look into next() and just throw away the first n lines that are not needed

That's pretty much what the transaction_lines() function is doing. It's iterating over the lines in a given CSV file until it finds the line that precedes the transaction lines and only returns the lines following that specific line.

I agree that a comment helps here, and the function name could need some tweaking, too (find_transaction_lines() or skip_header_lines() might be good alternatives).

from dkb2homebank.

Ablesius avatar Ablesius commented on September 26, 2024

The simplest solution that'd work for now would be to change the transaction_lines() function from stopping at a line containing the word Betrag to stopping at a line containing the word Wertstellung (as this seems to be another one that's present in both, Cash and Visa files, but not in the CSV header).

This is at least a hotfix solution we could do (I might even be able to do that myself later). But in the long run, it might make sense to look into next() and just throw away the first n lines that are not needed. To me that looks simpler for debugging. I'm not entirely sure though, maybe there's an even better way.

One way or another, we should add a comment to this section of the code to make it clear why this is being done. I didn't find it when I was trying to debug the code (I might not have looked hard enough, true), this would surely add a bit of clarity.

from dkb2homebank.

Ablesius avatar Ablesius commented on September 26, 2024

Another idea: maybe we search for a line containing actual numbers/dates before we try to process them, because the lines before can't contain relevant data.

from dkb2homebank.

Ablesius avatar Ablesius commented on September 26, 2024

#7

from dkb2homebank.

Related Issues (9)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.