Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Single and double quotation marks are not properly escaped from a CSV import #1692

Open
1 of 2 tasks
sup3rgiu opened this issue Jan 25, 2025 · 5 comments
Open
1 of 2 tasks
Labels
Community This is a great issue for community members to work on 3️⃣ Low priority Contributions accepted, but Maybe team will not be working on this in the near term

Comments

@sup3rgiu
Copy link

sup3rgiu commented Jan 25, 2025

Where did this bug occur? (required)

  • I am a self-hosted user reporting a bug from my self hosted app
  • I am a user of Maybe's paid app

Describe the bug
Consider this simple csv:

Date;Description;Amount (EUR);
09.01.2025;Bar "La collina";-19,50;

I'm currently facing 3 issues with the "Import transactions" feature:

  1. Quotation marks cannot be present within the csv content.
    As you can see, I have some " in the "Description" column. If there are, the upload fails with this message:

Image

  1. In the "Date format" field, the format DD.MM.YYYY is not present. When using DD.MM.YY, the sample transaction is imported as "9 January 2020"

Image

  1. No options to change the decimal separator. When imported, the sample transaction is registered as -1950€.

Note that I found these bugs when importing a real CSV provided by my bank (UniCredit).

@verymilan
Copy link

verymilan commented Jan 25, 2025

Just ran into all three of these as well. :(
I have tried to adjust my CSV accordingly, but changing "something" to 'something' makes Maybe interpret items including the single quote ... and the import still fails. Moving quotation out of that file altogether presents a problem with numbers as i am currently not sure how efficiently adjust their decimal point unless i do some bash scripting perhaps.

Also just fyi, the application ran into a 500 could not obtain a connection from the pool within 5.000 seconds (waited 5.006 seconds); all pooled connections were in use when doing a testimport with a modified file. (its a Pi5, but only around 30 transactions)

Edit regarding 500: deleted the account and tuned postgres, now even a whole year of csv imported in a second

@zachgoll
Copy link
Collaborator

@sup3rgiu thanks for the report and info. In order of priority, we should probably address:

  1. Decimal separator - this is a big one, definitely need to get this handled. Since we have no way of telling what the value is separated by, we'll probably need to introduce 1 more configuration option for this
  2. Date format - should be a very quick and simple fix
  3. Quotation marks - we should be able to escape quotations and other characters from the raw input fairly easily

@justinfar we should probably throw the decimal separator configuration in this row right?

Image

@zachgoll
Copy link
Collaborator

Also just fyi, the application ran into a 500 could not obtain a connection from the pool within 5.000 seconds (waited 5.006 seconds); all pooled connections were in use when doing a testimport with a modified file. (its a Pi5, but only around 30 transactions)
Edit regarding 500: deleted the account and tuned postgres, now even a whole year of csv imported in a second

@verymilan just curious, was your self-hosted instance running a version prior to this commit? 3140835

My expectation is that instances prior to that commit would suffer the all pooled connections were in use error, while instances after that commit should never experience that error.

If you were on the latest, I'm wondering if that commit had the correct math for allocating the DB pool.

@zachgoll
Copy link
Collaborator

@sup3rgiu I have opened 2 separate issues to address the number formats + date formats:

We'll use this issue for the quotation marks (I've updated the title to reflect that).

@zachgoll zachgoll changed the title Bug: Quotation Marks, Date Format and Currency Separator in transactions import Bug: Single and double quotation marks are not properly escaped from a CSV import Jan 27, 2025
@zachgoll zachgoll added 3️⃣ Low priority Contributions accepted, but Maybe team will not be working on this in the near term Community This is a great issue for community members to work on labels Jan 27, 2025
@verymilan
Copy link

@verymilan just curious, was your self-hosted instance running a version prior to this commit? 3140835

My expectation is that instances prior to that commit would suffer the all pooled connections were in use error, while instances after that commit should never experience that error.

If you were on the latest, I'm wondering if that commit had the correct math for allocating the DB pool.

Yes, I am using docker on 'latest' when I give new releases a shot. :) – just now noticed that latest does not only contains releases, so i might change that.

Just inspected my image and it is the second one from 5 days ago, while the fix appears to be younger:

       "RepoDigests": [
            "ghcr.io/maybe-finance/maybe@sha256:4a2d07bedbd0b56f0c27abc69b6cdc2eb721bd5363deefea565b8360fea7c989"
        ],
        "Parent": "",
        "Comment": "buildkit.dockerfile.v0",
        "Created": "2025-01-23T16:17:43.395401671Z",

(sorry, should have been its own issue perhaps - it actually happened multiple times after tweaking postgres.. kinda randomly/unexpectedly at times)

Will test again when the CSV issues are closed, thanks for your work 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Community This is a great issue for community members to work on 3️⃣ Low priority Contributions accepted, but Maybe team will not be working on this in the near term
Projects
None yet
Development

No branches or pull requests

3 participants