Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trim spaces in the location fields and regenerate location ids #318

Closed
davidgamez opened this issue Feb 27, 2024 · 7 comments
Closed

Trim spaces in the location fields and regenerate location ids #318

davidgamez opened this issue Feb 27, 2024 · 7 comments
Assignees

Comments

@davidgamez
Copy link
Member

Description

Some location IDs in the location table contain a _ at the end. The municipality may have ending spaces. Example: US-Washington-Wenatchee_

Proposed solution

Trim spaces at the beginning and end of the municipality values. As part of this issue, we should also trim spaces in country_code and subdivision_name to avoid the same problem.

@emmambd emmambd added the bug Something isn't working label Apr 15, 2024
@emmambd emmambd removed the bug Something isn't working label Apr 22, 2024
@emmambd
Copy link
Contributor

emmambd commented Apr 22, 2024

Tasks:

  • Identify all feeds with missing municipality and/or subdivision field (2 underscores in the middle, underscore at beginning or end)
  • Modifying the concatenation code in populate script to trim
  • Update database (manually or Liquibase)

Resources:

update database script locally

@davidgamez
Copy link
Member Author

Blocked by #402

@jcpitre
Copy link
Contributor

jcpitre commented Apr 25, 2024

Because the dash is a legitimate character in subdivision-name. and municipality, it is suggested to change the separator to a character that will be disallowed when setting the data. For example |

Question: Why is it a problem with is ending with the separator (for |, example is US|New-York|
Or have 2 separators after the other. e.g. US||
That is as good an ID as any other, plus it conveys some info.

@emmambd emmambd removed the blocked label Apr 29, 2024
@jcpitre
Copy link
Contributor

jcpitre commented Apr 29, 2024

In the current DEV db, there are:

  • 5 subdivision_name starting or ending with a space
  • 6 municipalities starting or ending with a space

Also tested for two+ consecutive spaces in either field, but found none.

@jcpitre
Copy link
Contributor

jcpitre commented Apr 30, 2024

The trimming of spaces has already been done in #406
Tested by running the populate script without trimming then the same script with trimming. The results is that there are duplicated entries in the location table.

image

I suggest we manually remove these entries from the dev database once the #406 is released.

@jcpitre jcpitre added the blocked label May 1, 2024
@jcpitre
Copy link
Contributor

jcpitre commented May 1, 2024

Blocked until after release.
We will then have to remove entries in the location table by hand

Or maybe we should automatically remove all unused entries in the location table once in a while?

@emmambd emmambd removed the blocked label May 21, 2024
@jcpitre jcpitre closed this as completed May 22, 2024
@jcpitre
Copy link
Contributor

jcpitre commented May 22, 2024

Deleted 15 unused locations by hand in PROD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants