-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove extra spaces from user full name in staging tables #1365
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good. I just have comments/questions
In case I missed it, did we agree to clean the data in the mart in our meeting yesterday? @blarghmatey The staging seems to be a place to do that at least that's what we have been doing according to
https://github.com/mitodl/ol-data-platform/blob/main/src/ol_dbt/docs/staging_guidelines.md
Running the code reveals that the extra space exists not only in edX.org, but also on each platform. See examples.
https://xpro.mit.edu/admin/users/user/145180/change/
https://mitxonline.mit.edu/admin/users/user/103141/change/
https://studio.mitx.mit.edu/admin/auth/user/27502/change/
@pdpinch Do we need to create issues to address it in the platform? Not sure if its worth it
I don't think it's worth it. It doesn't actually cause any problems in the apps. |
I agree that doing the cleaning in staging is the right place to apply the fixes. |
@rachellougee when you're back could you take a look at this if you have time? I changed it to modify the staging tables |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change looks good and table ran fine
What are the relevant tickets?
https://github.com/mitodl/hq/issues/5832
Description (What does it do?)
Remove extra spaces from the user full name column in staging table. As discussed during the data platform meeting today it's better to make this change in the table vs. the query/superset dashboard since it's considered cleaning the data.
How can this be tested?
dbt build --select stg__mitxresidential__openedx__auth_user
dbt build --select stg__edxorg__bigquery__mitx_user_info_combo
dbt build --select stg__mitxresidential__openedx__auth_userprofile
dbt build --select stg__micromasters__app__postgres__profiles_profile
dbt build --select stg__mitxonline__app__postgres__users_user
dbt build --select stg__mitxpro__app__postgres__users_user
dbt build --select stg__bootcamps__app__postgres__profiles_profile