Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(name): add hyphenated patterns to lastName() #691

Closed
ejcheng opened this issue Mar 26, 2022 · 19 comments · Fixed by #1819 or #1872
Closed

feat(name): add hyphenated patterns to lastName() #691

ejcheng opened this issue Mar 26, 2022 · 19 comments · Fixed by #1819 or #1872
Assignees
Labels
c: locale Permutes locale definitions good first issue Good for newcomers help wanted Extra attention is needed m: person Something is referring to the person module p: 1-normal Nothing urgent s: accepted Accepted feature / Confirmed bug
Milestone

Comments

@ejcheng
Copy link
Member

ejcheng commented Mar 26, 2022

Clear and concise description of the problem

The ability to generate hyphenated last names so that it seems just a little bit more realistic. The function only generates single last names currently.

Suggested solution

Add a boolean option hyphenated?

Alternative

No response

Additional context

No response

@ejcheng ejcheng added c: feature Request for new feature p: 1-normal Nothing urgent s: needs decision Needs team/maintainer decision labels Mar 26, 2022
@ejcheng ejcheng added this to the v6.2 - New small features milestone Mar 26, 2022
@ejcheng ejcheng moved this to Todo in Faker Roadmap Mar 26, 2022
@ejcheng ejcheng removed the status in Faker Roadmap Mar 26, 2022
@ejcheng ejcheng moved this to Todo in Faker Roadmap Mar 26, 2022
@ST-DDT
Copy link
Member

ST-DDT commented Mar 27, 2022

IMO there are three possible solutions.

  1. Extend the locales
  2. Add parameter
  3. Random chance

As I'm not sure whether this actually a thing in all locales, maybe we should add it to the locale files directly?

@ejcheng
Copy link
Member Author

ejcheng commented Apr 27, 2022

@ST-DDT Yes, I don't think hyphenated last names are a thing in all locales. On another somewhat related note, what about locales with two or more surnames? Some Spanish names have multiple surnames.

@ST-DDT
Copy link
Member

ST-DDT commented Apr 27, 2022

On another somewhat related note, what about locales with two or more surnames? Some Spanish names have multiple surnames.

IMO if you request a single surname, you only get one (hyphenated or not).
We could/should use a name pattern for that.

'#{prefix} #{first_name} #{last_name}',
'#{first_name} #{last_name} #{suffix}',
'#{first_name} #{last_name}',
'#{first_name} #{last_name}',
'#{male_first_name} #{last_name}',
'#{female_first_name} #{last_name}',

This is the old format that was never usable.
But I think, we should revive it using faker.fake and treat all {{patterns}} that don't refer to methods as references to the locale data. (Basically an automatic faker.random.arrayElement(faker.definitions.<pattern>))

That way we can generate firstname lastname, lastname firstname or firstname lastname lastname easily per locale.

@ejcheng
Copy link
Member Author

ejcheng commented Apr 27, 2022

On another somewhat related note, what about locales with two or more surnames? Some Spanish names have multiple surnames.

IMO if you request a single surname, you only get one (hyphenated or not). We could/should use a name pattern for that.

'#{prefix} #{first_name} #{last_name}',
'#{first_name} #{last_name} #{suffix}',
'#{first_name} #{last_name}',
'#{first_name} #{last_name}',
'#{male_first_name} #{last_name}',
'#{female_first_name} #{last_name}',

This is the old format that was never usable. But I think, we should revive it using faker.fake and treat all {{patterns}} that don't refer to methods as references to the locale data. (Basically an automatic faker.random.arrayElement(faker.definitions.<pattern>))

That way we can generate firstname lastname, lastname firstname or firstname lastname lastname easily per locale.

Yes, this method would work really well.

@ST-DDT
Copy link
Member

ST-DDT commented May 5, 2022

This is the old format that was never usable. But I think, we should revive it using faker.fake and treat all {{patterns}} that don't refer to methods as references to the locale data. (Basically an automatic faker.random.arrayElement(faker.definitions.<pattern>))
That way we can generate firstname lastname, lastname firstname or firstname lastname lastname easily per locale.

Yes, this method would work really well.

I created PR #927 for that (depends on #884)

@xDivisionByZerox xDivisionByZerox added the m: person Something is referring to the person module label Jul 30, 2022
@ST-DDT ST-DDT added good first issue Good for newcomers help wanted Extra attention is needed s: accepted Accepted feature / Confirmed bug c: locale Permutes locale definitions and removed s: needs decision Needs team/maintainer decision labels Sep 8, 2022
@Shinigami92 Shinigami92 removed the c: feature Request for new feature label Sep 8, 2022
@ST-DDT
Copy link
Member

ST-DDT commented Sep 8, 2022

The locale data/fake name pattern needs to be updated accordingly.

@ejcheng ejcheng changed the title feat: add hyphenated option to name.lastName() feat(name): add hyphenated option to lastName() Sep 11, 2022
@conner-c
Copy link
Contributor

I am interested in this issue. May I be assigned, please?

@ejcheng
Copy link
Member Author

ejcheng commented Oct 18, 2022

I am interested in this issue. May I be assigned, please?

Sure!

@ST-DDT ST-DDT changed the title feat(name): add hyphenated option to lastName() feat(name): add hyphenated patterns to lastName() Oct 18, 2022
@markscamilleri
Copy link

@ST-DDT Yes, I don't think hyphenated last names are a thing in all locales. On another somewhat related note, what about locales with two or more surnames? Some Spanish names have multiple surnames.

Sorry to just pop in, I just wanted to add that some locales (United Kingdom for example) use both hyphenated and non-hyphenated (with the latter having terrible support sadly) - for example: Olivia Newton-John and Andrew Lloyd Webber. There are also some compounded surnames with more than 2 surnames which also may or may not be hyphenated (although as far as I am aware, these are less common)

IMO if you request a single surname, you only get one (hyphenated or not). We could/should use a name pattern for that.

'#{prefix} #{first_name} #{last_name}',
'#{first_name} #{last_name} #{suffix}',
'#{first_name} #{last_name}',
'#{first_name} #{last_name}',
'#{male_first_name} #{last_name}',
'#{female_first_name} #{last_name}',

This is the old format that was never usable. But I think, we should revive it using faker.fake and treat all {{patterns}} that don't refer to methods as references to the locale data. (Basically an automatic faker.random.arrayElement(faker.definitions.<pattern>))

That way we can generate firstname lastname, lastname firstname or firstname lastname lastname easily per locale.

I think this would mean that such combinations can be generated by specifying different patterns?

@ST-DDT
Copy link
Member

ST-DDT commented Oct 31, 2022

@markscamilleri Yes, but then it would only work for the fullName method. I think, if two ore more names (hypenated or space separated) are possible for surnames in the UK, then we should add some samples to the en(_GB?).person.last_name locale data directly.
However, if these are multiple surnames like I think Spainish has (father's and mother's surname?), then it should be added to the pattern list.

@conner-c Any progress so far?

@markscamilleri
Copy link

@markscamilleri Yes, but then it would only work for the fullName method. I think, if two ore more names (hypenated or space separated) are possible for surnames in the UK, then we should add some samples to the en(_GB?).person.last_name locale data directly. However, if these are multiple surnames like I think Spainish has (father's and mother's surname?), then it should be added to the pattern list.

Thanks @ST-DDT! Well techncally, they're multiple surnames combined into one (usually to keep both family names after marriage) - however on legal documentation (birth certificates, passports, ID cards and drivers licenses) these would be treated as a single surname. So I guess it would make more sense to have these in samples in the locale data.

Such surnames are also allowed in other en locales so I think it makes more sense to add this to the en sample list rather than the GB specific one.

I think this would be different to what @conner-c is working on? If it is, I'm happy to open up another issue and work on it.

@ST-DDT
Copy link
Member

ST-DDT commented Nov 1, 2022

I think this would be different to what @conner-c is working on? If it is, I'm happy to open up another issue and work on it.

No, I think that is exactly what conner-c is working on. If they dont react in a week or so you can take over.

@conner-c
Copy link
Contributor

conner-c commented Nov 8, 2022

I have been busy, so I have not had much time to work on it. It seems like @markscamilleri is currently displaying more interest than me and seems to have a better understanding of the issue. Feel free to unassign me and assign them to the issue. Sorry for such a late response.

@ejcheng ejcheng assigned markscamilleri and unassigned conner-c Nov 8, 2022
@matthewmayer
Copy link
Contributor

Now that the fullName pattern code has landed this can be revisited

Seems there are two possible strategies

  1. In some locales add a hyphenated name pattern with two lastNames and a low percentage weight (similar to how it is already works for es with double surnames)

  2. Add a few sample hyphenated patterns directly into the last_name definitions.

@ST-DDT
Copy link
Member

ST-DDT commented Feb 11, 2023

This could be fixed via #1819

@github-project-automation github-project-automation bot moved this from Todo to Done in Faker Roadmap Feb 21, 2023
@matthewmayer matthewmayer reopened this Feb 21, 2023
@matthewmayer
Copy link
Contributor

We should discuss if we want hyphenated patterns in the default en locale?

@matthewmayer
Copy link
Contributor

A few samples from major English speaking countries.

Australia - 3% https://mccrindle.com.au/article/blog/last-name-mash-ups-give-fascinating-insights-into-changing-social-trends/
US - 5-6% https://www.nytimes.com/2011/11/24/fashion/babies-surnames-to-hyphenate-or-not.html / https://www.researchgate.net/publication/340388866_Women's_Marital_Surname_Change_by_Bride's_Age_and_Jurisdiction_of_Residence_A_Replication
UK - 11% https://www.theguardian.com/lifeandstyle/2017/nov/02/keeping-up-with-smith-joneses-no-longer-posh-double-barrelled-surname
India - ??% https://timesofindia.indiatimes.com/city/delhi/whats-in-a-sires-name/articleshow/2982.cms https://en.wikipedia.org/wiki/Double-barrelled_name#Non-Western_surname_traditions

My feeling is we should probably have {{person.last_name}}-{{person.last_name}} with a say 5% weight for en, allow other en-FOO locales to override if needed, and have {{person.last_name}} with 100% weight for locales with no pattern at the moment (e.g. ja) so they dont fall back to the en locale rules unexpectedly. Then if we get additional information about other locales we could add double or hyphenated patterns for them later.

@markscamilleri
Copy link

Please accept my apologies for not working on this. I wanted to work on it, but I went through some life events that left me unable to deidcate the time for it. I understand I should have spoken up earlier but I was still hoping I could contribute to this.

@ST-DDT
Copy link
Member

ST-DDT commented Mar 25, 2023

Thanks for letting us know. If you ever have time and interest to contribute again, feel free to do so.
I hope you will have a great time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c: locale Permutes locale definitions good first issue Good for newcomers help wanted Extra attention is needed m: person Something is referring to the person module p: 1-normal Nothing urgent s: accepted Accepted feature / Confirmed bug
Projects
No open projects
Status: Done
7 participants