-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add (Modern Standard) Arabic language #3
Comments
Are these Modern Standard Arabic, too? Adding them would be a matter of 1 line each. |
Seeds for crawling a language corpus in Maroccan Arabic (BCP47 language code
|
For Algerian Arabic (BCP47 language code |
Yes, @brawer. These two are definitely both Standard Arabic: About the country-specific news services, I can't tell if we they are in local dialects of Standard Arabic, or the regional Arabic. So, I think we need to ask some help reviewing them one by one. |
Actually, |
So far, I’ve tried to follow the BCP47 language tags as per Unicode conventions. There, macrolanguage codes stand for the individual language that “everyone” (a typical webmaster or programmer who isn’t deeply rooted in the internationalization scene) means when they see that code. For example, according to Unicode/ICU/CLDR, the code for Estonian is |
Cool! Yeah, that's what I though is happening here, but wasn't sure. About the other links, hopefully regional Arabic sources, I'll send an update as soon as I get more info. |
Regarding |
hespress.com is definitely MSA (including most comments on the random articles I checked). Actually, you are unlikely to find any newspapers in local dialects, your best bet would be forums and the likes that are considered less “formal”. |
Is there any work being done regarding any Arabic dialects?
We can start with
http://www.dw.com/ar/
, which is Modern Standard Arabic. I think MSA is a good start, and we can add regional dialects later.Please list here any source you think we should add, for MSA or regional dialects.
The text was updated successfully, but these errors were encountered: