Skip to content

Latest commit

 

History

History
21 lines (12 loc) · 563 Bytes

README.md

File metadata and controls

21 lines (12 loc) · 563 Bytes

Urdu-resource-NLP

This repo contain preprocessor , Stopwords and Other functionality that we need when we want to do work on Urdu NLP

  1. urdu.py contains URDU_DIACRITICS, URDU_DIGIT URDU_PUNCTUATIONS URDU_EXTRA_CHARACTER URDU_ALPHABET URDU_STOPWORDS

  2. The notebook preprocessor.ipynb contains some exaple of preprocesing

  3. capture_phone_or_email_from_text.py two function that accept string told that phone or email availabe in the text and return boolian vaule. The value
    0 -> Not found
    1 -> Found