Skip to content

pvn25/SortingHatLib

Repository files navigation

SortingHatLib

Library for ML feature type inference: https://github.com/pvn25/MLDataPrepZoo/tree/master/MLFeatureTypeInference

Due to git-lfs limits, the resources files are moved to: https://drive.google.com/drive/folders/1eC8F5pO2hSoQf4RQM7zww49y2ZbLIvqG

By default, these resources will be auto downloaded the first time you run the program. If for some reason, this does not work you can try manual download.

  1. Install the package using python-pip
git clone https://github.com/pvn25/SortingHatLib.git

pip install SortingHatLib/
  1. Import the library using
import sortinghat.pylib as pl
  1. Read in csv file using pandas
dataDownstream = pd.read_csv('adult.csv')
  1. Perform base featurization of the raw CSV file:
dataFeaturized = pl.FeaturizeFile(dataDownstream)
  1. bigram feature extraction for Random Forest:
dataFeaturized1 = pl.FeatureExtraction(dataFeaturized)
  1. Finally, load the model for prediction
y_RF = pl.Load_RF(dataFeaturized1)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •