# CoffeeMaker ## Update: Ciao, I have pushed the version of our nanoaod Analyzer reading on multiple nanoaod files hosted in eospublic. The notebook (Zpeak_nano_multipledataset_v2.ipynb) equipped with: a.) The Hadoop-Xrootd connector, declared in spark context (EOS.jar) to enable reading nanoaods files hosts in eospublic. b.) The code structure is almost identical to Melo's z-mumu-cr.ipynb. By aggregating the derived quantities such as the kinematics of the reconstructed Z boson in a column and save the state variable (pass/fail muon selection). c.) It is a work in progress, we have not figure how to implement the plotting end due to the known issues we have discussed. ## To-do: a.) The weights of each processes does not save at first evaluation, looking for a better implementation. b.) Plot a histogram with weight from a collection which facilitated with histogrammar features. c.) If the plotting issue resolved, we will scale up the operation by topping up more nanoaod MC and datasetsa. d.) Re-optimize the code and conduct an assessment on how much we gain from spark in term of time and efficiencies. If the Zpeak exercise is a success, we can move forward to port other analysis into spark based analytical framework. ## How to run ### Prerequisites * Python 2.7 * numpy version 1.13.1 * jupyter 4.3+ * ipython version 5.7+ * histbook * vega version 1.1 * uproot ``` pip install numpy==1.13 pip install jupyter --user pip install ipython pip install histbook --user pip install vega==1.1 --user pip install uproot --user ``` Note: you might need to upgrade pip to version 18 to get Jupyter 4.3 or above (`pip install --upgrade pip`) ### Setup Once you have installed the prerequisites, set up the striped client ``` git clone http://cdcvs.fnal.gov/projects/nosql-ldrd striped cd striped python setup.py install --user ``` Now on a different directory, clone Coffea repository ``` git clone git@github.com:LPC-DM/CoffeaMaker.git ``` ### Run Launch Jypyter Notebook ``` cd CoffeaMaker/ jupyter notebook ``` It should open a new page in your default browser. If not , you can follow the link displayed in the terminal. At the page you will find the directories and files from the location you ran jupyter notebook. select one of the samples to run it.