-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathnotes.txt
138 lines (101 loc) · 7.46 KB
/
notes.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
The CDS&E program seems to focus more on large data sets that already exist, but some of the programs within CCF (like the Cyberinfrastructure for Sustained Scientific Innovation and possibly the Software and Hardware Foundations) look like such a project might be in scope.
BIGDATA: https://www.nsf.gov/pubs/2018/nsf18539/nsf18539.htm
CSSI: https://www.nsf.gov/funding/pgm_summ.jsp?pims_id=505505
CISE: https://www.nsf.gov/dir/index.jsp?org=CISE
==> don't focus too much on computation. need to get my science funded.
==> want to write "core program" proposals on science
==> funding streams for computational tools come and go
==> stand alone, be unique and recognizable
==> not having science grants won't hurt my chances with CSSI/etc applications
==> NSF has 2 month support for PI.
==> some communities have multiple sources of funding e.g. materials science
==> Dark matter is listed as HEP
==> NSF careful about funding the same person twice, work together with DOE. Should talk to program officers about what they usually do about this.
==> At the end of the day what's important is to have a productive career and be happy :)
==> core programs *do* support software development
==> LHC requires software
==> if a community would benefit from a particular tool, Office of Infrastructure is the place to go. OIC works with domain (physics) to fund. Expects best practices, documentation, outreach to make people aware of the tool.
==> OIC works *across* NSF programs
==> Equivalent of OSCA at DOE
==> webinar, doing it again in a couple days
==> cities and scientists: need to connect to community so people know what we're doing so they keep funding us
==> evaluation criteria: 1. intellectual merit, 2. broader impacts
==> avoid jargon
==> *have to* make the science case. physics would be co-funding! if this doesn't get funded, what happens?
==> panels have to rank proposals. make sure people know why your proposal is critical.
==> CSSI has "solicitation-specific review criteria", listed in solicitation
==> if it's relevant to CDMS then get a letter that says they'll work with me. two-liner on standard letterhead is all that's needed.
==> see if the solicitation requires you to specify which track I'm applying to. Look for "CSSI-HDR" or "CSSI-SI2".
==> can look in the funded projects database and see if it was a DIBBS fund (data) or SI2 fund (software)
==> boundaries are pretty porous.
==> used to have "elements", SSE, these were the small ones. These are internal categories.
The NSF offers a ``Software Infrastructure for Sustained Innovation - SSE \& SSI'' grant
(\url{http://www.nsf.gov/pubs/2016/nsf16532/nsf16532.htm}). Work on specific software is most appropriate to the SSE grant, which typically awards \$500K over three years and is due in the spring. SSI grants have a broader scope and typically award \$1M over five years; SSI grants are due in the fall.
Full Proposal Deadline(s) (due by 5 p.m. proposer's local time):
April 26, 2016
SSE Proposals
September 20, 2016
SSI Proposals
February 21, 2017
SSE Proposals
September 19, 2017
SSI Proposals
The NSF awards about 25 SSE and SSI grants, in total, each year. Up to 19 SSE awards and 10 SSI awards are allowed each year.
The success rate for ``Software Infrastructure'' proposals - both SSE and SSI - is about 20\%.
These software grants generally support work that benefits
an entire field; it's rare that work focused on a single collaboration recieves
funding. To get a sense of typically-funded projects, see
\url{https://sites.google.com/site/softwarecyberinfrastructure/software/software} and
\url{http://www.nsf.gov/awardsearch/simpleSearchResult?queryText=SI2&ActiveAwards=true}.
software infrastructure (SSE, SSI) solicitation:
http://www.nsf.gov/pubs/2014/nsf14520/nsf14520.htm
software infrastructure for sustained innovation program summary:
http://www.nsf.gov/funding/pgm_summ.jsp?pims_id=504865
physics-division contact for Software Infrastructure grants:
http://www.nsf.gov/staff/staff_bio.jsp?lan=bmihaila&org=NSF&from_org=NSF
A list of SSE/SSI grant awardees:
https://sites.google.com/site/softwarecyberinfrastructure/software/software
http://www.nsf.gov/awardsearch/showAward?AWD_ID=1339757 (partial-file FTP!)
http://www.dataturbine.org/sites/default/files/biblio/0012-9623-93%252E3%252E242.pdf (data streams)
http://www.nsf.gov/awardsearch/simpleSearchResult?queryText=SI2&ActiveAwards=true
NSF software vision statement(s):
http://www.nsf.gov/pubs/2012/nsf12113/nsf12113.pdf
http://www.nsf.gov/funding/pgm_summ.jsp?pims_id=504817
Fermilab event-processing framework, art: http://art.fnal.gov/
Some interesting notes on data-intensive applications:
*https://www.confluent.io/blog/using-logs-to-build-a-solid-data-infrastructure-or-why-dual-writes-are-a-bad-idea/
* https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying
* http://dataintensive.net/
Connecting different programs:
* http://www.swig.org/
These guys are an organization that's working on a shared data science infrastructure:
* https://ursalabs.org/tech/
Other software coalitions in physics:
* LArSoft (http://larsoft.org/larsoft-workshop-report-august-2016/) "develops and supports a shared base of physics software across Liquid Argon (LAr) Time Projection Chamber (TPC) experiments"
* HSF (http://hepsoftwarefoundation.org/) "facilitates coordination and common efforts in high energy physics (HEP) software and computing internationally"
FPGAs provide users with an API that gaurantees data synchronicity!
https://www.altera.com/content/dam/altera-www/global/en_US/pdfs/literature/manual/mnl_avalon_spec.pdf
BONSAI is an event-processing framework developed by neuroscientists for combining heterogenous streams of data:
http://journal.frontiersin.org/article/10.3389/fninf.2015.00007/full
Programs that do similar things
-- DFDL, daffodil
-- Kaitai.io
-- python construct
-- python structures
-- bindat, a lisp program?
-- ASN.1 ECN?
-- FlexT? From Hmelnov?
-- ruby bindata
-- telegram? A messaging service?
-- I *know* there's a commercial version of this out there!
Intersting Paper? Optimizing Binary Serialization with an Independent Data Definition Format
mosh is a latency-friendly remote-terminal protocol. The "State Syncrhonization Protocol" they use might be relevant for us! https://mosh.org/#techinfo
Apache Thrift allows communication between programs written in different languages using an Interface Definition Layer (IDL) written by the programmer. Supports quite a few languages, including C++, Python, and Rust.
Would the University of Washington Interactive Data Lab be interested in working with us? https://idl.cs.washington.edu/
And of course there's the Stanford Infolab, whose interesting projects include Trio (data provenance, http://infolab.stanford.edu/trio/) and Weld (Rust-backed data-processing DSL, http://dawn.cs.stanford.edu/blog/weld.html)
Paper on electrophysiology data and open-source software, hardware, and DAQ (!) efforts https://www.sciencedirect.com/science/article/pii/S0959438814002268?via%3Dihub
BESA is a software suite that operates on a wide range of neuroscience formats (but only MEF v2). Would they be interested in this data formatting tool? http://www.besa.de/products/overview/
reana, a platform for reproducible analyses.
=> uses the Common Workflow Language (!)
=> which is an open standard, for which openstand.org apparently has guidelines (!)