Scikit-HEP project - welcome!
The Scikit-HEP project is a community-driven and community-oriented project with the aim of providing Particle Physics at large with an ecosystem for data analysis in Python. The project started in Autumn 2016 and is under active development.
It is not just about providing core and common tools for the community. It is also about improving the interoperability between HEP tools and the scientific ecosystem in Python, and about improving on discoverability of utility packages and projects.
For what concerns the project grand structure, it should be seen as a toolset rather than a toolkit. The project defines a set of five pillars, which are seen to embrace all major topics involved in a physicist's work. These are:
- Datasets: data in various sources, such as ROOT, Numpy/Pandas, databases, wrapped in a common interface.
- Aggregations: e.g. histograms that summarize or project a dataset.
- Modeling: data models and fitting utilities.
- Simulation: wrappers for Monte Carlo engines and other generators of simulated data.
- Visualization: interface to graphics engines, from ROOT and Matplotlib to even beyond.
Toolset packages
To get started, have a look at our GitHub repository.
The list of presently available packages follows, together with a very short description of their goals:
Interoperability and data manipulation:
- root_numpy: Interface between ROOT and NumPy.
- formulate: Easy conversions between different styles of expressions.
- root_pandas: Module for conveniently loading/saving ROOT files as pandas DataFrames.
- uproot-methods: Pythonic behaviours for non-I/O related ROOT classes.
- uproot: Minimalist ROOT I/O in pure Python and Numpy.
Interface to HEP libraries:
Event processing:
- awkward-array: Manipulate arrays of complex data structures as easily as Numpy.
Particles and decays:
- Particle: PDG particle data and identification codes.
- DecayLanguage: Describe and convert particle decays between digital representations.
Histogramming:
- boost-histogram: Python bindings for the C++14 Boost::Histogram library.
- histbook: Versatile, high-performance histogram toolkit for Numpy.
Fitting:
- iminuit: MINUIT from Python - Fitting like a boss.
- probfit: Cost function builder. For fitting distributions.
Simulation:
- pyhepmc: Next generation Python bindings for HepMC3.
Machine Learning:
- NNDrone: Collection of tools and algorithms to enable conversion of HEP ML to mass usage model.
Visualization:
- vegascope: View Vega/Vega-Lite plots in your web browser from local or remote Python processes.
Units and constants:
- hepunits: Units and constants in the HEP system of units.
Miscellaneous:
- scikit-hep-testdata: Common package to provide example files (e.g., ROOT) for testing and developing packages against.
- scikit-hep: Toolset of interfaces and tools for Particle Physics. To become a metapackage.
In some cases, the packages have to do with bridging between different technologies and/or popular packages from the Python scientific software stack.