The Scientific Python ecosystem is a collection of open-source scientific software packages written in Python. It is a broad and ever-expanding set of algorithms and data structures that grew around NumPy, SciPy, and matplotlib.
The ecosystem includes a wide variety of tools: some more specialized to specific domains such as biological imaging or astronomy, and others quite general for tasks such as data management and high-performance computing.
It includes projects such as Pandas (for data analysis), NetworkX (for graph computation), scikit-learn (for machine learning), and scikit-image (for image processing).
Ecosystem Packages#
Here is a curated selection of packages available in the ecosystem:
Core#
- NumPy, the fundamental package for numerical computation. NumPy defines the n-dimensional array data structure, the most common way of exchanging data within packages in the ecosystem.
- SciPy, a collection of numerical algorithms and domain-specific toolboxes, including signal processing, optimization, statistics, and much more.
- Matplotlib, a mature and popular plotting package that provides flexible, publication-quality 2-D and 3-D visualization.
Data and computation#
- pandas, providing high-performance, easy-to-use data structures.
- SymPy, for symbolic mathematics and computer algebra.
- NetworkX, is a collection of tools for analyzing complex networks.
- scikit-image is a collection of algorithms for image processing.
- scikit-learn is a collection of algorithms and tools for machine learning.
Productivity and high-performance computing#
- IPython, a command-line interface to Python, for interactively exploring code, processing data, and testing code ideas.
- Jupyter Lab provides computational notebooks that combine interactive code with descriptive text in your web browser, useful especially for teaching and documenting research.
- Joblib, Dask, or Ray for distributed processing with a focus on numerical data.