Datascience: Getting Started

Core Libraries

Numeric

  • Numpy - numerical python, numerical arrays, mathematical operations on arrays. It is the primary data container passed between algorithms. They are more efficient and lower level languages can operate on them without copying the data.
  • Scipy - High level numerical routines, optmisiation, regression and interpolation
  • Matplotlib - 2d visualisations and interactive plots

Interactive environments

  • IPython - Robust and interactive environment. Useful in interactively working with data and working with matplotlib
  • Jupyter Notebooks - Share documents and datascience

Domain-specific packages

  • Mayavi - 3d visualisations
  • pandas - Rich data structures and functions for working with structured data. The primary object is a DataFrame. Combines the high performance array computing features of numpy with flexible manipulation of spreadsheets and relational databases.
  • sympy - symbolic computing
  • scikit-image - image processing
  • scikit-learn - machine learning

Sources