인공지능/Machine Learning

[ML] 머신러닝 라이브러리

유일리 2022. 10. 13. 14:42

1. Pandas

  • 데이터를 관리하고 표시하는 라이브러리
  • The name “Pandas” comes from the term “panel data,” which refers to Panda’s ability to create a series of panels, similar to sheets in Excel.
  • Pandas can be used to organize structured data as a dataframe, which is a two-dimensional data structure (tabular dataset) with labeled rows and columns, similar to a spreadsheet or SQL table.

example of Pandas' dataframe

2. Numpy

  • 행렬이나 일반적으로 대규모 다차원 배열을 쉽게 처리할 수 있도록 지원하는 파이썬의 라이브러리
  • NumPy is often used in combination with Pandas and is short for “numeric Python.”
  • NumPy is used for managing multi-dimensional arrays and matrices, merging and slicing datasets, and offers a collection of mathematical functions including min, max, mean, standard deviation, and variance.
  • A Pandas dataframe is also more suitable for managing a mix of data types, whereas a NumPy array is designed for dealing with numerical data, especially multi-dimensional data.
  • NumPy is often used in conjunction with Pandas.

3. Scikit-learn

  • 파이썬 프로그래밍 언어용 자유 소프트웨어 기계 학습 라이브러리
  • Scikit-learn is the core library for general machine learning.
  • It offers an extensive repository of learning algorithms including logistic regression, decision trees, linear regression, gradient boosting, etc., a broad range of evaluation metrics such as mean absolute error, as well as data partition methods including split validation and cross validation.
  • Scikit-learn is also used to perform a number of important machine learning tasks including training the model and using the trained model to predict the test data.

Common Terms & Function used in machine learning from Scikit-learn

4. Matplotlib

  •  Python 프로그래밍 언어 및 수학적 확장 Numpy 라이브러리를 활용한 플로팅 라이브러리 (데이터 시각화)
  • Matplotlib is a visualization library you can use to generate scatterplots, histograms, pie charts, bar charts, error charts, and other visual charts with just a few lines of code.
  • Matplotlib is generally used in conjunction with Seaborn themes.

5. Seaborn

  • Matplotlib을 기반으로 다양한 색상 테마와 통계용 차트 등의 기능을 추가한 시각화 패키지
  • Seaborn is a popular Python visualization library based on Matplotlib.
  • This library comes with numerous built-in themes for visualization and complex visual techniques including color visualization of dependent and independent variables, sophisticated heatmaps, cluster maps, and pairplots.

6. Tensor-Flow

  • 다양한 작업에 대해 데이터 흐름 프로그래밍을 위한 오픈소스 소프트웨어 라이브러리
  • While Scikit-learn offers a broad set of popular shallow algorithms, TensorFlow is the library of choice for deep learning and artificial neural networks (ANN).
  • TensorFlow was created at Google and supports various advanced distributed numerical computation techniques.