관리 메뉴

TEAM EDA

Day1 : Introduction 본문

EDA Study/수학

Day1 : Introduction

김현우 2018. 11. 16. 08:56

Ch1. Introduction


*슬라이드*

https://lagunita.stanford.edu/c4x/HumanitiesScience/StatLearning/asset/introduction.pdf



*강의*

Opening Remarks and Examples (18:18)

https://www.youtube.com/watch?v=2wLfFB_6SKI&list=PL5-da3qGB5ICcUhueCyu25slvsGp8IDTa


Supervised and Unsupervised Learning (12:12)

https://www.youtube.com/watch?v=LvaTokhYnDw&list=PL5-da3qGB5ICcUhueCyu25slvsGp8IDTa



슬라이드 요약


Statistical Learning Problems


  • Identify the risk factors for prostate cancer.
  • Classify a recorded phoneme based on a log-periodogram
  • Predict whether someone will have a heart attack on the basis of demographic, diet and clinical measurements
  • Customize an email spam detection system.
  • Identify the numbers in a handwritten zip code.
  • Classify a tissue sample into one of several cancer classes, based on a gene expression profile.
  • Establish the relationship between salary and demographic variables in population survey data.
  • Classify the pixels in a LANDAST image, by usage.  



The Supervised Learning Problem


  • Outcome measurement Y (also called dependent variable, response, target)
  • Vector of p predictor measurements X (also called inputs, regressors, covariates, features, independent variables)
  • In the regression problem, Y is quantitative (e.g price, blood pressure)
  • In the classification problem, Y takes values in a finite, unordered set ( survived/died, digit 0-9, cancer class of tissue sample)
  • We have training data (x1,y1), . . . , (xN,yN). These are observations (examples, instances) of these measurements.



The Unsupervised Learning


  • No outcome variable, just a set of predictors (features) measured on a set of samples.
  • Objective is more fuzzy(흐린) --- find groups of samples that behave similarly, find features that behave similarly, find linear combinations of features with the most variation.
  • difficult to know how well your are doing.
  • different from supervised learning, but can be useful as a pre-processing step for supervised learning.


Statistical Learning Vs Machine Learning


  • Machine learning arose as a subfield of A.I.
  • Statistical learning arose as a subfield of Statistics.
  • There is much overlap - both fields focus on supervised and unsupervised problems:
      • Machine learning has a greater emphasis on large scale applications and prediction accuracy.
      • Statistical learning emphasizes models and their interpretability, and precision and uncertainty.
  • But the distinction has become more and more blurred(희미한), and the is a great deal of "cross-fertilization"(상호수정).
  • Machine learning has the upper hand in Marketing!


강의 요약


Opening Remarks and Examples (18:18) - 위의 슬라이드랑 동일 한 내용.

Supervised and Unsupervised Learning (12:12) - 위의 슬라이드랑 동일 한 내용.



'EDA Study > 수학' 카테고리의 다른 글

Day7 : 1주차 질의응답 해설  (0) 2018.11.19
Day6 : 1주차 질의응답  (0) 2018.11.19
Day4 : Linear Regression  (0) 2018.11.16
Day3 : Statistical Learning(2)  (0) 2018.11.16
Day2 : Statistical Learning(1)  (0) 2018.11.16