Notice
Recent Posts
Recent Comments
Link
일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | |
7 | 8 | 9 | 10 | 11 | 12 | 13 |
14 | 15 | 16 | 17 | 18 | 19 | 20 |
21 | 22 | 23 | 24 | 25 | 26 | 27 |
28 | 29 | 30 |
Tags
- 나는리뷰어다
- TEAM EDA
- 코딩테스트
- Machine Learning Advanced
- 큐
- 프로그래머스
- 추천시스템
- DilatedNet
- 알고리즘
- 한빛미디어
- hackerrank
- MySQL
- 스택
- 나는 리뷰어다
- 3줄 논문
- Recsys-KR
- 엘리스
- eda
- 입문
- Python
- 파이썬
- Image Segmentation
- TEAM-EDA
- pytorch
- 협업필터링
- Semantic Segmentation
- Segmentation
- 튜토리얼
- DFS
- Object Detection
Archives
- Today
- Total
TEAM EDA
Instant Gratification 본문
Preprocessing
-
- Feature Selection with variance
-
- KERNEL PCA
Try and Fail : PCA, SVD, AutoEncoder, DAE etc
Feature Engineering
We used the following three methods to make variable of the distribution of the dataset.
-
- GMM_PRED
-
- GMM_SCORE
-
- HIST_PRED
By the way, one of the mistakes I found was that GMM_PRED and GMM_SCORE would increase their score if they were duplicated. The results below are still a mystery to our team.
- Gmm_pred 1 + Gmm_score 0 : LB 0.97465
- Gmm_pred 2 + Gmm_score 0 : LB 0.97475
- Gmm_pred 3 + Gmm_score 0 : LB 0.97479
- Gmm_pred 4 + Gmm_score 0 : LB 0.97480
- Gmm_pred 5 + Gmm_score 0 : LB 0.97481
- Gmm_pred 5 + Gmm_score 1 : LB 0.97482
- Gmm_pred 5 + Gmm_score 2 : LB 0.97482
- Gmm_pred 5 + Gmm_score 3 : LB 0.97473
Model
- 1st layer : Nusvc + Nusvc2 + qda + svc + knn + lr / Stratified with GMM_LABEL
- 2nd layer : Lgbm + mlp / Stratified
- 3rd layer : 1st layer + 2st layer / Average
Private Test with Make Classificationthe process is as follows. -
- Estimate the number of private magic n(0~511) has number using the linear relationship between the number of magic and auc.
(ex. private magic 0 auc 0.97xx then magic number 250)
- Estimate the number of private magic n(0~511) has number using the linear relationship between the number of magic and auc.
-
- Create public and private data by seed using make_classification with estimated magic number.
-
- Test Model with maked public, private dataset and check score cv, pb, v
-
- Finally, choose a model with cv + pb + pv close to 0.975. Others thought that overfitting.
But actually it did not work out.
- Finally, choose a model with cv + pb + pv close to 0.975. Others thought that overfitting.
Resource
Code is below
- kernel public 0.97491 : https://www.kaggle.com/chocozzz/hyun-stacking?scriptVersionId=15820611