통계분석, 머신러닝을 이용한 데이터 분석

A/B Test: Free Trial Screener Experiment

At the time of this experiment, Udacity courses currently have two options on the home page: “start free trial”, and “access course materials”. If the student clicks “start free trial”, they will be asked to enter their credit card information, and then they will be enrolled in a free trial for the paid version of the course. After 14 days, they will automatically be charged unless they cancel first. If the student clicks “access course materials”, they will be able to view the videos and take the quizzes for free, but they will not receive coaching support or a verified certificate, and they will not submit their final project for feedback.

Socio-Economic Status and Performance of Mathematics

I visualized relationship of socio-economic status and performance of mathematics from Programme for International Student Assessment(PISA) dataset. The explanatory question is: What would be the average performance if all students had the OECD-average socio-economic status?

Analyzing the NYC Subway Ridership

I investigated subway ridership of New York city. Do more people ride the NYC subway when it is raining or when it is not raining? There seems little tendency for people to ride subway when it is raining. I used Mann-Whiteny U test and linear regression to support my analysis.

Identify Fraud from Enron Email

Once successful and dominant company in the energy business, but has fallen because they tried to hide company losses and avoid taxes by creating made-up entities, known since as the Enron scandal. The most infamous name associated with this scandal is Kenneth Lay, who is on trial as of januray 2006. he is charged on 11 counts ranging from insider trading to bank fraud.

Explore and Summarize White Wine Dataset

I used R and 20applied exploratory data analysis techniques to explore relationships in one variable to multiple variables and to explore a white wine dataset for distributions, outliers, and anomalies.

OpenStreetMap Data Wrangling with MongoDB

With Seoul map data from OppenStreetMap, I used data munging techniques, such as assessing the quality of the data for validity, accuracy, completeness, consistency and uniformity, to clean the OpenStreetMap data for a part of the world that I care about. I used MongoDB and applied new data schema to the project.

Investigate a Titanic Dataset

I investigated Titanic dataset using NumPy and Pandas. I went through the entire data analysis process, starting by posing a question and finishing by sharing my findings. In this report, the passengers survival rate is analyzed according to passenger class, age, and sex.

Test a Perceptual Phenomenon

I used descriptive statistics and a statistical test to analyze the Stroop effect, a classic result of experimental psychology. I tried to give a good intuition for the data and use statistical inference to draw a conclusion based on the results.