IDRIPR Word

.docx

School

University Canada West *

*We aren’t endorsed by this school

Course

650

Subject

Economics

Date

Feb 20, 2024

Type

docx

Pages

17

Uploaded by DukeInternetPony38 on coursehero.com

1 Individual Descriptive Report University Canada West
2 Contents Table of Figures ............................................................................................................................... 3 Introduction ...................................................................................................................................... 4 Background ...................................................................................................................................... 4 Methodology .................................................................................................................................... 4 Result and Discussion ...................................................................................................................... 5 Data Cleaning .............................................................................................................................. 5 Data Exploration .......................................................................................................................... 6 Regression Model ........................................................................................................................ 8 Some intriguing Questions ........................................................................................................ 10 Conclusion ..................................................................................................................................... 10 References ...................................................................................................................................... 11 ZeroGPT ........................................................................................................................................ 12
3 Table of Figures Figure 1 ........................................................................................................................................... 6 Figure 2 ........................................................................................................................................... 7 Figure 3 ........................................................................................................................................... 8 Figure 4 ........................................................................................................................................... 8 Figure 5 ......................................................................................................................................... 10
4 Introduction In this topic, we will predict the insurance charges for the policyholders based on their age, smoking habits, body mass index and region. This is helpful to many insurance companies as they can charge new customers accurately with the dataset of their previous experiences. In the background section, more details of this study will be given; some of the previous reports in this field and their conclusion will be in the literature review. In methodology, there will be a description of the different methods used in this report, the heading result and discussion will describe different tables and analyse all the data, and finally, we will give our conclusion for this report. Background We have data of charges of the individuals by an insurance company with different characteristics like age, sex, BMI, children, smoker and region. This report will analyse the data and then give a prediction for charges of insurance price. Methodology Use Excel functions like filter and boxplot, use pivot tables to create line and column charts, —use some commands to find quartiles, standard deviation, correl, rand, mse, mae etc. Use of dummy variable to convert characteristics to values for smoker, sex, and region. Use of data analysis for regression.
5 Result and Discussion Data Cleaning Data cleaning makes data more reliable by correcting or removing inaccurate or redundant data and duplicates. (Stedman, 2022). The following steps were used in the dataset- 1. Checking for duplicates – 1 duplicate found and removed. 2. Checking blank spaces – There was no missing data. 3. Changing the charges from value to currency. 4. Checking for outliers in charges, we calculated the min and max whisker and removed outliers greater than the max whisker, i.e. $34679. The following boxplot was created from the price data. Figure 1 Note- Outliers are shown above the max whisker in the boxplot.
6 After removing outliers, the total data left is 1200. Data Exploration Descriptive analysis of data with charges and different variables like age, BMI and smoker. The prices of insurance increase with age; the older you get, the more you can get hospitalized, so you have to pay a premium for insurance (Walker, 2022). Figure 2 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 0 5000 10000 15000 20000 25000 Age Avg Charge Note - The charges of insurance are according to their age. From the line chart, we can see that there is an upward trend in the price with the rise in age and we can see three different age groups with comparative price rage – young people (16- 35), middle-aged (35-50) and older people (>50) as there is steep rise after the end of each group. Companies also use a person’s BMI to charge insurance as anybody with BMI more than 30 is categorised as obese and have to pay more as they are more prone to obesity-related disease like diabetes ( GoodRX r , n.d.).
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help