Load & check the data: 1. Load the data into a pandas dataframe named data_firstname where first name is you name. 2. Carryout some initial investigations: a. Check the names and types of columns. b. Check the missing values. c. Check the statistics of the numeric fields (mean, min, max, median, count..etc.) d. In you written response write a paragraph explaining your findings about each column. Pre-process and visualize the data 3. Replace the ‘?’ mark in the ‘bare’ column by np.nan and change the type to ‘float’

Np Ms Office 365/Excel 2016 I Ntermed
1st Edition
ISBN:9781337508841
Author:Carey
Publisher:Carey
Chapter8: Working With Advanced Functions
Section: Chapter Questions
Problem 2.7CP
icon
Related questions
Question
100%
Load & check the data:
1. Load the data into a pandas dataframe named data_firstname where first name is you name.
2. Carryout some initial investigations:
a. Check the names and types of columns.
b. Check the missing values.
c. Check the statistics of the numeric fields (mean, min, max, median, count..etc.)
d. In you written response write a paragraph explaining your findings about each column.
Pre-process and visualize the data
3. Replace the ‘?’ mark in the ‘bare’ column by np.nan and change the type to ‘float’
4. Fill any missing data with the median of the column.
5. Drop the ID column
6. Using Pandas, Matplotlib, seaborn (you can use any or a mix) generate 3-5 plots and add them
to your written response explaining what are the key insights and findings from the plots.
7. Separate the features from the class.
8. Split your data into train 80% train and 20% test, use the last two digits of your student number
for the seed.
Build Classification Models
Support vector machine classifier with linear kernel
 
 
9. Train an SVM classifier using the training data, set the kernel to linear and set the regularization
parameter to C= 0.1. Name the classifier clf_linear_firstname.
10. Print out two accuracy score one for the model on the training set i.e. X_train, y_train and the
other on the testing set i.e. X_test, y_test. Record both results in your written response.
11. Generate the accuracy matrix. Record the results in your written response.
Support vector machine classifier with “rbf” kernel
12. Repeat steps 9 to 11, in step 9 change the kernel to “rbf” and do not set any value for C.
Support vector machine classifier with “poly” kernel
13. Repeat steps 9 to 11, in step 9 change the kernel to “poly” and do not set any value for C.
Support vector machine classifier with “sigmoid” kernel
14. Repeat steps 9 to 11, in step 9 change the kernel to “sigmoid” and do not set any value for C.
(Optional: for steps 9 to 14 you can consider a loop)
By now you have the results of four SVM classifiers with different kernels recorded in your written
report. Please examine and write a small paragraph indicating which classifier you would recommend
and why
 
note: 
python programming
Expert Solution
steps

Step by step

Solved in 5 steps with 3 images

Blurred answer
Knowledge Booster
Complex Datatypes
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Similar questions
  • SEE MORE QUESTIONS
Recommended textbooks for you
Np Ms Office 365/Excel 2016 I Ntermed
Np Ms Office 365/Excel 2016 I Ntermed
Computer Science
ISBN:
9781337508841
Author:
Carey
Publisher:
Cengage
Oracle 12c: SQL
Oracle 12c: SQL
Computer Science
ISBN:
9781305251038
Author:
Joan Casteel
Publisher:
Cengage Learning