19th Ave New York, NY 95822, USA

# 计算机代写|ST260 Data Analysis

## ST260课程简介

Description
Introduction to the use of basic statistical concepts in business applications. Topics include extensive graphing; descriptive statistics; measures of central tendency and variation; regression, including transformations for curvature; sampling techniques; designs; conditional probability; random variables; probability distributions; sampling distributions; confidence intervals; and statistical inference. Computer software applications are utilized extensively. Emphasis throughout the course in on interpretation. Computing proficiency is required for a passing grade in this course. Students are limited to three attempts for this course, excluding withdrawals.

Credits
3

## Prerequisites

Recent Professors
Marcus Perry, Brad Casselman, Jennifer Moore, Bruce Barrett, Xuwen Zhu, Subha Chakraborti, Jennifer McMillan, Mohammed Alzahrani, Aqi Dong, Yang-Li Liao, Danhyang Lee, Yuhui Yao, Brian Gray, Yang Wang, Bradley Casselman, Subhabrata Chakraborti

Open Seat Checker
Get notified when ST 260 has an open seat
Schedule Planner
Recent Semesters
Fall 2023, Spring 2023, Fall 2022, Spring 2022, Fall 2021

## ST260 Data Analysis HELP（EXAM HELP， ONLINE TUTOR）

Using the data in Table 5.12 in Agresti (2007), do the following:
(a) Fit the loglinear model of homogenous association. Report the estimated conditional odds ratio between smoking and lung cancer. Obtain a $99 \%$ confidence interval for the true odds ratio. Interpret.
(b) Test goodness-of-fit of the model. Interpret.
(c) Consider the simpler model of conditional independence between smoking and lung cancer, given city. Compare the fit to the homogeneous association model, and Interpret.
(d) Fit a logit model containing effects of smoking and city on lung cancer. Use the smoking effect to estimate the conditional odds ratio between smoking and lung cancer. How does this compare to the estimate from the log-linear model?
(e) Use (b) to test the hypothesis of a common odds ratio between smoking and lung cancer for these eight studies. How does the result compare to the Breslow-Day test?

Add part (c): Consider Alcohol, Cigarette and Marijuana use to be the “response” variables and Race and Gender to be “explanatory”. Analyze the data. For this part, be sure to

• Describe your model selection strategy.
• Give a full interpretation of the final model you select.
• Summarize your findings in non-technical language (i.e., Give two or three lines that could be used as the opening of an newspaper or magazine article reporting the results).

What were the two most commonly awarded levels of educational attainment awarded between 20002010 (inclusive)? Use the mean percent over the years to compare the education levels in order to find the s two largest. For this computation, you should use the rows for the ‘A’ sex. Call this method top_2_2000s and return a Series with the top two values (the index should be the degree names and the values should be the percent).

For example, assuming we have parsed hw3-nces-ed-attainment.csv and stored it in a variable called data, then top_2_2000s(data) will return the following Series (shows the index on the left, then the value on the right)
high school 87.557143
associate’s $\quad 38.757143$
Hint: The Series class also has a method nlargest that behaves similarly to the one for the DataFrame, but does not take a column parameter (as Series objects don’t have columns).
Our assert_equals only checks that floating point numbers are within 0.001 of each other, so your floats do not have to match exactly.

Problem 3: percent_change_bachelors_2000s
What is the difference between total percent of bachelor’s degrees received in 2000 as compared to 2010 ? Take a sex parameter so the client can specify ‘ $M$ ‘, ‘F’, or ‘A’ for evaluating. If a call does not specify the sex to evaluate, you should evaluate the percent change for all students (sex = ‘ $A$ ‘). Call this method percent_change_bachelors_2000s and return the difference (the percent in 2010 minus the percent in 2000) as a float.

For example, assuming we have parsed hw3-nces-ed-attainment.csv and stored it in a variable called data, then the call percent_change_bachelors_2000s(data) will return 2.599999999999998. Our assert_equals only checks that floating point numbers are within 0.001 of each other, so your floats do not have to match exactly.

Hint: For this problem you will need to use the squeeze() function on a series to get a single value from a series of length 1 .

MY-ASSIGNMENTEXPERT™可以为您提供COURSICLE ST260 DATA ANALYSIS数据库课程的代写代考和辅导服务！