1
Calculate the coefficient of determination.
Choose one answer.
a. 0.37
b. 0.43
c. 0.85
d. 0.12
.
.
Question 2
If sx and sy are the standard deviations of x and y, and rxy is the sample correlation coefficient between x and y, which of the following is the slope of a simple linear regression y = α+ βx ?
Choose one answer.
a. rxy
b. rxy sy/sx
c. rxy (sy/sx)2
d. sy/sx
.
.
Question 3
The following data x =[18 20 22 24] and y = [76.1 77.0 78.1 79.7] are fitted with a linear regression. What is the slop of the line?
Choose one answer.
a. 0.60
b. 0.1
c. 0.9
d. 0.7
.
.
Question 4
The sample mean and the variance for X are 0 and 1. The sample mean and the variance of Y are 2 and 2. The sample correlation is 0.7. What is the intercept of the least squares line?
Choose one answer.
a. 2
b. 0.99
c. 1
d. 0
.
.
Question 5
The sample variance for X and Y are 1 and 4, respectively. The sample correlation is 0.5. What is the slope of the least squares line?
Choose one answer.
a. 1
b. 2
c. 0.5
d. 0.25
.
.
Question 6
What are the consequences of heteroscedasticity in ordinary least square estimates?
Choose one answer.
a. Heteroscedasticity results in biased parameter estimates.
b. Heteroscedasticity leads to inefficient estimation.
c. Heteroscedasticity leads to inconsistency.
d. All of the above.
.
.
Question 7
What is ANOVA used for?
Choose one answer.
a. To determine if the means of two samples are equal
b. To determine if the variances of two or more samples are equal
c. To determine if the variances of more than two samples are equal
d. To determine if the means of two or more populations are equal
.
.
Question 8
What is the F ratio in a one-way ANOVA?
Choose one answer.
a. MSTR/MSE
b. MST/MSE
c. MSE/MSTR
d. MSE/MST
.
.
Question 9
When there are only two groups for a one-way ANOVA, what is the relationship between F-test and t-test?
Choose one answer.
a. F = t
b. F = t2
c. F > t
d. F < t
.
.
Question 10
Which of the following assumption is NOT necessary for conducting a one-way ANOVA comparing two population means?
Choose one answer.
a. The variances of the response variables of the two populations are equal.
b. The values of the response variables are normally distributed.
c. The samples from the two populations are randomly selected, independent samples.
d. The sample sizes of two populations are equal.
.
.
Question 11
Which of the following is NOT a component of a summary table for ANOVA?
Choose one answer.
a. F-statistics
b. Degree of freedom
c. Correlation coefficients
d. Sum of squares
.
.
Question 12
Which of the following is not true about the correlation coefficient?
Choose one answer.
a. The correlation coefficient is also known as Pearson’s r, named after its inventor Karl Pearson.
b. The value of a correlation coefficient computed from a sample always lies between -1 and +1.
c. A significant correlation indicates a causal relationship between two random variables.
d. When a sample correlation is significant, the null hypothesis of no linear association can be rejected.
.
.
Question 13
Which of the following statements about chi-square test is NOT true?
Choose one answer.
a. A chi-square goodness of fit test is valid if each of the expected cell frequencies is less than five.
b. A contingency table is a type of table in a matrix format that displays the frequency distribution of the variables.
c. The smaller the value of the chi-square test statistic, the more likely the null hypothesis will be rejected.
d. The statistic is zero if the observed frequencies are equal to the expected frequencies.
.
.
Question 14
Which of the following statements is NOT true?
Choose one answer.
a. The variance of the residuals has no influence on the uncertainty in estimating the regression coefficients.
b. In simple linear regression, the slope of the regression line is proportional to the correlation between X and Y.
c. The R2 for a regression of Y onto X and the R2 for the regression of X onto Y are equal.
d. Least squares residuals are not correlated with the fitted values.
.
.
Question 15
Which of the following values could not represent a correlation coefficient?
Choose one answer.
a. 0.5
b. 0.9
c. -0.5
d. 1.5
.
.
Question 16
Which quantity measures the variability of the observed values of the response variable around their respective treatment means?
Choose one answer.
a. Mean
b. Error sum of squares
c. Correlation
d. None of the above
.
.
Question 17
You conducted a 2 (gender) x 4 (treatment group) analysis of responses to 4 different treatment regiments for depression. You conducted a two-way ANOVA and found that the F-value for the corrected model is significant. Which conclusion can you make based on this analysis?
Choose one answer.
a. There is a significant difference between at least two of the groups.
b. There is a significant difference between genders.
c. There is a significant difference between at least two of the four treatment regimens.
d. There is a significant difference between all eight groups.
.
.
Question 18
You have three groups of patients, and each group is comprised of ten males and ten females. The dependent variable is the reduction in cholesterols following a treatment. You predict that Group A will respond to treatment better than Group C, and that males will respond to treatment less than females. Which method do you use to confirm your prediction?
Choose one answer.
a. two-way ANOVA
b. one-way ANOVA
c. t-test
d. chi-square test
.
.
Question 19
A multiple regression model has p independent variables. What is the degree of freedom for error if data contain N observations?
Choose one answer.
a. N-p-1
b. N-p+1
c. N-p
d. N-1
.
.
Question 20
Calculate R2 for a multiple regression model, with SSR = 100 and SSE = 200.
Choose one answer.
a. 0.333
b. 0.5
c. 0.300
d. 0.75
.
.
Question 21
Compute the value of F0.5 with 8 numerator and 19 denominator degrees of freedom.
Choose one answer.
a. 2.12
b. 2.48
c. 7.50
d. 1.56
.
.
Question 22
Determine the value of F0.05 with 6 numerator and 60 denominator degrees of freedom.
Choose one answer.
a. 2.3
b. 5.2
c. 2.5
d. 1.2
.
.
Question 23
Fill in the blank. ________ is used to test the significance of multiple regression model.
Choose one answer.
a. The overall F-test
b. The t-test
c. The partial F-test
d. The chi-square test
.
.
Question 24
In order to test for the significance of a regression model involving 5 independent variables and 40 observations, the denominator degree of freedom for the critical value of F is:
Choose one answer.
a. 5
b. 35
c. 45
d. 34
.
.
Question 25
Let yi be the observed values in a dataset and fi be the modeled values from a regression. Let ymean be the mean of the observed data. Which of the following is the regression sum of squares (SSR)?
Choose one answer.
a. ∑(yi-ymean)2
b. ∑(fi-ymean)2
c. ∑(fi-yi)2
d. None of the above
.
.
Question 26
R2 is computed to be 0.79 for a multiple regression analysis on fifty independent variables. What is the appropriate interpretation of R2 = 0.79?
Choose one answer.
a. The model predicts outcomes 79% of the time.
b. Twenty-one percent of the independent variables should be removed from the analysis.
c. Seventy-nine percent of variations in the observed values of the dependent variable are explained by the independent variables.
d. Only seventy-nine percent of the independent variables are significant.
.
.
Question 27
The correlation coefficient between two random variables is 0.85, which indicates
Choose one answer.
a. negative linear correlation
b. some linear correlation
c. no linear correlation
d. perfect linear correlation
.
.
Question 28
What is the objective of multiple regression?
Choose one answer.
a. Quantify correlations between predictors
b. Identify causal relationships between variables
c. Explain variation in one variable based on variations of other variables
d. All of the above
.
.
Question 29
When do you need to use dummy variables in a multiple regression?
Choose one answer.
a. When performing residual analysis
b. When correcting for multicollinearity
c. When qualitative variables are used in the model
d. None of the above
.
.
Question 30
Which factors do the adjusted coefficient of determination adjust for?
Choose one answer.
a. Number of independent variables and sample size
b. Number of dependent variables
c. Sample size
d. Significant level and sample size
.
.
Question 31
Which is the range of a correlation coefficient r?
Choose one answer.
a. r > 1
b. -1≤r≤1
c. r ≤1
d. 0≤r≤1
.
.
Question 32
Which of the following is always true when we add an independent variable to a multiple regression model?
Choose one answer.
a. Adjusted coefficient of determination decreases.
b. Unadjusted coefficient of determination increases.
c. Adjusted coefficient of determination increases.
d. Unadjusted coefficient of determination decreases.
.
.
Question 33
Which of the following is NOT true about principal component analysis (PCA)?
Choose one answer.
a. PCA transforms data to a new coordinate system, with each coordinate being referred to as a principal component.
b. In PCA, each of the principal components is a nonlinear combination of the original variables.
c. The principal components are arranged in order of decreasing variance.
d. The most informative principal component is the first component.
.
.
Question 34
Which of the following is the definition of R2? SSR is the regression sum of squares, SST is the total sum of squares, and SSE is the sum of squares of residuals.
Choose one answer.
a. SSR/SST
b. SSE/SSR
c. SSE/SST
d. SST/SSE
.
.
Question 35
Which of the following statement is NOT true?
Choose one answer.
a. The sum of the residuals for a least squares line is zero.
b. R2 is equal to the square of the sample correlation between the observed values and the fitted values.
c. R2 = SSR/SST.
d. If the correlation rX,Y is 0, there is no relationship between X and Y.
.
.
Question 36
You were asked to conduct a regression analysis to determine the relationship between a dependent variable and 8 independent variables, using 56 observations. You obtained the following information from the computer outputs: R Square = 0.80 and SSR = 4,280. Determine the value of F-test for the regression.
Choose one answer.
a. 12.4
b. 24.5
c. 11.2
d. 27.3
.
.
Question 37
You were asked to perform a regression analysis performed on a sample with thirty-five observations to quantify the relationships between one dependent variable and four independent variables. You obtained R2 = 0.6 and sum of squares (SSR) is 4800. Determine F-statistics for the regression and whether the model is significant at α = 0.05 level of significance.
Choose one answer.
a. 29.55 and the model is significant
b. 11.25 and the model is significant
c. 2.5 and the model is not significant
d. 7.5 and the model is not significant
.
.
Question 38
You were asked to perform a regression analysis performed on a sample with forty-five observations to quantify the relationships between one dependent variable and four independent variables. You obtained R2 = 0.8 and sum of squares (SSR) is 680. Determine the degrees of freedom and F-statistics for the regression.
Choose one answer.
a. 4 and 45
b. 4 and 40
c. 5 and 20
d. 2 and 10
.
.
Question 39
Fill in the blank. A variable Z, whose value is Z = X1X2, is introduced to a general linear model in order to account for potential ________ of two variables X1 and X2 acting together.
Choose one answer.
a. interactions
b. multicollinearity
c. residuals
d. autocorrelation
.
.
Question 40
Fill in the blank. In multiple regression analysis, ________ occurs when independent predictors are correlated with one another.
Choose one answer.
a. heteroscedasticity
b. multicollinearity
c. elasticity
d. homoscedasticity
.
.
Question 41
Fill in the blank. Multicollinearity occurs when ________.
Choose one answer.
a. two or more predictor variables in a multiregression are correlated
b. the probability distribution for the response variable has the same standard deviation regardless of the value of the predictors
c. the values of a variable at different points in time are correlated with itself
d. None of the above
.
.
Question 42
Fill in the blank. ______ is used to determine whether an additional variable makes a significant contribution to a multiple regression model.
Choose one answer.
a. An F test
b. A Z test
c. A t test
d. A chi-square test
.
.
Question 43
For a multiple regression problem with four predictors, how many possible models are there?
Choose one answer.
a. 25
b. 16
c. 15
d. 10
.
.
Question 44
In multiple regression, why do you want to calculate partial correlation?
Choose one answer.
a. To investigate the relationship between two variables, controlled for the effects of other variables.
b. To investigate the relationship between two variables within a portion of the sample.
c. To remove independent variables from the regression.
d. To find out which independent variables are the most predictive.
.
.
Question 45
Rj2 is the coefficient of determination of the model that includes all predictors except the jth predictor. For x1, R12 = 0.95, calculate the variance inflation factor for x1 and determine whether there is a problem with multicollinearity.
Choose one answer.
a. 1.05 and there is no problem with multicollinearity
b. 25 and there is no problem with multicollinearity
c. 20 and there is a problem with multicollinearity
d. 10 and there is a problem with multicollinearity
.
.
Question 46
Rj2 is the coefficient of determination of the model that includes all predictors except the jth predictor. Which of the following formula is used to calculate the variance inflation factor (VIF)?
Choose one answer.
a. 1/(1- Rj2)
b. 1- Rj2
c. 1+Rj2
d. 1/(1+ Rj2)
.
.
Question 47
What are the consequences of heteroscedasticity in ordinary least square estimates?
Choose one answer.
a. Heteroscedasticity results in biased parameter estimates.
b. Heteroscedasticity leads to inefficient estimation.
c. Heteroscedasticity leads to inconsistency.
d. All of the above.
.
.
Question 48
Which of the following assumptions about the error term of multiple regression model is NOT true?
Choose one answer.
a. Constant variation
b. Independent
c. A mean of zero
d. Exponentially distributed
.
.
Question 49
Which of the following criteria is used to determine the best regression model?
Choose one answer.
a. The best model minimizes prediction errors.
b. The best model should be as simple as possible, with the least number of independent variables.
c. The best model only includes predictors that make a significant contribution to the model.
d. All of the above.
.
.
Question 50
Which of the following is NOT true about multicollinearity?
Choose one answer.
a. Multicollinearity is a result of strong correlations between independent variables.
b. Multicollinearity reduces the variances of the parameter estimates.
c. Multicollinearity can be reduced by combining the involved variables into one.
d. Multicollinearity can lead to wrong signs and magnitudes of regression coefficient estimates.
.
.
Question 51
Which of the following procedures is backward elimination procedure for selecting models?
Choose one answer.
a. The starting model is the one with all the predictors in it, and at each step the procedure tries to drop out one nonsignificant predictor, stopping when all predictors are significant.
b. The procedure starts without any predictors and tries to add them, step by step, while it also tries to drop out predictors at each step.
c. The procedure begins with the model having no predictors at all and adds the best available predictor at each step.
d. None of the above.
.
.
Question 52
Which of the following statements describes “redundancy” in multiple regression?
Choose one answer.
a. Redundancy occurs when two or more independent variables convey approximately the same predictive information about the dependent variable, consequently, the model using these predictors has predictive power similar to those models using only one of the predictors.
b. Redundancy occurs when the number of predictors is greater than ten.
c. Redundancy occurs when adding another variable does not increase predictive power of the regression.
d. None of the above
.
.
Question 53
You are tasked with predicting the heights of sons, which are influenced by genes from both sides of the family. You already have data on the heights of their mothers. You are now allowed to measure a second predictor variable. Choose among the following predictors a second variable that would help most with the predictions.
Choose one answer.
a. The height of an uncle on the mother’s side
b. The height of an aunt on the father’s side
c. The height of an in-law
d. The height of a friend on the father’s side
.
.
Question 54
Determine the coefficients of an exponential regression y = α.βx for the following data: x = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 and y = 63, 76, 92, 105.7, 122.8, 131.7, 151.3, 179.3, 203.3, 226.54, 248.7, 281.4.
Choose one answer.
a. α = 4.3 and β = 0.12
b. α = 71.5 and β = 1.13
c. α = 12.5 and β = 10.2
d. α = 11.5 and β = 0.13
.
.
Question 55
In a logistic regression, what is a logit?
Choose one answer.
a. The natural logarithm of the odds ratio
b. The natural logarithm of the probability
c. The natural logarithm of the risk factors
d. None of the above
.
.
Question 56
In a period of eight years, the revenue of a company (in $ millions) was 25, 26, 28, 27, 26, 29, 29, and 34. For the same period, the spending on advertisement (in $ thousands) was 16, 13, 15, 14, 16, 17, 19, and 22. Estimate the percentage of variation in revenue explained by variation in advertising.
Choose one answer.
a. 71%
b. 10%
c. 25%
d. 41%
.
.
Question 57
In logistic regression, which test is used to reject the null hypothesis that β = 0?
Choose one answer.
a. t-test
b. F-test
c. Wald test
d. Mann-Whitney test
.
.
Question 58
In the context of Generalized Linear Models (GLM), what is the link function for logistic regression?
Choose one answer.
a. g(p) = p
b. g(p) = ln(p)
c. g(p) = p-1
d. g(p) = ln(p/(1-p))
.
.
Question 59
In the context of Generalized Linear Models, which of the following statements is NOT true?
Choose one answer.
a. The link function provides the relationship between the linear predictor and the mean of the distribution function.
b. Each outcome of the dependent variables is assumed to be generated from a distribution in the exponential family.
c. Generalized Linear Models and General Linear Models refer to the same type of model.
d. Generalized Linear Models include logistic regression, exponential regression, and multiple linear regression.
.
.
Question 60
In the model y = a0 + a1x + a2x2, what is the value of a2 if the relationship between y and x is linear?
Choose one answer.
a. 1
b. a0
c. 0
d. –( a0 + a1)
.
.
Question 61
In the model y = a0 + a1x + a2x2, what is the value of x at which the mean of y takes its maximum if the parabola is mound-shaped or its minimum if the parabola is bowl-shaped?
Choose one answer.
a. –a1/2a2
b. a1/a2
c. 0
d. a1/2a2
.
.
Question 62
What does heteroscedasticity mean?
Choose one answer.
a. The mean of the errors is not zero.
b. The variances of the predictors are not constant.
c. The variance of the dependent variable is small.
d. The variance of the errors is not constant.
.
.
Question 63
What is the principle of parsimony when building models?
Choose one answer.
a. Models should have no more parameters than necessary to represent the relationship adequately.
b. All models are wrong; some are useful.
c. The best regression models are linear.
d. When building models, one should try to use as many predictors as possible.
.
.
Question 64
What is the reason for using polynomial regression?
Choose one answer.
a. Polynomial regression is used to address multicollinearity.
b. The shape of data is not always linear and an nth polynomial can be used to create a better fit.
c. When your dependent variable is categorical, polynomial regression will provide a better fit.
d. None of the above.
.
.
Question 65
Which of the following is exponential regression model?
Choose one answer.
a. y = a0 + a1x + a2x2+…..amxm
b. y = a0 + a1x1
c. y = log(a0 + a1x1)+ε
d. y = a0 + a1a2x
.
.
Question 66
Which of the following is polynomial regression model?
Choose one answer.
a. y = a0 + a1x + a2x2+…..amxm
b. y = a0 + a1x1
c. y = log(a0 + a1x1)+ε
d. y = a0 + a1a2x
.
.
Question 67
Which of the following is the equation for logistic regression?
Choose one answer.
a. p(x) = β0 + β ∙x
b. log(p(x)/(1-p(x))) = β0 + β ∙x
c. log(p(x))= β0 + β ∙x
d. p(x)/(1-p(x) = β0 + β ∙x
.
.
Question 68
Which of the following is Type I error?
Choose one answer.
a. Type 1 error is made when a test fails to reject a false null hypothesis.
b. Type 1 error is made when a test rejects a true null hypothesis.
c. Type 1 error is made when a test rejects a false null hypothesis.
d. None of the above.
.
.
Question 69
Which of the following method is used in data mining?
Choose one answer.
a. Logistic regression
b. Support vector machine
c. Naïve Bayse classification
d. All of the above
.
.
Question 70
In an epidemiological study focusing on a biomarker A, blood samples were collected from two different ethnic groups, “Caucasian” and “Asian.” For the Caucasian group, the measured values of biomarker A in seven individuals are 42, 41, 47, 42, 37, 41, and 35. For the Asian group, the values of biomarker A in seven individuals are 16, 31, 20, 35, 47, 27, and 31. The null hypothesis is that the probability distributions of biomarker A in Caucasian and Asian populations are identical. You are asked to perform a Wilcoxon rank-sum test to reject or confirm the null hypothesis. What are the sums of the ranks for observations in the Caucasian and Asian groups, respectively?
Choose one answer.
a. 30 and 50
b. 80 and 10
c. 60 and 30
d. 70 and 35
.
.
Question 71
Which non-parametric test would you use to assess whether there is a significant difference between the mean ranks of two conditions?
Choose one answer.
a. Wilcoxon
b. Kruskal-Wallis
c. F-test
d. None of the above
.
.
Question 72
Which nonparametric test is equivalent to a parametric t-test for independent sample means?
Choose one answer.
a. Wilcoxon signed-rank test
b. Mann-Whitney test
c. Kruskal-Wallis test
d. None of the above
.
.
Question 73
Which of the following is a method of multivariate analysis?
Choose one answer.
a. Principal Component Analysis
b. Multiple regression
c. Hypothesis testing
d. All of the above
.
.
Question 74
Which of the following is not a nonparametric test?
Choose one answer.
a. Sign test
b. Wilcoxon sign-rank test
c. Kolmogorov–Smirnov test
d. Mann-Whitney-Wilcoxon test
.
.
Question 75
Which of the following is NOT a parametric procedure?
Choose one answer.
a. Pearson’s correlation coefficient
b. ANOVA
c. A sign test
d. Logistic regression
.
.
Question 76
Which of the following is NOT true about nonparametric tests?
Choose one answer.
a. Nonparametric tests are more likely to reject the null hypothesis, compared with their parametric equivalents.
b. Nonparametric tests use the Z-distribution for large samples.
c. Nonparametric tests are cumbersome to use for large sample sizes.
d. Nonparametric tests are more conservative as compared with their parametric equivalents.
.
.
Question 77
Which of the following statement is correct?
Choose one answer.
a. All variables in a multiple regression analysis must be quantitative.
b. All variables in a multiple regression analysis must be positive.
c. All variables in a multiple regression analysis must be qualitative.
d. None of the above.
.
.
Question 78
Which of the following statements about nonparametric tests is NOT true?
Choose one answer.
a. If the sampled populations are normally distributed, parametric tests such as F and t tests are more powerful than their nonparametric counterparts.
b. The Wilcoxon rank-sum test requires that we take independent random samples.
c. Nonparametric tests can be easily extended to multiple regression.
d. The Kruskal-Wallis test is the nonparametric counterpart of the one-way ANOVA test.
.
.
Question 79
Which test can be used to test normality of data?
Choose one answer.
a. t-test
b. Kolmogorov–Smirnov test
c. Mann-Whitney test
d. F-test
.
.
Question 80
A group of shapes is classified into five circles and nine rectangles. Determine the expected information (entropy) needed to classify a tuple in the group.
Choose one answer.
a. 0.14 bits
b. 0.74 bits
c. 0.94 bits
d. 0.54 bits
.
.
Question 81
An event happens at a constant hazard rate of 0.5 events per second. What is the survival function at 1 second since the last event?
Choose one answer.
a. 0.61
b. 0.39
c. 0.41
d. 0.59
.
.
Question 82
D has fourteen shapes, classified into five circles and nine rectangles. Determine the Gini coefficient for D.
Choose one answer.
a. 0.22
b. 1.50
c. 0.92
d. 0.46
.
.
Question 83
Fill in the blank. In data mining, ______ is used to predict categorical class labels.
Choose one answer.
a. Classification
b. Numerical prediction
c. Hypothesis testing
d. None of the above
.
.
Question 84
Fill in the blank. In decision tree induction, ______ is used to measure impurity.
Choose one answer.
a. Information gain
b. Gini coefficient
c. Entropy
d. Number of nodes
.
.
Question 85
Fill in the blank. In naïve Bayes’ classification, the Bayes’ theorem can be informally written as ______.
Choose one answer.
a. posteriori = likelihood x prior / evidence
b. prior = likelihood x posteriori / evidence
c. posteriori = evidence x prior / likelihood
d. posteriori = likelihood x evidence/ prior
.
.
Question 86
In principal component analysis (PCA), what does “reduction of dimensionality” mean?
Choose one answer.
a. PCA reduces a large number of interrelated variables to a small number of uncorrelated principal components.
b. PCA linearizes the problem.
c. PCA reduces the variance of variables.
d. PCA identifies the number of independent variables in the original problem.
.
.
Question 87
In structural equation modeling (SEM), what are endogenous variables?
Choose one answer.
a. Explanatory variables, only appearing in the structural equations
b. Errors in the structural equations
c. Response variables whose values are determined by the model
d. Variables uncorrelated with the error of the structural equations
.
.
Question 88
In support vector machine, which term is used to refer to the separation between classes?
Choose one answer.
a. Margin
b. Hyperplane
c. Support vectors
d. None of the above
.
.
Question 89
In survival analysis, if the probability of surviving to time t is P(t), what is the survival function?
Choose one answer.
a. P(t)
b. 1-P(t)
c. 1+P(t)
d. P(t)2
.
.
Question 90
Which component of a time series refers to the long-run decline or growth in a time series?
Choose one answer.
a. trend
b. seasonal
c. irregular
d. cyclical
.
.
Question 91
Which of the following activities is considered to be part of data mining?
Choose one answer.
a. Classification
b. Outlier analysis
c. Clustering
d. All of the above
.
.
Question 92
Which of the following approaches to modeling time series is the Box-Jenkins approach?
Choose one answer.
a. The approach is based on a linear regression of the current value of the series against the white noise of one or more prior values of the series.
b. Frequency domain is used to analyze the time series.
c. The time series is decomposed into trend, seasonal, and residual components.
d. The approach combines moving average and autoregressive approaches.
.
.
Question 93
Which of the following is NOT true about exponential smoothing in time-series analysis?
Choose one answer.
a. Recent observations are given relatively more weight in forecasting than the older observations.
b. The weights assigned to the observations are equal to 1/N, where N is the number of observations.
c. In single exponential smoothing, the weights decrease geometrically.
d. Single exponential smoothing does follow the data well when there is a trend.
.
.
Question 94
Which of the following is NOT true about factor loadings in factor analysis?
Choose one answer.
a. Factor loadings are the correlation coefficients between the variables and the factors.
b. Factor loading of a variable is the percentage of variable in that variable explained by the factor.
c. In practice, factor loadings should be 0.95 or higher to confirm that independent variables are represented by a particular factor.
d. If the communality (the sum of the squared factor loadings for all factors for a given variable) exceeds 1.0, there is a spurious solution.
.
.
Question 95
Which of the following is NOT true about hazard function?
Choose one answer.
a. A hazard function can be interpreted as the expected number of events per individual per unit of time.
b. For an exponential survival function, the hazard rate is a constant.
c. The probability of an event is the cumulative hazard function.
d. The hazard function can be derived from the survival function.
.
.
Question 96
Which of the following is NOT true about principal component analysis (PCA)?
Choose one answer.
a. PCA transforms data to a new coordinate system, with each coordinate being referred to as a principal component.
b. In PCA, each of the principal components is a nonlinear combination of the original variables.
c. The principal components are arranged in order of decreasing variance.
d. The most informative principal component is the first component.
.
.
Question 97
Which of the following is not true about structural equation models (SEM)?
Choose one answer.
a. Structural-equation models are multiple-equation regression models in which the response variable in one regression equation might be an explanatory variable in another equation.
b. Path diagrams can be used to represent an SEM in the form of a causal graph.
c. In a recursive SEM, causation in the model is bidirectional.
d. SEMs can include variables that are indirectly measured through their effects.
.
.
Question 98
Which of the following is NOT true about supervised learning?
Choose one answer.
a. Observations or measurements are labeled with predefined classes.
b. In supervised learning, class labels of the data are not known.
c. Supervised learning consists of two steps: training and testing.
d. Decision tree induction is a supervised learning algorithm.
.
.
Question 99
Which of the following is true about endogenous and exogenous variables in structural equation models (SEMs)?
Choose one answer.
a. There is one structural equation for each endogenous variable.
b. An endogenous variable may appear as an explanatory variable in other structural equations.
c. Exogenous variables are determined outside of the model.
d. All of the above.
.
.
Question 100
Which of the following is true about the Cox proportional hazard model?
Choose one answer.
a. The Cox proportional hazard model is a parametric model.
b. The Cox proportional hazard model is a semi-parametric model.
c. The Cox proportional hazard model cannot accommodate time-dependent covariates.
d. All of the above.
.
.
Question 101
Which of the following is(are) potential problem(s) for a Cox regression model in survival analysis?
Choose one answer.
a. Violation of the assumption of proportional hazards
b. Influential observations
c. Nonlinearity in the relationship between the covariates and the log-hazard
d. All of the above
.
.
Question 102
Which of the following software package is an implementation of support vector machine in R?
Choose one answer.
a. e1071
b. lm
c. arima
d. None of the above
.
.