bi-9

 Excel is probably the most popular spreadsheet software for PCs. Why? What can we do with this package that makes it so attractive for modeling efforts?

Need 400-500 words

Cloud Computing Paper

Select an organization that has leveraged Cloud Computing technologies in an attempt to improve profitability or to give them a competitive advantage.  Research the organization to understand the challenges that they faced and how they intended to use Cloud Computing to overcome their challenges.  The paper should include the following sections each called out with a header.   

• Company Overview:  The section should include the company name, the industry they are in and a general overview of the organization.    
• Challenges: Discuss the challenges the organization had that limited their profitability and/or competitiveness and how they planned to leverage Cloud Computing to overcome their challenges.    
• Solution:  Describe the organization’s Cloud Computing implementation and the benefits they realized from the implementation.  What was the result of implementing Cloud Computing? Did they meet their objectives for fall short?    
• Conclusion:  Summarize the most important ideas from the paper and also make recommendations or how they might have achieved even greater success.   

Requirements:   
    The paper must adhere to APA guidelines including Title and Reference pages.  There should be at least three scholarly sources listed on the reference page.  Each source should be cited in the body of the paper to give credit where due.  Per APA, the paper should use a 12-point Time  New Roman font, should be double spaced throughout, and the first sentence of each paragraph should be indented .5 inches.  The body of the paper should be 3 – 5 pages in length.  The Title and Reference pages do not count towards the page count requirements.   
 

Week 9

Assignment 1: Email Harassment

Due Week 9 and worth 125 points

Suppose you are an internal investigator for a large software development company. The Human Resources Department has requested you investigate the accusations that one employee has been harassing another over both the corporate Exchange email system and Internet-based Google Gmail email.

Prepare a report in you:

  1. Create an outline of the steps you would take in examining the email accusations that have been identified.
  2. Describe the information that can be discovered in email headers and determine how this information could potentially be used as evidence in the investigation. 
  3. Analyze differences between forensic analysis on the corporate Exchange system and the Internet-based Google Gmail email system. Use this analysis to determine the challenges that exist for an investigator when analyzing email sent from an Internet-based email system outside of the corporate network.
  4. Select one (1) software-based forensic tool for email analysis that you would utilize in this investigation. Describe its use, features, and how it would assist in this scenario.
  5. Use at least three (3) quality resources in this assignment. Note: Wikipedia and similar websites do not qualify as quality resources. 

Your assignment must follow these formatting requirements:

  • Be typed, double spaced, using Times New Roman font (size 12), with one-inch margins on all sides; citations and references must follow APA or school-specific format. Check with your professor for any additional instructions.
  • Include a cover page containing the title of the assignment, the student’s name, the professor’s name, the course title, and the date. 

physical security, security planning and Influence of Physical Design

write a research paper(6 pages) on approaches to physical security:

paper should contain abstract, introduction, body, conclusion

apa-format

no plaigarism.

CHAPTERS- 

Chapter 3 – Influence of Physical Design

Chapter 4 – Approaches to Physical Security

Chapter 5 – Security Lighting

Chapter 6 – Focus on Electronic Devices for Entry into locations.

Chapter 7 – Use of Locks – Focus on Key Operated Mechanisms

Chapter 8 – Explain why do employees steal?

Chapter 9 – External Threats and Countermeasures

Chapter 10 – Biometrics in Criminal Justice System and Society Today

Chapter 11 – Access Control Systems and Identification Badges

Course Objectives/Learner Outcomes:

Upon completion of this course, the student will:

· Recognize basic threats to an organization& physical security and identify the security mechanisms used

· in securing an enterprise environment.

·  Identify the security mechanisms and strategies used to protect the perimeter of a facility.

·  Identify the appropriate physical security mechanisms to implement in a given scenario.

·  Identify the appropriate mechanisms and controls for securing the inside of a building or facility.

 Select the most appropriate intrusion detection technology for a scenario.

Subject resources:

Fennelly, Lawrence. Effective Physical Security. ELSEVIER, 2017. 

Print ISBN: 978-0-12-804462-9

Other articles and readings may be assigned by course professor.

Recommended Materials/Resources 

Harris, Shon. All in One CISSP Exam Guide, Sixth Edition. McGraw-Hill, 2013.

• International Information Systems Security Certification Consortium, Inc., (ISC)²® – This Web site provides access to current industry information. It also provides opportunities in networking and contains valuable career tools. http://www.isc2.org/ 

• ISACA – This Web site provides access to original research, practical education, career-enhancing certification, industry-leading standards, and best practices. It also provides a network of likeminded colleagues and contains professional resources and technical/managerial publications. https://www.isaca.org/Pages/default.aspx

BS W2

In 300 words

OWASP( Open Web Application Security Project ) Vulnerabilities.

Please describe the below in 300 words.

  • Broken Authentication – OWASP Vulnerabilities.

Discussions 8

 

This discussion topic is to be reflective and will be using your own words and not a compilation of direct citations from other papers or sources. You can use citations in your posts, but this discussion exercise should be about what you have learned through your viewpoint and not a re-hash of the course  Data Science & Big Data Analysis and any particular article, topic, or the book.

Items to include in the initial thread: 

  • “Interesting Readings” – What reading or readings did you find the most interesting and why? “Interesting Readings”
  • “Perspective” – How has this course changed your perspective? 
  • “Course Feedback” – What topics or activities would you add to the course, or should we focus on some areas more than others? 

Statistical analysis using R

  

Final Exam DPEE

Note: 

· For demonstrating conceptual understanding, you are required to work on the model that is easier to handle or compute, not necessarily the more suitable (or more complicated) model for the dataset. Follow the question description. 

· You don’t need to check the assumption of a model unless the question asks for it. For example, if the question asks you to make prediction based on a model, you don’t need to check the assumption for the model before making prediction. 

· For any of the testing (hypothesis test) problem, define Ho/Ha, compute the test statistic, report the exact p value, and state the conclusion. The default alpha value is 5%, unless specify. 

· Elaborate your reasoning clearly and show relevant plots, R results, and tables to support your opinion in each step and conclusion. 

· Submit the Rmd file and the corresponding pdf file knitted from it, along with your answer, this format is similar to your homework. 

· The data is real, just like the project you are working on. Hence it is possible that even after the remedial method has been done, the model is still not perfect. When this happens, evaluation will be based on the level you execute the methods covered in Stat512 to improve the model. Don’t worry if your model is not perfect, try your best to demonstrate the skill set you learn in this class. 

Study the data with a linear analysis and complete the problems. The data set, dataDPEE.csv has 3 continues predictors and two categorical predictors. 

  

Problem 1. Consider only the first order model with X1, X2 and X3, perform the following hypothesis. 

a. (10) whether X1 can be dropped from the full model. 

> dpeemod <- lm(y ~ x1 + x2 + x3)

> plot(dpee)

> summary(dpeemod)

Call:

lm(formula = y ~ x1 + x2 + x3)

Residuals:

Min 1Q Median  3Q Max 

-15.948 -11.640 -1.480 6.402 31.650 

Coefficients:

Estimate Std. Error t value Pr(>|t|) 

(Intercept) 90.79017 22.07408 4.113 0.00106 **

x1 -0.68731 0.47959 -1.433 0.17377 

x2 -0.47047 0.24227 -1.942 0.07254 . 

x3 -0.06845 0.46523 -0.147 0.88513 

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 14.75 on 14 degrees of freedom

Multiple R-squared: 0.6257, Adjusted R-squared: 0.5455 

F-statistic: 7.8 on 3 and 14 DF, p-value: 0.002652

 

Ho: X1 is significant and cannot be dropped from the model.

Ha: X1 can be dropped from the model. 

Using the model Y~X1 + X2 + X3, 

b. (10) whether X1 can be dropped from the model containing only X1 and X2. 

  

Problem 2 (10) Consider the first order model with X1, X2 and X3, simultaneously estimate parameters (beta1, beta2 and beta3)  with a confidence level of 75%. 

  

Problem 3 (20) Perform appropriate analysis to diagnose the potential issues with the first order full mode with X1 X2 and X3, improve the model as much as possible with the methods covered in Stat512. You should also consider the assumption checking for your revised model.  

  

Problem 4 

a. (10) Compute AIC, BIC, and PRESSP to compare the following two models. 

· The model on the first order terms for X1 and X2 and the interaction term X1X2.

· The model on the first order terms for X1, X2 and X3 

Do they all yield the same better model? If not, explain. 

b. (10) Select the model that you think is better to predict the mean response value, then predict the mean response for the following case, at a confident level of 99%. 

  

x1

x2

x3

 

45

36

45

Problem 5 

X4 and X5 are two factors on Y.

a. (10) Is there any significant interaction effect between X4 and X5 on Y? 

b. (10) With the ANOVA method, compute the 95% confidence interval for the following difference, respectively: 

D1= The difference in the mean of Y when (X4=high, X5=less) and (X4=high, X5=more) 

D2= The difference in the mean of Y when (X4=low, X5=less) and (X4=low, X5=more) 

c. (10) With the ANOVA method, compute the 95% confidence interval for 

D1-D2

Where D1 and D2 are described in b. 

How is your result related to a? 

Basic Optimization

ISYE 6740 Homework 3

Total 100 points.

1. Basic optimization. (30 points.)

Consider a simpli_x000c_ed logistic regression problem. Given m training samples (xi; yi), i = 1; : : : ;m.

The data xi 2 R (note that we only have one feature for each sample), and yi 2 f0; 1g. To _x000c_t a

logistic regression model for classi_x000c_cation, we solve the following optimization problem, where 2 R

is a parameter we aim to _x000c_nd:

max

`( ); (1)

where the log-likelhood function

`( ) =

Xm

i=1

f???? log(1 + expf???? xig) + (yi ???? 1) xig :

(a) (10 points) Show step-by-step mathematical derivation for the gradient of the cost function `( )

in (1) and write a pseudo-code for performing gradient descent to _x000c_nd the optimizer . This is

essentially what the training procedure does. (pseudo-code means you will write down the steps

of the algorithm, not necessarily any speci_x000c_c programming language.)

(b) (10 points) Present a stochastic gradient descent algorithm to solve the training of logistic

regression problem (1).

(c) (10 points) We will show that the training problem in basic logistic regression problem

is concave. Derive the Hessian matrix of `( ) and based on this, show the training problem (1)

is concave (note that in this case, since we only have one feature, the Hessian matrix is just a

scalar). Explain why the problem can be solved e_x000e_ciently and gradient descent will achieve a

unique global optimizer, as we discussed in class.

2. Comparing Bayes, logistic, and KNN classi_x000c_ers. (30 points)

In lectures we learn three di_x000b_erent classi_x000c_ers. This question is to implement and compare them. We are

suggest use Scikit-learn, which is a commonly-used and powerful Python library with various machine

learning tools. But you can also use other similar library in other languages of your choice to perform

the tasks.

Part One (Divorce classi_x000c_cation/prediction). (20 points)

This dataset is about participants who completed the personal information form and a divorce predic-

tors scale.

The data is a modi_x000c_ed version of the publicly available at https://archive.ics.uci.edu/ml/datasets/

Divorce+Predictors+data+set (by injecting noise so you will not replicate the results on uci web-

site). There are 170 participants and 54 attributes (or predictor variables) that are all real-valued. The

dataset marriage.csv. The last column of the CSV _x000c_le is label y (1 means divorce”, 0 means no

divorce”). Each column is for one feature (predictor variable), and each row is a sample (participant).

A detailed explanation for each feature (predictor variable) can be found at the website link above.

Our goal is to build a classi_x000c_er using training data, such that given a test sample, we can classify (or

essentially predict) whether its label is 0 (no divorce”) or 1 (divorce”).

1

Build three classi_x000c_ers using (Naive Bayes, Logistic Regression, KNN). Use the _x000c_rst 80% data for

training and the remaining 20% for testing. If you use scikit-learn you can use train test split to split

the dataset.

Remark: Please note that, here, for Naive Bayes, this means that we have to estimate the variance for

each individual feature from training data. When estimating the variance, if the variance is zero to

close to zero (meaning that there is very little variability in the feature), you can set the variance to

be a small number, e.g., _x000f_ = 10????3. We do not want to have include zero or nearly variance in Naive

Bayes. This tip holds for both Part One and Part Two of this question.

(a) (10 points) Report testing accuracy for each of the three classi_x000c_ers. Comment on their perfor-

mance: which performs the best and make a guess why they perform the best in this setting.

(b) (10 points) Use the _x000c_rst two features to train three new classi_x000c_ers. Plot the data points and

decision boundary of each classi_x000c_er. Comment on the di_x000b_erence between the decision boundary

for the three classi_x000c_ers. Please clearly represent the data points with di_x000b_erent labels using di_x000b_erent

colors.

Part Two (Handwritten digits classi_x000c_cation). (10 points) Repeat the above using the MNIST

Data in our Homework 2. Here, give digit” 6 label y = 1, and give digit” 2 label y = 0. All the

pixels in each image will be the feature (predictor variables) for that sample (i.e., image). Our goal

is to build classi_x000c_er to such that given a new test sample, we can tell is it a 2 or a 6. Using the _x000c_rst

80% of the samples for training and remaining 20% for testing. Report the classi_x000c_cation accuracy on

testing data, for each of the three classi_x000c_ers. Comment on their performance: which performs the best

and make a guess why they perform the best in this setting.

3. Naive Bayes for spam _x000c_ltering. (40 points)

In this problem we will use the Naive Bayes algorithm to _x000c_t a spam _x000c_lter by hand. This will en-

hance your understanding to Bayes classi_x000c_er and build intuition. This question does not involve any

programming but only derivation and hand calculation.

Spam _x000c_lters are used in all email services to classify received emails as Spam” or Not Spam”. A

simple approach involves maintaining a vocabulary of words that commonly occur in Spam” emails

and classifying an email as Spam” if the number of words from the dictionary that are present in the

email is over a certain threshold. We are given the vocabulary consists of 15 words

V = fsecret, o_x000b_er, low, price, valued, customer, today, dollar, million, sports, is, for, play, healthy, pizzag:

We will use Vi to represent the ith word in V . As our training dataset, we are also given 3 example

spam messages,

• million dollar o_x000b_er

• secret o_x000b_er today

• secret is secret

and 4 example non-spam messages

• low price for valued customer

• play secret sports today

• sports is healthy

• low price pizza

2

Recall that the Naive Bayes classi_x000c_er assumes the probability of an input depends on its input feature.

The feature for each sample is de_x000c_ned as x(i) = [x(i)

1 ; x(i)

2 ; : : : ; x(i)

d ]T , i = 1; : : : ;m and the class of the

ith sample is y(i). In our case the length of the input vector is d = 15, which is equal to the number

of words in the vocabulary V . Each entry x(i)

j is equal to the number of times word Vj occurs in the

i-th message.

(a) (5 points) Calculate class prior P(y = 0) and P(y = 1) from the training data, where y = 0

corresponds to spam messages, and y = 1 corresponds to non-spam messages. Note that these

class prior essentially corresponds to the frequency of each class in the training sample.

(b) (10 points) Write down the feature vectors for each spam and non-spam messages.

(c) (15 points) In the Naive Bayes model, assuming the keywords are independent of each other (this

is a simpli_x000c_cation), the likelihood of a sentence with its feature vector x given a class c is given

by

P(xjy = c) =

Yd

k=1

xk

c;k; c = f0; 1g

where 0 c;k 1 is the probability of word k appearing in class c, which satis_x000c_es

Xd

k=1

c;k = 1; 8c:

Given this, the complete log-likelihood function for our training data is given by

`( 1;1; : : : ; 1;d; 2;1; : : : ; 2;d) =

Xm

i=1

Xd

k=1

x(i)

k log y(i);k

(In this example, m = 7.) Calculate the maximum likelihood estimates of 0;1, 0;7, 1;1, 1;15 by

maximizing the log-likelihood function above. (Hint: We are solving a constrained maximization

problem. To do this, remember, you need to introduce two Lagrangian multiplier because you

have two constraints.)

(d) (10 points) Given a test message today is secret”, using the Naive Bayes classier that you have

trained in Part (a)-(c), to calculate the posterior and decide whether it is spam or not spam.

3

Artificial Neural Network

Define and explain Artificial Neural Network. Explain how Artificial Neurons decide whether to activate synapses and how to evaluate the performance of the final model.

In developing your initial response, be sure to draw from, explore, and cite credible reference materials, including at least one scholarly peer-reviewed reference. In responding to your classmates’ posts, you are encouraged to examine their opinions, offering supporting and/or opposing views.

Firewalls and Host and Intrusion Prevention Systems

 

  • Based on this week’s course objectives of securing resources in a network and describing Firewalls and Host and Intrusion Prevention Systems, discuss if both of these items are necessary for the network. Security networking devices have many advantages in the business environment. Moreover, what are some of these advantages, and from an Information Security perspective, what are some of the standard practices in securing these networks? Describe the significant reasons to secure the Network and protect access to the Web. Feel free to research the internet and provide resource links and APA cited references to support your comments.