Intro to Data Mining

You will need to ensure to use proper APA citations with any content that is not your own work.

Question 1

Suppose that you are employed as a data mining consultant for an Internet search engine company. Describe how data mining can help the company by giving specific examples of how techniques, such as clustering, classification, association rule mining, and anomaly detection can be applied.

Question 2

Identify at least two advantages and two disadvantages of using color to visually represent information.

Question 3

Consider the XOR problem where there are four training points: (1, 1, −),(1, 0, +),(0, 1, +),(0, 0, −). Transform the data into the following feature space:

Φ = (1, √ 2×1, √ 2×2, √ 2x1x2, x2 1, x2 2).

Find the maximum margin linear decision boundary in the transformed space.

Question 4

Consider the following set of candidate 3-itemsets: {1, 2, 3}, {1, 2, 6}, {1, 3, 4}, {2, 3, 4}, {2, 4, 5}, {3, 4, 6}, {4, 5, 6}

Construct a hash tree for the above candidate 3-itemsets. Assume the tree uses a hash function where all odd-numbered items are hashed to the left child of a node, while the even-numbered items are hashed to the right child. A candidate k-itemset is inserted into the tree by hashing on each successive item in the candidate and then following the appropriate branch of the tree according to the hash value. Once a leaf node is reached, the candidate is inserted based on one of the following conditions:

Condition 1: If the depth of the leaf node is equal to k (the root is assumed to be at depth 0), then the candidate is inserted regardless of the number of itemsets already stored at the node.

Condition 2: If the depth of the leaf node is less than k, then the candidate can be inserted as long as the number of itemsets stored at the node is less than maxsize. Assume maxsize = 2 for this question.

Condition 3: If the depth of the leaf node is less than k and the number of itemsets stored at the node is equal to maxsize, then the leaf node is converted into an internal node. New leaf nodes are created as children of the old leaf node. Candidate itemsets previously stored in the old leaf node are distributed to the children based on their hash values. The new candidate is also hashed to its appropriate leaf node.

How many leaf nodes are there in the candidate hash tree? How many internal nodes are there?

Consider a transaction that contains the following items: {1, 2, 3, 5, 6}. Using the hash tree constructed in part (a), which leaf nodes will be checked against the transaction? What are the candidate 3-itemsets contained in the transaction?

Question 5

Consider a group of documents that has been selected from a much larger set of diverse documents so that the selected documents are as dissimilar from one another as possible. If we consider documents that are not highly related (connected, similar) to one another as being anomalous, then all of the documents that we have selected might be classified as anomalies. Is it possible for a data set to consist only of anomalous objects or is this an abuse of the terminology?

written assignment

 

Law enforcement professionals and investigators use digital forensic methods to solve crimes every day. Locate one current news article that explains how investigators may have used these techniques to solve a crime. Explain the crime that was solved and the methods used to determine how the crime was committed. Some examples of crimes solved may include locating missing children, finding criminals who have fled the scene of a crime, or unsolved crimes from the past that have been solved due to the use of new techniques (such as DNA testing).

Presentation power point.

Hide Folder InformationTurnitin®Turnitin® enabledThis assignment will be submitted to Turnitin®.Instructions

Pick one of the below operating systems and present information on the operating systems, and your thoughts comparing the selected operating system with other systems.

  • Windows 
  • Linux 
  • Unix
  • Android
  • iOS

Due DateOct 22.

Only seroious bidder.
Must be Computer Science Major to do this task.

Assignment

Select any visualisation/infographic and looking at any individual chart included. Try to extract and write down in language terms what this chart shows across the angle, the framing, and (where relevant) the focus?

Does it feel that the definition you have arrived at is consistent with the aims/claims of the chart as it is published? In other words does the chart show and include what you think it is actually supposed to be doing or is there a disconnect?

Assignment Link: http://book.visualisingdata.com/chapter/chapter5

Assignment Length (word count): At least 500 words (not including direct quotes).

References: At least two peer-reviewed, scholarly journal references.

Applied Machine Learning

Total 21 question.

Have to use ANACONDA NAVIGATOR_under JUPYTER… 

DOWNLOAD LINK: 

https://www.anaconda.com/products/individual

Open Source (Free Individual)

I added two excel database sheets and .ipnyb folder. 

Research Paper

 

This week’s reading centered around Bitcoin Economics.  For this week’s research paper, search the Internet and explain why some organizations are accepting and other organizations are rejecting the use of Bitcoins as a standard form of currency.  Your paper needs to identify two major companies that have adopted Bitcoin technology as well as one that has refused accepting Bitcoin as a form of currency. Be sure to discuss each organization, how they adopted (or why they won’t adopt) Bitcoin, and what recommendations you have for them to continue to support Bitcoin (or why they should support Bitcoin).

Your paper should meet these requirements:

  • Be approximately four to six pages in length, not including the required cover page and reference page.
  • Follow APA 7 guidelines. Your paper should include an introduction, a body with fully developed content, and a conclusion.
  • Support your answers with the readings from the course and at least two scholarly journal articles to support your positions, claims, and observations, in addition to your textbook. The UC Library is a great place to find resources.
  • Be clearly and well-written, concise, and logical, using excellent grammar and style techniques. You are being graded in part on the quality of your writing.

lab

 

In this lab you’ll set up your own local environment you’ll use in later labs to set up Ethereum and Hyperledger Fabric development and deployment environments. When you finish this lab, you’ll have the infrastructure in place to build many different types of virtual environments on your own computer.  

 

written assignment

 In this module, you learned that random numbers (or, at least, pseudorandom numbers) are essential in cryptography, but it is extremely difficult even for powerful hardware and software to generate them. Go online and conduct research on random number generators. What are the different uses of these tools besides cryptography? How do they work?