Pick a dataset (see Resources below) and define a problem you want to solve

General information:


General rule : There may be some underspecified parts in the project description. This is on

purpose! In those cases, make your own design choices and document them!

You can choose from the following project types:

Software : Pick a dataset (see Resources below) and define a problem you want to solve.

Select as many data mining techniques that you would like to use in order to solve the problem, implement them from scratch , clean and analyze the data, compare the results from the different techniques, and present the findings. Alternatively, you may define a problem associated with data that can be obtained from a some website or web platform. Write a crawler that scrapes data from that platform, making sure that you respect all the crawling policies that the website has in place (usually looking at the robots.txt file of the website, or looking at their data policy). Choose the same number of data mining techniques that you want to use; If you have to do non-trivial implementation work for the crawling and the data preparation, it is OK if you implement one less technique.


Research : In this project type you can propose your own idea along the lines of your own

research or along the lines of improving the state of the art in solving an existing data mining problem. After you propose the idea, if there are any well-known techniques for that problem, you should use them as baselines, and you should propose a novel solution to that problem. You

must implement at least one method, as in the Software option. This project type can earn extra



Project Deliverables:


Project Proposal

Description :

In the proposal you must briefly but concisely introduce your project. In particular, you have to clearly define the problem your project proposes to solve. You should be able to distill the

essence of your proposal to a statement like:


Given <dataset, website, …> Use <data mining technique(s)> To <achieve “KDD outcome”>


For example:

Given Netflix data Use Collaborative Filtering algorithms To recommend new movies to users


Given Twitter data Use Matrix Factorization To detect fake followers


In special cases, you may be able to relax the above format for the problem statement, but it is

fairly generic and applies to a wide variety of problem statements. In any case, make sure you

define what problem you are going to solve, and very importantly, describe how you are

planning to evaluate your approach.


In addition to the above, make sure you include:

  1. The type of the project.
  2. Evaluation plan


Depending on the project type you chose, you need to clearly describe your plan on obtaining

the data that you will use.

– Here is how I will find labeled data

– Given labeled data, here’s what I’ll do

– Without labeled data, here’s what I’ll do


The page limit for the proposal is 2 pages, single column.


Final Project Deliverable

Description :


The final project deliverable should include:

  1. The project report in .pdf format.
  2. The code for your implementation.
  3. If you collected any dataset(s) for your project, include it/them in your deliverable, if that

is possible. If the dataset comes with restrictions, there is no need to include it.



Details for the report:

Your final report should resemble a KDD paper (download the ACM “tight” format here

http://www.acm.org/publications/proceedings-template) and the page limit is 10 pages in double column format including the references.

For all project types you have to include 1) an Introduction where you describe and motivate

the problem, give an outline of your contributions and motivate your approach; if you have

Research you also have to argue that your proposed approach is sufficiently novel with respect

to the state-of-the-art, by providing statements on how existing methods do not adequately

address the problem you are solving., 2) a Related Work section where you outline relevant

papers that work on the same problem, a 3) Proposed Method section where you describe the

method(s) you used to solve the problem, 4) an Experimental Evaluation section where you

compare the methods used; if you have Research you have to further demonstrate that the

proposed approach outperforms the baselines (at least in some cases); this can earn extra

credit, and 5) a Discussion & Conclusions section where you draw the conclusions of your

paper and outline potential future research directions.


For the code , make sure you include:

  1. All source files you wrote with comments that explain your implementation.
  2. A README file that describes what each file does.


Page limit: 5 pages + 1 for references (KDD-style double column format, ACM “tight” style)


Project Implementation


You need to implement one method. “Implementation” means writing the code for the method from scratch. For those implementations, you may use packages like Pandas, NumPy etc., but only for their basic functionality. You may not use an existing library implementation for your implementation.


If you find a website/tutorial/blog that outlines the implementation, you may use it as inspiration/guide but anything you submit must be your own implementation. Verbatim (or nearly verbatim) copies will not be allowed or tolerated (see the academic integrity section below).


There are some techniques for which, by exception, you may use existing implementations in

packages :

  • Neural Networks: You may use packages like TensorFlow, PyTorch etc., and as part of your implementation you should do a thorough experimentation of the different architectures
  • You may use an existing implementation of the Singular Value Decomposition


Calculate the price of your order

550 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
The price is based on these factors:
Academic level
Number of pages
Basic features
  • Free title page and bibliography
  • Unlimited revisions
  • Plagiarism-free guarantee
  • Money-back guarantee
  • 24/7 support
On-demand options
  • Writer’s samples
  • Part-by-part delivery
  • Overnight delivery
  • Copies of used sources
  • Expert Proofreading
Paper format
  • 275 words per page
  • 12 pt Arial/Times New Roman
  • Double line spacing
  • Any citation style (APA, MLA, Chicago/Turabian, Harvard)

Our guarantees

Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.

Money-back guarantee

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

Read more

Zero-plagiarism guarantee

Each paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.

Read more

Free-revision policy

Thanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.

Read more

Privacy policy

Your email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.

Read more

Fair-cooperation guarantee

By sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.

Read more
Open chat
Hello. Can we help you?