6 steps to a successful data science project

Do you have a data science project ahead of you or are you simply interested in how data mining works? We explain the current and cross-industry guidelines for leading projects to success in a well-structured way — the CRISP-DM (Cross Industry Standard Process for Data Mining).

The method was developed in 1996 by well-known companies (Daimler AG, NCR Corporation, etc.) with the aim of establishing a uniform standard for projects. It structures the projects into six phases, where the model represents a cycle and the individual steps are not to be understood strictly hierarchically.

1. Business understanding (task definition)

The first step, Business Understanding, focuses on the problem or question. Which problem should be solved and how much potential does the project have? In this way, it should be considered how much financial resources can flow into the project. The easiest way to do this is to define the economic target criteria.

2.Data understanding (selection of relevant data sets)

What data sources do I have to achieve this goal? Is all necessary information available to me or do I have to obtain data first? 
Under certain circumstances, it may also be useful to reformulate the goal here.

3. Data preparation (data preparation)

Once the database has been compiled, it is time to view and prepare the data. This usually also results in one of the most complex parts, because the data usually has to be cleaned, transformed and prepared first.

4. Modelling (selection and application of data mining methods)

That's where the algorithm comes in. A data scientist creates a model and usually uses simple key figures to check whether the model is suitable for the calculations. At this point, it is often necessary to adjust a few things again and take a step back to data preparation.

5. Evaluation (evaluation and interpretation of events)

If the model is conclusive from a current perspective, it must be checked with regard to the target: Can the previously defined goals be achieved with this model? If necessary, the goal or model must be adapted accordingly.

6. Deployment (application of results)

If the evaluation has met the quality requirement, the model is now being implemented. At this point, a process for ongoing monitoring is usually also used to ensure whether the model still fits with the goals.


Magazin

Andere Beiträge

This is the thumbnail of the other blogpost.
Big data vs. small data

Big data is one of the most popular buzzwords of our time. But when does' big 'actually become' big '?

Read more
This is the thumbnail of the other blogpost.
Internet of Things — from sensor to cloud

'IoT' is booming, but why actually? Imagine saving money because you use electricity efficiently and avoid expensive consumption peaks, knowing which jobs...

Read more