Structure of a Dataset

< 1 min read

A dataset is a simple, flat table containing historical data of the scenario for which predictions are to be made. Each row, referred to as an example or an instance, contains data for a single observation about a problem. Each column, referred to as an attribute, a variable or a dimension contains a feature of the observation.

In supervised machine learning columns are divided into a set of descriptive features also called standard features, regular features or simply features, and a single target feature also called a label or an outcome. The dataset for an unsupervised learning task does not have a label, i.e., the dataset is “unlabeled“.

The features that come directly from raw data sources, e.g., Origin, Dest, and DayOfWeek in the above diagram are referred to as raw features while those that do not exist in any raw data source or are created from one or more features in a data source, e.g., Duration (if calculated from “call start time” and “call end time”) and TimeOfDay (if derived from the actual time on the timestamp) are referred to as derived features.

Join WhatsApp group here
Join Facebook group here

What are your Feelings

Still stuck? How can we help?

Updated on 3 June, 2022

1. Introduction to Artificial Intelligence and Machine Learning

2. Machine Learning Success Factors

3. Build and Use a Quick Machine Learning Model

4. Defining the Business Problem

5. Data Understanding

6. Data Preparation

7. Modeling

8. Predictive Modeling

9. Model Validation

10. Model Deployment

Structure of a Dataset

What are your Feelings

Leave a Reply Cancel reply

How can we help you?

What are your Feelings

Share post:

What's on your mind?

Leave a Reply Cancel reply