Skip to article frontmatterSkip to article content
import os

if not os.getenv(
    "NBGRADER_EXECUTION"
):
    %load_ext jupyter_ai
    %ai update chatgpt dive:chat
    # %ai update chatgpt dive-azure:gpt4o

In this notebook, you will learn to use Weka to complete [Witten11] Exercises 17.1.3 to 17.1.10. You may refer to Chapter 10 for a more detailed introduction to Weka. The following tip should be useful for providing your answers in this tutorial notebook and completing projects later on in the course.

Dataset Editor

After loading the data in the preprocess panel, we can inspect or change the data using the dataset editor shown in Figure 1.

Dataset editor

Figure 1:Dataset editor.

YOUR ANSWER HERE

YOUR ANSWER HERE

YOUR ANSWER HERE

Applying a Filter

We can also modify the data using filters. After selecting a filter,

  • left-click the filter to change its configuration or
  • right-click the filter configuration in Weka to copy the configuration to the clipboard as shown in Figure 2.
Applying a filter

Figure 2:Applying a filter.

YOUR ANSWER HERE

Classify Panel

To train a classifier, use the classify panel shown in Figure 3 to select a classification algorithm and start the training.

Classify panel

Figure 3:Classify panel

  1. The default test options use 10-fold cross-validation but we can choose to
    • use the training set for testing,
    • supply a separate dataset for testing, or
    • use only a specified percentage of the original data for training and holdout the remaining data for testing.
  2. After training, we can right-click the result in the result list to visualize the classifier errors.
  3. For decision tree classifier, we can sometimes visualize the tree in addition to its text representation from the Classifier output.

YOUR ANSWER HERE

YOUR ANSWER HERE

YOUR ANSWER HERE

%%ai chatgpt -f text
Regardless of the test options, Weka runs the learning algorithm on the full
dataset to obtain the model to deploy. Wouldn't this cause overfitting?
Footnotes
  1. See the GitHub issue for vscode-drawio in typesetting math.