import os
if not os.getenv(
"NBGRADER_EXECUTION"
):
%load_ext jupyter_ai
%ai update chatgpt dive:chat
# %ai update chatgpt dive-azure:gpt4o
In this notebook, you will learn to use Weka to complete [Witten11] Exercises 17.1.3 to 17.1.10. You may refer to Chapter 10 for a more detailed introduction to Weka. The following tip should be useful for providing your answers in this tutorial notebook and completing projects later on in the course.
Dataset Editor¶
After loading the data in the preprocess panel, we can inspect or change the data using the dataset editor shown in Figure 1.
Figure 1:Dataset editor.
YOUR ANSWER HERE
YOUR ANSWER HERE
YOUR ANSWER HERE
Applying a Filter¶
We can also modify the data using filters. After selecting a filter,
- left-click the filter to change its configuration or
- right-click the filter configuration in Weka to copy the configuration to the clipboard as shown in Figure 2.
Figure 2:Applying a filter.
YOUR ANSWER HERE
Weka’s CLI
The configuration can be run in Weka’s command line interface (CLI). See the documentation for details.
Classify Panel¶
To train a classifier, use the classify panel shown in Figure 3 to select a classification algorithm and start the training.
Figure 3:Classify panel
- The default test options use 10-fold cross-validation but we can choose to
- use the training set for testing,
- supply a separate dataset for testing, or
- use only a specified percentage of the original data for training and holdout the remaining data for testing.
- After training, we can right-click the result in the result list to visualize the classifier errors.
- For decision tree classifier, we can sometimes visualize the tree in addition to its text representation from the Classifier output.
YOUR ANSWER HERE
YOUR ANSWER HERE
YOUR ANSWER HERE
%%ai chatgpt -f text
Regardless of the test options, Weka runs the learning algorithm on the full
dataset to obtain the model to deploy. Wouldn't this cause overfitting?
LLMs can hallucinate, especially for concepts that require critical thinking. You should verify the answers with rigorous proofs or reasoning. Try modifying the prompt to force LLM to go through proper reasoning. You may need to clear the history to avoid the LLM being too absorbed into the previous prompts:
%ai reset
See the GitHub issue for
vscode-drawio
in typesetting math.