Skip to article frontmatterSkip to article content
# initialization
import os

if not os.getenv(
    "NBGRADER_EXECUTION"
):  # Skip the code or auto-grading may take too long to complete.
    %load_ext jupyter_ai
    # Set LLM alias
    %ai update chatgpt dive:chat

Load data into Weka Explorer Interface

How to start Weka in JupyterHub?

  • Open the Launcher (File->New Launcher)
  • Start a Desktop from the Launcher.
  • Start a Terminal from the menu on the top left.
  • Run the command weka and click the Explorer button.
  • Load data from the folder /data/ under the linux root directory.
%%ai chatgpt
How is Weka implemented and what is its main advantage over other data mining 
tools?

Use Weka to do [Witten11] Exercises 17.1.1 and 17.1.2..

YOUR ANSWER HERE

YOUR ANSWER HERE

Create an ARFF file

# YOUR CODE HERE
raise NotImplementedError

# write the content of text to the file
try: content
except NameError: 
    print("AND.arff not generated because `content` is undefined.")
else:
    filename = 'AND.arff'
    with open(filename,'w') as f:
        f.write(content)
    print("AND.arff generated.")

Run the following test cell to see if your file is a valid ARFF file. You may also download and load the ARFF file into WEKA to see if there is any syntax error.

# test
print('Content of AND.arff:')
with open(filename) as f:
    print(f.read())

from scipy.io import arff
import pandas as pd

d = arff.loadarff(filename)
df = pd.DataFrame(d[0]).astype(int)
df.head()
%%ai chatgpt
How is ARFF compared to CSV is Weka implemented and why is one former better or
more popular than the other?