# initialization
import os
if not os.getenv(
"NBGRADER_EXECUTION"
): # Skip the code or auto-grading may take too long to complete.
%load_ext jupyter_ai
# Set LLM alias
%ai update chatgpt dive:chat
Load data into Weka Explorer Interface¶
How to start Weka in JupyterHub?
- Open the Launcher (
File
->New Launcher
) - Start a
Desktop
from the Launcher. - Start a
Terminal
from the menu on the top left. - Run the command
weka
and click theExplorer
button. - Load data from the folder
/data/
under the linux root directory.
%%ai chatgpt
How is Weka implemented and what is its main advantage over other data mining
tools?
Other ways to run Weka
- To run Weka locally on your computer:
- For the computers in CSC teaching studios, you can start Weka as follows:
- Click the shortcut
Work Desk
from desktop. - Click the link
Weka 3.8.x
for CS Department. - Load the dataset from
C:\Program Files\Weka-3-8\data\
.
- Click the shortcut
- For the computers in CS labs, you can start Weka as follows:
- Execute
G:\weka\3.8\run.bat
, - Click the
Explorer
button, and - Load the dataset from
C:\temp\Weka-3.8\data
orG:\weka\3.8\files\data
.
- Execute
Use Weka to do [Witten11] Exercises 17.1.1 and 17.1.2..
YOUR ANSWER HERE
YOUR ANSWER HERE
Create an ARFF file¶
# YOUR CODE HERE
raise NotImplementedError
# write the content of text to the file
try: content
except NameError:
print("AND.arff not generated because `content` is undefined.")
else:
filename = 'AND.arff'
with open(filename,'w') as f:
f.write(content)
print("AND.arff generated.")
Run the following test cell to see if your file is a valid ARFF file. You may also download and load the ARFF file into WEKA to see if there is any syntax error.
# test
print('Content of AND.arff:')
with open(filename) as f:
print(f.read())
from scipy.io import arff
import pandas as pd
d = arff.loadarff(filename)
df = pd.DataFrame(d[0]).astype(int)
df.head()
%%ai chatgpt
How is ARFF compared to CSV is Weka implemented and why is one former better or
more popular than the other?