Binary classification of strings in WEKA

Jack McKenzie
Jack McKenzie used Ask the Experts™
on
I'm building a binary classifier in weka. It will be used to check using AI whether the strings in test data belong to Swing UI or not.

Example
@relation SwingComponents

@attribute DataTypeText string
@attribute SwingComponent {yes,no}

@data

JPanel, yes
FooBar,no
JButton, yes

Currently, I'm experimenting with StringToWordVector filter and then, for example,  Naive Bayes. I'm not sure if I'm doing it correctly. The goal is to pass some test data and receive a classified set, or in other words a Swing UI set.

Test example
@relation SwingComponents

@attribute DataTypeText string
@attribute SwingComponent {yes,no}

@data

JPanel, ?
SomeOtherFooBar, ?
JButton, yes

If possible, please instruct regarding the filters that I should use and the algorithm.

The result this should produce will be something like JPanel, yes  SomeOtherFooBar, no  JButton, yes

I'll use this in an app that scans all lines of a .java file and determines which lines belong to the UI. This allows to later copy all UI lines from the .java file into a new file. We are doing this to extract relevant Swing UI components without extracting any application logic.
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Top Expert 2016
Commented:
I don't know this API but you could just search the runtime jar for matches on '^javax.swing.*' in order to produce the dataset
Jack McKenzieChief Executive Officer http://mybigdata.co.uk

Author

Commented:
Thanks a lot! Swing components, including custom ones, are always imported from some packages. It makes them easily identifiable without AI.
Top Expert 2016

Commented:
:)

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial