mmccy
asked on
2D Array of cleaning and normalization !
306,362,348,265,114,88,110 ,90,110,91 ,103,81,90 8,475,581, 4248,41693 6,1
..............
320,378,361,298,113,89,109 ,90,113,93 ,105,90,92 6,469,590, 4478,42677 4,1
346,387,349,278,92,80,95,8 7,86,78,78 ,70,868,42 8,538,4279 ,365147,2
1)I have a problem for data cleaning and normalization, I need to read the text file(the above format) into a 2D Arraylist and the last column indicated the userid and the first 16 of each row is the user's sample, treat the last column as a userid for indication in the 2D Array is ok !!!!
2) I need to compare the data of each column with the median of each column(totaly 16 columns for calculations, except the useid column which still use for indication only ) , if the cell contains a data which is > median x 3 , then the whole ROW of the data sample will be deleted.(cleaning)
3) After that I need to normalize the data in the cleaned arraylist, using the min-max matching scores that is
normalized data(each cell)=[s - min(S)]/[max(S)-min(S)]
where s is the original data, min(S) is the mininium value of the each COLUMN and max(S) is the maximium value of each COLUMN!!
4) Finally , after that a cleaned and normalized Arraylist is created. then print it out to a text file agin !!
any ideas for the above and some sample code is preferred !
Thank you very much !
..............
320,378,361,298,113,89,109
346,387,349,278,92,80,95,8
1)I have a problem for data cleaning and normalization, I need to read the text file(the above format) into a 2D Arraylist and the last column indicated the userid and the first 16 of each row is the user's sample, treat the last column as a userid for indication in the 2D Array is ok !!!!
2) I need to compare the data of each column with the median of each column(totaly 16 columns for calculations, except the useid column which still use for indication only ) , if the cell contains a data which is > median x 3 , then the whole ROW of the data sample will be deleted.(cleaning)
3) After that I need to normalize the data in the cleaned arraylist, using the min-max matching scores that is
normalized data(each cell)=[s - min(S)]/[max(S)-min(S)]
where s is the original data, min(S) is the mininium value of the each COLUMN and max(S) is the maximium value of each COLUMN!!
4) Finally , after that a cleaned and normalized Arraylist is created. then print it out to a text file agin !!
any ideas for the above and some sample code is preferred !
Thank you very much !
2) It's definitely the median for the *column* i.e. reading vertically?
ASKER
yes !!
ASKER
by the way, have u heard of 10 fold cross validation and Euclidean distance ?
I saw your previous posts.
What is the structure you use?
ArrayList, [][], or other?
Giant.
What is the structure you use?
ArrayList, [][], or other?
Giant.
if you use the ArrayList you can delete a row directly using remove method of ArrayList without worry about compact the ArrayList.
if you use [][] you must use System.arraycopy to copy deleting the particular row.
Giant.
if you use [][] you must use System.arraycopy to copy deleting the particular row.
Giant.
ASKER
basically it doesn't matter , just easier of me to understand is ok !!
now I know how to do with arraylistofarraylist(array listof int[]) ! can I use array[][] ?
now I know how to do with arraylistofarraylist(array
ASKER
oic !
I prefer Arraylist of int[] then, because I need to other calculation after the normalization !!
I prefer Arraylist of int[] then, because I need to other calculation after the normalization !!
sure, but it's little more difficultbecause you must write some method allready implemented with ArrayList (for example).
This is the Euclidean Distance definition:
http://www.nist.gov/dads/HTML/euclidndstnc.html
http://www.nist.gov/dads/HTML/euclidndstnc.html
ASKER
Arraylist of int[] makes me easier to calculate the distance between two different rows !!
>Arraylist of int[] makes me easier to calculate the distance between two different rows !!
yes.
yes.
ASKER
yes ED !! this is what I will finally do with the normalised arraylist !! I understand the concept but I need to put it into programming in a very short time !!
public ArrayList normalize(ArrayList original,int[] averages){
int i=0;
while (original.size()>=i){
int[]el=(int[])original.ge t(i);
boolean remove=false;
for (int k=0;k<el.length;k++){
if (el[k]>(averages[k]*3)){re move=true; break;}
}//end for k
if (remove){original.remove(i );}
else {i++;}
}//end while
return original;
}
int i=0;
while (original.size()>=i){
int[]el=(int[])original.ge
boolean remove=false;
for (int k=0;k<el.length;k++){
if (el[k]>(averages[k]*3)){re
}//end for k
if (remove){original.remove(i
else {i++;}
}//end while
return original;
}
What did you mean with "S" and "s" ?
this read the file and insert data in an ArrayList of int[]
public ArrayList readFile(String fileName) {
ArrayList ret = new ArrayList();
try {
RandomAccessFile fos = new RandomAccessFile(fileName, "r");
fos.seek(0);
String line = fos.readLine();
while (line != null) {
StringTokenizer t = new StringTokenizer(line, ",");
ArrayList lineArray = new ArrayList();
while (t.hasMoreTokens()) {
lineArray.add(t.nextToken( ));
}
line = fos.readLine();
int[] lineInt=new int[lineArray.size()];
for (int i=0;i<lineArray.size();i++ ){
lineInt[i]=Integer.parseIn t(lineArra y.get(i).t oString()) ;
}
ret.add(lineArray);
}
fos.close();
} catch (IOException ex) {
System.out.println(ex);
}
return ret;
}
public ArrayList readFile(String fileName) {
ArrayList ret = new ArrayList();
try {
RandomAccessFile fos = new RandomAccessFile(fileName,
fos.seek(0);
String line = fos.readLine();
while (line != null) {
StringTokenizer t = new StringTokenizer(line, ",");
ArrayList lineArray = new ArrayList();
while (t.hasMoreTokens()) {
lineArray.add(t.nextToken(
}
line = fos.readLine();
int[] lineInt=new int[lineArray.size()];
for (int i=0;i<lineArray.size();i++
lineInt[i]=Integer.parseIn
}
ret.add(lineArray);
}
fos.close();
} catch (IOException ex) {
System.out.println(ex);
}
return ret;
}
ASKER
s means the original data (before normalized)
min(S) means the mininium value of the one column , max(S) means the maximium value of one column ! the column is formed by the i index of each int[i] ! if I got 1000 rows then I will have 1000 values in one column, find the minimium and maximium of the column and normalized the data in the cell !!
min(S) means the mininium value of the one column , max(S) means the maximium value of one column ! the column is formed by the i index of each int[i] ! if I got 1000 rows then I will have 1000 values in one column, find the minimium and maximium of the column and normalized the data in the cell !!
something like this to write:
public void writeFile(String fileName,ArrayList original){
try {
RandomAccessFile fos = new RandomAccessFile(fileName, "w");
fos.seek(0);
for (int i=0;i<original.size();i++) {
String line=original.get(i).toStr ing()+"\n" ;//or other thing you want to write
fos.write(line.getBytes()) ;
}
fos.close();
} catch (IOException ex) {
System.out.println(ex);
}
}
public void writeFile(String fileName,ArrayList original){
try {
RandomAccessFile fos = new RandomAccessFile(fileName,
fos.seek(0);
for (int i=0;i<original.size();i++)
String line=original.get(i).toStr
fos.write(line.getBytes())
}
fos.close();
} catch (IOException ex) {
System.out.println(ex);
}
}
ASKER
in my program, I try to use float[] because some decimal value is calculated after the normalization !!
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
this for get the minimum array and maximum array:
public float[] maxims(ArrayList list){
float[] maxs= new float[((float[])list.get(0 )).length] ;
for (int i = 0; i < list.size(); i++) {
float[] a = (float[]) list.get(i);
for (int j = 0; j < a.length; j++) {
if (maxs[j]<a[j])maxs[j]=a[j] ;
}
}
return maxs;
}
public float[] minims(ArrayList list){
float[] mins= new float[((float[])list.get(0 )).length] ;
for (int i = 0; i < list.size(); i++) {
float[] a = (float[]) list.get(i);
for (int j = 0; j < a.length; j++) {
if (mins[j]>a[j])mins[j]=a[j] ;
}
}
return mins;
}
Hope this help you.
Bye, Giant.
public float[] maxims(ArrayList list){
float[] maxs= new float[((float[])list.get(0
for (int i = 0; i < list.size(); i++) {
float[] a = (float[]) list.get(i);
for (int j = 0; j < a.length; j++) {
if (maxs[j]<a[j])maxs[j]=a[j]
}
}
return maxs;
}
public float[] minims(ArrayList list){
float[] mins= new float[((float[])list.get(0
for (int i = 0; i < list.size(); i++) {
float[] a = (float[]) list.get(i);
for (int j = 0; j < a.length; j++) {
if (mins[j]>a[j])mins[j]=a[j]
}
}
return mins;
}
Hope this help you.
Bye, Giant.
ASKER
public static void main( String args[] )
{
readfile_test1 process = new readfile_test1();
String file="testFile";
ArrayList uncleanedlist;
try
{
uncleanedlist = readfile_test1.readFile(fi le);
///how can I pass the uncleanedlist into the normalise with the float[] ??
}
catch ( FileNotFoundException e )
{
e.printStackTrace( );
}
}
{
readfile_test1 process = new readfile_test1();
String file="testFile";
ArrayList uncleanedlist;
try
{
uncleanedlist = readfile_test1.readFile(fi
///how can I pass the uncleanedlist into the normalise with the float[] ??
}
catch ( FileNotFoundException e )
{
e.printStackTrace( );
}
}
public static void main( String args[] ){
FileReadAndTokenize process=new FileReadAndTokenize;
String file="testFile";
ArrayList uncleanedlist=new ArrayList();
try
{
uncleanedlist = process.readFile(file);
///how can I pass the uncleanedlist into the normalise with the float[] ??
float[] averages=process.average(u ncleanedli st);
ArrayList cleanedlist=process.normal ize(unclea nedlist,av erages);
float[] maxims=process.maxims(clea nedlist);
float[] minims=process.minims(clea nedlist);
ArrayList normalizedlist=process.nor malize2(cl eanedlist, maxims,min ims);
process.writeFile("outFile ",normaliz edlist);
}
catch ( FileNotFoundException e )
{
e.printStackTrace( );
}
}
FileReadAndTokenize process=new FileReadAndTokenize;
String file="testFile";
ArrayList uncleanedlist=new ArrayList();
try
{
uncleanedlist = process.readFile(file);
///how can I pass the uncleanedlist into the normalise with the float[] ??
float[] averages=process.average(u
ArrayList cleanedlist=process.normal
float[] maxims=process.maxims(clea
float[] minims=process.minims(clea
ArrayList normalizedlist=process.nor
process.writeFile("outFile
}
catch ( FileNotFoundException e )
{
e.printStackTrace( );
}
}
I call the class FileReadAndTokenize, so I create in the main an instance of it (its name is process) with:
FileReadAndTokenize process = new FileReadAndTokenize();
Hope this help you.
Giant.
FileReadAndTokenize process = new FileReadAndTokenize();
Hope this help you.
Giant.
ASKER
oh !! yes !! I understand !!
but there seems no method called average there ?
but there seems no method called average there ?
ASKER
also this seems no need to use average !!
public float[] average(ArrayList list) {
float[] tot = new float[((float[]) list.get(0)).length];
float[] averages = new float[tot.length];
for (int i = 0; i < list.size(); i++) {
float[] a = (float[]) list.get(i);
for (int j = 0; j < a.length; j++) {
tot[j] += a[j];
}
}
System.out.println("AVERAG ES");
int numRows = list.size();
for (int i = 0; i < tot.length; i++) {
averages[i] = tot[i] / numRows;
System.out.println("column " + i + " average=" + averages[i]);
}
return averages;
}
float[] tot = new float[((float[]) list.get(0)).length];
float[] averages = new float[tot.length];
for (int i = 0; i < list.size(); i++) {
float[] a = (float[]) list.get(i);
for (int j = 0; j < a.length; j++) {
tot[j] += a[j];
}
}
System.out.println("AVERAG
int numRows = list.size();
for (int i = 0; i < tot.length; i++) {
averages[i] = tot[i] / numRows;
System.out.println("column
}
return averages;
}
ASKER
oh !! this is the first time when I try to learn how to read data into 2DArray !! it is not related !!!
Sorry about that !!! basically I need to find the median of the column instead of the mean of the column !!!
Sorry about that !!! basically I need to find the median of the column instead of the mean of the column !!!
>median of the column instead of the mean of the column !!!
??
??
ASKER
something like
static boolean delete( float[] listOfIntegers )
{
boolean delete = false;
Arrays.sort( listOfIntegers );
float median = listOfIntegers[ listOfIntegers.length / 2 ];
float targetValue = median * 3;
for ( int i = 0; ( i < listOfIntegers.length ) && ( !delete ); i++ )
{
float testValue = listOfIntegers[ i ];
delete = testValue > targetValue;
System.out.println( "test=" + testValue + ", target=" + targetValue );
}
return delete;
}
static boolean delete( float[] listOfIntegers )
{
boolean delete = false;
Arrays.sort( listOfIntegers );
float median = listOfIntegers[ listOfIntegers.length / 2 ];
float targetValue = median * 3;
for ( int i = 0; ( i < listOfIntegers.length ) && ( !delete ); i++ )
{
float testValue = listOfIntegers[ i ];
delete = testValue > targetValue;
System.out.println( "test=" + testValue + ", target=" + targetValue );
}
return delete;
}
ASKER
I think this one is finding the median of one row !! am I right ?
ASKER
I want to find the median of the column and once the data(in a cell) is greater than median x3 , delete the whole ROW !!
ah! Ok.
public float[] medium(ArrayList list) {
float[] medium= new float[((float[]) list.get(0)).length];
int mediumPosition=list.size() /2;
for (int j = 0; j < a.length; j++) {
medium[j]=((float[])list.g et(mediumP osition))[ j];
}
return medium;
}
public float[] medium(ArrayList list) {
float[] medium= new float[((float[]) list.get(0)).length];
int mediumPosition=list.size()
for (int j = 0; j < a.length; j++) {
medium[j]=((float[])list.g
}
return medium;
}
and this is the main:
public static void main( String args[] ){
FileReadAndTokenize process=new FileReadAndTokenize();
String file="testFile";
ArrayList uncleanedlist=new ArrayList();
try
{
uncleanedlist = process.readFile(file);
///how can I pass the uncleanedlist into the normalise with the float[] ??
float[] medium=process.medium(uncl eanedlist) ;
ArrayList cleanedlist=process.normal ize(unclea nedlist,me dium);
float[] maxims=process.maxims(clea nedlist);
float[] minims=process.minims(clea nedlist);
ArrayList normalizedlist=process.nor malize2(cl eanedlist, maxims,min ims);
process.writeFile("outFile ",normaliz edlist);
}
catch ( FileNotFoundException e )
{
e.printStackTrace( );
}
}
Hope this is what you are lloking for.
Bye, Giant.
public static void main( String args[] ){
FileReadAndTokenize process=new FileReadAndTokenize();
String file="testFile";
ArrayList uncleanedlist=new ArrayList();
try
{
uncleanedlist = process.readFile(file);
///how can I pass the uncleanedlist into the normalise with the float[] ??
float[] medium=process.medium(uncl
ArrayList cleanedlist=process.normal
float[] maxims=process.maxims(clea
float[] minims=process.minims(clea
ArrayList normalizedlist=process.nor
process.writeFile("outFile
}
catch ( FileNotFoundException e )
{
e.printStackTrace( );
}
}
Hope this is what you are lloking for.
Bye, Giant.
ASKER
public float[] medium(ArrayList list) {
float[] medium= new float[((float[]) list.get(0)).length];
int mediumPosition=list.size() /2;
for (int j = 0; j < a.length; j++) {
medium[j]=((float[])list.g et(mediumP osition))[ j];
}
return medium;
}
is it a = list ?
float[] medium= new float[((float[]) list.get(0)).length];
int mediumPosition=list.size()
for (int j = 0; j < a.length; j++) {
medium[j]=((float[])list.g
}
return medium;
}
is it a = list ?
ASKER
oh !! a should medium ! right ?
ASKER
null
press any key to continue............
such error after I run the program , any idea ?
press any key to continue............
such error after I run the program , any idea ?
Can you show us what output comes from running the answer on those first three rows of data?
...meaning the normalization process
Thanks for accepting.
>null
>press any key to continue............
>such error after I run the program
What do you use to run the program?
>null
>press any key to continue............
>such error after I run the program
What do you use to run the program?
ASKER
I am using java 1.4.2_02
I run it by typing java classify in the jdk\bin\ (I put the class file and the text file in it)
classify is the class name !!
I run it by typing java classify in the jdk\bin\ (I put the class file and the text file in it)
classify is the class name !!
Are there any exceptions?
ASKER
java.lang.ClassCastExcepti on
at classify.medium(classify.j ava:50)
at classify.main(classify.jav a:115)
Press any key to continue...
at classify.medium(classify.j
at classify.main(classify.jav
Press any key to continue...
see the line 50 of the class classify.
What is this line (post it please).
What is this line (post it please).
ASKER
float[] medium= new float[((float[]) list.get(0)).length];
try to replace with these lines:
System.out.println(list.ge t(0));
float[] medium= new float[((float[]) list.get(0)).length];
and tell me what it display (the error I believe will be at line 51)
System.out.println(list.ge
float[] medium= new float[((float[]) list.get(0)).length];
and tell me what it display (the error I believe will be at line 51)
ASKER
[321, 371, 361, 305, 112, 88, 109, 89, 115, 97, 101, 91, 922, 468, 586, 4333, 41
5047, 1]
java.lang.ClassCastExcepti on
at classify.medium(classify.j ava:53)
at classify.main(classify.jav a:118)
Press any key to continue...
5047, 1]
java.lang.ClassCastExcepti
at classify.medium(classify.j
at classify.main(classify.jav
Press any key to continue...
what is list object?
list object I believe is an ArrayList of float[].
in the last post I understand it's an ArrayList of int[], isn't it?
list object I believe is an ArrayList of float[].
in the last post I understand it's an ArrayList of int[], isn't it?
ASKER
yes Arraylist of float[] !
because needed to calculate decimal numbers !
because needed to calculate decimal numbers !
try this:
public float[] medium(ArrayList list) {
System.out.println((list.g et(0)).get Class().ge tName());
float[] medium= new float[((float[]) list.get(0)).length];
int mediumPosition=list.size() /2;
for (int j = 0; j < medium.length; j++) {
medium[j]=((float[])list.g et(mediumP osition))[ j];
}
return medium;
}
public float[] medium(ArrayList list) {
System.out.println((list.g
float[] medium= new float[((float[]) list.get(0)).length];
int mediumPosition=list.size()
for (int j = 0; j < medium.length; j++) {
medium[j]=((float[])list.g
}
return medium;
}
ASKER
java.util.ArrayList
java.lang.ClassCastExcepti on
at classify.medium(classify.j ava:62)
at classify.main(classify.jav a:128)
Press any key to continue...
java.lang.ClassCastExcepti
at classify.medium(classify.j
at classify.main(classify.jav
Press any key to continue...
So you have an ArrayList of ArrayList. Is it so?
public float[] medium(ArrayList list) {
float[] medium= new float[((ArrayList) list.get(0)).size()];
int mediumPosition=list.size() /2;
for (int j = 0; j < medium.length; j++) {
medium[j]=(float)(((ArrayL ist)list.g et(mediumP osition)). get(j));
}
return medium;
}
public float[] medium(ArrayList list) {
float[] medium= new float[((ArrayList) list.get(0)).size()];
int mediumPosition=list.size()
for (int j = 0; j < medium.length; j++) {
medium[j]=(float)(((ArrayL
}
return medium;
}
ASKER
inconvertible types error after compile at
medium[j]=(float)(((ArrayL ist)list.g et(mediumP osition)). get(j));
medium[j]=(float)(((ArrayL
?????
Could ou post the method you use for read data from file?
Could ou post the method you use for read data from file?
ASKER
public ArrayList readFile(String fileName) {//read in the text file into the Arraylist of float[]
ArrayList ret = new ArrayList();
try {
RandomAccessFile fos = new RandomAccessFile(fileName, "r");
fos.seek(0);
String line = fos.readLine();
while (line != null) {
StringTokenizer t = new StringTokenizer(line, ",");
ArrayList lineArray = new ArrayList();
while (t.hasMoreTokens()) {
lineArray.add(t.nextToken( ));
}
line = fos.readLine();
float[] lineInt=new float[lineArray.size()];
for (int i=0;i<lineArray.size();i++ ){
lineInt[i]=Float.parseFloa t(lineArra y.get(i).t oString()) ;
}
ret.add(lineArray);
}
fos.close();
} catch (IOException ex) {
System.out.println(ex);
}
return ret;
}
public static void main( String args[] ){
classify process=new classify();
ArrayList uncleanedlist=new ArrayList();
try
{
uncleanedlist = process.readFile("testFile ");
float[] medium2=process.medium(unc leanedlist );
ArrayList cleanedlist=process.cleani ng(unclean edlist,med ium2);
float[] maxims=process.maxims(clea nedlist);
float[] minims=process.minims(clea nedlist);
ArrayList normalizedlist=process.nor malize(cle anedlist,m axims,mini ms);
process.writeFile("outFile ",normaliz edlist);
System.out.print("total line read after normalizedlist is " + normalizedlist.size());
}
catch ( Exception e )
{
e.printStackTrace( );
}
}
ArrayList ret = new ArrayList();
try {
RandomAccessFile fos = new RandomAccessFile(fileName,
fos.seek(0);
String line = fos.readLine();
while (line != null) {
StringTokenizer t = new StringTokenizer(line, ",");
ArrayList lineArray = new ArrayList();
while (t.hasMoreTokens()) {
lineArray.add(t.nextToken(
}
line = fos.readLine();
float[] lineInt=new float[lineArray.size()];
for (int i=0;i<lineArray.size();i++
lineInt[i]=Float.parseFloa
}
ret.add(lineArray);
}
fos.close();
} catch (IOException ex) {
System.out.println(ex);
}
return ret;
}
public static void main( String args[] ){
classify process=new classify();
ArrayList uncleanedlist=new ArrayList();
try
{
uncleanedlist = process.readFile("testFile
float[] medium2=process.medium(unc
ArrayList cleanedlist=process.cleani
float[] maxims=process.maxims(clea
float[] minims=process.minims(clea
ArrayList normalizedlist=process.nor
process.writeFile("outFile
System.out.print("total line read after normalizedlist is " + normalizedlist.size());
}
catch ( Exception e )
{
e.printStackTrace( );
}
}
these are correct methods:
public float[] medium(ArrayList list) {
//System.out.println((list .get(0)).g etClass(). getName()) ;
float[] medium = new float[((ArrayList)list.get (0)).size( )];
int mediumPosition = list.size() / 2;
for (int j = 0; j < medium.length; j++) {
medium[j] = Float.valueOf((((ArrayList )list.get( mediumPosi tion)).get (j)).toStr ing()).flo atValue();
}
return medium;
}
public float[] maxims(ArrayList list) {
float[] maxs = new float[((ArrayList)list.get (0)).size( )];
for (int i = 0; i < list.size(); i++) {
ArrayList a = (ArrayList) list.get(i);
for (int j = 0; j < a.size(); j++) {
if (maxs[j] < Float.valueOf((String)a.ge t(j)).floa tValue())
maxs[j] = Float.valueOf((String)a.ge t(j)).floa tValue();
}
}
return maxs;
}
public float[] minims(ArrayList list) {
float[] mins = new float[((ArrayList)list.get (0)).size( )];
for (int i = 0; i < list.size(); i++) {
ArrayList a = (ArrayList) list.get(i);
for (int j = 0; j < a.size(); j++) {
if (mins[j] > Float.valueOf((String)a.ge t(j)).floa tValue())
mins[j] = Float.valueOf((String)a.ge t(j)).floa tValue();
}
}
return mins;
}
public ArrayList readFile(String fileName) throws IOException {
ArrayList ret = new ArrayList();
RandomAccessFile fos = new RandomAccessFile(fileName, "r");
fos.seek(0);
String line = fos.readLine();
while (line != null) {
StringTokenizer t = new StringTokenizer(line, ",");
ArrayList lineArray = new ArrayList();
while (t.hasMoreTokens()) {
lineArray.add(t.nextToken( ));
}
line = fos.readLine();
float[] lineInt = new float[lineArray.size()];
for (int i = 0; i < lineArray.size(); i++) {
lineInt[i] = Float.parseFloat(lineArray .get(i).to String());
}
ret.add(lineArray);
}
fos.close();
return ret;
}
public void writeFile(String fileName, ArrayList original) throws IOException {
RandomAccessFile fos = new RandomAccessFile(fileName, "rw");
fos.seek(0);
for (int i = 0; i < original.size(); i++) {
String line = original.get(i).toString() + "\n"; //or other thing you want to write
fos.write(line.getBytes()) ;
}
fos.close();
}
public ArrayList normalize(ArrayList original, float[] averages) {
int i = 0;
while (original.size() > i) {
ArrayList el = (ArrayList) original.get(i);
boolean remove = false;
for (int k = 0; k < el.size(); k++) {
if (Float.valueOf((String)el. get(k)).fl oatValue() > (averages[k] * 3)) {
remove = true;
break;
}
} //end for k
if (remove) {
original.remove(i);
} else {
i++;
}
} //end while
return original;
}
public ArrayList normalize2(ArrayList original, float[] maxs, float[] mins) {
for (int i = 0; i < original.size(); i++) {
ArrayList el = (ArrayList) original.get(i);
for (int k = 0; k < maxs.length; k++) {
el.set(k, String.valueOf((Float.valu eOf((Strin g)el.get(k )).floatVa lue() - mins[k]) / (maxs[k] - mins[k])));
} //end for k
} //end for i
return original;
}
public ArrayList[] divide(ArrayList original, int numOfSubset) {
ArrayList[] ret = new ArrayList[numOfSubset];
int subset = 0;
int nrlen = original.size() / numOfSubset;
int pos = 0;
while (subset < numOfSubset && pos < original.size()) {
ret[subset].add(original.g et(pos));
pos++;
if (pos == nrlen)
subset++;
}
if (pos < original.size()) {
for (int i = pos; i < original.size(); i++)
ret[numOfSubset - 1].add(original.get(i));
}
return ret;
}
Tell me if all Ok now.
Giant.
public float[] medium(ArrayList list) {
//System.out.println((list
float[] medium = new float[((ArrayList)list.get
int mediumPosition = list.size() / 2;
for (int j = 0; j < medium.length; j++) {
medium[j] = Float.valueOf((((ArrayList
}
return medium;
}
public float[] maxims(ArrayList list) {
float[] maxs = new float[((ArrayList)list.get
for (int i = 0; i < list.size(); i++) {
ArrayList a = (ArrayList) list.get(i);
for (int j = 0; j < a.size(); j++) {
if (maxs[j] < Float.valueOf((String)a.ge
maxs[j] = Float.valueOf((String)a.ge
}
}
return maxs;
}
public float[] minims(ArrayList list) {
float[] mins = new float[((ArrayList)list.get
for (int i = 0; i < list.size(); i++) {
ArrayList a = (ArrayList) list.get(i);
for (int j = 0; j < a.size(); j++) {
if (mins[j] > Float.valueOf((String)a.ge
mins[j] = Float.valueOf((String)a.ge
}
}
return mins;
}
public ArrayList readFile(String fileName) throws IOException {
ArrayList ret = new ArrayList();
RandomAccessFile fos = new RandomAccessFile(fileName,
fos.seek(0);
String line = fos.readLine();
while (line != null) {
StringTokenizer t = new StringTokenizer(line, ",");
ArrayList lineArray = new ArrayList();
while (t.hasMoreTokens()) {
lineArray.add(t.nextToken(
}
line = fos.readLine();
float[] lineInt = new float[lineArray.size()];
for (int i = 0; i < lineArray.size(); i++) {
lineInt[i] = Float.parseFloat(lineArray
}
ret.add(lineArray);
}
fos.close();
return ret;
}
public void writeFile(String fileName, ArrayList original) throws IOException {
RandomAccessFile fos = new RandomAccessFile(fileName,
fos.seek(0);
for (int i = 0; i < original.size(); i++) {
String line = original.get(i).toString()
fos.write(line.getBytes())
}
fos.close();
}
public ArrayList normalize(ArrayList original, float[] averages) {
int i = 0;
while (original.size() > i) {
ArrayList el = (ArrayList) original.get(i);
boolean remove = false;
for (int k = 0; k < el.size(); k++) {
if (Float.valueOf((String)el.
remove = true;
break;
}
} //end for k
if (remove) {
original.remove(i);
} else {
i++;
}
} //end while
return original;
}
public ArrayList normalize2(ArrayList original, float[] maxs, float[] mins) {
for (int i = 0; i < original.size(); i++) {
ArrayList el = (ArrayList) original.get(i);
for (int k = 0; k < maxs.length; k++) {
el.set(k, String.valueOf((Float.valu
} //end for k
} //end for i
return original;
}
public ArrayList[] divide(ArrayList original, int numOfSubset) {
ArrayList[] ret = new ArrayList[numOfSubset];
int subset = 0;
int nrlen = original.size() / numOfSubset;
int pos = 0;
while (subset < numOfSubset && pos < original.size()) {
ret[subset].add(original.g
pos++;
if (pos == nrlen)
subset++;
}
if (pos < original.size()) {
for (int i = pos; i < original.size(); i++)
ret[numOfSubset - 1].add(original.get(i));
}
return ret;
}
Tell me if all Ok now.
Giant.
ASKER
java.lang.IndexOutOfBounds Exception: Index: 0, Size: 0
at java.util.ArrayList.RangeC heck(Array List.java: 507)
at java.util.ArrayList.get(Ar rayList.ja va:324)
at classify1.normalize(classi fy1.java:9 7)
at classify1.main(classify1.j ava:133)
Press any key to continue...
at java.util.ArrayList.RangeC
at java.util.ArrayList.get(Ar
at classify1.normalize(classi
at classify1.main(classify1.j
Press any key to continue...
what is the line nr. 97 of the class classify1 ?
ASKER
el.set(k, String.valueOf((Float.valu eOf((Strin g)el.get(k )).floatVa lue() - mins[k]) / (maxs[k] - mins[k])));
If I well remember the code correct is:
Try:
for (int k = 0; k < el.size(); k++) {
el.set(k, String.valueOf((Float.valu eOf((Strin g)el.get(k )).floatVa lue() - mins[k]) / (maxs[k] - mins[k])));
} //end for k
Tell me if it's Ok.
Try:
for (int k = 0; k < el.size(); k++) {
el.set(k, String.valueOf((Float.valu
} //end for k
Tell me if it's Ok.
ASKER
yes it is !! I change normalize2 to normalize and normalize to cleaning !!
public ArrayList normalize(ArrayList original, float[] maxs, float[] mins) {
for (int i = 0; i < original.size(); i++) {
ArrayList el = (ArrayList) original.get(i);
for (int k = 0; k < maxs.length; k++) {
el.set(k, String.valueOf((Float.valu eOf((Strin g)el.get(k )).floatVa lue() - mins[k]) / (maxs[k] - mins[k])));
} //end for k
} //end for i
return original;
}
public ArrayList normalize(ArrayList original, float[] maxs, float[] mins) {
for (int i = 0; i < original.size(); i++) {
ArrayList el = (ArrayList) original.get(i);
for (int k = 0; k < maxs.length; k++) {
el.set(k, String.valueOf((Float.valu
} //end for k
} //end for i
return original;
}
ok.
Try what I posted:
public ArrayList normalize(ArrayList original, float[] maxs, float[] mins) {
for (int i = 0; i < original.size(); i++) {
ArrayList el = (ArrayList) original.get(i);
for (int k = 0; k < el.length; k++) {//here I changed
el.set(k, String.valueOf((Float.valu eOf((Strin g)el.get(k )).floatVa lue() - mins[k]) / (maxs[k] - mins[k])));
} //end for k
} //end for i
return original;
}
Try what I posted:
public ArrayList normalize(ArrayList original, float[] maxs, float[] mins) {
for (int i = 0; i < original.size(); i++) {
ArrayList el = (ArrayList) original.get(i);
for (int k = 0; k < el.length; k++) {//here I changed
el.set(k, String.valueOf((Float.valu
} //end for k
} //end for i
return original;
}
ASKER
for (int k = 0; k < el.length; k++)
incompatiable type varaible length !
is it el.size() ?
incompatiable type varaible length !
is it el.size() ?
>is it el.size()
yes, it's.
yes, it's.
0.000,0.000,0.000,0.000,1. 000,0.889, 1.000,1.00 0,0.889,0. 867,0.926, 0.550,0.69 0,1.000,0. 827,0.000, 0.840
0.350,0.640,1.000,1.000,0. 955,1.000, 0.933,1.00 0,1.000,1. 000,1.000, 1.000,1.00 0,0.872,1. 000,1.000, 1.000
1.000,1.000,0.077,0.394,0. 000,0.000, 0.000,0.00 0,0.000,0. 000,0.000, 0.000,0.00 0,0.000,0. 000,0.135, 0.000
is what i get for those three rows of normalized data to 3 dp
0.350,0.640,1.000,1.000,0.
1.000,1.000,0.077,0.394,0.
is what i get for those three rows of normalized data to 3 dp