Coder Social home page Coder Social logo

csv2arff's People

Watchers

 avatar

csv2arff's Issues

Spaces in column names in att file produce invalid arff file; no missing values written to arff file; patch posted here

Index: csv2arff.py
===================================================================
--- csv2arff.py (revision 3)
+++ csv2arff.py (working copy)
@@ -13,7 +13,7 @@
     dom1 = minidom.parse(file_xml)
     for node in dom1.getElementsByTagName('attribute'):
         out.append({
-            'name': node.getAttribute('name') ,
+            'name': node.getAttribute('name').replace(' ','_') ,
             'atype': node.getAttribute('atype'),
             'format':node.getAttribute('format'),
             'skip':node.getAttribute('skip')
@@ -31,7 +31,8 @@
         delimiter=node.getAttribute('delimiter');
     if(len(delimiter)==0):
         delimiter=';';
-    print delimiter   
+    out = out.replace(' ', '_')
+    print out, delimiter   
     return out, delimiter


@@ -47,8 +48,8 @@
         classes = []

         #read attribute
-        self.relation_name, self.delimiter = get_relation(attribute_file)
-        attributes_list = get_attributes(attribute_file)
+        self.relation_name, self.delimiter = get_relation(self.attribute_file)
+        attributes_list = get_attributes(self.attribute_file)
         arff_data = '@RELATION ' + self.relation_name + '\n\n'


@@ -66,9 +67,10 @@


         arff_data += '\n@DATA\n'
-        print classes 
+        print classes
+        print self.delimiter
         #open csv
-        reader = csv.reader(open(self.csv_file), delimiter=self.delimiter,
quoting=csv.QUOTE_NONE)
+        reader = csv.reader(open(self.csv_file)) #,
delimiter=self.delimiter, quoting=csv.QUOTE_NONE)

         rnum = 0     

@@ -77,16 +79,19 @@
             #print row
             buff = ''
             pos = 0
+            num_written = 0
             #print len(row)
             #occhio alla lunghezza riga
             for j in range(0, len(row)-1):
                 field = row[j]

                 if(attributes_list[pos]['skip'] != 'yes'):
-                
-                    if (pos > 0):
+                    
+                    if (num_written > 0):
                         buff += ','
-                    if(attributes_list[pos]['atype'] == 'string'):
+                    if (len(field) == 0):
+                        field = '?'
+                    if(attributes_list[pos]['atype'] == 'string') and
not(field == '?'):
                         field = "'" + field + "'"
                     buff += field
                     #se è una classe raccolgo i valori
@@ -95,7 +100,7 @@
                             classes[pos]+= ','+ field
                         else:
                             classes[pos]+=  field
-                        
+                    num_written += 1   
                 pos += 1
             buff += '\n'
             arff_data += buff

Original issue reported on code.google.com by [email protected] on 17 May 2009 at 7:44

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.