This workflow is related to my need to ingest ~40 csv files per day;
Ah, well, you didn't say it was not a one-time thing. :-) You can do that in 9 with a script as well, just not using the default .csv dataport. Read the file in as text, manipulate it, write it out, and then import it. Or, since it is just text anyway, go ahead and create a table from the thing when you bring it in the first time. Use whatever scripting language you want. I prefer V8, but hey, if python is your thing, no problem... It's always a tricky thing when somebody breaks a format in some weird way. In this case, as far as I can see from the example you've provided it is not even just a matter of being able to specify a custom end-of-record character. In this case your data uses CR LF as an end of record but LF alone is not. I'd like to see the CSV dataport extended to enable specification of a custom character for end of record. You could then specify CR. You'd still have to clean up extra LF characters that appear to the tune of 40 csv files a day, but then if your field personnel are damaging data here and there you'll have some extra work in any case. While we're on the topic, I'd also like to see the "CSV" dataport extended to handle fixed width fields and fixed length records, so that even without any field or record separators you could extract data.
|