| Author |
Message |
scsijon
Joined: 23 May 2007 Posts: 920 Location: the australian mallee
|
Posted: Wed 13 Feb 2013, 00:01 Post subject:
[closed]do you already have a xml to comma delinated script? |
|
before I start to try to build it! I am NOT a good base coder, although I can adapt existing code fairly well.
Does anyone already have a script that can take a xml file and turn it into a comma deliniated file. I'm starting on a opensuse2ppm script similarly to barryk's mageia2ppm and would like to start with a automated step1 as there are 19098 packages in the opensuse set at the moment(, up from 18880 last month). I'm actually thinking of stripping out lines we don't need as step0 as the source filesize is some 76meg and that should make it quicker to process the rest.
source xml is in this format if anyones interested:
| Code: |
<package type="rpm">
<name>844-ksc-pcf</name>
<arch>noarch</arch>
<version epoch="0" ver="19990207" rel="784.1.1"/>
<checksum type="sha256" pkgid="YES">ec26988a001df41bd1752aeb035608edbf1ef5ec646569d63a7d938228a6ff4d</checksum>
<summary>Korean 8x4x4 Johab Fonts</summary>
<description>Korean 8x4x4 johab fonts.</description>
<packager>http://bugs.opensuse.org</packager>
<url>http://www.debian.or.kr/~cwryu/archive/fonttools/</url>
<time file="1319310585" build="1319310562"/>
<size package="2592518" installed="4382509" archive="4403032"/>
<location href="noarch/844-ksc-pcf-19990207-784.1.1.noarch.rpm"/>
<format>
<rpm:license>Public Domain, Freeware</rpm:license>
<rpm:vendor>openSUSE</rpm:vendor>
<rpm:group>System/X11/Fonts</rpm:group>
<rpm:buildhost>build25</rpm:buildhost>
<rpm:sourcerpm>844-ksc-pcf-19990207-784.1.1.src.rpm</rpm:sourcerpm>
<rpm:header-range start="872" end="39087"/>
<rpm:provides>
<rpm:entry name="locale(xorg-x11:ko)"/>
<rpm:entry name="844-ksc-pcf" flags="EQ" epoch="0" ver="19990207" rel="784.1.1"/>
</rpm:provides>
<rpm:requires>
<rpm:entry name="perl" pre="1"/>
<rpm:entry name="/bin/sh"/>
<rpm:entry name="aaa_base" pre="1"/>
<rpm:entry name="/bin/sh" pre="1"/>
</rpm:requires>
</format>
</package>
|
regards
scsijon
Last edited by scsijon on Mon 18 Feb 2013, 23:30; edited 1 time in total
|
|
Back to top
|
|
 |
amigo
Joined: 02 Apr 2007 Posts: 1757
|
Posted: Wed 13 Feb 2013, 12:57 Post subject:
|
|
Yeah, before trying to build -parsing xml is 'heavy lifting'. A search would have gotten you lots of hits:
https://duckduckgo.com/?q=convert+xml+to+CSV
|
|
Back to top
|
|
 |
scsijon
Joined: 23 May 2007 Posts: 920 Location: the australian mallee
|
Posted: Wed 13 Feb 2013, 19:14 Post subject:
|
|
| amigo wrote: | Yeah, before trying to build -parsing xml is 'heavy lifting'. A search would have gotten you lots of hits:
https://duckduckgo.com/?q=convert+xml+to+CSV |
thanks amigo, had already done a search via sourceforge and had a look at the results (all 52 pages of them), found everything but what actually did what was wanted, the few that said they did were windows, mac, or required such a lot of additional packages (for a puppyan) that it didn't make sense to use. I shall try your link and see if I can do any better.
|
|
Back to top
|
|
 |
amigo
Joined: 02 Apr 2007 Posts: 1757
|
Posted: Thu 14 Feb 2013, 04:24 Post subject:
|
|
Using xslt would be the most obvious:
http://stackoverflow.com/questions/2516858/convert-xml-file-to-csv
Other standard XML tools are: XMLStarlet, xsltproc and perl xpath
This is interesting:
http://www.freesoftwaremagazine.com/articles/convert_xml_csv_ugly_way_unix_utilities_linux
http://stackoverflow.com/questions/893585/how-to-parse-xml-in-bash
|
|
Back to top
|
|
 |
musher0

Joined: 04 Jan 2009 Posts: 2193 Location: Gatineau (Qc), Canada
|
Posted: Thu 14 Feb 2013, 12:26 Post subject:
|
|
Hello, scsijon.
If you're into java, you may want to try one of these :
http://www.wenzlaff.de/xmltocsv.html
or
http://code.google.com/p/xml2csv-conv/
Best regards.
musher0
_________________
"...l'industrie de l'informatique n'aura besoin que de très peu de temps pour ramener l'humanité aux dessins rupestres." (M. Goebbel, Order of the Command Line; [ma trad.])
|
|
Back to top
|
|
 |
seaside
Joined: 11 Apr 2007 Posts: 832
|
Posted: Thu 14 Feb 2013, 13:02 Post subject:
|
|
scsijon.,
You may find the following xml utilities of interest. Below is a quote from the simple-icon-tray thread weather program made with xml-printf.
| Quote: | Since there are libraries for many languages to handle xml code parsing, I was wondering if any xml help existed for shell programs in Linux and ran across this -
http://xml-coreutils.sourceforge.net/
It's a set of utilities aimed at emulating the standard shell text tools like sed, tr, cat, printf, find, etc... but specifically for xml.
I compiled xml-printf which can be used to capture the content between tags and made a tray icon program for weather. |
Here's a download link for tweather.pet, which contains xml-printf.
http://murga-linux.com/puppy/viewtopic.php?mode=attach&id=63046
Regards,
s
|
|
Back to top
|
|
 |
technosaurus

Joined: 18 May 2008 Posts: 3843
|
Posted: Fri 15 Feb 2013, 02:52 Post subject:
|
|
xml is the most ridiculous format, I've no idea how it caught on.
That being said, if you are dealing with one tag per line, awk is pretty useful
awk '
BEGIN{FS="<|>"}
/<name>/{name=...}
/<arch>/{pkgs[$name][arch]=...}
/ var=/{pkgs[$name][var]=...}
END{}
' file
_________________ Puppy Web Desktop Now with pet packages - Pet Packaging 100 & 101
|
|
Back to top
|
|
 |
scsijon
Joined: 23 May 2007 Posts: 920 Location: the australian mallee
|
Posted: Fri 15 Feb 2013, 21:29 Post subject:
|
|
thank you all, and I agree technosaurus.
Unfortunately it's the best of two worlds for opensuse's packages! The other requires three sql files to be opened and integrated, before extraction for ppm and together they build 300meg+.
I think I have a lead from one of amigo's new links for something that will work easily. Thank you.
But i'm not marking this solved quite yet!
_________________ Mage2 in final Beta! http://www.murga-linux.com/puppy/viewtopic.php?t=72565
|
|
Back to top
|
|
 |
jamesbond
Joined: 26 Feb 2007 Posts: 1531 Location: The Blue Marble
|
Posted: Sun 17 Feb 2013, 07:12 Post subject:
|
|
I'm a little late to the game. But this is the tool that I use for getting stuff out of xml data: http://www.ofb.net/~egnor/xml2/. Converts XML into a flat file format which you can grep sed awk on.
_________________ Fatdog64, Slacko and Puppeee user. Puppy user since 2.13
|
|
Back to top
|
|
 |
scsijon
Joined: 23 May 2007 Posts: 920 Location: the australian mallee
|
Posted: Mon 18 Feb 2013, 23:32 Post subject:
|
|
thanks all,
I have changed methods and feel more than a bit of a fool after it was pointed out to me that all I need to do is modify the rpm2ppm script to match the opensuse format.
|
|
Back to top
|
|
 |
|