1 rizwank 1.1 <?php
2
3 /*
4 (c) 2000 Hans Anderson Corporation. All Rights Reserved.
5 You are free to use and modify this class under the same
6 guidelines found in the PHP License.
7
8 -----------
9
10 bugs/me:
11 http://www.hansanderson.com/php/
12 me@hansanderson.com
13
14 -----------
15
16 Version 1.0
17
18 - 1.0 is the first actual release of the class. It's
19 finally what I was hoping it would be, though there
20 are likely to still be some bugs in it. This is
21 a much changed version, and if you have downloaded
22 rizwank 1.1 a previous version, this WON'T work with your existing
23 scripts! You'll need to make some SIMPLE changes.
24
25 - .92 fixed bug that didn't include tag attributes
26
27 (to use attributes, add _attributes[array_index]
28 to the end of the tag in question:
29 $xml_html_head_body_img would become
30 $xml_html_head_body_img_attributes[0],
31 for example)
32
33 -- Thanks to Nick Winfield <nick@wirestation.co.uk>
34 for reporting this bug.
35
36 - .91 No Longer requires PHP4!
37
38 - .91 now all elements are array. Using objects has
39 been discontinued.
40
41 -----------
42
43 rizwank 1.1 What class.xml.php is:
44
45 A very, very easy to use XML parser class. It uses PHP's XML functions
46 for you, returning one array that has all the tag information. The only
47 hard part is figuring out the syntax of the tags!
48
49 -----------
50
51 Sample use:
52
53 require('class.xml.php');
54 $file = "data.xml";
55 $data = implode("",file($file)) or die("could not open XML input file");
56 $obj = new xml($data,"xml");
57
58
59 print $xml["hans"][0]->num_results[0];
60 for($i=0;$i<sizeof($xml["hans"]);$i++) {
61 print $xml["hans"][$i]->tag[0] . "\n\n";
62 }
63
64 rizwank 1.1 To print url attributes (if they exist):
65
66 print $xml["hans"][0]->attributes[0]["size"]; # where "size" was an attr name
67
68 (that's it! slick, huh?)
69 -----------
70
71 Two ways to call xml class:
72
73 $xml = new xml($data);
74 - or -
75 $xml = new xml($data,"jellyfish");
76
77 The second argument (jellyfish) is optional. Default is 'xml'.
78 All the second argument does is give you a chance to name the array
79 that is returned something besides "xml" (in case you are already using
80 that name). Normal PHP variable name rules apply.
81
82 ----------
83
84 Explanation of xml class:
85 rizwank 1.1
86 This class takes valid XML data as an argument and
87 returns all the information in a complex but loopable array.
88
89 Here's how it works:
90
91 Data:
92
93 <html>
94 <head>
95 <title>Hans Anderson's XML Class</title>
96 </head>
97 <body>
98 </body>
99 </html>
100
101 Run the data through my class, then access the title like this:
102 $xml["html_head"][0]->title[0];
103
104 Or, loop through them:
105 for($i=0;$i<sizeof($xml["html_head"]);$i++) {
106 rizwank 1.1 print $xml["html_head"][$i]->title[0] . "\n";
107 }
108
109 Yes, the variable names *are* long and messy, but it's
110 the best way to create the tree, IMO.
111
112
113 Here is a complex explanation I sent to one class.xml.php user:
114
115 ---------
116
117 > Now I've run into another problem:
118 >
119 > <STORY TIMESTAMP="2000-12-15T20:08:00,0">
120 > <SECTION>Markets</SECTION>
121 > <BYLINE>By <BYLINE_AUTHOR ID="378">Aaron L. Task</BYLINE_AUTHOR><BR/>Senior
122 > Writer</BYLINE>
123 > </STORY>
124 >
125 > How do I get BYLINE_AUTHOR?
126
127 rizwank 1.1 print $xml["STORY_BYLINE"][0]->BYLINE_AUTHOR[0];
128
129 > And just a little question: Is there an easy way to get TIMESTAMP?
130
131 print $xml["STORY"][0]->attributes[0]["TIMESTAMP"];
132
133 This is confusing, I know, but it's the only way I could really do
134 this. Here's the rundown:
135
136 The $xml part is an array -- an array of arrays. The first array is the
137 name of the tag -- in the first case above, this is the tag STORY, and
138 below that BYLINE. You want BYLINE_AUTHOR. You want the first BA. The
139 first one is index [0] in the second part of the two-dimensional array.
140
141 Even if there is only *one* byline author, it's still an array, and you
142 still have to use the [0]. Now, the two-dimensional array is storing
143 dynamic structures -- objects in this case. So, we need to dereference
144 the object, hence the ->. The BYLINE_AUTHOR is the tag you want, and it
145 is an array in that object. The reason for the array is that if there are
146 more than one BYLINE_AUTHOR for the tags STORY, BYLINE, we would have a
147 [0] and [1] in the array. In your case there is just the one.
148 rizwank 1.1
149 *** This is very confusing, I know, but once you understand it, the power
150 of this method will be more apparent. You have access to *every* bit of
151 information in the XML file, without having to do anything but understand
152 how to refer to the variables. ***
153
154 EVERY variable will look like this:
155 print $xml["STORY_BYLINE"][0]->BYLINE_AUTHOR[0];
156
157 The trick is understanding how to get the variable to give you the
158 information. This is an array of arrays of objects holding arrays!
159
160 Any tag that has attributes will have them stored in a special object
161 array named "attributes" and will be called this way:
162
163 print $xml["STORY"][0]->attributes[0]["TIMESTAMP"];
164
165 If you aren't sure if there are attributes, you could do isset() or
166 is_array() for that above example. If isset(), you could for loop and
167 while(list($k,$v) = each($xml...)) over it to get the values.
168
169 rizwank 1.1
170 array of
171 objects
172 |
173 |
174 $xml["STORY_BYLINE"][0]->BYLINE_AUTHOR[0];
175 ^ ^
176 array of ^
177 arrays ^
178 ^
179 array in
180 object
181
182 In general, to get the value of this:
183
184 <STATE>
185 <STATENAME></STATENAME>
186 <COUNTY>
187 <COUNTYNAME></COUNTYNAME>
188 <CITY></CITY>
189 <CITY></CITY>
190 rizwank 1.1 </COUNTY>
191 <COUNTY>
192 <COUNTYNAME></COUNTYNAME>
193 <CITY></CITY>
194 <CITY></CITY>
195 </COUNTY>
196 </STATE>
197
198 You would look for what you want, say "CITY", then go UP one level, to
199 COUNTY (COUNTYNAME is on the same 'level'), for your first array:
200
201 $xml["STATE_COUNTY"] -- ALL tags pushed together are separated with
202 "_". Otherwise tags are as they were -- spaces, dashes, CaSe, etc.
203
204 Now, you want the first COUNTY, though there are two, so we are do this:
205
206 $xml["STATE_COUNTY"][0] -- to get the second, we'd use [1] instead of
207 [0]. You could also do a for() loop through it, using sizeof() to figure
208 out how big it is.
209
210 So, we have the STATE,COUNTY we want -- the first one. It's an
211 rizwank 1.1 object, and we know we want the CITY. So, we dereference the object. The
212 name of the array we want is, of course, CITY:
213
214 $xml["STATE_COUNTY"][0]->CITY[0] (the first one, the second one would be
215 [1]).
216
217 And that's it. Basically, find what you want, and go up a level.
218
219 You could do some complex for loops to go through them all, too:
220
221 for($i=0;$i<sizeof($xml["STATE_COUNTY"]);$i++) {
222
223 for($j=0;$j<sizeof($xml["STATE_COUNTY"][0]->CITY);$j++) {
224
225 print $xml["STATE_COUNTY"][$i]->CITY[$j];
226
227 }
228
229 }
230
231 -----------
232 rizwank 1.1
233 Whew. I hope that helps, not hurts.
234
235
236
237 */
238
239
240
241
242
243
244
245
246
247
248
249
250
251 /* used to store the parsed information */
252 class xml_container {
253 rizwank 1.1
254 function store($k,$v) {
255 $this->{$k}[] = $v;
256 }
257
258 }
259
260
261 /* parses the information */
262 class xml {
263
264 // initialize some variables
265 var $current_tag=array();
266 var $xml_parser;
267 var $Version = 1.0;
268 var $tagtracker = array();
269
270 /* Here are the XML functions needed by expat */
271
272
273 /* when expat hits an opening tag, it fires up this function */
274 rizwank 1.1
275 function startElement($parser, $name, $attrs) {
276
277 array_push($this->current_tag, $name); // add tag to the cur. tag array
278
279 $curtag = implode("_",$this->current_tag); // piece together tag
280
281 /* this tracks what array index we are on for this tag */
282
283 if(isset($this->tagtracker["$curtag"])) {
284 $this->tagtracker["$curtag"]++;
285 } else {
286 $this->tagtracker["$curtag"]=0;
287 }
288
289
290 /* if there are attributes for this tag, we set them here. */
291
292 if(count($attrs)>0) {
293 $j = $this->tagtracker["$curtag"];
294 if(!$j) $j = 0;
295 rizwank 1.1
296 if(!is_object($GLOBALS[$this->identifier]["$curtag"][$j])) {
297 $GLOBALS[$this->identifier]["$curtag"][$j] = new xml_container;
298 }
299
300 $GLOBALS[$this->identifier]["$curtag"][$j]->store("attributes",$attrs);
301 }
302
303 } // end function startElement
304
305
306
307 /* when expat hits a closing tag, it fires up this function */
308
309 function endElement($parser, $name) {
310
311 $curtag = implode("_",$this->current_tag); // piece together tag
312 // before we pop it off,
313 // so we can get the correct
314 // cdata
315
316 rizwank 1.1 if(!$this->tagdata["$curtag"]) {
317 $popped = array_pop($this->current_tag); // or else we screw up where we are
318 return; // if we have no data for the tag
319 } else {
320 $TD = $this->tagdata["$curtag"];
321 unset($this->tagdata["$curtag"]);
322 }
323
324 $popped = array_pop($this->current_tag);
325 // we want the tag name for
326 // the tag above this, it
327 // allows us to group the
328 // tags together in a more
329 // intuitive way.
330
331 if(sizeof($this->current_tag) == 0) return; // if we aren't in a tag
332
333 $curtag = implode("_",$this->current_tag); // piece together tag
334 // this time for the arrays
335
336 $j = $this->tagtracker["$curtag"];
337 rizwank 1.1 if(!$j) $j = 0;
338
339 if(!is_object($GLOBALS[$this->identifier]["$curtag"][$j])) {
340 $GLOBALS[$this->identifier]["$curtag"][$j] = new xml_container;
341 }
342
343 $GLOBALS[$this->identifier]["$curtag"][$j]->store($name,$TD); #$this->tagdata["$curtag"]);
344 unset($TD);
345 return TRUE;
346 }
347
348
349
350 /* when expat finds some internal tag character data,
351 it fires up this function */
352
353 function characterData($parser, $cdata) {
354 $curtag = implode("_",$this->current_tag); // piece together tag
355 $this->tagdata["$curtag"] .= $cdata;
356 }
357
358 rizwank 1.1
359 /* this is the constructor: automatically called when the class is initialized */
360
361 function xml($data,$identifier='xml') {
362
363 $this->identifier = $identifier;
364
365 // create parser object
366 $this->xml_parser = xml_parser_create();
367
368 // set up some options and handlers
369 xml_set_object($this->xml_parser,$this);
370 xml_parser_set_option($this->xml_parser,XML_OPTION_CASE_FOLDING,0);
371 xml_set_element_handler($this->xml_parser, "startElement", "endElement");
372 xml_set_character_data_handler($this->xml_parser, "characterData");
373
374 if (!xml_parse($this->xml_parser, $data, TRUE)) {
375 sprintf("XML error: %s at line %d",
376 xml_error_string(xml_get_error_code($this->xml_parser)),
377 xml_get_current_line_number($this->xml_parser));
378 }
379 rizwank 1.1
380 // we are done with the parser, so let's free it
381 xml_parser_free($this->xml_parser);
382
383 } // end constructor: function xml()
384
385
386 } // thus, we end our class xml
387
388 ?>
389
390
|