Give us a call 888-856-2664
Get a Quote Today

Things a Chameleon Would Say

Web design

XML encoding (utf-8, ascii)
By: Ryan Wright 07/31/2011

XML is a markup language similar to HTML. It was designed to transport data. Once data has been enoded, it can be easily read by many different systems. As a result, it is widely used in web services to transfer data.

Recently, I was working on a web service which required us to parse the data from the XML feed and to store it in the database. Normally this is a simple task which can be achieved by using PHP’s simple_xml library to parse the data. However, if the document has not been encoded properly, simple_xml will generate an XML.

Whenever an XML document is encoded, the encoding used should be provided in the document.
If the document was encoded with unicode, example, UTF-8, the following would be the first line of the:
<?xml version=”1.0″ encoding=”UTF-8″?>

In my case, the xml document was label as UTF-8, however it was an ascii document which contained non-acii characters. This created a major problem with the parser. The quick solution is to strip the non-ascii characters from the document.
This can be achieved with the following php code:

$jobs = file_get_contents(‘/home/mydir/doc.xml’);
$jobs = preg_replace(‘/[^(\x20-\x7F)]*/’,”, $jobs);

Leave a Reply




Have a Question? Need a Quote?

Drop us a line or give us a call at 888-856-2664

Your Name (required)
Your Email (required)
Phone (required)
Your Message

What Our Clients Are Saying

John,
Can I just tell you how impressed I am with your team?! This whole process has been so great and working with you guys has made it SOOOO much easier! Thank you for all your hard work and for convincing us to choose Accella.

SO THRILLED!!

Alycia White
Kubota

Recent Posts

Getting Arrested Can Help Your Business

While generally coming face to face with the law is generally not good for...

Drupal 6 with UberCart / UberPOS - A Love-Hate Relationship?

As most techies and programmers have heard, the Drupal CMS is an extremely...