• 0

Tricks and Tips for storing html in xml


Question

Hi everyone. I'm using C# to create xml files of data and I've never had a problem doing what I've been doing until I incorporated a new type of data to include, which is an html description block. Once I did this all hell broke loose.

Is there a guide somewhere to like a set of tricks to make sure your html is xml safe so you can do something like <MyElement Description="my html description here" />. B/c I keep running into road blocks.

Thanks :)

9 answers to this question

Recommended Posts

  • 0

You could surround the HTML string in a CDATA string, i.e. <![CDATA[ my_html_description ]]> However if the html contains a CDATA block then it will be messed up.

You could also scan through your HTML replacing special XML characters with their equivalent XML escape code:

" to "

< to <

> to >

& to &

Obviously you'll need to reverse this process when you remove the HTML.

  • 0

^ Base64 encoding is probably a little more complicated than simply having the description as its own element:

&lt;MyElement&gt;
  &lt;description&gt;
    &lt;p&gt;Here is a &lt;strong&gt;formatted&lt;/strong&gt; description&lt;/p&gt;
  &lt;/description&gt;
&lt;/MyElement&gt;

If you enforce an xhtml syntax on your descriptions, that way you can ensure that the html description is semantically strong. There are downisdes to that, in that if the html is not well formed, the whole xml stream will encounter errors whilst being parsed. You need to decide your risks for each design.

  • 0

Thanks for all your help so far. I don't know why base64 never occurred to me though I just added 2 static functions in my program that change the code although it might be more efficient to base64 encode it when coming to large descriptions come to think of it.

Thanks to all of you you've all given me something to think about, either way you all provided answers. I would do it the way you said Antaris but then since the description is put in manually by the end-user I have no guarantee that it's well-formed. I tested it that

way and ran into hundreds of parsing errors.

  • 0

The best solution of course is to use a tag soup parser to construct a DOM, then parse it back out again (valid HTML will be 1:1, invalid HTML will come out valid)

Failing that, storing it as CDATA would be the next best (assuming you can't enforce XHTML, which is quite possible)

  • 0
  On 10/03/2010 at 17:31, The_Decryptor said:

The best solution of course is to use a tag soup parser to construct a DOM, then parse it back out again (valid HTML will be 1:1, invalid HTML will come out valid)

Failing that, storing it as CDATA would be the next best (assuming you can't enforce XHTML, which is quite possible)

This. ;)

  • 0
  On 10/03/2010 at 17:17, Rob said:

Part of XML's charm is that it is human-readable, to an extent. Base64 encoding it removes that.

He didn't say that it needed "charm." Why go though the pain and pitfalls of a tag parser if you don't need it? If all he's doing is storing the HTML in the XML attribute, I still think encoding is the best idea because it's the simplest and would reduce to zero the chance of screwing up his XML. If he needs validation of the HTML, he can plug that in after the decoding step.
This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Posts

    • I was going to say that search engines and such, have been scraping everyone's copyright, IP and data, since the beginning of the internet.
    • Microsoft is officially making the Xbox app on PC a universal launcher by Pulasthi Ariyasinghe It was earlier this year that an image mockup from Microsoft showed the Xbox app on PC with an interesting change: including support for other PC stores on the app, teasing that it may be becoming a universal launcher like GOG Galaxy or Playnite. Considering the company's new handheld initiative that will house a brand-new gaming-focused version of Windows, it was clear that this feature was on the way. And now, Xbox Insiders have access. Announced today as the 'aggregated gaming library' feature, it's poised to land this holiday on the ROG Xbox Ally and ROG Xbox Ally X to easily manage all their installed games from a single place. But before that, Xbox Insiders on PC can have a crack at it to see how it functions and provide feedback to Microsoft. In its current state, Microsoft says that the feature now supports Xbox, Game Pass, Battle.net, and "other leading PC storefronts," all handled via the Xbox PC app. The company did not detail what these other storefronts are, but Steam, Epic Games Store, Ubisoft Connect, and EA Play apps seem likely candidates. "Whether you’re on a Windows PC or a handheld device, your Xbox library, hundreds of Game Pass titles, and all your installed games from leading PC storefronts will now be at your fingertips," said the company. When a game from a supported store is installed on a PC, Insiders should now see it appear on the Xbox app in the My Library and Most Recent sections for easy access. "And this is just the beginning," adds Microsoft. "We’ll continue rolling out support for additional PC storefronts over time." Insiders can also disable this functionality and hide games from specific stores if needed from the Settings > Library & Extensions menu. Anyone interested in testing out the new 'aggregated gaming library' update can use the Xbox Insider app on PC to enroll in the ongoing Insider Previews.
    • Get this powerful mini PC with Core Ultra 9, 32GB RAM, and 1TB SSD for just $799 by Taras Buria The ASUS NUC 14 Pro+ is a powerful mini PC with capable hardware, and right now, you can get it on Amazon with a big discount. At just $799, this computer offers a Core Ultra 9 processor, 32GB of memory, and a 1TB SSD. The NUC 14 Pro+ features a low-profile aluminum chassis, which can be opened without removing rubber feet or undoing any screws. Its toolless design lets you access the storage without a screwdriver. The computer also has a rich set of ports. On the front side, you will find two USB 3.2 Gen 2 Type-C, one USB 3.2 Gen 2x2 Type-C, and a power button. Unlike the Mac mini, which has a frustrating power button placement, the power button in the NUC 14 Pro+ is located where it should be. The back of the NUC 14 Pro+ has a DC-in port, two Thunderbolt 4 ports, one 2.5G Ethernet port, one USB 3.2 Gen2 Type-A, one USB 2.0 Type-A, two HDMI 2.1, and a Kensington lock. Finally, there is a VESA mount, which lets you place the device on the back of your monitor for a cleaner desk. The computer is powered by Intel's 14th-gen Core Ultra 9 185H processor, 32GB of DDR5 memory, and a 1TB PCIe Gen4 NVMe SSD. Windows 11 Home is preinstalled, so you do not need to bring your own drive, memory, or Windows 11 license. ASUS NUC 14 Pro+ Core Ultra 9 185H, 32GB RAM, 1TB SSD - $799.99 | 27% off on Amazon US This Amazon deal is US-specific and not available in other regions unless specified. If you don't like it or want to look at more options, check out the Amazon US deals page here. Get Prime (SNAP), Prime Video, Audible Plus or Kindle / Music Unlimited. Free for 30 days. As an Amazon Associate, we earn from qualifying purchases.
    • This guy is just salty that Waymo is about to get buried by a company with cars that cost significantly less, charge significantly lower fares, and will soon dramatically outnumber their fleet. Waymo made the mistake of not reducing their vehicle cost quick enough and not overcoming their route limitations. Unless they start allowing their cars to use the freeways and have significantly wider geofencing, they're going to soon join the list of discontinued Google products. If Tesla wasn't the one to make them irrelevant, somebody else soon was. There's a long list of companies designing robotaxis right now.
    • LOL. Hard to believe people still fall for this. If you are having some sort of issue, I would work on fixing that instead turning off these settings.
  • Recent Achievements

    • Week One Done
      fredss earned a badge
      Week One Done
    • Dedicated
      fabioc earned a badge
      Dedicated
    • One Month Later
      GoForma earned a badge
      One Month Later
    • Week One Done
      GoForma earned a badge
      Week One Done
    • Week One Done
      ravenmanNE earned a badge
      Week One Done
  • Popular Contributors

    1. 1
      +primortal
      649
    2. 2
      Michael Scrip
      226
    3. 3
      ATLien_0
      219
    4. 4
      +FloatingFatMan
      146
    5. 5
      Xenon
      137
  • Tell a friend

    Love Neowin? Tell a friend!