• 0

Tricks and Tips for storing html in xml


Question

Hi everyone. I'm using C# to create xml files of data and I've never had a problem doing what I've been doing until I incorporated a new type of data to include, which is an html description block. Once I did this all hell broke loose.

Is there a guide somewhere to like a set of tricks to make sure your html is xml safe so you can do something like <MyElement Description="my html description here" />. B/c I keep running into road blocks.

Thanks :)

9 answers to this question

Recommended Posts

  • 0

You could surround the HTML string in a CDATA string, i.e. <![CDATA[ my_html_description ]]> However if the html contains a CDATA block then it will be messed up.

You could also scan through your HTML replacing special XML characters with their equivalent XML escape code:

" to "

< to <

> to >

& to &

Obviously you'll need to reverse this process when you remove the HTML.

  • 0

^ Base64 encoding is probably a little more complicated than simply having the description as its own element:

&lt;MyElement&gt;
  &lt;description&gt;
    &lt;p&gt;Here is a &lt;strong&gt;formatted&lt;/strong&gt; description&lt;/p&gt;
  &lt;/description&gt;
&lt;/MyElement&gt;

If you enforce an xhtml syntax on your descriptions, that way you can ensure that the html description is semantically strong. There are downisdes to that, in that if the html is not well formed, the whole xml stream will encounter errors whilst being parsed. You need to decide your risks for each design.

  • 0

Thanks for all your help so far. I don't know why base64 never occurred to me though I just added 2 static functions in my program that change the code although it might be more efficient to base64 encode it when coming to large descriptions come to think of it.

Thanks to all of you you've all given me something to think about, either way you all provided answers. I would do it the way you said Antaris but then since the description is put in manually by the end-user I have no guarantee that it's well-formed. I tested it that

way and ran into hundreds of parsing errors.

  • 0

The best solution of course is to use a tag soup parser to construct a DOM, then parse it back out again (valid HTML will be 1:1, invalid HTML will come out valid)

Failing that, storing it as CDATA would be the next best (assuming you can't enforce XHTML, which is quite possible)

  • 0
  On 10/03/2010 at 17:31, The_Decryptor said:

The best solution of course is to use a tag soup parser to construct a DOM, then parse it back out again (valid HTML will be 1:1, invalid HTML will come out valid)

Failing that, storing it as CDATA would be the next best (assuming you can't enforce XHTML, which is quite possible)

This. ;)

  • 0
  On 10/03/2010 at 17:17, Rob said:

Part of XML's charm is that it is human-readable, to an extent. Base64 encoding it removes that.

He didn't say that it needed "charm." Why go though the pain and pitfalls of a tag parser if you don't need it? If all he's doing is storing the HTML in the XML attribute, I still think encoding is the best idea because it's the simplest and would reduce to zero the chance of screwing up his XML. If he needs validation of the HTML, he can plug that in after the decoding step.
This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Posts

    • Listen, adrynalyne - You're trolling me (I've known this for months). If you have something meaningful to say - say it, don't ask as if you know something, becauser you know sh*t. Ok?
    • SMPlayer 25.6.0 by Razvan Serea SMPlayer intends to be a complete front-end for MPV/MPlayer , from basic features like playing videos, DVDs, and VCDs to more advanced features like support for MPlayer filters and more. One of the most interesting features of SMPlayer: it remembers the settings of all files you play. So you start to watch a movie but you have to leave... don't worry, when you open that movie again it will resume at the same point you left it, and with the same settings: audio track, subtitles, volume... Other additional interesting features: Configurable subtitles. You can choose font and size, and even colors for the subtitles. Audio track switching. You can choose the audio track you want to listen. Works with avi and mkv. And of course with DVDs. Seeking by mouse wheel. You can use your mouse wheel to go forward or backward in the video. Video equalizer, allows you to adjust the brightness, contrast, hue, saturation and gamma of the video image. Multiple speed playback. You can play at 2X, 4X... and even in slow motion. Filters. Several filters are available: deinterlace, postprocessing, denoise... and even a karaoke filter (voice removal). Audio and subtitles delay adjustment. Allows you to sync audio and subtitles. Advanced options, such as selecting a demuxer or video & audio codecs. Playlist. Allows you to enqueue several files to be played one after each other. Autorepeat and shuffle supported too. Preferences dialog. You can easily configure every option of SMPlayer by using a nice preferences dialog. Possibility to search automatically for subtitles in opensubtitles.org. Translations: currently SMPlayer is translated into more than 20 languages, including Spanish, German, French, Italian, Russian, Chinese, Japanese.... It's multiplatform. Binaries available for Windows and Linux. SMPlayer is under the GPL license. SMPlayer 25.6.0 changelog: Fix play/pause button. Some fixes to stop the screensaver on Linux. Fixed some issues related to disc playback. Various bug fixes and stability improvements. Download: SMPlayer 25.6.0 (64-bit) | Portable | ~40.0 MB (Open Source) Download: SMPlayer 25.6.0 (32-bit) | Portable Links: SMPlayer Website | Mac OS | Screenshot Get alerted to all of our Software updates on Twitter at @NeowinSoftware
    • Mozilla never had the funding resources to promote their stuff the way Microsoft and Google do. As simple as 1-2-3.
    • ...Microsoft says that Edge is better for your Windows PC than Google's Chrome, as it is a "Microsoft product" that "integrates closely with Microsoft Windows," which helps with "performance benefits."... Yeah, right! Microsoft is still trying hard to convince everyone about everything it makes...
    • This is the curse of Linux on the desktop, the development and maintenance manpower is spread out across countless distributions. The same thing has to be integrated many times.
  • Recent Achievements

    • Week One Done
      Al_ earned a badge
      Week One Done
    • Week One Done
      MadMung0 earned a badge
      Week One Done
    • Reacting Well
      BlakeBringer earned a badge
      Reacting Well
    • Reacting Well
      Lazy_Placeholder earned a badge
      Reacting Well
    • Dedicated
      Epaminombas earned a badge
      Dedicated
  • Popular Contributors

    1. 1
      +primortal
      474
    2. 2
      +FloatingFatMan
      273
    3. 3
      ATLien_0
      242
    4. 4
      snowy owl
      211
    5. 5
      Edouard
      182
  • Tell a friend

    Love Neowin? Tell a friend!