• 0

Tricks and Tips for storing html in xml


Question

Hi everyone. I'm using C# to create xml files of data and I've never had a problem doing what I've been doing until I incorporated a new type of data to include, which is an html description block. Once I did this all hell broke loose.

Is there a guide somewhere to like a set of tricks to make sure your html is xml safe so you can do something like <MyElement Description="my html description here" />. B/c I keep running into road blocks.

Thanks :)

9 answers to this question

Recommended Posts

  • 0

You could surround the HTML string in a CDATA string, i.e. <![CDATA[ my_html_description ]]> However if the html contains a CDATA block then it will be messed up.

You could also scan through your HTML replacing special XML characters with their equivalent XML escape code:

" to "

< to <

> to >

& to &

Obviously you'll need to reverse this process when you remove the HTML.

  • 0

^ Base64 encoding is probably a little more complicated than simply having the description as its own element:

&lt;MyElement&gt;
  &lt;description&gt;
    &lt;p&gt;Here is a &lt;strong&gt;formatted&lt;/strong&gt; description&lt;/p&gt;
  &lt;/description&gt;
&lt;/MyElement&gt;

If you enforce an xhtml syntax on your descriptions, that way you can ensure that the html description is semantically strong. There are downisdes to that, in that if the html is not well formed, the whole xml stream will encounter errors whilst being parsed. You need to decide your risks for each design.

  • 0

Thanks for all your help so far. I don't know why base64 never occurred to me though I just added 2 static functions in my program that change the code although it might be more efficient to base64 encode it when coming to large descriptions come to think of it.

Thanks to all of you you've all given me something to think about, either way you all provided answers. I would do it the way you said Antaris but then since the description is put in manually by the end-user I have no guarantee that it's well-formed. I tested it that

way and ran into hundreds of parsing errors.

  • 0

The best solution of course is to use a tag soup parser to construct a DOM, then parse it back out again (valid HTML will be 1:1, invalid HTML will come out valid)

Failing that, storing it as CDATA would be the next best (assuming you can't enforce XHTML, which is quite possible)

  • 0
  On 10/03/2010 at 17:31, The_Decryptor said:

The best solution of course is to use a tag soup parser to construct a DOM, then parse it back out again (valid HTML will be 1:1, invalid HTML will come out valid)

Failing that, storing it as CDATA would be the next best (assuming you can't enforce XHTML, which is quite possible)

This. ;)

  • 0
  On 10/03/2010 at 17:17, Rob said:

Part of XML's charm is that it is human-readable, to an extent. Base64 encoding it removes that.

He didn't say that it needed "charm." Why go though the pain and pitfalls of a tag parser if you don't need it? If all he's doing is storing the HTML in the XML attribute, I still think encoding is the best idea because it's the simplest and would reduce to zero the chance of screwing up his XML. If he needs validation of the HTML, he can plug that in after the decoding step.
This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Posts

    • Apple introduces macOS Tahoe with a new Phone app, revamped Spotlight, and more by David Uzondu The speculation can finally end, as the rumors about the name turned out to be completely true. At its Worldwide Developers Conference today, Apple officially christened the next version of its desktop operating system macOS Tahoe. This release, part of the new "26" family of operating systems like iOS 26, is bringing more than just a new name to the table. The most significant and immediate change is a system-wide visual redesign Apple is calling "Liquid Glass." Let's talk about this new design, because it is the first thing you will notice. Apple is taking the translucent, layered look from its visionOS and bringing it everywhere. It is a fundamental change to how macOS looks and feels, which some Neowin readers are definitely not a fan of. Sidebars, toolbars, and menus all have this frosted glass effect, where their color and texture shift based on the window or wallpaper behind them. The menu bar is now completely transparent, and Apple is also adding more customization. You can now change the color of folders, which is a feature people have wanted for nearly forever, and add little symbols or emojis to them. The bigger story for day-to-day use might be how much tighter the Mac and iPhone are becoming. Last year, macOS Sequoia saw the release of apps like the dedicated Passwords app, and this year, with Tahoe, we get a full-blown Phone app on the Mac. It is not just for getting call notifications anymore. You can see your recent calls, check your voicemail, and access your contacts list just like you would on your iPhone. New features like Call Screening, which can figure out who is calling before you answer, are also included. Live Activities from your iPhone, like the status of an Uber ride or a food delivery, will now show up right in your Mac's menu bar. Spotlight search is also getting a massive update. For years, Spotlight has been a simple tool for finding files and launching apps. Now, Apple is trying to turn it into an action center. You will be able to perform tasks directly from the search results, like creating a new note or sending an email, without ever opening the corresponding application. The search results themselves are supposed to be smarter and are no longer separated into rigid categories. Everything just shows up in one big list, ranked by what the system thinks is most relevant to you. Despite Apple's ongoing AI challenges, Apple Intelligence is getting new abilities. The AI features introduced last year are being expanded. A new Live Translation feature can translate text in Messages or audio during a phone call or FaceTime, all on the device, to maintain privacy. The Shortcuts app is also getting more powerful. You can now build automations that tap directly into Apple's AI models to do more complex tasks, like summarizing an audio recording of a lecture and comparing it to your typed notes. And yes, Apple is still trying to make gaming on the Mac a serious thing. A new dedicated app called Apple Games is being introduced. It acts as a central library for all your games, similar to launchers you might see on a PC. It also includes a new Game Overlay, which lets you mess with system settings or chat with friends without leaving your game. Apple announced that titles like Cyberpunk 2077, Crimson Desert, and Lies of P: Overture are on their way to the platform. Other mainstays are getting refreshed as well. Safari has a redesigned tab layout, Messages is getting polls, and the Journal app is finally making its way from the iPhone to the Mac. The first developer beta for macOS Tahoe is available today, June 9, 2025. A public beta is expected to follow next month, with the final version being released for free to everyone with a compatible Mac this fall. You can check out all the details in Apple's official macOS announcement on the Newsroom.
    • Google introduces new analytics tools in Classroom by David Uzondu Google has started the rollout of new analytics tools for Google Classroom, adding a bunch of ways for educators to monitor the activity of their students. This latest addition puts a new "Analytics" tab on class pages, which, according to Google, is meant to help teachers "see relevant insights on the class analytics page that alert them on how students are progressing and where they may need additional support." For example, on the new page, there might pop up a notification saying "3 students' grades increased over 25% since last month," or, on the flip side, "1 student turned in over half their assignments late in the last month." On the Classwork page, teachers can now see a number next to an assignment showing how many students have not even opened the attached files in Google Drive. For any teacher who has stared at a blank submission list, wondering if a student is struggling or just forgot. It shows who has not even started, letting teachers poke a student privately or nudge the whole class to get going. Google says insights are triggered by factors like approaching deadlines or performance issues, such as a student scoring below 70%. But here is the catch: these new tools are not for everyone. This is a premium feature, locked behind the paid Google Workspace for Education Plus license, so schools using the free version are completely out of luck. For those with a subscription, the student engagement metric that shows unopened Drive files is available right now. The main "Analytics" tab and its associated alerts, however, are on a slower schedule. Their extended rollout begins today. The company says the tab itself should show up by June 30, while the full set of insights will continue to appear, with everything expected to be in place by August 1, 2025. Super admins get access to this automatically and will have to decide which other education leaders and staff get to see all the new data.
    • I welcome thee! Here here!
    • I noticed that macOS STILL does not have a Menubar Icon manager. Top right hand corner is already getting cluttered and they've needlessly added two extra things. We need a better way to manage it and all the third-party tools out there are awful. Don't offer me alternatives, I've tried them more than they are rubbish! Apple, fix this!
  • Recent Achievements

    • Rookie
      CHUNWEI went up a rank
      Rookie
    • Enthusiast
      the420kid went up a rank
      Enthusiast
    • Conversation Starter
      NeoToad777 earned a badge
      Conversation Starter
    • Week One Done
      VicByrd earned a badge
      Week One Done
    • Reacting Well
      NeoToad777 earned a badge
      Reacting Well
  • Popular Contributors

    1. 1
      +primortal
      484
    2. 2
      +FloatingFatMan
      277
    3. 3
      ATLien_0
      257
    4. 4
      Edouard
      206
    5. 5
      snowy owl
      199
  • Tell a friend

    Love Neowin? Tell a friend!