• 0

Extracting select data from CSV file (one column)


Question

Have a CSV file w/ about 15 columns - when opened in spread sheet (also have couple of "CSV editor" prgms).

Want to take 1 of the columns (by what ever method), & get it into a text config file in another prgm.

When you look at any CSV file in a text editor, there are NO spaces after each comma & before next data (datum).

The syntax for the prgm (text file) I'm trying to enter the data into, shows no spaces to be used between each comma & next data. Just a command then data list, w/ each piece of data separated by commas, but no spaces. It appears when I enter the data (including spaces) in the other prgm's text file, it does NOT use it correctly. I'm * assuming * it's because of the spaces - but I'm no expert on this.

Everything I've tried w/ Open Office, Excell (older ver), the CVS editors, Notepad ++, always winds up putting a space after each comma - before next piece of data.

Even when I delete everything from the orig CVS file except the column I want, once opened in the spread sheets, then save it in .cvs format, then reopen it in something other than a spreadsheet (text editor, etc.), it either has no commas at all, or put spaces between data. In short, the spreadsheets aren't saving the edited files in the EXACT same CVS format (w/ no spaces) as the original CVS file.

Any ideas how to do it the way (I think) I need to do it, w/o introducing spaces? Thanks.

6 answers to this question

Recommended Posts

  • 0

You will need to craft a Regular Expression to match the data you need and to insert it into the desitnation file.

I haven't read this page, but it seems like it might be a good explaination on how to use RegEx in Notepad++

http://markantoniou.blogspot.com/2008/06/notepad-how-to-use-regular-expressions.html

  • 0

Thanks Frazell, took a quick look at the page linked. (It has links to more detailed pages on using regular expressions & examples).

Caught a bit about using "search & replace" in Notepad ++. It was talking about adding spaces, so figured might work in reverse - it did. For this, I didn't have to use a Regular Expression (which I know nothing about). Used Notepad++ "find & replace" function, to find all blank spaces followed by comma, & replace w/ just a comma. Worked fine.

Depending on which spreadsheet I copy one column into Notepad ++, each line in the data copied in to Notepad ++ may / may not have an actual comma (& space) before each value.

First had to use the Edit>Column Editor function to add a comma in front of each value (there was one value on each line, in column form). Put the cursor at the place where want to enter the comma (or what ever want to add).

But, as said, after that, when use "unwrap" in Notepad ++, there's a space between every comma & next value, which I had to remove.

Not sure why the spreadsheets aren't smart enough to (or don't) save BACK to a CSV format & eliminate spaces between commas & values - the same way orig CSV file was before the spread sheet opened it. Seems like would be very common use to open & edit a CSV file in a spread sheet, then save back to CSV format (without spaces) - to open / use in other prgms, instead of going through extra steps of eliminating the spaces. Maybe not - maybe most uses for edited / saved CSV files prefer spaces between separators? Very little experience w/ them.

Prgms like Excel, OO Calc do warn (to the effect): "this file (may?) contain content or formatting that can't be saved in text .CSV format..."

Odd that they just opened a CSV file that contained no spaces between separators, but when saving it back to .CSV, they put spaces between all separators??

  • 0

Interesting, jocaaa. I tried older ver of Excel (office 2000), but newest Open Office Calc. May well be operator error (ignorance), but saw nothing in options how to save the file, once edited in spread sheet(s), back to CSV file, whether to include / exclude spaces (or anything else).

When you opened an original CSV file (which assume contained no spaces) in Excel '07, did you do any editing before saving back to CSV (or maybe trying saving w/o doing editing)?

Maybe there are options / settings in Excell that specify about properties for saving in CSV format, that exclude spaces?

In your example, assuming when you opened it in Excel, it showed 3 cols & 2 rows in spread sheet. Then when saved back to CSV format file, & you looked at that saved file in a text editor, had no spaces?

  • 0

There were no editing at all, and yes, I checked it in text editor and hex editor - no (unwanted) spaces as you describe. I tried several scenarios - first to open an existing .csv file and export (save as) it to another .csv file, and to enter values manually in Excel and export them to .csv - no spaces in either case. In your case double check if you have string values in Excell cells without space characters.

  • 0

I can't find any thing about "string values w/o space characters" in options or help files of either Excell '00, or latest open office Calc.

But, think I see what you mean. The key part of my OP you overlooked was,

  Quote
Want to take 1 of the columns (by what ever method), & get it into a text config file in another prgm.

Yes, if I edit it down to 2 cols, then export to CSV, each value * in a row * has no spaces between commas / values.

But if save only ONE column to a CSV file, then try to manipulate it in text editor (like Notepad ++), & unwrap the rows, so it appears like a CSV file, THEN all values have a space between them.

When I open that saved CSV file w/ only 1 col left, in most editors, it initially shows up as 1 col, w/ 1 value in each row (go figure). But, there are obviously hidden spaces after each value, because when unwrap the col, it does separate each value w/ comma, but also has a space.

I'm assuming this is part of the formatting of a CSV file to specify when to make a new row (for multi row files). That was the problem I had - dealing w/ using only 1 col AND getting rid of the spaces between values.

This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Posts

    • I don't usually complain about articles here, but this is just ridiculous, you couldn't ask your precious AI to write something about tech instead? This literally has nothing to do with tech.
    • Microsoft's "Athena" AI: A blueprint for your own dev team's productivity boost by Paul Hill Last month, Microsoft made many developer-oriented announcements at its annual BUILD conference. One of the tools that it announced at the time was called Athena, an artificial intelligence agent that lives in Microsoft Teams and aims to speed up product development processes. Now, however, Microsoft has released the blueprint of Athena so anyone can begin implementing a similar helper for their work. Rather than just another chatbot, Athena is a deeply integrated agent accessible through Teams that helps to connect people, tools, and data throughout the product development process. Athena is smart enough to work out what needs to happen next and helps team members get it done without having to go to different applications - Athena controls it all from Teams and you just have to communicate in natural language. Athena can be embedded in several developer workflows including Teams, Azure DevOps, and GitHub. Perhaps the greatest thing about Athena is that it’s not a new product being sold by Microsoft, but rather a methodology and open-source template (Dex) that organizations or individuals can take to build their own AI agents, allowing for more customization. To get started with Athena, you'll want to set up the Dex agent. Microsoft has also published a breakout session video about Athena so you can take a deeper dive. Who it affects, and how The primary beneficiaries of Microsoft’s decision to release Athena open source are engineering teams everywhere who will be freed from boring, repetitive tasks such as pull request (PR) reviews, work item management, and security checks. This will let them get on with coding new features and innovating - something that Microsoft has been pushing hard for since the start of the latest AI revolution. For product managers and engineering leaders, Athena also looks set to be incredibly useful as they will get better real-time visibility into the status of projects, if a release is truly ready for launch, and to ensure the team is aligned. Organizations from small to large will be able to benefit from using Athena. Due to its open-source nature, Athena can be tailored to meet specific development processes. This could unlock faster delivery cycles and improve code quality across the board. Why it's happening Athena is already being used internally at Microsoft by over 2,000 of its engineers. The Redmond giant explained that this has led to “measurable gains” in speed, quality, and focus. Aside from faster review cycles for developers, Athena is also surfacing release-blocking bugs earlier, enabling the consistent completion of security and privacy workflows, and providing quicker health assessments so that teams can gauge the overall health of their software delivery. By handling all these more boring tasks, Athena can free up developers to build more features into their projects. It also has the potential to speed up delivery times so that end users can use the new software faster, and with potentially less bugs. We often hear the term “democratization” in tech, a process that makes technologies more accessible and affordable. According to this definition, Microsoft’s release of Athena delivers on democratizing AI for developers as its open source and allows people to integrate AI Into their workflows, without starting from scratch. The move also aligns with Microsoft's AI strategy, that is, putting it all throughout its products. Copilot is probably the most notorious for its omnipresence in essentially every Microsoft product including Windows and Edge. Unlike Copilot, developers get a bit more freedom with Athena, but it’s still tied up with Microsoft products, namely Teams and GitHub. Caveats and what to watch for While it’s great that Microsoft is making its Athena blueprint accessible, one issue is that developers may still find it a bit complex to implement as there are still specific customizations organizations will want to make. Additionally, this solution involves a more involved setup process as outlined in the GitHub README. Another thing organizations should be wary about is data privacy and security implications when it comes to integrating with sensitive internal systems. Organizations that are working on secretive projects probably wouldn’t want to use Athena as this could put sensitive code in the hands of third parties. It’s not only technical issues that need considering either, there is also the human element. Some people may have concerns about AI hallucinating or ethical concerns around job security that could hurt adoption. To this end, Microsoft has reaffirmed that Athena is supposed to assist teams only, not replace team members. While Athena can be extremely useful, as shown by the results internally at Microsoft, human oversight and judgment will still be vital. Complex decision and creative problem-solving in development are some areas where a human still needs to be involved. Source: Microsoft
    • How could it not be optional? ChatGPT isn't going to guess people's credentials :P
    • Mmmm, spread too thin into 2 quite different things
  • Recent Achievements

    • First Post
      James courage Tabla earned a badge
      First Post
    • Reacting Well
      James courage Tabla earned a badge
      Reacting Well
    • Apprentice
      DarkShrunken went up a rank
      Apprentice
    • Dedicated
      CHUNWEI earned a badge
      Dedicated
    • Collaborator
      DarkShrunken earned a badge
      Collaborator
  • Popular Contributors

    1. 1
      +primortal
      382
    2. 2
      +FloatingFatMan
      176
    3. 3
      ATLien_0
      171
    4. 4
      snowy owl
      169
    5. 5
      Xenon
      133
  • Tell a friend

    Love Neowin? Tell a friend!