Cron Task to Download CSV and Dump Into MySQL


Recommended Posts

Hey,
Ok..Trying to wrap my head around this..
Basically, what I'm trying to do is every thirty minutes, download a csv file, and then dump it into MySQL database, so that I can view the results from my PHP website...
I haven't tested this yet (Never used a cron before), but from my searching, to setup the cron, I would use:
 

0,30 * * * * wget -O - http://example.com/test.csv

But..Once I have the csv file..I can't seem to figure out what to do with it...Maybe I'm just tired, but I have no idea how to get it into the database...A push in the right direction would help a lot..

 

(In case it matters, using Debian, everything up-to-date)

 

Thanks!

You would then run a sql query via MySQL command line. So you would set up another cron to run a minute or an hour later. Depending on how long it takes the csv to download. So you figure out the time it takes to download the csv and then run a bash script at that time interval. I Think it's cleaner to point cron to a bash script that runs the sql query instead of inside the cron file.

Look up MySQL queries view command line.

Example

$ mysql -u vivek -p -e 'SELECT COUNT(*) FROM quotes' cbzquotes

Source: http://www.cyberciti.biz/faq/run-sql-query-directly-on-the-command-line/

So you would just run an import into the table your php script parses for the data.

A bash script..That's a good idea...Never used one of those either, I'll have to read up on those, thank for the suggestion!

And I should have figured out MySQL from command line...Think I've been staring at lines of code for too long today..Think I was partly trying to over-complicate things...lol Never liked it too much though...I still need to get more comfortable with command line.

But I'll look into the bash script idea, thanks!

Bash scripts are easy and wonderful. Throw a quick file together and name it .sh. Make it executable by running chmod +x filename.sh

Example:

#!/bin/sh

MySQL -u user -p password -h localhost -e 'query'

Put the above 2 lines in a file, save it and have cron run the file.

Same cron entry like above with obviously different interval. The command will be /path/to/script.sh

Cron run it as root so you don't have to worry about permissions just make sure it's executable.

Thanks :)

I had thought about learning about Bash scripts when I was trying to learn Linux, but could never think of a use for it, so never got around to it lol Looks fairly straight forward..Hopefully this weekend I'll have time to play around with it, I'll let you know how it goes!

Ok, finally had a chance to work on this..Had to work on something else for the project quickly.

Anyways, I'm getting commands not found...

#!/bin/sh

wget "http://www.example.com/test.csv"

sleep 40

MySQL -u user -p password

use databseName

load data local infile 'table.csv' into table tableName
IGNORE 1 LINES
fields terminated by ','
enclosed by '"'
lines terminated by '\n'
(column1, column2, column3, column4, column5, column6, column7)

And I get this returned:

/var/www/html/scripts/test.sh: line 7: MySQL: command not found
/var/www/html/scripts/test.sh: line 9: use: command not found
/var/www/html/scripts/test.sh: line 11: load: command not found
/var/www/html/scripts/test.sh: line 12: IGNORE: command not found
/var/www/html/scripts/test.sh: line 13: fields: command not found
/var/www/html/scripts/test.sh: line 14: enclosed: command not found
/var/www/html/scripts/test.sh: line 15: lines: command not found
/var/www/html/scripts/test.sh: line 16: column1,: command not found

It seemed to have downloaded the file..However when I check /var/www/html/scripts/, the csv file isn't there..

But still not sure what to do about the commands not found..What did I do wrong..?

Thanks! Using something similar..(I had quotes in there, so changed it to QUERY_INPUT)

#!/bin/sh

wget -O table.csv "http://www.example.com/table.csv"

sleep 40

mysql -u<user> -p<password> dataBase <<QUERY_INPUT

load data local infile 'table.csv' into table testing
fields terminated by ','
enclosed by '"'
lines terminated by '\n'
IGNORE 1 ROWS
(column1, column2, column3, column4, column5, column6, column7)
QUERY_INPUT

the 'mysql' part needed to be lowercase as well.

That's the final file for anyone who needs it.

The 'sleep 40' is there to make sure the file downloads before loading MySQL.

'Ignore 1 Rows' is to ignore the first row

 

Thanks for the help!

 

Marking Sikh's as the answer, however thank you Haggis as well!

Didn't get notifications for this, gurrrr. Thanks haggis for helping out.

 

Glad you figured it out. I would make the sleep a minute or two just incase mysql gets tied up for some reason. Its always good to overestimate then it is to underestimate because if you don't get email notifications that your script is running or failed to run, the next time you look at it, it might have not been working for 2 weeks or 2 days.

 

So I recommend up'ing the sleep

This topic is now closed to further replies.
  • Posts

    • I use two of these in RAID0 for video games and other things, together they are capable of 2.8 million IOPS and 15 GB/s on Gen4. At this price, 4 TB of Gen4 is faster and less expensive than a single 2 TB Gen5 NVMe, not to mention easier to cool off. Highly recommend.
    • that is a normal sign in, they just put in a dumb location to try to hide it... and yes I think the whole MS account by default is BS too, the first question should be do you want an online profile or a local one
    • From our past comments, it looks like some ppl are defo enjoying these stories.
    • You are missing the point, we should not have to do that. Should have a do you want an MS account option on the normal sign in.
    • Hynix 2TB Platinum P41 NVMe Gen4 2280 SSD is back at a mouth-watering price by Sayan Sen If you are in the market for an SSD, especially an M.2 2280 NVMe drive, then take a look at the Platinum P41 from SK hynix, which is currently back to its lowest price in the last six or so months (purchase link down below). This is a Gen4 SSD that promises sequential reads and writes of up to 7000MB/s and 6500MB/s, respectively. Meanwhile, the sequential read and write speeds are rated at up to 1400K IOPS and 1300K IOPS, respectively. The 2TB variant of the P41 features a 2GB DDR4 DRAM cache as well, which is meant to improve write caching and random access times with quicker metadata look-ups. Hynix says that the drive can operate at temperatures of up to 70 °C and thus you should opt for a heatsink if you intend to do data transfers for longer periods at a time. That should not be a problem, as the P41 Platinum is based on 176-layer TLC NAND and has a rated endurance of up to 1200 TBW (terabytes written) for 2TB. Make sure to update the firmware for the Platinum P41, as it fixes a throttling issue wherein the SSD would drop speeds after being used for several months (via Lower-Tone-3503 on Reddit). SSD firmware updates can often resolve critical problems, like in the case of WD/SanDisk SSDs, for example, which are still blocking Windows 11 24H2 due to inappropriate firmware. Get the SK hynix P41 at the link below: SK hynix Platinum P41 M.2 2280 NVMe SSD without heatsink: $129.99 + $10 off w/ promo code SACET23323, limited offer => $119.99 (Newegg US) You can also check out the other SSD deals we have covered in these articles here. This Amazon deal is US-specific and not available in other regions unless specified. If you don't like it or want to look at more options, check out the Amazon US deals page here. Get Prime (SNAP), Prime Video, Audible Plus or Kindle / Music Unlimited. Free for 30 days. As an Amazon Associate, we earn from qualifying purchases.
  • Recent Achievements

    • Dedicated
      tesla maxwell earned a badge
      Dedicated
    • Dedicated
      Camlann earned a badge
      Dedicated
    • Week One Done
      fredss earned a badge
      Week One Done
    • Dedicated
      fabioc earned a badge
      Dedicated
    • Week One Done
      GoForma earned a badge
      Week One Done
  • Popular Contributors

    1. 1
      +primortal
      629
    2. 2
      Michael Scrip
      224
    3. 3
      ATLien_0
      219
    4. 4
      +FloatingFatMan
      142
    5. 5
      Xenon
      134
  • Tell a friend

    Love Neowin? Tell a friend!