Recommended Posts

So I woke up late today, around 1300, to find that my Nextcloud instance was down.  I'm hosting it on Debian Bullseye via the regular old tarball manually set up with Apache, MariaDB/MySQL, PHP, etc.  It's been running great for literally years across multiple in-place upgrades to both Nextcloud and Debian.

 

After doing some tinkering it came to my attention that I MySQL was complaining it couldn't connect to the database.  Easy enough I figured, I'll just log into MySQL and see what's wrong.  Upon trying to launch the MySQL shell though it would ask for the password and then error out saying it couldn't connect to the server.

"ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/run/mysqld/mysqld.sock' (111)"

So I thought maybe the .sock file got messed with during an update or something and wasn't being removed properly, so I verified the location of the correct file by looking at the configs, all of which pointed to the same file, and I then deleted that mysqld.sock file and tried restarting MySQL, but still no dice.  I tried rebooting the whole server just for kicks, no luck.

 

I tried reinstalling MariaDB/MySQL but that apparently doesn't get rid of the existing configuration files, so what I ended up doing was apt purge --autoremove on mariadb-server, deleting /var/run/mysql, then reinstalling it and re-importing my most recent database backup (yesterday).  It's just a personal instance with myself, my wife and kids on it, and I've got it scheduled to do daily backups of the database, so it wasn't a huge issue.  What I'm curious about is why it crapped out in the first place.

 

While poking around in syslog I found the following line:

mariadbd[1115]: 2022-01-01 11:55:02 0 [ERROR] [FATAL] InnoDB: You should dump + drop + reimport the table to fix the corruption.


That timestamp is hours after any kind of automatic update/reboot would have taken place.

 

So something crazy happened that corrupted the actual database, but why would that have broken my ability to log into the MySQL shell to try and correct it?  It's saying I should dump and reimport the database, but I couldn't do that without having access to the MySQL shell.

 

I've checked the logs for apt and I don't see any kind of updates that would have been applied by unattended-upgrades; my last automatic update was December 18th.

 

Did anybody else have anything happen today with their database?  I guess it's definitely possible that Nextcloud encountered some kind of bug and corrupted its database.  I've done short SMART tests on all the drives in the system and found no issues, and the server is running on an UPS so there shouldn't have been any kind of power fluctuation or outage to cause any issues.  My UPS is reporting no events since the 17th either.

 

I guess I'm posting all this just to try and fish for thoughts from any of you who may have encountered this kind of thing in the past, or who may have some idea as to what happened.  I've restored a backup and everything is fine, but if there's something I can do to prevent the issue in the future, I'd like to do so.

Link to comment
https://www.neowin.net/forum/topic/1414223-mariadbmysql-took-a-dump-last-night/
Share on other sites

Well, I have a bunch of MySQL / MariaDB 5.x and MariaDB 10.x instances which are all running without issue right now.

 

I've had things like that happen before though. One cause is if the filesystem temporarily ran out of diskspace which can cause a table to require fixing. I've a suspicion that MySQL doesn't behave well if the data filesystem is briefly marked as read-only but it's just a hunch.

 

Table corruption can stop MySQL from starting though. That's a thing unfortunately.

 

Personally I'd recommend enabling the binlog and adding "--master-data=2" to your mysqldump line so that you can recover the database right up to the point where corruption occured. If you backup both the database dump file and the associated binlog files then you're pretty well sorted in terms of data recovery I think.

I just checked my install of MySQL running on Raspbian and all is well. With having to do a complete wipe and restore, the last entries are more than likely gone to see what the last thing that was modified or added/removed. The last time I had any corruption on my setup was testing new additions and was completely my own doing. Have you checked any connection logging to see if any weird connections were seen around the after the last time you knew it was working?

 

@DonC has a great point as that missing data between the last backup could be vital to see what happened.

The log entry immediately prior to the error messages is Nextcloud invoking its cron.php script, so I'm guessing it has something to do with that.  I've made copies of syslog from that timeframe so I may dig into it some more later, but I'm tired of reading logs since everything is back up and working I'll save it for later.

Good luck and I am happy that at least everything is back up and going. Keep us posted if you do dig into this. I am interested to see what you find if you do

On 01/01/2022 at 21:05, Gerowen said:

 

Did anybody else have anything happen today with their database? 

 

chinese hackers

On 01/01/2022 at 22:44, Marujan said:

chinese hackers

I thought about hackers of some sort, but there was no indications of any files missing or modified, no suspicious entries in auth.log, nothing banned by Fail2Ban, etc.  On top of that, all the various services hosted by the server are all hosted by their own non-root user accounts/groups and SSH is not open to the world and enforces public key authentication.  I'm fairly certain it was just some weird-ness with the database during the execution of Nextcloud's cron script.

 

Besides, with only 4 users, outside of some script kiddie who happened across a public share link I've posted somewhere, there's not really any incentive to try and bother my personal server.

Edited by Gerowen

So here's a piece of syslog.  You can see that at 11:50 the cron.php script executes and there are no problems.  5 minutes later it runs again (this is scheduled/expected), and this is where the problems begin.  So in the block of time between 11:50 and 11:55, something screwy happened.  I was asleep at the time, so I personally wasn't doing anything on the server directly, but we've all got cell phones and PCs connected to it all the time, plus I've shared several public links for photo albums and such with family members over Facebook, so even if I wasn't logged in, Nextcloud is constantly doing "something" in the background.

image.thumb.png.b4d6e17c3252dbf55a75302c9e5a5541.png

 

Here's the contents of auth.log for that particular block of time.  Nothing suspicious, root running cron and www-data running Nextcloud's cron.php script.

image.thumb.png.f22bf5f0c8231a95da077253eca4d1af.png

 

The database and the Nextcloud server files are stored on the main system drive which is a Western Digital Blue 2.5" SSD.  The actual data directory (user files) is stored however on a separate, encrypted RAID 5 "storage" partition.  Both drives have plenty of free space available.

image.png.f3fed2caa212019ffd5d112db6aa65d9.png

 

image.png.da1083f5449bd822d948673846af89b9.png

 

I never thought to keep a backup copy of the corrupted database for further inspection but once I got the backup copy up and running I deleted it.  I even had a copy of /var/run/mysql as a backup in the event that purging/re-installing MariaDB didn't fix the issue, but once it was clear everything was working again I deleted it.  But as far as I can tell, everything looks fine.  All I can figure is that I encountered some kind of weird bug/edge case.  I am running an older system.  The "server" originally started out as an old HP Pavilion P6803W tower PC that I bought ages ago.  Since then it has received an upgrade to a 6 core AMD Phenom II processor, 16GB of RAM, a new power supply, new case, etc.  The only original part is the motherboard.  However, all of the hardware in it is old and used, and the RAM isn't ECC, so it's totally possible that there was some sort of a bit flip or some other hardware issue.  I haven't had any issues in the past, but that doesn't mean they can't start, especially since the system has been running basically 24/7/365 for going on a decade now.  The temps have always been in great shape because I put an over-sized 125 watt cooler on a 95 watt chip.

image.png.2f78d8657437b73c344eb786920159d6.png

 

There's no indications of this being any kind of an attack either.  No changes made to my firewall rules, no new packages installed or removed, no modifications to any of my systemd service files, no files apparently tampered with or bothered, nobody banned by Fail2Ban, no unexpected auth attempts or blocked traffic on my firewall, no weird entries in syslog/kern.log, (at least that I've noticed) etc.

 

On the hardware front all the drives check out after running some short SMART tests, but I will see about doing a memtest scan on it at some point just to verify whether there are any issues with the RAM.  I'm gonna hope that it was just a software bug and I don't encounter it again because even though I don't mind replacing the server, I'm kind of attached to the old girl, :p  I will also verify that I don't have any other services hogging up my RAM as well just to be safe.

Edited by Gerowen
added screenshot as evidence of free space

At least from the quick views, nothing looks out of place. Knowing the the hardware is as old as it is could be just a really unfortunately timed hiccup. If the drivs check out and no bad sectors found, my next check would be the ram.

 

Keep up on those backs to be safe and I hope it does not happen again.🤞

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Posts

    • Dragon's Dogma 2: Dark Arisen expansion to bring snowy region, new updates also coming by Pulasthi Ariyasinghe Capcom had a surprise waiting for Dragon's Dogma fans today in the Nintendo Direct presentation. The company revealed an expansion for the second installment with a name that should be familiar to series veterans. Coming later this year, Dragon's Dogma 2: Dark Arisen is promising a massive new region to explore, new monsters, fresh skills to learn, and more. The studio says players will be heading to the Northern region of the world, named Norgan, to find new secrets about an undying "Fallen Dragon." There will be forgotten relics that the protagonist can find to unlock fresh weapons and skills the expansion is introducing. Players will also be able to find mysterious equipment from a previous Arisen as a part of the expansion, all part of 12 Lost Rites Dungeon Challenges they must complete to gain access. In Neowin's own review, I found Dragon's Dogma 2 to be an impressive RPG when it launched back in 2024, giving the title an 8.5/10 for its class variants, companion system, and immersive exploration. "Once a prosperous region of the kingdom of Vermund, it was abandoned many years ago for reasons unknown," says Capcom about the new region. "Long has it been since any soul traveled its paths. Blanketed in heavy snow, these frigid lands are home to savage hordes and creatures of unbelievable power. Those who are capable of vanquishing such fearsome foes, or those who possess a keen eye for exploration, will find themselves rewarded with powerful relics." Dragon’s Dogma 2: Dark Arisen expansion launches on October 9, 2026, with a $29.99 price tag. Ahead of the expansion release, Capcom is also planning to release two free updates to the base game. The first will land tomorrow, June 10, bringing more accessible fast travel with an Eternal Ferrystone and other quality-of-life adjustments. The second update will land sometime in August, aiming to improve frame rates, add more save slots, and bring even more community-requested adjustments. This expanded Dark Arisen edition is also launching on the Nintendo Switch 2 on the same day the content comes to PC, Xbox Series X|S, and PlayStation 5.
    • Classic themes are just the colors on the bar like the olden days, if you use the image themes, it does fancy transparent backgrounds and it makes the elements of the app look like they are transparent bubbles. This sample image shows what it looks like.  
    • Good point, unfortunately. NextDNS has far more filters and workarounds than uBlock, and it's easy to implement.
    • Windows 10 KB5094127 Patch Tuesday improves File Explorer search and more by Taras Buria The June 2026 Patch Tuesday updates are here, bringing mandatory patches to users with PCs enrolled in the Extended Security Update program for Windows 10. Microsoft is rolling out KB5094127, with build numbers 19045.7417 and 19044.7417. Changelog includes the following: [File Explorer] This update improves File Explorer search, including support for Chinese text, and UTF 8–encoded files without a byte order mark (BOM). Text now displays more clearly and consistently across search results, Content view, and tooltips. [Secure Boot] This update enables dynamic status reporting for Secure Boot states in Windows Security App. This update adds a new policy setting, LimitSecureBootRequiredServiceData, under Computer Configuration > Administrative Templates > Windows Components > Secure Boot. When this setting is enabled, Windows limits the Secure Boot service data it sends by suppressing the event normally sent to Microsoft. This policy is also included in the Windows Restricted Traffic Limited Functionality Baseline package. For information about the policy, see Manage connections from Windows 10 and Windows 11 operating system components to Microsoft services. With this update, Windows quality updates include additional high confidence device targeting data, increasing coverage of devices eligible to automatically receive new Secure Boot certificates. Devices receive the new certificates only after demonstrating sufficient successful update signals, maintaining a controlled and phased rollout. As for known bugs, Microsoft has the following to say: A workaround is available in the official documentation. Today's updates are available for PCs enrolled in the Extended Security Updates program only. If your PC is eligible, you can download the update from Settings > Windows Update or from the Microsoft Update Catalog here.
    • Then the solution is to not let children have easy access to smart phones or internet until they are older, not mass surveillance. Only this would require parents to do actual parenting, most likely, as with any good solution to the problem.
  • Recent Achievements

    • Week One Done
      rubentuben8 earned a badge
      Week One Done
    • Week One Done
      ARaclen earned a badge
      Week One Done
    • One Year In
      jojodbn earned a badge
      One Year In
    • One Month Later
      jojodbn earned a badge
      One Month Later
    • Week One Done
      jojodbn earned a badge
      Week One Done
  • Popular Contributors

    1. 1
      +primortal
      522
    2. 2
      PsYcHoKiLLa
      231
    3. 3
      +Edouard
      124
    4. 4
      ATLien_0
      87
    5. 5
      Steven P.
      83
  • Tell a friend

    Love Neowin? Tell a friend!