Recommended Posts

So I woke up late today, around 1300, to find that my Nextcloud instance was down.  I'm hosting it on Debian Bullseye via the regular old tarball manually set up with Apache, MariaDB/MySQL, PHP, etc.  It's been running great for literally years across multiple in-place upgrades to both Nextcloud and Debian.

 

After doing some tinkering it came to my attention that I MySQL was complaining it couldn't connect to the database.  Easy enough I figured, I'll just log into MySQL and see what's wrong.  Upon trying to launch the MySQL shell though it would ask for the password and then error out saying it couldn't connect to the server.

"ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/run/mysqld/mysqld.sock' (111)"

So I thought maybe the .sock file got messed with during an update or something and wasn't being removed properly, so I verified the location of the correct file by looking at the configs, all of which pointed to the same file, and I then deleted that mysqld.sock file and tried restarting MySQL, but still no dice.  I tried rebooting the whole server just for kicks, no luck.

 

I tried reinstalling MariaDB/MySQL but that apparently doesn't get rid of the existing configuration files, so what I ended up doing was apt purge --autoremove on mariadb-server, deleting /var/run/mysql, then reinstalling it and re-importing my most recent database backup (yesterday).  It's just a personal instance with myself, my wife and kids on it, and I've got it scheduled to do daily backups of the database, so it wasn't a huge issue.  What I'm curious about is why it crapped out in the first place.

 

While poking around in syslog I found the following line:

mariadbd[1115]: 2022-01-01 11:55:02 0 [ERROR] [FATAL] InnoDB: You should dump + drop + reimport the table to fix the corruption.


That timestamp is hours after any kind of automatic update/reboot would have taken place.

 

So something crazy happened that corrupted the actual database, but why would that have broken my ability to log into the MySQL shell to try and correct it?  It's saying I should dump and reimport the database, but I couldn't do that without having access to the MySQL shell.

 

I've checked the logs for apt and I don't see any kind of updates that would have been applied by unattended-upgrades; my last automatic update was December 18th.

 

Did anybody else have anything happen today with their database?  I guess it's definitely possible that Nextcloud encountered some kind of bug and corrupted its database.  I've done short SMART tests on all the drives in the system and found no issues, and the server is running on an UPS so there shouldn't have been any kind of power fluctuation or outage to cause any issues.  My UPS is reporting no events since the 17th either.

 

I guess I'm posting all this just to try and fish for thoughts from any of you who may have encountered this kind of thing in the past, or who may have some idea as to what happened.  I've restored a backup and everything is fine, but if there's something I can do to prevent the issue in the future, I'd like to do so.

Link to comment
https://www.neowin.net/forum/topic/1414223-mariadbmysql-took-a-dump-last-night/
Share on other sites

Well, I have a bunch of MySQL / MariaDB 5.x and MariaDB 10.x instances which are all running without issue right now.

 

I've had things like that happen before though. One cause is if the filesystem temporarily ran out of diskspace which can cause a table to require fixing. I've a suspicion that MySQL doesn't behave well if the data filesystem is briefly marked as read-only but it's just a hunch.

 

Table corruption can stop MySQL from starting though. That's a thing unfortunately.

 

Personally I'd recommend enabling the binlog and adding "--master-data=2" to your mysqldump line so that you can recover the database right up to the point where corruption occured. If you backup both the database dump file and the associated binlog files then you're pretty well sorted in terms of data recovery I think.

I just checked my install of MySQL running on Raspbian and all is well. With having to do a complete wipe and restore, the last entries are more than likely gone to see what the last thing that was modified or added/removed. The last time I had any corruption on my setup was testing new additions and was completely my own doing. Have you checked any connection logging to see if any weird connections were seen around the after the last time you knew it was working?

 

@DonC has a great point as that missing data between the last backup could be vital to see what happened.

The log entry immediately prior to the error messages is Nextcloud invoking its cron.php script, so I'm guessing it has something to do with that.  I've made copies of syslog from that timeframe so I may dig into it some more later, but I'm tired of reading logs since everything is back up and working I'll save it for later.

Good luck and I am happy that at least everything is back up and going. Keep us posted if you do dig into this. I am interested to see what you find if you do

On 01/01/2022 at 21:05, Gerowen said:

 

Did anybody else have anything happen today with their database? 

 

chinese hackers

On 01/01/2022 at 22:44, Marujan said:

chinese hackers

I thought about hackers of some sort, but there was no indications of any files missing or modified, no suspicious entries in auth.log, nothing banned by Fail2Ban, etc.  On top of that, all the various services hosted by the server are all hosted by their own non-root user accounts/groups and SSH is not open to the world and enforces public key authentication.  I'm fairly certain it was just some weird-ness with the database during the execution of Nextcloud's cron script.

 

Besides, with only 4 users, outside of some script kiddie who happened across a public share link I've posted somewhere, there's not really any incentive to try and bother my personal server.

Edited by Gerowen

So here's a piece of syslog.  You can see that at 11:50 the cron.php script executes and there are no problems.  5 minutes later it runs again (this is scheduled/expected), and this is where the problems begin.  So in the block of time between 11:50 and 11:55, something screwy happened.  I was asleep at the time, so I personally wasn't doing anything on the server directly, but we've all got cell phones and PCs connected to it all the time, plus I've shared several public links for photo albums and such with family members over Facebook, so even if I wasn't logged in, Nextcloud is constantly doing "something" in the background.

image.thumb.png.b4d6e17c3252dbf55a75302c9e5a5541.png

 

Here's the contents of auth.log for that particular block of time.  Nothing suspicious, root running cron and www-data running Nextcloud's cron.php script.

image.thumb.png.f22bf5f0c8231a95da077253eca4d1af.png

 

The database and the Nextcloud server files are stored on the main system drive which is a Western Digital Blue 2.5" SSD.  The actual data directory (user files) is stored however on a separate, encrypted RAID 5 "storage" partition.  Both drives have plenty of free space available.

image.png.f3fed2caa212019ffd5d112db6aa65d9.png

 

image.png.da1083f5449bd822d948673846af89b9.png

 

I never thought to keep a backup copy of the corrupted database for further inspection but once I got the backup copy up and running I deleted it.  I even had a copy of /var/run/mysql as a backup in the event that purging/re-installing MariaDB didn't fix the issue, but once it was clear everything was working again I deleted it.  But as far as I can tell, everything looks fine.  All I can figure is that I encountered some kind of weird bug/edge case.  I am running an older system.  The "server" originally started out as an old HP Pavilion P6803W tower PC that I bought ages ago.  Since then it has received an upgrade to a 6 core AMD Phenom II processor, 16GB of RAM, a new power supply, new case, etc.  The only original part is the motherboard.  However, all of the hardware in it is old and used, and the RAM isn't ECC, so it's totally possible that there was some sort of a bit flip or some other hardware issue.  I haven't had any issues in the past, but that doesn't mean they can't start, especially since the system has been running basically 24/7/365 for going on a decade now.  The temps have always been in great shape because I put an over-sized 125 watt cooler on a 95 watt chip.

image.png.2f78d8657437b73c344eb786920159d6.png

 

There's no indications of this being any kind of an attack either.  No changes made to my firewall rules, no new packages installed or removed, no modifications to any of my systemd service files, no files apparently tampered with or bothered, nobody banned by Fail2Ban, no unexpected auth attempts or blocked traffic on my firewall, no weird entries in syslog/kern.log, (at least that I've noticed) etc.

 

On the hardware front all the drives check out after running some short SMART tests, but I will see about doing a memtest scan on it at some point just to verify whether there are any issues with the RAM.  I'm gonna hope that it was just a software bug and I don't encounter it again because even though I don't mind replacing the server, I'm kind of attached to the old girl, :p  I will also verify that I don't have any other services hogging up my RAM as well just to be safe.

Edited by Gerowen
added screenshot as evidence of free space

At least from the quick views, nothing looks out of place. Knowing the the hardware is as old as it is could be just a really unfortunately timed hiccup. If the drivs check out and no bad sectors found, my next check would be the ram.

 

Keep up on those backs to be safe and I hope it does not happen again.🤞

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Posts

    • FIFA cup is a worldwide event. Total global engagement — FIFA World Cup Qatar 2022 (official FIFA report) 5 billion people https://theworlddata.com/fifa-...-cup-viewership-statistics/ U.S., Canada, Japan drive vast World Series viewership for Games 1 and 2 In Japan despite a 9 a.m. local start time, Game One averaged 11.8 million viewers on NHK-G https://www.mlb.com/news/2025-...ching-large-global-audience There are also millions of annual viewers of the World Series in Latin America, especially Venezuela and the Domincan Republic due to the large number of players from those countries playing in the Major Leagues.
    • The original word arts were far more awesome! With their own preconfigured fill patterns, 3D layout, etc, with options to fine-tune their size and colors as desired. I especially loved the ones circled below and still miss them from my primary school years: I frankly use them less these days 'cause the new one isn't as straightforward fine art as what we originally had, taking time and effort to get back to the same graphics quality as the original offered, which I simply don't have enough of for simply fine-tuning my presentation titles. Same with the built-in picarts selection.
    • What didn't you understand about that was mainly referring to Google, Microsoft, etc. keeping your passwords. Password Management is a key service of Bitwarden and it's not going anywhere. In any case they do offer export to other Password management services, backup/download of passwords, and the already mentioned on-prem option. I don't agree with the OP to use the free option as it's better to be an actual customer IMO. They don't just don't delete accounts like the big tech companies with no recourse which was the main concern of this article. I was confused if the author was trying to sell this setup? It should be obvious to anyone reading this article this solution is overly complicated and overkill for most users.
    • I got this notification just now in Android: So I went in to disable the "Other" or "Marketing" notifications in Notifications management: But it came through the Now Playing? So if I disable that I no longer get what's Now Playing in Notifications? I'm a paying subscriber, not on the free plan... can they sink any lower?
    • Population especially in high density areas creates more heat and more humidity. This can be noticed in an indoor arena or concert room which heats up when the room or arena fills with people, without air conditoning to cool it down, Watering of lawns creates more humidity as the moisture from the watering rises into the atmosphere, creating a more humid condition. The again, depopulating an arena or room after an event will drop the temperature inside. Desert areas are less humid for a number of reasons, including a lower population density. Tel Aviv has horrible weather, unless you like it hot and humid. Summer days are regularly 90+ F with humidity well over 70%. It is probably not as bad as Mississippi but still it is bad enough.
  • Recent Achievements

    • Dedicated
      JuvenileDelinquent earned a badge
      Dedicated
    • First Post
      DrWankel earned a badge
      First Post
    • Reacting Well
      DrWankel earned a badge
      Reacting Well
    • Week One Done
      Supreme Spray LV earned a badge
      Week One Done
    • Week One Done
      Genuinetonerink- Dubai earned a badge
      Week One Done
  • Popular Contributors

    1. 1
      +primortal
      504
    2. 2
      +Edouard
      163
    3. 3
      PsYcHoKiLLa
      91
    4. 4
      Steven P.
      75
    5. 5
      Michael Scrip
      72
  • Tell a friend

    Love Neowin? Tell a friend!