PM41 issues

Ask any Prometeus related question !
Post Reply
S_xda
Posts: 2
Joined: Wed Feb 13, 2013 8:14 am

PM41 issues

Post by S_xda » Tue Apr 02, 2013 3:31 am

I host a busy community forum on PM41.

I like that you took quick action and restored from a 14 hour backup, but I had newer backups (I take hourly backups), and they are now stale because of the posts/threads posted after the restore. And I couldn't do anything because I was asleep.

Anyway, PM41 (previously PM31) has been really unstable. There were spurious reboots, downtimes, all attributed to the vSwap instability. I miss the legendary stability and uptime Prometeus is known for.

Oh, you guys have incorrect cookie settings on this board, which is why persistent logins don't work. ACP -> General tab -> Under Server Configuration, Cookie settings -> Set cookie domain to board.prometeus.net

Admin
Site Admin
Posts: 490
Joined: Wed Jul 25, 2012 10:54 pm

Re: PM41 issues

Post by Admin » Wed Apr 03, 2013 7:19 pm

Hello !

pm41 is not the previous pm31. There have been containers migrated here, but the machine is different (although specs are the same).
It seems it had a hidden defect in the motherboard that eventually cut power to the raid array in an uncontrolled fashion creating a lot of errors. It will be sent in for warranty since is almost new.
There was one previous reboot on pm41 caused by the same problem with the power, tho was not evident at that time since the restart fixed it and the errors were repaired successfully.
We have months of stability on many servers with OVZ, there is an unknown combination of factors that makes it lock from time to time, perhaps related to the software run, but we were not able to find a cause yet. What we do know, is that .18 (burst) kernels are much more stable, but they lack important features for modern apps, so we decided a few reboots in our server farm is worth the extra set of features.
KVM, VMWare and Xen were never rebooted (except by human error one Xen server which was shutdown instead of the neighboring machine).
pm19 (KVM, 24 cores, 64 GB ram): [root@pm19 ~]# uptime
19:07:38 up 314 days, 5:21, 1 user, load average: 5.00, 4.47, 4.33
pm13 (Xen, similar specs): [root@pm13 ~]# uptime
19:09:40 up 263 days, 7:04, 1 user, load average: 0.00, 0.03, 0.00
I probably can find higher uptime in vmware.
There are some reasons for which openvz is cheaper, we also have ovz biz plans which run on special small servers with 16 GB ram and on which heavy games or ddos prone services are not allowed and have similar uptime:
[root@pm17 ~]# uptime
19:14:03 up 173 days, 19:22, 2 users, load average: 0.16, 0.31, 0.37
Even some regular ovz ssd servers have good uptime, some since launch:
[root@pm33 ~]# uptime
19:16:15 up 82 days, 14:48, 1 user, load average: 3.69, 3.68, 3.37
We must find a way to offer service to gamers too, on a cheap and as stable as possible platform regarding usage, some compromises have to be done.
Bottom of line, you wish great stability and clockwork performance ? You go with Xen/KVM. Want good price and plenty of resources ? Go with OVZ.

The cookie is my fault, set it up on my vps as a proof of concept, uncle redirected the domain and i forgot to adjust the cookie... Thanks !

S_xda
Posts: 2
Joined: Wed Feb 13, 2013 8:14 am

Re: PM41 issues

Post by S_xda » Sat Apr 13, 2013 7:06 pm

Thanks for your reply. I'm considering going with a business plan, maybe set up some sort of HA system, between the two VPSes.

Admin
Site Admin
Posts: 490
Joined: Wed Jul 25, 2012 10:54 pm

Re: PM41 issues

Post by Admin » Mon Apr 15, 2013 6:06 pm

We plan to offer real HA cloud since these problems tend to multiply as the number of nodes grows.
We have yet another Raid rebuilding now and another migration will be needed just to be sure on another node, if this continues we will discontinue local storage plans and offer only HA ones.

Post Reply

Who is online

Users browsing this forum: No registered users and 9 guests