@VirMach said: if you know you previously had a functional OS, especially don't do it to fix I/O issues, and we'll probably work on preventing people from doing it if the node is facing potential disk anomalies (until they edit the disabled button and do it anyway.)
That is like counter-intuitive to me. I know I had working system, but one of the disk failed, you fixed that (networkstatus), I don't really need old files - I will just reinstall. I would really expect that this mythical LVM you talking about that and I have no idea what it is (I just press buttons in panel, duh) was fixed along the disk and reinstall would just work.
Especially when support is overwhelmed and take weeks to answer - reinstalling system seems like 'safe & correct' way to go before engaging Support - it should place me on other disk or on this fixed one or somethingsomething. No one knows SolusVM is 10+ years old piece of crap that takes years to make sense.
You can't really be mad a people for trying to reinstall. You can be mad for people spamming reinstall every minute ;')
@VirMach said: if you know you previously had a functional OS, especially don't do it to fix I/O issues, and we'll probably work on preventing people from doing it if the node is facing potential disk anomalies (until they edit the disabled button and do it anyway.)
That is like counter-intuitive to me. I know I had working system, but one of the disk failed, you fixed that (networkstatus), I don't really need old files - I will just reinstall. I would really expect that this mythical LVM you talking about that and I have no idea what it is (I just press buttons in panel, duh) was fixed along the disk and reinstall would just work.
Especially when support is overwhelmed and take weeks to answer - reinstalling system seems like 'safe & correct' way to go before engaging Support - it should place me on other disk or on this fixed one or somethingsomething. No one knows SolusVM is 10+ years old piece of crap that takes years to make sense.
You can't really be mad a people for trying to reinstall. You can be mad for people spamming reinstall every minute ;')
Yes but I can't think of any non-rare situation where an OS re-install would help at all in any situation where the OS/VM was previously functional and abruptly stopped functioning to that level, without anything having been done first. Let me know if you can think of any cases where it would.
There's a difference between doing it after something did occur or was announced to occur.
I understand there's a point in time where network status was not made either, but a portion of these also occurred after it was announced. If there's ever an issue with a server it's generally a big no-no to use any functions. I'm not blaming anyone though, I get it. It's just something we need to watch out more for in Tokyo as apparently there will be a high rate of that going on. And of course people can and will do it and it'll just extend their downtime every time no matter what because logically, it will always have to be fixed first before we move on to the next step of regenerating anyone who regressed it into having no LVM.
Again, I know it may have been helpful after migrations or after a specific OS didn't first install, but that was an event that occurred. And of course a small portion will be those that just needed an OS re-install anyway.
(edit) So yeah, I definitely don't think it's counter-intuitive but I also don't expect people to follow the suggestion. That is the correct suggestion though, kind of like our previous suggestion to not use the Ryzen migrate button if the node is reported as having issues.
@lxm said: @VirMach AZ004 network become terrible ,package loss
CHI2 as well, should improve once the swaps are complete. These two locations still need switch configuration changes so while they have additional blocks there will be additional packets until the old IPs are no longer used.
Also additional issue, but expected: routing globally will take some hours to a day or so to improve.
Bleh, must wake up before diagnosing issues. Network interface was ens3 instead of eth0 so auto fix didn't work and then I made a couple typos fixing it all for a NYC vps
@VirMach Just a note on ATLZ007B.
I can ping the new gateway with the new IP, but when I try to go anywhere else I get:
"From 128.136.83.193 icmp_seq=X Packet filtered"
Almost every time I install debian 10-11 in those migrated VPS, it works fine with initial OS installation but immediately lost the disk with full OS upgrade and reboot (yes, only apt-get upgrade), Only can discover this patten after many failures of Debian setup on those migrated nodes, in the end all most all my VPS are reinstalled with Ubuntu which have no issue of disk lost after the upgrade and reboot, it was the case at least few weeks ago maybe no longer the case for now, for reference.
@VirMach said: Yes but I can't think of any non-rare situation where an OS re-install would help at all in any situation where the OS/VM was previously functional and abruptly stopped functioning to that level, without anything having been done first. Let me know if you can think of any cases where it would.
It appears that LAXA031 has had its IPs reconfigured, but I still cannot access the control panel for my VM (90s timeout). I manually reconfigured the IP and set it as main a few days before this mass migration. Is this still in limbo or should I create a ticket?
We were able to get some people off but it went offline pretty quickly. I've already requested the DC try to bring it back up and monitoring so we can continue the emergency migrations.
@VirMach
What is wrong with SJCZ005?
It's been a long time, what is your plan for SJCZ005?
We were able to get some people off but it went offline pretty quickly. I've already requested the DC try to bring it back up and monitoring so we can continue the emergency migrations.
@VirMach
What is wrong with SJCZ005?
It's been a long time, what is your plan for SJCZ005?
I was actually about to post an update on it. And yes planned on moving people off a few days ago, doing that now, and we'll be crediting people for a month or extending service by a month.
Also wanted to update everyone else:
Some of these already sent out as emails, network updates, etc.
Old LAX servers that failed: All have been regenerated on Ryzen. We'll be crediting people 6 months or extending by 6 months. They still need IP addresses assigned, will most likely get that part done by Monday, couldn't get to it today.
LAXA031: Maintenance scheduled with DC. We thought it stabilized after maintenance but there was miscommunication there, it just happened to stabilize and then ran into issues again last week or so, we've scheduled they perform original intended hardware maintenance and that should hopefully stabilize it. Crediting people 1 month or extending.
SEAZ009 - Scheduled maintenance, it may fix the issue. Extending/crediting 1 month.
SEAZ010 - This one has been extended because we're having trouble getting DC hands to perform what we requested. It's "only" been about a week now but I'm completely blind on this one at the moment, we asked them to just ship it out to us so we can get the data off and people migrated sooner if they can't perform our request (I asked them to check what's being displayed last and got back "this has been completed.") No auto credit/extension if we can get this one fixed by Monday/Tuesday but you can of course request SLA credits in a ticket.
SJCZ005 - Also covered above. Trying to migrate people now. If that doesn't work I have a final idea/attempt at getting it fixed permanently as well. Migrations will be to LAX as we have limited space in SJC. Also 1 month credit/extension here.
@FrankZ said: @VirMach Just a note on ATLZ007B.
I can ping the new gateway with the new IP, but when I try to go anywhere else I get:
"From 128.136.83.193 icmp_seq=X Packet filtered"
128.136.83.193 is a Flexential IP
So nothing in or out beyond the gateway.
The 8 other IP changes went just fine.
Yeah, network's being weird but ATLZ003 was also completely going wacko in multiple ways so I had to focus most the time on that and everything else being completed today. I am seeing bursts out but I've also verified the issue you describe several hours earlier, didn't have any time to update especially since I'm trying to figure out the timeline. It does look like it was announced so it must be misconfiguration on the network equipment. I swear I cleared it though so it must have been working and they also checked on their end, we'll see.
I have VPS on SJC2005 (down) and SJC2008 (older servier, working again after being down for months, but idling and expiring soon since I don't need two VPS at that location). I'd rather not get migrated to LAX if there is some hope of SJC2008 becoming reliable.
1) Instead of migrating the SJC2005 vps to LAX could it be transferred to SJC2008, cancelling the existing SJC2008 vps, leaving just one?
2) Rather than service extensions could it be possible to have a resource increase for that remaining vps, from 768MB to 1.5GB?
If either of these vps gets migrated or transferred, it's not necessary to actually transfer the data, if it's easier at your end to just delete the old vps and create a new one that I can reinstall on. Thanks.
Yeah, network's being weird but ATLZ003 was also completely going wacko in multiple ways so I had to focus most the time on that and everything else being completed today. I am seeing bursts out but I've also verified the issue you describe several hours earlier, didn't have any time to update especially since I'm trying to figure out the timeline. It does look like it was announced so it must be misconfiguration on the network equipment. I swear I cleared it though so it must have been working and they also checked on their end, we'll see.
Ah the "remote hands with no brains attached" thing again. I fled Dallas but my Atlanta one is working well usually. And I don't need more in NY lol
This has been an interesting adventure to see just how badly some DC and remote hands setup can be. Pretty sure Virmach would describe it in slightly less interesting and more rage inducing terms though.
@Daevien said:
This has been an interesting adventure to see just how badly some DC and remote hands setup can be. Pretty sure Virmach would describe it in slightly less interesting and more rage inducing terms though.
Most remote hands techs are idiots across the board. If you need their help often or for something, most of the time it's cheaper to fly to the DC yourself or just to hire contractors. I can think of maybe 10 experiences I've had with datacenter hands that I'd consider positive, out of hundreds of interactions.
Followup to above: I spoke too soon, Comcast blocks the new Buffalo IP too. It's annoying. It is one of the old 384MB specials and I have a small personal website on it. I guess I can try to migrate it to another location? I'm not sure if that is supposed to work.
Does anyone know the current state of VPSShared? That's where my site was before, but it stopped working for a while due to the shared IP being null routed. I saw in the control panel that the IP changed after a while, but the new one doesn't work either.
@fluttershy said:
Most remote hands techs are idiots across the board. If you need their help often or for something, most of the time it's cheaper to fly to the DC yourself or just to hire contractors. I can think of maybe 10 experiences I've had with datacenter hands that I'd consider positive, out of hundreds of interactions.
True I guess, I haven't used them many times personally, usually just depending on either my own ipkvm or getting one connected temp.
When I think about it, I've probably been the remote hands to help out someone more times than I've used them myself
@willie said:
Followup to above: I spoke too soon, Comcast blocks the new Buffalo IP too. It's annoying. It is one of the old 384MB specials and I have a small personal website on it. I guess I can try to migrate it to another location? I'm not sure if that is supposed to work.
Comcrap just flat out blocks the ip entirely? Thats pretty bad even for them.
Does anyone know the current state of VPSShared? That's where my site was before, but it stopped working for a while due to the shared IP being null routed. I saw in the control panel that the IP changed after a while, but the new one doesn't work either.
Last I remember was an update about the changeover in IP and I thought it was working again, I don't use them myself.
@willie said:
I have VPS on SJC2005 (down) and SJC2008 (older servier, working again after being down for months, but idling and expiring soon since I don't need two VPS at that location). I'd rather not get migrated to LAX if there is some hope of SJC2008 becoming reliable.
1) Instead of migrating the SJC2005 vps to LAX could it be transferred to SJC2008, cancelling the existing SJC2008 vps, leaving just one?
2) Rather than service extensions could it be possible to have a resource increase for that remaining vps, from 768MB to 1.5GB?
If either of these vps gets migrated or transferred, it's not necessary to actually transfer the data, if it's easier at your end to just delete the old vps and create a new one that I can reinstall on. Thanks.
SJC is full due to SJCZ004 also having more serious issues, so we used all the space there to SJC. It hasn't exactly been smooth sailing for that location so there's only so much room we have left to move around within SJC.
SJCZ005 is technically more salvageable at the moment. It has to do with a loose or malfunctioning cable. It was hard to diagnose for a while because of the sporadic manner it manifested itself but I finally was able to catch it. We did have maintenance done on this node before so perhaps when they opened it up to fix the previous issues, a cable shifted and it worked fine (until it didn't.) Resource increase would be too complicated to do in bulk but I'll do it for you specifically (@willie) since you've been stuck with us in the worst string of luck ever.
Yeah, network's being weird but ATLZ003 was also completely going wacko in multiple ways so I had to focus most the time on that and everything else being completed today. I am seeing bursts out but I've also verified the issue you describe several hours earlier, didn't have any time to update especially since I'm trying to figure out the timeline. It does look like it was announced so it must be misconfiguration on the network equipment. I swear I cleared it though so it must have been working and they also checked on their end, we'll see.
Ah the "remote hands with no brains attached" thing again. I fled Dallas but my Atlanta one is working well usually. And I don't need more in NY lol
This has been an interesting adventure to see just how badly some DC and remote hands setup can be. Pretty sure Virmach would describe it in slightly less interesting and more rage inducing terms though.
Yeah I had a lot more written but realized I was going off on a weird upset tangent so I cut it out. There's a reason I'm unsurprised that such a thing would happen in Atlanta, let's just leave it there.
It runs as well as my new 10gb nic does. Oh sorry, my new 10gb paperweight. sigh Here's hoping the others incoming work better lol
LAXA014 was so close to being done around Thursday/Friday. Then it crapped out mid-migration and SolusVM migration feature fully malfunctioned as a result as well, making it more difficult to do because we were doing it in batches versus a bulk script since we knew it's unstable. So the first batch was like 70% done, then it crashed, and made the system lock up. I fixed it in every way we knew possible but now we're facing another weird bug with the migration tool, so it just means we have to plan out semi-bulk manual batches instead which complicated it enough to where with everything else going on, it definitely didn't get done by Friday.
@willie said:
Followup to above: I spoke too soon, Comcast blocks the new Buffalo IP too. It's annoying. It is one of the old 384MB specials and I have a small personal website on it. I guess I can try to migrate it to another location? I'm not sure if that is supposed to work.
Does anyone know the current state of VPSShared? That's where my site was before, but it stopped working for a while due to the shared IP being null routed. I saw in the control panel that the IP changed after a while, but the new one doesn't work either.
The null thing... oh god, I don't want to even get into it but I'll just say one thing: we tried. Many times.
New one shouldn't get nullrouted like that. The IP change for those also went through and there should not be any immediate problems for a long time. We just are still setting up rDNS for mailing but other than that, everything looks good and if it doesn't then it'll be an easy fix as it's not facing any weird hardware issues or anything else as of right now.
@willie said: Followup to above: I spoke too soon, Comcast blocks the new Buffalo IP too. It's annoying. It is one of the old 384MB specials and I have a small personal website on it. I guess I can try to migrate it to another location? I'm not sure if that is supposed to work.
I have no idea what's going on with Comcast recently but I've had a ton of issues with them and routing from LAX office to pretty much everywhere. What's the IP start with?
Old LAX nodes that had catastrophic failure are getting their new IPs now. You can manually change the main IP to get it working sooner. I may be able to get the script working to do the main IP swaps for them but if something goes wrong that'll get done tomorrow.
(And the credits/extensions will be done tomorrow, it requires a lot of DB queries that I don't want to try to remember right now while I'm doing a lot of other things, sorry.)
Comments
this.
I bench YABS 24/7/365 unless it's a leap year.
Yes but I can't think of any non-rare situation where an OS re-install would help at all in any situation where the OS/VM was previously functional and abruptly stopped functioning to that level, without anything having been done first. Let me know if you can think of any cases where it would.
There's a difference between doing it after something did occur or was announced to occur.
I understand there's a point in time where network status was not made either, but a portion of these also occurred after it was announced. If there's ever an issue with a server it's generally a big no-no to use any functions. I'm not blaming anyone though, I get it. It's just something we need to watch out more for in Tokyo as apparently there will be a high rate of that going on. And of course people can and will do it and it'll just extend their downtime every time no matter what because logically, it will always have to be fixed first before we move on to the next step of regenerating anyone who regressed it into having no LVM.
Again, I know it may have been helpful after migrations or after a specific OS didn't first install, but that was an event that occurred. And of course a small portion will be those that just needed an OS re-install anyway.
(edit) So yeah, I definitely don't think it's counter-intuitive but I also don't expect people to follow the suggestion. That is the correct suggestion though, kind of like our previous suggestion to not use the Ryzen migrate button if the node is reported as having issues.
NYC is working now, updating network status after confirming all nodes.
@VirMach AZ004 network become terrible ,package loss
CHI2 as well, should improve once the swaps are complete. These two locations still need switch configuration changes so while they have additional blocks there will be additional packets until the old IPs are no longer used.
Also additional issue, but expected: routing globally will take some hours to a day or so to improve.
Bleh, must wake up before diagnosing issues. Network interface was ens3 instead of eth0 so auto fix didn't work and then I made a couple typos fixing it all for a NYC vps
@VirMach Just a note on ATLZ007B.
I can ping the new gateway with the new IP, but when I try to go anywhere else I get:
"From 128.136.83.193 icmp_seq=X Packet filtered"
128.136.83.193 is a Flexential IP
So nothing in or out beyond the gateway.
The 8 other IP changes went just fine.
For staff assistance or support issues please use the helpdesk ticket system at https://support.lowendspirit.com/index.php?a=add
so far all ip changes went smoothly on all affected servers
My experience after ryzen migration.
Almost every time I install debian 10-11 in those migrated VPS, it works fine with initial OS installation but immediately lost the disk with full OS upgrade and reboot (yes, only apt-get upgrade), Only can discover this patten after many failures of Debian setup on those migrated nodes, in the end all most all my VPS are reinstalled with Ubuntu which have no issue of disk lost after the upgrade and reboot, it was the case at least few weeks ago maybe no longer the case for now, for reference.
It appears that LAXA031 has had its IPs reconfigured, but I still cannot access the control panel for my VM (90s timeout). I manually reconfigured the IP and set it as main a few days before this mass migration. Is this still in limbo or should I create a ticket?
TYOC040 cant boot debian 11 custom install installed via netboot.xyz.
boots up to grub command line.
Debian 11 template is fine.
I bench YABS 24/7/365 unless it's a leap year.
TYOC026 failure occurred: disk I/O error.
@VirMach
What is wrong with SJCZ005?
It's been a long time, what is your plan for SJCZ005?
I was actually about to post an update on it. And yes planned on moving people off a few days ago, doing that now, and we'll be crediting people for a month or extending service by a month.
Also wanted to update everyone else:
Some of these already sent out as emails, network updates, etc.
Old LAX servers that failed: All have been regenerated on Ryzen. We'll be crediting people 6 months or extending by 6 months. They still need IP addresses assigned, will most likely get that part done by Monday, couldn't get to it today.
LAXA031: Maintenance scheduled with DC. We thought it stabilized after maintenance but there was miscommunication there, it just happened to stabilize and then ran into issues again last week or so, we've scheduled they perform original intended hardware maintenance and that should hopefully stabilize it. Crediting people 1 month or extending.
SEAZ009 - Scheduled maintenance, it may fix the issue. Extending/crediting 1 month.
SEAZ010 - This one has been extended because we're having trouble getting DC hands to perform what we requested. It's "only" been about a week now but I'm completely blind on this one at the moment, we asked them to just ship it out to us so we can get the data off and people migrated sooner if they can't perform our request (I asked them to check what's being displayed last and got back "this has been completed.") No auto credit/extension if we can get this one fixed by Monday/Tuesday but you can of course request SLA credits in a ticket.
SJCZ005 - Also covered above. Trying to migrate people now. If that doesn't work I have a final idea/attempt at getting it fixed permanently as well. Migrations will be to LAX as we have limited space in SJC. Also 1 month credit/extension here.
Yeah, network's being weird but ATLZ003 was also completely going wacko in multiple ways so I had to focus most the time on that and everything else being completed today. I am seeing bursts out but I've also verified the issue you describe several hours earlier, didn't have any time to update especially since I'm trying to figure out the timeline. It does look like it was announced so it must be misconfiguration on the network equipment. I swear I cleared it though so it must have been working and they also checked on their end, we'll see.
I have VPS on SJC2005 (down) and SJC2008 (older servier, working again after being down for months, but idling and expiring soon since I don't need two VPS at that location). I'd rather not get migrated to LAX if there is some hope of SJC2008 becoming reliable.
1) Instead of migrating the SJC2005 vps to LAX could it be transferred to SJC2008, cancelling the existing SJC2008 vps, leaving just one?
2) Rather than service extensions could it be possible to have a resource increase for that remaining vps, from 768MB to 1.5GB?
If either of these vps gets migrated or transferred, it's not necessary to actually transfer the data, if it's easier at your end to just delete the old vps and create a new one that I can reinstall on. Thanks.
Ah the "remote hands with no brains attached" thing again. I fled Dallas but my Atlanta one is working well usually. And I don't need more in NY lol
This has been an interesting adventure to see just how badly some DC and remote hands setup can be. Pretty sure Virmach would describe it in slightly less interesting and more rage inducing terms though.
LAXA014 stay strong 🥲
Want free vps ? https://microlxc.net
It runs as well as my new 10gb nic does. Oh sorry, my new 10gb paperweight. sigh Here's hoping the others incoming work better lol
In positive news, I see that the new IP address for my Buffalo (NYC?) vps seems to not be blacklisted by Comcast. I hope it keeps working.
Also re:
I should add that it's ok (and expected) if the 2 year renewal cost goes to $40. I can't edit the earlier post any more to mention this.
Most remote hands techs are idiots across the board. If you need their help often or for something, most of the time it's cheaper to fly to the DC yourself or just to hire contractors. I can think of maybe 10 experiences I've had with datacenter hands that I'd consider positive, out of hundreds of interactions.
DAL008 and AZ004 network still not fixed
Followup to above: I spoke too soon, Comcast blocks the new Buffalo IP too. It's annoying. It is one of the old 384MB specials and I have a small personal website on it. I guess I can try to migrate it to another location? I'm not sure if that is supposed to work.
Does anyone know the current state of VPSShared? That's where my site was before, but it stopped working for a while due to the shared IP being null routed. I saw in the control panel that the IP changed after a while, but the new one doesn't work either.
True I guess, I haven't used them many times personally, usually just depending on either my own ipkvm or getting one connected temp.
When I think about it, I've probably been the remote hands to help out someone more times than I've used them myself
Comcrap just flat out blocks the ip entirely? Thats pretty bad even for them.
Last I remember was an update about the changeover in IP and I thought it was working again, I don't use them myself.
SJC is full due to SJCZ004 also having more serious issues, so we used all the space there to SJC. It hasn't exactly been smooth sailing for that location so there's only so much room we have left to move around within SJC.
SJCZ005 is technically more salvageable at the moment. It has to do with a loose or malfunctioning cable. It was hard to diagnose for a while because of the sporadic manner it manifested itself but I finally was able to catch it. We did have maintenance done on this node before so perhaps when they opened it up to fix the previous issues, a cable shifted and it worked fine (until it didn't.) Resource increase would be too complicated to do in bulk but I'll do it for you specifically (@willie) since you've been stuck with us in the worst string of luck ever.
Yeah I had a lot more written but realized I was going off on a weird upset tangent so I cut it out. There's a reason I'm unsurprised that such a thing would happen in Atlanta, let's just leave it there.
LAXA014 was so close to being done around Thursday/Friday. Then it crapped out mid-migration and SolusVM migration feature fully malfunctioned as a result as well, making it more difficult to do because we were doing it in batches versus a bulk script since we knew it's unstable. So the first batch was like 70% done, then it crashed, and made the system lock up. I fixed it in every way we knew possible but now we're facing another weird bug with the migration tool, so it just means we have to plan out semi-bulk manual batches instead which complicated it enough to where with everything else going on, it definitely didn't get done by Friday.
The null thing... oh god, I don't want to even get into it but I'll just say one thing: we tried. Many times.
New one shouldn't get nullrouted like that. The IP change for those also went through and there should not be any immediate problems for a long time. We just are still setting up rDNS for mailing but other than that, everything looks good and if it doesn't then it'll be an easy fix as it's not facing any weird hardware issues or anything else as of right now.
I have no idea what's going on with Comcast recently but I've had a ton of issues with them and routing from LAX office to pretty much everywhere. What's the IP start with?
Old LAX nodes that had catastrophic failure are getting their new IPs now. You can manually change the main IP to get it working sooner. I may be able to get the script working to do the main IP swaps for them but if something goes wrong that'll get done tomorrow.
(And the credits/extensions will be done tomorrow, it requires a lot of DB queries that I don't want to try to remember right now while I'm doing a lot of other things, sorry.)
Eagerly waiting for it