Multiplayer & Multiplayer Lobby improvements in Unofficial Patch 1.5 after February 2021

New EE2 Government (since: 01.03.2014), everything about the EE2 management. You can report other players here.
Post Reply
User avatar
Dr.MonaLisa
High Representative
Posts: 8041
Joined: 17 Jun 2010, 11:21
Location: Poland

Multiplayer & Multiplayer Lobby improvements in Unofficial Patch 1.5 after February 2021

Post by Dr.MonaLisa »

Hi there. As everyone knows, I like to keep things transparent, and explain deeply every change / feature available in Unofficial Patch 1.5. This helps the community to understand issues, and also have additional knowledge about the root cause of them. I'll try to explain important multiplayer improvements that are available for UP1.5 users since February 2021.

Minor Update 158006

* Fix for the "empty slot" bug which was causing a drastic multiplayer game slow down when occurred.

Available in [158006 - 2021-02-02] https://www.ee2.eu/patch/changes/#158006
— Fixed a bug (present since ver 1.0), where a multiplayer game could drastically slow down:
* It usually occurred when a player (who was in a slot between other players) left the hosted game room before the game started. Multiplayer users bypassed this bug by asking players to re-join, but it wasn't always working due that the EE2ENet thread has its own slots, unrelated to the slots which we see in the game room.
* It could occasionally occur when any player quit the active game.
* The fix skips checking the number of remaining players, in a function which calculates the maximum ping (because it used to ignore the other active EE2ENet slots), and verifies if the slot is correct using a different method.
This bug needs a further explanation and details how my fix works.
1. The bug was present since ver. 1.0. It also wasn't fixed in EE2: AoS.
2. The game uses the unique "EE2ENet" thread for multiplayer connections. This thread has it's own 12 available slots for connected users.
3. The slots in "EE2ENet" thread are not related to the slots which we can see in game. For example, if you host a multiplayer game and open game room slot number 10, the first connected player will still be in the slot number 1 of "EE2ENet" thread. In short words: this bug didn't depend on slots in game, but slots in game were usually helpful in detecting if such a bug would occur. This is why multiplayer users asked last "game slots" to re-join the existing hosted game.
4. When a game started with an empty EE2ENet slot in the middle of other players, there was a mismatch in game functions that calculate the maximum ping from all players in game.
5. When an user left the already started game, the EE2ENet thread slot was filled with the player slot ID of "-1" which helped game understand that this slot is unused, but still exisiting. However, in rare cases (I don't know exactly why) this slot was getting removed instead of getting the updated ID (possibly due to the host migration), and the same bug occurred as if the EE2ENet slot was missing before game start.
6. The game uses a function that determines the maximum ping from all players present in game. Because of it, every player in game has the same command lag as everyone else. When someone loses the connection, the game "freezes" and shows "Waiting for..." message. This is exactly what was happening when the "missing slot" bug occurred, but on the much greater scale.
7. The maximum ping is the ping of a player with the highest delay. Let's say that the maximum ping is "350". The game multiplies this number by "1.5", so as final we are getting "525" milliseconds command lag. This multiplier is most likely added to allow slightly longer responses from connected players, which may be caused for example by the greater CPU usage during the calculation of the game "tick".
8. A function that checks for the maximum ping was not working correctly, because it was looping the number of players in game. Let's give a simple example:
* The game was hosted with 3 slots. Each player joined in the order that matches the game slot.
* Player 2 has ping "50", Player 3 has ping "350".
* Player from the slot number 2 leaves the game, but host starts it anyway to play 1v1. Player from slot number 3 did not re-join.
* The game is checking for the maximum ping from 2 players. Since the third slot has ENet slot with ID 3, his ping is never checked.
* As a result, the game simulates the clock / ticks as for ping of "7 - 20" (most likely the CPU speed depended delay). It expects slot number 3 to send updated data with the speed of 20 ms, but it comes after 350 ms! As a result, the game is pausing few times per second. It doesn't display the "Waiting for..." message, because this message is shown after 2 seconds (2000 ms) of missing data from another player.
9. How it has been fixed? I have re-written the function that was skipping the ping-check from the last slot. Instead, I made a check which is "JUMP if SlotID == -1". Thanks to this, the game won't crash when someone quits the started game, but most of remaining ENet thread slots will have the maximum ping checked. So the final solution is rather simple, but took me over 30 hours to figure out what's wrong.
10. This fix required an update release, because it contains additional lines / byte of execution.

Keep-Alive BETA fixes after 2nd of February 2021

Keep-Alive fixes are applied to the game process by the Unofficial Patch 1.5 Launcher. They're usually executed on game start, but we also use the same method for features like: "Observers can chat with players in game". More details on how the Keep-Alive system works are in this topic: viewtopic.php?f=58&t=5202

* Changed the automatic servers list (list of hosted games) refresh time from 10 seconds to 5 seconds.

In the old GameSpy Lobby, the games without free slots available, or the games that already started were not shown on the list of games. So players never knew who is in which game, or how much time ago the game has started. With the new EE2.eu Lobby I tweaked the game status to always present itself as "open", and modified the maximum number of players to show the current number of players, in case when the game status is "closed", "closed playing". Thanks to this, all games are visible in the gray color & open games are visible in the black color.

Unfortunately, this nice improvement has caused another problem, where slots are getting grayed-out instantly, but returning to the black color when a new slot is available was taking up to 10 (sometimes more) seconds. Impatient Multiplayer users were re-logging to the Lobby in order to bypass this bug.

The only solution for this issue was reducing the automatic games list refresh time. I Initially reduced it to 1 second, however it caused a bug with the blinking tooltip (when hovering mouse over the game room to display details), and was causing unnecessary requests to the server. Thankfully, I've found another fix, so the auto-refresh time was reduced to 5 seconds.

* Repaired returning to the black color after clicking on the "Refresh List of Games" button.

As mentioned above, the main problem was that the color could be changed from gray to black, only by the auto-refreshing function. The Refresh button itself had a bug, where the color update was only executed when it was about to get grayed-out. Thanks to my fix it now executes the color update function every time, and pushes the correct status argument (EAX instead of 0), so the bug is repaired.

The new Keep-Alive fixes are still being tested.

Why I used the KeepAlive method instead of adding fixes to Minor Update 158006?
1. I focused on this problem after Minor Update 158006 release. So it was not fixed before the update release.
2. The KeepAlive fixes can not change the number of bytes in code. Things are getting overwritten, but nothing new can be added.
3. The KeepAlive fixes can be disabled at any time (or tweaked, like the auto-refreshing number of seconds). So I prefer this method as it doesn't require users to install updates all the time. Generally, the KeepAlive method is like an online service. It works like a server-controlled cheat-engine script.
4. This allows to test features before including them to UP1.5. However, those are Multiplayer fixes, so the Internet connection is required anyway. In this case including those improvements directly to game process is just pointless and potentially dangerous for the stability.

Minor Update 158007

The KeepAlive fixes mentioned above have been included to the patch after the confirmation that they don't cause any game problems.
In Update 158007, the multiplayer game slowdown bug fixes have been significantly improved. They now use UP15_GameHelper.dll which also allowed to print DEBUG message when the fix was activated. It helps understand what times the game would experience problems. In the upcoming update 158008, those DEBUG messages will be visible by all players, not only the host.

In 158007, the same game slowdown has been fixed after the host migration (host quit). It was at the similar location as the previous problem, but a bit different.
After the host migration, players slots are again missmatching the ENet thread slots of the new host, so certain players ping is not checked. The new fix detects for this problem, and forces re-check based on the player slot ID instead of ENet slot ID... And thankfully it fixes all issues.


Please comment below if you have any additional questions, or need the further clarification.
Best regards,
Dr.MonaLisa
Ministry of Game Affairs
Department of Control and Complains

User avatar
IndieRock00
Posts: 52
Joined: 16 Jun 2019, 17:38

Re: Multiplayer & Multiplayer Lobby improvements in Unofficial Patch 1.5 after February 2021

Post by IndieRock00 »

Great post, because as mentioned in point 7, the game multiplies the highest ping by 1.5?
Long life and prosperity. 🖖
User avatar
Dr.MonaLisa
High Representative
Posts: 8041
Joined: 17 Jun 2010, 11:21
Location: Poland

Re: Multiplayer & Multiplayer Lobby improvements in Unofficial Patch 1.5 after February 2021

Post by Dr.MonaLisa »

IndieRock00 wrote: 03 Feb 2021, 13:35 Great post, because as mentioned in point 7, the game multiplies the highest ping by 1.5?
Yes it does. I can reduce it for test purposes, but then "wilparra's lag" could be even worse than now.

Not sure if you remember it, I was experimenting with changes of this value in a game which I was observing. I wrote on chat that I'm trying to reduce the lag. Maybe you weren't in that game.

Generally, I could also set this "mpRoundScaleFactor" to 0.5. Then from ping 350, the command lag would be 175 ms. However, then the game would start "freezing" just as when this empty slot bug occurred. So we would see units stop moving for a short time and resuming again. It's because the game would be expecting to receive a response from all connected players in 175 ms, but someone would send it with 350 ms delay and cause an additional 175 ms "freeze". When this value is higher, then we minimize the risk of these freezes, because even if a ping jumps up by 50 ms, we still won't see the freeze, since tolerance would be 350 * 1.5 = 525.

Then, users with antic processors (like Wilparra's 1Ghz) need more time to "simulate" a game tick on their computers. So there is a danger that reducing the "mpRoundScaleFactor" would make wilparra slow down our games even more. But that's just my suspicions. Possibly that it's unrelated, and the hardware lag information are counted by some different function.

Generally, this 1.5 "mpRoundScaleFactor" was set by developers in 2004. Back then, processors were much worse than now. I think it would be safe to set it even to just 1.1 or 1.0. However, that would require a long observation and tests.
The one idea I had was about using players statistics from the Launcher (CPU name), and then calculate the most optimal "mpRoundScaleFactor" depending on players who are in the current game. Unfortunately, this would be a lot of unnecessary coding, and we don't even know if the final result would be good enough. I would need to define the "power" of all available processors manually, so that's some work that would be extremely long to finish, and would require updates when new processors are released.

In my opinion this "mpRoundScaleFactor" doesn't make much sense due that it uses the maximum ping. For example, when there is max ping 400, we will get 600 ms command lag, but when there is max ping 120, we will get 180 ms command lag. 180 - 120 = 60, but 600 - 400 = 200. So 200 - 60 = 140. So when we connect to somebody in Australia, we get too high "tolerance", and we feel even greater command lag that we should. It's unlikely that pings dynamically change in 2021. In most of cases it would be maybe +- 40 ms difference... The question is what would be more annoying: bigger command lag, or potential frequent units freezes?
Best regards,
Dr.MonaLisa
Ministry of Game Affairs
Department of Control and Complains
Post Reply

Return to “Ministry of Game Affairs of the Empire Earth II Community”