It's been both the home page and a game in the past. This time it was the home page.
webDiplomacy.net new server
Re: webDiplomacy.net new server
-
- Gold Donator
- Posts: 420
- Joined: Tue Aug 21, 2018 7:36 am
- Contact:
Re: webDiplomacy.net new server
So I'm seeing Timed Out Connections, and or Connection Rejected errors since the switch to the new server much more frequently than before. Typically about 1-2 times an hour I'm unable to access the site for 10-15 minutes. I'm also seeing a lot more unexpected Draws in Bot Games, typically during a phase where only the bots have orders to input, seems to happen in close proximity to connectivity issues as well.
-
- Posts: 24
- Joined: Wed Jan 03, 2018 3:17 am
- Contact:
Re: webDiplomacy.net new server
It became impossible to play live games...
https://webdiplomacy.net/board.php?gameID=364826
was broken up by connection issues today (site would not respond and then we both got dinged)
https://webdiplomacy.net/board.php?gameID=364826
was broken up by connection issues today (site would not respond and then we both got dinged)
Re: webDiplomacy.net new server
Since about 3:00pm Eastern (it's now 4:39pm) the site has been accessible for about 30 of the last 100 minutes, split into several even shorter spans. And I just had another Bot game draw for no orders entered. The site is significantly less functional than prior to the server migration.
Re: webDiplomacy.net new server
Dang it.. Thanks for reporting these issues guys, we're collecting the error logs and will sort out the deadlocking issues
Re: webDiplomacy.net new server
Just so you know, I'm still seeing frequent outages. Enough to raise my blood pressure when playing bot games. LOL
Re: webDiplomacy.net new server
Is it perhaps just simply a routing issue?
I tried doing routing tests, but the results are quite weird. It seems like the packets are bouncing back and forth for quite some time in roughly the same area.
I tried doing routing tests, but the results are quite weird. It seems like the packets are bouncing back and forth for quite some time in roughly the same area.
Re: webDiplomacy.net new server
Addendum:
I used https://www.uptrends.com/tools/traceroute
For example, especially from Europe (Berlin, Amsterdam) the traces either contain time-outs or take almost 2000 ms in total.
When I try to do a trace from my own pc (for those that want to do this: in cmd type [tracert webdiplomacy.net] without []) I had a timeout as well.
Hope this helps.
I used https://www.uptrends.com/tools/traceroute
For example, especially from Europe (Berlin, Amsterdam) the traces either contain time-outs or take almost 2000 ms in total.
When I try to do a trace from my own pc (for those that want to do this: in cmd type [tracert webdiplomacy.net] without []) I had a timeout as well.
Hope this helps.
Re: webDiplomacy.net new server
Thanks for coming up with some ideas Tjappator, but we're pretty confident this is due to locking issues, and we're fairly confident that is because the query optimizer has changed a lot since the system was written and there may be some queries that weren't designed with locking in mind, and we're getting deadlock errors as a result.
I've got four days off next week to look into this, and I'm expecting it to be resolved this time next week. (Fingers crossed)
I've got four days off next week to look into this, and I'm expecting it to be resolved this time next week. (Fingers crossed)
-
- Gold Donator
- Posts: 56
- Joined: Sat Feb 29, 2020 9:08 pm
- Contact:
Re: webDiplomacy.net new server
No, I think it is Fairbot becoming sentient. Be careful next week, @kestasjk.we're pretty confident this is due to locking issues
Re: webDiplomacy.net new server
Yeah, I'm sorry. It appears my analysis was wrong.kestasjk wrote: ↑Fri Jul 02, 2021 8:02 amThanks for coming up with some ideas Tjappator, but we're pretty confident this is due to locking issues, and we're fairly confident that is because the query optimizer has changed a lot since the system was written and there may be some queries that weren't designed with locking in mind, and we're getting deadlock errors as a result.
I've got four days off next week to look into this, and I'm expecting it to be resolved this time next week. (Fingers crossed)
I'm a embedded software engineer myself (C/C++ mainly), so I know how annoying debugging can be at times and I wish you good luck with it :)
Re: webDiplomacy.net new server
Interesting and promising results from analyzing the error logs to extract all deadlocking queries:
These are the queries that are locking up when these deadlocks occur, so they don't necessarily point to where the issue is coming from but they do point to the tables involved and likely problems, a few good leads. Definitely it shows there are too many places game processing is occurring, map.php and board.php are requiring locks they probably shouldn't need, the timeouts are way too long when waiting for timeout.
So far I've changed it so game processing can only happen from one place, and I've changed the query that looks up forum times so that it doesn't link to the forum if using a third party DB.
I'm also trialing a new service: https://status.webdiplomacy.net , which will keep us better informed of any issues via alerts, and can be checked to see how well we're doing.
Code: Select all
A)
98x : sql => UPDATE wD_Members SET timeLoggedIn = 1624155999 WHERE id = 1680167 file => /var/www/webdiplomacy.net/public_html/board.php
B)
105x : sql => SELECT g.id, g.variantID, LOWER(HEX(g.password)) as password, g.turn, g.phase, g.processTime, g.name, g.gameOver, g.attempts, g.pot, g.potType, g.phaseMinutes, g.nextPhaseMinutes, g.phaseSwitchPeriod, g.processStatus, g.pauseTimeRemaining, g.minimumBet, g.anon, g.pressType, g.missingPlayerPolicy, g.drawType, g.minimumReliabilityRating, g.excusedMissedTurns, g.playerTypes, g.startTime FROM wD_Games g WHERE g.id=103892 FOR UPDATE file => /var/www/webdiplomacy.net/public_html/map.php
C)
3x: sql => SELECT g.id, g.variantID, LOWER(HEX(g.password)) as password, g.turn, g.phase, g.processTime, g.name, g.gameOver, g.attempts, g.pot, g.potType, g.phaseMinutes, g.nextPhaseMinutes, g.phaseSwitchPeriod, g.processStatus, g.pauseTimeRemaining, g.minimumBet, g.anon, g.pressType, g.missingPlayerPolicy, g.drawType, g.minimumReliabilityRating, g.excusedMissedTurns, g.playerTypes, g.startTime FROM wD_Games g WHERE g.id=364883 FOR UPDATE User => resource => Trace: file => /var/www/webdiplomacy.net/public_html/ajax.php
3x: sql => SELECT g.id, g.variantID, LOWER(HEX(g.password)) as password, g.turn, g.phase, g.processTime, g.name, g.gameOver, g.attempts, g.pot, g.potType, g.phaseMinutes, g.nextPhaseMinutes, g.phaseSwitchPeriod, g.processStatus, g.pauseTimeRemaining, g.minimumBet, g.anon, g.pressType, g.missingPlayerPolicy, g.drawType, g.minimumReliabilityRating, g.excusedMissedTurns, g.playerTypes, g.startTime FROM wD_Games g WHERE g.id=365381 FOR UPDATE file => /var/www/webdiplomacy.net/public_html/board.php
D)
4x : sql => SELECT m.id AS id, m.userID AS userID, m.gameID AS gameID, m.countryID AS countryID, m.status AS status, m.orderStatus AS orderStatus, m.bet AS bet, m.missedPhases as missedPhases, m.timeLoggedIn as timeLoggedIn, m.newMessagesFrom AS newMessagesFrom, m.votes AS votes, m.supplyCenterNo as supplyCenterNo, m.unitNo as unitNo, m.excusedMissedTurns as excusedMissedTurns, u.username AS username, u.points AS points, m.pointsWon as pointsWon, IF(s.userID IS NULL,0,1) as online, u.type as userType FROM wD_Members m INNER JOIN wD_Users u ON ( m.userID = u.id ) LEFT JOIN wD_Sessions s ON ( u.id = s.userID ) WHERE m.gameID = 359461 ORDER BY m.status ASC, m.supplyCenterNo DESC, m.countryID ASC FOR UPDATE file => /var/www/webdiplomacy.net/public_html/gamemaster.php
3x : sql => SELECT m.id AS id, m.userID AS userID, m.gameID AS gameID, m.countryID AS countryID, m.status AS status, m.orderStatus AS orderStatus, m.bet AS bet, m.missedPhases as missedPhases, m.timeLoggedIn as timeLoggedIn, m.newMessagesFrom AS newMessagesFrom, m.votes AS votes, m.supplyCenterNo as supplyCenterNo, m.unitNo as unitNo, m.excusedMissedTurns as excusedMissedTurns, u.username AS username, u.points AS points, m.pointsWon as pointsWon, IF(s.userID IS NULL,0,1) as online, u.type as userType FROM wD_Members m INNER JOIN wD_Users u ON ( m.userID = u.id ) LEFT JOIN wD_Sessions s ON ( u.id = s.userID ) WHERE m.gameID = 364551 ORDER BY m.status ASC, m.supplyCenterNo DESC, u.points DESC FOR UPDATE file => /var/www/webdiplomacy.net/public_html/board.php
1x : sql => SELECT m.id AS id, m.userID AS userID, m.gameID AS gameID, m.countryID AS countryID, m.status AS status, m.orderStatus AS orderStatus, m.bet AS bet, m.missedPhases as missedPhases, m.timeLoggedIn as timeLoggedIn, m.newMessagesFrom AS newMessagesFrom, m.votes AS votes, m.supplyCenterNo as supplyCenterNo, m.unitNo as unitNo, m.excusedMissedTurns as excusedMissedTurns, u.username AS username, u.points AS points, m.pointsWon as pointsWon, IF(s.userID IS NULL,0,1) as online, u.type as userType FROM wD_Members m INNER JOIN wD_Users u ON ( m.userID = u.id ) LEFT JOIN wD_Sessions s ON ( u.id = s.userID ) WHERE m.gameID = 365391 ORDER BY m.status ASC, m.supplyCenterNo DESC, u.points DESC FOR UPDATE file => /var/www/webdiplomacy.net/public_html/gamemaster.php
1x : sql => SELECT m.id AS id, m.userID AS userID, m.gameID AS gameID, m.countryID AS countryID, m.status AS status, m.orderStatus AS orderStatus, m.bet AS bet, m.missedPhases as missedPhases, m.timeLoggedIn as timeLoggedIn, m.newMessagesFrom AS newMessagesFrom, m.votes AS votes, m.supplyCenterNo as supplyCenterNo, m.unitNo as unitNo, m.excusedMissedTurns as excusedMissedTurns, u.username AS username, u.points AS points, m.pointsWon as pointsWon, IF(s.userID IS NULL,0,1) as online, u.type as userType FROM wD_Members m INNER JOIN wD_Users u ON ( m.userID = u.id ) LEFT JOIN wD_Sessions s ON ( u.id = s.userID ) WHERE m.gameID = 362565 ORDER BY m.status ASC, m.supplyCenterNo DESC, m.countryID ASC FOR UPDATE file => /var/www/webdiplomacy.net/public_html/map.php
E)
8x: sql => UPDATE wD_Games SET attempts=attempts+1 WHERE id=364284 file => /var/www/webdiplomacy.net/public_html/board.php
5x: sql => UPDATE wD_Games SET attempts=attempts+1 WHERE id=354566 file => /var/www/webdiplomacy.net/public_html/gamemaster.php
5x: sql => UPDATE wD_Games SET attempts=attempts+1 WHERE id=364793 User => Trace: file => /var/www/webdiplomacy.net/public_html/ajax.php
F)
1x: FROM wD_Members WHERE gameID = 365549 AND countryID=2 FOR UPDATE User => Trace: file => /var/www/webdiplomacy.net/public_html/ajax.php
1x: FROM wD_Members WHERE gameID = 365656 AND countryID=6 FOR UPDATE file => /var/www/webdiplomacy.net/public_html/board.php
G)
3x : sql => UPDATE wD_Misc SET value = 311943569 WHERE name = 'Hits' file => /var/www/webdiplomacy.net/public_html/gamemaster.php
1x : sql => UPDATE wD_Misc SET value = 144 WHERE name = 'ErrorLogs' file => /var/www/webdiplomacy.net/public_html/gamemaster.php
1x : sql => UPDATE wD_Misc SET value = 1624270235 WHERE name = 'LastProcessTime' file => /var/www/webdiplomacy.net/public_html/gamemaster.php
H)
4x : sql => UPDATE wD_Users SET timeLastSessionEnded = 1624211262, lastMessageIDViewed = (SELECT MAX(f.id) FROM wD_ForumMessages f) WHERE id IN (133701, 131126, 114734, 44860, 129540, 43473, 98756, 130817, 17272, 114189, 132775, 95704, 44661, 133942, 133876, 118741, 25560, 132519, 133933, 114232, 49505, 104098, 128803, 125810, 125879, 125974, 133957, 117790, 33239, 23113, 107946, 126452, 133894, 132907) file => /var/www/webdiplomacy.net/public_html/gamemaster.php
So far I've changed it so game processing can only happen from one place, and I've changed the query that looks up forum times so that it doesn't link to the forum if using a third party DB.
I'm also trialing a new service: https://status.webdiplomacy.net , which will keep us better informed of any issues via alerts, and can be checked to see how well we're doing.
Re: webDiplomacy.net new server
I don't know if it's expected or not, but I have a bot game going (https://www.webdiplomacy.net/board.php?gameID=366255) that doesn't seem to want to process even after several minutes of waiting after readying my moves.
Re: webDiplomacy.net new server
Hmm.. not expected but that could well be due to changes made to reduce simultaneous game processing. How many minutes was it delayed for? It looks like it has moved on now, so did it eventually move on because of the votes or did it move on due to timing out?
The current processor is running on a 30-second cycle; so if you had a 30 second delay that would be understandable, but if it was definitely over a minute that is odd.
The current processor is running on a 30-second cycle; so if you had a 30 second delay that would be understandable, but if it was definitely over a minute that is odd.
Re: webDiplomacy.net new server
Still right where I left it, Spring 1904, and it has been a few hours now. No worries anout the game itself, just want to provide info if there's a potential issue.kestasjk wrote: ↑Mon Jul 05, 2021 11:01 pmHmm.. not expected but that could well be due to changes made to reduce simultaneous game processing. How many minutes was it delayed for? It looks like it has moved on now, so did it eventually move on because of the votes or did it move on due to timing out?
The current processor is running on a 30-second cycle; so if you had a 30 second delay that would be understandable, but if it was definitely over a minute that is odd.
Re: webDiplomacy.net new server
Also, out of curiosity, I started another bot game (https://www.webdiplomacy.net/board.php?gameID=366352) and gave it a couple of minutes... it says 7 players have joined and the game will start on the next processing cycle.
-
- Posts: 1
- Joined: Tue Jun 22, 2021 3:16 am
- Contact:
Re: webDiplomacy.net new server
I too have a bot game going and despite readying up, has not processed in the past 4 hours.
https://webdiplomacy.net/board.php?game ... #gamePanel
https://webdiplomacy.net/board.php?game ... #gamePanel
Re: webDiplomacy.net new server
Thanks guys good feedback. I thought the main gamemaster could recognize vote and order complete conditions but it looks like it doesn't. I've made a change so that the board can trigger a process by setting the process time to now, so that it'll then get picked up by the gamemaster within a few moments.
Let me know how it goes.
I'm expecting once we've really nailed all the deadlocking issues we'll be able to carefully turn some of this back on so that games process instantly again. (and there have been some other discoveries like the bots have been using more resources than all the other users on the site combined due to the rapid-fire requests)
edit: Definitely making progress, finally caught a deadlock in the act:
and it was because a bot triggered a game to get processed at the same time as the gamemaster via a path I hadn't stopped properly
Let me know how it goes.
I'm expecting once we've really nailed all the deadlocking issues we'll be able to carefully turn some of this back on so that games process instantly again. (and there have been some other discoveries like the bots have been using more resources than all the other users on the site combined due to the rapid-fire requests)
edit: Definitely making progress, finally caught a deadlock in the act:
Code: Select all
------------------------
LATEST DETECTED DEADLOCK
------------------------
2021-07-06 04:25:02 0x7f3b681c6700
*** (1) TRANSACTION:
TRANSACTION 4827537, ACTIVE 0 sec starting index read
mysql tables in use 2, locked 2
LOCK WAIT 4 lock struct(s), heap size 1128, 2 row lock(s)
MySQL thread id 93956, OS thread handle 139892861830912, query id 3225261 localhost webdiplomacy Creating sort index
SELECT m.id AS id,
\t\t\t\tm.userID AS userID,
\t\t\t\tm.gameID AS gameID,
\t\t\t\tm.countryID AS countryID,
\t\t\t\tm.status AS status,
\t\t\t\tm.orderStatus AS orderStatus,
\t\t\t\tm.bet AS bet,
\t\t\t\tm.missedPhases as missedPhases,
\t\t\t\tm.timeLoggedIn as timeLoggedIn,
\t\t\t\tm.newMessagesFrom AS newMessagesFrom,
\t\t\t\tm.votes AS votes,
\t\t\t\tm.supplyCenterNo as supplyCenterNo,
\t\t\t\tm.unitNo as unitNo,
\t\t\t\tm.excusedMissedTurns as excusedMissedTurns,
\t\t\t\tu.username AS username,
\t\t\t\tu.points AS points,
\t\t\t\tm.pointsWon as pointsWon,
\t\t\t\tIF(s.userID IS NULL,0,1) as online,
\t\t\t\tu.type as userType
\t\t\tFROM wD_Members m
\t\t\tINNER JOIN wD_Users u ON ( m.userID = u.id )
\t\t\tLEFT JOIN wD_Sessions s ON ( u.id = s.userID )
\t\t\tWHERE m.gameID = 366356
\t\t\tORDER BY m.status ASC, m.supplyCenterNo DESC, m.countryID ASC FOR UPDATE
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1039 page no 8416 n bits 824 index gid of table `webdiplomacy`.`wD_Members` trx id 4827537 lock_mode X locks rec but not gap waiting
Record lock, heap no 741 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 3; hex 059714; asc ;;
1: len 3; hex 19d69d; asc ;;
*** (2) TRANSACTION:
TRANSACTION 4827535, ACTIVE 0 sec starting index read
mysql tables in use 1, locked 1
8 lock struct(s), heap size 1128, 16 row lock(s), undo log entries 9
MySQL thread id 93955, OS thread handle 139893126489856, query id 3225262 localhost webdiplomacy Updating
UPDATE wD_Games SET attempts=0 WHERE id=366356
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 1039 page no 8416 n bits 824 index gid of table `webdiplomacy`.`wD_Members` trx id 4827535 lock_mode X locks rec but not gap
Record lock, heap no 739 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 3; hex 059714; asc ;;
1: len 3; hex 19d6a0; asc ;;
Record lock, heap no 740 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 3; hex 059714; asc ;;
1: len 3; hex 19d69f; asc ;;
Record lock, heap no 741 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 3; hex 059714; asc ;;
1: len 3; hex 19d69d; asc ;;
Record lock, heap no 744 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 3; hex 059714; asc ;;
1: len 3; hex 19d6a1; asc ;;
Record lock, heap no 745 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 3; hex 059714; asc ;;
1: len 3; hex 19d69e; asc ;;
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1038 page no 4167 n bits 96 index PRIMARY of table `webdiplomacy`.`wD_Games` trx id 4827535 lock_mode X locks rec but not gap waiting
Record lock, heap no 28 PHYSICAL RECORD: n_fields 31; compact format; info bits 0
0: len 3; hex 059714; asc ;;
1: len 6; hex 000000000000; asc ;;
2: len 7; hex 80000000000000; asc ;;
3: len 1; hex 09; asc ;;
4: len 2; hex 0001; asc ;;
5: len 1; hex 03; asc ;;
6: len 4; hex 60e3dfce; asc ` ;;
7: len 3; hex 800019; asc ;;
12: len 1; hex 01; asc ;;
13: SQL NULL;
14: SQL NULL;
15: len 2; hex 0014; asc ;;
16: len 4; hex 000005a0; asc ;;
17: len 4; hex 7fffffff; asc ;;
18: len 1; hex 01; asc ;;
19: len 1; hex 01; asc ;;
20: len 2; hex 0002; asc ;;
21: len 1; hex 01; asc ;;
22: len 3; hex 000000; asc ;;
23: len 1; hex 01; asc ;;
24: len 1; hex 00; asc ;;
25: len 1; hex 01; asc ;;
26: len 4; hex 00000004; asc ;;
27: SQL NULL;
28: len 1; hex 01; asc ;;
29: len 4; hex 60e3cd0e; asc ` ;;
30: len 4; hex 80000000; asc ;;
*** WE ROLL BACK TRANSACTION (1)
Who is online
Users browsing this forum: No registered users