• 周五. 12月 2nd, 2022

5G编程聚合网

5G时代下一个聚合的编程学习网

热门标签

Solution to bug: zookeeper cluster denial of service

[db:作者]

1月 6, 2022

{“type”:”doc”,”content”:[{“type”:”heading”,”attrs”:{“align”:null,”level”:2},”content”:[{“type”:”text”,”text”:” Preface “,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:”ZooKeeper As dubbo Registration Center for , It’s the top priority , on-line ZK Any wind and grass will affect the heartstrings . Recently, I came across online ZK Leader After downtime , The failure of the elector leads to ZK Cluster denial of service phenomenon , So I put this case Write it out and share it with you ( be based on ZooKeeper 3.4.5).”,”attrs”:{}}]},{“type”:”heading”,”attrs”:{“align”:null,”level”:2},”content”:[{“type”:”text”,”text”:”Bug The scene “,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” One morning , Get a call out of the blue , be supposed to ZooKeeper The physical machine is down , And the rest of the machines are “,”attrs”:{}}]},{“type”:”codeblock”,”attrs”:{“lang”:null},”content”:[{“type”:”text”,”text”:”sh zkServer.sh status\nit is probably not running\n”,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” I looked at the monitoring , What’s causing physical downtime is ZK Of leader.3 Node ZK,leader When it’s down , The other two have never been leader, After pulling up the down emergency machine , Still unable to choose , Lead to ZK Cluster total denial of service !”,”attrs”:{}}]},{“type”:”image”,”attrs”:{“src”:”https://static001.geekbang.org/infoq/82/829c6385ca6959c92b81278ed9e3c417.png”,”alt”:””,”title”:null,”style”:[{“key”:”width”,”value”:”75%”},{“key”:”bordertype”,”value”:”none”}],”href”:null,”fromPaste”:true,”pastePass”:true}},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null}},{“type”:”heading”,”attrs”:{“align”:null,”level”:2},”content”:[{“type”:”text”,”text”:” Business impact “,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:”Dubbo If you can’t connect ZK, The call meta information is always cached , So there’s no real impact on the request invocation . The trouble is , If in ZK During a denial of service , The app can’t be restarted or released , In case of emergency and restart ( Release ) You can’t , It will have a significant impact . Fortunately, for high availability , We have built peer-to-peer computer rooms , So it’s very calm to switch the traffic to B Computer room ,”,”attrs”:{}}]},{“type”:”image”,”attrs”:{“src”:”https://static001.geekbang.org/infoq/85/856adafca1fec5530f67b9b280602cf6.png”,”alt”:””,”title”:null,”style”:[{“key”:”width”,”value”:”75%”},{“key”:”bordertype”,”value”:”none”}],”href”:null,”fromPaste”:true,”pastePass”:true}},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” Double room construction is good , One button switching ! After switching, you can have enough time to recover A The cluster of computer rooms . As the tension recovers , The author also started the analysis work .”,”attrs”:{}}]},{“type”:”heading”,”attrs”:{“align”:null,”level”:2},”content”:[{“type”:”text”,”text”:” Log performance “,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” First , Check the log , There was a lot of client Connection error , It’s natural to filter it out , To avoid interference .”,”attrs”:{}}]},{“type”:”codeblock”,”attrs”:{“lang”:null},”content”:[{“type”:”text”,”text”:”cat zookeeper.out | grep -v ‘client xxx’ | > /tmp/1.txt\n”,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” The first thing I see is the following log :”,”attrs”:{}}]},{“type”:”heading”,”attrs”:{“align”:null,”level”:3},”content”:[{“type”:”text”,”text”:”ZK-A Machine log “,”attrs”:{}}]},{“type”:”codeblock”,”attrs”:{“lang”:null},”content”:[{“type”:”text”,”text”:”Zk-A machine :\n2021-06-16 03:32:35 … New election. My id=3\n2021-06-16 03:32:46 … QuoeumPeer] LEADING // Be careful , It’s a success here \n2021-06-16 03:32:46 … QuoeumPeer] LEADING – LEADER ELECTION TOOK – 7878’\n2021-06-16 03:32:48 … QuoeumPeer] Reading snapshot /xxx/snapshot.xxx\n2021-06-16 03:32:54 … QuoeumPeer] Snahotting xxx to /xxx/snapshot.xxx\n2021-06-16 03:33:08 … Follower sid ZK-B.IP\n2021-06-16 03:33:08 … Unexpected exception causing shutdown while sock still open\njava.io.EOFException \n\tat java.io.DataInputStream.readInt\n\t……\n\tat quorum.LearnerHandler.run\n2021-06-16 03:33:08 ******* GOODBYE ZK-B.IP *******\n2021-06-16 03:33:27 Shutting down\n”,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” This log looks like the election was successful , But there’s something wrong with communication with other machines , Lead to Shutdown And then re elect .”,”attrs”:{}}]},{“type”:”heading”,”attrs”:{“align”:null,”level”:2},”content”:[{“type”:”text”,”text”:”ZK-B Machine log “,”attrs”:{}}]},{“type”:”codeblock”,”attrs”:{“lang”:null},”content”:[{“type”:”text”,”text”:”2021-06-16 03:32:48 New election. My id=2\n2021-06-16 03:32:48 QuoeumPeer] FOLLOWING\n2021-06-16 03:32:48 QuoeumPeer] FOLLOWING – LEADER ELECTION TOOK – 222\n2021-06-16 03:33:08.833 QuoeumPeer] Exception when following the leader\njava.net.SocketTimeoutException: Read time out\n\tat java.net.SocketInputStream.socketRead0\n\t……\n\tat org.apache.zookeeper.server.quorum.Follower.followLeader\n2021-06-16 03:33:08.380 Shutting down\n”,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” This log also shows that the selection was successful , And I am Following state , It’s just Leader Not coming back , Results in a time-out Shutdown”,”attrs”:{}}]},{“type”:”heading”,”attrs”:{“align”:null,”level”:2},”content”:[{“type”:”text”,”text”:” Sequence diagram “,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” I will draw the above log into a sequence diagram , For analysis :”,”attrs”:{}}]},{“type”:”image”,”attrs”:{“src”:”https://static001.geekbang.org/infoq/9f/9f1f515cd140d8082019a1de077ba957.png”,”alt”:””,”title”:null,”style”:[{“key”:”width”,”value”:”75%”},{“key”:”bordertype”,”value”:”none”}],”href”:null,”fromPaste”:true,”pastePass”:true}},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” from ZK-B It can be seen from the log of , It’s becoming follower after , Have been waiting for leader, know Read time out. from ZK-A It can be seen from the log of , It’s becoming LEADING after , stay 33:08,803 Just received Follower That is to say ZK-B The bag sent out . And then ,ZK-B Already in 33:08,301 When Read timed out 了 .”,”attrs”:{}}]},{“type”:”heading”,”attrs”:{“align”:null,”level”:3},”content”:[{“type”:”text”,”text”:” First analysis follower(ZK-B) The situation of “,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” We know it’s in 03:32:48 Become follower, And then in 03:33:08 error Read time out, It happens to be 20s. So the author starts with Zookeeper Find its settings in the source code Read time out How long is it .”,”attrs”:{}}]},{“type”:”codeblock”,”attrs”:{“lang”:null},”content”:[{“type”:”text”,”text”:”Learner\nprotected void connectToLeader(InetSocketAddress addr) {\n\t……\n\tsock = new Socket()\n\t// self.tockTime 2000 self.initLimit 10\n\tsock.setSoTimeout(self.tickTime * self.initLimit);\n\t……\n}\n”,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” Its Read time out Is in accordance with the zoo.cfg The configuration item in :”,”attrs”:{}}]},{“type”:”codeblock”,”attrs”:{“lang”:null},”content”:[{“type”:”text”,”text”:”tickTime=2000 self.tickTime\ninitLimit=10 self.initLimit\nsyncLimit=5\n”,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” The obvious ,ZK-B Is becoming follower after , For some reason leader stay 20s And then respond . So what’s next is leader Analyze .”,”attrs”:{}}]},{“type”:”heading”,”attrs”:{“align”:null,”level”:3},”content”:[{“type”:”text”,”text”:” Yes leader(ZK-A) Analyze “,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” First of all, let’s take a look at Leader Initialization logic of :”,”attrs”:{}}]},{“type”:”codeblock”,”attrs”:{“lang”:null},”content”:[{“type”:”text”,”text”:”quorumPeer\n\t|-> Print LEADING\n\t|->makeLeader\n\t\t|-> new ServerSocket listen and bind \n\t|->leader.lead()\n\t\t|-> Print LEADER ELECTION TOOK\n\t\t|->loadData\n\t\t\t|->loadDataBase \n\t\t\t\t|->resore Print Reading snapshot\n\t\t\t|->takeSnapshot\n\t\t\t\t|->save Print Snapshotting\n\t\t\t|->cnxAcceptor Processing requests Accept\t\t\n”,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” You can see , In our ZK Between starting the listening port and formally processing the request , also Reading Snapshot and Snapshotting( Write ) action . It can be seen from the log that one spent 6s many , One spent 14s many . And then there is 20s The processing gap of . As shown in the figure below :”,”attrs”:{}}]},{“type”:”image”,”attrs”:{“src”:”https://static001.geekbang.org/infoq/77/7703295c4313f4900c0570cbc5b76173.png”,”alt”:””,”title”:null,”style”:[{“key”:”width”,”value”:”75%”},{“key”:”bordertype”,”value”:”none”}],”href”:null,”fromPaste”:true,”pastePass”:true}},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” Because in socket listen 20s And then we start processing the data , therefore ZK-B Establishing a successful connection is still in the process of tcp Kernel full connection queue (backlog) Inside , Because from the kernel’s point of view, three handshakes are successful , So it can receive normally ZK-B Sent follower ZK-B data . stay 20s,ZK-A After really dealing with it , from buffer Take it out 20s front ZK-B Data sent , When we’re done with the return package , Find out ZK-B The connection has been disconnected . alike , another follower( At this time, we have pulled up the downtime , So it is 3 platform ) It’s also for this reason gg, and leader No response from other machines , Think of yourself as leader Don’t reach 1/2 The number of votes , and Shutdown Re election .”,”attrs”:{}}]},{“type”:”heading”,”attrs”:{“align”:null,”level”:2},”content”:[{“type”:”text”,”text”:”Snapshot Time consuming “,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” So what causes Snapshotting Reading and writing is so time-consuming ? The author looked at Snapshot file size , There’s nearly one G about .”,”attrs”:{}}]},{“type”:”heading”,”attrs”:{“align”:null,”level”:2},”content”:[{“type”:”text”,”text”:” turn up initLimit”,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” In this case , In fact, we just need to increase initLimit, We should be able to cross this barrier .”,”attrs”:{}}]},{“type”:”codeblock”,”attrs”:{“lang”:null},”content”:[{“type”:”text”,”text”:”zoo.cfg\ntickTime=2000 // Don’t move this , Because and ZK It’s about the heartbeat mechanism \ninitLimit=100 // Directly to 100,200s!\n”,”attrs”:{}}]},{“type”:”heading”,”attrs”:{“align”:null,”level”:2},”content”:[{“type”:”text”,”text”:” It’s a coincidence that 20s Well ?”,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” It’s just a coincidence , Every election process happens to be stuck in 20s however ? There have been repeated elections , There should be one 1/2 Conditions , Report an error and jump out \n\t if (!tickSkip && !self.getQuorumVerifier().containsQuorum(syncedSet)) {\n shutdown(\”Only\” + syncedSet.size() + \” followers, need\” + (self.getVotingView().size()/2));\n return;\n } \n\t}\n}\n”,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” The essence of error reporting is to communicate with leader synchronous syncedSet Less than fixed 1/2 colony , therefore shutdown 了 . At the same time, we can see in the code syncedSet It’s decided by learnerHander.synced() To decide . Let’s keep looking at the code :”,”attrs”:{}}]},{“type”:”codeblock”,”attrs”:{“lang”:null},”content”:[{“type”:”text”,”text”:”LearnerHandler\n\tpublic boolean synced(){\n\t // here isAlive It’s threaded isAlive\n\t\treturn isAlive() && tickOfLastAck >= leader.self.tick – leader.self.syncLimit;\n\t}\n”,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” The obvious ,follower and leader The synchronization time of is longer than leader.self.syncLimit That is to say 5 * 2 = 10s”,”attrs”:{}}]},{“type”:”codeblock”,”attrs”:{“lang”:null},”content”:[{“type”:”text”,”text”:”zoo.cfg\ntickTime = 2000\nsyncLimit = 5 \n”,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” So our tick How is it updated , The answer is follower Respond to UPTODATE package , That is to say, it has been with leader After synchronization ,follower Every package will be updated once , Not updated before .”,”attrs”:{}}]},{“type”:”image”,”attrs”:{“src”:”https://static001.geekbang.org/infoq/4b/4bfe5cdfe4eb2b57a917f27e43528770.png”,”alt”:””,”title”:null,”style”:[{“key”:”width”,”value”:”75%”},{“key”:”bordertype”,”value”:”none”}],”href”:null,”fromPaste”:true,”pastePass”:true}},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” Further reasoning , That’s our follower Handle leader I have more bags than 10s, Lead to tick Not updated in time , , in turn, syncedSet Less than quantity , Lead to leader shutdown.”,”attrs”:{}}]},{“type”:”image”,”attrs”:{“src”:”https://static001.geekbang.org/infoq/1a/1a61fcdba4e9694d74d836efac82ca48.png”,”alt”:””,”title”:null,”style”:[{“key”:”width”,”value”:”75%”},{“key”:”bordertype”,”value”:”none”}],”href”:null,”fromPaste”:true,”pastePass”:true}},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null}},{“type”:”heading”,”attrs”:{“align”:null,”level”:3},”content”:[{“type”:”text”,”text”:”follower(ZK-B) The second case “,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” With this conclusion , The author has gone to look through follower(ZK-B) Log ( notes :ZK-C So it is with )”,”attrs”:{}}]},{“type”:”codeblock”,”attrs”:{“lang”:null},”content”:[{“type”:”text”,”text”:”2021-06-16 03:38:24 New election. My id = 3\n2021-06-16 03:38:24 FOLLOWING\n2021-06-16 03:38:24 FOLLOWING – LEADER ELECTION TOOK – 8004\n2021-06-16 03:38:42 Getting a diff from the leader\n2021-06-16 03:38:42 Snapshotting\n2021-06-16 03:38:57 Snapshotting\n2021-06-16 03:39:12 Got zxid xxx\n2021-06-16 03:39:12 Exception when following the leader\njava.net.SocketException: Broken pipe\n”,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” again Snapshot, This time we can see every time Snapshot Can spend 15s about , It’s far more than that syncLimit. From the source code we can see that , Every time Snapshot After that, it’s going to be right away writePacket( That is, in response ), But the first time I returned the package, it was not handled UPTODATE package , So it doesn’t update Leader The corresponding end of tick:”,”attrs”:{}}]},{“type”:”codeblock”,”attrs”:{“lang”:null},”content”:[{“type”:”text”,”text”:”learner:\nproteced void syncWithLeader(…){\nouterloop:\n\twhile(self.isRunning()){\n\t\treadPacket(qp);\n\t\tswitch(qp.getType()){\n\t\t\tcase Leader.UPTODATE\n\t\t\tif(!snapshotTaken){\n\t\t\t\tzk.takeSnapshot();\n\t\t\t\t……\n\t\t\t}\n\t\t\tbreak outerloop;\n\t\t}\n\t\tcase Leader.NEWLEADER:\n\t\t\tzk.takeSnapshot();\n\t\t\t……\n\t\t\twritePacket(……) // leader It will be updated when received tick\n\t\t\tbreak;\n\t}\n\t……\n\twritePacket(ack,True); // leader It will be updated when received tick\n}\n”,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” Be careful ,ZK-B It says twice in my journal Snapshotting. As for why twice , It should be a subtle Bug,( stay 3.4.5 In the official notes of fix, But I still typed it twice ), I didn’t go into it . Okay , The whole sequence diagram is as follows :”,”attrs”:{}}]},{“type”:”image”,”attrs”:{“src”:”https://static001.geekbang.org/infoq/5c/5c23c2e2c179be0949b0462bd3e4c900.png”,”alt”:””,”title”:null,”style”:[{“key”:”width”,”value”:”75%”},{“key”:”bordertype”,”value”:”none”}],”href”:null,”fromPaste”:true,”pastePass”:true}},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” Okay , The second situation is also gg 了 . This time, time is not just on the edge , It’s close to 30s can Okay, and synedSet Only 10s(2*5).ZK The cluster elects repeatedly in both cases , Until human intervention .”,”attrs”:{}}]},{“type”:”heading”,”attrs”:{“align”:null,”level”:2},”content”:[{“type”:”text”,”text”:” turn up syncLimit”,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” In this case , In fact, we just need to increase syncLimit, We should be able to cross this barrier .”,”attrs”:{}}]},{“type”:”codeblock”,”attrs”:{“lang”:null},”content”:[{“type”:”text”,”text”:”zoo.cfg\ntickTime=2000 // Don’t move this , Because and ZK It’s about the heartbeat mechanism \nsyncLimit=50 // Directly to 50,100s!\n”,”attrs”:{}}]},{“type”:”heading”,”attrs”:{“align”:null,”level”:2},”content”:[{“type”:”text”,”text”:” Offline reproduction “,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” Yes, of course , Analysis is not enough . We also need to test to reproduce and verify our conclusions . We’ve constructed an offline 1024G Snapshot Of ZookKeeper To test , stay initLimit=10 as well as syncLimit=5 In fact, as like as two peas, two phenomena appear exactly . After the author adjusted the parameters :”,”attrs”:{}}]},{“type”:”codeblock”,”attrs”:{“lang”:null},”content”:[{“type”:”text”,”text”:”zoo.cfg\ntickTime=2000\ninitLimit=100 // 200s\nsyncLimit=50 // 100s\n”,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:”Zookeeper The cluster is finally normal .”,”attrs”:{}}]},{“type”:”heading”,”attrs”:{“align”:null,”level”:2},”content”:[{“type”:”text”,”text”:” Offline with a new version 3.4.13 Try to reappear “,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” We also use a newer version offline 3.4.13 Try to reappear , Find out Zookeeper Without adjusting the parameters , Soon, the host was selected successfully and the service was provided normally . I turned the source code , Find it directly in Leader.lead() Phase and SyncWithLeader Stage ( If it is to use Diff Words ) take takeSnapshot Removed . This avoids dealing with snapshot Too much time leads to the failure to provide services .”,”attrs”:{}}]},{“type”:”codeblock”,”attrs”:{“lang”:null},”content”:[{“type”:”text”,”text”:”Zookeeper 3.4.13\n\nZookeeperServer.java\npublic void loadData(){\n\t…\n\t// takeSnapshot() Delete the last line of takeSnapshot\n}\n\nlearner.java\nprotected void syncWithLeader(…){\n\tboolean snapshotNeeded=true\n\tif(qp.getType() == Leader.DIFF){\n\t\t……\n\t\tsnapshotNeeded = false\n\t}\n\t……\n\tif(snapshotNeeded){\n\t\tzk.takeSnapshot();\n\t}\n\t……\n}\n\n”,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” It’s reliable to upgrade to a higher version , This version of the code also changes the confusing log !”,”attrs”:{}}]},{“type”:”heading”,”attrs”:{“align”:null,”level”:2},”content”:[{“type”:”text”,”text”:” why Dubbo-ZK There’s so much data “,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” The last question is dubbo dependent ZK Why is there so much data ! I use ZK The use of “,”attrs”:{}}]},{“type”:”codeblock”,”attrs”:{“lang”:null},”content”:[{“type”:”text”,”text”:”org.apache.zookeeper.server.SnapshotFormatter\n”,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:” Tools dump Come out and use shell(awk|unique) A bunch of , Find out dubbo Only a fraction of the total 1/4. Yes 1/2 yes Solar Of Zookeeper( It’s been removed , What’s left on it ). also 1/4 Because of the distributed lock of a system Bug Keep writing in and not deleting ( They have been asked to modify ). So will dubbo-zk And other ZK How important data separation is ! Abuse can lead to major events !”,”attrs”:{}}]},{“type”:”heading”,”attrs”:{“align”:null,”level”:2},”content”:[{“type”:”text”,”text”:” summary “,”attrs”:{}}]},{“type”:”paragraph”,”attrs”:{“indent”:0,”number”:0,”align”:null,”origin”:null},”content”:[{“type”:”text”,”text”:”Zookeeper As an important metadata management system , Its inability to provide services could have an immeasurable impact . Thanks to the construction of double computer rooms, we have enough time and a relaxed mind to deal with this problem . in addition , although ZK The election is complicated , But just settle down and analyze it slowly , You can always find clues , And then find a breakthrough !”,”attrs”:{}}]}]}

发表回复

您的电子邮箱地址不会被公开。 必填项已用*标注