当我之前在我的开发环境中在我的Gemfire集群上调用GFSH关闭命令时,我仍然需要等待剩余的Gemfire Cache Server成员再次启动。为什么?
我以为当我调用GFSH关闭命令时,所有成员都会运行,所有在线数据存储都会在关闭前同步,所以都保存最近的数据副本。因此,所有缓存服务器成员都将拥有最近的记录。
软件配置
例如,如果我在Gemfire集群上调用GFSH关闭命令,然后对所有机器执行关闭。然后,我启动2个定位器和3个缓存服务器。它会等待剩余的缓存服务器吗?
加法
在缓存服务器日志中:
[info ...19:27:02.327...... <main> tid=0x1] Created oplog#9 drf for disk store pdxMetaDataStore
[info ...19:27:02.327...... <main> tid=0x1] Created oplog#9 crf for disk store pdxMetaDataStore
[info ...19:27:02.327...... <main> tid=0x1] Deleted oplog#8 crf for disk store pdxMetaDataStore
[info ...19:27:02.327...... <main> tid=0x1] Deleted oplog#8 drf for disk store pdxMetaDataStore
[info ...19:27:02.329...... <main> tid=0x1] recovery region initialization took 17 ms
[info ...19:27:02.355...... <main> tid=0x1] Initializing region PdxTypes
[info ...19:31:31.509...... <unicast-receiver,gf-1> receive new view: View[148.88.88.100....
.....
[info ...19:31:31.514...... Admitting member...
[info ...19:31:31.514...... Region PdxTypes requesting initial image from 148.88.88.100...
[info ...19:31:31.514...... PdxTypes is done getting image from 148.88.88.100.
仅当另一个缓存服务器启动时,初始化的区域PdxTypes才完成。
server-cache. xml
<disk-store name="pdxMetaDataStore" compaction-threshold="40" auto-compact="false" allow-force-compaction="true" max-oplog-size="75" queue-size="10000" time-interval="15" write-buffer-size="65535">
<disk-dirs>
<disk-dir dir-size="3000">/gemfire/store</disk-dir>
</disk-dirs>
</disk-store>
<pdx read-serialized="true" disk-store-name="pdxMetaDataStore" persistent="true/>
GFSH
Disk Store ID | Host | Directory
--------------------------------------------------------------------
66asdf-asdf-asdf-asdf-asdfafadfasfC | 148.88.88.100 | /gemfire/store
片段线程转储
"Asynchronous disk writer for region pdxMetaDataStore" #55 daemon prio=5 os_prio=0 tid=0x0007ffcess nid=0x225d in Object.wait() [0x001....]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:460)
at java.util.concurrent.TimeUnti.timedWait(TimeUnit.java:348)
at org.apache.geode.internal.cache.DiskStoreImpl$FlusherThread.waitUntilFlushIsReady(DiskStoreImpl.java:1647)
- Locked <0x00000123123> (a java.lang.Object)
at org.apache.geode.internal.cache.DiskStoreImpl$FlusherThread.doAsyncFlush(DiskStoreImpl.java:1706)
at org.apache.geode.internal.cache.DiskStoreImpl$FlusherThread.run(DiskStoreImpl.java:1696)
"main" #1 prio=5........ in Object.wait() [0x......]
"Java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at org.apache.geode.internal.cache.persistence.MembershipChangeListener.waitForChange(MembershipChangeListener.java:62)
- Locked (0x....) (a org.apache.geode.internal.cache.persistence.MembershipChangeListener)
at org.apache.geode.internal.cache.persistence.PersistenceInitialImageAdvisor.waitForMembershipChangeForMissingDiskStores(PersistenceInitialImageAdvisor.java:218)
at org.apache.geode.internal.cache.persistence.PersistenceInitialImageAdvisor.getAdvice(PersistenceInitialImageAdvisor.java:118)
at org.apache.geode.internal.cache.persistence.PeristenceAdvisorImpl.getInitialImageAdvice(PeristenceAdvisorImpl.java:835)
at org.apache.geode.internal.cache.persistence.CreatePersistentRegionProcessor.getInitialImageAdvice(CreatePersistentRegionProcessor.java:52)
at org.apache.geode.internal.cache.DistributedRegion.getInitialImageAndRecovery(DisritubedRegion.java:1196)
at org.apache.geode.internal.cache.DistributedRegion.initialize(DistributedRegion.java:1076)
......
我刚刚看到了更新后的描述和日志摘录,这些消息在GemFire集群中很常见,每当成员离开或加入分布式系统(内部同步机制)时,它们并不意味着服务器正在等待其他服务器出现并完成初始化,在这种情况下,您会看到如下内容:
Region /MyRegion has potentially stale data.
It is waiting for another member to recover the latest data.
My persistent id:
DiskStore ID: 6893751ee74d4fbd-b4780d844e6d5ce7
Name: server1
Location: /192.0.2.0:/home/dsmith/server1/.
Members with potentially new data:
[
DiskStore ID: 160d415538c44ab0-9f7d97bae0a2f8de
Name: server2
Location: /192.0.2.0:/home/dsmith/server2/.
]
Use the "gfsh show missing-disk-stores" command to see all disk stores
that are being waited on by other members.
您可以查看这篇文章,了解有关获取初始图像序列序列的更多信息。
干杯。