baikgd的博客

技术是工具，生活是目的，更好的生活是目标！

日志

关于我

baikgd

文章分类

DRBD8 with two primaries on debian etch(转）

2010-10-25 23:23:16| 分类： DRBD | 标签： |举报 |字号大中小订阅

下载LOFTER 我的照片书 |

The goal of this post is to set up an architecture composed of two nodes that can both simultaneously have read/write access to a synchronized partition. Each node will access the synchronized partition as any other classical partition, the only difference is that every modification will also be forwarded to the other node.

DRBD8 configuration file

drbd.conf

System requirements

2.6.24 kernel at least

[tux]# apt-get update
[tux]# apt-get install linux-image-2.6.24-etchnhalf.1-686
[tux]# apt-get install linux-headers-2.6.24-etchnhalf.1-686
[tux]# reboot

Configuration requirements

Adjust hostname, that is to say specify the same hostname as in /etc/drbd.conf in /etc/hostname.
Every resource has to be accessible from its hostname, that’s why you have to define every hostname in hosts section:

[tux]# vi /etc/hosts
192.168.9.xx hostname1.domain.org hostname1
192.168.9.xx hostname2.domain.org hostname2

Update your source list to be able to get packages from debian backports

Add the following line to your /etc/apt/sources.list:

deb http://www.backports.org/debian etch-backports main contrib non-free

If you are using etch and you want apt to verify the downloaded backports you can import backports.org archive’s key into apt then run apt-get update:

[tux]# apt-get install debian-backports-keyring
[tux]# apt-get update

Install DRBD8

Install DRBD8 source and DRBD8 utils:

[tux]# apt-get install drbd8-source drbd8-utils

Build DRBD8 module from previously installed sources using the module-assistant:

[tux]# module-assistant auto-install drbd8

Prepare the partition that will be synchronized thanks to DRBD

Generally it’s better to think about the DRBD partition earlier and prepare a dedicated partition during operating system installation. Since my machines are hosted on VMWare, I simply add a new virtual disk. Then we will process to its configuration. Do the same on both nodes as usual!

[tux]# fdisk -l    Disk /dev/sda: 6442 MB, 6442450944 bytes  255 heads, 63 sectors/track, 783 cylinders  Units = cylinders of 16065 * 512 = 8225280 bytes    Device       Boot   Start    End     Blocks    Id                 System  /dev/sda1       *       1    743    5968116    83                  Linux  /dev/sda2             744    783     321300     5               Extended  /dev/sda5             744    783    321268+    82   Linux swap / Solaris    Disk /dev/sdb: 1073 MB, 1073741824 bytes  255 heads, 63 sectors/track, 130 cylinders  Units = cylinders of 16065 * 512 = 8225280 bytes    Disk /dev/sdb doesn't contain a valid partition table

Let’s configure the new disk added entitled /dev/sdb. First, create a logical partition on an extended one that will completely fill the new partition:

[tux]# fdisk /dev/sdb    Command (m for help): n  Command action  e   extended  p   primary partition (1-4)  e  Partition number (1-4): 1  First cylinder (1-130, default 1):  Using default value 1  Last cylinder or +size or +sizeM or +sizeK (1-130, default 130):  Using default value 130    Command (m for help): n  Command action  l   logical (5 or over)  p   primary partition (1-4)    l  First cylinder (1-130, default 1):  Using default value 1    Last cylinder or +size or +sizeM or +sizeK (1-130, default 130):    Using default value 130    Command (m for help): w  The partition table has been altered!    Calling ioctl() to re-read partition table.  Syncing disks.

In our case, the logical partition that will be synchronized by DRBD is entitled /dev/sdb5:

[tux]# fdisk -l /dev/sdb    Disk /dev/sdb: 1073 MB, 1073741824 bytes  255 heads, 63 sectors/track, 130 cylinders  Units = cylinders of 16065 * 512 = 8225280 bytes    Device       Boot   Start    End     Blocks    Id      System  /dev/sdb1               1    130   1044193+     5    Extended  /dev/sdb5               1    130    1044162    83       Linux

/dev/sdb5 is ready to be used by DRBD. Thanks to the comment of Mark, we know that if there is no existing data on the device, we should not create a filesystem on the underlying device! It will be done later on the DRBD device itself.

Configure DRBD8

Adjust drbd.conf according to your system, that is to say specify your local partition as well as the correct hostnames and IP addresses, then copy it in /etc directory.

on hostname1 {    device /dev/drbd0;    disk /dev/sdb5;    address 192.168.9.xx:7789;    meta-disk internal;  }  on hostname2 {    device /dev/drbd0;    disk /dev/sdb5;    address 192.168.9.xx:7789;    meta-disk internal;  }

Create DRBD resource

First of all create your resource. Resource name corresponds to what you specified in your drbd.conf (here the resource name is r0). The DRBD partition must have the same size on both nodes to work correctly.
If there is no filesystem on /dev/sdb5, following commands should do the trick (on both nodes):

[tux]# drbdadm create-md r0  You want me to create a v08 style flexible-size internal meta data block.  There apears to be a v08 flexible-size internal meta data block  already in place on /dev/sdb5 at byte offset 106917888  Do you really want to overwrite the existing v08 meta-data?  [need to type 'yes' to confirm] yes    Writing meta data...  initialising activity log  NOT initialized bitmap  New drbd meta data block sucessfully created.

If any error like the following one occurs:

[tux]# drbdadm create-md r0  md_offset 1069215744  al_offset 1069182976  bm_offset 1069150208    Found ext3 filesystem which uses 1044160 kB  current configuration leaves usable 1044092 kB    Device size would be truncated, which  would corrupt data and result in  'access beyond end of device' errors.  You need to either  * use external meta data (recommended)  * shrink that filesystem first  * zero out the device (destroy the filesystem)  Operation refused.    Command 'drbdmeta /dev/drbd0 v08 /dev/sdb5 internal create-md'  terminated with exit code 40  drbdadm aborting

It means there was some data on /dev/sdb5 or at least a filesystem. In this kind of scenario there are two ways to solve the problem: adjust partition size or backup and erase existing data.

Adjust partition size is the most dangerous option! Use it at your own risks. Many people encountered ‘accessing beyond end of device’ errors when the device was almost full! (thank you for your comment Mark ). If you still want to do it, use the current configuration leaves usable information returned by previous command (1044092) to resize the device. Please perform the same commands on each node.

[tux]# e2fsck -f /dev/sdb5 && resize2fs /dev/sdb5 1044092K

The second option is the safer one: backup up existing data on another device, then erase its current content:

[tux]# dd if=/dev/zero of=/dev/sdb5

And then you can successfully create your resource on both machines:

[tux]# drbdadm create-md r0

Start DRBD8 and ensure it works correctly

You can start DRBD this way (will produce an error since at first startup you have to force only one node into primary state):

[hostname1]# /etc/init.d/drbd start
[hostname2]# /etc/init.d/drbd start

Or make it manually (attach and connect the resource r0):

[hostname1]# drbdadm attach r0
[hostname1]# drbdadm connect r0
[hostname2]# drbdadm attach r0
[hostname2]# drbdadm connect r0

In order to get consistent data, force DRBD partition into primary state for only one of the two nodes, this will trigger the synchronization process. (“drbdadm — –overwrite-data-of-peer primary r0″ is equivalent to the following command)

[hostame1]# drbdsetup /dev/drbd0 primary -o

Wait the end of the synchronization.

[hostname1]# /etc/init.d/drbd status  drbd driver loaded OK; device status:  version: 8.0.13 (api:86/proto:86)  GIT-hash: ee3ad77563d2e87171a3da17cc002ddfd1677dbe build by phil@fat-tyre,  2008-08-04 15:28:07  m:res  cs         st                ds                    p mounted fstype  ...    sync'ed:   20.8%             (830012/1044092)K  0:r0   SyncSource Primary/Secondary UpToDate/Inconsistent C

When 100% is reached, you can successfully set the secondary node into primary state in order to get two primaries:

[hostame2]# drbdadm primary r0

Verify that everything works fine:

[tux]# /etc/init.d/drbd status  drbd driver loaded OK; device status:  version: 8.0.13 (api:86/proto:86)  GIT-hash: ee3ad77563d2e87171a3da17cc002ddfd1677dbe build by phil@fat-tyre,  2008-08-04 15:28:07  m:res cs        st              ds                p mounted fstype  0:r0  Connected Primary/Primary UpToDate/UpToDate C

Now it’s time to create your EXT3 filesystem on the DRBD device! (only in the case there was no existing data on /dev/sdb5)

[hostname1]# mke2fs -j /dev/drbd0

You can finally check if your system works correctly. First create a new directory:

[hostname1]# mkdir /synchronized
[hostname2]# mkdir /synchronized

Then mount the DRBD partition at this location. You should be able to access it read/write on both machines simultaneously.

[hostname1]# mount /dev/drbd0 /synchronized
[hostname2]# mount /dev/drbd0 /synchronized

If you were unable to mount your partition and get the following error message, ensure you have correctly created a valid filesystem on the DRBD device as explained before.

[tux] # mount /dev/drbd0 /synchronized/
mount: you must specify the filesystem type

If you followed every step carefully, your system must be set up correctly and the DRBD status command should return the following state on both nodes. You can see that DRBD is mounted on /synchronized.

[hostname1]# /etc/init.d/drbd status  drbd driver loaded OK; device status:  version: 8.0.13 (api:86/proto:86)  GIT-hash: ee3ad77563d2e87171a3da17cc002ddfd1677dbe build by phil@fat-tyre,  2008-08-04 15:28:07  m:res cs        st              ds                p mounted       fstype  0:r0  Connected Primary/Primary UpToDate/UpToDate C /synchronized ext3

According to DRBD specifications, our system and configuration is valid. We can try to create files from both locations to see if it works how it should.

[hostname1]# touch /synchronized/from_host1
[hostname2]# touch /synchronized/from_host2
[hostname1]# ls /synchronized
lost+found from_host1
[hostname2]# ls /synchronized
lost+found from_host2

As you can notice, our system does not work as expected, since we are not able to see the file added from the other node. I did not find a way to make it work properly using a standard Ext3 filesystem. Generally, you have to unmount the partition, put it into secondary state, put it back into primary state and mount it again to be sure to see the modifications of the other node. DRBD documentation specifies that in order to use the software safely with two primaries we have to couple DRBD with a specific filesystem that ensures integrity. I will go further and asserts that DRBD cannot work properly without a specific filesystem even if there is no corruption risk.

Other considerations

You have to keep in mind that there is no integrity guaranty between the two machines. Therefore we are in any case force to implement a lock mechanism to avoid data corruption. DRBD8 deals only with synchronization between nodes. In order to circumvent this drawback, the DRBD synchronized partition must use a specific file system like GFS2 for example if you want to see modifications from both nodes almost in real time. A next post will go into further details and explain how to build a shared nothing architecture between two nodes using DRBD8 + GFS2 on debian etch. I can already claim this time it will work as expected, so do not become discouraged.

评论这张

转发至微博

阅读(986)| 评论(0)

历史上的今天

this.p={  m:2,
              b:2,
              loftPermalink:'',
              id:'fks_081074085094084064081081081095087083085067086082095070',
              blogTitle:'DRBD8 with two primaries on debian etch(转）',
              blogAbstract:'<P\>The goal of this post is to set up an architecture composed of two nodes that can both simultaneously have read/write access to a synchronized partition. Each node will access the synchronized partition as any other classical partition, the only difference is that every modification will also be forwarded to the other node.</P\>  <P\><STRONG\><EM\>DRBD8</EM\> configuration file</STRONG\></P\>  <UL\>  </UL\>',
              blogTag:'',
              blogUrl:'blog/static/354024952010925112316534',
              isPublished:1,
              istop:false,
              type:0,
              modifyTime:1317189415825,
              publishTime:1288020196534,
              permalink:'blog/static/354024952010925112316534',
              commentCount:0,
              mainCommentCount:0,
              recommendCount:0,
              bsrk:-100,
              publisherId:0,
              recomBlogHome:false,
              currentRecomBlog:false,
              attachmentsFileIds:[],
              vote:{},
              groupInfo:{},
              friendstatus:'none',
              followstatus:'unFollow',
              pubSucc:'',
              visitorProvince:'',
              visitorCity:'',
              visitorNewUser:false,
              postAddInfo:{},
              mset:'000',
              mcon:'',
              srk:-100,
              remindgoodnightblog:false,
              isBlackVisitor:false,
              isShowYodaoAd:false,
              hostIntro:'',
              hmcon:'1',
              selfRecomBlogCount:'0',
              lofter_single:'<iframe width="140" height="560" style="overflow:hidden;" src="http://www.lofter.com/mailEntry.do?blogad=1&blog" frameBorder="0"></iframe>'
            }

{list a as x}
    {if !!x}
    <div class="iblock nbw-fce nbw-f40">
      <a class="fc03 noul" target="_blank" hidefocus="true" href="http://blog.163.com/${x.visitorName}/">
      {if x.visitorName==visitor.userName}
      <img alt="${x.visitorNickname|escape}" onerror="this.src=location.f40" class="cwd bdwa bdc0" src="${fn1(x.visitorName)}&r=${visitor.imageUpdateTime}"/>
      {else}
      <img alt="${x.visitorNickname|escape}" onerror="this.src=location.f40" class="cwd bdwa bdc0" src="${fn1(x.visitorName)}"/>
      {/if}
      </a>
      <div class="cwd vname thide">
        {if x.moveFrom=='wap'}
          <a class="noul pnt" target="_blank" href="http://blog.163.com/services/wapblog.html?frompersonalbloghome"><span title="来自网易手机博客" class="iblock wapIcon"> </span></a>
        {elseif x.moveFrom=='iphone'}
          <a class="noul pnt" target="_blank"><span title="来自iPhone客户端" class="iblock iphoneIcon"> </span></a>
        {elseif x.moveFrom=='android'}
          <a class="noul pnt" target="_blank"><span title="来自Android客户端" class="iblock androidIcon"> </span></a>
        {elseif x.moveFrom=='mobile'}
          <a class="noul pnt" target="_blank" href="http://blog.163.com/services/emsblog.html?frompersonalbloghome"><span title="来自网易短信写博" class="iblock wapIcon"> </span></a>
        {/if}
        <a class="fc03 m2a"  target="_blank" hidefocus="true" href="http://blog.163.com/${x.visitorName}/">
          ${fn(x.visitorNickname,8)|escape}
        </a>
      </div>
    </div>
    {/if}
    {/list}

<#--最新日志，群博日志--> <#--推荐日志-->

<p class="fc06">推荐过这篇日志的人：</p>
    <div>
      {list a as x}
      {if !!x}
      <div class="iblock nbw-fce nbw-f40">
        <a class="fc03 noul" target="_blank" hidefocus="true" href="http://blog.163.com/${x.recommenderName}/">
        <img alt="${x.recommenderNickname|escape}" onerror="this.src=location.f40" class="cwd bdwa bdc0" src="${fn1(x.recommenderName)}"/>
        </a>
        <div class="cwd thide">
          <a class="fc03 m2a" target="_blank" hidefocus="true" href="http://blog.163.com/${x.recommenderName}/">
            ${fn(x.recommenderNickname,6)|escape}
          </a>
        </div>
      </div>
      {/if}
      {/list}
    </div>
    {if !!b&&b.length>0}
    <p  class="fc06">他们还推荐了：</p>
    <ul>
    {list b as y}
      {if !!y}
        <li class="rrb"><span class="iblock">·</span><a class="fc03 m2a" target="_blank" href="http://blog.163.com/${y.recommendBlogPermalink}/?from=blog/static/354024952010925112316534">${y.recommendBlogTitle|escape}</a></li>
      {/if}
    {/list}
    </ul>
    {/if}

<#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇，下一篇--> <#-- 热度 -->

{list a as x}
    {if !!x}
    <div class="hotItem iblock nbw-fce nbw-f40">
      <a class="fc03 noul" target="_blank" hidefocus="true" href="http://blog.163.com/${x.publisherUsername}/">
      {if x.publisherUsername==visitor.userName}
      <img alt="${x.publisherNickname|escape}" onerror="this.src=location.f40" class="cwd bdwa bdc0" src="${fn1(x.publisherUsername)}&r=${visitor.imageUpdateTime}"/>
      {else}
      <img alt="${x.publisherNickname|escape}" onerror="this.src=location.f40" class="cwd bdwa bdc0" src="${fn1(x.publisherUsername)}"/>
      {/if}
      </a>
      <div class="cwd vname thide">
        <a class="fc03 m2a"  target="_blank" hidefocus="true" href="http://blog.163.com/${x.publisherUsername}/">
          ${fn(x.publisherNickname,8)|escape}
        </a>
      </div>
      <a class="f-myLikeIcons hottype {if x.type==1} js-liketype{elseif x.type==2} js-reblogtype{elseif x.type==3} js-sharetype{else}{/if}" target="_blank" hidefocus="true" href="http://blog.163.com/${x.publisherUsername}/"> </a>
    </div>
    {/if}
    {/list}

<#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->

页脚

我的照片书 - 手机博客 - 下载LOFTER APP - 订阅此博客

baikgd的博客

导航

日志

DRBD8 with two primaries on debian etch(转）

历史上的今天

最近读者

热度

评论

页脚