ABXZone Computer  Forums



Welcome to the ABXZone Computer Forums forums.

You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content and access many other special features. Registration is fast, simple and absolutely free so please, join our community today!

If you have any problems with the registration process or your account login, please contact contact us.
Why not Register and remove some of the ads from The ABXZone
Reply
 
LinkBack Thread Tools Display Modes
Old 06-15-2008, 03:09 PM   #1
Registered User
 
ephekt's Avatar
 
Join Date: Feb 2005
Location: New Orleans, La
Posts: 78
Creating near-enterprise level storage at home with ZFS

What is ZFS?
Quote:
ZFS is a new kind of file system that provides simple administration, transactional semantics, end-to-end data integrity, and immense scalability. ZFS is not an incremental improvement to existing technology; it is a fundamentally new approach to data management. We've blown away 20 years of obsolete assumptions, eliminated complexity at the source, and created a storage system that's actually a pleasure to use.

ZFS presents a pooled storage model that completely eliminates the concept of volumes and the associated problems of partitions, provisioning, wasted bandwidth and stranded storage. Thousands of file systems can draw from a common storage pool, each one consuming only as much space as it actually needs. The combined I/O bandwidth of all devices in the pool is available to all filesystems at all times.

All operations are copy-on-write transactions, so the on-disk state is always valid. There is no need to fsck(1M) a ZFS file system, ever. Every block is checksummed to prevent silent data corruption, and the data is self-healing in replicated (mirrored or RAID) configurations. If one copy is damaged, ZFS detects it and uses another copy to repair it.

ZFS introduces a new data replication model called RAID-Z. It is similar to RAID-5 but uses variable stripe width to eliminate the RAID-5 write hole (stripe corruption due to loss of power between data and parity updates). All RAID-Z writes are full-stripe writes. There's no read-modify-write tax, no write hole, and — the best part — no need for NVRAM in hardware. ZFS loves cheap disks.

But cheap disks can fail, so ZFS provides disk scrubbing. Like ECC memory scrubbing, the idea is to read all data to detect latent errors while they're still correctable. A scrub traverses the entire storage pool to read every copy of every block, validate it against its 256-bit checksum, and repair it if necessary. All this happens while the storage pool is live and in use.

ZFS has a pipelined I/O engine, similar in concept to CPU pipelines. The pipeline operates on I/O dependency graphs and provides scoreboarding, priority, deadline scheduling, out-of-order issue and I/O aggregation. I/O loads that bring other file systems to their knees are handled with ease by the ZFS I/O pipeline.

ZFS provides unlimited constant-time snapshots and clones. A snapshot is a read-only point-in-time copy of a filesystem, while a clone is a writable copy of a snapshot. Clones provide an extremely space-efficient way to store many copies of mostly-shared data such as workspaces, software installations, and diskless clients.

ZFS backup and restore are powered by snapshots. Any snapshot can generate a full backup, and any pair of snapshots can generate an incremental backup. Incremental backups are so efficient that they can be used for remote replication — e.g. to transmit an incremental update every 10 seconds.

There are no arbitrary limits in ZFS. You can have as many files as you want; full 64-bit file offsets; unlimited links, directory entries, snapshots, and so on.

ZFS provides built-in compression. In addition to reducing space usage by 2-3x, compression also reduces the amount of I/O by 2-3x. For this reason, enabling compression actually makes some workloads go faster.

In addition to file systems, ZFS storage pools can provide volumes for applications that need raw-device semantics. ZFS volumes can be used as swap devices, for example. And if you enable compression on a swap volume, you now have compressed virtual memory.

ZFS administration is both simple and powerful. Please see the zpool(1M) and zfs(1M) man pages for more information — and be sure to check out the Getting Started section for a whirlwind tour.

ZFS is already quite snappy on most workloads — and we're just getting started.
ZFS Storage pools
Quote:
Unlike traditional file systems, which reside on single devices and thus require a volume manager to use more than one device, ZFS filesystems are built on top of virtual storage pools called zpools. A zpool is constructed of virtual devices (vdevs), which are themselves constructed of block devices: files, hard drive partitions, or entire drives, with the last being the recommended usage.[6] Block devices within a vdev may be configured in different ways, depending on needs and space available: non-redundantly (similar to RAID 0), as a mirror (RAID 1) of two or more devices, as a RAID-Z group of three or more devices, or as a RAID-Z2 group of four or more devices.[7] The storage capacity of all vdevs is available to all of the file system instances in the zpool.

A quota can be set to limit the amount of space a file system instance can occupy, and a reservation can be set to guarantee that space will be available to a file system instance.



Creating the array

Before I get into creating the array, I'd like to point out that only one command is required to create both the pool and file system. I'm adding commands in here to help you understand what's going on behind the scenes. In places where pertinent, I will assume no prior knowledge of UNIX.


Before I can create the array, I need to get the disk names with the format command. Below you can see the 5 750GB drives listed as disks 0-4, with the boot drive at disk 5. (In case you're a UNIX noob, the naming scheme is as follows c=controller ID, d=Disk ID [LUN target].

Code:
justin@spice:~# format Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c5d0 <DEFAULT cyl 45598 alt 2 hd 255 sec 126> /pci@0,0/pci-ide@1f,2/ide@0/cmdk@0,0 1. c5d1 <DEFAULT cyl 45598 alt 2 hd 255 sec 126> /pci@0,0/pci-ide@1f,2/ide@0/cmdk@1,0 2. c6d0 <DEFAULT cyl 45598 alt 2 hd 255 sec 126> /pci@0,0/pci-ide@1f,2/ide@1/cmdk@0,0 3. c6d1 <DEFAULT cyl 45598 alt 2 hd 255 sec 126> /pci@0,0/pci-ide@1f,2/ide@1/cmdk@1,0 4. c7d0 <DEFAULT cyl 45598 alt 2 hd 255 sec 126> /pci@0,0/pci-ide@1f,5/ide@0/cmdk@0,0 5. c8d0 <DEFAULT cyl 9726 alt 2 hd 255 sec 63> /pci@0,0/pci-ide@1f,5/ide@1/cmdk@0,0 Specify disk (enter its number): ^C justin@spice:~#
Now I create a zpool named raid with raidz redundancy (striping+parity, basically RAID5) with the disks from above. (This is the one command required to create the redundant pool and zfs file system.)

Code:
justin@spice:~# zpool create raid raidz c5d0 c5d1 c6d0 c6d1 c7d0
Then check the status:
Code:
justin@spice:~# zpool status raid pool: raid state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM raid ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c5d0 ONLINE 0 0 0 c5d1 ONLINE 0 0 0 c6d0 ONLINE 0 0 0 c6d1 ONLINE 0 0 0 c7d0 ONLINE 0 0 0 errors: No known data errors
We can check the available space in the pool like so.
Code:
justin@spice:/# zpool list raid NAME SIZE USED AVAIL CAP HEALTH ALTROOT raid 3.41T 1.5M 3.41T 0% ONLINE
ZFS I can now check with the file system ZFS created for me earlier with the zfs list command.
Code:
justin@spice:/raid# zfs list raid NAME USED AVAIL REFER MOUNTPOINT raid 292K 2.68T 28.8K /raid
Pretty easy, right? The RAID pool and file system are already setup, and it only took about 2 minutes. Now I just need to install CIFS and setup some shares.



Sharing the file system

I added the CIFS/smb packages with the package manager. You might need to reboot after this.
Code:
justin@spice:/raid# pkg install SUNWsmbs justin@spice:/raid# pkg install SUNWsmbskr
Then I started the CIFS server:
Code:
justin@spice:/# svcadm enable -r smb/server
Next I created the share filesystem.
Code:
justin@spice:/# zfs create -o casesensitivity=mixed raid
Then shared that filesystem out via SMB, and check its status.
Code:
justin@spice:/# zfs set sharesmb=on raid justin@spice:/# sharemgr show -vp default nfs=() zfs zfs/raid smb=() raid=/raid
Next I joined the workgroup
Code:
justin@spice:/# smbadm join -w workgroup Successfully joined workgroup 'workgroup'
Then I just need to edit /etc/pam.conf to add this line at the end of the config.
Code:
other password required pam_smb_passwd.so.1 nowarn


Save that, then you need to generate a CIFS password for the account you wish to use, using the passwd command. This is as simple as changing the password.
Code:
justin@spice:~# passwd passwd: Changing password for justin New Password:
Next I changed the permissions on the share to prevent any issues.
Code:
justin@spice:/# chmod 777 /raid
Now all you have to do is map the network drive in windows.
Code:
C:\Users\justin>net use r: \\server\share /user:server\username password /p:y The command completed successfully.
Now the files can be accessed from Windows.



Snapshots

Creating a snapshot is as easy as running:
Code:
zfs snapshot raid@snapshot
Snapshots are stored in the .zfs directory. Here you can see the snapshot in action:
Code:
justin@spice:~# ls /raid 2k3.txt Docs Backup catalog.txt MOVIES Music Screenshot.png TV justin@spice:~# ls /raid/.zfs/snapshot/snapshot/ 2k3.txt Docs Backup catalog.txt MOVIES Music Screenshot.png TV justin@spice:~#
Snapshots only consume the space equivalent to files changed since the snapshot was created. Below you can see that my snapshot is only taking up 34.6MB.
Code:
justin@spice:~# zfs list NAME USED AVAIL REFER MOUNTPOINT raid 941G 1.76T 941G /raid raid@snapshot 34.6M - 941G -

Restoring from a specific snapshot is as easy as using the rollback feature.
Code:
zfs rollback raid@snapshot
One caveat is that attempts to rollback to old snapshots will require the -r option to destroy all newer snapshots. You can get around this by simply copying the snapshot onto the array.
Code:
cp /raid/.zfs/snapshot/snapshot /raid
Destroying a snapshot can be achieved with the destroy command.
Code:
zfs destroy raid@snapshot
Moving a pool

Moving a pool is as easy as exporting it on one system: (you have to move the disks as well obviously)
Code:
zpool export pool_name
And importing it on the new one:
Code:
zpool import pool_name
The export is actually a management semantic, and can be skipped or forced with -f if needed. All it does is write a flag to the pool to remove it from zpool.cache.
(Offline)   Reply With Quote
Old 06-15-2008, 04:22 PM   #2
Stuck in 3D
 
Gorganzola's Avatar
 
Join Date: May 2001
Location: Hangin' with the fruits
Posts: 9,336
Re: Creating near-enterprise level storage at home with ZFS

Nice work. I especially like the fact that you added some extra detail to give us a bit more knowledge of what was going on.

What kinda of hardware would you expect something like this to run properly on?
__________________
TTFN.

I wasn't asleep at the switch, I was drunk. -- Homer J. Simpson

Q. How many dull people does it take to change a lightbulb?

A. One.

A very useful tool on these forums:

You can Meebo in public.

(Offline)   Reply With Quote
Old 06-15-2008, 05:14 PM   #3
Registered User
 
ephekt's Avatar
 
Join Date: Feb 2005
Location: New Orleans, La
Posts: 78
Re: Creating near-enterprise level storage at home with ZFS

Quote:
Originally Posted by Gorganzola View Post
What kinda of hardware would you expect something like this to run properly on?
Normal commodity hardware. Currently OpenSolaris doesn't have as good of hardware support as Linux, but you can check compatibility here or with the liveCD installer. Most common chipsets for NIC, SATA and graphics are supported though.

Here's the hardware list for the system I made this tutorial with. My server (this one was for a friend) runs on an Intel board with Intel server NIC... Intel stuff is very well supported.

Samsung Spinpoint 750GB x5
Supermicro 5 bay hot-swap SATA backplane
DFI INFINITY P965-S motherboard
Intel Core 2 Duo E4500 Allendale 2.2GHz
Nvidia 6200
(Offline)   Reply With Quote
Old 06-15-2008, 10:42 PM   #4
Stuck in 3D
 
Gorganzola's Avatar
 
Join Date: May 2001
Location: Hangin' with the fruits
Posts: 9,336
Re: Creating near-enterprise level storage at home with ZFS

That's newer than what I was intending.

ASUS PC-DL i875p
2x 2.66GHz Xeon
3x 320GB SATA
3x 250GB IDE

The 3x250GB drives are already running on a Promise SX4000 RAID 5 controller with 256MB of memory.

I'd like to centralize my storage and put the machine in a corner and forget about it. The SATA backplane peaks my interest.
__________________
TTFN.

I wasn't asleep at the switch, I was drunk. -- Homer J. Simpson

Q. How many dull people does it take to change a lightbulb?

A. One.

A very useful tool on these forums:

You can Meebo in public.

(Offline)   Reply With Quote
Old 06-16-2008, 11:09 AM   #5
Registered User
 
ephekt's Avatar
 
Join Date: Feb 2005
Location: New Orleans, La
Posts: 78
Re: Creating near-enterprise level storage at home with ZFS

There's actually a better chance of older hardware being supported. That being (I believe) an Intel board, there's a pretty good chance the chipsets are supported even if the board officially isn't. The Sun HCL site and OpenSolaris forums would be a good place to check.

You wouldn't need the hardware RAID functionality of the card as ZFS does an excellent job at softRAID (far better than md+LVM imo), but would probably want to check compatibility for the controller since you presumable intend on using it for the ports. If you wish to stick with hardware RAID, then ZFS probably isn't for you and you should honestly just pick your favorite OS for this task.

RAM is another issue since ZFS uses a lot of cache, but for the pool size you'll get out of those disks 1GB is probably more than enough.

With that setup, you could create two RAIDz vdevs - 3x320 and 3x250 - and add them to the pool for a total of about 1TB of storage. Even though the data would be on two physical stripe sets, it would appear as a single 'volume' with the total amount of space being accessible to all file systems created on the pool.

That backplane is a very cool little device. I love them. I'm not sure what kind of offerings there are for 6 drives though, as those are generally built to consume 3 5.25" bays.
(Offline)   Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



Powered by vBulletin® Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.0.1
vBulletin Skin developed by: vBStyles.com