In preparation to move the most recent Wikipedia Traffic data to a public dataset, I moved my S3 storage bucket to EBS. First, though, I had to create the storage bucket and copy the S3 files to it. Here are some notes on creating, mounting and formatting an Amazon EBS volume
Overview
* create the EBS volume
* attach the EBS volume to an instance
* validate attachment through dmesg, system logs
* partition the volume
* format the parition
* create a mount point
* mount
* copy files to the new voume
Detail
1) create the EBS volume
When creating your EBS volume, make sure your EBS volume is in the same Availability Zone as your Amazon EC2 instance.
2) attach the EBS volume to an instance
3) review the system message log to verify it got attached
ip-10-32-69-206:~ # dmesg | tail
[ 1.073737] kjournald starting. Commit interval 15 seconds
[ 1.097264] EXT3-fs (sda2): using internal journal
[ 1.097280] EXT3-fs (sda2): mounted filesystem with ordered data mode
[ 1.125686] JBD: barrier-based sync failed on sda1 - disabling barriers
[ 83.353564] sdf: unknown partition table
As it is raw storage, we'll need to partition the volume.
4) partition the EBS volume
ip-10-32-69-206:~ # fdisk /dev/sdf
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-19581, default 1):
Using default value 1
Last cylinder, +cylinders or +size{K,M,G} (1-19581, default 19581):
Using default value 19581
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
And of course, format!
5) format the newly created partition
ip-10-32-69-206:~ # mkfs.ext3 /dev/sdf1
mke2fs 1.41.11 (14-Mar-2010)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
9830400 inodes, 39321087 blocks
1966054 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=0
1200 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624, 11239424, 20480000, 23887872
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 34 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.
6) create a mount point for the partition
ip-10-32-69-206:~ # mkdir /mnt/data
7) mount the partition
ip-10-32-69-206:~ # mount -t ext3 /dev/sdf1 /mnt/data
Good to go for copy!
8) copy the S3 storage bucket data over to the EBS volume
ip-10-32-69-206:/data/wikistats # s3cmd get s3://sodotrendingtopics/wikistats/* /data
s3://sodotrendingtopics/wikistats/pagecounts-20110101-000000.gz -> ./pagecounts-20110101-000000.gz [1 of 2161]
..
2161 files took about four hours to copy over from S3. Amazon does not charge you to move files within their network from S3 to EBS and vice-versa.
No comments:
Post a Comment